Octrees, Occlusion Culling, and Solid Node BSP's

 

Introduction:

Well, the title pretty much says it all. This tutorial will take you through the process of creating a hybrid 3d engine that uses octrees for rendering, along with occlusion and view frustrum culling for HSR ( hidden surface removal ), and runtime compiled solid node BSP's within the octree leaves for collision detection, line of sight testing, and correct sorting of translucent polygons. Don't worry if any of that sounds intimidating, each of these processes will be described in some detail by the end of this tutorial. The main motivation for writing this is to expose people to some lesser documented visibility solutions, and to perhaps spark creativity in mixing techniques to divide and conquor the goals of your 3d engine. First, here are some introductions to the techniques that we will put to use.

Octrees:

I suppose it would be proper to start a discussion of octrees with a quick overview of spatial partitioning in general. The idea behind spatial partitioning is that it's preferable to perform operations ( such as collision detection or rendering ) only on the geometry that is relevent. Spatial partitioning basically carves up a polygonal mesh ( ie. your game level. I will use "mesh" from here on out ) into managable chunks. There are many ways to do this, such as dividing your mesh into an even grid, but the most useful spatial partitioning techniques are those that are hierarchical in nature. Enter the octree. Note that this is only going to be a refresher. Hopefully you've already read about octrees before, but maybe you haven't implemented them yet. Ok, like most hierarchical spatial partitioning techniques, the octree is very recursive. When generating the octree from your mesh, at every iteration of the recursive build function, the following is done:

The ordering of this, and the way you carry out these steps is entirely up to you, but for this tutorial we are going to use the algorithm above which was layed out by Kurt Miller ( Click here ) with only a slight modification that is discussed later. Now, for some specifics:

Generating bounding boxes for the 8 child nodes. The main feature of an octree is that it divides the mesh into 8 smaller chunks recursively. Another way to look at it is that is divides the mesh on all three axis with three split planes, creating 8 AABB’s ( axis aligned bounding boxes ). The bounding boxes for these nodes can be generated many different ways. For instance, you could take the median of all the poly's for each axis and create the splitters there. Or you could just create 8 equal sized cubes.

Test each poly for containment in each child node’s bounding box. This is a simple poly in AABB test. A problem arises however, when the poly is partially in more than one bounding box ( sorry, I’ve been using ‘bounding box’ and ‘AABB’ interchangably, the bounding boxes are assumed to be axis aligned though ). One technique is to split the poly into fragments and insert those fragments into the child nodes that contain them. Another technique is to insert the poly into all nodes that partially contain it, and store a bool for each poly to flag whether it has been processed or not so it won’t get rendered more than once.

 The method we will use is a little unorthodox though, and it avoids splitting and simplifies rendering a bit. The bounding boxes for the 8 child nodes are first generated as 8 equal sized cubes. As the poly’s are tested for containment in the child nodes bounding boxes, if they are partially contained they will be removed from the list so that no other child nodes add this poly to their list even if they partially contain them. This could of course cause problems when rendering, for example if a node is culled that had a poly that was partially in another node, and that other node is partially visible, there might be a dropout. This is fixed as described in the next section.

Test for ending conditions. The recursion of course has to stop at some point, and here are some of the most often used conditions. You may want to keep the node size from going below a certain size, say 8x8x8 or whatever would best suit the mesh and its unit scale. You could also stop recursion when the node contains less than a set number of poly’s, say somewhere between 200-500. This number is best found by trial and error, I’ve found that around 300 is a good number for fast rendering of a high poly mesh. Another condition you could test for is recursion depth. This makes sure you don’t have to descend too far into the tree just to find a leaf. This is the same as limiting the node size, it’s just another way of doing that. When one of these conditions is met, instead of calling the build function again for each child node, this node is marked as having no children ( making it a leaf ), and the poly list that was passed to this node is stored at this node. Now, since this node may have poly’s that were only partially contained as described in the previous section, this node must then adjust it’s bounding box, resizing it to make sure it completely contains all of its poly’s, eliminating any rendering artifacts. This makes for a somewhat funky looking octree, but in practice it’s worth it do to the fact that it simplifies the rendering code, and it does away with splitting. Here’s a picture of this type of octree:

[ Goofy octree pic ]

So what good does this octree do? First of all it’s splitless, so the original mesh stays intact. Also there are no duplicate poly’s, so it doesn’t taint the rendering code with per poly ‘hasBeenRendered’ checks. Most importantly though, it creates a hierarchy of AABB’s, and for rendering this can be used to cull away large chunks of poly’s that are outside of the view frustrum. Do to the fact that the AABB’s are hierarchical, if a parent node’s AABB is outside of the view frustrum, so are all it’s child nodes. If a parent node is partially visible, each of it’s child nodes becomes the parent node in the same test recursively, until a completely visible node is found, in which case it renders all contained leaves.

 

Occlusion Culling:

( Note: the culling methods I discuss are conservative. In other words, they don’t solve exact visibility. I assume the use of a z buffer for that )

Of course there is a problem with the rendering method described above. While it does cull away nodes that aren’t in the view frustrum, it does nothing about the fact that some nodes in the view frustrum will obscure others. This can result in a lot of overdraw. Occlusion culling methods attempt to reduce this overdraw by avoiding rendering polygons that are totally obscured by other polygons. Actually, I shouldn’t have used “polygons” to describe occluders and occludees. In the interest of speed, the occlusion culling technique I show you later has occluders that are generally a simple convex polygon rather than complex higher poly objects, and the occludees are  tested using only their bounding volume instead of checking visibility per poly which would be too slow. Here I’ll quickly describe some popular techniques ( over simplifications really ) that we won’t be using.

Portals. Here the focus is actually shifted to the areas of the mesh that don’t have polygons. More specifically, portals are usually polygons generated from the gaps between sets of convex sub-spaces. You can picture this as two rooms connected by a doorway. That doorway would form a portal. First the renderer would draw the room the viewer is in. Then a frustrum would be created using the portal polygon. Any object in the second room that is outside of this frustrum ( remember, only the bounding volumes are tested, not polygons )  would be culled. Any object at least partially within this frustrum would be rendered.

PVS ( potentially visible set ). This method usually generates portals as discussed above, but instead of using them at runtime, they are used to precompute for each convex sub-space ( leaf ) the set of every other leaf that is at least partially visible from it. At runtime, the renderer determines which leaf the viewer is in, and it draws all the leaves that were marked as being potentially visible from that leaf.

 

Hierarchical Z Buffering, and Hierarchical Occlusion Maps ( HOM ). The basic idea is to render the occluders, and test the bounding volume of an occludee against this silhouette and it’s z value. Both of these methods generally involve reading back from the frame buffer, which is too slow to be useful. HOM can still be useful if the silhouette rendering is done in software however, as a modified version of this algorithm using a sort of silhouette rasterizer is used by Hybrids dPvs engine, which works very well.

 

Now we get to the type of occlusion culling that this tutorial focuses on.

 

Shadow culling. The technique we are going to exploit uses a large polygon as an occluder, and an AABB as an occludee. The first step is to determine which occludees are on the opposite side of the occluder from the viewpoint. This can be done by finding the closest point on the AABB to the viewer, and using the dot product to determine if the viewpoint and this closest AABB point are on opposite sides of the plane the occluder lies on. From here, there are at least two methods of finding out whether the occluder blocks the occludee. One method creates a frustrum using the view point and the edges of the occluder polygon to create the side planes, and the plane of the occluder polygon for the near plane. If the occludees bounding box is completely within the side planes of the frustrum, and in front of the near plane, it is in the occluders shadow and can be culled. See the picture below if you’re having trouble visualizing this:

 

[ Shadow frustrum pic ]

 

The problem with this is that we are using a hierarchy of AABB’s, so we need to know not only if the occludee is completely in the shadow, but if the occludee is partially shadowed as well, so it can recursively perform this test on it’s child nodes. This means that all cases of partial containment of an AABB in a frustrum must be considered, and if you are doing this test for an average of 10 occluders, 25 occludees per occluder, that’s 250 times you have to do this containment test each frame. Ok, I may be exaggerating the impact of this on performance a little, but that’s only because there is a better way! J

 

Separating and supporting planes. While the prevous method used the relationship between the viewpoint and the occluder to determine if the occludee was in the occluders shadow, this method uses the relationship between the occluder and the occludee to determine if the viewpoint is in the occluders shadow. This is done by creating separating planes, and supporting planes. The separating and supporting planes are formed by taking an edge of the occluder, and a corner of the AABB. For separating planes, the planes must have the property that the occluder and occludee are completely on opposite sides of the plane. For the supporting planes, the occluder and occludee must be on the same side of the planes. If the view point is in front of  the entire set of supporting planes, the occludee is completely occluded. The same test performed with the separating planes will determine whether an occludee is at least partially occluded. I’m sure you can immediately see why this goes perfectly with our octree. If a node is not completely occluded, test if it is at least partially occluded. If it is, recursively perform this test on all child nodes. If it was completely occluded however, mark the node as not visible. You should cache these planes for at least partially occluded nodes since the planes are view independant, so they don’t have to be reconstructed each frame. This can greatly speed up the occlusion culling process. Check out the picture below if this is a bit hazy:

 

[ Supporting/separating plane pic ]