Octrees, Occlusion Culling, and Solid
Node BSP's
Introduction:
Well,
the title pretty much says it all. This tutorial will take you through the
process of creating a hybrid 3d engine that uses octrees
for rendering, along with occlusion and view frustrum culling for HSR
( hidden surface removal ), and runtime compiled solid
node BSP's within the octree leaves for collision detection, line of
sight testing, and correct sorting of translucent polygons. Don't worry if any
of that sounds intimidating, each of these processes will be described in some
detail by the end of this tutorial. The main motivation for writing this is to
expose people to some lesser documented visibility solutions, and to
perhaps spark creativity in mixing techniques to divide and conquor the goals
of your 3d engine. First, here are some introductions to the techniques that we
will put to use.
Octrees:
I
suppose it would be proper to start a discussion of octrees with a
quick overview of spatial partitioning in general. The idea behind spatial
partitioning is that it's preferable to perform operations ( such as collision
detection or rendering ) only on the geometry that is relevent. Spatial
partitioning basically carves up a polygonal mesh ( ie. your game
level. I will use "mesh" from here on out ) into
managable chunks. There are many ways to do this, such as dividing
your mesh into an even grid, but the most useful spatial partitioning
techniques are those that are hierarchical in nature. Enter the octree. Note
that this is only going to be a refresher. Hopefully you've already read about
octrees before, but maybe you haven't implemented them yet. Ok, like most
hierarchical spatial partitioning techniques, the octree is very recursive.
When generating the octree from your mesh, at every iteration of the
recursive build function, the following is done:
The
ordering of this, and the way you carry out these steps is entirely up to
you, but for this tutorial we are going to use the algorithm above which was
layed out by Kurt Miller ( Click
here ) with only a slight modification that is discussed later. Now,
for some specifics:
Generating bounding
boxes for the 8 child nodes. The
main feature of an octree is that it divides the mesh into 8 smaller chunks
recursively. Another way to look at it is that is divides the mesh on all three
axis with three split planes, creating 8 AABB’s ( axis aligned bounding boxes
). The bounding boxes for these nodes can be generated many different ways. For
instance, you could take the median of all the poly's for each axis and create
the splitters there. Or you could just create 8 equal sized cubes.
Test each poly for
containment in each child node’s bounding box. This is a simple poly in AABB test. A problem
arises however, when the poly is partially in more than one bounding box (
sorry, I’ve been using ‘bounding box’ and ‘AABB’ interchangably, the bounding
boxes are assumed to be axis aligned though ). One technique is to split the
poly into fragments and insert those fragments into the child nodes that
contain them. Another technique is to insert the poly into all nodes that
partially contain it, and store a bool for each poly to flag whether it has
been processed or not so it won’t get rendered more than once.
The
method we will use is a little unorthodox though, and it avoids splitting
and simplifies rendering a bit. The bounding boxes for the 8 child nodes are
first generated as 8 equal sized cubes. As the poly’s are tested for
containment in the child nodes bounding boxes, if they are partially contained
they will be removed from the list so that no other child nodes add this poly
to their list even if they partially contain them. This could of course cause
problems when rendering, for example if a node is culled that had a poly that
was partially in another node, and that other node is partially visible, there
might be a dropout. This is fixed as described in the next section.
Test for ending
conditions. The recursion of course has to stop at some point,
and here are some of the most often used conditions. You may want to keep the
node size from going below a certain size, say 8x8x8 or whatever would best
suit the mesh and its unit scale. You could also stop recursion when the node
contains less than a set number of poly’s, say somewhere between 200-500. This
number is best found by trial and error, I’ve found that around 300 is a good
number for fast rendering of a high poly mesh. Another condition you could test
for is recursion depth. This makes sure you don’t have to descend too far into
the tree just to find a leaf. This is the same as limiting the node size, it’s
just another way of doing that. When one of these conditions is met, instead of
calling the build function again for each child node, this node is marked as
having no children ( making it a leaf ), and the poly list that was passed to
this node is stored at this node. Now, since this node may have poly’s that
were only partially contained as described in the previous section, this node
must then adjust it’s bounding box, resizing it to make sure it completely
contains all of its poly’s, eliminating any rendering artifacts. This makes for
a somewhat funky looking octree, but in practice it’s worth it do to the fact
that it simplifies the rendering code, and it does away with splitting. Here’s
a picture of this type of octree:
[
Goofy octree pic ]
So what good does this
octree do? First of all it’s splitless, so the original mesh
stays intact. Also there are no duplicate poly’s, so it doesn’t taint the
rendering code with per poly ‘hasBeenRendered’ checks. Most importantly though,
it creates a hierarchy of AABB’s, and for rendering this can be used to cull
away large chunks of poly’s that are outside of the view frustrum. Do to the
fact that the AABB’s are hierarchical, if a parent node’s AABB is outside of
the view frustrum, so are all it’s child nodes. If a parent node is partially
visible, each of it’s child nodes becomes the parent node in the same test
recursively, until a completely visible node is found, in which case it renders
all contained leaves.
Occlusion Culling:
(
Note: the culling methods I discuss are conservative. In other words, they
don’t solve exact visibility. I assume the use of a z buffer for that )
Of
course there is a problem with the rendering method described above. While it
does cull away nodes that aren’t in the view frustrum, it does nothing about
the fact that some nodes in the view frustrum will obscure others. This can
result in a lot of overdraw. Occlusion culling methods attempt to reduce this
overdraw by avoiding rendering polygons that are totally obscured by other
polygons. Actually, I shouldn’t have used “polygons” to describe occluders and
occludees. In the interest of speed, the occlusion culling technique I show you
later has occluders that are generally a simple convex polygon rather than
complex higher poly objects, and the occludees are tested using only their bounding volume instead of checking
visibility per poly which would be too slow. Here I’ll quickly describe some
popular techniques ( over simplifications really ) that we won’t be using.
Portals.
Here the focus is actually shifted to the areas of the mesh that don’t have
polygons. More specifically, portals are usually polygons generated from the
gaps between sets of convex sub-spaces. You can picture this as two rooms
connected by a doorway. That doorway would form a portal. First the renderer
would draw the room the viewer is in. Then a frustrum would be created using
the portal polygon. Any object in the second room that is outside of this
frustrum ( remember, only the bounding volumes are tested, not polygons ) would be culled. Any object at least
partially within this frustrum would be rendered.
PVS ( potentially visible set ). This
method usually generates portals as discussed above, but instead of using them
at runtime, they are used to precompute for each convex sub-space ( leaf ) the
set of every other leaf that is at least partially visible from it. At runtime,
the renderer determines which leaf the viewer is in, and it draws all the
leaves that were marked as being potentially visible from that leaf.
Hierarchical Z Buffering, and
Hierarchical Occlusion Maps ( HOM ). The basic idea is to render the occluders,
and test the bounding volume of an occludee against this silhouette and it’s z
value. Both of these methods generally involve reading back from the frame
buffer, which is too slow to be useful. HOM can still be useful if the
silhouette rendering is done in software however, as a modified version of this
algorithm using a sort of silhouette rasterizer is used by Hybrids dPvs engine,
which works very well.
Now we get to the type of occlusion culling that
this tutorial focuses on.
Shadow culling. The
technique we are going to exploit uses a large polygon as an occluder, and an
AABB as an occludee. The first step is to determine which occludees are on the
opposite side of the occluder from the viewpoint. This can be done by finding
the closest point on the AABB to the viewer, and using the dot product to
determine if the viewpoint and this closest AABB point are on opposite sides of
the plane the occluder lies on. From here, there are at least two methods of
finding out whether the occluder blocks the occludee. One method creates a
frustrum using the view point and the edges of the occluder polygon to create
the side planes, and the plane of the occluder polygon for the near plane. If
the occludees bounding box is completely within the side planes of the
frustrum, and in front of the near plane, it is in the occluders shadow and can
be culled. See the picture below if you’re having trouble visualizing this:
[
Shadow frustrum pic ]
The problem with this is that we are using a hierarchy of AABB’s, so we need to know not only if the occludee is completely in the shadow, but if the occludee is partially shadowed as well, so it can recursively perform this test on it’s child nodes. This means that all cases of partial containment of an AABB in a frustrum must be considered, and if you are doing this test for an average of 10 occluders, 25 occludees per occluder, that’s 250 times you have to do this containment test each frame. Ok, I may be exaggerating the impact of this on performance a little, but that’s only because there is a better way! J
Separating and supporting planes. While
the prevous method used the relationship between the viewpoint and the occluder
to determine if the occludee was in the occluders shadow, this method uses the
relationship between the occluder and the occludee to determine if the
viewpoint is in the occluders shadow. This is done by creating separating
planes, and supporting planes. The separating and supporting planes are formed
by taking an edge of the occluder, and a corner of the AABB. For separating
planes, the planes must have the property that the occluder and occludee are
completely on opposite sides of the plane. For the supporting planes, the occluder
and occludee must be on the same side of the planes. If the view point is in
front of the entire set of supporting
planes, the occludee is completely occluded. The same test performed with the
separating planes will determine whether an occludee is at least partially
occluded. I’m sure you can immediately see why this goes perfectly with our
octree. If a node is not completely occluded, test if it is at least partially
occluded. If it is, recursively perform this test on all child nodes. If it was
completely occluded however, mark the node as not visible. You should cache
these planes for at least partially occluded nodes since the planes are view
independant, so they don’t have to be reconstructed each frame. This can
greatly speed up the occlusion culling process. Check out the picture below if
this is a bit hazy:
[
Supporting/separating plane pic ]