visibility sorting and compositing for image-based rendering · pdf filevisibility sorting and...

Visibility Sorting and Compositing forImage-Based Rendering

John Snyder and Jed Lengyel

April 2, 1997

Technical ReportMSR-TR-97-11

Microsoft ResearchAdvanced Technology Division

Microsoft CorporationOne Microsoft Way

Redmond, WA 98052

Abstract

We present an efficient algorithm for analyzing synthetic scenes involving a set of moving objects and changing view parametersinto a sorted sequence of image layers. The final image is generated by evaluating a 2D image compositing expression onthese layers. Unlike previous visibility approaches, we detect and report occlusion cycles rather than splitting to remove them.Such an algorithm has many applications in computer graphics; we demonstrate two: rendering acceleration using imageinterpolation, and visibility-correct depth of field using image blurring.

The algorithm involves an incremental method for identifying mutually occluding sets of objects and computing a visibilitysort of these sets. Occlusion queries are accelerated by testing on convex bounding hulls; less conservative tests are alsodiscussed. Kd-trees formed by combinations of directions in object or image space provide an initial cull on potential occluders,and a new incremental algorithm is employed to resolve pairwise occlusions. Mutual occluders are further analyzed to generatean image compositing expression; in the case of nonbinary cycles an expression can always be generated without aggregatingthe objects into a single layer. Results demonstrate that the algorithm is practical for interactive animation of scenes involvinghundreds of objects each comprising hundreds or thousands of polygons.

1 Introduction

The problem addressed by this paper is: given dynamic geometry and viewpoint, how can parts of the geometry be efficientlysorted into layers so that the final image is realizable by 2D image operations on the set of images for each layer. Applicationsof the above problem include:

1. image-based rendering acceleration – by using image warping techniques rather than re-rendering to approximate appear-ance, rendering resources can be conserved [Shade96, Schaufler96, Torborg96, Lengyel97].

2. image stream compression – segmenting a synthetic image stream into visibility-sorted layers yields greater compressionby exploiting the greater coherence present in the segmented layers [Wang94, Ming97].

3. fast special effects generation – effects such as motion blur and depth-of-field can be efficiently computed via imagepost-processing techniques [Potmesil81, Potmesil83, Max85, Rokita93]. Visibility sorting corrects errors due to the lackof information on occluded surfaces in [Potmesil81, Potmesil83, Rokita93] (see [Cook84] for a discussion of these errors),and uses a correct visibility sort instead of the simple depth sort proposed in [Max85].

4. animation playback with selective display – by storing the image layers associated with each object and the imagecompositing expression for these layers, layers may be selectively added or removed when playing back the recordedanimation to allow interactive insertion/deletion of scene elements without re-rendering.

5. animation design preview – selected layers can also represent modified elements of an animation which can be previewedquickly without re-rendering the unchanged elements.

6. incorporation of external image streams – external image streams, such as hand-drawn character animation, recordedvideo, or off-line rendered images, can be inserted into a 3D animation using a geometric proxy which is ordered alongwith the 3D synthetic elements, but drawn using the 2D image stream.

7. rendering with transparency – while standard z-buffers fail to properly render arbitrarily-ordered transparent objects,visibility sorting solves this problem provided the (possibly grouped) objects themselves can be properly rendered.

8. fast hidden-line rendering – by factoring the geometry into sorted layers, we reduce the hidden line problem [Appel67,Markosian97] into a set of much simpler problems. Rasterized hidden line renderings of occluding layers simply overwriteoccluded layers beneath.

9. rendering without or with reduced z-buffer use – the software visibility sort produced by the methods described here allowelimination or reduced use of hardware z-buffers. Z-buffer resolution can also be targeted to the extent of small groupsof objects, rather than the whole scene’s.

To understand the usefulness of visibility sorting, we briefly focus on rendering acceleration. The goal is to rendereach coherent object at the appropriate spatial and temporal resolution and interpolate with image warps between renderings.Several approaches have been used to compose the set of object images. We use pure image sprites without z information,

1

Figure 1: This configuration can be visibility sorted even though no non-splitting partitioning plane exists.

requiring software visibility sorting [Lengyel97]. Another approach caches images with sampled (per-pixel) z information[Molnar92, Regan94, Mark97, Schaufler97], but incurs problems with antialiasing and depth uncovering (disocclusion). A thirdapproach is to use texture-mapped geometric impostors like single quadrilaterals [Shade96, Schaufler96] or polygonal meshes[Maciel95, Sillion97]. Such approaches use complex 3D rendering rather than simple 2D image transformations and requiregeometric impostors suitable for visibility determination, especially demanding for dynamic scenes. By separating visibilitydetermination from appearance approximation, we exploit the simplest appearance representation (a 2D image without z) andwarp (affine transformation), without sacrificing correct visibility results.

In our approach, the content author identifies geometry that forms the lowest level layers, called parts. Parts (e.g., a tree, car,space vehicle, or joint in an articulated figure) typically contain many polygons and form a perceptual object or object componentthat is expected to have coherent motion. Very large continuous objects, like terrain, are a priori split into component objectsto detect occlusions efficiently and control growth of occlusion cycles. At runtime, for every frame, the visibility relationsbetween parts are incrementally analyzed to generate a sorted list of layers, each containing one or more parts, and an imagecompositing expression (from [Porter84]) on these layers that produces the final image. We assume the renderer can correctlyproduce hidden-surface-eliminated images for each layer when necessary, regardless of whether the layer contains one or moreparts.

Once defined, our approach never splits parts at run-time as in BSP-tree or octree visibility algorithms; the parts are renderedalone or in groups. There are two reasons for this. First, real-time software visibility sorting is practical for hundreds ofparts but not for the millions of polygons they contain. Second, splitting is undesirable and often unnecessary. In dynamicsituations, the number of splits and their location in image space varies as the corresponding split object or other objects in itsenvironment move. Not only is this a major computational burden, but it also destroys coherence, thereby reducing the reuserate in image-based rendering acceleration or the compression ratio in a layered image stream.

BSP and octree decompositions require global separating planes which induce unnecessary splitting, even though a globalseparating plane is not required for a valid visibility ordering (Figure 1). We use pairwise occlusion tests between convexhulls or unions of convex hulls around each part. Such visibility testing is conservative, since it fills holes in objects andcounts intersections as occlusions even when the intersection occurs only in the invisible (backfacing) part of one object. Thiscompromise permits fast sorting and, in practice, does not cause undue occlusion cycle growth. Less conservative special casescan also be developed, such as between a sphere/cylinder joint (see Appendix C). Our algorithm always finds a correct visibilitysort if it exists, with respect to a pairwise occlusion test, or aggregates mutual occluders and sorts among the resulting groups.Moreover, we show that splitting is unnecessary even in the presence of occlusion cycles having no mutually occluding pairs(i.e., no binary occlusion cycles).

The main contribution of this work is the identification of a new and useful problem in computer graphics, that of visibilitysorting and occlusion cycle detection on dynamic, multi-polygon objects without splitting, and the description of a fast algorithmfor its solution. We introduce the notion of an occlusion graph, which defines the “layerability” criterion using pairwise occlusionrelations without introducing unnecessary global partitioning planes. We present a fast method for occlusion culling, and a hybridincremental algorithm for performing occlusion testing on convex bounding polyhedra. We show how non-binary occlusioncycles can be dynamically handled without grouping the participating objects, by compiling and evaluating an appropriateimage compositing expression. We also show how binary occlusion cycles can be eliminated by pre-splitting geometry. Finally,we demonstrate the practicality of these ideas in several situations and applications. Visibility sorting of collections of severalhundred parts can be computed at more than 60Hz on a PC.

2

2 Previous Work

The problem of visibility has many guises. Recent work has considered invisibility culling [Greene93, Zhang97], analytic hiddensurface removal [McKenna87, Mulmuley89, Naylor92], and global visibility [Teller93, Durand97]. The problem we solve,production of a layered decomposition that yields the hidden-surface eliminated result, is an old problem in computer graphics ofparticular importance before hardware z-buffers became widely available [Schumacker69, Newell72, Sutherland74, Fuchs80].In our approach, we do not wish to eliminate occluded surfaces, but to find the correct layering order, since occluded surfacesin the current frame might be revealed in the next. Unlike the early work, we handle dynamic, multi-polygon objects withoutsplitting; we call this variant of the visibility problem non-splitting layered decomposition.

Much previous work in visibility focuses on walkthroughs of static scenes, but a few do consider dynamic situations.[Sudarsky96] uses octrees for the invisibility culling problem, while [Torres90] uses dynamic BSP trees to compute a visibilityordering on all polygons in the scene. Neither technique treats the non-splitting layered decomposition problem, and thealgorithms of [Torres90] remain impractical for real-time animations. Visibility algorithms can not simply segregate thedynamic and static elements of the scene and process them independently. A dynamic object can form an occlusion cycle withstatic objects that were formerly orderable. Our algorithms detect such situations without expending much computation onstatic components.

To accelerate occlusion testing, we use a spatial hierarchy (kd-tree) to organize parts. Such structures (octrees, bintrees,kd-trees, and BSP trees) are a staple of computer graphics algorithms [Fuchs80, Teller91, Funkhouser92, Naylor92, Greene93,Sudarsky96, Shade96]. Our approach generalizes octrees [Greene93, Sudarsky96] and 3D kd-trees [Teller91, Funkhouser92,Shade96] in that it allows a fixed but arbitrarily chosen number of directions in both object and image space. This allowsmaximum flexibility to tightly bound scenes with a few directions. Our hierarchy is also dynamic, allowing fast rebalancing,insertion, and deletion of objects.

Collision and occlusion detection are similar. We use convex bounding volumes to accelerate occlusion testing, as in[Baraff90, Cohen95, Ponamgi97], and track extents with vertex descent on the convex polyhedra, as in [Cohen95] (althoughthis technique is generalized to angular, or image space, extent tracking as well as spatial). Still, occlusion detection has severalpeculiarities, among them that an object A can occlude B even if they are nonintersecting, or in fact, very far apart. For thisreason, the sweep and prune technique of [Cohen95, Ponamgi97] is inapplicable to occlusion detection. We instead use kd-treesthat allow dynamic deactivation of objects as the visibility sort proceeds. Pairwise collision of convex bodies can be applied toocclusion detection; we hybridize techniques from [Chung96a, Chung96b, Gilbert88].

The work of [Max85] deserves special mention as an early example of applying visibility sorting and image compositing tospecial effects generation. Our work develops the required sorting theory and algorithms.

3 Occlusion Graphs

The central notion for our visibility sorting algorithms is the pairwise occlusion relation. We use the notation A !E Bmeaning object A occludes object B with respect to eye point E. Mathematically, this relation signifies that there exists a rayemanating from E such that the ray intersects A and then B. It is useful notationally to make the dependence on the eye pointimplicit so that A ! B means that A occludes B with respect to an implicit eye point. The arrow “points” to the object that isoccluded. The notation A 6! B means that A does not occlude B.

The set of occlusion relations between pairs of the n parts comprising the entire scene forms a directed graph, called theocclusion graph. This notion of occlusion graph is very similar to the priority graph of [Schumacker69] but uses actual occlusionof the objects rather than the results of plane equation tests for pairwise separating planes chosen a priori (view independently).Figure 2 illustrates some example occlusion graphs. When the directed occlusion graph is acyclic, visibility sorting is equivalentto topological sorting of the occlusion graph, and produces a (front-to-back) ordering of the objects hO1;O2; : : : ;Oni such thati < j implies Oj 6! Oi. Objects so ordered can thus be rendered with correct hidden surface elimination simply by using“Painter’s algorithm”; i.e., by rendering On, followed by On�1, and so on until O1. Thus the final image, I, can be constructedby a sequence of “over” operations on the image layers of each of the objects:

I � I1 over I2 over : : : over In (1)

where Ii is the shaped image of Oi: an image containing both color and coverage/transparency information [Porter84].Cycles in the occlusion graph mean that no visibility ordering exists (see Figure 2c). In this case, parts in the cycle are

grouped together and analyzed further to generate an image compositing expression (Section 8). The resulting image for thecycle can then be composited in the chain of over operators as above.

3

geometric configuration occlusion graph

C

B

AA B

C

(a)

C

B

AA B

C

(b)

C

B

A

A B

C

(c)

Figure 2: Occlusion graphs: The figure illustrates the occlusion graphs for some simple configurations. (a) and (b) are acyclic, while (c)contains a cycle.

4

Z

A

B

Ez

A

zB

Figure 3: Depth ordering does not indicate visibility ordering: While the minimum depth of object B is smaller than A’s (zB < zA), Aoccludes B as seen from eye point E. Similarly, by placing E on the right side of the diagram, it can be seen that maximum depth orderingalso fails to correspond to visibility ordering.

Note that this notion of occlusion ignores the viewing direction; only the eye point matters. By taking account of visibilityrelationships all around the eye, the algorithms described here can respond to rapid shifts in view direction common in interactivesettings and critical in VR applications [Regan94].

4 Incremental Visibility Sorting

Our algorithm for incremental visibility sorting and occlusion cycle detection (IVS) is related to the Newell, Newell, and Sancha(NNS) algorithm for visibility ordering a set of polygons [Newell72, Sutherland74]. In brief, NNS sorts a set of polygonsby furthest depth and tests whether the resulting order is actually a visibility ordering. NNS traverses the depth-sorted list ofpolygons; if the next polygon does not overlap in depth with the remaining polygons in the list, the polygon can be removedand placed in the ordered output. Otherwise, NNS examines the collection of polygons that overlap in depth using a series ofocclusion tests of increasing complexity. If the polygon is not occluded by any of these overlapping polygons, it can be sent tothe output; otherwise, it is marked and reinserted behind the overlapping polygons. When NNS encounters a marked polygon,a cyclic occlusion is indicated and NNS splits the polygon to remove the cycle.

IVS differs from NNS in that it orders aggregate geometry composed of many polygons rather than individual polygons, andidentifies and groups occlusion cycles rather than splitting to remove them. Most important, IVS orders incrementally, basedon the visibility ordering computed previously, rather than starting from an ordering based on depth.

This fundamental change has both advantages and disadvantages. It is advantageous because depth sorting is an unreliableindicator of visibility order as shown in Figure 3. Applying the NNS algorithm to a coherently changing scene repeatedlycomputes the same object reorderings (with their attendant costly occlusion tests) to convert the initial depth sort to a visibilitysort. The disadvantage is that the sort from the last invocation provides no restriction on the set of objects that can occlude agiven object for the current invocation. The NNS depth sort, in contrast, has the useful property that an object Q further in thelist from a given object P, and all objects after Q, can not occlude P if Q’s min depth exceeds the max depth of P. Naively, IVSrequires testing potentially all n objects to see if any occlude a given one, resulting in an O(n2) algorithm. Fortunately, we willsee in the next section how the occlusion culling may be sped up using simple hierarchical techniques, actually improving uponNNS occlusion culling (see Section 12).

The IVS algorithm is presented in Figures 4 and 5. Mathematically, the IVS algorithm computes an incremental topologicalsort on the strongly connected components of the directed occlusion graph (see [Sedgewick83] for background on directedgraphs, strongly connected components, and topological sort). A strongly connected component (SCC) in the occlusion graphis a set of mutually occluding objects, in that for any object pair A and B in the SCC, either A ! B or there exist objects,X1;X2; :::;Xs also in the SCC such that

A ! X1 ! X2 ! :::! Xs ! B:

5

IVS(L,G) [computes visibility sort]

Input: ordering of non-grouped objects from previous invocation (L)Output: front-to-back ordering with cylic elements grouped together (G)Algorithm:

G ;unmark all elements of Lwhile L is nonempty

pop off top(L): Aif A is unmarked

if nothing else in L occludes A[insert A onto output]insert A onto Gunmark everything in L

else[reinsert A onto L]mark Afind element in L occluding A furthest from top(L): FA

reinsert A into L after FA

endifelse [A is marked]

form list S � hA; L1;L2; : : : ; Lni where L1; : : : ; Ln

are the largest consecutive sequence of marked elements, starting from top(L)if detect cycle(S) then

[insert cycle as grouped object onto L]group cycle-forming elements of S into grouped object Cdelete all members of C from Linsert C (unmarked) as top(L)

else[reinsert A onto L]find element in L occluding A furthest from top(L): FA

reinsert A into L after FA

endifendif

endwhile

Figure 4: IVS algorithm. The top object, A, in the current ordering (list L) is examined for occluders. If nothing occludes A, it is insertedin the output list G. Otherwise, A is marked and reinserted behind the furthest object in the list that occludes it, FA. When a marked object isencountered, the sequence of consecutively marked objects starting at the top of the list is checked for an occlusion cycle using detect cycle.If an occlusion cycle is found, the participating objects are grouped and reinserted on top of L. This loop is repeated until L is empty; G thencontains the sorted list of parts with mutual occluders grouped together.

detect cycle(S) [finds a cycle]

Input: list of objects S � hS1; S2; : : : ; SniOutput: determination of existence of a cycle and a list of cycle-forming objects, if a cycle is found.Algorithm:

if n � 1 return NO CYCLEi1 1for j = 2 to n + 1

if Sik occludes Sij�1 for k < j� 1 thencycle is hSik ; Sik+1 ; : : : ; Sij�1ireturn CYCLE

else if no occluder of Sij�1 exists in S thenreturn NO CYCLE

elselet Sk be an occluder of Sij�1

ij kendif

endfor

Figure 5: Cycle detection algorithm used in IVS. This algorithm will find a cycle if any initial contiguous subsequence of 1 < m � nvertices hS1; S2; : : : ; Smi forms a cyclically-connected subgraph; i.e., a subgraph in which every part is occluded by at least one other memberof the subgraph. It is easy to see that such a subgraph always contains a cycle. For subgraphs which are not cyclically connected, the algorithmcan fail to find existing cycles, but this is not necessary for the correctness of IVS (for example, consider the occlusion graph with three nodesA, B, and C where A! B! A and initial list hC;A;Bi).

6

The IVS algorithm also identifies the SCCs, forming a group of member parts for each SCC. It can optionally compute theocclusion subgraph of the members of each SCC, useful for resolving nonbinary cycles without aggregating layers (Section 8).

A series of example invocations of the IVS algorithm for some of the occlusion graphs in Figure 2 are presented in Figures 6,7, and 8.

When the animation is started and at major changes of scene there is no previous visibility sort to be exploited. In thiscase, we use an initial sort by distance from the eye point to the centroid of each part’s bounding hull. Using a sort by depth isless effective because it sorts objects behind the eye in reverse order; sorting by distance is effective even if the view directionswings around rapidly.

4.1 IVS Correctness

To demonstrate the correctness of the IVS algorithm, we note that if a (possibly grouped) object A is encountered as top(L)which nothing remaining on L occludes, then A is correctly placed on the output. This is the only time the algorithm insertsan object to the sorted output. Objects are grouped only when the cycle detection algorithm finds that they form a connectedcomponent.

To see that the algorithm halts, assume the contrary. Then the algorithm will return from a state T � hQ1;Q2; : : : ;QNi to Tagain having visited (popped from top(L) and then reinserted) m elements, where N is the total number of objects remaining inlist L at state T. That is, Qm is the furthest element in T’s sequence that was visited in progressing from state T through a finitesequence of steps back to T. But since no object was output, all the elements before Qm from T must also have been visitedand are thus marked. Let Q be the set of elements from Q1 to Qm. Since every element in Q was reinserted, but no element wasreinserted further than Qm, all elements in Q are occluded by some other element of Q. But then the cycle detection algorithm(Figure 5) is guaranteed to find a cycle within elements of Q, leading to the formation of a grouped object.1 Since there are onlyfinitely many deletion or grouping operations that can happen on a finite set of elements, the algorithm eventually halts, withthe topologically sorted list of SCCs.

4.2 IVS Complexity

The IVS algorithm takes advantage of coherence in the visibility ordering from the previous frame. When a given object A ispopped off the list, it is likely that few objects further in the list will occlude it. Typically, no objects will be found to occludeA and it will be immediately inserted onto G. If we can quickly determine that no objects occlude A, and the new orderingrequires no rearrangements, the algorithm verifies that the new order is identical to the old with computation O(n log n) in thetotal number of objects. In essence, the algorithm’s incrementality allows it to examine only a small subset of the potentiallyO(n2) arcs in the occlusion graph.

We assume that occlusion cycles (SCCs) will be small and of limited duration in typical scenes. This assumption is importantsince the cycle detection algorithm has quadratic complexity in the number of cycle elements. The visibility sorting algorithmdoes not attempt to exploit coherence in persistent occlusion cycles.

As the number of re-arrangements required in the new order increases (i.e., as the coherence of the ordering decreases) theIVS algorithm slows down, until a worst case scenario of starting from what is now a completely reversed ordering requiresO(n2) outer while loop iterations. This is analogous to using insertion sort for repeatedly sorting a coherently changing list:typically, the sort is O(n), but can be O(n2) in pathologically incoherent situations.

The algorithm’s complexity is bounded by (n+r)(s+co+c2) where r is the number of reinsertions required, c is the maximumnumber of objects involved in an occlusion cycle, o is the maximum number of primitive occluders of a (possibly grouped)object, and s is the complexity of the search for occluders of a given object. The first factor represents the number of outerwhile-loop iterations of IVS. In the second factor, the three terms represent time to find potential occluders, to reduce this set toactual occluders (see Section 5.3), and to detect occlusion cycles. Typically, r � O(n), c � O(1), o � O(1), and s � O(log n)resulting in an O(n log n) algorithm. In the worst case, many reinsertions are required, many objects are involved in occlusioncycles, and many objects occlude any given object so that r � O(n2), c � O(n), o � O(n), and s � O(n) resulting in an O(n4)algorithm. This analysis assumes that occlusion detection between a pair of parts requires constant time.

1There is a subtlety in the proof in that at state T some of the objects after Qm could have been marked, but not visited again in progressing from T back toT. This is the reason for the requirement that the cycle detection algorithm find cycles whenever any initial subsequence of marked objects forms a cyclicallyconnected subgraph.

7

L O comment

ABC ; initial stateBC A insert A onto OC AB insert B onto O; ABC insert C onto O

Figure 6: IVS Example 1: Each line shows the state of L and O after the next while loop iteration, using the graph of Figure 2(b) and initialordering ABC.

L O comment

CBA ; initial stateBC�A ; mark C and reinsert after BC�AB� ; mark B and reinsert after AAB�C� ; A unmarked, so reinsert C

BC A insert A onto O, unmark everythingC AB insert B onto O; ABC insert C onto O

Figure 7: IVS Example 2: Using the graph of Figure 2(b), this time with initial ordering CBA. The notation P� is used to signify marking.The step from 3 to 4 reinserts C into L because there is an unmarked element, A, between C and the furthest element occluding it, B.

L O comment

ABC ; initial stateBCA� ; mark A and reinsertCA�B� ; mark B and reinsertA�B�C� ; mark C and reinsert(ABC) ; group cycle; (ABC) insert (ABC) onto O

Figure 8: IVS Example 3: Using the graph of Figure 2(c) with initial ordering ABC. The notation (P1;P2; : : : ;Pr) denotes grouping.

8

X1

X2

Z

α0

α1

E

D

D0

D1

angular extent [�0; �1] spatial extent [D0;D1]

Figure 9: Angular and Spatial Extents: Angular extents are defined with respect to an eye point E and a orthogonal coordinate frame(X1;X2; Z) where X2 (out of the page) is perpendicular to the plane in which angles are measured, Z defines the zero angle, and X1 definesan angle of +�=2 radians. Spatial extents are defined with respect to a direction D. In either case the resulting extent is simply an interval:[�0; �1] for angular extents, and [D0;D1] for spatial.

5 Occlusion Culling

The fundamental query of the IVS algorithm determines which objects not already inserted into the output order occlude agiven (possibly grouped) object. To quickly cull the list of candidate occluders to as small a set as possible, we bound each partwith a convex polyhedron and determine the spatial extent of this convex bound with respect to a predetermined set of directions,as in [Kay86]. These directions are of two types: spatial extents or projection along a given 3D vector, and angular extents orprojected angle with respect to a given (eye) point and axis (Figure 9). Spatial extents are defined by extremizing (maximizingand minimizing) S(P) � D � P over all points P in the convex hull. Angular extents are defined similarly by extremizing2

A(P) � tan�1

�(P� E) � Z(P� E) � X1

�: (2)

where E is the “eye” point, Z defines the zero angle direction, and X1 defines the positive angles.Given two objects, A and B, with interval bounds for each of their extents, occlusion relationships can be tested with simple

interval intersection tests performed independently for each extent, as shown in Figure 10. The content author chooses thenumber of spatial (ks) and angular (ka) extents and their directions; let k � ks + ka be the total number. If any of the k tests findsthat B 6! A then the test can be concluded and B rejected as an occluder without testing more extents. Note the expansion ofA’s extent by the eye point E required in the occlusion test (B ! A) for spatial extents. Three cases in the occlusion test forspatial extents are further illustrated in Figure 11. When objects are grouped their k extents are “unioned” via

[a0; a1][

[b0; b1] � [min(a0; b0);max(a1; b1)] :

5.1 Tracking Extents on Convex Hulls

Spatial extent directions can be fixed in space (e.g., the coordinate axes, but note that arbitrary directions are allowed) or tiedto the camera. Camera-independent spatial extents only need to be updated when the object moves; camera-dependent spatialextents must be updated when the object or the camera moves. Angular extents must also be updated whenever the object orcamera moves. For the results in Section 12, in one case (Tumbling Toothpicks) we used two orthogonal angular extents andthe orthogonal camera-dependent spatial extent (Z). In another case with many unmoving objects (Canyon Flyby), we used 3mutually orthogonal camera-independent spatial extents.

For convex bounding polyhedra, spatial and angular extents can be found simply by “sliding downhill” (i.e., gradient descent)from vertex to neighboring vertex, evaluating the objective function (S orA) at each vertex. An optimization for angular extentevaluation can be made which avoids evaluating the inverse tangent for all but the minimizing vertex. This is because thedescent only needs to check if any neighbor has a value smaller than the current minimum; details can be found in Section A.At each iteration, the neighboring vertex having the minimum value is accepted as the starting point for the next iteration. If no

2Care must be taken when the denominator is close to 0. This is easily accomplished in the C math library by using the function atan2.

9

X1

X2

Z

α0

α1

E

A

B

β1

β0

D

a0 a1

E

b1b0

A

B

angular: [�0; �1]T

[�0; �1] = ; ) B 6! A spatial: ([a0; a1] ^ E � D)T

[b0; b1] = ; ) B 6! A

Figure 10: Occlusion Testing with Angular and Spatial Extents: To determine that B 6! A for angular extents (left side of figure), we testfor empty interval intersection. For spatial extents, the test must enlarge A’s interval with E � D to account for the intervening space betweenthe eye and A. The ^ operator is defined via [a; b]^ c � [min(a; c);max(b; c)]. Note that the k query extents with respect to A can be formedonce, with the spatial extents computed by enlarging via E � Di, so that the problem of finding potential occluders of A reduces to finding allnondisjoint extents with respect to this query.

D

a0 a1

E

b1b0

A

B

a'0 a'1 b0 b1

D

a0 a1

E

A

b1b0

B

b0

b1

a'0

a'1

D

E

b1b0

B

b0

b1

a'0

a'1

a0 a1

A

(a) B 6! A (b) B ! A (c) B ! A

Figure 11: Occlusion Testing with Spatial Extents: To test whether B! A, A’s spatial extent [a0; a1] is expanded by E �D to yield [a0

0; a0

1].Three cases can occur. In (a), [a0

0; a0

1] is disjoint from B’s extent [b0; b1], so B 6! A. In (b), [a0

0; a0

1] overlaps with [b0; b1], so B ! A ispossible. In (c), [a0

0; a0

1] overlaps with [b0; b1] even though A’s extent [a0; a1] is disjoint from B. Again, B! A is possible. Note that in case(b) and (c), the spatial test must determine B! A for all ks spatial extents before we conclude B is a possible occluder of A. In other words, asingle extent suffices to demonstrate B 6! A; failure of this test for all extents leads to the conclusion that B is a possible occluder of A.

10

X1

X2

Z

X'1

X2

Z'

ω0

ω1

angular extent [��; �] angular extent [!0; !1]

Figure 12: Branch Cut Problem for Determining Angular Extents: An object that straddles the branch cut (dashed line) has a “full”angular extent, [��; �], even though its visible extent is clearly smaller. By rotating the reference frame so that the branch cut is behind theobject (Z maps to Z0, X1 to X0

1), the true angular extent can be found ([!0; !1] on the right side of the figure).

neighbors have a smaller objective function, then the computation is halted with the current vertex returned as the minimizer.Small motions of the convex hull or the spatial/angular reference frame move the new extremal vertex at most a few neighborsaway from the last one.

By starting with the extremal vertex from the last query, coherence in object and camera motions is thus exploited. In thecase of spatial extents, the convexity of the bounding polyhedra assures us that such a locally minimizing vertex is also globallyminimizing [Cohen95]. The same property can be shown for angular extents. For each of the k dimensions, we cache the twovertices associated with the last extremizing query; one vertex represents the maximizing vertex and the other the minimizingvertex.

Branch Cuts in Angular Extent Tracking

Angular extent tracking must manage the branch cut problem, as shown in Figure 12. The branch cut is the direction where theangles � and�� are identical. In the canonical frame for angular extent measurement, Z corresponds to the line of sight and X1

to the perpendicular direction to the right. The branch cut is thus the �Z direction. When an object straddles this branch cut,one approach is to rotate the coordinate frame about X2 so that the new branch cut direction, �Z0, falls away from the object(Figure 12, right). The resulting angular extent can be converted to the canonical coordinate frame by adding an offset � to theextent, where � is the angle formed by Z0 with Z in the canonical coordinate frame. Objects whose projection with respect tothe X2 plane contain the projection of the eye have a full angular extent: [��;+�].

To determine a direction for which the branch cut falls away from an object,we perform point inclusion tracking (Appendix B)on the eye point E, which tests whether E falls inside or outside the hull, and if outside, returns a plane separating E from theobject. The perpendicular projection of this plane’s normal onto X2 determines Z0. If the eye is found to be inside the hull, thefull angular range is returned. This approach does not necessarily produce a separating direction between the eye point andthe object (because of the perpendicular projection), even if one exists. Nevertheless, a correct angular extent is computed aslong as the object doesn’t straddle the branch cut at the computed �Z0. If however the object does straddle the branch cut, weconservatively return the full extent even though the object may subtend a smaller angular range. An alternative is to also try abranch cut in the opposite direction – if this yields a nonfull angular range, it is returned; otherwise, the object necessarily hasa full angular extent.

Branch cut rotation ensures that angular overlap is correctly tested for the visible region [��=2; �=2] with respect to thecurrent camera. We use a canonical angular extent [!0; !1], in which �� !0 � +�. Thus, angular extents that contain thebranch cut always “slop over” into an angular range containing angles greater than +�. Angular extents that do not containbut are near the branch cut, (i.e., entirely greater than ��) can have intersections with extents that do contain the branch cut.

11

These overlaps are ignored because the extents as 1D intervals are disjoint. Such missed occlusions happen outside the visibleregion and therefore do not invalidate the results. The problem here essentially stems from the fact that the space of angles(i.e., points on the circle) are topological different from the space of points on the real line. Thus an alternative is to conductcorrect overlap tests on periodic intervals; this alternative is feasible but complicates the implementation of the kd-tree, whichmust differentiate between periodic (angular extent) and nonperiodic (spatial extent) intervals. The current solution of testingangular extents as 1D intervals may cause some extra “spikiness” in the computation if the view direction rapidly changes.

Note also that objects known to be in front of E with respect to the canonical direction Z need not be further tested usingpoint inclusion testing or branch cut rotation. Similarly, we can compute angular extents for objects that are entirely behind Ewith respect to Z by using using Z0 � �Z and rotating the resulting angular range by �. In either case, the object is guaranteednot to have a full angular extent. This makes the viewing (Z) direction a good choice for one of the spatial extents, though suchan extent is tied to the moving camera and thus must be updated even for unmoving objects.

Angular extent tracking is computed from vertex to vertex. Thus the detection of a full angular extent (or more properly, ofthe object straddling the branch cut) is not obvious: the algorithm simply returns an angular range in which the lower boundis near �� and the upper near +�, representing the vertex with negative angle nearest to �� and positive angle nearest to +�,respectively. One approach to solve this problem is to keep track of the edges emanating from the extreme vertices trackedto see if neighboring vertices are on the other side of the branch cut. Since objects are convex, a much simpler method testswhether the width of the angular interval exceeds �. An additional complication is that tracked angular ranges may “switch”,so that the !0 > !1. This simply represents the angular range [!0; �]

S[��; !1], or in canonical form [!0; !1 + 2�].

5.2 Accelerating Occlusion Queries with Kd-Trees

We’ve reduced the problem of finding all potential occluders of an object A to

1. forming a query extent for A, in which an k-dimensional interval is created by taking the angular extents without changeand the spatial extents after enlarging by E � D, and

2. finding all objects that overlap this query.

Hierarchically organizing part extents using a kd-tree accelerates finding the set of overlapping extents for a given query.A kd-tree [Bentley75, Preparata85] is a binary tree which subdivides along k fixed dimensions. Each node T in the tree

stores both the dimension subdivided (T:i) and the location of the partitioning point (T:v). In our case, k = ka + ks. Objectextents whose T:i-th dimension interval lower bound is less than T:v are placed in the left child of node T; those whose upperbound is greater than T:v are placed in the right child. This implies that objects whose T:i-th interval contains T:v exist in bothsubtrees.

5.2.1 Computing Partitioning Cost

A simple minimum cost metric is used to determine a subdivision point for a list of intervals, representing the 1D extents ofthe set of objects with respect to one of the ka angular or ks spatial directions. Our cost metric sums the length of the longer ofthe left and right sublists and the number of intervals shared between left and right. Avoiding lopsided trees and trees in whichmany objects are repeated in both subtrees is desirable since such trees tend to degrade query performance in the average case.The cost can be computed with a simple traversal of a sorted list containing both upper and lower bounds.

To compute the cost, we assume the lower and upper sides of all intervals are placed in a sorted list fsig having 2n entries(n parts), called a 1D sorted bound list. For example, if the set of objects is given by hA;B;C;D;E;Fi, the interval list for oneof the spatial or angular extents might look like:

[ ]| {z }A

[ ]| {z }B

Cz }| {[ [ ]|{z}

D

[ ]|{z}E

] [ ]|{z}F

This configuration yields the 1D sorted bound list

A0;B0;A1;B1;C0;D0;D1;E0;E1;C1;F0;F1:

12

In the following algorithm for computing cost, we use the notation si:v to denote the lower or upper bound location of the i-thelement of the 1D sorted bound list, and si:b to indicate whether the bound is lower (si:b = LOWER) or upper (si:b = UPPER).The algorithm MinCost computes the minimum cost with a single traversal of fsig.3

MinCost(fsig) [computes cost of optimal partitioning of 1d sorted bound list]

C 1nl 0; nu 0for i = 1 to 2n

if si:b = UPPER thennu nu + 1[cost to partition at i is length of longer child list plus number shared]ci max(nl; n� nu) + nl � nu

C min(C; ci)else

nl nl + 1endif

endforreturn C

The partitioning point always occurs immediately after an upper bound. nl and nu represent the number of lower and upperbounds traversed so far. ci measures the cost of splitting at si:v and is formed from two terms where max(nl; n�nu) representingthe length of the longer of the two sublists, and nl � nu representing the number of intervals shared between the two sublists.

5.2.2 Building the Kd-Tree

To build the kd-tree, we begin by sorting each of the k interval sets to produce 1D sorted bound lists. The kd-tree is then builtrecursively in a top-down fashion. To subdivide a node, the MinCost algorithm is invoked for each of the k dimensions, andthe dimension of lowest cost used to partition. Object extents for the left and right subtrees can then be re-sorted to producevalid 1D sorted bound lists using a single traversal of the original list, by inserting to either or both child lists according tothe computed partitioning. We then recurse to the left and right sides of the kd-tree. The algorithm is terminated when thelonger child list is insufficiently smaller than its parent (we use a threshold of 10). A node T in the final kd-tree stores the 1Dsorted bound list only for dimension T:i, which is used to update the subdivision value T:v in future queries, and to shift objectsbetween left and right subtrees as they move. The other lists are deleted.

5.2.3 Rebalancing the Kd-Tree

Since rebuilding is relatively expensive, the algorithm also incorporates a quick kd-tree rebalancing pass. To rebalance thekd-tree as object extents change, we visit all its nodes depth-first. At each node T, the 1D sorted bound list is re-sorted usinginsertion sort and the cost algorithm is invoked to find a new optimal subdivision point, T:v. Extents are then repartitioned withrespect to the new T:v, shifting extents between left and right subtrees. Extent addition is done lazily (i.e., only to the immediatechild), with further insertion occurring when the child nodes are visited. Extent deletion is done immediately for all subtrees inwhich the extent appears, an operation that can be done efficiently by recording a (possibly null) left and right child pointer foreach extent stored in T.

Note that coherent changes to the object extents yield an essentially linear resort of bound lists, and few objects that must beshifted between subtrees. An optimization is to note that when no bound is moved in the bound list resort at T, there is no needfor further processing in T, including the cost invocation or repartitioning. Child nodes of T however must still be traversedin case their ordering needs to be updated (since they sort based on other dimensions). Another optimization is to segregatenewly added extents in the 1D sorted bound list from previously added extents. Newly added extents are sorted separately usingquicksort, since the order of insertion is likely to be incoherent (the parent kd node sorted based on a different dimension). Oldextents are sorted using insertion sort, since they remained from a previous query. The two groups of extents are then mergesorted to produce the final order.

It is important to realize that the coherence of kd-tree rebalancing depends on fixing the subdivision dimension T:i at eachnode. If changes in the subdivided dimension were allowed, large numbers of extents could be shifted between left and right

3The algorithm assumes that all intervals in the sorted list have nonzero width. It also assumes that upper bounds come before lower bounds in the orderingwhenever they are equal.

13

B

A

C

Figure 13: Occlusion Cycle Growth with Grouped Objects: In this example, objects A and B have been grouped because they are mutuallyoccluding. A simple bound around their union, shown by the dashed lines, is occluded by object C, even though the objects themselves arenot. We therefore use the bounded extents around grouped objects for a quick cull of nonoccluders, but further test objects which are not soculled to make sure they occlude at least one primitive element of the grouped object.

subtrees, eliminating coherence in all descendants. Fixing T:i but not T:v restores coherence, but since T:i is computed onlyonce, the tree can gradually become less efficient for query acceleration as object extents change. This problem can be dealtwith by rebuilding the tree after a specified number of frames or after measures of tree effectiveness (e.g., tree balance) soindicate. A new kd-tree can then be rebuilt as a background process over many frames while simultaneously rebalancing andquerying an older version.

5.2.4 Querying the Kd-Tree

Querying the kd-tree involves simple descent guided by the query. At a given node T, if the query’s T:i-th interval lower boundis less than T:v, then the left subtree is recursively visited. Similarly, if the query’s T:i-th interval upper bound is greater thanT:v then the right subtree is recursively visited. When a terminal node is reached, extents stored there are tested for overlapwith respect to all k dimensions. Overlapping extents are accumulated by inserting into an occluder list. An extent is insertedonly once in the occluder list, though it may occur in multiple leaf nodes of the kd-tree.

An additional concern is that the occlusion query should return occluders of an object A that have not already been insertedinto the output list. Restricting the set of occluders to the set remaining in L can be accomplished by activating/deactivatingextents in the kd-tree.When A is popped off the list L in the IVS algorithm, all objects grouped within it are deactivated.Deactivated objects are handled by attaching a flag to each object in the list stored at each terminal node of the kd-tree.Deactivating an object involves following its left and right subtree pointers, beginning at the kd root, to arrive at terminal listscontaining the object to be deactivated. Activation is done similarly, with the flag set oppositely. Objects are activated in theIVS algorithm when it is determined that they must be reinserted in the list L, rather than appended to the output order, becauseof the existence of occluders. The reason objects are deactivated and then reactivated if occluders are found (rather than simplydeactivating only if no occluders are found) is that deactivation is necessary to ensure that the objects themselves are not returnedas self-occluders.

The current implementation simply flags deactivated objects at the leaf nodes of the kd tree so that they are not returned asoccluders; the tree structure is not changed as objects are deactivated. However, counts of active objects within each kd-treenode are kept so that nodes in which all objects have been deactivated can be ignored during queries. The counts are maintainedduring descent of the tree for activation (count increment) and deactivation (count decrement).

5.3 Avoiding Occlusion Cycle Growth

The occlusion testing described so far is conservative, in the sense that possible occluders of an object can be returned whichdo not in fact occlude it. There are two sources of this conservativeness. First, occlusion is tested with respect to a fixed set ofspatial and/or angular extents, which essentially creates an object larger than the original convex hull and thus more likely tobe occluded. Second, extents for grouped objects are computed by simple unioning of the extents of the members, even thoughthe unioned bound may contain much empty space, as shown in Figure 13. The next section will show how to compute an exact

14

occlusion test between a pair of convex objects, thus handling the first problem. This section describes a more stringent test forgrouped objects which removes the second problem.

Occlusion testing that is too conservative can lead very large groupings of objects in occlusion cycles. In the extreme caseevery object is inserted into a single SCC. This is especially problematic because of the second source of conservatism – thatbounds essentially grow to encompass all members of the current SCC, which in turn occlude further objects, and so on, untilthe SCC becomes very large.

To handle this problem, we perform additional tests when a grouped object A is tested for occlusion. A’s unioned extents areused to return a candidate list of possible occluders, as usual. Then the list of occluders is scanned to make sure each occludesat least one of the primitive members of A, using a simple k-dimensional interval intersection test. Any elements of the list thatdo not occlude at least one member of A are rejected, thus ensuring that “holes” within the grouped object can be seen throughwithout causing occlusions. Finally, remaining objects can be tested against primitive members of A using the exact occlusiontest.

6 Occlusion Testing

The algorithms in Section 5 provide a fast but conservative pruning of the set of objects that can possibly occlude a given objectA. To produce the set of objects that actually occlude A with respect to the convex hull bounds, we apply an exact test ofocclusion for primitive object pairs (A;B), which determines whether B ! A. The test is used in the IVS algorithm by scanningthe list of primitive elements of the possibly grouped object A and ensuring that at least one occluder in the returned list occludesit, with respect to the exact test. The exact test is thus used as a last resort when the faster methods fail to reject occluders.

The exact occlusion test algorithm is as follows:

ExactConvexOcclusion(A,B,E) [returns whether B!E A]

if ks-dimensional extents of A and B intersect or ks = 0initiate 3D collision tracking for hA;Bi, if not alreadyif A and B collide, return B! Aif E on same side of separating plane as A, return B 6! A

endif

if B contains eyepoint E, return B! A [B occludes everything]initiate occlusion tracking for hB;Ai, if not alreadyreturn result of occlusion test

Note that the test of ks-dimensional extent intersection does not compute eye-expanded intervals as does the test of Section 5;the intervals are tested as is.

Both the collision and occlusion query used in the above algorithm can be computed using the algorithm in Appendix B.While the collision query is not strictly necessary, it is more efficient in the case of a pair of colliding objects to track thecolliding pair once rather than tracking two queries which bundle the eye point with each of the respective objects. For scenesin which collisions are rare, the direct occlusion test should be used.

The IVS algorithm is extended to make use of a hash table of object pairs for which 3D collision or occlusion tracking havebeen initialized, allowing fast access to the information. Tracking is discontinued for a pair if the information is not accessedin the IVS algorithm after more than a parameterized number of invocations (we have used a threshold of 1).

Note that further occlusion resolution is also possible with respect to the actual objects rather than convex bounds aroundthem. It is also possible to inject special knowledge in the occlusion resolution process, such as the fact that a given separatingplane is known to exist between certain pairs of objects, like joints in an articulated character or adjacent cells in a pre-partitionedterrain. Special purpose pair-wise visibility codes can also be developed; Appendix C provides an example for a cylinder withendcap tangent to a sphere that provides a visibility heuristic for articulated joints in animal-like creatures.

7 Conditioning Sort

After each IVS invocation, we have found it useful to perform a conditioning sort on the output that “bubbles up” SCCs basedon their midpoint with respect to a given extent. More precisely, we reorder according to the absolute value of the differenceof the midpoint and the projection of the eye point along the spatial extents. The camera-dependent Z direction is typicallyused as the ordering extent, but other choices also provide benefit. An SCC is only moved up in the order if doing so does notviolate the visibility ordering; i.e., the object does not occlude the object before which it is inserted. This conditioning sort

15

A B C A2 = C atop A

A1 = A out C B1 = B out A C1 = C out B

A2 over Bover C

A1 + B1 + C1

Figure 14: Compositing expressions for cycle breaking: The original sprite images are shown as A, B, C. Using “over-atop”, the finalimage is formed by (C atop A) over B over C. Using “sum-of-outs”, the final image is formed by (A out C) + (B out A) + (C out B).

smooths out computation over many queries. Without it, unoccluding objects near the eye can remain well back in the orderinguntil they finally occlude something, when they must be moved in front of many objects in the order, reducing coherence. Theconditioning sort also sorts parts within SCCs according to extent midpoint, but ignoring occlusion relationships (since the SCCis not visibility sortable).

8 Resolving Non-Binary Cycles

Following [Porter84], we represent a shaped image as a 2d array of 4-tuples, written

A � [Ar;Ag;Ab;A�]

where Ar;Ag;Ab are the color components of the image and A� is the transparency, in the range [0; 1].Consider the cyclic occlusion graph and geometric situation shown in Figure 2c. Clearly,

A over B over C

produces an incorrect image because C ! A but no part of C comes before A in the ordering. A simple modification thoughproduces the correct answer:

A out C + B out A + C out B:

where “out” is defined asA out B � A(1� B�):

This expression follows from the idea that objects should be attenuated by the images of all occluding objects. Another correctexpression is

(C atop A) over B over C

where “atop” is defined asA atop B � AB� + (1� A�)B:

In either case, the expression correctly overlays the relevant parts of occluding objects over the occluded objects, using onlyshaped images for the individual objects (refer to Figure 14). Technically, the result is not correct at any pixels partially covered

16

by all three objects, since the matte channel encodes coverage as well as transparency. Such pixels tend to be isolated, if theyexist, and the resulting errors of little significance.4

The above example can be generalized to any collection of objects with a known occlusion graph having no binary cycles:cycles of the form A ! B;B ! A. The reason binary cycles can not be handled is that in the region of intersection of thebounding hulls of A and B, we simply have no information about which object occludes which. Note also that the compositingexpression in this case reduces to A out B + B out A which incorrectly eliminates the part of the image where A

TB projects.

A correct compositing expression for n shaped images Ii is given by

nXi=1

Ii OUTfj jOj!Oig

Ij (3)

The notation OUT with a set subscript is analogous to the multiplication accumulator operator �, creating a chain of “out”operations, as in

D OUTfA;B;Cg

= D out A out B out C:

In words, (3) sums the image for each object Oi, attenuated by the “out” chain of products for each object Oj that occludes Oi

(Figure 14, bottom row).An alternate recursive formulation using atop is harder to compile but generates simpler expressions. As before, we have a

set of objects O = fOig together with an occlusion graph G for O containing no binary cycles. The subgraph of G induced byan object subset X � O is written GX . Then for any O� 2 O

I(GO) =�I(GfOijOi!O�g) atop I(O�)

�over I(GO�fO�g) (4)

where I(G) represents the shaped image of the collection of objects using the occlusion graph G. In other words, to render thescene, we can pick any isolated object O�, find the expression for the subgraph induced by those objects occluding O�, andcompute that expression “atop” O� (Figure 14, top right). That result is then placed “over” the expression for the subgraphinduced by removing O� from the set of objects O. Note also that the above expression assumes I(G;) = 0.

Proofs of correctness of the two expressions are available in a companion technical report [Snyder98].

Compositing Expression Compilation

An efficient approach to generating an image compositing expression for the scene uses the IVS algorithm to produce a visibilitysorted list of SCCs. Thus the images for each SCC can be combined using a simple sequence of “over” operations as inExpression (1). Most SCCs are singletons (containing a single object). Non-singleton SCCs are further processed to mergebinary cycles, using the occlusion subgraph of the parts comprising the SCC. Merging must take place iteratively in case binarycycles are present between objects that were merged in a previous step, until there are no binary cycles between merged objects.We call such merged groups BMCs, for binary merged components. Expression (3) or (4) is then evaluated using the resultingmerged occlusion graph to produce an expression for the SCC. Each BMC must be grouped into a single layer, but not the entireSCC. For example, Figure 2(c) involves one SCC but three BMCs, since there are no binary cycles.

It is clear that Expression (3) can be evaluated using two image registers: one for accumulating a series of “out” operationsfor all image occluders of a given object, and another for summing the results. It is less obvious but nevertheless true [Snyder98]that Expression (4) can be similarly compiled into an expression using two image resisters: one for “in” or “out” operations andone for sum accumulation. Two image registers thus suffice to produce the image result for any SCC. An efficient evaluationfor the scene’s image requires a third register to accumulate the results of the “over” operator on the sorted sequence of SCCs,as shown in Figure 15. This third register allows segregation of the SCCs into separately compilable units.

Given such a three-register implementation, it can be seen why Expression (4) is more efficient. For example, for a simplering cycle of n objects; i.e., a graph

O1 ! O2 ! � � � ! On ! O1

the “sum-of-outs” formulation (Expression 3) produces

I(O1) out I(On) + I(O2) out I(O1) + I(O3) out I(O2) + � � � + I(On) out I(On�1)

4Recall too that overlaying shaped images where the matte channel encodes coverage is itself an approximation since it assumes uncorrelated silhouetteedges.

17

memory

R* RoverR+display

Figure 15: Composition Expression Evaluator: Three registers are used in this realization of an image composition expression evaluatoruseful for resolving non-binary cyclic occlusions. Each register can be loaded with a shaped image from memory. R� accumulates chains of“in” or “out” expressions. R+ accumulates sums, either from memory or from the current contents of R�. Finally, Rover accumulates sequencesof over operations, either from memory or from the current contents of R+.

with n “out” and n� 1 addition operations, while the “over-atop” formulation (Expression 4) produces

(I(On) in I(O1) + I(O1) out I(On)) over I(O2) over � � � over I(On)

with n � 1 “over”, 1 “in”, 1 “out”, and 1 addition operators. Assuming “over” is an indispensable operator for hardwareimplementations and is thus atomic, the second formulation takes advantage of “over” to reduce the expression complexity.

9 Pre-Splitting to Remove Binary Cycles

The use of convex bounding hulls in occlusion testing is sometimes overly conservative. For example, consider a pencil beingplaced inside a cup or an aircraft flying within a narrow valley. If the cup or valley form a single part, our visibility sortingalgorithm will always group the pencil and cup, and the aircraft and valley, in a single layer (BMC) because their convex hullsintersect. In fact, in the case of the valley it is likely that nearly all of the scene’s geometry will be contained inside the convexhull of the valley, yielding a single layer for the entire scene.

To solve this problem, we pre-split objects that are likely to cause unwanted aggregation of parts. Objects that are very large,like terrain, are obvious candidates. Foreground objects that require large rendering resources and are known to be “containers”,like the cup, may also be pre-split. The object is thus transformed before run-time into a set of many parts, called split parts,whose convex hull is less likely to intersect other moving parts. With enough splitting, the layer aggregation problem can besufficiently reduced or eliminated.

Simple methods for splitting usually suffice. Terrain height fields can be split using a 2D grid of splitting planes, whilerotationally symmetric containers, like a cup, can be split using a cylindrical grid. A 3D grid of splitting planes can be usedfor objects without obvious projection planes or symmetry (e.g., trees). On the other hands, more sophisticated methods thatsplit more in less convex regions can reduce the number of split objects, improving performance. Such methods remain to beinvestigated in future work.

Pre-splitting produces a problem however. At the seam between split neighbors the compositor produces a pixel-wide gap,because its assumption of uncorrelated edges is incorrect. The split geometry exactly tessellates any split surfaces; thus alpha(coverage) values should be added at the seam, not over’ed. The result is that seams become visible.

To solve this problem, we extend the region which is included in each split object to produce overlapping split objects, atechnique also used in [Shade96, Schaufler96]. While this eliminates the visible seam artifact, it causes split objects to intersect,and the layer aggregation problem reoccurs. Fortunately, adjacent split objects contain the same geometry in their region ofoverlap. We therefore add pairwise separating planes between neighbors, because both agree on the appearance within the regionof overlap so either may be drawn. This breaks the mutual occlusion relationship between neighbors, avoiding catastrophiclayer growth. But we use the convex hulls around the “inflated” split objects for testing with all other objects, so that the correctocclusion relationship is still computed.

Note that the occlusion sort does not preclude splitting arrangements, like hexagonal terrain cells, that permit no globalpartitioning planes. All that is required is pairwise separation.

18

10 Visibility Correct Depth of Field

2D image blurring is a fast method for simulating depth of field effects amenable to hardware implementation [Rokita93].Unfortunately, as observed in [Cook84], any approximation that uses a single hidden-surface-eliminated image, including[Potmesil81, Rokita93], causes artifacts because no information is available for occluded surfaces made visible by depth offield. The worst case is when a blurry foreground object occludes a background object in focus (Figure 16). As shown in thefigure, the approximation of [Potmesil81] sharpens the edge between the foreground and background objects, greatly reducingthe illusion. Following [Max85], but using correct visibility sorting, we take advantage of the information in layer images thatwould ordinarily be eliminated to correct these problems.

The individual image layers are still approximated by spatially invariant blurring in the case of objects having small depthextent, or by the spatially varying convolution from [Potmesil81]. Image compositing is used between layers. Since a substantialportion of depth of field cues come from edge relations between foreground and background objects, integrating over the lensonly approximates the appearance of individual parts.

Consider two image layers A and B where A occludes B. Using shaped images for these objects derived from standardrenderings (i.e., hidden-surface-eliminated and from a single point of view) the convolution of [Potmesil81] (Figure 16(a)) isequivalent to h

blurA

(A) + blurB

(B out A)i�

where � denotes normalization by the sum of the matte channel for the two addends, and “blur” denotes an image transformationthat blurs both color and matte channels using some depth of field approximation. [Rokita93] more properly uses an unnormalizedform (Figure 16(b)) which translates simply to

blurA

(A) + blurB

(B out A):

Note the presence of the “out” operator which accounts for the absence of information about occluded surfaces. The visibilitycompositing approximation (Figure 16(c)) merely draws blurred versions of the sorted sequence of layer images, resulting in

blurA

(A) over blurB

(B):

Note the correct blurriness of the image of A where it covers B in Figure 16(c).It can be seen that grouping parts because of occlusion undecomposability exacts a penalty. Such grouping increases the

depth extent of the members of the group so that the constant blur approximation or even the more expensive depth-basedconvolution incur substantial error. In cases of groupings of very large extent, the renderer could even resort to renderingintegration techniques, such as the accumulation buffer [Haeberli90]. Because such integration requires many image samples(23 rendering samples were used in images from [Haeberli90]), this represents a large allocation of system resources, to beavoided when simple blurring suffices. Still, accumulation buffer integration need be applied only to the objects comprisinga BMC, not to the whole scene. We can selectively apply spatially invariant blur, depth-based blur, or accumulation bufferintegration to individual layers, depending on the importance of the layer and the accuracy of the cheaper methods.

11 Image Generation

The mode of image generation for parts depends on the application.For z-bufferless rendering or rendering with transparency, we can use a priori BSP trees to render singleton SCCs.

Unfortunately, for multi-objects BMCs, a dynamic BSP merge of the BMC components must be computed, an expensiveoperation for geometry of even modest complexity [Torres90]. Still, this approach for non-real-time software rendering is viableand limits BSP-merging to the components of the smaller BMCs rather than SCCs. Note also that the special case of singletonconvex objects can be correctly rendered simply with backface culling.

The greatest applicability of the approach is in rendering acceleration, animation design, selective display, and reducedz-buffer use. Here, we are not prohibited from hardware rendering (e.g., z-buffering) within each BMC. While small BMCs aredesirable so that relative motion doesn’t destroy coherence, the existence of some multi-object BMCs is not catastrophic. Wesimply re-render these few BMCs at higher rates. In the case of design preview, we have to render each changed object plusobjects in its associated BMC, but this represents a small fraction of the entire scene’s geometry. Such rendering may be inhardware for greater speed or in software for greater realism, but is accelerated in either case by these methods.

19

(a) no depth of field

(b) single layer depth of field approximation

(c) two layer visibility compositing depth of field approximation

Figure 16: Simulating Depth of Field with Image Blurring.

20

Figure 17: Toothpick Example (nobjs=800): This image shows a frame from the first experiment, drawn with hidden line elimination.

For special effects generation and incorporation of external image streams, the animation is commonly designed to eliminatemulti-object BMCs involving the special objects. The methods of this paper find application during design to quickly verifythat a layered decomposition exists. In some cases, such as computing depth of field blur by globally blurring image layers[Potmesil81], some multi-object BMCs may be tolerable, since the proximity of BMC components makes it likely that a singleblur factor across all components is not an unreasonable approximation. In cases where this approximation incurs too mucherror, depth of field must be computed by slower means, such as by using per-pixel z-information to generate per-pixel blur.

For the hidden-line elimination problem, we simply draw each SCC in back-to-front order, essentially using Painter’salgorithm. For each SCC, we first draw the filled outlines of all components in white to block occluded lines of previouslydrawn SCCs. We then drawn the visible line segments of the SCC. Here the components of the SCC tell us the complete(and generally very small) set of relevant occluders. For each segment, we use ray casting to see which portions are visible,considering only elements of the SCC. An example is shown in Figure 17.

12 Results

All timings are reported for one processor of a Gateway E5000-2300MMX PC with dual Pentium II 300MHz processors and128MB of memory. Measured computation includes visibility sorting and kd-tree building and rebalancing. The kd-tree wasbuilt only once at the start of each animation; the amortized cost to build it is included in the “average” cpu times reported.

12.1 Tumbling Toothpicks

The first results involve a simulation of tumbling “toothpicks”, eccentric ellipsoids, moving in a cubical volume (Figure 17).The toothpicks bounce off the cube sides, but are allowed to pass through each other.5 Each toothpick contains 328 polygonsand forms a single part. There are 250 frames in the animation.

Performance with Changing Complexity

In the first series of experiments, we measured cpu time per frame, as a function of number of toothpicks (Figure 18). Timeper frame averaged over the entire animation and maximum time for any frame are both reported. One experiment, labeled“us” for uniform scale in the figure, adds more and more toothpicks of the same size to the volume. This biases the occlusion

5Note that the algorithm computes all intersecting pairs; a useful computational by-product for simulation. View frustum culling is also made made trivialby computing the angular extents of the visible region once at the start of each frame and determining whether each objects’ angular extents intersect it.

21

0

0.2

0.4

0.6

0.8

1

200 400 600 800 1000 1200 1400 1600

cpu

(sec

onds

/fram

e)

no. objects

us maxus avg

ud maxud avg

objects 25 50 100 200 400 800 1600

avg. cpu (ms) [ud] 1.51 2.51 4.81 9.97 23.1 51.7 122max cpu (ms) [ud] 2.40 3.55 6.28 11.8 26.3 56.9 131

avg. cpu (ms) [us] 1.32 2.17 4.21 10.1 27.0 92.5 849max cpu (ms) [us] 2.28 3.11 5.57 11.9 30.5 134 3090

Figure 18: IVS performance with increasing number of objects.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

200 400 600 800 1000 1200 1400 1600

cpu

(sec

onds

/fram

e)

no. objects

us kdbuildud kdbuild

objects 25 50 100 200 400 800 1600

ud cpu (ms) 6.61 9.52 17.5 36.6 75.9 173 387us max cpu (ms) 6.14 9.51 17.6 36.7 77.4 181 442

Figure 19: Kd build performance with increasing number of objects.

22

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0 10 20 30 40 50 60 70

cpu

(sec

onds

/fram

e)

velocity scale

ud maxud avg

velocity scale .125 .25 .5 1 2 4 8 16 32 48 64

avg cpu (ms) 8.73 9.15 9.52 9.78 11.0 12.6 13.8 16.1 19.3 22.5 25.2max cpu (ms) 10.3 10.9 11.2 11.6 13.2 15.3 16.7 19.5 25.0 26.9 32.0

percent diff 17.7% 19.3% 17.5% 18.5% 19.6% 21.7% 20.8% 21.0% 29.6% 19.9% 27.0%

Figure 20: Performance with increasing velocity (decreasing coherence).

complexity superlinearly with number of objects, since there are many more collisions and the size of the average occlusioncycle increases. With enough toothpicks, the volume becomes filled with a solid mass of moving geometry, forming a singleSCC. As previously discussed, the algorithm is designed for situations in which occlusion cycles are relatively small.

A more suitable measure of the algorithm’s complexity preserves the average complexity per unit volume and simplyincreases the visible volume. This effect can be achieved by scaling the toothpicks by the cube root of their number ratio, so asto preserve average distance between toothpicks as a fraction of their length. The second experiment, labeled “ud” for uniformdensity presents these results. The results indicate the expected roughly O(n) or O(n log n) rate of growth. The two experimentsare normalized so that the simulations are identical within timing noise for nobjs=200: the uniform density experiment appliesthe scale (200=nobjs)

13 to the toothpicks of the other trials. In particular, note that a simulation with 200 toothpicks (220 total

objects including cube parts), can be computed at over 61Hz, making it practical for real-time applications. To verify the abovescaling assumptions, the following table summarizes some visibility statistics (averaged over all frames of the animation) forthe baseline scene with 200 toothpicks and the two scenes with 1600 toothpicks (uniform density, uniform scale):

measurement nobjs=200 nobjs=1600 [ud] nobjs=1600 [us]

fraction of SCCs that are nonsingleton .0454 .04633 .2542fraction of parts in nonsingleton SCCs .0907 0.0929 .5642average size of nonsingleton SCCs 2.097 2.107 3.798max size of nonsingleton SCCS 2.64 3.672 45.588

With the exception of the “max size of nonsingleton SCC” which we would expect to increase given that the 1600 objectsimulation produces greater probability that bigger SCCs will develop, the first two columns in the table are comparable,indicating a reasonable scaling, while the third indicates much greater complexity. Note also that the large maximum cpu timefor the 1400 and 1600 uniform scale trials is due to the brief existence of much larger than average sized occlusion cycles.

We also measured time to build the kd-tree as a function of number of objects for the uniform scale and uniform densitysimulations. The results are shown in Figure 19.

Performance with Changing Coherence

The second experiment measures cpu time with varying coherence. We globally scale the rate of camera movement and thelinear and angular velocities of the toothpicks (Figure 20). The number of toothpicks was fixed at 200; the trial with velocity

23

0

0.2

0.4

0.6

0.8

1

200 400 600 800 1000 1200 1400 1600

cpu

(sec

onds

/fram

e)

no. objects

k=1 usk=3 usk=1 udk=3 ud

objects 50 100 200 400 800 1600

k = 3 cpu (ms) [ud] 2.51 4.81 9.97 23.1 51.7 122k = 1 cpu (ms) [ud] 2.65 5.48 12.7 38.0 124 447% diff. [ud] 5.55% 14.0% 27.7% 64.6% 140% 267%

k = 3 cpu (ms) [us] 2.17 4.21 10.1 27.0 92.5 849k = 1 cpu (ms) [us] 2.28 4.79 12.9 47.2 215 1780% diff. [us] 5.05% 13.6% 28.7% 74.4% 132% 109%

Figure 21: Comparison of kd-tree culling with different numbers of extents. Cpu times are for the average case.

scale of 1 is thus identical to the trial with nobjs=200 in Figure 18. The algorithm is clearly sensitive to changing coherence, butexhibits only slow growth as the velocities become very large. Not surprisingly, the difference between average and worst casequery times increases as coherence decreases, but the percentage difference remains fairly constant, between 17% and 30%.

To calibrate these results, for the unit scale trial the velocity measured at one toothpick end and averaged over all framesand toothpicks was 0.117% S per frame (image space) and 0.269% W per frame (world space) where S represents the lengthof the image window and W the length of the cube side containing the toothpicks. This amounts to an average of 14.2 and 6.2seconds to traverse the image or cube side respectively at a 60Hz frame rate.6

Performance with Number of Kd Dimensions

In a third experiment (Figure 21), we compared performance of the algorithm using kd-trees that sort by different numbersof extents. The same simulations were run as in the first experiment, either exactly as before (k = 3, using two angular extentsand the perpendicular camera-dependent spatial extent Z), or using kd-tree partitioning only in a single dimension (k = 1, usingonly Z). In the second case, the two angular extents were still used for occlusion culling, but not for kd-tree partitioning. Thisroughly simulates the operation of the NNS algorithm, which first examines objects that overlap in depth before applying furtherculls using screen bounding boxes. It can be seen that simultaneously searching all dimensions is much preferable, especiallyas the number of objects increases. For example, in the uniform density case, using a single direction rather than three degradesperformance by 14% for 100 objects, 28% for 200, 65% for 400, up to 267% for 1600. The differences in the uniform scale caseare still significant but less dramatic, since occlusion culling forms a less important role than layer reordering and occlusioncycle detection.

6While this baseline may seem somewhat slow-moving, it should be noted that small movements of the parts in this simulation can cause large changesin their occlusion graph with attendant computational cost. We believe this situation to be more difficult than typical computer graphics animations. Statedanother way, most computer graphics animations will produce similar occlusion topology changes only at much higher velocities.

24

Depth of Field Effects

We used the visibility sorting results to create a depth of field blurred result using compositing operations as described inSection 10, and compared it to a version created with 21 accumulation buffer passes. The results are shown in Figure 22. Forthe visibility compositing result, a constant blur factor was determined from the circle of confusion at the centroid of the objector object group, for all objects except the cube sides. Because of the large depth extent of the cube sides, these few parts weregenerated using the accumulation buffer technique on the individual layer parts and composited into the result with the rest.

12.2 Canyon Flyby

The second results involve a set of aircraft flying in formation inside a winding valley (Figure 23). We pre-split the valleyterrain (see Section 9) into split parts using 2D grids of separating planes and an inflation factor of 20%. The animation involvessix aircraft each divided into six parts (body, wing, rudder, engine, hinge, and tail); polygon counts are given in the table below:

object polygons hull polygons

body 1558 192engine 1255 230wing 1421 80tail 22 22rudder 48 28hinge 64 46sky (sphere) 480 N/Aterrain (unsplit) 2473 N/A

Using terrain splitting grids of various resolutions, we investigated rendering acceleration using image-based interpolationof part images. The following table shows average polygon counts per split part for experimental terrain splits using 2D gridsof 20�20, 14�14, 10�10, 7�7, and 5�5:

grid split objects polygons/object hull polygons/object

20�20 390 31.98 29.0114�14 191 48.91 37.7810�10 100 76.48 42.427�7 49 130.45 57.515�5 25 225.32 72.72

Note that the “polygons” and “polygons/object” column in the above tables are a measure of the average rendering cost ofeach part, while the “hull polygons” and “hull polygons/object” column is an indirect measure of computational cost for thevisibility sorting algorithm, since it deals with hulls rather than actual geometry.

Following results from [Lengyel97], we assumed the 6 parts of each aircraft required a 20% update rate (i.e., could berendered every fifth frame and interpolated the rest), the terrain a 70% update rate, and the sky a 40% update rate. These choicesproduce a result in which the interpolation artifacts are almost imperceptible. To account for the loss of coherence which occurswhen parts are aggregated into a single layer, we conservatively assumed that all parts so aggregated must be rendered everyframe (100% update rate), which we call the aggregation penalty. The results are summarized below:

split cpu (ms) terrain layers agg. parts agg. update rateavg max expan. fraction fraction unit weighting poly weighting

fac. (agg) (no agg) (agg) (no agg)

20�20 17.33 29.82 5.04 0.1% 0.2% 58.3% 58.1% 41.8% 40.7%14�14 8.08 14.59 3.78 0.4% 1.0% 51.3% 50.2% 38.0% 36.0%10�10 5.17 9.88 3.09 1.9% 6.4% 48.7% 40.9% 40.4% 30.7%

7�7 4.51 9.68 2.58 5.2% 22.5% 51.9% 37.8% 42.4% 27.8%5�5 5.37 11.01 2.28 13.0% 53.1% 71.5% 32.3% 72.0% 26.1%

The column “cpu” shows average and maximum cpu time per frame in milliseconds. The next column (“terrain expansionfactor”) is the factor increase in number of polygons due to splitting and overlap; this is equal to the total number of polygons

25

(a) Depth of Field the accumulation buffer (21 rendering passes)

(b) Depth of Field with visibility compositing

Figure 22: Comparison of Depth of Field Generation Methods: The images show two different depth of field renderings from the tumblingtoothpicks experiment. Toothpicks comprising a multi-object layer share a common color; singleton layers are drawn in white. Note theocclusion relationships between the sphere/cylinder joints at the cube sides, computed using the algorithm in C. While pairs of spheres andcylinders are sometimes mutually occluding, the algorithm is able to prevent any further occlusion cycle growth.

26

Figure 23: Scene from canyon flyby: computed using image compositing of sorted layers with 14� 14 terrain split. The highlighted terrainportion is one involved in a detected occlusion cycle with the ship above it, with respect to the bounding convex hulls.

in the split terrain divided by the original number, 2473. The next columns show the fraction of visible layers that includemore than one part (“aggregate layers fraction”), followed by the fraction of visible parts that are aggregated (“aggregatedparts fraction”). Visible in this context means not outside the viewable volume. Average re-rendering (update) rates undervarious weightings and assumptions follow: unit weighting per part with and without the aggregation penalty (“update rate,unit weighting, (agg)” and “: : : (no agg)”), followed by the analogs for polygon number weighting. Smaller rates are better inthat they indicate greater reuse of image layers through interpolation and less actual rendering. The factors without aggregationare included to show how much the rendering rate is affected by the presence of undecomposable multi-object layers. Thepolygon-weighted rates account for the fact that the terrain has been decomposed into an increased number of polygons. Thisis done by scaling the rates of all terrain objects by the terrain expansion factor.

In summary, the best polygon-weighted reuse rate in this experiment, 38%, is achieved by the 14�14 split. Finer splittingincurs a penalty for increasing the number of polygons in the terrain, without enough payoff in terms of reducing aggregation.Coarser splitting decreases the splitting penalty but also increases the number of layer aggregations, in turn reducing the reuserate via the aggregation penalty. Note the dramatic increase from 7�7 to 5�5 – splits below this level fill up concavities in thevalley too much, greatly increasing the portion of aggregated objects.

It should be noted that the reuse numbers in this experiment become higher if the fraction of polygons in objects with morecoherence (in this case, the aircraft) are increased or more such objects are added. Allowing independent update of the terrain’slayers would also improve reuse, although as pointed out in [Lengyel97] this results in artificial apparent motion between terrainparts.

13 Conclusion

Many applications exist in image-based rendering for an algorithm that performs visibility sorting without splitting, includingrendering acceleration, fast special effects generation, animation design, and incorporation of external image streams into asynthetic animation. These techniques all derive from the observation that 2D image processing is cheaper than 3D renderingand often suffices. By avoiding oftentimes unnecessary splitting, these techniques better exploit the temporal coherencepresent in most animations, and allow sorting at the level of objects rather than polygons. We’ve shown that the non-splittingvisibility sorting required in these applications can be computed in real-time on PCs, for scenes of high geometric and occlusioncomplexity, and demonstrated a few of the many applications.

Much future work remains. Using more adaptive ways of splitting terrain and container objects is a straightforward extension.Incorporation of space-time volumes would allow visibility-correct motion blur using 2D image processing techniques. Webelieve animation design tools can fruitfully use visibility sorting to quickly preview modified parts of an animation in theircomplete context without re-rendering the unmodified parts, but this application requires further development. Opportunities

27

also exist to invent faster and less conservative occlusion tests for special geometric cases. Finally, further development isneeded for fast hardware which exploits software visibility sorting and performs real-time image operations, like compositingwith multiple image registers, blurring, warping, and interpolation, combined with 3D rendering.

Acknowledgments

We thank the Siggraph reviewers for their careful reading and many helpful suggestions. Jim Blinn suggested the “sum of outs”resolution for nonbinary cycles. Brian Guenter provided a most helpful critical reading. Susan Temple has been an early adopterof a system based on these ideas and has contributed many helpful suggestions. Jim Kajiya and Conal Elliot were involved inmany discussions during the formative phase of this work.

References

[Appel67] Appel A., “The Notion of Quantitative Invisibility and the Machine Rendering of Solids,” In Proceedings of theACM National Conference, pp. 387-393, 1967.

[Baraff90] Baraff, David, “Curved Surfaces and Coherence for Non-Penetrating Rigid Body Simulation,” Siggraph ‘90, August1990, pp. 19-28.

[Bentley75] Bentley, J.L., “Multidimensional Binary Search Trees Used for Associative Searching,” Communications of theACM, 18(1975), pp. 509-517.

[Chen96] Chen, Han-Ming, and Wen-Teng Wang, “The Feudal Priority Algorithm on Hidden-Surface Removal,” Siggraph’96, August 1996, pp. 55-64.

[Chung96a] Chung, Kelvin, and Wenping Wang, “Quick Collision Detection of Polytopes in Virtual Environments,” ACMSymposium on Virtual Reality Software and Technology 1996, July 1996, pp. 1-4.

[Chung96b] Chung, Tat Leung (Kelvin), “An Efficient Collision Detection Algorithm for Polytopes in Virtual Environments,”M. Phil Thesis at the University of Hong Kong, 1996 [www.cs.hku.hk/ tlchung/collision library.html].

[Cohen95] Cohen, D.J., M.C. Lin, D. Manocha, and M. Ponamgi, “I-Collide: An Interactive and Exact Collision DetectionSystem for Large-Scale Environments,” Proceedings of the Symposium on Interactive 3D Graphics,, 1995, pp.189-196.

[Cook84] Cook, Robert, “Distributed Ray Tracing,” Siggraph ’84, July 1984, pp. 137-145.

[Durand97] Durand, Fredo, George Drettakis, and Claude Puech, “The Visibility Skeleton: A Powerful and Efficient Multi-Purpose Global Visibility Tool,” Siggraph ’97, August 1997, pp. 89-100.

[Fuchs80] Fuchs, H., Z.M. Kedem, and B.F. Naylor, “On Visible Surface Generation by A Priori Tree Structures,” Siggraph’80, July 1980, pp. 124-133.

[Funkhouser92] Funkhouser, T.A., C.H. Sequin, and S.J. Teller, “Management of Large Amounts of Data in Interactive BuildingWalkthroughs,” Proceedings of 1992 Symposium on Interactive 3D Graphics, July 1991, pp. 11-20.

[Gilbert88] Gilbert, Elmer G., Daniel W. Johnson, and S. Sathiya A. Keerthi, “A Fast Procedure for Computing the DistanceBetween Complex Objects in Three-Dimensional Space,” IEEE Journal of Robotics and Automation, 4(2), April1988, pp. 193-203.

[Greene93] Greene, N., M. Kass, and G. Miller, “Hierarchical Z-buffer Visibility,” Siggraph ’93, August 1993, pp. 231-238.

[Haeberli90] Haeberli, Paul, and Kurt Akeley, “The Accumulation Buffer: Hardware Support for High-Quality Rendering,”Siggraph ’90, August 1990, pp. 309-318.

[Kay86] Kay, Tim, and J. Kajiya, “Ray Tracing Complex Scenes,” Siggraph ’86, August 1986, pp. 269-278.

[Lengyel97] Lengyel, Jed, and John Snyder, “Rendering with Coherent Layers,” Siggraph ’97, August 1997, pp. 233-242.

28

[Maciel95] Maciel, Paolo W.C. and Peter Shirley, “Visual Navigation of Large Environments Using Textured Clusters,”Proceedings 1995 Symposium on Interactive 3D Graphics, April 1995, pp. 95-102.

[Mark97] Mark, William R., Leonard McMillan, and Gary Bishop, “Post-Rendering 3D Warping,” Proceedings 1997 Sympo-sium on Interactive 3D Graphics, April 1997, pp. 7-16.

[Markosian97] Markosian, Lee, M.A. Kowalski, S.J. Trychin, L.D. Bourdev, D. Goodstein, and J.F. Hughes, “Real-TimeNonphotorealistic Rendering,” Siggraph ’97, August 1997, pp. 415-420.

[Max85] Max, Nelson, and Douglas Lerner, “A Two-and-a-Half-D Motion-Blur Algorithm,” Siggraph ’85, July 1985, pp.85-93.

[McKenna87] McKenna, M., “Worst-Case Optimal Hidden Surface Removal,” ACM Transactions on Graphics, 1987, 6, pp.19-28.

[Ming97] Ming-Chieh Lee, Wei-ge Chen, Chih-lung Bruce Lin, Chunag Gu, Tomislav Markoc, Steven I. Zabinsky, and RichardSzeliski, “A Layered Video Object Coding System Using Sprite and Affine Motion Model,” IEEE Transactions onCircuits and Systems for Video Technology, 7(1), February 1997, pp. 130-145.

[Molnar92] Molnar, Steve, John Eyles, and John Poulton, “PixelFlow: High-Speed Rendering Using Image Compositing,”Siggraph ’92, August 1992, pp. 231-140.

[Mulmuley89] Mulmuley, K., “An Efficient Algorithm for Hidden Surface Removal,” Siggraph ’89, July 1989, pp. 379-388.

[Naylor92] Naylor, B.F., “Partitioning Tree Image Representation and Generation from 3D Geometric Models,” Proceedingsof Graphics Interface ’92, May 1992, pp. 201-212.

[Newell72] Newell, M. E., R. G. Newell, and T. L. Sancha, “A Solution to the Hidden Surface Problem,” Proc. ACM NationalConf., 1972.

[Ponamgi97] Ponamgi, Madhav K., Dinesh Manocha, and Ming C. Lin, “Incremental Algorithms for Collision Detectionbetween Polygonal Models,” IEEE Transactions on Visualization and Computer Graphics, 3(1), March 1997, pp51-64.

[Porter84] Porter, Thomas, and Tom Duff, “Compositing Digital Images,” Siggraph ’84, July 1984, pp. 253-258.

[Potmesil81] Potmesil, Michael, and Indranil Chakravarty, “A Lens and Aperture Camera Model for Synthetic Image Genera-tion,” Siggraph ’81, August 1981, pp. 389-399.

[Potmesil83] Potmesil, Michael, and Indranil Chakravarty, “Modeling Motion Blur in Computer-Generated Images,” Siggraph’83, July 1983, pp. 389-399.

[Preparata85] Preparata, Franco P., and Michael Ian Shamos, Computational Geometry, Springer-Verlag, New York, NY, 1985.

[Regan94] Regan, Matthew, and Ronald Pose, “Priority Rendering with a Virtual Address Recalculation Pipeline,” Siggraph’94, August 1994, pp. 155-162.

[Rokita93] Rokita, Przemyslaw, “Fast Generation of Depth of Field Effects in Computer Graphics,” Computers and Graphics,17(5), 1993, pp. 593-595.

[Schaufler96] Schaufler, Gernot, and Wolfgang Sturzlinger, “A Three Dimensional Image Cache for Virtual Reality,” Proceed-ings of Eurographics ’96, August 1996, pp. 227-235.

[Schaufler97] Schaufler, Gernot, “Nailboards: A Rendering Primitive for Image Caching in Dynamic Scenes,” in Proceedingsof the 8th Eurographics Workshop on Rendering ’97, St. Etienne, France, June 16-18, 1997, pp. 151-162.

[Schumacker69] Schumacker, R.A., B. Brand, M. Gilliland, and W. Sharp, “Study for Applying Computer-Generated Imagesto Visual Simulation,” AFHRL-TR-69-14, U.S. Air Force Human Resources Laboratory, Sept. 1969.

[Sedgewick83] Sedgewick, Robert, Algorithms, Addison-Wesley, Reading, MA, 1983.

29

0 1

23π−ε

−π+ε

Figure 24: Quadrant Labeling for Angular Measurement.

[Shade96] Shade, Jonathan, Dani Lischinski, David H. Salesin, Tony DeRose, and John Snyder, ”Hierarchical Image Cachingfor Accelerated Walkthroughs of Complex Environments,” Siggraph ’96, August 1996, pp. 75-82.

[Sillion97] Sillion, Francois, George Drettakis, and Benoit Bodelet, “Efficient Impostor Manipulation for Real-Time Visual-ization of Urban Scenery,” Proceedings of Eurographics ’97, Sept 1997, pp. 207-218.

[Snyder98] Snyder, John, Jed Lengyel, and Jim Blinn, “Resolving Non-Binary Cyclic Occlusions with Image Compositing,”Microsoft Technical Report, MSR-TR-98-05, March 1998.

[Sudarsky96] Sudarsky, Oded, and Craig Gotsman, “Output-Sensitive Visibility Algorithms for Dynamic Scenes with Appli-cations to Virtual Reality,” Computer Graphics Forum, 15(3), Proceedings of Eurographics ’96, pp. 249-258.

[Sutherland74] Sutherland, Ivan E., Robert F. Sproull, and Robert A. Schumacker, “A Characterization of Ten Hidden-SurfaceAlgorithms,” Computing Surveys, 6(1), March 1974, pp. 293-347.

[Teller91] Teller, Seth, and C.H. Sequin, “Visibility Preprocessing for Interactive Walkthroughs,” Siggraph ’91, July 1991,pp. 61-19.

[Teller93] Teller, Seth, and P. Hanrahan, “Global Visibility Algorithms for Illumination Computations,” Siggraph ’93, August1993, pp. 239-246.

[Torborg96] Torborg, Jay, and James T. Kajiya, “Talisman: Commodity Realtime 3D Graphics for the PC,” Siggraph ’96,August 1996, pp. 353-364.

[Torres90] Torres, E., “Optimization of the Binary Space Partition Algorithm (BSP) for the Visualization of Dynamic Scenes,”Proceedings of Eurographics ’90, Sept. 1990, pp. 507-518.

[Wang94] Wang, J.Y.A., and E.H. Adelson, “Representing Moving Images with Layers,” IEEE Trans. Image Processing, vol.3, September 1994, pp. 625-638.

[Zhang97] Zhang, Hansong, Dinesh Manocha, Thomas Hudson, and Kenneth Hoff III, “Visibility Culling Using HierarchicalOcclusion Maps,” Siggraph ’97, August 1997, pp. 77-88.

A Optimizing the Angular Extent Objective Function

Equation 2 measures angular extents using the inverse tangent function. But the angular extent minimization procedure(Section 5.1) can avoid this evaluation by observing that we need only find whether a given angle is less than the current

30

ABvA

vB

D

A vA D

R

D'

B

vB

D separates D fails to separate

Figure 25: Chung algorithm: On the left, D is a separating direction direction between hulls A and B. vA is the vertex on A maximizing theprojection onto D. vB is the vertex on B minimizing the projection onto D. On the right, D fails to separate. The Chung algorithm computesa new separating direction, D0, by reflecting D about the vector R from vB to vA.

minimum, postponing a single inverse tangent evaluation until the global minimum is attained. To compute whether the angleformed by a given 2D vector (x; y) is less than the current minimum represented by (x0; y0), the quadrant labeling scheme shownin Figure 24 is used. Here, x represents the denominator in the argument of tan�1 of Equation 2, and y the numerator. Clearly,if the new angle is in a different quadrant than the old minimum, then whether the angle is smaller or larger can be determinedby whether the quadrant label is smaller or larger. Quadrant labels are computed by simple sign testing of x and y.

If the angles are in the same quadrant, then the following test determines if the angle (x; y) is smaller

y0x > x0y:

That this formula is correct in quadrants 1 and 2 follows because the tangent function is monotonically increasing in (��=2; �=2),so

y0x > x0y ) y=x < y0=x0 ) tan�1(y=x) < tan�1(y0=x0):

since x and x0 are positive in quadrants 1 and 2. In quadrants 0 and 3, the test also holds, which can be seen by negating x (whichis always negative in these quadrants) and maximizing rather than minimizing the resulting reflected angle.

B Incremental Convex Collision Detection

To incrementally detect collisions and occlusions between moving 3D convex polyhedra, we use a modification of Chung’salgorithm [Chung96a, Chung96b]. The main idea is to iterate over a potential separating plane direction between the two objects.Given a direction, it is easy to find the extremal vertices with respect to that direction as already discussed in Section 5.1. Givena direction D, if vA is the vertex on object A maximizing the projection onto D, and vB is the vertex on B minimizing theprojection onto object D, then D is a separating direction if

D � vA < D � vB:

In other words, D is a separating vector if it points in the same direction as vB � vA. This situation is shown on the left inFigure 25. If D fails to separate the objects, then it is updated by reflecting with respect to the line joining the two extremalpoints. Mathematically,

D0 � D� 2(R �D)R

where R is the unit vector in the direction vB � vA. This situation is shown on the right in Figure 25. [Chung96b] proves thatif the objects are indeed disjoint, then this algorithm converges to a separating direction for the objects A and B. Coherence is

31

achieved for disjoint objects because the separating direction from the previous invocation often suffices as a witness to theirdisjointness in the current invocation, or suffices after a few of the above iterations.

While it is well known that collisions between linearly transforming and translating convex polyhedra can be detectedwith efficient, coherent algorithms, Chung’s algorithm has several advantages over previous methods, notably Voronoi featuretracking algorithm ([Ponamgi97]) and Gilbert’s algorithm ([Gilbert88]). The inner loop of Chung’s algorithm finds the extremalvertex with respect to a current direction, a very fast algorithm for convex polyhedra. Also, the direction can be transformedto the local space of each convex hull once and then used in the vertex gradient descent algorithm. Chung found a substantialspeedup factor in experiments comparing his algorithm with its fastest competitors. Furthermore, Chung found that most querieswere resolved with only a few iterations (< 4) of the separating direction.

To detect the case of object collision, Chung’s algorithm keeps track of the directions from vA to vB generated at eachiteration and detects when these vectors span greater than a hemispherical set of directions in S2. This approach works wellin the 3D simulation domain where collision responses are generated that tend to keep objects from interpenetrating, makingcollisions relatively evanescent. In the visibility sorting domain however, there is no guarantee that a collision between theconvex hulls of some object pairs will not persist in time. For example, a terrain cell’s convex hull may encompasses severalobjects for many frames. In this case, Chung’s algorithm is quite inefficient.

To achieve coherence for colliding objects, we use a variant of Gilbert’s algorithm [Gilbert88]. In brief, Gilbert’s algorithmiterates over vertices on the Minkowski difference of the two objects, by finding extremal vertices on the two objects withrespect to computed directions. A set of up two four vertex pairs are stored, and the closest point to the origin on the convexhull of these points computed at each iteration, using Johnson’s algorithm for computing the closest point on a simplex to theorigin. If the convex hull contains the origin, then the two objects intersect. Otherwise, the direction to this point becomes thedirection to locate extremal vertices for the next iteration. In the case of collision, a tetrahedron on the Minkowski differenceserves as a coherent witness to the objects’ collision. We also note that the extremal vertex searching employed in Gilbert’salgorithm can be made more spatially coherent by caching the vertex from the previous search on each of the two objects andalways starting from that vertex in a search query.

The final, hybrid algorithm uses up to 4 Chung iterations if in the previous invocation the objects were disjoint. If thealgorithm finds a separating plane, it is halted. Otherwise, Gilbert’s iteration is used to find a witness to collision or find aseparating plane. In the case in which Chung iteration fails, Gilbert’s algorithm is initialized with the 4 pairs of vertices foundin the Chung iterations. The result is an algorithm which functions incrementally for both colliding and disjoint objects andrequires only a single query on geometry that returns the extremal vertex on the object given a direction.

The algorithm can be used to detect collision between two convex polyhedra, or for point inclusion queries (i.e., singlepoint vs convex polyhedron). It can also be used for occlusion detection between convex polyhedra given an eye point E. Todetect whether A ! B, we can test whether B0 � ch(B

SE) intersects with A, where ch(X) represents the convex hull of X.

Fortunately, there is no need to actually compute the polytope B0. Instead, the extremal direction search of B0 is computed byfirst searching B as before. We then simply compare that result with the dot product of the direction with E to see if is moreextreme and, if so, return E.

C Exact Occlusion Testing for Sphere/Cylinder Joints

This section presents an method for testing occlusion between a sphere and a cylinder tangent to it with respect to its endcap. Let the sphere have center at O and radius s. The cylinder has unit-length central axis in direction V away from the sphere,and radius r, r � s. Note that the convex hulls of such a configuration intersect (one cylindrical endcap is entirely inside thesphere), and thus the methods of Section 6 always indicate mutual occlusion. However, two exact tests can be used to “split”the cycle, indicating a single occlusion arc between the sphere and cylinder.7

The cylinder occludes the sphere (and not vice versa) if the eye is on cylinder side of endcap plane; i.e.,

V � (E �O)� h � 0

where E is the eye point, and where h � ps2 � r2 is the distance from O to the plane of intersection.

The sphere occludes the cylinder (and not vice versa) if the circle where the sphere and cylinder intersect is invisible. Thiscan be tested using the cone formed by the apex P along the cylinder’s central axis for which emanating rays are tangent to thesphere at the circle of intersection. If the eye point is inside this cone, then the circle of intersection is entirely occluded by the

7We assume the eye point E is not inside either object.

32

VO

s r

Phr

Figure 26: Sphere/Cylinder Joint Occlusion: A cylinder of radius r in direction V is tangent to the sphere of radius s with center O. Anocclusion relationship can be derived using the plane through the intersection, at distance h from O along V , and the cone with apex at P suchthat lines tangent to the sphere through P pass through the circle of intersection.

sphere, and so the cylinder is behind the sphere. We define l � srh + h representing the distance from P to O; P is thus given by

O + lV. Then the sphere completely occludes the circle of intersection if

(E � P) � (O� P) � 0

and[(E � P) � (O� P)]2 � (l2 � s2)(E � P) � (E � P)

where the first test indicates whether E is in front of the cone apex, and the second efficiently tests the square of the cosine ofthe angle, without using square roots. Note that h and l can be computed once as a preprocess, even if O and v vary as the jointmoves.

If both these tests fails, then the sphere and cylinder are mutually occluding.

33

visibility sorting and compositing for image-based rendering · pdf filevisibility sorting and...

Documents