Accelerated Occlusion Culling using ShadowFrusta

y

T. Hudson D. Mano cha J. Cohen M. Lin K. Ho H. Zhang

Department of Computer Science

University of North Carolina

Chap el Hill, NC 27599-3175

fhudson,mano cha,cohenj,lin,ho ,[email protected]

and mo del simpli cation to render such large mo dels. In this Abstract:

pap er, we address visibility culling. Many applications in and virtual environ-

Given a large mo del and a viewp oint, the goal of visibility ments need to render datasets with large numb ers of primitives

culling and hidden surface removal algorithms is to determine and high depth complexityatinteractive rates. However, stan-

the set of primitives visible from that viewp oint. No general pur- dard techniques like view frustum culling and a hardware z-bu er

p ose, interactive algorithms are known for exact visibility deter- are unable to display datasets comp osed of hundred of thousands

mination on large mo dels comp osed of hundred of thousands of of p olygons at interactive frame rates on current high-end graph-

p olygons. Current graphics hardware provides supp ort for visi- ics systems. We add a \conservative" visibility culling stage to the

bility computations on a p er-pixel basis with a z-bu er. However, rendering pip eline, attempting to identify and avoid pro cessing of

high-end systems are only able to render a few tens of thousands o ccluded p olygons. Given a moving viewp oint, the algorithm dy-

of p olygons at reasonable frame rates, due to b ottlenecks in their namically cho oses a set of occluders. Each o ccluder is used to

hardware b efore the p er-pixel analyses can b e carried out. As compute a shadow frustum, and all primitives contained within

a result, many applications use the following high-level software this frustum are culled. The algorithm hierarchically traverses the

techniques to cull away a subset of the p olygons which are not mo del, culling out parts not visible from the current viewp oint

visible from the current viewp oint b efore they are needlessly sent using ecient, robust, and in some cases sp ecialized interference

to the rendering hardware: detection algorithms. The algorithm's p erformance varies with

the lo cation of the viewp oint and the depth complexity of the

 View-Frustum Culling: View-frustum culling uses a traver-

mo del. In the worst case it is linear in the input size with a small

sal of spatial data structures to cull out p ortions of the

constant. In this pap er, we demonstrate its p erformance on a

mo del not lying in the current view frustum. The viewing

city mo del comp osed of 500; 000 p olygons and p ossessing varying

volume is represented as a frustum of six planes near plane,

depth complexity.We are able to cull an average of 55 of the

far plane, and four sides. At runtime the display algorithm

p olygons that would not b e culled by view-frustum culling and

checks whether each no de of the spatial structure overlaps

obtain a commensurate improvement in frame rate. The overall

this frustum; only no des partially or completely contained

approachis e ective and scalable, is applicable to all p olygonal

within the frustum are rendered.

mo dels, and can b e easily implemented on top of view-frustum

culling.

 Occlusion Culling: Hidden-surface removal algorithms

and o cclusion culling techniques are commonly used for

mo dels with high depth complexity to further reject the

1 Intro duction

p ortion of geometry obscured by other ob jects in the scene.

Some of the existing techniques are based on backface culling,

Interactive display of extremely large and complex geometric data

binary space partition trees, or partitioning the mo dels into

sets has long b een an imp ortant problem in computer graphics.

cells and p ortals.

Although throughput of graphics systems has improved consid-

erably over the years, the size and complexity of mo dels has

Desiderata: In designing algorithms for o cclusion culling, we

grown even faster. In manywalkthrough and virtual environment

recognize the imp ortance of generality and robustness.We make

applications, mo dels commonly consist of millions of primitives.

no assumptions ab out the structure of our input mo dels { they

Rendering such mo dels at interactive rates is a ma jor challenge.

may b e arbitrary sets of p olygons with no other top ological in-

There is a great b o dy of literature in computer graphics and com-

formation, also known as \p olygon soup". The problem of 3D

putational geometry using techniques based on visibility culling

o cclusion culling involves the computation of some geometric re-



lationship b etween two or more ob jects. In our case, we reduce

Supp orted in part by a Sloan fellowship, ARO Contract P-

the problem to p erforming overlap tests b etween convex ob jects in

34982-MA, NSF grant CCR-9319957, NSF grant CCR-9625217,

2D or 3D. Based on our exp erience in developing twointerference

ONR Young Investigator Award, DARPA contract DABT63-93-

detection systems, I-COLLIDE [CLMP95] and RAPID [GLM96],

C-0048, NSF/ARPA Science and Technology Center for Com-

as well as that of other authors in implementing algorithms for

puter Graphics & Scienti c NSF Prime contract

+

interference detection [HKM95, BCG 96] and intersection com-

No. 8920219.

y putation for solid mo deling [For96, HHK89 ], wehave realized ro-

Also with U.S. Army Research Oce

bustness is an imp ortant issue in the design and implementation

of interference detection algorithms. Our goal is to develop al-

gorithms which are relatively simple, ecient, and not prone to

geometric degeneracies.

In this pap er, we present ob ject-space techniques for occlu-

sion cul ling. This involves computation of a set of p olygons that

are within the view frustum but are not visible from the current

To app ear in the pro ceedings of the Thirteenth ACM Symp osium

viewp oint. We add this conservative visibility culling stage to the

on Computational Geometry, June 1997, Nice, France.

rendering pip eline in order to reduce the numb er of p olygons sent coherence by p erforming visibility queries on the Z-bu er. Cur-

to the graphics hardware. rently, most graphics systems do not supp ort this capabilityin

As the viewp ointchanges, the algorithm dynamically cho oses hardware, and simulating the hierarchical Z-bu er in software is

a set of o ccluders. Each o ccluder is a convex p olytop e or a union relatively exp ensive.

of convex p olytop es. For each o ccluder the algorithm computes The work most directly related to our approach is that of

a shadow frustum and uses fast interference detection and a hier- Co org and Teller [CT96, CT97]. Given two convex ob jects an

archical representation to nd those p ortions of the mo del within o ccluder and o ccludee, their early work required the construction

the shadow frustum. The idea of shadowvolumes was rst intro- and maintenance of a linearized p ortion of an asp ect graph. They

duced by Crow [Cro77] to generate shadows by creating for each use this structure to track the viewp oint and determine whether

ob ject a shadowvolume that the ob ject blo cks from the light one convex p olytop e o ccludes the other from a given viewp oint.

source. This involves enumeration of all visual events and data struc-

tures for dynamic plane maintenance. In the worst case, the

Main contribution: We present geometric algorithms for

numb er of planes used to form a cell of the arrangement can b e

2

O m , where m is the number of vertices of the convex p oly-

1. Occluder selection using o -line and on-line techniques

top es, though dynamic and hierarchical data structures are used

2. Robust and ecient o cclusion culling based on sp ecialized

in [CT96] to sp eed-up the computation of relevant planes. Each

interference detection algorithms, given the o ccluders and a

arrangement cell classi es all p olytop es as completely, partially,

hierarchical decomp osition of scene geometry into b ounding

or un- o ccluded. This approach is similar to earlier shadow com-

volumes.

putation algorithms which, given a light source and an o ccluder,

decomp ose space into p enumbra and umbra volumes. In their

The resulting algorithms have b een implemented and we re-

newer work they reduce the amount of coherence used and sim-

p ort their p erformance on a large mo del.

plify the structure of the arrangement. This yields a considerable

Organization: The rest of the pap er is organized in the following

sp eedup, eliminating the overhead cost of maintaining complex

manner. We survey related work in Section 2 and giveanoverview

data structures.

of the algorithm in Section 3. The o ccluder selection algorithm

is presented in Section 4 and visibility culling based on o ccluders

3 Algorithm Overview

is describ ed in Section 5. We present implementation details and

the p erformance in Section 6 and analyse its complexity in Section

Occlusion culling can b e divided into two subproblems. First, for

7.

a given viewp ointwemust select a small set of go o d o ccluders to

use. Second, given go o d o ccluders, wemust use them to cull away

o ccluded p ortions of the mo del. These two problems are partially

2 Related Work

indep endent, and so we treat them separately in this pap er.

There has b een signi cant amount of research on visibility and

In general, any primitive in the mo del to b e rendered or any

hidden surface removal in the elds of computer graphics and

combinations of such primitives can b e used as an o ccluder. In

computational geometry. Many asymptotically ecient algorithms

this pap er, we restrict o ccluders to b e either convex ob jects or

have b een prop osed for exact visibility and hidden surface removal

those which can b e expressed as union of two convex ob jects.

[SSS74, Mul89, BDEG94, McK87]. [Dor94] provides a recent sur-

vey of ob ject-space hidden surface removal algorithms. McKenna

3.1 Data Structures

and Seidel [MS85]have presented an algorithm for computing

To p erform o cclusion culling, we require that the mo del b e rep-

optimal shadows of a convex p olytop e. For static mo dels, it is

resented b oth in a spatial partition and in a spatial hierarchy. p ossible to precompute the visibility from all the viewp oints in

The spatial partitioning structure is constructed as part of pre-

space based on asp ect graphs [GCS91]. In the worst case, for

pro cessing and used for o ccluder selection, while the spatial hier-

input mo dels comp osed of n p olygons, this algorithm can decom-

9

archy is used for fast culling of o ccluded geometry. These maybe p ose the space into O n  regions, making it impractical for large

separate data structures, or may b e united into a single structure,

mo dels. The utility of all these algorithms for complex mo dels is

based on the implementation. currently unclear.

The spatial partition is a discrete structure that spans space, In the last few years a numb er of techniques have b een pro-

mapping every p oint in the space of the mo del onto one of a nite

p osed for eciently computing conservative visibility. These can

numb er of partition regions. If the space of the mo del has volume b e classi ed into two categories, ob ject-space and image-space

v

v , and no region is a volume larger than v ,wehave at least algorithms. More details on this classi cation are presented in

r

v

r

[SSS74]. In ob ject space, [Cla76] prop osed view-frustum culling

regions in the partition.

of a hierarchy of b ounding volumes. Garlick et al. [GBW90] pro-

The hierarchy is not a data structure concerned with the space

p osed using o ctree-based spatial sub division to render p olygons

of the mo del, but rather with the geometry of the mo del. Every

contained in the viewing frustum.

piece of geometry is mapp ed to exactly one leaf volume of the

Several recent algorithms structure the database into cel ls

hierarchy, and every volume of the hierarchy encloses all the ge-

or regions, and use a combination of o -line and on-line algo-

ometry mapp ed to it or its children. Let the entire mo del consist

rithms for cell-to-cell visibility and the conservative computa-

of n pieces of geometry, and let each leaf volume contain no more

tion of the p otentially visible set PVS of p olygons [ARB90,

than n geometry elements. Then, if wewant a binary hierar-

l

TS91, LG95]. Such approacheshave b een successfully used in

chy each non-leaf volume having twochildren, we need at least

n

architecturalwalkthrough systems, where the division of a build-

2 1volumes in the hierarchy.

n

l

ing into discrete ro oms lends itself to a natural division of the

database into cells. It is not apparent that cell-based approaches

3.2 Occluder Selection

can b e generalized to an arbitrary mo del, whichmay come with

no structure information. Decomp osing an arbitrary p olygonal

Finding go o d o ccluders in real time for an arbitrary mo del is a

mo del into appropriate cells is rather dicult. Other algorithms

dicult task. Among all of those ob jects in the database which

for densely-o ccluded but somewhat less-structured mo dels have

we might use, wemust select the few which will o cclude the most

b een prop osed byYagel and Ray [YR96]. They use regular spa-

geometry from the current viewp oint with the least computational

tial sub division to partition the mo del into cells and describ e a

overhead. We make this task feasible by using a prepro cess to

2D implementation. Some algorithms are based on binary space

discard all o ccluders which are not likely to b e go o d when viewed

partition trees [FKN80, Nay92].

from a given set of viewp oints. Then, at runtime, the results of

The hierarchical Z-bu er algorithm op erates in b oth ob ject-

this prepro cess give us a list of o ccluders that are go o d over the

space and image-space [GKM93]. It combines spatial and temp o-

lo cal region of space, and we further reduce this to select those

ral coherence with hierarchical structures. The algorithm exploits

o ccluders that are b est from the current viewp oint.

3.3 Visibility Culling We use the following guiding principles in de ning the optimiza-

tion function to select go o d o ccluders:

Once wehave selected a few go o d o ccluders, they can b e used by

our culling algorithm. For each o ccluder we construct a shadow

1. Solid Angle: The viewed solid angle of a convex ob ject is

frustum: a frustum with its ap ex at the viewp oint, near plane de-

easily computable and measures the fraction of the visual

termined by the o ccluder, and sides determined by the o ccluder's

eld that it o ccupies. If we assume that the geometry of the

silhouette as shown in Fig. 1. The space contained within this

scene is uniformly distributed ab out the viewp oint in all di-

frustum is that which is not visible from the viewp oint due to

rections, the viewed solid angle is also directly prop ortional

the o ccluder. The pixels rendered for ob jects in this space would

to the amount of geometry o ccluded.

b e discarded during depth comparison by the hardware z-bu er.

2. Depth Complexity: Wewant to use o ccluders whichoc-

Thus, mo del geometry which is completely contained within any

clude the maximum amount of geometry. In addition to

of these shadow frusta as is ob ject A need not b e rendered.

using solid angle at runtime, we estimate the actual value

of any o ccluder in the prepro cess by random sampling. The

algorithm selects some random viewp oints each partition

w frustum from that viewp oint

Occluder region, constructs a shado

through each p otentially go o d o ccluder, and determines the

umb er of ob jects contained in the frustum. The average

Viewpoint n

eral samples is a direct estimate of the value of the

A of sev

o ccluder.

3. Coherence: An o ccluder that do es well or p o orly at o c-

cluding for some frame will likely p erform similarly for the

next few frames. Similarly, ob jects that lie near a go o d o c-

es likely to be good oc-

B cluder in screen space are themselv

eeps track of the geometry culled

Shadow Volume cluders. The algorithm k

by each o ccluder at each frame.

These criteria are integrated into our algorithm for b oth the

C online and oine pro cessing of the geometry database. Rather

than consider every p olygon in the mo del as a p otential o ccluder

from every viewp oint, we use auxiliary data structures to quickly

reduce the set of p otential o ccluders to a manageable quantity.

We can then further re ne this set of p otentially go o d o ccluders

Figure 1: Relationshi p of b ounding volumes to frusta.

based on the exact viewing parameters of a particular frame.

4.1 Prepro cess

4 Occluder Selection

The goal of the prepro cess is to asso ciate a set of p otentially go o d

o ccluders with every viewp oint in the mo del. We b egin by con-

Finding go o d o ccluders from a given viewp oint in a general un-

structing a spatial partition which divides the mo del into regions.

structured mo del is a hard problem. By de nition, a go o d o c-

Each region will store a list of p otentially go o d o ccluders. At run-

cluder should o cclude a large fraction of the geometry in the view

time, we determine which region contains the current viewp oint

frustum. For many viewp oints in the scene, no such go o d o cclud-

and use this region's asso ciated list as a set of p otentially go o d o c-

ers may exist. In the worst case, computation of go o d o cclud-

cluders. Thus, the region sizes control the granularity with which

ers may corresp ond to computing the visible surface from that

we sample the mo del's geometry for o cclusion prop erties. Using

viewp oint using hidden surface removal algorithms. Some theo-

small regions in our partition may shorten the list of p otentially

retically ecient algorithms of O n log n + q  complexity, where q

go o d o ccluders at each region, but this increases the number of

is the numb er of edges in the visibility map, have b een prop osed

regions in the data structure, thereby increasing its total memory

by [Mul89]. However, not much is known ab out their practical

usage.

p erformance.

Having constructed our spatial partition, we consider every

The other alternative is to prepro cess the entire mo del to de-

2

p otential o ccluder in the mo del .For each p otential o ccluder, we

termine a set of useful o ccluders at every viewp oint. However,

compute the set of viewp oints from which its viewed solid angle

the computation of such global visibility information is more dif-

exceeds the threshold a user-de ned value. We use as the

cult than hidden surface removal from a particular viewp oint.

cuto for consideration as an o ccluder. Ob jects whose viewed

The fastest known algorithms known for computing the e ects

solid angles are less than the threshold cover only a small amount

on global visibility due to a single p olyhedron with m vertices

6

of screen space and, under the assumption that geometry is evenly

requires O m log m time [GCS91].

distributed, are unlikely to o cclude much geometry.For a single-

Given the overall complexity of nding go o d o ccluders, we

sided ellipse, this set of viewp oints forms an ellipsoid and for a

prop ose an approximation algorithm to nd go o d o ccluders using

sphere, a concentric sphere. More complex o ccluders will have

a combination of online and oine techniques. Our algorithms

signi cantly more complex sets, whichwe approximate with the

work reasonably well in practice, but are not guaranteed to nd

ab ove. Each region of the partition whichintersects this set of

go o d o ccluders all the time.

viewp oints adds the p otential o ccluder to its list of p otentially

We p ose o ccluder selection as an an optimization problem.

go o d o ccluders.

Our goal is to use as few o ccluders as p ossible to keep CPU over-

This solid-angle approximation is the rst criterion of the run-

head down, but to o cclude as many p olygons as p ossible to keep

time analysis we detail later. It is an indirect measure of the e ec-

the graphics pip eline lightly loaded. We take advantage of the em-

tiveness of the o ccluder. We also pre-sample the second criterion,

pirical observation that for the data sets we are interested in, a

the actual amount of geometry b ehind the o ccluder, during the

few o ccluders cause most of the o cclusion from most viewp oints,

1

prepro cess and store it in the region with the reference to the

and using other o ccluders contributes little additional b ene t .

p otentially go o d o ccluder. To do this, wecho ose a small num-

1

We p erformed exp eriments to con rm this. A typical result

b er of viewp oints in the region and calculate the shadow frustum

is shown in Figure 4. When we graph the fraction of p olygons

2

A separate prepro cess may lter out small ob jects or simplify

culled as a function of the numb er of o ccluders used, the optimum

highly detailed ob jects to equivalently-occludingversions. Thus,

numb er of o ccluders is near the knee of the graph; in this case,

the set of p otential o ccluders need not b e the same as the set of

approximately 8. ob jects to b e rendered.

rob otics literature for collision detection b etween convex p oly- cast by the o ccluder from each viewp oint. If we determine what

top es. These are based on Minkowski sums [GJK88], closest fea- fraction of the mo del geometry is actually o ccluded from each

tures computation based on external Voronoi regions [LC91] and of these viewp oints and average our results together over all the

linear programming [Sei90]. All of them have b een implemented viewp oints sampled, wehavea direct estimate of the o ccluder's

and work reasonably well in practice. However, these algorithms ecacy. Our selection of sample viewp oints b e either random

are not directly applicable to our requirements for two reasons: or guided by results on cho osing optimal sampling patterns for

antialiasing.

 Limitation: All these algorithms only check whether or

not two ob jects are overlapping. They can not di erentiate

4.2 Runtime Computation

between partial overlap of two ob jects and complete con-

tainment of one ob ject inside another, which is required for

Atevery frame, we nd the region of the spatial partition which

our algorithm.

contains the viewp oint. That region has an asso ciated list of p o-

tentially go o d o ccluders obtained from the prepro cess. The list is

 Robustness: Although the basic idea in all these algo-

rst narrowed by p erforming preliminary view-frustum culling to

rithms is simple, their practical implementation are prone

determine which p otential o ccluders lie within the eld of view.

to robustness problems due to oating p oint arithmetic and

These p otential o ccluders are then sorted based on our optimiza-

geometric degeneracies.

tion function. The n o ccluders which pro duce the highest value

occ

In the geometry literature, other linear time algorithms have

of the function are used as that particular frame's o ccluders.

b een prop osed to compute intersections b etween convex p olytop es

Throughout the pro cess we exploit coherence. The algorithm

[Cha89]. Unlike collision detection algorithms, they can unam-

exploits temp oral coherence of o cclusion. When the viewp oint

biguously distinguish the three distinct cases shown in Figure 1.

only moves a little b etween frames, the value of a given o ccluder

However, not much is known ab out their p erformance on real-

only changes a little. Using coherence, we improve our exp ected

world mo dels.

sorting p erformance to linear time. Similarly, due to spatial co-

herence and the fact we use convex o ccluders, the algorithm easily

tracks the silhouette of each o ccluder during the overlap tests.

5.2 General Algorithm

We pro ject the silhouette of the o ccluder and the o ccludee onto

5 Visibility Culling Using Occluders

the image plane. Each pro jection is a convex p olygon and we refer

to them as the o ccluder p olygon A and the o ccludee p olygon

In order to use o ccluders, wemust have constructed a hierarchyof

B . Based on this pro jection,we reduce the problem to a 2D

bounding volumes that contains the entire mo del. Having selected

overlap test b etween two convex p olygons. Tocheck whether B

n go o d o ccluders, wenow pro ceed to p erform o cclusion culling

occ

is totally or partially contained inside A, the algorithm initially

with them. The critical requirement is to do so eciently.

checks whether they are overlapping or not. Our algorithm uses a

We b egin by constructing a shadow frustum for each o ccluder.

mo di ed Cyrus-Beck clipping algorithm [FDHF90], which quickly

Taking the viewp oint as the ap ex of a frustum, we de ne the near

determines if two p olygons intersect and robustly computes an

plane of the frustum as a plane passing through the farthest p oint

edge and a common p oint call it O  contained in the intersection

of the o ccluder's silhouette and whose normal p oints in the di-

of two p olygons if they are overlapping. There are also other

rection from that p ointtowards the viewp oint. Each side of the

robust implementations available for computing the intersection

frustum is a plane containing the viewp oint and an edge two ad-

of two planar p olygons.

jacentvertices of the ob ject's silhouette. This frustum contains

Tocheck whether B is totally contained inside A or not, we

the space which the o ccluder o ccludes from the viewp oint. Any

need to check whether any of their edges intersect. Given O ,we

geometry completely contained in the frustum is o ccluded and

useasweep-line approachtocheck whether or not B is totally

need not b e rendered.

contained inside A. Therefore, the numb er of edge pairs we need

Using these shadow frusta requires an inorder traversal of the

to test for overlap is linear in the numb er of edges of the two

hierarchy. In fact, we incorp orate view-frustum culling, shadow-

p olygons. In terms of robustness, this algorithm only requires a

frustum culling, and rendering into a single traversal of the hier-

robust edge-edge overlap test.

archy.We b egin by marking all shadow frusta as active. As we

encounter eachvolume during our traversal of the hierarchy,we

5.3 Sp ecialized Overlap Tests

rst test for interference with the view frustum. If the volume is

outside the view frustum, we are done with that sub-tree. Other-

Although the hierarchyofvolumes may consist of arbitrarily-

wise, we test for interference with all of the active shadow frusta.

shap ed convex volumes, this is rarely the case. Our work is a

If the volume is entirely within any active frustum, the algorithm

companion algorithm to view frustum culling, which normally

stops the current branch of the traversal without rendering any

uses hierarchies based on o ctrees, k-d trees, sphere-trees, OBB-

of the geometry in the volume. If the volume do es not intersect

Trees, R-trees etc. In all of these each no de of the hierarchy

any active frustum, we stop the current branch of the traversal

is a sphere or a rectangularbox. The rectangular b oxmaybe

and render all the geometry contained in the volume. Only if

axis-aligned an axis-aligned b ounding b ox, or AABB or b e ar-

the volume partially overlaps some of the active frusta need we

bitrarily oriented OBB.

continue the traversal with that volume's children.

In this section, we present robust and sp ecialized overlap tests

between a shadow frustum and an AABB or a OBB. Their overall

5.1 Overlap Tests running time is linear in the numb er of faces of the shadow frus-

tum. However, wehave also optimized the constant terms and

The algorithm determines the p ortions of the mo del o ccluded

overall op eration count. The p erformance of the o cclusion culling

by a shadow frustum using interference tests with the b ounding

algorithm is dominated by these overlap tests.

volume hierarchy. As the algorithm traverses the hierarchy, the

The naive approach to exact interference detection requires an

intersection test b etween each b ounding volume and the shadow

enormous amount of computation in the worst case: 54 \inside-

frustum b ecomes a time-critical op eration which directly in u-

outside" half-plane tests and 108 edge-face intersection tests see

ences the overall p erformance of o cclusion culling. As shown in

Table 1. In the b est case, it trivially rejects a b ox in 8 half-

Figure 1, we need to quickly decide if a b ounding volume is: A

plane tests dot pro ducts. Wewould like to maintain an exact

completely inside the shadow frustum, B partially overlapping

test while drastically reducing the worst case cost and improving

the shadow frustum, or C completely outside the shadow frus-

the b est case cost. Our approach combines improved b ox-plane

tum.

overlap tests and a fast edge-b oxintersection test using a parallel

Consider the case when the o ccluder is a convex p olytop e.

slabs representation for the b ounding volume[Gre94, KK86].

The shadow frustum must therefore b e a convex volume. A num-

Our b ox-plane overlap test uses the b ox p olygon's normal to

b er of ecient algorithms have b een prop osed in geometry and

quickly nd the b oxvertices which are closest to and farthest

Interference Test Dot Scalar-Vector Vector Cross

Pro ducts Mult/Divide Add/Sub Pro ducts

Naive with far plane 270 8 216 0 1080 0 432 0

Naive without far plane 216 8 168 0 600 0 336 0

Sp ecialized with far plane 294 4 144 0 144 0 0 0

Sp ecialized without far plane 240 4 144 0 132 0 0 0

Table 1: Worst and Best case computational cost for the overlap tests

from the frustum's plane. Using these extremal p oints we need other hierarchical spatial partition to contain the p olygons of the

to check each plane against no more than twovertices, rather mo del, thus using a single data structure to serve b oth purp oses.

than eight. Oriented b ounding b oxes imp ose an overhead of only However, in our exp erience that approach has two drawbacks:

three additional dot pro ducts p er plane tested at this stage.

1. Most spatial partition algorithms sub divide the p olygons

We improve the worst case dramatically by sp eeding up ex-

or other primitives, such that each p olygon lies in exactly

p ensive edge intersection tests. We can easily sp eed up the twelve

one region. Polygon countistypically increased by a factor

tests b etween the frustum's edges and a b oxby representing our

of twoby this typ e of sub division.Any increase in p oly-

b ounding volumes as three sets of slabs. This lets us avoid the

gon count can b e exp ected to slow rendering, which is the

edge-face test against each face individually by p erforming only

pro cess we are fundamentally trying to sp eed up.

a single edge-\solid" intersection test. Previously we required six

exp ensive edge-face tests to test an edge against an entire b ox,

2. The two data structures capture two di erent prop erties of

but nowwe require only three edge-slab tests that completely

the mo del whichmay not b e spatially similar. The spatial

eliminate the more exp ensive cross-pro duct calculations.

partition should be dense, i.e. have the smallest volumes,

Improving the twelvebox-edge/frustum intersection tests pro-

in regions of the mo del where the b est set of o ccluders to

ceeds similarly, using techniques from the ray-tracing literature.

change frequently. The b ounding volume hierarchy needs to

We represent the frustum as a set of planes forming a convex p oly-

capture the geometry of the mo del as closely as p ossible for

hedron and p erform the edge test against the entire frustum at

b est results in culling. Occlusion culling as a eld would

once. This also removes the exp ensive cross-pro duct calculations

b ene t from further investigation of the tradeo s in the

entirely since we only have to p erform six edge-plane intersection

numb er of spatial data structures used by algorithms.

tests.

We further improve the p erformance by removing the far

6.2 Comparison with Other Algorithms

plane; this avoids having to p erform the far-plane edge/b oxinter-

section tests and the view-frustum containment tests entirely an

Our sp ecialized shadow frustum/b ounding b oxinterference test

in nite frustum never b e contained in the b ounding b ox. The

has b een compared against ecient algorithms and implementa-

improvements resulting from the new overlap test are shown in

tions for collision detection b etween convex p olytop es [CLMP95,

Table 1. Far planes are useful for view frusta, but are undesirable

GJK88]. Our overlap test is at least two times faster and more

for shadow frusta.

robust than ecient implementations of these algorithms. Notice

that some of the earlier algorithms for collision detection b etween

convex p olytop es utilize temp oral and spatial coherence. Since

6 Implementation and Performance

the shadow frusta maychange b etween frames, our algorithm

uses no coherence in p erforming the overlap tests.

6.1 Data Structures

Wehave implemented the algorithms describ ed in this pap er.

6.3 Performance on a Large Mo del

They scale well eciently and robustly to large mo dels. Our im-

To test our algorithm in environments of varying high depth

plementation has b een interfaced within an unoptimized graphics

complexity,we built a comp osite mo del of more than 500,000

viewer written using Op enGL. The viewer runs on top of UNC's

p olygons by fusing together a mo del of central London, a mo del

view-frustum culling library and uses either exact or conservative

of the Aztec ceremonial center at Teno chtitlan, and a partially-

versions of our sp ecialized interference detection algorithm. We

develop ed region of hills in England. See plates I - IV at the

assume that any large mo del viewer will use view-frustum culling;

end of the pap er. The input mo del came to us as a collection

results rep orted ignoring it are meaningless.

of p olygons with no adjacency or any other structure. We used

In our implementations,we also used the area-angle approxi-

uniform grids for o ccluder computation and a b ounding volume

mation presented by Co org and Teller[CT96]. This is

hierarchy based on axis-aligned b ounding b oxes for view frusta

~ ~

and o cclusion culling.

aN  V

On our SGI Onyx RealityEngine 2, along a 2748-frame path

~

2

jVj

through the central regions of this mo del wesaw up to 80 of the

mo del that passed view-frustum culling o ccluded with roughly

~

where a is the area of the p olygon in ob ject space, N the p oly-

40 of total p olygons o ccluded on average. Our measured aver-

~

age sp eedup in rendering was 55. The theoretical maximum is

gon's normal vector, and V the vector from the viewp oint to the

65, with the discrepancy due to the overhead of managing the

center of the p olygon. This gives a go o d approximation of the

o ccluders and their shadow frusta.

subtended solid angle of the p olygon.

Occlusion culling is fundamentally a viewp oint-dep endent op-

The results rep orted in this pap er were obtained using area-

timization; Figures 2 and 3 show the frame time and rendered

angle as a go o dness criterion; wehave seen b etter results using

p olygon count for each frame in our long path through the city

the full three-part go o dness criterion solid angle, prepro cess es-

mo del. On some frames o cclusion culling gives extremely go o d

timation, and coherence, but have not yet tested it on the large

results, while on others it do es not obtain signi cant culling, de-

mo del.

p ending on the depth complexity.

We use a uniform division of the b ounding b ox of the en-

Occlusion culling yields go o d results with surprisingly few o c-

tire database into rectangular voxels as our spatial partition for

cluders. We rapidly see decreasing returns b eyond ve or six

o ccluder selection, and a tree of axis-aligned or oriented b ound-

o ccluders active. Figure 4 shows the results of one exp eriment

ing b oxes as our hierarchy for view-frustum culling and o cclu-

sion culling. We could alternatively use an o ctree, BSP tree, or

Metho d Polygons Time p er Frames

per Frame Frame us p er Second

View-Frustum Culling 49196 276 3.6

View-Frustum plus Occlusion Culling 29749 178 5.6

Table 2: Comparison of average statistics over 2748-frame path down streets of the 500,000-p olygon city mo del. Our imple-

mentation improves average frame rate by 55. Frame time, in microseconds 500000 View-Frustum Culling Only Occlusion Culling

250000

0 0 500 1000 1500 2000 2500

Frame number

Figure 2: Time p er frame, in microseconds, for each frame in the 2748-frame path through the large city mo del. Some

viewp oints are go o d for o cclusion culling, and at many of these p oints our implementation is seen to yield sp eedup.

with the city mo del, and wehave seen similar results for other public domain implementationsavailable in Graphics Gems.

paths and other mo dels.

7 Algorithm Analysis

6.4 Generality and Robustness

The overall asymptotic running time of the algorithm is at most

Generality and robustness have b een main concerns in the design

linear in n, where n is the total numb er of p olygon in the mo del.

and implementation of our algorithm. The algorithm and imple-

The worst-case running times for the various phases of the algo-

mentation are applicable to all unstructured p olygonal mo dels.

rithm are:

The algorithm requires no adjacency or connectivity information

Occluder Selection: We select the n active o ccluders by

occ

between the p olygons. Only the silhouette tracking algorithm

sorting a list of n p otentially go o d o ccluders. In practice, n

pot occ

used for shadow frusta computation requires connectivity infor-

is typically a small constantbetween ve and ten, and n is

pot

mation. On unstructured mo dels, we treat each convex p olygon

a small fraction of the input primitives. The total complexityof

as a separate o ccluder.

this phase is O n n orOn log n .

occ pot pot pot

Our sp ecialized algorithms for overlap tests b etween the shadow

ShadowFrusta Computation: Constructing shadow frusta

frusta and b ounding b oxes are robust and not prone to degenera-

based on silhouette tracking for these o ccluders requires time lin-

cies. They do not need to check for non-generic conditions such

ear in the numb er of edges p er o ccluder, which is a small constant

as parallel faces or edges. These are not sp ecial cases for the test

k , for a phase complexityof Okn .

occ

and do not need to b e handled separately. As a series of compar-

Visibility Culling: We use the n shadow frusta in the

occ

isons b etween linear combinations, the test is numerically stable.

same manner as view frusta; each view-frustum cull is worst-case

Our current implementation uses IEEE 64-bit oating p oint arith-

linear in the size of the mo del. The complexity of this phase is

metic instead of exact arithmetic for these overlap tests. Sp eed

O n n.

occ

is an imp ortant concern in our system. We realize that oating

Since n is a small constant, and n << n, our algo-

occ pot

p oint arithmetic can b e a p otential source of problem though

rithm is in the worst case linear: O n. The spatial hierarchy

wehave not exp erienced any so far, esp ecially when one face of

constructed for o ccluder use keeps the constant in this term very

the b ox is just touching the shadow frusta. Many recent pap ers

small. We also use spatial and temp oral coherence in resorting

byFortune, van Wijk, Clarkson, Boissonnat et al, have demon-

the list of p otentially go o d o ccluders by using insertion sort. Since

strated that clever use of oating p oint arithmetic can yield very

the ordering of this list only changes slightly b etween frames, that

e ective and yet correct implementations of geometric primitives.

phase has an exp ected time linear in the numb er of p otentially

Many libraries like LEDA [MN95] have implementations of such

go o d o ccluders n .

pot

algorithms for line segmentintersections. As a result, it should b e

p ossible to have fast correct implementations of our sp ecialized

overlap test. For the general algorithm, we reduce the problem

to 2D overlap tests b etween two convex p olygons and use robust Number of polygons 100000 View-Frustum Culling Only Occlusion Culling

80000

60000

40000

20000

0 0 500 1000 1500 2000 2500

Frame number

Figure 3: Polygons rendered p er frame along a 2748-frame path through the large city mo del. Some viewp oints are go o d for

o cclusion culling, and at many of these p oints our implementation is seen to reduce the numb er of p olygons rendered, which

translates into less bus bandwidth consumed.

7.1 Main Features solution by the fact that the union of several convex o ccluders is

not necessarily convex. Wehave develop ed and are implementing

Our main goals in developing these algorithms and systems have

extensions to our interference detection algorithms to combine

b een robustness, applicability to large unstructured mo dels, and

o ccluders in ob ject space.

scalability. None of the earlier conservative visibility algorithms

In [ZMHH97], Zhang et al. have successfully used the graph-

has b een applied to large and unstructured mo dels comp osed of

ics pip eline to merge these o ccluders in the image space and form

hundred of thousands of p olygons. As a result, it is somewhat

o cclusion maps. Based on that they have prop osed a two-pass al-

dicult to make exact comparisons in terms of robustness and

gorithm. In the rst pass, the algorithm constructs an o cclusion

eciency.

map hierarchy and in the second pass used to cull away p ortions of

Some of the main features of our approach are:

the mo del not visible from the current viewp oint. This algorithm

 Robustness: Since we use sp ecialized interference detec- is able to cull away a higher p ercentage of the mo del than our

tion algorithms for determining o ccluder-o ccludee relation- ob ject space approaches, but requires a high p erformance graph-

ships, our algorithm is not prone to robustness problems

ics pip eline. On the other hand, the algorithm presented in this

and geometric degeneracies. pap er do es not use the graphics pip eline for visibility culling and

should thus b e useful for low-end hardware or p ersonal comput-

 Eciency: The additional CPU overhead due to o cclusion

ers.

culling varies with the viewp oint and is a linear function of

the numb er of o ccluders chosen at that particular frame.

8 Acknowledgements

In most cases, we use ve to ten o ccluders. We use e-

cient and sp ecialized algorithms for overlap tests b etween

Some of the mo dels used in this work were originally constructed

a shadow frustum and a hierarchy of b ounding volumes.

by:

As a result, our algorithm p erforms interactively on large

mo dels comp osed of hundreds of thousands of p olygons.

 Dr. Vasilis Bourdakos of the Centre for Advanced Studies

in Architecture, Bath University.

 Generality: The algorithm is applicable to all p olygonal

and unstructured mo dels. In many CAD applications, the

 Bob Galbraith Computer Graphics, East Longmeadow, MA.

input mo dels come as \p olygon soup," with no structure or

adjacency information available for constructing winged-

References

edge or similar data structures. The presence of structure

increases the e ectiveness of our approach, but the results

[ARB90] J. Airey, J. Rohlf, and F. Bro oks. Towards image re-

rep orted herein were obtained on unstructured data sets.

alism with interactive up date rates in complex virtual

building environments. In Symposium on Interactive

 Ease of Implementation: The implementation requires

3D Graphics, pages 41{50, 1990.

only simple data structures plus a general view frustum

+

[BCG 96] G. Barequet, B. Chazelle, L. Guibas, J. Mitchell, and

culling library. Since the o cclusion culling algorithm sees

A. Tal. Boxtree: A hierarchical representation of sur-

the mo del as comp osed of simple b ounding volumes like

faces in 3d. In Proc. of Eurographics'96, 1996.

spheres, rectangular b oxes etc., we use simple and easily-

[BDEG94] M. Bern, D. Dobkin, D. Eppstein, and R. Grossman.

sp ecialized overlap tests.

Visibility with a moving p oint of view. Algorithmica,

11:360{78, 1994.

7.2 Ongoing Work

[Cha89] B. Chazelle. An optimal algorithm for intersect-

Often geometry is not o ccluded byany single convex o ccluder, ing three-dimensional convex p olyhedra. Proc. 30th

but is o ccluded by the combination of several o ccluders. This Annu. IEEE Sympos. Found. Comput. Sci., pages

is made more dicult to take advantage of in an ob ject-space 586{591, 1989. Percent occlusion as a function of number of occluders 50

45

40

35

30

25

20

15

10

5

0 0 5 10 15 20 25

Average number of occluders used per frame

Figure 4: Average p ercentage of p olygons within view frustum culled by o cclusion culling as a function of average number

of o ccluders active, along a 2748-frame path through the large city mo del. This supp orts the observation that for any given

viewp oint a small numb er of o ccluders provide most of the o cclusion.

[GJK88] E. G. Gilb ert, D. W. Johnson, and S. S. Keerthi. A [Cla76] J.H. Clark. Hierarchical geometric mo dels for visi-

fast pro cedure for computing the distance b etween ble surface algorithms. Communications of the ACM,

ob jects in three-dimensional space. IEEE J. Robotics 1910:547{554, 1976.

and Automation,vol RA-4:193{203, 1988. [Cla88] K. L. Clarkson. Applications of random sampling in

[GKM93] N. Greene, M. Kass, and G. Miller. Hierarchical z- computational geometry, I I. In Proc. 4th Annu. ACM

bu er visibility.InProc. of ACM Siggraph, pages Sympos. Comput. Geom., pages 1{11, 1988.

231{238, 1993. [CLMP95] J. Cohen, M. Lin, D. Mano cha, and M. Ponamgi. I-

[GLM96] S. Gottschalk, M. Lin, and D. Mano cha. Obb-tree: A collide: An interactive and exact collision detection

hierarchical structure for rapid interference detection. system for large-scale environments. In Proc. of ACM

In Proc. of ACM Siggraph'96, pages 171{180, 1996. Interactive 3D Graphics Conference, pages 189{196,

[Gre94] N. Greene. Detecting intersection of a rectangular 1995.

solid and a convex p olyhedron. In Graphics Gems [Cro77] F. C. Crow. Shadow algorithms for computer graph-

IV, pages 74{82. Academic Press, 1994. ics. ACM Computer Graphics, 113:242{248, 1977.

[HHK89] C. Ho mann, J. Hop croft, and M. Karasick. Robust [CT96] S. Co org and S. Teller. Temp orally coherent conser-

set op erations on p olyhedral solids. IEEE Computer vative visibility.InProc. of 12th ACM Symposium

Graphics and Applications, 96:50{59, 1989. on Computational Geometry, 1996.

[HKM95] M. Held, J.T. Klosowski, and J.S.B. Mitchell. Evalu- [CT97] S. Co org and S. Teller. Real-time o cclusion culling

ation of collision detection metho ds for for mo dels with large o ccluders. In Proc. of ACM

y-throughs. In Canadian Conference on Computa- Symposium on Interactive 3D Graphics, 1997.

tional Geometry, 1995. [Dor94] S. E. Dorward. A survey of ob ject-space hidden sur-

[KK86] T. Kat and J. Ka jiya. Ray tracing complex scenes. face removal. Internat. J. Comput. Geom. Appl.,

Computer Graphics, pages 269{278, 1986. 4:325{362, 1994.

[LC91] M.C. Lin and John F. Canny. Ecient algorithms for [FDHF90] J. Foley,A.Van Dam, J. Hughes, and S. Feiner. Com-

incremental distance computation. In IEEE Confer- puter Graphics: Principles and Practice. Addison

enceonRobotics and Automation, pages 1008{1014, Wesley, Reading, Mass., 1990.

1991. [FKN80] H. Fuchs, Z. Kedem, and B. Naylor. On visible surface

[LG95] D. Luebke and C. Georges. Portals and mirrors: Sim- generation by a priori tree structures. Proc. of ACM

ple, fast evaluation of p otentially visible sets. In ACM Siggraph, 143:124{133, 1980.

Interactive 3D Graphics Conference, Monterey, CA, [For96] S. Fortune. Robustness issues in geometric algo-

1995. rithms. In M.C. Lin and D. Mano cha, editors, Ap-

[McK87] M. McKenna. Worst-case optimal hidden-surface re- plied Computational Geometry, pages 9{14. Springer-

moval. ACM Trans. Graph., 6:19{28, 1987. Verlag, 1996.

[MN95] K. Mehlhorn and S. Naher. LEDA: a platform for [GBW90] B. Garlick, D. Baum, and J. Winget. Interactive view-

combinatorial and geometric computing. Commun. ing of large geometric databases using multipro ces-

ACM, 38:96{102, 1995. sor graphics workstations. Siggraph'90 course notes:

[MS85] M. McKenna and R. Seidel. Finding the optimal shad- Paral lel Algorithms and Architectures for 3D Image

ows of a convex p olytop e. In Proc. 1st Annu. ACM Generation, 1990.

Sympos. Comput. Geom., pages 24{28, 1985. [GCS91] Z. Gigus, J. Canny, and R. Seidel. Eciently com-

[Mul89] K. Mulmuley. An ecient algorithm for hidden sur- puting and representing asp ect graphs of p olyhedral

face removal. Computer Graphics, 233:379{388, ob jects. IEEE Transactions on Pattern Analysis and

1989. Machine Intel ligence, 136:542{551, 1991.

[Nay92] B. Naylor. Interactive solid geometry via partitioning

trees. In Proc. of Graphics Interface, pages 11{18,

1992.

[Sei90] R. Seidel. Linear programming and convex hulls made

easy.InProc. 6th Ann. ACM Conf. on Computa-

tional Geometry, pages 211{215, Berkeley, California,

1990.

[SSS74] I. Sutherland, R. Sproull, and R. Schumaker. A char-

acterization of ten hidden-surface algorithms. Com-

puting Surveys, 61:1{55, 1974.

[TS91] S. Teller and C.H. Sequin. Visibility prepro cessing for

interactivewalkthroughs. In Proc. of ACM Siggraph,

pages 61{69, 1991.

[YR96] R. Yagel and W. Ray. Visibility computations for

ecientwalkthrough of complex environments. Pres-

ence, 51:1{16, 1996.

[ZMHH97] H. Zhang, D. Mano cha, T. Hudson, and K. Ho . Visi-

bility culling using hierarchical o cclusion maps. Tech-

nical Rep ort TR97-004, Department of Computer Sci-

ence, University of North Carolina, 1997. To App ear

in Pro c. of ACM Siggraph'97.