Acceleration techniques

Acceleration

Digggyital Image Synthesis Yung-Yu Chuang

with slides by Mario Costa Sousa, Gordon Stoll and Pat Hanrahan

Bounding volume hierarchy Bounding volume hierarchy

1) Find bounding box of objects Bounding volume hierarchy Bounding volume hierarchy

1) Find bounding box of objects 1) Find bounding box of objects 2) SliSplit objects into two groups 2) SliSplit objects into two groups 3) Recurse

Bounding volume hierarchy Bounding volume hierarchy

1) Find bounding box of objects 1) Find bounding box of objects 2) SliSplit objects into two groups 2) SliSplit objects into two groups 3) Recurse 3) Recurse Bounding volume hierarchy Where to split?

1) Find bounding box of objects • At midpoint 2) SliSplit objects into two groups • Sort, and put half of the objects on each side 3) Recurse • Use modeling hierarchy

BVH traversal BVH traversal • If hit parent, then check all children • Don't return intersection immediately because the other subvolumes may have a closer intersection Bounding volume hierarchy Bounding volume hierarchy

Space subdivision approaches Space subdivision approaches

Unifrom grid (2D) KD BSP tree Octree (3D) Uniform grid Uniform grid

Preprocess scene 1. Find bounding box

Uniform grid Uniform grid

Preprocess scene Preprocess scene 1. Find bounding box 1. Find bounding box 2. Determine grid resolution 2. Determine grid resolution 3. Place object in cell if its bounding box overlaps the cell Uniform grid Uniform grid traversal

Preprocess scene Preprocess scene 1. Find bounding box Traverse grid 2. Determine grid resolution 3D line = 3D-DDA 3. Place object in cell if its (Digital Differential bounding box overlaps the cell Analyzer) 4. Check that object overlaps cell (expensive!) y  y y  mx  b m  2 1 x2  x1

xi1  xi 1

yi1  mxi1  b naive

yi1  yi  m DDA

octree Octree K-d tree

A

A

Leaf nodes correspond to unique regions in space K-d tree K-d tree

A A

B

B B

A A

Leaf nodes correspond to unique regions in space Leaf nodes correspond to unique regions in space

K-d tree K-d tree

C A C A

B B

C B B

A A K-d tree K-d tree

C A C A

D B D B

C C B B D A A

K-d tree K-d tree traversal

A A

D B D B

B C C B C C

D D

A A

Leaf nodes correspond to unique regions in space Leaf nodes correspond to unique regions in space BSP tree BSP tree

6 6 5 5 1

9 9 inside outside ones ones 7 7 10 8 10 8 1 1 11 11 2 2

4 4

3 3

BSP tree BSP tree

6 6 5 1 5 1

2 5 5 9 9b 9 3 6 4 7 7 6 8 7 10 10 11b 8 8 8 9a 7 9b 1 9 1 9a 11b 11 10 11 10 11a 2 11 2 11a

4 4

3 3 BSP tree BSP tree traversal

6 5 6 5 1 1

9b 9b 9 2 5 9 2 5 7 7 10 11b 8 8 10 11b11b 8 9a 3 6 9a9a 3 6 8 1 1 11 9b 11 9b 11a 4 7 11a 4 7 2 2

4 9a 11b 4 9a 11b poitint 3 3 10 10

11a 11a

BSP tree traversal BSP tree traversal

6 5 6 5 1 1

9b 9b 9 2 5 9 2 5 7 11b 7 10 11b11b 10 11b 8 8 3 8 9a9a 3 6 8 9a9a 6 1 1 11 9b 11 9b 11a 4 7 11a 4 7 2 2

4 9a 11b 4 9a 11b poitint point 3 3 10 10

11a 11a Classes Hierarchy

• Primitive (in core/primitive.*) Primitive – Geome tiPiititricPrimitive – InstancePrimitive – AtAggregate • Three types of accelerators are provided (in Geometric Transformed Aggregate accelerators/*.cpp) PiPrimiti ve PiPrimiti ve – GridAccel T – BVHAccel – KdTreeAccel Material Shape Support both Store intersectable Instancing and primitives, call animation Refine If necessary

Primitive Interface class Primitive : public ReferenceCounted { BBox WorldBound(); geometry bool CanIntersect(); const int primitiveId; bool Intersect(const Ray &r, // update maxt static int nextprimitiveId; Intersection *in); } bool IntersectP(const Ray &r); void Refine(vector> Refine(vector> &refined); class TransformedPrimitive: public Primitive void FullyRefine(vector> &refined); { AreaLight *GetAreaLight(); material … BSDF *GetBSDF((yg,const DifferentialGeometry &dg, Re ference< Pr im itive> itinstance; Transform &WorldToObject); } BSSRDF *GetBSSRDF(DifferentialGeometry &dg, Transform &WorldToObject); Intersection GeometricPrimitive struct Intersection { • represents a single shape • hldholds a ref erence to a Shape and its Materilial, DifferentialGeometry dg; and a pointer to an AreaLight const Primitive *primitive; Reference shape; Transform WorldToObject, ObjectToWorld; Reference material; // BRDF int shapeId, primitiveId; AreaLight *areaLight; // emittance float rayEpsilon; • Most operations are forwarded to shape }; adaptiv ely estimated primitive stores the actual intersecting primitive, hence Primitive->GetAreaLight and GetBSDF can only be called for GeometricPrimitive

GeometricPrimitive Object instancing bool Intersect(Ray &r,Intersection *isect) { float thit, rayEpsilon; if (!shape->Intersect(r, &thit, &rayypEpsilon , &isect->dg)) return false; isect->primitive = this; isect->WorldToObject = *shape->WorldToObject; isect->ObjectToWorld = *shape->ObjectToWorld; isect->shapeId = shape->shapeId; isect->primitiveId = primitiveId; isect->rayEpsilon = rayEpsilon; r.maxt = thit; 61 uniqqpue plants , 4000 individual plants , 19.5M trian gles return true; With instancing, store only 1.1M triangles, 11GB->600MB } TransformedPrimitive TransformedPrimitive

bool Intersect(Ray &r, Intersection *isect){ Re ference pri m itive; Transform w2p; AnimatedTransform WorldToPrimitive; WorldToPrimitive.Interpolate(r.time,&w2p); for instancing and animation Ray ray = w2p(r); if (!primitive->Intersect(ray, isect)) Ray ray = WorldToPrimitive(r); return false; if (!instance->Intersect(ray, isect)) r.maxt = ray.maxt; return false; itisect->iitiId>primitiveId = pr iitiIdimitiveId; r.maxt = ray.maxt; if (!w2p.IsIdentity()) { // Compute world-to-object transformation for instance isect->WorldToObject = isect->WorldToObject isect->WorldToObject=isect->WorldToObject*w2p; *WorldToInstance; isect->ObjectToWorld=Inverse( isect->WorldToObject);

TransformedPrimitive Aggregates // Transform instance's differential geometry to world space • Acceleration is a heart component of a ray Transform PrimitiveToWorld = Inverse((p);w2p); tracer because ray/scene intersection accounts isect->dg.p = PrimitiveToWorld(isect->dg.p); for the majority of execution time isect->dg.nn = Normalize( PrimitiveToWorld(isect- >dg.nn)); • GlGoal: reduce the num ber of ray/iiti/primitive isect->dg.dpdu=PrimitiveToWorld(isect->dg.dpdu); intersections by quick simultaneous rejection of isect->dg.dpdv=PrimitiveToWorld(isect->dg.dpdv); groups of pr im itives and the fact tha t near by isect->dg.dndu=PrimitiveToWorld(isect->dg.dndu); intersections are likely to be found first isect->dg.dndv=PrimitiveToWorld(isect->dg.dndv); • Two main approaches: spatial subdivision, } object subdivision return true; • No clear winner } Ray-Box intersections Ray-Box intersections • Almost all acclerators require it 1 • QikQuick rejecti on, use enter and exit point to traverse the hierarchy Dx

• AABB is the intersection of three slabs t1 x1  Ox t1  Dx x1  Ox

x=x0 x=x1

Ray-Box intersections Grid accelerator bool BBox::IntersectP(const Ray &ray, • Uniform grid float *hitt0, float *hitt1) { float t0 = ray.mint, t1 = ray.maxt; for (int i = 0; i < 3; ++i) { float invRayDir = 1.f / ray.d[i]; float tNear = (pMin[i] - ray.o[i]) * invRayDir; float tFar = (pMax [i ] -rayy[]).o[i]) * invRa yDir ;

if (tNear > tFar) swap(tNear, tFar); t0 = tNear > t0 ? tNear : t0; segment iiintersection t1 = tFar < t1 ? tFar : t1; if (t0 > t1) return false; intersection is empty } if (hitt0) *hitt0 = t0; if (hitt1) *hitt1 = t1; return true; } Teapot in a stadium problem GridAccel

• Not adaptive to distribution of primitives. Class GridAccel:public Aggregate { • Have to determine the number of . (problem with too many or too few) u_int nMailboxes; MailboxPrim *mailboxes; vector> primitives; int nVoxels[3]; BBox bounds; VtVector Width WidthIWidth, InvWidth; **voxels; MAMemoryArena voxe lArena; RWMutex *rwMutex; }

mailbox GridAccel struct MailboxPrim { GridAccel(vector > &p, Reference primitive; bool forRefined, bool refineImmediately) Int lastMailboxId; : gridForRefined(forRefined) { } // Initialize with primitives for grid if (refineImmediately) for (int i = 0; i < p. size(); ++i) p[i]->FullyRefine(primitives); else primitives = p; for (iti(int i = 0 ; i < pri iitimitives.s ize (); ++ i) bounds = Union(bounds, primitives[i]->WorldBound()); Determine number of voxels Calculate voxel size and allocate voxels • Too many voxels → slow traverse, large for (int axis=0; axis<3; ++axis) { memory consumption (bad cache performance) nVoxels[axis]=Round2Int(delta[axis]*voxelsPerUnitDist); nVoxels[axis]=Clamp(nVoxels[axis], 1, 64); • Too few voxels → too many primitives in a } voxel • Let the axis with the largest extent have 3 3 N for (int axis=0; axis<3; ++axis) { width[axis]=delta[axis]/nVoxels[axis]; partitions (N:number of primitives) invWidth[axis]= (width[axis]==0.f)?0.f:1.f/width[axis]; Vector delta = bounds.pMax - bounds.pMin; } int maxAxis=bounds.MaximumExtent(); float invMaxWidth=1.f/delta[maxAxis]; int nv = nVoxels[0] * nVoxels[1] * nVoxels[2]; float cubeRoot=3.f*powf(float(prims.size()),1.f/3.f); voxels=AllocAligned(nv); float voxelsPerUnitDist=cubeRoot * invMaxWidth; memset(voxels, 0, nv * sizeof(Voxel *));

Conversion between voxel and position Add primitives into voxels int posToVoxel(const Point &P, int axis) { for (u_int i=0; i return Clamp(v, 0, NVoxels[axis]-1); } } float voxelToPos(int p, int axis) const { return bounds. pMin[axis]+p* Width[axis]; } Point voxelToPos(int x, int y, int z) const { return bounds. pMin+ Vector(x*Width[0], y*Width[1], z*Width[2]); } inline int offset(int x, int y, int z) { return z*NVoxels[0]*NVoxels[1] + y*NVoxels[0] + x; }

BBox pb = prims[i]->WorldBound(); for (int z = vmin[2]; z <= vmax[2]; ++z) for (int y = vmin[1]; y <= vmax[1]; ++y) int vmin[3] , vmax[3]; for (int x = vmin[0]; x <= vmax[0]; ++x) { for (int axis = 0; axis < 3; ++axis) { int o = offset(x, y, z); if (!voxels[o]) { vmin[axis] = posToVoxel(pb .pMin , axis); voxels[o] = voxelArena.Alloc(); vmax[axis] = posToVoxel(pb.pMax, axis); *voxels[o] = Voxel(primitives[i]); } } else { // Add primitive to already-allocated voxel voxels[o]->AddPrimitive(primitives[i]); } }

Voxel structure GridAccel traversal struct Voxel { bool GridAccel::Intersect( Ray &ray, Intersection *isect) { vector> primitives; bool allCanIntersect; } Voxel(Reference op) { } allCanIntersect = false; prim itives.push_b h back( op ); } void AddP ri miti ve(R ef erence

p ) { primitives.push_back(p); } Check against overall bound up 3D DDA (Digital Differential Analyzer) float rayT; • Similar to Bresenhm’s line drawing algorithm if (bounds. Inside(ray(ray. mint))) rayT = ray.mint; else if (!bounds. IntersectP(ray, &rayT)) return false; Point gridIntersect = ray(rayT);

Set up 3D DDA (Digital Differential Analyzer) Set up 3D DDA

for (int axis=0; axis<3; ++axis) { blue values changgges along the traversal Pos[axis]=posToVoxel(gridIntersect, axis); if (ray.d[axis]>=0) { Out NextCrossingT[1] NextCrossingT[axis] = rayT+ voxel (voxelToPos(Pos[axis]+1, axis)-gridIntersect[axis]) index /ray.d[axis];

DeltaT[axis] = width[axis] / ray.d[axis]; Step[axis] = 1; 1 DeltaT[0] Out[axis] = nVoxels[axis];

Step[0]=1 } else { Dx rayT ... Step[axis] = -1; Pos NextCrossingT[0] Out[axis] = -1; } DeltaT: the distance change when voxel changes 1 in that direction } width[0] Walk through grid Advance to next voxel for (;;) { int bits=((NextCrossingT[0]Intersect(ray,isect,rayId); int stepAxis=cmpToAxis[bits];

if (ray.maxt < NextCrossingT[stepAxis]) break; } return hitSome thing; Pos[stepAxis]+= Step[stepAxis];

if (Pos[stepAxis] == Out[stepAxis]) break; Do not return; cut tmax instead Return when entering a voxel NextCrossingT[stepAxis] += DeltaT[stepAxis]; that is beyond the closest found intersection.

conditions Bounding volume hierarchies • Object subdivision. Each primitive appears in the hierarchy exactly once. Additionally, the xy 1 BVH provides much faster intersection. 010 - 011z>x≥y1 • BVH v.s. Kd-tree: Kd-tree could be slightly 1 0 0 y>x≥z 2 faster for intersection, but takes much longer to build. In addition, BVH is generally more 101 - numerically robust and less prone to subtle 1 1 0 y≥z>x 0 round-off bugs. 1 1 1 z>y>x 0 • accelerators/bvh.* BVHAccel BVHAccel construction class BVHAccel : public Aggregate { BVHAccel::BVHAccel(vector > &p, uint32_t mp, const string &sm) uint32_t maxPrimsInNode; { enum SplitMethod { SPLIT_MIDDLE, SPLIT_EQUAL_COUNTS, maxPrimsInNode = min(255u, mp); SPLIT_ SAH }; for (uint3(uint322ti=0;iFullyRefine(primitives); vector > primitives; if (sm=="sah") splitMethod =SPLIT_SAH; LinearBVHNode *nodes; else if (sm=="middle") splitMethod =SPLIT_MIDDLE; } else if (sm=="equal") splitMethod=SPLIT_EQUAL_COUNTS; else { Warning("BVH split method \"%s\" unknown. Using \"sah\".", sm.c_str()); splitMethod = SPLIT_SAH; }

BVHAccel construction Initialize buildData array

vector buildData; buildData.reserve(primitives.size ()); for (int i = 0; i < primitives.size(); ++i) BBox bbox = primitives[i]->WorldBound(); } It is possible to construct a pointer-less BVH tree buildData.push_back( directly, but it is less straightforward. BVHPrimitiveInfo(i, bbox)); } struct BVHPrimitiveInfo { BVHPrimitiveInfo() { } BVHPrimitiveInfo(int pn, const BBox &b) : primitiveNumber(pn), bounds(b) { centroid = .5f * b.pMin + .5f * b.pMax; } int primitiveNumber; Point centroid; BBox bounds; }; Recursively build BVH tree BVHBuildNode

MemoryArena buildArena; struct BVHBuildNode { uint32_; t totalNodes = 0; void InitLeaf((,,){int first, int n, BBox &b) { vector > orderedPrims; firstPrimOffset = first; orderedPrims.reserve(primitives.size()); nPrimitives = n; bounds = b; } BVHBuildNode *root = recursiveBuild(buildArena, void InitInterior(int axis, BVHBuildNode *c0, buildData, 0, primitives. size(), &totalNodes, BVHBuildNode *c1) { orderedPrims); children[0] = c0; children[1] = c1; [start end) bounds = Union(c0->bounds, c1->bounds); primitives.swap(orderedPrims); splitAxis = axis; nPrimitives = 0; } The leaf contains primitives from BVHAccel::primitives[firstPrimOffset] BBox bounds; to [firstPrimOffset+nPrimitives-1] BVHBuildNode *children[2]; int splitAxis , firstPrimOffset, nPrimitives; };

recursiveBuild Choose axis • Given n primitives, there are in general 2n-2 BBox cBounds; possible ways to partition them into two non- for (int i = start; i < end; ++i) empty groups. In practice, one considers cBounds=Union(cBounds, buildData[i].centroid); int dim = centroidBounds.MaximumExtent(); partitions along a coordinate axis, resulting in If cBounds has zreo volume, create a leaf 6n candidate partitions.

1. Choose axis 2. Choose split 3. Interior(dim, recursiildiveBuild(id)(.., start, mid, ..), recursiveBuild(.., mid, end, ..) ) Choose split (split_middle) Choose split (split_equal_count) float pmid = .5f * (centroidBounds.pMin[dim] + mid = (start + end) / 2; centroidBounds.pMax[dim]); std::nthstd::nth_element(&buildData[start], element(&buildData[start], BVHPrimitiveInfo *midPtr = std::partition( &buildData[mid], &buildData[end-1]+1, &buildData[start], &buildData[end-1]+1, Compp());arePoints(dim)); CompareToMid(dim, pmid)); mid = midPtr - &buildData[0]; It orders the array so that the middle pointer has median, Return true if the given primitive’s bound’s the first half is smaller and the second half is larger in O(n). centroid is below the given midpoint

Choose split Choose split (split_SAH)

Do not split N tisect (i) i1 split

N A N B c(A, B)  ttrav  pA tisect (ai )  pB tisect (bi ) both heuristics work well both are sub-optimal i1 i1

s sA B B p  pB  A s sC C A C better solution Choose split (split_SAH) Choose split (split_SAH)

• If there are no more than 4 primitives, use equal size const int nBuckets = 12; heuristics instead. struct BucketInfo { • Instead of testing 2n candidates, the extend is divided int count; BBox bounds; into a small number (()12) of buckets of e qual extent. }; Only buck boundaries are considered. BucketInfo buckets[nBuckets]; for (int i=start; i

Choose split (split_SAH) Choose split (split_SAH) float cost[nBuckets-1]; float minCost = cost[0]; uint32_t minCostSplit = 0; for (int i = 0; i < nBuckets-1; ++i) { for ((;int i = 1; i < nBuckets-1;;){ ++i) { BBox b0, b1; if (cost[i] < minCost) { int count0 = 0,,; count1 = 0; minCost = cost[i]; minCostSplit = i; for (int j = 0; j <= i; ++j) { } b0 = Union(b0, buckets[j].bounds); } count0 += buckets[j].count; } if (nPrimitives > maxPrimsInNode || for (int j = i+1; j < nBuckets; ++j) { minCost < nPrimitives)){ { b1 = Union(b1, buckets[j].bounds); BVHPrimitiveInfo *pmid = count1 += buckets[j].count; } std::partition(&buildData[start],&buildData[end- cost[i] = .125f + (count0*b0.SurfaceArea() + 1]+1, Compare To Buc ke t(m in Cos tSp lit, n Buc ke ts, dim, count1*b1.SurfaceArea())/bbox.SurfaceArea(); centroidBounds)); mid = pmid - &buildData[0]; } Traverse cost : Intersection cost = 1 : 8 } else Compact BVH BVHAccel traversal • The last step is to convert the BVH tree into a bool BVHAccel::Intersect(const Ray &ray, Intersection *isect) const { compact representation which improves cache, if (!nodes) return false; memory and thus overall performance. blhiflbool hit = false; Point origin = ray(ray.mint); Vector invDir(1.f / ray.d.x, 1.f / ray.d.y, 1.f / ray.d.z); uint32_t dirIsNeg[3]={ invDir.x < 0, invDir.y < 0, invDir.z < 0 };

uin t32_ t no de Num = 0; offset into the nodes array to be visited uint32_t todo[64]; nodes to be visited; acts like a stack uint32_ t todoOffset = 0; next free element in the stack

BVHAccel traversal BVHAccel traversal while (true) { else { interior node const LinearBVHNode *node = &nodes[[];nodeNum]; if ((g[dirIsNeg[node->axis]) { if (::IntersectP(node->bounds,ray,invDir,dirIsNeg)){ todo[todoOffset++] = nodeNum + 1; if (node->nPrimitives > 0) { leaf node nodeNum = node->secondChildOffset; // Intersect ray with primitives in leaf BVH node } for (uint32_t i = 0; i < node->nPrimitives; ++i){ else { if (primitives[node->primitivesOffset+i] todo[todoOffset++] = node->secondChildOffset; ->Intersect(ray, isect)) nodeNum = nodeNum + 1; hit = true; } } } if (todoOffset == 0) break; } nodeNum = todo[--todoOffset]; else { Do not hit the bounding box; retrieve the next one if any } if (todoOffset == 0) break; nodeNum = todo[--todoOffset]; } KD-Tree accelerator Spatial hierarchies • Non-uniform space subdivision (for example, kd-tree and octree) is better than uniform grid A if the scene is irregularly distributed.

A

Letters correspond to planes (A) Point Location by recursive search

Spatial hierarchies Spatial hierarchies

A A

B D B

B B C C

D

A A

Letters correspond to planes (A, B) Letters correspond to planes (A, B, C, D) Point Location by recursive search Point Location by recursive search Variations “Hack” kd-tree building • Split axis – RdRound-robin; largest ex ten t • Split location – Middle of extent; median of geometry (balanced tree) • Termination – Target # of primitives, limited tree depth • All of these techniques stink.

kd-tree octree bsp-tree

Building good kd-trees Splitting with cost in mind • What split do we really want? – Clever Idea: the one tha t ma kes ray trac ing c – Write down an expression of cost and minimize it – Gree dy cost optim iza tion • What is the cost of tracing a ray through a cell?

Cost(cell) = C_trav + Prob(hit L) * Cost(L) + Prob(hit R) * Cost(R) Split in the middle Split at the median To get through this part of empty space, you need to test all triangles on the right.

• Makes the L & R probabilities equal • Makes the L & R costs equal • Pays no a tten tion to the L & R cos ts • Pays no a tten tion to the L & R pro ba bilities

Cost-optimized split Building good kd-trees Since Cost(R) is much higher, make it as small as possible • Need the probabilities – Turns out to be propor tiona l to sur face area • Need the child cell costs – Simple triangle count works great (very rough approx.) –Empty cell “boost ” Cost(cell) = C_trav + Prob(hit L) * Cost(L) + Prob(hit R) * Cost(R) = C_trav + SA(L) * TriCount(L) + SA(R) * TriCount(R)

C_trav is the ra tio o f the cos t to traverse to the cos t to • Automatically and rapidly isolates complexity intersect • PdProduces large c hun ks of emp ty space C_trav = 1:80 in pbrt (found by experiments) Surface area heuristic Termination criteria • When should we stop splitting? – BdBad: dep th liitlimit, num ber o f titriang les – Good: when split does not help any more. •Thhldfhreshold of cost improvement 2n splits; must coincides – Stretch over multiple levels with object – For example, if cost does not go down after three boundary. Why? splits in a row, terminate a b • Threshold of cell size – Absolute probability SA(node)/SA(scene) small Sa Sb pa  p  S b S

Basic building algorithm Ray traversal algorithm 1. Pick an axis, or optimize across all three • Recursive inorder traversal 2. Build a set of can didate sp lit locat ions (cost

extrema must be at bbox vertices) tmax t * 3. Sort or bin the triangles 4. Sweeppy to incrementally track L/R counts , cost t * 5. Output position of minimum cost split t Running time: T (N)  N log N  2T (N / 2) min t * T (N)  N log2 N tt * tttmin* max tt*  min • Characteristics of highly optimized tree max Intersect(L,tmin,tmax) Intersect(L,tmin,t*) Intersect(R,tmin,tmax) – very deep, very small leaves, big empty cells Intersect((,R,t* ,tmax ) a video for kdtree Tree representation Tree representation

8-byte (reduced from 16-byte, 20% gain) float is irrelevant in pbrt2 interior 18 23 struct KdAccelNode { SE M ... union { float split; // Interior n flags u_int onePrimitive; // Leaf leaf 2 uu_int int *primitives; // Leaf n }; Flag: 0,1,2 (interior x, y, z) 3 (leaf) union { u_int flags; // Both u_int nPrims; // Leaf u_int aboveChild; // Interior }; }

KdTreeAccel construction Interior node • Recursive top-down algorithm • Choose split axis position •max dhdepth = 8 1.3l(log(N) – MdMedpoi itnt – Medium cut If (nPrims <= maxPrims || depth==0) { – Area hhitieuristic • Create leaf if no good splits were found } • Classify primitives with respect to split Choose split axis position Choose split axis position

N Start from the axis with maximum extent, sort cost of no split: ti (k) k 1 N B N A all edge events and process them in order cost of split: tt  PB ti (bk )  PA ti (ak ) k 1 k 1 assumptions: A 1. ti is the same for all primitives C 2. ti : tt = 80 : 1 (determined by experiments, B main factor for the pp)erformance) cost of no split: ti N

cost of split: tt  ti (1 be )( pB N B  pA N A )

a0 b0 a1 b1 c0 c1 sB A B C p(B | A)  sA

Choose split axis position KdTreeAccel traversal

If there is no split along this axis, try other axes. When all fail, create a leaf. KdTreeAccel traversal KdTreeAccel traversal

tmax

ToDo stack ToDo stack tplane tmax

tmin tmin far near far near

KdTreeAccel traversal KdTreeAccel traversal

far

tmax

tplane tmax ToDo stack ToDo stack tmin tmin

near KdTreeAccel traversal KdTreeAccel traversal bool KdTreeAccel::Intersect (const Ray &ray, Intersection *isect) {

tmax if (!bounds.IntersectP (y,,))(ray, &tmin, &tmax)) tmin return false; ToDo stack KdAccelNode *node=&nodes[0]; while (node!=NULL) { if (ray.maxtIsLeaf()) else } ToDo stack } (max depth)

Leaf node Interior node 1. Check whether ray intersects primitive(s) 1. Determine near and far (by testing which side inside the node; update ray’ s maxt O is) 2. Grab next node from ToDo queue below above

node+1 &(nodes[node->aboveChild]) 2. Determine whether we can skip a node t t max tplane tplane min tmin tmax

near far near far

tplane Acceleration techniques Best efficiency scheme

References

• J. Goldsmith and J. Salmon, Automatic Creation of Object Hierarchies for Ray Tracing, IEEE CG&A, 1987. •Brian Smits, Efficiency Issues for Ray Tracing, Journal of Grapp,hics Tools, 1998.

• K. Klimaszewski and T. Sederberg, Faster Ray Tracing Using Adaptive Grids, IEEE CG&A Jan/Feb 1999.

• Whang et. al., Octree-R: An Adaptive Octree for efficient ray tracing, IEEE TVCG 1(4), 1995. • A. Glassner, Space subdivision for fast ray tracing. IEEE CG&A, 4(10), 1984