15-853:Algorithms in the Real World Spatial Decompositions

Spatial Decompositions Goal to partition low or medium “dimensional” space 15-853:Algorithms in the Real World into a hierarchy so that various operations can be made efficiently. Nearest Neighbors and Spatial Decompositions Examples: –Introduction • Quad/Oct trees –Quad/Oct/Kd/BSP trees • K-d trees –Nearest Neighbor Search • Binary space partitioning (BSP) –Metric spaces • Bounded volume hierarchies (BVH) –Ball/Cover trees • Cover trees –Callahan-Kosaraju • Ball trees –Bounded Volume Hierarchies • Well-separated pair decompositions 15-853 Page1 • R-trees 15-853 Page2 Applications Applications Simulation Geometry • N-body simulation in astronomy, molecular • Range searching dynamics, and solving PDEs • All nearest neighbors • Collision detection Databases Machine learning and statistics – Range searching – Clustering and nearest neighbors – Spatial indexing and joins – Kernel Density estimation Robotics – Classifiers – Path finding Graphics • Ray tracing • Radiosity • Occlusion Culling 15-853 Page3 15-853 Page4 1 Trees in Euclidean Space: Quad/Oct Trees in Euclidean Space: Quad/Oct Quad (2d) and Oct (3d)Trees: – Find an axis aligned bounding box for all points – Recursively cut into 2d equal parts If points are “nice” then will be O(log n) deep and will take O(n log n) time to build. In the worst case can be O(n) deep. With care in how built can still run in O(n log n) time for constant dimension. 15-853 Page5 15-853 Page6 Trees in Euclidean Space: k-d tree Trees in Euclidean Space: k-d tree Similar to Quad/Oct but cut only one dimension at a time. – Typically cut at median along the selected dimension – Typically pick longest dimension of bounding box – Could cut same dimension multiple times in a row If cut along medians then no more than O(log n) deep and will take O(n log n) time to build (although this requires a linear time median, or presorting the data along all dimensions). But, partitions themselves are much more irregular. 15-853 Page7 15-853 Page8 2 Trees in Euclidean Space: BSP Trees in Euclidean Space: BSP Binary space partitioning (BSP) – Cuts are not axis aligned – Typically pick cut based on a feature, e.g. a line segment in the input Tree depth and runtime depends on how cuts are selected 15-853 Page9 15-853 Page10 Nearest Neighbors on a Searching in a KD Tree Decomposition Tree NearestNeighbor(p,T) =! N = find leaf p belongs to! p’ = nearest neighbor of p in N (if any, otherwise empty)! while (N not the root)! for each child C of P(N) except N! p’ = Nearest(p,p’,C)! N = P(N)! retun p’! ! Nearest(p,p’,T) =! if anything in T can be closer to p than p’! if T is leaf return nearest in leaf or p’! else for each child C of T ! p’ = Nearest(p,p’,C)! return p’! ! 15-853 Page11 15-853 Page12 3 What about other Metric Spaces? Ball Trees Consider set of points S in some metric space (X,d) Divide a metric space into a hierarchy of balls. The 1. d(x, y) ≥ 0 (non-negativity) union of the balls of the children of a node cover 2. d(x, y) = 0 iff x = y (identity) all the elements of the node. 3. d(x, y) = d(y, x) (symmetry) 4. d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality) Some metric spaces: • Euclidean metric • Manhattan distance (l1) • Edit distance • Shortest paths in a graph 15-853 Page13 15-853 Page14 A ball-tree: level A ball-tree: level 1 2 Slide from Alex Gray Slide from Alex Gray 4 A ball-tree: level A ball-tree: level 3 4 Slide from Alex Gray Slide from Alex Gray Ball Trees A ball-tree: level 5 Need to decide how to find balls that cover the children. Can be a lot of waste due to overlap of balls. Can copy a node to all children it belongs to, or just put in one….depends on application. Can be used for our nearest neighbor search. Hard to say anything about costs of ball trees in general for arbitrary metric spaces. Slide from Alex Gray 15-853 Page20 5 Measures of Dimensionality Measures of Dimensionality Ball of radius r around a point p from a point set S The Doubling Constant for a metric space X is the taken from a metric space X minimum value c such that every ball in X can be covered by c balls in X of half the radius. BS (p,r) = {q ∈ S : d(p,q) ≤ r} A point set has a (t,c)-Expansion if More general than the KR dimension, but harder to for all p in X and r > 0, prove bounds based on it. |B (r)| ≥ t implies |B (p,2r)| ≤ c|B (p,r)|. € s S S c is called the Expansion-Constant, t is typically O(log |S|) If S is uniform in Euclidean space then c is proportional to 2d suggesting that dim(S) = log c This is referred to as the KR dimension. 15-853 Page21 15-853 Page22 Cover Trees Cover Tree Data Structure The following slides are from a presentation of A cover tree T on a dataset S is a leveled tree Victoria Choi. where each level is indexed by an integer scale i Cover trees work for arbitrary metrics but bounds which decreases as the tree is descended depend on expansion or doubling constants Ci denotes the set of nodes at level I d(p,q) denotes the distance between points p and q A valid tree satisfies the following properties – Nesting: Ci ⊂ Ci−1 – Covering tree: For every node p ∈ C i − 1 , there i exists a node q ∈ C isatisfying d ( p , q ) ≤ 2 and exactly one such q is a parent of p i – Separation: For all nodes p , q ∈ C i , d( p,q) > 2 15-853 Page23 6 Nesting Covering Tree For every node p ∈ C i− 1, there exists a node q∈C i satisfying C ⊂ C i i i−1 d( p , q ) ≤ 2 and exactly one such q is a parent of p – Each node in set Ci has a self-child – All nodes in set Ci are also nodes in sets Cj where j<i – Set C-∞ contains all the nodes in dataset S Separation Tree Construction For all nodes , i Single Node Insertion (recursive call) p,q∈Ci d( p,q) > 2 Insert(point p, cover set Qi , level i) set Q = {Children(q) : q ∈Qi} i if d( p,Q) > 2 then return "no parent found" else i set Q = {q ∈Q : d( p,q) ≤ 2 } i−1 if Insert(p,Q ,i 1) "no parent found" and d( p,Q ) 2i i−1 − = i ≤ i pick q ∈Qi satisfying d( p,q) ≤ 2 insert q into Children(q) return "parent found" else return "no parent found" Batch insertion algorithm also available 7 Searching Why can you always find the nearest neighbour? Iterative method : find p When searching for the nearest node at each level set Q∞ = C∞ i, the bound for the nodes to be included in the next for i from ∞ down to -∞ cover set Qi-1 is set to be i consider the set of children of Q : d(p,Q)+2 where d(p,Q) is i the minimum distance Q = {Children(q) : q ∈ Qi } from nodes in Q Q will always center around form next cover set: the query node and will i contain at least one of its Q {q Q | d(p, q) d(p,Q) 2 } i−1 = ∈ ≤ + nearest neighbours return arg min d(p, q) q∈Q−∞ How? Implicit v. Explicit Theory is based on an implicit implementation, but tree is built All the descendents of a with a condensed explicit implementation to preserve O(n) i node q in Ci-1 are at most 2 space bound away (2i-1 + 2i-2 + 2i-3 + …) By setting the bound to be d(p,Q)+2i, we have included all the nodes with descendents which might do better than node p in Qi-1 and eliminated everything else 8 Bounds on Cover Trees Callahan-Kosaraju For an expansion constant c: Well separated pair decompositions The number of children of any node is bounded – A decomposition of points in d-dimensional by c4 (width bound) space The maximum depth of any point in the explicit Applications 2 tree is O(c log n) (depth bound) – N-body codes (calculate interaction forces Runtime for a search is O(c12log n) among n bodys) Runtime for insertion or deletion is O(c6log n) – K-nearest-neighbors O(n log n) time Similar to k-d trees but better theoretical properties 15-853 Page34 Tree decompositions A “realization” A spatial decomposition of points A single path between any two leaves consisting of tree edges up, an interaction edge across, and {a,b,c,d,e,f,g} tree edges down. {a,b,c,d} {e,f,g} {a,b,c} {e,f} d g a {b,c} interaction edge d g e f b c a b c e f Quiz: 1 edge is missing 15-853 Page35 15-853 Page36 9 A “well-separated realization” Overall approach A realization such that the endpoints of each Build tree decomposition: O(n log n) time interaction edge is “well separated” Build well-separated realization: O(n) time Depth of tree = O(n) worst case, but not in practice Goal: show that the number of interaction edges We can bound number of interaction edges to O(n) is O(n) d g – For both n-body and nearest-neighbors we only a need to look at the interaction edges e f b c 15-853 Page37 15-853 Page38 Callahan Kosaraju Outline Some Definitions Some definitions Bounding Rectangle R(P) lmax Building the tree Smallest rectangle that Generating well separated realization contains a set of points P Bounding the size of the realization lmax : maximum length of a Using it for nearest neighbors rectangle Well Separated: r r = smallest radius that can contain either d rectangle s = separation constant r d > s * r 15-853 Page 39 15-853 Page40 10 More Definitions Interaction Product A well-separated realization A ⊗ B = {{p,p’} : p ∈A, p’ ∈ B, p ≠ p’} {{A1,B1}, {A2,A3}, … , {Ak,Bk}} A Realization of A ⊗ B such that R(Ai) and R(Bi) are well separated Is a set {{A1,B1}, {A2,B2}, … , {Ak,Bk}} A well-separated pair decomposition = such that Tree decomposition of P 1.

Load more