Simple and Fast Nearest Neighbor Search
Total Page:16
File Type:pdf, Size:1020Kb
Birn et al.: Nearest Neighbors 1 Simple and Fast Nearest Neighbor Search Marcel Birn, Manuel Holtgrewe, Peter Sanders, Johannes Singler Birn et al.: Nearest Neighbors 2 2D Nearest Neighbor Search Preprocess S = {(x1,y1),..., (xn,yn)}. Query: q = (x,y) −→ closest point in S (Euclidean distance) Birn et al.: Nearest Neighbors 3 Application Example Geographic Information Systems, e.g., Route Planning Birn et al.: Nearest Neighbors 4 A Greedy Algorithm Preprocessing: Delaunay Triangulation (not Yao graph etc. !) Query: Start at any point in S. Follow any edge towards q. Iterate until no improvement possible. still linear time. Birn et al.: Nearest Neighbors 5 Multiple Levels Start with a sample of points from S. Closest point in sample → starting point in next level. How many levels? Complicated data structure? Birn et al.: Nearest Neighbors 6 n Levels? Randomly number points in S. Function nlevelNN(q) u:= s1 538 1 6 4 2 7 for i := 1 to n do u:= greedyNN(u, q, DT(s1..k)) return u Sounds like quadratic space and at least linear time??? Birn et al.: Nearest Neighbors 7 Full Delaunay Hierarchy (FDH) n G:= (1..n, [ Ek) i=1 where Gk =(1..k,Ek):= Delaunay triangulation of points 1..k. 538 1 6 4 2 7 538 1 6 4 2 7 G has a linear expected number of edges e.g., [Guibas et al. 92]. Birn et al.: Nearest Neighbors 8 FDH-Based n Level Search! Procedure nn(q, {s1,...,sn}) u:= s1 repeat foreach neighbor v of u in G with u > v in increasing order do invariant u is nearest neighbor of q wrt 1..v − 1 if ||q − sv||2 < ||q − su||2 then 538 1 6 4 2 7 u:= v break for loop until no improvement found return u Birn et al.: Nearest Neighbors 9 Analysis Theorem: Expected number of nodes visited ≤ ln n + 1 Proof Idea: si is visited ⇔si is closest to q wrt 1..i. This event has probability 1/i. harmonic sum claim. i Birn et al.: Nearest Neighbors 10 Bad Inputs # of inspected edges might be Ω(n) Birn et al.: Nearest Neighbors 11 Implementation Store coordinates redundantly in edge array. Scan logarithmic number of pieces. Never worse than greedy scanning algorithm. x: 1 2 3 4 5 6 7 8 #:538 1 6 4 2 7 q=(2.8, 0) 1 2 3 4 2(7,0) 3(2,0) 4(6,0)6(5,0) 8(3,0) 4(6,0) 7(8,0)5(1,0) 8(3,0) 6(5,0) Birn et al.: Nearest Neighbors 12 Other Approaches optimal kd−Tree Delaunay Hierarchy Voronoi Diagram currently preferred relatively complicated +point location in practice implementation? [CGAL, Mount et al.] [Devillers92] Birn et al.: Nearest Neighbors 13 Instances [Devillers02] 95% 5% in square on circle on parabola mixed Birn et al.: Nearest Neighbors 14 Experiments: Square 10 s] µ 1 time per query [ CGAL Delaunay hierarchy FDH exact ANN kd-tree CGAL kd-tree FDH array, inexact 0.1 128 512 211 213 215 217 219 221 223 number of points (n) Birn et al.: Nearest Neighbors 15 Experiments: Circle (parabola similar) 104 ANN kd-tree CGAL Delaunay hierarchy CGAL kd-tree 1000 FDH, exact FDH, inexact s] µ 100 10 time per query [ 1 0.1 128 512 211 213 215 217 219 221 223 number of points (n) Birn et al.: Nearest Neighbors 16 Experiments: Mixed 1000 ANN kd-tree CGAL Delaunay hierarchy CGAL kd-tree FDH, exact 100 FDH, inexact s] µ 10 time per query [ 1 0.1 128 512 211 213 215 217 219 221 223 number of points (n) Birn et al.: Nearest Neighbors 17 More Related Work Several n-level geom. data structures on more complex objects (triangles. ) and for point location. [BoissonatTeillaud86], [Guibas et al. 92],. Contraction Hierarchies – route planning with n levels [Geisberger et al. 2008] n-level Graph Partitioning [with Vitaly Osipov] Birn et al.: Nearest Neighbors 18 Conclusions Full Delaunay Hierarchies (mostly) fast simple elegant Birn et al.: Nearest Neighbors 19 Open Problems Comparison with worst case efficient algorithms Make worst case efficient, e.g., replace high degree nodes? move high degree nodes in hierarchy? Higher Dimensions??? More n-level algorithms?.