15-853:Algorithms in the Real World Spatial Decompositions

Total Page:16

File Type:pdf, Size:1020Kb

15-853:Algorithms in the Real World Spatial Decompositions Spatial Decompositions Goal to partition low or medium “dimensional” space 15-853:Algorithms in the Real World into a hierarchy so that various operations can be made efficiently. Nearest Neighbors and Spatial Decompositions Examples: –Introduction • Quad/Oct trees –Quad/Oct/Kd/BSP trees • K-d trees –Nearest Neighbor Search • Binary space partitioning (BSP) –Metric spaces • Bounded volume hierarchies (BVH) –Ball/Cover trees • Cover trees –Callahan-Kosaraju • Ball trees –Bounded Volume Hierarchies • Well-separated pair decompositions 15-853 Page1 • R-trees 15-853 Page2 Applications Applications Simulation Geometry • N-body simulation in astronomy, molecular • Range searching dynamics, and solving PDEs • All nearest neighbors • Collision detection Databases Machine learning and statistics – Range searching – Clustering and nearest neighbors – Spatial indexing and joins – Kernel Density estimation Robotics – Classifiers – Path finding Graphics • Ray tracing • Radiosity • Occlusion Culling 15-853 Page3 15-853 Page4 1 Trees in Euclidean Space: Quad/Oct Trees in Euclidean Space: Quad/Oct Quad (2d) and Oct (3d)Trees: – Find an axis aligned bounding box for all points – Recursively cut into 2d equal parts If points are “nice” then will be O(log n) deep and will take O(n log n) time to build. In the worst case can be O(n) deep. With care in how built can still run in O(n log n) time for constant dimension. 15-853 Page5 15-853 Page6 Trees in Euclidean Space: k-d tree Trees in Euclidean Space: k-d tree Similar to Quad/Oct but cut only one dimension at a time. – Typically cut at median along the selected dimension – Typically pick longest dimension of bounding box – Could cut same dimension multiple times in a row If cut along medians then no more than O(log n) deep and will take O(n log n) time to build (although this requires a linear time median, or presorting the data along all dimensions). But, partitions themselves are much more irregular. 15-853 Page7 15-853 Page8 2 Trees in Euclidean Space: BSP Trees in Euclidean Space: BSP Binary space partitioning (BSP) – Cuts are not axis aligned – Typically pick cut based on a feature, e.g. a line segment in the input Tree depth and runtime depends on how cuts are selected 15-853 Page9 15-853 Page10 Nearest Neighbors on a Searching in a KD Tree Decomposition Tree NearestNeighbor(p,T) =! N = find leaf p belongs to! p’ = nearest neighbor of p in N (if any, otherwise empty)! while (N not the root)! for each child C of P(N) except N! p’ = Nearest(p,p’,C)! N = P(N)! retun p’! ! Nearest(p,p’,T) =! if anything in T can be closer to p than p’! if T is leaf return nearest in leaf or p’! else for each child C of T ! p’ = Nearest(p,p’,C)! return p’! ! 15-853 Page11 15-853 Page12 3 What about other Metric Spaces? Ball Trees Consider set of points S in some metric space (X,d) Divide a metric space into a hierarchy of balls. The 1. d(x, y) ≥ 0 (non-negativity) union of the balls of the children of a node cover 2. d(x, y) = 0 iff x = y (identity) all the elements of the node. 3. d(x, y) = d(y, x) (symmetry) 4. d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality) Some metric spaces: • Euclidean metric • Manhattan distance (l1) • Edit distance • Shortest paths in a graph 15-853 Page13 15-853 Page14 A ball-tree: level A ball-tree: level 1 2 Slide from Alex Gray Slide from Alex Gray 4 A ball-tree: level A ball-tree: level 3 4 Slide from Alex Gray Slide from Alex Gray Ball Trees A ball-tree: level 5 Need to decide how to find balls that cover the children. Can be a lot of waste due to overlap of balls. Can copy a node to all children it belongs to, or just put in one….depends on application. Can be used for our nearest neighbor search. Hard to say anything about costs of ball trees in general for arbitrary metric spaces. Slide from Alex Gray 15-853 Page20 5 Measures of Dimensionality Measures of Dimensionality Ball of radius r around a point p from a point set S The Doubling Constant for a metric space X is the taken from a metric space X minimum value c such that every ball in X can be covered by c balls in X of half the radius. BS (p,r) = {q ∈ S : d(p,q) ≤ r} A point set has a (t,c)-Expansion if More general than the KR dimension, but harder to for all p in X and r > 0, prove bounds based on it. |B (r)| ≥ t implies |B (p,2r)| ≤ c|B (p,r)|. € s S S c is called the Expansion-Constant, t is typically O(log |S|) If S is uniform in Euclidean space then c is proportional to 2d suggesting that dim(S) = log c This is referred to as the KR dimension. 15-853 Page21 15-853 Page22 Cover Trees Cover Tree Data Structure The following slides are from a presentation of A cover tree T on a dataset S is a leveled tree Victoria Choi. where each level is indexed by an integer scale i Cover trees work for arbitrary metrics but bounds which decreases as the tree is descended depend on expansion or doubling constants Ci denotes the set of nodes at level I d(p,q) denotes the distance between points p and q A valid tree satisfies the following properties – Nesting: Ci ⊂ Ci−1 – Covering tree: For every node p ∈ C i − 1 , there i exists a node q ∈ C isatisfying d ( p , q ) ≤ 2 and exactly one such q is a parent of p i – Separation: For all nodes p , q ∈ C i , d( p,q) > 2 15-853 Page23 6 Nesting Covering Tree For every node p ∈ C i− 1, there exists a node q∈C i satisfying C ⊂ C i i i−1 d( p , q ) ≤ 2 and exactly one such q is a parent of p – Each node in set Ci has a self-child – All nodes in set Ci are also nodes in sets Cj where j<i – Set C-∞ contains all the nodes in dataset S Separation Tree Construction For all nodes , i Single Node Insertion (recursive call) p,q∈Ci d( p,q) > 2 Insert(point p, cover set Qi , level i) set Q = {Children(q) : q ∈Qi} i if d( p,Q) > 2 then return "no parent found" else i set Q = {q ∈Q : d( p,q) ≤ 2 } i−1 if Insert(p,Q ,i 1) "no parent found" and d( p,Q ) 2i i−1 − = i ≤ i pick q ∈Qi satisfying d( p,q) ≤ 2 insert q into Children(q) return "parent found" else return "no parent found" Batch insertion algorithm also available 7 Searching Why can you always find the nearest neighbour? Iterative method : find p When searching for the nearest node at each level set Q∞ = C∞ i, the bound for the nodes to be included in the next for i from ∞ down to -∞ cover set Qi-1 is set to be i consider the set of children of Q : d(p,Q)+2 where d(p,Q) is i the minimum distance Q = {Children(q) : q ∈ Qi } from nodes in Q Q will always center around form next cover set: the query node and will i contain at least one of its Q {q Q | d(p, q) d(p,Q) 2 } i−1 = ∈ ≤ + nearest neighbours return arg min d(p, q) q∈Q−∞ How? Implicit v. Explicit Theory is based on an implicit implementation, but tree is built All the descendents of a with a condensed explicit implementation to preserve O(n) i node q in Ci-1 are at most 2 space bound away (2i-1 + 2i-2 + 2i-3 + …) By setting the bound to be d(p,Q)+2i, we have included all the nodes with descendents which might do better than node p in Qi-1 and eliminated everything else 8 Bounds on Cover Trees Callahan-Kosaraju For an expansion constant c: Well separated pair decompositions The number of children of any node is bounded – A decomposition of points in d-dimensional by c4 (width bound) space The maximum depth of any point in the explicit Applications 2 tree is O(c log n) (depth bound) – N-body codes (calculate interaction forces Runtime for a search is O(c12log n) among n bodys) Runtime for insertion or deletion is O(c6log n) – K-nearest-neighbors O(n log n) time Similar to k-d trees but better theoretical properties 15-853 Page34 Tree decompositions A “realization” A spatial decomposition of points A single path between any two leaves consisting of tree edges up, an interaction edge across, and {a,b,c,d,e,f,g} tree edges down. {a,b,c,d} {e,f,g} {a,b,c} {e,f} d g a {b,c} interaction edge d g e f b c a b c e f Quiz: 1 edge is missing 15-853 Page35 15-853 Page36 9 A “well-separated realization” Overall approach A realization such that the endpoints of each Build tree decomposition: O(n log n) time interaction edge is “well separated” Build well-separated realization: O(n) time Depth of tree = O(n) worst case, but not in practice Goal: show that the number of interaction edges We can bound number of interaction edges to O(n) is O(n) d g – For both n-body and nearest-neighbors we only a need to look at the interaction edges e f b c 15-853 Page37 15-853 Page38 Callahan Kosaraju Outline Some Definitions Some definitions Bounding Rectangle R(P) lmax Building the tree Smallest rectangle that Generating well separated realization contains a set of points P Bounding the size of the realization lmax : maximum length of a Using it for nearest neighbors rectangle Well Separated: r r = smallest radius that can contain either d rectangle s = separation constant r d > s * r 15-853 Page 39 15-853 Page40 10 More Definitions Interaction Product A well-separated realization A ⊗ B = {{p,p’} : p ∈A, p’ ∈ B, p ≠ p’} {{A1,B1}, {A2,A3}, … , {Ak,Bk}} A Realization of A ⊗ B such that R(Ai) and R(Bi) are well separated Is a set {{A1,B1}, {A2,B2}, … , {Ak,Bk}} A well-separated pair decomposition = such that Tree decomposition of P 1.
Recommended publications
  • Lecture 04 Linear Structures Sort
    Algorithmics (6EAP) MTAT.03.238 Linear structures, sorting, searching, etc Jaak Vilo 2018 Fall Jaak Vilo 1 Big-Oh notation classes Class Informal Intuition Analogy f(n) ∈ ο ( g(n) ) f is dominated by g Strictly below < f(n) ∈ O( g(n) ) Bounded from above Upper bound ≤ f(n) ∈ Θ( g(n) ) Bounded from “equal to” = above and below f(n) ∈ Ω( g(n) ) Bounded from below Lower bound ≥ f(n) ∈ ω( g(n) ) f dominates g Strictly above > Conclusions • Algorithm complexity deals with the behavior in the long-term – worst case -- typical – average case -- quite hard – best case -- bogus, cheating • In practice, long-term sometimes not necessary – E.g. for sorting 20 elements, you dont need fancy algorithms… Linear, sequential, ordered, list … Memory, disk, tape etc – is an ordered sequentially addressed media. Physical ordered list ~ array • Memory /address/ – Garbage collection • Files (character/byte list/lines in text file,…) • Disk – Disk fragmentation Linear data structures: Arrays • Array • Hashed array tree • Bidirectional map • Heightmap • Bit array • Lookup table • Bit field • Matrix • Bitboard • Parallel array • Bitmap • Sorted array • Circular buffer • Sparse array • Control table • Sparse matrix • Image • Iliffe vector • Dynamic array • Variable-length array • Gap buffer Linear data structures: Lists • Doubly linked list • Array list • Xor linked list • Linked list • Zipper • Self-organizing list • Doubly connected edge • Skip list list • Unrolled linked list • Difference list • VList Lists: Array 0 1 size MAX_SIZE-1 3 6 7 5 2 L = int[MAX_SIZE]
    [Show full text]
  • Cover Trees for Nearest Neighbor
    Cover Trees for Nearest Neighbor Alina Beygelzimer [email protected] IBM Thomas J. Watson Research Center, Hawthorne, NY 10532 Sham Kakade [email protected] TTI-Chicago, 1427 E 60th Street, Chicago, IL 60637 John Langford [email protected] TTI-Chicago, 1427 E 60th Street, Chicago, IL 60637 Abstract The basic nearest neighbor problem is as follows: We present a tree data structure for fast Given a set S of n points in some metric space (X, d), nearest neighbor operations in general n- the problem is to preprocess S so that given a query point metric spaces (where the data set con- point p ∈ X, one can efficiently find a point q ∈ S sists of n points). The data structure re- which minimizes d(p, q). quires O(n) space regardless of the met- ric’s structure yet maintains all performance Context. For general metrics, finding (or even ap- properties of a navigating net [KL04a]. If proximating) the nearest neighbor of a point requires the point set has a bounded expansion con- Ω(n) time. The classical example is a uniform met- stant c, which is a measure of the intrinsic ric where every pair of points is near the same dis- dimensionality (as defined in [KR02]), the tance, so there is no structure to take advantage of. cover tree data structure can be constructed However, the metrics of practical interest typically do in O c6n log n time. Furthermore, nearest have some structure which can be exploited to yield neighbor queries require time only logarith- significant computational speedups.
    [Show full text]
  • Faster Cover Trees
    Faster Cover Trees Mike Izbicki [email protected] Christian Shelton [email protected] University of California Riverside, 900 University Ave, Riverside, CA 92521 Abstract est neighbor of p in X is defined as The cover tree data structure speeds up exact pnn = argmin d(p;q) nearest neighbor queries over arbitrary metric q2X−fpg spaces (Beygelzimer et al., 2006). This paper makes cover trees even faster. In particular, we The naive method for computing pnn involves a linear scan provide of all the data points and takes time q(n), but many data structures have been created to speed up this process. The 1. A simpler definition of the cover tree that kd-tree (Friedman et al., 1977) is probably the most fa- reduces the number of nodes from O(n) to mous. It is simple and effective in practice, but it can only exactly n, be used on Euclidean spaces. We must turn to other data 2. An additional invariant that makes queries structures when given an arbitrary metric space. The sim- faster in practice, plest and oldest of these structures is the ball tree (Omo- 3. Algorithms for constructing and querying hundro, 1989). Although attractive for its simplicity, it pro- the tree in parallel on multiprocessor sys- vides only the trivial runtime guarantee that queries will tems, and take time O(n). Subsequent research focused on provid- 4. A more cache efficient memory layout. ing stronger guarantees, producing more complicated data structures like the metric skip list (Karger & Ruhl, 2002) On standard benchmark datasets, we reduce the and the navigating net (Krauthgamer & Lee, 2004).
    [Show full text]
  • Canopy – Fast Sampling Using Cover Trees
    Canopy { Fast Sampling using Cover Trees Canopy { Fast Sampling using Cover Trees Manzil Zaheer Abstract Background. The amount of unorganized and unlabeled data, specifically images, is growing. To extract knowledge or insights from the data, exploratory data analysis tools such as searching and clustering have become the norm. The ability to search and cluster the data semantically facilitates tasks like user browsing, studying medical images, content recommendation, and com- putational advertisements among others. However, large image collections are plagued by lack of a good representation in a relevant metric space, thus making organization and navigation difficult. Moreover, the existing methods do not scale for basic operations like searching and clustering. Aim. To develop a fast and scalable search and clustering framework to enable discovery and exploration of extremely large scale data, in particular images. Data. To demonstrate the proposed method, we use two datasets of different nature. The first dataset comprises of satellite images of seven urban areas from around the world of size 200GB. The second dataset comprises of 100M images of size 40TB from Flickr, consisting of variety of photographs from day to day life without any labels or sometimes with noisy tags (∼1% of data). Methods. The excellent ability of deep networks in learning general representation is exploited to extract features from the images. An efficient hierarchical space partitioning data structure, called cover trees, is utilized so as to reduce search and reuse computations. Results. Using the visual search-by-example on satellite images runs in real-time on a single machine and the task of clustering into 64k classes converges in a single day using just 16 com- modity machines.
    [Show full text]
  • Cover Trees for Nearest Neighbor
    Cover Trees for Nearest Neighbor Alina Beygelzimer [email protected] IBM Thomas J. Watson Research Center, Hawthorne, NY 10532 Sham Kakade [email protected] TTI-Chicago, 1427 E 60th Street, Chicago, IL 60637 John Langford [email protected] TTI-Chicago, 1427 E 60th Street, Chicago, IL 60637 Abstract The basic nearest neighbor problem is as follows: We present a tree data structure for fast Given a set S of n points in some metric space (X, d), nearest neighbor operations in general n- the problem is to preprocess S so that given a query point p X, one can efficiently find a point q S point metric spaces (where the data set con- ∈ ∈ sists of n points). The data structure re- which minimizes d(p, q). quires O(n) space regardless of the met- ric’s structure yet maintains all performance Context. For general metrics, finding (or even ap- properties of a navigating net [KL04a]. If proximating) the nearest neighbor of a point requires the point set has a bounded expansion con- Ω(n) time. The classical example is a uniform met- stant c, which is a measure of the intrinsic ric where every pair of points is near the same dis- dimensionality (as defined in [KR02]), the tance, so there is no structure to take advantage of. cover tree data structure can be constructed However, the metrics of practical interest typically do in O c6n log n time. Furthermore, nearest have some structure which can be exploited to yield neighbor queries require time only logarith- significant computational speedups.
    [Show full text]
  • SURVEY on VARIOUS-WIDTH CLUSTERING for EFFICIENT K-NEAREST NEIGHBOR SEARCH
    Vol-3 Issue-2 2017 IJARIIE-ISSN(O)-2395-4396 SURVEY ON VARIOUS-WIDTH CLUSTERING FOR EFFICIENT k-NEAREST NEIGHBOR SEARCH Sneha N. Aware1, Prof. Dr. Varsha H. Patil2 1 PG Student, Department of Computer Engineering,Matoshri College of Engineering & Research Centre,Nashik ,Maharashtra, India. 2 Head of Department of Computer Engineering, Matoshri College of Engineering & Research Centre, Nashik, Maharashtra, India. ABSTRACT Over recent decades, database sizes have grown large. Due to the large sizes it create new challenges, because many machine learning algorithms are not able to process such a large volume of information. The k-nearest neighbor (k- NN) is widely used in many machines learning problem. k-NN approach incurs a large computational cost. In this research a k-NN approach based on various-width clustering is presented. The k-NN search technique is based on VWC is used to efficiently find k-NNs for a query object from a given data set and create clusters with various widths. This reduces clustering time in addition to balancing the number of produced clusters and their respective sizes. Keyword: - clustering,k-Nearest Neighbor, various-width clustering, high dimensional, etc. 1. INTRODUCTION:- The k-nearest neighbor approach (k-NN) has been extensively used as a powerful non-parametric technique in many scientific and engineering applications. To find the closest result a k-NN technique is widely used in much application area. This approach incurs a large computational area. General k-NN problem states that, suppose we have n number of objects and a Q query object, then distance from the query object to all training object need to be calculated in order to find k closest objects to Q.
    [Show full text]
  • An Empirical Comparison of Exact Nearest Neighbour Algorithms
    An Empirical Comparison of Exact Nearest Neighbour Algorithms Ashraf M. Kibriya and Eibe Frank Department of Computer Science University of Waikato Hamilton, New Zealand {amk14, eibe}@cs.waikato.ac.nz Abstract. Nearest neighbour search (NNS) is an old problem that is of practical importance in a number of fields. It involves finding, for a given point q, called the query, one or more points from a given set of points that are nearest to the query q. Since the initial inception of the problem a great number of algorithms and techniques have been proposed for its solution. However, it remains the case that many of the proposed algorithms have not been compared against each other on a wide variety of datasets. This research attempts to fill this gap to some extent by presenting a detailed empirical comparison of three prominent data structures for exact NNS: KD-Trees, Metric Trees, and Cover Trees. Our results suggest that there is generally little gain in using Metric Trees or Cover Trees instead of KD-Trees for the standard NNS problem. 1 Introduction The problem of nearest neighbour search (NNS) is old [1] and comes up in many fields of practical interest. It has been extensively studied and a large number of data structures, algorithms and techniques have been proposed for its solution. Although nearest neighbour search is the most dominant term used, it is also known as the best-match, closest-match, closest-point and the post office problem. The term similarity search is also often used in the information retrieval field and the database community.
    [Show full text]
  • COVER TREES for NEAREST NEIGHBOR Nearest Neighbor
    COVER TREES FOR NEAREST NEIGHBOR ALINA BEYGELZIMER, SHAM KAKADE, AND JOHN LANGFORD ABSTRACT. We present a tree data structure for fast nearest neighbor operations in general Ò-point metric ´Òµ spaces. The data structure requires Ç space regardless of the metric’s structure. If the point set has an ¾ expansion constant in the sense of Karger and Ruhl [KR02], the data structure can be constructed ¡ ¡ ½¾ Ò ÐÓ Ò Ç ÐÓ Ò in Ç time. Nearest neighbor queries obeying the expansion bound require time. ½ ´Òµ Ç ´ Òµ In addition, the nearest neighbor of Ç points can be queried in time. We experimentally test the algorithm showing speedups over the brute force search varying between 1 and 2000 on natural machine learning datasets. 1. INTRODUCTION Nearest neighbor search is a fundamental problem with a number of applications in peer-to-peer networks, lossy data compression, vision, dimensionality reduction, computational biology, machine learning, and Ò physical simulation. The standard setting is as follows: Given a set Ë of points in some metric space µ Ë Ô ¾ ´ , the problem is to preprocess so that given a query point , one can efficiently find a point ¾ Ë ´Ô Õ µ Õ which minimizes . For general metrics, finding (or even approximating) the nearest neighbor Òµ of a point requires ª´ time. However, the metrics and datasets of practical interest typically have some structure which may be exploited to yield significant computational speedups. Most research has focused on the special case when the metric is Euclidean. Although many strong theoretical guarantees have been made here, this case is quite limiting in a variety of settings when the data do not naturally lie in a Euclidean space or when the data is embedded in a high dimensional space, but has a low dimensional intrinsic structure [TSL00, RS00] (as is common in many machine learning problems).
    [Show full text]
  • All-Near-Neighbor Search Among High-Dimensional Data Via Hierarchical Sparse Graph Code Filtering
    All-Near-Neighbor Search Among High-Dimensional Data via Hierarchical Sparse Graph Code Filtering by Alexandros-Stavros Iliopoulos Department of Computer Science Duke University Date: Approved: Xiaobai Sun, Advisor Pankaj K. Agarwal Nikos Pitsianis Carlo Tomasi Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2020 Abstract All-Near-Neighbor Search Among High-Dimensional Data via Hierarchical Sparse Graph Code Filtering by Alexandros-Stavros Iliopoulos Department of Computer Science Duke University Date: Approved: Xiaobai Sun, Advisor Pankaj K. Agarwal Nikos Pitsianis Carlo Tomasi An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science in the Graduate School of Duke University 2020 Copyright c 2020 by Alexandros-Stavros Iliopoulos All rights reserved except○ the rights granted by the Creative Commons Attribution-Noncommercial Licence Abstract This thesis addresses the problem of all-to-all near-neighbor (all-NN) search among a large dataset of discrete points in a high-dimensional feature space endowed with a distance metric. Near-neighbor search is fundamental to numerous computational tasks arising in modern data analysis, modeling, and knowledge discovery. The ac- curacy and efficiency of near-neighbor search depends on the underlying structure of the dataset. In emerging data-driven analysis applications, the dataset is not necessarily stationary, the structure is not necessarily built once for all queries, and search is not necessarily limited to a few query points. To facilitate accurate and efficient near-neighbor search in a stationary or dynamic dataset, we makeasys- tematic investigation of all-NN search at multiple resolution levels, with attention to several standing or previously overlooked issues associated with high-dimensional feature spaces.
    [Show full text]
  • Improving Dual-Tree Algorithms
    IMPROVING DUAL-TREE ALGORITHMS A Dissertation Presented to The Academic Faculty By Ryan R. Curtin In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Electrical and Computer Engineering School of Electrical and Computer Engineering Georgia Institute of Technology December 2015 Copyright © 2015 by Ryan R. Curtin IMPROVING DUAL-TREE ALGORITHMS Approved by: Dr. David V. Anderson, Committee Chair Dr. Mark A. Clements Associate Professor, School of Electrical and Associate Professor, School of Electrical and Computer Engineering Computer Engineering Georgia Institute of Technology Georgia Institute of Technology Dr. Charles L. Isbell, Jr., Advisor Dr. Polo Chau Professor, School of Interactive Computing Assistant Professor, School of Computational Georgia Institute of Technology Science and Engineering Georgia Institute of Technology Dr. Richard W. Vuduc Associate Professor, School of Computational Science and Engineering Georgia Institute of Technology Date Approved: August 18, 2015 This document is dedicated solely to my cats, who do not and will not ever have the capacity to understand even the title of this manuscript, and who, thanks to domestication, are actually entirely incapable of leading any sort of autonomous lifestyle and thus are mortally dependent on my completion of simple maintenance tasks. TABLE OF CONTENTS LIST OF TABLES ................................... vii LIST OF FIGURES .................................. ix CHAPTER 1 THE POINT ............................. 1 CHAPTER 2 INTRODUCTION .......................... 3 2.1 An abridged history of statistical computing . .3 2.2 A less abridged history of the development of trees . .4 2.3 The explosion of single-tree algorithms . 10 2.4 A smorgasboard of trees . 12 2.5 The fast multipole method and query amortization .
    [Show full text]
  • Non-Rigid Surface Registration Using Cover Tree Based Clustering and Nearest Neighbor Search
    Non-rigid Surface Registration using Cover Tree based Clustering and Nearest Neighbor Search Manal H. Alassaf 1, 2, Yeny Yim3 and James K. Hahn1 1Department of Computer Science, George Washington University, Washington DC, U.S.A. 2Department of computer science, Taif University, Taif, Saudi Arabia 3Samsung Electronics, Suwon, Geonggi-Do, Republic of Korea Keywords: Non-rigid Registration, Iterative Closest Point Algorithm, ICP, Cover Tree, Clustering, Nearest Neighbor Search. Abstract: We propose a novel non-rigid registration method that computes the correspondences of two deformable surfaces using the cover tree. The aim is to find the correct correspondences without landmark selection and to reduce the computational complexity. The source surface S is initially aligned to the target surface T to generate a cover tree from the densely distributed surface points. The cover tree is constructed by taking into account the positions and normal vectors of the points and used for hierarchical clustering and nearest neighbor search. The cover tree based clustering divides the two surfaces into several clusters based on the geometric features, and each cluster on the source surface is transformed to its corresponding cluster on the target. The nearest neighbor search from the cover tree reduces the search space for correspondence computation, and the source surface is deformed to the target by optimizing the point pairs. The correct correspondence of a given source point is determined by choosing one target point with the best correspondence measure from the k nearest neighbors. The proposed energy function with Jacobian penalty allows deforming the surface accurately and with less deformation folding. 1 INTRODUCTION it is quite efficient especially for low dimensional data.
    [Show full text]
  • Introduction to Cover Tree
    Introduction to Cover Tree Javen Qinfeng Shi SML of NICTA RSISE of ANU 2006 Oct. 15 Outline Goal Related works Cover tree definition Optimized Implement in C++ Complexity Further work Outline Goal Related works Cover tree definition Efficiently Implement in C++ Advantages and disadvantages Further work Goal Speed up Nearest-neighbour search[John Langford etc. 2006] ZPreprocess a dataset S of n points in some metric space X so that given a query point p ∈ X , a point q ∈ S which minimises d(p,q) can be efficiently found Also Kmeans Outline Goal Related works Cover tree definition Example Optimized Implement in C++ Advantages and disadvantages Further work Relevant works Kd tree [Friedman, Bentley & Finkel 1977] Ball tree family: ZBall tree [Omohundro,1991] ZMetric tree [Uhlmann, 1991] ZAnchors metric tree [Moore 2000] Z(Improved) Anchors metric tree [Charles2003] Navigating net [R.Krauthgamer, J.Lee 2004] ? Lots of discussions and proofs on its computational bound Kd tree Univariate axis-aligned splits Split on widest dimension O(N log N) to build, O(N) space Kd tree Kd tree y x (axis) Ball tree/metric tree A set of points in R2 [Uhlmann 1991], [Omohundro 1991] A ball-tree: level 1 [Uhlmann 1991], [Omohundro 1991] A ball-tree: level 2 A ball-tree: level 3 A ball-tree: level 4 A ball-tree: level 5 Outline Goal Related works Cover tree definition ZBasic definition ZHow to build a cover tree ZQuery a point/points Optimized Implement in C Advantages and disadvantages Further work Basic definition A cover tree
    [Show full text]