CCCG 2016, Vancouver, British Columbia, August 3–5, 2016

Transforming Hierarchical Trees on Metric Spaces∗

Mahmoodreza Jahanseir† Donald R. Sheehy‡

Abstract that satisfy a certain sphere packing property. These data structures assume that the input and query point We show how a simple hierarchical called a cover are drawn from the same probability distribution. The tree can be turned into an asymptotically more efficient nearest neighbor query time of these structures depends one known as a net-tree in linear time. We also in- on the spread ∆ of the input, which is the ratio of the di- troduce two linear-time operations to manipulate cover ameter to the distance between the closest pair of points. trees called coarsening and refining. These operations Karger & Ruhl [9] devised a dynamic data struc- make a trade-off between tree height and the node de- ture for nearest neighbor queries in growth restricted gree. metrics. A closed metric ball centered at p with ra- dius r is denoted B(p, r) := {q ∈ P | d(p, q) ≤ r}. 1 Introduction Karger & Ruhl defined the expansion constant as the minimum µ such that for all p ∈ M and r > 0, There are many very similar data structures for search- |B(p, 2r)| ≤ µ |B(p, r)|. A growth restricted metric ing and navigating n points in a metric space M. Most space has constant µ (independent of n). Karger & Ruhl such structures support range queries and (approxi- proved that their has size µO(1)n log n, mate) among others. For com- and answers nearest neighbor queries in µO(1) log n. putational geometers, two of the most important such Gupta et al. [7] defined the doubling constant ρ of a structures for general metric spaces are the cover tree [2] metric space as the minimum ρ such that every ball in and the net-tree [8]. Cover trees, by virtue of their sim- M can be covered by ρ balls of half the radius. The dou- plicity, have found wide adoption, especially for machine bling dimension is defined as γ = lg ρ. A metric is called learning applications. Net-trees on the other hand, pro- doubling when it has a constant doubling dimension. vide much stronger theoretical guarantees and can be Krauthgamer & Lee [10] proposed navigating nets used to solve a much wider class of problems, but they to answer (approximate) nearest neighbor queries in come at the cost of unrealistic constant factors and 2O(γ) log ∆ + (1/ε)O(γ)-time for doubling metrics. Nav- complex algorithms. In this paper, we generalize these igating nets require 2O(γ)n space. two data structures and show how to convert a cover Gao et al. [6] proposed a (1 + ε)-spanner with size tree into a net tree in linear time. In fact, we show O(n/εd) for a of n points in Rd. Their data struc- that a cover tree with the right parameters, satisfies ture is similar to navigating nets and can be constructed the stronger conditions of a net tree, thus finding some in O(n log ∆/εd) time and answers approximate nearest middle ground between the two. In Section 5, we give neighbor queries in O(log ∆) time. They also main- efficient algorithms for modifying these parameters for tained the spanner under dynamic updates and contin- an existing tree. uous motion of the points. Har-Peled & Mendel [8] devised a data structure Related Work For Euclidean points, [5] called a net-tree to address approximate nearest neigh- and k-d trees [1] are perhaps the two famous data struc- bor search and some other problems in doubling metrics. tures. Most data structures for general metric spaces are They proposed a linear-time algorithm to construct a generalizations of these. Uhlmann [11] proposed ball net-tree of size O(ρO(1)n) starting from a specific or- trees to solve the proximity search on metric spaces. dering of the points called an approximate greedy per- Ball trees are generalizations of k-d trees. Yianilos [12] mutation. Constructing a greedy permutation requires proposed a structure similar to ball trees called a vp- O(ρO(1)n log(∆n)) time. To beat the spread, they pro- tree that allows O(log n)-time queries in expectation for posed a randomized algorithm to generate an approxi- restricted classes of inputs. mate greedy permutation in O(ρO(1)n log n) time. Net- Clarkson [3] proposed two randomized data structures trees support approximate nearest neighbor search in to answer nearest neighbor queries in metric spaces O(2O(γ) log n) + (1/ε)O(γ) time. Beygelzimer et al. [2] presented cover trees to solve ∗Partially supported by the National Science Foundation under grant numbers CCF-1464379 and CCF-1525978 the nearest neighbor problem in growth restricted met- †University of Connecticut [email protected] rics. Cover trees are a simplificiation of navigating nets ‡University of Connecticut [email protected] and can be constructed incrementally in O(µ6n log n) 28th Canadian Conference on Computational Geometry, 2016

Figure 2: Packing and covering balls for a point p at level ` in a net-tree. White points belong to the subtree rooted at node p`. Figure 1: A hierarchical tree on P = {a, b, c, d, e, f}. Squares and ovals illustrate points and nodes respec- tively. Cover Trees. A cover tree T is a hierarchical tree with following properties.

` • Packing: For all distinct p, q ∈ L`, d(p, q) > cpτ . time. The space complexity of cover trees is O(n) inde- h ` ` pendent of doubling constant or expansion constant. • Covering: For each r ∈ ch(p ), d(p, r) ≤ ccτ . Cole & Gottlieb [4] extended the notion of navigat- We call c and c the packing constant and the cov- ing nets to construct a dynamic data structure to sup- p c ering constant, respectively, and c ≥ c > 0. We repre- port approximate nearest neighbor search in O(log n) + c p sent all cover trees with the same scale factor, packing (1/ε)O(1) time for doubling metrics. Similar to net- constant, and covering constant with CT(τ, c , c ). Note trees, their data structure provides strong packing and p c that the cover tree definition by Beygelzimer et al. [2] covering properties. To insert a new point, they used results a tree in CT(2, 1, 1). biased skip lists to make the search process faster. They proved that the data structure requires O(n) space in- dependent of doubling dimension of the metric space. Net-trees. A net-tree is a hierarchical tree. For each node p` in a net-tree, the following invariants hold.

` T • Packing: B(p, cpτ ) P ⊂ Pp` . 2 Definitions ` 1 • Covering: Pp` ⊂ B(p, ccτ ).

Hierarchical trees. Cover trees and net-trees are both Here, cp and cc are defined similar to cover trees. examples of hierarchical trees. In these trees, the input Fig 2 illustrates both packing and covering balls for a points are leaves and each point p can be associated with point p at some level ` in a net-tree. Let NT(τ, cp, cc) many internal nodes. Each node is uniquely identified denote the set of net-trees. The algorithm in [8] con- by its associated point and an integer called its level. τ−5 2τ structs a tree in NT(11, 2(τ−1) , τ−1 ). Leaves are in level −∞ and the root is in +∞. The The main difference in the definitions is in the pack- ` node in level ` associated with a point p is denoted p . ing conditions. The net-tree requires the packing to be ` ` Let par(p ) be the parent of a node p ∈ T . Also, let consistent with the hierarchical structure of the tree, ` ` ch(p ) be the children of p . Each non-leaf has a child a property not necessarily satisfied by the cover trees. with the same associated point. Similar to compressed Also, Har-Peled and Mendel [8] set τ = 11, whereas quadtrees, a node skips a level iff it is the only child optimized cover tree code sets τ = 1.3. of its parent and it has only one child. Let L` be the A net-tree can be augmented to maintain a list of points associated with nodes in level at least `. Let Pp` nearby nodes called relatives defined for each node p` ` denote leaves of the subtree rooted at p . The levels of as follows. the tree represent the metric space at different scales. The constant τ > 1, called the scale factor of the tree Rel(p`) = {xf ∈ T with yg = par(xf ) |f ≤ ` < g, and determines the change in scale between levels. Fig 1 ` shows an example of hierarchical trees. Note that in d(p, x) ≤ crτ } this figure the tree is neither a cover tree nor a net-tree, 1The packing condition we give is slightly different from [8], because there are not any restrictions on the distance but it is an easy exercise to prove this (more useful) version is between points. equivalent. CCCG 2016, Vancouver, British Columbia, August 3–5, 2016

We call cr the relative constant, and Har-Peled and Algorithm 1 Augmenting a given cover tree with rel- Mendel set cr = 13. atives In this paper, we add a new and easy to implement 1: procedure Augment(T, cr) condition on cover trees. We require that children of a 2: for all p` ∈ T in decreasing order of level ` do ` node p are closer to p than to any other point in L`. 3: Rel(p`) ← p` 4: if p` is not the root then ` 3 From cover trees to net-trees 5: Relatives(p , cr, true)

In this section, first we show that for every node in ` a cover tree, the size of children and relatives of that Proof. From Lemma 4, for a node p ∈ T , Pp` ⊂ node is constant. Then, we prove that a cover tree with ccτ ` B(p, τ−1 τ ). Suppose for contradiction there exists a a sufficiently large scale factor satisfies both stronger cp(τ−1)−2cc ` point r ∈ B(p, τ ) such that r∈ / P ` . Then, packing and covering properties of net-trees. 2(τ−1) p there exists a node xf ∈ T which is the lowest node with ` g f Lemma 1 For each node p in T ∈ CT(τ, cp, cc), f ≥ ` and r ∈ Pxf . Let y be the child of x such that ` lg ccτ/cp |ch(p )| = O(ρ ). r ∈ Pyg . It is clear that g < `. First, Let g < f − 1. So, x = y and d(p, x) = d(p, y) > c τ `. By the triangle ` ` p Proof. When |ch(p )| > 1, all children of p are in level inequality, ` − 1. From the packing property, the distance between `−1 every two nodes in this list is greater than cpτ . We ` cp(τ − 1) − 2cc ` ` ` d(y, r) ≥ d(y, p) − d(p, r) > cpτ − τ know that all children of p are within the distance ccτ 2(τ − 1) of p. By the definition of the doubling constant, the cp(τ − 1) + 2cc ball centered at p with radius c τ ` will be covered by > τ `. c 2(τ − 1) lg ccτ/cp `−1 O(ρ ) balls of radius cpτ .  c τ c τ ` ` Also, d(y, r) ≤ c τ g < c . Therefore, Lemma 2 Let p ∈ T and T ∈ CT(τ, cp, cc). For each τ−1 τ−1 e f ` ` two nodes s , t ∈ Rel(p ), d(s, t) > cpτ . c (τ − 1) + 2c c τ ` p c τ ` < c . Proof. Let rh = par(se). By the definition of relatives, 2(τ − 1) τ − 1 e ≤ ` < h. If e < `, then s = r and s ∈ Lh. Because Lh ⊂ L`, the distance of s to all points in L` is greater This implies that cp(τ − 1) < 0, which is a contradic- ` than cpτ . Otherwise, s is in level `. The same argument tion. Now, let g = f −1. In this case, we have f = ` and f holds for t . Therefore, s, t ∈ L`, and it implies d(s, t) > g = ` − 1. By the parent property, d(y, p) > d(y, x). ` So, cpτ . 

` ` ` Lemma 3 For each node p in T ∈ CT(τ, cp, cc), d(y, p) ≥ d(p, x) − d(x, y) > cpτ − d(y, p) > cpτ /2. |Rel(p`)| = O(ρlg cr /cp ) Also, by the triangle inequality, Proof. By the definition of relatives, all nodes in ` ` c c (τ − 1) − 2c Rel(p ) are within the distance crτ of point p. From d(y, r) ≥ d(y, p) − d(p, r) > p τ ` − p c τ ` Lemma 2, the distance between any two points in 2 2(τ − 1) ` ` ` Rel(p ) is greater than cpτ . Therefore, the total size ccτ ` lg c /c > . of Rel(p ) is O(ρ r p ).  τ − 1

Lemma 4 For each descendant xf of p` in T ∈ ` We get a contradiction because d(y, r) ≤ ccτ . There- ccτ ` τ−1 CT(τ, cp, cc), d(p, x) < τ τ−1 fore, r ∈ Pp` .  Proof. The covering property and the triangle inequal- ity imply that 4 Augment cover trees

` ∞ X X τ `+1 Theorem 5 shows that for a sufficiently large scale fac- d(p, x) ≤ c τ i < c τ `−i = c . c c c τ − 1 tor packing and covering properties in a cover tree imply i=f i=0 packing and covering properties of net-trees. However, net-tree nodes maintain a list of nearby nodes called rel-  atives. Algorithm 1 is a procedure that adds a list of 2cc Theorem 5 For all τ > + 1, if T ∈ CT(τ, cp, cc), relatives to each node of a cover tree. Note that Rela- cp c (τ−1)−2c tives is similar to the find relative algorithm in [8], but then T ∈ NT(τ, p c , ccτ ). 2(τ−1) τ−1 it gives a smaller relative constant. 28th Canadian Conference on Computational Geometry, 2016

Algorithm 2 Finding relatives of a node p` Lemma 4, 1: procedure Relatives(p`, c , update) c τ 2 c τ r d(p, s) ≤ d(p, x) + d(x, s) < c τ ` + c τ h 2: Let qm = parent(p`) (τ − 1)2 τ − 1 f m 3: for all x ∈ Rel(q ) do 2 2 g f ccτ h−1 ccτ h τ h 4: Let y = par(x ) < 2 τ + τ = cc( 2 )τ . ` (τ − 1) τ − 1 (τ − 1) 5: if d(p, x) ≤ crτ and f ≤ ` < g then f ` 6: Add x to Rel(p ) This is a contradiction.  7: else if update = true and ` ≤ f < m and f Theorem 7 Algorithm 1 has time complexity d(p, x) ≤ crτ then lg( ccτ )2τ 8: Add p` to Rel(xf ) O(ρ cp(τ−1) n). S e ` 9: candidates ← re∈Rel(qm) ch(r ) \{p } Proof. We use an amortized analysis for the time com- f 10: for all x ∈ candidates do plexity of Relatives. While finding relatives, if a node 11: Let yg = par(xf ) is inserted into the relative list of another node, we de- ` 12: if d(p, x) ≤ crτ and f ≤ ` < g then crease one credit from the node whose relative list has 13: Add xf to Rel(p`) been grown. Note that in Algorithm 2, for a node xf in f 14: else if d(p, x) ≤ crτ and ` ≤ f < m then Rel(qm) or children of Rel(qm), when xf ∈/ Rel(p`) and 15: if update = true then p` ∈/ Rel(xf ), p` is responsible for checking xf . Also, a 16: Add p` to Rel(xf ) child of a node xf is required to be checked against rel- ` f 17: candidates ← candidates ∪ ch(xf ) ative conditions if p ∈ Rel(x ). In this case, we charge node xf one credit. From Lemma 3, the relative list for 2 lg ccτ each node has size O(ρ cp(τ−1)2 ). Also, Lemma 1 im- ` lg ccτ Theorem 6 For each node p in T ∈ CT(τ, cp, cc) and plies that each node has at most O(ρ cp ) children. 2 ccτ ` cr = (τ−1)2 , Relatives correctly finds Rel(p ). Therefore, the total required credit for each node of c τ2 c τ2 lg c lg c lg ccτ the tree is O(ρ cp(τ−1)2 ) + 2 · O(ρ cp(τ−1)2 ρ cp ) = lg( ccτ )2τ Proof. Suppose for contradiction there exists xf with O(ρ cp(τ−1) ). So, the total total time complexity is g f f ` lg( ccτ )2τ y = par(x ) such that x ∈ Rel(p ), and Relatives cp(τ−1) O(ρ n).  does not find it. Therefore, either xf ∈/ Rel(qm) or it has an ancestor sh with h ≤ m − 1 such that p` ∈/ Rel(sh). 5 Transform cover trees We consider each case separately. For a cover tree, there is a trade-off between the height of the tree and the scale factor. It is not hard to see that Case 1: xf ∈/ Rel(qm). In this case, at least one of the height of a cover tree has upper bound O(logτ ∆). the two conditions of relatives does not hold for xf . If 2 So by increasing the scale factor, the height of the tree ccτ m d(q, x) > (τ−1)2 τ , then by the triangle inequality, will be decreased. Also, from Lemma 1, increasing the scale factor results in more children for each node of a c τ 2 cover tree. d(p, x) ≥ d(q, x) − d(p, q) > c τ m − c τ m (τ − 1)2 c In this section, we define two operations to change 2τ − 1 scale factor of a given tree. A coarsening operation > c τ `+1. modifies the tree to increase the scale factor. Simi- c (τ − 1)2 larly, a refining operation results a tree with smaller

2 scale factor. Note that in Theorem 5, we assumed that f ` ccτ ` 2cc We assumed that x ∈ Rel(p ), so d(p, x) ≤ 2 τ . τ > + 1. However, in many cases we may have (τ−1) cp These inequalities imply τ < 1, a contradiction. If 2cc 2 τ ≤ c + 1. For example, Beygelzimer et. al. [2] set ccτ m p d(q, x) ≤ (τ−1)2 τ and g > f > m, then f > ` is τ = 2, and they found τ = 1.3 is even more efficient in 2 ccτ m practice. In these situations, we can use the coarsening also a contradiction. The last case d(q, x) ≤ (τ−1)2 τ and f < g ≤ m is a special case of p` ∈/ Rel(sh), which operation to get a cover tree with the stronger packing is described in the following. and covering conditions of net-trees.

5.1 Coarsening ` h Case 2: p ∈/ Rel(s ). We know that ` < h < m, The coarsening operation can be seen as combining ev- 2 ccτ h h−1 ` 0 so d(p, s) > (τ−1)2 τ . Also, τ ≥ τ because h ≥ ery k levels of T into one level in T . We define a map- ` + 1. Using the triangle inequality and then applying ping between nodes of T and T 0. In this mapping, each CCCG 2016, Vancouver, British Columbia, August 3–5, 2016

Algorithm 3 Coarsening operation for a given cover Proof. The theorem requires showing three invariants tree holds: the covering property, the packing property, and 0 1: procedure Coarsening(T, k) the parent relation. First we prove that T satisfies the e 2: T 0 ← ∅ covering property. If r ∈ T is the descendant at most 2 ` 3ccτ k levels down from some p , then from Lemma 4, 3: Augment(T, (τ−1)2 ) 4: for all p` ∈ T in increasing order of level ` do ccτ ccτ 0 m ` 0 0 d(p, r) < τ ` < (τ 0)` . 5: q ← the lowest ancestor of p with m > ` τ − 1 τ − 1 6: if p = q then `0 0 7: p ← FindNode(high(p), ` ) Because we are combining sets of k consecutive levels, `0+1 0 8: q ← FindNode(high(q), ` + 1) it follows that each node in T 0 will have a node in the 9: else level above whose distance is at most this amount. It m 10: par ← q follows that T 0 has a covering constant ccτ . m τ−1 11: relatives ← Rel(q ) Next, we prove that the packing constant is correct. 0 0 12: if m > ` + 1 then If ` = k`0, then the minimum distance between points in 13: h ← k(b`/kc + 2) − 1 0 0 2 level ` of T is equal to the minimum distance between h 3ccτ h ` 0 `0 14: Relatives(q , (τ−1)2 , false) ∪ {q } points in level ` of T , which is at least cpτ = cp(τ ) . 0 0 15: relatives ← Rel(qh) Thus, the points in level ` of T satisfy the packing 16: for all xf ∈ relatives do condition and the packing constant is cp. 17: if f 0 = `0 + 1 then Now, we prove that this algorithm correctly finds the `0 0 18: xf ← RestrictedNN(p, xf , k) parent of p in T . Without loss of generality, let ` be divisible by k. Also, let s be the closest point to p among 19: if d(p, x) < d(p, par) then f all points in L`+k, and s has been appeared for the first 20: par ← x 0 0 `0 0 time in level e such that e > ` . So, s is the right parent 21: par ← FindNode(high(par), ` ) ` e m `0+1 0 for p . For contradiction, assume that s ∈/ Rel(q ) and 22: par ← FindNode(high(par), ` + 1) e `0 0 s is not resulted from RestrictedNN over all nodes 23: p ← FindNode(high(p), ` ) m h e 0 0 in Rel(q ). Let t be parent of s . From Lemma 4, 24: Add p` as a child of par` +1 c τ d(p, s) < d(p, q) ≤ c τ m. 0 τ − 1 node p` in T maps to a node p` = pb`/kc in T 0. Here, we use prime as a function that indicates the level of the We have following cases: node in T 0 that corresponds to p`, i.e. `0 = b`/kc. We also assume that each point p in T 0 maintains high(p), 2 0 3ccτ m which is the highest node of T associated to point p. Case 1: d(s, q) > (τ−1)2 τ . By the triangle inequal- Algorithm 3 describes the coarsening operation. ity, Coarsening uses three procedures Relatives, Re- 2c τ strictedNN, and FindNode. The first procedure is d(s, q) ≤ d(p, s) + d(p, q) < 2d(p, q) ≤ c τ m, described in Algorithm 2. Note that in Algorithm 3, T τ − 1 does not have node qk(b`/kc+2)−1. This node is a dummy and we set qm as its parent. The only reason to use this which is a contradiction, because τ > 1. dummy node is to bound the running time of the al- 2 gorithm. The next procedure is RestrictedNN, and 3ccτ m Case 2: d(s, q) ≤ (τ−1)2 τ and e > m. In this case, it returns the nearest neighbor to point p among those g f there exists a node s with g ≤ m, such that it satisfies nodes of the subtree rooted at x such that their levels g m 0 0 both conditions of relatives. So, s ∈ Rel(q ) and the in T are greater that ` . Finally, FindNode receives a g 2 0 algorithm correctly finds s . node p` and a level m0 in T 0, and it to find node 0 pm . If it finds that node, the node is returned. Oth- m0 0 2 erwise, p will be inserted in T such that it satisfies 3ccτ m Case 3: d(s, q) ≤ (τ−1)2 τ , m ≥ h, and Restrict- all properties of a hierarchical tree. More specifically, if e f m0 e0 e0 edNN does not find s . Let x be the highest ancestor p has only one child p and p has only one child, e 0 s such that f ≤ m. Then, Lemma 4 implies then pe will be removed from T 0 and the only child of 0 0 pe will be added as a child of pm . Then, this new node c τ c τ d(x, s) ≤ c τ f < c τ m. will be returned. τ − 1 τ − 1

Theorem 8 Algorithm 3 converts a T ∈ CT(τ, cp, cc) 2g can be equal to −∞, in this case we have a long edge from 0 k ccτ to T ∈ CT(τ , cp, τ−1 ). a node s in a level greater than m to the point s in level −∞. 28th Canadian Conference on Computational Geometry, 2016

Also, by the triangle inequality, Algorithm 4 Refining operation for a given cover tree

d(x, q) ≤ d(x, s) + d(p, s) + d(p, q) 1: procedure Refining(T, k) 2: T 0 ← ∅ 2 < d(x, s) + 2d(p, q) 3ccτ 3: Augment(T, (τ−1)2 ) c τ 2c τ < c τ m + c τ m 4: for all p` ∈ T in increasing order of level ` do τ − 1 τ − 1 5: Let qm = par(p`) 3c τ 3c τ 2 h0 < c τ m < c τ m. 6: p ← high(p) τ − 1 (τ − 1)2 7: if p = q and h0 < k` then 8: pk` ← FindNode(high(p), k`) Therefore, xf ∈ Rel(qm). Because e0 > `0, Restrict- 9: qk(`+1) ← FindNode(high(q), k(` + 1)) returns se as the nearest neighbor to p`, which edNN 10: else if p 6= q then is a contradiction.  11: par ← qm S h ` Theorem 9 The time complexity of Algorithm 3 is 12: list ← sh∈Rel(qm) ch(s ) \{q } lg( ccτ )2τ f cp(τ−1) 13: for all x ∈ list where f = ` do O(ρ n lg k). 0 14: xh ← high(x) h0/k Proof. From Theorem 7, T can be augmented with rel- 15: if d(p, x) ≤ ccτ and d(p, x) < lg( ccτ )2τ atives in O(ρ cp(τ−1) n) time. As a preprocessing d(p, par) then f step, we can maintain the lowest ancestor of all nodes 16: par ← x `+i/k in T in O(n) time, which results a constant time ac- 17: Find an i such that cpτ < `+(i+1)/k cess in the algorithm. By Lemma 3, the size of each d(p, par) ≤ ccτ 2 k`+i+1 lg ccτ 18: par ← FindNode(high(par), k` + c (τ−1)2 list of relatives is O(ρ p ). The time complexity i + 1) of is O(lg k), because the height of the RestrictedNN 19: park`+i ← FindNode(high(par), k` + i) subtree is O(k). When Algorithm 3 is processing all 20: pk`+i ← FindNode(high(p), k` + i) nodes of level ` in T , for each point p in T 0, the level of 21: Add pk`+i as a child of park`+i+1 high(p) is at most `0+1. So, FindNode requires O(1) to return a node of T 0 in level `0 or `0 +1. Since the number of edges in T is O(n), finding relatives of dummy nodes Now, we show that those nodes of T that have ap- will be done O(n) times for the entire algorithm. Conse- peared for the first time in level ` only required to be quently, because cc ≥ cp > 0, the total time complexity checked. Let x have appeared for the first time in level lg( ccτ )2τ of the algorithm is O(ρ cp(τ−1) n lg k).  h > `. By the parent property of T , d(p, q) < d(p, x), otherwise p should have x as its parent. Therefore, p 5.2 Refining cannot be closer to x than q and we can ignore x in the search process. Decreasing the scale factor is another useful operation We also show that the right parent of p in T 0 is in the for cover trees, and we call this operation refining. To set of children of relatives of q`+1. Let x` serve as the 0 `+1 refine a given cover tree T , each level ` in T is split into parent of p in T . So, d(p, x) < d(p, q) ≤ ccτ . From 0 at most k levels k`, . . . , (k` + k − 1) in T . Note that by the previous part, we know that x` has parent y`+1 and ` this division, a node p in T may be appeared at most k x 6= y. By the triangle inequality, times in levels k`, . . . , (k` + k − 1) of T 0. Similar to the `+1 coarsening operation, suppose that each point p in T 0 d(q, y) ≤ d(q, p) + d(p, x) + d(x, y) < 3cpτ maintains high(p) which is the highest node associated 3c τ 2 < c τ `+1. to point p. Algorithm 4 describes the refining operation. (τ − 1)2

`+1 `+1 Theorem 10 Algorithm 4 turns T ∈ CT(τ, cp, cc) into It implies that y ∈ Rel(q ). 0 1/k T ∈ CT(τ , cp, cc). Now, we should find the right level k` + i such that insertion of p in that level and as a child of x satisfies Proof. First, we prove that the algorithm correctly both packing and covering conditions with constants c finds parent of each node in T 0. Note that q as the p ` and cc, respectively. So, in this way we guarantee that current parent of p in T may not be the right parent these constants in T 0 will be the same as T . of it in T 0 because there may exist a node x` such that  d(p, x) < d(p, q) and the level of x in T 0 be greater than Theorem 11 Algorithm 4 has time complexity 0 lg( ccτ )2τ the level of p in T . In this case, p should be inserted as O((ρ cp(τ−1) + k)n). a child of x. To find the right parent of p`, we search its nearby nodes and select the closest node that satisfies Proof. By Theorem 7 we can augment T in ccτ 2 lg( c (τ−1) ) τ the covering property with constant cc. O(ρ p n) time. From Lemma 3 and Lemma 1, CCCG 2016, Vancouver, British Columbia, August 3–5, 2016

lg( ccτ )2τ number of nearby nodes to p` is O(ρ cp(τ−1) ). Note [12] P. N. Yianilos. Data structures and algorithms for nearest that for each node p` ∈ T in this algorithm, level of neighbor search in general metric spaces. In Proceedings of 0 the Fourth Annual ACM-SIAM Symposium on Discrete Al- high(p) in T is at most k(` + 1). Therefore, FindNode gorithms, pages 311–321, 1993. requires O(k) to return a node which is in a level be- tween k(`+1) to k` in T 0. Also, finding the right interval i requires O(lg k). Therefore, the time complexity of the lg( ccτ )2τ refining algorithm is O((ρ cp(τ−1) + k)n). 

6 Conclusion

In this paper, we add an easy to implement condition to cover trees and we show that a cover tree with a large enough scale factor is a net-tree. We also proposed a linear time algorithm to augment nodes of a cover tree with relatives. Furthermore, we present two linear-time algorithms to transform a cover tree to a coarser or finer cover tree. In fact, these two operations are useful to trade-off between the depth and the degree of nodes in a cover tree.

References

[1] J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509–517, Sept. 1975. [2] A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In Proceedings of the 23rd International Conference on Machine Learning, pages 97–104, 2006. [3] L. K. Clarkson. Nearest neighbor queries in metric spaces. Discrete & Computational Geometry, 22(1):63–93, 1999. [4] R. Cole and L.-A. Gottlieb. Searching dynamic point sets in spaces with bounded doubling dimension. In Proceedings of the Thirty-eighth Annual ACM Symposium on Theory of Computing, pages 574–583, 2006. [5] R. A. Finkel and J. L. Bentley. Quad trees a data structure for retrieval on composite keys. Acta Informatica, 4(1):1–9, 1974. [6] J. Gao, L. J. Guibas, and A. Nguyen. Deformable spanners and applications. Comput. Geom. Theory Appl., 35(1-2):2– 19, Aug. 2006. [7] A. Gupta, R. Krauthgamer, and J. R. Lee. Bounded geome- tries, fractals, and low-distortion embeddings. In Proceed- ings of the 44th Annual IEEE Symposium on Foundations of Computer Science, pages 534–, 2003. [8] S. Har-Peled and M. Mendel. Fast construction of nets in low dimensional metrics, and their applications. SIAM Journal on Computing, 35(5):1148–1184, 2006. [9] D. R. Karger and M. Ruhl. Finding nearest neighbors in growth-restricted metrics. In Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing, pages 741–750, 2002. [10] R. Krauthgamer and J. R. Lee. Navigating nets: Simple algorithms for proximity search. In Proceedings of the Fif- teenth Annual ACM-SIAM Symposium on Discrete Algo- rithms, pages 798–807, 2004. [11] J. K. Uhlmann. Satisfying general proximity / similarity queries with metric trees. Information Processing Letters, 40(4):175 – 179, 1991.