Node

Total Page:16

File Type:pdf, Size:1020Kb

Node Representation of multidimensional point data • How can we store multidimensional data, such as points in the plane, to also allow fast retrieval, updates and queries? • Quadtrees • k-d trees • Range trees and priority search trees Queries to point databases • Point query - determine if a specific point is in the database (rarely used) • Range query - identify the set of data points whose coordinate values fall within a certain range - e.g., all points within distance k of a given point, all points within a rectangular region, etc. • Nearest neighbor query - find the nearest neighbor(s) to a given point independent of how far away they might be. Range searching • Find all records whose key values fall between two limiting values • Easy to do for one-dimensional data using binary search trees • Let L and U specify the lower and upper bounds for the search • Start at the root of the BST holding key K • 1) If L <= K <= U, include K in the answer set • 2) If L <= K, then search the left subtree • 3) If K <= U, search the right subtree Range searching procedure BSTRangeSearch (L, U, P, f Op); b {Perform Op on each node, n, in BST s pointed at by P if L <= key(n)<= U} a e n v If P = nil return; K <-- Key(P); m q y If L <=K <= U then Op(P); if L <= K then BSTRange Search (L,U, r Left(P), Op); If K <= U then Range search from e to p BSTRangeSearch(L,U,Right(P), Op) Sets of points • Goal is to develop data structures for representing sets of points in the plane. S = { (xi, yi) } • Simplest solution for bounded sets - an array of {0,1} with 1’s inserted at locations of points in S. • Need to specify two parameters to determine size of the array: • 1) Range of x and y coordinates. For example, we could have 0 <= x<=1 and 0 <= y <= 1. • 2) Precision of x and y coordinates. For example, x and y can be represented as 6 bit numbers - .b1b2b3b4b5b6 • Combining the range and precision, we can determine the maximum number of points in any set chosen from that range and represented with that precision – size of the array. Sets of points • So, for example, if 0 <= x <= 100 and 0 <= y <= 100 and the precision of a coordinate is 1.0, then we can have a maximum of 10,000 distinct points. • Insertion and deletion using this data structure are trivial • But answering queries takes time proportional to the size of the array, rather than the number of points stored. • And typically points are “sparse” – think restaurants, banks, … Trees and tries (again) • Suppose we want a data structure to store one dimensional points (locations on a line segment) for fast lookup. • Let the points be 6 bit numbers (think of dividing the line segment into 64 bins) • Consider the points 100100 (34), 110010 (50), 001001(9), 011100 (28) and insert them into a BST, Trie in that order (clearly for the trie, the final trie is independent of the order in which the points are inserted). Insert 10010 (34), 110010 (50), 001001(9), 011100 (28) into BST 34 0 64 10010 (34), 110010 (50), 001001(9), 011100 (28) 34 50 0 64 10010 (34), 110010 (50), 001001(9), 011100 (28) 34 9 50 0 64 10010 (34), 110010 (50), 001001(9), 011100 (28) 34 9 50 28 0 64 Insert 10010 (34), 110010 (50), 001001(9), 011100 (28) into a trie 10010 0 64 Insert 100010 (34), 110010 (50), 001001(9), 011100 (28) into a trie 10010 110010 0 64 Insert 100010 (34), 110010 (50), 001001(9), 011100 (28) into a trie 001001 10010 110010 0 64 BST and Tries break up search key space in different ways. • The BST breaks up space based on the VALUES of the keys – so the root can break space into an arbitrary pair of intervals. • The Trie breaks up space by regular decomposition of the interval – splits sub intervals in half (0,1) whenever a new point is inserted. • We will use both of these methods for two dimensional points. Quadtrees • Point quadtree – like a binary search tree – space will be decomposed at the values of the coordinates. • Each data point is represented as a node that contains seven fields • Four fields are pointers to the sons (NW,NE,SW,SE) • 2 fields for coordinates (x,y) • one field points to record containing information about the point (name) Example data for quad tree • Name X Y • Chicago 35 40 • Mobile 50 10 • Toronto 60 75 • Buffalo 80 65 • Denver 5 45 • Omaha 25 35 • Atlanta 85 15 • Miami 90 5 Point quadtree C Toronto Buffalo D T O M Denver Chicago B A i Omaha Atlanta Mobile Miami Insertion into a point quadtree • Insertion is similar to binary search trees • Compare the (x,y) coordinates of the new point to those of the quadtree nodes, branching until encountering a nil pointer procedure which_quadrant (newp, quadp); return (if x(newp) < x(quadp) then {West} if y(newp) < y(quadp) then SW else NW else {East} if y(newp) < y(quadp) then SE else NE) • This procedure is used to choose the appropriate son as we descend through the point quadtree Example (35,40) Chicago Example Chicago (35,40) Mobile (50,10) Example Toronto (60,75) Chicago (35,40) Mobile (50,10) Example Toronto (60,75) Buffalo (80,65 Chicago (35,40) Mobile (50,10) Proximity search • It is all about pruning – ruling out entire subtrees that cannot possibly contain any points that fall in the query region • In the BST case, if we are visiting a node, N, holding key k, our search interval is [l,u] and k<l we can prune the left subtree necause all keys, k’, in the left subtree of N are by construction <k<l, so cannot lie in the interval. • In a point quadtree the logic of pruning is slightly more complicated. But it depends on which quadrants defined by the point at a node intersect the search region. • So, if the search region is contained completely within the NE quadrant, we only need to search the NE quadrant. 1. SE Example 2. SE,SW 3. SW 4. SE,NE 5. SW,NW 6. NE 1 Toronto2 (60,75)3 9 4 5 Buffalo (80,65 6 7 8 Chicago (35,40) Mobile (50,10) Example 1. SE 2. SE,SW 3. SW 4. SE,NE 5. SW,NW Toronto (60,75) 6. NE Buffalo (80,65 Chicago (35,40) 1 2 3 9 4 Mobile 5 (50,10) 6 7 8 Proximity search • Find all points in a point quadtree within distance r of a given point, A. • Quadtree allows us to rule out large parts of the data structure from the search 1. SE 7. NE,NW 1 2 3 2. SE,SW 8. NW r 9 3. SW 9. All but NW/ 4 5 4. SE,NE ALL (BOX) 5. SW,NW 10. All but NE 6 7 8 6. NE 11. All but SW 12. All but SE 13. All Which quadrants to search based on location of root of quadtree 1 2 3 7. NE,NW 8. NW 9. All but NW 9 10 10. All but NE 11. All but SW 4 13 5 12. All but SE 13. All 12 11 6 7 8 1. SE 2. SE,SW 3. SW If quadtree node falls in region I relative 4. SE,NE to query point, then search the 5. SW,NW regions listed 6. NE Search algorithm • Let (u,v) be the coordinates of the point whose fixed radius neighbors we are searching for. • Algorithm will construct a list, V, of quadtree nodes to visit, and a list, N, of fixed radius neighbors. • Put R onto V, Initialize N to empty • If V is empty, halt; otherwise remove a node, T, from V. Let (x,y) be the coordinates of T • Compute the region, r, that (u,v) falls in relative to (x,y) using mask. • Case on r • r = 1: Add SE(T) to V • … • r = 13: Add T to N, add SE(T), SW(T), NE(T), NW(T) to V • repeat Example - Find all cities within 10 units of (83,10) • Look only in SE quadrant of root (Chicago) since Toronto! Chicago is in region 1 relative Buffalo! to search point • Look at SE and Denver! NE quadrants of Mobile (SE son of Chicago! Chicago) since Omaha! 1! 2! 3! 9! 10! Mobile is in region Atlanta! r! 4! a! 5! 4 relative to Mobile! 13! search point 11! 12! 6Miami! ! 7! 8! Example 1. Chicago is in R4 Visit Toronto, Mobile Toronto! 2. Mobile is in R6 Buffalo! Visit Atlanta 3. Toronto is in R2 1 2 3 Visit Buffalo Denver! 9 Chicago4 ! 5 Omaha! 6 7 8 Atlanta! Mobile! Miami! Deletion from point quadtrees • Difficult to achieve efficiently • because deleting a node changes the partition lines for all of the descendants of that node • When deleting a node, remove the subtree rooted at the node and reinsert all of the nodes in the subtree • This is an expensive process, and often results in a deeper tree - yielding longer search times • Samet contains a better, but more complicated algorithm Point quadtree summary • Lookup and insertion are straightforward extensions of binary tree search • just have to make a four way rather than a two way decision • Deletion is much more complicated and expensive • For higher dimensional data, storage requirements can be large • (x1) - binary search tree and each node has two pointers to children • (x1, x2) - 2D point quadtree, and each node has 4 pointers • (x1, x2, x3) - 3D point quadtree and each node has 8 pointers n • (x1, …xn) - nD point quadtree and each node has 2 pointers • Can overcome this problem by storing only non-null pointers at each node, but this makes the implementation more complex • can use a linked list of (pointer name, pointer value) of non-null pointers k-d trees - 2 pointers per node • k- dimensionality of the data being represented • 2-d tree - representation for points in the plane • k-d tree is a binary tree 1) at each stage, a different coordinate of the key is used to determine the branching 2) first branch on the x coordinate of a key (ties break left) 3) then branch on the y coordinate (ties break down) 4) then alternate between the x and y coordinates 5) at a branch, the left son will have all keys smaller than the tested attribute, and the right son will have all keys greater than or equal to the test attribute k-d tree example Chicago (partition on x) Chicago (35,40) k-d tree - example Chicago(x) Denver(y) Denver (5,45) Chicago (35,40) k-d tree - example
Recommended publications
  • Interval Trees Storing and Searching Intervals
    Interval Trees Storing and Searching Intervals • Instead of points, suppose you want to keep track of axis-aligned segments: • Range queries: return all segments that have any part of them inside the rectangle. • Motivation: wiring diagrams, genes on genomes Simpler Problem: 1-d intervals • Segments with at least one endpoint in the rectangle can be found by building a 2d range tree on the 2n endpoints. - Keep pointer from each endpoint stored in tree to the segments - Mark segments as you output them, so that you don’t output contained segments twice. • Segments with no endpoints in range are the harder part. - Consider just horizontal segments - They must cross a vertical side of the region - Leads to subproblem: Given a vertical line, find segments that it crosses. - (y-coords become irrelevant for this subproblem) Interval Trees query line interval Recursively build tree on interval set S as follows: Sort the 2n endpoints Let xmid be the median point Store intervals that cross xmid in node N intervals that are intervals that are completely to the completely to the left of xmid in Nleft right of xmid in Nright Another view of interval trees x Interval Trees, continued • Will be approximately balanced because by choosing the median, we split the set of end points up in half each time - Depth is O(log n) • Have to store xmid with each node • Uses O(n) storage - each interval stored once, plus - fewer than n nodes (each node contains at least one interval) • Can be built in O(n log n) time. • Can be searched in O(log n + k) time [k = #
    [Show full text]
  • Augmentation: Range Trees (PDF)
    Lecture 9 Augmentation 6.046J Spring 2015 Lecture 9: Augmentation This lecture covers augmentation of data structures, including • easy tree augmentation • order-statistics trees • finger search trees, and • range trees The main idea is to modify “off-the-shelf” common data structures to store (and update) additional information. Easy Tree Augmentation The goal here is to store x.f at each node x, which is a function of the node, namely f(subtree rooted at x). Suppose x.f can be computed (updated) in O(1) time from x, children and children.f. Then, modification a set S of nodes costs O(# of ancestors of S)toupdate x.f, because we need to walk up the tree to the root. Two examples of O(lg n) updates are • AVL trees: after rotating two nodes, first update the new bottom node and then update the new top node • 2-3 trees: after splitting a node, update the two new nodes. • In both cases, then update up the tree. Order-Statistics Trees (from 6.006) The goal of order-statistics trees is to design an Abstract Data Type (ADT) interface that supports the following operations • insert(x), delete(x), successor(x), • rank(x): find x’s index in the sorted order, i.e., # of elements <x, • select(i): find the element with rank i. 1 Lecture 9 Augmentation 6.046J Spring 2015 We can implement the above ADT using easy tree augmentation on AVL trees (or 2-3 trees) to store subtree size: f(subtree) = # of nodes in it. Then we also have x.size =1+ c.size for c in x.children.
    [Show full text]
  • Advanced Data Structures
    Advanced Data Structures PETER BRASS City College of New York CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521880374 © Peter Brass 2008 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2008 ISBN-13 978-0-511-43685-7 eBook (EBL) ISBN-13 978-0-521-88037-4 hardback Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents Preface page xi 1 Elementary Structures 1 1.1 Stack 1 1.2 Queue 8 1.3 Double-Ended Queue 16 1.4 Dynamical Allocation of Nodes 16 1.5 Shadow Copies of Array-Based Structures 18 2 Search Trees 23 2.1 Two Models of Search Trees 23 2.2 General Properties and Transformations 26 2.3 Height of a Search Tree 29 2.4 Basic Find, Insert, and Delete 31 2.5ReturningfromLeaftoRoot35 2.6 Dealing with Nonunique Keys 37 2.7 Queries for the Keys in an Interval 38 2.8 Building Optimal Search Trees 40 2.9 Converting Trees into Lists 47 2.10
    [Show full text]
  • L11: Quadtrees CSE373, Winter 2020
    L11: Quadtrees CSE373, Winter 2020 Quadtrees CSE 373 Winter 2020 Instructor: Hannah C. Tang Teaching Assistants: Aaron Johnston Ethan Knutson Nathan Lipiarski Amanda Park Farrell Fileas Sam Long Anish Velagapudi Howard Xiao Yifan Bai Brian Chan Jade Watkins Yuma Tou Elena Spasova Lea Quan L11: Quadtrees CSE373, Winter 2020 Announcements ❖ Homework 4: Heap is released and due Wednesday ▪ Hint: you will need an additional data structure to improve the runtime for changePriority(). It does not affect the correctness of your PQ at all. Please use a built-in Java collection instead of implementing your own. ▪ Hint: If you implemented a unittest that tested the exact thing the autograder described, you could run the autograder’s test in the debugger (and also not have to use your tokens). ❖ Please look at posted QuickCheck; we had a few corrections! 2 L11: Quadtrees CSE373, Winter 2020 Lecture Outline ❖ Heaps, cont.: Floyd’s buildHeap ❖ Review: Set/Map data structures and logarithmic runtimes ❖ Multi-dimensional Data ❖ Uniform and Recursive Partitioning ❖ Quadtrees 3 L11: Quadtrees CSE373, Winter 2020 Other Priority Queue Operations ❖ The two “primary” PQ operations are: ▪ removeMax() ▪ add() ❖ However, because PQs are used in so many algorithms there are three common-but-nonstandard operations: ▪ merge(): merge two PQs into a single PQ ▪ buildHeap(): reorder the elements of an array so that its contents can be interpreted as a valid binary heap ▪ changePriority(): change the priority of an item already in the heap 4 L11: Quadtrees CSE373,
    [Show full text]
  • Search Trees
    Lecture III Page 1 “Trees are the earth’s endless effort to speak to the listening heaven.” – Rabindranath Tagore, Fireflies, 1928 Alice was walking beside the White Knight in Looking Glass Land. ”You are sad.” the Knight said in an anxious tone: ”let me sing you a song to comfort you.” ”Is it very long?” Alice asked, for she had heard a good deal of poetry that day. ”It’s long.” said the Knight, ”but it’s very, very beautiful. Everybody that hears me sing it - either it brings tears to their eyes, or else -” ”Or else what?” said Alice, for the Knight had made a sudden pause. ”Or else it doesn’t, you know. The name of the song is called ’Haddocks’ Eyes.’” ”Oh, that’s the name of the song, is it?” Alice said, trying to feel interested. ”No, you don’t understand,” the Knight said, looking a little vexed. ”That’s what the name is called. The name really is ’The Aged, Aged Man.’” ”Then I ought to have said ’That’s what the song is called’?” Alice corrected herself. ”No you oughtn’t: that’s another thing. The song is called ’Ways and Means’ but that’s only what it’s called, you know!” ”Well, what is the song then?” said Alice, who was by this time completely bewildered. ”I was coming to that,” the Knight said. ”The song really is ’A-sitting On a Gate’: and the tune’s my own invention.” So saying, he stopped his horse and let the reins fall on its neck: then slowly beating time with one hand, and with a faint smile lighting up his gentle, foolish face, he began..
    [Show full text]
  • Computational Geometry: 1D Range Tree, 2D Range Tree, Line
    Computational Geometry Lecture 17 Computational geometry Algorithms for solving “geometric problems” in 2D and higher. Fundamental objects: point line segment line Basic structures: point set polygon L17.2 Computational geometry Algorithms for solving “geometric problems” in 2D and higher. Fundamental objects: point line segment line Basic structures: triangulation convex hull L17.3 Orthogonal range searching Input: n points in d dimensions • E.g., representing a database of n records each with d numeric fields Query: Axis-aligned box (in 2D, a rectangle) • Report on the points inside the box: • Are there any points? • How many are there? • List the points. L17.4 Orthogonal range searching Input: n points in d dimensions Query: Axis-aligned box (in 2D, a rectangle) • Report on the points inside the box Goal: Preprocess points into a data structure to support fast queries • Primary goal: Static data structure • In 1D, we will also obtain a dynamic data structure supporting insert and delete L17.5 1D range searching In 1D, the query is an interval: First solution using ideas we know: • Interval trees • Represent each point x by the interval [x, x]. • Obtain a dynamic structure that can list k answers in a query in O(k lg n) time. L17.6 1D range searching In 1D, the query is an interval: Second solution using ideas we know: • Sort the points and store them in an array • Solve query by binary search on endpoints. • Obtain a static structure that can list k answers in a query in O(k + lg n) time. Goal: Obtain a dynamic structure that can list k answers in a query in O(k + lg n) time.
    [Show full text]
  • Range Searching
    Range Searching ² Data structure for a set of objects (points, rectangles, polygons) for efficient range queries. Y Q X ² Depends on type of objects and queries. Consider basic data structures with broad applicability. ² Time-Space tradeoff: the more we preprocess and store, the faster we can solve a query. ² Consider data structures with (nearly) linear space. Subhash Suri UC Santa Barbara Orthogonal Range Searching ² Fix a n-point set P . It has 2n subsets. How many are possible answers to geometric range queries? Y 5 Some impossible rectangular ranges 6 (1,2,3), (1,4), (2,5,6). 1 4 Range (1,5,6) is possible. 3 2 X ² Efficiency comes from the fact that only a small fraction of subsets can be formed. ² Orthogonal range searching deals with point sets and axis-aligned rectangle queries. ² These generalize 1-dimensional sorting and searching, and the data structures are based on compositions of 1-dim structures. Subhash Suri UC Santa Barbara 1-Dimensional Search ² Points in 1D P = fp1; p2; : : : ; png. ² Queries are intervals. 15 71 3 7 9 21 23 25 45 70 72 100 120 ² If the range contains k points, we want to solve the problem in O(log n + k) time. ² Does hashing work? Why not? ² A sorted array achieves this bound. But it doesn’t extend to higher dimensions. ² Instead, we use a balanced binary tree. Subhash Suri UC Santa Barbara Tree Search 15 7 24 3 12 20 27 1 4 9 14 17 22 25 29 1 3 4 7 9 12 14 15 17 20 22 24 25 27 29 31 u v xlo =2 x hi =23 ² Build a balanced binary tree on the sorted list of points (keys).
    [Show full text]
  • I/O-Efficient Spatial Data Structures for Range Queries
    I/O-Efficient Spatial Data Structures for Range Queries Lars Arge Kasper Green Larsen MADALGO,∗ Department of Computer Science, Aarhus University, Denmark E-mail: [email protected],[email protected] 1 Introduction Range reporting is a one of the most fundamental topics in spatial databases and computational geometry. In this class of problems, the input consists of a set of geometric objects, such as points, line segments, rectangles etc. The goal is to preprocess the input set into a data structure, such that given a query range, one can efficiently report all input objects intersecting the range. The ranges most commonly considered are axis-parallel rectangles, halfspaces, points, simplices and balls. In this survey, we focus on the planar orthogonal range reporting problem in the external memory model of Aggarwal and Vitter [2]. Here the input consists of a set of N points in the plane, and the goal is to support reporting all points inside an axis-parallel query rectangle. We use B to denote the disk block size in number of points. The cost of answering a query is measured in the number of I/Os performed and the space of the data structure is measured in the number of disk blocks occupied, hence linear space is O(N=B) disk blocks. Outline. In Section 2, we set out by reviewing the classic B-tree for solving one- dimensional orthogonal range reporting, i.e. given N points on the real line and a query interval q = [q1; q2], report all T points inside q. In Section 3 we present optimal solutions for planar orthogonal range reporting, and finally in Section 4, we briefly discuss related range searching problems.
    [Show full text]
  • Parallel Range, Segment and Rectangle Queries with Augmented Maps
    Parallel Range, Segment and Rectangle Queries with Augmented Maps Yihan Sun Guy E. Blelloch Carnegie Mellon University Carnegie Mellon University [email protected] [email protected] Abstract The range, segment and rectangle query problems are fundamental problems in computational geometry, and have extensive applications in many domains. Despite the significant theoretical work on these problems, efficient implementations can be complicated. We know of very few practical implementations of the algorithms in parallel, and most implementations do not have tight theoretical bounds. In this paper, we focus on simple and efficient parallel algorithms and implementations for range, segment and rectangle queries, which have tight worst-case bound in theory and good parallel performance in practice. We propose to use a simple framework (the augmented map) to model the problem. Based on the augmented map interface, we develop both multi-level tree structures and sweepline algorithms supporting range, segment and rectangle queries in two dimensions. For the sweepline algorithms, we also propose a parallel paradigm and show corresponding cost bounds. All of our data structures are work-efficient to build in theory (O(n log n) sequential work) and achieve a low parallel depth (polylogarithmic for the multi-level tree structures, and O(n) for sweepline algorithms). The query time is almost linear to the output size. We have implemented all the data structures described in the paper using a parallel augmented map library. Based on the library each data structure only requires about 100 lines of C++ code. We test their performance on large data sets (up to 108 elements) and a machine with 72-cores (144 hyperthreads).
    [Show full text]
  • The Skip Quadtree: a Simple Dynamic Data Structure for Multidimensional Data
    The Skip Quadtree: A Simple Dynamic Data Structure for Multidimensional Data David Eppstein† Michael T. Goodrich† Jonathan Z. Sun† Abstract We present a new multi-dimensional data structure, which we call the skip quadtree (for point data in R2) or the skip octree (for point data in Rd , with constant d > 2). Our data structure combines the best features of two well-known data structures, in that it has the well-defined “box”-shaped regions of region quadtrees and the logarithmic-height search and update hierarchical structure of skip lists. Indeed, the bottom level of our structure is exactly a region quadtree (or octree for higher dimensional data). We describe efficient algorithms for inserting and deleting points in a skip quadtree, as well as fast methods for performing point location and approximate range queries. 1 Introduction Data structures for multidimensional point data are of significant interest in the computational geometry, computer graphics, and scientific data visualization literatures. They allow point data to be stored and searched efficiently, for example to perform range queries to report (possibly approximately) the points that are contained in a given query region. We are interested in this paper in data structures for multidimensional point sets that are dynamic, in that they allow for fast point insertion and deletion, as well as efficient, in that they use linear space and allow for fast query times. Related Previous Work. Linear-space multidimensional data structures typically are defined by hierar- chical subdivisions of space, which give rise to tree-based search structures. That is, a hierarchy is defined by associating with each node v in a tree T a region R(v) in Rd such that the children of v are associated with subregions of R(v) defined by some kind of “cutting” action on R(v).
    [Show full text]
  • Lecture Notes of CSCI5610 Advanced Data Structures
    Lecture Notes of CSCI5610 Advanced Data Structures Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong July 17, 2020 Contents 1 Course Overview and Computation Models 4 2 The Binary Search Tree and the 2-3 Tree 7 2.1 The binary search tree . .7 2.2 The 2-3 tree . .9 2.3 Remarks . 13 3 Structures for Intervals 15 3.1 The interval tree . 15 3.2 The segment tree . 17 3.3 Remarks . 18 4 Structures for Points 20 4.1 The kd-tree . 20 4.2 A bootstrapping lemma . 22 4.3 The priority search tree . 24 4.4 The range tree . 27 4.5 Another range tree with better query time . 29 4.6 Pointer-machine structures . 30 4.7 Remarks . 31 5 Logarithmic Method and Global Rebuilding 33 5.1 Amortized update cost . 33 5.2 Decomposable problems . 34 5.3 The logarithmic method . 34 5.4 Fully dynamic kd-trees with global rebuilding . 37 5.5 Remarks . 39 6 Weight Balancing 41 6.1 BB[α]-trees . 41 6.2 Insertion . 42 6.3 Deletion . 42 6.4 Amortized analysis . 42 6.5 Dynamization with weight balancing . 43 6.6 Remarks . 44 1 CONTENTS 2 7 Partial Persistence 47 7.1 The potential method . 47 7.2 Partially persistent BST . 48 7.3 General pointer-machine structures . 52 7.4 Remarks . 52 8 Dynamic Perfect Hashing 54 8.1 Two random graph results . 54 8.2 Cuckoo hashing . 55 8.3 Analysis . 58 8.4 Remarks . 59 9 Binomial and Fibonacci Heaps 61 9.1 The binomial heap .
    [Show full text]
  • Uwaterloo Latex Thesis Template
    In-Memory Storage for Labeled Tree-Structured Data by Gelin Zhou A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Computer Science Waterloo, Ontario, Canada, 2017 c Gelin Zhou 2017 Examining Committee Membership The following served on the Examining Committee for this thesis. The decision of the Examining Committee is by majority vote. Johannes Fischer External Examiner Professor J. Ian Munro Supervisor Professor Meng He Co-Supervisor Assistant Professor Gordon V. Cormack Internal Examiner Professor Therese Biedl Internal Examiner Professor Gordon B. Agnew Internal-external Examiner Associate Professor ii I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. iii Abstract In this thesis, we design in-memory data structures for labeled and weights trees, so that various types of path queries or operations can be supported with efficient query time. We assume the word RAM model with word size w, which permits random accesses to w-bit memory cells. Our data structures are space-efficient and many of them are even succinct. These succinct data structures occupy space close to the information theoretic lower bounds of the input trees within lower order terms. First, we study the problems of supporting various path queries over weighted trees. A path counting query asks for the number of nodes on a query path whose weights lie within a query range, while a path reporting query requires to report these nodes.
    [Show full text]