Point Enclosure and the Interval Tree

Total Page:16

File Type:pdf, Size:1020Kb

Point Enclosure and the Interval Tree C.S. 252 Prof. Roberto Tamassia Computational Geometry Sem. II, 1992{1993 Point Enclosure and the Interval Tree Lecture 8 Date: March 3, 1993 Scribe: Dzung T. Hoang Point Enclosure We consider the 1-D point enclosure problem: Given a collection of segments on a line and a query point, ¯nd those that contain the query point. In a previous lecture, we have seen a solution using the segment tree data strucure that requires O(n log n) space and O(log n+k) time, where n is the number of segments and k the number of segments reported. In this lecture, we introduce a data structure that solves the 1-D point enclosure problem using O(n) space and O(log n + k) time. Interval Tree The data structure we introduce is called the interval tree. Given X, a set of points on the line, and a set of segments, S, with endpoints in X, the interval tree is a binary tree that stores the segments in its internal and leaf nodes. An internal node may hold no segments, but a leaf node contains at least one segment. A segment is stored in exactly one node. It is easiest to describe an interval tree by showing how one is constructed. Below are the steps for constructing an interval tree given X and S, with the restriction that segments have distinct endpoints. We later discuss how to remove this restriction. 1. Find the median, xm, of X. (If there are two medians, let xm be their average.) 2. Draw the root above xm. 3. Store xm at the root. 4. Store all segments in S that contain xm at the root. 5. Remove the above segments from S and their endpoints from X. 6. Recursively construct subtrees for the left (right) child by considering only the segments with both endpoints to the left (right) of xm. 1 To illustrate the process, we go through step-by-step construction of an interval tree for 8 segments (Figures 1 to 5 in the Appendix). At each step, the tree constructed so far is shown along with the remaining segments and endpoints. Each node is annotated with the the set of segments it contains. Recursive construction of the left and right subtrees are shown in the same ¯gure to reduce the number of ¯gures required. As promised earlier, we show how to handle the case where segments share common endpoints. Only step (5) is a®ected by this case. After removing the segments containing xm from S, we remove only the elements in X that are not endpoints of any of the remaining segments in S. Query For query processing, each node is augmented with two linked lists, one storing the left endpoints of the segments associated with the node sorted right-to-left and the other storing the right endpoints sorted left-to-right. When performing a point enclosure query, the interval tree is treated as a binary search tree, comparing the query point against the median x-coordinate stored at each node. At each node visited, the query point is checked for enclosure in the segments stored at the node. This check is done by scanning the two sorted lists described earlier. If the query point lies to the right of the median, the next node to be visited is the right child and the list containing right endpoints is scanned left-to-right. Similary, if the next node to be visited is the left child, the list containing left endpoints is scanned right-to-left. Each scan is halted at the ¯rst segment encountered that does not contain the query point. The number of endpoints scanned is one more than the number of segments to be reported. Since the tree is constructed by successively dividing the endpoints into two set of about equal size, the height of the tree is O(log n). If k segments enclose the query point, the total query time is O(log n + k). Each segment is stored in exactly one node of the tree. The amount of data stored for each segment is constant. Therefore the space requirement is O(n). Preprocessing We now examine the time needed to construct the tree. In the worst case, the tree will contain n leaves and have 2n ¡ 1 nodes. Since the number of endpoints considered by each node is reduced by about half with every recursion, it follows that a node on level m (with the root being on level 0) considers at most (2n + 1)=2m endpoints. Step (1) requires ¯nding the median of the set of endpoints under consideration. The median of a set of size n can be found in O(n) time. Noting that there may be at most 2m nodes on level m, the total time required to execute Step (1) for all the nodes created can be bounded by the following sum: log n m m X 2 (2n + 1)=2 = (log n + 1)(2n + 1) = O(n log n): m=0 The sorted lists of left and right endpoints of the segments stored at each node can be constructed in O(n log n) time as follows: 2 1. Create two sorted binary trees, one for all left endpoints and one for all right endpoints. Provide threads for left-to-right and right-to-left orderings. This takes O(n log n) time. 2. In the construction of the linked list of left (right) endpoints at each node, search the corresponding binary tree for xm and follow the left-to-right (right-to-left) threads to retrieve the endpoints. In this step, also \remove" the reported endpoints by updating the threads. Each binary search takes O(log n) time. Since there may be O(n) such searches, the total search time is O(n log n). The total time required to construct the interval tree is O(n log n). References [1] Yi-Jen Chiang and Roberto Tamassia, \Dynamic Algorithms in Computational Geome- try," Proceedings of the IEEE, vol. 80, no. 9, pp. 1412{1434, 1992. [2] Franco P. Preparata and Michael I. Shamos, Computational Geometry, Springer-Verlag, 1985. 3 r r r r r r r r r r r r r r r r - A B C D E F G H Figure 1: Initial set of segments and endpoints. B; H g r r r r r r r r r r r r r r r r - A B C D E F G H Figure 2: The root node is added above the average of the two median points. A ray is projected from the root to mark those segments to be added to the root. 4 B; H g »»»XXX »»» XXX »» XX »»» XXX A »»» XXX C g g r r r r r r r r r r r r - A C D E F G Figure 3: The segments intersecting the ray projected from the root are added to the root and then removed from consideration along with the endpoints. The left (right) child is placed above the median of the points to the left (right) of the root. B; H g »»»XXX »»» XXX »» XX »»» XXX A »»» XXX C g g T "" cc ¡ " c ¡ T " " c ¡ T " c E ¡ T F G " c D g g g g r r r r r r r r - D E F G Figure 4: Next level down. 5 B; H »Xg »»» XXX »» XX »»» XXX »»» XXX A »» XX C g g T "" cc ¡ " c ¡ T " " c ¡ T " c E ¡ T F G " c D g g g g r r r r r r r r r r r r r r r r - A B C D E F G H Figure 5: Final interval tree with all segments and endpoints shown. 6.
Recommended publications
  • Interval Trees Storing and Searching Intervals
    Interval Trees Storing and Searching Intervals • Instead of points, suppose you want to keep track of axis-aligned segments: • Range queries: return all segments that have any part of them inside the rectangle. • Motivation: wiring diagrams, genes on genomes Simpler Problem: 1-d intervals • Segments with at least one endpoint in the rectangle can be found by building a 2d range tree on the 2n endpoints. - Keep pointer from each endpoint stored in tree to the segments - Mark segments as you output them, so that you don’t output contained segments twice. • Segments with no endpoints in range are the harder part. - Consider just horizontal segments - They must cross a vertical side of the region - Leads to subproblem: Given a vertical line, find segments that it crosses. - (y-coords become irrelevant for this subproblem) Interval Trees query line interval Recursively build tree on interval set S as follows: Sort the 2n endpoints Let xmid be the median point Store intervals that cross xmid in node N intervals that are intervals that are completely to the completely to the left of xmid in Nleft right of xmid in Nright Another view of interval trees x Interval Trees, continued • Will be approximately balanced because by choosing the median, we split the set of end points up in half each time - Depth is O(log n) • Have to store xmid with each node • Uses O(n) storage - each interval stored once, plus - fewer than n nodes (each node contains at least one interval) • Can be built in O(n log n) time. • Can be searched in O(log n + k) time [k = #
    [Show full text]
  • 14 Augmenting Data Structures
    14 Augmenting Data Structures Some engineering situations require no more than a “textbook” data struc- ture—such as a doubly linked list, a hash table, or a binary search tree—but many others require a dash of creativity. Only in rare situations will you need to cre- ate an entirely new type of data structure, though. More often, it will suffice to augment a textbook data structure by storing additional information in it. You can then program new operations for the data structure to support the desired applica- tion. Augmenting a data structure is not always straightforward, however, since the added information must be updated and maintained by the ordinary operations on the data structure. This chapter discusses two data structures that we construct by augmenting red- black trees. Section 14.1 describes a data structure that supports general order- statistic operations on a dynamic set. We can then quickly find the ith smallest number in a set or the rank of a given element in the total ordering of the set. Section 14.2 abstracts the process of augmenting a data structure and provides a theorem that can simplify the process of augmenting red-black trees. Section 14.3 uses this theorem to help design a data structure for maintaining a dynamic set of intervals, such as time intervals. Given a query interval, we can then quickly find an interval in the set that overlaps it. 14.1 Dynamic order statistics Chapter 9 introduced the notion of an order statistic. Specifically, the ith order statistic of a set of n elements, where i 2 f1;2;:::;ng, is simply the element in the set with the ith smallest key.
    [Show full text]
  • Advanced Data Structures
    Advanced Data Structures PETER BRASS City College of New York CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521880374 © Peter Brass 2008 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2008 ISBN-13 978-0-511-43685-7 eBook (EBL) ISBN-13 978-0-521-88037-4 hardback Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents Preface page xi 1 Elementary Structures 1 1.1 Stack 1 1.2 Queue 8 1.3 Double-Ended Queue 16 1.4 Dynamical Allocation of Nodes 16 1.5 Shadow Copies of Array-Based Structures 18 2 Search Trees 23 2.1 Two Models of Search Trees 23 2.2 General Properties and Transformations 26 2.3 Height of a Search Tree 29 2.4 Basic Find, Insert, and Delete 31 2.5ReturningfromLeaftoRoot35 2.6 Dealing with Nonunique Keys 37 2.7 Queries for the Keys in an Interval 38 2.8 Building Optimal Search Trees 40 2.9 Converting Trees into Lists 47 2.10
    [Show full text]
  • Search Trees
    Lecture III Page 1 “Trees are the earth’s endless effort to speak to the listening heaven.” – Rabindranath Tagore, Fireflies, 1928 Alice was walking beside the White Knight in Looking Glass Land. ”You are sad.” the Knight said in an anxious tone: ”let me sing you a song to comfort you.” ”Is it very long?” Alice asked, for she had heard a good deal of poetry that day. ”It’s long.” said the Knight, ”but it’s very, very beautiful. Everybody that hears me sing it - either it brings tears to their eyes, or else -” ”Or else what?” said Alice, for the Knight had made a sudden pause. ”Or else it doesn’t, you know. The name of the song is called ’Haddocks’ Eyes.’” ”Oh, that’s the name of the song, is it?” Alice said, trying to feel interested. ”No, you don’t understand,” the Knight said, looking a little vexed. ”That’s what the name is called. The name really is ’The Aged, Aged Man.’” ”Then I ought to have said ’That’s what the song is called’?” Alice corrected herself. ”No you oughtn’t: that’s another thing. The song is called ’Ways and Means’ but that’s only what it’s called, you know!” ”Well, what is the song then?” said Alice, who was by this time completely bewildered. ”I was coming to that,” the Knight said. ”The song really is ’A-sitting On a Gate’: and the tune’s my own invention.” So saying, he stopped his horse and let the reins fall on its neck: then slowly beating time with one hand, and with a faint smile lighting up his gentle, foolish face, he began..
    [Show full text]
  • 2 Dynamization
    6.851: Advanced Data Structures Spring 2010 Lecture 4 — February 11, 2010 Prof. Andr´eSchulz Scribe: Peter Caday 1 Overview In the previous lecture, we studied range trees and kd-trees, two structures which support efficient orthogonal range queries on a set of points. In this second of four lectures on geometric data structures, we will tie up a loose end from the last lecture — handling insertions and deletions. The main topic, however, will be two new structures addressing the vertical line stabbing problem: interval trees and segment trees. We will conclude with an application of vertical line stabbing to a windowing problem. 2 Dynamization In our previous discussion of range trees, we assumed that the point set was static. On the other hand, we may be interested in supporting insertions and deletions as time progresses. Unfortunately, it is not trivial to modify our data structures to handle this dynamic scenario. We can, however, dynamize them using general techniques; the techniques and their application here are due to Overmars [1]. Idea. Overmars’ dynamization is based on the idea of decomposable searches. A search (X,q) for the value q among the keys X is decomposable if the search result can be obtained in O(1) time from the results of searching (X1,q) and (X2,q), where X1 ∪ X2 = X. For instance, in the 2D kd-tree, the root node splits R2 into two halves. If we want to search for all points lying in a rectangle q, we may do this by combining the results of searching for q in each half.
    [Show full text]
  • Lecture Notes of CSCI5610 Advanced Data Structures
    Lecture Notes of CSCI5610 Advanced Data Structures Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong July 17, 2020 Contents 1 Course Overview and Computation Models 4 2 The Binary Search Tree and the 2-3 Tree 7 2.1 The binary search tree . .7 2.2 The 2-3 tree . .9 2.3 Remarks . 13 3 Structures for Intervals 15 3.1 The interval tree . 15 3.2 The segment tree . 17 3.3 Remarks . 18 4 Structures for Points 20 4.1 The kd-tree . 20 4.2 A bootstrapping lemma . 22 4.3 The priority search tree . 24 4.4 The range tree . 27 4.5 Another range tree with better query time . 29 4.6 Pointer-machine structures . 30 4.7 Remarks . 31 5 Logarithmic Method and Global Rebuilding 33 5.1 Amortized update cost . 33 5.2 Decomposable problems . 34 5.3 The logarithmic method . 34 5.4 Fully dynamic kd-trees with global rebuilding . 37 5.5 Remarks . 39 6 Weight Balancing 41 6.1 BB[α]-trees . 41 6.2 Insertion . 42 6.3 Deletion . 42 6.4 Amortized analysis . 42 6.5 Dynamization with weight balancing . 43 6.6 Remarks . 44 1 CONTENTS 2 7 Partial Persistence 47 7.1 The potential method . 47 7.2 Partially persistent BST . 48 7.3 General pointer-machine structures . 52 7.4 Remarks . 52 8 Dynamic Perfect Hashing 54 8.1 Two random graph results . 54 8.2 Cuckoo hashing . 55 8.3 Analysis . 58 8.4 Remarks . 59 9 Binomial and Fibonacci Heaps 61 9.1 The binomial heap .
    [Show full text]
  • Buffer Trees
    Buffer Trees Lars Arge. The Buffer Tree: A New Technique for Optimal I/O Algorithms. In Proceedings of Fourth Workshop on Algorithms and Data Structures (WADS), Lecture Notes in Computer Science Vol. 955, Springer-Verlag, 1995, 334-345. 1 Computational Geometry 2 Pairwise Rectangle Intersection A Input N rectangles B D Output all R pairwise intersections F E Example (A; B) (B; C) (B; F ) (D; E) (D; F ) C Intersection Types Intersection Identified by: : : A B Orthogonal Line Segment Intersection on 4N rectangle sides D E Batched Range Searching on N rectangles and N upper-left corners Algorithm Orthogonal Line Segment Intersection + Batched Range Searching + Duplicate removal 3 Orthogonal Line Segment Intersection Input N segments, vertical and horizontal Output all R intersections Sweepline Algorithm Sort all endpoints w.r.t. x-coordinate • Sweep left-to-right with a range tree T y4 • storing the y-coordinates of horizontal segments intersecting the sweepline y3 Left endpoint insertion into T y • ) 2 Right endpoint deletion from T y • ) 1 Vertical segment [y ; y ] • 1 2 ) sweepline report T [y ; y ] \ 1 2 Total (internal) time O(N log N + R) · 2 4 Range Trees Create Create empty structure Insert(x) Insert element x Delete(x) Delete the inserted element x Report(x1; x2) Report all x [x1; x2] 2 x1 x2 Binary search trees B-trees (internal) (# I/Os) Updates O(log2 N) O(logB N) R Report O(log2 N + R) O(logB N + B ) Orthogonal Line Segment Intersection using B-trees O(Sort(N) + N log N + R ) I/Os : : : · B B 5 Batched Range Searching Input N rectangles and points Output all R (r; p) where point p is within rectangle r Sweepline Algorithm Sort all points and left/right rectangle • sides w.r.t.
    [Show full text]
  • Efficient Data Structures for Range Searching on a Grid MARK H
    JOURNAL OF ALGORITHMS 9,254-275 (1988) Efficient Data Structures for Range Searching on a Grid MARK H. OVERMARS Department of Computer Science, University of Utrecht, The Netherlands Received February 17,1987; accepted May 15.1987 We consider the 2-dimensional range searching problem in the case where all points lie on an integer grid. A new data structure is presented that solves range queries on a U * U grid in O( k + log log U) time using O( n log n) storage, where n is the number of points and k the number of reported answers. Although the query time is very good the preprocessing time of this method is very high. A second data structure is presented that can be built in time O( n log n) at the cost of an increase in query time to O(k + m). Similar techniques are used for solving the line segment intersection searching problem in a set of axis-parallel line segments on a grid. The methods presented also yield efficient structures for dominance searching and searching with half-infinite ranges that use only O(n) storage. A generalization to multi-dimensional space, using a normalization approach, yields a static solution to the general range searching problem that is better than any known solution when the dimension is at least 3. Q 1988 Academic Press, Inc. 1. INTRODUCTION One of the problems in computational geometry that has received a large amount of attention is the range searching problem. Given a set of n points in a d-dimensional space, the range searching problem asks to store these points such that for a given range ([A, * .
    [Show full text]
  • Geometric Algorithms 7.3 Range Searching
    Geometric search: Overview Types of data. Points, lines, planes, polygons, circles, ... Geometric Algorithms This lecture. Sets of N objects. Geometric problems extend to higher dimensions. ! Good algorithms also extend to higher dimensions. ! Curse of dimensionality. Basic problems. ! Range searching. ! Nearest neighbor. ! Finding intersections of geometric objects. Reference: Chapters 26-27, Algorithms in C, 2nd Edition, Robert Sedgewick. Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226 2 1D Range Search 7.3 Range Searching Extension to symbol-table ADT with comparable keys. ! Insert key-value pair. ! Search for key k. ! How many records have keys between k1 and k2? ! Iterate over all records with keys between k1 and k2. Application: database queries. insert B B insert D B D insert A A B D insert I A B D I Geometric intuition. insert H A B D H I ! Keys are point on a line. insert F A B D F H I ! How many points in a given interval? insert P A B D F H I P count G to K 2 search G to K H I Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226 4 1D Range Search Implementations 2D Orthogonal Range Search Range search. How many records have keys between k1 and k2? Extension to symbol-table ADT with 2D keys. ! Insert a 2D key. Ordered array. Slow insert, binary search for k1 and k2 to find range. ! Search for a 2D key. Hash table. No reasonable algorithm (key order lost in hash). ! Range search: find all keys that lie in a 2D range? ! Range count: how many keys lie in a 2D range? BST.
    [Show full text]
  • Algorithms for Packet Classification Pankaj Gupta and Nick Mckeown, Stanford University
    Algorithms for Packet Classification Pankaj Gupta and Nick McKeown, Stanford University Abstract The process of categorizing packets into “flows” in an Internet router is called packet classification. All packets belonging to the same flow obey a predefined rule and are processed in a similar manner by the router. For example, all packets with the same source and destination IP addresses may be defined to form a flow. Packet classification is needed for non-best-effort services, such as firewalls and quality of service; services that require the capability to distinguish and isolate traffic in differ- ent flows for suitable processing. In general, packet classification on multiple fields is a difficult problem. Hence, researchers have proposed a variety of algorithms which, broadly speaking, can be categorized as basic search algorithms, geometric algorithms, heuristic algorithms, or hardware-specific search algorithms. In this tutorial we describe algorithms that are representative of each category, and discuss which type of algorithm might be suitable for different applications. ntil recently, Internet routers provided only best- Table 2 shows the flows into which an incoming packet effort service, servicing packets in a first-come- must be classified by the router at interface X. Note that the first-served manner. Routers are now called on to flows specified may or may not be mutually exclusive. For provide different qualities of service to different example, the first and second flow in Table 2 overlap. This is Uapplications, which means routers need new mechanisms common in practice, and when no explicit priorities are speci- such as admission control, resource reservation, per-flow fied, we follow the convention that rules closer to the top of queuing, and fair scheduling.
    [Show full text]
  • Geometric Search
    Geometric Search range search quad- and kD-trees • range search • quad and kd trees intersection search • intersection search VLSI rules checking • VLSI rules check References: Algorithms 2nd edition, Chapter 26-27 Intro to Algs and Data Structs, Section Copyright © 2007 by Robert Sedgewick and Kevin Wayne. 3 Overview 1D Range Search Types of data. Points, lines, planes, polygons, circles, ... Extension to symbol-table ADT with comparable keys. This lecture. Sets of N objects. ! Insert key-value pair. ! Search for key k. ! Geometric problems extend to higher dimensions. How many records have keys between k1 and k2? ! ! Good algorithms also extend to higher dimensions. Iterate over all records with keys between k1 and k2. ! Curse of dimensionality. Application: database queries. Basic problems. ! Range searching. insert B B ! Nearest neighbor. insert D B D insert A A B D ! Finding intersections of geometric objects. Geometric intuition. insert I A B D I ! Keys are point on a line. insert H A B D H I ! How many points in a given interval? insert F A B D F H I insert P A B D F H I P count G to K 2 search G to K H I 2 4 1D Range search: implementations 2D Orthogonal range Ssearch: Grid implementation Range search. How many records have keys between k1 and k2? Grid implementation. [Sedgewick 3.18] ! Divide space into M-by-M grid of squares. ! Ordered array. Slow insert, binary search for k1 and k2 to find range. Create linked list for each square. Hash table. No reasonable algorithm (key order lost in hash).
    [Show full text]
  • Lecture Range Searching II: Windowing Queries
    Computational Geometry · Lecture Range Searching II: Windowing Queries INSTITUT FUR¨ THEORETISCHE INFORMATIK · FAKULTAT¨ FUR¨ INFORMATIK Tamara Mchedlidze · Darren Strash 23.11.2015 1 Dr. Tamara Mchedlidze · Dr. Darren Strash · Computational Geometry Lecture Range Searching II Object types in range queries y0 y x x0 Setting so far: Input: set of points P (here P ⊂ R2) Output: all points in P \ [x; x0] × [y; y0] Data structures: kd-trees or range trees 2 Dr. Tamara Mchedlidze · Dr. Darren Strash · Computational Geometry Lecture Range Searching II Object types in range queries y0 y0 y x x0 y x x0 Setting so far: Further variant Input: set of points P Input: set of line segments S (here P ⊂ R2) (here in R2) Output: all points in Output: all segments in P \ [x; x0] × [y; y0] S \ [x; x0] × [y; y0] Data structures: kd-trees Data structures: ? or range trees 2 Dr. Tamara Mchedlidze · Dr. Darren Strash · Computational Geometry Lecture Range Searching II Axis-parallel line segments special case (e.g., in VLSI design): all line segments are axis-parallel 3 Dr. Tamara Mchedlidze · Dr. Darren Strash · Computational Geometry Lecture Range Searching II Axis-parallel line segments special case (e.g., in VLSI design): all line segments are axis-parallel Problem: Given n vertical and horizontal line segments and an axis-parallel rectangle R = [x; x0] × [y; y0], find all line segments that intersect R. 3 Dr. Tamara Mchedlidze · Dr. Darren Strash · Computational Geometry Lecture Range Searching II Axis-parallel line segments special case (e.g., in VLSI design): all line segments are axis-parallel Problem: Given n vertical and horizontal line segments and an axis-parallel rectangle R = [x; x0] × [y; y0], find all line segments that intersect R.
    [Show full text]