Point Enclosure and the Interval Tree

C.S. 252 Prof. Roberto Tamassia Computational Geometry Sem. II, 1992{1993 Point Enclosure and the Interval Tree Lecture 8 Date: March 3, 1993 Scribe: Dzung T. Hoang Point Enclosure We consider the 1-D point enclosure problem: Given a collection of segments on a line and a query point, ¯nd those that contain the query point. In a previous lecture, we have seen a solution using the segment tree data strucure that requires O(n log n) space and O(log n+k) time, where n is the number of segments and k the number of segments reported. In this lecture, we introduce a data structure that solves the 1-D point enclosure problem using O(n) space and O(log n + k) time. Interval Tree The data structure we introduce is called the interval tree. Given X, a set of points on the line, and a set of segments, S, with endpoints in X, the interval tree is a binary tree that stores the segments in its internal and leaf nodes. An internal node may hold no segments, but a leaf node contains at least one segment. A segment is stored in exactly one node. It is easiest to describe an interval tree by showing how one is constructed. Below are the steps for constructing an interval tree given X and S, with the restriction that segments have distinct endpoints. We later discuss how to remove this restriction. 1. Find the median, xm, of X. (If there are two medians, let xm be their average.) 2. Draw the root above xm. 3. Store xm at the root. 4. Store all segments in S that contain xm at the root. 5. Remove the above segments from S and their endpoints from X. 6. Recursively construct subtrees for the left (right) child by considering only the segments with both endpoints to the left (right) of xm. 1 To illustrate the process, we go through step-by-step construction of an interval tree for 8 segments (Figures 1 to 5 in the Appendix). At each step, the tree constructed so far is shown along with the remaining segments and endpoints. Each node is annotated with the the set of segments it contains. Recursive construction of the left and right subtrees are shown in the same ¯gure to reduce the number of ¯gures required. As promised earlier, we show how to handle the case where segments share common endpoints. Only step (5) is a®ected by this case. After removing the segments containing xm from S, we remove only the elements in X that are not endpoints of any of the remaining segments in S. Query For query processing, each node is augmented with two linked lists, one storing the left endpoints of the segments associated with the node sorted right-to-left and the other storing the right endpoints sorted left-to-right. When performing a point enclosure query, the interval tree is treated as a binary search tree, comparing the query point against the median x-coordinate stored at each node. At each node visited, the query point is checked for enclosure in the segments stored at the node. This check is done by scanning the two sorted lists described earlier. If the query point lies to the right of the median, the next node to be visited is the right child and the list containing right endpoints is scanned left-to-right. Similary, if the next node to be visited is the left child, the list containing left endpoints is scanned right-to-left. Each scan is halted at the ¯rst segment encountered that does not contain the query point. The number of endpoints scanned is one more than the number of segments to be reported. Since the tree is constructed by successively dividing the endpoints into two set of about equal size, the height of the tree is O(log n). If k segments enclose the query point, the total query time is O(log n + k). Each segment is stored in exactly one node of the tree. The amount of data stored for each segment is constant. Therefore the space requirement is O(n). Preprocessing We now examine the time needed to construct the tree. In the worst case, the tree will contain n leaves and have 2n ¡ 1 nodes. Since the number of endpoints considered by each node is reduced by about half with every recursion, it follows that a node on level m (with the root being on level 0) considers at most (2n + 1)=2m endpoints. Step (1) requires ¯nding the median of the set of endpoints under consideration. The median of a set of size n can be found in O(n) time. Noting that there may be at most 2m nodes on level m, the total time required to execute Step (1) for all the nodes created can be bounded by the following sum: log n m m X 2 (2n + 1)=2 = (log n + 1)(2n + 1) = O(n log n): m=0 The sorted lists of left and right endpoints of the segments stored at each node can be constructed in O(n log n) time as follows: 2 1. Create two sorted binary trees, one for all left endpoints and one for all right endpoints. Provide threads for left-to-right and right-to-left orderings. This takes O(n log n) time. 2. In the construction of the linked list of left (right) endpoints at each node, search the corresponding binary tree for xm and follow the left-to-right (right-to-left) threads to retrieve the endpoints. In this step, also \remove" the reported endpoints by updating the threads. Each binary search takes O(log n) time. Since there may be O(n) such searches, the total search time is O(n log n). The total time required to construct the interval tree is O(n log n). References [1] Yi-Jen Chiang and Roberto Tamassia, \Dynamic Algorithms in Computational Geome- try," Proceedings of the IEEE, vol. 80, no. 9, pp. 1412{1434, 1992. [2] Franco P. Preparata and Michael I. Shamos, Computational Geometry, Springer-Verlag, 1985. 3 r r r r r r r r r r r r r r r r - A B C D E F G H Figure 1: Initial set of segments and endpoints. B; H g r r r r r r r r r r r r r r r r - A B C D E F G H Figure 2: The root node is added above the average of the two median points. A ray is projected from the root to mark those segments to be added to the root. 4 B; H g »»»XXX »»» XXX »» XX »»» XXX A »»» XXX C g g r r r r r r r r r r r r - A C D E F G Figure 3: The segments intersecting the ray projected from the root are added to the root and then removed from consideration along with the endpoints. The left (right) child is placed above the median of the points to the left (right) of the root. B; H g »»»XXX »»» XXX »» XX »»» XXX A »»» XXX C g g T "" cc ¡ " c ¡ T " " c ¡ T " c E ¡ T F G " c D g g g g r r r r r r r r - D E F G Figure 4: Next level down. 5 B; H »Xg »»» XXX »» XX »»» XXX »»» XXX A »» XX C g g T "" cc ¡ " c ¡ T " " c ¡ T " c E ¡ T F G " c D g g g g r r r r r r r r r r r r r r r r - A B C D E F G H Figure 5: Final interval tree with all segments and endpoints shown. 6.

Point Enclosure and the Interval Tree

Interval Trees Storing and Searching Intervals

14 Augmenting Data Structures

Advanced Data Structures

Search Trees

2 Dynamization

Lecture Notes of CSCI5610 Advanced Data Structures

Buffer Trees

Efficient Data Structures for Range Searching on a Grid MARK H

Geometric Algorithms 7.3 Range Searching

Algorithms for Packet Classification Pankaj Gupta and Nick Mckeown, Stanford University

Geometric Search

Lecture Range Searching II: Windowing Queries