Algorithms and Theory of Computation Handbook
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Interval Trees Storing and Searching Intervals
Interval Trees Storing and Searching Intervals • Instead of points, suppose you want to keep track of axis-aligned segments: • Range queries: return all segments that have any part of them inside the rectangle. • Motivation: wiring diagrams, genes on genomes Simpler Problem: 1-d intervals • Segments with at least one endpoint in the rectangle can be found by building a 2d range tree on the 2n endpoints. - Keep pointer from each endpoint stored in tree to the segments - Mark segments as you output them, so that you don’t output contained segments twice. • Segments with no endpoints in range are the harder part. - Consider just horizontal segments - They must cross a vertical side of the region - Leads to subproblem: Given a vertical line, find segments that it crosses. - (y-coords become irrelevant for this subproblem) Interval Trees query line interval Recursively build tree on interval set S as follows: Sort the 2n endpoints Let xmid be the median point Store intervals that cross xmid in node N intervals that are intervals that are completely to the completely to the left of xmid in Nleft right of xmid in Nright Another view of interval trees x Interval Trees, continued • Will be approximately balanced because by choosing the median, we split the set of end points up in half each time - Depth is O(log n) • Have to store xmid with each node • Uses O(n) storage - each interval stored once, plus - fewer than n nodes (each node contains at least one interval) • Can be built in O(n log n) time. • Can be searched in O(log n + k) time [k = # -
14 Augmenting Data Structures
14 Augmenting Data Structures Some engineering situations require no more than a “textbook” data struc- ture—such as a doubly linked list, a hash table, or a binary search tree—but many others require a dash of creativity. Only in rare situations will you need to cre- ate an entirely new type of data structure, though. More often, it will suffice to augment a textbook data structure by storing additional information in it. You can then program new operations for the data structure to support the desired applica- tion. Augmenting a data structure is not always straightforward, however, since the added information must be updated and maintained by the ordinary operations on the data structure. This chapter discusses two data structures that we construct by augmenting red- black trees. Section 14.1 describes a data structure that supports general order- statistic operations on a dynamic set. We can then quickly find the ith smallest number in a set or the rank of a given element in the total ordering of the set. Section 14.2 abstracts the process of augmenting a data structure and provides a theorem that can simplify the process of augmenting red-black trees. Section 14.3 uses this theorem to help design a data structure for maintaining a dynamic set of intervals, such as time intervals. Given a query interval, we can then quickly find an interval in the set that overlaps it. 14.1 Dynamic order statistics Chapter 9 introduced the notion of an order statistic. Specifically, the ith order statistic of a set of n elements, where i 2 f1;2;:::;ng, is simply the element in the set with the ith smallest key. -
Finding Neighbors in a Forest: a B-Tree for Smoothed Particle Hydrodynamics Simulations
Finding Neighbors in a Forest: A b-tree for Smoothed Particle Hydrodynamics Simulations Aurélien Cavelan University of Basel, Switzerland [email protected] Rubén M. Cabezón University of Basel, Switzerland [email protected] Jonas H. M. Korndorfer University of Basel, Switzerland [email protected] Florina M. Ciorba University of Basel, Switzerland fl[email protected] May 19, 2020 Abstract Finding the exact close neighbors of each fluid element in mesh-free computational hydrodynamical methods, such as the Smoothed Particle Hydrodynamics (SPH), often becomes a main bottleneck for scaling their performance beyond a few million fluid elements per computing node. Tree structures are particularly suitable for SPH simulation codes, which rely on finding the exact close neighbors of each fluid element (or SPH particle). In this work we present a novel tree structure, named b-tree, which features an adaptive branching factor to reduce the depth of the neighbor search. Depending on the particle spatial distribution, finding neighbors using b-tree has an asymptotic best case complexity of O(n), as opposed to O(n log n) for other classical tree structures such as octrees and quadtrees. We also present the proposed tree structure as well as the algorithms to build it and to find the exact close neighbors of all particles. arXiv:1910.02639v2 [cs.DC] 18 May 2020 We assess the scalability of the proposed tree-based algorithms through an extensive set of performance experiments in a shared-memory system. Results show that b-tree is up to 12× faster for building the tree and up to 1:6× faster for finding the exact neighbors of all particles when compared to its octree form. -
L11: Quadtrees CSE373, Winter 2020
L11: Quadtrees CSE373, Winter 2020 Quadtrees CSE 373 Winter 2020 Instructor: Hannah C. Tang Teaching Assistants: Aaron Johnston Ethan Knutson Nathan Lipiarski Amanda Park Farrell Fileas Sam Long Anish Velagapudi Howard Xiao Yifan Bai Brian Chan Jade Watkins Yuma Tou Elena Spasova Lea Quan L11: Quadtrees CSE373, Winter 2020 Announcements ❖ Homework 4: Heap is released and due Wednesday ▪ Hint: you will need an additional data structure to improve the runtime for changePriority(). It does not affect the correctness of your PQ at all. Please use a built-in Java collection instead of implementing your own. ▪ Hint: If you implemented a unittest that tested the exact thing the autograder described, you could run the autograder’s test in the debugger (and also not have to use your tokens). ❖ Please look at posted QuickCheck; we had a few corrections! 2 L11: Quadtrees CSE373, Winter 2020 Lecture Outline ❖ Heaps, cont.: Floyd’s buildHeap ❖ Review: Set/Map data structures and logarithmic runtimes ❖ Multi-dimensional Data ❖ Uniform and Recursive Partitioning ❖ Quadtrees 3 L11: Quadtrees CSE373, Winter 2020 Other Priority Queue Operations ❖ The two “primary” PQ operations are: ▪ removeMax() ▪ add() ❖ However, because PQs are used in so many algorithms there are three common-but-nonstandard operations: ▪ merge(): merge two PQs into a single PQ ▪ buildHeap(): reorder the elements of an array so that its contents can be interpreted as a valid binary heap ▪ changePriority(): change the priority of an item already in the heap 4 L11: Quadtrees CSE373, -
Search Trees
Lecture III Page 1 “Trees are the earth’s endless effort to speak to the listening heaven.” – Rabindranath Tagore, Fireflies, 1928 Alice was walking beside the White Knight in Looking Glass Land. ”You are sad.” the Knight said in an anxious tone: ”let me sing you a song to comfort you.” ”Is it very long?” Alice asked, for she had heard a good deal of poetry that day. ”It’s long.” said the Knight, ”but it’s very, very beautiful. Everybody that hears me sing it - either it brings tears to their eyes, or else -” ”Or else what?” said Alice, for the Knight had made a sudden pause. ”Or else it doesn’t, you know. The name of the song is called ’Haddocks’ Eyes.’” ”Oh, that’s the name of the song, is it?” Alice said, trying to feel interested. ”No, you don’t understand,” the Knight said, looking a little vexed. ”That’s what the name is called. The name really is ’The Aged, Aged Man.’” ”Then I ought to have said ’That’s what the song is called’?” Alice corrected herself. ”No you oughtn’t: that’s another thing. The song is called ’Ways and Means’ but that’s only what it’s called, you know!” ”Well, what is the song then?” said Alice, who was by this time completely bewildered. ”I was coming to that,” the Knight said. ”The song really is ’A-sitting On a Gate’: and the tune’s my own invention.” So saying, he stopped his horse and let the reins fall on its neck: then slowly beating time with one hand, and with a faint smile lighting up his gentle, foolish face, he began.. -
Skip-Webs: Efficient Distributed Data Structures for Multi-Dimensional Data Sets
Skip-Webs: Efficient Distributed Data Structures for Multi-Dimensional Data Sets Lars Arge David Eppstein Michael T. Goodrich Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science University of Aarhus University of California University of California IT-Parken, Aabogade 34 Computer Science Bldg., 444 Computer Science Bldg., 444 DK-8200 Aarhus N, Denmark Irvine, CA 92697-3425, USA Irvine, CA 92697-3425, USA large(at)daimi.au.dk eppstein(at)ics.uci.edu goodrich(at)acm.org ABSTRACT name), a prefix match for a key string, a nearest-neighbor We present a framework for designing efficient distributed match for a numerical attribute, a range query over various data structures for multi-dimensional data. Our structures, numerical attributes, or a point-location query in a geomet- which we call skip-webs, extend and improve previous ran- ric map of attributes. That is, we would like the peer-to-peer domized distributed data structures, including skipnets and network to support a rich set of possible data types that al- skip graphs. Our framework applies to a general class of data low for multiple kinds of queries, including set membership, querying scenarios, which include linear (one-dimensional) 1-dim. nearest neighbor queries, range queries, string prefix data, such as sorted sets, as well as multi-dimensional data, queries, and point-location queries. such as d-dimensional octrees and digital tries of character The motivation for such queries include DNA databases, strings defined over a fixed alphabet. We show how to per- location-based services, and approximate searches for file form a query over such a set of n items spread among n names or data titles. -
Octree Generating Networks: Efficient Convolutional Architectures for High-Resolution 3D Outputs
Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs Maxim Tatarchenko1 Alexey Dosovitskiy1,2 Thomas Brox1 [email protected] [email protected] [email protected] 1University of Freiburg 2Intel Labs Abstract Octree Octree Octree dense level 1 level 2 level 3 We present a deep convolutional decoder architecture that can generate volumetric 3D outputs in a compute- and memory-efficient manner by using an octree representation. The network learns to predict both the structure of the oc- tree, and the occupancy values of individual cells. This makes it a particularly valuable technique for generating 323 643 1283 3D shapes. In contrast to standard decoders acting on reg- ular voxel grids, the architecture does not have cubic com- Figure 1. The proposed OGN represents its volumetric output as an octree. Initially estimated rough low-resolution structure is gradu- plexity. This allows representing much higher resolution ally refined to a desired high resolution. At each level only a sparse outputs with a limited memory budget. We demonstrate this set of spatial locations is predicted. This representation is signifi- in several application domains, including 3D convolutional cantly more efficient than a dense voxel grid and allows generating autoencoders, generation of objects and whole scenes from volumes as large as 5123 voxels on a modern GPU in a single for- high-level representations, and shape from a single image. ward pass. large cell of an octree, resulting in savings in computation 1. Introduction and memory compared to a fine regular grid. At the same 1 time, fine details are not lost and can still be represented by Up-convolutional decoder architectures have become small cells of the octree. -
Evaluation of Spatial Trees for Simulation of Biological Tissue
Evaluation of spatial trees for simulation of biological tissue Ilya Dmitrenok∗, Viktor Drobnyy∗, Leonard Johard and Manuel Mazzara Innopolis University Innopolis, Russia ∗ These authors contributed equally to this work. Abstract—Spatial organization is a core challenge for all large computation time. The amount of cells practically simulated on agent-based models with local interactions. In biological tissue a single computer generally stretches between a few thousands models, spatial search and reinsertion are frequently reported to a million, depending on detail level, hardware and simulated as the most expensive steps of the simulation. One of the main methods utilized in order to maintain both favourable algorithmic time range. complexity and accuracy is spatial hierarchies. In this paper, we In center-based models, movement and neighborhood de- seek to clarify to which extent the choice of spatial tree affects tection remains one of the main performance bottlenecks [2]. performance, and also to identify which spatial tree families are Performances in these bottlenecks are deeply intertwined with optimal for such scenarios. We make use of a prototype of the the spatial structure chosen for the simulation. new BioDynaMo tissue simulator for evaluating performances as well as for the implementation of the characteristics of several B. BioDynaMo different trees. The BioDynaMo project [3] is developing a new general I. INTRODUCTION platform for computer simulations of biological tissue dynam- The high pace of neuroscientific research has led to a ics, with a brain development as a primary target. The platform difficult problem in synthesizing the experimental results into should be executable on hybrid cloud computing systems, effective new hypotheses. -
The Peano Software—Parallel, Automaton-Based, Dynamically Adaptive Grid Traversals
The Peano software|parallel, automaton-based, dynamically adaptive grid traversals Tobias Weinzierl ∗ December 4, 2018 Abstract We discuss the design decisions, design alternatives and rationale behind the third generation of Peano, a framework for dynamically adaptive Cartesian meshes derived from spacetrees. Peano ties the mesh traversal to the mesh storage and supports only one element-wise traversal order resulting from space-filling curves. The user is not free to choose a traversal order herself. The traversal can exploit regular grid subregions and shared memory as well as distributed memory systems with almost no modifications to a serial application code. We formalize the software design by means of two interacting automata|one automaton for the multiscale grid traversal and one for the application- specific algorithmic steps. This yields a callback-based programming paradigm. We further sketch the supported application types and the two data storage schemes real- ized, before we detail high-performance computing aspects and lessons learned. Special emphasis is put on observations regarding the used programming idioms and algorith- mic concepts. This transforms our report from a \one way to implement things" code description into a generic discussion and summary of some alternatives, rationale and design decisions to be made for any tree-based adaptive mesh refinement software. 1 Introduction Dynamically adaptive grids are mortar and catalyst of mesh-based scientific computing and thus important to a large range of scientific and engineering applications. They enable sci- entists and engineers to solve problems with high accuracy as they invest grid entities and computational effort where they pay off most. -
The Skip Quadtree: a Simple Dynamic Data Structure for Multidimensional Data
The Skip Quadtree: A Simple Dynamic Data Structure for Multidimensional Data David Eppstein† Michael T. Goodrich† Jonathan Z. Sun† Abstract We present a new multi-dimensional data structure, which we call the skip quadtree (for point data in R2) or the skip octree (for point data in Rd , with constant d > 2). Our data structure combines the best features of two well-known data structures, in that it has the well-defined “box”-shaped regions of region quadtrees and the logarithmic-height search and update hierarchical structure of skip lists. Indeed, the bottom level of our structure is exactly a region quadtree (or octree for higher dimensional data). We describe efficient algorithms for inserting and deleting points in a skip quadtree, as well as fast methods for performing point location and approximate range queries. 1 Introduction Data structures for multidimensional point data are of significant interest in the computational geometry, computer graphics, and scientific data visualization literatures. They allow point data to be stored and searched efficiently, for example to perform range queries to report (possibly approximately) the points that are contained in a given query region. We are interested in this paper in data structures for multidimensional point sets that are dynamic, in that they allow for fast point insertion and deletion, as well as efficient, in that they use linear space and allow for fast query times. Related Previous Work. Linear-space multidimensional data structures typically are defined by hierar- chical subdivisions of space, which give rise to tree-based search structures. That is, a hierarchy is defined by associating with each node v in a tree T a region R(v) in Rd such that the children of v are associated with subregions of R(v) defined by some kind of “cutting” action on R(v). -
2 Dynamization
6.851: Advanced Data Structures Spring 2010 Lecture 4 — February 11, 2010 Prof. Andr´eSchulz Scribe: Peter Caday 1 Overview In the previous lecture, we studied range trees and kd-trees, two structures which support efficient orthogonal range queries on a set of points. In this second of four lectures on geometric data structures, we will tie up a loose end from the last lecture — handling insertions and deletions. The main topic, however, will be two new structures addressing the vertical line stabbing problem: interval trees and segment trees. We will conclude with an application of vertical line stabbing to a windowing problem. 2 Dynamization In our previous discussion of range trees, we assumed that the point set was static. On the other hand, we may be interested in supporting insertions and deletions as time progresses. Unfortunately, it is not trivial to modify our data structures to handle this dynamic scenario. We can, however, dynamize them using general techniques; the techniques and their application here are due to Overmars [1]. Idea. Overmars’ dynamization is based on the idea of decomposable searches. A search (X,q) for the value q among the keys X is decomposable if the search result can be obtained in O(1) time from the results of searching (X1,q) and (X2,q), where X1 ∪ X2 = X. For instance, in the 2D kd-tree, the root node splits R2 into two halves. If we want to search for all points lying in a rectangle q, we may do this by combining the results of searching for q in each half. -
Lecture Notes of CSCI5610 Advanced Data Structures
Lecture Notes of CSCI5610 Advanced Data Structures Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong July 17, 2020 Contents 1 Course Overview and Computation Models 4 2 The Binary Search Tree and the 2-3 Tree 7 2.1 The binary search tree . .7 2.2 The 2-3 tree . .9 2.3 Remarks . 13 3 Structures for Intervals 15 3.1 The interval tree . 15 3.2 The segment tree . 17 3.3 Remarks . 18 4 Structures for Points 20 4.1 The kd-tree . 20 4.2 A bootstrapping lemma . 22 4.3 The priority search tree . 24 4.4 The range tree . 27 4.5 Another range tree with better query time . 29 4.6 Pointer-machine structures . 30 4.7 Remarks . 31 5 Logarithmic Method and Global Rebuilding 33 5.1 Amortized update cost . 33 5.2 Decomposable problems . 34 5.3 The logarithmic method . 34 5.4 Fully dynamic kd-trees with global rebuilding . 37 5.5 Remarks . 39 6 Weight Balancing 41 6.1 BB[α]-trees . 41 6.2 Insertion . 42 6.3 Deletion . 42 6.4 Amortized analysis . 42 6.5 Dynamization with weight balancing . 43 6.6 Remarks . 44 1 CONTENTS 2 7 Partial Persistence 47 7.1 The potential method . 47 7.2 Partially persistent BST . 48 7.3 General pointer-machine structures . 52 7.4 Remarks . 52 8 Dynamic Perfect Hashing 54 8.1 Two random graph results . 54 8.2 Cuckoo hashing . 55 8.3 Analysis . 58 8.4 Remarks . 59 9 Binomial and Fibonacci Heaps 61 9.1 The binomial heap .