Design and Analysis of Data Structures for Dynamic Trees

Design and Analysis of Data Structures for Dynamic Trees Renato F. Werneck A Dissertation Presented to the Faculty of Princeton University in Candidacy for the Degree of Doctor of Philosophy Recommended for Acceptance by the Department of Computer Science June, 2006 c Copyright 2006 by Renato F. Werneck. All rights reserved. Abstract The dynamic trees problem is that of maintaining a forest that changes over time through edge insertions and deletions. We can associate data with vertices or edges and manip- ulate this data, individually or in bulk, with operations that deal with whole paths or trees. Efficient solutions to this problem have numerous applications, particularly in algorithms for network flows and dynamic graphs in general. Several data structures capable of logarithmic-time dynamic tree operations have been proposed. The first was Sleator and Tarjan's ST-tree, which represents a partition of the tree into paths. Although reasonably fast in practice, adapting ST-trees to different applications is nontrivial. Frederickson's topology trees, Alstrup et al.'s top trees, and Acar et al.'s RC-trees are based on tree contractions: they progressively combine vertices or edges to obtain a hierarchical representation of the tree. This approach is more flexible in theory, but all known implementations assume the trees have bounded degree; arbitrary trees are supported only after ternarization. This thesis shows how these two approaches can be combined (with very little overhead) to produce a data structure that is at least as generic as any other, very easy to adapt, and as practical as ST-trees. It can be seen as a self-adjusting implementation of top trees and provides a logarithmic bound per operation in the amortized sense. We also discuss a pure contraction-based implementation of top trees, which is more involved but guarantees a logarithmic bound in the worst case. Finally, an experimental evaluation of these two data structures, including a comparison with previous methods, is presented. iii Acknowledgements I am deeply indebted to my advisor, Bob Tarjan, for his guidance and patience. I thought highly of him before I came to Princeton, but now I realize it was not nearly enough. Working with him was a privilege and, above all, a pleasure. I thank my readers, Adam Buchsbaum and Bernard Chazelle, for their numerous comments on this dissertation. I also thank Robert Sedgewick and Nicholas Pippenger for taking their time to participate in the thesis committee, and for their questions and suggestions during the presentation of the thesis proposal. I had important discussions about dynamic trees with Umut Acar, Guy Blelloch, Jorge Vittes, and Loukas Georgiadis. I thank Phil Klein for his many helpful comments on the self-adjusting data structure presented in Chapter 4. Mikkel Thorup and Jakob Holm are coauthors (with Bob Tarjan and I) of the worst-case data structure presented in Chapter 3. Diego Nehab has provided me with the computational resources for conducting the experiments described in Chapter 5 (i.e., he let me borrow his computer). Kevin Wayne generously taught me how to use the binding machine, all in exchange for a mere acknowl- edgement. I am eternally indebted to him. I thank everybody at Microsoft Research Silicon Valley for giving me the peace of mind to write this dissertation. There is nothing like having a job. I would also like to thank my coauthors in projects I worked on while at Princeton but that are not part of this dissertation. In particular, I thank Ricardo Fukasawa, Loukas Georgiadis, Andrew Goldberg, Haim Kaplan, Jens Lysgaard, Marcus Poggi de Arag~ao, Mauricio Resende, and Eduardo Uchoa. As great as dynamic trees are, it is always nice to iv work on other topics. I thank the administrative staff at the Computer Science department, in particular Melissa Lawson and Mitra Kelly, for shielding me from the bureaucracy of real life. During more than a decade as a Computer Science student, I was fortunate to have great mentors and advisors: Jo~ao Carlos Setubal at Unicamp, Marcus Poggi de Arag~ao at PUC-Rio, Mauricio Resende at AT&T Labs Research, Andrew Goldberg at Microsoft Research Silicon Valley, and, of course, Bob Tarjan at Princeton. I cannot thank them enough for their guidance. On a personal note, I thank my friends for making my years at Princeton unforgettable. This includes Adrian, Diego, Diogo, Loukas, Thomas, Tony, and especially anyone who is considering not talking to me anymore just because I did not mention you by name. Most importantly, I dedicate this dissertation to my parents, Dorothea and Rogério. Without them I would be nothing|literally. My work was funded by Princeton University and the Aladdin Project (National Science Foundation grant no. CCR-0122581). Additional summer funding (through internships) was provided by AT&T (in 2002 and 2003) and Microsoft (in 2004 and 2005). v Contents Abstract iii Acknowledgements iv Contents vi List of Figures x List of Tables xii 1 Introduction 1 2 Existing Data Structures 5 2.1 Path Decomposition . 6 2.1.1 Representation . 6 2.1.2 Updating the Tree . 8 2.1.3 Dealing with Values . 10 2.1.4 Undirected Trees . 11 2.1.5 Aggregating Information over Trees . 12 2.1.6 Other Extensions . 15 2.2 Tree Contraction . 16 2.2.1 The Parallel Setting . 16 2.2.2 Topology Trees . 18 vi 2.2.3 RC-Trees . 21 2.2.4 Top Trees . 23 2.3 Euler Tours . 34 3 Contraction-Based Top Trees 38 3.1 Number of Levels . 39 3.2 Updating the Contraction . 44 3.2.1 Updates: Basic Notions . 47 3.2.2 Proof Outline . 51 3.2.3 Replicated Moves . 53 3.2.4 Stable Subtours . 55 3.2.5 Unstable Subtours . 61 3.2.6 Running Time . 67 3.3 Implementation . 71 3.3.1 Representation . 71 3.3.2 Identifying Valid Moves . 72 3.3.3 Updating the Tree . 72 3.3.4 Other Details . 75 3.3.5 Implementing Expose . 76 3.4 Alternative Design Choices . 80 3.4.1 No Circular Order . 80 3.4.2 Back Rakes . 82 3.4.3 Alternating Rounds . 83 3.4.4 Randomization . 83 4 Self-Adjusting Top Trees 85 4.1 Representation . 85 4.1.1 Order within Binary Trees . 90 4.1.2 Handles . 90 vii 4.2 Updates . 91 4.2.1 Soft Expose . 91 4.2.2 Hard Expose . 97 4.2.3 Cuts . 98 4.2.4 Links . 99 4.2.5 Implementation Issues . 100 4.3 Analysis . 101 4.4 Alternative Representations . 105 4.4.1 Possible Simplifications . 105 4.4.2 Unit Trees . 106 4.5 Path Decomposition and Tree Contraction . 107 4.5.1 Contraction to Decomposition . 108 4.5.2 Decomposition to Contraction . 109 4.6 Final Remarks . 110 5 Experimental Analysis 111 5.1 Experimental Setup . 112 5.2 Data Structures . 112 5.2.1 ST-trees . 113 5.2.2 ET-trees . 115 5.2.3 Top Trees . 115 5.3 Maximum Flows . 117 5.3.1 Basics . 118 5.3.2 The Shortest Augmenting Path Algorithm . 119 5.3.3 Experimental Results . 124 5.4 Online Minimum Spanning Forests . 128 5.4.1 The Algorithm . 128 5.4.2 Experimental Setup . 130 5.4.3 Random Graphs . 131 viii 5.4.4 Circular Meshes . 135 5.4.5 High-Degree Vertices . 138 5.4.6 Memory Usage and Cache Effects . 141 5.5 Single-Source Shortest Paths . 143 5.5.1 Algorithm . 144 5.5.2 Experiments . 146 5.6 Random Operations . 148 5.7 Previous Work . 151 5.8 Final Remarks . 154 6 Final Remarks 156 References 160 ix List of Figures 2.1 Example of an ST-tree . 7 2.2 Ternarization . 13 2.3 Topology tree . 18 2.4 A free tree . 24 2.5 Top trees: Basic operations . 24 2.6 A contraction and the corresponding top tree . 26 2.7 A top tree without dummy nodes . 26 2.8 An Euler tour . 36 2.9 A binary tree representing an Euler tour . 36 3.1 Bad configurations . 41 3.2 Good configurations . ..

Design and Analysis of Data Structures for Dynamic Trees

Cache Oblivious Search Trees Via Binary Trees of Small Height

The Euler Tour Technique: Evaluation of Tree Functions

Area-Efficient Algorithms for Straight-Line Tree Drawings

Lecture 04 Linear Structures Sort

[Type the Document Title]

15–210: Parallel and Sequential Data Structures and Algorithms

Parallel and Nearest Neighbor Search for High-Dimensional Index Structure of Content- Based Data Using DVA-Tree R

Top Tree Compression of Tries∗

Dynamic Trees 2005; Werneck, Tarjan

List of Transparencies

Massively Parallel Dynamic Programming on Trees∗

Funnel Heap - a Cache Oblivious Priority Queue