Design of Data Structures for Mergeable Trees∗

Design of Data Structures for Mergeable Trees∗ Loukas Georgiadis1; 2 Robert E. Tarjan1; 3 Renato F. Werneck1 Abstract delete(v): Delete leaf v from the forest. • Motivated by an application in computational topology, merge(v; w): Given nodes v and w, let P be the path we consider a novel variant of the problem of efficiently • from v to the root of its tree, and let Q be the path maintaining dynamic rooted trees. This variant allows an from w to the root of its tree. Restructure the tree operation that merges two tree paths. In contrast to the or trees containing v and w by merging the paths P standard problem, in which only one tree arc at a time and Q in a way that preserves the heap order. The changes, a single merge operation can change many arcs. merge order is unique if all node labels are distinct, In spite of this, we develop a data structure that supports which we can assume without loss of generality: if 2 merges and all other standard tree operations in O(log n) necessary, we can break ties by node identifier. See amortized time on an n-node forest. For the special case Figure 1. that occurs in the motivating application, in which arbitrary arc deletions are not allowed, we give a data structure with 1 an O(log n) amortized time bound per operation, which is merge(6,11) asymptotically optimal. The analysis of both algorithms is 3 2 not straightforward and requires ideas not previously used in 6 7 4 5 the study of dynamic trees. We explore the design space of 1 algorithms for the problem and also consider lower bounds 8 9 10 11 2 for it. 4 3 1 Introduction and Overview 1 8 9 5 7 A heap-ordered forest is a set of node-disjoint rooted 2 10 6 trees, each node v of which has a real-valued label `(v) 3 satisfying heap order: for every node v with parent p(v), 11 `(v) `(p(v)). We consider the problem of maintaining 5 4 a heap-ordered≥ forest, initially empty, subject to a 10 6 7 9 merge(7,8) sequence of the following kinds of operations: 11 8 parent(v): Given a node v, return its parent, or null • if it is a root. Figure 1: Two successive merges. The nodes are nca(v; w): Given nodes v and w, return the nearest identified by label. • common ancestor of v and w, or null if v and w are in different trees. We call this the mergeable trees problem. In stating make(v; x): Create a new, one-node tree consisting complexity bounds we denote by n the number of make • of node v with label x. operations (the total number of nodes) and by m the number of operations of all kinds; we assume n 2. link(v; w): Given a root v and a node w such that Our motivating application is an algorithm of Agarw≥ al • `(v) `(w) and w is not a descendent of v, combine et al. [2] that computes the structure of 2-manifolds the trees≥ containing v and w by making w the parent embedded in 3. In this application, the tree nodes are of v. the critical poinR ts of the manifold (local minima, local ∗Research partially supported by the Aladdin Project, NSF maxima, and saddle points), with labels equal to their Grant No 112-0188-1234-12. heights. The algorithm computes the critical points 1Dept. of Computer Science, Princeton University, Princeton, and their heights during a sweep of the manifold, and NJ 08544. flgeorgia,ret,[email protected]. pairs up the critical points into so-called critical pairs 2Current address: Dept. of Computer Science, University of Aarhus. IT-parken, Aabogade 34, DK-8200 Aarhus N, Denmark. using mergeable tree operations. This use of mergeable 3Office of Strategy and Technology, Hewlett-Packard, Palo trees is actually a special case of the problem: parent, Alto, CA, 94304. nca, and merge are applied only to leaves, link is used only to attach a leaf to a tree, and each nca(v; w) is amortized time bound per operation becomes O(log2 n). followed immediately by merge(v; w). None of these Both the design and the analysis of our algorithms re- restrictions simplifies the problem significantly except quire new ideas that have not been previously used in for the restriction of merges to leaves. Since each such the study of the standard dynamic trees problem. merge eliminates a leaf, there can be at most n 1 This paper is organized as follows. Section 2 intro- merges. Indeed, in the 2-manifold application the total− duces some terminology used to describe our algorithms. number of operations, as well as the number of merges, Section 3 shows how to extend Sleator and Tarjan's is O(n). On the other hand, if arbitrary merges can data structure for dynamic trees to solve the mergeable occur, the number of merges can be Θ(n2). trees problem in O(log2 n) amortized time per opera- The mergeable trees problem is a new variant of the tion. Section 4 presents an alternative data structure well-studied dynamic trees problem. This problem calls that supports all operations except cut in O(log n) time. for the maintenance of a forest of rooted trees subject Section 5 discusses lower bounds, and Section 6 contains to make and link operations as well as operations of the some final remarks. following kind: 2 Terminology cut(v): Given a nonroot node v, make it a root by We consider forests of rooted trees, heap-ordered with • deleting the arc connecting v to its parent, thereby respect to distinct node labels. We view arcs as being breaking its tree in two. directed from child to parent, so that paths lead from The trees are not (necessarily) heap-ordered and leaves to roots. A vertex v is a descendent of a vertex merges are not supported as single operations. Instead, w (and w is an ancestor of v) if the path from v to each node and/or arc has some associated value or the root of its tree contains w. (This includes the case values that can be accessed and changed one node v = w.) If v is neither an ancestor nor a descendent of and/or arc at a time, an entire path at a time, or even w, then v and w are unrelated. We denote by size(v) the an entire tree at a time. For example, in the original number of descendents of v, including v itself. The path application of dynamic trees [21], which was to network from a vertex v to a vertex w is denoted by P [v; w]. We flows, each arc has an associated real value representing denote by P [v; w) the subpath of P [v; w] from v to the a residual capacity. The maximum value of all arcs on child of w on the path; if v = w, this path is empty. By a path can be computed in a single operation, and a extension, P [v; null) is the path from v to the root of its constant can be added to all arc values on a path in a tree. Similarly, P (v; w] is the subpath of P [v; w] from single operation. There are several known versions of its second node to w, empty if v = w. Along a path P , the dynamic trees problem that differ in what kinds of labels strictly decrease. We denote by bottom(P ) and values are allowed, whether values can be combined over top(P ) the first and last vertices of P , respectively. The length of P , denoted by P , is the number of arcs on paths or over entire trees at a time, whether the trees are j j unrooted, rooted, or ordered, and what operations are P . The nearest common ancestor of two vertices v and w is bottom(P [v; null) P [w; null)); if the intersection allowed. For all these versions of the problem, there are \ algorithms that perform a sequence of tree operations in is empty (that is, if v and w are in different trees), then O(log n) time per operation, either amortized [22, 25], their nearest common ancestor is null. worst-case [3, 12, 21], or randomized [1]. The main novelty, and the main difficulty, in the 3 Mergeable Trees via Dynamic Trees mergeable trees problem is the merge operation. Al- The dynamic tree (or link-cut tree) data structure of though dynamic trees support global operations on node Sleator and Tarjan [21, 22] represents a forest to which and arc values, the underlying trees change only one arc arcs can be added (by the link operation) and removed at a time, by links and cuts. In contrast, a merge opera- (by the cut operation). These operations can happen tion can delete and add many arcs (even Θ(n)) simulta- in any order, as long as the existing arcs define a neously, thereby causing global structural change. Nev- forest. The primary goal of the data structure is to ertheless, we have developed an algorithm that performs allow efficient manipulation of paths. To this end, the a sequence of mergeable tree operations in O(log n) structure implicitly maintains a partition of the arcs into amortized time per operation, matching the best bound solid and dashed. The solid arcs partition the forest into known for dynamic trees.

Design of Data Structures for Mergeable Trees∗

Advanced Data Structures

Search Trees

The Random Access Zipper: Simple, Purely-Functional Sequences

Bulk Updates and Cache Sensitivity in Search Trees

Biased Finger Trees and Three-Dimensional Layers of Maxima

December, 1990

Handout 09: Suggested Project Topics

Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime

Purely Functional Data Structures

A Simplified and Dynamic Unified Structure

Suggested Final Project Topics

The Algorithms and Data Structures Course Multicomponent Complexity and Interdisciplinary Connections