<<

ECE750-TXB Lecture 7: Red-Black Trees, Heaps, and

Todd L. ECE750-TXB Lecture 7: Red-Black Trees, Veldhuizen Heaps, and Treaps [email protected] Red-Black Trees

Heaps Todd L. Veldhuizen Treaps [email protected] Bibliography

Electrical & Computer Engineering University of Waterloo Canada

February 14, 2007

ECE750-TXB Binary Search Trees Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] I Recall that in a binary of height h the time required to find or insert an element is O(h). Red-Black Trees Heaps I In the worst case h = n, the number of elements. Treaps

I To keep h ∈ O(log n) one needs a balancing strategy. Bibliography I Balancing strategies may be either:

I Randomized: e.g. a random insert order results in expected height of log n with c ≈ 4.311.

I Deterministic (in the sense of not random).

I Today we will see an example of each:

I Red-black trees: deterministic balancing I Treaps: randomized. Also demonstrate persistence and unique representation. ECE750-TXB Red-black trees Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] I Red-black trees are a popular form of binary with a deterministic balancing strategy. Red-Black Trees Heaps I Nodes are coloured red or black. Treaps

I Properties of the node-colouring ensure that the longest Bibliography path to a leaf is no more than twice the length of the shortest path.

I This ensures height of ≤ 2 log2(n + 1), which implies search, min, max in O(log n) worst-case time.

I Insert and Delete can also be performed in O(log n) worst-case time.

I Invented by Bayer [2], red-black formulation due to Guibas and Sedgewick [9]. Other sources: [5, 10].

ECE750-TXB Red-Black Trees: Invariants Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps Balance invariants: I Bibliography 1. No red node has a red child. 2. Every path in a subtree contains the same number of black nodes. ECE750-TXB Red-Black Trees Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography

ECE750-TXB Red-Black Trees: Balance I Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees Let bh(x) be the number of black nodes along any path Heaps from a node x to a leaf, excluding the leaf. Treaps Bibliography Lemma The number of internal nodes in the subtree rooted at x is at least 2bh(x) − 1.

Proof. ECE750-TXB Red-Black Trees: Balance II Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen By induction on height: [email protected] 1. Base case: If x has height 0, then x is a leaf, and Red-Black Trees bh(x) = 0; the number of internal (non-leaf) Heaps bh(x) descendents of x is 0 = 2 − 1. Treaps 2. Induction step: assume the hypothesis is true for height Bibliography ≤ h. Consider a node of height h + 1. From invariant (2), the children have black height either bh(x) − 1 (if the child is black) or bh(x) (if the child is red). By induction hypothesis, each child subtree has at least 2bh(x)−1 − 1 internal nodes. The total number of internal nodes in the subtree rooted at x is therefore ≥ (2bh(x)−1 − 1) + 1 + (2bh(x)−1 − 1) = 2bh(x) − 1.

ECE750-TXB Red-Black Trees: Balance Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Theorem Red-Black Trees A red-black tree with n internal nodes has height at most Heaps Treaps 2 log2(n + 1). Bibliography Proof. Let h be the tree height. From invariant 1 (a red node must have both children black), the black-height of the root must be ≥ h/2. Applying Lemma 1.1, the number of internal nodes n of the tree satisfies n ≥ 2h/2 − 1. Rearranging, h ≤ 2 log2(n + 1). ECE750-TXB Red-Black Trees: Balance Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] I As with all non-randomized binary search trees, balance must be maintained when insert or delete operations are Red-Black Trees performed. Heaps Treaps

I These operations may disrupt the invariants, so Bibliography rotations and recolourings are needed to restore them.

I Insert for red-black tree: 1. Insert the new key as a red node, using the usual insert. 2. Perform restructurings and recolourings along the path from the newly added leaf to the root to restore invariants. 3. Root is always coloured black.

ECE750-TXB Red-Black Trees: Balance Lecture 7: Red-Black Trees, Heaps, and Treaps

I Four cases for red nodes with red children: Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography

I Restructure/recolour to correct: each of the above cases becomes ECE750-TXB Red-Black Trees: Example Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen I Insertion of [1,2,3,4,5] into a red-black tree: [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography

I Implementation of rebalancing is straightforward but a bit involved.

ECE750-TXB Heaps and Treaps Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. I Treaps are a randomized search tree that combine Veldhuizen TRees and hEAPS. [email protected]

Red-Black Trees I First, let’s look at heaps. Heaps I Consider determining the maximum element of a . Treaps I We could iterate through the array and keep track of Bibliography the maximum element seen so far. Time taken: Θ(n).

I We could build a binary tree (e.g. red-black). We can obtain the maximum (minimum) element in O(h) time by following rightmost (leftmost) branches. If tree is balanced, requires O(n log n) time to build the tree, and O(log n) time to retrieve the maximum element.

I A is a highly efficient for maintaining the maximum element of a set. It is a rudimentary example of a dynamic algorithm/data structure. ECE750-TXB Dynamic Algorithms Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. I A static problem is one where we are given an instance Veldhuizen of a problem to solve, we solve it, and are done (e.g., [email protected]

an array). Red-Black Trees I A dynamic problem is one where we are given a problem Heaps to solve, we solve it. Treaps

I Then the problem is changed slightly and we resolve. Bibliography I ...ad infinitum.

I The challenge goes from solving a single instance of a problem to maintaining a solution as the problem is modified.

I It is usually more efficient to update the solution than recompute from scratch.

I e.g., binary search trees can be viewed as a method for dynamically maintaining an ordered list as elements are inserted and removed.

ECE750-TXB Heaps Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen I A heap dynamically maintains the maximum element in [email protected] a collection (or, dually, the minimum element). A can: Red-Black Trees Heaps I Obtain the maximum element in O(1) time; Treaps I Remove the maximum element in O(log n) time; Bibliography I Insert new element in O(log n) time. Heaps are a natural implementation of the PriorityQueue ADT.

I There are several flavours of heaps: binary heaps, binomial heaps, fibonacci heaps, pairing heaps. The more sophisticated of these support merging (melding) two heaps.

I We will look at binary heaps. ECE750-TXB Binary Heap Invariants Lecture 7: Red-Black Trees, Heaps, and Treaps

1. A binary heap is a complete binary tree of height h − 1, Todd L. Veldhuizen plus a possibly incomplete level of height h filled from [email protected] left to right. Red-Black Trees

2. The key stored at each node is ≥ the key(s) stored in Heaps

its children. Treaps

Bibliography

ECE750-TXB Binary Heap Lecture 7: Red-Black Trees, Heaps, and Treaps I A binary heap may be stored as a (1-based) array, where Todd L. Veldhuizen I Parent(j) = bj/2c [email protected] I LeftChild(i) = 2 ∗ i Red-Black Trees I RightChild(i) = 2 ∗ i + 1 Heaps I e.g., [17, 11, 13, 9, 6, 2, 12, 4, 3, 1] is an array Treaps representation of the heap: Bibliography ECE750-TXB Heap operations Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] I To insert a key k into the heap:

I Place k at the next available position. Red-Black Trees

I Swap k with its parent(s) until the heap invariant is Heaps

satisfied. (Takes O(log n) time.) Treaps

I The maximum element is just the key stored at the Bibliography root, which can be read off in O(1) time.

I To delete the maximum element:

I Place the key at the last heap position at the root (overwriting the current maximum), and decrease the size of the heap by one.

I Choose the largest of the root and its two children, and make this the root; perform this procedure recursively until the heap invariant is satisfied.

ECE750-TXB Heap: insert example Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps I Example: insert 23 into the heap and restore the heap Treaps invariant. Bibliography ECE750-TXB Heap: delete-max example Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography

I To delete the max element, move the element from the last position (2) to the root;

I To restore heap invariant, swap root with the largest child greater than it, if any, and repeat down the heap.

ECE750-TXB Treaps Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] Treaps (binary TRee + hEAP) Red-Black Trees I a randomized Heaps

I with O(log n) average-case insert, delete, search Treaps

I with O(∆ log n) average-case union, intersection, ⊆, ⊇, Bibliography where ∆ = |(A \ B) ∪ (B \ A)| is the difference between the sets

I uniquely represented (to be explained)

I easily made persistent (to be explained)

I Due to Vuillemin [14] and independently, Seidel and Aragon [11]. Additional references: [3, 16, 15]. ECE750-TXB Treaps: Basics Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees I Keys are assigned (randomly chosen) priorities. Heaps I Two total orders on keys: Treaps I The usual key order; Bibliography I A randomly chosen priority order, often obtained by assigning each key a random integer, or using an appropriate hash function

I Treaps are kept sorted by key in the usual way (inorder visits keys in order).

I The heap property is maintained wrt the priority order.

ECE750-TXB ordering Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen I Each node has key k and priority p [email protected]

I Ordering invariants: Red-Black Trees

Heaps (k , p ) 2 2 Treaps t KK tt KK Bibliography tt KK tt KK tt K (k1, p1)(k3, p3)

k1 ≤ k2 ≤ k3 Key order

 p ≥ p 2 p 1 Priority order p2 ≥p p3

Every node has a higher priority than its descendents. ECE750-TXB Treaps: Basics Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees I If priorities are chosen randomly, the tree is on average balanced, and insert, delete, search take O(log n) time Heaps Treaps I Random priorities behave like a random insertion order: the structure of the treap is exactly that obtained by Bibliography inserting the keys into a binary search tree in descending order of heap prioritity.

I If keys are unique (no duplicates), and priorities are unique, then the treap has the unique representation property

ECE750-TXB Unique representation Lecture 7: Red-Black Trees, Heaps, and Treaps

I Unique representation: each set is represented by a Todd L. Veldhuizen unique data structure [1, 13, 12] [email protected] I Most tree data structures do not have this property: depending on order of inserts, deletes, etc. the tree can Red-Black Trees have different forms for the same set of keys. Heaps n −3/2 −1/2 Treaps I Recall there are Cn ∼ 4 n π ways to place n keys in a binary search tree (Catalan numbers). e.g. Bibliography C20 = 6564120420. Deterministic (i.e., not randomized) uniquely I √ represented search trees are known to require Ω( n) worst-case time for insert, delete, search [12].

I Treaps are randomized (not deterministic), and have O(log n) average-case time for insert, delete, search

I If you memoize or cache the constructors of a uniquely represented data structure, you can do equality testing in O(1) time by comparing pointers. ECE750-TXB Treap: Example Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps Treap A1 = R.insert("f"); // Insert the key f Treaps Treap A2 = A1.insert("u"); // Insert the key u Bibliography

Treap B1 = R.insert("u"); // Insert the key u into R Treap B2 = R.insert("f"); // Insert the key f

ECE750-TXB Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography ECE750-TXB Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography

ECE750-TXB Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography ECE750-TXB Canonical forms Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] I The structure of the treap does not depend on the order on which the operations are carried out. Red-Black Trees Heaps Treaps give a canonical form for sets: if A, B are sets, I Treaps

we can determine whether A = B by constructing treaps Bibliography containing the elements of A and B, and comparing them. If the treaps are the same, the sets are equal.

I Treaps give an easy decision procedure for equality of terms modulo associativity, commutativity, and idempotency.

I Treaps are very useful in program analysis (e.g., for compilers) for solving fixpoint equations on sets.

ECE750-TXB Persistent Data Structures Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen Literature: [7, 8, 4, 6] [email protected]

Red-Black Trees I Partially persistent: Can access previous versions of a data structure, but cannot derive new versions from Heaps them (read-only access to a linear past.) Treaps Bibliography I Fully persistent: Can make changes in previous versions of the data structure: versions can “fork.”

I Any with constant bounded in-degree can be made fully persistent with amortized O(1) space and time overhead, and worst case O(1) overhead for access [7]

I Confluently persistent: Can branch into two versions of the data structure, and later reconcile these branches ECE750-TXB The Version Graph Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. The version graph shows how versions of a data structure Veldhuizen are derived from one another. [email protected]

I Vertices: Data structures Red-Black Trees Heaps I Edges: Show how one data structure was derived from another Treaps Bibliography I Treaps example:

R B }} BB }} BB }} BB ~}} B A1 B1

  A2 B2

ECE750-TXB Version graph Lecture 7: Red-Black Trees, Heaps, and Treaps I Partial persistence: version graph is a linear sequence of Todd L. Veldhuizen versions, each derived from the previous version. [email protected]

I Partial/full persistence: get a version tree Red-Black Trees I Confluent persistence: get a version DAG (directed Heaps acyclic graph) Treaps Bibliography X A {{ AA {{ AA {{ AA }{{ A Y 1 Z     Y 2  C  CC  CC  CC  C! Ö W ECE750-TXB Purely Functional Data Structures Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

I Literature: [10] Heaps Treaps I Functional data structures: cannot modify a node of the data structure once it is created. (One implication: Bibliography no cyclic data structures.)

I Functional data structures are by nature partially persistent: we can always hold onto pointers to old versions of the data structure.

ECE750-TXB Scopes Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Partial persistence is very useful for managing scopes in Veldhuizen I [email protected] compilers and program analysis. Red-Black Trees I A scope is a representation of the names that are visible Heaps at a given program point: Treaps

int foo(int a, int b) Bibliography { // S1 int x = a*a, y = b*b, z=0; // S2

for (int k=0; k < x; ++k) // S3 for (int l=0; l < y; ++l) // S4 ++c; // S5 return x; } ECE750-TXB Scopes Example Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected]

Red-Black Trees

Heaps

Treaps

Bibliography

ECE750-TXB Bibliography I Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] [1] A. Andersson and T. Ottmann. Faster uniquely represented dictionaries. Red-Black Trees In IEEE, editor, Proceedings: 32nd annual Symposium Heaps on Foundations of Computer Science, San Juan, Puerto Treaps Bibliography Rico, October 1–4, 1991, pages 642–649, 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, 1991. IEEE Computer Society Press. bib pdf

[2] Rudolf Bayer. Symmetric binary B-trees: Data structure and maintenance algorithms. Acta Inf, 1:290–306, 1972. bib ECE750-TXB Bibliography II Lecture 7: Red-Black Trees, Heaps, and Treaps

[3] Guy E. Blelloch and Margaret Reid-Miller. Todd L. Veldhuizen Fast set operations using treaps. [email protected] In Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 16–26, Red-Black Trees Heaps Puerto Vallarta, Mexico, June 1998. bib ps Treaps [4] Adam L. Buchsbaum and Robert E. Tarjan. Bibliography Confluently persistent deques via data-structural bootstrapping. In Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, pages 155–164. ACM Press, 1993. bib pdf ps

[5] Thomas H. Cormen, Charles E. Leiserson, and Ronald R. Rivest. Intoduction to algorithms. McGraw Hill, 1991. bib

ECE750-TXB Bibliography III Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [email protected] [6] P. F. Dietz. Fully persistent arrays. Red-Black Trees In F. Dehne, J.-R. Sack, and N. Santoro, editors, Heaps Proceedings of the Workshop on Algorithms and Data Treaps Bibliography Strucures, volume 382 of LNCS, pages 67–74, Berlin, August 1989. Springer. bib [7] James R. Driscoll, Neil Sarnak, Daniel Dominic Sleator, and Robert Endre Tarjan. Making data structures persistent. In ACM Symposium on Theory of Computing, pages 109–121, 1986. bib pdf ECE750-TXB Bibliography IV Lecture 7: Red-Black Trees, Heaps, and Treaps [8] Amos Fiat and Haim Kaplan. Todd L. Making data structures confluently persistent. Veldhuizen [email protected] In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA-01), pages Red-Black Trees 537–546, New York, January 7–9 2001. ACM Press. Heaps bib pdf Treaps Bibliography [9] Leonidas J. Guibas and Robert Sedgewick. A dichromatic framework for balanced trees. In FOCS, pages 8–21. IEEE, 1978. bib [10] Chris Okasaki. Purely Functional Data Structures. Cambridge University Press, Cambridge, UK, 1998. bib [11] Raimund Seidel and Cecilia R. Aragon. Randomized search trees. Algorithmica, 16(4/5):464–497, 1996. bib pdf ps

ECE750-TXB Bibliography V Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [12] Lawrence Snyder. [email protected]

On uniquely representable data structures. Red-Black Trees In 18th Annual Symposium on Foundations of Heaps Computer Science, pages 142–146, Long Beach, Ca., Treaps USA, October 1977. IEEE Computer Society Press. bib Bibliography [13] R. Sundar and R. E. Tarjan. Unique binary search tree representations and equality-testing of sets and sequences. In Baruch Awerbuch, editor, Proceedings of the 22nd Annual ACM Symposium on the Theory of Computing, pages 18–25, Baltimore, MY, May 1990. ACM Press. bib pdf ECE750-TXB Bibliography VI Lecture 7: Red-Black Trees, Heaps, and Treaps

Todd L. Veldhuizen [14] Jean Vuillemin. [email protected] A unifying look at data structures. Red-Black Trees

Communications of the ACM, 23(4):229–239, 1980. Heaps

bib pdf Treaps

Bibliography [15] M. A. Weiss. A note on construction of treaps and Cartesian trees. Information Processing Letters, 54(2):127–127, April 1995. bib [16] Mark Allen Weiss. Linear-time construction of treaps and Cartesian trees. Information Processing Letters, 52(5):253–257, December 1994. bib pdf