ECE750-TXB Lecture 8: Treaps, Tries, and Hash Tables
Total Page:16
File Type:pdf, Size:1020Kb
ECE750-TXB Lecture 8: Treaps, Tries, and Hash Tables Todd L. ECE750-TXB Lecture 8: Treaps, Tries, and Veldhuizen Hash Tables [email protected] Review: Treaps Tries Todd L. Veldhuizen Hash Tables [email protected] Bibliography Electrical & Computer Engineering University of Waterloo Canada February 1, 2007 ECE750-TXB Review: Treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen I Recall that a binary search tree has keys drawn from a [email protected] totally ordered structure hK, ≤i Review: Treaps I An inorder traversal of the tree recovers the keys in Tries ascending order. Hash Tables Bibliography d b h a c f i ECE750-TXB Review: Treaps Lecture 8: Treaps, Tries, and Hash Tables Recall that a heap has priorities drawn from a totally Todd L. I Veldhuizen ordered structure hP, ≤i [email protected] I The priority of a parent is ≥ that of its children (for a Review: Treaps max heap.) Tries Hash Tables I The largest priority is at the root. Bibliography 23 11 14 7 1 6 13 ECE750-TXB Review: Treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. I In a treap, nodes contain a pair (k, p) where k ∈ K is a Veldhuizen [email protected] key, and p ∈ P is a priority. Review: Treaps I A Treap is a mixture of a binary search tree and a heap: Tries Hash Tables I A binary search tree with respect to keys; Bibliography I A heap with respect to priorities. (d,23) (b,11) (h,14) (a,7) (c,1) (f,6) (i,13) ECE750-TXB Review: Unique Representation Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] I If the keys and priorities are unique, then treaps have the unique representation property: given a set of (k, p) Review: Treaps pairs, there is only one way to build the tree. Tries Hash Tables I For the heap property to be satisfied, there is only one (k, p) pair that can be the root: the one with the Bibliography highest priority. I The left subtree of the root will contain all keys < k, and the right subtree of the root will contain all keys > k. I Of the keys < k, the one with the highest priority must occupy the left child of the root. This then splits constructing the left subtree into two subproblems. I etc. ECE750-TXB Review: Unique Representation Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen I Example: to build a treap from {(i, 13), (c, 1), (d, 23), (b, 11), (h, 14), (a, 7), (f , 6)}, [email protected] unique choice of root: (d, 23) Review: Treaps Tries (d, 23) T Hash Tables jjjj TTTT j Bibliography {(c, 1), (b, 11), (a, 7)} {(i, 13), (h, 14), (f , 6)} I To build the left subtree, pick out the highest priority element: (b, 11). And so forth. (d, 23) T t TTTT ttt (b, 11) {(i, 13), (h, 14), (f , 6)} u KK uuu KK (a, 7) (c, 1) ECE750-TXB Review: Unique Representation Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen I Data structures with the unique representation can be [email protected] checked for equality in O(1) time by using caching (also known as memoization): Review: Treaps Tries I Implement the data structure in a purely functional style Hash Tables (a node’s fields are never altered after construction. Any changes require creating a new node.) Bibliography I Maintain a map from (key, priority, lchild, rchild) tuples to already constructed nodes. I Before constructing a node, check the cache to see if it already exists; if so, return the pointer to that node. Otherwise, construct the node and add it to the cache. I If two treaps contain the same keys, their root pointers will be equal: can be checked in O(1) time. I Checking and maintaining the cache requires additional time overhead. ECE750-TXB Review: Balance of treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] I Treaps are balanced if the priorities are chosen Review: Treaps randomly. Tries Hash Tables I Recall that building a binary search tree with a random insertion order results in a tree of expected height Bibliography c log n, with c ≈ 4.311. I A treap with random priorities assigned to keys has exactly the same structure as a binary search tree created by inserting keys in descending order of priority I Descending order of priority is a random order; I Therefore treaps have expected height c log n with c ≈ 4.311. ECE750-TXB Insertion into treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] I Insertion for treaps is much simpler than that for Review: Treaps red-black trees. Tries 1. Insert the (k, p) pair as for a binary search tree, by key Hash Tables alone: the new node will be placed somewhere at the Bibliography bottom of the tree. 2. Perform rotations along the path from the new leaf to the root to restore invariants: I If there is a node x whose right subchild has a higher priority, rotate left at x. I If there is a node x whose left subchild has a higher priority, rotate right at x. ECE750-TXB Insertion into treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] I Example: the treap below has just had (e, 19) inserted as a new leaf. Rotations have not yet been performed. Review: Treaps (d,23) Tries Hash Tables Bibliography (b,11) (h,14) (a,7) (c,1) (f,6) (i,13) (e,19) I f has a left subchild with greater priority: rotate right at f . ECE750-TXB Insertion into treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] I After rotating right at f : Review: Treaps (d,23) Tries Hash Tables (b,11) (h,14) Bibliography (a,7) (c,1) (e,19) (i,13) (f,6) I h has a left subchild with greater priority: rotate right at h. ECE750-TXB Insertion into treaps Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] I After rotating right at h: Review: Treaps (d,23) Tries Hash Tables Bibliography (b,11) (e,19) (a,7) (c,1) (h,14) (f,6) (i,13) I Heap invariant is satisfied: all done. ECE750-TXB Lecture 8: Treaps, Tries, and Hash I Treaps are easily made persistent (retain previous Tables Todd L. versions) by implementing them in a purely functional Veldhuizen style. Insertion requires duplicating at most a sequence [email protected] of nodes from the root to a leaf: an O(log n) space Review: Treaps overhead. The remaining parts of the tree are shared. Tries Hash Tables I E.g. the previous insert done in a purely functional style: Bibliography Version 2 Version 1 (d,23) (d,23) (e,19) (b,11) (h,14) (a,7) (c,1) (f,6) (i,13) ECE750-TXB Strings Lecture 8: Treaps, Tries, and Hash Tables I A string is a sequence of characters drawn from some Todd L. Veldhuizen alphabet Σ. We will often use Σ = {0, 1}: binary [email protected] strings. Review: Treaps We write Σ∗ to mean all finite strings1 composed of I Tries characters from Σ. (∗ is the Kleene closure.) Hash Tables ∗ I Σ contains the empty string . Bibliography ∗ I If w, v ∈ Σ are strings, we write w · v or just wv to mean the concatenation of w and v. I Example: given w = 010 and v = 11, w · v = 01011. hΣ∗, ·, i is an example of a monoid: a set (Σ∗) together with an associative binary operator (·) and an identity element (). For any strings u, v, w ∈ Σ∗, u · (v · w) = (u · v) · w v = v = v 1Infinite strings are very useful also: if we write a real number x ∈ [0, 1] as a binary number e.g. 0.101100101000 ··· , this is a representation of x by an infinite string from Σω. ECE750-TXB Tries Lecture 8: Treaps, Tries, and Hash Tables I Recall that we may label the left and right links of a Todd L. Veldhuizen binary tree with 0 (for left) and 1 (for right): [email protected] Review: Treaps 0 y @@ 1 yyy @@ Tries x : 0 ÓÓ ::1 Hash Tables ÓÓ : y z Bibliography I To describe a path in the tree, one can list the sequence of left/right branches to take from the root. E.g., 10 gives y, 11 gives z. I The set of all paths from the root to leaves is P◦ = {0, 10, 11} I The set of all paths from the root to leaves or internal nodes is: P• = {, 0, 1, 10, 11}, where is the empty string indicating the path starting and ending at the root. ECE750-TXB Tries Lecture 8: Treaps, Tries, and Hash Tables Todd L. Veldhuizen [email protected] ◦ Review: Treaps I The set P is prefix-free: no string is an initial segment Tries of any other string. Otherwise, there would be a path Hash Tables to a leaf passing through another leaf! Bibliography • • • I The set P is prefix-closed: if wv ∈ P , then w ∈ P also. i.e., P• contains all prefixes of all strings in P•.2 2We can define • as an operator by A• ≡ {w : wv ∈ A}. • is a closure operator. A useful fact: every closure operator has as its range a complete lattice, where meet and join are given by (X u Y )• = X • ∩ Y • and (X t Y )• = (X • ∪ Y •)•.