Balanced Trees Part One
Total Page:16
File Type:pdf, Size:1020Kb
Balanced Trees Part One Balanced Trees ● Balanced search trees are among the most useful and versatile data structures. ● Many programming languages ship with a balanced tree library. ● C++: std::map / std::set ● Java: TreeMap / TreeSet ● Python: OrderedDict ● Many advanced data structures are layered on top of balanced trees. ● We'll see them used to build y-Fast Tries later in the quarter. (They’re really cool, trust me!) Where We're Going ● B-Trees (Today) ● A simple type of balanced tree developed for block storage. ● Red/Black Trees (Today) ● The canonical balanced binary search tree. ● Augmented Search Trees (Tuesday) ● Adding extra information to balanced trees to supercharge the data structure. ● Two Advanced Operations (Tuesday) ● Splitting and joining BSTs. Outline for Today ● BST Review ● Refresher on basic BST concepts and runtimes. ● Overview of Red/Black Trees ● What we're building toward. ● B-Trees and 2-3-4 Trees ● A simple balanced tree in depth. ● Intuiting Red/Black Trees ● A much better feel for red/black trees. A Quick BST Review Binary Search Trees ● A binary search tree is a binary tree with the following properties: ● Each node in the BST stores a key, and optionally, some auxiliary information. ● The key of every node in a BST is strictly greater than all keys to its left and strictly smaller than all keys to its right. ● The height of a binary search tree is the length of the longest path from the root to a leaf, measured in the number of edges. ● A tree with one node has height 0. ● A tree with no nodes has height -1, by convention. Searching a BST 137 73 271 42 161 314 60 Inserting into a BST 137 73 271 42 161 314 60 Inserting into a BST 137 73 271 42 161 314 60 166 Deleting from a BST 137 73 271 42 161 314 CaseCase 1: 1: If If the the node node has has justjust no no children, children, just just remove it. 166 remove it. Deleting from a BST 137 42 271 161 314 CaseCase 2: 2: If If the the node node has has justjust one one child, child, remove remove itit and and replace replace it it with with itsits child. child. 166 Deleting from a BST 161 42 271 166 314 CaseCase 3: 3: If If the the node node has has two two children,children, find find its its inorder inorder successorsuccessor (which (which has has zero zero or or oneone child), child), replace replace the the node's node's key with its successor's key, key with its successor's key, thenthen delete delete its its successor. successor. Runtime Analysis ● The time complexity of all these operations is O(h), where h is the height of the tree. ● That’s the longest path we can take. ● In the best case, h = O(log n) and all operations take time O(log n). ● In the worst case, h = Θ(n) and some operations will take time Θ(n). ● Challenge: How do you efficiently keep the height of a tree low? A Glimpse of Red/Black Trees Red/Black Trees ● A red/black tree is a BST with the 110 following properties: ● Every node is either red or black. 107 166 ● The root is black. ● No red node has a red child. 106 161 261 ● Every root-null path in the tree passes through the same number of black nodes. 140 Red/Black Trees ● A red/black tree is a BST with the 5 following properties: ● Every node is either red or black. 2 7 ● The root is black. ● No red node has a red child. 1 4 8 ● Every root-null path in the tree passes through the same number of black nodes. Red/Black Trees ● Theorem: Any red/black tree with n nodes has height O(log n). ● We could prove this now, but there's a much simpler proof of this we'll see later on. ● Given a fixed red/black tree, lookups can be done in time O(log n). Mutating Red/Black Trees 17 7 31 3 11 23 37 WhatWhat are are we we 13 supposedsupposed to to do do with with thisthis new new node? node? Mutating Red/Black Trees 17 7 37 3 11 23 HowHow do do we we fix fix upup thethe black-heightblack-height property? property? Fixing Up Red/Black Trees ● The Good News: After doing an insertion or deletion, can locally modify a red/black tree in time O(log n) to fix up the red/black properties. ● The Bad News: There are a lot of cases to consider and they're not trivial. ● Some questions: ● How do you memorize / remember all the different types of rotations? ● How on earth did anyone come up with red/black trees in the first place? B-Trees Generalizing BSTs ● In a binary search tree, each node stores a single key. ● That key splits the “key space” into two pieces, and each subtree stores the keys in those halves. 2 -1 4 -2 0 3 6 Values less than two Values greater than two Generalizing BSTs ● In a multiway search tree, each node stores an arbitrary number of keys in sorted order. ● A node with k keys splits the key space into k+1 regions, with subtrees for keys in each region. 0 3 5 (-∞, 0) (0, 3) (3, 5) (5, ∞) Generalizing BSTs ● In a multiway search tree, each node stores an arbitrary number of keys in sorted order. 43 46 5 19 31 45 71 83 2 3 7 11 13 17 23 29 37 41 47 53 59 61 67 73 79 89 97 ● Surprisingly, it’s a bit easier to build a balanced multiway tree than it is to build a balanced BST. Let’s see how. Balanced Multiway Trees ● In some sense, building a balanced multiway tree isn’t all that hard. ● We can always just cram more keys into a single node! 23 26 31 41 53 58 59 62 84 93 97 ● At a certain point, this stops being a good idea – it’s basically just a sorted array. Balanced Multiway Trees ● What could we do if our nodes get too big? ● Option 1: Push keys 23 26 31 41 53 58 59 84 93 97 down into new nodes. ● Option 2: Split big 62 nodes, kicking keys higher up. 58 ● What are some advantages of each approach? 23 26 31 41 53 59 62 84 93 97 ● What are some disadvantages? Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. 10 50 99 ● Can lead to tree imbalances. 20 30 40 Option 2: Split big nodes, kicking values higher up. 31 35 39 Keeps the tree balanced. 32 33 34 Keeps most nodes near the bottom. Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree balanced. ● Slightly trickier to 10 20 50 99 implement. 40 30 31 39 35 32 33 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 50 balanced. ● Slightly trickier to 10 20 99 implement. 40 30 31 39 35 32 33 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 50 balanced. ● Slightly trickier to 10 20 30 40 99 implement. 31 39 35 32 33 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 30 50 balanced. ● Slightly trickier to 10 20 40 99 implement. 31 39 35 32 33 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 30 50 balanced. ● Slightly trickier to 10 20 31 35 39 40 99 implement. 32 33 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 30 39 50 balanced. ● Slightly trickier to 10 20 31 35 40 99 implement. 32 33 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 30 39 50 balanced. ● Slightly trickier to 10 20 31 32 33 35 40 99 implement. 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 30 33 39 50 balanced. ● Slightly trickier to 10 20 31 32 35 40 99 implement. 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys higher up. ● Keeps the tree 30 33 39 50 balanced. ● Slightly trickier to 10 20 31 32 35 40 99 implement. 34 Balanced Multiway Trees ● Option 1: Push keys down into new nodes. ● Simple to implement. ● Can lead to tree imbalances. ● Option 2: Split big nodes, kicking keys 39 higher up. ● Keeps the tree 30 33 50 balanced.