2-3-4 Trees and Red- Black Trees

CS 16: Balanced Trees 2-3-4 Trees and Red- Black Trees erm 204 CS 16: Balanced Trees 2-3-4 Trees Revealed • Nodes store 1, 2, or 3 keys and have 2, 3, or 4 children, respectively • All leaves have the same depth k b e h nn rr a c d f g i l m p s x 1 ---log()N + 1 ≤≤height log()N + 1 2 erm 205 CS 16: Balanced Trees 2-3-4 Tree Nodes • Introduction of nodes with more than 1 key, and more than 2 children 2-node: a • same as a binary node <a >a a b 3-node: • 2 keys, 3 links <a >b >a <b 4-node: a b c • 3 keys, 4 links <a >c >a >b <b <c erm 206 CS 16: Balanced Trees Why 2-3-4? • Why not minimize height by maximizing children in a “d-tree”? • Let each node have d children so that we get O(logd N) search time! Right? log d N = log N/log d • That means if d = N1/2, we get a height of 2 • However, searching out the correct child on each level requires O(log N1/2) by binary search • 2 log N1/2 = O(log N) which is not as good as we had hoped for! • 2-3-4-trees will guarantee O(log N) height using only 2, 3, or 4 children per node erm 207 CS 16: Balanced Trees Insertion into 2-3-4 Trees • Insert the new key at the lowest internal node reached in the search • 2-node becomes 3-node g d d g • 3-node becomes 4-node f d g d f g • What about a 4-node? • We can’t insert another key! erm 208 CS 16: Balanced Trees Top Down Insertion • In our way down the tree, whenever we reach a 4-node, we break it up into two 2- nodes, and move the middle element up into the parent node e e n f n d f g d g • Now we can perform the insertion using one of the f n previous two cases • Since we follow this method from the root down d e g to the leaf, it is called top down insertion erm 209 CS 16: Balanced Trees Splitting the Tree As we travel down the tree, if we encounter any 4-node we will break it up into 2-nodes. This guarantees that we will never have the problem of inserting the middle element of a former 4- node into its parent 4-node. g c n t Whoa, cowboy a f i l p r x g n c t a f i l p r x erm 210 CS 16: Balanced Trees g n c t a f i l p r x n g c t Whoa, cowboy a f i l p r x erm 211 CS 16: Balanced Trees n g c t Whoa, cowboy a f i l p r x n g c i t a f l p r x n c i t axf g l p r erm 212 CS 16: Balanced Trees Time Complexity of Insertion in 2-3-4 Trees Time complexity: • A search visits O(log N) nodes • An insertion requires O(log N) node splits • Each node split takes constant time • Hence, operations Search and Insert each take time O(log N) Notes: • Instead of doing splits top-down, we can perform them bottom-up starting at the insertion node, and only when needed. This is called bottom-up insertion. • A deletion can be performed by fusing nodes (inverse of splitting), and takes O(log N) time erm 213 CS 16: Balanced Trees Beyond 2-3-4 Trees What do we know about 2-3-4 Trees? • Balanced Siskel • O(log N) search time Ebert • Different node structures Roberto Can we get 2-3-4 tree advantages in a binary tree format??? Welcome to the world of Red-Black Trees!!! erm 214 CS 16: Balanced Trees Red-Black Tree A red-black tree is a binary search tree with the following properties: • edges are colored red or black • no two consecutive red edges on any root-leaf path • same number of black edges on any root-leaf path (= black height of the tree) • edges connecting leaves are black Black Red Edge Edge erm 215 CS 16: Balanced Trees 2-3-4 Tree Evolution Note how 2-3-4 trees relate to red-black trees 2-3-4 Red-Black or Now we see red-black trees are just a way of representing 2-3-4 trees! erm 216 CS 16: Balanced Trees More Red-Black Tree Properties N # of internal nodes L # leaves (= N + 1) H height B black height Property 1: 2B ≤ N + 1 ≤ 4B 1 Property 2: ---log()N1+ ≤≤ Blog() N+ 1 2 Property 3: log()N1+ ≤≤ H 2log() N+ 1 This implies that searches take time O(logN)! erm 217 CS 16: Balanced Trees Insertion into Red-Black Trees 1.Perform a standard search to ﬁnd the leaf where the key should be added 2.Replace the leaf with an internal node with the new key 3.Color the incoming edge of the new node red 4.Add two new leaves, and color their incoming edges black 5.If the parent had an incoming red edge, we now have two consecutive red edges! We must reorganize tree to remove that violation. What must be done depends on the sibling of the parent. R R G G erm 218 CS 16: Balanced Trees Insertion - Plain and Simple Let: n be the new node p be its parent g be its grandparent Case 1: Incoming edge of p is black No violation g p STOP! n Pretty easy, huh? Well... it gets messier... erm 219 CS 16: Balanced Trees Restructuring Case 2: Incoming edge of p is red, and its sibling is black Uh oh! g Much Better! p p n g n We call this a “right rotation” • No further work on tree is necessary • Inorder remains unchanged • Tree becomes more balanced • No two consecutive red edges! erm 220 CS 16: Balanced Trees More Rotations Similar to a right rotation, we can do a “left rotation”... g p p g n n Simple, huh? erm 221 CS 16: Balanced Trees Double Rotations What if the new node is between its parent and grandparent in the inorder sequence? We must perform a “double rotation” (which is no more difﬁcult than a “single” one) g n p p g n This would be called a “left-right double rotation” erm 222 CS 16: Balanced Trees Last of the Rotations And this would be called a “right-left double rotation” g n p g p n erm 223 CS 16: Balanced Trees Bottom-Up Rebalancing Case 3: Incoming edge of p is red and its sibling is also red g g p p n n • We call this a “promotion” • Note how the black depth remains unchanged for all of the descendants of g • This process will continue upward beyond g if necessary: rename g as n and repeat. erm 224 CS 16: Balanced Trees Summary of Insertion •Iftwo red edges are present, we do either •arestructuring (with a simple or double rotation) and stop, or •apromotion and continue •Arestructuring takes constant time and is performed at most once. It reorganizes an off-balanced section of the tree. • Promotions may continue up the tree and are executed O(log N) times. • The time complexity of an insertion is O(logN). erm 225 CS 16: Balanced Trees An Example Start by inserting “REDSOX” into an empty tree E D R O S X Now, let’s insert “C U B S”... erm 226 CS 16: Balanced Trees A Cool Example C E D R O S X E D R C O S X erm 227 CS 16: Balanced Trees An Unbelievable Example U E D R C O S E X D R C O S Oh No! X What should we do? U erm 228 CS 16: Balanced Trees E D R C O S Double Rotation X U E D R C O U S X erm 229 CS 16: Balanced Trees A Beautiful Example E B D R C O U S X E D R What now? C O U B S X erm 230 CS 16: Balanced Trees E D R Rotation C O U B S X E C R B D O U S X erm 231 CS 16: Balanced Trees A Super Example S E C R B D O U S X E C R B D O U We could’ve Holy Consecutive placed it on S X Red Edges, Batman! either side S erm 232 CS 16: Balanced Trees E C R B D O U Use the Bat-Promoter!! S X E S C R BIFF! B D O U S X S erm 233 CS 16: Balanced Trees E C R Rotation B D O U S X S R The SUN lab and Red-Bat trees are safe... E U ...for now!!! C O S X B D S erm 234.

2-3-4 Trees and Red- Black Trees

Trees • Binary Trees • Newer Types of Tree Structures • M-Ary Trees

Tree-Combined Trie: a Compressed Data Structure for Fast IP Address Lookup

Balanced Trees Part One

Heaps a Heap Is a Complete Binary Tree. a Max-Heap Is A

Assignment 3: Kdtree ______Due June 4, 11:59 PM

Splay Trees Last Changed: January 28, 2017

Binary Search Tree

6.172 Lecture 19 : Cache-Oblivious B-Tree (Tokudb)

Search Trees for Strings a Balanced Binary Search Tree Is a Powerful Data Structure That Stores a Set of Objects and Supports Many Operations Including

Trees, Binary Search Trees, Heaps & Applications Dr. Chris Bourke

B-Trees M-Ary Search Tree Solution

AVL Trees and Rotations