Balanced Search Trees

Balanced Search Trees ● Binary search trees keep keys ordered, with efficient lookup ➤ Insert, Delete, Find, all are O(log n) in average case ➤ Worst case is bad ➤ Compared to hashing? Advantages? ● Balanced trees are guaranteed O(log n) in the worst case ➤ Fundamental operation is a rotation: keep tree roughly balanced ➤ AVL tree was the first one, still studied since simple conceptually ➤ Red-Black tree uses rotations, harder to code, better in practice ➤ B-trees are used when data is stored on disk rather than in memory CPS 100 12.1 Rotations and balanced trees ● Height-balanced trees ➤ For every node, left and right subtree heights differ by at most 1 ➤ After insertion/deletion need to rebalance Are these trees height- ➤ Every operation leaves tree in balanced? a balanced state: invariant property of tree ● Find deepest node that’s unbalanced ➤ On path from root to insert/deleted node ➤ Rebalance at this unbalanced point only CPS 100 12.2 Rotation to rebalance N ● When a node N is unbalanced N height differs by 2 (must be more than one) C A ➤ Change N->left->left • doLeft A B B C ➤ Change N->left->right • doLeftRight Tree * doLeft(Tree * root) ➤ Change N->right->left { • doRightLeft Tree * newRoot = root->left; ➤ Change N->right->right root->left = newRoot->right; • doRight newRoot->right = root; ● First/last cases are symmetric return newRoot; ● Middle cases require two rotation } ➤ First of the two puts tree into doLeft or doRight CPS 100 12.3 Rotation to rebalance ????? N ● Suppose we add a new node in right N subtree of left child of root ➤ Single rotation can’t fix C A ➤ Need to rotate twice A B B C ● First stage is shown at bottom ➤ Rotate blue node right ➤ This is left child of unbalanced Tree * doRight(Tree * root) { C B B 2 Tree * newRoot = root->right; A B B 1 1 2 A root->right = newRoot->left; newRoot->left = root; return newRoot; } CPS 100 12.4 Double rotation complete ●Calculate where to rotate and what case, do the rotations Tree * doRight(Tree * root) { C Tree * newRoot = root->right; B2 C root->right = newRoot->left; A B1 B2 B1 newRoot->left = root; A return newRoot; } Tree * doLeft(Tree * root) { Tree * newRoot = root->left; root->left = newRoot->right; newRoot->right = root; B1 B2 return newRoot; C } A CPS 100 12.5 Other trees ● Red-black tree uses same rotations, but can rebalance in one pass, contrast to AVL tree ➤ In AVL case, insert, calculate balance factors, rebalance ➤ In Red-black tree can rebalance on the way down, code is more complex, but doable ● Red-black trees used in practice, efficient, guaranteed log n ➤ STL in C++ uses red-black tree for map and set classes ➤ Standard java.util.TreeMap/TreeSet use red-black ● B-trees, 2-3 trees, 2-3-4 trees (all variants of the same thing) ➤ Data stored on disk, need higher branching factor than binary search tree ➤ O(log n), but much faster in practice CPS 100 12.6 Trie: efficient search of words/suffixes ● A trie (from retrieval, but pronounced “try”) supports ➤ Insertion: a word into the trie (delete and look up) ➤ These operations are O(size of string) regardless of how many strings are stored in the trie! ● In some ways a trie is like a 128 (or 26 or alphabet-size) tree, one branch/edge for each character/letter ➤ Node stores branches to other nodes ➤ Node stores whether it ends the string from root to it ● Extremely useful in DNA/string processing ➤ monkeys and typewriter simulation which is similar to some methods used in Natural Language understanding CPS 100 12.7 Trie picture and code (see trie.cpp) ● To add string a ➤ Start at root, for each char c p r create node as needed, go n s a r a down tree, mark last node ● To find string h st cd ➤ Start at root, follow links ao ➤ If Null/0 not contained ➤ Check word flag in node ● To print all nodes ➤ Visit every node, build string as nodes traversed ● What about union and intersection? Indicates word ends here CPS 100 12.8.

Balanced Search Trees

Application of TRIE Data Structure and Corresponding Associative Algorithms for Process Optimization in GRID Environment

KP-Trie Algorithm for Update and Search Operations

Lecture 26 Fall 2019 Instructors: B&S Administrative Details

Efficient In-Memory Indexing with Generalized Prefix Trees

Search Trees

Comparison of Dictionary Data Structures

Search Trees for Strings a Balanced Binary Search Tree Is a Powerful Data Structure That Stores a Set of Objects and Supports Many Operations Including

X-Fast and Y-Fast Tries

AVL Tree, Bayer Tree, Heap Summary of the Previous Lecture

Tries and String Matching

AVL Insertion, Deletion Other Trees and Their Representations

Ch04 Balanced Search Trees