File Processing & Organization Course No: 5901227-3

File Processing & Organization Course No: 5901227-3 Lecture 8 Tries and Splay Trees Tries • A trie, also called digital tree/radix tree/prefix tree. • It is an ordered tree that is used to store a set of strings. • The term trie comes from retrieval. • Properties of a trie: – It is a multi-way tree. – The root of the tree is associated with the empty string . – Each edge of the tree is labeled with a character. – Each leaf node corresponds to the stored string, which is a concatenation of characters on a path from the root to this node. Trie Example 1 • Set of strings: {bear, bid, bulk, bull, sun, sunday}. s • all strings end with “$” b e u u i a n d r l d $ $ l a $ k y $ $ $ • In some trees the nodes can store the keys, that is, the characters on the path from root to the node. Trie Example 2 [nodes store the keys] A trie for keys "A", "tea", "to", "ted", "ten", "i", "in", and "inn". Trie Example 3 • For strings – { bear, bell, bid, bull, buy, sell, stock, stop } b s e i u e t a l d l y l o r l l l c p k • Tries are most commonly used for character strings. But they can be used for other data as well. • A bitwise trie is keyed on the individual bits making up an integer number or memory address. Compact/ Compressed Tries – Replace a chain of one-child nodes with an edge labeled with a string – Each non-leaf node (except root) has at least two children b s b sun e u u i ear$ a n day$ id$ ul $ r d l d $ $ l l$ $ a k k$ y $ $ $ Compact Tries • Compact representation of a compressed trie • Approach – For an array of strings S = S[0], … S[S-1] – Store ranges of indices at each node • Instead of substring – Represent as a triplet of integers (i, j, k) • Such that X = s[i][j..k] – Example: S[0] = “abcd”, (0,1,2) = “bc” • Properties – Uses O(s) space, where s = number of strings in the array – Serves as an auxiliary index structure Compact Representation Example • . 0 1 2 3 4 0 1 2 3 0 1 2 3 S[0] = s e e S[4] = b u l l S[7] = h e a r S[1] = b e a r S[5] = b u y S[8] = b e l l S[2] = s e l l S[6] = b i d S[9] = s t o p S[3] = s t o c k 1, 0, 0 7, 0, 3 0, 0, 0 1, 1, 1 6, 1, 2 4, 1, 1 0, 1, 1 3, 1, 2 1, 2, 3 8, 2, 3 4, 2, 3 5, 2, 2 0, 2, 2 2, 2, 3 3, 3, 4 9, 3, 3 Search and Insertion in Tries Trie-Search(t, P[k..m]) //searches string P in t 01 if t is leaf then return true 02 else if t.child(P[k])=nil then return false 03 else return Trie-Search(t.child(P[k]), P[k+1..m]) ■ The search algorithm just follows the path down the tree (starting with Trie-Search(root, P[0..m])) Trie-Insert(t, P[k..m]) 01 if t is not leaf then //otherwise P is already present 02 if t.child(P[k])=nil then 03 Create a new child of t and a “branch” starting with that child and storing P[k..m] 04 else Trie-Insert(t.child(P[k]), P[k+1..m]) 10 Splay Trees • Splay trees are binary search trees (BSTs) that: – Are not perfectly balanced all the time – Allow search and insertion operations to try to balance the tree so that future operations may run faster • Based on the heuristic: – If X is accessed once, it is likely to be accessed again. – After node X is accessed, perform “splaying” operations to bring X up to the root of the tree. – Do this in a way that leaves the tree more or less balanced as a whole. 11 Motivating Example Root Root 15 After Search(12) 12 2 6 18 Splay idea: Get 12 6 15 1 up to the root 3 12 using rotations 3 9 14 18 9 14 After splaying with 12 Initial tree • Not only splaying with 12 makes the tree balanced, subsequent accesses for 12 will take O(1) time. • Active (recently accessed) nodes will move towards the root and inactive nodes will slowly move further from the root 12 Splay Tree Terminology • Let X be a non-root node, i.e., has at least 1 ancestor. • Let P be its parent node. • Let G be its grandparent node (if it exists) • Consider a path from G to X: – Each time we go left, we say that we “zig” – Each time we go right, we say that we “zag” • There are 6 possible cases: G G G G P P P P P P X X X X X X 1. zig 2. zig-zig 3. zig-zag 4. zag-zig 5. zag-zag 6. zag 13 Splay Tree Operations • When node X is accessed, apply one of six rotation operations: – Single Rotations (X has a P but no G) • zig, zag – Double Rotations (X has both a P and a G) • zig-zig, zig-zag • zag-zig, zag-zag 14 Splay Trees: Zig Operation • “Zig” is just a single rotation, as in an AVL tree • Suppose 6 was the node that was accessed (e.g. using Search) 15 6 Zig-Right 6 18 3 15 3 12 12 18 • “Zig-Right” moves 6 to the root. • Can access 6 faster next time: O(1) • Notice that this is simply a right rotation in AVL tree terminology. 15 Splay Trees: Zig-Zig Operation • “Zig-Zig” consists of two single rotations of the same type • Suppose 3 was the node that was accessed (e.g., using Search) 15 6 3 Zig-Right Zig-Right 1 6 6 18 3 15 4 15 3 12 1 4 12 18 1 4 12 18 • Due to “zig-zig” splaying, 3 has bubbled to the top! • Note: Parent-Grandparent is rotated first. 16 Splay Trees: Zig-Zag Operation • “Zig-Zag” consists of two rotations of the opposite type • Suppose 12 was the node that was accessed (e.g., using Search) 15 Zag-Left 15 Zig-Right 12 6 18 12 18 6 15 3 10 14 18 3 12 6 14 10 14 3 10 • Due to “zig-zag” splaying, 12 has bubbled to the top! • Notice that this is simply an LR imbalance correction in AVL tree terminology (first a left rotation, then a right rotation) 17 Splay Trees: Zag-Zig Operation • “Zag-Zig” consists of two rotations of the opposite type • Suppose 17 was the node that was accessed (e.g., using Search) 15 15 Zig-Right Zag-Left 17 15 6 20 6 17 20 6 16 18 30 17 30 16 20 18 30 16 18 • Due to “zag-zig” splaying, 17 has bubbled to the top! • Notice that this is simply an RL imbalance correction in AVL tree terminology (first a right rotation, then a left rotation) 18 Splay Trees: Zag-Zag Operation • “Zag-Zag” consists of two single rotations of the same type • Suppose 30 was the node that was accessed (e.g., using Search) 15 Zag-Left 20 Zag-Left 30 15 30 6 20 20 40 6 17 25 40 15 25 17 30 6 17 25 40 • Due to “zag-zag” splaying, 30 has bubbled to the top! • Note: Parent-Grandparent is rotated first. 19 Splay Trees: Zag Operation • “Zag” is just a single rotation, as in an AVL tree • Suppose 15 was the node that was accessed (e.g., using Search) 6 15 Zag-Left 3 15 6 18 12 18 3 12 • “Zag-Left”moves 15 to the root. • Can access 15 faster next time: O(1) • Notice that this is simply a left rotation in AVL tree terminology 20 Splaying during other operations • Splaying can be done not just after Search, but also after other operations such as Insert/Delete. • Insert X: After inserting X at a leaf node (as in a regular BST), splay X up to the root • Delete X: Do a Search on X and get X up to the root. Delete X at the root and move the largest item in its left sub-tree, i.e, its predecessor, to the root using splaying. • Note on Search X: If X was not found, splay the leaf node that the Search ended up with to the root. 21.

File Processing & Organization Course No: 5901227-3

KP-Trie Algorithm for Update and Search Operations

Amortized Analysis Worst-Case Analysis

Search Trees

Splay Trees Last Changed: January 28, 2017

Leftist Heap: Is a Binary Tree with the Normal Heap Ordering Property, but the Tree Is Not Balanced. in Fact It Attempts to Be Very Unbalanced!

SPLAY Trees • Splay Trees Were Invented by Daniel Sleator and Robert Tarjan

Amortized Complexity Analysis for Red-Black Trees and Splay Trees

Performance Analysis of Bsts in System Software∗

Lecture 4 1 Overview 2 Splay Tree Properties

The Adaptive Radix Tree

Artful Indexing for Main-Memory Databases

Data Structures Lecture 11