4.3 Binary Search Trees Symbol Table

Symbol Table Challenges 4.3 Binary Search Trees Symbol table. Key-value pair abstraction. ! Insert a value with specified key. ! Search for value given key. ! Delete value with given key. Binary search trees Challenge 1. Guarantee symbol table performance. Randomized BSTs hashing analysis depends on input distribution Challenge 2. Expand API when keys are ordered. find the kth largest Reference: Chapter 12, Algorithms in Java, 3rd Edition, Robert Sedgewick. Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.Princeton.EDU/~cos226 2 Binary Search Trees Binary Search Trees in Java Def. A binary search tree is a binary tree in symmetric order. A BST is a reference to a node. 51 A Node is comprised of four fields: Binary tree is either: ! A key and a value. 14 72 ! Empty. ! A reference to the left and right subtree. ! A key-value pair and two binary trees. 33 53 97 smaller larger 25 43 64 84 99 root private class Node { Key key; 51 node Val val; Symmetric order: right Node l, r; left ! Keys in nodes. x } ! No smaller than left subtree. 14 68 Key and Val are generic types; ! No larger than right subtree. subtrees Key is Comparable A B 12 54 79 smaller larger 3 4 Java Implementation of BST: Skeleton Search Get. Return val corresponding to given key, or null if no such key. public class BST<Key extends Comparable, Val> { private Node root; private class Node { private Key key; public Val get(Key key) { private Val val; Node x = root; private Node l, r; while (x != null) { if ( eq(key, x.key)) return x.val; private Node(Key key, Val val) { else if (less(key, x.key)) x = x.l; this.key = key; else le ss(key, x.key))x = x.r; this.val = val; } } return null; } } private boolean less(Key k1, Key k2) { … } private boolean eq (Key k1, Key k2) { … } public void put(Key key, Val val) { … } public Val get(Key key) { … } } 5 6 BST: Insert BST: Construction Put. Associate val with key. Insert the following keys into BST. A S E R C H I N G X M P L ! Search, then insert. ! Concise (but tricky) recursive code. public void put(Key key, Val val) { root = insert(root, key, val); } private Node insert(Node x, Key key, Val val) { if (x == null) return new Node(key, val); else if ( eq(key, x.key)) x.val = val; else if (less(key, x.key)) x.l = insert(x.l, key, val); else less(key, x.key)) x.r = insert(x.r, key, val); return x; } 7 8 Tree Shape BST: Analysis Tree shape. Theorem. If keys are inserted in random order, height of tree ! Many BSTs correspond to same input data. is !(log N), except with exponentially small probability. ! Cost of search/insert proportional to depth of node. mean " 4.311 ln N, variance = O(1) ! 1-1 correspondence between BST and quicksort partitioning. depth of node corresponds to depth of function call stack when node is partitioned Property. If keys are inserted in random order, expected number of comparisons for a search/insert is about 2 ln N. 51 25 14 72 14 72 43 97 33 53 97 But… Worst-case for search/insert/height is N. e.g., keys inserted in ascending order 25 43 64 84 99 33 53 84 99 51 64 9 10 Symbol Table: Implementations Cost Summary BST: Eager Delete Delete a node in a BST. [Hibbard] ! Zero children: just remove it. Worst Case Average Case ! One child: pass the child up. Implementation Get Put Remove Get Put Remove ! Two children: find the next largest node using right-left* or Sorted array log N N N log N N/2 N/2 left-right*, swap with next largest, remove as above. Unsorted list N N N N/2 N N Hashing N 1 N 1* 1* 1* BST N N N log N log N ??? * assumes hash function is random zero children one child two children BST. O(log N) insert and search if keys arrive in random order. Problem. Eager deletion strategy clumsy, not symmetric. Consequence. Trees not random (!) # sqrt(N) per op. 11 12 BST: Lazy Delete Symbol Table: Implementations Cost Summary Lazy delete. To delete node with a given key, set its value to null. Worst Case Average Case Cost. O(log N') per insert, search, and delete, where N' is the number Implementation Get Put Remove Get Put Remove of elements ever inserted in the BST. Sorted array log N N N log N N/2 N/2 under random input assumption Unsorted list N N N N/2 N N Hashing N 1 N 1* 1* 1* 25 25 tombstone BST N N N log N † log N † log N † 14 72 14 * assumes hash function is random delete 72 † assumes N is number of keys ever inserted 43 97 43 97 BST. O(log N) insert and search if keys arrive in random order. 33 53 84 99 33 53 84 99 Q. Can we achieve O(log N) independent of input distribution? 13 14 Right Rotate, Left Rotate Right Rotate, Left Rotate Two fundamental ops to rearrange nodes in a tree. Rotation. Fundamental operation to rearrange nodes in a tree. ! Maintains symmetric order. ! Easier done than said. ! Local transformations, change just 3 pointers. left rotate 'A' right rotate 'S' private Node rotL(Node h) { Node x = h.r; h.r = x.l; x.l = h; x y return x; } y x private Node rotR(Node h) { A y = left(x) C Node x = h.l; h.l = x.r; x.r = h; B C x = right(y) A B return x; } 15 16 Recursive BST Root Insertion insert G BST Construction: Root Insertion Root insertion: insert a node and make it the new root. Ex. A S E R C H I N G X M P L ! Insert using standard BST. ! Rotate it up to the root. Why bother? ! Faster if searches are for recently inserted keys. ! Basis for advanced algorithms. private Node rootInsert(Node h, Key key, Val val) { if (h == null) return new Node(key, val); if (less(key, h.key)) { h.l = rootInsert(h.l, key, val); h = rotR(h); } else { h.r = rootInsert(h.r, key, val); h = rotL(h); } return h; } 17 18 Randomized BST Randomized BST Example Intuition. If keys are inserted in random order, height is logarithmic. Ex: Insert keys in ascending order. Idea. When inserting a new node, make it the root (via root insertion) with probability 1/(N+1), and do so recursively. private Node insert(Node h, Key key, Val val) { if (h == null) return new Node(key, val); if (Math.random()*(h.N + 1) < 1) return rootInsert(h, key, val); else if (less(key, h.key)) h.l = insert(h.l, key, val); else h.r = insert(h.r, key, val); h.N++; return h; maintain size of subtree rooted at h } Fact. Tree shape distribution is identical to tree shape of inserting keys in random order. but now, no assumption made on the input distribution 19 20 Randomized BST Randomized BST: Delete Property. Always "looks like" random binary tree. Delete. Delete node containing given key; join two broken subtrees. G delete 'S' B S O U J Q T W ! As before, expected height is !(log N). H L P R V Z ! Exponentially small chance of bad balance. Implementation. Need to maintain subtree size in each node. 21 22 Randomized BST: Delete Randomized BST: Join Delete. Delete node containing given key; join two broken subtrees. Join. Merge T1 (of size N1) and T2 (of size N2) assuming all keys in T1 are less than all keys in T2. ! Use root of T1 as root with probability N1 / (N1 + N2), G and recursively join right subtree of T1 with T2. ! B Use root of T2 as root with probability N2 / (N1 + N2), and recursively join left subtree of T2 with T1. O U prob = 7/12 O J Q T W O U U H L P R V Z J Q T W J Q T W T1 T2 H L P R V Z H L P R V Z Goal. Join T and T , where all keys in T are less than all keys in T . 1 2 1 2 T T1 2 T1 T2 23 24 Randomized BST: Join Randomized BST: Delete Join. Merge T1 (of size N1) and T2 (of size N2) assuming all keys in T1 Join. Merge T1 (of size N1) and T2 (of size N2) assuming all keys in T1 are less than all keys in T2. are less than all keys in T2. ! Use root of T1 as root with probability N1 / (N1 + N2), and recursively join right subtree of T1 with T2. Delete. Delete node containing given key; join two broken subtrees. ! Use root of T2 as root with probability N2 / (N1 + N2), and recursively join left subtree of T2 with T1. Analysis. Running time bounded by height of tree. prob = 5/12 U Theorem. Tree still random after delete. O U O Corollary. Expected number of comparisons for a search/insert/delete is !(log N). J Q T W J Q T W H L P R V Z H L P R V Z T1 T2 T1 T2 25 26 Symbol Table: Implementations Cost Summary BST: Advanced Operations Sort. Iterate over keys in ascending order. Worst Case Average Case ! Inorder traversal. Implementation Search Insert Delete Search Insert Delete ! Same comparisons as quicksort, but pay space for extra links.

Load more