<<

From doubly-linked lists to binary trees Basic definitions

! Instead of using prev and next to point to a linear ! is a : arrangement, use them to divide the universe in half ! empty ! root with left and right subtrees ! Similar to binary search, everything less goes left, everything greater goes right ! terminology: parent, children, leaf node, internal node, depth, “koala” height, path “llama” “koala” • link from node N to M then N is parent of M ! How do we search? “giraffe” “tiger” – M is child of N A “koala” • leaf node has no children ! How do we insert? B “elephant” “jaguar” “monkey” – internal node has 1 or 2 children • path is sequence of nodes, N , N , … N “koala” 1 2 k D E F – Ni is parent of Ni+1 “hippo” “leopard” “pig” – sometimes edge instead of node G • depth (level) of node: length of root-to-node path “koala” – level of root is 1 (measured in nodes) • height of node: length of longest node-to-leaf path – height of tree is height of root

CompSci 100e 9.1 CompSci 100e 9.2

Implementing binary trees Printing a search tree in order

! Trees can have many shapes: short/bushy, long/stringy ! When is root printed? h ! if height is h, number of nodes is between h and 2 -1 ! After left subtree, before right subtree. ! single node tree: height = 1, if height = 3 void visit(TreeNode ) { if (t != null) { visit(t.left); System.out.println(t.info); ! Java implementation, similar to doubly- visit(t.right); “llama” public class TreeNode } { } String info; “giraffe” “tiger” TreeNode left; TreeNode right; “elephant” “jaguar” “monkey” TreeNode(String s, TreeNode llink, TreeNode rlink){ ! Inorder traversal info = s; left = llink; right = rlink; } “hippo” “leopard” “pig” };

CompSci 100e 9.3 CompSci 100e 9.4 Tree traversals Tree exercises

! Different traversals useful in different contexts 1. Build a tree ! Inorder prints search tree in order " People standing up are nodes that are currently in the • Visit left-subtree, process root, visit right-subtree tree " Point at a sitting down person to make them your child ! Preorder useful for reading/writing trees " Is it a binary tree? Is it a BST? • Process root, visit left-subtree, visit right-subtree " Traversals, height, deepest leaf? 2. How many different binary search trees are there with ! Postorder useful for destroying trees specified elements? • Visit left-subtree, visit right-subtree, process root " E.g given elements {90,13,2,3}, how many possible

“llama” legal BSTs are there? 3. Convert a to a in O(n) “giraffe” “tiger” time without creating any new nodes.

“elephant” “jaguar” “monkey”

CompSci 100e 9.5 CompSci 100e 9.6

Insertion and Find? Complexity? What does insertion look like?

! How do we search for a value in a tree, starting at root? ! Simple recursive insertion into tree (accessed by root) ! Can do this both iteratively and recursively, contrast to ! root = insert("foo", root); printing which is very difficult to do iteratively ! How is insertion similar to search? public TreeNode insert(TreeNode t, String s) { if (t == null) t = new Tree(s,null,null); else if (s.compareTo(t.info) <= 0) ! What is complexity of print? Of insertion? t.left = insert(t.left,s); ! Is there a worst case for trees? else t.right = insert(t.right,s); return t; ! Do we use best case? Worst case? Average case? } ! Note: in each recursive call, the parameter t in the called clone is ! How do we define worst and average cases either the left or right pointer of some node in the original tree ! For trees? For vectors? For linked lists? For arrays of ! Why is this important? linked-lists? ! Why must the idiom t = treeMethod(t,…) be used? ! What would removal look like?

CompSci 100e 9.7 CompSci 100e 9.8 Tree functions Balanced Trees and Complexity

! Compute height of a tree, what is complexity? ! A tree is height-balanced if ! Left and right subtrees are height-balanced int height(Tree root) ! Left and right heights differ by at most one { if (root == null) return 0; else { return 1 + Math.max(height(root.left), height(root.right) ); } } boolean isBalanced(Tree root) { ! Modify function to compute number of nodes in a tree, does if (root == null) return true; complexity change? return isBalanced(root.left) && isBalanced(root.right) && ! What about computing number of leaf nodes? Math.abs(height(root.left) – height(root.right)) <= 1; } }

CompSci 100e 9.9 CompSci 100e 9.10

Rotations and balanced trees What is complexity?

! Height-balanced trees ! Assume trees are “balanced” in analyzing complexity ! For every node, left and ! Roughly half the nodes in each subtree right subtree heights differ by at most 1 ! Leads to easier analysis ! After insertion/deletion need to rebalance Are these trees height- ! How to develop recurrence relation? ! Every operation leaves tree balanced? in a balanced state: invariant ! What is T(n)? property of tree ! What other work is done? ! Find deepest node that’s unbalanced then make sure: ! On path from root to ! How to solve recurrence relation inserted/deleted node ! Plug, expand, plug, expand, find pattern ! Rebalance at this unbalanced point only ! A real proof requires induction to verify correctness

CompSci 100e 9.11 CompSci 100e 9.12 Balanced trees we won't study Balanced trees we will study

! B-trees are used when is both in memory and on disk ! Both kinds have worst-case O(log n) time for tree operations ! File systems, really large data sets ! AVL (Adel’son-Velskii and Landis), 1962 ! Rebalancing guarantees good performance both ! Nodes are “height-balanced”, subtree heights differ by 1 asymptotically and in practice. Differences between ! Rebalancing requires per-node bookkeeping of height cache, memory, disk are important ! http://www.seanet.com/users/arsen/avltree.html

! Splay trees rebalance during insertion and during search, ! Red-black tree uses same rotations, but can rebalance in one nodes accessed often more closer to root pass, contrast to AVL tree ! Other nodes can move further from root, consequences? ! In AVL case, insert, calculate balance factors, rebalance • Performance for some nodes gets better, for others … ! In Red-black tree can rebalance on the way down, code ! No guarantee running time for a single operation, but is more complex, but doable guaranteed good performance for a sequence of operations, this is good amortized cost (vector ! Standard java.util.TreeMap/TreeSet use red-black push_back)

CompSci 100e 9.13 CompSci 100e 9.14

Rotation to rebalance Rotation up close (doLeft)

N ! When a node N (root) is ! Why is this called doLeft? N unbalanced height differs by 2 N ! N will no longer be root, (must be more than one) N new value in left.left C A ! Change N.left.left subtree • doLeft A B C C ! Left child becomes new B ! Change N.left.right A root • doLeftRight A B B C Node doLeft(Node root) ! Change N.right.left { • doRightLeft ! Rotation isn’t “to the left”, but rather “brings left child up” Node newRoot = root.left; ! Change N.right.right Node doLeft(Node root) ! doLeftChildRotate? root.left = newRoot.right; • doRight { newRoot.right = root; ! First/last cases are symmetric Node newRoot = root.left; return newRoot; ! Middle cases require two rotations root.left = newRoot.right; } ! First of the two puts tree into newRoot.right = root; doLeft or doRight return newRoot; }

CompSci 100e 9.15 CompSci 100e 9.16 Rotation to rebalance Double rotation complete ????? N ! Suppose we add a new node in ! Calculate where to rotate and what N right subtree of left child of root case, do the rotations ! Node doRight(Node root) Single rotation can’t fix { C A ! Need to rotate twice Node newRoot = root.right; C root.right = newRoot.left; B2 C A B B !C First stage is shown at bottom newRoot.left = root; A B1 B2 B1 return newRoot; ! Rotate blue node right } A

• (its right child takes its place) Node doLeft(Node root) ! This is left child of unbalanced { Node newRoot = root.left; Node doRight(Node root) root.left = newRoot.right; newRoot.right = root; { return newRoot; C B B 2 Node newRoot = root.right; } 1 B1 B2 A B1 B2 root.right = newRoot.left; A A C newRoot.left = root; return newRoot; }

CompSci 100e 9.17 CompSci 100e 9.18

AVL tree practice AVL practice: continued, and finished

! ! !"#$%&'("&)'*+,'&%$$- 18 After adding 13, ok 16 ! . / '. 0 '. 1 '. 2 '1 '3 '/ '. 3 ! After adding14, not ok 10 . 4 10 ! doRight at 12 10 18 6 16 *5&$%'677("8 '. 1 - ! doLeft 16 7),$5&9(8 :18& 3 8 12 18 16 10 18

10 doRight 16 10 18 10 18 6 16 16 10 6 12 3 8 12 18 10

10 13 6 16 16 10 ! *5&$%'3 ;'7),$5&')"'. 1 14 3 10 18 6 16 8 13 18 6 16 6 12 3 12 18 3 8 12 18 12 14 3

CompSci 100e 9.19 CompSci 100e 9.20 : efficient search of words/suffixes Trie picture and code (see trie.cpp)

! A trie (from retrieval, but pronounced “try”) supports ! To add string a ! Insertion: a word into the trie (delete and look up) ! Start at root, for each char c p r create node as needed, go ! These operations are O(size of string) regardless of n s a r a how many strings are stored in the trie! Guaranteed! down tree, mark last node ! To find string h s t c d ! In some ways a trie is like a 128 (or 26 or alphabet-size) tree, ! Start at root, follow links a o one branch/edge for each /letter • If NULL, not found ! Node stores branches to other nodes ! Check word flag at end ! Node stores whether it ends the string from root to it ! To print all nodes ! Visit every node, build string as nodes traversed ! Extremely useful in DNA/string processing ! What about union and ! monkeys and typewriter simulation which is similar to intersection? some methods used in Natural Language understanding Indicates word ends here (n-gram methods)

CompSci 100e 9.21 CompSci 100e 9.22

Scoreboard Boggle: , , structure

Find words on 4x4 grid Algorithm Insertion Deletion Search ! Adjacent letters: Unsorted Vector/array ! ! " # $ % & ' ( H E S I ! No re-use in same word Sorted vector/array

Linked list Two approaches to find all words W L B D ! Try to form every word on Hash Maps board ! Look up prefix as you go A B I M Binary search tree • Trie is useful for prefixes AVL tree S ! Look up every word in Z E V dictionary ! What else might we want to do with a ? ! For each word: on board? ! ZEAL and SMILES

CompSci 100e 9.23 CompSci 100e 9.24