trees

1 are lists enough? for correctness — sure want to efficiently access items better than linear time to find something want to represent relationships more naturally

2 inter-item relationships in lists

1 2 3 4 5 List: nodes related to predecessor/successor

3 trees trees: allow representing more relationships (but not arbitrary relationships — see graphs later in semester) restriction: single path from root to every node implies single path from every node to every other node (possibly through root)

4 natural trees: phylogenetic tree

image: Ivicia Letunic and Mariana Ruiz Villarreal, via the tool iTOL (Interative Tree of Life), via Wikipedia 5 natural trees: phylogenetic tree (zoom)

image: Ivicia Letunic and Mariana Ruiz Villarreal, via the tool iTOL (Interative Tree of Life), via Wikipedia 6 natural trees: Indo-European languages

INDO-EUROPEAN

ANATOLIAN

Luwian Hittite Carian Lydian Lycian Palaic Pisidian

HELLENIC INDO-IRANIAN DORIAN Mycenaean AEOLIC INDO-ARYAN Doric Attic ACHAEAN Aegean Northwest Greek Ionic Beotian Vedic Sanskrit Classical Greek Arcado Thessalian Tsakonian Koine Greek Epic Greek Cypriot Sanskrit Prakrit

Greek Maharashtri Gandhari Shauraseni Magadhi Niya ITALIC INSULAR INDIC Konkani Paisaci Oriya Assamese BIHARI CELTIC Pali Bengali LATINO-FALISCAN SABELLIC Dhivehi Marathi Halbi Chittagonian Bhojpuri CONTINENTAL Sinhalese CENTRAL INDIC Magahi Faliscan Oscan Vedda Maithili Umbrian Celtiberian WESTERN INDIC HINDUSTANI PAHARI INSULAR Galatian Classical Latin Aequian Gaulish NORTH Bhil DARDIC Hindi Urdu CENTRAL EASTERN Vulgar Latin Marsian GOIDELIC BRYTHONIC Lepontic Domari Ecclesiastical Latin Volscian Noric Dogri Gujarati Kashmiri Haryanvi Dakhini Garhwali Nepali Irish Common Brittonic Lahnda Rajasthani Nuristani Rekhta Kumaoni Palpa Manx Ivernic Potwari Romani Pashayi Scottish Gaelic Pictish Breton Punjabi Shina Cornish Sindhi IRANIAN ROMANCE Cumbric ITALO-WESTERN Welsh EASTERN Avestan WESTERN Sardinian EASTERN ITALO-DALMATIAN Corsican NORTH SOUTH NORTH Logudorese Aromanian Dalmatian Scythian Sogdian Campidanese Istro-Romanian Istriot Bactrian CASPIAN Megleno-Romanian Italian Khotanese Romanian GALLO-IBERIAN Neapolitan Ossetian Khwarezmian Yaghnobi Deilami Sassarese Saka Gilaki IBERIAN Sicilian Sarmatian Old Persian Mazanderani GALLIC SOUTH Shahmirzadi Alanic Middle Persian Lurish Talysh Aragonese Arpitan CISALPINE LANGUE D'OÏL OCCITAN Rhaetian Pamiri Pashto Median Astur-Leonese Bukhori Bakhtiari Galician-Portuguese Emilian French Catalan Friulian Sarikoli Waziri Dari Kumzari Parthian Zaza-Gorani Mozarabic Ligurian Gallo Occitan Ladin Vanji Persian Old Spanish Eonavian Lombard Norman Romansh Yidgha Shughni Tat Balochi Gorani Asturian Fala Piedmontese Walloon Yazgulami Hazaragi Kurdish Zazaki Extremaduran Ladino Galician Venetian ARMENIAN Tajik Juhuru Leonese Spanish Portuguese Mirandese GERMANIC Armenian TOCHARIAN EAST BALTO-SLAVIC Kuchean Old West Norse Old East Norse Burgundian Turfanian BALTIC SLAVIC Old Faroese Danish Gothic WEST EAST EAST Swedish Vandalic Icelandic WEST Galindan Latvian Old Novgorod Old East Slavic Norn Prussian Lithuanian Norwegian Sudovian Selonian Russian Ruthenian Semigallian WEST SOUTH LOW ALBANIAN Belorussian ANGLO-FRISIAN Old West Slavic WESTERN Rusyn WEST EAST Albanian EASTERN Ukrainian Old East LECHITIC Slovene Czech-Slovak Old Church Slavonic Old Polish Knaanic Serbo-Croatian Polabian Sorbian Czech Bulgarian Dutch Alemannic North Frisian English Polish Pomeranian Slovak Bosnian Church Slavonic image: via Wikipedia/Mandrak Ripuarian Austro-Bavarian Saterland Frisian Scots Silesian Croatian Macedonian 7 Thuringian Cimbrian West Frisian Yola Kashubian Serbian list to tree

list — up to 2 related nodes

predecessor element successor binary tree — up to 3 related nodes (list is special-case)

parent

element

left child left child 8 more general trees tree — any number of relationships (binary tree is special case) at most one parent

parent

element

child 1 child 2 … child n

9 tree terms (1)

parent child A root: node with no parents

B C siblings: nodes with the same parent D E F G

leafs: nodes with no children H

10 paths and path lengths

A path: sequence of nodes n1, n2, . . . , nk such that ni is parent of ni+1 example: {B,D,H} B C

D E F G

length (of path): number of edges in path H example: 2 (B → D and D → H)

internal path length: sum of depth of nodes example: 6 = 1 + 2 + 3 11 tree/node height

parent child A 3 height (of a node): length of longest path to leaf

B 2 C 1 height (of a tree): height of tree’s root (this example: 3) D 1 E 0 F 0 G 0

H 0

12 tree/node depth

parent child A 0 depth (of a node): length of path to root

B 1 C 1

D 2 E 2 F 2 G 2

H 3

13 first child/next sibling

home class TreeNode { private: string element; aaron TreeNode *firstChild; * TreeNode nextSibling; nextSibling public: ... cs2150 cs4970 mail }; firstChild lab1 lab2 proj1

coll.h coll.cpp proj.h

14 another tree representations

class TreeNode { private: string element; vector children; public: ... };

// and more --- see when we talk about graphs

15 tree traversal

/

× ×

+ - 5 6

1 2 3 4

pre-order: /*+12-34*56 in-order: (((1+2) * (3-4)) / (5*6)) (parenthesis optional?) post-order: 12+34-*56*/

16 pre/post-order traversal printing

(this is pseudocode) TreeNode::printPreOrder() { this−>print(); for each child c of this: c−>printPreOrder() }

TreeNode::printPostOrder() { for each child c of this: c−>printPostOrder() this−>print(); }

17 in-order traversal printing

(this is pseudocode) BinaryTreeNode::printInOrder() { if (this−>left) this−>left−>printInOrder(); cout << this−>element << "␣"; if (this−>right) this−>right−>printInOrder(); }

18 post-order traversal counting

(this is pseudocode) int numNodes(TreeNode *tnode) { if ( tnode == NULL ) return 0; else { sum=0; for each child c of tnode sum += numNodes(c); return 1 + sum; } }

19 expression tree and traversals

+ (a + ((b + c) * d)) a *

+ d

b c

20 expression tree and traversals

+ infix: (a + ((b + c) * d)) a * postfix: a b c + d * + prefix: + a * + b c d + d

b c

21 postfix expression to tree use a stack of trees number n → push( n ) operator OP → pop into A, B; then push OP B A

22 e * + * + * d d e c + a b c + top of stack d bc e d e

+ a a b

example a b + c d e + * *

23 e * + * + * d d e c + a b c +

d c e d e

+

a b

example a b + c d e + * *

top of stack b

a

23 e * + * + * d d e c + a b c +

d bc e d e

a

example a b + c d e + * *

top of stack

+

a b 23 * + * + * d e c + a b c +

d bc e d e

a

example a b + c d e + * *

e

d

top of stack c

+

a b 23 e *

* + * d c + a b c +

d bc e d e

a

example a b + c d e + * *

+

d e

top of stack c

+

a b 23 e * + + * d d e a b c +

bc d e

a

example a b + c d e + * *

* c + top of stack d e

+

a b 23 e +

d d e

top of stack bc

a

example a b + c d e + * *

*

* + *

c + a b c +

d e d e

+

a b 23 element = 2 left = NULL right = addr of node 3

element = 7 left = NULL right = NULL

binary trees all nodes have at most 2 children 1 class BinaryNode { ... int element; 2 BinaryNode *left; BinaryNode *right; 3 }; 4 1 5 2 3 6 4 5 6 7 7 24 element = 7 left = NULL right = NULL

binary trees all nodes have at most 2 children 1 class BinaryNode { element = 2 ... int element; 2 left = NULL BinaryNode *left; right = addr of node 3 BinaryNode *right; 3 }; 4 1 5 2 3 6 4 5 6 7 7 24 element = 2 left = NULL right = addr of node 3

binary trees all nodes have at most 2 children 1 class BinaryNode { ... int element; 2 BinaryNode *left; BinaryNode *right; 3 }; 4 1 5 2 3 element =67 4 5 6 7 left = NULL right = NULL7 24 right subtree of 4 left subtree of 4 right subtree of 5

binary search trees binary tree and… each node has a key for each node: keys in node’s left subtree are less than node’s keys in node’s right subtree are greater than node’s 4 2 5 1 3 7 6 8

25 right subtree of 5

binary search trees binary tree and… each node has a key for each node: keys in node’s left subtree are less than node’s keys in node’s right subtree are greater than node’s 4 2 5 1 3 7 right subtree of 4 left subtree of 4 6 8

25 right subtree of 4 left subtree of 4

binary search trees binary tree and… each node has a key for each node: keys in node’s left subtree are less than node’s keys in node’s right subtree are greater than node’s 4 2 5 1 3 7 6 8 right subtree of 5 25 not a binary search tree

8 5 11 2 6 10 18 4 15 20

21

26 binary search tree versus binary tree binary search trees are a kind of binary tree …but — often people say “binary tree” to mean “binary search tree”

27 BST: find

(pseudocode) find(node, key) { if (node == NULL) return NULL; else if (key < node−>key) return find(node−>left, key) else if (key > node−>key) return find(node−>right, key) else // if (key == node->key) return node; }

28 BST: insert

(pseudocode) insert(Node *&node, key) { if (node == NULL) node = new BinaryNode(key); else if (key < node−>key) insert(node−>left, key); else if (key < root−>key) insert(node−>right, key); else // if (key > root->key) ; // duplicate -- no new node needed }

29 BST: findMin

(pseudocode) findMin(Node *node, key) { if (node−>left == NULL) return node; else insert(node−>left, key); }

30 BST: remove (1)

case 1: no children 5 5

4 9 4 9

1 7 11 1 11

3 3

31 BST: remove (2)

case 2: one child 5 5

4 9 4 9

1 7 11 3 7 11

3

32 BST: remove (3)

case 3: two children 5 7

4 9 4 9

1 7 11 3 11

3

replace with minimum of right subtree (alternately: maximum of left subtree, …)

33 binary tree: worst-case height

1

2 n-node BST: worst-case height/depth n − 1 3

4

5

6

7

34 binary tree: best-case height height h: at most 2h+1 − 1 nodes 4

2 6

1 3 5 7

35 binary tree: proof best-case height is possible proof by induction: can have 2h+1 − 1 nodes in h-height tree h = 0: h = 0: exactly one node; 2h+1 − 1 = 1 nodes h = k → h = k + 1: start with two copies of a maximum tree of height k create a new tree as follows: create a new root node add edges from the root node to the roots of the copies the height of this new tree is k + 1 path of length k in old tree + either new edge the number of nodes is 36 2(2k+1 − 1) + 1 = 2k+1+1 − 2 + 1 = 2k+1+1 − 1 binary tree: best-case height is best

(informally) property of trees in root: except for the leaves, every node in tree has 2 children no way to add nodes without increasing height add below leaf — longer path to root — longer height add above root — every old node has longer path to root

37 binary tree height formula n: number of nodes h: height

n + 1 ≤ 2h+1  h+1 log2(n + 1) ≤ log2 2 log(n + 1) ≤ h + 1

h ≥ log2 (n + 1) − 1

shortest tree of n nodes: ∼ log2(n) height 38 perfect binary trees

4

2 6

1 3 5 7 a binary tree is perfect if all leaves have same depth all nodes have zero children (leaf) or two children exactly the trees that achieve 2h+1 − 1 nodes

39 AVL animation tool http://webdiis.unizar.es/asignaturas/EDA/ AVLTree/avltree.html

40 AVL tree idea

AVL trees: one of many balanced trees — search tree balanced to keep height Θ(log n) avoid “tree is just a long linked list” scenarios gaurentees Θ(log n) for find, insert, remove AVL = Adelson-Velskii and Landis

41 AVL gaurentee the height of and right subtrees of every node differs by at most one

42 AVL state normal binary search tree stuff: data; and left, right, parent pointers additional AVL stuff: height of right subtree minus height of left subtree called “balance factor” -1, 0, +1 (kept up to date on insert/delete — computing on demand is too slow)

43 example AVL tree

5

4 9

1 7 11

8

44 example AVL tree

5 b: +1

4 9 b: -1 b: -1

1 7 11 b: 0 b: +1 b: 0

8 b: 0

44 example non-AVL tree

5 b: -2

4 8 b: -3 b: 0

1 7 11 b: +2 b: 0 b: 0

3 b: -1

2 b: 0 45 runtime for both Θ(d) where d is depth of node found/inserted max balance factor ±1 at root max depth of node is Θ(log2 n + 1) = Θ(log n)

AVL tree algorithms find — exactly the same as binary search tree just ignore balance factors insert — two extra steps: update balance factors “fix” tree if it became unbalanced

46 AVL tree algorithms find — exactly the same as binary search tree just ignore balance factors insert — two extra steps: update balance factors “fix” tree if it became unbalanced runtime for both Θ(d) where d is depth of node found/inserted max balance factor ±1 at root max depth of node is Θ(log2 n + 1) = Θ(log n)

46 AVL insertion cases simple case: tree remains balanced otherwise: let x be deepest imbalanced node (+2/-2 balance factor) insert in left subtree of left child of x: single rotation right insert in right subtree of right child of x: single rotation left insert in right subtree of left child of x: double left-right rotation insert in left subtree of right child of x: double right-left rotation

47 AVL: simple right rotation

just inserted 0 unbalanced root becomes new left child

3 2 b: -2 b: 0

2 1 3 b: -1 b: 0 b: 0

1 b: 0

48 AVL: less simple right rotation (1)

just inserted 0 unbalanced root becomes new left child

5 3 b: -2 b: 0

3 10 2 5 b: -1 b: 0 b: -1 b: 0

2 4 1 4 10 b: -1 b: 0 b: 0 b: 0 b: 0

1 b: 0 49 AVL: simple left rotation

just inserted 1 deepest unbalanced node is 3

1 2 b: +2 b: 0

2 1 3 b: +1 b: 0 b: 0

3 b: 0

50 AVL rotation: up and down

at least one node moves up (this case: 1 and 2) at least one node moves down (this case: 3)

3 2 b: -2 b: 0

2 1 3 b: -1 b: 0 b: 0

1 b: 0

51 15 b: -1

3 20 b: 0 b: 0

2 5 17 21 b: -1 b: 0 b: 0 b: 0

1 4 10 deepest unbalancedb: 0 b: subtree 0 b: 0

AVL: less simple right rotation (2)

just inserted 1 15 b: -2

5 20 b: -2 b: 0

3 10 17 21 b: -1 b: 0 b: 0 b: 0

2 4 b: -1 b: 0

1 b: 0 52 15 b: -1

3 20 b: 0 b: 0

2 5 17 21 b: -1 b: 0 b: 0 b: 0

1 4 10 b: 0 b: 0 b: 0

AVL: less simple right rotation (2)

just inserted 1 15 b: -2

5 20 b: -2 b: 0

3 10 17 21 b: -1 b: 0 b: 0 b: 0

2 4 b: -1 b: 0 deepest unbalanced subtree 1 b: 0 52 15 b: -1

3 20 b: 0 b: 0

2 5 17 21 b: -1 b: 0 b: 0 b: 0

1 4 10 b: 0 b: 0 b: 0

AVL: less simple right rotation (2)

just inserted 1 15 b: -2

5 20 b: -2 b: 0

3 10 17 21 b: -1 b: 0 b: 0 b: 0

2 4 b: -1 b: 0 deepest unbalanced subtree 1 b: 0 52 deepest unbalanced subtree

AVL: less simple right rotation (2)

just inserted 1 15 15 b: -2 b: -1

5 20 3 20 b: -2 b: 0 b: 0 b: 0

3 10 17 21 2 5 17 21 b: -1 b: 0 b: 0 b: 0 b: -1 b: 0 b: 0 b: 0

2 4 1 4 10 b: -1 b: 0 b: 0 b: 0 b: 0

1 b: 0 52 general single rotation

a rotate b right b Z a (h) rotate X left (h+1)

X Y (h+1) (h) Y Z X < b < Y < a < Z (h) (h)

53 step 1: rotate subtree left step 2: rotate imbalanced tree right

8 b: 8

6 10 b: -1 b: 0

4 5 b: -1 b: 0

3 b: 0

double rotation

15 15 b: -2 b: -1

8 17 6 17 b: -2 b: -1 b: 0 b: -1

4 10 16 4 8 16 b: 1 b: 0 b: 0 b: 0 b: 1 b: 0

3 6 3 5 10 b: 0 b: -1 b: 0 b: 0 b: 0

5 b: 0

54 8 b: 8

6 10 b: -1 b: 0

4 5 b: -1 b: 0

3 b: 0

double rotation step 1: rotate subtree left step 2: rotate imbalanced tree right 15 15 b: -2 b: -1

8 17 6 17 b: -2 b: -1 b: 0 b: -1

4 10 16 4 8 16 b: 1 b: 0 b: 0 b: 0 b: 1 b: 0

3 6 3 5 10 b: 0 b: -1 b: 0 b: 0 b: 0

5 b: 0

54 double rotation step 1: rotate subtree left step 2: rotate imbalanced tree right 15 15 b: -2 b: -1

8 17 6 17 b: -2 b: -1 b: 0 b: -1 8 4 10 16 b: 8 4 8 16 b: 1 b: 0 b: 0 b: 0 b: 1 b: 0 6 10 3 6 b: -1 b: 0 3 5 10 b: 0 b: -1 b: 0 b: 0 b: 0 4 5 5 b: -1 b: 0 b: 0 3 b: 0 54 double rotation step 1: rotate subtree left step 2: rotate imbalanced tree right 15 15 b: -2 b: -1

8 17 6 17 b: -2 b: -1 b: 0 b: -1 8 4 10 16 b: 8 4 8 16 b: 1 b: 0 b: 0 b: 0 b: 1 b: 0 6 10 3 6 b: -1 b: 0 3 5 10 b: 0 b: -1 b: 0 b: 0 b: 0 4 5 5 b: -1 b: 0 b: 0 3 b: 0 54 c becomes root, so its children X and Y both switch parents

general double rotation

a rotate c

b b a

Z (h) Y X Z (h-1) c (h) (h) (h)

W (h) W

Y X (h-1) (h) 55 general double rotation

a rotate c

b b a

Z (h) Y W X Z (h-1) c (h) (h) (h)

W (h) W

56 AVL insertion cases simple case: tree remains balanced otherwise: let x be deepest imbalanced node (+2/-2 balance factor) insert in left subtree of left child of x: single rotation right insert in right subtree of right child of x: single rotation left insert in right subtree of left child of x: double left-right rotation insert in left subtree of right child of x: double right-left rotation

57 choose rotation based on lowest imbalanced node and on direction of insertion (inserted node is green+dashed)

AVL insert cases (revisited)

3 1 3 1 b: -2 b: +2 b: -2 b: +2

2 2 1 3 b: -1 b: +1 b: +1 b: -1

1 3 2 2 b: 0 b: 0 b: 0 b: 0

single left single right left-right right-left

58 AVL insert cases (revisited)

3 1 3 1 b: -2 b: +2 b: -2 b: +2

2 2 1 3 b: -1 b: +1 b: +1 b: -1

1 3 2 2 b: 0 b: 0 b: 0 b: 0

single left single right left-right right-left choose rotation based on lowest imbalanced node and on direction of insertion

(inserted node is green+dashed) 58 AVL insert case: detail (1)

3 7 b: -2 b: -2

2 3 9 b: -1 b: -2 b: 0

1 2 b: 0 b: -1 choose rotation based on 1 b: 0 lowest imbalanced node and on direction of insertion (inserted node is green+dashed) 59 AVL insert case: detail (2)

3 7 b: -2 b: -2

2 4 8 b: -1 b: -1 b: 0 choose using lowest imbalanced node 0 2 5 and on direction of insertion b: 0 b: -1 b: +1 (inserted node is 1 3 6 green+dashed) b: -1 b: 0 b: 0

0 b: 0 60 61 AVL tree: runtime

worst depth of node: Θ(log2 n + 2) = Θ(log n) find: Θ(log n) worst case: traverse from root to worst depth leaf insert: Θ(log n) worst case: traverse from root to worst depth leaf then back up (update balance factors) then perform constant time rotation remove: Θ(log n) left as exercise (similar to insert) print: Θ(n) visit each of n nodes 62 other types of trees many kinds of balanced trees not all binary trees different ways of tracking balance factors, etc. different ways of doing tree rotations or equivalent

63 red-black trees each node is red or black null leafs considered nodes to aid analysis (still null pointers…) rules about when nodes can be red/black gaurentee maximum depth

13

8 17

1 11 15 25

NULLn 6 NULLn NULLn NULLn NULLn 22 27

NULLn NULLn NULLn NULLn NULLn NULLn 64 red-black tree rules root is black counting null pointers as nodes, leaves are black a red node’s children are black → a red node’s parents are black every simple path from node to leaf under it contains same number of black nodes (property holds regardless of whether null pointers are considered nodes)

65 worst red-black tree imbalance same number of black nodes on paths to leaves → factor of 2 imbalance max A

B C

D E F G

H I N O

J K L M

66 property: “root is black” noproperty: children → “childrenno worries of red aboutnode # are blackblack nodes” onno different change in paths # of black nodes on paths

red-black insert default: insert as red (no change to black node ), but… (1) if new node is root: color black (2) if parent is black: keep child red (3) if parent and uncle is red: adjust several colors (4) if parent is red, uncle is black, new node is right child perform a rotation, then go to case 5 (5) if parent is red, uncle is black, new node is left child perform a rotation

67 property: “root is black” no children → no worries about # black nodes on different paths

red-black insert default: insert as red (no change to black node count), but… (1) if new node is root: color black (2) if parent is black: keep child red (3) if parent and uncle is red: adjust several colors (4) if parent is red, uncle is black, new node is right child perform a rotation, then go to case 5

(5) if parent is property:red, uncle “children is black of, newred node are is leftblack child” perform a rotationno change in # of black nodes on paths

68 property: “root is black” noproperty: children → “childrenno worries of red aboutnode # are blackblack nodes” onno different change in paths # of black nodes on paths

red-black insert default: insert as red (no change to black node count), but… (1) if new node is root: color black (2) if parent is black: keep child red (3) if parent and uncle is red: adjust several colors (4) if parent is red, uncle is black, new node is right child perform a rotation, then go to case 5 (5) if parent is red, uncle is black, new node is left child perform a rotation

69 but…what if grandparent’s parent is red? (property: children of red node are black) solution: recurse to the grandparent, as if it was just inserted

case 3: parent, uncle are red

G G

P U P U

N N 3 4 5 3 4 5

1 2 1 2 make grandparent red, parent and uncle black (property: every path to leaf has same number of black nodes) just swapped grandparent and parent/uncle in those paths

image: Wikipedia/Abloomfi 70 case 3: parent, uncle are red

G G

P U P U

N N 3 4 5 3 4 5

1 2 1 2 make grandparent red, parent and uncle black (property: every path to leaf has same number of black nodes) just swapped grandparent and parent/uncle in those paths but…what if grandparent’s parent is red? (property: children of red node are black) solution: recurse to the grandparent, as if it was just inserted

image: Wikipedia/Abloomfi 70 case 3: parent, uncle are red

G G

P U P U

N N 3 4 5 3 4 5

1 2 1 2 make grandparent red, parent and uncle black (property: every path to leaf has same number of black nodes) just swapped grandparent and parent/uncle in those paths but…what if grandparent’s parent is red? (property: children of red node are black) solution: recurse to the grandparent, as if it was just inserted

image: Wikipedia/Abloomfi 70 property: “root is black” noproperty: children → “childrenno worries of red aboutnode # are blackblack nodes” onno different change in paths # of black nodes on paths

red-black insert default: insert as red (no change to black node count), but… (1) if new node is root: color black (2) if parent is black: keep child red (3) if parent and uncle is red: adjust several colors (4) if parent is red, uncle is black, new node is right child perform a rotation, then go to case 5 (5) if parent is red, uncle is black, new node is left child perform a rotation

71 case 4: parent red, uncle black, right child

G G

P U N U

N P 1 4 5 3 4 5

2 3 1 2 perform left rotation on parent subtree and new node now case 5 (but new node is P , not N)

image: Wikipedia/Abloomfi 72 property: “root is black” noproperty: children → “childrenno worries of red aboutnode # are blackblack nodes” onno different change in paths # of black nodes on paths

red-black insert default: insert as red (no change to black node count), but… (1) if new node is root: color black (2) if parent is black: keep child red (3) if parent and uncle is red: adjust several colors (4) if parent is red, uncle is black, new node is right child perform a rotation, then go to case 5 (5) if parent is red, uncle is black, new node is left child perform a rotation

73 case 5: parent red, uncle black, left child

G P

P U N G

N U 3 4 5 1 2 3

1 2 4 5 perform right rotation of grandparent and parent swap colors of parent and grandparent preserves properties: red parent’s children are black every path to leaf has same number of black nodes

image: Wikipedia/Abloomfi 74 case 3 (parent,100 uncle are red) continued: caseinsert 3:8 parent, uncle are red: recusively examine grandparent 3 grandparentinitially make becomesred red 10 case 5:10 parent (of 3)200 is red parent/unclecase 3: parent,black uncle are red uncle is black, left child 3 100 3 before:15

… … 3 1 5 15 200 case 5: parent is red uncle1 is black, left5 child: 8 perform right rotation of parent + grandparent (of8 3) (and swap parent/grandparent colors)

example recursive case

initially: leaves are black X red node’s children are black X 100 same number of black nodes in every path from node to leaves X 10 200

3 15

1 5

8

75 initially: case 3 (parent,100 uncle are red) continued: leaves arecaseblack 3: parent,X uncle are red: recusively examine grandparent 3 red node’sgrandparent children becomesare blackredX 10 case 5:10 parent (of 3)200 is red same numberparent/uncle of blackblack nodes uncle is black, left child in every path from node to leaves X 3 100 3 before:15

… … 3 1 5 15 200 case 5: parent is red uncle1 is black, left5 child: 8 perform right rotation of parent + grandparent (of8 3) (and swap parent/grandparent colors)

example recursive case

insert 8 initially make red 100 case 3: parent, uncle are red 10 200

3 15

1 5

8

75 initially: case 3 (parent,100 uncle are red) continued: leaves areinsertblack8 X recusively examine grandparent 3 red node’sinitially children make areredblack X 10 case 5:10 parent (of 3)200 is red same numbercase 3: of parent, black nodes uncle are red uncle is black, left child in every path from node to leaves X 3 100 3 15

… … 1 5 15 200 case 5: parent is red uncle is black, left child: 8 perform right rotation of parent + grandparent (of 3) (and swap parent/grandparent colors)

example recursive case

case 3: parent, uncle are red: grandparent becomes red 100 parent/uncle black 10 200 before: 3 3 15

1 5 1 5

8 8

75 initially: 100 leaves arecaseinsertblack 3:8 parent,X uncle are red: red node’sgrandparentinitially children make becomesareredblackred 10 10 200 X same numberparent/unclecase 3: of parent, blackblack nodes uncle are red in every path from node to leaves X 3 100 3 before:15

… … 3 1 5 15 200 case 5: parent is red uncle1 is black, left5 child: 8 perform right rotation of parent + grandparent (of8 3) (and swap parent/grandparent colors)

example recursive case

case 3 (parent, uncle are red) continued: recusively examine grandparent 3 100 case 5: parent (of 3) is red uncle is black, left child 10 200

3 15

1 5

8

75 initially: case 3 (parent, uncle are red) continued: leaves arecaseinsertblack 3:8 parent,X uncle are red: recusively examine grandparent 3 red node’sgrandparentinitially children make becomesareredblackredX 100 case 5: parent (of 3) is red same numberparent/unclecase 3: of parent, blackblack nodes uncle are red uncle is black, left child in every path from node to leaves X 10 200 before: 3 3 15

1 5 1 5

8 8

example recursive case

100

10 10 200

3 100 3 15

… … 1 5 15 200 case 5: parent is red uncle is black, left child: 8 perform right rotation of parent + grandparent (of 3) (and swap parent/grandparent colors) 75 initially: case 3 (parent,100 uncle are red) continued: leaves arecaseinsertblack 3:8 parent,X uncle are red: recusively examine grandparent 3 red node’sgrandparentinitially children make becomesareredblackredX 100 case 5:10 parent (of 3)200 is red same numberparent/unclecase 3: of parent, blackblack nodes uncle are red uncle is black, left child in every path from node to leaves X 10 200 3 before:15

… … 3 3 15 case 5: parent is red uncle1 is black, left5 child: 1 5 perform right rotation of parent + grandparent (of8 3) 8 (and swap parent/grandparent colors)

example recursive case

10

3 100

1 5 15 200

8

75 RB-tree: removal start with normal BST remove of x, but… instead find next highest/lowest node y can choose node with at most one child (“bottom” of a left or right subtree) swap x and y’s value, then replace y with its child several cases for color maintainence/rotations

76 RB tree: removal cases

N: node just replaced with child; S: its sibling; P: its parent

(1): N is new root (2): S is red (3): P, S, and S’s children are black (4): S and S’s children are black (5): S is black, S’s left child is red, S’s right child is black, N is left child of P (6): S is black, S’s right child is red, N is left child

77 details: see, e.g., Wikipedia why red-black trees? a lot more cases…but a lot less rotations …because tree is kept less rigidly balanced red-black trees end up being faster in practice

78 more balanced trees several other kinds of balanced trees one notable kind: non-binary balanced trees commonly used in databases more efficient to store multiple nodes together on disk/SSD

79 splay trees tree that’s fast for recently used nodes self-balancing binary search tree keeps recent nodes near the top simpler to implement than AVL or RB trees

80 worst-case height: Θ(n) — linked-list case

Θ(h) time — where h is tree height

‘splaying’ every time node is accessed (find, insert, delete)…

“splay” tree around that node make the node the new tree root

81 worst-case height: Θ(n) — linked-list case

‘splaying’ every time node is accessed (find, insert, delete)…

“splay” tree around that node make the node the new tree root

Θ(h) time — where h is tree height

81 ‘splaying’ every time node is accessed (find, insert, delete)…

“splay” tree around that node make the node the new tree root

Θ(h) time — where h is tree height worst-case height: Θ(n) — linked-list case

81 splay tree operations

insert 4 find 2 3 4 2

2 3 1 3

1 2 4

1

82 amortized complexity splay tree insert/find/delete is amortized O(log n) time informally: average insert/find/delete: O(log n) more formally: m operations: O(m log n) time (where n: max size of tree)

83 splay tree pro/con can be faster than AVL, RB-trees in practice take advantage of frequently accessed items simpler to implement but worst case find/insert is Θ(n) time

84 last time red-black trees less well-balanced than AVL trees track color instead of balance factor rules about colors to limit possible imbalance algorithm for insertion, etc. that makes tree always obey rules usually faster in practice — less rotations splay trees optimized for repeated accesses keep recently accessed items near top of tree find rearranges tree! amortized logarithmic time (but worst case is linear)

85 doubling size — requires copying! — Θ(n) time Θ(n) worst case per insert but average…?

amortized analysis: vector growth vector insert algorithm: if not big enough, double capacity write to end of vector

86 amortized analysis: vector growth vector insert algorithm: if not big enough, double capacity write to end of vector doubling size — requires copying! — Θ(n) time Θ(n) worst case per insert but average…?

86 counting copies (1) suppose initial capacity 100 + insert 1600 elements 100 → 200: 100 copies 200 → 400: 200 copies 400 → 800: 400 copies 800 → 1600: 800 copies total: 1500 copies total operations: 1500 copies + 1600 writes of new elements about 2 operations per insert

87 Θ(n) worst case but Θ(n) time for n inserts → O(1) amortized time per insert

counting copies (2) more generally: for N inserts about N copies + N writes why? K to 2K elements: K copies N inserts: 1 + 2 + 4 + ... + N/4 + N/2 = N − 1 copies (and a bit better if initial capacity isn’t 1)

88 counting copies (2) more generally: for N inserts about N copies + N writes why? K to 2K elements: K copies N inserts: 1 + 2 + 4 + ... + N/4 + N/2 = N − 1 copies (and a bit better if initial capacity isn’t 1) Θ(n) worst case but Θ(n) time for n inserts → O(1) amortized time per insert

88 increase by constant: linear worst-case and amortized

other vector capacity increases? (1) instead of doubling…add 1000 N inserts: 1000 + 2000 + 3000 + ... + N ∼ N 2 → Θ(N 2) total — O(N) amortized time per insert

89 other vector capacity increases? (1) instead of doubling…add 1000 N inserts: 1000 + 2000 + 3000 + ... + N ∼ N 2 → Θ(N 2) total — O(N) amortized time per insert increase by constant: linear worst-case and amortized

89 1 − klogk N N inserts: 1 + k + k2 + k3 + ... + klogk N = ∼ N 1 − k → Θ(N) total — O(1) amortized time per insert amortized constant time for all k > 1

instead of doubling…multiply by k > 1 e.g. k = 1.1 — increase by 10%

90 → Θ(N) total — O(1) amortized time per insert amortized constant time for all k > 1

instead of doubling…multiply by k > 1 e.g. k = 1.1 — increase by 10%

1 − klogk N N inserts: 1 + k + k2 + k3 + ... + klogk N = ∼ N 1 − k

90 instead of doubling…multiply by k > 1 e.g. k = 1.1 — increase by 10%

1 − klogk N N inserts: 1 + k + k2 + k3 + ... + klogk N = ∼ N 1 − k → Θ(N) total — O(1) amortized time per insert amortized constant time for all k > 1

90 trees are not great for… ordered, unsorted lists list of TODO tasks being easy/simple to implement compare, e.g., stack/queue

Θ(1) time compare vector compare hashtables (almost)

91 programs as trees

program int z; vars functions int foo (int x) { int z foo() main() for (int y = 0; params vars body params vars y < x; y++) int z for int z cout << y << endl; int y y < x y++ body =5 } cout int main() { y endl int z = 5; body cout << "enter x" << endl; cin >> z; cout cin foo() foo(z); "enter z" endl z z } 92 programs as trees

program int z; vars functions int foo (int x) { int z foo() main() for (int y = 0; params vars body params vars y < x; y++) int z for int z cout << y << endl; int y y < x y++ body =5 } cout int main() { y endl int z = 5; body cout << "enter x" << endl; cin >> z; cout cin foo() foo(z); "enter z" endl z z } 92 for loop: four children init, condition, update, body

class ASTNode { ... };

// public class ForNode extends ASTNode class ForNode : public ASTNode { ... private: ASTNode *init, *condition, *update, *body; };

abstract syntax tree

program

vars functions 93 int z foo() main()

params vars body params vars body

int z for int z cout cin foo()

int y y < x y++ body =5 "enter z" endl z z cout y endl class ASTNode { ... };

// public class ForNode extends ASTNode class ForNode : public ASTNode { ... private: ASTNode *init, *condition, *update, *body; };

abstract syntax tree

program

vars functions

int z foo() main() for loop: four children params vars body init,params condition,vars update, bodybody int z for int z cout cin foo()

int y y < x y++ body =5 "enter z" endl z z cout y endl

93 abstract syntax tree

program

vars functions

int z foo() main() for loop: four children params vars body init,params condition,vars update, bodybody int z for int z cout cin foo() class ASTNode { int y y < x y++ body =5... "enter z" endl z z }; cout // public class ForNode extends ASTNode y endlclass ForNode : public ASTNode { ... private: ASTNode *init, *condition, *update, *body; }; 93 AST applications

“abstract syntax tree” = “parse tree” part of how compilers work do some tree traversal to do… code generation — e.g. ASTNode::outputCode() method optimization type checking…

94 using AST to compare programs comparing trees is a good way to compare programs… while ignoring: function/method order (e.g. sort function nodes by length) variable names (e.g. ignore variable names when comparing) comments … part of many software plagerism/copy+paste detection tools

95