MA 513: Data Structures Lecture Note http://www.iitg.ernet.in/psm/indexing_ma513/y09/index.html

Partha Sarathi Mandal [email protected] Dept. of Mathematics, IIT Guwahati lTue 9:00-9:55 Wed 10:00-10:55 Thu 11:00-11:55 Class Room : 1G2 lMA514 Lab : Tue 14:00-16:55 Constructing an expression

• Convert postfix expression to expression tree • a b + d e + * *

a b Constructing an expression Tree

• Convert postfix expression to expression tree • a b + c d e + * *

+ a b Constructing an expression Tree

• Convert postfix expression to expression tree • a b + c d e + * *

+ c d e a b Constructing an expression Tree

• Convert postfix expression to expression tree • a b + c d e + * *

+ c + a b d e Constructing an expression Tree

• Convert postfix expression to expression tree • a b + c d e + * *

* + a b c +

d e Constructing an expression Tree

• Convert postfix expression to expression tree • a b + c d e + * *

*

* + a b c +

d e AVL-Trees Objectives

• Understand the concept of an AVL tree. • Understand how AVL trees and BSTs differ. • Understand the issues involved in balancing an AVL tree. Introduction

• Named for its inventors (Adelson-Velsky and Landis), an AVL tree is a binary in which the heights of the subtrees of a node differ by no more than 1. • It is thus a balanced BST. • To understand the significance of the tree being balanced, let’s look at two different trees containing the same data. Introduction Introduction

 This “tree” is just a in clothing.

 It takes 2 tests to locate 12, 3 to locate 14, and 8 to locate 52.

 Hence, the search effort for this binary tree is O(n). Introduction

 This BST is an AVL tree.

 It takes 2 tests to locate 18, 3 to locate 12, and 4 to locate 8.

 Hence, the search effort for this binary tree is

O(log2n). Introduction

• For a tree with 1000 nodes, the worst case for a completely unbalanced tree is 1000 tests. • However, the worst case for a balanced tree is 10 tests. • Hence, balancing a tree can lead to significant improvements. Balanced Binary Search Trees

• height is O(log n), where n is the number of elements in the tree • AVL (Adelson-Velsky and Landis) trees • red-black trees • get, put, and remove take O(log n) time Balanced Binary Search Trees

• Indexed AVL trees • Indexed red-black trees • Indexed operations also take – O(log n) time Balanced Search Trees

• weight balanced binary search trees • 2-3 & 2-3-4 trees • B-trees • etc. AVL Tree

• binary tree • for every node x, define its balance factor

balance factor of x = height of left subtree of x - height of right subtree of x • balance factor of every node x is -1, 0, or 1 Balance Factors

-1 • This is an AVL tree.

1 1

-1 0 0 1

0 0 0 -1 0

0 Height

The height of an AVL tree that has n nodes is at most 1.44 log2 (n+2).

The height of every n node binary tree is at least log2 (n+1). AVL Search Tree

-1 10

1 1 7 40 0 1 -1 0 45 3 8 30 0 -1 0 0 0 35 60 1 5 20 0 25 put(9)

-1 10

0 1 1 7 40 -1 0 0 -1 1 45 3 8 30 0 -1 0 0 0 0 35 60 1 5 9 20 0 25 put(29)

-1 10

1 1 7 40 0 1 -1 0 45 3 8 30 0 -1 0 0 0 -2 35 60 20 1 5 0 -1 RR imbalance => new node is in right 25 0 subtree of right subtree of blue node (node with bf = -2) 29 put(29)

-1 10

1 1 7 40 0 1 -1 0 45 3 8 30 0 0 0 0 0 35 60 1 5 25 0 0 20 29 RR rotation Balanced binary tree

• The disadvantage of a binary search tree is that its height can be as large as N-1 • This means that the time needed to perform insertion and deletion and many other operations can be O(N) in the worst case • We want a tree with small height • A binary tree with N node has height at least (log N) • Thus, our goal is to keep the height of a binary search tree O(log N) • Such trees are called balanced binary search trees. Examples are AVL tree, red-black tree. AVL tree

Height of a node • The height of a leaf is zero. • The height of an internal node is the maximum height of its children plus 1. AVL tree

• An AVL tree is a binary search tree in which – for every node in the tree, the height of the left and right subtrees differ by at most 1.

AVL property violated here AVL tree • Let x be the root of an AVL tree of height h

• Let Nh denote the minimum number of nodes in an AVL tree of height h

• Clearly, Ni ≥ Ni-1 by definition • We have Nh  Nh1  Nh2 1

 2Nh2 1

 2Nh2 • By repeated substitution, we obtain the general form i Nh  2 Nh2i • The boundary conditions are: N0=1 and N1 =2. This implies that h = O(log Nh). • Thus, many operations (searching, insertion, deletion) on an AVL tree will take O(log N) time. Rotations

• When the changes (e.g., insertion or deletion), we need to transform the tree to restore the AVL tree property. • This is done using single rotations or double rotations. e.g. Single Rotation y x x y C A B C B A Before Rotation After Rotation Rotations

• Since an insertion/deletion involves adding/ deleting a single node, this can only increase/ decrease the height of some subtree by 1 • Thus, if the AVL tree property is violated at a node x, it means that the heights of left(x) ad right(x) differ by exactly 2. • Rotations will be applied to x to restore the AVL tree property. Insertion

• First, insert the new key as a new leaf just as in ordinary binary search tree. • Then trace the path from the new leaf towards the root. For each node x encountered, check if heights of left(x) and right(x) differ by at most 1. • If yes, proceed to parent(x). If not, restructure by doing either a single rotation or a double rotation [next slide]. • For insertion, once we perform a rotation at a node x, we won’t need to perform any rotation at any ancestor of x. Insertion

• Let x be the node at which left(x) and right(x) differ by more than 1. • Assume that the height of x is h+3 • There are 4 cases – Height of left(x) is h+2 (i.e. height of right(x) is h) • LL: Height of left(left(x)) is h+1  single rotate with left child • LR: Height of right(left(x)) is h+1  double rotate with left child – Height of right(x) is h+2 (i.e. height of left(x) is h) • RR: Height of right(right(x)) is h+1  single rotate with right child. • RL: Height of left(right(x)) is h+1  double rotate with right child. AVL Rotations

• RR: right-right • LL: left-left • RL: right-left • LR: left-right Single rotation (LL)

The new key is inserted in the subtree A. The AVL-property is violated at x o height of left(x) is h+2 o height of right(x) is h. Single rotation (RR)

The new key is inserted in the subtree C. The AVL-property is violated at x.

Single rotation takes O(1) time. Insertion takes O(log N) time. 5 x AVL Tree 5

8 C 3 y 3 8

1 4 1 4 B A 0.8

Insert 0.8 3

5 1 After rotation 4 8 0.8 Double rotation (LR) The new key is inserted in the subtree B1 or B2. The AVL-property is violated at x. x-y-z forms a zig-zag shape

also called left-right rotate Double rotation (RL)

The new key is inserted in the subtree B1 or B2. The AVL-property is violated at x.

also called right-left rotate 5 x AVL Tree 5 C y 8 3 3 8

A 1 4 z 1 4

B 3.5

Insert 3.5 4

5 3

3.5 8 After Rotation 1 An Extended Example

Insert 3,2,1,4,5,6,7, 16,15,14 Single rotation

3 3 2 3

2 1 Fig 1 2 3 Fig 4 Fig 2 2 1 2 Single rotation Fig 3 1 3 1 3 Fig 5 Fig 6 4 4 5 2 2 Single rotation 1 1 4 4 3 5 3 5 Fig 8 Fig 7 6 4 4 Single rotation 2 5 2 5

1 3 6 1 3 6 4

Fig 9 Fig 10 7 2 6

1 3 7

5 Fig 11 4

2 6

1 3 7

5 16 Fig 12

4 Double rotation 4 2 6 2 6 1 3 7 1 3 15 5 16 5 16 Fig 13 15 Fig 14 7 4 4 Double rotation

2 2 6 7

15 1 3 15 1 3 5 6 16 7 5 14 16 Fig 15 14 Fig 16 AVL Tree template class template class AVL; template class AvlNode{ friend class AVL; private: keyType data; AvlNode *LiftChild, *RightChild; int bf; }; template class AVL{ AvlNode* root; public: AVL(AvlNode *init = 0): root(init){}; Boolean Insert(const ketType&); Boolean Delete(const ketType&); AvlNode* Search(const ketType&); }; Insert in AVL Tree template Boolean AVL::Insert(const ketType& x); AvlNode *a, *b, *c, *f, *p, *q, *y, *clchild, *crchild; Boolean Found, Unbalanced; int d; { if(!root){ y = new AvlNode; y->data = x; root=y; root->bf=0; root->leftChild=root->RightChild=0; return TRUE;} } //a = most resent node with bf +1/-1, f =p[a], q follows p f=0; a=p=root; q=0; Found = FALSE; While(p && !Found){ if(p->bf){a=p; f=q;} if(x.key < p->data.key){q=p; p=p->LeftChild;} else if(x.key > p->data.key) {q=p; p=p->RightChild;} else{y=p; Found = TRUE; }

} If(Unbalanced){ //tree unbalanced if(d==1){ // left imbalance Insert in AVL Tree if(b->bf==1){//rotation LL a->LeftChild = b->RightChild; if(!Found){ b->RightChild=a, a->bf=0; b->bf=0; y = new AvlNode; } y->data = x; y->bf=0; else{ //rotation LR y->leftChild= y->RightChild=0; c= b->RightChild; if(x.key < q->data.key) b->RightChild = c->LeftChild; q->LeftChild=y; a->LeftChild = c->RightChild ; else q->RightChild=y; c->LeftChild =b; if(x.key > a->data.key){ c->RightChild =a; p=a->RightChild; b=p; d= -1} switch(c->bf){ else { p=a->LeftChild; b=p; d=1} case 1: a->bf = -1; b->bf =0; break; while(p!=y) case -1: b->bf =1; a->bf =0; break; if(x.key > p->data.key){ case 0 : b->bf =0; a->bf =0; break; p->bf = -1; p=p->RightChild;} } c->bf =0; b=c; else {p->bf = 1; p=p->LeftChild;} } // end of LR } // end of left imbalance Unbalanced = TRUE; else {// right imbalance} If( !(a->bf) || !(a->bf+d)){ if(!f) root = b; a->bf+=d; Unbalanced = FALSE; else if(a==f->LeftChild) f->leftChild =b; } else if(a==f->RightChild) f->RightChild =b; } // end of tree unbalanced return TRUE; } // end of if(!Found) Deletion

• Delete a node x as in ordinary binary search tree. Note that the last node deleted is a leaf. • Then trace the path from the new leaf towards the root. • For each node x encountered, check if heights of left(x) and right(x) differ by at most 1. If yes, proceed to parent(x). If not, perform an appropriate rotation at x. There are 4 cases as in the case of insertion. • For deletion, after we perform a rotation at x, we may have to perform a rotation at some ancestor of x. Thus, we must continue to trace the path until we reach the root. Deletion

• On closer examination: the single rotations for deletion can be divided into 4 cases (instead of 2 cases) – Two cases for rotate with left child – Two cases for rotate with right child Single rotations in deletion In both figures, a node is deleted in subtree C, causing the height to drop to h. The height of y is h+2. When the height of subtree A is h+1, the height of B can be h or h+1. Fortunately, the same single rotation can correct both cases.

rotate with left child Single rotations in deletion

In both figures, a node is deleted in subtree A, causing the height to drop to h. The height of y is h+2. When the height of subtree C is h+1, the height of B can be h or h+1. A single rotation can correct both cases.

rotate with right child Rotations in deletion

• There are 4 cases for single rotations, but we do not need to distinguish among them. • There are exactly two cases for double rotations (as in the case of insertion) • Therefore, we can reuse exactly the same procedure for insertion to determine which rotation to perform