<<

Binary Search - Best AVL Trees Time • All BST operations are O(d), where d is CSE 326 tree depth • minimum d is = for a Data Structures d log2N⁄ with N nodes Unit 5 › What is the best case tree? › What is the worst case tree? Reading: Section 4.4 • So, best case running time of BST operations is O(log N)

2

Binary - Worst Balanced and unbalanced BST Time

• Worst case running time is O(N) 1 4 › What happens when you Insert elements in 2 ascending order? 2 5 3 • Insert: 2, 4, 6, 8, 10, 12 into an empty BST 1 3 › Problem: Lack of “balance”: 4 • compare depths of left and right subtree 4 Is this “balanced”? 5 › Unbalanced degenerate tree 2 6 6

1 3 5 7 7

3 4 Balancing Binary Search Approaches to balancing trees Trees • Don't balance • Many algorithms exist for keeping › May end up with some nodes very deep binary search trees balanced • Strict balance › Adelson-Velskii and Landis (AVL) trees › The tree must always be balanced perfectly (height-balanced trees) • Pretty good balance › Splay trees and other self-adjusting trees › Only allow a little out of balance › B-trees and other multiway search trees • Adjust on access › Self-adjusting

5 6

AVL - Good but not Perfect Perfect Balance Balance • Want a complete tree after every operation • AVL trees are height-balanced binary › tree is full except possibly in the lower right search trees • This is expensive • Balance factor of a node › For example, insert 2 in the tree on the left and › height(left subtree) - height(right subtree) then rebuild as a complete tree • An AVL tree has balance factor 6 5 calculated at every node Insert 2 & › For every node, heights of left and right 4 9 complete tree 2 8 subtree can differ by no more than 1: For ∈ 1 5 8 1 4 6 9 every node t h(t.left)-h(t.right) {-1, 0, 1}

7 › Store current heights in each node 8 Height of an AVL Tree Height of an AVL Tree

• N(h) = minimum number of nodes in an • N(h) > φh (φ ≈ 1.62) AVL tree of height h. • Suppose we have n nodes in an AVL • Basis tree of height h. › N(0) = 1, N(1) = 2 › n > N(h) (because N(h) was the minimum) h • Induction h › n > φ hence logφ n > h (relatively well › N(h) = N(h-1) + N(h-2) + 1 balanced tree!!) • Solution (recall Fibonacci analysis) › h < 1.44 log2n (i.e., Find takes O(logn)) h-2 › N(h) > φh (φ ≈ 1.62) h-1

9 10

Node Heights Node Heights after Insert 7

Tree A (AVL) Tree B (AVL) Tree B (AVL) Tree (not AVL) 2 2 2 balance factor 3 1-(-1) = 2 6 6 6 6 1 0 1 1 1 1 1 2 4 9 4 9 4 9 4 9 0 0 0 0 0 0 0 0 0 0 1 -1 1 5 1 5 8 1 5 8 1 5 8 0 7 height of node = h height of node = h

balance factor = hleft-hright balance factor = hleft-hright empty height = -1 empty height = -1

11 12 Insert and Rotation in AVL Single Rotation in an AVL Trees Tree

• Insert operation may cause balance factor 2 2 to become 2 or –2 for some node 6 6 1 2 1 1 › only nodes on the path from insertion point to 4 9 4 8 root node have possibly changed in height 0 0 1 0 0 0 0 › So after the Insert, go back up to the root 1 5 8 1 5 7 9 node by node, updating heights 0 7 › If a new balance factor (the difference hleft- hright) is 2 or –2, adjust tree by rotation around the node

13 14

Insertions in AVL Trees AVL Insertion: Outside Case

Let the node that needs rebalancing be α. Consider a valid h+2 AVL subtree j There are 4 cases: Outside Cases (require single rotation) : 1. Insertion into left subtree of left child of α. h h+1 2. Insertion into right subtree of right child of α. k Inside Cases (require double rotation) : h 3. Insertion into right subtree of left child of α. h Z 4. Insertion into left subtree of right child of α. The rebalancing is performed through four X Y separate rotation algorithms.

15 16 AVL Insertion: Outside Case AVL Insertion: Outside Case

j Inserting into X j Do a “right rotation” destroys the AVL Becomes property at node j h + 2 k h (h+2) - h k h

h+1 h Z h+1 h Z Y Y X X

17 18

Single right rotation Outside Case Completed

j Do a “right rotation” “Right rotation” done! k (“Left rotation” is mirror symmetric)

h k h+1 j

h+1 h Z h h Y X Y Z X AVL property has been restored! 19 20 AVL Insertion: Inside Case AVL Insertion: Inside Case

Consider a valid Inserting into Y Does “right rotation” j h+2 AVL subtree destroys the j restore balance? AVL property at node j Becomes h h h + 2 k h+1 k

h h h+1 Z h Z X X Y Y

21 22

AVL Insertion: Inside Case AVL Insertion: Inside Case

“Right rotation” Consider the structure k does not restore of subtree Y… j balance… now k is h j out of balance k h h X h+1 h h+1 Z Z X Y Y

23 24 AVL Insertion: Inside Case AVL Insertion: Inside Case

Y = node i and j j We will do a left-right subtrees V and W “double rotation” . . . h+2 k h k h i h+1 Z i Z

X h or h-1 X V W V W

25 26

Double rotation : second Double rotation : first rotation rotation left rotation complete j j Now do a right rotation i i k Z k Z W W X V X V

27 28 Double rotation : second rotation Implementation

double rotation complete hight key Balance has been h+2 left right i restored h+1 Another possible implementation: do not keep the height; just the h+1 k j difference in height, i.e. the balance factor (1,0,-1). In both implementation, this has to be modified on the path of h h h or h-1 insertion even if you don’t perform rotations Once you have performed a rotation (single or double) you won’t X V W Z need to go back up the tree

29 30

Single Rotation Double Rotation

RotateFromRight(n : reference node pointer) { • Implement Double Rotation in two lines. p : node pointer; p := n.right; n n.right := p.left; DoubleRotateFromRight(n : reference node pointer) { p.left := n; p ???? n n := p } }

We also need to X modify the heights X or balance factors Insert Y Z of n and p Z

31 V W 32 Insertion in AVL Trees Insert in BST

Insert(T : reference tree pointer, x : element) : integer { • Insert at the leaf (as for all BST) if T = null then T := new tree; T.data := x; return 1;//the links to › only nodes on the path from insertion point to //children are null case root node have possibly changed in height T.data = x : return 0; //Duplicate do nothing T.data > x : return Insert(T.left, x); › So after the Insert, go back up to the root T.data < x : return Insert(T.right, x); node by node, updating heights endcase } › If a new balance factor (the difference hleft- hright) is 2 or –2, adjust tree by rotation around the node

33 34

Example of Insertions in an Insert in AVL trees AVL Tree

Insert(T : reference tree pointer, x : element) : integer{ int temp; 2 if T = null then 20 Insert 5, 40 T := new tree; T.data := x; T.height=0; return 1; 0 1 else { case T.data = x : return 0; //Duplicate do nothing 10 30 T.data > x : temp = Insert(T.left, x); 0 0 if ((height(T.left)- height(T.right)) = 2){ 25 35 if (T.left.data > x ) then //outside case T = RotatefromLeft (T); else //inside case T = DoubleRotatefromLeft (T); return temp; } T.data < x : temp = Insert(T.right, x); code similar to the left case Endcase } T.height := max(height(T.left),height(T.right)) +1; return 1; 35 36 } Example of Insertions in an Single rotation (outside case) AVL Tree

2 3 3 3 20 20 20 20 1 1 1 2 1 2 1 2 10 30 10 30 10 30 10 30 0 0 0 1 0 0 2 0 0 0 0 5 25 35 5 25 35 5 25 35 5 25 40 1 0 0 0 40 35 45 Now Insert 45 Imbalance 1 40

0 45 Now Insert 34

37 38

Double rotation (inside case) AVL Tree Deletion

3 3 • Similar but more complex than insertion 20 20 1 3 1 2 › Rotations and double rotations needed to 10 30 10 35 rebalance 0 0 2 0 1 › Imbalance may propagate upward so that 5 Imbalance 25 40 5 30 40 1 0 many rotations may be needed. 1 35 45 0 0 25 34 45 Insertion of 34 0 34

39 40 Pros and of AVL Trees Double Rotation Solution Arguments for AVL trees:

1. Search is O(log N) since AVL trees are always balanced. DoubleRotateFromRight(n : reference node pointer) { 2. Insertion and deletions are also O(logn) RotateFromLeft(n.right); 3. The height balancing adds no more than a constant factor to the RotateFromRight(n); n speed of insertion. } Arguments against using AVL trees: 1. Difficult to program & debug; more space for balance factor. 2. Asymptotically faster but rebalancing costs time. X 3. Most large searches are done in database systems on disk and use other structures (e.g. B-trees). Z 4. May be OK to have O(N) for a single operation if total run time for many consecutive operations is fast (e.g. Splay trees). V W

41 42