Binary Search Tree Binary Search Tree Binary Search Tree

Binary Search Tree Binary Search Tree • BST converts a static binary search into a dynamic binary search allowing to efficiently Compare the left−right ordering in a binary search tree to the insert and delete data items bottom−up ordering in a heap where the key of each parent node is greater than or equal to the key of any child node • Left-to-right ordering in a tree: for every node x, the values of all the keys kleft in the left subtree are smaller than the key kparent in x and the values of all the keys kright in the right subtree are larger than the key in x: k < k < k x kparent left parent right kleft kright 1 2 Binary Search Tree BST: find / insert operations • No duplicates! (attach them all to a single item) find is the successful • Basic operations: binary search – find: find a given search key or detect that it is not present in the tree insert creates a new node – insert: insert a node with a given key to the at the point at which tree if it is not found the unsuccessful – findMin: find the minimum key – findMax: find the maximum key search stops – remove: remove a node with a given key and restore the tree if necessary 3 4 Binary Search Trees: Binary Search Tree: running time findMin / findMax / sort • Running time for find, insert, findMin, • Extremely simple: starting at the root, branch findMax, sort: O(log n) average-case and O(n) repeatedly left (findMin) or right (findMax) as worst-case complexity (just as in QuickSort) long as a corresponding child exists • The root of the tree plays a role of the pivot in quickSort • As in QuickSort, the recursive traversal of the tree can sort the items: – First visit the left subtree BST of the depth – Then visit the root – Then visit the right subtree 5 about log n 6 1 BST of the depth about n Binary Search Tree: node removal 15 1 1 • remove is the most complex operation: 10 3 15 – The removal may disconnect parts of the tree – The reattachment of the tree must maintain 8 4 3 the binary search tree property 5 5 10 – The reattachment should not make the tree unnecessarily deeper as the depth specifies 4 8 4 the running time of the tree operations 8 3 10 1 15 5 7 8 BST: how to remove a node BST: how to remove a node • If the node k to be removed is a leaf, delete it • If the node k to be removed has two children, • If the node k has only one child, remove it after then replace the item in this node with the item linking its child to its parent node with the smallest key in the right subtree and remove this latter node from the right subtree •Thus, and are not removeMin removeMax (Exercise: if possible, how can the nodes in the complex because the affected nodes are either left subtree be used instead? ) leaves or have only one child • The second removal is very simple as the node with the smallest key does not have a left child • The smallest node is easily found as in findMin 9 10 Average-Case Performance of BST: an Example of Node Removal Binary Search Tree Operations • Internal path length of a binary tree is the sum of the depths of its nodes: depth 0 IPL = 0 + 1 + 1 + 2 + 2 + 3 + 3 + 3 1 = 15 2 3 • Average internal path length T(n) of the binary search trees with n nodes is O(n log n) 11 12 2 Average-Case Performance of Average-Case Performance of Binary Search Tree Operations Binary Search Tree Operations • If the n-node tree contains the root, the i-node • Therefore, the average complexity of find or left subtree, and the (n−i−1)-node right subtree, insert operations is T(n) ⁄ n =O(logn) then: 2 T(n) = n − 1 + T(i) + T(n−i−1) •For n pairs of random insert / remove 0.5 because the root contributes 1 to the path operations, an expected depth is O(n ) length of each of the other n − 1 nodes • In practice, for random input, all operations are • Averaging over all i;0 ≤ i < n: the same about O(log n) but the worst-case performance recurrence as for QuickSort: can be O(n)! 2 T(n) = (n −1) + n ()T(0) + T(1) +...+ T(n −1) so that T(n) is O(n log n) 13 14 Balanced Trees AVL Tree • Balancing ensures that the internal path lengths • An AVL tree is a binary search tree with the are close to the optimal n log n following additional balance property: • The average-case and the worst-case – for any node in the tree, the height of the left complexity is about O(log n) due to their and right subtrees can differ by at most 1 balanced structure • But, insert and remove operations take more – the height of an empty subtree is −1 time on average than for the standard binary • The AVL-balance guarantees that the AVL tree search trees of height h has at least ch nodes, c > 1, and the – AVL tree (1962: Adelson-Velskii, Landis) maximum depth of an n-item tree is about logcn – Red-black and AA-tree – B-tree (1972: Bayer, McCreight) 15 16 Minimal AVL-trees of heights 0,1,2,3 ! Lemma 3.19: The height of an AVL-tree is O(log n). S =1 0 Proof: h+1 S1=2 1) AVL tree of height h has less than 2 – 1 nodes. S2=4 2) We calculate the maximum height of an AVL tree with n nodes. 1 S3=7 Let Sh be the size of the smallest AVL tree of the height h 5 (it is obvious that S0 = 1, S1 = 2) 10 We can set up and solve the following recurrence relation: Binary search tree of 15 Sh= Sh−1+ Sh−2+1, height 6, 7 nodes! 20 Sh = Fh+3 − 1 where Fi is the i-th Fibonacci number 25 Minimal complete binary search tree 30 of height 3 ! 17 18 3 Definition-Fibonacci number: F(n)=F(n-1)+F(n-2), Fi is ith Fibonacci number, AVL Tree F1=1, F2=1, F3=2 F4=F2+F3 • Therefore, for each n-node AVL tree: h+3 n ≥ Sh ≈ (ϕ 5)−1 Sh= Sh−1+ Sh−2+1, Sh = Fh+3 − 1 where ϕ = ()1+ 5 2 ≅ 1.618, or Proof by mathematical induction: h ≤1.44log2 (n +1) −1.328 1) Base case: S0=F3-1=1, S1=F4-1=2 • Thus, the worst-case height is at most 44% 2) Assumption: S =F -1 more than the minimum height of a complete i i+3 binary tree 3) Proof: Si+1=(Fi+3-1)+(Fi+2-1)+1=Fi+4-1 19 20 Balancing an AVLTree Single Rotation • Two mirror-symmetric pairs of cases to rebalance • The inserted new key invalidates the AVL the tree if after the insertion of a new key to the property bottom the AVL property is invalidated • To restore the AVL tree, the root is moved to • Only one single or double rotation is sufficient the node B and the rest of the tree is • Deletions are more complicated: O(log n) rotations can be required to restore the balance reorganised as the BST • AVL balancing is not computationally efficient . Better balanced search trees: red-black, AA-trees, B-trees 21 22 Example for inserting a new key and rebalancing the AVL-tree: Red-Black Tree 40 •A red-black tree is a binary search tree with the 15 60 15 following ordering properties: Every node is coloured either red or black, the root is 10 20 10 40 black – Red Rule: If a node is red, its children must 12 20 60 12 be black. – Path Rule: Every path from the root to a leaf or to a node with one child must contain the Insertion algorithm (informal): 1) Ordinary binary search tree insertion same number of black nodes. 2) rebalance the tree if necessary 23 24 4 100 30 50 90 50 150 30 45 5 45 60 144 180 20 40 2 9 40 50 40 46 10 41 Red-Black Tree A red-black tree with 8 nodes A red-black tree not balanced! 1) RedRule is satisfied. A new node below 10 not possible! 2) 2 black nodes in each of the 5 paths This shows that red black trees from the root to a leaf or a node with 1 child, are somehow balanced! 25 So the PathRule is satisfied. 26 Red-Black Tree Red-Black Tree Statement: Because every path from the root to a leaf contains b black nodes, If a red-black tree is complete, with all black nodes except for red there are at least 2b -1 nodes leaves at the lowest level, the height of that tree will be minimal, in the tree. approximately log2 n. • Proof (math induction): To get the maximum height of for a given n, we would have as Base case: it is valid for b=1 (only the root or also1-2 its many red nodes as possible on one path, and all other nodes are red children) black. The path with all the red nodes would be about twice as long as the paths with no red elements.. • Let it be valid for all red-black trees with b black nodes per path • If a tree contains b+1 black nodes per path and the root This lead us to the hypothesis that the maximum has 2 black children, then it contains at least height of a red-black tree is less than 2 log2n.

Load more