Binary Search Tree

Total Page:16

File Type:pdf, Size:1020Kb

Binary Search Tree Binary Search Tree - Best AVL Trees Time • All BST operations are O(d), where d is CSE 326 tree depth • minimum d is = for a binary tree Data Structures d log2N⁄ with N nodes Unit 5 › What is the best case tree? › What is the worst case tree? Reading: Section 4.4 • So, best case running time of BST operations is O(log N) 2 Binary Search Tree - Worst Balanced and unbalanced BST Time • Worst case running time is O(N) 1 4 › What happens when you Insert elements in 2 ascending order? 2 5 3 • Insert: 2, 4, 6, 8, 10, 12 into an empty BST 1 3 › Problem: Lack of “balance”: 4 • compare depths of left and right subtree 4 Is this “balanced”? 5 › Unbalanced degenerate tree 2 6 6 1 3 5 7 7 3 4 Balancing Binary Search Approaches to balancing trees Trees • Don't balance • Many algorithms exist for keeping › May end up with some nodes very deep binary search trees balanced • Strict balance › Adelson-Velskii and Landis (AVL) trees › The tree must always be balanced perfectly (height-balanced trees) • Pretty good balance › Splay trees and other self-adjusting trees › Only allow a little out of balance › B-trees and other multiway search trees • Adjust on access › Self-adjusting 5 6 AVL - Good but not Perfect Perfect Balance Balance • Want a complete tree after every operation • AVL trees are height-balanced binary › tree is full except possibly in the lower right search trees • This is expensive • Balance factor of a node › For example, insert 2 in the tree on the left and › height(left subtree) - height(right subtree) then rebuild as a complete tree • An AVL tree has balance factor 6 5 calculated at every node Insert 2 & › For every node, heights of left and right 4 9 complete tree 2 8 subtree can differ by no more than 1: For ∈ 1 5 8 1 4 6 9 every node t h(t.left)-h(t.right) {-1, 0, 1} 7 › Store current heights in each node 8 Height of an AVL Tree Height of an AVL Tree • N(h) = minimum number of nodes in an • N(h) > φh (φ ≈ 1.62) AVL tree of height h. • Suppose we have n nodes in an AVL • Basis tree of height h. › N(0) = 1, N(1) = 2 › n > N(h) (because N(h) was the minimum) h • Induction h › n > φ hence logφ n > h (relatively well › N(h) = N(h-1) + N(h-2) + 1 balanced tree!!) • Solution (recall Fibonacci analysis) › h < 1.44 log2n (i.e., Find takes O(logn)) h-2 › N(h) > φh (φ ≈ 1.62) h-1 9 10 Node Heights Node Heights after Insert 7 Tree A (AVL) Tree B (AVL) Tree B (AVL) Tree C (not AVL) 2 2 2 balance factor 3 1-(-1) = 2 6 6 6 6 1 0 1 1 1 1 1 2 4 9 4 9 4 9 4 9 0 0 0 0 0 0 0 0 0 0 1 -1 1 5 1 5 8 1 5 8 1 5 8 0 7 height of node = h height of node = h balance factor = hleft-hright balance factor = hleft-hright empty height = -1 empty height = -1 11 12 Insert and Rotation in AVL Single Rotation in an AVL Trees Tree • Insert operation may cause balance factor 2 2 to become 2 or –2 for some node 6 6 1 2 1 1 › only nodes on the path from insertion point to 4 9 4 8 root node have possibly changed in height 0 0 1 0 0 0 0 › So after the Insert, go back up to the root 1 5 8 1 5 7 9 node by node, updating heights 0 7 › If a new balance factor (the difference hleft- hright) is 2 or –2, adjust tree by rotation around the node 13 14 Insertions in AVL Trees AVL Insertion: Outside Case Let the node that needs rebalancing be α. Consider a valid h+2 AVL subtree j There are 4 cases: Outside Cases (require single rotation) : 1. Insertion into left subtree of left child of α. h h+1 2. Insertion into right subtree of right child of α. k Inside Cases (require double rotation) : h 3. Insertion into right subtree of left child of α. h Z 4. Insertion into left subtree of right child of α. The rebalancing is performed through four X Y separate rotation algorithms. 15 16 AVL Insertion: Outside Case AVL Insertion: Outside Case j Inserting into X j Do a “right rotation” destroys the AVL Becomes property at node j h + 2 k h (h+2) - h k h h+1 h Z h+1 h Z Y Y X X 17 18 Single right rotation Outside Case Completed j Do a “right rotation” “Right rotation” done! k (“Left rotation” is mirror symmetric) h k h+1 j h+1 h Z h h Y X Y Z X AVL property has been restored! 19 20 AVL Insertion: Inside Case AVL Insertion: Inside Case Consider a valid Inserting into Y Does “right rotation” j h+2 AVL subtree destroys the j restore balance? AVL property at node j Becomes h h h + 2 k h+1 k h h h+1 Z h Z X X Y Y 21 22 AVL Insertion: Inside Case AVL Insertion: Inside Case “Right rotation” Consider the structure k does not restore of subtree Y… j balance… now k is h j out of balance k h h X h+1 h h+1 Z Z X Y Y 23 24 AVL Insertion: Inside Case AVL Insertion: Inside Case Y = node i and j j We will do a left-right subtrees V and W “double rotation” . h+2 k h k h i h+1 Z i Z X h or h-1 X V W V W 25 26 Double rotation : second Double rotation : first rotation rotation left rotation complete j j Now do a right rotation i i k Z k Z W W X V X V 27 28 Double rotation : second rotation Implementation double rotation complete hight key Balance has been h+2 left right i restored h+1 Another possible implementation: do not keep the height; just the h+1 k j difference in height, i.e. the balance factor (1,0,-1). In both implementation, this has to be modified on the path of h h h or h-1 insertion even if you don’t perform rotations Once you have performed a rotation (single or double) you won’t X V W Z need to go back up the tree 29 30 Single Rotation Double Rotation RotateFromRight(n : reference node pointer) { • Implement Double Rotation in two lines. p : node pointer; p := n.right; n n.right := p.left; DoubleRotateFromRight(n : reference node pointer) { p.left := n; p ???? n n := p } } We also need to X modify the heights X or balance factors Insert Y Z of n and p Z 31 V W 32 Insertion in AVL Trees Insert in BST Insert(T : reference tree pointer, x : element) : integer { • Insert at the leaf (as for all BST) if T = null then T := new tree; T.data := x; return 1;//the links to › only nodes on the path from insertion point to //children are null case root node have possibly changed in height T.data = x : return 0; //Duplicate do nothing T.data > x : return Insert(T.left, x); › So after the Insert, go back up to the root T.data < x : return Insert(T.right, x); node by node, updating heights endcase } › If a new balance factor (the difference hleft- hright) is 2 or –2, adjust tree by rotation around the node 33 34 Example of Insertions in an Insert in AVL trees AVL Tree Insert(T : reference tree pointer, x : element) : integer{ int temp; 2 if T = null then 20 Insert 5, 40 T := new tree; T.data := x; T.height=0; return 1; 0 1 else { case T.data = x : return 0; //Duplicate do nothing 10 30 T.data > x : temp = Insert(T.left, x); 0 0 if ((height(T.left)- height(T.right)) = 2){ 25 35 if (T.left.data > x ) then //outside case T = RotatefromLeft (T); else //inside case T = DoubleRotatefromLeft (T); return temp; } T.data < x : temp = Insert(T.right, x); code similar to the left case Endcase } T.height := max(height(T.left),height(T.right)) +1; return 1; 35 36 } Example of Insertions in an Single rotation (outside case) AVL Tree 2 3 3 3 20 20 20 20 1 1 1 2 1 2 1 2 10 30 10 30 10 30 10 30 0 0 0 1 0 0 2 0 0 0 0 5 25 35 5 25 35 5 25 35 5 25 40 1 0 0 0 40 35 45 Now Insert 45 Imbalance 1 40 0 45 Now Insert 34 37 38 Double rotation (inside case) AVL Tree Deletion 3 3 • Similar but more complex than insertion 20 20 1 3 1 2 › Rotations and double rotations needed to 10 30 10 35 rebalance 0 0 2 0 1 › Imbalance may propagate upward so that 5 Imbalance 25 40 5 30 40 1 0 many rotations may be needed. 1 35 45 0 0 25 34 45 Insertion of 34 0 34 39 40 Pros and Cons of AVL Trees Double Rotation Solution Arguments for AVL trees: 1. Search is O(log N) since AVL trees are always balanced.
Recommended publications
  • Trees • Binary Trees • Newer Types of Tree Structures • M-Ary Trees
    Data Organization and Processing Hierarchical Indexing (NDBI007) David Hoksza http://siret.ms.mff.cuni.cz/hoksza 1 Outline • Background • prefix tree • graphs • … • search trees • binary trees • Newer types of tree structures • m-ary trees • Typical tree type structures • B-tree • B+-tree • B*-tree 2 Motivation • Similarly as in hashing, the motivation is to find record(s) given a query key using only a few operations • unlike hashing, trees allow to retrieve set of records with keys from given range • Tree structures use “clustering” to efficiently filter out non relevant records from the data set • Tree-based indexes practically implement the index file data organization • for each search key, an indexing structure can be maintained 3 Drawbacks of Index(-Sequential) Organization • Static nature • when inserting a record at the beginning of the file, the whole index needs to be rebuilt • an overflow handling policy can be applied → the performance degrades as the file grows • reorganization can take a lot of time, especially for large tables • not possible for online transactional processing (OLTP) scenarios (as opposed to OLAP – online analytical processing) where many insertions/deletions occur 4 Tree Indexes • Most common dynamic indexing structure for external memory • When inserting/deleting into/from the primary file, the indexing structure(s) residing in the secondary file is modified to accommodate the new key • The modification of a tree is implemented by splitting/merging nodes • Used not only in databases • NTFS directory structure
    [Show full text]
  • 1 Suffix Trees
    This material takes about 1.5 hours. 1 Suffix Trees Gusfield: Algorithms on Strings, Trees, and Sequences. Weiner 73 “Linear Pattern-matching algorithms” IEEE conference on automata and switching theory McCreight 76 “A space-economical suffix tree construction algorithm” JACM 23(2) 1976 Chen and Seifras 85 “Efficient and Elegegant Suffix tree construction” in Apos- tolico/Galil Combninatorial Algorithms on Words Another “search” structure, dedicated to strings. Basic problem: match a “pattern” (of length m) to “text” (of length n) • goal: decide if a given string (“pattern”) is a substring of the text • possibly created by concatenating short ones, eg newspaper • application in IR, also computational bio (DNA seqs) • if pattern avilable first, can build DFA, run in time linear in text • if text available first, can build suffix tree, run in time linear in pattern. • applications in computational bio. First idea: binary tree on strings. Inefficient because run over pattern many times. • fractional cascading? • realize only need one character at each node! Tries: • used to store dictionary of strings • trees with children indexed by “alphabet” • time to search equal length of query string • insertion ditto. • optimal, since even hashing requires this time to hash. • but better, because no “hash function” computed. • space an issue: – using array increases stroage cost by |Σ| – using binary tree on alphabet increases search time by log |Σ| 1 – ok for “const alphabet” – if really fussy, could use hash-table at each node. • size in worst case: sum of word lengths (so pretty much solves “dictionary” problem. But what about substrings? • Relevance to DNA searches • idea: trie of all n2 substrings • equivalent to trie of all n suffixes.
    [Show full text]
  • Lecture 26 Fall 2019 Instructors: B&S Administrative Details
    CSCI 136 Data Structures & Advanced Programming Lecture 26 Fall 2019 Instructors: B&S Administrative Details • Lab 9: Super Lexicon is online • Partners are permitted this week! • Please fill out the form by tonight at midnight • Lab 6 back 2 2 Today • Lab 9 • Efficient Binary search trees (Ch 14) • AVL Trees • Height is O(log n), so all operations are O(log n) • Red-Black Trees • Different height-balancing idea: height is O(log n) • All operations are O(log n) 3 2 Implementing the Lexicon as a trie There are several different data structures you could use to implement a lexicon— a sorted array, a linked list, a binary search tree, a hashtable, and many others. Each of these offers tradeoffs between the speed of word and prefix lookup, amount of memory required to store the data structure, the ease of writing and debugging the code, performance of add/remove, and so on. The implementation we will use is a special kind of tree called a trie (pronounced "try"), designed for just this purpose. A trie is a letter-tree that efficiently stores strings. A node in a trie represents a letter. A path through the trie traces out a sequence ofLab letters that9 represent : Lexicon a prefix or word in the lexicon. Instead of just two children as in a binary tree, each trie node has potentially 26 child pointers (one for each letter of the alphabet). Whereas searching a binary search tree eliminates half the words with a left or right turn, a search in a trie follows the child pointer for the next letter, which narrows the search• Goal: to just words Build starting a datawith that structure letter.
    [Show full text]
  • Balanced Trees Part One
    Balanced Trees Part One Balanced Trees ● Balanced search trees are among the most useful and versatile data structures. ● Many programming languages ship with a balanced tree library. ● C++: std::map / std::set ● Java: TreeMap / TreeSet ● Many advanced data structures are layered on top of balanced trees. ● We’ll see several later in the quarter! Where We're Going ● B-Trees (Today) ● A simple type of balanced tree developed for block storage. ● Red/Black Trees (Today/Thursday) ● The canonical balanced binary search tree. ● Augmented Search Trees (Thursday) ● Adding extra information to balanced trees to supercharge the data structure. Outline for Today ● BST Review ● Refresher on basic BST concepts and runtimes. ● Overview of Red/Black Trees ● What we're building toward. ● B-Trees and 2-3-4 Trees ● Simple balanced trees, in depth. ● Intuiting Red/Black Trees ● A much better feel for red/black trees. A Quick BST Review Binary Search Trees ● A binary search tree is a binary tree with 9 the following properties: 5 13 ● Each node in the BST stores a key, and 1 6 10 14 optionally, some auxiliary information. 3 7 11 15 ● The key of every node in a BST is strictly greater than all keys 2 4 8 12 to its left and strictly smaller than all keys to its right. Binary Search Trees ● The height of a binary search tree is the 9 length of the longest path from the root to a 5 13 leaf, measured in the number of edges. 1 6 10 14 ● A tree with one node has height 0.
    [Show full text]
  • Heaps a Heap Is a Complete Binary Tree. a Max-Heap Is A
    Heaps Heaps 1 A heap is a complete binary tree. A max-heap is a complete binary tree in which the value in each internal node is greater than or equal to the values in the children of that node. A min-heap is defined similarly. 97 Mapping the elements of 93 84 a heap into an array is trivial: if a node is stored at 90 79 83 81 index k, then its left child is stored at index 42 55 73 21 83 2k+1 and its right child at index 2k+2 01234567891011 97 93 84 90 79 83 81 42 55 73 21 83 CS@VT Data Structures & Algorithms ©2000-2009 McQuain Building a Heap Heaps 2 The fact that a heap is a complete binary tree allows it to be efficiently represented using a simple array. Given an array of N values, a heap containing those values can be built, in situ, by simply “sifting” each internal node down to its proper location: - start with the last 73 73 internal node * - swap the current 74 81 74 * 93 internal node with its larger child, if 79 90 93 79 90 81 necessary - then follow the swapped node down 73 * 93 - continue until all * internal nodes are 90 93 90 73 done 79 74 81 79 74 81 CS@VT Data Structures & Algorithms ©2000-2009 McQuain Heap Class Interface Heaps 3 We will consider a somewhat minimal maxheap class: public class BinaryHeap<T extends Comparable<? super T>> { private static final int DEFCAP = 10; // default array size private int size; // # elems in array private T [] elems; // array of elems public BinaryHeap() { .
    [Show full text]
  • Assignment 3: Kdtree ______Due June 4, 11:59 PM
    CS106L Handout #04 Spring 2014 May 15, 2014 Assignment 3: KDTree _________________________________________________________________________________________________________ Due June 4, 11:59 PM Over the past seven weeks, we've explored a wide array of STL container classes. You've seen the linear vector and deque, along with the associative map and set. One property common to all these containers is that they are exact. An element is either in a set or it isn't. A value either ap- pears at a particular position in a vector or it does not. For most applications, this is exactly what we want. However, in some cases we may be interested not in the question “is X in this container,” but rather “what value in the container is X most similar to?” Queries of this sort often arise in data mining, machine learning, and computational geometry. In this assignment, you will implement a special data structure called a kd-tree (short for “k-dimensional tree”) that efficiently supports this operation. At a high level, a kd-tree is a generalization of a binary search tree that stores points in k-dimen- sional space. That is, you could use a kd-tree to store a collection of points in the Cartesian plane, in three-dimensional space, etc. You could also use a kd-tree to store biometric data, for example, by representing the data as an ordered tuple, perhaps (height, weight, blood pressure, cholesterol). However, a kd-tree cannot be used to store collections of other data types, such as strings. Also note that while it's possible to build a kd-tree to hold data of any dimension, all of the data stored in a kd-tree must have the same dimension.
    [Show full text]
  • CSCI 333 Data Structures Binary Trees Binary Tree Example Full And
    Notes with the dark blue background CSCI 333 were prepared by the textbook author Data Structures Clifford A. Shaffer Chapter 5 Department of Computer Science 18, 20, 23, and 25 September 2002 Virginia Tech Copyright © 2000, 2001 Binary Trees Binary Tree Example A binary tree is made up of a finite set of Notation: Node, nodes that is either empty or consists of a children, edge, node called the root together with two parent, ancestor, binary trees, called the left and right descendant, path, subtrees, which are disjoint from each depth, height, level, other and from the root. leaf node, internal node, subtree. Full and Complete Binary Trees Full Binary Tree Theorem (1) Full binary tree: Each node is either a leaf or Theorem: The number of leaves in a non-empty internal node with exactly two non-empty children. full binary tree is one more than the number of internal nodes. Complete binary tree: If the height of the tree is d, then all leaves except possibly level d are Proof (by Mathematical Induction): completely full. The bottom level has all nodes to the left side. Base case: A full binary tree with 1 internal node must have two leaf nodes. Induction Hypothesis: Assume any full binary tree T containing n-1 internal nodes has n leaves. 1 Full Binary Tree Theorem (2) Full Binary Tree Corollary Induction Step: Given tree T with n internal Theorem: The number of null pointers in a nodes, pick internal node I with two leaf children. non-empty tree is one more than the Remove I’s children, call resulting tree T’.
    [Show full text]
  • Search Trees
    Lecture III Page 1 “Trees are the earth’s endless effort to speak to the listening heaven.” – Rabindranath Tagore, Fireflies, 1928 Alice was walking beside the White Knight in Looking Glass Land. ”You are sad.” the Knight said in an anxious tone: ”let me sing you a song to comfort you.” ”Is it very long?” Alice asked, for she had heard a good deal of poetry that day. ”It’s long.” said the Knight, ”but it’s very, very beautiful. Everybody that hears me sing it - either it brings tears to their eyes, or else -” ”Or else what?” said Alice, for the Knight had made a sudden pause. ”Or else it doesn’t, you know. The name of the song is called ’Haddocks’ Eyes.’” ”Oh, that’s the name of the song, is it?” Alice said, trying to feel interested. ”No, you don’t understand,” the Knight said, looking a little vexed. ”That’s what the name is called. The name really is ’The Aged, Aged Man.’” ”Then I ought to have said ’That’s what the song is called’?” Alice corrected herself. ”No you oughtn’t: that’s another thing. The song is called ’Ways and Means’ but that’s only what it’s called, you know!” ”Well, what is the song then?” said Alice, who was by this time completely bewildered. ”I was coming to that,” the Knight said. ”The song really is ’A-sitting On a Gate’: and the tune’s my own invention.” So saying, he stopped his horse and let the reins fall on its neck: then slowly beating time with one hand, and with a faint smile lighting up his gentle, foolish face, he began..
    [Show full text]
  • Splay Trees Last Changed: January 28, 2017
    15-451/651: Design & Analysis of Algorithms January 26, 2017 Lecture #4: Splay Trees last changed: January 28, 2017 In today's lecture, we will discuss: • binary search trees in general • definition of splay trees • analysis of splay trees The analysis of splay trees uses the potential function approach we discussed in the previous lecture. It seems to be required. 1 Binary Search Trees These lecture notes assume that you have seen binary search trees (BSTs) before. They do not contain much expository or backtround material on the basics of BSTs. Binary search trees is a class of data structures where: 1. Each node stores a piece of data 2. Each node has two pointers to two other binary search trees 3. The overall structure of the pointers is a tree (there's a root, it's acyclic, and every node is reachable from the root.) Binary search trees are a way to store and update a set of items, where there is an ordering on the items. I know this is rather vague. But there is not a precise way to define the gamut of applications of search trees. In general, there are two classes of applications. Those where each item has a key value from a totally ordered universe, and those where the tree is used as an efficient way to represent an ordered list of items. Some applications of binary search trees: • Storing a set of names, and being able to lookup based on a prefix of the name. (Used in internet routers.) • Storing a path in a graph, and being able to reverse any subsection of the path in O(log n) time.
    [Show full text]
  • Tree Structures
    Tree Structures Definitions: o A tree is a connected acyclic graph. o A disconnected acyclic graph is called a forest o A tree is a connected digraph with these properties: . There is exactly one node (Root) with in-degree=0 . All other nodes have in-degree=1 . A leaf is a node with out-degree=0 . There is exactly one path from the root to any leaf o The degree of a tree is the maximum out-degree of the nodes in the tree. o If (X,Y) is a path: X is an ancestor of Y, and Y is a descendant of X. Root X Y CSci 1112 – Algorithms and Data Structures, A. Bellaachia Page 1 Level of a node: Level 0 or 1 1 or 2 2 or 3 3 or 4 Height or depth: o The depth of a node is the number of edges from the root to the node. o The root node has depth zero o The height of a node is the number of edges from the node to the deepest leaf. o The height of a tree is a height of the root. o The height of the root is the height of the tree o Leaf nodes have height zero o A tree with only a single node (hence both a root and leaf) has depth and height zero. o An empty tree (tree with no nodes) has depth and height −1. o It is the maximum level of any node in the tree. CSci 1112 – Algorithms and Data Structures, A.
    [Show full text]
  • Comparison of Dictionary Data Structures
    A Comparison of Dictionary Implementations Mark P Neyer April 10, 2009 1 Introduction A common problem in computer science is the representation of a mapping between two sets. A mapping f : A ! B is a function taking as input a member a 2 A, and returning b, an element of B. A mapping is also sometimes referred to as a dictionary, because dictionaries map words to their definitions. Knuth [?] explores the map / dictionary problem in Volume 3, Chapter 6 of his book The Art of Computer Programming. He calls it the problem of 'searching,' and presents several solutions. This paper explores implementations of several different solutions to the map / dictionary problem: hash tables, Red-Black Trees, AVL Trees, and Skip Lists. This paper is inspired by the author's experience in industry, where a dictionary structure was often needed, but the natural C# hash table-implemented dictionary was taking up too much space in memory. The goal of this paper is to determine what data structure gives the best performance, in terms of both memory and processing time. AVL and Red-Black Trees were chosen because Pfaff [?] has shown that they are the ideal balanced trees to use. Pfaff did not compare hash tables, however. Also considered for this project were Splay Trees [?]. 2 Background 2.1 The Dictionary Problem A dictionary is a mapping between two sets of items, K, and V . It must support the following operations: 1. Insert an item v for a given key k. If key k already exists in the dictionary, its item is updated to be v.
    [Show full text]
  • Binary Search Tree
    ADT Binary Search Tree! Ellen Walker! CPSC 201 Data Structures! Hiram College! Binary Search Tree! •" Value-based storage of information! –" Data is stored in order! –" Data can be retrieved by value efficiently! •" Is a binary tree! –" Everything in left subtree is < root! –" Everything in right subtree is >root! –" Both left and right subtrees are also BST#s! Operations on BST! •" Some can be inherited from binary tree! –" Constructor (for empty tree)! –" Inorder, Preorder, and Postorder traversal! •" Some must be defined ! –" Insert item! –" Delete item! –" Retrieve item! The Node<E> Class! •" Just as for a linked list, a node consists of a data part and links to successor nodes! •" The data part is a reference to type E! •" A binary tree node must have links to both its left and right subtrees! The BinaryTree<E> Class! The BinaryTree<E> Class (continued)! Overview of a Binary Search Tree! •" Binary search tree definition! –" A set of nodes T is a binary search tree if either of the following is true! •" T is empty! •" Its root has two subtrees such that each is a binary search tree and the value in the root is greater than all values of the left subtree but less than all values in the right subtree! Overview of a Binary Search Tree (continued)! Searching a Binary Tree! Class TreeSet and Interface Search Tree! BinarySearchTree Class! BST Algorithms! •" Search! •" Insert! •" Delete! •" Print values in order! –" We already know this, it#s inorder traversal! –" That#s why it#s called “in order”! Searching the Binary Tree! •" If the tree is
    [Show full text]