Balanced Search Trees Made Simple
Total Page:16
File Type:pdf, Size:1020Kb
Balanced Search Trees Made Simple ? Arne Andersson Department of Computer Science Lund University Box S Lund Sweden Abstract As a contribution to the recent debate on simple implementa tions of dictionaries we present new maintenance algorithms for balanced trees In terms of co de simplicity our algorithms compare favourably with those for deterministic and probabilistic skip lists Intro duction It is well known that there is a huge gap b etween theory and practice in computer programming While companies pro ducing computers or cars are anxious to use the b est technology available at least they try to convince their customers that they do so the philosophy is often dierent in software engineering it is enough to pro duce programs that work Ecient algorithms for sorting and searching which are taught in intro ductory courses are often replaced by p o or metho ds such as bubble sorting and linked lists If your program turns out to require to o much time or space you advice your customer to buy a new heavier and faster computer This situation strongly motivates an extensive search for simple and short co ded solutions to common computational tasks for instance worstcase ecient maintenance of dictionaries Due to its fundamental character this problem is one of the most wellstudied in algorithm design Until recently all solutions have b een based on balanced search trees such as AVLtrees symmetric binary Btrees also denoted redblack trees SBBk trees weight balanced trees halfbalanced trees and k neighb our trees However none of them has b ecome the structure of choice among programmers as they are all cumb ersome to implement To cite Munro Papadakis and Sedgewick the traditional source co de for a balanced search tree contains numerous cases involving single and double rotations to the left and the right In the recent years the search for simpler dictionary algorithms has taken a new and quite successful direction by the intro duction of some nontree struc tures The rst one the skip list intro duced by Pugh is a simple and elegant randomized data structure A worstcase ecient variant the deterministic skip list was recently intro duced by Munro Papadakis and Sedgewick The imp ortant feature of the deterministic skip list is its code simplicity As p ointed out by Munro et al the source co de for insertion into a deterministic skip list is simpler or at least shorter than any previously presented co de for a ? ArneAnderssondnalthse balanced search tree In addition the authors claimed that the co de for deletion would b e simple although it was not presented In this article we demonstrate that the binary Btree intro duced by Bayer may b e maintained by very simple algorithms The simplicity is achieved from three observations The large numb er of cases o ccurring in traditional balancing metho ds may b e replaced by two simple op erations called Skew and Split Making a Skew followed by a Split at each no de on the traversed path is enough to maintain balance during up dates The representation of balance information as one bit p er no de creates a signicant amount of b o okkeeping One short integer olog log n bits in each no de makes the algorithms simpler The deletion of an internal no de from a binary search tree has always b een cumb ersome even without balancing We show how to simplify the deletion algorithm by the use of two global p ointers As a matter of fact the co ding of our algorithms is even shorter than the co de for skip lists b oth probabilistic and deterministic are In our opinion it is also simpler and clearer Hence the binary search tree may comp ete very well with skip lists in terms of simplicity In Section we present the new maintenance algorithms for binary Btrees and in Section we discuss their implementation Section contains a compar ison b etween binary Btrees and deterministic skip lists We also make a brief comparison with the probabilistic skip list Finally we summarize our results in Section Simple Algorithms The binary Btree BBtree was intro duced by Bayer in as a binary representation of trees Using the terminology in we say that a no de in a tree is represented by a pseudonode containing one or two binary no des Edges inside pseudono des are horizontal and edges b etween pseudono des are vertical Only rightedges are allowed to b e horizontal In order to maintain balance during up dates we have to store balance information in the no des One bit telling whether the incoming edge or the outgoing rightedge is horizontal or not would b e enough However in our implementation we chose to store an integer level in each no de corresp onding to the vertical height of the no de No des at the b ottom of the tree are on level Briey all representations of Btrees including binary Btrees red black trees and deterministic skip lists are maintained by two basic op erations joining and splitting of Btree no des In a binary tree representation this is p erformed by rotations and change of balance information The reason why the algorithms b ecome complicated is that a pseudono de may take many dierent shap es causing many sp ecial cases For example adding a new horizontal edge to a or may result in ve dierent shap es namely pseudono de of shap e or These p ossibilities give rise to a large numb er of dierent cases involving more or less complicated restructuring op erations Fortunately there is a simple rule of thumb that can b e applied to reduce the numb er of cases Make sure that only rightedges are horizontal before you check the size of the pseudonode Using this rule the ve p ossible shap es in the ab ove will reduce to two namely and only the last one will require splitting In order to apply our rule of thumb in a simple manner we dene two basic restructuring op erations p is a binary no de Skew p Eliminate horizontal leftedges b elow p This is p erformed by following the right path from p making a right rotation whenever a horizontal leftedge is found Split p If the pseudono de ro oted at p is to o large split it by increasing the level of every second no de This is p erformed by following the right path from p making left rotations These op erations are simple to implement as short and elegant pro cedures They also allow a conceptually simple description of the maintenance algorithms Insertion Add a new no de at level Follow the path from the new no de to the ro ot At each binary no de p p erform the following a Skew p b Split p Deletion Remove a no de from level the problem of removing internal no des is discussed in Section Follow the path from the removed no de to the ro ot At each binary no de p p erform the following a If a pseudono de is missing b elow p i e if one of ps children is two levels b elow p decrease the level of p by one If ps rightchild b elonged to the same pseudono de as p we decrease the level of that no de to o b Skew p c Split p The algorithms are illustrated in Figures and a b 4 4 10 10 3 3 2 2 8 12 2 2 8 12 Skew 5 5 1 3 9 11 13 1 3 7 9 11 13 7 1 1 6 Insert 6 c d 4 4 10 10 3 3 Skew 8 2 Split 8 12 2 12 2 2 6 5 1 3 6 9 11 13 1 3 5 7 9 11 13 1 7 1 e 4 3 10 6 2 12 2 8 1 3 5 7 9 11 13 1 Fig Example of insertion into a BBtree The levels are separated by horizontal lines a b 4 4 3 10 3 10 Decrease level 6 6 2 12 2 12 2 8 2 8 1 3 5 7 9 11 13 3 5 7 9 11 13 1 1 Delete Decrease level c d 4 Skew 3 10 4 10 6 6 12 12 8 2 8 2 2 2 5 7 9 11 13 5 7 9 11 13 3 3 1 1 e e Split 6 3 10 4 6 8 10 2 12 2 4 8 12 2 5 7 9 11 13 2 5 7 9 11 13 3 3 1 1 Fig Example of deletion Simple implementation Below we give the complete co de for declarations and maintenance of a BBtree We use the wellknown technique of having a sentinel at the b ottom of the tree In this way we do not have to consider the existence of a no de b efore examining its level Each time we try to nd the level of a no de outside the tree we examine the level of the sentinel which is initialized to zero The declaration and initialization is straightforward As a sentinel we use the global variable bottom which has to b e initialized Note that more than one tree can share the sentinel A p ointer variable is initialized as an empty tree simply by making it p oint to the sentinel In our implementation we use the following co de for declarations of data typ es and global variables and for initialization typ e data pro cedure InitGlobalVariables Tree "no de b egin no de record new b ottom left right Tree b ottom"level level integer b ottom"left b ottom key data b ottom"right b ottom end deleted b ottom var b ottom deleted last Tree end The restructuring op erations Skew and Split may b e co ded in several ways they may b e inline co ded into the insertion and deletion pro cedures or they may b e co ded as separate pro cedures traversing a pseudono de making rotations whenever needed The co de given here is a third p ossibility where Skew and Split are co ded as pro cedures op erating on a single binary no de This is enough for the restructuring required during insertion but during deletion we need three calls of Skew and two calls of Split The fact that these calls are sucient is not hard to show we leave the details as an exercise In order to handle deletion of internal no des without a lot of co de we use two global p ointers deleted and last These p ointers are set during the topdown traversal in the following simple manner At each no de we make a binary compar ison if the key to b e deleted is less than the no des value we turn left otherwise we turn right ie even if the searched element is present