4.1 Motivation and Background
Total Page:16
File Type:pdf, Size:1020Kb
Advanced Algorithms Lecture Septemb er Lecturer David Karger Scrib es Jaime Teevan Arvind Sankar Jason Rennie Splay Trees Motivation and Background With splay trees we are leaving b ehind heaps and moving on to trees Trees allow one to p erform actions not typically supp orted by heaps such as nd element nd predecessor nd successor and print no des in order Binary trees are a common and basic form of a tree As long as a binary tree is balanced it has logarithmic insert and delete time The goal of a splay tree is to have the tree maintain a logarithmic depth in an amortized sense by adjusting its structure with each access This creates a p owerful data structure which can b e proven to b e in the limit as go o d as or b etter than any static tree optimized for a certain sequence of accesses Previous Work After binary search trees were rst prop osed a numb er of variants were develop ed to improve on their p o or worst case b ehavior These include AVL trees trees redblack trees and many more Each improved the p erformance of a simple binary search tree but left something to b e desired Most require augmentation of the simple tree data structure and none can claim theoretical p erformance as go o d as that of splay trees The two basic elements of splay trees selfadjustment and rotation are hardly new Many variants on binary trees use some form of rotation selfadjusting linked lists and heaps had b een intro duced ere known for b efore splay trees Splay trees were basically the right orchestration of ideas that w some time While variants on binary search trees often aggressively pursue maintaining a balance splay trees deal with the problem lazily doing a little bit of work with each tree op eration but making little obvious eort to maintain a nicely balanced tree Intuition This laziness is instantiated in the splay trees selfmo dication with every access It would b e nice if we could show that the work done for accessing an element is at most O log n This isnt p ossible so we do the next b est thing that is we show that any additional work that we do b eyond O log n can b e accounted for by our past laziness In other words if we do work O log n we want to show that there were a lot of op erations b efore this one where we did less work than we were allowed ay the work that is O log n by spreading it around to previously and thus we can amortize aw p erformed op erations As we all know by know this technique is called amortization Lecture Septemb er If the path to nd a no de is long it implies that the tree is p o orly balanced Thus as you travel from a no de to a new no de most of the tree b elow is the new no de A double rotations distributes some of the weight b elow the queried no de over its siblings The total amount of imbalance which is measured by the sum of ranks is a p otential function against which we charge the cost of long searches Tree Rotation Tree rotation can b e thought of as a way to move a no de to a higher p osition in a binary search tree without aecting the ordering prop erties The simplest rotations are called single rotations they involve two no des and their corresp onding subtrees Figure displays such a simple rotation When read from left to right the rotation brings no de x to the top of the tree Y X X C A Y A B B C Figure A zig rotation Double rotations A slightly more complication rotation is a double rotation Here three no des and their subtrees are involved a double rotation essentially p erforms two single rotations in sequence Figure is a depiction of what is known as a zigzig rotation When following the arrow from left to right the no de x is brought up two levels to the top of the subtree Figure is a depiction of what is known as a zigzag rotation Any weight b elow x is spread more evenly as x is brought to the ro ot of the subtree Z X Y D A Y X C B Z A B C D Figure A zigzig rotation The double rotation is one of the key elements that make Splay Trees the success that they are While single rotations can move a queried element to the top of a tree without aecting key ordering only Lecture Septemb er Z X Y D Y Z X A A B C D B C Figure A zigzag rotation double rotations allow yield the balancing prop erties that give splay trees static optimality and O log n op erations Running time Splay trees run all basic op erations in logn time We can prove this through manipulation of weights and p otential function We dene w x to b e the weight of no de x Then X sx w y y 2descendants r x log sx 2 X DS w x is the sum of the weights of all descendants of x including the weight of x itself r x is called sx the rank of x is the p otential function that we use for proving prop erties ab out splay trees DS represents the splay tree data structure We can use this notation as a basis for proving the theorem that underlies the logn running time of splay trees Theorem Access Theorem The amortized time to access x from root t is at most r t � r x Pro of Using the simple lemma that we prove b elow we will show that the amortized cost of a r t � r x double rotation is � r t � r x and the amortized cost of a single rotation is � We will show that a sequence of rotations yields a telescoping sum that results in the b ound describ ed in the Access Theorem Lemma Given b as a root with two children a and c r a r c � r b � Proof Consider the equivalent inequality among the sizes of the three no des 2 sasc � sb Lecture Septemb er We know that sb sa sc w b by denition Since the weights are nonnegative we obtain sb � sa sc Hence 2 2 � sasc � sb � sasc sa sc or 2 2 asc � sb � s sa � sc Since squares are always nonnegative we get the desired inequality 2 � sb � sasc Consider the amortized cost of p erforming one rotation Lets say that z is the ro ot of a subtree and that no de x is either a child or grand child of z If we can show that the amotized cost of any double rotation � r z � r x and the amortized cost of any single rotation is � r z � r x then the amortized cost of splaying an element x to ro ot x is 0 k C r x � r x r x � r x r x � r x k k �1 k �1 k �2 1 0 � r x r x 0 k where the x s i � are ro ots of subtrees that are rotated i Now we will show that the amortized time for each rotation involved in the splaying of no de x is 0 0 0 at most r x � r x and the time for a single rotation is at most r x � r x where r denotes the rank of a no de after the rotation There are three cases to b e considered ZIG This is the rotation shown in gure The no des that change ranks are x and y So the amortized cost is 0 0 r x r y � r x � r y 0 0 0 x r y and r y � r x So the cost is at most But r 0 0 r x � r x � r x � r x ZIGZAG This is the double rotation shown in gure Now three no des change rank namely x y and z The amortized cost is 0 0 0 r x r y r z � r x � r y � r z 0 z and r y � r x so that the cost is at most We note that r x r 0 0 r y r z � r x which we rewrite as 0 0 0 0 r x � r x r y r z � r x which by the lemma is at most 0 0 r x � r x � r x � r x Lecture Septemb er ZIGZIG The other kind of double rotation is shown in gure Again three no des x y and z change ranks The amortized cost is therefore 0 0 0 r x r y r z � r x � r y � r z 0 We have r x r z so the cost simplies to 0 0 r y r z � r x � r y 0 0 x this expression is at most Since r x � r y and r y � r 0 0 r x r z � r x 0 We want this to b e less than or equal to r x � r x so we need to show that 0 0 r x r z � r x � This lo oks somewhat like the inequality we proved in the lemma so we try to see if that can b e applied The pro of dep ended only on the fact that the sum of the sizes of the two subtrees was at most the size of the entire tree Clearly that remains true 0 0 sx s z � s x So we can still apply the lemma and get the desired inequality Now we can sum over all the rotations that are needed to splay x to the ro ot The sum telescop es and we get the desired b ound r t log st and r x log sx so r t � r x O r t � r x O log st � Note that s(t) In the future we will denote st as W which is the sum of all of the logsx O log s(x) to denote the weight of a single no de x Note that weights in the tree ro oted at t We also use w x W log x to the ro ot t is w � sx As a result the amortized time to splay a no de O x w x Now we will show that the basic tree op erations insert and delete are O log n We will do this by dening two op erations split and join that can b e used to easily implement insert and delete JoinT x T is a function of trees T and T and element x where t x t 8t 2 T t 2 T 1 2 1 2 1 2 1 1 2 2 JoinT x T returns a tree where x is the ro ot with T as its left subtree and T as its right 1 2 1 2 has amortized cost O log n Note that we can also join two trees T T subtree JoinT x T 2 1 2 1 can b e joined by splaying the rightmost element of T and making T its right subtree This has 1 2 amortized cost O l og n SplitT x is a function of a tree T and an element x SplitT x returns two trees T � x T 1 2 Note that x simply denes a b oundary b etween the two trees x is not necessarily contained in T 1 SplitT x is implemented by splaying the greatest element less than x to the ro ot We remove the The remaining tree is named T right subtree and call it T 1 2 We are now prepared to implelement insert and delete with O log n amortized cost DeleteT x is a function of a tree T and an element x We p erform splitT x to yield two subtrees T and T Note that since x was an element of T it must