Skip Lists: a Probabilistic Alternative to Balanced Trees

ARTICLES Algorithms and Data Structures Skip Lists: Jeffrey Vitter Editor A Probabilistic Alternative to Balanced Trees Skip lists are data structures thlat use probabilistic balancing rather than strictly enforced balancing. As a result, the algorithms for insertion and deletion in skip lists are much simpler and significantly faster than equivalent algorithms for balanced trees. William Pugh Binary trees can be used for representing abstract data self-adjusting tree algorithms. Skip lists are also very types such as dictionaries and ordered lists. They work space efficient. They can easily be configured to require well when the elements are inserted in a random order. an average of 1% pointers per element (or even less) Some sequences of operations, such as inserting the and do not require balance or priority information to be elements in order, produce degenerate data structures stored with each node. that perform very poo.rly. If it were possible to randomly permute the list of items to be inserted, trees SKIP LISTS would work well with high probability for any input We might need to examine every node of the list when sequence. In most cases queries must be answered on- searching a linked list (Figure la). If the list is stored in line, so randomly permuting the input is impractical. sorted order and every other node of the list also has Balanced tree algorithms rearrange the tree as opera- a pointer to the node two ahead of it in the list (Fig- tions are performed to maintain certain balance condi- ure lb), we have to examine no more than [n/21 + tions and assure good performance. 1 nodes (where n is the length of the list). Also giving Skip lists are a probabilistic alternative to balanced every fourth node a pointer four ahead (Figure lc) re- trees. Skip lists are balanced by consulting a random quires that no more than rn/41 + 2 nodes be examined. number generator. Although skip lists have bad worst- If every (27th node has a pointer 2’ nodes ahead (Fig- case performance, no input sequence consistently pro- ure Id), the number of nodes that must be examined duces the worst-case performance (much like quicksort can be reduced to rlog,nl while only doubling the num- when the pivot element is chosen randomly). It is very ber of pointers. This data structure could be used for unlikely a skip list data structure will be significantly fast searching, but insertion and deletion would be im- unbalanced (e.g., for a dictionary of more than 250 ele- practical. ments, the chance that a search will take more than A node that has k forward pointers is called a level k three-times the expeci.ed time is less than one in a node. If every (2’)th node has a pointer 2’ nodes ahead, million). Skip lists have balance properties similar to then levels of nodes are distributed in a simple pattern: that of search trees built by random insertions, yet do 50 percent are level 1, 25 percent are level 2, 12.5 not require insertions to be random. percent are level 3 and so on. What would happen if It is easier to balance a data structure probabilisti- the levels of nodes were chosen randomly, but in the tally than to explicitly maintain the balance. For many same proportions (e.g., as in Figure le)? A node’s ith applications, skip lists are a more natural representa- forward pointer, instead of pointing 2’-’ nodes ahead, tion than trees, and they lead to simpler algorithms. points to the next node of level i or higher. Insertions or The simplicity of skip list algorithms makes them deletions would require only local modifications; the easier to implement and provides significant constant level of a node, chosen randomly when the node is factor speed improvements over balanced tree and inserted, need never change. Some arrangements of levels would give poor execution times, but we will see This work was partially supported by an AT&T Bell Labs Fellowship and by that such arrangements are rare. Because these data NSF grant CCR-8908900. structures are linked lists with extra pointers that skip 0 1990 ACM OOOl-O78z/9o/o6t,o-o668 $1.50 over intermediate nodes, I named them skip lists. 668 Communications of the AGM ]une 1990 Volume 33 Number 6 Articles b 31 * t ) * d- -9 l 21- NIL --j 3 I 471 -t- --M--t, -4251 -+-26 -- c c * c -5 . w e- 6- NIL w . * . *25 - * 17 --+3/-f, --171-b9 --I 121 -t, -4191 +4211 -+- -4261 -t, FIGURE1. Linked Lists with Additional Pointers I Search(list, searchKey) SKIP LIST ALGORITHMS x := lisbheader This section describes how to search for algorithms and -- loop invariant: x-+key < searchKey to insert and delete elements in a dictionary or symbol for i := list+level downto 1 do table. The Search operation returns the contents of the while x-+fotward[i]+key c searchKey do value associated with the desired key or failure if the x := x+forward[i] key is not present. The Insert operation associates a -- x+key < searchKey I x+forward[ I] +key specified key with a new value (inserting the key if it x := x+forward[l] had not already been present). The Delete operation if x-+key = searchKey then return x+value deletes the specified key. It is easy to support additional else return failure operations such as “find the minimum key” or “find the next key.” FIGURE2. Skip List Search Algorithm Each element is represented by a node, the level of which is chosen randomly when the node is inserted without regard for the number of elements in the data pointers, the search moves down to the next level. structure. A level i node has i forward pointers, indexed When we can make no more progress at level 1, we 1 through i. We do not need to store the level of a node must be immediately in front of the node that contains in the node. Levels are capped at some appropriate con- the desired element (if it is in the list). stant MaxLevel. The level of a list is the maximum level Insertion and Deletion Algorithms currently in the list (or 1 if the list is empty). The To insert or delete a node, we simply search and splice, header of a list has forward pointers at levels one as shown in Figure 3. Figure 4 gives algorithms for through MaxLevel. The forward pointers of the header insertion and deletion. A vector update is maintained so at levels higher than the current maximum level of the list point to NIL. that when the search is complete (and we are ready to perform the splice), update[i] contains a pointer to the Initialization rightmost node of level i or higher that is to the left of An element NIL is allocated and given a key greater the location of the insertion/deletion. than any legal key. All levels of all skip lists are termi- If an insertion generates a node with a level greater nated with NIL. A new list is initialized so that the than the previous maximum level of the list, we update level of the list is equal to 1 and all forward pointers of the maximum level of the list and initialize the appro- the list’s header point to NIL. priate portions of the update vector. After each deletion, we check to see if we have deleted the maximum Search Algorithm element of the list and if so, decrease the maximum We search for an element by traversing forward point- level of the list. ers that do not overshoot the node containing the element being searched for (Figure 2). When no more Choosing a Random Level progress can be made at the current level of forward Initially, we discussed a probability distribution where ]une 1990 Volume 33 Number 6 Communications of the ACM Articles Searyh path , update[i]+forwzrd[i] original list, I7 to be inserted list after insertion, updated pointers in grey FIGURE3. Pictorial Description of Steps Involved in Performing an Insert(list, searchKey, newvalue) Insertion local update[l ..MaxLevel] x := list-+header for i := list+level downto 1 do randomLevel() while x+forward[i]+key c searchKey do newLevel := 1 x := x+forward[i] -- random0 returns a random value in [O...l) -- x+key < searchKey I x+fonuard[i]+key while random0 -Z p do update[i] := x newLevel := newLevel + 1 x := x+forward[l] return min(newLevel, MaxLevel) if x+key = searchKey then x+value := newValue else newLevel := randomLevel() FIGURE5. Algorithm to Calculate a Random Level if newLevel > list+level then for i := lisblevel + 1 to newLevel do half of the nodes that have level i pointers also have update[i] := listjheader level i + 1 pointers. To get away from magic constants, list+level := newLevel we say that a fraction p of the nodes with level i point- x := makeNode(newLevel, searchKey, value) ers also have level i + 1 pointers (see p = % for our for i := 1 to newLevel do original discussion). Levels are generated randomly by x+forward[i] := update[i]-+forward[i] an algorithm equivalent to the one in Figure 5. Levels update[i]-+forward[i] := x are generated without reference to the number of elements in the list. Delete(list, searchKey) local update[l ..MaxLevel] At What Level do We Start a Search? Defining L(n) x := list-+header In a skip list of 16 elements generated with p = %, we for i := list-+level downto 1 do might happen to have 9 elements of level 1; 3 elements while x+forward[i]+key c searchKey do of level 2; 3 elements of level 3; and 1 element of level x := x+forward[i] 14 (this would be very unlikely, but it could happen).

Skip Lists: a Probabilistic Alternative to Balanced Trees

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support