Deterministic-Skip-Lists-Munro.Pdf

Chapter 43 Deterministic Skip Lists* J. Ian Munrot Thomas Papadakist Robert Sedgewick* Abstract trees. It is a very simple data structure; Lewis and Reexplore techniques based on the notion of askip list to Denenberg, for exampl~, in their introductory data guarantee logarithmic search, insert and delete costs. The structures textbook [5] introduce the skip list in leading basic idea is to insist that between any pair of elements to the binary search tree. Unlike other data structures, above a given height are a small number of elements of the skip list is probabilistic in nature, that is, given a precisely that height. Thedesired behaviour can reachieved set of values to be stored in a skip list and the insertion byeither using some extra space for pointers, or by adding sequence of these values, the shape of the skip list that the constraint that the physical sizes of the nodes be will be formed is not determined, but depends on the exponentially increasing. The first approach leads to simpler outcomes of coin flips. It therefore makes sense to code, whereas the second is ideally suited to a buddy system talk about the average cost for the search of the mth of memory allocation. Our techniques are competitive in element in a skip list of n elements (where this average terms of time and space with balanced tree schemes, and, is taken over all outcomes of coin flips) and about the we feel, inherently simpler when taken from first principles. average cost for the search of the aver-age element in a skip list of n elements (where this average is taken 1 Introduction over all outcomes of coin fiips and over all search keys). These costs are [8,7] log; n + ~ log~ m + 6)(1) and We address the classic problem of designing a data structure to support the operations of search, insert, ~ log: n + @(l) respectively, where p is the probability delete and find-closest value in logarithmic time in the that the coin used to build the skip list decides to worst case. There are several well-known solutions, the increase the height of the new node to be inserted. AVL tree [1], the B-tree [2] and its special case the 2-3 Similar results hold about the average insert and delete tree, the red-black tree [4], etc., all of which are bal- costs . anced search trees, but which are rarely used for main Despite the fact that the average insert and update memory applications in (non research) computational costs for the skip lists are low, the worst case insert practice. Evidently, they seem too hard or complicated and update costs are technically unbounded as functions to implement for the “root mean square programmer”. of n, and they remain linear even if the skip list is Our general question then becomes “how hard is it to capped at some level, as suggested by Pugh [8]. Thus understand and implement such a data structure?)’ or the balanced search tree schemes (AVL, red-black, etc.) “how many neurons must fire in the head of the pro- remain the refuge of one who wants a firm logarithmic grammer to solve the dictionary problem in logarithmic upper bound for any individual operation. time?” Intriguing as a lower bound for the latter form In this paper, we present several deterministic ver- may be, we restrict ourselves to what we feel is a new sions of the skip list that have a @(lg n) worst case upper bound. search cost. We begin by demonstrating our discipline We start with the skip list, that was recently in its simplest form, which requires a total of at most introduced by Pugh [8] as an alternative to search 2n pointers and haa a @(lg2 n) worst case update cost. This update behaviour is improved to @(lg n) with a total of at most 6n pointers in one version, or a total of at *This research was supported in part by the National Science most 2 .3n pointers in another. The 6n pointers version and Engineering Research Council of Canada under grant No. A- has the advantage of easier coding. The 2.3n version 8237, the Information Technology Research Centre of Ontario, and the National Science Foundation under Grant No. CCR- is ideally suited to an environment in which the buddy 8918152 memory allocation system is employed (e.g. BSD Unix), t DePmtment of Computer Science, University of Waterloo, since in this version, there is no increase in the stor- Waterloo, Ont N2L 3G1, Canada age requirement over the initial form having @(lgz n) t Department of Computer Science, Princeton University, worst case update cost, aa we are given this extra stor- Princeton, NJ 08544, USA 367 368 MUNRO ET AL. age whether we want it or not. are guaranteed to drop down a level, without looking at Finally we present deterministic versions of skip the 3rd element ahead at the same level. This will bring lists in which updates are done in a top-down manner. down the worst case number of key comparisons from Although these new skip lists are essentially simple ex- 3 [lg(n + 1)] to 2 Llg(n + 1)]. tensions of the previous versions, their implementations An insertion in a 1-2 skip list is made by initially are simpler, and if we become a bit cavalier with storage, adding an element of height 1 in the appropriate spot. we are able to achieve an implementation simpler than This may, of course, trigger the invalid configuration of that of the 2-3-4 trees implemented directly or through 3 elements of height 1 in a row. This is easily rectified the red-black formalism. by letting the middle of these 3 elements grow to height 2. If this now results in 3 elements of height 2 in a row, 2 The starting point we let the middle of these three elements grow to height A natural starting point is with the idea of a perfectly 3, etc. For example, the insertion of element 15 in the balanced skip list, namely one in which every kth skip list of Fig. l(a) will cause 13 to grow from height 1 node of height at least h is of height at least h + 1 to height 2, that will cause 17 to grow from 2 to 3, and (for some fixed k). Although searches in balanced that will cause 17 again to grow from 3 to 4. skip lists take logarithmic time, the insert and delete In general, the insertion algorithm described above costs are prohibitive. We should therefore examine the may cause up to ~lg(n + l)J elements to grow. In the consequences of relaxing a bit the strict requirement 2-3 tree analogy, this is not a great problem, since each “every kth node”. increase of an element’s height takes constant time. Assuming that in a skip list of n elements there However, in our setting, if we insist on implementing exists a Oth and a (n+l)st node of height 1 higher than the sets of horizontal pointers of the nodes as arrays the height of the skip list, we require that between any (rather than as linked lists) of pointers, then, when two nodes of height h (h > 1) or higher, there exist a node grows, we have to allocate space for a larger either 1 or 2 nodes of height h – 1. We call this skip node, and we have to copy all pointers in and out of list a 1-2 skip list. As we see from Fig. 1, there exists the old node into the new node. Therefore, the growth a one-to-one correspondence between 1-2 skip lists and of a single node from height h to height h+ 1 requires 2-3 trees. We feel that, taken from first principles, the the change of @(h) pointers. Since in the worst case skip list view of the structure is simpler than the search all nodes of heights 1,2, . Llg(n + 1)] will have to be tree view. raised, an insertion may take up to ~(lg2 n) time. What is the number of (horizontal) pointers of a A deletion of an element from a 1-2 skip list is done 1-2 skip list in the worst case? Clearly, the header in a way completely analogous to the way it is done has at most Llg(n + I)j pointers. In addition to in the corresponding 2-3 tree. The deletion of 5 for these, there exist exactly n pointers at level 1, at example from the 1-2 skip list of Fig. 1(a), will cause 3 to shrink, and this will cause 10 to shrink and 17 most l~j pointers at level 2, at most *= to grow. In general, up to Llg(n + l)J elements may 1) have to grow, and up to [lg(n + l)j may have to shrink, pointers at level 3, and, in general, at most [+1 resulting thus in a @(lg2 n) worst case cost as well. n–(1+2+4+... +2’–2 [ ,-I ‘J= 1*J -1 pointers at level i. Therefore, the maximum number of pointers, including 3 Achieving logarithmic worst case update costs the pointers of the header, is ~~~f”+l)j [*J. This sum is well-known (see, for example, [3, pp. 113–114]), One solution to the problem of the @(lg2 n) worst and so we get that the maximum number of pointers case update costs encountered in the preceding section, (including those of the header) in a 1-2 skip list of n would be to implement the set of horizontal pointers of elements is exactly each node as a linked list of pointers.

Deterministic-Skip-Lists-Munro.Pdf

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support