Dynamic concurrent insert(), remove() and find(), also the method succes- van Emde Boas array sor(), which allows users to determine the first greater element from the specified one. It has very good theo- retical and practical properties, as confirmed by tests Konrad Kułakowski and analyses carried out. Unfortunately, one of the shortcomings of that solution is the need to allocate AGH University of Science and Technology, all the required memory at the very beginning, as in al. Mickiewicza 30, Kraków, Poland, the case of a regular array. Another limitation of that [email protected] structure is the implementation of successor(), which in the case of massive interference with remove() op- erating in different threads might be delayed or failed Abstract. The growing popularity of due to the search repetition. These deficiencies led shared-memory multiprocessor machines the author to propose a new dynamic concurrent van has caused significant changes in the design Emde Boas (dcvEB) array, which, on the one hand, of concurrent software. In this approach, retains the good properties of its antecedent, and on the concurrently running threads commu- the other hand is deprived of its shortcomings. Hence, nicate and synchronize with each other the new structure presented in this article allocates through data structures in shared memory. and deallocates memory dynamically, depending on Hence, the efficiency of these structures is the amount of data stored in it. In addition, a new essential for the performance of concurrent applications. The need to find new concur- strategy for the successor() and predecessor() meth- rent data structures prompted the author ods has been adopted. The new structure, rather than some time ago to propose the cvEB array repeating the successor or predecessor search, contin- modeled on the van Emde Boas struc- ues searching until an appropriate element is found ture as a dynamic alternative. or the absence of such an element is decided. The as- This paper describes an improved version sumed strategy is more robust and less susceptible to of that structure - the dcvEB array (Dy- interference. It also seems to be more intuitive and namic Concurrent van Emde Boas Array). justifiable in the context of user expectation. One of the improvements involves memory The article consists of several sections, where, ex- usage optimization. This enhancement re- cept for introductory ones (Sec. 1 - 3), the dcvEB ar- quired the design of a tree which grows ray (Subsection 4.2) and its implementation (Subsec- and shrinks at both: the top (root) and the bottom (leaves) level. Another enhance- tion 4.3) are discussed. Next, the mechanisms of con- ment concerns the successor (and predeces- current expanding and shrinking are explained (Sub- sor) search strategy. The tests performed section 5.1). Other enhancements, such as dynamic seem to confirm the high performance of memory allocation and the new search strategy, are the dcvEB array. They are especially visi- explained (Subsections: 5.2 and 5.3). Then, the suc- ble when the range of keys is significantly cessor search running time and the structure correct- larger than the number of elements in the ness are discussed (Section 5). The experimental re- collection. sults are examined in Section 6. The comments and discussion (Section 7) and a brief summary (Section 1 Introduction 8) close the article. arXiv:1509.06948v1 [cs.DS] 23 Sep 2015 The rapid rise in the popularity of multi-core shared- 2 Background memory processor systems makes concurrent pro- grams increasingly common and desirable. Following A dynamic set is one of the basic data structures in the growing market for concurrent software, an in- computer science. Usually, it is assumed that a dy- creasing demand for the use of concurrent data struc- namic set supports the following operations: insert(), tures can be observed. Such a situation makes the delete(), search(), minimum(), maximum(), succes- search for new concurrent data structures particu- sor() and predecessor() [5, p. 230]. The first two of larly important. One of the attempts to find such a them are included in the category of modifying oper- structure is the work [15] in which the author pro- ations, while others are queries, which do not modify posed the early version of the concurrent van Emde the structure. Due to increasing demands for data Boas (cvEB) array. The structure presented here is format, different structures support dynamic set op- an example of concurrent dynamic set implementa- erations to varying degrees. In particular, good dy- tion providing, in addition to the standard methods namic set operation performance is provided by bal- anced search trees. For instance, all the dynamic set unusable if the stored objects cannot be represented operations can be handled by RB-Trees [1] in a se- as unique integers. quential running time O(lg α), whilst van Emde Boas The key to the efficiency of the vEB tree operations trees [25] need barely O(lg lg α) time to complete any is the uneven number of subtrees on different levels of the mentioned operations [5]. Unfortunately, tran- of the vEB tree. Thus, the root node has α1/2 of sub- sition from the sequential to the concurrent objects trees, whereas each next level of the vEB tree shrinks is not easy [24]. Hence, many concurrent dynamic set the number of children in the nodes by the square implementations (e.g. [6,13]) do not support all the root. Assuming that an operation over the vEB tree dynamic set operations and instead focus on dictio- performs O(1) work at each level of the hierarchy, nary operations. the running time of a method is O(h), where h is the The early works on the concurrent balanced search height of the vEB tree. Reducing the number of sub- trees with dictionary operations began to emerge in trees can not be carried out indefinitely. Thus, at the the 70s [23,2]. In the subsequent years, the topic was last but one level of the tree, the nodes have at most h−1 studied in [7,16,19]. The studies, initially focusing two single-element subtrees, i.e. α1/2 = 2. Hence, on lock strategy [2] and lock coupling [19,17], be- we obtain ln α = 2h−1, and finally h = ln ln α + 1. gan to deal with the relaxed (delayed) re-balancing Thus, the asymptotic running time of an exemplary [21,9] and the non-blocking synchronization schemes operation is O(ln ln α + 1) = O(ln ln α). [3,6,4,13]. To be able to traverse each level of a tree in O(1) Skip List, proposed by Pugh [22], is an alternative the vEB tree methods use the arrays of references to to balanced search trees. It provides several linked the subtrees. For this reason the root node Troot needs 1/2 lists arranged in a hierarchy, so that the single list to store Troot.arr - α -element array of subtrees, corresponds to the set of nodes at the same depth their children T , T.arr - α1/4-element arrays of their in a . The structure avoids additional re- subtrees, and so on. With this construction, every balancing due to the randomized fashion of the in- method can calculate in which subtree the given key sertion algorithm. SkipList is suitable for both the can be found. For example, in the case of the root sequential and concurrent applications. Very efficient node, the key x is expected to be in x/α1/2 subtree SkipList implementation [10], based on Fraser [8], is etc. part of a standard Java API 5. The Java SkipList To achieve O(1) level traversing time, the more implementation as one of the few (the second is a complex methods like successor() and predecessor() SnapTree Map by Bronson [3]) supports all the dy- need further information about the subtrees. Thus, namic set operations including successor() and pre- with every node T the next three variables are as- decessor(). signed: T.max, T.min and T.summary, where T.max, T.min denote correspondingly the maximal and the minimal value of a key in the subtree rooted in T . 3 van Emde Boas tree The summary is an auxiliary search structure. Intu- itively speaking, the search() method traverses down The tree structure proposed by van Emde Boas [25] the vEB tree along a well-defined path from the root is not a typical search tree. It supports all the to the given key. The successor() must deviate from dynamic set operations, such as insert(), delete(), this path to the right (predecessor() to the left). The search(), minimum(), maximum(), successor() and decision whether to go down into the subtree accord- predecessor() [5, p. 230] in O(ln ln α). This tremen- ing to the predetermined path or go to the right at the dous speed involves the requirement that the keys same level is taken on the basis of the value T.max. must be unique integers in the range 0 to α − 1. Thus, if T is a subtree in which, according to the Thus, from a practical point of view, the van Emde path calculation, the key x should be stored, then Boas (vEB) tree is something between an array and the successor() goes down into T only if x < T.max, a search tree. Assuming that the number of stored i.e. when the maximal key in T is greater than x. elements is essentially smaller than α, the vEB tree If x ≥ T.max the successor() method needs to move is better than the array as regards the speed of suc- horizontally to the right in search of the first non- cessor(), predecessor(), minimum() and maximum(). empty subtree. Of course, such a horizontal search Of course, the efficiency of the array operations in- might be time consuming. For instance, the linear 1/2 sert(), delete() and search() remains unchallenged re- browsing Troot.arr may take up to O(α ). In order gardless of the stored data size. In general, the vEB to shorten the horizontal search, the same mecha- tree operates faster than the other search trees. How- nism as in the case of the whole structure is used. ever, the strong constraint on the key values makes it T.summary is an auxiliary tree that holds informa- tion about the occupancy of the array T.arr in the to ArrayHolder (AH) - a tree node structure, which same manner as the main tree holds the keys. Thus, wraps the lower-level array or stores a specific value traversing T.summary takes at most O(ln ln |T.arr|). if AH is a leaf. The leaves are kept at the lower level In the results, the overall asymptotic running time of of the tree. Each array corresponds to an associated successor() and predecessor() is O(ln ln α). summary - a bit vector, in which the i − th bit is A good and systematic introduction into the vEB enabled only if the appropriate array’s cell holds the trees theory can be found in [5]. lower level AH. Besides an array and the associated bit vector, every AH also contains the readers-writers 4 Construction of the dcvEB array lock object [11]. The leaf AH instead of an array ref- erence holds an element and an integer index value 4.1 From the vEB Tree to the dcvEB array as its key. The value of the key determines the path from the root to the leaf understood as a sequence of One of the reasons why vEB trees are not so popu- positions on the various levels of the tree. The path lar in practice are space requirements [5]. The need positions are calculated according to the following re- to allocate one continuous block of memory in the current formula: i = i − lp ∗ nh−k−1, lp = 1/2 k k−1 k−1 k root of a structure capable of holding α - element  h−k−1 ik/n where n is the length of a bit vector, h - array might be inconvenient. The problem can be is the height of the tree, lpk is the path position on the addressed in different ways [20,5]. One of them im- k − th level, i0 is a key of the element. A position at plemented in the dcvEB array proposes the use of  h−1 the root level lp0 is defined as lp0 = i0/n . The a fixed number of subtrees per node. It results in a dcvEB array of the height h can hold elements within worse theoretical time complexity, however, in many the range [0, . . . , nh −1]. If there is a need to store an practical applications the achieved speed appears to element with a key greater than nh −1 or by removing be quite sufficient. For the same reason, the summary the item there are no elements with the keys within structure is simplified to a bit-vector aligned to the the range [nh−1, . . . , nh − 1] the tree has to be verti- length of a machine-word. The use of high-speed non- cally resized. The concurrent tree resizing algorithms blocking bitwise operations on the summary vector as integral parts of the insert() and remove() pro- allows users to avoid the use of T.min and T.max. cedures are discussed later. If the dcvEB array does The logic behind some methods of the dcvEB ar- not contain a particular element, and its key fits the ray is also changed. For example, in the vEB tree, current key range, the insert procedure recreates the the delete() method performs one single pass from missing AH along the path from the root to the leaf. the top to bottom. Due to synchronization issues in Similarly, the remove procedure deletes AHs if the the dcvEB array [15] the delete() method proceeds appropriated summaries are 0. bottom-up. Similarly, successor() and predecessor() first reach the bottom of the tree, then start to tra- verse the tree moving up and down in search of the 4.3 dcvEB array methods appropriate element. The dcvEB array to use The dcvEB array is designed to support all the dy- the non-blocking synchronization mechanisms as of- namic set methods as specified in [5, p. 230]. Not all of ten as possible. For example, the get() method uses them are extensively discussed in the article, although only the lock-free synchronization mechanisms, which all of them are implemented1. In particular, the basics results in its very good performance in the tests (Sec. of the missing predecessor() method are very similar 6). The only exception is the mutual synchronization to successor(), which is discussed below, whilst the of insert() and delete(). In this case, in order to en- methods minimum() and maximum() have straight- sure data consistency [15, p. 373] the readers-writer forward implementation using successor() and prede- lock [11] is used. cessor()1. It is assumed that the stored objects are Despite the fact that the creation of the dcvEB uniquely identified by integer keys. Thus, the key ap- array was inspired by the vEB tree, the differences pears in most of the dcvEB array methods as an between these two structures seem to be fundamental. input parameter, whilst the return value of all the Therefore the dcvEB array should be treated, not as query methods is the pair consisting of the key and a concurrent extension of the sequential vEB tree but, the stored element. The presented implementation as the new and original data structure. uses locks as well as lock-free synchronization mecha- nisms. Hence, wherever an atomic, lock-free element 4.2 Structure organization is used, an appropriate object or variable is declared The dcvEB and cvEB arrays can be seen as a tree 1 The minimum can be determined by the call succes- of arrays [15]. Each array’s cell holds the reference sor(0), whilst maximum by the call predecessor( nh −1) a tree implementing the dcvEB array structure except the last leaf level, and root - a root’s AH (Listing: 2). 7 ArrayParam 8 int size 9 int height 10 ArrayHolder root Listing 2: Array Parameters structure The current value of ArrayParam is stored in the common atomic variable ap. Except for initialization, the fields of AP are read-only, hence they do not need to be synchronized. The first presented method discussed in this sec- tion is insert() (Listing: 3). At the very beginning it locks the common atomic variable ap (in order to prevent altering the current ArrayParam reference by remove()), then it makes the local copy of the current array parameters (Line: 12). Next, it locks the root, (Line: 13), and unlocks ap (Line: 13). Then, it checks whether the key value fits the cur- rent array size and, if not, it tries to extend the array (Listing: 3, Lines: 15 - 17). Array growing is imple- mented by adding successive levels above the current Fig. 1. dcvEB array scheme root (Listing: 4). When the new top of the dcvEB as atomic. The main purpose of this section is to al- array tree is ready, the algorithm tries to set it as low the reader to understand the general idea behind the new root within the newly created ArrayParam the presented algorithms and the data structures they record (Listing: 4, Line: 17). Then, irrespectively of use. For this reason some issues connected with syn- the result of the CAS 2 invoke, it unlocks cAP’s root chronization and concurrency are only indicated, and (Listing: 3, Line: 18). If, due to concurrent interfer- will be discussed later. ence with other threads, CAS fails and the common array parameters are not changed, the root locking The methods presented above use two additional guarded by the ap lock is repeated (Listing: 3, Lines: structures: ArrayHolder (Listing: 1), ArrayParam, 20 - 21), and the loop condition is re-evaluated (List- and one atomic common variable ap, which holds ing: 3, Line: 14). If CAS succeeds, then cAP is up- the current ArrayParam value. The ArrayHolder con- dated (Line: 22), and the loop is interrupted. tains five fields: array - atomic array of references to After the size of the array has been adapted to the the lower-level AHs, summary - an atomic bit vector size of a key, the algorithm traverses the tree struc- implemented as any integer type available on the cur- ture starting from the current root (Listing: 3, Line: rent hardware platform, index - an atomic key value 24) to the leaf. On every step of the loop while (List- of the stored object, data - an atomic reference to the ing: 3, Lines: 25 - 38) a subsequent level of the tree is stored object, and lock - reader-writer lock object as- visited. The loop starts from calculating the level po- sociated with the given AH. sition lp, then, if it is not the top level, cAH becomes 1 ArrayHolder read locked, and the previous node pAH is unlocked 2 AtomicRefArray array (Listing: 3, Line: 28). Next, pAH is set to cAH, and 3 AtomicInt summary cAH is atomically updated (Listing: 3, Lines: 29 - 4 AtomicInt index 31). This update is to set the n − lp bit in summary 5 AtomicRef data corresponding to the lp cell of the cAH’s array field. 6 RWLock lock After setting the bit indicating that at lp position in Listing 1: Array Holder structure cAH’s array there is a subtree, iteration moves to the The second structure ArrayParam (AP) contains lower level of the tree, i.e. the current value of cAH the fields: size - the number of indices assignable at the moment in the dcvEB array (i.e. the maximal 2 CAS(a,b,c) - compare and swap atomic action operat- object stored in the dcvEB array cannot have a key ing under the scheme: if a = b then a ← c and return greater than size−1), height - the number of levels of true. Return false otherwise. is replaced by the reference to its lp children (Listing: As a result of this operation, a new AP record is cre- 3, Line: 32). ated (Listing: 4, Line: 42). Then, the grow() proce- 11 insert(key, data) dure calculates the appropriate new height and size 12 apLock.rLock(); cAP ← ap; (Listing: 4, Lines: 43 - 44). The number of levels to 13 cAP.root.rLock(); apLock.rUnlock(); create is determined as the difference between the 14 while (key >= cAP.size) previous height and the new height of the tree (List- 15 ArrayParam newAP ← grow(key); ing: 4, 46). Then, the procedure starts the loop while 16 newAP.rLock(); (Listing: 4, Lines: 48 - 57), and within every turn of 17 tmp ← CAS(ap,cAP,newAP); the loop the new AH is generated. The first gener- 18 cAP.root.rUnlock(); ated AH becomes a new root of the tree (Listing: 19 if not tmp then 4, Line: 52), each further one becomes the leftmost 20 apLock.rLock(); cAP ← ap; child of its predecessor (Listing: 4, Line: 54), and fi- 21 cAP.root.rLock();apLock.rUnlock(); nally the last generated AH takes the previous root 22 else cAP ← newAp; break; 23 end while; AH as its leftmost child (Listing: 4, Line: 56). The 24 cAH ← cAP.root; cl ← 0; pAH ← nil; sequential running time of grow() is O(logn α). Since 25 while (cl < cAP.height) the number of iterations of the loop while (Listing: 26 lp ← lvlPos(cl, key); 3, Lines: 14 - 23) depends on the interferences with 27 if cl 6= 0 then the concurrently operating delete threads, whilst the 28 cAH.rLock(); pAH.rUnlock(); number of iterations of while (Listing: 3, Lines: 25 - 29 pAH ← cAH; cAH.summary ← 40) is limited by the height h = logn α of the tree, 30 0n ... 0n−lp+11n−lp0n−lp−1 ... 01 then the overall sequential running time of insert() is 31 ORbit cAH.summary; O(logn α). 32 cAH ← cAH.array[lp]; 33 if (cAH = nil) then 41 grow(key) 34 cAH ← createAH(); 42 nAP ← createAP() 35 CAS(pAH.array[lp],nil,cAH) 43 nAP.height ← dlogd key e cAP.hight 36 cAH ← pAH.array[lp]; 44 nAP.size ← d 37 cl ← cl+1; 45 cAP ← ap; 38 end while 46 topSize ← nAP.height - cAP.height 39 cAH.data ← data; cAH.index ← key; 47 cl ← 0 40 pAH.rUnlock(); 48 while (cl < topSize) Listing 3: Insert method 49 cAH ← createAH() 50 cAH.summary ← 1n0n−1 ... 01 Of course, it is possible that the subtree has not yet 51 if cl = 0 been initialized (Listing: 3, Line: 33). In such a case 52 nAP.root ← cAH the new cAH is created, atomically assigned to the 53 else parent AH’s array when possible (Listing: 3, Line: 54 pAH.array[0] ← cAH 35), then due to the possible interference with an- 55 if cl = topSize - 1 other insert thread (but not remove thread) the final 56 cAH.array[0] ← cAP.root value of cAH is re-read from the parent cAH’s ar- 57 pAH ← cAH 58 return nAP ray (Listing: 3, Line: 36). At the end of the loop, the variable determining the current level of iteration Listing 4: Array growing is incremented (Listing: 3, Line: 37). The loop ends The next method get(), similarly to insert(), first when cAP is pointing at some leaf AH. Hence, at the retrieves the current snapshot of the dcvEB array end of the method both leaf AH’s fields: data and in- parameters (Listing: 5, Line: 60), then traverses the dex, are updated. In the last line of insert() the leaf’s structure down from the root to the leaf following parent node lock is released (Listing: 3, Line: 40). the subsequent level positions. The main difference between get() in the dcvEB array and get() from the An important routine used within the insert() previous version of the structure [15] is that currently method is grow(). It is responsible for extending the the enabled bit in a summary does not guarantee dcvEB array, when it is too small to hold an ele- the existence of the corresponding lower level array ment with the given key. Enlarging the array relies holder. Hence, the additional check whether the next on adding additional levels above the existing root so AH is not actually nil is necessary (Listing: 5, Lines: that the total height h of the dcvEB array tree in- 67 - 68). As can be seen, the sequential running time creases. Hence, the dcvEB array becomes capacious enough to encompass the key i.e. it requires nh > key. of get() is determined by the loop (Listing: 5, Lines: 72 delete(key) 61 - 70) and is O(logn α). 73 cAP ← ap; cAH ← cAP.root; 74 pAH ← nil; cl ← 0; 59 get(key) 75 ahol ← makeEmptyArray(cAP.height) 60 cAP ← ap; cAH ← cAP.root; cl ← 0; 76 pos ← makeEmptyArray(cAP.height) 61 while (cl < cAP.height) 77 makePath(key,cAP,cAH,pAH,cl,ahol,pos) 62 lp ← lvlPos(cl, key) 78 if (!is_filed(ahol,cAP.height)) then 63 if (0n0n−1 ... 0n−lp+11n−lp0n−lp−1 ... 01 79 return; 64 ANDbit cAH.summary) = 0 80 delIntern(key,cAP,cAH,pAH,cl,ahol,pos); 65 return nil 81 while (cAP != ap and rep < maxRep) 66 cAH ← cAH.array[lp] 82 deleteClean(key); rep ← rep + 1; 67 if cAH = nil 83 topTrim(); 68 return nil 69 cl++ Listing 6: Delete method 70 end while 71 return (cAH.data, cAH.index)

Listing 5: Get method 84 makePath(key,cAP,cAH,pAH,cl,ahol,pos) Changes resulting from the introduction of dy- 85 while (cl < cAP.height) namic memory allocation also affected the delete() 86 lp ← lvlPos(cl, key); method. Since insert() is able to expand the top of the 87 if (0n ... 0n−lp+11n−lp0n−lp−1 ... 01 ANDbit 88 cAH.summary) = 0 then break; dcvEB array tree and to generate missing lower level 89 pos[cl] ← lp; ahol[cl] ← cAH; AHs, then delete() needs to be able to trim the top of 90 pAH ← cAH; cAH ← cAH.array[lp]; the tree and to remove redundant nodes. The delete() 91 if (cAH = nil) then break; method implementation can be logically divided into 92 cl ← cl + 1; three stages: preparing a path towards a leaf, deleting 93 end while the leaf with the deletion propagation and cleaning, Listing 7: makePath - delete auxiliary method and the dcvEB array top trimming. Like almost all presented dynamic set methods, delete() also starts The second and the major subroutine of delete() from fetching the snapshot of the current dcvEB ar- is delIntern(). In terms of the synchronization struc- ray parameters (Listing: 6, Lines: 73 - 74). Then, after ture, it is similar to the original delete() method pre- the creation of the two empty tables ahol and pos for sented in [15]. The need, however, for effective ar- holding the path between the root and the node for ray holder removal caused the necessity to introduce disposal, the method makePath is invoked (Listing: a few new elements into the code of the algorithm. 6, Line: 77). The purpose of this method is to fill The delIntern() method is executed only if the ar- these tables with the subsequent AH s and their po- ray holder structure is correctly filled, which takes sitions along the way from the root to the leaf node place only if makePath() (Listing: 7) does not break being removed according to the formula for lpk. Dur- its while loop. Hence, at the very beginning of delIn- ing the iteration, similarly to in get(), the presence tern(), it is assumed that the variable pAH refers to of the child must be checked twice. Firstly, by check- some AH from the last but one (cAP.height-1 ) level, ing a summary bit vector (Listing: 7, Lines: 87 - 88), whilst cAH points at the element from the last level the second time by checking whether the retrieved containing pairs (index, data). subsequent AH is not nil (Listing: 7, Line: 91). It Thus, after locking appropriate array holders (List- is assumed that the arguments of makePath are in- ing: 8, Line: 95), the stored data are overwritten by nil out, which means that the changes made inside the (Listing: 8, Line: 96). Afterwords delIntern() begins method are visible outside. its arduous journey towards the root iterating within It is noteworthy that makePath may not contain a the loop while (Listing: 8, Lines: 98 - 121). It starts complete path between the root and the leaf designed from the last but one level (Listing: 8, Line: 100). to be disposed. This happens when there is no such First, it sets an appropriate bit in cAH’s summary to path i.e. because the desired element has just been re- 0 (Listing: 8, Line: 102). Therefore, the data was log- moved. In such a case the procedure stops, and leaves ically removed from the structure (data field is set to the arrays ahol and pos partially filled. In the case of nil, index to −1, and AH is not by the parent’s sum- delete(), not entirely filled arrays indicate that there mary), although an appropriate array holder still ex- is no element to delete (Listing: 6, Line: 78). Hence ists. Such an array holder will be physically removed the method can finalize its operation (Listing: 6, Line: only when the whole cAH’s summary is 0 (Listing: 8, 79). Line: 107). Otherwise, if cAH’s summary is not 0, the previously locked nodes are released and the method 122 topTrim() exits (Listing: 8, Lines: 103 - 105). 123 cAP ← ap; 124 while (cAP.root.summary = 1 0 ... 0 ) 94 delIntern(key,cAP,cAH,pAH,cl,ahol,pos) n n−1 1 125 if (cAP.height = 1) then break; 95 pAH.wLock(); cAH.wLock(); 126 nAP ← createAP(); 96 cAH.data ← nil; cAH.index ← −1; 127 nAP.height ← cAP.height - 1; 97 pAH ← cAH; nAP.hight 98 while (cl ≥ 0) 128 nAP.size ← d ; 99 cAH ← ahol[cl]; lp ← pos[cl]; 129 theLonelyChild ← 100 if (cl = cAP.height - 1) 130 cAP.root.array[0]; 101 cAH.summary ← 1n ... 0n−lp ... 11 131 if (theLonelyChild = nil) then 102 ANDbit cAH.summary; 132 return nil; 103 if (cAH.summary 6= 0) 133 nAP.root ← theLonelyChild; 104 pAH.wUnlock(); cAH.wUnlock(); 134 apLock.wLock(); cAP.root.wLock(); 105 return; 135 if (cAP.root.summary = 1n0n−1 ... 01) 106 else 136 then CAS(ap,cAP,nAP); 107 cAH.array ← {nil0, . . . , niln−1}; 137 cAP.root.wUnlock();apLock.wUnlock(); 108 else 138 cAP ← ap; 109 cAH.wLock(); pAH.wLock(); Listing 9: TopTrim - delete auxiliary method 110 isSummaryAltered ← false 111 if (pAH.summary = 0) The purpose of topTrim() - the last auxiliary 112 cAH.summary ← 1n ... 0n−lp ... 11 method involved in delete() implementation is to cut 113 ANDbit pAH.summary; the top of the dcvEB array tree if it is reduced to 114 isSummaryAltered ← true the list (Listing: 9). It is possible that, after the in- 115 if (cAH.summary = 0) ternalDelete() call, the root and a few nodes below 116 cAH.array ← {nil0, . . . , niln−1} have only one, the leftmost, child. In such a case, the 117 cAH.wUnlock(); pAH.wUnlock(); sequence of such vertices starting from the root needs 118 if (not isSummaryAltered) to be safely removed. The topTrim() reduces the top 119 return; of the tree iteratively. It removes only one node (root) 120 cl ← cl - 1; pAH ← cAH; 121 end while in every course of the loop while (Listing: 9, Lines: 124 - 138). If the loop while condition is met, i.e. the Listing 8: Delete internal - delete auxiliary root has only one child at the leftmost cell in the method array, then the new AP candidate is prepared (List- This, "my brother keeps me alive", lazy strategy ing: 9, Lines: 126 - 128). Next the “lonely” child is aims to reduce the amount of memory allocation per- retrieved (Listing: 9, Line: 130). If it is not nil (it formed during the course of the algorithm. If delIn- might be nil due to another delete thread), it is pro- tern() processes the element on the level cAP.height moted to a new root candidate of the whole dcvEB −2 or higher, then it first locks the current and previ- array tree (Listing: 9, Line: 133). Finally, if the ar- ous AH’s node, and next alters the cAH summary by ray properties are not changed during the course of removing the bit corresponding to the removed chil- the topTrim routine (i.e. the assertion that the root dren (Listing: 8, Line: 113). As in the previous case, if has only one leftmost child still holds) the newly pre- cAH summary is 0 then the child node is dereferenced pared nAP becomes the main array parameters refer- (Listing: 8, Lines: 115 - 116). Then, after unlocking ence. The topTrim() method does not trim the trees cAH and pAH (Listing: 8, Line: 117) and checking shallower than the ones composed of the root and whether it makes sense to propagate a delete action leaves (Listing: 9, Line: 125). The CAS call (Listing: towards the root (Listing: 8, Lines: 118 - 119) the else 9, Line: 136) responsible for the ArrayParam altering block ends. At the end of the procedure the variables is guarded by two locks (Listing: 9, Line: 134). They cl (current level) and pAH (previous array holder) prevent a situation in which insert() adds the new are updated (Listing: 8, Line: 120). element into the subtree rooted in the node, which is subject to removal by topTrim(). The sequential The next subroutine of delete() is deleteClean(). It running time of delete() depends on the complexity is called from delete() just after delIntern() (Listing: of their subroutines. The first of them makePath() 6, Lines: 81 - 82). The main reason for which it is (Listing: 7) comprises one loop while. Due to the loop introduced is the danger of not removing all the re- condition (Listing: 7, Line: 85) it is clear that the se- quired AH when the delete action interferes with the quential running time of makePath() is limited by the insert action. The idea of deleteClean() implementa- height of the tree i.e. O(log α). The methods delIn- tion and further explanations are in Subsection 5.1. n tern(), deleteClean() and topTrim() also need at most the leaf level, ( the condition cl = cAP.height is true, to visit all the nodes on a single path between the root see Fig. 1) this means that the element indexed by the and a leaf. Therefore, their sequential running time is key exists. Hence, if only the successor() procedure O(logn α). Furthermore, if only one thread is up and manages to fetch the stored data, then the appropri- running, the loop while (Listing: 6, Line: 81) executes ate (data, key) pair is returned by the method (List- only one. Thus the overall sequential running time of ing: 10, Lines: 146 - 151). Of course, makePath() may delete() equals the maximum of the running time of not reach the leaf level (the condition cl = cAP.height all their subcomponents, and is O(logn α). does not hold) or even if the leaf is reached, its re- The pseudo code of the last method successor() moval might start before the leaf data are extracted was divided into two parts. The first (Listing: 10) (one of the following three conditions is true: cAH one is responsible for the attempt to reach the leaf = nil, data = nil, index = -1 ). In such a case, the AH holding the data indexed by a key. If such a method control goes to the while loop (Listing: 11) leaf exists, it will be returned as its own successor. and the algorithm starts to explore other successor The second part (Listing: 11) contains a loop which candidates. consists of two other loops, where the first internal 152 mark: while(true) loop is responsible for traversing the dcvEB array 153 while (cl ≥ 0) tree up, whilst the second traverses the tree down. 154 cAH ← ahol[cl];lp ← pos[cl]; 155 tmpSum = 0 ... 0 1 ... 1 Such a structure of the code in the second part cor- n n−lp n−lp+1 1 156 AND cAH.summary; responds to the successor()’s searching strategy. In bit 157 if (tmpSum = 0) then cl ← cl - 1; other words, first the method tries to go a little bit 158 else break; higher to check where a successor leaf could be (the 159 end while first internal loop), then tries to go towards the leaf 160 if (cl = -1) then return nil; in order to retrieve the stored data and key (the sec- 161 while(true) ond internal loop). Of course, sometimes during the 162 if (tmpSum = 0) then gliding down the tree the successor candidate might 163 cl ← cl - 1; goto mark; be removed. In such a case, the second loop must be 164 bp ← mostLeftBitPos(tmpSum); aborted and the method once again starts to follow 165 lp ← lvlPos(pb); pos[cl] ← lp; up the tree in order to find another potential succes- 166 ahol[cl]←cAH;cAH←cAH.array[lp]; 167 if (cAH = nil) sor candidate. 168 cl ← cl - 1; goto mark; 139 successor(key) 169 tmpSum ← cAH.summary; cl ← cl + 1; 140 cAP ← ap; cAH ← cAP.root; 170 if (cAP.height = cl) then break; 141 if (cAH.summary = 0) then return nil; 171 end while 142 pAH ← nil; cl ← 0; 172 data ← cAH.data; index ← cAH.index; 143 ahol ← makeEmptyArray(cAP.height); 173 if (data 6= nil and index 6= -1) then 144 pos ← makeEmptyArray(cAP.height); 174 return (data, index); 145 makePath(key,cAP,cAH,pAH,cl,ahol,pos); 175 else 146 if (cl = cAP.height) then 176 cl ← cl - 1; goto mark; 147 cl ← cl - 1; 177 end while; 148 if (cAH 6= nil) then 149 data ← cAH.data; index ← cAH.index; Listing 11: Successor method (part 2) 150 if (data 6= nil and index 6= -1) The second part of the successor() method (List- 151 then return pair; ing: 11) is responsible for traversing the structure up Listing 10: Successor method (part 1) and down looking for the next successor candidate. At the very beginning, the successor() sets its own The first inner loop (Listing: 11, Lines: 153 - 159) local copy of the array parameters (Listing: 10, Line: is responsible for traversing the structure up until 140), then it prepares a pair of holders, cAH and the node with the non-empty subtree further to the pAH, used to traverse the structure. Then the cl vari- right is found or the root level is achieved, i.e. cl =