Non-Blocking Interpolation Search Trees with Doubly-Logarithmic Running Time

Non-Blocking Interpolation Search Trees with Doubly-Logarithmic Running Time Trevor Brown Aleksandar Prokopec Dan Alistarh University of Waterloo Oracle Labs Institute of Science and Technology Canada Switzerland Austria [email protected] [email protected] [email protected] Abstract are surprisingly robust to distributional skew, which sug- Balanced search trees typically use key comparisons to guide gests that our data structure can be a promising alternative their operations, and achieve logarithmic running time. By to classic concurrent search structures. relying on numerical properties of the keys, interpolation CCS Concepts • Theory of computation → Concur- search achieves lower search complexity and better perfor- rent algorithms; Shared memory algorithms; • Computing mance. Although interpolation-based data structures were methodologies → Concurrent algorithms; investigated in the past, their non-blocking concurrent vari- ants have received very little attention so far. Keywords concurrent data structures, search trees, inter- In this paper, we propose the first non-blocking imple- polation, non-blocking algorithms mentation of the classic interpolation search tree (IST) data structure. For arbitrary key distributions, the data structure ensures worst-case O¹logn + pº amortized time for search, 1 Introduction insertion and deletion traversals. When the input key distri- Efficient search data structures are critical in practical set- butions are smooth, lookups run in expected O¹log logn + pº tings such as databases, where the large amounts of under- time, and insertion and deletion run in expected amortized lying data are usually paired with high search volumes, and O¹log logn + pº time, where p is a bound on the number with high amounts of concurrency on the hardware side, of threads. To improve the scalability of concurrent inser- via tens or even hundreds of parallel threads. Consequently, tion and deletion, we propose a novel parallel rebuilding there has been a significant amount of research on efficient technique, which should be of independent interest. concurrent implementations of search data structures. We evaluate whether the theoretical improvements trans- For search data structures supporting predecessor queries, late to practice by implementing the concurrent interpola- which are the focus of this work, such as binary search trees tion search tree, and benchmarking it on uniform and non- (BSTs) or balanced search trees, efficient implementations uniform key distributions, for dataset sizes in the millions have been well researched and are relatively well understood, to billions of keys. Relative to the state-of-the-art concur- e.g. [9, 13, 22, 36]. However, these classic search data struc- rent data structures, the concurrent interpolation search tree tures are subject to the fundamental logarithmic complexity achieves performance improvements of up to 15% under thresholds (in the number of keys n), even in the average high update rates, and of up to 50% under moderate update case, which limits their performance for large key sets, in the rates. Further, ISTs exhibit up to 2× less cache-misses, and order of millions or even billions of keys. In the sequential consume 1:2 − 2:6× less memory compared to the next best case, elegant and non-trivial techniques have been proposed alternative on typical dataset sizes. We find that the results to reduce average-case complexity, by leveraging properties of the key space, or of the key distribution. With one notable exception [37], these techniques are significantly less well understood for concurrent implementations. Permission to make digital or hard copies of all or part of this work for This paper revisits this area, and provides the first efficient, personal or classroom use is granted without fee provided that copies interpolation are not made or distributed for profit or commercial advantage and that non-blocking concurrent implementation of an copies bear this notice and the full citation on the first page. Copyrights search tree data structure [34], called the C-IST. The C-IST is for components of this work owned by others than the author(s) must dynamic, in that it supports concurrent searches, insertions be honored. Abstracting with credit is permitted. To copy otherwise, or and deletions. Interpolation search trees, presented in the republish, to post on servers or to redistribute to lists, requires prior specific next section, have amortized worst-case O¹lognº time for permission and/or a fee. Request permissions from [email protected]. standard operations, but achieve O¹log lognº expected amor- PPoPP ’20, February 22–26, 2020, San Diego, CA, USA O¹ nº © 2020 Copyright held by the owner/author(s). Publication rights licensed tized time complexity for insert and delete, and log log to ACM. expected time for search, by leveraging smoothness proper- ACM ISBN 978-1-4503-6818-6/20/02...$15.00 ties of the key distribution [34]. Our concurrent implemen- https://doi.org/10.1145/3332466.3374542 tation preserves these properties with high probability. 276 PPoPP ’20, February 22–26, 2020, San Diego, CA, USA Trevor Brown, Aleksandar Prokopec, and Dan Alistarh To ensure correctness, non-blocking progress, and scal- such that km is the successor of kj , and then allocates a new ability in the concurrent setting, we introduce several new inner node that holds both kj and km. Finally, the old pointer techniques relative to sequential ISTs. Specifically, our con- in the parent is atomically changed with a CAS instruction tributions are as follows: to point to the new node. • We describe the first non-blocking concurrent inter- Without rebalancing, the tree can become arbitrarily deep. polation search tree (C-IST) based on atomic compare- Therefore, insertion must periodically rebalance parts of the and-swap (CAS) instructions (Section 2), with expected tree. The following figure shows the tree after inserting an lookup time O¹log logn + pº, and expected amortized additional key kn, such that ki < kj < km < kn. The subtree O¹log logn + pº time for insert and delete. at the bottom, which contains the keys ki , kj , km and kn, • We design a parallel, non-blocking rebuilding algo- is sufficiently imbalanced, and it should be replaced with rithm to provide fast and scalable periodic rebuilding a more balanced tree. Rebalancing creates a new subtree for C-ISTs (Section 3). We believe that this technique that contains the same set of keys. After rebalancing, the is applicable to other concurrent data structures that subtree consists of a single inner node of degree 4, as shown require rebuilding. on the right. Note that deletions also periodically rebalance the subtrees. • We prove the correctness, non-blocking and complex- ... ... ity properties of the C-IST (Section 4). k0 k1 k2 ... k0 k1 k2 ... • We provide a C-IST implementation in C++, and com- ... ... pare its performance against concurrent ¹a;bº-trees [13], kj kj km kn Natarajan and Mittal’s concurrent BSTs [36], and Bron- km son’s concurrent AVL trees [10] (Section 5). We report ki kj kn ki kj km kn performance improvements of 15% − 50% compared km kn to ¹a;bº-trees (the prior best-performing concurrent There are several challenges with this making this ap- search tree) on large datasets, and improvements of proach concurrent. First, concurrent modifications and re- up to 3:5× compared to the other concurrent trees, de- balancing must correctly synchronize so that all operations pending on the proportion of updates. We also analyze remain non-blocking, while searches remain wait-free. Sec- the average depth and cache-miss behavior, present a ond, the rebalancing of any subtree must not compromise breakdown of the execution time, show the impact of the scalability of the other operations. Finally, concurrent the parallel rebuilding algorithm, and compare mem- rebalancing must, when the probability distribution of the ory footprints. input keys is smooth [34], ensure that the operations run in 2 Concurrent Interpolation Search Tree amortized O¹log lognº time. 2.1 Examples and Overview 2.2 Data Types We illustrate how concurrent interpolation search trees work The concurrent interpolation search tree consists of the data using several examples. Examine the first tree in the follow- types shown in Figure 1. The IST data type represents the in- ing figure. Each inner node consists of asetof d pointers to terpolation search tree with the single member root, which child nodes, and d − 1 keys that are used to drive the search. points to the root node. Initially, the root node points to an We say that the node’s degree is d. The top node usually has empty leaf node, whose type is Empty. The Single data the highest degree, and the degree of a node decreases as it type represents a leaf node with a single key and an associ- gets deeper in the tree (explained precisely below). The tree ated value, and the Inner data type represents inner nodes, is external, meaning that the keys are stored in the leaf nodes. as illustrated on the right of Figure 1. The illustration shows a subset of nodes – the missing nodes In addition to holding the search keys, and the pointers are represented with ··· symbols. to the child nodes, the Inner data type contains the node’s root root degree, and a field called initSize, which contains the ... ... k0 k1 k2 ... k0 k1 k2 ... number of keys that were in the corresponding subtree when ... .. ... ... .. ... ki .. ki .. this node was created. Apart from the child pointers, these .. .. CAS fields are set on creation, and not subsequently modified. kj kj < km kj Inner also contains two volatile fields, count and status, km km which are used to coordinate rebuilding. The count field kj ki km ki kj km holds the number of updates that were performed in the Consider the task of inserting a key km, such that kj < subtree rooted at this node since it was created.

Non-Blocking Interpolation Search Trees with Doubly-Logarithmic Running Time

Log4j-Users-Guide.Pdf

LMAX Disruptor

High Performance & Low Latency Complex Event Processor

Need a Title Here

Varon-T Documentation Release 2.0.1-Dev-5-G376477b

On New Approaches of Assessing Network Vulnerability: Hardness and Approximation Thang N

Data Structures of Big Data: How They Scale

Research Article DECISION TREE LEARNING and REGRESSION MODELS to PREDICT ENDOCRINE DISRUPTOR CHEMICALS - a BIG DATA ANALYTICS APPROACH with HADOOP and APACHE SPARK

Anomaly Detection in Manufacturing Equipment with Apache Flink : Grand Challenge Yann Busnel, Nicolo Riveei, Avigdor Gal

Operating Model Editorial Board

CS302ES Regulations

Haskell Communities and Activities Report