<<

Toward Coordination-free and Reconfigurable Mixed Concurrency Control Dixin Tang and Aaron J. Elmore, University of Chicago https://www.usenix.org/conference/atc18/presentation/tang

This paper is included in the Proceedings of the 2018 USENIX Annual Technical Conference (USENIX ATC ’18). July 11–13, 2018 • Boston, MA, USA ISBN 978-1-939133-02-1

Open access to the Proceedings of the 2018 USENIX Annual Technical Conference is sponsored by USENIX. Toward Coordination-free and Reconfigurable Mixed Concurrency Control

Dixin Tang Aaron J. Elmore University of Chicago University of Chicago

Abstract specific workloads, may only exhibit high performance under their optimized scenarios, and have poor perfor- Recent studies show that mixing concurrency control mance in others. Consider H-Store’s concurrency con- protocols within a single can significantly out- trol protocol that uses coarse-grained exclusive perform a single protocol. However, prior projects to mix locks and a simple single-threaded executor per parti- concurrency control either are limited to specific pairs tion [10]. This approach is ideal for partitionable work- of protocols (e.g mixing two-phase locking (2PL) and loads that have tuples partitioned such that a transaction optimistic concurrency control (OCC)) or introduce ex- is highly likely to access only one partition. But this tra concurrency control overhead to guarantee their gen- approach suffers decreasing throughput with increasing eral applicability, which can be a performance bottle- cross-partition transactions [34, 27, 20]. Here, an opti- neck. In addition, due to unknown and shifting access mistic protocol may be preferred if the workload mainly patterns within a workload, candidate protocols should consists of read operations or a pessimistic protocol may be chosen dynamically in response to workload changes. be ideal if the workload exhibits high conflicts. An ap- This requires changing candidate protocols online with- pealing solution to this tension is to combine different out having to stop the whole system, which prior work protocols such that each protocol can be used to does not fully address. To resolve these two issues, we a part of workload that they are optimized for, and avoid present CormCC, a general mixed concurrency control being brittle to scenarios where single protocols suffer. framework with no coordination overhead across candi- Efficiently mixing multiple concurrency control pro- date protocols while supporting the ability to change a tocols is challenging in several ways. First, it should not protocol online with minimal overhead. Based on this be limited to a specific set of protocols (e.g. OCC and framework, we build a prototype main-memory multi- 2PL [22, 28]), but be able to extend to new protocols core database to dynamically mix three popular proto- with reasonable assumptions. Second, the overhead of cols. Our experiments show CormCC has significantly mixing multiple protocols should be minimized such that higher throughput compared with single protocols and the overhead is not a performance bottleneck in any sce- state-of-the-art mixed concurrency control approaches. nario. A robust design should ensure that in any case the mixed execution does not perform worse than any single 1 Introduction candidate protocol involved. Finally, as many transac- tional back user-facing applications, dynami- With an increase in CPU core counts and main-memory cally switching protocols online in response to workload capacity, concurrency control has become a new bot- changes is necessary to maintain performance. tleneck in multicore main-memory databases due to the While several recent studies focus on mixed concur- elimination of disk stalls [8, 34]. New concurrency con- rency control, they only address a part of the above chal- trol protocols and architectures focus on enabling high lenges. For example, MOCC [28] and HSync [22] are de- throughput by fully leveraging available computation ca- signed to mix OCC and 2PL with minimal mixing over- pacity, while supporting ACID transactions. Some pro- head but fails to extend to other protocols; Callas [31] tocols try to minimize the overhead of concurrency con- and Tebaldi [23] provide a general framework that can trol [10], while other protocols strive to avoid single cover a large number of protocols, but their overhead of contention points across many cores [27, 35]. How- mixing candidate protocols is non-trivial in some scenar- ever, these single protocols are typically designed for ios and do not support online protocol switch thus failing

USENIX Association 2018 USENIX Annual Technical Conference 809 to address the workload changes. Our prior work [25] 2 Related Work envisions an adaptive database that mixes concurrency control, but does not includes a general framework to ad- With the increase of main memory capacity and the num- dress these challenges. ber of cores for a single node, recent research focuses In this paper, we present a general mixed concur- on improving traditional concurrency control protocols rency control scheme CormCC (Coordination-free and on modern hardware. We classify these works into opti- Reconfigurable Mixed Concurrency Control) to sys- mizing single protocols, mixing multiple protocols, and tematically address all the aforementioned challenges. adaptable concurrency control. CormCC decomposes a database into partitions accord- Optimizing Single Protocols Many recent projects ing to workload access patterns and assigns a specific consider optimizing a single protocol, which we believe protocol to each partition, such that a protocol can be are orthogonal to this project. H-Store [8, 10] developed used to process the parts of a workload they are opti- a partitioned based concurrency control (PartCC) to min- mized for. We then develop several criteria to regulate imize concurrency control overhead. The basic idea is the mixed execution of multiple forms of concurrency to divide databases into disjoint partitions, where each control to maintain ACID properties. We show that un- partition is protected by a and managed by a sin- der reasonable assumptions, this method allows correct gle thread. A transaction can execute when it has ac- mixed execution without coordination across different quired the locks of all partitions it needs to access. Ex- protocols, which minimizes the mixing overhead. In ad- ploiting data partitioning and the single-threaded model dition, we develop a general protocol switching method is also adopted by several research systems [8, 10, 32, 18, (i.e. reconfiguration) to support changing protocols on- 19, 11]. Other projects propose new concurrency con- line with multiple protocols running together; the key trols optimized for multi-core main-memory databases idea is to compose a mediated protocol compatible with [6, 14, 27, 17, 35, 20, 13, 29, 30, 36, 12, 37, 16] or new both the old and new protocol such that switch process data structures to remove the bottlenecks of traditional does not have to stop all transaction workers while mini- concurrency control [34, 21, 15, 9]. mizing the impact on throughput and latency. To validate Mixing Multiple Protocols Callas [31] and its succes- the efficiency and effectiveness of CormCC, we develop sive work Tebaldi [23] provide a modular concurrency a prototype main-memory database on multi-core sys- control mechanism to group stored procedures and pro- tems that supports mixed execution and dynamic switch- vide an optimized concurrency control protocol for each ing of three widely-used protocols: a single-threaded group based on offline workload analysis; for transac- partition based concurrency control (PartCC) from H- tion conflicts across groups, Callas and Tebaldi intro- Store [10], an optimistic concurrency control (OCC) duce one or multiple protocols in addition to protocols based on Silo [27], and a two-phase locking based on for each group to resolve them. While stored proce- VLL (2PL) [21]. 1 dure oriented protocol assignment can process conflicts within the same group more efficiently, the additional The main contributions of this paper over our vision concurrency control overhead from executing both in- paper [25] are the following: group and cross-group protocols can become the per- • A detailed analysis of mixed concurrency control formance bottleneck for a main-memory database on a design space, a general framework that is not limited multi-core server. Section 3 shows a detailed discussion. to a specific set of protocols to mix multiple forms In addition, the grouping based on offline analysis as- of concurrency control without introducing extra over- sumes workload conflicts are known upfront, which may head of coordinating conflicts across protocols. not be true in real applications. Our work differs from Callas and Tebaldi in that our mixed concurrent control • A general protocol switching method to reconfigure execution does not introduce any extra concurrency con- a protocol for parts of a workload without stopping the trol overhead, which greatly reduces the mixing over- system or introducing significant overhead. head. Additionally, we do not assume a static work- load or require any knowledge about workload conflicts • A thorough evaluation of state-of-the-art mixed in advance, but allow protocols are chosen and reconfig- concurrency control approaches, CormCC’s end-to- ured online in response to dynamic workloads. Some end performance over varied workloads, and the per- other projects exploit the mixed execution of 2PL or formance benefits and overhead of CormCC’s mixed OCC [33, 6, 28, 22]. CormCC differs in that our frame- execution and online reconfiguration. work is more general and can be extended to more pro- tocols. Adaptable Concurrency Control Adaptable concur- 1Note that we use strong 2PL (SS2PL), but refer to it as 2PL rency control has been studied in several research works.

810 2018 USENIX Annual Technical Conference USENIX Association 1) One Protocol per Record/Transaction 2) One Protocol per Record, Multiple Protocols Per Transaction 3) Multiple Protocols per Record, One Protocol Per Transaction 4) Multiple Protocols per Record/Transaction Figure 1: Design choices of mixed concurrency control At the hardware level, ProteusTM [7] is proposed to action executes multiple protocols (shown in the second adaptively switch across multiple transaction memory al- design in Figure 1); it provides the flexibility that trans- gorithms for different workloads, but cannot mix them actions can access any record and allows a specific proto- to process different parts of a workload. Tai et al. [24] col to process all access to each record according to their shows the benefits of adaptively switching between OCC access patterns. On the other hand, the execution of a and 2PL. RAID [4, 3] proposes a general way to change single transaction may involve a larger set of instructions a single protocol online. CormCC differs in that it al- from multiple protocols, which makes CPU instruction lows multiple protocols running in the same system and cache less efficient. However, according to our exper- supports to reconfigure a protocol for parts of workload. iment in Section 6.5 this mixing overhead is very low. In this scenario, the presence of multiple protocols dur- MOCC [28] adopts this design to mix OCC and 2PL, but ing the reconfiguration presents new challenges we are does not have a general framework. addressing. Multiple Protocols per Record, One Protocol per Transaction An alternative design (the third one in Fig- 3 Design Choices ure 1) is that each record can be accessed via multi- ple protocols and each transaction executes one protocol. This design is useful when the semantics of a subset of While combining multiple protocols into a single sys- transactions (e.g. from stored procedures) can be lever- tem can potentially allow more concurrency to improve aged by an optimized protocol. It raises a problem, how- the overall throughput, it comes with the cost of higher ever, that co-existence of multiple protocols on the same concurrency control overhead. In this section, we dis- set of records should be carefully synchronized. One so- cuss two key design choices to explore this trade-off lution is to let all protocols share the same set of concur- and show how this trade-off motivates the design of rency control metadata and carefully design each proto- CormCC. Specifically, we consider whether a record can col such that all concurrent access to the same records be accessed (i.e. read/write) by multiple protocols and can be synchronized without introducing additional co- whether a single transaction involves multiple protocols. ordination. This solution requires specialized design for Based on the design choices, the overhead can be de- all protocols and thus is limited in its applicability. Prior fined as the cost of executing more than one protocol for works Hekaton [6] and HSync [22] adopt this design, but each transaction plus the cost of synchronizing the con- can only combine 2PL and OCC. current read/write operations across different protocols for each record in the database. Note that in this paper, Multiple Protocols per Record and per Transaction we assume all the transaction logic and the correspond- This design (the fourth one in Figure 1) provides the ing concurrency control logic of a transaction are exe- most fine-grained and flexible mixed concurrency con- cuted by a single thread (i.e. transaction worker), which trol; each transaction can mix multiple protocols and is a common model in the design of mixed concurrency each record can be accessed via different protocols. To control [31, 23, 28, 22, 6]. We now discuss the four pos- process the concurrent access from different protocols sible designs that are shown in Figure 1. over the same records, additional protocols are intro- One Protocol per Record and per Transaction This duced. For example, Callas [31] and its successive work is the simplest case, which is the left most part of Fig- Tebaldi [23] organized protocols into a tree, where the ure 1. We see that each transaction (denoted by T) can protocols in leaf nodes process conflicts of the assigned only choose one protocol (i.e. CC in Figure 1) and each transactions and the protocols in interior nodes process record (denoted by R) can be accessed via one proto- conflicts across its children. The overhead here is that for col. While the mixing overhead is minimal (based on each record access, multiple concurrency control logic our overhead definition), this design has very limited ap- should be executed to resolve the conflicts across dif- plicability since each transaction can only access the par- ferent protocols over that record. Such overhead, as we tition of records managed by a specific protocol. To the show in Section 6.3, can become a performance bottle- best of our knowledge, no previous work adopts it. neck in the multi-core main-memory database. One Protocol per Record, Multiple Protocols per Summary The above discussion (and our experi- Transaction This design further allows that one trans- ments) show that the second design, which lets each pro-

USENIX Association 2018 USENIX Annual Technical Conference 811 Preprocess Execution Validation 4.1 CormCC Protocol

OP1 OP2 OPn

CC1 CormCC divides a transaction’s life cycle into four

CC2 phases: Preprocess, Execute transaction logic (Execu-

CC3 tion), Validation, and Commit. We adopt this four-phase model because most concurrency control protocols can Figure 2: An Example of CormCC execution fit into it. We use a transaction execution example in tocol exclusively process the access of a subset of records Figure 2 to explain the four phases. to minimize synchronization cost and allows protocols Preprocess The preprocess phase executes concur- are mixed within a transaction to provide mixing flex- rency control logic that should be executed before the ibility, strikes a good trade-off between leveraging the transaction logic. Figure 2 shows that CormCC iter- performance benefits of single protocols and minimizing ates over all candidate protocols (denoted as CC) and mixing overhead. CormCC draws its spirit and builds a executes their preprocess phases respectively. Typical general and coordination-free framework. preprocess phase includes initializing protocol-specific metadata. For example, 2PL Wait-die needs to acquire a transaction timestamp to determine the relative order to 4 CormCC Design concurrent transactions, or partition-based single-thread protocol (e.g. PartCC) acquires locks in a predefined or- We consider a main-memory database with multiple der for partitions the transaction needs to access. forms of concurrency control on a multi-core machine. Execution Transaction logic is executed in this phase. Each includes one primary index and zero or more As shown in Figure 2, for each operation (denoted as OP) secondary indices. A transaction can be composed into issued from the transaction, CormCC first finds the pro- read, write (i.e. update), delete, and insert operations to tocol managing the record using the concurrency control access the database via either primary or secondary in- lookup table. Then, CormCC utilizes the protocol’s con- dices in a key-value way. The database is (logically) currency control logic to process this operation. For ex- partitioned with respect to candidate protocols within the ample, if the transaction reads an attribute of a record system, that is, each partition is assigned a single proto- managed by 2PL, it acquires its read lock and returns the col. Each protocol maintains an independent set of meta- attribute’s value. Note that insert operations can find the data for all records. For all operations on the records of corresponding protocol in the lookup table even though a partition, the associated protocol executes its own con- the record to be inserted is not in the database because currency control logic to process these operations (e.g. the table stores the mapping of the whole key space. preprocess, reading/writing, commit), which we denote Validation Each validation phase of all protocols are as a protocol managing this partition. We use a con- executed sequentially in this phase. If the transac- currency control lookup table to store the mapping from tion passes all validation, it enters the Commit phase; primary keys to the protocol. The lookup table maintains otherwise, CormCC aborts it. For example, if OCC the mapping for the whole key space and is shared by all is involved, CormCC executes its validation to verify transaction workers. Prior work [26] shows that such a whether records read by OCC during Execution have lookup table can be implemented in a memory-efficient been modified by other transactions. If yes, the trans- and fast way, and thus will not be the performance bottle- action is aborted; otherwise, it passes this validation. neck of CormCC. CormCC regards secondary indices as Commit Finally, CormCC begins an atomic Commit logically additional tables storing entries to primary keys phase by executing commit phases of all protocols. For (not pointers to records); it adopts a dedicated protocol example, one typical commit phase, like OCC, will ap- (e.g. OCC) to process all concurrent operations over sec- ply all writes to the database and make them visible to ondary indices. Transactions are routed to a global pool other transactions via releasing all locks. Note that an of transaction workers, each of which is a thread or a pro- abort can happen in the Execution or Validation phase, cess occupying one physical core. This worker executes in which case CormCC calls the abort functions of all the transaction to the end (i.e. commit or abort) with- protocols respectively. out interruption. We additionally use a coordinator to manage the online protocol reconfiguration for all trans- action workers. It collects statistical information periodi- 4.2 Correctness cally from all workers, builds a new lookup table accord- ingly, and finally lets all workers use it. We first outline We first show the criteria for CormCC to generate se- the CormCC protocol and then show the correctness of rializable schedule for a set of concurrent transactions CormCC. After that, we discuss online protocol recon- (i.e. its result is equivalent to some serial history of these figuration within CormCC. transactions) to guarantee the consistency of databases.

812 2018 USENIX Annual Technical Conference USENIX Association We then discuss how CormCC avoids , and the deadlock can happen within a single phase or across show that CormCC is recoverable (i.e. any committed phases when two transactions wait for each other due to transaction has not read data written by an aborted trans- their conflicts on records that are managed by separate action). protocols. Figure 3 shows two such cases. The first case To guarantee that CormCC is serializ- shows that the single-phase deadlock happens because able, we require all candidate protocols are commit or- both protocols CC1 and CC2 can make transactions T1 dering conflict serializable (COCSR). It means that if and T2 wait within one phase. For the second case, we see two transactions ti and t j have conflicts on a record r (i.e. that T1 waits for T2 in the Execution phase due to CC1 and ti and t j reads/writes r and at least one of them is a write) additionally introduce conflicts to make T2 wait for itself and t j depends on ti (i.e. ti accesses r before t j), then ti based on CC2 in the Validation phase. Such deadlock must be committed earlier than t j. is possible because a protocol (e.g. CC2) can introduce Given that all candidate protocols are COCSR, we conflicts across phases, that is, the conflicts introduced now show that CormCC is also COCSR. Suppose that by T1 in Execution are detected by T2 in Validation. two transactions ti and t j have conflicts on a record set CormCC avoids the two cases by requiring each pro- R, we consider two cases. First, if for any conflicted tocol make transactions wait due to conflicts in no more record r ∈ R the conflicted operation of ti accesses r be- than one phase, and in each phase only one protocol fore t j, then we have t j depends on ti. Since any can- make transactions wait because of conflicts. If CormCC didate protocol is COCSR, for each conflicted record r can meet the two criteria and each protocol can avoid or they ensure that ti is committed before t j. Therefore in detect , then CormCC is deadlock-free. this case, CormCC can maintain COCSR property. In Recoverable To guarantee CormCC recoverable, we the second case, there exist two records r1 and r2 ∈ R, require that each candidate protocol is strict, which and ti accesses r1 before t j and t j accesses r2 before ti. means that a record modified by a transaction is not vis- We have t j depends on ti and ti depends on t j. In this ible to concurrent transactions until the transaction com- case, r1 and r2 must be managed by two separate pro- mits. This ensures transactions never read dirty writes tocols p1 and p2 since each single protocol is COCSR (i.e. uncommited writes) and thus are recoverable [2]. and it is not possible to form a conflict-cycle in one pro- Since each record written by a transaction is managed tocol. Consider p1, which manages r1. Because it is by a single strict protocol, the execution of CormCC will COCSR, it enforces that ti is committed earlier than t j never read dirty writes and is recoverable. since t j depends on ti based on their conflicts on r1. On Supported Protocols A candidate protocol incorpo- the other hand, p2 enforces that ti is committed earlier rated in CormCC should be COCSR and strict. We find a than t j. Therefore, there is no valid commit time for both wide range of protocols can meet these criteria including ti and t j to suffice the above two constraints. Thus, in traditional 2PL and OCC [2], VLL [21], Orthrus [20], this case committing both transactions is impossible and PartCC from H-Store [10], and Silo [27]. the COCSR property of CormCC is also maintained. Fi- To enforce that CormCC meets the criteria of nally, since COCSR is a sufficient condition for conflict deadlock-free, CormCC requires each protocol specifies serializable [2], CormCC is conflict serializable. the phases where it makes transactions wait. Based on these specification, it is easy to detect whether the above Preprocess Execution Validation criteria hold for a given set of protocols.

T1 CC1 CC2 CC1 4.3 Online Reconfiguration T2 (1) (2) Online reconfiguration is to switch a protocol for a sub- Figure 3: Examples of deadlock across protocols set of records without stopping all transaction workers. CormCC uses a coordinator to manage this process. At Deadlock Avoidance While each individual protocol first, the coordinator collects statistical information from can provide mechanisms to avoid or detect deadlocks, all workers periodically and generates a new concurrency mixing them using CormCC without extra regulation control lookup table. When a worker completes a trans- may not make the system deadlock-free. One potential action, it adopts the new lookup table while workers that solution can be using a global deadlock detection mech- have not finished their current transactions still use the anism. However, this contradicts our spirit of not coordi- old one. After all workers use the new table, the old one nating candidate protocols. is deleted. We now introduce the challenge of supporting Alternatively, we examine the causes of deadlock and such online protocol switch. find that under reasonable assumptions, CormCC can Challenge The potential problem is that some transac- mix single protocols without coordination. Specifically, tion workers may use the new lookup table while others

USENIX Association 2018 USENIX Annual Technical Conference 813 W W W1 T1 Read Validate Commit Switch T3 Write W1 W2 1 2 Switch OCC 2PL T T3 2 OPCC R1 OCC T 2PL OCC OCC 5 Switch T1 Switch R1 R1 W2 T2 Write Read Read Time Time

Figure 4: Problems during protocol reconfiguration OPCC 2PL Switch T3 T6 T7 are using the old one. Figure 4 shows an example of this T4 problem. Consider two workers W1 and W2, which ex- Upgrade Degrade ecute transactions T1 and T2 respectively. Assume that Figure 5: An Example of mediated switching both T1 and T2 access a record (i.e. R1) that is man- aged by OCC. During their execution, the coordinator in- • If OPCC reads a record, it applies a read lock (2PL) forms that the protocol managing R1 should be switched and reads its value and timestamp into the read set to 2PL. Therefore, after T1 finishes W1 checks this mes- (OCC). sage, performs the switch (i.e. using 2PL to access R1), • If OPCC writes a record, it applies a write lock and starts a new transaction (T3). Since OCC and 2PL (2PL) and stores the record along with new data into maintain a different set of metadata, T3 and T2 are not the write set (OCC). aware of the conflict of T2 reading R1 and T3 writing R1, which may make database result in an inconsistent state. • In the validation phase, it locks all records in write Mediated Switching To address this issue, we pro- set (e.g. for critical section of Silo OCC)2 and then pose mediated switching; it adopts a mediated protocol validate the read set using OCC logic. that is compatible with both old and new protocols. Dur- • In the commit phase, it applies all writes and release ing reconfiguration, the coordinator lets all workers asyn- locks acquired by OCC and 2PL respectively. chronously change the old protocol to the mediated one; The switch via mediated protocol is composed of two After all workers use the mediated protocol, the coordi- phases: upgrade and degrade. The protocol switch is nator then informs them to adopt the new protocol. initiated when the coordinator finds that the protocol for We compose a mediated protocol that can execute con- a record set RS should be changed. It starts the up- currency control logic of both old and new protocols. grade phase by issuing a message to all workers to let Specifically, a mediated protocol first executes the Pre- them switch the protocol for RS to the mediated proto- process logic of both old and new protocols; then, it col. Each transaction worker checks for this message be- enters the Execution phase, where for each record ac- tween transactions and acknowledges the message to the cess the mediated protocol executes the Execution logic coordinator. During this asynchronous process, workers of both protocols. The mediated protocol’s Validation, that have received the message access RS using the medi- Commit, and Abort also executes the corresponding ated protocol, while other workers may access RS using logic of both protocols. While it is easy to compose a the old protocol, which happens when they are running protocol that executes the logic of both protocols, one transactions that started before the switch. The left part problem is how to unify different ways of applying mod- of Figure 5 shows an example of upgrade phase. We see ifications (i.e. insert/delete/write) of different protocols. that worker W finished T first; thus, it begins to access We find there are two ways to apply modifications: in- 2 2 R using OPCC (i.e. in transaction T ). At the same place modification during execution and lazy modifica- 1 3 time, T still accesses R using OCC. According to the tion during commit phase. In the mediated protocol, we 1 1 above OPCC description, we see that the conflicts on R always opt for lazy modification, which means storing 1 from W and W can be serialized because OPCC runs the modification in a local buffer during execution and 1 2 the full logic of OCC. After all workers acknowledge the applying them when the transaction is committed. Since switch for RS, the degrade phase begins with the coor- all protocols are strict, deferring the actual modification dinator messaging workers about the degrade to the new to the commit phase does not violate correctness of pro- protocol. Therefore, workers are using either the me- tocols. For example, 2PL performs in-place write, while diated protocol or the new protocol, and serializability is OCC writes the new value into a local buffer and applies guaranteed by the new protocol logic (e.g. 2PL in our ex- it in the commit phase. In our mediated protocol, we ample) used in both execution modes. The right part of choose the OCC approach of applying writes. Figure 5 shows an example of degrade phase, where the We use the example in Figure 5 to illustrate this pro- conflicts over R can be serialized because OPCC also cess, where we need to switch the protocol managing R 1 1 executes the full logic of 2PL. from OCC to 2PL. The mediated protocol here will exe- cute the logic of both OCC and 2PL (denoted as OPCC). 2Note that OCC and 2PL use different sets of metadata; no deadlock OPCC adopts the following logic: can exist between them

814 2018 USENIX Annual Technical Conference USENIX Association 5 Prototype Design phase; all protocols apply writes and release all locks. We show that such mixed execution is correct. First, for We build a prototype main-memory multi-core database PartCC and 2PL if a transaction ti conflicts with another that can dynamically mix PartCC from H-Store [10], transaction t j, ti cannot proceed until t j commits or vice OCC from Silo [27], and 2PL from VLL [21] using versa, so PartCC and 2PL are COCSR. Shang et al. [22] CormCC. PartCC partitions the database and associates have proven that Silo OCC is also COCSR. Therefore, each partition an exclusive lock. Every transaction first their mixed execution using CormCC maintains COCSR. acquires all locks for the partitions it needs to read/write In addition, each of PartCC, 2PL, and OCC can indepen- in a predefined order before the transaction logic is ex- dently either avoid or detect deadlocks and make transac- ecuted. Then, the transaction is executed by a single tions conflict-wait in only one mutually exclusive phase thread to the end without additional coordination. In Silo among Preprocess, Execution, or Validation. Thus, their OCC, each record is assigned a timestamp. During trans- mixed execution can also prevent or detect deadlocks. Fi- action execution, Silo OCC tracks read/write operations, nally, all protocols are strict, so is their mixed execution. and stores the records read by the transactions along with To enable dynamic protocol reconfiguration, we build their associated timestamps into a local read set and all two binary classifiers to predict the ideal protocol for writes into a local write set. In the validation phase, Silo each partition. The detailed discussion of classifiers are OCC locks the write set and validates whether records in presented in a technical report [1]. read set are changed using their timestamps. If the val- idation succeeds, it commits; otherwise, it aborts. VLL 6 Experiments is an optimized 2PL by co-locating each lock with each record to remove the contention of the centralized lock We now evaluate the effectiveness of mixed execution manager. and online reconfiguration of CormCC. Our experiments Our prototype partitions primary indices and corre- answer four questions: 1) How does CormCC perform sponding records using an existing partitioning algo- compared to state-of-the-art mixed concurrency control rithm [5]. Each partition is assigned with a transaction approaches (Section 6.3)? 2) How does CormCC adap- worker (i.e. thread or process) and each worker is only tively mix protocols under varied workloads over time assigned transactions that will access some data in its (Section 6.4)? 3) What is the performance benefit and partition (the base partition), but may also access data in overhead of mixed execution (Section 6.5)? 4) What is other partitions (the remote partition). CormCC selects a the performance benefit and overhead of online reconfig- protocol for each partition according to its access pattern. uration (Section 6.6)? A partition routing table maintains the mapping from pri- All experiments are run on a single server with four mary keys to partition numbers and also corresponding NUMA nodes, each of which has a 8-core Intel Xeon protocols. We use stored procedures as the primary in- E7-4830 processor (2.13 GHz), 64 GB of DRAM and terface, which is a set of predefined and parameterized 24 MB of shared L3 cache, yielding 32 physical cores SQL statements. Stored procedures can provide a quick and 256 GB of DRAM in total. Each core has a pri- mapping from database operations to the corresponding vate 32 KB of L1 cache and 256 KB of L2 cache. We partitions by annotating the parameters that can be used disable hyperthreading such that each worker occupies a to identify a base partition to execute the transaction and physical core. To eliminate network client latency, each other involved partitions [10]. worker combines a client transaction generator. The mixed execution of the three protocols start with Preprocess phase, where PartCC acquires all partition 6.1 Prototype Implementation locks in a predefined order. Transactions may wait in this phase because of partition lock requests. Next, trans- We develop a prototype based on Doppel [17], an open- actions enter Execution, where CormCC uses PartCC, source multi-core main-memory transactional database. OCC, and 2PL to process record access operations ac- Clients issue transaction requests using pre-defined cording to which partition the record belongs to. Note stored procedures, where all parameters are provided that only 2PL will make transactions wait in this phase, when a transaction begins, and transactions are exe- so transactions in the Execution phase will not wait for cuted to the completion without interacting with clients. those blocked in the Preprocess, which indicates dead- Stored procedures issue read/write operations using in- locks across PartCC and 2PL is impossible. After the terfaces provided by the prototype. Each transaction is Execution phase, Validation begins and OCC acquires dispatched to a worker that runs this transaction to the write locks and validates the read set [27]. Only OCC re- end (commit or abort). quires transactions to wait; thus, they will not be blocked Workers access records via key-value hash tables. by previous phases. Finally, there is no wait in commit Each worker thread occupies a physical core and main-

USENIX Association 2018 USENIX Annual Technical Conference 815 1.2 1.2 2PL SSI Tebaldi CormCC 1 1

2PL OS SL 0.8 0.8 2PL 0.6 0.6 Tebaldi CormCC

RL RL Throughtput (Millions txn/s) 0.4 Throughtput (Millions txn/s) 0.4 0 4 8 12 16 20 24 28 32 0 0.3 0.6 0.9 1.2 1.5 PM NO DEL Number of non-partitionable partitions Theta of Zipf Figure 6: A three-layer configuration of Tebaldi Figure 7: Comparison under different partitionability Figure 8: Comparison unber different conflicts tains its own memory pool to avoid memory allocation bution within a partition. Theta of Zipf can be varied contention across many cores [34]. A coordinator thread from 0 to 1.5. This means for TPC-C within a partition is used for extracting features for the prediction classi- determined by WarehouseID, we skew record access for fiers from statistics collected by workers and predicting related tables. We also vary these parameters to train our the ideal protocol to be used. Our prototype supports classifiers for protocol prediction. The detailed configu- automatically selecting PartCC [10], Silo OCC [27], ration of training classifiers is illustrated in [1]. or No-Wait VLL [21] for each partition. Note that we have compared 2PL variants No-Wait and Wait-Die, and find that 2PL No-Wait performs best in most cases be- 6.3 Comparison with Tebaldi cause of lower synchronization overhead of lock man- agement [20]. We do not implement logging in CormCC Tebaldi [23] is a general mixed concurrency control since prior work shows that logging is not a performance framework that groups stored procedures according to bottleneck [17, 38]. Our comparison additionally in- their conflicts and build a hierarchical concurrency con- cludes a general mixed concurrency control framework trol protocols to address in-group and cross-group con- based on Tebaldi [23] (denoted as Tebaldi) and a hybrid flicts. Figure 6 shows a three-layer configuration for approach of OCC and 2PL [22, 28] (denoted as Hybrid), TPC-C. We see that NewOrder (NO) and Payment (PM) that adopts locks to protect highly conflicted records but are in the same group and their conflicts are managed by uses validation for the rest. We statically tune the set of runtime pipeline (RL). Runtime pipeline is an optimized highly conflicted records to make Hybrid have the high- 2PL that can leverage the semantics of stored procedures est throughput. Specifically, for our highly conflicted to pipeline conflicted transactions. Delivery (DEL) alone workload we protect 1000 mostly-conflicted records for is in another group also run by runtime pipeline. Con- each partition and for our lowly conflicted workload no flicts between the two groups are processed by 2PL. Or- records will be locked and we use OCC for them. derStatus (OS) and StockLevel (SL) are executed in a sep- arate group. Their conflicts with the rest of the groups are processed by Serializable (SSI). 6.2 Benchmarks & Experiment Settings For every operation issued by a transaction, it needs to execute all concurrency control logic from the root node We use YCSB and TPC-C in our experiments. We to the leaf node to delegate the conflicts to specific pro- generate one table for YCSB that includes 10 million tocols. For example, any operation issued by NewOrder records, each with 25 columns and 20 bytes for each needs to go through the logic of SSI, 2PL, and RL. While . Transactions are composed of mixed read and this approach can process conflicts in a more fine-grained read-modify-write operations. The partitioning of YCSB way, the overhead of multiple protocols for all opera- is based on hashing its primary keys. TPC-C simulates tions can limit the performance. Another restriction of an order processing application. We generate 32 ware- Tebaldi is that it relies on the semantics of stored proce- houses and partition the store according to warehouse dures to assign protocols. For workloads including data IDs except the Item table, which is shared by all workers. dependent behaviour (e.g. hot keys or affinity between We use the full mix of five procedures. keys) and not having stored procedures with rich seman- To generate varied workloads, we tune three parame- tics (i.e. YCSB in our test), Tebaldi can only use one ters. The first is the percentage of cross-partition trans- protocol to process the whole workload. In our test, we actions ranging from 0 to 100. We set the number of use this 3-layer configuration for Tebaldi, which has the partitions a cross-partition transaction will access as 2; best performance for TPC-C [23]. Note that we use an The second is the mix of stored procedures for a work- optimized Runtime Pipeline reported in [31], which can load. Note that YCSB only has one , eliminate the conflicts between Payment and NewOrder. and we tune the number of operations per transaction and We first compare CormCC with Tebaldi, imple- the ratio of read operations. Finally, we vary data access mented in our prototype, using TPC-C over mixing well- skewness. We use Zipf to generate record access distri- partitionable and non-partitionable workloads. We par-

816 2018 USENIX Annual Technical Conference USENIX Association 3.5 PartCC 3 OCC 2PL 2.5 Hybrid 2 CormCC 1.5 1 0.5 0 Throughtput (Millions txn/s) 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Time (s) Figure 9: Holistic test for CormCC under YCSB varied workloads over time 3 PartCC Hybrid 2.5 OCC CormCC 2PL 2 1.5 1 0.5 0 Throughtput (Millions txn/s) 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Time (s) Figure 10: Holistic test for CormCC under TPC-C varied workloads over time 2.5 Max Avg Min 2.5 Max Avg Min

2 2

1.5 1.5

1 1

Throughput Ratio to Single Protocols 0.5 Throughput Ratio to Single Protocols 0.5 0 25 50 75 100 0 25 50 75 100 Time(s) Time(s) Figure 11: CormCC throughput ratio to single protocols for YCSB Figure 12: CormCC throughput ratio to single protocols for TPC-C tition the database into 32 warehouses and start our test of all protocols increases at first, because more access with a well-partitionable workload (i.e. each partition skewness introduces better access locality and improves receives 100% single-partition transactions), and then in- CPU cache efficiency. Then, high conflicts dominate the crease the number of non-partitionable warehouses (i.e. performance and the throughput decreases for all proto- each receives 100% cross-partition transactions) by an cols. We see that with higher conflicts Tebaldi gradually interval of 4. Throughout this test, we use default trans- outperforms 2PL, and suffers less throughput loss in the action mix of TPC-C. Since both Tebaldi and CormCC workload with very high conflicts. use 2PL as their candidate protocol, our test additionally These tests show that while Tebaldi can efficiently pro- includes the results of 2PL. cess conflicts, it comes with a non-trivial concurrency Figure 7 shows the performance results of three proto- control overhead. In addition, Tebaldi needs to know the cols. CormCC first adopts PartCC and then proceeds to conflicts of a workload a priori such that it can utilize mix PartCC and 2PL for workloads with mixed partition- the static analysis [23] to make an efficient configuration ability. When the workload becomes non-partitionable, offline. In contrast, CormCC mixes protocols with mini- 2PL is used by CormCC. We see that CormCC always mal overhead, requires no knowledge of conflicts before- performs better than Tabaldi and 2PL because it can hand, and can dynamically choose protocols online. leverage the partitionable workloads. Tebaldi always performs slightly worse than 2PL because of its concur- 6.4 Tests on Varied Workloads rency control overhead from multiple protocols. While such overhead is not substantial in a distributed environ- We evaluate the holistic benefits of CormCC by running ment as shown in the original paper [23], it can become YCSB with randomizing benchmark parameters every 5 a bottleneck in a main-memory multi-core database due seconds. We compare the same randomized run (e.g. to the elimination of network I/O operations. same parameters at each interval) with fixed protocols To highlight Tebaldi’s performance benefits of effi- and Hybrid. CormCC collects features every second and ciently processing conflicts, we increase the access skew- concurrently selects the ideal protocol for each partition. ness within each warehouse by varying the theta of Zipf We randomly vary five parameters: i) read rate chosen distribution from 0 to 1.5 with an interval 0.3. Here, in 50%, 80%, and 100%; ii) number of operations per we choose 16 warehouses as partitionable and the rest transaction, selected between 15 and 25; iii) theta of Zipf as non-partitionable. Figure 8 shows that the throughput for data access distribution; chosen from 0, 0.5, 1, and

USENIX Association 2018 USENIX Annual Technical Conference 817 5 4 50 PartCC Hybrid Single Protocols Mixed Protocols 4 OCC CormCC 3 40 2PL 3 30 2 2 20 1 PartCC Hybrid 1 OCC CormCC 10 2PL Throughput (Thousands txn/s/core) (Thousands Throughput Throughtput (Millions txn/s) 0 Throughtput (Millions txn/s) 0 0 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 1/2 PartCC 1/2 PartCC 1/2 OCC 1/3 PartCC PartCC OCC 2PL 1/2 OCC 1/2 2PL 1/2 2PL 1/3 OCC

Number of non-partitionable partitions Number of read-only partitions 1/3 2PL Figure 13: Measuring partitionability Figure 14: Measuring read/write ratio Figure 15: Testing the overhead of mixed execution 1.5; iv) the number of partitions that have cross-partition the highest ratio CormCC can achieve for YCSB and transactions (with the remaining partitions as well parti- TPC-C is 2.2x and 2.6x respectively. For 55% work- tionable): we randomly choose the number from 0 to 32 loads of YCSB and 85% workloads of TPC-C, the av- with the interval 4; v) the percentage of cross-partition erage ratio is at least 1.2x. The lowest ratio in YCSB transactions for partitions in (iv), randomly selected be- and TPC-C test is 0.91x and 0.94x respectively, which tween 50% and 100%. means that for the two benchmarks CormCC can achieve The test starts with a well-partitionable workload of at least 91% and 94% throughput of the best protocol due 80% read rate, 15 operations per transaction, and a uni- to wrong protocol selection for some partitions. These form access distribution (i.e. theta = 0). Figure 9 shows results show that CormCC can achieve significant per- the test results of every 2 seconds for 100 seconds in to- formance gains when a wrong protocol is selected for a tal. We see that in almost all cases CormCC can either workload, can improve the throughput over single proto- choose the best protocol or find a mixed execution to out- cols for a wide range of varied workloads, and is robust perform any candidate protocols and Hybrid approach, to dynamic workloads. while not experiencing long periods of throughput degra- dation due to switching. CormCC can achieve at most 6.5 Evaluating Mixed Execution 2.5x, 1.9x, 1.8x, and 1.7x throughput of PartCC, OCC, 2PL, and Hybrid respectively. In this subsection, we first evaluate the performance ben- We additionally test the performance variations of efits of CormCC over single protocols and Hybrid ap- CormCC under randomized varied workloads of TPC- proaches, and then test the overhead of CormCC. C. We partition the database into 32 warehouses, and We first show how mixed well-partitionable and non- have each worker collect features every second and se- partitionable workloads based on YCSB benchmark in- lect the ideal protocol for each warehouse at runtime. We fluence the relative performance of CormCC to other permute four parameters to generate varied workloads. protocols. We partition the database into 32 partitions, First, we randomly select a transaction mix in 10 can- and start our test with a well-partitionable workload and didates, where one is default transaction mix of TPC-C then increase the number of non-partitionable partitions and other nine are randomly generated. Then, we vary by an interval of 4. In this test, each transaction includes the three parameters: record skew for related tables, the 80% read operations. For these tests, we use transactions number of warehouses that have cross-partition transac- consisting 20 operations and skew record access within tions, and the percentage of cross-partition transactions each partition using Zipf distribution with theta = 1.5. in the same way as YCSB test. We report the results of Figure 13 shows that CormCC always performs best every 2 seconds in Figure 10. The test starts with well- because it starts with PartCC and then adaptively mixes partitionable default transaction mix of TPC-C and varies PartCC for partitionable workloads and 2PL for highly workload every 5 seconds. We see that it has similar be- conflicted non-partitionable counterpart. Compared to haviours of Figure 9, where CormCC can almost always CormCC, the performance of PartCC degrades rapidly perform the best. In this test, CormCC can achieve at due to high partition conflicts and other protocols cannot most 2.8x, 2.4x, 1.7x, and 1.8x throughput of PartCC, take advantage of partitionable workloads. OCC, 2PL, and Hybrid respectively. We then test workloads with the increasing percent- We then report the ratios of the mean throughput of age of read operations. Initially, operations accessing CormCC (after protocol switching) to that of the worst each partition include 80% read operations; we increase and best single protocols (labeled by max and min respec- the number of partitions receiving 100% read opera- tively) for each varied workload (i.e. every 5s) of both tions by an interval of 4. In this test, our workload in- benchmarks in Figure 11 and Figure 12. We addition- cludes 16 partitions having 100% cross-partition trans- ally report the ratio of the mean throughput of CormCC actions among them, while the others are only accessed to the average throughput of three single fixed protocols by single-partition transactions. (labelled by avg) in each varied workload. We see that Figure 14 shows that CormCC has remarkable

818 2018 USENIX Annual Technical Conference USENIX Association throughput improvement over other protocols by com- OCC 2PL StopAll Mediated bining the benefits of PartCC, OCC, and 2PL. Specif- 0.6 ically, CormCC first mixes PartCC and 2PL, and then 0.4 applies OCC for non-partitionable and read-only parti- tions. While Hybrid can adaptively mix OCC and 2PL, 0.2 it is sub-optimal due to failing to leverage the benefits txn/s) (Millions Throughput 0 Short-only Long (0.5s) Long (1s) Long (2s) Long (4s) of PartCC. In these tests, the speed-ups of CormCC over PartCC, OCC, 2PL, and Hybrid can be up to 3.4x, 2.2x, Figure 16: Testing mediated switching 1.9x, and 2.0x respectively. generate long-running transactions by introducing client Next, we test the overhead of CormCC. We first exe- think/wait time to short transactions. The long transac- cute transactions using CormCC and track the percent- tions last 0.5s, 1s, 2s, and 4s for the four workloads re- age of each transaction’s operations executed on records spectively and are dedicated to one worker. We collect owned by a specific protocol (e.g. 1/2 of the transaction’s throughput every second and report the average through- records use OCC and 1/2 of the transaction’s records use put during protocol switching. We ensure that switch 2PL). With the percentages collected, we execute a mix happens at the start, end, and middle of a long running of transactions where a corresponding percentage of the transaction, which represent that switch waits for little, transactions are executed exclusively on a single proto- whole, and half of the transaction respectively, and re- col (e.g. 1/2 of the transactions are only OCC and 1/2 are port three test cases for each mixed workload. only 2PL), and compare the throughputs of the two ap- As shown in Figure 16, we see that Mediated and proaches. Note that to test the overhead without involv- StopAll have a minimal throughput drop compared to ing the performance advantages of CormCC over single 2PL when the workload only includes short transac- protocols, we use a single core to execute all transactions. tions. When long-running transactions are introduced, Figure 15 shows a micro-benchmark to evaluate mixed StopAll suffers due to waiting for the completion of long- execution overhead. We execute 50,000 transactions, running transactions, while Mediated can still maintain each having 20 operations with 50% read operations, high throughput during the switch because Mediated with the rest as read-modify-write operations. Key ac- does not stop all workers, but let them adopt both 2PL cess distribution is uniform. The dataset is partitioned and OCC (i.e. upgrade phase); then, the coordinator no- into 32 partitions; 10 of them are managed by OCC, 10 tifies all workers to adopt 2PL (i.e. degrade phase) after of them are for 2PL, and 12 are PartCC. The “mixed pro- the long transaction ends. Mediated protocol can achieve tocols” shows the average throughput of CormCC with at least 93% throughput of OCC or 2PL due to the over- different percentage of operations executed by different head of executing the logic of two protocols. protocols; the “single protocol” results show the average In addition, we perform the same test for all other throughput of a single protocol (e.g. 100% of transac- pairwise protocol switching. We find that the over- tions use OCC), or using single protocols to exclusively head is minimal under short-only workload. When execute a corresponding percentage of transactions. We long-running transactions are introduced, the maximum find that our method has roughly the same throughput throughput drop is about 20% during protocol switch- as a mix of “single protocols”, which shows that the ing from PartCC to OCC. This is acceptable compared overhead of mixed concurrency control is minimal in to StopAll, which cannot process new transactions in the CormCC. This is largely due to the fact that we do not switch process. These experiments show that Mediated add extra meta-data operations to synchronize conflicts can maintain reasonable throughput during a protocol across protocols. switch, even in the presence of long transactions.

6.6 Evaluating Mediated Switching 7 Conclusion To evaluate the performance benefits and overhead of mediated switching (denoted as Mediated), we com- By exploring the design space of mixed concurrency con- pare it with a method of stopping all protocol execu- trol, CormCC presents a new approach to generally mix- tion and applying the new protocol (denoted as StopAll). ing multiple concurrency control protocols, while not in- In our test, we perform a protocol switch from OCC troducing coordination overhead. In addition, CormCC to 2PL using five YCSB workloads with uniform key proposes a novel way to reconfigure a protocol for parts access distribution. The first workload only includes of a workload online with multiple protocols running. short-lived transactions with each having 10 read and 10 Our experiments show that CormCC can greatly outper- read-modify-write operations. The other workloads in- form static protocols, and state-of-the-art mixed concur- clude a mix of short and long-running transactions. We rency control approaches in various workloads.

USENIX Association 2018 USENIX Annual Technical Conference 819 References International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, [1] Toward Coordination-free and Reconfigurable 2013 (2013), pp. 73–84. Mixed Concurrency Control (Technical Report). https://newtraell.cs.uchicago.edu/ [10]K ALLMAN,R.,KIMURA,H.,NATKINS,J., files/tr_authentic/TR-2018-06.pdf. PAVLO,A.,RASIN,A.,ZDONIK,S.B.,JONES, E. P. C., MADDEN,S.,STONEBRAKER,M., [2]B ERNSTEIN, P. A., HADZILACOS, V., AND ZHANG, Y., HUGG,J., AND ABADI, D. J. H- GOODMAN,N. Concurrency Control and Recov- store: a high-performance, distributed main mem- ery in Database Systems. Addison-Wesley Long- ory system. PVLDB 1, 2 man Publishing, 1986. (2008), 1496–1499.

[3]B HARGAVA,B.K.,HELAL,A.,FRIESEN,K., [11]K EMPER,A., AND NEUMANN, T. Hyper: A AND RIEDL, J. Adaptility experiments in the RAID hybrid oltp&olap main memory database system distributed data base system. In Ninth Sympo- based on virtual memory snapshots. In Proceed- sium on Reliable Distributed Systems, SRDS 1990, ings of the 27th International Conference on Data Huntsville, Alabama, USA, October 9-11, 1990, Engineering, ICDE 2011, April 11-16, 2011, Han- Proceedings (1990), pp. 76–85. nover, Germany (2011), pp. 195–206.

[4]B HARGAVA,B.K., AND RIEDL, J. A model for [12]K IM,K.,WANG, T., JOHNSON,R., AND PAN- adaptable systems for transaction processing. IEEE DIS, I. ERMIA: fast memory-optimized database Trans. Knowl. Data Eng. 1, 4 (1989), 433–449. system for heterogeneous workloads. In Proceed- ings of the 2016 International Conference on Man- [5]C URINO,C.,ZHANG, Y., JONES, E. P. C., AND agement of Data, SIGMOD Conference 2016, San MADDEN, S. Schism: a workload-driven approach Francisco, CA, USA, June 26 - July 01, 2016 to database and partitioning. PVLDB 3, (2016), pp. 1675–1687. 1 (2010), 48–57. [13]K IMURA, H. FOEDUS: OLTP engine for a thou- [6]D IACONU,C.,FREEDMAN,C.,ISMERT,E., sand cores and NVRAM. In Proceedings of the LARSON, P., MITTAL, P., STONECIPHER,R., 2015 ACM SIGMOD International Conference on VERMA,N., AND ZWILLING, M. Hekaton: SQL Management of Data, Melbourne, Victoria, Aus- server’s memory-optimized OLTP engine. In Pro- tralia, May 31 - June 4, 2015 (2015), pp. 691–706. ceedings of the ACM SIGMOD International Con- [14]L ARSON, P., BLANAS,S.,DIACONU,C.,FREED- ference on Management of Data, SIGMOD 2013, MAN,C.,PATEL,J.M., AND ZWILLING, New York, NY, USA, June 22-27, 2013 (2013), M. High-performance concurrency control mech- pp. 1243–1254. anisms for main-memory databases. PVLDB 5, 4 [7]D IDONA,D.,DIEGUES,N.,KERMARREC,A., (2011), 298–309. GUERRAOUI,R.,NEVES,R., AND ROMANO, [15]L EVANDOSKI,J.J.,LOMET,D.B., AND SEN- P. Proteustm: Abstraction meets performance GUPTA, S. The bw-tree: A b-tree for new hardware in . In Proceedings of the platforms. In 29th IEEE International Conference Twenty-First International Conference on Architec- on Data Engineering, ICDE 2013, Brisbane, Aus- tural Support for Programming Languages and Op- tralia, April 8-12, 2013 (2013), pp. 302–313. erating Systems, ASPLOS ’16, Atlanta, GA, USA, April 2-6, 2016 (2016), pp. 757–771. [16]M U,S.,CUI, Y., ZHANG, Y., LLOYD, W., AND LI, J. Extracting more concurrency from dis- [8]H ARIZOPOULOS,S.,ABADI,D.J.,MADDEN,S., tributed transactions. In 11th USENIX Sympo- AND STONEBRAKER, M. OLTP through the look- sium on Operating Systems Design and Implemen- ing glass, and what we found there. In Proceedings tation, OSDI ’14, Broomfield, CO, USA, October of the ACM SIGMOD International Conference on 6-8, 2014. (2014), pp. 479–494. Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008 (2008), pp. 981– [17]N ARULA,N.,CUTLER,C.,KOHLER,E., AND 992. MORRIS, R. Phase reconciliation for contended in-memory transactions. In 11th USENIX Sympo- [9]J UNG,H.,HAN,H.,FEKETE,A.D.,HEISER,G., sium on Operating Systems Design and Implemen- AND YEOM, H. Y. A scalable lock manager for tation, OSDI ’14, Broomfield, CO, USA, October multicores. In Proceedings of the ACM SIGMOD 6-8, 2014. (2014), pp. 511–524.

820 2018 USENIX Annual Technical Conference USENIX Association [18]P ANDIS,I.,JOHNSON,R.,HARDAVELLAS,N., [27]T U,S.,ZHENG, W., KOHLER,E.,LISKOV,B., AND AILAMAKI, A. Data-oriented transaction ex- AND MADDEN, S. Speedy transactions in mul- ecution. PVLDB 3, 1 (2010), 928–939. ticore in-memory databases. In ACM SIGOPS 24th Symposium on Operating Systems Principles, [19]P ANDIS,I.,TOZ¨ UN¨ , P., JOHNSON,R., AND AIL- SOSP ’13, Farmington, PA, USA, November 3-6, AMAKI, A. PLP: page latch-free shared-everything 2013 (2013), pp. 18–32. OLTP. PVLDB 4, 10 (2011), 610–621. [28]W ANG, T., AND KIMURA, H. Mostly-optimistic [20]R EN,K.,FALEIRO,J.M., AND ABADI, D. J. De- concurrency control for highly contended dynamic sign principles for scaling multi-core OLTP under workloads on a thousand cores. PVLDB 10, 2 high contention. In Proceedings of the 2016 Inter- (2016), 49–60. national Conference on Management of Data, SIG- MOD Conference 2016, San Francisco, CA, USA, [29]W ANG,Z.,MU,S.,CUI, Y., YI,H.,CHEN,H., June 26 - July 01, 2016 (2016), pp. 1583–1598. AND LI, J. Scaling multicore databases via con- strained parallel execution. In Proceedings of the [21]R EN,K.,THOMSON,A., AND ABADI,D.J. 2016 International Conference on Management of Lightweight locking for main memory database Data, SIGMOD Conference 2016, San Francisco, systems. vol. 6, pp. 145–156. CA, USA, June 26 - July 01, 2016 (2016), pp. 1643– 1658. [22]S HANG,Z.,LI, F., YU,J.X.,ZHANG,Z., AND CHENG, H. Graph analytics through fine-grained [30]W U, Y., CHAN, C. Y., AND TAN, K. Transaction parallelism. In Proceedings of the 2016 Interna- healing: Scaling optimistic concurrency control on tional Conference on Management of Data, SIG- multicores. In Proceedings of the 2016 Interna- MOD Conference 2016, San Francisco, CA, USA, tional Conference on Management of Data, SIG- June 26 - July 01, 2016 (2016), pp. 463–478. MOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016 (2016), pp. 1689–1704. [23]S U,C.,CROOKS,N.,DING,C.,ALVISI,L., [31]X IE,C.,SU,C.,LITTLEY,C.,ALVISI,L., AND XIE, C. Bringing modular concurrency con- KAPRITSOS,M., AND WANG, Y. High- trol to the next level. In Proceedings of the 2017 ACM International Conference on Management of performance ACID via modular concurrency con- Proceedings of the 25th Symposium on Op- Data, SIGMOD Conference 2017, Chicago, IL, trol. In erating Systems Principles, SOSP 2015, Monterey, USA, May 14-19, 2017 (2017), pp. 283–297. CA, USA, October 4-7, 2015 (2015), pp. 279–294. [24]T AI, A. T., AND MEYER, J. F. Performability [32]Y AO,C.,AGRAWAL,D.,CHEN,G.,LIN,Q., management in systems: An OOI,B.C.,WONG, W., AND ZHANG, M. Ex- adaptive concurrency control protocol. In MAS- ploiting single-threaded model in multi-core in- COTS ’96, Proceedings of the Fourth Interna- memory systems. IEEE Trans. Knowl. Data Eng. tional Workshop on Modeling, Analysis, and Sim- 28, 10 (2016), 2635–2650. ulation On and Telecommunication Sys- tems, February 1-3, 1996, San Jose, California, [33]Y U, P. S., AND DIAS, D. M. Analysis of hybrid USA (1996), pp. 212–216. concurrency control schemes for a high data con- tention environment. IEEE Trans. Eng. [25]T ANG,D.,JIANG,H., AND ELMORE, A. J. Adap- 18, 2 (1992), 118–129. tive concurrency control: Despite the looking glass, one concurrency control does not fit all. In CIDR [34]Y U,X.,BEZERRA,G.,PAVLO,A.,DEVADAS, 2017, 8th Biennial Conference on Innovative Data S., AND STONEBRAKER, M. Staring into the Systems Research, Chaminade, CA, USA, January abyss: An evaluation of concurrency control with 8-11, 2017, Online Proceedings (2017). one thousand cores. PVLDB 8, 3 (2014), 209–220.

[26]T ATAROWICZ,A.,CURINO,C.,JONES, E. P. C., [35]Y U,X.,PAVLO,A.,SANCHEZ,D., AND DE- AND MADDEN, S. Lookup tables: Fine-grained VADAS, S. Tictoc: Time traveling optimistic con- partitioning for distributed databases. In IEEE currency control. In Proceedings of the 2016 Inter- 28th International Conference on Data Engineer- national Conference on Management of Data, SIG- ing (ICDE 2012), Washington, DC, USA (Arling- MOD Conference 2016, San Francisco, CA, USA, ton, Virginia), 1-5 April, 2012 (2012), pp. 102–113. June 26 - July 01, 2016 (2016), pp. 1629–1642.

USENIX Association 2018 USENIX Annual Technical Conference 821 [36]Y UAN, Y., WANG,K.,LEE,R.,DING,X.,XING, J.,BLANAS,S., AND ZHANG, X. BCC: reducing false aborts in optimistic concurrency control with low cost for in-memory databases. PVLDB 9, 6 (2016), 504–515.

[37]Z HANG, Y., POWER,R.,ZHOU,S.,SOVRAN, Y., AGUILERA,M.K., AND LI, J. Transaction chains: achieving serializability with low latency in geo-distributed storage systems. In ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP ’13, Farmington, PA, USA, November 3-6, 2013 (2013), pp. 276–291.

[38]Z HENG, W., TU,S.,KOHLER,E., AND LISKOV, B. Fast databases with fast durability and recov- ery through multicore parallelism. In 11th USENIX Symposium on Operating Systems Design and Im- plementation, OSDI ’14, Broomfield, CO, USA, Oc- tober 6-8, 2014. (2014), pp. 465–477.

822 2018 USENIX Annual Technical Conference USENIX Association