Chiller: Contention-Centric Transaction Execution and Data Partitioning For

Research 6: Transaction Processing and Query Optimization SIGMOD ’20, June 14–19, 2020, Portland, OR, USA Chiller: Contention-centric Transaction Execution and Data Partitioning for Modern Networks Erfan Zamanian Julian Shun [email protected] [email protected] Brown University MIT CSAIL Carsten Binnig Tim Kraska [email protected] [email protected] TU Darmstadt MIT CSAIL Abstract SIGMOD International Conference on Management of Data (SIG- Distributed transactions on high-overhead TCP/IP-based net- MOD’20), June 14–19, 2020, Portland, OR, USA. ACM, New York, NY, works were conventionally considered to be prohibitively USA, 16 pages. https://doi.org/10.1145/3318464.3389724 expensive and thus were avoided at all costs. To that end, 1 Introduction the primary goal of almost any existing partitioning scheme The common wisdom is to avoid distributed transactions is to minimize the number of cross-partition transactions. at almost all costs as they represent the dominating bot- However, with the new generation of fast RDMA-enabled tleneck in distributed database systems. As a result, many networks, this assumption is no longer valid. In fact, recent partitioning schemes have been proposed with the goal of work has shown that distributed databases can scale even minimizing the number of cross-partition transactions [8, 28, when the majority of transactions are cross-partition. 29, 34, 37, 44]. Yet, a recent result [43] has shown that with In this paper, we first make the case that the new bottle- the advances of high-bandwidth RDMA-enabled networks, neck which hinders truly scalable transaction processing in neither the message overhead nor the network bandwidth modern RDMA-enabled databases is data contention, and that are limiting factors anymore, significantly mitigating the optimizing for data contention leads to different partition- scalability issues of traditional systems. This raises the fun- ing layouts than optimizing for the number of distributed damental question of how data should be partitioned across transactions. We then present Chiller, a new approach to machines given high-bandwidth low-latency networks. In data partitioning and transaction execution, which aims to this paper, we argue that the new optimization goal should be minimize data contention for both local and distributed trans- to minimize contention rather than distributed transactions. actions. Finally, we evaluate Chiller using various workloads, In this paper, we present Chiller, a new partitioning scheme and show that our partitioning and execution strategy out- and execution model based on 2-phase-locking which aims to performs traditional partitioning techniques which try to minimize contention. Chiller is based on two complementary avoid distributed transactions, by up to a factor of 2. ideas: (1) a novel commit protocol based on re-ordering transaction operations with the goal of minimizing the lock ACM Reference Format: duration for contended records through committing such Erfan Zamanian, Julian Shun, Carsten Binnig, and Tim Kraska. 2020. records early, and (2) contention-aware partitioning so Chiller: Contention-centric Transaction Execution and Data Par- titioning for Modern Networks. In Proceedings of the 2020 ACM that the most critical records can be updated without addi- tional coordination. For example, assume a simple scenario with three servers in which each server can store up to two Permission to make digital or hard copies of all or part of this work for records, and a workload consisting of three transactions C1, personal or classroom use is granted without fee provided that copies C2, and C3 (Figure 1a). All transactions update A1. In addition, are not made or distributed for profit or commercial advantage and that C A C A A C A A copies bear this notice and the full citation on the first page. Copyrights 1 updates 2, 2 updates 3 and 4, and 3 updates 4 and 5. for components of this work owned by others than the author(s) must The common wisdom would dictate partitioning the data be honored. Abstracting with credit is permitted. To copy otherwise, or in a way that the number of cross-cutting transactions is republish, to post on servers or to redistribute to lists, requires prior specific minimized; in our example, this would mean co-locating all permission and/or a fee. Request permissions from [email protected]. data for C1 on a single server as shown in Figure 1b, and SIGMOD’20, June 14–19, 2020, Portland, OR, USA having distributed transactions for C2 and C3. © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. However, as shown in Figure 2a, if we re-order each trans- ACM ISBN 978-1-4503-6735-6/20/06...$15.00 action’s operations such that the updates to the most con- https://doi.org/10.1145/3318464.3389724 tended items (A1 and A4) are done last, we argue that it is better 511 Research 6: Transaction Processing and Query Optimization SIGMOD ’20, June 14–19, 2020, Portland, OR, USA Server 3 t1 t1 t1: t2: t3: t1: t2: t3: r2 r2 Read r1 Read r1 Read r4 t2 t3 Read r2 Read r3 Read r5 t2 t3 Write r1 Read r3 Write r4 Outer Region Write r2 Write r3 Write r5 r3 r1 r5 r3 r1 r5 Server 1 Read r2 Write r1 Read r5 Read r1 Read r4 Read r4 Server 1 Write r2 Write r3 Write r5 Write r1 Write r4 Write r4 Server 2 Read r4 Read r1 Read r1 Read r1 Server 2 r4 Inner Region r4 Write r4 Write r1 Write r1 Write r1 Server 3 (a) Original Txn. Execution (b) Distr. Txn. Avoiding Partitioning (a) Re-ordered Txn. Execution (b) Contention-aware Partitioning Figure 1: Traditional Execution and Partitioning. Figure 2: Chiller Execution and Partitioning. to place A1 and A4 on the same machine, as in Figure 2b. At (3) We show that Chiller outperforms existing techniques first this might seem counter-intuitive as it increases the total by up to a factor of 2 on various workloads. number of distributed transactions. However, this partition- 2 Overview ing scheme decreases the likelihood of conflicts and therefore The throughput of distributed transactions is limited by three increase the total transaction throughput. The idea is that factors: (1) message overhead, (2) network bandwidth, and re-ordering the transaction operations minimizes the lock (3) increased contention [3]. The first two limitations are duration for the “hot” items and subsequently the chance significantly alleviated with the new generation of high- of conflicting with concurrent transactions. More impor- speed RDMA-enabled networks. However, what remains is tantly, after the re-ordering, the success of a transaction re- the increased contention likelihood, as message delays are lies entirely on the success of acquiring the lock for the most still significantly longer than local memory accesses. contended records. That is, if a distributed transaction has already acquired the necessary locks for all non-contended 2.1 Transaction Processing with 2PL & 2PC records (referred to as the outer region), the commit out- To understand the impact of contention in distributed trans- come depends solely on the contended records (referred to actions, let us consider a traditional 2PL with 2PC. Here, we as the inner region). This allows us to make all updates to use transaction C3 from Figure 1, and further assume that its the records in the inner region without any further coordina- coordinator is on Server 1, as shown in Figure 3a. The green tion. Note that this partitioning technique primarily targets circle on each partition’s timeline shows when it releases its high-bandwidth low-latency networks, which mitigates the locks and commits. We refer to the time span between acqui- two most common bottlenecks for distributed transactions: sition and release of a record lock as the record’s contention message overhead and limited network bandwidth. span (depicted by thick blue lines), during which all con- To provide such a contention-aware scheme, Chiller is current accesses to the record would be conflicting. In this based on two complementary ideas that go hand-in-hand: a example, the contention span for all records is 2 messages contention-aware data partitioning algorithm and an operation- long with piggybacking optimization (when merging the last reordering execution scheme. First, different from existing step of execution with the prepare phase) and 4 without it. partitioning algorithms that aim to minimize the number of While our example used 2PL, other concurrency control distributed transactions (such as Schism [8]), Chiller’s par- (CC) methods suffer from this issue to various extents16 [ ]. titioning algorithm explicitly takes record contention into For example in OCC, transactions must pass a validation account to co-locate hot records. Second, at runtime, Chiller phase before committing. If another transaction has modified uses a novel execution scheme which goes beyond existing the data accessed by a validating transaction, it has to abort work on re-ordering operations (e.g., QURO [40]). By taking and all its work will be wasted [9, 16]. advantage of the co-location of hot records, Chiller’s exe- 2.2 Contention-Aware Transactions cution scheme reorders operations such that it can release We propose a new partition and execution scheme that aims locks on hot records early and thus reduce the overall con- to minimize the contention span for contended records. The tention span on those records. As we will show, these two partitioning layout shown in Figure 2b opens new possibili- complementary ideas together provide significant perfor- ties. As shown in Figure 3b, the coordinator requests locks for mance benefits over existing state-of-the-art approaches on all the non-contended records in C3, which is A5. If successful, various workloads.

Chiller: Contention-Centric Transaction Execution and Data Partitioning For

Improving High Contention OLTP Performance Via Transaction

A High-Performance, Distributed Main Memory Transaction Processing System

Arxiv:1612.05566V1 [Cs.DB] 16 Dec 2016

Jennie Duggan (Née Rogers)

Scheduling OLTP Transactions Via Learned Abort Prediction

Handling Highly Contended OLTP Workloads Using Fast Dynamic

H-Store: a High-Performance, Distributed Main Memory Transaction Processing System

Cross-Engine Query Execution in Federated Database Systems Ankush M. Gupta

Uses Data Partitioning to Distribute Execution » on Shared

P-Store: an Elastic Database System with Predictive Provisioning

Faster: a Concurrent Key-Value Store with In-Place Updates

A Queue-Oriented Transaction Processing Paradigm