A Queue-oriented Transaction Processing Paradigm

Thamir M. Qadah∗† Exploratory Systems Lab School of Electrical and Computer Engineering, Purdue University, West Lafayette [email protected]

Abstract because the final state cannot be entirely deter- Transaction processing has been an active area of research mined by the input database state and the input set of trans- for several decades. A fundamental characteristic of classical actions. The output database state acceptable as long as the transaction processing protocols is non-determinism, which resulted history of concurrent transaction execution is equiv- causes them to suffer from performance issues on modern alent to some serial history of execution according to serial- computing environments such as main-memory izability theory. using many-core, and multi-socket CPUs and distributed The goal of transaction processing protocols is to ensure environments. Recent proposals of deterministic transaction ACID properties and increase the concurrency of executed processing techniques have shown great potential in address- transactions. Serializable isolation ensures anomaly-free ex- ing these performance issues. In this position paper, I argue ecution. Using other isolation levels (e.g., Read-committed) for a queue-oriented transaction processing paradigm that improves concurrency but is prone to producing anomalies leads to better design and implementation of deterministic that defy users’ intentions and leave the database in an un- transaction processing protocols. I support my approach desirable inconsistent state. with extensive experimental evaluations and demonstrate Due to the non-deterministic nature of classical trans- significant performance gains. action processing protocols, they suffer from performance issues on modern computing environments such as main- CCS Concepts • Information systems → Database trans- memory databases that use many-core and multi-socket action processing; Distributed database transactions; CPUs, and cloud-based distributed environment. In this Ph.D. Main memory engines; • Computer systems organiza- dissertation, I look into ways to impose determinism to im- tion → Multicore architectures; Distributed architec- prove the performance of transaction processing in modern tures; computing environments. Keywords database systems, transaction processing, con- currency control, distributed database systems, performance 2 Transaction Processing in Modern evaluation Computing Environments ACM Reference Format: Thamir M. Qadah. 2019. A Queue-oriented Transaction Processing This section describes two major performance issues when Paradigm. In Proceedings of ACM Conference (Conference’17). ACM, running database transactions using non-deterministic trans- New York, NY, USA, 5 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn action processing protocols. In this section, our discussion assumes the requirement of a serializable isolation model. 1 Introduction Transaction processing is an old-aged problem that has been 2.1 High-contention Workloads an active area of research for the past 40 years[8]. Classical Under high-contention workloads, non-deterministic trans- arXiv:1910.10350v1 [cs.DB] 23 Oct 2019 transaction processing is characterized as non-deterministic action processing protocols suffer from high abort rates ∗The author is co-advised by Prof. Mohammad Sadoghi because their concurrency control algorithms need to en- † Also with Umm Al-Qura University, Makkah, Saudi Arabia. sure serializable histories. Pessimistic concurrency control Permission to make digital or hard copies of all or part of this work for algorithms, abort transactions to avoid deadlocks, and op- personal or classroom use is granted without fee provided that copies are not timistic concurrency control algorithms abort transactions made or distributed for profit or commercial advantage and that copies bear during the validation phase. Ensuring deadlock-free execu- this notice and the full citation on the first page. Copyrights for components tion and validating transactions require extensive coordina- of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to tion among concurrent threads executing transactions while redistribute to lists, requires prior specific permission and/or a fee. Request guaranteeing serializability. The main research question for permissions from [email protected]. this problem is: Is it possible to process high-contended work- Conference’17, July 2017, Washington, DC, USA loads in a concurrency control-free and minimal coordination © 2019 Association for Computing Machinery. while ensuring serializability? What is the right abstraction ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 and principles to achieve that? https://doi.org/10.1145/nnnnnnn.nnnnnnn Conference’17, July 2017, Washington, DC, USA Thamir M. Qadah

2.2 Distributed Commit Protocols 3 Approach In distributed transaction processing, agreement protocols Our goal is to process transactions efficiently in modern introduce significant overhead to the processing because all computing environments with minimal coordination among participant nodes need to agree on the fate of an executed the threads running in our system. The proposed approach distributed transaction. Achieving this agreement involves addresses the research questions presented in the previous multiple rounds of communication messages to be exchanged section. The answer to these questions relies on three princi- among participating nodes. ples: transaction fragmentation, deterministic two-phase pro- The state-of-the-art to solve the agreement problem on cessing, and priority-based queue-oriented representation the fate of transactions in database systems is the two-phase of transnational workload. The essence of the approach is to commit protocol (2PC) [9]. In general cases, 2PC is required minimize the overhead of transactional concurrency control to ensure atomicity for processing distributed transactions. and coordination across the whole system. A second goal Note that 2PC by itself does not ensure serializable histories. is to provide a unified extensible abstraction for determin- A distributed concurrency control augments it to guarantee istic transaction processing that seamlessly admits various serializable execution of transactions. Therefore, the research configurations (e.g., speculative execution, conservative exe- questions for this problem are as follows: Can we reduce cution, serializable isolation, and read-committed isolation). the cost of commitment in distributed transaction processing To lay the foundations for describing the queue-oriented protocols? What conditions are needed to avoid using the costly transaction processing paradigm, we start by describing the 2PC-based protocol? transaction fragmentation model. Fortunately, in many useful and practical cases, we can do away with 2PC. The work on deterministic transaction 3.1 Transaction Fragmentation Model processing protocols has demonstrated that. The next sec- tion describes how determinism is a step toward overcom- Now, I briefly describe the transaction fragmentation model. ing this obstacle. However, proposed deterministic transac- For more formal specification of this model, I refer the read- tion processing protocols suffer from in-efficiencies. Another ers to [17]. In this model, a transaction is broken into frag- step toward eliminating these in-efficiencies is the proposed ments containing the relevant transaction logic and aborting queue-oriented paradigm, which addresses the following ad- conditions. A fragment can perform multiple operations on ditional research questions: What is the best way to abstract the same record, such as read, modify, and write operations. deterministic transaction processing? Is it possible to provide A fragment can cause the transaction to abort, and in this a unified framework that unifies transaction processing for case, we refer to such fragments as abortable fragments. In centralized and distributed transaction processing? Table 1, a summary is provided on the kinds of dependencies that may exist among fragments.

3.2 Queue-oriented Transaction Processing 2.3 Potentials and Limitations of Determinism The essence of this paradigm is to process batches of transac- Work on deterministic transaction processing protocols has tions in two deterministic phases. Figure 1 depicts the basic demonstrated great potential for improving the performance flow. The first phase is called a planning phase, where threads of transaction processing systems [2]. In distributed trans- deterministically create queues tagged with deterministic action processing systems, recently proposed deterministic priorities containing transaction fragments. Dependencies approaches almost eliminates the need to perform a costly among fragments are not shown in Figure 1. The dependency 2PC protocol [18]. In other words, they rely on commit pro- information is maintained in a shared lock-free and thread- tocols that minimize overhead for committing a distributed safe distributed data structure. In the second execution phase, transaction because they perform agreement ahead of time, execution threads receive their assigned queues (filled with which avoids aborting transactions for non-deterministic fragments) and use the tagged priorities to determine the reasons (e.g., deadlocks, validation, or node failures). processing order of queues from different planning threads. In deterministic databases, the output database state is At this point, execution threads are not ware of the actual entirely determined by the input database state and the transactions. They are simply executing the logic associated input set of transactions. Thus, the full knowledge of the with the fragments in the queues, and obey the FIFO property read/write set is required to process transactions determin- of queues when processing fragments with conflict depen- istically, which the main weakness of using deterministic dencies. Processing all queues is equivalent to processing transaction processing protocols. Despite the existence of the whole batch of planned transactions and committing such limitation, there are commercial offerings that adopt them. Other than the necessary communication to resolve this deterministic philosophy [7, 21], which indicates that dependencies among fragments, no other coordination is the approach has found practical use cases in practice. needed. A Queue-oriented Transaction Processing Paradigm Conference’17, July 2017, Washington, DC,USA

Name Fragment relation Notes Data dependency Same transaction dependent fragment required values read by dependee fragment Conflict dependency Different transactions framents access the same record Commit dependency Same transaction dependee fragment may abort and dependent fragment updates the database Speculation dependency Different transactions dependent fragment uses data values updated by is an abortable fragment Table 1. Summary of dependencies in the transaction fragmentation model

Figure 1. Queue-oriented Transaction Processing Architecture

Queue Execution Mechanisms. The proposed paradigm 2 summarizes the experimental results obtained from the supports multiple execution mechanisms, such as specula- centralized implementation running on multi-core hardware tive or conservative. When using speculative execution, addi- with speculative execution. More details is available in [17] tional speculation dependencies occur. Resolving them may for our centralized implementation. Furthermore, Table 2 cause cascading aborts. Conservative execution, on the other reports results for our distributed implementation against hand, ensures that uncommitted updated are not processed state-of-the-art distributed deterministic transaction process- until all abortable fragments complete with aborting, which ing protocol. The key performance metrics for evaluating require additional synchronization and coordination among transaction processing protocols are throughput and latency. threads. Another criterion for evaluating this paradigm is the ap- Isolation Levels. The queue-oriented paradigm admits plicability and broader impact. For this criterion, the queue- read-committed isolation level in addition to serializable iso- oriented paradigm scores high because it is the first determin- lation. Supporting read-committed isolation with speculative istic transaction processing paradigm that allows different is interesting as it requires maintaining a speculative version execution models and isolation levels. It also has the po- and a committed version of records. Other than the storage tential to guide implementations that improves blockchain requirements, the planning phase would create additional systems. queues for read operations. In the execution phase, multiple threads can execute these read operations using committed 5 Related Work data. The related work to my Ph.D. dissertation falls into two cat- egories. In the first category, many centralized deterministic 4 Evaluation transaction processing protocols are proposed. LADS by Yao For evaluation, I implemented the queue-oriented processing et al. [23] creates multiple sub-graphs representing transac- protocol in ExpoDB [10, 11]. I also ported the state-of-the-art tion dependencies of a batch of transactions, and execute non-deterministic and deterministic protocols into ExpoDB. these transactions according to the dependency sub-graphs. Using a single test-bed implementation allows apple-to-apple The main issue with this approach is the graph-based process- comparison among different protocols. I used industry-standard ing is not efficient. Using a different graph-based approach, macro-benchmarks such as YCSB [4], and TPC-C [19]. Table Faleiro et al. [6] process transactions deterministically and Conference’17, July 2017, Washington, DC, USA Thamir M. Qadah

Environment Compared Proto- Throughput Macro- Notes cols improve- benchmark ment Centralized (deterministic) two-orders of YCSB Multi-partition work- H-Store [13] magnitude load Distributed (deterministic) 22× YCSB Low-contention Calvin [18] workload (Uniform access) Centralized (non-deterministic) 3× TPC-C High-contention Cicada [16], TicToc workload (1 ware- [25], Foudus [15], house) Ermia [14], Silo [20], 2PL-NoWait [24] Table 2. Experimental results using TPC-C and YCSB for the centralized implementation of queue-oriented paradigm [17], and a distributed deterministic database.

introduces the notion of “early write visibility” which al- while our approach uses thread-to-queue assignment. There- lows transactions to read uncommitted data safely. In our fore, these systems cannot exploit intra-transaction paral- approach, we use queues of transaction fragments with dif- lelism within a single node. ferent dependency semantics that allows us to process trans- actions more efficiently when compared to a purely graph- 6 Conclusion based approach. BOHM [5] started re-thinking multi-version In this paper, I argued for a queue-oriented transaction pro- concurrency control for deterministic multi-core in-memory cessing paradigm, which improves the performance of de- data stores. BOHM relies on pessimistic transactional con- terministic databases. On-going work includes using this currency control while our proposed paradigm avoids trans- paradigm to design and implement distributed transaction actional concurrency control during execution. Some ideas processing with byzantine-fault-tolerance. presented in [5, 6] are complementary to our approach. For Future work includes using the proposed paradigm to re- example, our current implementation is single-version but alize a deterministic version of production-ready NewSQL can be extended to multi-version in the future. databases such as TiDB [1]. Moreover, I believe that this In the second category, one of the first proposed distributed paradigm can also improve the performance of blockchain deterministic database systems is H-Store [13], which focuses systems. In particular, the queue-oriented paradigm can lead on partitioned workloads. The design of H-Store does not lend to a design and implementation that improves the perfor- itself to work well for multi-partition transactional workload mance of the ordering service in HyperLedger Fabric [3]. because of the partition-level locking mechanism and 2PC. To improve the performance of multi-partition workloads, Acknowledgments Jones et al. [12] introduced the idea of speculative execu- I want to thank my co-advisors Prof. Arif Ghafoor, for his tion in H-Store while still relying on 2PC as a distributed continuous support during my Ph.D. journey, and Prof. Mo- commit protocol. In contrast to these proposals, the use of hammad Sadoghi, for his valuable comments that helped speculative execution in the proposed paradigm is different me develop the ideas in my thesis. The author would also because speculative execution is at the level of fragments. like to thank the anonymous referees and Yahya Javed for Furthermore, the proposed paradigm does not require 2PC their valuable comments and helpful suggestions. The work to commit distributed multi-partition transactions. is supported in part by a scholarship from Umm Al-Qura As mentioned previously, Calvin [18] greatly reduces the University, Makkah, Saudi Arabia. overhead of distributed transactions because it does not rely on 2PC. Wu et al. proposes T-Part [22], which uses the same References fundamental design as Calvin. T-Part optimizes the handling remote reads by using a forward-pushing technique at the [1] 2019. TiDB | SQL at Scale. https://pingcap.com/en/. [2] Daniel J. Abadi and Jose M. Faleiro. 2018. An Overview of Deterministic cost of more complex scheduling that involves solving a Database Systems. Commun. ACM 61, 9 (Aug. 2018), 78–88. https: graph-partitioning problem. The key characteristic of Calvin //doi.org/10.1145/3181853 and T-Part is that they use thread-to-transaction assignment [3] Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Kon- stantinos Christidis, Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidha- ran, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith, A Queue-oriented Transaction Processing Paradigm Conference’17, July 2017, Washington, DC,USA

Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko VukoliÄĞ, [20] Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger Fabric: A Samuel Madden. 2013. Speedy Transactions in Multicore In-Memory Distributed Operating System for Permissioned Blockchains. In Pro- Databases. In SOSP. ACM, 18–32. https://doi.org/10.1145/2517349. ceedings of the Thirteenth EuroSys Conference (EuroSys ’18). ACM, New 2522713 York, NY, USA, 30:1–30:15. https://doi.org/10.1145/3190508.3190538 [21] VoltDB. 2019. VoltDB. https://www.voltdb.com/. [4] Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, [22] Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, and Yu- and Russell Sears. 2010. Benchmarking Cloud Serving Systems with Shan Lin. 2016. T-Part: Partitioning of Transactions for Forward- YCSB. In Proc. SoCC. ACM, 143–154. https://doi.org/10.1145/1807128. Pushing in Deterministic Database Systems. In Proceedings of the 1807152 2016 International Conference on Management of Data (SIGMOD ’16). [5] Jose M. Faleiro and Daniel J. Abadi. 2015. Rethinking Serializable ACM, New York, NY, USA, 1553–1565. https://doi.org/10.1145/2882903. Multiversion Concurrency Control. Proc. VLDB Endow. 8, 11 (July 2915227 2015), 1190–1201. https://doi.org/10.14778/2809974.2809981 [23] C. Yao, D. Agrawal, G. Chen, Q. Lin, B. C. Ooi, W. F. Wong, and M. [6] Jose M. Faleiro, Daniel J. Abadi, and Joseph M. Hellerstein. 2017. High Zhang. 2016. Exploiting Single-Threaded Model in Multi-Core In- Performance Transactions via Early Write Visibility. Proc. VLDB Endow. Memory Systems. IEEE TKDE 28, 10 (2016), 2635–2650. https://doi. 10, 5 (Jan. 2017), 613–624. https://doi.org/10.14778/3055540.3055553 org/10.1109/TKDE.2016.2578319 [7] FaunaDB. 2019. FaunaDB Website. https://fauna.com/. [24] Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and [8] Jim Gray and Andreas Reuter. 1992. Transaction Processing: Concepts . 2014. Staring into the Abyss: An Evaluation of and Techniques (1st ed.). Morgan Kaufmann Publishers Inc., San Fran- Concurrency Control with One Thousand Cores. Proc. VLDB Endow. cisco, CA, USA. 8, 3 (Nov. 2014), 209–220. https://doi.org/10.14778/2735508.2735511 [9] J. N. Gray. 1978. Notes on Data Base Operating Systems. In Op- [25] Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, and Srinivas Devadas. erating Systems: An Advanced Course, R. Bayer, R. M. Graham, and 2016. TicToc: Time Traveling Optimistic Concurrency Control. In Proc. G. Seegmüller (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, SIGMOD. ACM, 1629–1642. https://doi.org/10.1145/2882903.2882935 393–481. [10] Suyash Gupta and Mohammad Sadoghi. 2018. Blockchain Transaction Processing. In Encyclopedia of Big Data Technologies, Sherif Sakr and Albert Zomaya (Eds.). Springer International Publishing, Cham, 1–11. https://doi.org/10.1007/978-3-319-63962-8_333-1 [11] Suyash Gupta and Mohammad Sadoghi. 2018. EasyCommit: A Non- Blocking Two-Phase Commit Protocol. In EDBT. https://doi.org/10. 5441/002/edbt.2018.15 [12] Evan P.C. Jones, Daniel J. Abadi, and Samuel Madden. 2010. Low Over- head Concurrency Control for Partitioned Main Memory Databases. In Proceedings of the 2010 ACM SIGMOD International Conference on Man- agement of Data (SIGMOD ’10). ACM, New York, NY, USA, 603–614. https://doi.org/10.1145/1807167.1807233 [13] Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-Store: A High-Performance, Distributed Main Memory Trans- action Processing System. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1496– 1499. https://doi.org/10.14778/1454159.1454211 [14] Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, and Ippokratis Pan- dis. 2016. ERMIA: Fast Memory-Optimized Database System for Het- erogeneous Workloads. In Proceedings of the 2016 International Con- ference on Management of Data (SIGMOD ’16). ACM, San Francisco, California, USA, 1675–1687. https://doi.org/10.1145/2882903.2882905 [15] Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Cores and NVRAM. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD ’15). ACM, Melbourne, Victoria, Australia, 691–706. https://doi.org/10.1145/2723372.2746480 [16] Hyeontaek Lim, Michael Kaminsky, and David G. Andersen. 2017. Cicada: Dependably Fast Multi-Core In-Memory Transactions. In Proc. SIGMOD. ACM, 21–35. https://doi.org/10.1145/3035918.3064015 [17] Thamir M. Qadah and Mohammad Sadoghi. 2018. QueCC: A Queue- Oriented, Control-Free Concurrency Architecture. In Proceedings of the 19th International Middleware Conference (Middleware ’18). ACM, New York, NY, USA, 13–25. https://doi.org/10.1145/3274808.3274810 [18] Alexander Thomson, Thaddeus Diamond, Shu C. Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. 2012. Calvin: Fast Distributed Trans- actions for Partitioned Database Systems. In Proc. SIGMOD. ACM, 1–12. https://doi.org/10.1145/2213836.2213838 [19] TPC. 2010. TPC-C, On-Line Transaction Processing Benchmark, Version 5.11.0. TPC Corporation.