On Transactional Memory, Spinlocks, and Database Transactions

On Transactional Memory, Spinlocks, and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton University of Wisconsin University of Wisconsin University of Wisconsin Madison Madison Madison 1210 W. Dayton 1210 W. Dayton 1210 W. Dayton Madison, WI, 53706 Madison, WI, 53706 Madison, WI, 53706 [email protected] [email protected] [email protected] ABSTRACT most of the power of the multiprocessor. If we give up on Currently, hardware trends include a move toward multi- true serialized execution and run the transactions concur- core processors, cheap and persistent variants of memory, rently, we once again face the need to implement some kind and even sophisticated hardware support for mutual exclu- of concurrency control. The traditional locking approach sion in the form of transactional memory. These trends, cou- may be inefficient for short-lived transactions due to the pled with a growing desire for extremely high performance high overhead of acquiring and releasing locks, and the cost on short database transactions, raise the question of whether of resolving contention, including thread scheduling and syn- the hardware primitives developed for mutual exclusion can chronization. be exploited to run database transactions. In this paper, These trends are converging to place extreme pressure on we present a preliminary exploration of this question. We techniques enforcing the isolation requirements of database conduct a set of experiments on both a hardware prototype transactions. Accordingly, in this paper we investigate us- and a simulator of a multi-core processor with Transactional ing hardware primitives directly as a concurrency control Memory (TM.) Our results show that TM is attractive under mechanism for workloads consisting of very short transac- low contention workloads, while spinlocks can tolerate high tions. We evaluate and compare the performance of three contention workloads well, and that in some cases these ap- Concurrency Control (CC) mechanisms: Hardware Transac- proaches can beat a simple implementation of a traditional tional Memory (HTM), test-and-set spinlocks, and a simple database lock manager by an order of magnitude. implementation of a database lock manager. We then use both an early hardware prototype implementation of trans- 1. INTRODUCTION actional memory and a simulator supporting TM to conduct a set of experiments on each CC approach under different To the best of our knowledge, there has been no published workload conditions. consideration of using hardware synchronization primitives We emphasize what we are not studying in this paper. We such as transactional memory to directly support concur- are not studying the question \Given traditional DBMS ar- rency control for database transactions. One reason for chitectures and workloads, can we improve performance by this is perhaps that these hardware primitives have little or replacing the lock manager with concurrency control based nothing to offer in the context of arbitrarily complex, heavy- upon hardware primitives?" The answer to that question weight database transactions running over disk-resident data. is almost certainly \no," since current architectures present However, the convergence of a number of important trends substantial overhead due to their generality and the legacy lead us to believe that this issue should be revisited. assumptions in their design. Rather, we are studying the The first such trend is the demand for special-purpose sys- question \what will happen with respect to the performance tems targeted at very fast processing of very simple trans- of concurrency control if we build a stripped-down system actions. The DBMS industry has already begun reacting optimized for the execution of short transactions over main- to this demand | there are startups targeting this market memory data?" (including VoltDB and other, stealth-mode companies.) An Building such systems will require advances including re- important characteristic of these workloads is that they con- thinking storage structures (slotted pages of variable-length sist of very short transactions that contain no I/O or think records are probably not optimal) and transaction logging. time. For such workloads, the overhead due to traditional While it is too early to be certain, it is possible that the concurrency control methods will be by far the major part of emergence of very fast persistent memory such as phase- the transaction execution path [5]. On a uniprocessor, there change memory [2], perhaps in addition to new software is no reason to incur this overhead [4], since there is no wait techniques like \k-safety" [15], will eventually lead to ex- or think time in one transaction to be overlapped with the tremely efficient mechanisms for supporting transaction per- execution of another. In other words, on a uniprocessor, sistence. We do not pretend to address all of these issues in the best concurrency control approach for such workloads is this paper; rather, our intent is to shed light on what will likely to be to truly serialize the transactions. be a crucial component in any such system, namely, very However, true serialization collides with the second trend, light-weight concurrency control for short-lived transactions that is, the growing dominance of multicore processors. Clearly, on multi-core processors. on a multicore processor, true serialization of transactions is We also wish to acknowledge that at this point it is not not an attractive option, since true serialization will waste clear when, if ever, transactional memories will become widely partition transaction causes cascading aborts of following available. Furthermore, due to the extremely primitive na- speculatively executed transactions. ture of the hardware transactional memory prototype to In recent work, Pandis et al. showed that a conventional which we had access, the workloads we explore in this pa- lock manger severely limits the performance and scalability per are greatly oversimplified even when compared to the of database systems on multicores [14]. By using thread-to- short transactions that are the target of extreme transac- data instead of thread-to-transaction assignments, they can tion processing. Hence, again, we are not making claims reduce contention among threads. However, since this ap- about the practicality of our work in any near-term sense. proach is still based on database locks, it does not avoid the Rather, we hope to raise the point that from the database problem of high overhead in acquiring and releasing locks. transaction processing perspective, a move by the hardware In other recent work, Johnson et al. studied the trade-off community away from the development of hardware transac- between spinning and blocking synchronization [7]. They tional memory would be unfortunate, as a sufficiently pow- pointed out that while the blocking approach is widely used erful future implementation of HTM could be of great ben- in database systems, it negatively affects system perfor- efit to database transactional processing. To express this mance in multicore environments due to the overhead of co- differently, one sometimes hears computer architects won- ordination with the operating system for scheduling threads. dering \what are we going to do with all the transistors we The spinning approach, on the other hand, does not incur will have on future chips?" In this paper we would like to that cost and offers a much lower overhead, but system at least suggest \consider hardware support for transaction performance degrades under high load since the spinning isolation." threads do not release the CPU even if they do not make The main contributions of the paper are as follows: any progress. This work did not consider the option of using spinning as an alternative approach for transaction concur- • We develop timing models for estimating the execution rency control. time of a transaction using TM and spinlocks as CC There has been also a great deal of work exploring meth- mechanisms. The accuracy of these models is verified ods of combining hardware and software facilities to en- through a set of experiments. sure mutual exclusion and to coordinate concurrent access to shared data structures [8]. However, to the best of our • Through both the analytical model and the experi- knowledge there has been no published work considering us- mental results, we show that TM and spinlocks are ing HTM and spinlocks in the context of database transac- promising CC approaches for workloads consisting of tions. a high volume of very short transactions. TM deliv- ers high performance under low contention. Perhaps 3. BACKGROUND surprisingly, although spinlocks do not differentiate between read and write access, and employ busy-waiting In this section, we explore HTM and spinlocks as CC ap- instead of sleeping on a queue, spinlocks appear to proaches for database transactions. For each approach, we be more attractive than TM or database locks under give a simple timing model to provide insight into their ex- higher contention. We show that these hardware prim- pected performance. The model will be used to help explain itives can beat the performance of our simple lock man- the results of our experiments on a hardware prototype and ager by an order of magnitude. a software simulator in Section 4. We also describe the implementation of our simple lock manager. The rest of the paper is organized as follows. Section 2 discusses related work. Section 3 describes the analytical 3.1 Workloads and timing model model for each CC approach. Section 4 presents experiments The workloads we study in this paper have following char- and results. Finally, Section 5 concludes the paper. acteristics: 2. RELATED WORK • Transactions are very short and do not contain I/Os The idea of running short main memory transactions se- or think time. rially was first proposed by Garcia-Molina and Salem in [4]. • The active portion of the database is stored in main In that paper, the authors briefly mentioned that even if all memory.

Load more