Serializable Snapshot Isolation for Replicated Databases in High-Update Scenarios
Total Page:16
File Type:pdf, Size:1020Kb
Serializable Snapshot Isolation for Replicated Databases in High-Update Scenarios Hyungsoo Jungy Hyuck Han∗ Alan Feketey Uwe Rohm¨ y University of Sydney Seoul National University University of Sydney University of Sydney yffi[email protected] ∗[email protected] ABSTRACT pects of this work. Among these is the value of Snapshot Many proposals for managing replicated data use sites run- Isolation (SI) rather than serializability. SI is widely avail- ning the Snapshot Isolation (SI) concurrency control mech- able in DBMS engines like Oracle DB, PostgreSQL, and Mi- anism, and provide 1-copy SI or something similar, as the crosoft SQL Server. By using SI in each replica, and deliver- global isolation level. This allows good scalability, since only ing 1-copy SI as the global behavior, most recent proposals ww-conflicts need to be managed globally. However, 1-copy have obtained improved scalability, since local SI can be SI can lead to data corruption and violation of integrity con- combined into (some variant of) global 1-copy SI by han- straints [5]. 1-copy serializability is the global correctness dling ww-conflicts, but ignoring rw-conflicts. This has been condition that prevents data corruption. We propose a new the dominant replication approach, in the literature [10, 11, algorithm Replicated Serializable Snapshot Isolation (RSSI) 12, 13, 23, 27]. that uses SI at each site, and combines this with a certifi- While there are good reasons for replication to make use cation algorithm to guarantee 1-copy serializable global ex- of SI as a local isolation level (in particular, it is the best ecution. Management of ww-conflicts is similar to what is currently available in widely deployed systems including Or- done in 1-copy SI. But unlike previous designs for 1-copy se- acle and PostgreSQL), the decision to provide global SI is rializable systems, we do not need to prevent all rw-conflicts less convincing. Global SI (like SI in a single site) does not among concurrent transactions. We formalize this in a theo- prevent data corruption or interleaving anomalies. If data rem that shows that many rw-conflicts are indeed false pos- integrity is important to users, they should expect global (1- itives that do not risk non-serializable behavior. Our pro- copy) serializable execution, which will maintain the truth posed RSSI algorithm will only abort a transaction when it of any integrity constraint that is preserved by each trans- detects a well-defined pattern of two consecutive rw-edges action considered alone. in the serialization graph. We have built a prototype that Thus in this paper, we propose a new replication algo- integrates our RSSI with the existing open-source Postgres- rithm that uses sites running SI, and provides users with a R(SI) system. Our performance evaluation shows that there global serializable execution. is a worst-case overhead of about 15% for getting full 1- Recently, Bornea et al. [6] proposed a replica management copy serializability as compared to 1-copy SI in a cluster of system that uses sites running SI, and provides users with a 8 nodes, with our proposed RSSI clearly outperforming the global serializable execution. They use an essentially opti- previous work [6] for update-intensive workloads. mistic approach (following the basic principles from [17]) in which each transaction is performed locally at one site, and then (for update transactions) the writeset and readset are 1. INTRODUCTION transmitted (by a totally ordered broadcast) to remote sites In 1996, a seminal paper by Gray et al. [15] showed that where a certification step at the end of the transaction can there were performance bottlenecks that limited scalability check for conflicts between the completing transaction and of all the then-known approaches to managing replicated all concurrent committed transactions; if a conflict is found, data. These approaches involved a transaction performing the later transaction is aborted. All the designs that offer SI reads at one site, performing writes at all sites, and ob- as global isolation level, must do essentially the same checks taining locks at all sites. This led to a busy research area, to prevent ww-conflicts; the extra work for 1-copy serializ- proposing novel system designs to get better scalability. In ability lies in preventing rw-conflicts. Bornea et al. follow 2010, in a reflection [19] on winning a Ten-Year-Best-Paper this approach in a middleware-based design, with readsets award, Kemme and Alonso identified some important as- calculated by the middleware from analysis of SQL text, in order to reduce transmission cost. We refer to this conflict management approach, of aborting an update transaction Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are when it has a rw- or ww-conflict with a committed concur- not made or distributed for profit or commercial advantage and that copies rent transaction, as the “conflict-prevention with read-only bear this notice and the full citation on the first page. To copy otherwise, to optimisation" approach (abbreviated as CP-ROO). Our pa- republish, to post on servers or to redistribute to lists, requires prior specific per offers a more subtle approach, that looks at pairs of permission and/or a fee. Articles from this volume were invited to present consecutive conflict edges in order to reduce the number of their results at The 37th International Conference on Very Large Data Bases, unnecessary aborts; as a trade-off however, we must also cer- August 29th - September 3rd 2011, Seattle, Washington. Proceedings of the VLDB Endowment, Vol. 4, No. 11 tify read-only transactions, and so our technique is suited to Copyright 2011 VLDB Endowment 2150-8097/11/08... $ 10.00. 783 cases of high update rate (as we demonstrate in the evalua- Snapshot Isolation. Snapshot Isolation (SI) is a method tion). These cases are not well treated by existing replication of multiversion concurrency control in a single DBMS engine strategies which focus instead on read-mostly environments, that ensures non-blocking reads and avoids many anomalies. and they scale badly when updates are common. SI was described in Berenson et al. [5] and it has been im- A similar issue was previously studied in single-site DBMS plemented in several production systems including Oracle concurrency control. Theory from Adya [1] as extended by DB, PostgreSQL and Microsoft SQL Server. When SI is the Fekete et al. [14] showed that checking each rw-conflict in an concurrency control mechanism, and each transaction Ti be- SI-based system gives many unnecessary aborts (\false pos- gins its execution, it gets a begin timestamp bi. Whenever itives") in situations where non-serializable execution will Ti reads a data record x, it does not necessarily see the lat- not arise even in presence of a rw-conflict; instead, non- est value written to x; instead Ti sees the version of x that serializable execution only arises from a pair of consecutive was produced by the last to commit among the transactions rw-conflicts within a cycle in the serialization graph. Cahill that committed before Ti started and also modified x (there et al. [7] used this to propose Serializable Snapshot Isola- is one exception to this: if Ti has itself modified x, it sees tion (SSI), a concurrency control mechanism that is a mod- its own version). Owing to this rule, Ti appears to run on a ification of an SI-based (single site) engine. SSI guarantees snapshot of the database, storing the last committed version serializable execution, with performance close to that of SI of each record at bi. SI also enforces another rule on writes, itself (note that a pair of consecutive edges is much less fre- called the First-Committer-Wins (FCW) rule: two concur- quent than a single edge, so SSI has far fewer aborts than a rent transactions that both modify the same data record standard optimistic approach). In this paper we do a similar cannot both commit. This prevents lost updates in a single thing for a replicated system, with SI as the mechanism in site DBMS. each site, and 1-copy serializability as the global property of The theoretical properties of SI have been studied ex- executions. We prevent ww-conflicts in the same way as 1- tensively. Berenson et al. [5] showed that SI allows some copy SI designs, and we prevent certain pairs of consecutive non-serializable executions. Adya [1] showed that in any rw-conflicts. non-serializable execution of SI, the serialization graph has Contributions of This Work. We prove a new the- a cycle with two consecutive rw-edges. This was extended orem that gives a sufficient condition for 1-copy serializ- by Fekete et al. [14], who found that any non-serializable able execution, based on absence of ww-conflicts between execution had consecutive rw-edges where the transactions concurrent transactions, and absence of certain patterns of joined by the edge were concurrent with one another. Our consecutive rw-conflicts. We propose a concurrency con- new theorem is similar in format to those in [1, 14], but trol algorithm that applies the theorem, to abort trans- with different side-conditions that are suited to a replicated actions during a certification step when these situations are system. found. We design and implement a prototype system This theory has been used to enforce serializability based called Postgres-RSSI, using this algorithm, as a modification on SI mechanisms. In Fekete et al. [14] and in Alomari of the existing Postgres-R(SI) open source replication sys- et al.