Concurrency Control Basics

Total Page:16

File Type:pdf, Size:1020Kb

Concurrency Control Basics Outline l Introduction/problems, l definitions Introduction/ (transaction, history, conflict, equivalence, Problems serializability, ...), Definitions l locking. Chapter 2: Locking Concurrency Control Basics Klemens Böhm Distributed Data Management: Concurrency Control Basics – 1 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 2 Atomicity, Isolation Synchronisation, Distributed (1) l Transactional guarantees – l Essential feature of databases: in particular, atomicity and isolation. Many users can access the same data concurrently – be it read, be it write. Introduction/ l Atomicity Introduction/ Problems Problems u Example, „bank scenario“: l Consistency must be guaranteed – Definitions Definitions task of synchronization component. Locking Number Person Balance Locking Klemens 5000 l Multi-user mode shall be hidden from users as far as possible: concurrent processing Gunter 200 of requests shall be transparent, u Money transfer – two elementary operations. ‚illusion‘ of being the only user. – debit(Klemens, 500), – credit(Gunter, 500). l Isolation – can be explained with this example, too. l Transactions. Klemens Böhm Distributed Data Management: Concurrency Control Basics – 3 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 4 Synchronisation, Distributed (2) Synchronization in General l Serial execution of application programs Uncontrolled non-serial execution u achieves that illusion leads to other problems, notably inconsistency: l Introduction/ without any synchronization effort, Introduction/ lost updates, Problems u database consistency Problems l Inconsistent analysis („non-repeatable read“), Definitions Definitions Locking at the end of each program, Locking l dirty reads, i.e., reads of uncommitted updates, u but extremely long delays l phantoms. and insufficient utilization of resources. (Processor is idle during communication and I/O.) Klemens Böhm Distributed Data Management: Concurrency Control Basics – 5 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 6 Lost Update Dirty Read l Program T1 transfers EUR 300,- l Program T2 credits interest rate based on a value from Account A to Account B, that is not part of a consistent state. Program T2 credits 3 % interest rate to Account A. l Introduction/ Introduction/ Namely, T1 is aborted later on. Problems Problems l Interest credited in Step 5 by T2 is lost, Definitions Definitions Step T1 T2 because value is overwritten in Step 6 by T1. Locking Locking 1 Read(A, a1) 2 a1 := a1-300 Step T1 T2 3 Write(A, a1) 1 Read(A, a1) 4 Read(A, a2) 2 a1 := a1-300 5 a2 := a2 *1.03 3 Read(A, a2) 6 Write(A, a2) 4 a2 := a2 *1.03 7 commit 5 Write(A, a2) 8 Read(B, b1) 6 Write(A, a1) 9 … Read(B, b1) 7 10 abort 8 b1 := b1 + 300 Not necessar ily 9 Write(B, b1) invoked by user. Klemens Böhm Distributed Data Management: Concurrency Control Basics – 7 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 8 Non-Repeatable Reads Transactions l Program reads data object more than once l Execution of a program and sees modification of another program. that manipulates the database. l Representation of the execution that identifies Introduction/ Step T1 T2 Introduction/ Problems Problems u the reads and writes, Definitions 1 Read(A, a1) Definitions Locking 2 a1 := a1-300 Locking u the order of their execution, 3 Write(A, a1) u whether or not there is a commit at the end 4 Read(A, a2) (or abort). 5 a2 := a2 *1.03 6 Write(A, a2) 7 Read(A, a3) 8 … l Explain why this is neither a lost update nor a dirty read. z Klemens Böhm Distributed Data Management: Concurrency Control Basics – 9 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 10 Transactions Conflict l Example: l Two operations p, q conflict := Procedure P begin p, q operate on the same data object, Start; and p or q is a write. Introduction/ Introduction/ Problems temp := Read(x); Problems l Further operations – operations Definitions temp := temp + 1; Definitions definition of conflict must be extended. Locking Write(x, temp); Locking l Example. Compatibility matrix: Commit end Read Write Increment Decrement l Representation: r [x] → w [x] → c Read y n n n 1 1 1 Write n n n n l Transaction is partial order (Σ, <) Increment n n y y (Σ will mostly be omitted in what follows.) Decrement n n y y Klemens Böhm Distributed Data Management: Concurrency Control Basics – 11 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 12 Transaction – Formal Definition Reads-from Relationship between Transactions Transaction is partial order with ordering relation <, s.t. l Transaction T reads-from transaction T 1. T ⊆ {r [x], w [x]|x is a data object} ∪ {a , c }, i j i i i i i in a certain execution if Introduction/ 2. ai∈Ti ⇔ ci∉Ti; Introduction/ Problems Problems 1. Tj reads x after Ti has written x; 3. Definitions if t is ci or ai Definitions 2. Ti does not abort before Ti reads x; and Locking then for each other operation p∈Ti holds: p<i t; and Locking 3. Each transaction that writes x before Ti reads 4. if ri[x], wi[x] ∈Ti then ri[x] <i wi[x] or wi[x] <i ri[x]. x and after Tj writes x aborts before Ti reads x. Examples: l Examples: u w1[x] w2[x] r3[x] r4[x] c1 c2 c3 c4 r[x] w[x] r[x] w[x] r[x] w[y] reads-from relationships: c r[z] c T from T , T from T . Nothing else. r[y] w[y] r[y] w[y] r[y] w[x] 3 2 4 2 u w1[x] w2[x] r3[x] r4[x] c1 a2 c3 c4 is a transaction. is not a TA. is not a TA. reads-from relationships: T3 from T2, T4 from T2. Nothing else. Klemens Böhm Distributed Data Management: Concurrency Control Basics – 13 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 14 Histories (1) Histories – Examples l Execution of the operations of several transactions l Two transactions given: that are ‘intertangled’ with each other, T1 r1[x] w1[x] i.e., concurrent. c1 T2 r2[x] w2[x] c2 Introduction/ Introduction/ Problems l Formally – Problems r1[y] w1[y] Definitions T = {T , T , …, T } be a set of transactions. Definitions 1 2 n l An OK complete history: Locking Complete history H over T := Locking r2[x] w2[x] c2 partial order with order relation <H, such that n r [x] w [x] 1. H = 8 T 1 1 i=1 i c1 r1[y] w1[y] n 2. < ⊇ 8 < H i=1 i l Not complete histories: r [x] w [x] c r2[x] w2[x] c2 2 2 2 3. p, q∈H have conflict p<H q or q<H p u history := prefix of a complete history. r1[x] w1[x] r1[x] w1[x] c1 c1 r1[y] w1[y] r1[y] w1[y] Klemens Böhm Distributed Data Management: Concurrency Control Basics – 15 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 16 Histories (2) Histories (3) l Committed projection of history H – l History need not be ‘correct’. Abbreviation: C(H) := results from H E.g., it may contain lost updates. by removing all operations that are not committed. l Introduction/ Introduction/ Objective in what follows: Problems l Illustration: Problems Formal definition of correctness. Definitions Definitions Locking Locking r2[x] w2[x] c2 r1[x] w1[x] c1 r1[y] w1[y] Klemens Böhm Distributed Data Management: Concurrency Control Basics – 17 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 18 Prefix Commit-Closedness (1) Prefix Commit-Closedness (2) l Characteristic of histories is prefix commit-closed H=o1 ... on α=”history β=”all operations γ=”history (linear contains less are reads” contains more if the following holds: to keep than 10 than 10 characteristic holds for H. Introduction/ things operations” operations” Introduction/ Problems Problems characteristic holds for C(H’), H’ is prefix of H. simple) Definitions. Definitions. (C(H) := committed projection, i.e., Prefixes: If α holds for H, If β holds for H, γ may not hold Locking Locking H’=o1 ... ol it also holds for it also holds for for H’ even if it only operations from committed transactions) H”=o1 ... H’ and H” and H’ and H” and holds for H. γ is l Note that we do not take just any prefix om any other prefix. any other prefix. not prefix into account, only committed projections. α is prefix β is prefix commit-closed. commit-closed. commit-closed. l Come up with an example of your own. Klemens Böhm Distributed Data Management: Concurrency Control Basics – 19 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 20 Prefix Commit-Closedness (3) Equivalence of Histories l Rationale: l More than one definition: u correctness criterion for histories u conflict equivalence, Introduction/ must have this characteristic. Introduction/ u view equivalence. Problems Problems u Scheduler generates history, l Definitions Definitions Definition ‘conflict equivalence’: Locking but also each prefix. Locking Histories H, H’ are (conflict) equivalent if u Failure of the DBMS – 1. same transactions, same operations; history after recovery has the characteristic 2. they establish the same order as well. I.e., history should be correct as well. of operations with conflict. I.e., pi, qj, belong to Ti and Tj, respectively. ai, aj ∉H. If pi <H qj then pi <H’ qj. Klemens Böhm Distributed Data Management: Concurrency Control Basics – 21 Klemens Böhm Distributed Data Management: Concurrency Control Basics – 22 Equivalence of Histories Equivalence of Histories – Examples (1) – Examples (2) l Two transactions given: l History 1: T1 r1[x] w1[x] Step T1 T2 T3 Introduction/ c r [x] w [x] c Introduction/ 1 Read(A) Problems 1 T2 2 2 2 Problems All transactions eventually commit.
Recommended publications
  • A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems
    Purdue University Purdue e-Pubs Department of Computer Science Technical Reports Department of Computer Science 1993 A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems Aidong Zhang Evaggelia Pitoura Bharat K. Bhargava Purdue University, [email protected] Report Number: 93-082 Zhang, Aidong; Pitoura, Evaggelia; and Bhargava, Bharat K., "A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems" (1993). Department of Computer Science Technical Reports. Paper 1095. https://docs.lib.purdue.edu/cstech/1095 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. A VIEW-BASED APPROACH TO RELAXING GLOBAL SERIALIZABILITY IN MULTIDATABASE SYSTEMS Aldong Zhang Evaggelia PitOUI'3 Bharat Bh:Jrgava CSD TR-9J.082 December 1993 (Revised March 1994) A View-Based Approach to Relaxing Global Serializability in Multidatabase Systems Aidong Zhang, Evaggelia Pitoura, and Bharat K Bhargava Department of Computer Science Purdue University West Lafayette, IN 47907 USA Abstract In this paper, we propose a new approach to ensuring the correctness of non­ serializable executions. The approach is based on relating transaction views of the database to the integrity constraints of the system. Drawing upon this approach, we de­ velop a new correctness criterion for multidatabase concurrency control. This criterion, caUed view-based two-level serializability, relaxes global serializabitity in multidatabase systems while respecting the autonomy of local database systems. No additional re­ strictions other than serializallility are imposed on local database systems. 1 Introduction A Illultidataba.-<;e system (MDBS) is a higher-level confederation of a nWllber of pre-existing autonomolls and possibly heterogeneous database systems.
    [Show full text]
  • Cache Serializability: Reducing Inconsistency in Edge Transactions
    Cache Serializability: Reducing Inconsistency in Edge Transactions Ittay Eyal Ken Birman Robbert van Renesse Cornell University tributed databases. Until recently, technical chal- Abstract—Read-only caches are widely used in cloud lenges have forced such large-system operators infrastructures to reduce access latency and load on to forgo transactional consistency, providing per- backend databases. Operators view coherent caches as object consistency instead, often with some form of impractical at genuinely large scale and many client- facing caches are updated in an asynchronous manner eventual consistency. In contrast, backend systems with best-effort pipelines. Existing solutions that support often support transactions with guarantees such as cache consistency are inapplicable to this scenario since snapshot isolation and even full transactional atom- they require a round trip to the database on every cache icity [9], [4], [11], [10]. transaction. Our work begins with the observation that it can Existing incoherent cache technologies are oblivious to be difficult for client-tier applications to leverage transactional data access, even if the backend database supports transactions. We propose T-Cache, a novel the transactions that the databases provide: trans- caching policy for read-only transactions in which incon- actional reads satisfied primarily from edge caches sistency is tolerable (won’t cause safety violations) but cannot guarantee coherency. Yet, by running from undesirable (has a cost). T-Cache improves cache consis- cache, client-tier transactions shield the backend tency despite asynchronous and unreliable communication database from excessive load, and because caches between the cache and the database. We define cache- are typically placed close to the clients, response serializability, a variant of serializability that is suitable latency can be improved.
    [Show full text]
  • Serializability, Not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores
    Serializability, not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores Stacy Patterson1 Aaron J. Elmore2 Faisal Nawab2 Divyakant Agrawal2 Amr El Abbadi2 1Department of Electrical Engineering 2Department of Computer Science Technion - Israel Institute of Technology University of California, Santa Barbara Haifa, 32000, Israel Santa Barbara, CA 93106 [email protected] faelmore,nawab,agrawal,[email protected] ABSTRACT create applications within the eventual consistency model We present a framework for concurrency control and avail- [21]. Many cloud providers then introduced support for ability in multi-datacenter datastores. While we consider atomic access to individual data items, in essence, provid- Google's Megastore as our motivating example, we define ing strong consistency guarantees. This consistency level general abstractions for key components, making our solu- has become a standard feature that is offered in most cloud tion extensible to any system that satisfies the abstraction datastore implementations, including BigTable, SimpleDB, properties. We first develop and analyze a transaction man- and Apache HBase [17]. Strong consistency of single data agement and replication protocol based on a straightforward items is sufficient for many applications. However, if several implementation of the Paxos algorithm. Our investigation data items must be updated atomically, the burden to imple- reveals that this protocol acts as a concurrency prevention ment this atomic action in a scalable, fault tolerant manner mechanism rather than a concurrency control mechanism. lies with the software developer. Several recent works have We then propose an enhanced protocol called Paxos with addressed the problem of implementing ACID transactions Combination and Promotion (Paxos-CP) that provides true in cloud datastores [3, 11, 12], and, while full transaction transaction concurrency while requiring the same per in- support remains a scalability challenge, these works demon- stance message complexity as the basic Paxos protocol.
    [Show full text]
  • A Study of the Availability and Serializability in a Distributed Database System
    A STUDY OF THE AVAILABILITY AND SERIALIZABILITY IN A DISTRIBUTED DATABASE SYSTEM David Wai-Lok Cheung B.Sc., Chinese University of Hong Kong, 1971 M.Sc., Simon Fraser University, 1985 A THESIS SUBMI'ITED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHLLOSOPHY , in the School of Computing Science 0 David Wai-Lok Cheung 1988 SIMON FRASER UNIVERSITY January 1988 All rights reserved. This thesis may not be reproduced in whole or in part, by photocopy or other means, without permission of the author. 1 APPROVAL Name: David Wai-Lok Cheung Degree: Doctor of Philosophy Title of Thesis: A Study of the Availability and Serializability in a Distributed Database System Examining Committee: Chairperson: Dr. Binay Bhattacharya Senior Supervisor: Dr. TikoJameda WJ ru p v Dr. Arthur Lee Liestman Dr. Wo-Shun Luk Dr. Jia-Wei Han External Examiner: Toshihide Ibaraki Department of Applied Mathematics and Physics Kyoto University, Japan Bate Approved: January 15, 1988 PARTIAL COPYRIGHT LICENSE I hereby grant to Simon Fraser University the right to lend my thesis, project or extended essay (the title of which is shown below) to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. I further agree that permission for multiple copying of this work for scholarly purposes may be granted by me or the Dean of Graduate Studies. It is understood that copying or publication of this work for financial gain shall not be allowed without my written permission, T it l e of Thes i s/Project/Extended Essay Author: (signature) ( name (date) ABSTRACT Replication of data objects enhances the reliability and availability of a distributed database system.
    [Show full text]
  • A Theory of Global Concurrency Control in Multidatabase Systems
    VLDB Journal,2, 331-360 (1993), Michael Carey and Patrick Valduriez, Editors 331 t~)VLDB A Theory of Global Concurrency Control in Multidatabase Systems Aidong Zhang and Ahmed K. Elmagarmid Received December 1, 1992; revised version received February 1, 1992; accepted March 15, 1993. Abstract. This article presents a theoretical basis for global concurrency control to maintain global serializability in multidatabase systems. Three correctness criteria are formulated that utilize the intrinsic characteristics of global transactions to de- termine the serialization order of global subtransactions at each local site. In par- ticular, two new types of serializability, chain-conflicting serializability and shar- ing serializability, are proposed and hybrid serializability, which combines these two basic criteria, is discussed. These criteria offer the advantage of imposing no restrictions on local sites other than local serializability while retaining global se- rializability. The graph testing techniques of the three criteria are provided as guidance for global transaction scheduling. In addition, an optimal property of global transactions for determinating the serialization order of global subtransac- tions at local sites is formulated. This property defines the upper limit on global serializability in multidatabase systems. Key Words. Chain-conflicting serializability, sharing serializability, hybrid serial- izability, optimality. 1. Introduction Centralized databases were predominant during the 1970s, a period which saw the development of diverse database systems based on relational, hierarchical, and network models. The advent of applications involving increased cooperation between systems necessitated the development of methods for integrating these pre-existing database systems. The design of such global database systems must allow unified access to these diverse database systems without subjecting them to conversion or major modifications.
    [Show full text]
  • Chapter 14: Concurrency Control
    ChapterChapter 1515 :: ConcurrencyConcurrency ControlControl What is concurrency? • Multiple 'pieces of code' accessing the same data at the same time • Key issue in multi-processor systems (i.e. most computers today) • Key issue for parallel databases • Main question: how do we ensure data stay consistent without sacrificing (too much) performance? Lock-BasedLock-Based ProtocolsProtocols • A lock is a mechanism to control concurrent access to a data item • Data items can be locked in two modes: 1. exclusive (X) mode. Data item can be both read as well as written. X-lock is requested using lock-X instruction. 2. shared (S) mode. Data item can only be read. S-lock is requested using lock-S instruction. • Lock requests are made to concurrency-control manager. Transaction can proceed only after request is granted. Lock-BasedLock-Based ProtocolsProtocols (Cont.)(Cont.) • Lock-compatibility matrix • A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions. • Any number of transactions can hold shared locks on an item, – but if any transaction holds an exclusive on the item no other transaction may hold any lock on the item. • If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted. Lock-BasedLock-Based ProtocolsProtocols (Cont.)(Cont.) • Example of a transaction performing locking: T2: lock-S(A); read (A); unlock(A); lock-S(B); read (B); unlock(B); display(A+B) • Locking as above is not sufficient to guarantee serializability — if A and B get updated in-between the read of A and B, the displayed sum would be wrong.
    [Show full text]
  • Models of Transactions Structuring Applications Flat Transaction Flat
    Structuring Applications • Many applications involve long transactions Models of Transactions that make many database accesses • To deal with such complex applications many transaction processing systems Chapter 19 provide mechanisms for imposing some structure on transactions 2 Flat Transaction Flat Transaction • Consists of: begin transaction – Computation on local variables • Abort causes the begin transaction • not seen by DBMS; hence will be ignored in most future discussion execution of a program ¡£ ¤¦©¨ – Access to DBMS using call or ¢¡£ ¥¤§¦©¨ that restores the statement level interface variables updated by the ¡£ ¤¦©¨ • This is transaction schedule; commit ….. applies to these operations ¢¡£ ¤§¦©¨ ….. transaction to the state • No internal structure they had when the if condition then abort • Accesses a single DBMS transaction first accessed commit commit • Adequate for simple applications them. 3 4 Some Limitations of Flat Transactions • Only total rollback (abort) is possible – Partial rollback not possible Providing Structure Within a • All work lost in case of crash Single Transaction • Limited to accessing a single DBMS • Entire transaction takes place at a single point in time 5 6 1 Savepoints begin transaction S1; Savepoints Call to DBMS sp1 := create_savepoint(); S2; sp2 := create_savepoint(); • Problem: Transaction detects condition that S3; requires rollback of recent database changes if (condition) {rollback (sp1); S5}; S4; that it has made commit • Solution 1: Transaction reverses changes • Rollback to spi causes
    [Show full text]
  • CS848 - Cloud Data Management
    Cloud Transactions Failures Partitioning Replication CAP Views CS848 - Cloud Data Management Introduction and Background Ken Salem David R. Cheriton School of Computer Science University of Waterloo Winter 2010 • Adjectives associated with clouds • scalable • highly-available • pay-as-you-go • on demand • Not much point in trying to pin down what is cloud and what is not. Cloud Transactions Failures Partitioning Replication CAP Views What is cloud computing? • It seems that everybody who is offering an internet service or using a cluster wants to label themselves “cloud” • Not much point in trying to pin down what is cloud and what is not. Cloud Transactions Failures Partitioning Replication CAP Views What is cloud computing? • It seems that everybody who is offering an internet service or using a cluster wants to label themselves “cloud” • Adjectives associated with clouds • scalable • highly-available • pay-as-you-go • on demand Cloud Transactions Failures Partitioning Replication CAP Views What is cloud computing? • It seems that everybody who is offering an internet service or using a cluster wants to label themselves “cloud” • Adjectives associated with clouds • scalable • highly-available • pay-as-you-go • on demand • Not much point in trying to pin down what is cloud and what is not. Cloud Transactions Failures Partitioning Replication CAP Views Services Spectrum less flexible more flexible more constrained less constrained less effort more effort Cloud Transactions Failures Partitioning Replication CAP Views Services Spectrum less
    [Show full text]
  • Analysis and Comparison of Concurrency Control Techniques
    ISSN (Online) 2278-1021 ISSN (Print) 2319-5940 International Journal of Advanced Research in Computer and Communication Engineering Vol. 4, Issue 3, March 2015 Analysis and Comparison of Concurrency Control Techniques Sonal Kanungo1, Morena Rustom. D2 Smt.Z.S.Patel College Of Computer, Application,Jakat Naka, Surat1 2 Department Of Computer Science, Veer Narmad South Gujarat University, Surat. Abstract: In a shared database system when several transactions are executed simultaneously, the consistency of database should be maintained. The techniques to ensure this consistency are concurrency control techniques. All concurrency-control schemes are based on the serializability property. The serializability properties requires that the data is accessed in a mutually exclusive manner; that means, while one transaction is accessing a data item no other transaction can modify that data item. In this paper we had discussed various concurrency techniques, their advantages and disadvantages and making comparison of optimistic, pessimistic and multiversion techniques. We have simulated the current environment and have analysis the performance of each of these methods. Keywords: Concurrency, Locking, Serializability 1. INTRODUCTION When a transaction takes place the database state is transaction has to wait until all incompatible locks held by changed. In any individual transaction, which is running other transactions are released. The lock is then granted. in isolation, is assumed to be correct. While in shared [1] database several transactions are executes concurrently in 1.1.2 The Two-Phase Locking Protocol the database, the isolation property may no longer be Transaction can always commit by not violating the preserved. To ensure that the system must control the serializability property.
    [Show full text]
  • Where We Are Snapshot Isolation Snapshot Isolation
    Where We Are • ACID properties of transactions CSE 444: Database Internals • Concept of serializability • How to provide serializability with locking • Lowers level of isolation with locking • How to provide serializability with optimistic cc Lectures 16 – Timestamps/Multiversion or Validation Transactions: Snapshot Isolation • Today: lower level of isolation with multiversion cc – Snapshot isolation Magda Balazinska - CSE 444, Spring 2012 1 Magda Balazinska - CSE 444, Spring 2012 2 Snapshot Isolation Snapshot Isolation • Not described in the book, but good overview in Wikipedia • A type of multiversion concurrency control algorithm • Provides yet another level of isolation • Very efficient, and very popular – Oracle, PostgreSQL, SQL Server 2005 • Prevents many classical anomalies BUT… • Not serializable (!), yet ORACLE and PostgreSQL use it even for SERIALIZABLE transactions! – But “serializable snapshot isolation” now in PostgreSQL Magda Balazinska - CSE 444, Fall 2010 3 Magda Balazinska - CSE 444, Fall 2010 4 Snapshot Isolation Rules Snapshot Isolation (Details) • Multiversion concurrency control: • Each transactions receives a timestamp TS(T) – Versions of X: Xt1, Xt2, Xt3, . • Transaction T sees snapshot at time TS(T) of the database • When T reads X, return XTS(T). • When T commits, updated pages are written to disk • When T writes X: if other transaction updated X, abort – Not faithful to “first committer” rule, because the other transaction U might have committed after T. But once we abort • Write/write conflicts resolved by “first
    [Show full text]
  • An Evaluation of Distributed Concurrency Control
    An Evaluation of Distributed Concurrency Control Rachael Harding Dana Van Aken MIT CSAIL Carnegie Mellon University [email protected] [email protected] Andrew Pavlo Michael Stonebraker Carnegie Mellon University MIT CSAIL [email protected] [email protected] ABSTRACT there is little understanding of the trade-offs in a modern cloud Increasing transaction volumes have led to a resurgence of interest computing environment offering high scalability and elasticity. Few in distributed transaction processing. In particular, partitioning data of the recent publications that propose new distributed protocols across several servers can improve throughput by allowing servers compare more than one other approach. For example, none of the to process transactions in parallel. But executing transactions across papers published since 2012 in Table 1 compare against timestamp- servers limits the scalability and performance of these systems. based or multi-version protocols, and seven of them do not compare In this paper, we quantify the effects of distribution on concur- to any other serializable protocol. As a result, it is difficult to rency control protocols in a distributed environment. We evaluate six compare proposed protocols, especially as hardware and workload classic and modern protocols in an in-memory distributed database configurations vary across publications. evaluation framework called Deneva, providing an apples-to-apples Our aim is to quantify and compare existing distributed concur- comparison between each. Our results expose severe limitations of rency control protocols for in-memory DBMSs. We develop an distributed transaction processing engines. Moreover, in our anal- empirical understanding of the behavior of distributed transactions ysis, we identify several protocol-specific scalability bottlenecks.
    [Show full text]
  • A Drop-In Middleware for Serializable DB Clustering Across Geo-Distributed Sites
    A Drop-in Middleware for Serializable DB Clustering across Geo-distributed Sites Enrique Saurez1, Bharath Balasubramanian2, Richard Schlichting3 Brendan Tschaen2 Shankaranarayanan Puzhavakath Narayanan,2 Zhe Huang2, Umakishore Ramachandran1 Georgia Institute of Technology1 AT&T Labs - Research2 United States Naval Academy3 [email protected], [email protected], [email protected], [email protected], fsnarayanan, [email protected], [email protected] ABSTRACT formance needs of clients.1 However, many of these services Many geo-distributed services at web-scale companies still use databases (DBs) like MariaDB [54] and PostgreSQL [50] rely on databases (DBs) primarily optimized for single-site that are primarily optimized for single site deployments even performance. At AT&T this is exemplified by services in the when they have clustering solutions. For eample, in Mari- network control plane that rely on third-party software that aDB Galera [28] synchronous clustering [29] all replicas are uses DBs like MariaDB and PostgreSQL, which do not pro- updated on each commit, which is prohibitively expensive vide strict serializability across sites without a significant across sites with WAN latencies on the order of hundreds performance impact. Moreover, it is often impractical for of milliseconds. Similarly, in PostgreSQL master-slave [49] these services to re-purpose their code to use newer DBs op- clustering, requests from all sites are sent to a single master timized for geo-distribution. In this paper, a novel drop-in replica, compromising on performance and availability. solution for DB clustering across sites called Metric is pre- Although new geo-distributed DBs have been developed sented that can be used by services without changing a single that improve the performance of cross-site transactionality line of code.
    [Show full text]