Strongly Consistent Replication for a Bargain

Total Page:16

File Type:pdf, Size:1020Kb

Strongly Consistent Replication for a Bargain Strongly consistent replication for a bargain Konstantinos Krikellas #z, Sameh Elnikety y, Zografoula Vagena ∗‡, Orion Hodson y #School of Informatics, University of Edinburgh yMicrosoft Research ∗Concentra Consulting Ltd Abstract— Strong consistency is an important correctness AgentA executes a transaction T1 to update a particular property for replicated databases. It ensures that each transaction database, based on a request received from AgentB, e.g., accesses the latest committed database state as provided in to trade shares for Agent . When the transaction commits, centralized databases. Achieving strong consistency in replicated B databases is a major performance challenge and is typically not AgentA notifies AgentB that the operation has completed. provided, exposing inconsistent data to client applications. We Both agents presume that the updates of T1 are reflected in propose two scalable techniques that exploit lazy update propaga- any direct interaction – or indirect interaction via a third tion and workload information to guarantee strong consistency by party – with that database. Here communication between delaying transaction start. We implement a prototype replicated Agent and Agent occurs via a hidden channel, which database system and incorporate the proposed techniques for A B providing strong consistency. Extensive experiments using both is increasingly common in applications that span multiple a micro-benchmark and the TPC-W benchmark demonstrate administrative domains. In a centralized database the latest that our proposals are viable and achieve considerable scalability committed database state is immediately available, making while maintaining strong consistency. strong consistency natively guaranteed. However, replacing a centralized database with a replicated database system exposes I. INTRODUCTION these subtle inconsistencies (e.g., AgentB may not observe Database systems are an integral part of enterprise IT the effects of T1 for a considerable interval), unless strong infrastructures. In these environments, replication is a com- consistency is provided. monly used technique in databases to increase availability and This challenge has been recognized by the database commu- performance. In many cases, the database clients are other nity for years, with early replicated databases using techniques computer systems. This creates a large and complex distributed such as eager update propagation to provide strong consistency system with intricate dependencies and hidden communication [20]. According to this eager approach, the commit of an channels. update transaction is executed on all replicas synchronously Traditionally, when a database system is replicated to scale before responding to the client. In this case, any new trans- its performance, i.e., all replicas execute client transactions, action from any client observes the committed changes. How- there is an explicit trade-off between performance and con- ever, committing transactions on all replicas synchronously sistency. A replicated database may be configured to expose is expensive, requiring tight replica synchronization. Many inconsistent data to clients by providing weaker consistency replicated systems use weaker consistency guarantees, such guarantees. Some inconsistencies stem from the fact that there as session consistency [12] or relaxed currency [6], [21], to is a (usually considerable) delay between when an update improve performance. Still, correct programs using replicated transaction is executed at a replica and when its effects databases without strong consistency are explicitly written have been propagated to all the other replicas. Weakening to handle the increased complexity of data inconsistency; consistency allows the replicated database system to achieve handling consistency at the application level is tricky and error higher performance; providing strong consistency penalizes prone. the scalability of the replicated system. Replicated database systems should ideally provide strong Many real–world applications would benefit from replicated consistency to clients while preserving their scalability poten- databases being transparent, providing the same semantics as tial. To that end, we propose two approaches that guarantee a centralized database. In particular, centralized databases pro- strong consistency in replicated databases while lazily prop- vide strong consistency, which guarantees that each transaction agating the effects of update transactions. The first approach sees the latest committed state of the database. Therefore, monitors the progress of propagated updates at each replica replicated database systems should ideally provide the same and ensures that replicas are updated before starting a new strong consistency guarantees [7]. transaction. The second approach extends the first approach We highlight the importance of strong consistency using the and exploits workload specific information, which is easily following example. Consider two automated clients, AgentA available in automated environments, to execute transactions as and AgentB, running under separate administrative domains soon as the needed subset of data is current. Both approaches which implement a simple hidden communication channel. mask the latency required to reflect the effects of an update transaction on all replicas, but might introduce a delay at the zWork done while author was at Microsoft Research. start of a transaction to apply the needed updates. This delay allows new transactions to receive the latest committed state We adopt the definition of strong consistency in replicated of the data they are going to access. systems from prior work [7], [12], [16], [17]: We build a prototype replicated database system and im- Definition 1: Strong consistency: For every committed 0 plement our proposals to experimentally evaluate them. The transaction Ti, there is a single-copy history H such that results show that the proposed techniques guarantee strong (a) H0 is view equivalent to a history H over a single copy consistency with scalable performance, outperforming the ea- of the database, and (b) for any transaction Tj: if (Ti commits 0 ger approach for providing strong consistency and matching before Tj starts in H) then Ti precedes Tj in H . the performance of session consistency. The latter is a weaker Strong consistency guarantees that after a transaction Ti form of consistency guaranteeing that a single client sees its commits, any new transaction Tj following Ti observes the own committed updates. updates of Ti. This property is immediately provided in a Strong consistency is concerned with the effects of com- standalone database system. mitted transactions and not with overlapping transactions. It Next, we describe session consistency. Each client submits is different from transaction isolation levels [5], [18], such a sequence of transactions to the database that defines the as serializability and snapshot isolation. This work focuses client’s session (S). H is an interleaved sequence of the trans- on providing strong consistency: this issue has received much action operations contained in all sessions. Session consistency less attention than serializability from the replicated database is then defined as follows: research community. We delineate the differences between Definition 2: Session consistency: For every committed 0 strong consistency and serializability and give examples in the transaction Ti, there is a single-copy history H such that next section. (a) H0 is view equivalent to a history H over a single copy The main contributions of this paper are the following: of the database, and (b) for any transaction Tj: if (Ti commits before Tj starts in H, and session(Ti) = session(Tj)) then • we propose two scalable techniques that exploit lazy T precedes T in H0. update propagation and workload characteristics to guar- i j The definition above shows that session consistency is a re- antee strong consistency, laxed version of strong consistency: it ensures that transactions • we present a prototype replicated database system and access a consistent state of the database within the bounds of show how to implement the two proposed techniques, the session they belong to. Here, a client is guaranteed that and the updates of its last transaction are visible to its new one. • we experimentally evaluate the proposed techniques and Session consistency is weaker than strong consistency, in the compare them with eager strong consistency and session sense that there is no guarantee for a client’s transaction to consistency, proving the viability and the efficiency of access the updates of committed transactions originating from our proposals. other clients. The paper proceeds as follows: in Section II we provide the Although the definitions of strong and session consistency database model and define strong consistency. We describe are associated with serializability in prior work [7], [12], it the proposed lazy approaches to provide strong consistency in is important to distinguish between strong consistency and Section III. In Section IV we present the architecture and de- the isolation level. Both are correctness properties and they sign of our replicated database prototype and explain how the are different: both, one, or none can be provided. Strong proposed techniques for strong consistency are implemented. consistency captures the externally visible order of transaction The experimental evaluation of our techniques is presented in commits that clients
Recommended publications
  • An Efficient Concurrency Control Technique for Mobile Database Environment by Md
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Global Journal of Computer Science and Technology (GJCST) Global Journal of Computer Science and Technology Software & Data Engineering Volume 13 Issue 2 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172 & Print ISSN: 0975-4350 An Efficient Concurrency Control Technique for Mobile Database Environment By Md. Anisur Rahman Dhaka University of Engineering & Technology Gazipur, Bangladesh Abstract - Day by day, wireless networking technology and mobile computing devices are becoming more popular for their mobility as well as great functionality. Now it is an extremely growing demand to process mobile transactions in mobile databases that allow mobile users to access and operate data anytime and anywhere, irrespective of their physical positions. Information is shared among multiple clients and can be modified by each client independently. However, for the assurance of timely access and correct results in concurrent mobile transactions, concurrency control techniques (CCT) happen to be very difficult. Due to the properties of Mobile databases e.g. inadequate bandwidth, small processing capability, unreliable communication, mobility etc. existing mobile database CCTs cannot employ effectively. With the client-server model, applying common classic pessimistic techniques of concurrency control (like 2PL) in mobile database leads to long duration Blocking and increasing waiting time of transactions. Because of high rate of aborting transactions, optimistic techniques aren`t appropriate in mobile database as well. This paper discusses the issues that need to be addressed when designing a CCT technique for Mobile databases, analyses the existing scheme of CCT and justify their performance limitations.
    [Show full text]
  • Effective Technique for Optimizing Timestamp Ordering in Read- Write/Write-Write Operations
    International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072 Effective Technique for Optimizing Timestamp Ordering in Read- Write/Write-Write Operations Obi, Uchenna M1, Nwokorie, Euphemia C2, Enwerem, Udochukwu C3, Iwuchukwu, Vitalis C4 1Lecturer, Department of Computer Science, Federal University of Technology, Owerri, Nigeria 2Senior Lecturer, Department of Computer Science, Federal University of Technology, Owerri, Nigeria 3Lecturer, Department of Computer Science, Federal University of Technology, Owerri, Nigeria 4Lecturer, Department of Computer Science, Federal University of Technology, Owerri, Nigeria ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - In recent times, the use of big data has been the measures, these are: read-write (RW) synchronization and trending technology, most enterprises tend to adopt large write-write (WW) synchronization. databases, but the inherent problem is how to ensure serializability in concurrent transactions that may want to Concurrency control techniques are employed for managing access the data in the database so as to maintain its data concurrent access of transactions on a particular data item integrity and not to compromise it. The aim of this research is by ensuring serializable executions or to avoid interference to develop an efficient Timestamp Ordering algorithm and among transactions and thereby helps in avoiding errors and model in conflicting operations in read-write/write-write data maintain consistency of the database. Various concurrency synchronization. In read-write synchronization, one of the control techniques have been developed by different operations to perform is read while the other is write researchers and these techniques are distinct and unique in operation.
    [Show full text]
  • On Ordering Transaction Commit
    On Ordering Transaction Commit Mohamed M. Saad Roberto Palmieri Binoy Ravindran Virginia Tech msaad, robertop, binoy @vt.edu { } Complete Commit Execute Abstract Ordering In this poster paper, we briefly introduce an effective solution to Time address the problem of committing transactions enforcing a pre- α defined order. To do that, we overview the design of two algorithms γ that deploy a cooperative transaction execution that circumvents the δ transaction isolation constraint in favor of propagating written val- β ues among conflicting transactions. A preliminary implementation shows that even in the presence of data conflicts, the proposed al- gorithms outperform other competitors, significantly. Categories and Subject Descriptors D.1.3 [Software]: Concur- rent Programming; H.2.4 [Systems]: Transaction processing (a) BS 4 Thr. (b) BS 8 Thr. (c) FH Commit (d) FH Complete Keywords Transactional Memory, Commitment Ordering Figure 1: ACO using Blocking/Stall (BS) and Freeze/Hold (FH) 1. Introduction semantics (of the parallel code) that is equivalent to the original Transactional Memory (TM) [5] is an easy abstraction to program (sequential) code. Regarding the latter, SMR-based transactional concurrent applications. Its integration into main-stream compilers systems order transactions (totally or partially) before their execu- and programming languages such as GCC and C++ gives TM, tion to guarantee that a state always evolves on several computing respectively, accessibility and concreteness. nodes, consistently. To do that, usually a consensus protocol is em- In this poster paper we provide the design of two TM implemen- ployed (e.g., Paxos [6]), which establishes a common order among tations that commit transactions enforcing an order defined prior to transactions.
    [Show full text]
  • A Concurrency Control Method Based on Commitment Ordering in Mobile Databases
    International Journal of Database Management Systems ( IJDMS ) Vol.3, No.4, November 2011 A CONCURRENCY CONTROL METHOD BASED ON COMMITMENT ORDERING IN MOBILE DATABASES 1 2 Ali Karami and Ahmad Baraani-Dastjerdi 1Department of Computer Engineering, University of Isfahan, Isfahan, Iran [email protected] 2 Department of Computer Engineering, University of Isfahan, Isfahan, Iran [email protected] ABSTRACT Disconnection of mobile clients from server, in an unclear time and for an unknown duration, due to mobility of mobile clients, is the most important challenges for concurrency control in mobile database with client-server model. Applying pessimistic common classic methods of concurrency control (like 2pl) in mobile database leads to long duration blocking and increasing waiting time of transactions. Because of high rate of aborting transactions, optimistic methods aren`t appropriate in mobile database. In this article, OPCOT concurrency control algorithm is introduced based on optimistic concurrency control method. Reducing communications between mobile client and server, decreasing blocking rate and deadlock of transactions, and increasing concurrency degree are the most important motivation of using optimistic method as the basis method of OPCOT algorithm. To reduce abortion rate of transactions, in execution time of transactions` operators a timestamp is assigned to them. In other to checking commitment ordering property of scheduler, the assigned timestamp is used in server on time of commitment. In this article, serializability of OPCOT algorithm scheduler has been proved by using serializability graph. Results of evaluating simulation show that OPCOT algorithm decreases abortion rate and waiting time of transactions in compare to 2pl and optimistic algorithms.
    [Show full text]
  • Inferring a Serialization Order for Distributed Transactions∗
    Inferring a Serialization Order for Distributed Transactions∗ Khuzaima Daudjee and Kenneth Salem School of Computer Science University of Waterloo Waterloo, Canada fkdaudjee, [email protected] Abstract Consider a distributed database system in which each site’s local concurrency control is rigorous and transactions Data partitioning is often used to scale-up a database are synchronized using a two-phase (2PC) commit proto- system. In a centralized database system, the serialization col. Our choice of 2PC stems from it being the most widely order of commited update transactions can be inferred from used protocol for coordinating transactions in distributed the database log. To achieve this in a shared-nothing dis- database systems [1]. Although serializability is guaranteed tributed database, the serialization order of update trans- at the cluster by the sites’ local concurrency controls and actions must be inferred from multiple database logs. We the 2PC protocol, the issue is how to determine the re- describe a technique to generate a single stream of updates sulting serialization order. The contribution of this paper from logs of multiple database systems. This single stream is a technique that determines a serialization order for dis- represents a valid serialization order of update transactions tributed update transactions that have executed in a parti- at the sites over which the database is partitioned. tioned database over multiple sites. Our technique merges log entries of each site into a single stream that represents a valid serialization order for update transactions. 1. Introduction 1.1. System Model In a centralized database system that guarantees com- 1 mitment ordering , the serialization order of transactions The database is partitioned over one or more sites.
    [Show full text]
  • Transaction Properties(ACID Properties)
    UNIT-5 Transaction properties(ACID properties) In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably. In the context ofdatabases, a single logical operation on the data is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction. Jim Gray defined these properties of a reliable transaction system in the late 1970s and developed technologies to achieve them automatically. In 1983, Andreas Reuter and Theo Härder coined the acronym ACID to describe them. The characteristics of these four properties as defined by Reuter and Härder: Atomicity Atomicity requires that each transaction be "all or nothing": if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes. To the outside world, a committed transaction appears (by its effects on the database) to be indivisible ("atomic"), and an aborted transaction does not happen. Consistency The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted (that is the responsibility of application-level code) but merely that any programming errors cannot result in the violation of any defined rules.
    [Show full text]
  • Distributed Computing Problems from Wikipedia, the Free Encyclopedia Contents
    Distributed computing problems From Wikipedia, the free encyclopedia Contents 1 Atomic broadcast 1 1.1 References .............................................. 1 2 Atomic commit 2 2.1 Usage ................................................. 2 2.2 Database systems ........................................... 2 2.3 Revision control ............................................ 3 2.4 Atomic commit convention ...................................... 3 2.5 See also ................................................ 4 2.6 References ............................................... 4 3 Automatic vectorization 5 3.1 Background .............................................. 5 3.2 Guarantees ............................................... 6 3.2.1 Data dependencies ...................................... 6 3.2.2 Data precision ......................................... 6 3.3 Theory ................................................. 6 3.3.1 Building the dependency graph ................................ 6 3.3.2 Clustering ........................................... 6 3.3.3 Detecting idioms ....................................... 7 3.4 General framework .......................................... 7 3.5 Run-time vs. compile-time ...................................... 7 3.6 Techniques .............................................. 8 3.6.1 Loop-level automatic vectorization .............................. 8 3.6.2 Basic block level automatic vectorization ........................... 8 3.6.3 In the presence of control flow ................................ 9 3.6.4 Reducing
    [Show full text]
  • Database Management Systems Ebooks for All Edition (
    Database Management Systems eBooks For All Edition (www.ebooks-for-all.com) PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 20 Oct 2013 01:48:50 UTC Contents Articles Database 1 Database model 16 Database normalization 23 Database storage structures 31 Distributed database 33 Federated database system 36 Referential integrity 40 Relational algebra 41 Relational calculus 53 Relational database 53 Relational database management system 57 Relational model 59 Object-relational database 69 Transaction processing 72 Concepts 76 ACID 76 Create, read, update and delete 79 Null (SQL) 80 Candidate key 96 Foreign key 98 Unique key 102 Superkey 105 Surrogate key 107 Armstrong's axioms 111 Objects 113 Relation (database) 113 Table (database) 115 Column (database) 116 Row (database) 117 View (SQL) 118 Database transaction 120 Transaction log 123 Database trigger 124 Database index 130 Stored procedure 135 Cursor (databases) 138 Partition (database) 143 Components 145 Concurrency control 145 Data dictionary 152 Java Database Connectivity 154 XQuery API for Java 157 ODBC 163 Query language 169 Query optimization 170 Query plan 173 Functions 175 Database administration and automation 175 Replication (computing) 177 Database Products 183 Comparison of object database management systems 183 Comparison of object-relational database management systems 185 List of relational database management systems 187 Comparison of relational database management systems 190 Document-oriented database 213 Graph database 217 NoSQL 226 NewSQL 232 References Article Sources and Contributors 234 Image Sources, Licenses and Contributors 240 Article Licenses License 241 Database 1 Database A database is an organized collection of data.
    [Show full text]
  • Application-Level Caching with Transactional Consistency by Dan R
    Application-Level Caching with Transactional Consistency by Dan R. K. Ports M.Eng., Massachusetts Institute of Technology (2007) S.B., S.B., Massachusetts Institute of Technology (2005) Submitted to the Department of Electrical Engineering and Computer Science in partial fulllment of the requirements for the degree of Doctor of Philosophy in Computer Science at the MASSACHUSETTSINSTITUTEOFTECHNOLOGY June 2012 © Massachusetts Institute of Technology 2012. All rights reserved. Author...................................................... Department of Electrical Engineering and Computer Science May 23, 2012 Certied by . Barbara H. Liskov Institute Professor esis Supervisor Accepted by . Leslie A. Kolodziejski Chair, Department Committee on Graduate eses 2 Application-Level Caching with Transactional Consistency by Dan R. K. Ports Submitted to the Department of Electrical Engineering and Computer Science on May 23, 2012, in partial fulllment of the requirements for the degree of Doctor of Philosophy in Computer Science abstract Distributed in-memory application data caches like memcached are a popular solution for scaling database-driven web sites. ese systems increase performance signicantly by reducing load on both the database and application servers. Unfortunately, such caches present two challenges for application developers. First, they cannot ensure that the application sees a consistent view of the data within a transaction, violating the isolation properties of the underlying database. Second, they leave the application responsible for locating data in the cache and keeping it up to date, a frequent source of application complexity and programming errors. is thesis addresses both of these problems in a new cache called TxCache. TxCache is a transactional cache: it ensures that any data seen within a transaction, whether from the cache or the database, reects a slightly stale but consistent snap- shot of the database.
    [Show full text]
  • Scaling Multicore Databases Via Constrained Parallel Execution
    Scaling Multicore Databases via Constrained Parallel Execution TR2016-981 Zhaoguo Wang, Shuai Mu, Yang Cui, Han Yi †, Haibo Chen†, Jinyang Li New York University, † Shanghai Jiao Tong University ABSTRACT IC3, a concurrency control scheme for multi-core in-memory Multicore in-memory databases often rely on traditional con- databases, which unlocks such parallelism among conflicting currency control schemes such as two-phase-locking (2PL) or transactions. optimistic concurrency control (OCC). Unfortunately, when A basic strategy for safe interleaving is to track depen- the workload exhibits a non-trivial amount of contention, dencies that arise as transactions make conflicting data ac- both 2PL and OCC sacrifice much parallel execution op- cess and to enforce tracked dependencies by constraining a portunity. In this paper, we describe a new concurrency transaction’s subsequent data access. This basic approach control scheme, interleaving constrained concurrency con- faces several challenges in order to extract parallelism while trol (IC3), which provides serializability while allowing for guaranteeing serializability: How to know which data access parallel execution of certain conflicting transactions. IC3 should be constrained and which ones should not? How to combines the static analysis of the transaction workload ensure transitive dependencies are not violated without hav- with runtime techniques that track and enforce dependencies ing to explicitly track them (which is expensive)? How to among concurrent transactions. The use of static analysis guarantee that tracked dependencies are always enforceable IC3 at runtime? simplifies ’s runtime design, allowing it to scale to many IC3 cores. Evaluations on a 64-core machine using the TPC- ’s key to solving these challenges is to combine run- IC3 time techniques with a static analysis of the transaction C benchmark show that outperforms traditional con- IC3 currency control schemes under contention.
    [Show full text]
  • Network Security and Concurrency Control
    Volume : 4 | Issue : 6 | June2015 ISSN - 2250-1991 Research Paper Computer Science Network Security and Concurrency Control Vishal Goyal National College Bhikhi(Mansa) KEYWORDS Computer Networking as understand the possible solutions to safeguard and secure In the world of computers, networking is the practice of link- the information flow. In particular, we use the firewalls to re- ing two or more computing devices together for the purpose solve cases where interleaving of establishment messages can of sharing data. Networks are built with a mix of computer lead to deadlock. Deadlock can be avoided by making security hardware and computer software. compromises, but we prove that it can be eliminated system- atically without such compromises. Area Networks Networks can be categorized in several different ways. One What are the security and concurrency associated with net- approach defines the type of network according to the geo- worked systems? The focus of this project is Security and graphic area it spans. Concurrency Management. The scope of the study has been further refined to —the evaluation to security and concurrency Local area networks (LANs) typically span a single home, associated with networked information systems.“ The problem school, or small office building, whereas being investigated here deals with evaluation of security risks associated with networked systems. As will be seen elsewhere, Wide area networks (WANs) , reach across cities, states, or networked systems are vulnerable to a number of indigenous even across the world. problems. This project is a systematic attempt to evaluate secu- rity associated with such systems. The Internet is the world’s largest public WAN.
    [Show full text]
  • Databases Theoretical Introduction Contents
    Databases Theoretical Introduction Contents 1 Databases 1 1.1 Database ................................................ 1 1.1.1 Terminology and overview .................................. 1 1.1.2 Applications .......................................... 2 1.1.3 General-purpose and special-purpose DBMSs ........................ 2 1.1.4 History ............................................ 2 1.1.5 Research ........................................... 6 1.1.6 Examples ........................................... 6 1.1.7 Design and modeling ..................................... 7 1.1.8 Languages ........................................... 9 1.1.9 Performance, security, and availability ............................ 10 1.1.10 See also ............................................ 12 1.1.11 References .......................................... 12 1.1.12 Further reading ........................................ 13 1.1.13 External links ......................................... 14 1.2 Schema migration ........................................... 14 1.2.1 Risks and Benefits ...................................... 14 1.2.2 Schema migration in agile software development ...................... 14 1.2.3 Available Tools ........................................ 15 1.2.4 References .......................................... 15 1.3 Star schema .............................................. 16 1.3.1 Model ............................................ 16 1.3.2 Benefits ............................................ 16 1.3.3 Disadvantages .......................................
    [Show full text]