Strongly Consistent Replication for a Bargain
Total Page:16
File Type:pdf, Size:1020Kb
Strongly consistent replication for a bargain Konstantinos Krikellas #z, Sameh Elnikety y, Zografoula Vagena ∗‡, Orion Hodson y #School of Informatics, University of Edinburgh yMicrosoft Research ∗Concentra Consulting Ltd Abstract— Strong consistency is an important correctness AgentA executes a transaction T1 to update a particular property for replicated databases. It ensures that each transaction database, based on a request received from AgentB, e.g., accesses the latest committed database state as provided in to trade shares for Agent . When the transaction commits, centralized databases. Achieving strong consistency in replicated B databases is a major performance challenge and is typically not AgentA notifies AgentB that the operation has completed. provided, exposing inconsistent data to client applications. We Both agents presume that the updates of T1 are reflected in propose two scalable techniques that exploit lazy update propaga- any direct interaction – or indirect interaction via a third tion and workload information to guarantee strong consistency by party – with that database. Here communication between delaying transaction start. We implement a prototype replicated Agent and Agent occurs via a hidden channel, which database system and incorporate the proposed techniques for A B providing strong consistency. Extensive experiments using both is increasingly common in applications that span multiple a micro-benchmark and the TPC-W benchmark demonstrate administrative domains. In a centralized database the latest that our proposals are viable and achieve considerable scalability committed database state is immediately available, making while maintaining strong consistency. strong consistency natively guaranteed. However, replacing a centralized database with a replicated database system exposes I. INTRODUCTION these subtle inconsistencies (e.g., AgentB may not observe Database systems are an integral part of enterprise IT the effects of T1 for a considerable interval), unless strong infrastructures. In these environments, replication is a com- consistency is provided. monly used technique in databases to increase availability and This challenge has been recognized by the database commu- performance. In many cases, the database clients are other nity for years, with early replicated databases using techniques computer systems. This creates a large and complex distributed such as eager update propagation to provide strong consistency system with intricate dependencies and hidden communication [20]. According to this eager approach, the commit of an channels. update transaction is executed on all replicas synchronously Traditionally, when a database system is replicated to scale before responding to the client. In this case, any new trans- its performance, i.e., all replicas execute client transactions, action from any client observes the committed changes. How- there is an explicit trade-off between performance and con- ever, committing transactions on all replicas synchronously sistency. A replicated database may be configured to expose is expensive, requiring tight replica synchronization. Many inconsistent data to clients by providing weaker consistency replicated systems use weaker consistency guarantees, such guarantees. Some inconsistencies stem from the fact that there as session consistency [12] or relaxed currency [6], [21], to is a (usually considerable) delay between when an update improve performance. Still, correct programs using replicated transaction is executed at a replica and when its effects databases without strong consistency are explicitly written have been propagated to all the other replicas. Weakening to handle the increased complexity of data inconsistency; consistency allows the replicated database system to achieve handling consistency at the application level is tricky and error higher performance; providing strong consistency penalizes prone. the scalability of the replicated system. Replicated database systems should ideally provide strong Many real–world applications would benefit from replicated consistency to clients while preserving their scalability poten- databases being transparent, providing the same semantics as tial. To that end, we propose two approaches that guarantee a centralized database. In particular, centralized databases pro- strong consistency in replicated databases while lazily prop- vide strong consistency, which guarantees that each transaction agating the effects of update transactions. The first approach sees the latest committed state of the database. Therefore, monitors the progress of propagated updates at each replica replicated database systems should ideally provide the same and ensures that replicas are updated before starting a new strong consistency guarantees [7]. transaction. The second approach extends the first approach We highlight the importance of strong consistency using the and exploits workload specific information, which is easily following example. Consider two automated clients, AgentA available in automated environments, to execute transactions as and AgentB, running under separate administrative domains soon as the needed subset of data is current. Both approaches which implement a simple hidden communication channel. mask the latency required to reflect the effects of an update transaction on all replicas, but might introduce a delay at the zWork done while author was at Microsoft Research. start of a transaction to apply the needed updates. This delay allows new transactions to receive the latest committed state We adopt the definition of strong consistency in replicated of the data they are going to access. systems from prior work [7], [12], [16], [17]: We build a prototype replicated database system and im- Definition 1: Strong consistency: For every committed 0 plement our proposals to experimentally evaluate them. The transaction Ti, there is a single-copy history H such that results show that the proposed techniques guarantee strong (a) H0 is view equivalent to a history H over a single copy consistency with scalable performance, outperforming the ea- of the database, and (b) for any transaction Tj: if (Ti commits 0 ger approach for providing strong consistency and matching before Tj starts in H) then Ti precedes Tj in H . the performance of session consistency. The latter is a weaker Strong consistency guarantees that after a transaction Ti form of consistency guaranteeing that a single client sees its commits, any new transaction Tj following Ti observes the own committed updates. updates of Ti. This property is immediately provided in a Strong consistency is concerned with the effects of com- standalone database system. mitted transactions and not with overlapping transactions. It Next, we describe session consistency. Each client submits is different from transaction isolation levels [5], [18], such a sequence of transactions to the database that defines the as serializability and snapshot isolation. This work focuses client’s session (S). H is an interleaved sequence of the trans- on providing strong consistency: this issue has received much action operations contained in all sessions. Session consistency less attention than serializability from the replicated database is then defined as follows: research community. We delineate the differences between Definition 2: Session consistency: For every committed 0 strong consistency and serializability and give examples in the transaction Ti, there is a single-copy history H such that next section. (a) H0 is view equivalent to a history H over a single copy The main contributions of this paper are the following: of the database, and (b) for any transaction Tj: if (Ti commits before Tj starts in H, and session(Ti) = session(Tj)) then • we propose two scalable techniques that exploit lazy T precedes T in H0. update propagation and workload characteristics to guar- i j The definition above shows that session consistency is a re- antee strong consistency, laxed version of strong consistency: it ensures that transactions • we present a prototype replicated database system and access a consistent state of the database within the bounds of show how to implement the two proposed techniques, the session they belong to. Here, a client is guaranteed that and the updates of its last transaction are visible to its new one. • we experimentally evaluate the proposed techniques and Session consistency is weaker than strong consistency, in the compare them with eager strong consistency and session sense that there is no guarantee for a client’s transaction to consistency, proving the viability and the efficiency of access the updates of committed transactions originating from our proposals. other clients. The paper proceeds as follows: in Section II we provide the Although the definitions of strong and session consistency database model and define strong consistency. We describe are associated with serializability in prior work [7], [12], it the proposed lazy approaches to provide strong consistency in is important to distinguish between strong consistency and Section III. In Section IV we present the architecture and de- the isolation level. Both are correctness properties and they sign of our replicated database prototype and explain how the are different: both, one, or none can be provided. Strong proposed techniques for strong consistency are implemented. consistency captures the externally visible order of transaction The experimental evaluation of our techniques is presented in commits that clients