Efficient Protocols for Replicated Transactional Systems

Efficient Protocols for Replicated Transactional Systems

Sapienza Universita` di Roma Dottorato di Ricerca in Ingegneria Informatica XXVI Ciclo { 2014 Efficient Protocols for Replicated Transactional Systems Sebastiano Peluso Sapienza Universita` di Roma Dottorato di Ricerca in Ingegneria Informatica XXVI Ciclo - 2014 Sebastiano Peluso Efficient Protocols for Replicated Transactional Systems Thesis Committee Reviewers Prof. Francesco Quaglia (Co-Advisor) Prof. Pascal Felber Prof. Paolo Romano (Co-Advisor) Prof. Fernando Pedone Prof. Leonardo Querzoni Author's address: Sebastiano Peluso Dipartimento di Ingegneria Informatica, Automatica e Gestionale Sapienza Universit`adi Roma Via Ariosto 25, I-00185 Roma, Italy e-mail: [email protected] www: http://www.dis.uniroma1.it/∼peluso To my family... Abstract Over the last years several relevant technological trends have significantly in- creased the relative impact that the inter-replica synchronization costs have on the performance of transactional systems. Indeed, the emergence of tech- nologies like Transactional Memory, Solid-State Drives and Cloud computing has exacerbated the ratio between the latencies of replication coordination and transaction processing. The requirements of these environments harshly challenge state of the art techniques for replication of transactional systems, raising the need for rethinking existing approaches to this problem. This dissertation advances the state of the art on replicated transactional systems by presenting a set of innovative replication protocols designed to achieve high efficiency even in such challenging scenarios. More in detail, four transactional replication protocols are proposed, which tackle the aforementioned issues from various angles. The first two cope with full replication scenarios, and exploit orthogonal techniques, such as specula- tion and transaction migration, which allow for amortizing, in different ways, the impact of distributed coordination on system performance. The other two proposals explicitly cope with the issue of scalability, by introducing the first genuine partial replication techniques that support abort-free read-only transactions while ensuring, respectively, One-Copy Serializability and Ex- tended Update Serializability. The core of these protocols is a distributed multi-version concurrency control algorithm, which relies on a novel logical clock synchronization mechanism to track, in a totally decentralized (and con- sequently scalable) way, both data and causal dependency relations among transactions. The trade-offs arising across the different presented solutions are also discussed and experimentally evaluated by integrating them into state of the art academic and industrial transactional platforms. i Most of the material presented in this dissertation can also be found in the following papers: 1. Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Lu´ısRodrigues When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Data Replication In Proc. of the 32nd IEEE International Conference on Distributed Com- puting Systems (ICDCS), Macau, China, June 2012. 2. Sebastiano Peluso, Jo~aoFernandes, Paolo Romano, Francesco Quaglia and Lu´ısRodrigues SPECULA: Speculative Replication of Software Transactional Memory In Proc. of the 31st IEEE International Symposium on Reliable Dis- tributed Systems (SRDS), Irvine, California, USA, October 2012. 3. Sebastiano Peluso, Paolo Romano and Francesco Quaglia SCORe: a Scalable One-Copy Serializable Partial Replication Protocol In Proc. of the ACM/IFIP/USENIX 13th International Conference on Middleware (Middleware), Montr´eal,Qu´ebec, Canada, December 2012. 4. Danny Hendler, Alex Naiman, Sebastiano Peluso, Paolo Romano, Francesco Quaglia and Adi Suissa Exploiting Locality in Lease-Based Replicated Transactional Memory via Task Migration In Proc. of the 27th International Symposium on Distributed Computing (DISC), Jerusalem, Israel, October 2013. 5. Hugo Pimentel, Paolo Romano, Sebastiano Peluso and Pedro Ruivo Enhancing locality via caching in the GMU protocol In Proc. of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Chicago, IL, USA, May 2014. Contents Abstract i 1 Introduction 1 1.1 The Need for Rethinking Transactional Replication . 2 1.2 Outline of Innovative Contributions . 6 2 State of the Art 9 2.1 Primary Copy . 11 2.2 Update Everywhere . 13 2.2.1 Active Replication . 14 2.2.2 Deferred Update Replication . 16 3 Model of the Target Systems and Preliminary Definitions 21 3.1 Distributed Processes and Communication Primitives . 21 3.2 Data Model . 23 3.3 Transaction Model . 25 3.3.1 History and Direct Serialization Graph . 26 3.4 Consistency Model . 27 3.4.1 Extended Update Serializability . 28 3.4.2 Serializability . 29 3.4.3 Opacity . 30 4 Exploiting Speculation to Overlap Computation and Distributed Coordination in Fully Replicated Systems 33 4.1 Correctness Criteria . 35 4.2 The SPECULA protocol . 35 4.2.1 Protocol Overview . 35 4.2.2 High Level Software Architecture . 37 4.2.3 Speculative Execution of Transactions . 39 4.2.4 Speculative Execution of Non-Transactional Code . 48 4.2.5 Correctness Arguments . 49 4.3 Experimental Evaluation . 50 iii 5 Reducing Full Replication Costs by Leveraging Transactions Migration 55 5.1 Overview of ALC . 57 5.2 Lilac-TM ............................. 58 5.2.1 Fine-Grained Leases . 60 5.2.2 Transaction Forwarder . 63 5.2.3 Distributed Transaction Dispatching . 66 5.3 Correctness Arguments . 68 5.4 Experimental Evaluation . 69 6 Changing the Viewpoint: a Scalable Multi-Version Protocol under Genuine Partial Replication 75 6.1 The GMU protocol . 77 6.1.1 Transaction execution phase . 79 6.1.2 Transaction commit phase . 82 6.1.3 Garbage Collection . 87 6.1.4 Failure Handling and Dynamic Process Groups . 87 6.1.5 On the support for read operations . 88 6.2 Correctness Proof . 89 6.2.1 Unidirectional flow of information . 90 6.2.2 No-update-conflict-misses . 93 6.3 On the Data Freshness . 96 6.4 Experimental Evaluation . 99 7 Additional Tradeoffs in the Design of Multi-Version GPR Pro- tocols 105 7.1 The SCORe Protocol . 107 7.1.1 Overview . 107 7.1.2 Handling of Read and Write Operations . 109 7.1.3 Commit Phase . 112 7.1.4 Garbage Collection and Fault-Tolerance . 115 7.2 Correctness Proof . 115 7.3 Experimental Evaluation . 117 8 Concluding Remarks 123 Bibliography 127 Chapter 1 Introduction The explosion of web applications' usage is allowing companies to easily break the wall of national boundary, making their services available to any user in the world. On the one hand, the popularity and the productivity of those services increases significantly due to this ease and wide deployment. On the other hand, the IT systems behind these services have to face the challenge of processing an ever growing volume of requests. Despite the different types of requests, almost all workloads trigger the execution of procedures for querying (i.e., read interactions) or manipulating (i.e., write interactions) common application state. In this context, a cru- cial, and long studied, problem is handling concurrent data manipulations efficiently and preserving the consistency of application state, despite multiple simultaneous requests. Another objective is to keep the service usable, i.e. guaranteeing low user perceived latency, while yielding high service throughput. In addition, since a lot of companies and enterprises base their success on the IT market, or in general, they rely on systems for enlarging their user base, computing systems should meet dependability requirements and ensure the survival of both data and services in case of failures. Concurrent data accesses are widely managed by means of the transaction abstraction, a well established technique in Database Management Systems (DBMS) that has recently emerged also in the context of parallel program- ming via the transaction memory paradigm. In this way, applications enclose accesses on shared data, e.g. write and read operations on tables in a database, or simple objects in main memory, within the boundaries of so called transac- tions. Then the concurrency control module is responsible for ensuring that, despite their parallel activation, transactions appear as if they were executed in isolation and atomically (i.e., either all or none of a transaction's opera- tions take effect), thereby allowing only safe and consistent transitions of the 1 2 CHAPTER 1 application state [12]. On the other hand, data and service replication is a widely adopted tech- nique for dependability, and it is recognized as a practical and effective way for enhancing the availability and fault-tolerance of computer systems. User perceived latency can be reduced by exploiting application locality or by mi- grating data closer to the requests' source. Replication applied to transactional systems has been already successfully consolidated in the literature as the reference methodology for building avail- able, fault-tolerant and high performance data management systems. How- ever, current replication protocols for transactional systems do not represent a definitive solution when new requirements arise due to novelties in the ar- chitectural trends. The following Section 1.1 discusses (i) the most relevant trends currently driving the process of reorganizing the architecture of transac- tional systems and (ii) the major shortcoming of state of the art transactional replication protocols in combination with these new trends. 1.1 The Need for Rethinking Transactional Repli- cation During the

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    147 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us