Copyright and Use of This Thesis This Thesis Must Be Used in Accordance with the Provisions of the Copyright Act 1968
Total Page:16
File Type:pdf, Size:1020Kb
COPYRIGHT AND USE OF THIS THESIS This thesis must be used in accordance with the provisions of the Copyright Act 1968. Reproduction of material protected by copyright may be an infringement of copyright and copyright owners may be entitled to take legal action against persons who infringe their copyright. Section 51 (2) of the Copyright Act permits an authorized officer of a university library or archives to provide a copy (by communication or otherwise) of an unpublished thesis kept in the library or archives, to a person who satisfies the authorized officer that he or she requires the reproduction for the purposes of research or study. The Copyright Act grants the creator of a work a number of moral rights, specifically the right of attribution, the right against false attribution and the right of integrity. You may infringe the author’s moral rights if you: - fail to acknowledge the author of this thesis if you quote sections from the work - attribute this thesis to another author - subject this thesis to derogatory treatment which may prejudice the author’s reputation For further information contact the University’s Copyright Service. sydney.edu.au/copyright CHERRY GARCIA: LinkingTRANSACTIONS Named Entities to ACROSS Wikipedia HETEROGENEOUS DATA STORES Will Radford Supervisor: Dr. James R. Curran A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy A thesis submittedSchool in offulfilment Information Technologiesof the requirements for the Faculty of Engineering & IT degree of Doctor of Philosophy in the School of Information Technologies at The University of Sydney The University of Sydney 2015 Akon Samir Dey October 2015 © Copyright by Akon Samir Dey 2016 All Rights Reserved ii Abstract In recent years, cloud or utility computing has revolutionised the way software, hardware and network infrastructure is provisioned and deployed into production. A key component of these vast, diverse and heterogeneous systems is the per- sistence layer provided by a variety of data store and database services, broadly categorised into what is referred to as NoSQL (Not only SQL) databases or data stores. These come in many flavours from simple key-value stores and column stores to database services with support for SQL-like interfaces. These systems are primarily designed to operate at internet-scale with high scalability and fault-tolerance in mind. As a result, they typically sacrifice consis- tency guarantees and often support only single-item consistent operations or no transactions at all. While these consistency limitations are fine for a wide class of applications, there are a few or sometimes only parts of larger applications that need ACID transactional guarantees in order to function correctly. To address this, we define a data store client API, we call REST+T (REST with Transactions), an extension of HTTP that supports transactions on one store. Then, we use this to define a client-coordinated transaction commitment protocol and library, called Cherry Garcia, to enable easy applications development across diverse, heterogeneous data stores that each support single-item transactions. We extend the well-known YCSB benchmark, to present YCSB+T, to enable us to group multiple data store operations into ACID transactions and evaluate proper- ties such as throughput. YCSB+T also provides the ability to detect and quantify data store anomalies that result from the execution of the workload. Finally, we describe our prototype implementations of REST+T in a system called Tora, and our client-coordinated transaction library, also called Cherry Garcia, that supports transactions across Windows Azure Storage (WAS), Google Cloud Storage (GCS) and Tora. We evaluate these using both YCSB+T and micro-benchmarks. iii Acknowledgements This thesis has enabled me to discover my strengths and overcome my weaknesses. It has taught me to appreciate the friendship and support of family, friends and colleagues who have enabled, empowered and encouraged me in this endeavour. I would like to express my deepest gratitude to my supervisor, Prof. Alan Fekete, for his support and guidance. His kindness, resourcefulness and willingness to help in my research work and outside of it, made this dissertation possible. My heartfelt thanks go to my co-supervisor, Assoc. Prof. Uwe R¨ohm,for his guidance, support, encouragement and ideas on presenting my work. He has complemented Alan as a guiding force behind this thesis. I thank members of the Database Research Group, Michael Cahill, Jack Galilee, Hyungsoo Jung, Meena Rajani, Ying Zhou, Vincent Gramoli and Paul Greenfield for their feedback and support. I also grateful to Arunmoezhi Ramachandran, Raghunath Nambiar, Sherif Sakr, David Bermbach, J¨ornKuhlemkamp for their collaboration. Many thanks to Lynne Hutton-Williams and Robyn Kemmis for their friendship, kindness and timely help. I am eternally grateful to my parents, Bharati and Samir Dey, for their uncon- ditional love, a wonderful childhood, and for instilling in me the curiosity to learn. I thank my brother, Kautuk Dey, for his encouragement and support during this work. Most of all, I would like to thank my wife, Aditi, for her love and understanding during this very challenging time of our life. I could not have done this without you. v Contents Abstract iii Acknowledgements v List of Figures xvi List of Algorithms xvii 1 Introduction 1 1.1 Problem definition . .2 1.2 Contributions . .2 1.3 Relevant publications . .3 1.4 Organisation of the Thesis . .4 2 Background 5 2.1 Motivating example . .5 2.2 Database Management Systems . .6 2.2.1 Transactions in Database Management Systems . .7 2.2.2 Properties of Transactions . .8 2.2.3 Transaction Programming Interface . .9 2.2.4 Requirements of Transactional Systems . .9 2.3 Distributed Concurrency Control . 10 2.4 Serializability . 10 2.4.1 Serializability in Distributed Databases . 12 2.4.2 Serializability in Heterogeneous Federation . 13 2.4.3 Weak Isolation . 14 2.5 Distributed Transaction Recovery . 15 vii 2.5.1 The Basic Two-Phase Commit Algorithm . 16 2.5.2 The Transaction Tree Two-Phase Commit Algorithm . 26 2.5.3 Optimised Algorithms for Distributed Commit . 28 2.6 Distributed systems . 33 2.7 NoSQL and NewSQL database systems . 35 2.8 Transactions in NoSQL database systems . 38 2.9 Evaluating database systems . 39 2.9.1 TPC . 40 2.9.2 Benchmarks . 41 2.9.3 Evaluation metrics . 42 2.9.4 Properties of a good benchmark . 43 2.10 Chapter Summary . 45 3 Related Work 47 3.1 Cloud and web services . 47 3.2 Distributed Concurrency Control . 48 3.3 Distributed Transaction Recovery . 49 3.4 Transaction Models . 50 3.5 Transactions over HTTP . 51 3.6 Transaction support in NoSQL and Cloud Storage Systems . 52 3.7 Middleware coordinated transactions . 54 3.8 Client coordinated transactions . 54 3.8.1 Comparison to our approach . 56 3.9 Time in Distributed Systems . 56 3.10 Database benchmarks . 58 3.10.1 Cloud services benchmarks . 58 3.10.2 YCSB . 59 3.11 Chapter Summary . 61 4 REST+T: Scalable transactions over HTTP 63 4.1 State and state change . 65 4.2 Challenges . 66 4.3 REST with Transaction support . 69 4.4 The REST+T Proposal . 69 viii 4.4.1 REST+T data header metadata . 72 4.4.2 REST+T methods . 75 4.5 Limiting REST+T to HTTP verbs . 77 4.6 A Case Study - Tora: a transaction-aware NoSQL data store . 78 4.6.1 REST+T interface . 79 4.6.2 Storage layer . 80 4.7 Chapter Summary . 80 5 Cherry Garcia - The Protocol 81 5.1 Challenges . 82 5.2 Intuition . 84 5.2.1 Overview of Two-Phase Commit . 84 5.2.2 Distributed Two-Phase Commit . 86 5.2.3 Write-Ahead Log (WAL) Protocol . 87 5.3 An example application . 88 5.4 Assumptions on the platforms . 90 5.5 Protocol overview . 90 5.5.1 Phase 1 . 91 5.5.2 Phase 2 . 92 5.6 Time . 93 5.7 Data item records . 94 5.8 Data store abstraction . 95 5.9 Transaction abstraction . 96 5.10 Cherry Garcia Protocol . 97 5.10.1 Start transaction . 97 5.10.2 Transactional read . 97 5.10.3 Transactional write . 98 5.10.4 Transaction commit . 100 5.10.5 Transaction abort . 102 5.10.6 Transaction recovery . 103 5.10.7 Performance Analysis of the Protocol . 103 5.11 Deadlock detection and avoidance . 104 5.11.1 First preparer wins . 104 5.11.2 Ensuring at least one winner . 105 ix 5.12 Optimisations . 105 5.12.1 One-phase transaction commit optimisation . 105 5.12.2 Parallel commit phase . 106 5.12.3 Read-only transaction optimisation . 106 5.13 Failure scenarios . 106 5.13.1 Performance implications of retries . 107 5.14 Correctness . 107 5.14.1 Effect of Violations of Data Store Assumptions . 110 5.15 Implementing serializablity . 113 5.16 Correctness for Serializable transactions . 115 5.17 Chapter Summary . 116 6 Implementation 119 6.1 Implementation Overview . 119 6.2 Datastore abstraction . 121 6.3 Datastore abstraction for WAS . 121 6.4 Datastore abstraction for GCS . 123 6.5 Datastore abstraction for Tora . 124 6.6 Tora: a REST+T implementation . 125 6.6.1 Request-response loop . 127 6.7 Apache HttpClient REST+T Extension . 130 6.8 Chapter Summary . 130 7 The YCSB+T Benchmark 133 7.1 Background . 134 7.1.1 Brief overview of YCSB . 136 7.2 Benchmark tiers . 137 7.2.1 Tier 5 - Transactional overhead . 137 7.2.2 Tier 6 - Consistency . 138 7.3 Benchmark details . ..