Aware, Workstation-Based Distributed Database System
Total Page:16
File Type:pdf, Size:1020Kb
THE ARCHITECTURE OF AN AUTONOMIC, RESOURCE- AWARE, WORKSTATION-BASED DISTRIBUTED DATABASE SYSTEM Angus Macdonald PhD Thesis February 2012 Abstract Distributed software systems that are designed to run over workstation machines within organisations are termed workstation-based. Workstation-based systems are characterised by dynamically changing sets of machines that are used primarily for other, user-centric tasks. They must be able to adapt to and utilize spare capacity when and where it is available, and ensure that the non-availability of an individual machine does not affect the availability of the system. This thesis focuses on the requirements and design of a workstation-based database system, which is motivated by an analysis of existing database architectures that are typically run over static, specially provisioned sets of machines. A typical clustered database system — one that is run over a number of specially provisioned machines — executes queries interactively, returning a synchronous response to applications, with its data made durable and resilient to the failure of machines. There are no existing workstation-based databases. Furthermore, other workstation-based systems do not attempt to achieve the requirements of interactivity and durability, because they are typically used to execute asynchronous batch processing jobs that tolerate data loss — results can be re-computed. These systems use external servers to store the final results of computations rather than workstation machines. This thesis describes the design and implementation of a workstation-based database system and investigates its viability by evaluating its performance against existing clustered database systems and testing its availability during machine failures. ACKNOWLEDGEMENTS I’d like to thank my supervisors, Professor Alan Dearle and Dr Graham Kirby, for the opportunities, the support, and the education that they have given me. Words cannot express how grateful I am for this. To Greg, Davie, Stuart, Alex, and the many others who provided support and entertainment away from work, and who pushed me (mostly) in the right direction, thank you. Thanks to my many officemates through the years, particularly Rob and Andy for helping me start, and Masih for helping me finish. Many thanks to the countless students I’ve had the pleasure to teach, for providing challenging discussions and inspiration in equal measure. I learned more from you, than you did from me. Finally, the biggest of thanks to my sister Julie, and to my parents, Anne and John Macdonald, for their love and support, and for making me who I am today. This is for them. DECLARATIONS I, Angus Macdonald, hereby certify that this thesis, which is approximately 55,000 words in length, has been written by me, that it is the record of work carried out by me and that it has not been submitted in any previous application for a higher degree. I was admitted as a research student in September 2007 and as a candidate for the degree of Doctor of Philosophy in September 2007; the higher study for which this is a record was carried out in the University of St Andrews between 2007 and 2012. Date: Signature of candidate: I hereby certify that the candidate has fulfilled the conditions of the Resolution and Regulations appropriate for the degree of Doctor of Philosophy in the University of St Andrews and that the candidate is qualified to submit this thesis in application for that degree. Date: Signature of supervisor: In submitting this thesis to the University of St Andrews I understand that I am giving permission for it to be made available for use in accordance with the regulations of the University Library for the time being in force, subject to any copyright vested in the work not being affected thereby. I also understand that the title and the abstract will be published, and that a copy of the work may be made and supplied to any bona fide library or research worker, that my thesis will be electronically accessible for personal or research use unless exempt by award of an embargo as requested below, and that the library has the right to migrate my thesis into new electronic forms as required to ensure continued access to the thesis. I have obtained any third-party copyright permissions that may be required in order to allow such access and migration, or have requested the appropriate embargo below. The following is an agreed request by candidate and supervisor regarding the electronic publication of this thesis: Access to printed copy and electronic publication of thesis through the University of St Andrews. Date: Signature of candidate: Date: Signature of supervisor: TABLE OF CONTENTS 1 Introduction ........................................................................................................................ 1 1.1 Database Systems ....................................................................................................... 2 1.2 Existing Approaches .................................................................................................. 3 1.2.1 Limitations of Existing Approaches ...................................................................... 3 1.3 Thesis Contribution .................................................................................................... 3 1.4 Thesis Structure .......................................................................................................... 5 2 Background Concepts ..................................................................................................... 10 2.1 Terminology .............................................................................................................. 10 2.2 Aspects of Distributed Database Architectures ................................................... 12 2.2.1 Heterogeneity ...................................................................................................... 12 2.2.2 Hardware ............................................................................................................. 12 2.2.3 Locale ................................................................................................................... 13 2.3 Database Components ............................................................................................. 15 2.3.1 Fault Tolerance .................................................................................................... 15 2.3.2 Transactional Properties ..................................................................................... 15 2.3.3 Replication ........................................................................................................... 26 2.3.4 Placement ............................................................................................................ 35 2.3.5 Caching ............................................................................................................... 39 2.3.6 Concurrency Control .......................................................................................... 39 2.3.7 Specialization ...................................................................................................... 42 2.3.8 Benchmarking ..................................................................................................... 43 2.3.9 Data Model .......................................................................................................... 43 2.4 Distributed Systems Components ......................................................................... 47 2.4.1 Failure Detection ................................................................................................. 47 2.4.2 Local Point of Presence ........................................................................................ 48 2.5 Resource Monitoring ............................................................................................... 50 2.5.1 Resource Monitoring Architectures .................................................................... 51 2.5.2 Querying Interfaces ............................................................................................ 54 2.5.3 Challenges ........................................................................................................... 55 2.6 Autonomic Computing............................................................................................ 56 2.7 Workstation-based Computing .............................................................................. 58 2.7.1 Ad Hoc Cloud Computing .................................................................................. 60 2.8 Summary ................................................................................................................... 61 2.8.1 Database Design .................................................................................................. 61 2.8.2 Distributed Systems ............................................................................................ 61 2.8.3 Resource Monitoring .......................................................................................... 62 2.8.4 Autonomic Computing ....................................................................................... 62 3 Related Work .................................................................................................................... 66 3.1 Distributed Databases .............................................................................................. 66 3.1.1 Clustered Databases (SQL) ................................................................................