A Study of the Availability and Serializability in a Distributed Database System
Total Page:16
File Type:pdf, Size:1020Kb
A STUDY OF THE AVAILABILITY AND SERIALIZABILITY IN A DISTRIBUTED DATABASE SYSTEM David Wai-Lok Cheung B.Sc., Chinese University of Hong Kong, 1971 M.Sc., Simon Fraser University, 1985 A THESIS SUBMI'ITED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHLLOSOPHY , in the School of Computing Science 0 David Wai-Lok Cheung 1988 SIMON FRASER UNIVERSITY January 1988 All rights reserved. This thesis may not be reproduced in whole or in part, by photocopy or other means, without permission of the author. 1 APPROVAL Name: David Wai-Lok Cheung Degree: Doctor of Philosophy Title of Thesis: A Study of the Availability and Serializability in a Distributed Database System Examining Committee: Chairperson: Dr. Binay Bhattacharya Senior Supervisor: Dr. TikoJameda WJ ru p v Dr. Arthur Lee Liestman Dr. Wo-Shun Luk Dr. Jia-Wei Han External Examiner: Toshihide Ibaraki Department of Applied Mathematics and Physics Kyoto University, Japan Bate Approved: January 15, 1988 PARTIAL COPYRIGHT LICENSE I hereby grant to Simon Fraser University the right to lend my thesis, project or extended essay (the title of which is shown below) to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. I further agree that permission for multiple copying of this work for scholarly purposes may be granted by me or the Dean of Graduate Studies. It is understood that copying or publication of this work for financial gain shall not be allowed without my written permission, T it l e of Thes i s/Project/Extended Essay Author: (signature) ( name (date) ABSTRACT Replication of data objects enhances the reliability and availability of a distributed database system. However, due to the inherent conflict between serializability and availability, if serializability is to be guaranteed in a partitioned database system, degradation of availability is inevitable. We first characterize serializable transaction executions in a partitioned database system, by means of a graph theoretical method. We then derive an upper bound on the availability of a database system with two partitions. This upper bound holds for a general class of transaction distributions that satisfy the "weak uniformity assumption". Since it is impossible to simultaneously achieve serializability and high availability in a general database system, we investigate database systems in which constraints are imposed on the readlwrite activity of the transactions. In particular, we propose a fragmented database system, in which transactions are classified as either local or global. This model can be used to realize wide-area distributed database systems, in which messages encounter substantial communication delays. We argue that a transaction scheduling policy that favors local transactions over global transactions should be adopted in wide-area distributed database systems. Two schemes are proposed for concurrency control in this kind of system. The first scheme, Global Timestamp Order Certification, uses an "active" approach, and sends out requests for global transactions to be certified by remote sites. The second scheme, Global Timestamp Order Synchronization, adopts a "passive" approach in which a global transaction is . made to wait until it is known that the transaction has read consistent data object values from other sites. In both approaches, no global transaction blocks a local transaction. Moreover, local transactions are executed as if they were in the single-site environment, in which communication delays are nkgligible. ACKNOWLEDGEMENTS I wish to express special appreciation and gratitude to Professor Tiko Kameda, my senior supervisor, for his constant guidance, advice, support and availability. Without his support and supervision which I enjoyed during the last four years of my study at Simon Fraser, the completion of this thesis would not have been possible. His insistence on precision and clarity in writing has been immensely helpful in the preparation of this thesis. I am also indebted to Dr. Wo-shun Luk, Dr. Arthur Liestman and Dr. Jia-wei Han, not only for their reading and commenting on this thesis, but also for their constant advice and encouragement. Thanks are also due to Professor Toshihide Ibaraki, the external examiner of this thesis, for his valuable and stimulating suggestions and comments. My thanks go also to many of the graduate students in the School of Computing Science at Simon Fraser, who either read and commented parts of my thesis, or provided me with various kinds of help when they were most needed: Ada Fu, Steven Yap, Mimi Kao, Frank Tong, Patta Pattabhiraman, Siu-Cheung Chau, Wuyi Wu and many others. Finally, this work is dedicated to my wife, Juana. She gave me love and support, and encouraged my endeavour to study at an age when most men have to work. This thesis is also dedicated to my parents, brothers and sisters. They have been caring so much about me, my wife and my children. TABLE OF CONTENTS Approval ... .. ... .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. ... .. , . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ii Abstract ....... ... .. .. .. .. ... .. .. .. .. .. .. .. ... .. , . ... .. .. .. ... .. .. .. .. .. .. .. ... .. .. .. ... .. .. .. .. .. iii Acknowledgements ...... .. .. .. ... .. .. .. .. .. ... .. .. .. .. .. .. .. .. ... .. .. .. iv Table of Contents ... .. ... .. ............. .. .. ..... .. ......... .. ...... .. ... ............... .. ....... .. .. .. .. ... .. .. .. .. .. ......... ....... ... v List of Figures ............................................................................................................................. viii Chapter One. INTRODUCTION .. .. ........................... .. ...... ... .. .. .. ....... .. .. .... ... .. .. ...... ......... ....... ... 1 1.1. Replicated Database Management ..................................................................................... 1 1.2. Goals of the thesis .............................................................................................................. 2 1.3. Overview of the thesis ....................................................................................................... 3 Chapter Two. CONCURRENCY CONTROL IN DISTRIBUTED DATABASE SYSTEM 2.1. Concurrency Control ......................................................................................................... 2.2. Serializability in a Single Site Database System ......;........................................................ 2.3. Disjoint-Interval Topological Sort .. .. ... .. .. ...... ... .. .. .. .. ... .. .. .. .. .. ... .. .. .... ... .. .... .. .. ... ............... 2.4. Serializability in a Distributed Database System .............................................................. Chapter Three. COPING WITH PARTITION FAILURES .. .. .. .. .. .. ... ........ ... ...... .. ......... .. .......... 3.1. Partition Failure ................................................................................................................. \ 3.2. Pessimistic Approach to Dealing with Partition Failures .................................................. 3.3. Optimistic Approach to Dealing with Partition Failures .............................................. ..... 1 3.4. Transaction-Cluster Log .................................................................................................... 3.5. Characterization of Executions Admissible under Partition Failures ............................... Chapter Four. LIMITATION ON AVAILABILITY .................................................................. 4.1. Availability in Partitioned Database .................................................................................. 4.2. Uniform Transaction Distribution and Availability .......................................................... 4.3. A General Upper Bound on Availability ........................................................................... Chapter Five . TECHNIQUES FOR ACHIEVING HIGH AVAILABILITY ............................ 5.1. Trade-off between Serializability and Availability ........................................................... 5.2. A Highly Available Distributed Database System ............................................................ 5.3. Fragmented Distributed Database System ......................................................................... 5.4. Transaction Processing with a Static Read Access Graph ................................................ Chapter Six. A CONCURRENCY CONTROL SCHEME FOR WIDE-AREA DISTRI- BUTED DATABASE SYSTEM ........................................................................................ 6.1. A Model for Wide-Area Distributed Database Systems .................................................... 6.2. Architecture for Wide-Area Distributed Database System (WADDS) ............................. 6.3. Correctness of Fragmented Executions ............................................................................. 6.4. An Algorithm to Control Fragmented Execution .............................................................. 6.5. Performance Analysis ........................................................................................................ 6.6. Partition Failures in a Fragmented Database System ........................................................ Chapter Seven. ANOTHER