State Machine Replication for Wide-Area Networks

Total Page:16

File Type:pdf, Size:1020Kb

State Machine Replication for Wide-Area Networks UC San Diego UC San Diego Electronic Theses and Dissertations Title State machine replication for wide area networks Permalink https://escholarship.org/uc/item/9m759181 Author Mao, Yanhua Publication Date 2010 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO State Machine Replication for Wide Area Networks A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Yanhua Mao Committee in charge: Professor Keith Marzullo, Chair Professor Chung-Kuan Cheng Professor Curt Schurgers Professor Amin Vahdat Professor Geoffrey M. Voelker 2010 Copyright Yanhua Mao, 2010 All rights reserved. The dissertation of Yanhua Mao is approved, and it is accept- able in quality and form for publication on microfilm and electronically: Chair University of California, San Diego 2010 iii TABLE OF CONTENTS Signature Page ................................... iii Table of Contents .................................. iv List of Figures .................................... viii List of Tables .................................... x Acknowledgements ................................. xi Vita and Publications ................................ xiii Abstract of the Dissertation ............................. xiv Chapter 1 Introduction ............................. 1 Chapter 2 Background ............................. 5 2.1 System Model ......................... 5 2.1.1 Asynchronous message passing model ........ 6 2.1.2 Failure models .................... 8 2.2 Consensus ........................... 9 2.3 Replicated state machine ................... 11 2.4 FLP impossibility result .................... 12 2.5 Failure detectors ....................... 13 2.5.1 The concept ...................... 14 2.5.2 Failure detector classes ................ 15 2.5.3 The weakest failure detector for solving Consensus . 17 2.6 Lower bounds ......................... 18 2.7 Paxos ............................. 19 2.8 PBFT ............................. 21 Chapter 3 Benefits and challenges of wide-area networks ........... 24 3.1 Benefits ............................ 24 3.2 Challenges .......................... 25 3.3 Paxos vs. Fast Paxos ..................... 26 3.3.1 Fast Paxos ...................... 27 3.3.2 Protocol Analysis ................... 27 3.3.3 Simulation ...................... 32 3.3.4 Summary . ...................... 34 3.4 Design considerations ..................... 35 3.5 Summary ........................... 37 3.6 Acknowledgement ...................... 37 iv Chapter 4 Rotating leader design for multi-site systems ............ 39 4.1 Multi-site systems ....................... 39 4.2 Simple Consensus ....................... 41 4.3 R: a rotating leader protocol ................. 46 4.3.1 Coordinated Protocols ................ 47 4.3.2 A generic rotating leader protocol .......... 48 4.4 Performance analysis of R .................. 53 4.4.1 Delayed commit ................... 56 4.5 Optimizations for R ..................... 58 4.5.1 Revocation in large blocks .............. 59 4.5.2 Out-of-order commit ................. 60 4.6 Summary ........................... 61 Chapter 5 Crash Failure: Protocol Mencius .................. 62 5.1 Performance issues ...................... 63 5.2 Why not existing protocols? . .............. 65 5.3 Deriving Mencius ....................... 68 5.3.1 Assumptions ..................... 68 5.3.2 Coordinated Paxos .................. 69 5.3.3 P: A simple state machine ............. 71 5.3.4 Optimizations ..................... 71 5.4 Implementation considerations . ............. 76 5.4.1 Prototype implementation .............. 76 5.4.2 Recovery . ...................... 77 5.4.3 Revocation ...................... 77 5.4.4 Commit delay and out-of-order commit ....... 78 5.4.5 Parameters ...................... 79 5.5 Evaluation ........................... 81 5.5.1 Experimental settings ................. 81 5.5.2 Throughput ...................... 83 5.5.3 Throughput under failure ............... 86 5.5.4 Scalability ...................... 89 5.5.5 Latency . ...................... 90 5.5.6 Other possible optimizations ............. 97 5.6 Related work ......................... 97 5.7 Future directions and open issues . ............. 99 5.8 Summary ........................... 100 5.9 Acknowledgement ...................... 101 Chapter 6 Byzantine Failure: Protocol RAM ................. 102 6.1 Assumptions ......................... 104 6.2 Design goals ......................... 106 6.2.1 Low latency as a main goal .............. 106 v 6.2.2 Discouraging uncivil rational behavior ........ 107 6.2.3 Exposing irrational behavior ............. 108 6.3 Reducing latency for PBFT .................. 108 6.3.1 Mutually Suspicious Domains ............ 109 6.3.2 Rotating leader design ................ 110 6.3.3 A2M for identical Byzantine failure ......... 111 6.4 Coordinated Byzantine Paxos ................. 112 6.4.1 Revocation sub-protocol ............... 116 6.4.2 Correctness proof ................... 117 6.4.3 Performance metrics ................. 121 6.5 Deriving RAM ........................ 122 6.5.1 Revocation in RAM ................. 122 6.5.2 Protocol Q ...................... 123 6.5.3 Optimizations ..................... 127 6.6 Discourage uncivil rational behavior ............. 128 6.6.1 Failure detection under network jitter ........ 129 6.6.2 Turnaround time advertisement ........... 130 6.6.3 Unverifiable suspicion ................ 131 6.6.4 Verifiable suspicion .................. 133 6.7 Combating irrational behavior . ............. 137 6.7.1 Compromised A2M ................. 138 6.7.2 Untrustworthy local servers ............. 139 6.7.3 Ignoring revocation .................. 140 6.7.4 Latency upper bound ................. 140 6.8 Evaluation ........................... 144 6.8.1 Prototype . ...................... 144 6.8.2 Experimental setup .................. 145 6.8.3 Latency . ...................... 146 6.8.4 Throughput ...................... 147 6.8.5 Revocation ...................... 149 6.9 Related work ......................... 150 6.10 Future direction ........................ 153 6.11 Summary ........................... 154 6.12 Acknowledgement ...................... 154 Chapter 7 Conclusion ............................. 155 Bibliography .................................... 163 Appendix A Pseudo code of Coordinated Paxos ................. 169 Appendix B Pseudo code of Protocol P ..................... 173 Appendix C Pseudo code of Mencius ...................... 176 vi Appendix D Pseudo code of Coordinated Byzantine Paxos ........... 181 vii LIST OF FIGURES Figure 2.1: The asynchronous message passing model .............. 8 Figure 2.2: The asynchronous message passing model enhanced with failure detectors ............................... 14 Figure 2.3: The message flow of Paxos during failure-free runs ......... 20 Figure 2.4: The message flow of Paxos leader change .............. 20 Figure 2.5: The message flow of Paxos when ACCEPT messages are broadcasted 21 Figure 2.6: The message flow of PBFT in the steady state. ........... 22 Figure 2.7: The message flow of PBFT view change ............... 23 Figure 3.1: The structure of a {W×L} system .................. 29 Figure 3.2: Cumulative distribution comparing Paxos and Fast Paxos, learning latency. ................................ 34 Figure 3.3: Cumulative distribution comparing Paxos and Fast Paxos, client la- tency. ................................. 35 Figure 4.1: A multi-site system .......................... 40 Figure 4.2: Delayed commit ........................... 57 Figure 5.1: The communication pattern in Paxos ................. 66 Figure 5.2: The message flow of suggest, skip and revoke in Coordinated Paxos. 69 Figure 5.3: Liveness problem can occur when using Optimization M.2 without Accerlerator M.1 ........................... 73 Figure 5.4: A comparison of the communication pattern in Paxos and Mencius . 76 Figure 5.5: Delayed commit in Mencius. ..................... 79 Figure 5.6: Throughput for 20 Mbps bandwidth ................. 83 Figure 5.7: Mencius with asymmetric bandwidth (ρ = 4,000) ......... 84 Figure 5.8: Mencius dynamically adapts to changing network bandwidth (ρ = 4,000) ................................ 85 Figure 5.9: The throughput of Mencius when a server crashes .......... 86 Figure 5.10: The throughput of Paxos when the leader crashes .......... 87 Figure 5.11: The throughput of Paxos when a follower crashes .......... 88 Figure 5.12: The scalability of Paxos and Mencius ................ 90 Figure 5.13: Mencius’s commit latency when client load shifts from one site to another ................................ 91 Figure 5.14: Commit latency distribution under low and medium load ...... 92 Figure 5.15: Commit latency vs offered client load (ρ = 4,000, no network vari- ance) ................................. 93 Figure 5.16: Commit latency vs offered client load (ρ = 0, no network varianc) . 94 Figure 5.17: Commit latency vs offered client load (ρ = 0, with network variance) 95 Figure 6.1: The relationship between different kinds of server behavior ..... 106 viii Figure 6.2: A space time diagram showing the message flows of the suggest and skip actions in Coordinated Byzantine Paxos. ............ 114 Figure 6.3: A space time diagram showing the message flow of the revoke ac- tion in Coordinated Byzantine Paxos. ................ 115 Figure 6.4: Algorithm A1 ............................. 137 Figure 6.5: Commit latency
Recommended publications
  • Failure Detectors for Wireless Sensor-Actuator Systems
    Cleveland State University EngagedScholarship@CSU Electrical Engineering & Computer Science Electrical Engineering & Computer Science Faculty Publications Department 7-2009 Failure Detectors for Wireless Sensor-Actuator Systems Hamza A. Zia Cleveland State University Nigamanth Sridhar Clevealand State University, [email protected] Shivakumar Sastry University of Akron Follow this and additional works at: https://engagedscholarship.csuohio.edu/enece_facpub Part of the Digital Communications and Networking Commons How does access to this work benefit ou?y Let us know! Publisher's Statement NOTICE: this is the author’s version of a work that was accepted for publication in Ad Hoc Networks. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Ad Hoc Networks, 7, 5, (07-01-2009); 10.1016/j.adhoc.2008.09.003 Original Citation Zia, H. A., Sridhar, N., , & Sastry, S. (2009). Failure detectors for wireless sensor-actuator systems. Ad Hoc Networks, 7(5), 1001-1013. doi:10.1016/j.adhoc.2008.09.003 Repository Citation Zia, Hamza A.; Sridhar, Nigamanth; and Sastry, Shivakumar, "Failure Detectors for Wireless Sensor-Actuator Systems" (2009). Electrical Engineering & Computer Science Faculty Publications. 64. https://engagedscholarship.csuohio.edu/enece_facpub/64 This Article is brought to you for free and open access by the Electrical Engineering & Computer Science Department at EngagedScholarship@CSU. It has been accepted for inclusion in Electrical Engineering & Computer Science Faculty Publications by an authorized administrator of EngagedScholarship@CSU.
    [Show full text]
  • Understanding Distributed Consensus
    What’s Live? Understanding Distributed Consensus∗ Saksham Chand Yanhong A. Liu Computer Science Department, Stony Brook University, Stony Brook, NY 11794, USA {schand,liu}@cs.stonybrook.edu Abstract Distributed consensus algorithms such as Paxos have been studied extensively. They all use the same definition of safety. Liveness is especially important in practice despite well-known theoretical impossibility results. However, many different liveness proper- ties and assumptions have been stated, and there are no systematic comparisons for better understanding of these properties. This paper systematically studies and compares different liveness properties stated for over 30 prominent consensus algorithms and variants. We introduce a precise high- level language and formally specify these properties in the language. We then create a hierarchy of liveness properties combining two hierarchies of the assumptions used and a hierarchy of the assertions made, and compare the strengths and weaknesses of algorithms that ensure these properties. Our formal specifications and systematic comparisons led to the discovery of a range of problems in various stated liveness proper- ties, from too weak assumptions for which no liveness assertions can hold, to too strong assumptions making it trivial to achieve the assertions. We also developed TLA+ spec- ifications of these liveness properties, and we use model checking of execution steps to illustrate liveness patterns for Paxos. 1 Introduction arXiv:2001.04787v2 [cs.DC] 21 Jun 2021 Distributed systems are pervasive, where processes work with each other via message passing. For example, this article is available to the current reader owing to some distributed system. At the same time, distributed systems are highly complex, due to their unpredictable asyn- chronous nature in the presence of failures—processes can fail and later recover, and messages can be lost, delayed, reordered, and duplicated.
    [Show full text]
  • Distributed Consensus Revised
    UCAM-CL-TR-935 Technical Report ISSN 1476-2986 Number 935 Computer Laboratory Distributed consensus revised Heidi Howard April 2019 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 https://www.cl.cam.ac.uk/ c 2019 Heidi Howard This technical report is based on a dissertation submitted September 2018 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Pembroke College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: https://www.cl.cam.ac.uk/techreports/ ISSN 1476-2986 Distributed consensus revised Heidi Howard Summary We depend upon distributed systems in every aspect of life. Distributed consensus, the abil- ity to reach agreement in the face of failures and asynchrony, is a fundamental and powerful primitive for constructing reliable distributed systems from unreliable components. For over two decades, the Paxos algorithm has been synonymous with distributed consensus. Paxos is widely deployed in production systems, yet it is poorly understood and it proves to be heavyweight, unscalable and unreliable in practice. As such, Paxos has been the subject of extensive research to better understand the algorithm, to optimise its performance and to mitigate its limitations. In this thesis, we re-examine the foundations of how Paxos solves distributed consensus. Our hypothesis is that these limitations are not inherent to the problem of consensus but instead specific to the approach of Paxos. The surprising result of our analysis is a substantial weakening to the requirements of this widely studied algorithm. Building on this insight, we are able to prove an extensive generalisation over the Paxos algorithm.
    [Show full text]
  • Distributed Consensus: Performance Comparison of Paxos and Raft
    DEGREE PROJECT IN INFORMATION AND COMMUNICATION TECHNOLOGY, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020 Distributed Consensus: Performance Comparison of Paxos and Raft HARALD NG KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Distributed Consensus: Performance Comparison of Paxos and Raft HARALD NG Master in Software Engineering of Distributed Systems Date: September 10, 2020 Supervisor: Lars Kroll (RISE), Max Meldrum (KTH) Examiner: Seif Haridi School of Electrical Engineering and Computer Science Host company: RISE Swedish title: Distribuerad Konsensus: Prestandajämförelse mellan Paxos och Raft iii Abstract With the growth of the internet, distributed systems have become increasingly important in order to provide more available and scalable applications. Con- sensus is a fundamental problem in distributed systems where multiple pro- cesses have to agree on the same proposed value in the presence of partial failures. Distributed consensus allows for building various applications such as lock services, configuration manager services or distributed databases. Two well-known consensus algorithms for building distributed logs are Multi-Paxos and Raft. Multi-Paxos was published almost three decades before Raft and gained a lot of popularity. However, critics of Multi-Paxos consider it difficult to understand. Raft was therefore published with the motivation of being an easily understood consensus algorithm. The Raft algorithm shares similar characteristics with a practical version of Multi-Paxos called Leader- based Sequence Paxos. However, the algorithms differ in important aspects such as leader election and reconfiguration. Existing work mainly compares Multi-Paxos and Raft in theory, but there is a lack of performance comparisons in practice. Hence, prototypes of Leader- based Sequence Paxos and Raft have been designed and implemented in this thesis.
    [Show full text]
  • The Dynamic Enterprise Bus∗
    Fourth International Conference on Autonomic and Autonomous Systems The Dynamic Enterprise Bus∗ Dag Johansen Havard˚ Johansen Dept. of Computer Science Dept. of Computer Science University of Tromsø, Norway University of Tromsø, Norway Abstract This is where autonomic computing [18] comes into play. Our goal is that information access software should In this paper we present a prototype enterprise informa- keep manual control and interventions outside the compu- tion run-time heavily inspired by the fundamental principles tational loop as much as possible. Closely related is that of autonomic computing. Through self-configuration, self- such autonomic solutions should consolidate and utilize re- optimization, and self-healing techniques, this run-time tar- sources better, with the net effect that large-scale informa- gets next-generation extreme scale information-access sys- tion access systems can be built in smaller scale. In this tems. We demonstrate these concepts in a processing cluster vein, self-configuration, self-optimization, and self-healing that allocates resources dynamically upon demand. become important. We are building the next generation run-time for fu- ture generation information access systems. Autonomic be- 1 Introduction havior is fundamental in order to dynamically adapt and reconfigure to accommodate changing situations. Hence, the run-time needs efficient mechanisms for monitoring Enterprise search systems are ripe for change. In their in- and controlling applications and their resource, moving ap- fancy, these systems were primarily used for information re- plications transparently while in execution, scheduling re- trieval purposes, with main data sources being internal cor- sources in accordance to end-to-end service-level agree- porate archives.
    [Show full text]
  • 42 Paxos Made Moderately Complex
    Paxos Made Moderately Complex ROBBERT VAN RENESSE and DENIZ ALTINBUKEN, Cornell University This article explains the full reconfigurable multidecree Paxos (or multi-Paxos) protocol. Paxos is by no means a simple protocol, even though it is based on relatively simple invariants. We provide pseudocode and explain it guided by invariants. We initially avoid optimizations that complicate comprehension. Next we discuss liveness, list various optimizations that make the protocol practical, and present variants of the protocol. Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]: Distributed Syst- ems—Network operating systems; D.4.5 [Operating Systems]: Reliability—Fault-tolerance General Terms: Design, Reliability Additional Key Words and Phrases: Replicated state machines, consensus, voting ACM Reference Format: Robbert van Renesse and Deniz Altinbuken. 2015. Paxos made moderately complex. ACM Comput. Surv. 47, 3, Article 42 (February 2015), 36 pages. DOI: http://dx.doi.org/10.1145/2673577 1. INTRODUCTION Paxos [Lamport 1998] is a protocol for state machine replication in an asynchronous environment that admits crash failures. It is useful to consider the terms in this 42 sentence carefully: —A state machine consists of a collection of states, a collection of transitions between states, and a current state. A transition to a new current state happens in response to an issued operation and produces an output. Transitions from the current state to the same state are allowed and are used to model read-only operations. In a deterministic state machine, for any state and operation, the transition enabled by the operation is unique and the output is a function only of the state and the operation.
    [Show full text]
  • Low-Overhead Accrual Failure Detector
    Sensors 2012, 12, 5815-5823; doi:10.3390/s120505815 OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Article Low-Overhead Accrual Failure Detector Xiao Ren, Jian Dong *, Hongwei Liu, Yang Li and Xiaozong Yang School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +86-451-8640-3317; Fax: +86-451-8641-3309. Received: 6 March 2012; in revised form: 20 April 2012 / Accepted: 25 April 2012 / Published: 4 May 2012 Abstract: Failure detectors are one of the fundamental components for building a distributed system with high availability. In order to maintain the efficiency and scalability of failure detection in a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. In this paper, an new accrual failure detector—LA-FD with low system overhead has been proposed specifically for current mobile network equipment on the Internet whose processing power, memory space and power supply are all constrained. It does not rely on the probability distribution of message transmission time, or on the maintenance of a history message window. By simple calculation, LA-FD provides adaptive failure detection service with high accuracy to multiple upper applications. The related experiments and results have also been presented. Keywords: failure detection; accrual failure detector; adaptive 1. Introduction Failure detector is one of the fundamental components for building a distributed system with high availability [1]. By providing the processes’ failure information to the system, it supports the solution of many basic issues (such as consensus and atomic broadcasting, etc.) in an asynchronous system.
    [Show full text]
  • A Literature Review of Failure Detection Within the Context of Solving the Problem of Distributed Consensus
    A literature review of failure detection Within the context of solving the problem of distributed consensus by Michael Duy-Nam Phan-Ba B.S., The University of Washington, 2010 AN ESSAY SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The Faculty of Graduate and Postdoctoral Studies (Computer Science) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2015 c Michael Duy-Nam Phan-Ba 2015 Abstract As modern data centers grow in both size and complexity, the probabil- ity that components fail becomes significant enough to affect user-facing services [3]. These failures have the apparent consequence of invoking the impossibility result for distributed consensus in the presence of even one failure [15]. One way to solve the impossibility result is to use failure de- tectors [8]. In this essay, we present the theoretical models that allow us to solve consensus. Then, we discuss practical refinements to the models for the purposes of implementing failure detectors in practice. Finally, we con- clude by surveying common design patterns for building distributed failure detectors. ii Table of Contents Abstract ................................. ii Table of Contents ............................ iii List of Tables .............................. v List of Figures .............................. vi List of Algorithms . vii Acknowledgements . viii Dedication ................................ ix 1 Introduction ............................. 1 2 The theory behind failure detection and consensus . 3 2.1 The impossibility result for consensus . 3 2.2 A model of failure detection . 4 2.3 The weakest failure detector . 7 3 Binary failure detection with partial synchrony . 13 3.1 Pull failure detection . 14 3.2 Push failure detection . 15 3.3 Push-pull failure detection .
    [Show full text]
  • Failure Detectors for Wireless Sensor-Actuator Systems Hamza A
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Cleveland-Marshall College of Law Cleveland State University EngagedScholarship@CSU Electrical Engineering & Computer Science Faculty Electrical Engineering & Computer Science Publications Department 7-2009 Failure Detectors for Wireless Sensor-Actuator Systems Hamza A. Zia Cleveland State University Nigamanth Sridhar Clevealand State University, [email protected] Shivakumar Sastry University of Akron Follow this and additional works at: https://engagedscholarship.csuohio.edu/enece_facpub Part of the Digital Communications and Networking Commons How does access to this work benefit oy u? Let us know! Publisher's Statement NOTICE: this is the author’s version of a work that was accepted for publication in Ad Hoc Networks. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Ad Hoc Networks, 7, 5, (07-01-2009); 10.1016/j.adhoc.2008.09.003 Original Citation Zia, H. A., Sridhar, N., , & Sastry, S. (2009). Failure detectors for wireless sensor-actuator systems. Ad Hoc Networks, 7(5), 1001-1013. doi:10.1016/j.adhoc.2008.09.003 Repository Citation Zia, Hamza A.; Sridhar, Nigamanth; and Sastry, Shivakumar, "Failure Detectors for Wireless Sensor-Actuator Systems" (2009). Electrical Engineering & Computer Science Faculty Publications. 64. https://engagedscholarship.csuohio.edu/enece_facpub/64 This Article is brought to you for free and open access by the Electrical Engineering & Computer Science Department at EngagedScholarship@CSU.
    [Show full text]
  • CS5412: Lecture II How It Works
    CS5412 Spring 2016 (Cloud Computing: Birman) 1 CS5412: PAXOS Lecture XIII Ken Birman Leslie Lamport’s vision 2 Centers on state machine replication We have a set of replicas that each implement some given, deterministic, state machine and we start them in the same state Now we apply the same events in the same order. The replicas remain in the identical state To tolerate ≤ t failures, deploy 2t+1 replicas (e.g. Paxos with 3 replicas can tolerate 1 failure) How best to implement this model? CS5412 Spring 2016 (Cloud Computing: Birman) Two paths forwards... 3 One option is to build a totally ordered reliable multicast protocol, also called an “atomic broadcast” protocol in some papers To send a request, you give it to the library implementing that protocol (for cs5412: probably Vsync). Eventually it does upcalls to event handlers in the replicated application and they apply the event In this approach the application “is” the state machine and the multicast “is” the replication mechanism Use “state transfer” to initialize a joining process if we want to replace replicas that crash CS5412 Spring 2016 (Cloud Computing: Birman) Two paths forwards... 4 A second option, explored in Lamport’s Paxos protocol, achieves a similar result but in a very different way We’ll look at Paxos first because the basic protocol is simple and powerful, but we’ll see that Paxos is slow Can speed it up... but doing so makes it very complex! The basic, slower form of Paxos is currently very popular Then will look at faster but more complex reliable multicast options (many of them...) CS5412 Spring 2016 (Cloud Computing: Birman) Key idea in Paxos: Quorums 5 Starts with a simple observation: Suppose that we lock down the membership of a system: It has replicas {P, Q, R, ..
    [Show full text]
  • INFORMATION to USERS This Manuscript Has Been Reproduced from the Microfilm Master. UMI Films the Text Directly from the Origina
    INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. Hie quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed through, substandard margins, and improper alignment can adversely afreet reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. University Microfilms International A Ben & Howell information Company 300 North ZeebRoad Ann Arbor Mt 48106-1346 USA 313 761-4700 800 521-0600 Order Number 9605332 Fault-tolerant distributed algorithms for consensus and termination detection Wu, Li-Fen, Ph.D. The Ohio State University, 1994 UMI 300 N.
    [Show full text]
  • Self-Healing Distributed Systems
    ÃABCÊÇÅÍÆGËÀ¼ Self-healing distributed systems Dissertation for the degree of Doctor of Natural Sciences (Dr. rer. nat.) submitted to the Department of Computer Science of the University of Augsburg presented by Benjamin Satzger in 2008 Examiner: Prof. Dr. rer. nat. Theo Ungerer Co-examiner: Prof. Dr. rer. nat. Bernhard Bauer Date of oral examination: 2008-12-18 Abstract The growing complexity of distributed systems demands for new ways of control. This work addresses self-healing in distributed environments. The term self-healing represents a quite new area of research and is used in a fairly broad way, but can be seen as dynamic fault tolerance. This work proposes generic concepts and algorithms to build self-healing systems. The detection of node failures in distributed environments is a non-trivial problem. Failure detectors are an important component of many fault tol- erant distributed systems. In this work a new failure detection algorithm is proposed with noteworthy features like a high flexibility and good perfor- mance. Furthermore an approach is presented to save the message overhead of failure detectors. New grouping algorithms are introduced in this work to enable a scalable self-monitoring property. This allows an autonomous installation of moni- toring relations in complex large scale distributed systems. A failure recovery engine based on automated planning, which manages a distributed system according to user-defined objectives, is proposed. It is able to generate and execute plans to autonomously recover a system from unwanted states. Finally, ideas for a generic self-healing architecture for highly complex dis- tributed systems are presented. The design is based on psychological and sociological concepts.
    [Show full text]