Specification-Based Design of Self-Stabilization
Total Page:16
File Type:pdf, Size:1020Kb
1 Specification-based Design of Self-Stabilization Murat Demirbas, Member, IEEE, and Anish Arora, Member, IEEE Abstract—Research in system stabilization has traditionally reliedontheavailabilityofacompletesystemimplementation.Assuch,it would appear that the scalability and reusability of stabilization is limited in practice. To redress this perception, in this paper, we show for the first time that system stabilization may be designed knowing only the system specification but not the system implementation. We refer to stabilization designed thus as specification-based design of stabilization and identify “local everywhere specifications” and “convergence refinements” as being amenable to the specification-based design of stabilization. Using our approach, we present the design of Dijkstra’s 4-state stabilizing token-ring systemstartingfromanabstractfault-intoleranttoken-ringsystem. We also present an illustration of automated design of specification-based stabilization on a 3-state token-ring system. Index Terms—Self-stabilization, Fault-tolerance preserving refinements, Distributed systems. ✦ 1INTRODUCTION we show for the first time that system stabilization can be achieved without knowledge of implementation ELF-STABILIZATION research [7], [8], [10], [11], [12] details. We eschew “whitebox” knowledge—of system focuses on how systems deal with arbitrary state S implementation—in favor of “graybox” knowledge— corruption. A stabilizing system is such that every com- of system specification—for the design of stabilization. putation of the system, upon starting from an arbitrary Since specifications are typically more succinct than state, eventually reaches a state from where the compu- implementations, specification-based design of fault- tation is “correct”. The motivation for considering arbi- tolerance offers the promise of scalability when the trary state-corruption is to account for the unanticipated design effort for adding fault-tolerance is proportional faults that results from complex interactions of faults to the size of the specification. Also, since specifications and system components. Self-stabilization offers a formal admit multiple implementations and since system com- state-based verification and design technique that shuns ponents are often reused, specification-based design of case-by-case analysis of faults and recovery in favor fault-tolerance offers the promise of reusability. Finally, of a uniform mechanism. To this end, research in self- for closed-source situations where exploiting a specifica- stabilization has traditionally relied on the availability tion is warranted, specification-based approach allows of a complete system implementation. The standard ap- the design of efficient fault-tolerance in contrast to a proach uses knowledge of all implementation variables blackbox design. and actions to exhibit an “invariant” condition such that Given a high-level system specification A,the if the system is properly initialized then the invariant specification-based approach is to design a tolerance is always satisfied and if the system is placed in an wrapper W such that adding W to A yields a fault- arbitrary state then continued execution of the system tolerant system. The goal is to ensure that for any low- eventually reaches a state from where the invariant is level refinement (implementation) C of A adding a low- always satisfied. level refinement W ! of W would also yield a fault- The apparently intimate connection between stabiliza- tolerant system. tion and the details of implementation has raised the fol- Note that since the refinements from A to C and W lowing serious concerns: (1) Stabilization is not feasible ! for many applications whose implementation details are to W can be done independently, specification-based not available, for instance, closed-source applications. (2) design enables a posteriori or dynamic addition of fault- Even if implementation details are available, stabiliza- tolerance. That is, given a concrete implementation C,it tion is not scalable as the complexity of calculating the is possible to add fault-tolerance to C as follows: invariant of large implementations may be exorbitant. • First, design an abstract (high-level) tolerance wrap- (3) Stabilization lacks reusability since it is specific to a per W using solely an abstract specification A of C, particular implementation. and then ! Towards addressing these concerns, in this paper, • add a concrete (low-level) refinement W of W to C. The goal of specification-based fault-tolerance is not • M. Demirbas is with the Department of Computer Science and Engineer- readily achieved for all refinements. The refinements ing, University at Buffalo, SUNY, Buffalo, NY, 14260. E-mail: [email protected] we need for achieving specification-based fault-tolerance • A. Arora is with the Department of Computer Science and Engineering, should not only preserve fault-tolerance but also have The Ohio State University, Columbus, OH, 43210. nice composability features so that the refinements E-mail: [email protected] from A to C and W to W ! can be done indepen- 2 dently. In this paper we present special classes of re- systems use a different state space than the abstract. finement, “everywhere refinements”, “local-everywhere Definition. C is a refinement of A,denoted[C ⊆ A]init, refinements”, and “convergence refinements”, that en- iff every computation of C that starts from an initial state able specification-based design of stabilization. These is a computation of A. refinements ensure that if A composed with W is fault- Definition. C is an everywhere refinement [2] of A,denoted tolerant, then for any everywhere or convergence re- [C ⊆ A],iffeverycomputationofC is a computation finement C of A adding an everywhere or convergence of A. ! refinement W of W would also yield a fault-tolerant sys- Definition.Astatesequencec is a convergence isomorphism tem. Using these refinements, in this paper, we present of a state sequence a iff c is a subsequence of a with at the design of Dijkstra’s 4-state and 3-state stabilizing most a finite number of omissions and with the same token-ring systems as illustrations. initial and final (if any) state as a. Outline of the rest of the paper. In Section 2, For instance, c = s1 s3 s6 is a convergence iso- we give preliminaries. In Section 3 we show that ev- morphism of a = s1 s2 s3 s4 s5 s6. However, erywhere and convergence refinements are stabilization c = s1 s3 s5 s6 is not aconvergenceisomorphism preserving and in Section 4 that they are amenable for of a = s1 s2 s5 s6 since c can only drop states in a, specification-based design of stabilization. In Section 5, and cannot insert states to a. Intuitively, the convergence using specification-based approach we present a deriva- isomorphism requirement corresponds to the notion of tion of Dijkstra’s 4-state stabilizing token-ring algorithm using similar recovery paths: c should use a similar as a refinement of a simple, abstract token-ring algo- recovery path with a and not any arbitrary recovery rithm. We present related work in Section 6 and make path. concluding remarks in Section 7. For reasons of space, Definition. C is a convergence refinement of A,denoted we relegate most of the proofs and an illustration of [C # A],iff: automated synthesis of specification-based stabilization • C is a refinement of A, on the token-ring example to the supporting material • every computation of C is a convergence isomor- associated with this article. phism of some computation of A. Convergence refinements are more general than every- where refinements: [C ⊆ A] ⇒ [C # A],butnotvice 2PRELIMINARIES versa. A fault is a perturbation of the system state. Here, Let Σ be a state space. we focus on transient faults that may arbitrarily corrupt Definition.Asystem S is a finite-state automaton (Σ, T , the process states. The following definition captures a I)whereT ,thesetoftransitions,isasubsetof{(s0,s1): standard tolerance to transient faults. s0,s1 ∈ Σ} and I,thesetofinitialstates,isasubsetof Definition. C is stabilizing to A iff every computation of Σ. C has a suffix that is a suffix of some computation of A AcomputationofS is a maximal sequence of states that starts at an initial state of A. such that every state is related to the subsequent one This definition of stabilization allows the possibility with a transition in T ,i.e.,ifacomputationisfinitethere that A is stabilizing to A,thatis,A is self-stabilizing. are no transitions in T that start at the final state. We refer to an abstract system as a specification,andto 3STABILIZATION PRESERVING REFINEMENTS aconcretesystemasanimplementation.Forconvenience in our definitions and theorems, we pretend that the Refinement tools such as compilers, program transform- specification and the implementation use the same state ers, and code optimizers generally do not preserve the space. In general, the state space of the implementation fault-tolerance properties of their input programs. Con- can be different than that of the specification since the sider, for example, a program that is trivially tolerant implementations often introduce some components of to the corruption of a variable x in that it eventually states that are not used by the specifications. We handle ensures x is always 0. this by relating the states of the concrete implementation int x=0; with the abstract specification via an abstraction func- while(x==x) { tion. The abstraction function is a total mapping from x=0;} ΣC ,thestatespaceoftheimplementationC, onto ΣA, the state space of the specification A.Thatis,everystate