Power-Hierarchy of Dependability-Model Types 495

IEEE TRANSAnIONS ON RELIABILITY, VOL. 43, NO. 3, 1994 SEPTEMBER 493 Power-Hierarchy of Dependability -Model Types Manish Malhotra Over the years, several model types such as reliability block AT&T Bell Laboratories, Holmdel diagrams, fault trees, and Markov chains, have been used to Kishor S. Trivedi, Fellow IEEE model fault-tolerant systems and to evaluate various dependabili- Duke University, Durham ty measures. These model types differ from one another not only in the ease of use in a particular application but in terms of modeling power. For instance, a series-parallel2 system is Key Wonis - Combinatorial-model type, dependability, fault- reasonably modeled by a series-parallel reliability block tree, generalized stodmhc' Petri net, Markov-model type, reliabii diagram. Similarly, fault trees are more intuitive in capturing block diagram, reliability graph, stochastic reward net. how a component failure propagates into a higher level subsystem or system failure. Thus some model types lend Reader Aids - themselves easily to model certain kind of behavior of systems. General purpose: Widen the state-of-the-art Modeling power of a model type is determined by the kinds Special math needed for explanations: Discrete mathematics of dependencies within subsystems that can be modeled and the Special math needed to use results: Same kind of dependability measures that can be computed. For in- Results useful to: Reliability modelers stance, if various components of a system share a repair per- son (repair dependency among components), then FT or RBD Summary & Conclusions - This paper formally establishes cannot easily be used to model the availability of this system. a hierarchy, among the most commonly used types of dependability Markov chains and stochastic Petri nets can easily model such models, according to their modeling power. Among the com- a repair dependency. binatorial (non-state-space) model types, we show that fault trees From a variety of model types, a particular model type with repeated events are the most powerful in terms of kinds of is chosen to specify a model. The choice of a suitable model dependencies among various system components that can be model- type is determined by factors such as: ed (which is one metric of modeling power). Reliability graphs are less powerful than fault trees with repeated events but more powerful than reliabrrity block diagrams and fault trees without repeated Constraints events. By virtue of the constructive nature of our proofs, we provide algorithms for converting from one model type to another. Familiarity of the user with the model type Among the Markov (state-space) model types, we consider The model type supported by the available modeling tool-kit continuous-time Markov chains, generalized stochastic Petri nets, Markov reward models, and &ochasW' reward wts. These are more Choices powerful than combinatorial-modeltypes in that they can capture dependencies such as a shared repair facility between system com- Ease of use in a particular application ponents. However, they are analytically tractable only under cer- The kind of system and system behavior to be modeled tain distributional assumptions such as exponential failure- & The measure of system behavior to be computed repair-time distributions. They are also subject to an exponential- Conciseness and ease of model specification ly large state space. The equivalence among various Markov-model types is well known and thus only briefly discussed. This paper analyzes the choices category and ignores the constraints category (although it is obviously important in some situations). The modeler's decision process can be greatly 1. INTRODUCTION' simplified by comparing model types according to: modeling power Fault-tolerant computer systems are used in a variety of conciseness of model specification. applications that require high reliability or availability. For instance, computer systems in flight-control in aircraft & Little has been done to compare formally the dependability- spacecraft require that the system provide service, without fail- model types. Ref [2,3] summarize the dependability-model types. ing, until the end of mission. Such systems have a high reliability These studies informally discuss model types, the kinds of requirement. On the other hand, computer systems in database dependencies that can be modeled by them, and dependability applications and communication networks are required to be measures that can be evaluated using these model types. However, operational for as high a fraction of time as possible (there is no critical mission time in this case). Such systems are required to possess high availability. Laprie [ 11 coined the term depen- 'Acronyms, nomenclature, and notation are given at the end of the dability as a measure of the quality, correctness, and continui- Introduction. ty of service delivered by a system. Dependability encompasses 'The terms, series / parallel are used in their logicdiagram sense, ir- measures such as reliability, availability, and safety. respective of the schematicdiagram or physical-layout. 0018-9529/94/$4.00 01994 IEEE 494 IEEE TRANSACTIONS ON RELIABILITY, VOL. 43, NO. 3, 1994 SEPTEMBER to the best of our knowledge, there has been no formal com- Notation parative evaluation of model types except for the following studies. Mi, Pi [memory, processor] module i N interconnection network Using probabilistic arguments, Shooman [4] showed the U, I/ set of [nodes, edges] in a digraph equivalence of RBD & FT (without repeated events); ie, any G reliability graph system that can be modeled by RBD can also be modeled by e;, wi [edge, node] i in a reliability graph FT and vice-versa. Gi gate i in a fault tree Hura & Atwood [5] showed how Petri net models can repre- Dij disk j of processing subsystem i sent s-coherent fault trees. They showed that an equivalent e number of edges in a reliability graph Petri net representation allows study of dynamic behavior of px,y, pi,.x,y [path, subpath i] from node n to node y in a the model and offers more insightful treatment of fault detec- reliability graph tion & propagation. However, they do not show that Petri si state i of a CTMC nets can model certain systems which can not be modeled by oi initial probability of being in si FT. 4 Pi place i in a Petri net transition from pi (i # 0) to pj in a Petri net This paper is mainly concerned with the modeling power fij of the following dependability-model types: ri rate associated with ti immediate transition between place po and place pi reliability block diagrams Ci component i fault trees without repeated events hi, pi [failure, repair] rate of Ci fault trees with repeated events ri reward rate associated with state si of CTMC. reliability graphs continuous-time Markov chains Other, standard notation is given in “Information for Readers generalized stochastic Petri nets & Authors” at the rear of each issue. Markov reward models stochastic reward nets. 2. FAULT-TOLERANT MULTIPROCESSOR SYSTEM We compare the model types and establish a hierarchy of model AN EXAMPLE types on the basis of their modeling power. For example, to compare model types A & B, we - A fault-tolerant multiprocessor system is a running exam- either provide an algorithm that converts any instance of ple in this paper. Figure 1 shows the basic multiprocessor ar- model type A to an equivalent instance of model type B (and chitecture; it consists of two processors PI & P2, each with a vice-versa) , private memory MI& M2 respectively. A processor and its or prove that not every instance of model type A can be con- memory form a processing unit. Each processing unit is con- verted to an equivalent instance of model type B. nected to a mirrored-disk system. This forms a processing subsystem. Both processing units are connected via an intercon- Some of the relationships our study reveal are obvious and some nection network N. The system is functional while N is func- are not so obvious. Our aim is to provide a modeler with a power- tional and at least one of the processing subsystems is functional. hierarchy of dependability-modeltypes which enable the modeler For a processing subsystem to be functional, the processor, to select from a variety of model types for a given problem. memory module, and at least one of the two disks must be func- Section 2 describes the fault-tolerant multiprocessor system tional. For simplicity & illustration, we restrict ourselves to this that is the illustrative example in this paper. Section 3 describes 2-processor system. This architecture and the corresponding combinatorial-model types. Section 4 establishes power- models are easily scaled to many processors. hierarchy among combinatorial-model types. Section 5 briefly discusses Markov-model types and compares them to combinatorial-model types. Section 6 shows the overall power 3. COMBINATORIAL MODEL TYPES hierarchy. Acronyms-’ 3.1 Reliability Block Diagrams CTMC continuous time Markov chain RBD fall into the category of combinaforiul (also known FT fault tree (without repeated events) as non-sfufe-space)model types [2, 31. They map the opera- FTRE fault tree with repeated events tional dependency of a system on its components and not the GSPN generalized stochastic Petri net actual physical structure. In Shooman’s [4] words, RBD repre- MRM Markov reward model sent the probability-of-success approach to system modeling. RBD reliability block diagram RG reliability graph SRN stochastic reward net. %e singular / plural of an acronym are always spelled the same. MALHOTWRIVEDI: POWER-HIERARCHY OF DEPENDABILITY-MODEL TYPES 495 Some researchers have used RBD with repeated blocks [6, 71. However, we use RBD without repeated blocks. 3.2 Fault Trees Without Repeated Events Like RBD, a FT is also a combinatorial-model type and maps the operational dependency of a system on its components. However, unlike RBD, FT represent a probability-of-failure approach to system modeling [4].

Power-Hierarchy of Dependability-Model Types 495

Balancing Dependability Quality Attributes Relationships for Increased Embedded Systems Dependability

A Reasoning Framework for Dependability in Software Architectures Tacksoo Im Clemson University, [email protected]

Fundamental Concepts of Dependability

Dependability Assessment of Software- Based Systems: State of the Art

Software Reliability and Dependability: a Roadmap Bev Littlewood & Lorenzo Strigini

Critical Systems

A Practical Framework for Eliciting and Modeling System Dependability Requirements: Experience from the NASA High Dependability Computing Project

A Brief Essay on Software Testing

Dependability Analysis of a Safety Critical System: the LHC Beam Dumping System at CERN

Empirical Evaluation of the Effectiveness and Reliability of Software Testing Adequacy Criteria and Reference Test Systems

Dependability in Cloud Computing

Model-Driven Tools for Dependability Management in Component-Based Distributed Systems