Fundamental Concepts of Dependability

Fundamental Concepts of Dependability Algirdas Avizˇ ienis Jean-Claude Laprie Brian Randell UCLA, Los Angeles, CA, USA LAAS-CNRS Dept. of Computing Science and Vytautas Magnus U. Toulouse U. of Newcastle upon Tyne Kaunas, Lithuania France U.K. [email protected] [email protected] [email protected] 1. Origins and Integration of the Concepts Critical Applications was held in 1989. This and the six The concept of dependable computing first appears in working conferences that followed fostered the interaction the 1830’s in the context of Babbage’s Calculating Engine of the dependability and security communities, and [1,2]. The first generation of electronic computers (late advanced the integration of security (confidentiality, 1940’s to mid-50’s) used rather unreliable components, integrity and availability) into the framework of therefore practical techniques were employed to improve dependable computing [22]. A summary of [22] is their reliability, such as error control codes, duplexing presented next. with comparison, triplication with voting, diagnostics to locate failed components, etc. [3-5]. 2. The Principal Concepts: a Summary At the same time J. von Neumann [6], E. F. Moore A systematic exposition of the concepts of and C. E. Shannon [7] and their successors developed dependability consists of three parts: the threats to, the theories of using redundancy to build reliable logic attributes of, and the means by which dependability is structures from less reliable components, whose faults attained, as shown in Figure 1. were masked by the presence of multiple redundant AVAILABILITY components. The theories of masking redundancy were RELIABILITY unified by W. H. Pierce as the concept of failure tolerance SAFETY ATTRIBUTES in 1965 [8]. In 1967, A. Avizienis integrated masking CONFIDENTIALITY with the practical techniques of error detection, fault INTEGRITY diagnosis, and recovery into the concept of fault-tolerant MAINTAINABILITY systems [9]. In the reliability modeling field, the major FAULT PREVENTION event was the introduction of the coverage concept by FAULT TOLERANCE DEPENDABILITY MEANS Bouricius, Carter and Schneider [10]. Seminal work on FAULT REMOVAL software fault tolerance was initiated by B. Randell FAULT FORECASTING [11,12], later it was complemented by N-version programming [13]. FAULTS THREATS ERRORS The formation of the IEEE-CS TC on Fault-Tolerant FAILURES Computing in 1970 and of IFIP WG 10.4 Dependable Computing and Fault Tolerance in 1980 accelerated the Figure 1 - The dependability tree emergence of a consistent set of concepts and Computing systems are characterized by four terminology. Seven position papers were presented in fundamental properties: functionality, performance, cost, 1982 at FTCS-12 [14], and J.-C. Laprie formulated a and dependability. Dependability of a computing system synthesis in 1985 [15]. Further work by members of IFIP is the ability to deliver service that can justifiably be WG 10.4, led by J.-C. Laprie, resulted in the 1992 book trusted. The service delivered by a system is its behavior Dependability: Basic Concepts and Terminology [16], in as it is perceived by its user(s); a user is another system which the English text was also translated into French, (physical, human) that interacts with the former at the German, Italian, and Japanese. service interface. The function of a system is what the In [16], intentional faults (malicious logic, intrusions) system is intended for, and is described by the system were listed along with accidental faults (physical, design, specification. or interaction faults). Exploratory research on the 2.1. The Threats: Faults, Errors, and Failures integration of fault tolerance and the defenses against the Correct service intentional faults, i.e., security threats, was started at the is delivered when the service failure RAND Corporation [17], University of Newcastle [18], implements the system function. A system is an st event that occurs when the delivered service deviates from LAAS [19], and UCLA [20,21] in the mid-80’s. The 1 correct service. A system may fail either because it does IFIP Working Conference on Dependable Computing for not comply with the specification, or because the specification did not adequately describe its function. A The alphabetic ordering of authors’ names does not imply failure is a transition from correct service to incorrect any seniority of authorship service, i.e., to not implementing the system function. A NATURAL FAULTS transition from incorrect service to correct service is PHENOMENOLOGICAL service restoration. The time interval during which CAUSE HUMAN-MADE FAULTS incorrect service is delivered is a service outage. An error ACCIDENTAL FAULTS is that part of the system state that may cause a INTENT DELIBERATE, NON-MALICIOUS FAULTS subsequent failure: a failure occurs when an error reaches DELIBERATELY MALICIOUS FAULTS the service interface and alters the service. A fault is the adjudged or hypothesized cause of an error. A fault is DEVELOPMENTAL FAULTS PHASE OF CREATION PRODUCTION FAULTS active when it produces an error, otherwise it is dormant. OR OCCURENCE OPERATIONAL FAULTS A system does not always fail in the same way. The FAULTS ways a system can fail are its failure modes. As shown in PHYSICAL FAULTS DOMAIN Figure 2, the modes characterize incorrect service INFORMATION FAULTS according to three viewpoints: a) the failure domain, b) INTERNAL FAULTS the perception of a failure by system users, and c) the SYSTEM BOUNDARIES EXTERNAL FAULTS consequences of failures on the environment. PERMANENT FAULTS PERSISTENCE VALUE FAILURES TRANSIENT FAULTS DOMAIN TIMING FAILURES Figure 3 - Elementary fault classes CONSISTENT FAILURES FAILURES PERCEPTION BY 2.2. The Attributes of Dependability SEVERAL USERS INCONSISTENT FAILURES Dependability is an integrative concept that MINOR FAILURES encompasses the following attributes: availability: CONSEQUENCES • • readiness for correct service; reliability: continuity of ON ENVIRONMENT • CATASTROPHIC FAILURES correct service; safety: absence of catastrophic Figure 2 - The failure modes consequences on the user(s) and the environment; confidentiality A system consists of a set of interacting components, : absence of unauthorized disclosure of integrity therefore the system state is the set of its component information; : absence of improper system state maintainability states. A fault originally causes an error within the state of alterations; ; ability to undergo repairs one (or more) components, but system failure will not and modifications. occur as long as the error does not reach the service Security is the concurrent existence of a) availability interface of the system. A convenient classification of for authorized users only, b) confidentiality, and c) errors is to describe them in terms of the component integrity with ‘improper’ meaning ‘unauthorized’. failures that they cause, using the terminology of Figure The above attributes may be emphasized to a greater or 2: value vs. timing errors; consistent vs. inconsistent lesser extent depending on the application: availability is (‘Byzantine’) errors when the output goes to two or more always required, although to a varying degree, whereas components; errors of different severities: minor vs. reliability, safety, confidentiality may or may not be ordinary vs. catastrophic errors. An error is detected if its required. The extent to which a system possesses the presence in the system is indicated by an error message attributes of dependability should be interpreted in a or error signal that originates within the system. Errors relative, probabilistic, sense, and not in an absolute, that are present but not detected are latent errors. deterministic sense: due to the unavoidable presence or Faults and their sources are very diverse. Their occurrence of faults, systems are never totally available, classification according to six major criteria is presented reliable, safe, or secure. in Figure 3. Of special interest for this Workshop are Integrity is a prerequisite for availability, reliability faults that result from attacks on a system. They are and safety, but may not be so for confidentiality (for classified as deliberately malicious (d.m.) faults and instance attacks via covert channels or passive listening include malicious logic [23] (Trojan horses, logic bombs, can lead to a loss of confidentiality, without impairing trapdoors are d.m. design faults, while viruses and worms integrity). The definition given for integrity – absence of are d.m. operational faults), intrusions (d.m. external improper system state alterations – extends the usual operational faults) and physical attacks on a system (d.m. definition as follows: (a) when a system implements an physical operational faults). Figure 4 shows the combined authorization policy, ’improper’ encompasses fault classes for which defenses need to be devised. ‘unauthorized’; (b) ‘improper alterations’ encompass Fault pathology, i.e., the relationship between faults, actions resulting in preventing (correct) upgrades of errors, and failures is summarized by Figure 5, which information; (c) ‘system state’ encompasses hardware gives the fundamental chain of threats to dependability. modifications or damages. The definition given for The arrows in this chain express a causality relationship maintainability goes beyond corrective and preventive between faults, errors and failures. They should be maintenance, and encompasses two other forms of interpreted generically: by propagation,

Fundamental Concepts of Dependability

Balancing Dependability Quality Attributes Relationships for Increased Embedded Systems Dependability

A Reasoning Framework for Dependability in Software Architectures Tacksoo Im Clemson University, [email protected]

Fundamental Concepts of Dependability

Dependability Assessment of Software- Based Systems: State of the Art

Software Reliability and Dependability: a Roadmap Bev Littlewood & Lorenzo Strigini

Critical Systems

A Practical Framework for Eliciting and Modeling System Dependability Requirements: Experience from the NASA High Dependability Computing Project

A Brief Essay on Software Testing

Dependability Analysis of a Safety Critical System: the LHC Beam Dumping System at CERN

Empirical Evaluation of the Effectiveness and Reliability of Software Testing Adequacy Criteria and Reference Test Systems

Dependability in Cloud Computing

Model-Driven Tools for Dependability Management in Component-Based Distributed Systems