Safety III: a Systems Approach to Safety and Resilience

Total Page:16

File Type:pdf, Size:1020Kb

Safety III: a Systems Approach to Safety and Resilience MIT ENGINEERING SYSTEMS LAB Safety III: A Systems Approach to Safety and Resilience Prof. Nancy Leveson Aeronautics and Astronautics Dept., MIT 7/1/2020 Abstract: Recently, there has been a lot of interest in some ideas proposed by Prof. Erik Hollnagel and labeled as “Safety-II” and argued to be the basis for achieving system resilience. He contrasts Safety-II to what he describes as Safety-I, which he claims to be what engineers do now to prevent accidents. What he describes as Safety-I, however, has very little or no resemblance to what is done today or to what has been done in safety engineering for at least 70 years. This paper describes the history of safety engineering, provides a description of safety engineering as actually practiced in different industries, shows the flaws and inaccuracies in Prof. Hollnagel’s arguments and the flaws in the Safety-II concept, and suggests that a systems approach (Safety-III) is a way forward for the future. Safety III: A Systems Approach to Safety and Resilience Contents Preface 3 Does Safety-I Exist? 4 Differences between Workplace Safety and Product/System Safety 7 Workplace and Product/System Safety History 8 A Brief Legal View of the History of Safety 8 A Technical View of the History of Safety 10 An Engineer’s View of Workplace Safety 12 An Engineer’s View of Product/System Safety 14 Activities Common among Different Industries 15 Commercial Aviation 17 Nuclear Power 19 Chemical Industry 20 Defense and “System Safety” 21 SUBSAFE: The U.S. Nuclear Submarine Program 25 Astronautics and Space 25 Healthcare/Hospital Safety 25 Summary 26 A Comparison of Safety-I, Safety-II and Safety-III 27 Definition of Safety 29 “Goes Wrong” vs. “Goes Right” 32 Safety is a Different Property than Reliability 38 What is a System? 41 Sociotechnical Systems 43 Decomposition and Emergence 45 “Bimodality” 49 Predictability 52 “Intractability” 52 Safety Management “Principle” 56 Investigation/Reporting Databases 57 Learning from Failure in Engineering 60 Accident Causality and Causality Models 63 Causality in General 65 Models of Accident Causality 69 The Linear Chain-of-Failure Events Model 69 Domino Model 72 Swiss Cheese Model 73 Hollnagel’s Resonance Model and FRAM 75 Limitations of the Linear Chain-of-Events Model in General 83 Epidemiological Models 86 System Theory and STAMP 86 A Brief Introduction to Systems Theory 87 The STAMP Model of Accident Causality 95 1 Attitude Toward Human Factors 99 Role of Performance Variability 101 Summary 104 The Future 105 References 106 Appendix: System Theory vs. Complexity Theory 108 Figures Fig. 1: Safety depends on context 31 Fig. 2: The terminology used in engineering 32 Fig. 3: (Hollnagel Figure 3.2 on Page 50): “Hypothesis of different causes” 36 Fig. 4: Causality in System Engineering 36 Fig. 5: Operators learn from crossing the boundaries of safe behavior 61 Fig. 6: (Hollnagel Figure 73 on Page 137): “The Safety-II view of failures and successes” 63 Fig. 7: Chain of events model for a tank explosion 69 Fig. 8: Tank explosion example shown with added protections 70 Fig. 9: Heinrich’s Domino Model of accident causation 71 Fig. 10: Reason’s Swiss Cheese Model 73 Fig. 11: Two examples of a FRAM specification of the steps in a process 75 Fig. 12: The FRAM process and “model” 76 Fig. 13: General process for creating safety-related analyses 76 Fig. 14: Analytic decomposition 86 Fig. 15: Emergent properties arise from complex interactions 88 Fig. 16: Control of emergent properties 88 Fig. 17: An example of a safety control structure 90 Fig. 18: Four types of causality included in Systems Theory 91 Fig. 19: Three types of causal loop structures 92 Fig. 20: Some of the factors in the Space Shuttle Columbia accident 93 Fig. 21: The basic building block for a safety control structure 96 Fig. 22: A representation of the STAMP model of accident causality 97 2 Preface Recently, there has been a lot of interest in some ideas proposed by Prof. Erik Hollnagel and labeled as “Safety-II” and argued to be the basis for achieving system resilience. He contrasts Safety-II to what he describes as Safety-I, which he claims to be what engineers do now to prevent accidents. What he describes, however, has very little or no resemblance to what is done today or to what has been done in safety engineering for at least 70 years. First, should you take my word for this? I have worked in safety engineering for 40 years. Here’s a little of my relevant background. I have degrees in mathematics, management, and computer science and did graduate work in cognitive and behavioral psychology. I have written two books on system safety (Safeware [Leveson, 1995] and Engineering a Safer World [Leveson, 2012]) and hundreds of papers on the topic. My efforts have been rewarded with many awards, most recently an IEEE Medal for Environmental and Safety Technologies. I am an elected member of the National Academy of Engineering. I also am fascinated by engineering history and have read much about how engineers handled safety for the past hundred or so years. In practice, I have worked in almost all aspects of aerospace and defense and, to a lesser extent, nuclear power, petrochemicals, patient safety and medical devices, most forms of transportation (particularly aircraft and automobiles), etc. I have also participated in writing some major accident reports (Deep Water Horizon, the Columbia Space Shuttle, and Texas City) and many less well-known ones. Finally, in the past few years, I have been encouraged to look into workplace safety because people felt that the engineering approaches that I have created might be useful there. I provide this background because I don’t recognize Prof. Hollnagel’s definition of Safety-I in my 40 years of experience in safety engineering. It is just not what is done in practice except, perhaps, in a very few organizations with the least sophisticated safety approaches. His analysis also confuses the almost totally different fields of product/system safety and workplace safety. Prof. Hollnagel tears apart his strawman Safety-I and recommends an alternative, which he calls Safety-II. In my experience, again, Safety-II is a giant step backward, particularly if it takes resources and attention away from more successful approaches. It contains the types of practices used in the past, mostly very long ago but also more recently in industries that have many accidents and that usually blame them all on the human operators. These practices have led to many unnecessary deaths and injuries. The Safety-II approach was rejected long ago in sophisticated engineering projects because it is not effective. Goals such as resilience, flexibility, and adaptability are important, but they are much more likely to be achieved using approaches other than Safety-II. These properties must be built into the system as a whole—they are not a function simply of the behavior of human operators, which seems to be the almost total emphasis in Safety-II. There certainly are a few aspects of Safety-II that might be useful in limited ways, but following the overall approach, I believe, is likely to lead to unnecessary accidents. In this paper, I explain these very strong statements and note that Prof. Hollnagel and his followers seem unaware of the successful use of a systems approach to safety, which is called “System Safety”1 by its practitioners. They may not know about it; it was developed and used primarily in the United States. System Safety was created for and has been used over the past 70 years in aerospace and defense to cope with the most dangerous systems being created. In my work and writings, I have extended this very successful practice to handle the evolution (and sometimes revolutionary change) of engineering 1 The term “system safety” has been adopted recently as a general term for safety engineering by people not familiar with the special field of System Safety developed long ago. I will differentiate them here by using capital letters to denote the specialized field of System Safety. 3 practices over time. These changes include greatly increasing complexity, the extensive and growing use of computers and other forms of new technology, and a changing role of humans in complex systems. In this paper, I call this general approach Safety-III to put it into the Hollnagel context. It is not new, however—the general practices have been around for a very long time, but primarily used in the most sophisticated and sometimes secretive engineering contexts. It can provide a template for advances in all industries, including product/system safety and workplace safety, going forward. Changes and advances will be needed to keep it relevant for engineering in the future, of course, as our technology and society change. One of the dangers of critiquing someone’s approach is determining exactly what that approach is. Our views evolve over time as more is learned, and we all change them in small or even major ways with more experience. In addition, many people write papers about someone else’s concept and interpret it differently than the original author, introducing their own slant and representing their own experiences. I’ve seen this in papers by the proponents of Safety-II other than Prof. Hollnagel, particularly in healthcare safety. To try to stay as close as possible to the original conceptions of Safety-I and Safety-II, as defined by Prof. Hollnagel, in this paper I use only the writings of Prof. Hollnagel himself, basically his two books Safety I and Safety-II: The Past and Future of Safety Management [Hollnagel, 2014] and Safety-II in Practice [Hollnagel, 2018].
Recommended publications
  • System Safety Engineering: Back to the Future
    System Safety Engineering: Back To The Future Nancy G. Leveson Aeronautics and Astronautics Massachusetts Institute of Technology c Copyright by the author June 2002. All rights reserved. Copying without fee is permitted provided that the copies are not made or distributed for direct commercial advantage and provided that credit to the source is given. Abstracting with credit is permitted. i We pretend that technology, our technology, is something of a life force, a will, and a thrust of its own, on which we can blame all, with which we can explain all, and in the end by means of which we can excuse ourselves. — T. Cuyler Young ManinNature DEDICATION: To all the great engineers who taught me system safety engineering, particularly Grady Lee who believed in me, and to C.O. Miller who started us all down this path. Also to Jens Rasmussen, whose pioneering work in Europe on applying systems thinking to engineering for safety, in parallel with the system safety movement in the United States, started a revolution. ACKNOWLEDGEMENT: The research that resulted in this book was partially supported by research grants from the NSF ITR program, the NASA Ames Design For Safety (Engineering for Complex Systems) program, the NASA Human-Centered Computing, and the NASA Langley System Archi- tecture Program (Dave Eckhart). program. Preface I began my adventure in system safety after completing graduate studies in computer science and joining the faculty of a computer science department. In the first week at my new job, I received a call from Marion Moon, a system safety engineer at what was then Ground Systems Division of Hughes Aircraft Company.
    [Show full text]
  • Manuscript Instructions/Template
    INCOSE Working Group Addresses System and Software Interfaces Sarah Sheard, Ph.D. Rita Creel CMU Software Engineering Institute CMU Software Engineering Institute (412) 268-7612 (703) 247-1378 [email protected] [email protected] John Cadigan Joseph Marvin Prime Solutions Group, Inc. Prime Solutions Group, Inc. (623) 853-0829 (623) 853-0829 [email protected] [email protected] Leung Chim Michael E. Pafford Defence Science & Technology Group Johns Hopkins University +61 (0) 8 7389 7908 (301) 935-5280 [email protected] [email protected] Copyright © 2018 by the authors. Published and used by INCOSE with permission. Abstract. In the 21st century, when any sophisticated system has significant software content, it is increasingly critical to articulate and improve the interface between systems engineering and software engineering, i.e., the relationships between systems and software engineering technical and management processes, products, tools, and outcomes. Although systems engineers and software engineers perform similar activities and use similar processes, their primary responsibilities and concerns differ. Systems engineers focus on the global aspects of a system. Their responsibilities span the lifecycle and involve ensuring the various elements of a system—e.g., hardware, software, firmware, engineering environments, and operational environments—work together to deliver capability. Software engineers also have responsibilities that span the lifecycle, but their focus is on activities to ensure the software satisfies software-relevant system requirements and constraints. Software engineers must maintain sufficient knowledge of the non-software elements of the systems that will execute their software, as well as the systems their software must interface with.
    [Show full text]
  • Infrastructure (Resilience-Oriented) Modelling Language: I®ML
    Infrastructure (Resilience-oriented) Modelling Language: I®ML A proposal for modelling infrastructures and their interconnections Andrés Silva, Roberto Filippini EUR 24727 EN - 2011 The mission of the JRC-IPSC is to provide research results and to support EU policy-makers in their effort towards global security and towards protection of European citizens from accidents, deliberate attacks, fraud and illegal actions against EU policies. European Commission Joint Research Centre Institute for the Protection and Security of the Citizen Contact information Address: TP 210, EC JRC Ispra, Ispra (Va) Italy E-mail: [email protected] Tel.: +39 0332 789936 Fax: http://ipsc.jrc.ec.europa.eu/ http://www.jrc.ec.europa.eu/ Legal Notice Neither the European Commission nor any person acting on behalf of the Commission is responsible for the use which might be made of this publication. Europe Direct is a service to help you find answers to your questions about the European Union Freephone number (*): 00 800 6 7 8 9 10 11 (*) Certain mobile telephone operators do not allow access to 00 800 numbers or these calls may be billed. A great deal of additional information on the European Union is available on the Internet. It can be accessed through the Europa server http://europa.eu/ JRC 63302 EUR 24727 EN ISBN 978-92-79-19324-8 ISSN 1018-5593 doi:10.2788/54708 Luxembourg: Publications Office of the European Union © European Union, 2011 Reproduction is authorised provided the source is acknowledged Printed in Italy Infrastructure (Resilience-oriented) Modelling Language: I®ML A proposal for modelling infrastructures and their connections Andrés Silva1 Roberto Filippini Universidad Politécnica de Madrid JRC of the European Commission Abstract The modelling of critical infrastructures (CIs) is an important issue that needs to be properly addressed, for several reasons.
    [Show full text]
  • IS2018 Book of Abstract
    th Annual INCOSE 28 international symposium Washington, DC, USA July 7 - 12, 2018 Delivering Systems in the Age of Globalization Status as of May 15 th , 2018 Book of Abstract Table of contents keynotes ............................................................................................................................................................................... p. 7 keynotes#Keynote#2: The Big Shift: Innovation and Systems Engineering ................................................................ p. 7 Papers .................................................................................................................................................................................. p. 8 Papers#107: A Framework for Concept and its Testing on Patents ............................................................................ p. 8 Papers#75: A Framework for Testability Analysis from System Architecture Perspective .......................................... p. 9 Papers#128: A Framework for Understanding Systems Principles and Methods .......................................................p. 10 Papers#31: A fresh look at Systems Engineering - what is it, how should it work? .....................................................p. 11 Papers#35: A Hybrid Liver-Candidate Transportation System to Improve Accessibility and Extend Organ Life in L ..p. 12 Papers#55: A Novel “Resilience Viewpoint” to aid in Engineering Resilience in Systems of Systems (SoS) .............p. 13 Papers#97: A successful use of systems approaches in
    [Show full text]
  • Model for Safety Case Modeling and Documentation
    SAFE – an ITEA2 project D3.1.3 Contract number: ITEA2 – 10039 Safe Automotive soFtware architEcture (SAFE) ITEA Roadmap application domains: Major: Services, Systems & Software Creation Minor: Society ITEA Roadmap technology categories: Major: Systems Engineering & Software Engineering Minor 1: Engineering Process Support WP3 Deliverable D3.1.3: Proposal for extension of Meta- model for safety case modeling and documentation Due date of deliverable: 31/03/2013 Actual submission date: 28/03/2013 Start date of the project: 01/07/2011 Duration: 36 months Project coordinator name: Stefan Voget Organization name of lead contractor for this deliverable: fortiss Editor: Maged Khalil (fortiss GmbH) Contributors: Maged Khalil (fortiss GmbH), Eduard Metzker (Vector Informatik GmbH) Reviewers: Eduard Metzker (Vector Informatik GmbH) 2013 The SAFE Consortium 1 (50) SAFE – an ITEA2 project D3.1.3 Revision chart and history log Version Date Reason 0.1 2012-09-19 Initialization of document 0.2 2012-10-18 Input to Introduction Section 4 0.3 2012-11-07 Input to Overview on ISO Section 5 0.4 2012-11-16 Input to EAST-ADL Overview Section 7 0.5 2012-11-16 Input to Overview on ISO Section 5 0.6 2012-11-23 Input to Methodology Section 6 0.7 2012-11-29 Further Input to Methodology Section 6 0.8 2012-12-20 Editing and Input in various sections 0.9 2013-02-13 Input to SAFE Meta-model Contribution Section 8 0.10 2013-02-15 Introduction of Patterns Section 6.3 0.11 2013-03-08 Editing and Input in various sections 0.12 2013-03-24 First version ready for review 0.13 2013-03-26 Incorporation of first review comments 0.14 2013-03-27 Updated version reviewed.
    [Show full text]
  • Systems Theoretic Process Analysis (STPA): a Bibliometric and Patents Analysis
    ORIGINAL ARTICLE Systems Theoretic Process Analysis (STPA): a bibliometric and patents analysis Modelo teórico - Sistêmico de Análise de Processos (STPA): uma análise bibliométrica e de patentes Sarah Francisca De Souza Borges1 , Marco Antônio Fontoura De Albuquerque1 , Moacyr Machado Cardoso Junior1 , Mischel Carmen Neyra Belderrain1 , Luís Eduardo Loures Da Costa1 1Área de Gestão Tecnológica do Programa de Ciências e Tecnologias Espaciais – CTE/G, Instituto Tecnológico De Aeronáutica - ITA, São José dos Campos, SP, Brasil. E-mail: [email protected]; [email protected]; [email protected]; [email protected]; [email protected] How to cite: Borges, S. F. S., Albuquerque, M. A. F., Cardoso Junior, M. M., Belderrain, M. C. N. & Costa, L. E. L. (2021). Systems Theoretic Process Analysis (STPA): a bibliometric and patents analysis. Gestão & Produção, 28(2), e5073. https://doi.org/10.1590/1806-9649-2020v28e5073 Abstract: The Systemic Theoretical Process Analysis (STPA) model is used for hazard analysis and accident prevention, based on systemic thinking and the identification of causal scenarios, created by Professor Nancy Leveson of the Institute of Technology of Massachusetts (MIT). The purpose of this article is to perform a bibliometric and patent analysis of the STPA model. Since bibliometry is an important tool in the analysis of scientific production, this method is used as a descriptive statistic, for the purposes of this study, the concepts of Goffman's Epidemic Theory were highlighted, under a mainly qualitative analysis, for a study of decline and ascent scientific method. For the bibliometric analysis, the main page of Professor Nancy Leveson was used in MIT's Web site, besides the Web of Science, Mendeley, ResearchGate, Village of Engineering and Scientific Electronic Library Online (SciELO).
    [Show full text]
  • Manuscript Instructions/Template
    Safety Analysis in Early Concept Development and Requirements Generation1 Nancy G. Leveson Massachusetts Institute of Technology 77 Massachusetts Ave, Cambridge MA 02139 617-258-0505 http://[email protected] [email protected] Copyright © 2018 by Nancy G. Leveson. Published and used by INCOSE with permission. Abstract. This paper shows how a new hazard analysis technique, STPA (System Theoretic Process Analysis), can be used to generate high-level safety requirements early in the concept development phase that can then assist in the design of the system architecture. These general, system-level requirements can be refined using STPA as decisions are made. The process goes hand-in-hand with design and the rest of the lifecycle as STPA can be used to provide information to assist in decision- making throughout the development and even operations phases. STPA also fits into a model-based engineering process as it works on a model of the system (which is also refined as design decisions are made) although that model is different than the architectural models usually proposed for model- based system engineering today. The process promotes traceability throughout the development process so decisions and designs can be changed with minimum requirements for redoing previous analyses. Finally, while this paper describes the approach with respect to safety, it can be applied to any emergent system property. Early Concept Exploration and Development Early concept development, shown at the top left of the V-model (Figure 1), is the first step in the usual system engineering process. While the label may differ in variants of the model, this stage includes such activities as stakeholder and user analysis (needs analysis), customer requirements generation, regulatory requirements review, feasibility studies, concept and tradespace exploration, and establishment of criteria for evaluation of the evolving and final design.
    [Show full text]
  • Download from an Unknown Website Cannot Be Said to Be ‘Safe’ Just Because It Happens Not to Harbor a Virus
    A direct path to dependable software The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Jackson, Daniel. “A direct path to dependable software.” Commun. ACM 52.4 (2009): 78-88. As Published http://dx.doi.org/10.1145/1498765.1498787 Publisher Association for Computing Machinery Version Author's final manuscript Citable link http://hdl.handle.net/1721.1/51683 Terms of Use Attribution-Noncommercial-Share Alike 3.0 Unported Detailed Terms http://creativecommons.org/licenses/by-nc-sa/3.0/ A Direct Path To Dependable Software Daniel Jackson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology What would it take to make software more dependable? Until now, most approaches have been indirect: some practices – processes, tools or techniques – are used that are believed to yield dependable software, and the argument for dependability rests on the extent to which the developers have adhered to them. This article argues instead that developers should produce direct evidence that the software satisfies its dependability claims. The potential advantages of this approach are greater credibility (since the ar- gument is not contingent on the effectiveness of the practices) and reduced cost (since development resources can be focused where they have the most impact). 1 Why We Need Better Evidence Is a system that never fails dependable? Not necessarily. A dependable system is one you can depend on – that is, in which you can place your reliance or trust. A rational person or organization only does this with evidence that the system’s benefits outweigh its risks.
    [Show full text]
  • Machine Learning Testing: Survey, Landscapes and Horizons
    1 Machine Learning Testing: Survey, Landscapes and Horizons Jie M. Zhang*, Mark Harman, Lei Ma, Yang Liu Abstract—This paper provides a comprehensive survey of techniques for testing machine learning systems; Machine Learning Testing (ML testing) research. It covers 144 papers on testing properties (e.g., correctness, robustness, and fairness), testing components (e.g., the data, learning program, and framework), testing workflow (e.g., test generation and test evaluation), and application scenarios (e.g., autonomous driving, machine translation). The paper also analyses trends concerning datasets, research trends, and research focus, concluding with research challenges and promising research directions in ML testing. Index Terms—machine learning, software testing, deep neural network, F 1 INTRODUCTION The prevalent applications of machine learning arouse inherently follows a data-driven programming paradigm, natural concerns about trustworthiness. Safety-critical ap- where the decision logic is obtained via a training procedure plications such as self-driving systems [1], [2] and medical from training data under the machine learning algorithm’s treatments [3], increase the importance of behaviour relating architecture [8]. The model’s behaviour may evolve over to correctness, robustness, privacy, efficiency and fairness. time, in response to the frequent provision of new data [8]. Software testing refers to any activity that aims to detect While this is also true of traditional software systems, the the differences between existing and required behaviour [4]. core underlying behaviour of a traditional system does not With the recent rapid rise in interest and activity, testing typically change in response to new data, in the way that a has been demonstrated to be an effective way to expose machine learning system can.
    [Show full text]
  • Decision Procedures for Algebraic Data Types With
    Synthesis, Analysis, and Verification Lecture 01 Introduction, Overview, Logistics Lectures: Viktor Kuncak Exercises and Labs: Eva Darulová Giuliano Losa Monday, 21 February 2011 and 22 February 2011 Today Introduction and overview of topics – Analysis and Verification – Synthesis Course organization and grading SAV in One Slide We study how to build software analysis, verification, and synthesis tools that automatically answer questions about software systems. We cover theory and tool building through lectures, exercises, and labs. Grade is based on – quizzes – home works (theory and programming) – a mini project, presented in the class Steps in Developing Tools Modeling: establish precise mathematical meaning for: software, environment, and questions of interest – discrete mathematics, mathematical logic, algebra Formalization: formalize this meaning using appropriate representation of programming languages and specification languages – program analysis, compilers, theory of formal languages, formal methods Designing algorithms: derive algorithms that manipulate such formal objects - key technical step – algorithms, dataflow analysis, abstract interpretation, decision procedures, constraint solving (e.g. SAT), theorem proving Experimental evaluation: implement these algorithms and apply them to software systems – developing and using tools and infrastructures, learning lessons to improve and repeat previous steps Comparison to other Sciences Like science we model a part of reality (software systems and their environment) by introducing
    [Show full text]
  • Why Software Is So Bad
    Why Software Is So Bad By Charles C. Mann July/August 2002 For years we've tolerated buggy, bloated, badly organized computer programs. But soon, we'll innovate, litigate and regulate them into reliability. It’s one of the oldest jokes on the Internet, endlessly forwarded from e-mailbox to e-mailbox. A software mogul—usually Bill Gates, but sometimes another—makes a speech. “If the automobile industry had developed like the software industry,” the mogul proclaims, “we would all be driving $25 cars that get 1,000 miles to the gallon.” To which an automobile executive retorts, “Yeah, and if cars were like software, they would crash twice a day for no reason, and when you called for service, they’d tell you to reinstall the engine.” The joke encapsulates one of the great puzzles of contemporary technology. In an amazingly short time, software has become critical to almost every aspect of modern life. From bank vaults to city stoplights, from telephone networks to DVD players, from automobile air bags to air traffic control systems, the world around us is regulated by code. Yet much software simply doesn’t work reliably: ask anyone who has watched a computer screen flush blue, wiping out hours of effort. All too often, software engineers say, code is bloated, ugly, inefficient and poorly designed; even when programs do function correctly, users find them too hard to understand. Groaning beneath the weight of bricklike manuals, bookstore shelves across the nation testify to the perduring dysfunctionality of software. “Software’s simply terrible today,” says Watts S.
    [Show full text]
  • CAST HANDBOOK: How to Learn More from Incidents and Accidents
    CAST HANDBOOK: How to Learn More from Incidents and Accidents Nancy G. Leveson COPYRIGHT © 2019 BY NANCY LEVESON. ALL RIGHTS RESERVED. THE UNALTERED VERSION OF THIS HANDBOOK AND ITS CONTENTS MAY BE USED FOR NON-PROFIT CLASSES AND OTHER NON-COMMERCIAL PURPOSES BUT MAY NOT BE SOLD. An accident where innocent people are killed is tragic, but not nearly as tragic as not learning from it. 2 Preface About 15 years ago, I was visiting a large oil refinery while investigating a major accident in another refinery owned by the same company. The head of the safety engineering group asked me how they could decide which incidents and accidents to investigate when they had hundreds of them every year. I replied that I thought he was asking the wrong question: If they investigated a few of them in greater depth, they would not have hundreds. I don’t think he understood, or at least did not accept, my suggestion. The goal of this handbook is to explain that answer—we are not learning enough from the incidents and accidents we are having. We need to figure out how to learn more if we truly want to significantly reduce losses. After working in the field of system safety and helping to write the accident reports of several major accidents (such as the Space Shuttle Columbia, Deepwater Horizon, and Texas City) and other smaller ones, I have found many factors common to all accidents. Surprisingly, these are often not included as a cause in the official accident reports. CAST (Causal Analysis based on System Theory) and this handbook are my attempt to use my experience to help others learn more from accidents in order to do a better job in preventing losses in the future.
    [Show full text]