How Do Systems Manage Their Adaptive Capacity to Successfully Handle Disruptions? a Resilience Engineering Perspective
Total Page:16
File Type:pdf, Size:1020Kb
Complex Adaptive Systems —Resilience, Robustness, and Evolvability: Papers from the AAAI Fall Symposium (FS-10-03) How do Systems Manage Their Adaptive Capacity to Successfully Handle Disruptions? A Resilience Engineering Perspective Matthieu Branlat and David D. Woods Cognitive Systems Engineering Laboratory The Ohio State University Integrated Systems Engineering 210 Baker Systems 1971 Neil Avenue Columbus OH 43210 [email protected], [email protected] Abstract Introduction A large body of research describes the importance of adaptability for systems to be resilient in the face of Resilience engineering (Hollnagel, Woods and Leveson, disruptions. However, adaptive processes can be fallible, 2006) focuses on work systems as an instance of human either because systems fail to adapt in situations requiring systems and complex adaptive systems. In particular, it new ways of functioning, or because the adaptations studies how the performance of systems is affected by themselves produce undesired consequences. A central disruptions, and seeks to understand how the question is then: how can systems better manage their characteristics of these systems make them more prone to capacity to adapt to perturbations, and constitute intelligent success or to failure in their respective environments adaptive systems? Based on studies conducted in different high-risk domains (healthcare, mission control, military (especially in terms of the safety of their components or of operations, urban firefighting), we have identified three the elements they are responsible for). The field borrows basic patterns of adaptive failures or traps: (1) concepts and results from a variety of other fields related to decompensation – when a system exhausts its capacity to the study of complexity, such as control theory, adapt as disturbances and challenges cascade; (2) working at organizational and policy sciences. In return, resilience cross-purposes – when sub-systems or roles exhibit engineering aims at contributing to the study and behaviors that are locally adaptive but globally maladaptive; improvement of work systems, as well as to the (3) getting stuck in outdated behaviors – when a system development of knowledge about complex adaptive over-relies on past successes although conditions of systems. operation change. The identification of such basic patterns A large body of research describes the importance of then suggests ways in which a work organization, as an example of a complex adaptive system, needs to behave in adaptability and adaptive behavior for systems to be able to order to see and avoid or recognize and escape the face the variability of their environment without loosing corresponding failures. The paper will present how expert control or totally collapsing (Ashby, 1956; Ostrom, 1999; practitioners exhibit such resilient behaviors in high-risk Woods and Hollnagel, 2006; Woods and Wreathall, 2008; situations, and how adverse events can occur when systems Weick and Sutcliffe, 2001). Work systems pursue various fail to do so. We will also explore how various efforts in goals that require them to control various processes in a research related to complex adaptive systems provide given environment (e.g., a hospital treats sick patients, fruitful directions to advance both the necessary theoretical nuclear power plants produce energy from a nuclear work and the development of concrete solutions for reaction, aviation crews fly passengers, firefighters respond improving systems’ resilience. to fires). In order to do so, systems rely on models of their tasks and of the environment. Any model is however a Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 26 necessary simplification that aims at reducing and operations and urban firefighting (Woods and Branlat, managing the uncertainty and complexity of the real world 2011). for pragmatic purposes (Hollnagel, 2009). As a result, the conditions for operation are underspecified. The (1) Decompensation: Exhausting Capacity to predictability associated with the processes under control Adapt as Disturbances/Challenges Cascade. and the environment is never perfect, even in predominantly closed worlds like manufacturing plants. This first pattern is observed in settings implementing Any work system therefore needs to be able to adapt to a some form of supervisory control, in which a supervising variety of perturbations. Given the underspecification, role is in charge of controlling a dynamic process through a these perturbations may be surprising. In addition to the variety of actors (human, technological, biological). This fundamental limitation of the models they rely on, work pattern corresponds to an escalation of demands while the systems are also confronted with the evolving nature of system monitoring the process of interest is not capable of their environment, as well as their own evolutions (e.g., adapting and acting fast enough upon an initial set of when new technology, resources or working rules are disturbances. In a complex system, the amount of coupling introduced). Both dynamics constitute sources of new between elements or sub-systems creates the conditions for challenges to their normal way of operating, i.e., new – and a cascade of disturbances resulting in rapidly increasing often unforeseeable – vulnerabilities. demands. Under such circumstances, components are At some level of analysis systems are human systems stretched to their performance limits and the system’s since it is people who create, operate, and modify that overall control of the situation collapses abruptly. This loss system for human purposes, and since people, not of control creates conditions that can develop into an machines, gain or suffer from the operation of that system. incident, i.e., severely impair performance and compromise One of the central efforts of the resilience engineering field safety. From the perspective of the supervisor, the critical consists of studying and understanding challenges to information is how hard the agents under supervision are adaptation in complex socio-technical systems in order to working to maintain control on their parts of the process. better support human adaptive behavior, in part through Difficulties in dealing with escalating demands consist of technological development (Hollnagel, Woods and avoid falling behind the tempo of events by recognizing Leveson, 2006). From both a performance and safety early enough the trend towards a loss of control, and being standpoint, we are interested in how systems adapt their able to switch to new modes of functioning in time, such as plans in progress, i.e., the way they function when investing the needed resources (including in the case of expected or unexpected perturbations occur. To a large unanticipated demands). extent, such adaptations are managed by the human elements of the work systems, even when they make ample (2) Working at Cross-Purposes: Behavior that is use of advanced technology. An important reason for this Locally Adaptive, but Globally Maladaptive. is the context-sensitivity issue (Woods and Hollnagel, This pattern is directly related to the complexity of the 2006). Operators fill the gaps by adapting to the real environments in which real-world systems exist, and of the conditions of operations and their dynamics (Cook and systems themselves. Systems’ or subsystems’ behaviors, Rasmussen, 2005). In particular, we are interested in which exist in the context of networks of interdependencies situations where plan adaptations are not successful, either (functional, structural, temporal) and cross-scale because systems fail to recognize the need for adaptation, interactions, have implications at a larger scale than simply or because the adaptive processes themselves produce at the level of elements producing the behaviors. The undesired consequences (Woods and Shattuck, 2000; pattern is exemplified by the widely discussed tragedy of Dekker, 2003; Lagadec, 2010). A central question is then: the commons (Hardin, 1968; Ostrom, 1999) in which how can systems better manage their capacity to adapt to populations gradually deplete shared physical resources by perturbations, and constitute intelligent adaptive systems? over-exploiting them in the pursuit of short-term We will present how expert practitioners exhibit such production goals. Since the subsistence of these resilient behaviors in high-risk situations, and how adverse populations also depends on these resources, this behavior events can occur when systems fail to do so. We will also turns out over time to be counter-productive. More explore how various efforts in research related to complex generally, this second pattern relates to the difficulty of adaptive systems provide fruitful directions to advance designing systems that provide both stability, in order to both the necessary theoretical work and the development of ease the synchronization of distributed actions to reach a solutions for improving systems’ resilience. common goal, and flexibility, in order to adapt in the face of changing conditions of operation (Branlat, Fern, Voshell and Trent, 2009; Grote, Zala-Mezö and Grommes, 2004). Basic Patterns of How Adaptive Systems Fail Adaptation in tightly coupled systems indeed increases needs for coordination in order to avoid working at cross- We have identified three basic patterns of adaptive failures purposes. Critical challenges here correspond to the or traps based on studies conducted in