Improving the Reliability of Decision-Support Systems for Nuclear Emergency Management by Leveraging Software Design Diversity
Total Page:16
File Type:pdf, Size:1020Kb
CIT. Journal of Computing and Information Technology, Vol. 24, No. 1, March 2016, 45–63 45 doi: 10.20532/cit.2016.1002700 Improving the Reliability of Decision-Support Systems for Nuclear Emergency Management by Leveraging Software Design Diversity Tudor B. Ionescu and Walter Scheuermann Institut f¨ur Kernenergetik und Energiesysteme (IKE), Stuttgart, Germany This paper introduces a novel method of continuous ver- In the aftermath of the Chernobyl nuclear acci- ification of simulation software used in decision-support dent in 1986, atmospheric dispersion models systems for nuclear emergency management (DSNE). The proposed approach builds on methods from the field and simulation codes were used to assess the of software reliability engineering, such as N-Version radiological situation in Western Europe, which Programming, Recovery Blocks, and Consensus Recov- was one of the worst affected areas. On this ery Blocks. We introduce a new acceptance test for occasion, at least two important aspects con- dispersion simulation results and a new voting scheme based on taxonomies of simulation results rather than cerning ADMs and ADSCs became evident. individual simulation results. The acceptance test and the voter are used in a new scheme, which extends the Firstly, atmospheric dispersion models can be Consensus Recovery Block method by a database of used for (1) assessing the radiological situa- result taxonomies to support machine-learning. This tion any time after the release of nuclear trace enables the system to learn how to distinguish correct species, (2) supporting decision-makers in the from incorrect results, with respect to the implemented numerical schemes. Considering that decision-support process of deciding upon and implementing ap- systems for nuclear emergency management are used in propriate counter-measures, such as the distri- a safety-critical application context, the methods intro- bution of iodine tables or the evacuation of duced in this paper help improve the reliability of the system and the trustworthiness of the simulation results severely affected areas, while the accident is used by emergency managers in the decision making unfolding, and (3) training for nuclear emer- process. The effectiveness of the approach has been gencies (i.e., drills). These systems produce assessed using the atmospheric dispersion forecasts of forecasts of the atmospheric dispersion of ra- two test versions of the widely used RODOS DSNE system. dioactive materials using input data from var- ious sources, including nuclear power-plants, ACM CCS (2012) Classification: Information systems weather forecast centers, and radiological mea- → Information systems applications → Decision sup- surement stations. Dispersion forecasts are used port systems → Expert systems by government emergency response committees and nuclear power-plant operators in preparing Keywords: software reliability, decision-support, simu- for and managing nuclear emergencies. lation, safety-critical, machine-learning Secondly, the forecasts of different ADMs, as well as the results of different ADSCs imple- menting even the same model, differ to a great extent qualitatively and quantitatively. In re- 1. Introduction sponse to this situation,the International Atomic Energy Agency (IAEA) recommended the re- An atmospheric dispersion simulation code view and inter-calibration of the models and (ADSC) is a computer program, which imple- simulation codes for the short-ranged and long- ments an atmospheric dispersion model (ADM). ranged atmospheric transport of radioactive nu- 46 T. B. Ionescu and W. Scheuermann clides. More recently, the European Commis- 1.2. Proposed Approach sion also funded a project, called ENSEMBLE [ ] 1 with the aim of developing a methodology Considering the simple fact that ensemble fore- for assessing the reliability of several medium casting systems are a paradigmatic example of and long-range atmospheric dispersion models the software design diversity principle, the pri- currently in use in the European Union and other mary aim of this work is to assess the pos- countries of the world. The outcome of this sibility of detecting latent software faults in work was a multi-model approach, called en- ( ) the dispersion models participating in ensem- semble dispersion modeling EDM . The sys- ble forecasting systems using a new software tem aggregates results produced by different design diversity method. The current approach ADSCs implementing various models, and out- th helps improve the reliability of ADSCs by al- puts a unified result based on selecting the p lowing the effective automated continuous ver- percentile result from the ensemble for each ification [3] of these simulation codes. The pro- time step and each location of the monitored posed approached is based on combining the area. N-Version Programming (NVP)[4] and Recov- ery Blocks (RB)[5] methods while developing a completely new voting scheme as well as a 1.1. Motivation new acceptance test for dispersion simulation results. Germany, as well as other countries, in case of a significant release of radioactive materi- The remainder of the paper is organized as fol- als into the atmosphere, the decision on which lows. We first briefly review the ADMs imple- countermeasures to take and which sectors of mented in the RODOS system [6] and then pro- the monitored area to evacuate first is taken vide a literature review on verification methods based on these forecasts. Thus, simulation- for atmospheric dispersion simulation codes. based decision-support software systems must We then review the current debate concerning be regarded as safety critical. Therefore, it is software design diversity methods and argue necessary to investigate the reliability of such that they are well suited for the case of dis- systems from a software reliability point of persion forecasting systems. We introduce a view. Section 4.1.1.2 of the NASA-STD-8719. new acceptance test and taxonomy-based voter, 13B standard for safety-critical software [2] pro- which enable the application of the NVP and vides a list of criteria for determining whether RB methods for the continuous verification of a software component or system must be con- dispersion simulation codes. Finally, we evalu- sidered safety critical. Accordingly, it suffices ate the approach using two ensembles of artifi- for the software component in question to meet cially generated data and discuss the results of just one of the listed criteria in order for it to be the evaluation. regarded as safety critical. Paragraph 4.1.1.2.a. states that as soon as the software “processes data or analyzes trends that lead directly to safety decisions”, it must be considered safety 2. Materials and Methods critical. Moreover, note 4 − 1 from the same standard specifies that “if data is used to make safety decisions (either by a human or the sys- 2.1. Atmospheric Dispersion Modeling tem), then the data is safety-critical, as is all the software that acquires, processes, and transmits The atmospheric dispersion of trace species is the data”. Decision support systems for man- governed by meteorological events. Wind and aging nuclear emergencies (henceforth referred turbulence are the most influential forces in the to as DSNE systems) are in use in all Euro- dispersion process. In the following, three well- pean countries producing nuclear energy – see established ADMs will be briefly reviewed. for example [1] for a comprehensive list. They All these models share the property that they are safety critical because erroneous dispersion represent analytical or numerical solutions to simulation results could lead to false assump- the diffusion-advection (also called transport) tions about the dispersion of radioactive pollu- equation [7] – the most common physical model tants and may thus bias evacuation priorities. used to simulate the atmospheric dispersion of Improving the Reliability of Decision-Support Systems for Nuclear Emergency Management by ... 47 trace species. This property is an essential pre- 2.1.3. Puff Models requisite for the applicability of the software design diversity methods proposed in this work Puff dispersion models are more accurate than to the simulation codes implementing any of the Gaussian-Plume models and less time-consum- models considered here. ing than the Lagrangian particle model. These models simulate the dispersion of trace species 2.1.1. The Gaussian Plume Model using a series of Gaussian puffs of different sizes. A puff contains a quantity of pollutant The Gaussian Plume Model (GPM) is the old- (characterized by its concentration) which is est dispersion model with practical applicabil- subject to turbulent diffusion and advection by ity. The model is based on an analytical solu- a wind field in a way that is similar to how par- tion to the transport equation, which accounts ticles are transported. RIMPUFF [11] and AT- for the airborne transport of trace species. This STEP [12] are the two codes implementing puff transport occurs through the mean air flow (i.e., models used to evaluate the current approach. wind), in which case it is called advection,and through turbulent movement. Turbulence can disperse trace species in all three directions of 2.2. The Verification and Validation of space represented by the three axes of the Carte- Atmospheric Dispersion Simulation sian system. This process is called diffusion. Codes Any dispersion of substances which does not take place along the main flow is referred to as turbulent diffusion.