Uncertainty and Sensitivity Analysis for Long-Running Computer Codes: a Critical Review

UNCERTAINTY AND SENSITIVITY ANALYSIS FOR LONG-RUNNING COMPUTER CODES: A CRITICAL REVIEW By MASSACHUSE~s INSTrTUTE' Dustin R. Langewisch OF TECHNOLOGY B.S. Mechanical Engineering and B.S. Mathematics MAR 122010 University of Missouri-Columbia, 2007 L A [ LIBRARIES SUBMITTED TO THE DEPARTMENT OF NUCLEAR SCIENCE AND ENGINEERING IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF ARCHIVES MASTER OF SCIENCE IN NUCLEAR SCIENCE AND ENGINEERING AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY FEBRUARY 2010 The author hereby grants MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. Copyright 0 Massachusetts Institute of Technology All rights reserved. Signature of Author: "-1epartment of Nuclear Science and Engineering January 28, 2010 Certified by: Dr. Gofge E p-p'stolakis, Thesis Supervisor KEPCO Professor of Nuclear Science and Engineering Professor of Engineering Systems Certified by: Don d Helton, Thesis Reader of ear Regulatory Research a. Regulatory Commission Accepted by: Dr. Jacquelyn C. Yanch Chair, Department Committee on Graduate Students UNCERTAINTY AND SENSITIVITY ANALYSIS FOR LONG-RUNNING COMPUTER CODES: A CRITICAL REVIEW Dustin R. Langewisch SUBMITTED TO THE DEPARTMENT OF NUCLEAR SCIENCE AND ENGINEERING IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN NUCLEAR SCIENCE AND ENGINEERING AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY FEBRUARY 2010 Abstract This thesis presents a critical review of existing methods for performing probabilistic uncertainty and sensitivity analysis for complex, computationally expensive simulation models. Uncertainty analysis (UA) methods reviewed include standard Monte Carlo simulation, Latin Hypercube sampling, importance sampling, line sampling, and subset simulation. Sensitivity analysis (SA) methods include scatter plots, Monte Carlo filtering, regression analysis, variance-based methods (Sobol' sensitivity indices and Sobol' Monte Carlo algorithms), and Fourier amplitude sensitivity tests. In addition, this thesis reviews several existing metamodeling techniques that are intended provide quick-running approximations to the computer models being studied. Because stochastic simulation-based UA and SA rely on a large number (e.g., several thousands) of simulations, metamodels are recognized as a necessary compromise when UA and SA must be performed with long-running (i.e., several hours or days per simulation) computational models. This thesis discusses the use of polynomial Response Surfaces (RS), Artificial Neural Networks (ANN), and Kriging/Gaussian Processes (GP) for metamodeling. Moreover, two methods are discussed for estimating the uncertainty introduced by the metamodel. The first of these methods is based on a bootstrap sampling procedure, and can be utilized for any metamodeling technique. The second method is specific to GP models, and is based on a Bayesian interpretation of the underlying stochastic process. Finally, to demonstrate the use of these methods, the results from two case studies involving the reliability assessment of passive nuclear safety systems are presented. The general conclusions of this work are that polynomial RSs are frequently incapable of adequately representing the complex input/output behavior exhibited by many mechanistic models. In addition, the goodness- of-fit of the RS should not be misinterpreted as a measure of the predictive capability of the metamodel, since RSs are necessarily biased predictors for deterministic computer models. Furthermore, the extent of this bias is not measured by standard goodness-of-fit metrics (e.g., coefficient of determination, R 2), so these methods tend to provide overly optimistic indications of the quality of the metamodel. The bootstrap procedure does provide indication as to the extent of this bias, with the bootstrap confidence intervals for the RS estimates generally being significantly wider than those of the alternative metamodeling methods. It has been found that the added flexibility afforded by ANNs and GPs can make these methods superior for approximating complex models. In addition, GPs are exact interpolators, which is an important feature when the underlying computer model is deterministic (i.e., when there is no justification for including a random error component in the metamodel). On the other hand, when the number of observations from the computer model is sufficiently large, all three methods appear to perform comparably, indicating that in such cases, RSs can still provide useful approximations. Thesis Supervisor: George E. Apostolakis KEPCO Professor of Nuclear Science and Engineering Professor of Engineering Systems MIT Department of Nuclear Science and Engineering Thesis Reader: Donald Helton Office of Nuclear Regulatory Research US Nuclear Regulatory Commission Acknowledgments First of all, I would like to thank my advisor, Professor George Apostolakis, for his guidance and encouragement throughout the duration of this project, not to mention his countless suggested revisions to this thesis. I am grateful to all of the NRC staff who have supported this work. Specifically, I would like to thank Don Helton, who graciously agreed to read this thesis, as well as Nathan Siu, both of whom provided numerous helpful suggestions and comments. Furthermore, I am deeply indebted to Nicola Pedroni, who has provided numerous contributions to this thesis. The results from the artificial neural network studies were obtained from Nicola. In addition, I was first introduced to several of the advanced Monte Carlo methods through Nicola's research. I am also grateful to Qiao Liu, who provided much-needed assistance on the topic of sensitivity analysis. I would like to thank Anna Nikiforova for helping me work out numerous issues with the FCRR RELAP model, and C.J. Fong, whose Master's thesis inspired a substantial portion of this research. Finally, I would like to thank all of the friends and family members who have supported me over the past couple of years. While there are too many to list, I am especially grateful for the support and encouragement that I received from my parents. In addition, I would like to thank my officemate, Tim Lucas, for providing an incessant source of distraction. Without him, this thesis would, undoubtedly, be far more comprehensive, though I probably would not have maintained my sanity. Table of Contents I INTRODUCTION..............................................................................................................................................1 I UNCERTAINTY ANALYSIS...........................................................................................................................6 11.1 Classification of Uncertainty.............................................................................................. 7 11.2 M easures of Uncertainty ................................................................................................. 10 11.3 Computation of Uncertainty M easures ........................................................................... 12 I1.3.A StandardMonte Carlo Simulation (MCS)............................. ....... ....... 12 II.3.B Latin Hypercube Sampling (LHS)............................................................................................................15 11.3. C Imp ortance Samp ling (IS).......................................................................................................................16 II.3.D Line Sampling & Subset Simulation ..................................................................................................... 18 III SENSITIVITY ANALYSIS.............................................................................................................................23 111. 1 Scatter Plots ...................................................................................................................... 24 111.2 M onte Carlo Filtering (M CF) ........................................................................................ 26 111.3 Regression Analysis....................................................................................................... 28 111.4 Variance-Based M ethods................................................................................................ 34 III.4.A The Sobol' Sensitivity Indices ................................................................................................................. 35 III.4.B The Sobol'Monte Carlo Algorithm ..................................................................................................... 38 111.5 Fourier Amplitude Sensitivity Test (FAST).................................................................. 40 111.6 Additional M ethods and Factor Screening .................................................................... 41 IV METAMODELING.........................................................................................................................................43 IV. 1 A General M etamodeling Framework.............................................................................. 46 IV.2 Polynomial Response Surfaces (RS) ............................................................................. 48 IV.2.A M athematical F orm ulation ..................................................................................................................... 48 IV.2.B Goodness-of-Fit and Predictive Power..............................................................................................

Uncertainty and Sensitivity Analysis for Long-Running Computer Codes: a Critical Review

Latin Hypercube Sampling in Bayesian Networks

Sensitivity Analysis

Design of Experiments for Sensitivity Analysis of a Hydrogen Supply Chain Design Model

1 Global Sensitivity and Uncertainty Analysis Of

(MLHS) Method in the Estimation of a Mixed Logit Model for Vehicle Choice

Analysis on the Thermal Striping and Thermal Shock of the Lower Head of Central Measuring Shroud in a Fast Reactor

Efficient Experimental Design for Regularized Linear Models

Some Large Deviations Results for Latin Hypercube Sampling

Security Operational Skills 2 (Tracing).P65

A Selection Bias Approach to Sensitivity Analysis for Causal Effects*

Licensing Small Modular Reactors an Overview of Regulatory and Policy Issues

Sensitivity Analysis Methods for Identifying Influential Parameters in a Problem with a Large Number of Random Variables