Parallel Inverse Modeling and Uncertainty Quantification for Computationally Demanding Groundwater-Flow Models Using Covariance Matrix Adaptation

Parallel Inverse Modeling and Uncertainty Quantification for Computationally Demanding Groundwater-Flow Models Using Covariance Matrix Adaptation Ahmed S. Elshall1; Hai V. Pham2; Frank T.-C. Tsai, M.ASCE3; Le Yan4; and Ming Ye, A.M.ASCE5 Abstract: This study investigates the performance of the covariance matrix adaptation-evolution strategy (CMA-ES), a stochastic optimization method, in solving groundwater inverse problems. The objectives of the study are to evaluate the computational efficiency of the parallel CMA-ES and to investigate the use of the empirically estimated covariance matrix in quantifying model prediction uncertainty due to parameter estimation uncertainty. First, the parallel scaling with increasing number of processors up to a certain limit is discussed for synthetic and real-world groundwater inverse problems. Second, through the use of the empirically estimated covariance matrix of parameters from the CMA-ES, the study adopts the Monte Carlo simulation technique to quantify model prediction uncertainty. The study shows that the parallel CMA-ES is an efficient and powerful method for solving the groundwater inverse problem for computationally demanding groundwater flow models and for deriving covariances of estimated parameters for uncertainty analysis. DOI: 10.1061/(ASCE)HE.1943- 5584.0001126. © 2014 American Society of Civil Engineers. Author keywords: Groundwater; Inverse modeling; Stochastic optimization; Covariance matrix; Evolution strategy; Uncertainty quantification; Parallel computing. Introduction Karpouzos et al. 2001; Solomatine et al. 1999; Bastani et al. 2010) and particle swarm optimization (Scheerlinck et al. 2009; Jiang The use of optimization algorithms for solving the inverse problem et al. 2010; Krauße and Cullmann 2012) to avoid entrapment at in subsurface hydrology is a common practice. The classes of local minima. This class of algorithms might experience poor local optimization algorithms include local derivative algorithms, global convergence properties. Thus, a third class of algorithms for solv- heuristic algorithms, hybrid global-heuristic local-derivative algo- ing the inverse problem in subsurface modeling is to use the hybrid rithms, and global-local heuristic algorithms. While the local global-heuristic local-derivative algorithms (Tsai et al. 2003a, b; derivative algorithms are computationally efficient and can handle Blasone et al. 2007; Matott and Rabideau 2008; Zhang et al. a larger number of unknown model parameters, this can be at the 2009), which run a global heuristic algorithm for exploring the cost of finding local solutions instead of a near global solution. The search landscape followed by a local derivative algorithm for ex- second class of algorithms is the global heuristic algorithms, which ploiting favorable search regions. The fourth class of algorithms are generally implemented when a gradient search is not successful. is the global-local heuristic algorithms, which can perform both The heuristic algorithms are experience-based techniques that global search and local convergence without the need of combin- utilize simple to complex forms of learning to escape local optima ing two different algorithms. For solving the inverse problem in and improve the solution. Many studies use global heuristic groundwater modeling and quantifying model parameter uncer- algorithms such as the genetic algorithm (ElHarrouni et al. 1996; tainty, this study investigates the sequential and parallel performance of the covariance matrix adaptation-evolution strategy 1Postdoctoral Fellow, Dept. of Scientific Computing, Florida State (CMA-ES) (Hansen and Ostermeier 2001; Hansen et al. 2003)as Univ., 400 Dirac Science Library, Tallahassee, FL 32306; formerly, Grad- a global-local stochastic derivative-free algorithm for groundwater uate Student, Dept. of Civil and Environmental Engineering, Louisiana State model calibration and parameter uncertainty quantification. Univ., Baton Rouge, LA 70803. E-mail: [email protected] 2 The CMA-ES is becoming popular within the environmental Graduate Student, Dept. of Civil and Environmental Engineering, modeling community particularly for solving groundwater design- Louisiana State Univ., 3418G Patrick F. Taylor Hall, Baton Rouge, LA 70803. E-mail: [email protected] optimization problems (Bayer and Finkel 2004, 2007; Bürger et al. 3Associate Professor, Dept. of Civil and Environmental Engineering, 2007; Bayer et al. 2008, 2010) and for solving parameter estimation problems (Skahill et al. 2009; Keating et al. 2010; Bledsoe et al. Downloaded from ascelibrary.org by LOUISIANA STATE UNIV on 11/18/14. Copyright ASCE. For personal use only; all rights reserved. Louisiana State Univ., 3418G Patrick F. Taylor Hall, Baton Rouge, LA 70803 (corresponding author). E-mail: [email protected] 2011; Elshall et al. 2013; Razavi and Tolson 2013; Tsai and Elshall 4Manager, HPC User Services, Louisiana State Univ., High Perfor- 2013; Yu et al. 2013; Arsenault et al. 2014; Elshall and Tsai 2014). mance Computing, Frey Computing Services Center Building, Baton For parameter estimation, few studies (Skahill et al. 2009; Bledsoe Rouge, LA 70803. E-mail: [email protected] et al. 2011; Arsenault et al. 2014) compared the performance of 5 Associate Professor, Dept. of Scientific Computing, Florida State CMA-ES to other derivative and derivative-free algorithms. How- Univ., 400 Dirac Science Library, Tallahassee, FL 32306. E-mail: ever, these studies used CMA-ES with the default population size. [email protected] Note. This manuscript was submitted on June 29, 2014; approved on CMA-ES is quasi-parameter free with the population size being the October 14, 2014; published online on November 17, 2014. Discussion only parameter to be tuned by the user. Increasing the population period open until April 17, 2015; separate discussions must be submitted size will generally improve the solution precision since the local for individual papers. This paper is part of the Journal of Hydrologic En- search of CMA-ES will become more global. However, increasing gineering, © ASCE, ISSN 1084-0699/04014087(11)/$25.00. the population size encourages global exploration at the expense of © ASCE 04014087-1 J. Hydrol. Eng. J. Hydrol. Eng. convergence speed (Skahill et al. 2009). While this is true for the performance against five other commonly used derivative and sequential implementation, this study shows that increasing the pop- derivative-free algorithms. Then, the solution of the CMA-ES is ulation size will significantly improve the computational efficiency analyzed to assess the plausibility of using the empirically esti- for the parallel CMA-ES implementation. mated covariance matrix to quantify head prediction uncertainty. Parallel inverse modeling with CMA-ES can be implemented by Impact of the population size is evaluated for the sequential and an embarrassingly parallel technique. Algorithms that utilize inde- parallel performance of the CMA-ES. Finally, the parallel CMA- pendent solutions in each iteration allow for embarrassingly paral- ES is applied to calibrate two computationally demanding ground- lel computation (Vrugt et al. 2006, 2008; Tang et al. 2007, 2010). water models of the Baton Rouge aquifer system, Louisiana, and to This is the most efficient parallel technique since the solutions in analyze head prediction uncertainty. each iteration do not communicate, as explained later. The first objective of this study is to show that the parallel CMA-ES superiorly improves the calibration speed over the sequential CMA-ES. In ad- Method dition, the speedup of parallel runs scales variably with increasing the number of processors, which is equal to the population size, For complex groundwater models that generally take hours to run, up to a certain limit. This study makes an addition to the aforemen- using the sequential CMA-ES for solving the groundwater inverse tioned literature on the CMA-ES since this is the first work to the problem is impractical due to the prohibitive computational cost. authors’ knowledge that examines the parallel performance of the This study resolves this computational issue by implementing CMA-ES in subsurface hydrology. the CMA-ES in a high-performance computing (HPC) cluster using The second objective of this study is to investigate the use of the an embarrassingly parallel master/slave technique. The embarrass- CMA-ES to quantify the model prediction uncertainty due to the ingly parallel master/slave technique treats the individual solutions parameter estimation error. The solution of the CMA-ES, which as explicit tasks that do not communicate with each other, and consists of a maximum likelihood estimate and a full covariance assigns each task to a processor. Thus, embarrassingly parallel matrix of estimated model parameters, can be used for uncertainty problems are the easiest to parallelize and have negligible paralle- analysis. Several studies have proposed the utilization of the covari- lization overhead. This section explains the parallelization scheme ance matrix for sampling target distributions (Haario et al. 1999, and the role of population size in increasing the parallelization 2001; Kavetski et al. 2006a, b; Gallagher and Doherty 2007; Smith efficiency. and Marshall 2008; Cui et al. 2011; Lu et al. 2012). However, all Given an optimization problem with a search space dimension λ μ studies used a covariance matrix within the Markov chain Monte n, candidate solutions, and w best solutions, the iterative solu- Carlo (MCMC) scheme such

Parallel Inverse Modeling and Uncertainty Quantification for Computationally Demanding Groundwater-Flow Models Using Covariance Matrix Adaptation

Genetic Algorithm

Randomized Derivative-Free Optimization Institute of Theoretical Computer Science Swiss Institute of Bioinformatics

Gaussian Adaptation Revisited – an Entropic View on Covariance Matrix Adaptation

Recent Advances in Differential Evolution – an Updated Survey

Outline of Machine Learning

Data Reduction of Digital Twin Simulation Experiments Using Different Optimisation Methods

On Spectral Invariance of Randomized Hessian and Covariance Matrix

Evolutionary Search Algorithm Based on Hypercube Optimization for High-Dimensional Functions

Implementation Aspects Regarding Closed-Loop Control Systems Using Evolutionary Algorithms

On Proving Linear Convergence of Comparison-Based Step-Size Adaptive Randomized Search on Scaling-Invariant Functions Via Stabil

Using Mutation Analysis to Evolve Subdomains for Random Testing

Characterization and Uncertainty Analysis of Siliciclastic Aquifer-Fault