<<

M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

Kriging-based Surrogate Models for Quantification and Sensitivity Analysis

Xu Wu, Chen Wang, Tomasz Kozlowski

Department of Nuclear, Plasma and Radiological Engineering University of Illinois at Urbana-Champaign 224 Talbot Laboratory, 104 South Wright Street, Urbana, Illinois, 61801, USA [email protected], [email protected], [email protected]

Abstract - In this paper, we propose to use (also called emulators) surrogate models for uncertainty and sensitivty analysis in nuclear engineering. The motivation is to significantly reduce the computational cost while maintaining a desirable accuracy. Kriging modeling only requires the input/output relations of the full model and thus can treat the full model as a black box. We investigated the Point Reactor Kinetics Equation (PRKE) with lumped parameter thermal-hydraulics feedback model to compare Monte Carlo (MC) , Polynomial Chaos Expansion (PCE) and Kriging surrogate model for uncertainty and sensitivity analysis. We applied Principal Component Analysis (PCA) for the dimension reduction. The Kriging surrogate models are built for the principal component scores. Comparison of the statistical moments of results from MC sampling of the full model, PCE and Kriging surroagte model shows that the Kriging surrogate model can accurately predict the statistical moments of the full model outputs. It only requires tens to a few hundreds of full model runs for the training and after that it can be executed with negligible cost.

I. INTRODUCTION 4. Brute force Monte Carlo (MC) sampling is robust and in- dependent of the model dimension. But it has extremely Within the BEPU (Best Estimate plus Uncertainty) [1] slow convergence rate and requires large number of sim- methodology must be quantified in order to prove ulations, which is impractical for many simulation codes that the investigated design remains within acceptance criteria. in nuclear engineering even with high performance com- This requires the propagation of input uncertainties to the puting. output predictions, known as Uncertainty Quantification (UQ). Sensitivity analysis (SA) is the study of how uncertain input In this paper, we propose to use Kriging-based surrogate parameters contribute to the variation in the output. Reliable models [2][3][4][5] for UQ and SA in nuclear engineer- UQ and SA are significant parts of modelling and simulation. ing. Surrogate model is an approximation of the input/output Modern nuclear system simulations emphasize improve- relation of a compututer code/model. It is also called meta- ments of computational accuracy. One of the main directions model, response surface or emulator. Surroagte models usually to achieve this is to use coupled multi-physics approach with take much less computational time than the full model (like high fidelity simulators. The term “multi-physics” means the TRACE, BISON, etc.) while maintaining the input/output requirement of coupling of discrete physics. Coupled systems relation of the original model to a desirable accuracy. Con- that integrate relevant phenomena in reactor systems such as structing the surrogate models normally requires small number neutronics, thermal-hydraulics, and fuel performance analysis of executions of the original model. Once validated, surro- are needed to improve design, operation and safety method- gate models can be used to perform uncertainty and sensitivity ologies of modern reactors. analysis, Bayesian inversion, optimization, etc. Extensive UQ and SA efforts are subsequently required for such multi-physics coupled simulations, which poses chal- II. THEORY lenges to the current UQ and SA methodologies as listed 1. Essential Components of Kriging below: In statistics, Gaussian process regression [6][7] is widely 1. The nuclear reactor physics simulation normally involves used as a method of interpolation for which the interpolated high dimensionality spaces (e. g. cross-sections), while values are modeled by a Gaussian process. In geostatistics thermal-hydraulics and fuel performance modeling com- (also known as spatial statistics), regression with Gaussian monly includes strong non-linear phenomena. processes is known as Kriging [3]. Kriging has been widely 2. and adjoint approaches are less suit- used as surrogate models for deterministic computer models. able when large uncertainty and non-linear effects are A Kriging model is a generalized model significant. that accounts for the correlation in the residuals between the regression model and the observations. 3. Stochastic spectral methods like Polymial Choas Expan- Consider a computer model y = G(x). Without loss of gen- sion and Stochastic Collocation are very efficient and erality, consider y as a scalar and x is a d-dimensional vector can have spectral convergence if well-designed, but their representing d input parameters. Assuming that the computer > performance is greatly limited from the notorious "Curse model output is known at m design sites X = (x1, x2,..., xm) . > of Dimensionality". The corresponding output values are y = (y1, y2,..., ym) = M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

> ∗ [G(x1), G(x2),..., G(xm)] . 3. The correlation of x with design points X: The mathematical form of a Kriging model has two parts: ∗  ∗ ∗ ∗ > n r(x ) = R (x , x1) , R (x , x2) ,..., R (x , xm) X > yˆ(x) = β j f j(x) + z(x) = f (x)β + z(x) (1) j=1 4. The correlation matrix R of design points X:  > where f = f1, f2,..., fn . The first part is a linear regression R (x , x ) R (x , x ) ···R (x , x ) of the data with n regressors modeling the drift of the process  1 1 1 2 1 m  mean. The second part z(x) is a stationary Gaussian random R (x , x ) R (x , x ) ···R (x , x )  2 1 2 2 2 m  process with zero mean and covariance: R =  . . . .   . . .. .  h i 2     Cov z(xi), z(x j) = σ R xi, x j (2)   R (xm, x1) R (xm, x2) ···R (xm, xm) 2 where σ is the process variance.   The spatial correlation function (SCF) R xi, x j controls Then, the best linear unbiased predictor (BLUP) of yˆ(x∗) the smoothness of the resulting Kriging model and the influ- is: ence of nearby points. The Gaussian function is the most ∗ > ∗ > ∗ −   yˆ(x ) = f (x )βˆ + r (x )R 1 y − Fβˆ (5) commonly used SCF as it provides a relatively smooth and infinitely differentiable surface: where βˆ is the least sqaures estimate of β: |x −x |2     − i j R xi, x j = R |xi − x j| = e θ (3)  −1 βˆ = F>R−1F F>R−1y (6)

TABLE I: Common spatial correlation functions ( or correla- The MSE or variance of the estimatey ˆ(x∗) is: tion kernels)    ∗  2 n > ∗ −1 ∗ Name Expression h = |xi − x j| MSE yˆ(x ) = σ 1 − r (x )R r(x )+    >  −1   |h| r> x∗ R−1F − f> x∗ F>R−1F r> x∗ R−1F − f> x∗ Exponential R(h) = exp − θ ( ) ( ) ( ) ( )  |h|p  Power-exponential R(h) = exp − θp (7)  |h|2  Gaussian R(h) = exp − 2  √2θ   √  Equations 5 to 7 formulate the so called Universal Kriging 3|h| 3|h| Matern ν = 3/2 R(h) = 1 + θ exp − θ (UK). Ordinary Kriging (OK) is a special case of UK when the √ √  2    basis functions reduce to a unique constant function, f(x) = [1]. Matern ν = 5/2 R(h) = 1 + 5|h| + 5h exp − 5|h| θ 3θ2 θ Simple Kriging (SK) is the case when the mean function is a known deterministic function: The correlation function scale parameter θ is essentially a width parameter which affects how far a sample point’s > − yˆ(x) = µ(X) + r (x)R 1 (y − µ(X)) (8) influence extends. Different θ’s can be chosen for different input dimensions: n o MSE yˆ(x) = σ2 1 − r>(x)R−1r(x) (9) d 2   Y |xi,k−x j,k| − θ R |xi − x j| = e k (4) The Kriging model can be proved to be interpolating all k=1 the design points. The predicted variance increases as new 2. Prediction using Kriging Surrogate Model point gets further away from existing points. To build Kriging surrogate model, we only need to run the original model y = Given the design sites X and corresponding output values G(x) m times, which depends on the desired accuracy level Y, to predict the output values at a new input location x∗ using (m is typically a few hundreds). Once the Kriging surrogate the Kriging model, we first define the following notations: model is available, we can use it for prediction at new point ∗  > x , which will run practically instantaneously. 1. The set of regression functions f = f1, f2,..., fn eval- ∗ uated at the unknown point x : III. SIMULATION MODEL ∗  ∗ ∗ ∗ > f(x ) = f1(x ), f2(x ),..., fn(x ) To test the proposed methodology, the Point Reactor 2. The set of regression functions evaluated at m known Kinetics Equation (PRKE) with lumped parameter thermal- design points: hydraulics feedback model [8][9] will be used in this study. > It describes the transient behavior of the normalized power F = [f(x1), f(x2),..., f(xm)] level p(t), the delayed neutron precursor concentrations C(t),  f (x ) f (x ) ··· f (x )  1 1 2 1 n 1  and the core effective fuel and coolant temperatures, Tfuel(t)  f (x ) f (x ) ··· f (x )  1 2 2 2 n 2  and Tcool(t). The model employs the standard neutron point =  . . . .  kinetic equations and couples them to simple 0-D (core aver-  . . .. .    age) fuel heat conduction and fluid energy balance models via f1(xm) f2(xm) ··· fn(xm) the reactivity function. In this model, the reactivity depends M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

TABLE II: Symbols used in Kriging formulation Symbol Dimension Description d 1 × 1 number of input factors m 1 × 1 number of design points n 1 × 1 number of basis functions β n × 1 vector of regression coefficients βˆ n × 1 estimate of β  > f n × 1 vector of regression functions f = f1, f2,..., fn th  | xi d × 1 vector of i design point, xi = xi,1, xi,2,..., xi,p th yi 1 × 1 output value of i design point, yi = G(xi) X m × d collection of m design points > y m × 1 vector of output values at X, y = (y1, y2,..., ym) x∗ d × 1 unknown input point, to be predicted yˆ(x∗) 1 × 1 predicted output at x∗ f(x∗) n × 1 f evaluated at x∗ F m × n f evaluated at X r(x∗) m × 1 correlation between x∗ and design points X R m × m correlation between design points X on both fuel and coolant temperatures. The system of coupled are treated as uncertain input parameters in the model. The nonlinear Ordinary Differential Equations (ODEs) is given by: Quantity of Interests (QoIs) in this model are p(t), C(t), Tfuel(t) and T (t). dp(t) ρ(t, T , T ) − β cool = fuel cool p(t) + λC(t) (10) dt Λ TABLE III: Uncertainties for three input parameters dC(t) β = p(t) − λC(t) (11) Parameters mean std/mean dt Λ dT (t) Ωpow p(t) [T (t) − T (t)] ρ 0.8 · ρ 20% fuel = − fuel cool (12) ext ext,0 · dt ρfuelcp, fuel ρfuelcp, fuelRˆth αD 1.5 αD,0 20% dT (t) 2u h i A [T (t) − T (t)] αc 1.5 · αc,0 20% cool = − T − T in + fuel fuel cool dt H cool cool A ˆ flow ρcoolcp, coolRth We arbitrarily chose some uncertainties for the input pa- (13) rameters as shown in Table III, as the main purpose is to com- pare MC sampling, PCE and Kriging-based surrogate model Note that only one-group effective delayed neutron is con- for uncertainty and sensitivty analysis. In Table III, ρ = β, sidered. The detailed definitions of the parameters and values ext,0 α = 3.18×10−3β, α = 8.2186×10−3β are nominal values used in the model can be found in [9]. The fuel and cladding D,0 c,0 used in [8]. Figure 1 shows the Probability Density Functions material properties are based on LWR fuel and cladding prop- (PDFs) of the three uncertain input parameters. erties [10][11]. The lumped parameter (i.e. averaging the #10 4 #10 4 unknown values over the whole domain) description of the 400 8 2.5

2

reactor fuel and coolant temperatures allows the elimination 300 6

t

c

x

D

e

, ,

; 1.5

f f

of the spatial dependencies and therefore focuses on the time- f

o o

o 200 4

F

F F

D 1 D

dependent part. D

P

P P 100 2 In this model, the neutronics equations (the first two equa- 0.5 tions) are coupled to the thermal-hydraulics equations (the last 0 0 0 0 0.005 0.01 0 2 4 6 0 0.5 1 1.5 two equations) via the temperature dependent total reactivity ;ext ,D #10 -5 ,c #10 -4 ρ(t, Tfuel, Tcool) that contains the external reactivity term (e.g., Fig. 1: PDFs for three random input parameters control rod movement), the fuel temperature reactivity and the coolant temperature reactivity: The reasons of choosing this model for testing the Kriging surrogate modeling are listed below: ρ(t, T , T ) = ρ − α [T (t) − T (0)] fuel cool ext D fuel fuel 1. This model is essentially a mini-multiphysics problem, − αc[Tcool(t) − Tcool(0)] (14) which has been designed as close to a real reactor as possiblel. All the material properties are based on West- In the above equation, ρext, αD and αc are external re- inghouse PWR [9]. activity insertion, Doppler reactivity coefficient and coolant temperature coefficient, respectively. These three parameters 2. It takes very short simulation time to run (about 0.5 sec- M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

onds), which makes MC sampling available as a reference 4 1650 solution.

3 1600 p 3. Solutions from PCE [9] are available for comparison. C Intrusive Galerkin projection can be directly applied to 2 1550 this model since we know the exact form of the ODEs. 1 1500 4. This model involves multiple (three) uncertainty input 0 0.5 1 0 0.5 1 time (s) time (s) parameters, as well as multiple (four) QoIs. 950 588

5. The four QoIs are time-dependent, which is a challenging )

topic for surrogate modeling. Normally Principal Com- 900 ) 586

K

K

(

( l

ponent Analysis (PCA) is required for data reduction. l

e

o

u

o

c

f T T 850 584 IV. RESULTS AND ANALYSIS 800 582 1. Simulation at Nominal Input Values and Input Uncer- 0 0.5 1 0 0.5 1 tainties time (s) time (s) Fig. 3: Simulation results for the four QoIs at nominal values This model starts from an initial condition of p(t = 0) = of the random input parameters 1.0, while initial conditions for C(t), Tfuel(t) and Tcool(t) can be found in [9]. Figures 2 and Figure 3 show the simulation results at mean values of the three uncertain input parameters. mentation can be found in [9]. In this paper we only provide With the introduction of external reactivity, the power quickly the results. increases to about four times of the initial state. Consequently, Figure 4 shows the comparison of the mean values for the fuel and coolant temperature increase introduces negative four QoIs predicted by the MC sampling and PCE with order feedback to counter-interact the external reactivity, and the up to 5. Here we only compare results at 21 time points which power returns to a new lower value in a short time. are uniformly distributed between 0 and 1 second. There is an excellent agreement of these results. PCE can accurately 1.2 predict the mean values even at a very low expansion order.

1 12 1750

0.8 10 MC sampling 1700 PC1 8 PC2 1650

0.6 PC3 p 6 PC4 C

) total PC5 1600 $ 0.4

( 4

external y

t 1550 i Doppler 2

v 0.2

i coolant t

c 0 1500 a e 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 R Time (s) Time (s) -0.2 1000 592 -0.4 590

950 )

-0.6 ) 588

K

K

(

( l

900 l

e

o

u

o c

-0.8 f 586 T 0 0.2 0.4 0.6 0.8 1 T 850 time (s) 584

Fig. 2: Reactivity evolution at nominal values of the random 800 582 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 uncertain input parameters Time (s) Time (s) Fig. 4: Mean values of four QoIs by MC sampling and PCE The increase in coolant temperature is small and the neg- ative reactivity feedback caused by the coolant is negligible, Figure 5 shows the comparison of the standard deviations which will cause the system to be insensitive to coolant tem- for four QoIs predicted by the MC sampling and PCE with perature coefficient αc. order up to 5 for different priors. For p (normalized power) and C (delay neutron precursors) the PCE predicted standard 2. UQ Solution using Monte Carlo Sampling and Polyno- deviations converge to the MC sampling values, while for Tfuel mial Choas Expansion and Tcool PCE predictions converge to values that are slightly different than MC sampling standard deviations. They are To provide reference solutions for Kriging-based surro- likely to be caused by the PCE low order truncation in both gate models, we performed the MC sampling (with 5000 inputs and outputs. The ODEs for T and T include many samples) and PCE with intrusive Galerkin approach up to fuel cool material properties that depend on T and T themselves, polynomial order 5. Details of the PCE derivation and imple- fuel cool M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

causing them to be more sensitive to small errors than p and first look at the Singular Value Decomposition (SVD) of A: C, which are only affected by Tfuel and Tcool. However, these > discrepancies are very small compared to the mean values of A = UΣV (16) those QoIs.

10 80 where

8 1. U is a p × p orthogonal matrix. The columns of U are MC sampling 60 PC1 called the left-singular vectors of A. The columns of U 6 PC2 > PC3

p AA C 40 are a set of orthonormal eigenvectors of . PC4 4 PC5

20 2. V is a N × N orthogonal matrix. The columns of V are 2 called the right-singular vectors of A. The columns of > 0 0 V are a set of orthonormal eigenvectors of A A. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Time (s) Time (s) 3. Σ is a p×N rectangular diagonal matrix with non-negative

80 4 real numbers on the diagonal. The diagonal entries σi of Σ are called the singular values of A and are arranged 60 3 in descending order. Singular values are square roots of

> >

) )

K the eigenvalues of AA and A A. Large singular values

K

(

( l

40 l 2

e

o u o A

c point to important features in matrix .

f

T T > 20 1 If we choose P = U , we have:

0 0 > > 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 PA = U A = ΣV = B (17) Time (s) Time (s) Fig. 5: Standard deviations of four QoIs by MC sampling and > 1 > 1 2 CB = Cov(ΣV ) = ΣV VΣ = Σ (18) PCE N N It can be seen that if we choose P = U>, whose columns are the left-singular vectors of A, the covariance matrix of Y 3. Principal Component Analysis for Dimension Reduc- is diagonalized. Define the following: tion 1. The rows of P are called principal components, also The remaining issue before building surrogate models by known as loadings. Each row of P contains coefficients Kriging is that the four QoIs are time-dependent, which causes for one principal component (a vector of length p). each QoI to be multivariate. If we are interested in many time steps of these four QoIs, the dimension of the outputs can 2. The transformed variables (rows of B) are called princi- be very high. It is not efficient to build surrogate models for pal component scores. Principal component scores are each QoI at each time step. So we used PCA as a dimension the representations of the original data A in the princi- reduction technique to reduce the number of outputs. In the pal component space. The columns of B correspond to observations. following, we will use Tfuel as an example to demonstrate this process. 3. The eigenvalues of the covariance matrix of A are called Suppose now we are interested in p = 21 time points principal component variances, which correspond to in the T evolution. These p time points are uniformly dis- fuel the sqaure of the diagonal elements in Σ. tributed between 0 and 1 second. The intuition for dimension reduction is that by observing the evolution of Tfuel shown in Note that diagonal entries of Σ are arranged in decend- Figure 3, we noticed that the shape is nearly linear or close ing order. It is quite normal that the values of such entries to a quadratic polynomial. We do not need so many (p = 21) decrease very quickly. We can remove the entries that has “variables” to represent such a shape. To learn the underlying very small values, which corresponds to only keeping the first ∗ pattern of the evolution of Tfuel, we sample the input param- few principal components. Suppose we only keep the first p eters N times, run the model and collect the Tfuel data and principal components: center the data to form a p × N data matrix A. Define the following linear transformation: P∗A = B∗ (19)

PA = B (15) where P is a p∗ × p transformation matrix and B is a p∗ ×N data martrix whos rows represented new variables after reduction. where P is a p × p transformation matrix and B is a p × N data Now we have reduced the the number of variables from p to martrix whose rows represented new variables. p∗. A commonly accepted criteria is that the reserved variables Now the covariance matrix of A is a full matrix. After have variances that account for over 95% (some others choose the transformation, we would like matrix B to have a diagonal 99%) of the total variance. covariance matrix, which means that the new variables in B To find the principal components in the current study, we are independent. To find such a transformation matrix P, we run the model with N = 25, 50, 100, 200 samples. Figure 6 M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

shows the decay of the principal component variances. Clearly, 10 5 the variances become trivial after first few dimensions. 25 LHS samples 50 LHS samples Figure 7 shows the percentage of total variance explained 100 LHS samples by each dimension. Figure 8 shows the cumulative percentage 200 LHS samples of total variance explained. Clearly the first two variables can 10 0 account for over 99.8% percent of the total variation, which

means that we only need the first two principal components.

e c

Therefore, the dimension of the output Tfuel has been reduced n a -5

i 10 r from 21 to 2 and in the following we only need to build Kriging a surrogate models for these two new variables. v Figure 9 shows the first two (p∗) principal components

(each one is a p-vector). The first principal component is 10 -10 similar in shape with the evolution of Tfuel and it dominates the change of Tfuel with time. The second principal component represents the influence of power p on Tfuel. Tfuel first increases because of the power surge. Then thn increasing rate becomes 10 -15 0 5 10 15 20 smaller because of the negative weighting from the second index principal component. Fig. 6: Decay of principal component variances 4. Build and Validate the Kriging Surrogate Model 100 In the last section we have found the principal compo- 25 LHS samples 90 50 LHS samples nents for Tfuel and performed the dimension reduction. In this 100 LHS samples section we will build the Kriging surrogate models follow the 80 200 LHS samples steps below:

70

d

e n

1. Create the N training samples by Latin Hypercube Sam- i a

l 60 p

pling (LHS) [12]. x e

e 50 g

2. Run the PRKE model at the training samples, get the a

t n

× e 40

p N data matrix A. Calculate the mean and center this c

r e data matrix by substracting the mean. p 30

3. Perform PCA of the centered data matrix. Find the first 20 p∗ principal components which forms a p∗ × p matrix P∗, and the corresponding principal component scores. Note 10

that each score will have N samples. 0 0 5 10 15 20 25 4. Build Kriging surrogate model for each principal compo- index nent score based on the N samples. Fig. 7: Percentage of total variation explained 5. Use the Kriging surrogate model to make prediction. At every new sample of the inputs, the Kriging model pro- 100 duce prediction for each score which forms a p∗ ×1 vector 25 LHS samples 99.8 50 LHS samples b. 100 LHS samples

d 99.6 200 LHS samples

∗> e n

6. With a = P b we can get a p × 1 vector a, which i a l 99.4 represents the time evolution T after adding the mean p

fuel x e

vector back. e 99.2

g

a

t n

As we have a few options for correlations kernels and e 99

c

r e

many choices for the trend functions, we need to find the most p

e 98.8

appropriate combinations for the following study. This can be v

i

t a

done by a simple test to see if the Kriging model can reproduce l 98.6 u the T results at mean values of the uncertain inputs. Figure m fuel u 10 shows the predictions of Kriging surrogate models built c 98.4 using different kind of correlation kernels and same training 98.2 samples (N = 100) and trend functions (constant). The first 98 three subfigures show the predictions along with the standard 0 5 10 15 20 deviations of the prediction. And the last subfigure shows the index comparison of Tfuel from various Kriging predictions with re- Fig. 8: Cumulative percentage of total variation explained sults from running the full model. Even though all the Kriging M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

mdoels can accurately reproduce the Tfuel evolution, the Krig- ing models built with linear and exponential correlation kernel have large variances in their predictions. In the following we 0.3 will choose the Matern 5/2 correlation kernel.

t 0.25 n

e Constant trend function Linear trend function

n 940 940

o 0.2

p m

o 920 920

c 0.15

l

a

p

i

) )

c 0.1 900 900

K K

n

( (

i

r

l l

e e

p u

0.05 u

f f

t T

T 880 880

s

r i

F 0 860 860 -0.05 0 5 10 15 20 25 840 840 index 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 time (s) time (s)

0.6 Quadratic trend function 940 940

t 25 LHS samples n

e 50 LHS samples n 0.4

o 100 LHS samples 920 920 p

m 200 LHS samples

o

c ) 0.2 )

l 900 900

K K

a

( (

p

l l

i

e e

c

u u f

f PRKE

n

i T 0 T 880 880 r Kriging, constant

p Kriging, linear d

n 860 860 Kriging, quadratic

o -0.2

c

e S 840 840 -0.4 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 time (s) time (s) index Fig. 9: First and second principal components Fig. 11: Comparison of Tfuel from PRKE and Kriging surro- gate models with different trend functions

Figure 11 shows comparison for different trend functions: constant, linear and quadratic. No obvious difference can be observed for different cases. Actually Ordinary Kriging (with constant trend functions) is the most widely used version of Kriging surrogate model. In the following we will choose the

Matern 5/2 correlation kernel Exponential correlation kernel constant trend function. 940 960 training size 25 training size 50 940 940 940 920 920 920

920 )

900 )

K

K

(

(

) ) l

l 900 900 K

900 K

e

e

( (

u

u

f

f

l l

e e T

880 T

u u f

880 f 880 880

T T 860 860 860 860 840 840 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 840 840 time (s) time (s) 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 time (s) time (s) Linear correlation kernel 940 940 training size 100 940 940

920 920

920 920 )

) 900 900

K K

) ) (

( 900 900

K K

l l

e e

( (

u u

l l

f f

e e

u u T

T 880 880 PRKE

f f T Kriging, Matern 5/2 T 880 880 Kriging, exponential PRKE 860 860 Kriging, linear Kriging, training size 25 860 860 Kriging, training size 50 Kriging, training size 100 840 840 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 840 840 time (s) time (s) 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 time (s) time (s) Fig. 10: Comparison of T from PRKE and Kriging surro- fuel Fig. 12: Comparison of T from PRKE and Kriging surro- gate models with different correlation kernels fuel gate models with different number of training samples

Figure 12 shows the comparison for different sample sizes. We can see that even with a small training sample size (N = 25) M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017) the Kriging predictions are very accurate. To better evaluate training sizes. It is observed that most of the points fall very the performance of Kriging model with training sample sizes close to the diagonal line, with only a few exceptions for N = 25, 50, 100, we can use the sample set with size 200 for N = 25. validation (we can also use Cross Validation). The validation The primary assumption of Kriging surrogate model is is performed based on the accuracy of the Surrogate model’s that the response of the computer model under consideration is capability to predict the responses at samples other than the a sample path of an underlying Gaussian random field. It indi- training sites. The predictivity coefficient Q2 is usually used cates that the output responses at different samples follow joint to evaluate the fitted Kriging surrogate model: Gaussian distributions. Kriging surrogate model can not only provide the response prediction, but also the Mean Squared   PNval ˆ Error (MSE, or variance) of its prediction. The differences i=1 Yi − Yi Q2 = 1 −   (20) between full model simulations and Kriging metamodel pre- PNval ¯ i=1 Y − Yi dictions are called residuals. And by dividing the residuals by the corresponding standard deviations of the prediction we get where Nval is the size of the validation set (in this case 200). the standardized residuals. By assumption the standardized Yi denotes full model simulation outputs of the validation set residuals follow the standard normal distribution. Therefore, and Y¯ is their empirical mean. Yˆi represents the prediction 99.7% of all the standardized residuals are expected to fall from Kriging surrogate model. In practical situations, a meta- within [−3, 3]. This is also verified by Figure 14. As expected, model with a predictivity oefficient above than 0.7 is often the case with N = 25 has the most points that fall above 3 or considered as a satisfactory approximation of the computer below -3. In the following we will use the training sample set code [13]. Table IV shows the predictivity coefficients for of size 50 to build the Kriging surrogate model for uncertainty both principal component scores and all the training samples. and sensitivity analysis. The predictivity coefficient is above 97.5% even with a samll First score number of training samples (N = 25).

l 5

a

u d

TABLE IV: Predictivity coefficients Q2 i

s

e

r

d e

Training sample size score 1 score 2 z 0

i

d

r

a d

25 0.9984 0.9753 n

a t 50 0.9988 0.9975 S -5 100 0.9995 0.9988 0 20 40 60 80 100 120 140 160 180 200 Validation samples First score 400 Second score training size 25 training size 25 training size 50 training size 50 200 l 5 training size 100 a

training size 100

u

d

i

s

g

e

n

r

i

g d

i 0

e r

z 0

i

K

d

r

a d

-200 n

a t

S -5

-400 -400 -300 -200 -100 0 100 200 300 400 0 20 40 60 80 100 120 140 160 180 200 PRKE Validation samples Second score Fig. 14: Standardized residuals for first and second PCA scores 80 from Kriging surrogate models with different training sizes. 60

40 g

n 20

i 5. Uncertainty and Sensitivity Analysis using Kriging Sur-

g i r 0

K rogate Model -20 In the last section we have selected Matern 5/2 correla- -40 tion kernel, constant trend function and training samples of -60 -60 -40 -20 0 20 40 60 80 size 50 to build the Kriging surrogate model and successfully PRKE validated the surrogate model. Now we can use Kriging model Fig. 13: Comparison of first and second PCA scores from for uncertainty and sensitivity anlaysis by sampling it at ar- PRKE and Kriging surrogate models with different training bitrary input locations. The advantage and primary purpose sizes. of using Kriging surrogate model is that it can be evaluated in a negligible time. For example. 10,000 evaluations of the Figure 13 shows the comparison of first and second PCA current Kriging model only take about 19 seconds using a scores from PRKE and Kriging surrogate models with different single processor. M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

Figure 15 and 16 show the comparison of mean values and standard deviations of Tfuel w.r.t. time, by MC sampling of the full model (5000 samples), PCE of order 5 and MC 35 PRKE MC sampling sampling of the Kriging surrogate model (5000 samples). We PCE of order 5 observed excellent agreements between full model and Krig- 30 Kriging ing surrogate model for both the mean values and standard deviations. The standard deviations estimated by Kriging are 25 more accurate than PCE. The computational cost of various approaches are:

) 20

K

( l

1. MC sampling of full model: 5000 samples take about 43 e

u f

minutes to run. T 15 2. PCE of order 5: re-coding efforts caused by intrusive Galerkin, plus 9 seconds to to solve for PCE coefficients. 10

3. Kriging surrogate model: 50 full model evaluations for 5 the training set take about 26 seconds, plus 10 seconds to sample the Kriging surrogate models 5000 times. 0 0 0.2 0.4 0.6 0.8 1 960 Time (s) Fig. 16: Comparison of standard deviations of Tfuel computed PRKE MC sampling by sampling the full PRKE model, PCE and Kriging surrogate 940 PCE of order 5 Kriging model

920

)

K (

l 900

e

u

f T First score main e,ect 1

880 t

n

e i c ;

/ ext e

860 o ,

c 0.5 D y

t ,

i c

v

i

t i

840 s

0 0.2 0.4 0.6 0.8 1 n 0

e s

Time (s) '

l o

T b Fig. 15: Comparison of mean values of fuel computed by o sampling the full PRKE model, PCE and Kriging surrogate S -0.5 1 1.5 2 2.5 3 3.5 4 4.5 model log10(Sample size) First score total e,ect

We performed global sensitivity analysis [14] by calcu- 1.5 t

lating the Sobol’ indices. Sobol’ indices [15][16] represent n

e i

the part of output variance that can be attributed to each input c

/ e parameter. The detailed process and results are presented in o

c 1

a companion paper [17]. To calculate the Sobol’ indices, we y

t

i v used the suggested “D ” sampling method proposed in [18]. i

3 t

i s

The total computational cost of such method is N(2d + 2) n 0.5

e s

where d is the input dimension and N is the number of sam- '

l o

ples. Usually N is expected to be at least a few hundred to b o a few thousand. This is only practical by using some cheap S 0 surrogate models. Figure 17 and Figure 18 show the conver- 1 1.5 2 2.5 3 3.5 4 4.5 gence of the main and total effect Sobol’ indices for score 1 log10(Sample size) and score 2 by using different sample size N. We can see that Fig. 17: Convergence of the main and total effect Sobol’ in- the Sobol’ indices are converging to certain equilibrium values dices calculated with GP surrogate model for the first principal when enough samples are used. component score Figure 19 shows the converged Sobol’ indices for prin- cipal component score 1 and score 2. As expected, αc has negligible effect on both scores. ρext accounts for over 90% M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

Second score main e,ect

1 comparable with the results shown in Figure 19. The Sobol’

t n

e indices shown in Figure 19 are for the principal component

i c

/ scores, while PCE results are for Tfuel values. The direct

e o

c 0.5 comparison of Sobol’ indices calculated with PCE and Kriging

y t

i is presented in a companion paper [17] using a different model,

v

i t

i in which they are shown to be very close. s

n 0

e

s '

l V. CONCLUSIONS

o

b o S -0.5 In this paper, we propose to use Kriging-based surrogate 1 1.5 2 2.5 3 3.5 4 4.5 models for UQ and SA in nuclear engineering. The motivation log10(Sample size) is to significantly reduce the computational cost while main- Second score total e,ect taining a desirable accuracy. It only needs the input/output

1 t

n ; relations of the full model and thus can treat the full model

e ext i c 0.8 , as a black box. Also, Kriging surrogate models can deal

/ D e

o , with higher-dimensional problem than stochastic spectral tech-

c c

y 0.6 niques like PCE.

t

i v

i We investigated the Point Reactor Kinetics Equation

t i

s 0.4

n (PRKE) with lumped parameter thermal-hydraulics feedback

e s

' model to compare MC sampling, PCE and Kriging surrogate

l 0.2 o

b model for uncertainty and sensitivity analysis. We applied o S 0 PCA for the dimension reduction and it was demonstrated 1 1.5 2 2.5 3 3.5 4 4.5 that the evolution of fuel temperature can be represented by log10(Sample size) only two principal components. The Kriging surrogate models Fig. 18: Convergence of the main and total effect Sobol’ in- are built for the principal component scores and the Kriging dices calculated with GP surrogate model for the second prin- predictions of the scores can be mapped back to the original cipal component score space, forming a time-series for the fuel temperature. Comparison of the mean values and standard deviations of the variation in score 1 and αD is more important for score of results from MC sampling of the full model, PCE and 2 than ρext. Recall what we observed in Figure 9: The first Kriging surroagte model shows that the Kriging surrogate principal component dominates the change of Tfuel with time model can accurately predict the statistical moments of the and the second principal component represents the influence full model outputs. It only requires tens to a few hundreds of of power p on Tfuel. This is consistant with Sobol’ indices full model runs for the training and after that it can be executed shown in Figure 19 since ρext changes the power directly and in a negligible time. Kriging also has a few advantages over αD affects the power by a negative feedback. stochastic spectral methods like PCE: it is always non-intrusive and can deal with much higher input dimensions. Stochastic Main e,ect Total e,ect spectral methods can be non-intrusive but they always suffer PCA first score from the “curse of dimensionality”. To sum up, we recommend PCA second score Kriging-based surrogate models for uncertainty and sensitivity , c analysis in nuclear engineering, especially when the compute code is very expensive.

REFERENCES

, 1. G. E. WILSON, “Historical insights in the development D of Best Estimate Plus Uncertainty safety analysis,” Annals of Nuclear Energy, 52, 2–9 (2013). 2. J. D. MARTIN and T. W. SIMPSON, “Use of kriging models to approximate deterministic computer models,” AIAA Journal, 43, 4, 853–863 (2005). ; ext 3. N. CRESSIE, Statistics for spatial data, John Wiley & Sons, revised ed. (2015). 4. M. L. STEIN, Interpolation of spatial data: some theory 0 0.5 1 0 0.5 1 for kriging, Springer Science & Business Media (2012). Fig. 19: Sobol’ indices (main and total effects) for first and 5. J. P. KLEIJNEN, “Kriging metamodeling in simulation: a second principal component scores. review,” European Journal of Operational Research, 192, 3, 707–716 (2009). Finally, even though we have Sobol’ indices available 6. J. SACKS, W. J. WELCH, T. J. MITCHELL, and H. P. by post-processing of PCE coefficients, they are not directly WYNN, “Design and analysis of computer experiments,” M&C 2017 - International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, Jeju, Korea, April 16-20, 2017, on USB (2017)

Statistical Science, pp. 409–423 (1989). 7. T. J. SANTNER, B. J. WILLIAMS, and W. I. NOTZ, The design and analysis of computer experiments, Springer Science & Business Media (2003). 8. C. HOUSIADAS, “Lumped parameters analysis of cou- pled kinetics and thermal-hydraulics for small reactors,” Annals of Nuclear Energy, 29, 11, 1315–1325 (2002). 9. X. WU and T. KOZLOWSKI, “Inverse uncertainty quan- tification of reactor simulations under the Bayesian frame- work using surrogate models constructed by polynomial chaos expansion,” Nuclear Engineering and Design, 313, 29–52 (2017). 10. X. WU, P. SABHARWALL, J. HALES, and T. KO- ZLOWSKI, “Neutronics and Fuel Performance Evalu- ation of Accident Tolerant Fuel under Normal Operation Conditions,” Tech. rep., INL/EXT-14-32591, Idaho Na- tional Laboratory, Fuel Modeling and Simulation Depart- ment, Idaho Falls, Idaho (2014). 11. X. WU, T. KOZLOWSKI, and J. D. HALES, “Neutronics and fuel performance evaluation of accident tolerant Fe- CrAl cladding under normal operation conditions,” Annals of Nuclear Energy, 85, 763–775 (2015). 12. M. D. MCKAY, R. J. BECKMAN, and W. J. CONOVER, “Comparison of three methods for selecting values of input variables in the analysis of output from a computer code,” Technometrics, 21, 2, 239–245 (1979). 13. A. MARREL, B. IOOSS, B. LAURENT, and O. ROUS- TANT, “Calculations of sobol indices for the gaussian process metamodel,” Reliability Engineering & System Safety, 94, 3, 742–751 (2009). 14. A. SALTELLI, M. RATTO, T. ANDRES, F. CAMPO- LONGO, J. CARIBONI, D. GATELLI, M. SAISANA, and S. TARANTOLA, Global sensitivity analysis: the primer, John Wiley & Sons (2008). 15. I. M. SOBOL’, “On sensitivity estimation for nonlinear mathematical models,” Matematicheskoe Modelirovanie, 2, 1, 112–118 (1990). 16. I. M. SOBOL, “Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates,” Mathematics and computers in simulation, 55, 1, 271–280 (2001). 17. X. WU, C. WANG, and T. KOZLOWSKI, “Global Sen- sitivity Analysis of TRACE Physical Model Parameters based on BFBT benchmark,” in “Proceedings of M&C- 2017,” Jeju, Korea, April 16-20. (2017). 18. G. GLEN and K. ISAACS, “Estimating Sobol sensitivity indices using correlations,” Environmental Modelling & Software, 37, 157–166 (2012).