Design and Analysis of Experiments for Screening Input Variables

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Hyejung Moon, M.S.

Graduate Program in

The Ohio State University

2010

Dissertation Committee:

Thomas J. Santner, Co-Adviser Angela M. Dean, Co-Adviser William I. Notz ⃝c Copyright by

Hyejung Moon

2010 ABSTRACT

A computer model is a computer code that implements a mathematical model of a physical process. A computer code is often complicated and can involve a large number of inputs, so it may take hours or days to produce a single response. Screening to determine the most active inputs is critical for reducing the number of future code runs required to understand the detailed input-output relationship, since the computer model is typically complex and the exact functional form of the input- output relationship is unknown. This dissertation proposes a new screening method that identifies active inputs in a computer experiment setting. It describes a Bayesian computation of sensitivity indices as screening measures. It provides algorithms for generating desirable designs for successful screening.

The proposed screening method is called GSinCE (Group Screening in Computer

Experiments). The GSinCE procedure is based on a two-stage group screening ap- proach, in which groups of inputs are investigated in the first stage and then inputs within only those groups identified as active at the first stage are investigated indi- vidually at the second stage. Two-stage designs with desirable properties are con- structed to implement the procedure. Sensitivity indices are used to measure the effects of inputs on the response. Inputs with large sensitivity indices are determined by comparison with a benchmark null distribution constructed from user-specified, low-impact inputs. The use of low-impact inputs is useful for screening out inputs

ii having small effects as well as those that are totally inert. Simulated examples show that, compared with one-stage procedures, the GSinCE procedure provides accurate screening while reducing computational effort.

In this dissertation, the sensitivity indices used as screening measures are com- puted in a Gaussian process model framework. This approach is known to be compu- tationally efficient by using small numbers of expensive computer code runs for the estimation of sensitivity indices. The existing approach for quantitative inputs is ex- tended so that sensitivity indices can be computed when inputs include a qualitative input in addition to quantitative inputs.

An orthogonal design in which the design matrix has uncorrelated columns is important for estimating the effects of inputs. Moreover, a space-filling design for which design points are well spread out is needed to explore the experimental re- gion thoroughly. New algorithms for achieving such orthogonal space-filling designs are proposed in this dissertation. The three kinds of software are provided for the proposed GSinCE procedure, computation of sensitivity indices, and design search algorithms.

iii This is dedicated to my daughter Moonyoung, son Nathan,

husband Jungick, and parents.

iv ACKNOWLEDGMENTS

I would first like to express my gratitude to my co-advisors, Professor Thomas

Santner and Professor Angela Dean. They have given me tremendous help in my professional development and great guidance in my life. They are very special teach- ers and mentors to me. I am truly grateful for the effort that they have put into my education and the time that they have shared with me. I would also like to thank Professor William Notz for helpful comments and support as a member of my dissertation committee.

I want to give special thanks to my parents for their love and support. Without their help and sacrifices, my husband Jungick and I could not finish Ph.D. study at the same time. I would also like to thank Jungick for his love and for every moment that we have shared during our Ph.D. study. I am most thankful to my precious little ones, daughter Moonyoung and son Nathan. They have given me all the happiness, hope, and strength to do my best in my life.

v VITA

October 1977 ...... Korea

2000 ...... B.S. Statistics, Korea University

2000 to 2004 ...... Statistician, The Bank of Korea

2006 ...... M.S. Statistics, The Ohio State University

2005 to present ...... Graduate Research Associate, Graduate Teaching Associate, The Ohio State University

FIELDS OF STUDY

Major Field: Statistics

vi TABLE OF CONTENTS

Page

Abstract ...... ii

Dedication ...... iv

Acknowledgments ...... v

Vita ...... vi

List of Tables ...... x

List of Figures ...... xiv

Chapters:

1. Introduction ...... 1

1.1 Computer Experiments ...... 1 1.2 Gaussian Stochastic Process Model ...... 2 1.3 Screening Procedure ...... 5 1.3.1 Screening in Computer Experiments ...... 5 1.3.2 Group Screening in Physical Experiments ...... 6 1.4 Design of Computer Experiments ...... 7 1.5 Overview of Dissertation ...... 9

2. Two-stage Sensitivity-based Group Screening in Computer Experiments . 10

2.1 Introduction ...... 10 2.1.1 Background ...... 10 2.1.2 Overview of the Proposed Procedure ...... 13 2.2 GSinCE Initialization Stage ...... 14 2.3 GSinCE Procedure Stage 1 ...... 16

vii 2.3.1 Stage 1 Sampling Phase ...... 16 2.3.2 Stage 1 Grouping Phase ...... 17 2.3.3 Stage 1 Analysis Phase ...... 19 2.4 GSinCE Procedure Stage 2 ...... 24 2.4.1 Stage 2 Sampling Phase ...... 24 2.4.2 Stage 2 Analysis Phase ...... 25

3. Performance of GSinCE ...... 26

3.1 Studies to Set τ ...... 26 3.1.1 for f = 20 ...... 29 3.1.2 Simulations for f = 30 ...... 33 3.1.3 Simulations for f = 10 ...... 35 3.1.4 Summary of Simulation Studies ...... 43 3.2 Application of GSinCE in Least Favorable Cases ...... 43 3.2.1 Small Percentage of Active Inputs ...... 44 3.2.2 Non-linear Functions ...... 45 3.2.3 Detecting Large Effects ...... 50 3.3 Properties of Two-stage Designs ...... 51 3.3.1 Augmented Design ...... 52 3.3.2 Combined Design at Stage 2 ...... 54

4. Application of GSinCE ...... 57

4.1 Examples from the Literature ...... 57 4.1.1 Borehole Model ...... 58 4.1.2 A Model for the Weight of an Aircraft Wing ...... 60 4.1.3 OTL Circuit Model ...... 62 4.1.4 Piston Simulator Model ...... 64 4.1.5 Summary ...... 65 4.2 A Real Computer Experiment: FRAPCON Model ...... 66 4.2.1 Description of Code ...... 66 4.2.2 Use of GSinCE ...... 67 4.2.3 Implementations ...... 70

5. Computation of Sensitivity Indices ...... 81

5.1 Sensitivity Indices of Quantitative Inputs ...... 81 5.1.1 Definition of Sensitivity Indices ...... 82 5.1.2 Estimation in Gaussian Process Framework ...... 87 5.1.3 The Integrals: sgint, dbint, mxint ...... 94 5.1.4 Example ...... 100

viii 5.2 Sensitivity Indices of Mixed Inputs ...... 103 5.2.1 Setup ...... 104 5.2.2 Correlation Function for Mixed Inputs ...... 105 5.2.3 Estimation of Sensitivity Indices for Mixed Inputs ...... 107 5.2.4 Example ...... 115

6. Algorithms for Generating Maximin Latin Hypercube and Orthogonal De- signs ...... 118

6.1 Introduction ...... 118 6.2 Maximin Criteria for Space-filling Designs ...... 121 6.3 Algorithms for Space-filling Latin Hypercube Designs ...... 123 6.3.1 Complete Search and Random Generation ...... 123 6.3.2 Random Swap Methods for Maximin LHDs ...... 124 6.3.3 A Smart Swap Method for Maximin LHDs ...... 125 6.4 Algorithms for Orthogonal Maximin Designs ...... 127 6.4.1 Orthogonal Maximin LHDs ...... 127 6.4.2 Orthogonal Maximin Gram-Schmidt Designs ...... 129 6.5 Comparisons ...... 133 6.5.1 Maximin LHDs ...... 133 6.5.2 Orthogonal Maximin Designs ...... 135 6.6 Summary ...... 139

7. Alternative Two-stage Designs ...... 141

7.1 Orthogonal Array-based Latin Hypercube Design ...... 141 7.2 Stage 1 Design for a Two-stage Group Screening Procedure . . . . 143 7.2.1 Construction ...... 143 7.2.2 Secondary Criteria ...... 148 7.3 Stage 2 Design for a Two-stage Group Screening Procedure . . . . 149 7.4 Limitations ...... 150 7.4.1 Availability of OA-based LHD ...... 150 7.4.2 Group Variable Defined by Averaging ...... 150

8. Software ...... 155

8.1 GSinCE Code ...... 155 8.2 Sensitivity Code ...... 158 8.3 Maximin Code ...... 164

Bibliograhpy ...... 168

ix LIST OF TABLES

Table Page

3.1 Marginal probabilities and coefficient distributions for the simulation study ...... 29

3.2 Six combinations used to recommend τ ...... 30

3.3 Median and IQR values of the performance measures, and average num- ber of groups and average total runs over 200 test functions with about 25% of active inputs among f = 20 inputs for each τ in each combi- nation; value in parentheses is the number of test functions generated with no active inputs ...... 32

3.4 Modified values of qL and q×|NN . Other probabilities are as in Table 3.1 to achieve about 25% of f = 30 inputs active ...... 34

3.5 Median values of the performance measures, and median/average val- ues of true/claimed active inputs over 50 test functions with about 25% of active inputs among f = 30 inputs; value in parentheses is the number of test functions generated with no active inputs ...... 34

3.6 Modified values of qL to achieve about 25%, and 35% of f = 10 inputs active, while keeping other probabilities as in Table 3.1 ...... 36

3.7 Median values of the performance measures, and median/average val- ues of true/claimed active inputs over 100 test functions with about 25% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs ...... 38

3.8 Median values of the performance measures, and median/average val- ues of true/claimed active inputs over 100 test functions with about 35% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs ...... 40

x 3.9 Median values of the performance measures, and median/average val- ues of true/claimed active inputs over 100 test functions with about 20% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs ...... 42

3.10 Median values of FDR, FNDR, specificity, sensitivity over 30 functions having small percentages of active inputs ...... 44

3.11 All coefficients of test function (3.5) ...... 46

3.12 Results of automatic grouping and applying GSinCE for test function (3.5) ...... 47

3.13 Results of original, modified procedures and one-stage method . . . . 48

3.14 Comparisons of median values of FDR, FNDR, specificity, sensitivity over 30 non-linear functions for the original and modified procedures . 50

3.15 Result of automatic grouping and applying GSinCE in favorable situ- ations ...... 51

3.16 Minimum inter-point distance and computation time of two design methods and the estimated TESIs based on each of these designs . . 53

4.1 Grouping and active effect selection by GSinCE for the borehole model 59

4.2 Computation times and active effect selection of the four procedures for the borehole model using 70 runs ...... 60

4.3 Grouping and active effect selection by GSinCE for the aircraft wing weight model ...... 61

4.4 Computation times and active effect selection of the four procedures for the aircraft wing weight model using 65 runs ...... 62

4.5 Grouping and active effect selection by GSinCE for the OTL circuit model ...... 63

4.6 Computation times and active effect selection of the four procedures for the OTL circuit model using 60 runs ...... 63

xi 4.7 Grouping and active effect selection by GSinCE for the piston model 64

4.8 Computation times and active effect selection of the four procedures for the piston model using 85 runs ...... 65

4.9 Summary of screening for all outputs based on grouping by EDA . . . 71

4.10 Stage 1 grouping by EDA and selection for y1 ...... 72

4.11 Stage 1 grouping by EDA and selection for y2 ...... 72

4.12 Stage 1 grouping by EDA and selection for y3 ...... 73

4.13 Stage 1 grouping by EDA and selection for y4 ...... 73

4.14 Grouping by expert ...... 74

4.15 Summary of screening for all outputs based on grouping by expert . . 75

4.16 Construction of subgroups within a group made by expert ...... 76

4.17 Summary of screening for all outputs based on grouping by expert and EDA...... 78

4.18 Summary of screening for all groupings ...... 79

5.1 Estimated sensitivity indices for the example function in (5.46) using different correlation functions and approaches ...... 103

5.2 Estimated sensitivity indices for the example function in (5.84) . . . . 117

6.1 Characteristics of best (n, k) = (9, 4) designs formed using criterion (2) dmin: ϕ15 is Morris and Mitchell (1995) objective function with p = 15; 2 (4) ρave is average squared correlation; dmin is minimum 4-dimensional rectangular distance; |ρ|max is maximum absolute correlation; T is number of starting designs ...... 134

xii 6.2 Characteristics of best (n, k) = (40, 5) designs formed using criterion (2) dmin: ϕ15 is Morris and Mitchell (1995) objective function with p = 15; 2 (5) ρave is average squared correlation; dmin is minimum 4-dimensional rectangular distance; |ρ|max is maximum absolute correlation; T is number of starting designs ...... 135

6.3 Best orthogonal maximin 9×4 designs found by the OMLHD, OSGSD- (2) ϕ15, OSGSD-dmin algorithms based on 7 minutes of computational time and scatterplot matrices of these designs...... 136

6.4 Comparisons of best designs found by OMLHD, OSGSD-ϕ15, and OSGSD- (2) dmin algorithms based on 7 minutes of computational time: ϕ15 is Mor- 2 ris and Mitchell (1995) objective function with p = 15; ρave is average (4) squared correlation; dmin is minimum 4-dimensional rectangular dis- tance; |ρ|max is maximum absolute correlation; T is number of starting designs ...... 137

(4) × 6.5 Distributions of ϕ15 and dmin values in 100 9 4 designs produced by the OSGSD-ϕ15 algorithm (4 seconds of computation) and corresponding values of the best scaled OMLHD design indicated by horizontal lines 138

6.6 Comparisons of best designs found by OMLHD, OSGSD-ϕ15, and OSGSD- (2) dmin algorithms based on 9 hours and 42 minutes of computational time: ϕ15 is Morris and Mitchell (1995) objective function with p = 15; 2 (5) ρave is average squared correlation; dmin is minimum 5-dimensional rectangular distance; |ρ|max is maximum absolute correlation; T is number of starting designs ...... 139

(5) × 6.7 Distributions of ϕ15 and dmin values in 100 40 5 designs produced by the OSGSD-ϕ15 algorithm (349 seconds of computation) and corre- sponding values of the best scaled OMLHD design indicated by hori- zontal lines ...... 140

7.1 Secondary design criteria for Stage 1 design ...... 149

7.2 40 ×4 Stage 1 design X(1) ...... 153

7.3 Ranges and estimated TESIs of 2 groups under 3 different groupings . 153

xiii LIST OF FIGURES

Figure Page

3.1 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 200 test functions versus τ × 100% for functions with about 25% of active inputs among f = 20 inputs ...... 31

3.2 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 50 test functions versus τ ×100% for functions with about 25% of active inputs among f = 30 inputs ...... 35

3.3 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ × 100% for functions with about 25% of active inputs among f = 10 inputs ...... 37

3.4 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ × 100% for functions with about 35% of active inputs among f = 10 inputs ...... 39

3.5 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ × 100% for functions with about 20% of active inputs among f = 10 inputs ...... 41

5.1 Description of dbint of the cubic correlation function ...... 97

Y Y 5.2 Description of R (x, η1; θ)R (x, η2; θ) of the cubic correlation function 99

7.1 Rotate Method ...... 145

xiv 7.2 Shrink Method with h = 0.25 ...... 147

xv CHAPTER 1

INTRODUCTION

1.1 Computer Experiments

There are many complex physical phenomena that are impossible or too expen- sive to study using physical experiments. However, some of these physical processes can be described by means of a mathematical model which relates inputs to output.

A computer model is the implementation of such mathematical models in computer code. A computer experiment is the use of the computer code as an experimental tool in which the experimenter seeks to determine the computational “response” of the code to the inputs. Computer experiments are prevalent in a wide range of stud- ies, for example, in engineering (Fang, Li, and Sudjianto (2005)), in biomechanics

(Ong, Lehman, Notz, Santner, and Bartel (2006)), in the physical sciences (Higdon,

Kennedy, Cavendish, Cafeo, and Ryne (2004)), in the life sciences (Upton, Guilak,

Laursen, and Setton (2006), Fogelson, Kuharsky, and Yu (2003)), in economics (Lem- pert, Williams, and Hendrickson (2002)) and other areas of natural science.

The output from most computer codes is deterministic; that is, two runs of a computer code at the same set of input values give an identical output value. Hence the traditional principles of blocking, randomization, and replication of the physical

1 experiment are not required for the design of a computer experiment. A computer code is often complicated and can involve a large number of inputs, so it may take hours or days to produce a single output. Thus there is a need for efficient screening methods for detecting inputs that have influential impacts on an input-output system.

A flexible predictor (see Section 1.2) is often fitted to the outputs to provide a rapidly- computable surrogate predictor (a metamodel for the code). The performance of the predictor depends upon the choice of the training design points used to develop the predictor, so there is a need for careful design.

In this dissertation, a new screening method is proposed for computer experiments.

The computation of sensitivity indices as screening measures is discussed together with construction of designs for successful screening.

1.2 Gaussian Stochastic Process Model

Although the output from a computer code is deterministic, uncertainty arises since the exact functional form of the input-output relationships is unknown from a limited number of runs. So statistical models are needed to characterize the un- certainty. The Gaussian process (GP) model has been popularly used to model the output from a computer experiment, because it provides a flexible framework by pro- ducing a large class of potential response surfaces and easily adapts to the presence of nonlinearity and interactions. In the following, the GP model (see Sacks, Welch,

Mitchell, and Wynn (1989) and Santner, Williams, and Notz (2003), chapters 2 and

3) is reviewed briefly.

Let y(x) be a scalar output which is a function of a k-dimensional vector of inputs, x = (x1, x2, . . . , xk). Then the GP model treats the deterministic output y(x) as a

2 realization of a random function Y (x),

Y (x) = f ⊤(x)β + Z(x) (1.1)

⊤ where f(x) = (f1(x), . . . , fq(x)) is a q × 1 vector of known regression functions at

⊤ x and β = (β1, . . . , βq) is a q × 1 vector of unknown regression coefficients. The

Z(·) is a stationary Gaussian process with mean zero, variance 1/λZ , and covariance

function 1 Cov(Z(x),Z(x˜)) = R(x − x˜) (1.2) λZ where x and x˜ are two input sites, and R(x − x˜) is a correlation function of Z(·).

Valid correlation functions must have R(0) = 1, must be symmetric about the origin,

i.e., R(h) = R(−h), and be positive definite. One of the popular correlation functions

is the product power exponential correlation function,

∏k pj R(x − x˜) = exp(−θj(xj − x˜j) ) (1.3) j=1

where θj > 0, and 0 < pj ≤ 2. The Gaussian correlation function is the special

case when pj = 2. Cubic and Mat´ern correlation functions are also widely used (see

Santner et al. (2003), chapter 2).

Suppose that output y(x0) is predicted at the new input site x0, based on the

⊤ training data yn = (y(x1), . . . , y(xn)) at the n input sites x1,..., xn. Then Y0 =

Y (x0),Y1 = Y (x1),...,Yn = Y (xn) are also a GP from (1.1), and hence the joint distribution of Y0 and Y n = (Y1,...,Yn) is a multivariate normal distribution ( ) [( ) ( )] ⊤ ⊤ Y0 ∼ f 0 1 1 r0 N1+n β, λ (1.4) Y n F Z r0 R

× × where f 0 = f(x0) is a q 1 vector of regression functions at x0, F is an n q

th matrix of regression functions having (i, j) element fj(xi) for 1 ≤ i ≤ n, 1 ≤ j ≤ q,

3 ⊤ th r0 = (R(x0 − x1),...,R(x0 − xn)) , and R is an n × n matrix having (i, j) element

R(xi − xj). Let ψ be a vector of parameters of the correlation function R(·). Given the training data yn and the model parameters (β, λZ , ψ), Y0 has a conditional normal distribution [ ] 1 | ∼ ⊤ ⊤ −1 − − ⊤ −1 [Y0 yn, β, λZ , ψ] N f 0 β + r0 R (yn F β), (1 r0 R r0) , (1.5) λZ from (1.4). The minimum MSPE linear unbiased or a best linear unbiased predictor

(BLUP) of Y0 (see Sacks et al. (1989)) is

b ⊤ b ⊤ −1 − b Y0 = f 0 β + r0 R (yn F β) (1.6)

b ⊤ −1 −1 ⊤ −1 where β = (F R F ) F R yn is the generalized least squares estimator of β. In practice, the parameters ψ in the correlation function R(·) are unknown, so the esti- b mates R and rˆ0 can be used instead of R and r0 in (1.6). Such a predictor is called empirical best linear unbiased predictor (EBLUP) of Y0. According to the method used for estimating ψ, different EBLUPs such as maximum likelihood EBLUP, re- stricted maximum likelihood EBLUP, cross-validation EBLUP, and posterior mode

EBLUP can be obtained. See Santner et al. (2003), chapter 3, for more details.

In a fully Bayesian approach, prior distributions for the model parameters (β, λZ , ψ) are specified, and then the predictor of Y0 is obtained as the mean of the predictive | distribution [Y0 yn], i.e.,

| | E(Y0 yn) = EY0 (Eβ,λz,ψ(Y0 yn, β, λZ , ψ)). (1.7)

To compute the Bayesian predictor numerically, one can take draws of the model pa- | rameters (β, λZ , ψ) from the posterior distribution [β, λZ , ψ yn] using Markov Chain | ⊤ Monte Carlo (MCMC) sampling method, compute E(Y0 yn, β, λZ , ψ) = f 0 β +

4 ⊤ −1 − r0 R (yn F β) using each draw of (β, λZ , ψ), and take a sample mean of estimates | of E(Y0 yn, β, λZ , ψ) based on different draws of the parameters.

1.3 Screening Procedure

A computer code can involve a huge number of inputs and may take hours or days

to produce a single output at a certain set of input values because of the complexity

of the codes. Thus there is a need for efficient methods for detecting inputs that

have major impacts on an input-output system. Also in physical experiments, a

conventional factorial experiment may not be economically feasible when numerous

factors are considered. So it is necessary to identify influential inputs at an early

stage of experimentation. These can be investigated further at a later stage.

Screening methods developed in the setting of computer experiments are reviewed

in Section 1.3.1 and group screening in physical experiments in Section 1.3.2. A

new two-stage group screening procedure that identifies active inputs in a computer

experiment setting and borrows ideas from group screening is proposed in Chapter 2.

1.3.1 Screening in Computer Experiments

Most of the screening methods in computer experiments are based on the GP

model described in Section 1.2. For example, Sacks et al. (1989) used a decomposition

of the output function y(x) into an average effect, main effects for each input, two- factor interactions, and high-order interactions, estimated the effects by replacing y(x) by the predictor based on the GP model, and plotted the estimated effects to investigate the importance of the inputs. Welch, Buck, Sacks, Wynn, Mitchell, and

Morris (1992) extended Sacks et al. (1989) method so as to build an accurate predictor and identify important inputs when there are up to 30-40 inputs. Oakley and O’Hagan

5 (2004) presented a Bayesian approach to perform probabilistic sensitivity analysis which formulates uncertainty in the model inputs by a joint probability distribution and then analyzes the induced uncertainty in the output. Schonlau and Welch (2006) described the implementation of visualizing the estimated effects and quantifying the importance of the inputs via an ANOVA type of decomposition. Linkletter,

Bingham, Hengartner, Higdon, and Ye (2006) proposed a Bayesian method to select active inputs based on the posterior distribution of the parameters of the Gaussian correlation function. Campbell, McKay, and Williams (2006) suggested sensitivity analysis for functional computer model outputs by expanding the functional outputs in terms of an appropriate set of basis functions and doing sensitivity analysis on the coefficients of the expansion. Higdon, Gattiker, Williams, and Rightley (2008) performed sensitivity analysis for high-dimensional output using basis representations to reduce the dimensionality.

1.3.2 Group Screening in Physical Experiments

Group screening methodology was first described by Dorfman (1943) for blood screening and was later adapted to the setting of physical experiments by Watson

(1961) for identifying active factors in cases where there are many potentially influen- tial input factors. In two-stage group screening, the first stage of experimentation is done on groups of factors. The individual factors within the groups identified as active in the first stage are then investigated individually in a second-stage experiment. The early work in this area considered models in which only main effects were considered, but has now been extended to handle interactions. Lewis and Dean (2001) investi- gated two-stage group screening strategies for detecting interactions. Vine, Lewis,

6 and Dean (2005) developed methodology for handling groups of unequal sizes as well

as unequal probabilities for factors being active. Morris (2006) gave a survey of group

screening and its use in searching for active factors. Vine, Lewis, Dean, and Brunson

(2008) discussed practical aspects involved in running a two-stage group screening

experiment for investigating interactions.

1.4 Design of Computer Experiments

The output from most computer codes is deterministic and hence, no replications

are required at any design point. Moreover, for the thorough exploration of the

experimental region, the design points should be spread evenly throughout the region.

Such designs for which the points are unreplicated and well spread output are called

“space-filling.”

McKay, Beckman, and Conover (1979) introduced Latin hypercube designs for use in computer experiments. In its simplest form, each column of an n × k Latin hyper-

⊤ cube design (LHD) has hth column ξh = [ξ1h, ξ2h, . . . , ξnh] which can be obtained

⊤ from a random permutation πh = [π1h, π2h, . . . , πnh] of 1, . . . , n. Then ξih is the mid- point of the interval [(πih − 1)/n, πih/n]. In a slightly more sophisticated approach, a random point in this interval may be taken; the latter procedure is used in this dissertation.

All LHDs have the one-dimensional space-filling property that an observation is

taken in every one of the n evenly spaced intervals over the [0, 1] range of each

input. However, they need not have space-filling properties in higher dimensions.

Several criteria to generate space-filling designs are described in Santner et al. (2003),

chapter 5. In particular, the “maximin distance” criterion which finds a design which

7 maximizes minimum inter-point distance was first introduced by Johnson, Moore, and Ylvisaker (1990) and extended by Morris and Mitchell (1995).

Another desirable property of a design for a computer experiment is that of or- thogonality, where the design matrix has uncorrelated columns. If values of two inputs are highly correlated, then it is difficult to distinguish their effects on the output. The orthogonal design allows one to assess the effects of the different inputs independently.

Tang (1993) proposed a method of constructing the orthogonal array-based LHDs by combining the desirable properties of both orthogonal arrays and LHDs. Owen (1994) proposed an algorithm for generating LHDs with small pairwise correlations between input variables. Tang (1998) developed an algorithm for reducing polynomial canon- ical correlations of LHDs by extending Owen (1994)’s algorithm. Ye (1998) proposed a construction method for orthogonal LHDs with n = 2m + 1 runs and k = 2m − 2 input variables, and used an improvement algorithm for selecting designs within this class under space-filling and other criteria. Butler (2001) presented a construction method for LHDs which are orthogonal with respect to models based on trigonomet- ric functions. Steinberg and Lin (2006) constructed orthogonal LHDs with n = 2k runs where the number k of inputs is a power of 2, which can include more inputs than those proposed by Ye (1998). Cioppa and Lucas (2007) extended Ye (1998)’s approach to construct orthogonal LHDs that can accommodate more inputs and pre- sented a method that improves the space-filling properties of the resulting LHD at the expense of inducing small correlations. Joseph and Hung (2008) proposed an exchange algorithm for efficient generation of LHDs under a weighted combination of orthogonality and space-filling criteria. Lin, Mukerjee, and Tang (2009) proposed a method for constructing large dimensional orthogonal and nearly orthogonal LHDs

8 by using an orthogonal array and a small LHD. Bingham, Sitter, and Tang (2009) constructed a class of orthogonal designs which have various choices for the number of levels and flexible runs sizes by relaxing the constraint of LHD. In Chapter 6, a new method for achieving orthogonal and space-filling designs, based on Gram-Schmidt orthogonalization is proposed.

1.5 Overview of Dissertation

The rest of dissertation is organized as follows. Chapter 2 proposes a new two- stage group screening procedure that identifies active inputs in a computer experi- ment setting. The performance of the proposed method is discussed in Chapter 3 and application of the new method is demonstrated in Chapter 4. The computation of sen- sitivity indices as screening measures is discussed in Chapter 5. Chapter 6 presents a new algorithm for generating maximin LHDs and a new algorithm for achieving orthogonal maximin designs using Gram-Schmidt orthogonalization. Alternative ap- proaches for creating two-stage designs are given in Chapter 7. Chapter 8 provides software for the proposed screening procedure, computation of sensitivity indices, and design search algorithms.

9 CHAPTER 2

TWO-STAGE SENSITIVITY-BASED GROUP SCREENING IN COMPUTER EXPERIMENTS

This chapter proposes a new two-stage group screening procedure that identifies active inputs in computer experiments. The whole procedure is explained here in de- tail. Further discussions related with the performance of the procedure are described in Chapter 3 and the application using various examples is shown in Chapter 4.

2.1 Introduction

2.1.1 Background

A computer model is a numerical implementation of a mathematical description of an input-output relationship of a physical process. Modelling through computer codes is prevalent in a wide range of applications, for example, in engineering (Fang et al. (2005)), in biomechanics (Ong et al. (2006)), in the physical sciences (Higdon et al. (2004)), in the life sciences (Upton et al. (2006), Fogelson et al. (2003)), in economics (Lempert et al. (2002)), and other areas of natural science.

Over the past 20 years, the use of computer codes as experimental tools has become increasingly sophisticated. In addition to inputs that describe different “treatments,” computer models can allow the user to vary environmental inputs that describe the

10 conditions in which the process operates, and calibration inputs which are unknown

physical constants in the underlying mathematical model and for which expert-based

subjective distributions are available. For example, Ong, Santner, and Bartel (2008)

presented an application in the biomechanical engineering design of a prosthetic ac-

etabular cup in which the hip socket of a prosthetic total hip replacement rotates. In

addition to inputs defining the cup geometry, their study included inputs represent-

ing environmental conditions such as the patient bone quality and loading patterns,

inputs describing mis-alignments from nominal cup insertion values (which represent

the level of surgeon skill), and inputs describing unknown aspects of the true physical

setting such as the interface friction between the bone and prosthesis. The finite-

element codes used in this and other complex applications can require up to 24 hours

for a single run. Consequently, there is a need for efficient methods for detecting in-

puts that have major impacts on an input-output system. These are called the active or influential inputs. Once identified, researchers can restrict attention to varying only the active inputs (while setting other inputs to nominal values), thus reduc- ing the number of future code runs needed to understand the detailed input-output relationship.

The literature contains several proposals for screening inputs in computer ex-

periments where the deterministic output is modelled as a realization of a random

function. An approach that decomposes a Gaussian random function approximator

of a computer model into an average effect, main effects for each input, two-factor

interactions, and high-order interactions, and plots the estimated effects or quantifies

the importance of the effects has been applied by many authors (section 1.3.1), for

example, Sacks et al. (1989), Welch et al. (1992), Oakley and O’Hagan (2004), and

11 Schonlau and Welch (2006). Linkletter et al. (2006) proposed a Bayesian method to select active inputs based on the posterior distribution of the parameters of the Gaus- sian correlation function. Campbell et al. (2006) and Higdon et al. (2008) performed sensitivity analysis for multiple outputs.

For complex computer codes that are expensive to run and that must account for many inputs, standard screening methods described in Section 1.3 can be time- consuming. A screening method is presented below that incorporates experimental design considerations and group screening and allows the user to identify influential inputs in a computer experiment with computational efficiency.

As mentioned in Section 1.3.2, group screening methodology was first described by

Dorfman (1943) for blood screening and was later adapted to the setting of physical experiments by Watson (1961) for identifying active factors in cases where there are many potentially influential input factors. The early work in this area considered models in which only main effects were considered, but has now been extended to handle interactions (see Morris and Mitchell (1983), Lewis and Dean (2001), and

Vine et al. (2005); see also reviews by Kleijnen (1987) and Morris (2006)). Group

Screening works well under “effect sparsity,” where the proportion of active main effects and interactions is small. In two-stage group screening, the first stage of experimentation is done on groups of factors. The individual factors within the groups identified as active in the first stage are then investigated individually in a second- stage experiment.

The proposed method in this dissertation provides a two-stage group screening procedure that identifies active inputs in a computer experiment setting; it eliminates non-active inputs having small effects, not merely those having zero effects. This

12 approach reduces the number of experimental runs needed to understand the input- output relationship because groups of inputs with small effects are dropped at an early stage of the procedure.

2.1.2 Overview of the Proposed Procedure

For clarity of exposition, the description of each stage of the proposed procedure is divided into sampling, grouping (for Stage 1 only), and analysis phases. The pro- posed procedure is called the GSinCE (Group Screening in Computer Experiments) procedure. An outline is given below, with further details in Sections 2.2-2.4.

Initialization Given that n runs of the computer code are to be made in Stage 1, a matrix X∗ with n rows and (n − 1) columns satisfying certain desirable properties is generated, as described in Section 2.2. The choice of n is discussed in Section 2.3.1.

Stage 1 In the sampling phase, a set of columns from X∗ is selected to produce a design matrix X(1). The computer code is run at the design points (rows) in X(1) and a Gaussian process (GP) model is fitted to the output. In the grouping phase, the output is used to place the inputs into disjoint sets (groups). All inputs in the same group are set equal to the same level, defined by a design matrix G (Section 2.3.2) and the fitted GP model is used to predict the output at the design points in G. The analysis phase (Section 2.3.3) uses total effect sensitivity indices to determine which groups of inputs are inactive and which potentially contain active inputs. To judge whether a group is active or non-active, an additional “low-impact” input is created to use as a “benchmark” (c.f. Linkletter et al. (2006), Wu, Boos, and Stefanski (2007)).

Stage 2 The inputs in the groups selected as active in Stage 1 are investigated individually in Stage 2. In the Stage 2 sampling phase (Section 2.4.1), a new design

13 matrix X(2) is selected in such a way that the design points in the combined X(1),

X(2) retain, as closely as possible, desirable properties identified in Section 2.2. The computer code is run at the design points in X(2). The Stage 2 analysis phase uses the outputs from both stages in a second sensitivity analysis to make the final selection of active inputs (Section 2.4.2).

2.2 GSinCE Initialization Stage

th Suppose there are f experimental inputs, where the range of j input is [aj, bj] and aj and bj are known constants, for j = 1, . . . , f. Assume that the domain for the ∏ f vector of the f inputs is the entire hyper-rectangle j=1[aj, bj]. The design will be

f obtained from the scaled input space [0, 1] and xtj ∈ [0, 1] will denote the value of the jth scaled input on the tth run of the design. Then the computer code is run to obtain the output using the unscaled input, ztj ≡ xtj × (bj − aj) + aj, for t = 1, . . . , n and j = 1, . . . , f.

In the Initialization Stage, a preliminary design matrix X∗ is constructed with n

− th ∗ ⊤ rows and n 1 columns as below. Denote the j column of X by ξj = (ξ1j, . . . , ξnj) ,

th where ⊤ denotes transpose, j = 1, . . . , n−1, and the i row as xi = (xi1, . . . , xi(n−1)), i = 1, . . . , n. The design matrices for the Stage 1 sampling and grouping phases will be drawn from this matrix, as will those for the low-impact inputs. There are three requirements for the design matrix X∗. First, the columns of X∗ are required to be uncorrelated to allow independent assessment of the effects of the different inputs. Second, the minimum and maximum values in each column must be 0 and

1, respectively; if this is not the case then those variables whose scaled input values in the design have larger ranges will have a larger impact on the response, artificially

14 induced by the design (see Section 3.3.2). Third, the design X∗ should be “space-

filling” at each stage in the sense that the selected design maximizes the minimum

inter-point distance in all 2-dimensional subspaces of the input space. This helps to

insure that all regions of the input space are explored (c.f. Sacks et al. (1989), and

Santner et al. (2003), chapter 5). These three properties are referred to, respectively,

as (P.1), (P.2), and (P.3). An algorithm for generating X∗ which satisfies (P.1) and

(P.2), and approximately satisfies (P.3) follows.

Step 1 Randomly generate an n × (n − 1) Latin hypercube design matrix

Λ = (λ1,..., λn−1) with rank (n − 1), (see McKay et al. (1979)).

− ⊤ − Step 2 Center each column of Λ : vh = λh (λh 1/n)1 for h = 1, . . . , n 1, where 1 is a vector of n unit elements.

Step 3 Apply the Gram-Schmidt algorithm to form orthogonal columns

⊤ uh = (u1h, . . . , unh) , { v1, h = 1; u = ∑ − ⊤ h − h 1 ui vh − vh 2 ui, h = 2, . . . , n 1. i=1 ||ui||

⊤ Step 4 Scale the values of uh to [0,1] to give ξh = (ξ1h, . . . , ξnh) ,

where ξih = (uih −min{u1h, . . . , unh})/(max{u1h, . . . , unh}−min{u1h, . . . , unh}), − for i = 1, . . . , n and h = 1, . . . , n 1. Set X = (ξ1,..., ξn−1).

∗ Step 5 Select the design matrix X = (ξ1,..., ξn−1) which maximizes the minimum inter-point distance over all projections of the design into 2-dimensional space,

i.e., maximizes { } √ 2 2 min min (ξih − ξjh) + (ξiℓ − ξjℓ) . ∈{ n } ∈{ n−1 } i

15 Step 5 can be carried out (approximately) in a brute-force manner by repeating

Steps 1-4 many times and selecting the best maximin design among the candidate

designs generated. Alternatively, some form of genetic exchange algorithm (see Bartz-

Beielstein (2006)) could be used to find an approximate maximin design, for example,

the evolutionary operation (EVOP) method used in Forrester, S´obester, and Keane

(2008), chapter 1.

Step 4 of the algorithm guarantees (P.2), while (P.1) can be verified as follows.

¯ ⊤ th ∗ Let ξh = ξh 1/n be the arithmetic mean of the elements in the h column of X , ̸ then by construction, the correlation of ξh and ξℓ, h = ℓ, is

(ξ − ξ¯ 1)⊤(ξ − ξ¯ 1) u⊤u r(ξ , ξ ) = √ h h ℓ ℓ = √ h √ℓ = 0, h ℓ − ¯ ⊤ − ¯ − ¯ ⊤ − ¯ ⊤ ⊤ (ξh ξh1) (ξh ξh1)(ξℓ ξℓ1) (ξℓ ξℓ1) uh uh uℓ uℓ

⊤ ⊤ ⊤ where uh and uℓ are defined in Step 3 and satisfy uh uℓ = 0 and uh 1 = uℓ 1 = 0 from Step 2. Alternative distance criteria such as the average distance criterion over all low-dimensional projections could also be used, (see, for example, Welch (1985)).

2.3 GSinCE Procedure Stage 1

2.3.1 Stage 1 Sampling Phase

The GSinCE procedure is to be used in a screening situation where it is reason-

able to assume that only a small fraction (say 25% or less) of the inputs are active.

Loeppky, Sacks, and Welch (2009) justified “10 × number of inputs” as a reasonable

rule of thumb for the number of runs in an effective initial computer experiment.

Using this base value, 5 runs for each active input in each stage is reasonble. As an

example, with f = 20 inputs and a conservative assumption of a maximum of 40%

active inputs, one may take n = 5 × (f × 0.4) = 2f runs in Stage 1.

16 The Stage 1 design matrix, X(1), is taken to be the first f columns of the n×(n−1)

∗ (1) preliminary design matrix X ; thus X = (ξ1,..., ξf ). Denote a vector of outputs from the Stage 1 code runs as y(X(1)). A Bayesian GP model (see Higdon et al.

(2004) and Higdon et al. (2008)),

Y (x) = Z(x) + ϵ(x) (2.1)

is fitted to the data y(X(1)). Z(·) is taken to be a stationary Gaussian process with

zero mean, variance 1/λZ and covariance function

∏f 1 1 − 2 Cov(Z(x),Z(x˜)) = R(x, x˜) = ρ4(xj x˜j ) , (2.2) λ λ j Z Z j=1

where x = (x1, . . . , xf ) and x˜ = (˜x1,..., x˜f ) are two design points. The GP model

(2.1) is a special case of (1.1) when f ⊤(x)β = 0. The term ϵ(x) in (2.1) is added to represent numerical or other small scale noise and is modeled by a white noise process that is independent of Z(·) and has mean 0 and (small) prior variance 1/λϵ.

The output y(X(1)) is centered to have sample mean 0 and unit variance to conform to the prior specification when this model is fitted. The Bayesian model can be fitted using the GPM/SA (Gaussian Process Models for Simulation Analysis) software of

Gattiker (2005). The posterior distributions of the model parameters will be used to predict output as in (1.7) for the group variables in the grouping phase in Section 2.3.2.

2.3.2 Stage 1 Grouping Phase

Initial grouping of the inputs into groups that have similar effects on the re-

sponse is critical for efficient group screening. The individual inputs can be divided

into groups using information from subject experts, or using exploratory data anal-

ysis of the Stage 1 data, or a combination. Alternatively, an automatic grouping

17 procedure can be used as described below, where M is the user-selected maximum

group size. The method uses Fisher-transformed Pearson correlation coefficients,

∗ −1 (1) rj = tanh (r(ξj, y(X ))), j = 1, . . . f, (see Fisher (1921)), where the correlation

(1) coefficient r(ξj, y(X )) measures the strength of the linear relationship between the jth input and the output.

Step 1 Set q = f.

∗ Step 2 Compute the sample meanr ¯ and the sample standard deviation sr∗ of

∗ ∗ ∗ ∗ ∗ 2 r1, . . . , rq . Let a reference distribution for r1, . . . , rq be N(¯r , sr∗ ).

Step 3 Divide the reference distribution into ν = ⌈q/M⌉ intervals where the ith

× ∗ 2 −1 boundary is defined to be the i/ν 100th percentile of N(¯r , sr∗ ), i.e., Φ ∗ 2 (i/ν), r¯ ,sr∗ i = 1, . . . , ν − 1.

∗ ∗ Step 4 Group r1, . . . , rq into ν groups based on the boundaries of the reference dis-

tribution and count the number of elements observed in each group, h1, . . . , hν.

Step 5 If h1 > M and hν > M, then go to Step 6. Otherwise, go to Step 7.

Step 6 Subdivide each of groups 1 and ν, repeating Steps 1-4, first setting q = h1

for the leftmost group, and then q = hν for the rightmost group. Update ν as

the total number of groups so that the corresponding group sizes are h1, . . . , hν.

Go to Step 5 with the updated groups.

Step 7 Sequentially examine h1, h2, . . . , hν. Let i be the smallest index for which

hi > M. If there exists no i for which hi > M, then stop and set m = ν.

Otherwise, sequentially examine hν, hν−1, . . . , hν−(ν−i). Let j be the smallest

index for which hν−j > M. Let q = hi + hi+1 + ... + hν−j and go to Step 8.

∗ ∗ Step 8 Relabel r1, . . . , rq corresponding to the inputs to be re-grouped and go to Step 2.

18 After the f individual inputs have been divided into the m groups, a design matrix

∗ G = (g1,..., gm) is formed from a random selection of m columns from X for the

P P P group variables. From G, design matrix X = (ξ1 ,..., ξf ) is constructed in terms of the f individual inputs, where all the inputs in group i are set to the levels defined by gi, i = 1, . . . , m. For example, if the inputs 1, 5, 6 are assigned to group 1, then

P P P P ξ1 = ξ5 = ξ6 = g1. The design matrix X is used to predict the output based on the fitted GP model (Section 2.3.1). The resulting values, denoted by yb(XP ) (or, more simply, by yb(G)), are used in Section 2.3.3 to select the active groups. The training data, y(X(1)), will be used again in Section 2.4.2 to select active individual inputs within the active groups.

2.3.3 Stage 1 Analysis Phase Sensitivity Indices

In this section and Section 2.4.2, the total effect sensitivity index (TESI) is used to detect active effects. This subsection reviews the definition of sensitivity indices when the input region is [0, 1]f . See Chapter 5 for more details. Sobol´ (1993) showed that the function y(x) can be uniquely decomposed as

∑f ∑ y(x) = y0 + yj(xj) + yjh(xj, xh) + ... + y1,2,...,f (x1, . . . , xf ) (2.3) j=1 1≤j

y0 = y(x1, . . . , xf )dx1 . . . dxf ∫[0,1]f

yj(xj) = y(x1, . . . , xf )dx−j − y0 ∫[0,1]f−1

yjh(xj, xh) = y(x1, . . . , xf )dx−jh − yj(xj) − yh(xh) − y0 [0,1]f−2

19 and so on. Here dx−j denotes integration over all inputs except xj, and dx−jh denotes

the integration over all inputs except xj and xh. The individual components of Sobol’s decomposition are centered; that is, they satisfy ∫ 1 ≤ ≤ yj1,...,js (xj1 , . . . , xjs )dxjk = 0, for any 1 k s, 0

and orthogonal; that is, they satisfy ∫

yj1,...,js (xj1 , . . . , xjs )yh1,...,ht (xh1 , . . . , xht )dx1 . . . dxf = 0, [0,1]f

for any (j1, . . . , js) ≠ (h1, . . . , ht). Variance-based indices are obtained by squar- ing both sides of the Sobol’ decomposition (2.3) and integrating over [0, 1]f (Sobol´

(1993)). This leads to the variance decomposition, ∑f ∑ V = Vj + Vjh + ··· + V1,2,...,f (2.4) j=1 1≤j

each component in the variance decomposition (2.4) by the total variance V . The

th main effect sensitivity index of the j input, is defined to be Sj = Vj/V . The two-

th th factor sensitivity index of the j and h inputs is defined to be Sjh = Vjh/V . Higher-

order sensitivity indices are defined similarly. The TESI of the jth input (Homma and

Saltelli (1996)) is the sum of all sensitivity indices involving the jth input, i.e., ∑ Tj = Sj + Sjh + ... + S1,2,...,f . (2.5) h≠ j The sensitivity indices are computed using the Bayesian method of Oakley and

O’Hagan (2004) as implemented in GPM/SA; the sensitivity index is estimated by

20 the mean of the posterior distribution which is obtained from the posterior draws of the parameters in the GP model (2.1) via Markov chain Monte Carlo (MCMC).

A Benchmark Null Distribution for Sensitivity Indices

Linkletter et al. (2006) used a reference distribution for variable selection obtained by augmenting the experimental inputs with an input known to be inert. This ap- proach is attractive since it removes the need for subjective assessment about which inputs have “large” indicators of activity. The use of an input which has zero effect on the output can lead to the selection of inputs with very small effects as being active.

In practice, it is desired to eliminate low-impact inputs as well as totally inert ones.

Linkletter et al. (2006) addressed this issue in their discussion and suggested that the response could be spiked with some very small effect (see also, Wu et al. (2007)). The latter approach is used here and the output is modified by adding an input having a small, user-determined, effect. Any group of inputs whose TESI is smaller than that of the added low-impact input is treated as non-active.

Linkletter et al. (2006) selected random columns from the input space for the inert input. However, since randomly generated columns can be correlated with the columns of the design matrix for the inputs, multiple, uncorrelated columns are drawn from the preliminary design matrix X∗ for the low-impact inputs. There are

(n − 1) − m columns of X∗ that are both uncorrelated with the m columns of G and with each other. Thus, by augmenting G with these n − m − 1 columns in turn, n − m − 1 augmented group design matrices are constructed. Denote the wth such design matrix by

(w) G(w) = (g1,..., gm, gm+1)

21 th ≤ ≤ (w) where gi is the column of G for the i group variable, 1 i m, and gm+1 is the column selected from X∗ for the wth low-impact input, 1 ≤ w ≤ n − m − 1. Each

G(w) satisfies the properties (P.1) and (P.2) and approximately (P.3) of Section 2.2.

The magnitude of the low-impact input is set to a fraction τ, 0 < τ < 1, of

the range of the output; choices for τ that optimize the performance of the GSinCE

procedure are discussed in Section 3.1. Let ( )

β = max yˆt − min yˆt × τ (2.6) 1≤t≤n 1≤t≤n

th define the magnitude of the effect of the low-impact input, wherey ˆt is the t value in the predicted output yb(G) defined in Section 2.3.2. For the design G(w), a per- b (w) turbation of the predicted data y(G) is computed using gm+1,

(w) b (w) yˆ = y(G) + β gm+1. (2.7)

The perturbed output yˆ(w) is used to compute the TESIs corresponding to the m group inputs and the wth low-impact input; these quantities are denoted by (w) (w) (w) ≤ ≤ − − T1 ,...,Tm , Tm+1, for 1 w n m 1. The GSinCE procedure selects the ith group variable as being active if its TESI

is larger than that of the low-impact input. This decision is made based on the (w) (w) ≤ ≤ − − pairs of TESIs (Tm+1,Ti ), 1 w n m 1, using the sign test (see Conover (w) − (w) − − (1999)). The sign test is applied to the differences Ti Tm+1. The n m 1 pairs

(w) (w) (Tm+1,Ti ) are mutually independent because each pair of TESIs is estimated based on independent MCMC draws of parameters given yˆ(w). The pairs are assumed to be (w) (w) ≤ (w) (w) internally consistent, in that if P [Tm+1 < Ti ] P [Tm+1 > Ti ] for one w then it is true for all w. Then the hypotheses for the ith TESI, 1 ≤ i ≤ m, are formulated as (w) (w) ≤ (w) (w) H0i : P [Tm+1 < Ti ] P [Tm+1 > Ti ], for all w (w) (w) (w) (w) (2.8) H1i : P [Tm+1 < Ti ] > P [Tm+1 > Ti ], for all w 22 and tested at significance level α/m to account for the multiple groups, for chosen

“familywise” significance level α.

To perform the sign test, let the test statistic E be equal to the number of “plus”

pairs, that is, the number of pairs such that the TESI of the ith group variable is larger

than that of the low-impact input. First, disregard all tied pairs among the n−m−1

pairs, and let ζ be equal to the number of pairs that are not ties (ζ ≤ n − m − 1).

Large values of E indicate that a plus is more probable than a minus, as stated by

H1i. So the p-value is computed as

( ) ( ) − ∑E ζ 1 w 1 ζ w P [W > E] = 1 − 1 − (2.9) w 2 2 w=0

where W is a random variable from the binomial distribution B(ζ, 1/2). Thus H0i is

rejected if the p-value is smaller than significance level α/m. In principle, the test can

be conducted when ζ ≥ 1. As ζ increases, the power of the test increases. But a very

large ζ can lead to computational inefficiency by repeating the sensitivity analysis

more times than necessary.

The number of pairs in the test is determined from the number of low-impact

inputs. In the GSinCE procedure, the number of low-impact inputs is n − m − 1 at

Stage 1 and it can be small or large according to the choice of n and m. In such a case, an appropriate number of low-impact columns are needed to set by considering the computational effort in the sampling and analysis phases of the GSinCE procedure as well as the power of the test.

23 2.4 GSinCE Procedure Stage 2

2.4.1 Stage 2 Sampling Phase

Suppose that there are p inputs in total in the groups identified as active at

Stage 1; each of these p inputs is potentially active. Using a sample size justification

similar to that in Section 2.3.1, a Stage 2 design with n2 runs is constructed. For

example, n1 = 5(f × 0.4) = 2f and n2 = 5p are taken using conservative assumptions

of the proportion of active inputs. The maximum number of uncorrelated low-impact

benchmark columns that can be included in the design matrix Xc defined below is

∗ δ = min(n1 − f − 1, n2 − p − 1). Rearrange the columns of X in order of potentially

active (A), non-active (N), followed by δ of the low-impact benchmark (B) columns, (1) (1) (1) − − − − (1) to obtain the matrix (XA , XN , XB ). When n2 p 1 < n1 f 1, XB is

obtained from δ = n2 − p − 1 randomly selected columns from the n1 − f − 1 columns

in X∗ used for the low-impact inputs at Stage 1. × (2) Next, construct an n2 p Stage 2 design matrix XA for the inputs from the active × − (2) groups, an n2 (f p) matrix XN for the inputs from the non-active groups, and an × (2) (2) n2 δ matrix XB for the low-impact inputs. The values in the jth column of XN (1) × are all set equal to the median value of the jth column of XN . The n2 (p+δ) matrix

(2) (2) (XA , XB ) is constructed using Steps 1-4 of the design algorithm in Section 2.2 so that its columns satisfy (P.1) and (P.2); Step 5 is implemented so that the combined

(n1 + n2) × (p + δ) matrix ( ) (1) (1) c XA XB c c X = (2) (2) = (XA, XB) XA XB

is (approximately) maximin. The columns of Xc will satisfy (P.2) but need not satisfy

(P.l), i.e., the columns of Xc need not be uncorrelated. However, in the examples

24 that have been investigated for this dissertation, their correlations are very small.

The computer code is now run at each set of input values defined by the rows of

(2) (2) (2) (2) X = (XA , XN ) to obtain the output y(X ).

2.4.2 Stage 2 Analysis Phase

(1) The n1 output values y(X ) from Stage 1 are used together with the n2 output

(2) c (1) ⊤ (2) ⊤ ⊤ c c ⊤ values y(X ) from Stage 2; let y = (y(X ) , y(X ) ) = (y1, . . . , yn1+n2 ) denote the combined data. Based on Xc, construct δ augmented design matrices

c(1) c(2) c(δ) c th using, one-by-one, the (low-impact) columns ξp+1, ξp+1,..., ξp+1 from XB. The w such design is represented by c c c(w) X (w) = (XA, ξp+1 )

≤ ≤ c(w) for 1 w δ. The perturbed output for the Stage 2 analysis using ξp+1 is defined as c(w) c c c(w) y = y + β ξp+1 (2.10) ( ) where c c − c × β = max yt min yt τ (2.11) 1≤t≤(n1+n2) 1≤t≤(n1+n2) using the same value of τ that was used at Stage 1. The output yc(w) is used to

(w) (w) (w) compute the TESIs T1 ,...,Tp ,Tp+1 corresponding to the p individual inputs and the wth low-impact input. There are δ pairs of TESIs that can be used to test the

activity level for any given individual input among the p potentially active inputs.

Following the procedure used in Section 2.3.3, the hypotheses to test that the jth

input is active are formulated as (w) (w) ≤ (w) (w) H0j : P [Tp+1 < Tj ] P [Tp+1 > Tj ], for all w (w) (w) (w) (w) (2.12) H1j : P [Tp+1 < Tj ] > P [Tp+1 > Tj ], for all w and tested at significance level α/p, for j = 1, . . . , p, for a selected familywise signifi-

cance level α.

25 CHAPTER 3

PERFORMANCE OF GSINCE

This chapter discusses the factors that determine the performance of the GSinCE procedure proposed in Chapter 2. Through simulation studies, Section 3.1 investi- gates the optimal value of τ to perturb the output via the low-impact input. In

Section 3.2, more challenging screening situations are investigated via extended sim- ulation studies and then some modification of the GSinCE procedure is suggested for such cases. Section 3.3 shows details of the desirable design properties and the con- struction of the two-stage designs in order to improve the performance of the GSinCE procedure.

3.1 Simulation Studies to Set τ

This section determines a setting of τ in (2.6) and (2.11) to control the operating characteristics of the GSinCE procedure using a stochastic test bed of second-order polynomials: ∑f ∑f ∑f y(, . . . , zf ) = γjzj + γjhzjzh, (3.1) j=1 j=1 h=j where zj ∈ [0, 1], j = 1, . . . , f. Let L, Q, and I denote the set of inputs involved in active linear, quadratic, and interaction effects, respectively. In the simulation studies, the values of the regression coefficients γj, γjj, and γjh were assigned under

26 the principles of effect sparsity, hierarchy, and heredity described in Chipman (2006).

For j = 1, . . . , f, the linear coefficient γj was drawn from the following distributions, { 2 N(µA, σ ), with prob qL; γ ∼ A j 2 − N(µN , σN ), with prob 1 qL,

where qL = P [j ∈ L] is the probability that the linear effect of input j is active and

µA > 0, µA > µN , σA > σN . Let qQ|A and qQ|N be the conditional probabilities

that the quadratic effect of input j is active given that the linear effect is active or

non-active, i.e., qQ|A = P [j ∈ Q|j ∈ L] and qQ|N = P [j ∈ Q|j ̸∈ L]. The value of γjj

was drawn from the following distributions,  2 qQ {  N(µA, σA) with prob 2 ; qQ|A, if j ∈ L; γ ∼ N(−µ , σ2 ) with prob qQ ; where q = jj  A A 2 Q  q | , if j ̸∈ L. 2 − Q N N(µN , σN ) with prob 1 qQ.

Similarly, let q×|AA, q×|AN , q×|NA, q×|NN be the conditional probabilities that the

interaction between inputs j and h is active given that the linear effects of each of these inputs is active or non-active. The value of γjh was drawn from the following distributions,    q×| , if j ∈ L, h ∈ L; 2 q×  AA  N(µA, σA) with prob 2 ;  q×| , if j ∈ L, h ̸∈ L; 2 q× AN γ ∼ N(−µ , σ ) with prob ; where q× = jh  A A 2    q×| , if j ̸∈ L, h ∈ L; 2 −  NA N(µN , σN ) with prob 1 q×.  q×|NN , if j ̸∈ L, h ̸∈ L.

Lemma 3.1.1 The expected proportion of inputs that are active is

( ) qL + qQ|N (1 − qL) + (f − 1) q×|AN qL + q×|NN (1 − qL) (1 − qQ|N )(1 − qL). (3.2)

27 Proof Let Z denote the number of active inputs, then in terms of indicator functions, ∑f Z = I(input j is active) j=1 (3.3) ∑f = {I[j ∈ L] + I[j ̸∈ L, j ∈ Q] + I[j ̸∈ L, j ̸∈ Q, j ∈ I]} . j=1 The expected value of the random variable Z is ∑f E(Z) = {P [j ∈ L] + P [j ̸∈ L, j ∈ Q] + P [j ̸∈ L, j ̸∈ Q, j ∈ I]} . (3.4) j=1

By definition, P [j ∈ I|j ∈ Q] = P [j ∈ I] and P [j ∈ I|j ̸∈ L, h ∈ L] = q×|NA = q×|AN .

Then the second component of (3.4) is written as

P [j ̸∈ L, j ∈ Q] = P [j ∈ Q|j ̸∈ L] P [j ̸∈ L] = qQ|N (1 − qL).

The third component of (3.4) is P [j ̸∈ L, j ̸∈ Q, j ∈ I]

= P [j ∈ I|j ̸∈ L, j ̸∈ Q] P [j ̸∈ Q|j ̸∈ L] P [j ̸∈ L]

= P [j ∈ I|j ̸∈ L] (1 − P [j ∈ Q|j ̸∈ L]) (1 − P [j ∈ L]) { } = (f − 1) q×|AN qL + q×|NN (1 − qL) (1 − qQ|N )(1 − qL) where the last line follows from ∑f P [j ∈ I|j ̸∈ L] = {P [j ∈ I|j ̸∈ L, h ∈ L]P [h ∈ L] h≠ j + P [j ∈ I|j ̸∈ L, h ̸∈ L]P [h ̸∈ L]} ∑f { } = q×|AN qL + q×|NN (1 − qL) h≠ j { } = (f − 1) q×|AN qL + q×|NN (1 − qL) . Thus the expected number of active inputs in (3.4) is simplified as [ ( ) ] E(Z) = f qL + qQ|N (1 − qL) + (f − 1) q×|AN qL + q×|NN (1 − qL) (1 − qQ|N )(1 − qL) hence the expected proportion of active inputs in Lemma 3.1.1 is obtained.

28 3.1.1 Simulations for f = 20

Table 3.1 lists four sets, P1 − P4, of marginal probabilities that were selected to generate second-order polynomials with the expected percentage of active inputs

(3.1.1) approximately equal to 25% (4 inputs) when f = 20. P1 produces the largest number of active linear effects and the fewest active quadratic and interaction effects.

P3 produces more active quadratic and interaction effects whose linear effects are also active. P2 and P4 produce more active quadratic and interaction effects whose corre- sponding linear effect is not active, but fewer active linear effects. P4 is more extreme in this regard. Thus, P2, P3 and P4 can create more complicated test functions than

P1, while having a similar expected percentage of active inputs.

Table 3.1 also lists three sets of coefficient distributions labelled C1 − C3. Normal distributions were selected for the active effects and the distribution for the non-active

2 effects was fixed as N(0, 2 ). The coefficients drawn for the active effects under C1 are the easiest to differentiate from those of the non-active effects, and those under

C2 are the hardest.

Choice of Marginal Probabilities Choice of Coefficient Distributions 2 2 qL qQ|A qQ|N q×|AA q×|AN q×|NN µA σA µN σN 2 2 P1 0.15 0.10 0.005 0.10 0.010 0.005 C1 40 10 0 2 2 2 P2 0.10 0.10 0.010 0.10 0.014 0.008 C2 20 5 0 2 2 2 P3 0.15 0.90 0.005 0.90 0.010 0.005 C3 30 7.5 0 2 P4 0.05 0.10 0.050 0.10 0.010 0.009

Table 3.1: Marginal probabilities and coefficient distributions for the simulation study

Combinations of Pi and Cj were used to generate a wide variety of second-order polynomial functions, ranging from those having only a large linear effect to those

29 having interaction or quadratic effects of different signs. Of these twelve possible

combinations of marginal probabilities and coefficient distributions, the six listed in

Table 3.2 proved to be the most challenging for screening. (In the combination in

which P1 or C1 was involved, the performance with different values of τ was good overall without substantial variability in the performance across the τ value.) The

recommendation of τ below is based on these six test bed generators and 200 randomly

generated test functions under each combination.

Combination 1 2 3 4 5 6 Marginal Probabilities P2 P2 P3 P3 P4 P4 Coefficient Distributions C2 C3 C2 C3 C2 C3

Table 3.2: Six combinations used to recommend τ

In a screening problem, there are two kinds of errors; one can falsely select non-

active inputs, or falsely not select active inputs. Based on the claimed inputs, the false

discovery rate (FDR, see Benjamini and Hochberg (1995)) and false non-discovery rate

(FNDR) are defined as:

number of non-active inputs that are claimed to be active • FDR = number of inputs claimed to be active number of active inputs that are claimed to be non-active • FNDR = number of inputs claimed to be non-active

Two standard positive performance measures are specificity and sensitivity (see, for

example, Altman and Bland (1994)) which are defined to be:

number of true non-active inputs that are claimed to be non-active • specificity = number of true non-active inputs number of true active inputs that are claimed to be active • sensitivity = number of true active inputs

30 There is usually a trade-off between measures. For example, one may be willing to risk selecting some non-active inputs (low specificity), in order to increase the chance of identifying nearly all active inputs (high sensitivity). When the denominator is 0, then FDR or FNDR is defined to be 0 and specificity or sensitivity is defined to be

1. The objective is to select a value of τ for the low-impact inputs as used in (2.6) and (2.11) which leads to low FDR and FNDR and high specificity and sensitivity.

combination 1 combination 3 combination 5

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

12 14 16 12 14 16 12 14 16 τ × 100 % τ × 100 % τ × 100 %

combination 2 combination 4 combination 6

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

12 14 16 12 14 16 12 14 16 τ × 100 % τ × 100 % τ × 100 %

Figure 3.1: Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 200 test functions versus τ × 100% for functions with about 25% of active inputs among f = 20 inputs

31 Comb τ FDR FNDR specificity sensitivity Average Med IQR Med IQR Med IQR Med IQR Groups Runs 1 0.11 0.500 0.387 0 0.091 0.750 0.232 1 0.200 3.9 114.2 (5) 0.12 0.429 0.444 0 0.091 0.800 0.234 1 0.250 3.8 111.8 0.13 0.375 0.458 0 0.106 0.833 0.241 1 0.250 3.7 109.9 0.14 0.333 0.433 0.056 0.129 0.857 0.188 0.882 0.286 3.5 107.5 0.15 0.333 0.429 0.063 0.133 0.882 0.163 0.833 0.333 3.5 106.0 0.16 0.250 0.500 0.067 0.143 0.920 0.188 0.750 0.333 3.4 104.0 2 0.11 0.286 0.500 0 0.071 0.875 0.214 1 0.125 3.7 107.2 (2) 0.12 0.200 0.400 0 0.077 0.929 0.177 1 0.143 3.6 105.4 0.13 0.143 0.400 0 0.077 0.938 0.133 1 0.167 3.5 103.7 0.14 0 0.333 0 0.080 1 0.118 1 0.222 3.4 101.3 0.15 0 0.286 0.028 0.118 1 0.087 0.944 0.250 3.3 99.0 0.16 0 0.250 0.061 0.125 1 0.069 0.857 0.286 3.2 97.2 3 0.11 0.286 0.583 0 0.091 0.875 0.278 1 0.250 3.9 109.2 (2) 0.12 0.211 0.556 0.057 0.106 0.929 0.231 0.873 0.286 3.8 106.3 0.13 0.200 0.500 0.063 0.125 0.933 0.182 0.800 0.333 3.7 104.2 0.14 0.155 0.500 0.063 0.125 0.944 0.156 0.800 0.333 3.5 101.5 0.15 0 0.429 0.067 0.125 1 0.121 0.750 0.354 3.4 99.0 0.16 0 0.400 0.071 0.138 1 0.111 0.750 0.500 3.3 97.7 4 0.11 0 0.333 0 0.077 1 0.133 1 0.250 3.8 102.3 (1) 0.12 0 0.250 0 0.101 1 0.067 1 0.250 3.6 98.9 0.13 0 0.200 0.057 0.121 1 0.063 0.857 0.317 3.5 96.5 0.14 0 0.143 0.059 0.125 1 0.053 0.800 0.333 3.4 94.9 0.15 0 0 0.063 0.125 1 0 0.800 0.400 3.3 92.4 0.16 0 0 0.063 0.143 1 0 0.800 0.414 3.1 90.3 5 0.11 0.458 0.381 0 0.091 0.750 0.242 1 0.200 4.0 114.6 (6) 0.12 0.429 0.386 0 0.096 0.806 0.213 1 0.250 3.8 112.3 0.13 0.400 0.433 0 0.114 0.828 0.195 1 0.286 3.7 110.8 0.14 0.333 0.431 0.063 0.125 0.867 0.172 0.866 0.333 3.6 109.0 0.15 0.268 0.528 0.067 0.143 0.909 0.200 0.800 0.333 3.6 107.4 0.16 0.250 0.500 0.067 0.143 0.923 0.167 0.750 0.388 3.5 106.0 6 0.11 0.250 0.450 0 0.067 0.889 0.160 1 0.125 3.9 109.8 (5) 0.12 0.200 0.400 0 0.074 0.929 0.158 1 0.143 3.8 107.1 0.13 0.167 0.400 0 0.077 0.933 0.125 1 0.200 3.6 104.5 0.14 0.143 0.333 0 0.118 0.944 0.111 1 0.250 3.5 101.4 0.15 0 0.333 0 0.129 1 0.080 1 0.286 3.3 99.1 0.16 0 0.268 0.059 0.129 1 0.071 0.857 0.333 3.2 96.5

Table 3.3: Median and IQR values of the performance measures, and average number of groups and average total runs over 200 test functions with about 25% of active inputs among f = 20 inputs for each τ in each combination; value in parentheses is the number of test functions generated with no active inputs

32 The GSinCE procedure was applied to each of the 200 randomly generated second- order polynomials in each of the six combinations of Table 3.2 using

τ ∈ {0.11, 0.12, 0.13, 0.14, 0.15, 0.16}. The Stage 1 grouping was performed by the automatic grouping procedure of Section 2.3.2 with the maximum group size

M = 5. The hypothesis tests (2.8) and (2.12) were performed with a familywise significance level α = 0.2 at each stage. In Figure 3.1, the median values of FDR,

FNDR, specificity, and sensitivity are plotted for combinations 1 to 6 of Table 3.2.

The details of the simulation results related to Figure 3.1 are shown in Table 3.3.

For these combinations, the value of τ = 0.14 seems to give a reasonable compromise in achieving low median FDR and FNDR as well as high median specificity and sensitivity. Thus, τ = 0.14 is recommended when about 25% of the total inputs are active.

3.1.2 Simulations for f = 30

Here, consider a new situation with the expected percentage of inputs being active approximately equal to 25% (7.5 inputs) when f = 30. The expected proportion of active inputs is a function of f and probabilities qL, qQ|N , q×|AN , q×|NN as shown in Lemma 3.1.1. The probabilities in Table 3.1 were chosen for f = 20 to obtain an expected 25% active inputs. For f = 30, the values of qL and q×|NN were modified as shown in Table 3.4, while keeping the other probabilities the same to achieve an expected 25% active inputs.

The new set of probabilities in Table 3.4 were combined with the coefficient distri- butions in Table 3.1 to make the six combinations in Table 3.2. In each combination,

50 test functions were generated. The GSinCE procedure was applied to each test function using τ ∈ {0.13, 0.14, 0.15}. As Figure 3.2 and Table 3.5 show, the median FNDR and the median specificity show similar pattern for each τ value across all

33 Choice qL q×|NN P1 0.10 0.005 P2 0.08 0.006 P3 0.10 0.005 P4 0.01 0.007

Table 3.4: Modified values of qL and q×|NN . Other probabilities are as in Table 3.1 to achieve about 25% of f = 30 inputs active

combinations. Overall the median FDR decreases more when the τ value is increased from 0.13 to 0.14 as compared with an increase from 0.14 to 0.15. The decrease in the median specificity due to the change of τ value from 0.13 to 0.14 is not big. Thus the recommended value of τ is τ = 0.14 when the expected percentage of active inputs is approximately 25% and f = 30 (This is the same recommendation as for f = 20).

comb τ Median Average FDR FNDR specificity sensitivity True Claimed True Claimed 1 0.13 0.354 0.091 0.878 0.732 7.0 8.0 7.4 7.8 (0) 0.14 0.286 0.091 0.907 0.714 7.0 7.5 7.4 7.2 0.15 0.250 0.093 0.935 0.700 7.0 7.0 7.4 6.4 2 0.13 0.236 0.049 0.923 0.838 8.0 7.0 6.7 7.0 (0) 0.14 0.183 0.048 0.960 0.775 8.0 6.0 6.7 6.3 0.15 0.134 0.074 0.964 0.750 8.0 6.0 6.7 5.6 3 0.13 0.250 0.085 0.950 0.667 7.0 6.0 6.6 6.3 (0) 0.14 0.200 0.097 0.954 0.586 7.0 5.0 6.6 5.5 0.15 0.167 0.111 0.958 0.571 7.0 5.0 6.6 5.0 4 0.13 0 0.082 1 0.750 7.0 6.0 7.2 5.7 (0) 0.14 0 0.085 1 0.667 7.0 5.0 7.2 5.3 0.15 0 0.085 1 0.667 7.0 5.0 7.2 5.0 5 0.13 0.472 0.095 0.833 0.707 7.0 9.0 6.7 8.6 (2) 0.14 0.429 0.098 0.863 0.700 7.0 8.0 6.7 7.7 0.15 0.333 0.098 0.885 0.600 7.0 7.0 6.7 6.8 6 0.13 0.143 0.044 0.956 0.866 6.0 7.0 6.9 7.2 (0) 0.14 0 0.044 1 0.833 6.0 7.0 6.9 6.5 0.15 0 0.045 1 0.800 6.0 6.0 6.9 6.1

Table 3.5: Median values of the performance measures, and median/average values of true/claimed active inputs over 50 test functions with about 25% of active inputs among f = 30 inputs; value in parentheses is the number of test functions generated with no active inputs

34 combination 1 combination 3 combination 5

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

13 14 15 13 14 15 13 14 15 τ × 100 % τ × 100 % τ × 100 %

combination 2 combination 4 combination 6

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

13 14 15 13 14 15 13 14 15 τ × 100 % τ × 100 % τ × 100 %

Figure 3.2: Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 50 test functions versus τ × 100% for functions with about 25% of active inputs among f = 30 inputs

3.1.3 Simulations for f = 10

Here, consider a different simulation study for f = 10 inputs. The expected per-

centages of active inputs, 25% (2.5 inputs), 35% (3.5 inputs), and 20% (2 inputs) were

investigated. These percentages can be achieved by changing the values of probabili-

ties in Table 3.1 that were chosen for the f = 20 inputs to achieve approximately 25% of active inputs. In particular, the values of qL in Table 3.1 were modified to make the expected percentages of active inputs 25% and 35% as shown in Table 3.6, while keeping the other probabilities same. The original values of probabilities in Table 3.1

35 create approximately 20% of active inputs when f = 10, so the probabilities in Table 3.1 were directly used for the 20% case. The probability choices were combined with the coefficient distributions in Table 3.1. Then six combinations in Table 3.2 were investigated using 100 test functions in each combination.

Choice qL values 25% 35% P1 0.20 0.30 P2 0.18 0.28 P3 0.20 0.30 P4 0.15 0.25

Table 3.6: Modified values of qL to achieve about 25%, and 35% of f = 10 inputs active, while keeping other probabilities as in Table 3.1

25% of active inputs

The GSinCE procedure was applied based on τ ∈ {0.11, 0.12,..., 0.17, 0.18}. As

Figure 3.3 and Table 3.7 show, the median FNDR is 0 and the median sensitivity

is 1 for all τ values in all combinations considered. The performance of FDR and

specificity depends on choice of τ and combination. In combination 4, all 0.11 ≤ τ ≤ 0.18 values show the same performance. In combinations 2, 3 and 6, 0.14 ≤ τ ≤ 0.18,

0.16 ≤ τ ≤ 0.18, and 0.15 ≤ τ ≤ 0.18 show the best performance, respectively. In

combinations 1 and 5, the median FDR and the median specificity become stable

between τ = 0.14 and τ = 0.16. Overall, the previous recommendation of τ = 0.14

remains reasonable when the percentage of active inputs is about 25% for f = 10 inputs.

36 combination 1 combination 3 combination 5

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

12 14 16 18 12 14 16 18 12 14 16 18 τ × 100 % τ × 100 % τ × 100 %

combination 2 combination 4 combination 6

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

12 14 16 18 12 14 16 18 12 14 16 18 τ × 100 % τ × 100 % τ × 100 %

Figure 3.3: Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ × 100% for functions with about 25% of active inputs among f = 10 inputs

37 comb τ Median Average FDR FNDR specificity sensitivity True Claimed True Claimed 1 0.11 0.500 0 0.691 1 2.0 5.0 2.3 5.0 (9) 0.12 0.500 0 0.750 1 2.0 4.0 2.3 4.7 0.13 0.400 0 0.778 1 2.0 4.0 2.3 4.4 0.14 0.367 0 0.817 1 2.0 4.0 2.3 4.2 0.15 0.333 0 0.857 1 2.0 4.0 2.3 4.0 0.16 0.333 0 0.857 1 2.0 4.0 2.3 3.7 0.17 0.292 0 0.875 1 2.0 3.0 2.3 3.5 0.18 0.250 0 0.889 1 2.0 3.0 2.3 3.3 2 0.11 0.333 0 0.875 1 2.0 4.0 2.4 4.4 (8) 0.12 0.333 0 0.875 1 2.0 4.0 2.4 4.1 0.13 0.250 0 0.882 1 2.0 4.0 2.4 4.0 0.14 0 0 1 1 2.0 3.0 2.4 3.7 0.15 0 0 1 1 2.0 3.0 2.4 3.5 0.16 0 0 1 1 2.0 3.0 2.4 3.3 0.17 0 0 1 1 2.0 3.0 2.4 3.2 0.18 0 0 1 1 2.0 3.0 2.4 3.0 3 0.11 0.400 0 0.764 1 2.0 4.0 2.4 4.6 (7) 0.12 0.333 0 0.857 1 2.0 4.0 2.4 4.4 0.13 0.292 0 0.857 1 2.0 4.0 2.4 4.1 0.14 0.292 0 0.875 1 2.0 3.0 2.4 3.9 0.15 0.250 0 0.875 1 2.0 3.0 2.4 3.6 0.16 0 0 1 1 2.0 3.0 2.4 3.4 0.17 0 0 1 1 2.0 3.0 2.4 3.3 0.18 0 0 1 1 2.0 3.0 2.4 3.0 4 0.11 0 0 1 1 3.0 3.5 2.9 3.9 (5) 0.12 0 0 1 1 3.0 3.0 2.9 3.8 0.13 0 0 1 1 3.0 3.0 2.9 3.6 0.14 0 0 1 1 3.0 3.0 2.9 3.4 0.15 0 0 1 1 3.0 3.0 2.9 3.3 0.16 0 0 1 1 3.0 3.0 2.9 3.2 0.17 0 0 1 1 3.0 3.0 2.9 3.1 0.18 0 0 1 1 3.0 3.0 2.9 2.9 5 0.11 0.500 0 0.691 1 2.0 5.0 2.6 5.1 (9) 0.12 0.429 0 0.750 1 2.0 5.0 2.6 4.9 0.13 0.333 0 0.817 1 2.0 4.5 2.6 4.6 0.14 0.333 0 0.833 1 2.0 4.0 2.6 4.4 0.15 0.333 0 0.857 1 2.0 4.0 2.6 4.2 0.16 0.292 0 0.857 1 2.0 4.0 2.6 4.1 0.17 0.250 0 0.875 1 2.0 4.0 2.6 3.8 0.18 0.250 0 0.875 1 2.0 4.0 2.6 3.7 6 0.11 0.250 0 0.857 1 2.0 4.0 2.4 4.4 (10) 0.12 0.250 0 0.857 1 2.0 4.0 2.4 4.2 0.13 0.250 0 0.875 1 2.0 4.0 2.4 3.9 0.14 0.200 0 0.889 1 2.0 4.0 2.4 3.8 0.15 0 0 1 1 2.0 3.0 2.4 3.7 0.16 0 0 1 1 2.0 3.0 2.4 3.5 0.17 0 0 1 1 2.0 3.0 2.4 3.3 0.18 0 0 1 1 2.0 3.0 2.4 3.1

Table 3.7: Median values of the performance measures, and median/average values of true/claimed active inputs over 100 test functions with about 25% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs

38 35% of active inputs

The GSinCE procedure was applied to each of 100 test functions using 4 different

values of τ ∈ {0.11, 0.12, 0.13, 0.14}. As Figure 3.4 and Table 3.8 show, the median FNDR is 0 and the median sensitivity is 1 for all τ values in all combinations considered. Moreover, in all combinations, the median FDR/FNDR is 0 and the median sensitivity/specificity is 1 under τ = 0.14. Thus again, the recommendation of τ = 0.14 remains reasonable when about 35% active inputs exist among f = 10 inputs.

combination 1 combination 3 combination 5

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

11 12 13 14 11 12 13 14 11 12 13 14 τ × 100 % τ × 100 % τ × 100 %

combination 2 combination 4 combination 6

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

11 12 13 14 11 12 13 14 11 12 13 14 τ × 100 % τ × 100 % τ × 100 %

Figure 3.4: Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ × 100% for functions with about 35% of active inputs among f = 10 inputs

39 comb τ Median Average FDR FNDR specificity sensitivity True Claimed True Claimed 1 0.11 0.250 0 0.833 1 4.0 5.0 3.5 4.9 (1) 0.12 0.200 0 0.833 1 4.0 5.0 3.5 4.6 0.13 0.167 0 0.857 1 4.0 4.0 3.5 4.3 0.14 0 0 1 1 4.0 4.0 3.5 4.1 2 0.11 0 0 1 1 3.0 4.0 3.5 4.1 (2) 0.12 0 0 1 1 3.0 4.0 3.5 4.0 0.13 0 0 1 1 3.0 4.0 3.5 3.9 0.14 0 0 1 1 3.0 4.0 3.5 3.8 3 0.11 0.083 0 0.938 1 3.0 4.0 3.3 4.3 (3) 0.12 0 0 1 1 3.0 4.0 3.3 4.1 0.13 0 0 1 1 3.0 4.0 3.3 4.0 0.14 0 0 1 1 3.0 4.0 3.3 3.7 4 0.11 0 0 1 1 3.5 4.0 3.5 4.2 (5) 0.12 0 0 1 1 3.5 4.0 3.5 4.0 0.13 0 0 1 1 3.5 4.0 3.5 3.9 0.14 0 0 1 1 3.5 4.0 3.5 3.7 5 0.11 0.250 0 0.857 1 3.0 5.0 3.6 5.0 (0) 0.12 0.200 0 0.875 1 3.0 5.0 3.6 4.7 0.13 0 0 1 1 3.0 4.0 3.6 4.4 0.14 0 0 1 1 3.0 4.0 3.6 4.3 6 0.11 0 0 1 1 3.5 4.0 3.5 4.5 (5) 0.12 0 0 1 1 3.5 4.0 3.5 4.3 0.13 0 0 1 1 3.5 4.0 3.5 4.1 0.14 0 0 1 1 3.5 4.0 3.5 4.0

Table 3.8: Median values of the performance measures, and median/average values of true/claimed active inputs over 100 test functions with about 35% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs

20% of active inputs

For each of 100 test functions in each combination, the GSinCE procedure was performed using the extended values of τ ∈ {0.13, 0.14,..., 0.21, 0.22}. As Figure 3.5 and Table 3.9 show, the median FNDR is 0 and the median sensitivity is 1 for all τ values in all combinations considered. On the other hand, the performance of FDR and specificity depends on choice of τ and combination. In combination 4, all 0.13 ≤ τ ≤ 0.22 values show the same performance. In combinations 2 and 3, 0.19 ≤ τ ≤ 0.22 and 0.18 ≤ τ ≤ 0.22 show the best performance, respectively. In combinations 1, 5, and 6, the median FDR and the median specificity become stable

40 from τ = 0.20. Thus an increase of the value of τ to at least 0.2 when the proportion of active inputs is about 20% among f = 10 inputs is recommended.

combination 1 combination 3 combination 5

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

14 16 18 20 22 14 16 18 20 22 14 16 18 20 22 τ × 100 % τ × 100 % τ × 100 %

combination 2 combination 4 combination 6

1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

0.4 0.4 0.4 Median Value Median Value Median Value 0.2 0.2 0.2

0 0 0

14 16 18 20 22 14 16 18 20 22 14 16 18 20 22 τ × 100 % τ × 100 % τ × 100 %

Figure 3.5: Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ × 100% for functions with about 20% of active inputs among f = 10 inputs

41 comb τ Median Average FDR FNDR specificity sensitivity True Claimed True Claimed 1 0.13 0.600 0 0.667 1 2.0 5.0 1.8 4.8 (14) 0.14 0.600 0 0.714 1 2.0 4.0 1.8 4.6 0.15 0.500 0 0.750 1 2.0 4.0 1.8 4.3 0.16 0.500 0 0.750 1 2.0 4.0 1.8 4.1 0.17 0.500 0 0.806 1 2.0 4.0 1.8 3.8 0.18 0.500 0 0.857 1 2.0 3.0 1.8 3.5 0.19 0.500 0 0.875 1 2.0 3.0 1.8 3.3 0.20 0.333 0 0.875 1 2.0 3.0 1.8 3.1 0.21 0.333 0 0.889 1 2.0 3.0 1.8 2.9 0.22 0.333 0 0.889 1 2.0 3.0 1.8 2.8 2 0.13 0.500 0 0.778 1 1.5 4.0 1.8 4.4 (17) 0.14 0.500 0 0.845 1 1.5 4.0 1.8 4.2 0.15 0.500 0 0.875 1 1.5 4.0 1.8 3.9 0.16 0.333 0 0.889 1 1.5 3.0 1.8 3.6 0.17 0.333 0 0.889 1 1.5 3.0 1.8 3.4 0.18 0.333 0 0.889 1 1.5 3.0 1.8 3.2 0.19 0 0 1 1 1.5 3.0 1.8 3.1 0.20 0 0 1 1 1.5 3.0 1.8 2.9 0.21 0 0 1 1 1.5 2.5 1.8 2.8 0.22 0 0 1 1 1.5 2.0 1.8 2.7 3 0.13 0.429 0 0.750 1 2.0 4.0 2.0 4.5 (21) 0.14 0.333 0 0.845 1 2.0 4.0 2.0 4.3 0.15 0.333 0 0.857 1 2.0 4.0 2.0 4.1 0.16 0.292 0 0.857 1 2.0 4.0 2.0 3.9 0.17 0.250 0 0.875 1 2.0 3.0 2.0 3.7 0.18 0 0 1 1 2.0 3.0 2.0 3.5 0.19 0 0 1 1 2.0 3.0 2.0 3.3 0.20 0 0 1 1 2.0 3.0 2.0 3.2 0.21 0 0 1 1 2.0 3.0 2.0 3.0 0.22 0 0 1 1 2.0 3.0 2.0 2.9 4 0.13 0 0 1 1 2.0 3.0 2.1 3.8 (14) 0.14 0 0 1 1 2.0 3.0 2.1 3.6 0.15 0 0 1 1 2.0 3.0 2.1 3.4 0.16 0 0 1 1 2.0 3.0 2.1 3.2 0.17 0 0 1 1 2.0 3.0 2.1 3.1 0.18 0 0 1 1 2.0 3.0 2.1 3.0 0.19 0 0 1 1 2.0 3.0 2.1 2.8 0.20 0 0 1 1 2.0 3.0 2.1 2.7 0.21 0 0 1 1 2.0 2.0 2.1 2.5 0.22 0 0 1 1 2.0 2.0 2.1 2.5 5 0.13 0.667 0 0.667 1 2.0 5.0 1.8 4.8 (21) 0.14 0.600 0 0.714 1 2.0 4.5 1.8 4.6 0.15 0.550 0 0.764 1 2.0 4.0 1.8 4.3 0.16 0.500 0 0.778 1 2.0 4.0 1.8 4.1 0.17 0.500 0 0.833 1 2.0 4.0 1.8 3.9 0.18 0.500 0 0.857 1 2.0 4.0 1.8 3.6 0.19 0.500 0 0.866 1 2.0 3.0 1.8 3.4 0.20 0.333 0 0.875 1 2.0 3.0 1.8 3.2 0.21 0.333 0 0.882 1 2.0 3.0 1.8 3.0 0.22 0.333 0 0.889 1 2.0 3.0 1.8 2.9 6 0.13 0.500 0 0.764 1 1.0 4.0 1.5 4.7 (30) 0.14 0.500 0 0.778 1 1.0 4.0 1.5 4.4 0.15 0.500 0 0.806 1 1.0 4.0 1.5 4.2 0.16 0.500 0 0.833 1 1.0 4.0 1.5 4.0 0.17 0.367 0 0.875 1 1.0 4.0 1.5 3.8 0.18 0.333 0 0.875 1 1.0 3.0 1.5 3.6 0.19 0.333 0 0.875 1 1.0 3.0 1.5 3.5 0.20 0.292 0 0.889 1 1.0 3.0 1.5 3.3 0.21 0.292 0 0.889 1 1.0 3.0 1.5 3.2 0.22 0.100 0 0.944 1 1.0 3.0 1.5 3.0

Table 3.9: Median values of the performance measures, and median/average values of true/claimed active inputs over 100 test functions with about 20% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs

42 3.1.4 Summary of Simulation Studies

The optimal value of τ has been investigated via the various simulation studies

for f = 20, f = 30, and f = 10 inputs in previous sections. When f = 20 and

the expected percentage of active inputs is approximately 25%, τ = 0.14 is recom-

mended in Section 3.1.1. This value of τ seems to be reasonable also for f = 30

and f = 10 inputs when the expected percentage of active inputs is approximately 25%. The different expected percentages of active inputs such as 35% and 20% were investigated when f = 10. The value of τ = 0.14 still works well for the 35% case.

However, when the expected percentage of active inputs decreases to 20%, the value

of τ is recommended to increase up to around 0.2. In general, as the proportion of active inputs decreases, larger values of τ are needed to achieve a good compromise

between high specificity and low FDR. If τ is set too high, however, the sensitivity

begins to decrease. The situation of few active inputs will be investigated further in

Section 3.2.1.

3.2 Application of GSinCE in Least Favorable Cases

In the simulation studies of Section 3.1, the test functions were generated under

a selection of marginal probabilities and coefficient distributions. Compared to other

choices of marginal probabilities, P4 produces fewer active linear main effects, but more active quadratic and interaction effects whose corresponding linear main effect

is not active. The coefficients drawn for the active effects are hardest to differentiate

from the non-active effects under C2. Thus the combination 5 using P4 and C2 represents the “least favorable” situation for success of the GSinCE procedure among the combinations in Table 3.2. The test functions generated under the combination 5

are used in Sections 3.2.1 and 3.2.2. In particular, the screening situation having small

43 percentage of active inputs is investigated in Section 3.2.1 and the screening situation where active quadratic main effects or interaction effects exist without corresponding

active linear main effects is investigated in Section 3.2.2. Modifications of the GSinCE

procedure for these situations are suggested in each section. Section 3.2.3 shows the

success of GSinCE in detecting large effects even in the situation where quadratic main effects or interaction effects are active without corresponding active linear main

effects.

3.2.1 Small Percentage of Active Inputs

Here, a way to improve the specificity is considered in the situation when a very

small percentage of inputs is active, say, less than 15%. To simulate such a situation, 30 test functions having only 1, 2, or 3 active inputs among 20 inputs were generated

under the combination 5 of P4 and C2. As suggested in Section 3.1.3, a value of τ larger than 0.14 is needed in order to improve the specificity by avoiding selection of non-active inputs. To investigate a reasonable value of τ in this situation, the 5 values 0.14, 0.16, 0.18, 0.20, and 0.24 were applied in Stage 2, while keeping τ = 0.14 at Stage 1 of the GSinCE procedure. The τ value at Stage 1 was retained to maintain a low risk of missing some groups containing an active input(s) at Stage 1 due to a large τ value.

τ at Stage 1 τ at Stage 2 FDR FNDR specificity sensitivity 0.14 0.14 0.667 0 0.722 1 0.14 0.16 0.586 0 0.801 1 0.14 0.18 0.500 0 0.886 1 0.14 0.20 0.333 0 0.941 1 0.14 0.24 0 0.056 1 0.667

Table 3.10: Median values of FDR, FNDR, specificity, sensitivity over 30 functions having small percentages of active inputs

44 As the value of τ at Stage 2 increases up to 0.20, the median FDR and specificity are improved while still keeping a zero median FNDR and median sensitivity of 1.0

(see Table 3.10). When τ = 0.24 is used at Stage 2, the median FDR and specificity have the optimal values of 0 and 1 respectively, but the median FNDR increases from

0 to 0.056 and the median sensitivity sharply decreases 1 to 0.667. So τ = 0.20 at Stage 2 seems to be a better choice than the same value of τ = 0.14 in both stages, based on these 30 test functions.

3.2.2 Non-linear Functions

Here is one example of a non-linear test function for which it is difficult for GSinCE to correctly screen input variables. It has with 20 inputs, each ranging over [0,1] and

2 having 5 active inputs z1, z7, z12, z18, z19 and 4 active effects z1z18, z1z19, z19, z7z12 :

− − 2 y = 19.71z1z18 + 23.72z1z19 13.34z19 + 28.99z7z12 (3.5) + many terms with small coefficients.

The complete set of all coefficients corresponding to the linear, quadratic, and linear by linear interaction effects is given in Table 3.11. As expected from the construction of test functions under C2, the magnitudes of some non-active effects, for example,

−7.15z1z5 or −6.71z9z17, are not remarkably different from the magnitude of the − 2 active effects, for example, 13.34z19. Moreover, many small coefficients can cause a amalgamation of effects at Stage 1 of the GSinCE procedure. Thus a clear dif- ferentiation of the active inputs from the non-active inputs is difficult in this test function.

In the test function (3.5), the 5 active inputs are involved only in interaction or quadratic effects, without the corresponding active linear main effects. In such a case, the automatic grouping procedure of Section 2.3.2 which is based on linear

45 Table 3.11: All coefficients of test function (3.5)

input-output relationships has difficulty grouping together the important individual inputs with non-linear relationships. Table 3.12 shows the 5 groups constructed by the automatic grouping procedure with M = 5. Three of the important inputs, z1, z19, z12 are grouped together in g4. But z18 (in g1) is separated from z1, and z7 (in g5) is separated from z12, so their interaction effects are missed. The GSinCE procedure was performed using τ = 0.14 at both stages for the low- impact inputs and α = 0.2 for the hypothesis tests. The results are shown in Table

3.12. Four groups, not including g4, were selected at Stage 1, so the inputs z1, z19, z12 had no chance to proceed to Stage 2. At Stage 2, the input z7 having the interaction effect with the input z12 was not selected because input z12 had been screened out at

46 Stage 1. Finally only one input z18 was correctly selected, but 6 inactive inputs were chosen additionally.

∗ rj range Individual Group Stage 1 Stage 2 Inputs Selection√ Selection -0.35 to -0.30 , , z18 g1 √ z4, z5, z18 -0.20 to -0.17 , z17, z16 g2 √ z17 -0.10 to -0.04 z6, z10, z13, z8 g3 0.01 to 0.12 z1, z19, z12, , z20 g4 √ 0.14 to 0.49 z14, z7, z9, z15, g5 z3, z14, z15

Table 3.12: Results of automatic grouping and applying GSinCE for test function (3.5)

To improve the performance of the GSinCE procedure in such a least favorable situation, the following modifications of the procedure were investigated.

M-1: Increase the number of Stage 1 runs (here from n1 = 2f = 40 to n1 = 4f = 80) and decrease the number of additional runs at Stage 2 to keep fixed the total

number of runs

M-2: Use partial correlation coefficients instead of Pearson correlation coefficients to

create initial group

M-3: Decrease the maximum group size M from 5 to 4

There are 7 combinations of M-1 to M-3 to compare with the original procedure. The results of the GSinCE procedure performed using τ = 0.14 and α = 0.2 in each combination are summarized in Table 3.13. The increase in the number of

Stage 1 runs improves the sensitivity from 0.2 to 1 for any choice of grouping measure and maximum group size, even though the number of additional runs at Stage 2 is decreased.

47 Stage 1 Grouping M Selected Inputs p Total spec sens Runs Measure Runs 40 Pearson 5 z3, z4, z5, z14, z15, z17, z18 (Original) 15 115 0.60 0.2 4 z2, z3, z4, z5, z15, z18 14 110 0.67 0.2 Partial 5 z3, z4, z5, z14, z15, z17, z18 15 115 0.60 0.2 4 z3, z5, z15, z18 15 115 0.80 0.2 80 Pearson 5 z1, z3, z7, z12, z15, z17, z18, z19 20 115 0.80 1 4 z1, z3, z7, z12, z15, z16, z17, z18, z19 18 110 0.73 1 Partial 5 z1, z3, z7, z12, z15, z16, z17, z18, z19 18 115 0.73 1 4 z1, z3, z7, z12, z15, z16, z17, z18, z19, z20 20 115 0.67 1

One-stage Method z1, z3, z5, z7, z12, z14, z15, z16, z17, z18, z19 - 115 0.53 1

Table 3.13: Results of original, modified procedures and one-stage method

One method of interpreting the partial correlation between X and Y given a set { } of k controlling variables Z = Z1,Z2,...,Zk , written ρXY ·Z, is as the correlation between the residuals resulting from the linear regression of X with Z and of Y with Z, respectively. When the sample partial correlation coefficients were used in the automatic grouping procedure with 40 Stage 1 runs and M = 5, there was no change in the result from the original procedure. Decreasing in the maximum group size improved the procedure specificity from 0.60 to 0.67 or 0.80 for the two grouping methods in M-2, when 40 runs were used at Stage 1. In contrast, when 80 runs were used at Stage 1, the use of the partial correlation coefficients and the decrease in the maximum group size decreased the procedure specificity from 0.80 to 0.73 or

0.73 to 0.67. Thus the changes in the grouping measure and the maximum group size show different improvements/degradations according to the number of runs at Stage 1. These results indicate the importance of the number of Stage 1 runs.

A one-stage screening method is also compared with the two-stage methods in

Table 3.13. The one-stage method selected all active inputs but also selected 6 inactive inputs. Some of these inactive inputs were also selected by the two-stage methods.

Such a screening result leading to a low specificity can happen when the function has

48 many small moderate coefficients corresponding to the non-active linear main effects, quadratic main effects, and linear by linear interaction effects, as, for example, in

Table 3.11.

To investigate the effect of increase in Stage 1 runs further, random functions

2 y = γaaza + γabzazb + γbczbzc + γdezdze + small coefficients

were generated having similar features to the test function (3.5). Each of these func-

tions has 5 active inputs za, zb, zc, zd, ze from 4 active non-linear effects γaa, γab, γbc, γde. The 5 active inputs were randomly selected and all coefficients corresponding to the active or non-active linear, quadratic, and linear by linear interaction effects were drawn from the coefficient distribution C2 as done in the test function (3.5). The original GSinCE procedure and the modified procedure with the increased number of Stage 1 runs were applied to each test function. In the original procedure,

the number of additional runs at Stage 2 was determined by n2 = 5p where p is the number of inputs in the groups identified as active at Stage 1. In the modified

procedure, the number of runs n2 at Stage 2 was determined by subtracting 80 runs at Stage 1 of the modified procedure from the total number of runs 40+5p of the original

n1 = 40 procedure. If p ≤ 8 inputs proceed to Stage 2 in the original procedure, then the total number of runs of the original procedure is ≤ 80 and hence the Stage 2 is not

implemented in the modified procedure. Such cases were removed for comparisons

of the two procedures using different number of runs in each stage. The comparison results based on 30 test functions for both designs are summarized in Table 3.14.

Increasing the number of Stage 1 runs turns out to improve the FNDR and sen-

sitivity by better returning active inputs at Stage 1. This improvement seems to be

reasonable because the increase in the number of Stage 1 runs can produce a better

49 Procedure Stage 1 Runs (n1) FDR FNDR specificity sensitivity Original 2f = 40 0.444 0.077 0.733 0.800 Modified 4f = 80 0.444 0 0.733 1

Table 3.14: Comparisons of median values of FDR, FNDR, specificity, sensitivity over 30 non-linear functions for the original and modified procedures

prediction of the output at groups of inputs and lead to a more precise detection of

groups which have active effects on the output.

3.2.3 Detecting Large Effects

In Sections 3.2.1 and 3.2.2, some least favorable cases have been considered where

the GSinCE procedure can work poorly and possible modifications of the procedure

have been proposed. In this section, the success of the GSinCE procedure in a favor-

able situation will be shown. To simulate such a screening situation, the original test

function (3.5) was modified as follows, keeping z1, z7, z12, z18 and z19 as active inputs.

C-1: Set all small coefficients to 0, while keeping the 4 original active coefficients

− − 2 y = 19.71z1z18 + 23.72z1z19 13.34z19 + 28.99z7z12

C-2: Double the 4 active coefficients, while keeping all original small coefficients

− − 2 y = 39.42z1z18 + 47.44z1z19 26.68z19 + 57.98z7z12 + many small coefficients

C-3: Triple the 4 active coefficients, while keeping all original small coefficients

− − 2 y = 59.13z1z18 + 71.16z1z19 40.02z19 + 86.97z7z12 + many small coefficients

The original GSinCE procedure was applied to each of the 3 test functions, using

n1 = 2f = 40 runs at Stage 1, τ = 0.14 to create low-impact inputs, automatic Stage 1 grouping with a maximum of M = 5 inputs per group, and hypotheses

tests with α = 0.2 at each stage. The screening results for the 3 test functions are

50 ∗ Case rj range Individual Group Stage 1 Stage 2 Total Inputs Selection√ Selection Runs C-1 -0.31 to -0.13 z18, z19, z5, z13, z4 g1 z18, z19 80 -0.07 to -0.06 z14, z10 g2 -0.02 to 0.01 z11, z20, z6, z9, z15 g3 0.06 to 0.08 z8, z16, z2, z17, z3 g4 √ 0.25 to 0.72 z1, z12, z7 g5 √ z1, z7, z12 C-2 -0.36 to -0.29 z18, z5, z4 g1 √ z18 135 -0.09 z13, z2 g2 √ -0.08 to -0.07 z10, z19, z17, z16, z6 g3 z19 0.01 z8 g4 √ 0.03 to 0.14 z11, z14, z20, z9, z1 g5 √ z1 0.24 to 0.45 z15, z12, z3, z7 g6 √ z3, z7, z12, z15 C-3 -0.36 to -0.25 z18, z5, z4 g1 √ z18 100 -0.12 to -0.05 z19, z13, z10, z6 g2 z19 -0.04 to 0.01 z2, z16, z17, z11, z14 g3 0.03 to 0.07 z8, z20, z9 g4 √ 0.17 to 0.57 z15, z1, z3, z12, z7 g5 z1, z7, z12

Table 3.15: Result of automatic grouping and applying GSinCE in favorable situations

summarized in Table 3.15 and can be compared with the screening results for the function (3.5) shown in Table 3.12 of Section 3.2.2.

For the function C-1, the GSinCE procedure easily identified the 5 active inputs with 80 total runs, which is smaller than 115 needed for screening of the function

(3.5). In the situation of C-2, the 5 true active inputs were all selected, but z3 and

z15 were additionally chosen, and 135 total runs were required. However, in C-3 where active effects are easier to differentiate from non-active effects, only the 5 true

active inputs were correctly chosen using 100 total runs. This example demonstrates

that the GSinCE procedure successfully detects the active inputs when the magnitude

of the active effects is sufficiently large to be separated from the non-active effects.

3.3 Properties of Two-stage Designs

In this section, details of the two-stage designs are discussed.

51 3.3.1 Augmented Design

In the GSinCE procedure, a set of design matrix columns for low-impact inputs is generated at the same time as the experimental input columns. For example, the preliminary design matrix X∗ is a maximin orthogonal design for all experimental input columns and the maximum possible number of low-impact input columns. Each of the low-impact columns is used to form separate augmented designs, and these augmented designs are used one-by-one to compute a set of TESIs. The augmented design is a subset of the full design, so it is not necessarily a maximin design although the full design is a maximin design. It would be reasonable to impose a space-filling property on the augmented designs directly. However, implementing this has the following problems.

The augmented designs should have different columns only for each low-impact input while retaining the same columns for the experimental inputs. Because of this restriction, it is hard to get multiple good space-filling augmented designs. In this case, one possibility is to generate a space-filling design for the experimental inputs first and then choose the columns to add so that the added columns maximize the minimum inter-point distance between the added columns and the experimental inputs. Augmented designs generated in this way can have a greater minimum inter- point distance than the subsets of the maximin orthogonal design X∗. The use of a better space-filling augmented design can produce more precise estimates of TESIs.

However, this method takes more time than the current GSinCE procedure since the additional steps to construct the added column orthogonal to the preceding common columns and to scale them to [0,1] are needed and then the maximin distance criterion should be applied for each case.

52 To compare the different augmented designs, consider a simple example in which

∗ y is a perturbed version of y via the “low-impact input” z11,

∗ y = y + 0.1z11 = z1 + z2 + 0.1z11 (3.6)

where the original output y is a function of only two inputs among 10 inputs, zi ∈

[0, 1], i = 1, 2,..., 10. Assuming that Z1,Z2,...,Z11 are independent U[0, 1] random variables, the TESIs using the true function of y∗ are computed as

V ar(E(y∗|Z )) 12 T = 1 = = 0.4975 = T z1 V ar(y∗) 12 + 12 + 0.12 z2 (3.7) V ar(E(y∗|Z )) 0.12 T = 11 = = 0.0050. z11 V ar(y∗) 12 + 12 + 0.12

Now consider estimation of these TESIs using the data obtained from two different designs. Here the number of rows in both designs equal to 20 and the number of low- impact input columns is set to 9 so that 9 augmented designs are formed to produce

9 sets of TESIs. Design method 1 obtains the augmented designs from the subset of a 20 × 19 maximin orthogonal design as in the GSinCE procedure, while Design method 2 obtains the augmented designs by finding a 20 × 10 maximin orthogonal design and then applying the maximin distance criterion for each augmented design separately.

Comparison Design Method 1 Design Method 2 True TESIs Design Min Distance 0.0126 0.0243 - Time (seconds) 125.7 198.1 - TESI z1 0.4919 0.4987 0.4975 (mean z2 0.4990 0.4874 0.4975 estimates) z11 0.0090 0.0135 0.0050

Table 3.16: Minimum inter-point distance and computation time of two design meth- ods and the estimated TESIs based on each of these designs

53 In Table 3.16, the generation of these designs and the estimates of TESIs are compared. Here the estimates of TESIs are the average of TESIs obtained using 9 augmented designs. Although Design method 2 produces a design with larger mini- mum inter-point distance, it does not produce better estimates of TESI in this exam- ple. Design method 1 is quicker than Design method 2. This advantage is important in screening in a high-dimensional input space. Thus a simple way to generate the low-impact input columns at one time is chosen for the GSinCE procedure. It is computationally efficient and avoids the possibility of getting duplicate low-impact columns.

3.3.2 Combined Design at Stage 2

In the GSinCE procedure, a new set of data is obtained at Stage 2 and then it is combined with Stage 1 data for Stage 2 analysis. The reuse of Stage 1 data is useful in reducing the required number of computationally expensive code runs by taking only a small number of additional runs at Stage 2. The GSinCE procedure tries to ensure that the desirable design properties are preserved in the combined Stage 1 and Stage 2 designs. Here it is described how the design properties (P.1)-(P.3) are affected for the combined design.

Uncorrelated Columns (P.1)

When the columns of the design matrix are highly correlated, the TESIs of the corresponding inputs cannot be estimated independently. For example, when y = z1 + z2,(z1, z2 ∈ [0, 1]), so that y is a sum of two inputs with the same effect, the

TESIs of two inputs are both 0.5, assuming that Z1 and Z2 are independent U[0, 1] random variables. However, if the output is computed at input values in a design matrix whose two input columns are correlated, the estimate TESIs of the two inputs

54 will not be 0.5. This is the reason why the design matrix with uncorrelated columns (P.1) is desired.

One possible way to form a combined design with uncorrelated columns is to combine the Stage 1 and Stage 2 design matrices without scaling the columns of each matrix to [0,1] as in Step 4 of Section 2.2. If this is done, the columns in each stage have zero mean due to the centering in Step 2 and zero inner product due to the

Gram-Schmidt orthogonalization. Then the combined columns still have zero mean and zero inner product, that is,

⊤ ⊤ (1) (1) (2) (2) ̸ ξj ξh + ξj ξh = 0, j = h,

(1) (2) ⊤ (1) (2) ⊤ th th where (ξj , ξj ) and (ξh , ξh ) are the j and h columns of the combined design without scaling, respectively. So the correlation between any two columns in this combined design matrix is zero.

However, in the GSinCE procedure in Chapter 2, the Stage 1 and Stage 2 design matrices are combined after scaling each column of these matrix to [0,1]. So the corre- lations between pairs of columns in the combined design matrix are not guaranteed to be uncorrelated. The correlations in the combined design matrix need to be checked before it is used for Stage 2 analysis. If the correlations are not negligible, then a criterion for minimizing correlation can be added to the existing distance criterion in the design search step so that the resulting combined design is nearly uncorrelated and approximately space-filling (see for example Section 7.2.2).

Scaled Columns (P.2)

In Section 2.2, the application of the Gram-Schmidt algorithm produces columns which have different minimum and maximum values in each column. If the design

55 matrix with different minimum and maximum values is used to generate the output, then the variables whose scaled input values in the design have larger ranges will

have a larger impact on the output, which is artificially induced by the design. To

avoid this problem, the design matrices in each stage are made to have common

range [0,1] via the Step 4 of Section 2.2. It ensures that the columns of the combined design matrix Xc will have the same range [0,1]. However, this scaling produces non-

zero means in each stage, so the correlations are not zero. Thus there is a trade-off

between having uncorrelated columns and having columns with the same scale for the

combined design matrix Xc. So, a scaling step to the GSinCE procedure was added, sacrificing uncorrelatedness.

56 CHAPTER 4

APPLICATION OF GSINCE

This chapter demonstrates how the GSinCE procedure proposed in Chapter 2 is

implemented. In Section 4.1, examples in the computer experiment literature are used

for the demonstration of the GSinCE procedure and for the comparisons with other screening procedures. Section 4.2 shows the application of the GSinCE procedure in

a real computer experiment.

4.1 Examples from the Literature

The proposed GSinCE procedure is illustrated for four examples drawn from the

computer experiment literature. In each example, the output is described by a known

non-linear model with added inert inputs so that each output is regarded as a function

of a total of f = 20 inputs. The same preliminary design matrix X∗ and Stage 1

(1) design X with n1 = 2f = 40 runs is used for all four examples. The groups in Stage 1 are determined from the automatic grouping procedure in Section 2.3.2 with a maximum of M = 5 inputs per group. To create the low-impact inputs, τ = 0.14 is used, following the recommendation in Section 3.1.1. The hypothesis tests in (2.8) and (2.12) are conducted with α = 0.2 at each stage.

For each example, the GSinCE procedure is compared with (i) the one-stage

method that uses TESIs and the test (2.12) with p = f, τ = 0.14, and the same

57 number of runs as the GSinCE procedure, (ii) the one-stage screening method of Linkletter et al. (2006), denoted by LBHHY, based on 100 randomly selected inert columns and τ = 0, (iii) the LBHHY procedure with 100 randomly selected low- impact columns and τ = 0.14. The initial design for the LBHHY procedures is selected using the maximin LHD criterion. As in Linkletter et al. (2006), 100 ran- domly selected columns are added to form a benchmark distribution. The LBHHY procedure judges whether an input is active by the magnitudes of the posterior draws

∏ 2 f 4(xj −x˜j ) of the parameters of the correlation function R(x, x˜) = j=1 ρj in (2.2); as in Linkletter et al. (2006), the 10th percentile of the benchmark distribution is selected as the cut-off for the selection decision.

4.1.1 Borehole Model

Worley (1987) used the function

− 2[πz3(z4 z6) ] y(z1, . . . , z8) = 2z7z3 z3 ln(z2/z1) 1 + 2 + ln(z2/z1)z1 z8 z5 to describe the rate of flow of water (in m3/yr) through a borehole that is drilled from the ground surface through two aquifers, where z1 ∈ [0.05, 0.15] is the radius of the borehole (m); z2 ∈ [100, 50000] is the radius of influence (m); z3 ∈ [63070, 115600]

2 is the transmissivity of upper aquifer (m /yr); z4 ∈ [990, 1110] is the potentiometric head of the upper aquifer (m); z5 ∈ [63.1, 116] is the transmissivity of the lower

2 aquifer (m /yr); z6 ∈ [700, 820] is the potentiometric head of the lower aquifer (m); z7 ∈ [1120, 1680] is the length of the borehole (m); z8 ∈ [9855, 12045] is the hydraulic conductivity of the borehole (m/yr). Joseph, Hung, and Sudjianto (2008) identified z1 as the only important variable among the 8 inputs, using a 27-run experimental design and a Bayesian variable selection technique.

58 To illustrate the GSinCE procedure, the twelve inert inputs z9,..., z20 are added which serve only to add “noise” to the variable screening process. Then, in the analysis, y(·) is considered as a function of f = 20 inputs. The GSinCE procedure was applied using n1 = 2f = 40 runs at Stage 1. Using the Stage 1 data, the automatic grouping procedure with M = 5 places the inputs into 7 groups as shown in Table 4.1; for example, g1 consists of inputs z6 and z7. The GSinCE analysis phase selects groups g1, g6, g7 at the conclusion of Stage 1 and the p = 6 inputs in these groups (z1, z4, z6, z7, z8, z19) proceed to Stage 2 where a new design matrix is obtained as in Section 2.4.1. Following the computation of n2 = 5p = 30 additional code runs

(2) to obtain y(X ), inputs z1, z4, z6, z7 are selected as active via the analysis in Stage 2.

∗ rj range Individual Group Stage 1 Stage 2 Inputs Selection√ Selection -0.25 to -0.21 z7, z6 g1 z6, z7 -0.07 to -0.03 z17, z3, z10, z14 g2 -0.02 to -0.00 z13, z9, z12, z11 g3 0.02 z16, z18, z2, z15 g4 0.03 to 0.04 z20, z5 g5 √ 0.07 to 0.20 z19, z8, z4 g6 √ z4 1.49 z1 g7 z1

Table 4.1: Grouping and active effect selection by GSinCE for the borehole model

Table 4.2 compares the results of the GSinCE procedure with those of the one- stage procedure with 70 runs and the two LBHHY procedures described above. All three procedures that use τ = 0.14 to construct the benchmark select the same set

(z1, z4, z6, z7) of inputs as active. However, the amount of computation to reach this conclusion differs (the computation times measure the execution for the MATLAB code used to implement the four procedures on a 64 bit Linux machine with 8 cores,

59 32 GB of RAM, and 2.66 GHz). When used with an inert benchmark τ = 0 (Sec- tion 2.3.3), LHBBY selects as active the inert input z19 as well as input z8; this shows the importance of using a non-inert benchmark (τ > 0) to screen out inputs that are essentially noise. The GSinCE procedure is considerably faster than each of the one-stage procedures.

Procedure τ Selected Inputs Computation Times (1,000 seconds) GSinCE 0.14 z1, z4, z6, z7 3.0 One-stage 0.14 z1, z4, z6, z7 7.1 LBHHY 0.14 z1, z4, z6, z7 4.6 LBHHY 0 z1, z4, z6, z7, z8, z19 4.5

Table 4.2: Computation times and active effect selection of the four procedures for the borehole model using 70 runs

4.1.2 A Model for the Weight of an Aircraft Wing

Forrester et al. (2008) used the function

( ) ( ) 0.6 −0.3 0.758 0.0035 z3 0.006 0.04 100z7 0.49 y(z1, . . . , z10) = 0.036z1 z2 2 z5 z6 (z8z9) + z1z10 cos z4 cosz4

as the estimate of the weight of a light aircraft wing, where z1 ∈ [150, 200] is the wing

2 area (ft ); z2 ∈ [220, 300] is the weight of fuel in the wing (lb); z3 ∈ [6, 10] is the aspect ratio; z4 ∈ [−10, 10] is the quarter-chord sweep (deg); z5 ∈ [16, 45] is the dynamic

2 pressure at cruise (lb/ft ); z6 ∈ [0.5, 1] is the taper ratio; z7 ∈ [0.08, 0.18] is the aerofoil thickness to chord ratio; z8 ∈ [2.5, 6] is the ultimate load factor; z9 ∈ [1700, 2500] is the

2 flight design gross weight (lb); z10 ∈ [0.025, 0.08] is the paint weight (lb/ft ). Using the distribution of “elementary effects” (estimated partial derivatives), the authors

60 concluded that inputs z9, z1, z8, z3, z7 and perhaps z2 have significant impact on wing weight.

For illustration of the GSinCE procedure using automatic grouping procedure

(Section 2.3.2) with M = 5, a function y(·) is taken to be the function of the original ten inputs plus ten additional inert inputs z11,..., z20. The GSinCE procedure was applied using n1 = 2f = 40 runs at Stage 1. Using the Stage 1 data, individual inputs are divided into 7 groups as shown in Table 4.3. The GSinCE procedure selects groups g1 and g7 at Stage 1 and the p = 5 inputs in these groups (z1, z3, z7, z8, z9) proceed to

Stage 2. Following the computation of n2 = 5p = 25 additional code runs to obtain Stage 2 output, these five inputs are all selected as active via the analysis in Stage 2.

∗ rj range Individual Group Stage 1 Stage 2 Inputs Selection√ Selection -0.39 z7 g1 z7 -0.03 z16, z17 g2 -0.02 to -0.01 z4, z12, z13, z18, z2 g3 0 to 0.01 z19, z11, z14, z20, z15 g4 0.02 to 0.03 z5, z6 g5 0.09 z10 g6 √ 0.31 to 0.71 z9, z1, z3, z8 g7 z1, z3, z8, z9

Table 4.3: Grouping and active effect selection by GSinCE for the aircraft wing weight model

Table 4.4 compares the GSinCE procedure with the three one-stage procedures described above. All procedures using τ = 0.14 select the same set (z1, z3, z7, z8, z9) of inputs which is in accordance with the analysis done by Forrester et al. (2008).

In contrast, LBHHY with inert benchmark (τ = 0) also selects z6, z10 and the inert input z14. No procedures select the input z2 which Forrester et al. (2008) selected as

61 possibly active. The GSinCE procedure performs about 1.5 to 2 times faster than the one-stage procedures.

Procedure τ Selected Inputs Computation Times (1,000 seconds) GSinCE 0.14 z1, z3, z7, z8, z9 2.8 One-stage 0.14 z1, z3, z7, z8, z9 5.8 LBHHY 0.14 z1, z3, z7, z8, z9 4.4 LBHHY 0 z1, z3, z6, z7, z8, z9, z10, z14 4.3

Table 4.4: Computation times and active effect selection of the four procedures for the aircraft wing weight model using 65 runs

4.1.3 OTL Circuit Model

Ben-Ari and Steinberg (2007) used the function ( ) 12z2 + 0.74 z6(z5 + 9) z1+z2 11.35z3 0.74z3z6(z5 + 9) y(z1, . . . , z6) = + + z6(z5 + 9) + z3 z6(z5 + 9) + z3 (z6(z5 + 9) + z3)z4 describing an output transformerless (OTL) push-pull circuit in order to show a good performance of method for non-parametric smoothing of high-dimensional data. The output is the midpoint voltage (Vm), where z1 ∈ [50, 150] is the resistance

1 (K-Ohms); z2 ∈ [25, 70] is the resistance 2 (K-Ohms); z3 ∈ [0.5, 3] is the resistance 3

(K-Ohms); z4 ∈ [1.2, 2.5] is the resistance 4 (K-Ohms); z5 ∈ [0.25, 1.2] is the resistance

5 (K-Ohms); z6 ∈ [50, 300] is the current gain (Amperes).

Twelve additional inputs z7, . . . , z20 are added so that y(·) is taken to be a function of f = 20 inputs. The GSinCE procedure was applied to y(·) with n1 = 2f = 40 runs at Stage 1. The results in Table 4.5 show that groups g1 and g6 are selected st Stage 1 and the p = 4 inputs (z1, z2, z3, z4) in these groups proceed to Stage 2. As a result

62 of adding n2 = 5p = 20 runs computed from the Stage 2 design, inputs z1, z2, z3 are selected as active in the analysis at Stage 2.

∗ rj range Individual Group Stage 1 Stage 2 Inputs Selection√ Selection -0.83 to -0.18 z1, z4 g1 z1 -0.02 to -0.01 z16, z14, z9 g2 0.00 to 0.01 z20, z6, z11, z10, z7 g3 0.02 to 0.03 z15, z12, z13 g4 0.04 to 0.07 z18, z17, z5, z8, z19 g5 √ 0.21 to 0.79 z3, z2 g6 z2, z3

Table 4.5: Grouping and active effect selection by GSinCE for the OTL circuit model

Table 4.6 compares the results of the GSinCE procedure with those of the three

one-stage procedures. The GSinCE and one-stage procedures whose analyses use

τ = 0.14 and TeSIs select the same set (z1, z2, z3) of inputs. LBHHY with τ = 0.14

selects one more input z4 and LBHHY with τ = 0 selects z4 as well as inert inputs z9, z19. The GSinCE procedure performs about 1.7 or 2 times faster than the other procedures.

Procedure τ Selected Inputs Computation Times (1,000 seconds) GSinCE 0.14 z1, z2, z3 2.3 One-stage 0.14 z1, z2, z3 4.5 LBHHY 0.14 z1, z2, z3, z4 3.9 LBHHY 0 z1, z2, z3, z4, z9, z19 3.9

Table 4.6: Computation times and active effect selection of the four procedures for the OTL circuit model using 60 runs

63 4.1.4 Piston Simulator Model

Kenett and Zacks (1998) developed a model simulating a piston moving within a

cylinder. The piston’s performance is measured by the time it takes to complete one

cycle, in seconds. The cycle time is determined by a chain of nonlinear functions, √ z y(z , . . . , z ) = 2π 1 1 7 2 z5z3 z6 z4 + z2 z V 2 (√ 7 ) z2 2 z5z3 z4z3 where V = A + 4z4 z6 − A and A = z5z2 + 19.62z1 − 2z4 z7 z2

where z1 ∈ [30, 60] is the piston weight (kg); z2 ∈ [0.005, 0.020] is the piston surface

2 3 area (m ); z3 ∈ [0.002, 0.010] is the initial gas volumn (m ); z4 ∈ [1000, 5000] is

4 4 the spring coefficient (N/m); z5 ∈ [9 × 10 , 11 × 10 ] is the atmospheric pressure

2 (N/m ); z6 ∈ [290, 296] is the ambient temperature (K); z7 ∈ [340, 360] is the filling gas temperature (K).

∗ rj range Individual Group Stage 1 Stage 2 Inputs Selection√ Selection -0.95 z2 g1 √ z2 -0.16 to -0.05 z4, z17, z16, z14 g2 z4 -0.04 to -0.01 z12, z7, z20, z5, z10 g3 0.00 to 0.02 z13, z15, z9 g4 0.04 z11, z6, z18 g5 √ 0.06 to 0.16 z19, z8, z1 g6 √ z1 0.63 z3 g7 z3

Table 4.7: Grouping and active effect selection by GSinCE for the piston model

Thirteen inert inputs z8, . . . , z20 are added to the original seven inputs. The

GSinCE procedure was applied to y(·) with n1 = 2f = 40 runs at Stage 1. Table

4.7 shows that the groups g1, g2, g6, g7 are selected at Stage 1 and p = 9 inputs

(z1, z2, z3, z4, z8, z14, z16, z17, z19) in these groups proceed to Stage 2. After collecting

64 additional n2 = 5p = 45 runs at Stage 2, only four inputs z1, z2, z3, z4 are selected as active.

Table 4.8 compares the four procedures with the same number, 85, of runs. All procedures using τ = 0.14 select the same set (z1, z2, z3, z4) of inputs. In contrast,

LBHHY with inert benchmark (τ = 0) also selects z5 and inert inputs z14, z19. The GSinCE procedure performs about 1.5 or 3 times faster than the other procedures.

Procedure τ Selected Inputs Computation Times (1,000 seconds) GSinCE 0.14 z1, z2, z3, z4 4.0 One-stage 0.14 z1, z2, z3, z4 13.0 LBHHY 0.14 z1, z2, z3, z4 5.8 LBHHY 0 z1, z2, z3, z4, z5, z14, z19 5.8

Table 4.8: Computation times and active effect selection of the four procedures for the piston model using 85 runs

4.1.5 Summary

As shown in Sections 4.1.1, 4.1.2, and 4.1.4, the GSinCE procedure identifies the same set of active inputs as the one-stage procedures. But the GSinCE procedure has the advantage over other procedures by being computationally efficient. This is because the number of inputs is reduced through grouping at Stage 1 and then only potentially active inputs are considered at Stage 2. This decreases the number of computationally expensive computer code runs required to determine the active inputs.

65 4.2 A Real Computer Experiment: FRAPCON Model

4.2.1 Description of Code

In this section, a computer code used at Los Alamos National Laboratory (LANL)

called FRAPCON code is considered. This code is “a steady-state fuel behavior

code designed to analyze fuel behavior from beginning-of-life to burnup levels of 65

GWd/MTU” (see Lanning, Beyer, and Berna (1997) for details). There is a total of f = 61 inputs; the first 44 inputs (1-44) are “calibration” parameters corresponding

to physics model uncertainties, while the last 16 inputs (45-61) are “design” inputs,

the nominal values of which correspond to Rod 1. From each FRAPCON code run,

four outputs are computed as follows.

y1: Average of the total fuel radius change (in microns) for axial regions 1-4 at time-step 38

y2: Average of the total fuel radius change (in microns) for axial regions 1-4 at time-step 44

y3: Centerline temperature (Kelvin) for axial region 4 at time-step 38

y4: Fission gas release at time-step 44

The subject matter experts for this study wanted to identify the active inputs that

have major impacts on each of these four outputs. This problem is challenging since

there are 61 inputs and four outputs to consider in the input-output relationship.

However, the subject matter experts determined that the outputs y1 and y3 are most important and y2 is least important.

66 4.2.2 Use of GSinCE

This problem is a good example to show that the GSinCE procedure works well in a real world problem. The details of the GSinCE procedure given for this problem are described in the following.

Multiple Output

The GSinCE procedure is developed for a univariate output, but this problem has four outputs from the FRAPCON code runs. Since there is no clear functional rela- tionship between these outputs, they were analyzed separately via univariate analyses and the results were combined.

Number of Code Runs at Stage 1

In the examples of Section 4.1, the number of code runs in Stage 1 of the GSinCE procedure is selected to be n1 = 2f. Here, however, the computer code runs at some sets of input values cannot produce output values because of numerical problems in the codes. So n1 was set larger than 2f = 122. The FRAPCON code is computationally inexpensive, so n1 = 500 code runs was requested to the subject matter experts to guard against a large number of missing outputs.

Preliminary Design

∗ The GSinCE procedure first creates a preliminary design matrix X with n1 rows and (n1 − 1) columns. When n1 is large such as 500 in this problem, the search for the best design matrix X∗ in terms of the properties discussed in Section 2.2 is time-consuming. The maximum number of columns which can be assigned for the low-impact inputs is n1 − 1 − f = 438. Using a large number of low-impact inputs can reduce the computational efficiency of the procedure by repeating the sensitivity

67 analysis (Section 2.3.3) more times than necessary. Thus, here, the number of low- impact inputs was selected to be 50, and the dimension of the preliminary design

matrix X∗ was reduced from 500 × 499 to 500 × (61 + 50) = 500 × 111.

Stage 1 Sampling

The Stage 1 design matrix, X(1), was taken to be the first f = 61 columns of the 500 × 111 preliminary design matrix X∗. The columns in X(1) were converted into

the ranges of the inputs (as described in Section 2.3.1) and then the computer code

was run at the 500 sets of input values. From these, 336 sets of output values for the

four outputs were able to be obtained; that is, outputs were not obtained at 164 sets

of input values because of numerical code problems. At this time point, the design matrices X∗ and X(1) were updated so that they contained the 336 design points at which the output values were produced. The Stage 1 grouping and analysis were then done using the updated design matrices having these 336 design points.

Stage 1 Grouping

The groups can be made in many different ways. The choice of the grouping

affects the efficiency and precision of the GSinCE procedure. Here three different

approaches are discussed; (i) grouping using exploratory data analysis (EDA) of the

Stage 1 data using automatic procedure of Section 2.3.2 and modified by hands, (ii)

grouping using information from the subject matter experts, and (iii) grouping using a combination of these. The details of the grouping and screening results in each case

are described in Section 4.2.3.

Once the grouping is chosen, the design matrix for the group variables and the

low-impact inputs must be constructed. In Section 2.3.2, the design matrix for the

group is created by randomly selecting columns from the preliminary design matrix

68 X∗. However, the updated X∗ need not retain the desirable properties (P.1)-(P.3) because some rows were deleted. Thus in this example the design matrix for the group variables and 50 low-impact inputs was newly constructed so that it satisfies the properties (P.1)-(P.3).

Stage 2 Sampling

Even though some outputs may be missing due to numerical problems with the

code, the GSinCE rule was followed and n2 = 5p runs were taken for the additional computer code runs at Stage 2, where p is the number of inputs identified as poten-

tially active at Stage 1. In this experiment, there are plenty of output values since

the 336 Stage 1 outputs are re-used at Stage 2.

The Stage 2 design matrix with n2 = 5p rows was constructed using the updated Stage 1 design matrix with 336 rows. The number of low-impact inputs for Stage 2

was selected to be δ = min(50, n2 − p − 1) since 50 low-impact inputs were used at

(2) Stage 1. Then as described in Section 2.4.1, the matrix XA for p potentially active (2) inputs and the matrix XB for δ low-impact inputs were selected so that a combined c (336+n2)×(p+δ) matrix X is approximately maximin, while each column of matrix (2) − XN for f p potentially non-active inputs was set to equal to the median value of (1) the corresponding column of the updated Stage 1 design matrix XN . The FRAPCON code was run at each set of input values defined by the rows

(2) (2) (2) of X = (XA , XN ) to obtain the Stage 2 output. When some outputs were missing again with the same numerical problem of the code (the number of missing output values are specified in Section 4.2.3), the Stage 2 design was updated so that it contained only design points corresponding to available output values. Accordingly, the combined design matrix Xc must also be updated for Stage 2 analysis.

69 4.2.3 Implementations

For each of the three grouping approaches, τ = 0.14 is used to perturb the outputs as recommended via the simulation studies in Section 3.1. The hypothesis tests (2.8) and (2.12) in each stage are conducted at familywise significance level α = 0.2.

Screening based on Grouping by EDA

This approach divides the individual inputs into groups using an empirical study of the Stage 1 data. The automatic grouping procedure developed in Section 2.3.2 based on Fisher-transformed Pearson correlation coefficients is one way to obtain possible groups quickly. This was done first using the automatic grouping procedure with maximum group size M = 5. Then the groups were modified so that the group sizes were similar and each group consists of inputs having similar correlations. The groups made separately for each output (y1, y2, y3, y4) are given in Tables 4.10–4.13 in detail. The grouping procedure led to 15 groups for y1, 16 groups for y2, 18 groups for y3, and 17 groups for y4.

In the Stage 1 analysis, groups g1 with inputs 17, 19, 20 and g15 with inputs 1,

38, 39 are selected for output y1, groups g1 with input 31 and g16 with input 1 are selected for output y2, groups g1 with inputs 17, 19, 20 and g18 with inputs 1, 31 are selected for output y3, and groups g1 with inputs 17, 19, 20, g2 with inputs 36,

37, g17 with inputs 1, 38, 39 are selected for output y4. The inputs in these groups are summarized for each output in Table 4.9. Interestingly, input 1 is selected for all outputs, inputs 17, 19, 20 are selected for three of the outputs (y1, y3, y4), inputs 31, 38, 39 are selected for two of the outputs, and inputs 36, 37 are selected for only one output. Since it was necessary to reduce the possibility of missing active inputs at

Stage 1, a conservative approach was taken and the union of the p = 9 inputs (1, 17, 19, 20, 31, 36, 37, 38 39) proceeded to the second stage for each output.

70 Output Stage 1 Selection Stage 2 Selection y1 1 17 19 20 38 39 1 17 19 20 y2 1 31 1 31 y3 1 17 19 20 31 1 17 19 20 y4 1 17 19 20 36 37 38 39 1 17 19 20 36 38 39 Union 1 17 19 20 31 36 37 38 39 1 17 19 20 31 36 38 39

Table 4.9: Summary of screening for all outputs based on grouping by EDA

To obtain a second data set for the detailed investigation of the p = 9 inputs, a

(2) (2) (2) Stage 2 design X = (XA , XN ) with n2 = 5p = 45 rows was constructed. Here, − (2) (1) f p = 52 columns in XN were set to the median values from XN and p = 9 (2) (2) columns in XA were newly generated in combination with XB . The number of low-impact inputs at Stage 2 was selected to be δ = min(50, n2 − p − 1) = 35. So

(1) XB was selected from the first 35 columns among 50 low-impact columns of Stage 1 (2) in order to combine with XB . The FRAPCON code was run at these n2 = 45 input values in X(2). Of these, 38 outputs were obtained and 7 were missing due to the numerical code problems. The Stage 2 analysis was done for the combined 336+38 = 374 outputs from both

stages using p = 9 inputs and δ = 35 low-impact inputs. As Table 4.9 shows, inputs

1, 17, 19, 20 are selected for outputs y1, y3 and y4. In addition, inputs 36, 38, 39 are selected for y4. Inputs 1, 31 are selected for output y2. Then the 8 inputs (1, 17, 19, 20, 31, 36, 38, 39) among the p = 9 potentially active inputs are declared to be active for at least one output. Thus only 8 inputs among 61 inputs is finally selected to be active.

71 ∗ Output rj range Individual Inputs Groups Selection√ y1 -0.467 to -0.398 19 17 20 g1 -0.149 to -0.109 36 10 9 g2 -0.093 to -0.069 6 37 45 25 g3 -0.056 to -0.042 41 60 59 56 g4 -0.038 to -0.031 31 51 4 28 g5 -0.025 to -0.021 21 13 34 54 11 g6 -0.018 to -0.012 47 53 42 49 40 g7 -0.009 to 0.000 18 30 43 12 29 5 g8 0.005 to 0.013 24 2 55 61 g9 0.022 to 0.029 35 33 50 g10 0.030 to 0.050 22 14 26 48 7 g11 0.050 to 0.058 27 23 3 57 g12 0.062 to 0.081 58 46 8 15 g13 0.091 to 0.108 44 52 32 16 g14 √ 0.164 to 0.304 38 39 1 g15

Table 4.10: Stage 1 grouping by EDA and selection for y1

∗ Output rj range Individual Inputs Groups Selection√ y2 -1.942 31 g1 -0.060 to -0.054 43 54 6 g2 -0.045 to -0.033 36 30 27 g3 -0.029 to -0.023 3 4 9 g4 -0.019 to -0.011 25 14 41 50 23 g5 -0.010 to -0.006 12 56 5 58 g6 -0.003 to -0.001 26 42 37 40 49 g7 0.000 to 0.003 29 19 34 61 55 22 g8 0.004 to 0.006 24 7 13 g9 0.010 to 0.017 15 2 18 17 16 g10 0.020 to 0.029 59 11 52 35 21 39 g11 0.032 to 0.034 46 48 60 g12 0.035 to 0.039 10 8 45 38 g13 0.040 to 0.048 57 28 51 g14 0.056 to 0.073 32 53 47 33 20 g15 √ 0.241 1 g16

Table 4.11: Stage 1 grouping by EDA and selection for y2

72 ∗ Output rj range Individual Inputs Groups Selection√ y3 -0.524 to -0.481 17 19 20 g1 -0.108 to -0.088 10 9 g2 -0.067 to -0.060 6 51 47 25 45 g3 -0.057 to -0.051 56 37 36 g4 -0.048 to -0.042 41 60 21 g5 -0.037 to -0.033 13 59 11 g6 -0.028 to -0.023 4 42 34 53 g7 -0.017 to -0.014 28 33 49 g8 -0.006 to -0.003 29 24 5 g9 0.000 55 35 40 g10 0.004 to 0.007 18 30 2 g11 0.010 to 0.024 54 43 61 12 g12 0.027 to 0.035 50 22 46 27 48 g13 0.047 to 0.060 14 52 38 8 g14 0.062 to 0.069 26 7 58 3 g15 0.071 to 0.088 44 15 23 57 g16 0.097 to 0.111 32 16 39 g17 √ 0.221 to 0.252 1 31 g18

Table 4.12: Stage 1 grouping by EDA and selection for y3

∗ Output rj range Individual Inputs Groups Selection√ y4 -0.416 to -0.371 19 17 20 g1 √ -0.253 to -0.137 36 37 g2 -0.098 to -0.073 10 25 33 g3 -0.062 to -0.052 45 56 6 60 g4 -0.049 to -0.042 59 41 47 28 g5 -0.030 to -0.020 21 9 53 12 51 g6 -0.020 to -0.014 11 13 2 g7 -0.009 to -0.004 4 30 54 42 35 55 g8 0.000 to 0.010 49 40 46 5 g9 0.012 to 0.019 61 24 34 18 29 g10 0.024 to 0.027 14 48 8 g11 0.033 to 0.040 44 27 58 3 g12 0.041 to 0.044 22 7 43 50 g13 0.053 to 0.058 57 26 52 g14 0.072 to 0.095 23 15 16 g15 0.105 to 0.143 32 31 g16 √ 0.224 to 0.417 1 38 39 g17

Table 4.13: Stage 1 grouping by EDA and selection for y4

73 Screening based on Grouping by Expert

Based on physics considerations and preliminary study, the subject matter experts provided their own grouping. They divided the 61 inputs into 11 groups as shown in Table 4.14. Their grouping was made by considering all four outputs simultane- ously. Unlike the previous grouping, this grouping was commonly applied to the four outputs.

Group Individual inputs g1 1 2 3 g2 4 5 6 7 8 9 g3 10 11 12 13 14 15 g4 16 17 18 19 20 21 g5 22 23 24 25 g6 26 27 28 29 30 g7 31 g8 32 33 34 35 36 37 38 39 40 41 42 43 g9 44 g10 45 46 48 50 51 53 56 57 59 60 61 g11 47 49 52 54 55 58

Table 4.14: Grouping by expert

Using this grouping the selections at Stage 1 and Stage 2 are summarized in Table

4.15. At Stage 1, groups g1 and g4 are selected for outputs y1 and y3, groups g1, g4, g7 are selected for output y2, and groups g1, g4, g8 are selected for output y4. That is, inputs 1–3, 16–21 from the groups g1 and g4 are selected for all outputs, while input

31 in the group g7 is selected only for output y2 and input 32–43 in the group g8 are selected for only output y4. The inputs in these groups selected for at least one output proceed to Stage 2. Thus p = 22 inputs (1–3, 16–21, 31, 32–43) are examined in detail at Stage 2.

(2) (2) (2) For a detailed investigation of p = 22 inputs, a Stage 2 design X = (XA , XN ) − (2) with n2 = 5p = 110 rows was constructed. Here, f p = 39 columns in XN were set

74 Group Stage 1 Selection Stage 2 Selection y√1 y√2 y√3 y√4 y1 y2 y3 y4 g1 1 1 1 1 g2 g3 √ √ √ √ g4 17 19 20 17 19 20 17 19 20 g5 g6 √ g7 √ 31 g8 38 39 36 38 39 g9 g10 g11

Table 4.15: Summary of screening for all outputs based on grouping by expert

(1) (2) to the median values from XN and p = 22 columns in XA were newly generated in (2) combination with XB . The number of low-impact inputs at Stage 2 was selected to − − (1) (2) be δ = min(50, n2 p 1) = 50. So all 50 columns in XB were combined with XB . (2) The FRACPCON code was run at these n2 = 110 input values in X . Of these, 95 outputs were obtained and 15 were missing due to numerical code problems.

The Stage 2 analysis was done for the combined 336 + 95 = 431 outputs from

both stages using only p = 22 inputs and δ = 50 low-impact inputs. Table 4.15 shows inputs 1, 17, 19, 20, 38, 39 are selected for output y1, inputs 1 and 31 are selected for output y2, inputs 1, 17, 19, 20 are selected for output y3, and inputs 1, 17, 19,

20, 36, 38, 39 are selected for output y4. Then the 8 inputs (1, 17, 19, 20, 31, 36, 38, 39) among p = 22 are declared to be active for at least one output. Thus the 8 inputs among 61 inputs are finally selected to be active. Compared to the results of grouping using EDA, this approach selects inputs 38 and 39 additionally for output y1. Other than that, the screening result of the two approaches is the same.

75 Screening based on Grouping by Expert and EDA

The groupings given by the subject matter experts in Table 4.14 have large vari- ability in group size; the sizes of the groups vary from 1 to 12. The group having many inputs has more chance to be declared as active because of the larger chance of accumulation of small effects of the same sign. Cancellation of effects can also be more likely to happen when a group of big size consists of many inputs of different signs of effects. So the grouping with similar sizes of groups is preferred to reduce the side effect due to the big difference in the group size. The EDA technique was used to modify the initial grouping given by the subject matter experts.

input expert Trans Corr sub new input expert Trans Corr sub new group y1 y3 group group group y1 y3 group group 1 1 0.3042 0.2210 1-1 g1 32 8 0.1073 0.0965 8-1 g14 2 1 0.0073 0.0069 1-2 g2 33 8 0.0256 -0.0164 8-2 g15 3 1 0.0562 0.0692 1-2 g2 34 8 -0.0215 -0.0257 8-2 g15 4 2 -0.0344 -0.0280 2-1 g3 35 8 0.0219 -0.0001 8-2 g15 5 2 -0.0004 -0.0032 2-1 g3 36 8 -0.1486 -0.0507 8-3 g16 6 2 -0.0932 -0.0665 2-1 g3 37 8 -0.0919 -0.0564 8-3 g16 7 2 0.0495 0.0636 2-2 g4 38 8 0.1635 0.0553 8-1 g14 8 2 0.0710 0.0599 2-2 g4 39 8 0.2157 0.1112 8-1 g14 9 2 -0.1085 -0.0878 2-1 g3 40 8 -0.0122 0.0000 8-2 g15 10 3 -0.1099 -0.1077 3-1 g5 41 8 -0.0556 -0.0480 8-3 g16 11 3 -0.0205 -0.0325 3-1 g5 42 8 -0.0155 -0.0276 8-2 g15 12 3 -0.0039 0.0241 3-2 g6 43 8 -0.0044 0.0169 8-2 g15 13 3 -0.0219 -0.0365 3-1 g5 44 9 0.0906 0.0705 9 g17 14 3 0.0335 0.0473 3-2 g6 45 10 -0.0725 -0.0602 10-1 g18 15 3 0.0811 0.0763 3-2 g6 46 10 0.0642 0.0294 10-2 g19 16 4 0.1075 0.0974 4-1 g7 48 10 0.0437 0.0352 10-2 g19 17 4 -0.4604 -0.5240 4-2 g8 50 10 0.0291 0.0266 10-2 g19 18 4 -0.0087 0.0036 4-1 g7 51 10 -0.0345 -0.0646 10-1 g18 19 4 -0.4668 -0.4853 4-2 g8 53 10 -0.0178 -0.0233 10-1 g18 20 4 -0.3977 -0.4813 4-2 g8 56 10 -0.0422 -0.0565 10-1 g18 21 4 -0.0245 -0.0422 4-1 g7 57 10 0.0584 0.0882 10-2 g19 22 5 0.0301 0.0284 5-1 g9 59 10 -0.0487 -0.0350 10-1 g18 23 5 0.0523 0.0779 5-1 g9 60 10 -0.0538 -0.0472 10-1 g18 24 5 0.0052 -0.0052 5-2 g10 61 10 0.0126 0.0223 10-2 g19 25 5 -0.0687 -0.0615 5-2 g10 47 11 -0.0180 -0.0627 11-1 g20 26 6 0.0346 0.0616 6-1 g11 49 11 -0.0149 -0.0143 11-1 g20 27 6 0.0503 0.0342 6-1 g11 52 11 0.0990 0.0536 11-2 g21 28 6 -0.0311 -0.0170 6-2 g12 54 11 -0.0210 0.0102 11-1 g20 29 6 -0.0015 -0.0062 6-2 g12 55 11 0.0121 -0.0003 11-2 g21 30 6 -0.0085 0.0052 6-2 g12 58 11 0.0622 0.0688 11-2 g21 31 7 -0.0382 0.2523 7 g13

Table 4.16: Construction of subgroups within a group made by expert

76 Here groups were possibly divided into subgroups, but were not merged to form a combined group. To form the subgroups within each group of Table 4.14, the

Fisher-transformed correlation coefficients between inputs and the output y1 and

those between inputs and the output y3 were used simultaneously, since these outputs are known as most important among the 4 outputs. For example, the first group of the initial grouping in Table 4.14 was divided into two groups, one group with input

1 and another group with inputs 2 and 3. Input 1 was separated into a group of size

1 since it has distinguishably big correlations with y1 and y3; while inputs 2 and 3 were grouped into a different subgroup since they have small correlations with both

y1 and y3. Table 4.16 shows how the 11 initial groups are divided into 21 groups. The GSinCE procedure was performed for each output separately but using the

common grouping with 21 groups. As Table 4.17 shows, groups g1, g7, g8, g14, g16 are

selected for y1 and y4, groups g1, g7, g8, g13 are selected for output y2, and groups

g1, g7, g8 are selected for output y3. So inputs 1, 16, 17, 18, 19, 20, 21 from the groups

g1, g7, g8 are chosen for all outputs, input 31 is chosen only for output y2, and inputs

32, 36, 37, 38, 39, 41 are chosen only for outputs y1 and y4. Thus p = 14 inputs (1, 16, 17, 18, 19, 20, 21, 31, 32, 36, 37, 38, 39, 41) are selected at least for one output,

and hence proceed to Stage 2.

For a further investigation of the p = 14 inputs, a Stage 2 design X(2) = (2) (2) − (XA , XN ) with n2 = 5p = 70 rows was constructed. Here, f p = 47 columns in (2) (1) (2) XN were set to the median values from XN and p = 14 columns in XA were newly (2) generated in combination with XB . The number of low-impact inputs at Stage 2 − − (1) was selected to be δ = min(50, n2 p 1) = 50. So all 50 columns in XB were (2) combined with XB . The FRACPCON code was run at these n2 = 70 input values in X(2). Of these, 59 outputs were obtained and 11 were missing due to numerical code problems.

77 Individual inputs Groups Stage 1 Selection Stage 2 Selection y√1 y√2 y√3 y√4 y1 y2 y3 y4 1 g1 1 1 1 1 2 3 g2 4 5 6 9 g3 7 8 g4 10 11 13 g5 12 14 15 g6 √ √ √ √ 16 17 18 g7 √ √ √ √ 17 17 17 19 20 21 g8 19 20 19 20 19 20 22 23 g9 24 25 g10 26 27 g11 28 29 30 g12 √ 31 g13 √ √ 31 32 38 39 g14 38 38 39 33 34 35 40 42 43 g15 √ √ 36 37 41 g16 36 44 g17 45 51 53 56 59 60 g18 46 48 50 57 61 g19 47 49 54 g20 52 55 58 g21

Table 4.17: Summary of screening for all outputs based on grouping by expert and EDA

The Stage 2 analysis was done using the combined 336 + 59 = 395 outputs from both stages for the p = 14 inputs and δ = 50 low-impact inputs. Table 4.17 shows inputs 1, 17, 19, 20, 38 are selected for output y1, inputs 1 and 31 are selected for output y2, inputs 1, 17, 19, 20 are selected for output y3, and inputs 1, 17, 19, 20,

36, 38, 39 are selected for output y4. Thus as before the 8 inputs (1, 17, 19, 20, 31, 36, 38, 39) are declared to be active for at least one output and they are identified as active inputs.

Summary

Table 4.18 shows the summary of the GSinCE procedure for all three grouping methods. The three groupings conclude the same sets of the inputs to be active

78 for outputs y2, y3, and y4. The difference in the screening decision happens only in output y1. For output y1, inputs 1, 17, 19, 20 are commonly selected but the selection of inputs 38 and 39 is different in the three grouping approaches. However, inputs

38 and 39 are always selected for output y4. Thus the final screening decision is the same across all grouping methods after combining the results from all outputs. That is, the same set of inputs, 1, 17, 19, 20, 31, 36, 38, 39, are selected as active for at least one output, based on the three different grouping approaches. Interestingly, the

8 active inputs are all calibration parameters.

Output Grouping by EDA Grouping by Expert Grouping Expert and EDA y1 1 17 19 20 1 17 19 20 38 39 1 17 19 20 38 y2 1 31 1 31 1 31 y3 1 17 19 20 1 17 19 20 1 17 19 20 y4 1 17 19 20 36 38 39 1 17 19 20 36 38 39 1 17 19 20 36 38 39 Union 1 17 19 20 31 36 38 39 1 17 19 20 31 36 38 39 1 17 19 20 31 36 38 39 Stage 1 Runs 336 among 500 336 among 500 336 among 500 Stage 2 Runs 38 among 45 95 among 110 59 among 70

Table 4.18: Summary of screening for all groupings

The computational efficiencies of the three methods of grouping are different,

since they declare different number of inputs to be potentially active at Stage 1. For

example, the grouping by Expert selected 22 inputs at Stage 1, so it requires many additional computer code runs at Stage 2. It leads to the computational burden both

in running the computer code and in analyzing high-dimensional data. The grouping

by EDA provides the same screening result for outputs y2, y3, y4 compared to the

results based on the other groupings. For output y1, the grouping by EDA does

not select inputs 38 and 39, but these two inputs are selected for output y4, so the final selection of 8 inputs agrees with the other groupings. Thus the grouping using

79 EDA performs in the most efficient way by using a smallest total number of runs for screening inputs of the FRAPCON code.

80 CHAPTER 5

COMPUTATION OF SENSITIVITY INDICES

This chapter describes the computation of sensitivity indices which are used as screening measures in the GSinCE procedure in Chapter 2. The computation of sensitivity indices of quantitative inputs based on the Gaussian Process (GP) model is explained in Section 5.1. In Section 5.2, a new approach for computing the sensitivity indices is proposed for the situation in which both quantitative and qualitative inputs exist.

5.1 Sensitivity Indices of Quantitative Inputs

Oakley and O’Hagan (2004) described the computation of sensitivity indices in the GP model framework (described in Section 1.2) using Bayesian methodology and showed the computational efficiency compared to a based on a sampling plan. The GPM/SA code developed by Los Alamos National Laboratory implements the sensitivity analysis using this GP model framework. In particular, the

GPM/SA code was constructed based on a Gaussian correlation function to model the output. Here this approach is developed for the cubic correlation function. First, the definition of the effect of inputs on the output is reviewed, followed by the sensitivity indices corresponding to a set of quantitative inputs (see Saltelli, Chan, and Scott

81 (2000) and Santner et al. (2003), chapter 7). Finally the estimation of sensitivity indices in the GP model framework is described and application results via a simple example are given.

5.1.1 Definition of Sensitivity Indices

Suppose the computer code output y(x) = y(x1, . . . , xf ) is a function of f quan- titative inputs which are defined on a hyper-rectangle X = X1 × ... × Xf = [l1, u1] ×

... × [lf , uf ]. An effect of an input can be calculated as an average, which can be done by “integrating out” the other inputs. Suppose that integration is with re- ∏ f spect to g(x) = k=1 gk(xk), which is a product of functions of one input at a time.

The weight functions gk’s can be chosen to be equal, representing uniform interest across the input region Xk for each input xk, k = 1, . . . , f. The g(x) can be a joint probability density with independent components. First the Sobol´ decomposition is reviewed and its relationship is demonstrated to joint effect functions which are used in practice for an efficient computation of sensitivity indices. In the following, a general weight function gk(xk) on Xk = [lk, uk], k = 1, . . . , f, is used. Section 2.3.3 describes a simplified version when gk(xk) = 1 for k = 1, . . . , f.

Sobol´ Decomposition ∫

Let y0 = y(x)g(x) dx X denote the overall mean of y(x). Sobol´ (1993) showed that there is a unique decom- position of y(x),

∑f ∑ y(x) = y0 + yi(xi) + yij(xi, xj) + ... + y1,2,...,f (x1, . . . , xf ) (5.1) i=1 1≤i

82 ∫ that satisfies yi1,...,is (xi1 , . . . , xis )gik (xik ) dxik = 0 (5.2) X ik

for any 1 ≤ ik ≤ s, and that has orthogonal components; that is, for any (i1, . . . , is) ̸ = (j1, . . . , jt), ∫

yi1,...,is (xi1 , . . . , xis )yj1,...,jt (xj1 , . . . , xjt )g(x) dx = 0. (5.3) X

The component terms in (5.1) are defined by setting

∫ ∏f yi(xi) = y(x) gk(xk) dx−i − y0 k≠ i X−i

to be the “main effect” function of input xi, for 1 ≤ i ≤ f, and

∫ ∏f yij(xi, xj) = y(x) gk(xk) dx−{ij} − yi(xi) − yj(xj) − y0 k≠ i,j X−{ij}

to be the interaction effect function of inputs xi and xj, where 1 ≤ i < j ≤ f. Here ∏ X X dx−i denotes integration of all inputs except xi over the input region −i = k≠ i k and analogously dx−{ij} denotes integration of all inputs except xi and xj over the ∏ X X input region −{ij} = k≠ i,j k. Higher-order interaction terms are defined similarly. The total variance of y(x) (with respect to g(·)) is defined to be

∫ 2 − 2 V = V arg[y(X)] = y (x)g(x)dx y0. X

Using (5.1), the variance, V , of y(x) can be partitioned as follows

∑f ∑ V = Vi + Vij + ... + V1,2,...,f (5.4) i=1 1≤i

83 where ∫ ∫ 2 2 Vi = yi (xi)gi(xi)dxi,Vij = yij(xi, xj)gi(xi)gj(xj)dxidxj,

Xi Xi×Xj

≤ ≤ for 1 i < j f and, in general, the variance Vk1,...,ks is defined as

Vk1,...,ks = V arg[yk1,...,ks (Xk1 ,...,Xks )] ∫ ∏s 2 = yk1,...,ks (xk1 , . . . , xks ) gki (xki ) dxk1 . . . dxks X × ×X i=1 k1 ... ks

∈ because each yk1,...,ks has mean zero by construction for any subset (k1, . . . , ks) {1, . . . , f}.

The sensitivity of y(·) with respect to the (k1, . . . , ks) interaction (or main effect)

is defined by V S = k1,...,ks (5.5) k1,...,ks V

which, in words, is the proportion of total variance explained by the (xk1 , . . . , xks ) interaction (or main effect) function. In particular, S1,...,Sf are called the main effect sensitivity indices and Sij is a two-factor sensitivity index. The total effect sensitivity index of the input xi is defined to be the sum of all sensitivity indices involving the input xi, ∑ ∑ Ti = Si + Sij + Sji + ... + S1,2,...,f . (5.6) j>i j

For example, if there are f = 3 inputs, then T1 = S1 + S12 + S13 + S123. Notice that,

by construction, Si ≤ Ti for all i ∈ {1, . . . , f}; if Si and Ti are nearly equal, then one

can infer that input xi is involved in very few “important” interaction effects.

84 Joint Effect Functions

When discussing the main effect and higher-order effect terms, the following no-

tation will be used. If the effect of a subset of input variables S = {k1, . . . , ks} ⊂ { } 1, 2, . . . , f is of interest, let xS denote the vector of inputs (xk1 , . . . , xks ). The vec-

tor of the remaining inputs will be denoted by x−S . By rearranging the order of the

input variables, one may write x = (xS , x−S ). Similarly let XS denote the input

region of the inputs xS and let X−S denote the input region of the remaining inputs.

Then X = XS × X−S .

Now a set of functions closely related to the corrected effect function yS (·) are introduced. For any non-empty S = {k1, . . . , ks} ⊂ {1, 2, . . . , f}, define the joint

effect of the inputs xS to be

jS (xS ) = Eg[y(X)|XS = xS ] = y(xS , x−S )g(x−S )dx−S . (5.7)

XS

The joint effects jS (xS ) need not be centered nor need jS1 (xS1 ) and jS2 (xS2 ) be orthogonal for S1 ≠ S2. The joint effect and Sobol´ component are closely related.

For example, the main effect of input xi is

yi(xi) = ji(xi) − y0, 1 ≤ i ≤ f (5.8)

and the interaction effect of inputs xi and xj satisfies

yij(xi, xj) = jij(xi, xj) − yi(xi) − yj(xj) − y0. (5.9)

Equation (5.9) shows that the joint effect jij(xi, xj) includes the interaction effects as

well as the main effects of xi and xj.

85 For arbitrary S, define

j VS = V arg[jS (XS )] = V arg [Eg[y(X)|XS ]] (5.10)

j to be the variance of the joint effect. The quantity VS has the interpretation that it

measures the expected reduction in uncertainty due to observing xS , since

j VS = V arg[y(X)] − Eg [V arg[y(X)|XS )]]. Consider two special cases of the joint effect function. The variance of the effect

function ji(xi) of the individual input xi is

j Vi = V arg[yi(Xi) + y0] = Vi (5.11)

j from (5.8), so Vi is the same as Vi in (5.4). The main effect sensitivity index of input

j xi, Si, can be computed from V S = i . (5.12) i V

Using (5.9), (5.10), and (5.3), the variance of the joint effect of inputs xi and xj

j Vij = V arg[yi(Xi) + yj(Xj) + yij(Xi,Xj) + y0] = Vi + Vj + Vij (5.13)

because the components are orthogonal. Equation (5.13) contains both the variance

of the main effects and the variance of the interaction effect of inputs xi and xj. Thus

j VS ≠ VS when S contains more than one input.

Let X−i denote the vector of all components of X except Xi. Then

j V−i = V arg[j−i(X−i)]

= V arg[y1,2,...,i−1,i+1,...,f + ... + y1 + y2 + ... + yi−1 + yi+1 + ... + yf + y0]

= V1,2,...,i−1,i+1,...,f + ... + V1 + V2 + ... + Vi−1 + Vi+1 + ... + Vf ∑ = VS (5.14) S: i ̸∈ S

86 is the sum of all VS components not involving the subscript i in the variance decom- − j position (5.4). Thus V V−i is the sum of all VS components involving the input xi and can be used to compute the total effect sensitivity index (5.6) by

V − V j T = −i . (5.15) i V

Thus, in the following, to estimate the main effect and total effect sensitivity indices

Si and Ti, joint effect variances rather than the variances of the corrected effects in the Sobol’ decomposition need to be estimated.

5.1.2 Estimation in Gaussian Process Framework

Suppose the unknown function (output of a computer code) y(x) is viewed as

a draw from a stationary Gaussian process Y (x) with a constant mean β, variance

1/λY , and covariance function

∏f 1 2 1 Y 1 2 1 Y 1 2 Covp[Y (x ),Y (x )] = R (x , x ; θ) = R (xk, xk; θk) (5.16) λY λY k=1

where θ = (θ1, . . . , θf ) is a vector of parameters of the Gaussian or cubic correlation

functions. Here Ep and Covp denote expectation and covariance with respect to the

1 2 1 2 random GP, x and x denote two input sites, and xk and xk indicate the values corresponding to the kth input of these two input sites. In particular, assume the

code calculation at a specific input site x can be viewed as a realization of

Zsim(x) = Y (x) + ϵsim(x)

by adding ϵsim(x) which is a white noise process with mean zero and variance 1/λϵ and

is independent of Y (x). The term ϵsim(x) should be thought of as explicitly modeling

87 non-deterministic behavior of the computer output or enhancing numerical stability in

1 n ⊤ the estimation of the correlation parameters. Then Zsim = (Zsim(x ),...,Zsim(x ))

Z has a mean vector msim = β1 and a covariance matrix

Z 1 1 Σsim = R + I λY λϵ where the (i, j)th element of the n × n matrix R is RY (xi, xj; θ) and I is the n × n identity matrix.

Analogous to the joint effect function jS (xS ) defined in (5.7) for y(·), one can view ∫

JS (xS ) = Eg[Y (X)|XS = xS ] = Y (xS , x−S )g(x−S )dx−S (5.17)

X−S as a prior process for jS (xS ).

Viewed as a process with index xS , JS (xS ) is a GP with mean function ∫ J mS (xS ) ≡ Ep[JS (xS )] = Ep[Y (x)]g(x−S )dx−S = β, (5.18)

X−S covariance function

J 1 2 1 2 CS (xS , xS ) ≡ Covp[JS (xS ),JS (xS )]   ∫ ∫  1 1 1 2 2 2  = Covp  Y (x )g(x−S )dx−S , Y (x )g(x−S )dx−S  X X ∫ ∫−S −S 1 Y 1 2 1 2 1 2 = R (x , x ; θ)g(x−S )g(x−S )dx−S dx−S λY X X −S [−S∫ ∫ ] ∏ uk uk ∏ 1 Y 1 2 1 2 1 2 Y 1 2 = R (xk, xk; θk)g(xk)g(xk)dxkdxk R (xk, xk; θk) λY k̸∈S | lk lk {z } k∈S

dbint(lk,uk;θk) ∏ ∏ 1 Y 1 2 = dbint(lk, uk; θk) R (xk, xk; θk), (5.19) λY k̸∈S k∈S

88 and variance function ∏ J 1 CS (xS , xS ) ≡ Covp[JS (xS ),JS (xS )] = dbint(lk, uk; θk) (5.20) λY k̸∈S

Y since R (xk, xk; θk) = 1, where the double integral dbint is derived for the Gaussian and cubic correlation functions in Section 5.1.3.

⊤ Now [JS (xS ), Zsim] has the multivariate normal distribution       J J   mS (xS )   CS (xS , xS ) Covp[JS (xS ), Zsim]   N    ,    . (5.21) Z ⊤ Z msim Covp[JS (xS ), Zsim] Σsim

P Define JS (xS ) to be the process whose distribution is that of the posterior predictive

1 n ⊤ process of JS (xS ) given n code calculations zsim = (zsim(x ), . . . , zsim(x )) and

P whose marginal distribution can be derived from (5.21). Further JS (xS ) is a GP (see Johnson and Wichern (1998), pages 160–161) with mean function ( ) ( ) P Z −1 − Z mS (xS ) = β + Covp[JS (xS ), Zsim] Σsim zsim msim (5.22) and covariance function

P 1 2 CS (xS , xS ) ( ) J 1 2 − 1 Z −1 2 ⊤ = CS (xS , xS ) Covp[JS (xS ), Zsim] Σsim Covp[JS (xS ), Zsim] [( ) ] J 1 2 − Z −1 2 ⊤ 1 = CS (xS , xS ) trace Σsim Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim] (5.23) and variance function

P CS (xS , xS ) [( ) ] J − Z −1 ⊤ = CS (xS , xS ) trace Σsim Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim] . (5.24)

89 To derive a more explicit expression for (5.22)–(5.24), note that Covp[JS (xS ), Zsim] is a 1 × n vector with the ith element   ∫  i  Covp[JS (xS ), Zsim]i = Covp  Y (x)g(x−S )dx−S , Zsim(x ) X ∫ −S i i = Covp[Y (x),Y (x ) + ϵ(x )]g(x−S )dx−S X ∫−S ∫ i 1 Y i = Covp[Y (x),Y (x )]g(x−S )dx−S = R (x, x ; θ)g(x−S )dx−S λY X X −S [∫ ] −S ∏ uk ∏ 1 Y i Y i = R (xk, xk; θk)g(xk)dxk R (xk, xk; θk) λY k̸∈S | lk {z } k∈S i sgint(lk,uk,xk;θk) ∏ ∏ 1 i Y i = sgint(lk, uk, xk; θk) R (xk, xk; θk) (5.25) λY k̸∈S k∈S where the single integral sgint is derived in Section 5.1.3 for the Gaussian and cubic correlation functions.

Below the 1 × n vector of integrals Covp[JS (xS ), Zsim] with respect to xS ,

∫ ⊤ q ≡ Covp[JS (xS ), Zsim]g(xS )dxS (5.26)

XS is required. From (5.25), the ith element of q is written as ∫

qi = Covp[JS (xS ), Zsim]ig(xS )dxS X S ∫ ∏ ∏ uk 1 i Y i = sgint(lk, uk, xk; θk) R (xk, xk; θk)g(xk)dxk λY k̸∈S k∈S | lk {z } i sgint(lk,uk,xk;θk) ∏f 1 i = sgint(lk, uk, xk; θk). (5.27) λY k=1

90 ⊤ Also the n×n matrix of integrals Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim] with respect to xS , ∫ ⊤ C ≡ Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim]g(xS )dxS (5.28)

XS

is required below. Then from (5.25) the (i, j)th element of C is

(C)ij = Covp[JS (xS ), Zsim]iCovp[JS (xS ), Zsim]jg(xS )dxS

XS ∫ ∏ ∏ uk 1 i j Y i Y j = 2 sgint(lk, uk, xk; θk) sgint(lk, uk, xk; θk) R (xk, xk; θk)R (xk, xk; θk)g(xk)dxk λ l Y k̸∈S k∈S | k {z } i j mxint(lk,uk,x ,x ;θk) ∏ ∏ k k 1 j j = sgint(l , u , xi ; θ ) sgint(l , u , x ; θ ) mxint(l , u , xi , x ; θ ) (5.29) λ2 k k k k k k k k k k k k k Y k̸∈S k∈S

where the integral mxint is derived in Section 5.1.3 for the Gaussian and cubic

correlation functions.

j The joint effect variance VS = V arg[jS (XS )] in (5.10) is estimated by the posterior

j b j P predictive mean of VS given the data zsim, i.e., VS = E [V arg[JS (XS )]|Zsim]. By b j the mean-variance identity, VS is written as

[ ] [ ] b j P 2 P 2 VS = E Eg[JS (XS )]|Zsim − E Eg[JS (XS )] |Zsim [ ] [ ] P 2 P 2 = Eg E [JS (XS )|Zsim] − E Eg[JS (XS )] |Zsim [ ] [ ] P P 2 = Eg V ar [JS (XS )|Zsim] + Eg E [JS (XS )|Zsim] { } P 2 P − Eg[E [JS (XS )|Zsim]] − V ar [Eg[JS (XS )]|Zsim] [ ] [ ] P P P = Eg V ar [JS (XS )|Zsim] + V arg E [JS (XS )|Zsim] − V ar [Eg[JS (XS )]|Zsim]

and hence [ ] [ ] b j P P P VS = Eg CS (XS , XS ) + V arg mS (XS ) − V ar [Eg[JS (XS )]|Zsim] (5.30)

91 using (5.24) and (5.22). The three terms in (5.30) are calculated as follows. The first component of (5.30) is

∫ P P Eg[CS (XS , XS )] = CS (xS , xS )g(xS )dxS X ∫ S J = CS (xS , xS )g(xS )dxS X S ∫ [( ) ] − Z −1 ⊤ trace Σsim Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim] g(xS )dxS (5.31) XS 1 ∏ = dbint(lk, uk; θk) λY k̸∈S   ∫ ( )−  − Z 1 ⊤ trace  Σsim Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim]g(xS )dxS  (5.32) XS ∏ [ ] 1 ( )−1 − Z = dbint(lk, uk; θk) trace Σsim C (5.33) λY k̸∈S where (5.31) follows from (5.24), (5.32) follows from (5.20), and (5.33) follows from

(5.28). The second component of (5.30) is

P P P 2 V arg[mS (XS )] = Eg[(mS (XS ) − Eg[mS (XS )]) ] [( ) ] ( ) ( ) 2 − ⊤ Z −1 − Z = Eg (Covp[JS (XS ), Zsim] q ) Σsim zsim msim (5.34) ( ) − Z ⊤ Z −1 = (zsim msim) Σsim [ ] ⊤ ⊤ ⊤ × Eg (Covp[JS (XS ), Zsim] − q ) (Covp[JS (XS ), Zsim] − q ) (5.35) ( ) ( ) × Z −1 − Z Σsim zsim msim ( ) ( ) ( ) − Z ⊤ Z −1 − ⊤ Z −1 − Z = (zsim msim) Σsim (C qq ) Σsim zsim msim (5.36) where (5.34) follows because

92 [ ] ⊤ Eg Covp[JS (XS ), Zsim] = q from (5.26) ( ) ( ) P ⊤ Z −1 − Z Eg[mS (XS )] = β + q Σsim zsim msim from (5.22) ( ) ( ) P − P − ⊤ Z −1 − Z mS (XS ) Eg[mS (XS )] = (Covp[JS (XS ), Zsim] q ) Σsim zsim msim ,

(5.35) follows by algebra, and (5.36) follows because [ ] ⊤ Eg Covp[JS (XS ), Zsim] Covp[JS (XS ), Zsim] = C from (5.28). The third component of (5.30) is

P P V ar [Eg[JS (XS )]|Zsim] = Cov [Eg[JS (XS )],Eg[JS (XS )]|Zsim]   ∫ ∫ P  1 1 1 2 2 2  = Cov  JS (xS )g(xS )dxS , JS (xS )g(xS )dxS Zsim X X ∫ ∫ S S P 1 2 1 2 1 2 = CS (xS , xS )g(xS )g(xS )dxS dxS X X ∫S ∫S J 1 2 1 2 1 2 = CS (xS , xS )g(xS )g(xS )dxS dxS

XS XS ∫ ∫ [( ) ] − Z −1 2 ⊤ 1 1 2 1 2 trace Σsim Covp[JS (xS ), Zsim] Covp[JS (xS ), Zsim] g(xS )g(xS )dxS dxS XS XS (5.37) ∫ ∫ ∏ ∏ uk uk 1 Y 1 2 1 2 1 2 = dbint(lk, uk; θk) R (xk, xk; θk)g(xk)g(xk)dxkdxk λY k̸∈S k∈S | lk lk {z }  dbint(lk,uk;θk)  ∫ ∫ ( )−  − Z 1 2 ⊤ 2 2 1 1 1 trace  Σsim Covp[JS (xS ), Zsim] g(xS )dxS Covp[JS (xS ), Zsim]g(xS )dxS  XS XS (5.38) ∏f [ ] 1 ( )−1 − Z ⊤ = dbint(lk, uk; θk) trace Σsim qq (5.39) λY k=1

93 where (5.37) follows from (5.23), (5.38) follows from (5.19), and (5.39) follows from

b j (5.26). Thus VS in (5.30) can be expressed as { } ∏ [ ] 1 ( )−1 b j − Z VS = dbint(lk, uk; θk) trace Σsim C λY k̸∈S { ( ) ( ) } − Z ⊤ Z −1 − ⊤ Z −1 − Z + (zsim msim) Σsim (C qq ) Σsim (zsim msim) (5.40) { } ∏f [ ] 1 ( )−1 − − Z ⊤ dbint(lk, uk; θk) trace Σsim qq . λY k=1

The estimate Vˆ of the total variance V can be expressed as (5.40) when S =

{1, . . . , f}. The main effect sensitivity index for the individual input xi is estimated

b j by b Vi Si = (5.41) Vb

where S = {i}, and the total effect sensitivity index is estimated by b − b j b V V−i Ti = , (5.42) Vb

b j S { − } where V−i is obtained from (5.40) when = 1, . . . , i 1, i + 1, . . . , f .

5.1.3 The Integrals: sgint, dbint, mxint

Analytical forms of sgint, dbint, and mxint are derived for the Gaussian cor-

relation function and the cubic correlation functions. Let RY (x, η; θ) denote one of

these correlation functions with correlation parameter θ > 0 and two input values x, η ∈ [l, u]. Then the three integrals are ∫ 1 u ≡ Y sgint(l, u, η; θ) − R (x, η; θ)dx u l ∫l ∫ 1 u u dbint(l, u; θ) ≡ RY (x, η; θ)dηdx (5.43) − 2 (u l) l ∫l 1 u ≡ Y Y mxint(l, u, η1, η2; θ) − R (x, η1; θ)R (x, η2; θ)dx. u l l

94 For the Gaussian correlation function,

[ ] RY (x, η; θ) = exp −θ(x − η)2 the closed forms of the three integrals are ∫ 1 u − − 2 sgint(l, u, η; θ) = − exp[ θ(x η) ]dx (u l) l √ ∫ √ ( ) 2θ(u−η) 2 1 √2π √1 −x = − √ exp dx (u l) 2θ 2π 2θ(l−η) 2 √ [ ( ) ( )] 1 π √ √ = Φ (u − η) 2θ − Φ (l − η) 2θ (u − l) θ where Φ(·) denotes a cumulative distribution function (cdf) of the standard normal distribution, ∫ ∫ 1 u u dbint(l, u; θ) = exp[−θ(x − η)2]dηdx − 2 (u l) l l ∫ √ [ ( ) ( )] 1 u π √ √ = (u − l) Φ (u − x) 2θ − Φ (l − x) 2θ dx − 2 (u l) l θ { [ ( ) ] √ [ ( ) ]} 1 1 √ √ π √ = 2πϕ (u − l) 2θ − 1 + (u − l) 2Φ (u − l) 2θ − 1 (u − l)2 θ θ where ϕ(·) denotes a probability density function (pdf) of the standard normal dis- tribution, and ∫ 1 u mxint(l, u, η , η ; θ) = exp[−θ(x − η )2] exp[−θ(x − η )2]dx 1 2 u − l 1 2 ∫ [ ( l ) ] 1 u η + η 2 1 − − 1 2 − − 2 = − exp 2θ x θ(η1 η2) dx u l l 2 2 [ ] ∫ [ ( ) ] 1 1 u η + η 2 = exp − θ(η − η )2 exp −2θ x − 1 2 dx 2 1 2 u − l 2 [ ] (l ) 1 η + η = exp − θ(η − η )2 sgint l, u, 1 2 ; 2θ . 2 1 2 2

95 The cubic correlation function is  ( ) ( )  2 | − | 3  1 − 6 x−η + 6 x η , |x − η| ≤ θ ;  ( θ ) θ 2 3 RY (x, η; θ) = − |x−η| θ ≤ | − | ≤ (5.44)  2 1 θ , 2 x η θ;   0, θ < |x − η|.

The closed forms of the three integrals are as follows.

1) sgint

The integral of RY (x, η; θ) with respect to x depends on the value of η in determining the form of correlation function. Here the two non-zero correlation functions in (5.44) for given parameter θ are denoted by ( ) ( ) x − η 2 |x − η| 3 R (x, η) ≡ RY (x, η; θ) = 1 − 6 + 6 1 1 θ θ ( ) (5.45) |x − η| 3 R (x, η) ≡ RY (x, η; θ) = 2 1 − . 2 2 θ ∫ ∗ − ∗ ≡ u Y Let l = max(η θ, l) and u = min(η + θ, u). Let A l R (x, η; θ)dx, then A sgint(l, u, η; θ) = u−l and A is solved as follows.

− θ ≤ θ Case 1 : l > η 2 and u η + 2 ∫ u − − 2[(u−η)3−(l−η)3] 3[(u−η)4+(l−η)4] A = l R1(x, η)dx = (u l) θ2 + 2θ3

− θ θ Case 2 : l > η 2 and u > η + 2 ∫ ∫ η+ θ u∗ − 3 − 4− − ∗ 4 2 − 3θ 2(l η) 3(l η) (θ u +η) A = R1(x, η)dx + θ R2(x, η)dx = (η l) + + 2 + 3 l η+ 2 8 θ 2θ

≤ − θ ≤ θ Case 3 : l η 2 and u η + 2 ∫ ∫ η− θ u − 3 − 4− ∗− 4 2 − 3θ − 2(u η) 3(u η) (θ+l η) A = ∗ R2(x, η)dx+ − θ R1(x, η)dx = (u η)+ 2 + 3 l η 2 8 θ 2θ

≤ − θ θ Case 4 : l η 2 and u > η + 2 ∫ ∫ ∫ η− θ η+ θ u∗ − ∗ 4 ∗− 4 2 2 3θ − (θ u +η) +(θ+l η) A = l∗ R2(x, η)dx+ − θ R1(x, η)dx+ η+ θ R2(x, η)dx = 4 2θ3 η 2 2

96 2) dbint ∫ ∫ ≡ u u Y B Let B l l R (x, η; θ)dηdx, then dbint(l, u; θ) = (u−l)2 . The B depends on magnitude of u − l and is solved in the following 3 cases illustrated in Figure 5.1.

Case 3

Case 2 η

Case 1

x

Figure 5.1: Description of dbint of the cubic correlation function

Case 1 : u − l > θ ∫ − ∫ ∫ − θ ∫ ∫ ∫ − θ u θ x+θ u 2 u l+θ x 2 B = l x+ θ R2(x, η)dηdx+ u−θ x+ θ R2(x, η)dηdx+ l+ θ l R2(x, η)dηdx ∫ ∫ 2 ∫ ∫ 2 ∫ 2 ∫ u x− θ l+ θ x+ θ u− θ x+ θ + 2 R (x, η)dηdx + 2 2 R (x, η)dηdx + 2 2 R (x, η)dηdx l+θ x−θ 2 l l 1 l+ θ x− θ 1 ∫ ∫ 2 2 u u + − θ − θ R1(x, η)dηdx u 2 x 2 23 2 3 − − = 40 θ + 4 θ( l θ + u)

θ − ≤ Case 2 : 2 < u l θ ∫ − θ ∫ ∫ ∫ − θ ∫ − θ ∫ θ u 2 u u x 2 u 2 x+ 2 B = θ R2(x, η)dηdx+ θ R2(x, η)dηdx+ R1(x, η)dηdx l x+ 2 l+ 2 l l l ∫ θ ∫ ∫ ∫ l+ 2 u u u + − θ l R1(x, η)dηdx + l+ θ x− θ R1(x, η)dηdx u 2 2 2 (l+θ−u)5 3 − − 7 2 = 5θ3 + 4 θ(u l) 40 θ

− ≤ θ Case 3 : u l 2 ∫ ∫ u u − 2 − (l−u)4(3l−3u+5θ) B = l l R1(x, η)dηdx = (l u) 5θ3 97 3) mxint

Y Y The form of R (x, η1; θ)R (x, η2; θ) with respect to x depends on the values of η1 and

η2 as well as x. As in (5.45), denote possible forms of correlation function by ( ) ( ) x − η 2 |x − η | 3 R (η ) ≡ RY (x, η ; θ) = 1 − 6 1 + 6 1 1 1 1 1 θ θ ( ) |x − η | 3 R (η ) ≡ RY (x, η ; θ) = 2 1 − 1 2 1 2 1 θ ( ) ( ) x − η 2 |x − η | 3 R (η ) ≡ RY (x, η ; θ) = 1 − 6 2 + 6 2 1 2 1 2 θ θ ( ) |x − η | 3 R (η ) ≡ RY (x, η ; θ) = 2 1 − 2 2 2 2 2 θ ∫ ≡ u Y Y C Let C l R (x, η1; θ)R (x, η2; θ)dx, then mxint(l, u, η1, η2; θ) = u−l . The C is

solved in the following 5 cases as shown in Figure 5.2. Assume η1 ≤ η2 and let

∗ ∗ δ = η2 − η1, l = max(η2 − θ, l), and u = min(η1 + θ, u). Now the integral has different forms according to the magnitude of δ. Moreover, the integral depends on the location of l and u as done in sgint(l, u, η; θ), so there are many subcases in this case. In each case, it is straightforward to solve the integral in a closed form.

Case 1 : δ = 0

− θ ≤ θ (a) if l > η1 2 and u η1 + 2 , ∫ u C = l R1(η1)R1(η2)dx

− θ θ (b) if l > η1 2 and u > η1 + 2 , ∫ θ ∫ ∗ η1+ 2 u C = R1(η1)R1(η2)dx + θ R2(η1)R2(η2)dx l η1+ 2 ≤ − θ ≤ θ (c) if l η1 2 and u η1 + 2 , ∫ − θ ∫ η1 2 u C = ∗ R2(η1)R2(η2)dx + − θ R1(η1)R1(η2)dx l η1 2 ≤ − θ θ (d) if l η1 2 and u > η1 + 2 , ∫ − θ ∫ η + θ ∫ ∗ η1 2 1 2 u C = ∗ R2(η1)R2(η2)dx+ θ R1(η1)R1(η2)dx+ θ R2(η1)R2(η2)dx l − η1+ η1 2 2

98 R R R R R R Case 1 2 2 1 1 2 2

η −θ=η −θ η −θ/2=η −θ/2 η +θ/2=η +θ/2 η +θ=η +θ 1 2 1 2 1 2 1 2

R R R R R R R R R R Case 2 2 2 1 2 1 1 2 1 2 2

η −θ η −θ/2 η −θ/2 η +θ/2 η +θ/2 η +θ 2 1 2 1 2 1

R R R R R R Case 3 1 2 1 1 2 1

η −θ η −θ/2 η +θ/2 η +θ 2 2 1 1

R R R R R R Case 4 1 2 2 2 2 1

η −θ η +θ/2 η −θ/2 η +θ 2 1 2 1

R R Case 5 2 2

η −θ η +θ 2 1

Y Y Figure 5.2: Description of R (x, η1; θ)R (x, η2; θ) of the cubic correlation function

≤ θ Case 2 : 0 < δ 2

− θ ≤ θ (a) if l > η2 2 and u η1 + 2 , ∫ u C = l R1(η1)R1(η2)dx

− θ θ ≤ θ (b) if l > η2 2 and η1 + 2 < u η2 + 2 , ∫ θ ∫ η1+ 2 u C = R1(η1)R1(η2)dx + θ R2(η1)R1(η2)dx l η1+ 2 − θ θ (c) if l > η2 2 and u > η2 + 2 , ∫ θ ∫ η + θ ∫ ∗ η1+ 2 2 2 u C = R1(η1)R1(η2)dx+ θ R2(η1)R1(η2)dx+ θ R2(η1)R2(η2)dx l η2+ η1+ 2 2 − θ ≤ − θ ≤ θ (d) if η1 2 < l η2 2 and u η1 + 2 , ∫ − θ ∫ η2 2 u C = R1(η1)R2(η2)dx + − θ R1(η1)R1(η2)dx l η2 2

99 − θ ≤ − θ θ ≤ θ (e) if η1 2 < l η2 2 and η1 + 2 < u η2 + 2 , ∫ − θ ∫ η + θ ∫ η2 2 1 2 u C = R1(η1)R2(η2)dx+ θ R1(η1)R1(η2)dx+ θ R2(η1)R1(η2)dx l − η1+ η2 2 2 − θ ≤ − θ θ (f) if η1 2 < l η2 2 and u > η2 + 2 , ∫ θ ∫ θ ∫ θ η − η1+ η2+ C = 2 2 R (η )R (η )dx+ 2 R (η )R (η )dx+ 2 R (η )R (η )dx l 1 1 2 2 η − θ 1 1 1 2 η + θ 2 1 1 2 ∫ 2 2 1 2 u∗ + θ R2(η1)R2(η2)dx η2+ 2 − θ ≤ θ (g) if l < η1 2 and u η1 + 2 , ∫ − θ ∫ η − θ ∫ η1 2 2 2 u C = ∗ R2(η1)R2(η2)dx+ θ R1(η1)R2(η2)dx+ θ R1(η1)R1(η2)dx l − η2− η1 2 2 − θ θ ≤ θ (h) if l < η1 2 and η1 + 2 < u η2 + 2 , ∫ − θ ∫ η − θ ∫ η + θ η1 2 2 2 1 2 C = ∗ R (η )R (η )dx+ R (η )R (η )dx+ R (η )R (η )dx l 2 1 2 2 η − θ 1 1 2 2 η − θ 1 1 1 2 ∫ 1 2 2 2 u + θ R2(η1)R1(η2)dx η1+ 2 − θ θ (i) if l < η1 2 and u > η2 + 2 , ∫ − θ ∫ η − θ ∫ η + θ η1 2 2 2 1 2 C = l∗ R2(η1)R2(η2)dx+ − θ R1(η1)R2(η2)dx+ − θ R1(η1)R1(η2)dx η1 2 η2 2 ∫ θ ∫ ∗ η2+ 2 u + θ R2(η1)R1(η2)dx + θ R2(η1)R2(η2)dx η2+ η1+ 2 2

θ ≤ Case 3 : 2 < δ θ ∫ − θ ∫ η + θ ∫ ∗ η2 2 1 2 u C = ∗ R1(η1)R2(η2)dx+ θ R1(η1)R1(η2)dx+ θ R2(η1)R1(η2)dx l − η1+ η2 2 2

≤ 3 Case 4 : θ < δ 2 θ ∫ θ ∫ η − θ ∫ η1+ 2 2 2 η1+θ C = R1(η1)R2(η2)dx+ θ R2(η1)R2(η2)dx+ θ R2(η1)R1(η2)dx η2−θ η − η1+ 2 2 2

3 ≤ Case 5 : 2 θ < δ 2θ ∫ η1+θ C = R2(η1)R2(η2)dx η2−θ

5.1.4 Example

Here consider an output function with f = 3 quantitative inputs;

y(x1, x2, x3) = x1 + x2 + 3x1x3 (5.46)

100 where x1, x2, x3 ∈ [0, 1]. First the theoretical sensitivity indices are derived following the definition in Section 5.1.1 and then the sensitivity indices are estimated using the data.

Theoretical Derivation

Suppose g(x) = 1 on [0, 1]3, so the overall mean of the output y is ∫ ∫ ∫ 1 1 1 7 y0 = ydx1dx2dx3 = , 0 0 0 4 and the total variance V is ∫ ∫ ∫ 1 1 1 2 41 V = (y − y0) dx1dx2dx3 = . 0 0 0 48

From (5.7), the joint effect functions of the individual inputs are ∫ ∫ 1 1 5x 1 j (x ) = y dx dx = 1 + 1 1 2 3 2 2 ∫0 ∫0 1 1 5 j (x ) = y dx dx = x + 2 2 1 3 2 4 ∫0 ∫0 1 1 3x3 j3(x3) = y dx1dx2 = + 1. 0 0 2

Their variances are ∫ 1 25 V j = (j (x ) − y )2 dx = 1 1 1 0 1 48 ∫0 1 4 V j = (j (x ) − y )2 dx = 2 2 2 0 2 48 ∫0 1 9 j − 2 V3 = (j3(x3) y0) dx3 = 0 48 because E[j1(X1)] = E[j2(X2)] = E[j3(X3)] = y0. So the main effect sensitivity indices of the individual inputs are

25 4 9 S = = 0.6100,S = = 0.0976,S = = 0.2195. (5.47) 1 41 2 41 3 41

101 To compute the total effect sensitivity indices, the joint effect functions are computed ∫ 1 as 3x3 1 j (x , x ) = j− (x− ) = y dx = x + + 23 2 3 1 1 1 2 2 2 ∫0 1 1 j (x , x ) = j− (x− ) = y dx = x + 3x x + 13 1 3 2 2 2 1 1 3 2 ∫0 1 5x1 j12(x1, x2) = j−3(x−3) = y dx2 = + x2. 0 2

Their variances are ∫ ∫ 1 1 j 2 13 V = (j− (x− ) − y ) dx dx = −1 1 1 0 2 3 48 ∫0 ∫0 1 1 j 2 37 V = (j− (x− ) − y ) dx dx = −2 2 2 0 1 3 48 ∫0 ∫0 1 1 29 j − 2 V−3 = (j−3(x−3) y0) dx1dx2 = , 0 0 48 because E[j23(X2,X3) = E[j13(X1,X3)] = E[j12(X1,X2)] = y0. So the total effect sensitivity indices are

41 − 13 41 − 37 41 − 29 T = = 0.6829,T = = 0.0976,T = = 0.2927. (5.48) 1 41 2 41 3 41

Empirical Results

The data for this example were created by sampling n = 30 input values from [0, 1]3. To do this, a 30 × 3 design matrix whose columns range over [0,1] and are uncorrelated each other was generated. The output y was computed at the input values of the design matrix. Using n = 30 data, the GP model was fitted to the output, the estimated model parameters were plugged in the closed form (5.40) to compute the joint effect variances and the total variance, and then the sensitivity indices were computed from (5.41) and (5.42).

The two software, GPM/SA and MPERK codes, were used to compute the sen- sitivity indices. Based on the Bayesian methodology, the GPM/SA code draws the

102 parameters from the posterior distribution of the parameter given data via MCMC sampling under the Gaussian correlation function. It computes the sensitivity indices for each draw of the parameters, and then suggests a mean of these sensitivity indices as a final answer. On the other hand, the MPERK code estimates the model parame- ters by the maximum likelihood. The MPERK code computes the sensitivity indices by plugging the ML estimates of parameters into formula (5.40). The MPERK code can compute the sensitivity indices for both Gaussian and cubic correlation functions.

Table 5.1 shows that the empirical sensitivity indices obtained from the different ap- proaches are close to the theoretical values of the sensitivity indices in (5.47) and

(5.48).

Effect GPM/SA(Gaussian) MPERK(Gaussian) MPERK(Cubic) x1 x2 x3 x1 x2 x3 x1 x2 x3 Main 0.6098 0.0955 0.2200 0.6108 0.0965 0.2199 0.6105 0.0974 0.2194 Total 0.6844 0.0957 0.2947 0.6840 0.0961 0.2929 0.6832 0.0974 0.2921

Table 5.1: Estimated sensitivity indices for the example function in (5.46) using different correlation functions and approaches

5.2 Sensitivity Indices of Mixed Inputs

Here the computer experiments with quantitative and qualitative inputs are con- sidered. Qian, Wu, and Wu (2008) proposed a framework for building GP models that incorporate both types of inputs by constructing correlation functions of the quantitative and qualitative inputs. To predict the output from computer experi- ments having the quantitative and qualitative inputs, Han, Santner, Notz, and Bartel

(2009) developed a Bayesian methodology, Kennedy and O’Hagan (2000) proposed an autoregressive model, and McMillan, Sacks, Welch, and Gao (1999) proposed a

103 proportionality model. Moreover, the methodology developed by Conti and O’Hagan (2006) for predicting the multivariate output can be applied to the prediction of the scalar output when both the quantitative and qualitative inputs exist.

The main goal here is to estimate the sensitivity indices for a qualitative input as well as quantitative inputs. Prediction models for the output can be considered separately at different levels of the qualitative input and then incorporated. In this case, sensitivity indices can be obtained for different levels, but it increases the number of correlation parameters, so enough data are required for precise estimates. Instead, a correlation function with the quantitative and qualitative inputs is newly defined and then the approach developed in Section 5.1 is followed.

5.2.1 Setup

Suppose the output y(w) = y(x, t) = y(x1, . . . , xf , t) is a function of f quantitative inputs x1, . . . , xf and one qualitative input t, where xk ∈ [lk, uk] for k = 1, . . . , f and t ∈ {1, 2, . . . , ν}. To extend the approach in Section 5.1, set f 1 ∏ 1 g(w) = × (5.49) ν (uk − lk) k=1 so that all inputs are weighted equally.

As in Section 5.1, S ⊂ {1, . . . , f} denotes the index set of the quantitative inputs of interest. Similarly define Q ⊂ {1} to denote the index of the qualitative input. Then

A = (S, Q) represents the index set of the mixed inputs of interest. For example, wA = xS if Q = ∅ and wA = (xS , t) if Q = {1}.

The joint effect of the mixed inputs wA is defined to be  ∑ ∫  1 Q ∅  ν y(xS , x−S , t∗)g(x−S )dx−S , if = t∗ X−S jA(wA) = Eg[y(W )|W A = wA] = ∫ (5.50)   y(xS , x−S , t)g(x−S )dx−S , if Q = {1}. X−S

104 Inference about jA(wA) depends on the integration over the quantitative inputs x−S and/or summation over the qualitative input t.

For arbitrary A = (S, Q), define

j VA = V arg[jA(W A)] (5.51) to be the variance of the joint effect of the mixed inputs. The total variance V is a special case of (5.51) when S = {1, . . . , f} and Q = {1}. For clarity, a subscript xi is used instead of i to indicate the joint effect, joint effect variance, and sensitivity index corresponding to the quantitative input xi, and the subscript t to indicate the qualitative input. The main effect sensitivity index of input xi and that of input t are computed by V j V j S = xi ,S = t , (5.52) xi V t V respectively, where V j is obtained from (5.51) when S = {i} and Q = ∅, and V j is xi t obtained from (5.51) when S = ∅ and Q = {1}. The total effect sensitivity index of input xi and that of input t are defined by j j V − V− V − V T = xi ,T = −t , (5.53) xi V t V

j S { − } Q { } where V−xi is obtained from (5.51) when = 1, . . . , i 1, i + 1, . . . , f and = 1 , j S { } Q ∅ and V−t is obtained from (5.51) when = 1, . . . , f and = .

5.2.2 Correlation Function for Mixed Inputs

As in Section 5.1.2, the unknown function y(w) is treated as a realization of the

GP, Y (w), which has a constant mean β and variance 1/λY and covariance function

1 2 1 1 2 Covp[Y (w ),Y (w )] = R(w , w ) (5.54) λY for any two input sites w1 = (x1, t1) and w2 = (x2, t2).

105 Assume the code calculation at a specific input site w can be viewed as a realiza- tion of Zsim(w) = Y (w) + ϵsim(w)

by adding a white noise process ϵsim(w) to represent numerical noise; ϵsim(w) is assumed to have mean zero and variance 1/λϵ and is independent of Y (w). Then the

1 n ⊤ output obtained from n input sites Zsim = (Zsim(w ),...,Zsim(w )) has the mean

Z vector msim = β1 and covariance matrix

Z 1 1 Σsim = R + I λY λϵ where the (i, j)th element of the n × n matrix R is R(w1, w2) and I is the n × n identity matrix.

In Section 5.2, a separable correlation function for the mixed inputs,

R(w1, w2; c, θ) = RT (t1, t2; c)RY (x1, x2; θ) (5.55) is assumed as done in Qian et al. (2008). For the qualitative input, set    1, if t1 = t2 T 1 2 R (t , t ; c) =  (5.56)  c, if t1 ≠ t2 where 0 < c < 1, following Joseph and Delaney (2007). Here the equal correlation c is assigned to any two different levels, assuming no information about relative difference in any pairs of levels. Then the ν × ν correlation matrix among all pairs of levels of the qualitative input has a compound symmetry form:    1 c ··· c       c 1 ··· c    = (1 − c)I + c11⊤.  . . .   . ··· .. .    c c ··· 1

106 Then RT (t1, t2; c) is positive definite since

a⊤[(1 − c)I + c11⊤]a = (1 − c)a⊤a + c(a⊤1)2 > 0,

for any non-zero vector a. So RT (t1, t2; c) in (5.56) is a valid correlation function.

For the quantitative inputs, the Gaussian correlation function ∏ Y 1 2 f − 1 − 2 2 R (x , x ; θ) = k=1 exp [ θk(xk xk) ], which is known as valid, is used. Then (5.55) is written as ∏f [ ] 1 2 1 ̸ 2 − 1 − 2 2 R(w , w ; c, θ) = exp[I(t = t ) ln c] exp θk(xk xk) (5.57) k=1

because (5.56) is expressed by RT (t1, t2; c) = exp[I(t1 ≠ t2) ln c]. The separable

correlation function for the mixed inputs in (5.57) is valid since it is a product of

valid correlation functions (Santner et al. (2003), chapter 2).

If any two inputs w1 and w2 have different levels of the qualitative input, then the correlation between the outputs at these two input sites is cRY (x1, x2; θ). Thus

the parameter c measures the similarity between the outputs at any two input values

that differ only in the level of the qualitative input. On the other hand, if any two

inputs have the same level for the qualitative input, then the correlation between the

outputs at these input sites is RY (x1, x2; θ), so it depends only on the correlation function of the quantitative inputs.

5.2.3 Estimation of Sensitivity Indices for Mixed Inputs

Analogous to the joint effect function jA(wA) defined for y(·), one can view   ∑ ∫  1 Q ∅  ν Y (xS , x−S , t∗)g(x−S )dx−S , if = t∗ X−S JA(wA) = ∫ (5.58)   Y (xS , x−S , t)g(x−S )dx−S , if Q = {1} X−S

107 as an estimate of jA(wA) defined in (5.50) where A = (S, Q). Note that JA(wA) and jA(wA) depend on wA = (xS , x−S , t) or (xS , x−S ) only through (xS , t) or (xS )

according as Q = {1} or Q = ∅, respectively. Viewed as a process with index wA,

JA(wA) is a GP with mean function

J mA(wA) = Ep[JA(wA)] = β (5.59)

and covariance function

J 1 2 1 2 CA(wA, wA) ≡ Covp[JA(wA),JA(wA)]  [ ]  ∑ ∫ ∑ ∫  1 1 1 1 1 2 2 2 Q ∅  Covp ν Y (w )g(x−S )dx−S , ν Y (w )g(x−S )dx−S , if = t1 X t2 X = [ ∗ −S ∗ −S ]  ∫ ∫  1 1 1 2 2 2  Covp Y (w )g(x−S )dx−S , Y (w )g(x−S )dx−S , if Q = {1} X X  −S −S  ∑∑ ∫ ∫  1 1 2 1 2 1 2 Q ∅  ν2 Covp[Y (w ),Y (w )]g(x−S )g(x−S )dx−S dx−S , if = 1 2 t∗ t∗ X−S X−S =  ∫ ∫  1 2 1 2 1 2 Q { }  Covp[Y (w ),Y (w )]g(x−S )g(x−S )dx−S dx−S , if = 1 X−S X−S  ∫ ∫ ∑∑  1 T 1 2 1 2 1 2 1 2 Q ∅  ν2 R (t∗, t∗; c) Covp[Y (x ),Y (x )]g(x−S )g(x−S )dx−S dx−S , if =  1 2  t∗ t∗  X−S X−S  | {z } J 1 2 = ∫ ∫ CS (xS ,xS )   T 1 2 1 2 1 2 1 2 Q { }  R (t , t ; c) Covp[Y (x ),Y (x )]g(x−S )g(x−S )dx−S dx−S , if = 1  X X | −S −S {z } (5.60)   − ∏ ∏  1+c(ν 1) 1 dbint(l , u ; θ ) RY (x1, x2; θ ), if Q = ∅ ν λY k k k k k k k̸∈S k∈S =  ∏ ∏ (5.61)  exp[I(t1 ≠ t2) ln c] 1 dbint(l , u ; θ ) RY (x1, x2; θ ), if Q = {1} λY k k k k k k k̸∈S k∈S

where (5.60) follows because

1 2 T 1 2 1 2 Covp[Y (w ),Y (w )] = R (t , t ; c) Covp[Y (x ),Y (x )] (5.62)

108 due to the separable correlation function in (5.57), and (5.61) follows from (5.19) and

∑ ∑ ∑ ∑ T 1 2 1 2 R (t∗, t∗; c) = exp[I(t∗ ≠ t∗) ln c] = ν + cν(ν − 1). (5.63) 1 2 1 2 t∗ t∗ t∗ t∗

So the covariance function of JA(wA) in (5.61) is written as a function of the co-

variance function of JS (xS ) in (5.19). In particular, the variance function of JA(wA) is  ∏  1+c(ν−1) 1  dbint(lk, uk; θk), if Q = ∅ ν λY J k̸∈S CA(wA, wA) = ∏ (5.64)  1  dbint(lk, uk; θk), if Q = {1} λY k̸∈S

P Similar to Section 5.1.2, define JA (wA) to be the process whose distribution is that of the posterior predictive process of JA(wA) given n code calculations zsim =

1 n ⊤ i i i P (zsim(w ), . . . , zsim(w )) and w = (x , t ), 1 ≤ i ≤ n. Then JA (wA) is a GP with mean function ( ) ( ) P Z −1 − Z mA(wA) = β + Covp[JA(wA), Zsim] Σsim zsim msim (5.65)

and covariance function

P 1 2 CA (wA, wA) [( ) ] J 1 2 − Z −1 2 ⊤ 1 = CA(wA, wA) trace Σsim Covp[JA(wA), Zsim] Covp[JA(wA), Zsim] (5.66)

and, in particular, variance function

P CA (wA, wA) [( ) ] J − Z −1 ⊤ = CA(wA, wA) trace Σsim Covp[JA(wA), Zsim] Covp[JA(wA), Zsim] . (5.67)

To determine a more explicit expression for (5.65)–(5.67), note that for wA ∈ (xS , x−S ) or (xS , x−S , t) according as Q = ∅ or {1}, Covp[JA(wA), Zsim] is a 1 × n vector with

109 the ith element,

i i Cov [JA(wA), Z ] = Cov [JA(wA),Y (w ) + ϵ(w )]  p sim i p  ∑ ∫  1 i i Q ∅  ν Covp[Y (x),Y (w ) + ϵ(w )]g(x−S )dx−S , if = t∗ X−S =  ∫  i i Q { }  Covp[Y (x),Y (w ) + ϵ(w )]g(x−S )dx−S , if = 1 X  −S  ∑ ∫  1 T i i Q ∅  ν R (t∗, t ; c) Covp[Y (xS , x−S ),Y (w )]g(x−S )dx−S , if = t∗ X−S =  ∫  T i i Q { }  R (t, t ; c) Covp[Y (xS , x−S ),Y (w )]g(x−S )dx−S , if = 1 X  −S  1+c(ν−1) Covp[JS (xS ), Zsim]i, if Q = ∅ = ν (5.68)  i exp[I(t ≠ t ) ln c]Covp[JS (xS ), Zsim]i, if Q = {1}

from (5.25) and

∑ν ∑ν T i i R (t∗, t ; c) = exp[I(t∗ ≠ t ) ln c] = 1 + c(ν − 1). (5.69) t∗=1 t∗=1

Below the 1 × n vector of integral (or summation) of Covp[JA(wA), Zsim] with respect to xA (or (xS , t)) if Q = ∅ (or Q = {1}), i.e. ,  ∫   Covp[JA(wA), Zsim]g(xS )dxS , if Q = ∅ ⊤ XS qA =  ∑ ∫ (5.70)  1 Q { }  ν Covp[JA(wA), Zsim]g(xS )dxS , if = 1 t∗ XS

is required. From (5.27), (5.68), and (5.69), the ith element of q is written as  A  ∫  1+c(ν−1) Q ∅  ν Covp[JS (xS ), Zsim]ig(xS )dxS , if = ; XS (qA)i =  ∑ ∫  1 ̸ i Q { }  ν exp[I(t∗ = t ) ln c] Covp[JS (xS ), Zsim]ig(xS )dxS , if = 1 , t∗ XS 1 + c(ν − 1) = q ν i

110 × where qi is defined by (5.27). The n n matrix of integrals (summation) of ⊤ Covp[JA(wA), Zsim] Covp[JA(wA), Zsim] with respect to (xS , t) or xS according as Q = {1} or Q = ∅, respectively,  ∫  ⊤  Covp[JA(wA), Zsim] Covp[JA(wA), Zsim]g(xS )dxS , if Q = ∅ XS CA =  ∑ ∫ (5.71)  1 ⊤ Q { }  ν Covp[JA(wA), Zsim] Covp[JA(wA), Zsim]g(xS )dxS , if = 1 . t∗ XS

is also required below. Then from (5.29) and (5.68), the (i, j)th element of CA is

written as  ( )  − 2  1+c(ν 1)  ν  ∫  × Q ∅  Covp[JS (xS ), Zsim]iCovp[JS (xS ), Zsim]jg(xS )dxS , if = XS (CA)ij = ∑  1 i j  exp[I(t∗ ≠ t ) ln c] exp[I(t∗ ≠ t ) ln c]  ν  t∗  ∫  × Covp[JS (xS ), Zsim]iCovp[JS (xS ), Zsim]jg(xS )dxS , if Q = {1} XS  ( )  − 2  1+c(ν 1) (C) , if Q = ∅  ν ij 2 − = 1+c (ν 1) (C) , if Q = {1} and ti = tj  ν ij   2c+c2(ν−2) Q { } i ̸ j ν (C)ij, if = 1 and t = t

where Cij is defined by (5.29).

j Similar to (5.30) in Section 5.1.2, the joint effect variance VA = V arg[jA(W A)] j b j is estimated by the posterior predictive mean of VA given the data zsim, i.e., VA =

P b j E [V arg[JA(W A)]|Zsim]. Then VA is

b j P P P VA = Eg[CA (W A, W A)] + V arg[mA(W A)] − V ar [Eg[JA(wA)]|Zsim] (5.72)

using an argument analogous to that needed to derive (5.30). The three terms in

(5.72) are calculated as follows. The first component of (5.72) is

111  ∫  P  CA (wA, wA)g(xS )dxS , if Q = ∅ P XS Eg[CA (W A, W A)] =  ∑ ∫  1 P Q { }  ν CA (wA, wA)g(xS )dxS , if = 1 t∗ X  S ∫  J  C (wA, wA)g(xS )dxS  A  XS [ ]  ( )  −1 ∫  − Z ⊤  trace Σsim Covp[JA(wA), Zsim] Covp[JA(wA), Zsim]g(xS )dxS , XS = ∑ ∫  1 J  C (wA, wA)g(xS )dxS  ν A  t∗ XS [ ]  ( )  −1 ∑ ∫  − Z 1 ⊤ trace Σsim ν Covp[JA(wA), Zsim] Covp[JA(wA), Zsim]g(xS )dxS , t∗ XS (5.73)  [ ] ∏ ( )−  1+c(ν−1) 1 − Z 1 Q ∅  ν λ dbint(lk, uk; θk) trace Σsim CA , if = Y ̸∈S = k [ ] (5.74) ∏ ( )−  1 Z 1  dbint(lk, uk; θk) − trace Σ CA , if Q = {1} λY sim k̸∈S [ ] [( ) ] − ∏ −1 1 + c(ν 1) Q ∅ Q { } 1 − Z = I( = ) + I( = 1 ) dbint(lk, uk; θk) trace Σsim CA ν λY k̸∈S (5.75) where (5.73) follows from (5.67) and (5.74) follows from (5.64) and (5.71). The second component of (5.72) is

P P P 2 V arg[mA(W A)] = Eg[(mA(W A) − Eg[mA(W A)]) ] [( ) ] ( ) ( ) 2 − ⊤ Z −1 − Z = Eg (Covp[JA(W A), Zsim] qA) Σsim zsim msim (5.76) ( ) − Z ⊤ Z −1 = (zsim msim) Σsim [ ] ⊤ ⊤ ⊤ × Eg (Covp[JA(W A), Zsim] − qA) (Covp[JA(W A), Zsim] − qA) (5.77) ( ) ( ) × Z −1 − Z Σsim zsim msim ( ) ( ) ( ) − Z ⊤ Z −1 − ⊤ Z −1 − Z = (zsim msim) Σsim (CA qAqA) Σsim zsim msim (5.78)

112 where (5.76) follows because

[ ] ⊤ Eg Covp[JA(W A), Zsim] = qA from (5.70) ( ) ( ) P ⊤ Z −1 − Z Eg[mA(W A)] = β + qA Σsim zsim msim from (5.65) ( ) ( ) P − P − ⊤ Z −1 − Z mA(W A) Eg[mA(W A)] = (Covp[JA(W A), Zsim] qA) Σsim zsim msim ,

Equation (5.77) follows by algebra, and (5.78) follows because

[ ] ⊤ Eg Covp[JA(W A), Zsim] Covp[JA(W A), Zsim] = CA from (5.71). The third component of (5.72) is

P P V ar [Eg[JA(W A)]|Zsim] = Cov [Eg[JA(W A)],Eg[JA(W A)]|Zsim]  [ ]  ∫ ∫  P 1 1 1 2 2 2  Cov JA(wA)g(xS )dxS , JA(wA)g(xS )dxS Zsim , if Q = ∅ [XS XS ] =  ∑ ∫ ∑ ∫  P 1 1 1 1 1 2 2 2 Q { }  Cov ν JA(wA)g(xS )dxS , ν JA(wA)g(xS )dxS Zsim , if = 1 1 X 2 X  t∗ S t∗ S ∫ ∫  P 1 2 1 2 1 2  CA (wA, wA)g(xS )g(xS )dxS dxS , XS XS =  ∑∑ ∫ ∫  1 P 1 2 1 2 1 2  ν2 CA (wA, wA)g(xS )g(xS )dxS dxS , t1 t2 X X  ∗ ∗ S S  ∫ ∫  J 1 2 1 2 1 2  CA(wA, wA)g(xS )g(xS )dxS dxS  X X  S S [ ]  ∫ ∫  − Z −1 2 ⊤ 2 2 1 1 1 trace (Σsim) Covp[JA(wA), Zsim] g(xS )dxS Covp[JA(wA), Zsim]g(xS )dxS , = XS XS  ∑∑ ∫ ∫  1 CJ (w1 , w2 )g(x1 )g(x2 )dx1 dx2 ,  ν2 A A A S S S S  1 2 X X  t∗ t∗[ S S ]  ∑ ∫ ∑ ∫  − Z −1 1 2 ⊤ 2 2 1 1 1 1 trace (Σsim) ν Covp[JA(wA), Zsim] g(xS )dxS ν Covp[JA(wA), Zsim]g(xS )dxS , 2 1 t∗ XS t∗ XS (5.79) ∏f [ ] 1 + c(ν − 1) 1 ( )−1 − Z ⊤ = dbint(lk, uk; θk) trace Σsim qAqA (5.80) ν λY k=1

113 where (5.79) follows from (5.66) and (5.80) follows from (5.70), (5.63), and ∫ ∫ J 1 2 1 2 1 2 CA(wA, wA)g(xS )g(xS )dxS dxS X X S S ∏ ∏ ∫ ∫  1+c(ν−1) 1 uk uk  dbint RY (x1 , x2 ; θ )g(x1 )g(x2 )dx1 dx2  ν λ (lk, uk; θk) k k k k k k k,  Y l l  k̸∈S k∈S| k k {z } dbint(lk,uk; θk) = ∏ ∏ ∫ ∫  uk uk  1 ̸ 2 1 Y 1 2 1 2 1 2  exp[I(t = t ) ln c] dbint(lk, uk; θk) R (xk, xk; θk)g(xk)g(xk)dxkdxk,  λY  k̸∈S k∈S| lk lk {z }  dbint(lk,uk; θk)  ∏  1+c(ν−1) 1 f Q ∅ ν λ k=1 dbint(lk, uk; θk), if = = Y ∏  1 2 1 f exp[I(t ≠ t ) ln c] dbint(lk, uk; θk), if Q = {1} λY k=1

b j from (5.61). Thus VA in (5.72) is written as a closed form,

  [ ] [( ) ] − ∏ −1 b j 1 + c(ν 1) Q ∅ Q { } 1 − Z VA = I( = ) + I( = 1 ) dbint(lk, uk; θk) trace Σsim CA  ν λY  k̸∈S { } ( )−1 ( )−1 ⊤ ⊤ − Zw Z − Z − Z + (zsim msim) Σsim (CA qAqA) Σsim (zsim msim) { } f [( ) ] − ∏ −1 − 1 + c(ν 1) 1 − Z ⊤ dbint(lk, uk; θk) trace Σsim qAqA . ν λY k=1 (5.81)

The estimate Vb of the total variance can be expressed as (5.81) when S = {1, . . . , f} and Q = {1}. The main effect sensitivity index of input xi and that of input t are estimated by Vb j b j b xi b Vt Sx = , St = , (5.82) i Vb Vb where Vb j is obtained from (5.81) when S = {i} and Q = ∅, and Vb j is obtained from xi t

(5.81) when S = ∅ and Q = {1}. The total effect sensitivity index of input xi and that of input t are estimated by

b − b j b − b j b V V−xi b V V−t Tx = , Tt = , (5.83) i Vb Vb 114 b j S { − } Q { } where V−xi is obtained from (5.81) when = 1, . . . , i 1, i + 1, . . . , f and = 1 , b j S { } Q ∅ and V−t is obtained from (5.81) when = 1, . . . , f and = . Note that the above approach for computing the sensitivity indices of the mixed inputs is available using different correlation functions of the qualitative input other than (5.56). For example, a correlation function which assigns different correlations for different pairs of levels of the qualitative input can be used. Then the summation of the correlation function such as (5.63) and (5.69) will be changed. However, the assumption of the separable correlation functions for the mixed inputs must hold for the above derivation to be valid.

5.2.4 Example

Here consider an output function with f = 2 quantitative inputs and 1 qualitative input, y = (x1 + x2)I(t = 1) + 3x1I(t = 2) (5.84)

where x1, x2 ∈ [0, 1] and t ∈ {1, 2}. First of all, the theoretical sensitivity indices are derived using the definition in Section 5.2.1, and these values are compared with the empirical sensitivity indices using the data.

Theoretical Derivation

2 Using function averages, i.e, g(w) = g(x1, x2, t) = 1/2 on [0, 1] for each level of the qualitative input, the overall mean of the output y is ∫ ∫ 1 ∑2 1 1 5 y = ydx dx = , 0 2 1 2 4 t=1 0 0 and the total variance V is ∫ ∫ 1 ∑2 1 1 25 V = (y − y )2 dx dx = . 2 0 1 2 48 t=1 0 0

115 From (5.50), the joint effect functions of the individual inputs are

∫ 1 ∑2 1 1 j = y dx = 2x + x1 2 2 1 4 t=1 0 ∫ ∑2 1 1 x2 jx2 = y dx1 = + 1 2 0 2 ∫ t=1∫ 1 1 3I(t1 = 2) jt = y dx1dx2 = I(t1 = 1) + 0 0 2 and their variances are ∫ 1 16 V j = (j − y )2 dx = x1 x1 0 1 48 ∫0 1 1 j − 2 Vx2 = (jx2 y0) dx2 = 0 48 1 ∑2 3 V j = (j − y )2 = , t 2 t 0 48 t=1 so the main effect sensitivity indices of the individual inputs are

16 1 3 S = = 0.6400,S = = 0.0400,S = = 0.1200. (5.85) x1 25 x2 25 t 25

To compute the total effect sensitivity indices, the joint effect functions are computed as ∫ ( ) 1 1 3 j− = y dx = x + I(t = 1) + I(t = 2) x1 1 2 2 2 ∫0 ( ) 1 1 j−x2 = y dx2 = x1 + I(t = 1) + 3x1 0 2 ∑2 1 x2 j− = y = 2x + . t 2 1 2 t=1

Then their variance are

116 ∫ ∑2 1 j 1 2 5 V− = (j− − y ) dx = x1 2 x1 0 2 48 t=1 0 ∫ ∑2 1 j 1 2 23 − − V−x2 = (j x2 y0) dx1 = 2 0 48 ∫ t=1∫ 1 1 17 j − 2 V−t = (j−t y0) dx1dx2 = , 0 0 48

so the total effect sensitivity indices are

25 − 5 25 − 23 25 − 17 T = = 0.8000,T = = 0.0800,T = = 0.3200. (5.86) x1 25 x2 25 t 25

Empirical Results

To create data for an empirical study, a 30 × 3 design matrix whose first two

columns range over [0,1] and are uncorrelated, and whose last column has levels 1 or 2 with probability 0.5 was generated. The output was computed at the input

values of the design matrix. Using n = 30 data, the GP model was fitted and then the sensitivity indices using the estimates of the model parameters were estimated.

For the correlation function (5.57) of the mixed inputs, there are three correlation parameters to estimate, θ1, θ2, and c. These correlation parameters were estimated by maximum likelihood using an alteration of the MPERK code (see Section 8.2 for details). The estimated parameters were plugged in (5.81) and then (5.82) and (5.83) to get the main effect and total effect sensitivity indices. The estimated sensitivity indices in Table 5.2 are very close to the theoretical answers given in (5.85) and (5.86).

Effect x1 x2 t Main 0.6399 0.0401 0.1202 Total 0.7998 0.0800 0.3201

Table 5.2: Estimated sensitivity indices for the example function in (5.84)

117 CHAPTER 6

ALGORITHMS FOR GENERATING MAXIMIN LATIN

HYPERCUBE AND ORTHOGONAL DESIGNS

This chapter proposes new algorithms for generating maximin designs. Various proposals for implementing the maximin criterion for space-filling designs for com- puter experiments are reviewed. A new, well-performing algorithm is presented for the construction of maximin Latin hypercube designs under a 2-dimensional max- imin Euclidian distance criterion. An additional criterion, design orthogonality, is important for estimation of the effects of the inputs. A new algorithm for determin- ing orthogonal maximin designs is proposed and it is shown to outperform existing algorithms.

6.1 Introduction

A computer model is a computer code that implements a mathematical model of a physical process. A computer experiment is the use of the computer code as an experimental tool in which the experimenter seeks to determine the computational

“response” of the code to the inputs. Because a computer code may take hours or days to produce a single output, a flexible nonparametric predictor is often fitted to the outputs to provide a rapidly-computable surrogate predictor (a metamodel

118 for the code) which can be used to explore the experimental region in detail. The performance of the predictor depends upon the choice of the training design points

used to develop the predictor; in particular, the predictive performance, depends on

how well these training inputs are spread throughout the experimental region.

The output from most computer codes is deterministic and, hence, no replications are required at, or near, any previously run value. In particular, statements of predic-

tive uncertainty cannot be obtained using replication. Designs for which the points

are unreplicated and well spread out are called “space-filling.”

To be specific, let X be a k-dimensional input region of interest, where k is the

number of inputs, and let x ∈ X be a design point at which the computer code will be run. Let D(n, k) denote the set of possible designs, each design consisting of n distinct

points selected from X . A design in D(n, k) will be represented by an n × k matrix

⊤ ⊤ ⊤ X with rows x1 , x2 ,... xn specifying the n design points, and columns ξ1,..., ξk specifying the input values for each of the k inputs in the n runs. Throughout this chapter it is assumed that each input has been scaled to [0,1] and that the inputs are

functionally independent, so that X = [0, 1]k.

McKay et al. (1979) introduced Latin hypercube designs for use in computer ex-

periments. In its simplest form, the n × k design matrix X of a Latin hypercube

⊤ design (LHD) has hth column ξh = [ξ1h, ξ2h, . . . , ξnh] which can be obtained from a ⊤ random permutation πh = [π1h, π2h, . . . , πnh] of 1, . . . , n. Then ξih is the midpoint of

the interval [(πih − 1)/n, πih/n]. In a slightly more sophisticated approach, a random point in this interval may be taken; the latter procedure is used by the algorithms proposed in this chapter.

All LHDs have the one-dimensional space-filling property that an observation is taken in every one of the n evenly spaced intervals over the [0, 1] range of each

input. However, they need not have space-filling properties in higher dimensions.

119 Consequently, there has been much work in the literature on deriving methods for selecting LHDs which also have good higher-dimesional projection properties. Several distance metrics and associated “maximin” criteria for achieving such space-fillingness are discussed in Section 6.2. Some existing algorithms and a new algorithm for the generation of maximin LHDs are described in Section 6.3. Another desirable property of a design for a computer experiment is that of orthog- onality, where the design matrix X has uncorrelated columns. Ye (1998) proposed a construction method for orthogonal LHDs with n = 2m + 1 runs and k = 2m − 2 input variables, and used an improvement algorithm for selecting designs within this class under space-filling and other criteria. Ye (1998)’s method was extended by Cioppa and Lucas (2007) to achieve nearly-orthogonal and space-filling LHDs for up to m + (m − 1)(m − 2)/2 factors. Other combinatorial construction methods and algorithmic search methods for orthogonal and nearly orthogonal LHDs have been proposed by, for example, Owen (1994), Tang (1998), Butler (2001), Steinberg and Lin (2006), Sun, Liu, and Lin (2009), Lin et al. (2009), and Bingham et al. (2009).

Joseph and Hung (2008) proposed an exchange algorithm for efficient generation of

LHDs under a weighted combination of orthogonality and space-filling criteria. Their method is described in Section 6.4, together with a new efficient algorithm for achiev- ing orthogonal maximin designs using Gram-Schmidt orthogonalization. The various algorithms are compared in Section 6.5. It is shown that the proposed maximin LHD algorithm performs better than that of Forrester et al. (2008) under a 2-dimensional distance metric and that the new algorithm for orthogonal maximin designs outperforms that of Joseph and Hung (2008) under a variety of criteria.

Conclusions and discussion are given in Section 6.6.

120 6.2 Maximin Criteria for Space-filling Designs

The “maximin distance” design criterion was first introduced by Johnson et al.

(1990). The method of Morris and Mitchell (1995) for finding a design that is optimal

according to this criterion is described.

⊤ (k) For a given n × k design X = [x1,..., xn] ∈ D(n, k), let d (xi, xj) denote a k-

dimensional distance between xi and xj for a given metric such as the k-dimensional rectangular or Euclidean distances, which are

√ (k) k | − | (k) k − 2 dR (xi, xj) = Σh=1 xih xjh and dE (xi, xj) = Σh=1(xih xjh) , (6.1)

respectively. For the selected metric, let d1 be the minimum inter-point distance over all pairs of points in design X, and let J1 denote the number of pairs of points that are distance d1 apart (the “index” of d1). Johnson et al. (1990) defined a design to be maximin if the design maximizes d1 over all designs in D(n, k) and, among such designs, has minimum index J1. Morris and Mitchell (1995) refined the definition of a maximin design by consider- (k) (k) ··· (k) ing distances other than d1. Let d1 < d2 < < dm denote the distinct distances

(k) d (xi, xj) between all n(n − 1)/2 pairs of points in design X. Define Jh to be the (k) ≤ ≤ number of pairs of points in X separated by distance dh , 1 h m. Morris and Mitchell (1995) defined X to be a maximin design in D(n, k) if X maximizes sequen-

{ (k) −1 (k) −1 (k) −1} tially d1 ,J1 , d2 ,J2 , . . . , dm ,Jm . Note that if the design points are selected at random within the one-dimensional equi-spaced intervals, it is generally the case

(k) (k) that ties in the minimum distance value rarely occurs, so that dmin = d1 is sufficient (k) (k) ··· (k) to define the maximin design without considering d2 < d3 < < dm .

121 Morris and Mitchell (1995) proposed finding an (approximate) maximin design

(k) XMm for k-dimensional rectangular and Euclidian distance by selecting the design to minimize

  [ ] 1/p 1/p ∑m J ∑ ∑ 1 ϕ (X) =  [ h ]  = (6.2) p p (k) p (k) [d (xi, xj)] h=1 dh i

Thus,

(k) XMm = argmin ϕp(X) . (6.3) X∈D(n,k)

Morris and Mitchell (1995) investigated values of p to achieve an approximately cor- rect ranking for various settings. The use of (6.3) has been adopted by several authors for ranking designs including Forrester et al. (2008) (see Section 6.3) and Joseph and Hung (2008) (see Section 6.4).

In addition, for an n × k design to be space-filling in k-dimensions, it is usually the case that the projections of the design onto lower dimension spaces are desired to all be space-filling. For example, this is the case in Moon, Santner, and Dean (2010) who discussed the screening setting in which only a small (unknown) subset of inputs has a substantial effect on the output (i.e., few inputs are “active”). In this situation, the space-filling properties of the projections of the design are of great importance. In order to determine designs with good projected space-filling properties, Welch (1985) proposed an “average reciprocal distance” criterion which minimizes the average- reciprocal distance of any user-selected collection of subspaces of the k-dimensional input space.

Here, a maximin criterion over all 2-dimensional projections is used as follows. For × ∈ D ⊤ an n k design X = (ξ1,..., ξk) (n, k) with ξh = (ξ1h, . . . , ξnh) , h = 1, . . . , k,

122 (2) define dh,ℓ(xi, xj) to be the inter-point Euclidian distance between two design points

xi and xj of X projected into dimensions h and ℓ; that is

√ (2) − 2 − 2 dh,ℓ(xi, xj) = (ξih ξjh) + (ξiℓ ξjℓ) (6.4)

is the distance between (ξih, ξiℓ) and (ξjh, ξjℓ). Then the minimum inter-point distance

(2) dmin(X) over all projections of the design X onto every 2-dimensional subspace is

(2) ≡ (2) dmin(X) min min dh,ℓ(xi, xj). (6.5) ∈{ n } ∈{ k } i

(2) (2) (2) A design XMm is defined to be a maximin design provided XMm maximizes dmin(X), over all designs in D(n, k); that is,

(2) (2) XMm = argmax dmin(X) . (6.6) X∈D(n,k)

6.3 Algorithms for Space-filling Latin Hypercube Designs

Let DL(n, k) ⊂ D(n, k) denote the class of LHDs. Three algorithms for selecting

maximin designs from DL(n, k) are studied in Sections 6.3.1–6.3.3.

6.3.1 Complete Search and Random Generation

With large numbers of inputs, the set of Latin hypercube designs DL(n, k) is too large to allow a complete search for a maximin design under criteria (6.2) or (6.6) in reasonable time. For example, even reducing the search to designs which are based on midpoints, there are (15!)9 = 1.12 × 10109 distinct Latin hypercube designs when there are k = 10 inputs and n = 15 runs are required (and an infinite number of LHDs if randomly selected points in the intervals are used).

123 The simplest method of finding an approximate maximin design over DL(n, k) is to generate a very large number of LHDs, evaluate each design under the maximin

criterion of choice and to select the best of these. It is called the “Random Gen-

eration” (RGLHD) method. This method is not efficient in finding an approximate

maximin design quickly, and more sophisticated methods such as those discussed in Section 6.3.2 have been suggested in the literature for improving RGLHD.

6.3.2 Random Swap Methods for Maximin LHDs

Morris and Mitchell (1995) proposed a simulated annealing search algorithm to

find an X that satisfies criterion (6.3) using the rectangular or Euclidean distance metrics (6.1). Their search algorithm begins with a randomly generated LHD, X, and

generates a sequence of “near by” designs by perturbing X. A perturbed design, Xtry, is formed from X by exchanging two randomly chosen elements within a randomly

selected column of X. The perturbed design Xtry is compared to the initial design

X in terms of the value of ϕp in (6.2). Then X is set equal to Xtry with probability

1.0 if ϕp(Xtry) < ϕp(X), and with probability π = exp{−[ϕp(Xtry) − ϕp(X)]/t} if

ϕp(Xtry) > ϕp(X). The parameter t, called the “temperature,” is decreased as the algorithm proceeds. The algorithm keeps track of the “best” design while a given number of perturbations are tried at a given t value and t is decreased once a new best design is identified. When no exchange of the current design has lower value of

ϕp after a large given number of tries at a given temperature, the algorithm stops and the best design is reported. This algorithm forms the basis of the algorithm of

Joseph and Hung (2008) discussed in Section 6.4.1.

Forrester et al. (2008) took a slightly different approach and applied an evolu-

tionary operation (EVOP) algorithm to search for a maximin design according to criterion (6.3). Their search process starts with a single randomly generated LHD,

124 X, which is called a “parent.” The parent design is mutated to obtain an offspring

Xtry as follows; Xtry is the result of m mutations, each of which consists of swap- ping two randomly chosen elements within a randomly chosen column of the design, starting with the parent and applied successively until the mutation m is reached.

This process is repeated, again starting with the original parent, until a population of offspring has been produced. Finally, the best design based on (6.2), among all the offspring and the parent, is selected and becomes the new parent for the next generation of offspring. The procedure is iterated a given number of times, I. The number of stages of mutation used to construct each offspring from each parent is decreased during the iteration process.

(2) In the comparison of algorithms, ϕp is replaced by dmin in (6.5), and multiple (random) starting designs are used in Forrester et al. (2008)’s search algorithm to provide a more global search. It is called the “Random Swap Latin hypercube design”

(RSLHD) algorithm. To execute the RSLHD algorithm, the number of offspring constructed from each parent in each iteration, the number of mutations used to produce each offspring, the number of iterations, I, and the number of starting designs must be specified in advance.

6.3.3 A Smart Swap Method for Maximin LHDs

The use of random swaps can be inefficient since many worse designs are likely to be generated for every better design found. For the 2-dimensional distance metric in (6.4), some computational effort can be spent to identify swaps that alter the pair of rows and columns that have minimum inter-point distance. Although this does not guarantee that a better design will result, it greatly increases the chance of finding a better design. The following “smart swap” algorithm (called the SSLHD algorithm)

(2) searches for the maximin LHD XMm satisfying (6.6).

125 (2) Step 0: Choose the total number of starting designs T . Set s = 0 and dmin = 0.0.

Step 1: Set s = s + 1 and the choice number c = 0. Generate an n × k random

LHD, X, as a parent design.

(2) Step 2: Calculate the minimum inter-point distance dmin(X) of X defined in (6.5). (2) (2) ∗ ∗ If s = 1, set Xbest = X and dmin,best = dmin(X). Record the pair of rows (i , j ) ∗ ∗ (2) and the pair of columns (h , ℓ ) giving rise to dmin(X).

Step 3: For each selection of one of the columns h∗ or ℓ∗ and one of the rows i∗ or

j∗, there are (n − 2) remaining rows from which another row may be selected.

There are cmax = 2 × 2 × (n − 2) = 4(n − 2) possible (column, row, row) triples

so obtained. Label these “choices” as 1, 2, . . . , cmax.

Step 4: Set c = c + 1 and create Xtry by swapping the two elements obtained from the cth choice of (column, row, row) combination listed in Step 3.

Step 5: Calculate the new inter-point distances of Xtry. Only (k − 1) × (n − 2) × 2 distances need to be recalculated since there are (k − 1) columns to be paired

with the selected column and (n − 2) rows to be paired with each of the two (2) − − selected rows. Let ∆min,try denote the minimum of the 2(k 1)(n 2) new (2) − (2) inter-point distances and let a = ∆min,try dmin(X).

Step 6:

(2) ∗ ∗ ∗ ∗ 6.1) If a > 0, set X = Xtry and recompute dmin(X), i , j , h , ℓ . If, in addition, a > ϵ, for a pre-specified ϵ, then go to Step 3. Otherwise, if the improvement is smaller than or equal to ϵ, go to Step 7.

6.2) If a < 0 and c < cmax, then go to Step 4 to consider the next possible swap in the parent design X.

126 6.3) If a < 0 and c = cmax so that there are no more possible swaps, go to Step 7.

(2) (2) (2) (2) Step 7: If dmin(X) > dmin,best, then set Xbest = X and dmin,best = dmin(X). If

s ≤ T , go to Step 1 for a new parent design. If s > T , select Xbest as the final design and stop.

The SSLHD method is compared with the RSLHD and RGLHD methods in Sec- tion 6.5.1.

6.4 Algorithms for Orthogonal Maximin Designs

As described in Section 6.1, several authors (e.g. Owen (1994), Tang (1998)) have replaced the maximin criterion by a criterion based on minimizing the correlation of the columns of the input design matrix X. Below, a proposal of Joseph and Hung

(2008) is reviewed which uses a weighted average of maximin and correlation criteria.

Then the use of a maximin criterion is considered within a class of orthogonal designs originally developed for the screening situation.

6.4.1 Orthogonal Maximin LHDs

Joseph and Hung (2008) proposed a modification of the simulated annealing search algorithm of Morris and Mitchell (1995) described in Section 6.3.2. Rather than selecting a column and two rows at random for a mutation, they used a “smart swap” method to generate an orthogonal maximin LHD in an efficient way under (k) · · the k-dimensional rectangular distance metric dR ( , ) in (6.1). Specifically, Joseph and Hung (2008) desired their selected design to have (approximately) minimum average pairwise column correlation as well as minimum ϕp in (6.2). They proposed

127 minimizing the weighted objective function

ϕ − ϕ 2 − p p,L ψp = wρave + (1 w) (6.7) ϕp,U − ϕp,L

where w ∈ (0, 1) is a pre-specified positive weight, ϕp is defined by (6.2), ϕp,U and

ϕp,L are scaling factors for ϕp, and

∑ ∑ − k i 1 ρ2 ρ2 = i=2 j=1 ij (6.8) ave k(k − 1)/2

− 2 is the average of k(k 1)/2 squared correlations ρij where ρij is a sample correlation coefficient between columns i and j.

To select the column and one of the rows with which to make a swap, Joseph and Hung (2008) chose the column ℓ∗ which, stochastically, has greatest average correlation with other columns; that is, they drew ℓ∗ from the multinomial distribution with probabilities √ α ∑ ρℓ 1 2 P (ℓ) = ∑ , where ρℓ = ρ , 1 ≤ ℓ ≤ k (6.9) k α k − 1 ℓj ℓ=1 ρℓ j≠ ℓ with α ∈ [1, ∞). Similarly, they selected the row i∗ which, stochastically, has smallest average distance from the other rows; that is, they drew i∗ from the multinomial distribution with probabilities

( ) 1/p ϕα ∑ 1 P (i) = ∑ pi , where ϕ = , 1 ≤ i ≤ n, (6.10) n ϕα pi dp i=1 pi j≠ i ij

where dij is the k-dimensional rectangular distance between the rows i and j. Note that when α = ∞ in (6.9) or (6.10), the column with maximum average correlation and the row with maximum average reciprocal distance value ϕpi are chosen for the

128 swap. Joseph and Hung (2008) took p = 15 for a reasonably accurate ordering of the designs and α = ∞. Having selected i∗ and ℓ∗, the element in row i∗ was

∗ exchanged with the element in a randomly chosen row in column ℓ to give Xtry. The

perturbed LHD, Xtry, was evaluated under the criterion ψp in (6.7) with w = 0.5. If

ψp(Xtry) < ψp(X), then X was replaced by Xtry, otherwise X was replaced by Xtry with probability π = exp{−[ψp(Xtry) − ψp(X)]/t} using the simulated annealing algorithm, as in Morris and Mitchell (1995).

6.4.2 Orthogonal Maximin Gram-Schmidt Designs

A new class of designs, which is called the class of Gram-Schmidt designs (GSD)

in this dissertation, was constructed by Moon et al. (2010) for the GSinCE screening procedure. For k inputs and n observations, a randomly generated GSD is created

from a randomly generated LHD, Λ = (λ1,..., λk) via centering, orthogonalization, and scaling as follows:

Center: Center each column of Λ as:

− ⊤ vh = λh (λh 1/n)1, for h = 1, . . . , k,

where 1 is a vector of n unit elements.

Orthogonalize: Apply the Gram-Schmidt algorithm to form orthogonal columns

u1, u2,..., uk, where   v1, h = 1; uh = (6.11)  ∑ − ⊤  − h 1 ui vh vh 2 ui, h = 2, . . . , k. i=1 ||ui||

129 ⊤ Scale: Scale uh = (u1h, . . . , unh) so that its minimum element is 0 and its maximum

⊤ element is 1 to give column ξh = (ξ1h, . . . , ξnh) , where

uih − min{u1h, . . . , unh} ξih = , max{u1h, . . . , unh} − min{u1h, . . . , unh}

for i = 1, . . . , n and h = 1, . . . , k. Set X = (ξ1,..., ξk).

Note that X is not necessarily in the class of LHDs. Every pair of columns of X has zero correlation, but X need not have any 2-dimensional (or higher dimensional) space-filling properties. In the following, X is modified using smart swap methods in an attempt to increase the minimum inter-point distance. After each swap, the perturbed column is moved to the last (right-most) position and is re-orthogonalized with respect to the remaining columns 1, . . . , k −1. In this way, it is hoped to achieve both space-fillingness and orthogonality.

Two versions of an algorithm OSGSD (orthogonal swap Gram-Schmidt design) using different distance metrics and different criteria are proposed. The first version uses the distance (6.4) projected into 2-dimensional subspaces and criterion (6.6) as in Section 6.3, while the second version uses the k-dimensional rectangular distance in (6.1) and criterion (6.3). In both cases, all pairwise column correlations are zero by construction.

(2) OSGSD using the dmin criterion

(2) When dmin, defined in (6.5), is used as the design criterion the orthogonal swap method is similar to that of the SSLHD method of Section 6.3.3 with the additional step of transforming the LHD into a GSD. So Steps 1, 4, and 5 of the algorithm in

(2) Section 6.3.3 are modified for the OSGSD-dmin algorithm as follows.

130 Step 1: Set s = s+1 and the choice number c = 0. Generate an n×k random LHD and then convert it into the GSD via centering, orthogonalization, and scaling.

Call the result the parent design X.

Step 4: Set c = c + 1 and create Xtry by swapping two elements obtained from the cth (column, row, row) combination listed in Step 3. Exchange the column in

choice c with the last column, so that the perturbed column becomes column k. Apply the center, orthogonalize and scale steps of the GSD to column k (only).

Step 5: Calculate the new inter-point distances of Xtry. Only (k − 1)n(n − 1)/2 distances need to be recalculated since there are (k − 1) columns to be paired

with column k, and n elements within column k are changed via the orthogo- (2) − − nalization step. Let ∆min,try denote the minimum of the (k 1)n(n 1)/2 new (2) − (2) inter-point distances and set a = ∆min,try dmin(X).

OSGSD using the ϕp criterion

The OSGSD algorithm is described below for an arbitrary k-dimensional distance

(k) metric d (·, ·) and objective function ϕp in (6.2), although ϕp can be replaced by any other real-valued objective function. In Section 6.5, the designs produced by the (k) · · OSGSD algorithm based on ϕp with distance metric dR ( , ) in (6.1) are compared to the designs produced by the algorithm of Joseph and Hung (2008); the latter uses

2 a criterion based on minimizing a weighted combination of ϕp and ρave as in (6.7).

Step 0: Choose the total number of starting designs T . Set s = 0 and ϕp = ∞.

Step 1: Set s = s+1 and the choice number c = 0. Generate an n×k random LHD

and then convert it into the GSD via centering, orthogonalization, and scaling.

Call the result the parent design X.

131 Step 2: Calculate ϕp(X) as in (6.2). If s = 1, set Xbest = X and ϕp,best = ϕp(X).

∗ ∗ (k) Record the pair of rows (i , j ) giving rise to min d (xi, xj). 1≤i

Step 3: For each selection of a single column from the k columns of X and one of

the rows i∗ or j∗, there are (n − 2) remaining rows from which another row may

be selected. There are cmax = k × 2 × (n − 2) = 2k(n − 2) possible (column,

row, row) triples so obtained. Label these as 1, 2, . . . , cmax.

Step 4: Set c = c + 1 and create Xtry by swapping two elements obtained from the cth choice of (column, row, row) combination listed in Step 3. Exchange the column in choice c with the last column, so that the perturbed column

becomes column k. Apply the center, orthogonalize and scale steps of the GSD

to column k only.

Step 5: Calculate ϕp,try = ϕp(Xtry) for Xtry and let a = ϕp(X) − ϕp,try.

Step 6:

∗ ∗ 6.1) If a > 0, set X = Xtry and reset ϕp(X), i , j . If, in addition, a > ϵ, for a pre-specified ϵ, then go to Step 3. Otherwise, if the improvement is

smaller than or equal to ϵ, go to Step 7.

6.2) If a < 0 and c < cmax, then go to Step 4 to consider the next possible swap in the parent design X.

6.3) If a < 0 and c = cmax so that there are no more possible swaps, go to Step 7.

Step 7: If ϕp(X) < ϕp,best, then set Xbest = X and ϕp,best = ϕp(X). If s ≤ T , go

to Step 1 for a new parent design. If s > T , select Xbest as the final design and stop.

132 The OSGSD-ϕp algorithm here is similar to the SSLHD algorithm in Section 6.3.3

(2) and the previous OSGSD-dmin algorithm in its swap rule for mutations. In Step 3, the OSGSD-ϕp algorithm selects one of the two rows which give rise to the minimum distance d(k)(·, ·), in order to determine one of the rows to swap. On the other hand,

Joseph and Hung (2008) use ϕpi in (6.10) for choosing one of the rows to swap. These differences in determining the elements to swap can lead to different performances of the algorithms.

6.5 Comparisons

This section compares the various algorithms described in this chapter. A zip file containing the new algorithms is available at http://www.stat.osu.edu/∼comp exp/∼moon dean santner 2010. The algorithms for maximin LHD are compared with respect to their ability to generate a maximin design based on the 2-dimensional maximin criterion (6.6), while the algorithms for orthogonal maximin designs are compared under criteria based on ϕp using different distance metrics and criteria based on minimum correlation.

6.5.1 Maximin LHDs

First, the algorithms in Sections 6.3.1, 6.3.2, and 6.3.3 are compared with respect to their ability to generate a maximin LHD, using an example of n = 9 rows and k = 4 columns and 2-dimensional maximin distance criterion in (6.6), when each

(2) input is scaled to [0,1]. As a baseline, the design that maximized dmin using the random generation RGLHD method with 10,000 randomly generated LHDs (with random points selected in each of the nk subintervals) was determined. This search required 12.5 seconds using a MATLAB code on a 64 bit Linux machine with 8 cores,

133 32 GB of RAM, and 2.66 GHz. For the same criterion (6.6), the best design produced by the Forrester et al. (2008) random swap algorithm (RSLHD) was determined with

100 iterations, 100 offspring in each iteration. Within the same time limit as the

RGLHD algorithm, the RSLHD algorithm used T = 3 starting designs. Similarly, the best design produced by the smart swap algorithm (SSLHD) was determined so that the running time of the SSLHD algorithm again approximately matched that of the RGLHD algorithm (T = 404 starts).

(2) 2 (4) | | Algorithms dmin ϕ15 ρave dmin ρ max T RGLHD 0.1935 1.2342 0.0845 0.8194 0.4309 - RSLHD 0.2174 1.0981 0.0594 0.9529 0.3551 3 SSLHD 0.2497 1.0566 0.0246 0.9825 0.2858 404

(2) Table 6.1: Characteristics of best (n, k) = (9, 4) designs formed using criterion dmin: 2 ϕ15 is Morris and Mitchell (1995) objective function with p = 15; ρave is average (4) | | squared correlation; dmin is minimum 4-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs

The results are compared according to various characteristics in Table 6.1 where

(2) the design that maximized dmin for each algorithm was selected for comparison. (2) The SSLHD algorithm produced the design with the largest value of dmin, and also the best design in terms of the other criteria considered, even though it did not aim to optimize them. The SSLHD has the smallest value of ϕ15 (the Morris and Mitchell (1995) objective function in (6.2) with p = 15), the smallest average squared

2 correlation ρave in (6.8), the largest minimum 4-dimensional rectangular distance (4) dmin, and the smallest value of maximum absolute correlation between two columns

|ρ|max = max1≤i

40, k = 5 was constructed. Again, as baseline one run of the RGLHD algorithm with

134 10,000 candidate designs was made which took 20.4 seconds. Adapting the number of starts for the RGLHD and RSLHD algorithms to provide comparable computational

effort. The best designs determined by all three algorithms are shown in Table 6.2.

The results are similar to those for the (n, k) = (9, 4) case.

(2) 2 (5) | | Algorithms dmin ϕ15 ρave dmin ρ max T RGLHD 0.0394 2.1519 0.0353 0.4957 0.2824 - RSLHD 0.0567 2.1429 0.0165 0.4755 0.3034 2 SSLHD 0.0834 1.8254 0.0153 0.5894 0.2507 11

(2) Table 6.2: Characteristics of best (n, k) = (40, 5) designs formed using criterion dmin: 2 ϕ15 is Morris and Mitchell (1995) objective function with p = 15; ρave is average (5) | | squared correlation; dmin is minimum 4-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs

6.5.2 Orthogonal Maximin Designs

The performance of the Joseph and Hung (2008) algorithm for obtaining an or-

thogonal maximin LHD (OMLHD) was compared with the OSGSD algorithms in

Section 6.4.2 using different distance metrics and criteria. The OMLHD optimal de-

sign for (n, k) = (9, 4) of Joseph and Hung (2008) is shown in Table 6.3 after scaling by transforming an integer value i in their original design into (i−1)/8 for i = 1,..., 9. It took approximately 7 minutes to execute the OMLHD algorithm using C++ code

downloaded from http://stat.rutgers.edu/∼yhung/ on the 64 bit Linux machine with 8 cores, 32 GB of RAM, and 2.66 GHz. The algorithms of Section 6.4.2 were

coded in MATLAB. Despite the possible differences in actual computations performed by the compiled Joseph and Hung (2008) C++ program and interpreted MATLAB

programs, the same 7 minute time limit was used for all algorithms when constructing

designs.

135 OMLHD(scaled) Scatter plots

0 0.5000 0.2500 0.2500 1 0.1250 0.1250 0.5000 0.8750 0.5 0 0.2500 1 0.7500 0.5000 1 0.3750 0.2500 0.8750 0 0.5 0 0.5000 0.7500 0 0.7500 1 0.6250 0.6250 1 1 0.5 0 0.7500 0 0.1250 0.3750 1 0.8750 0.8750 0.3750 0.1250 0.5 0 1 0.3750 0.6250 0.6250 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 OSGSD-ϕ15 Scatter plots

0.1437 1 0.8647 0.8883 1 0.1371 0.0703 0 0.8546 0.5 0 0.9717 0.0413 0.8623 0.8454 1 0 0.0096 0.7763 0.1000 0.5 0 0.8642 0.6996 1 0.0057 1 0.6881 0.7096 0.2875 1 0.5 0 0.0017 0.9269 0.2941 0.0000 1 0.7737 0 0.1756 0 0.5 0 1 0.9604 0.0156 0.2457 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1 (2) OSGSD-dmin Scatter plots

0.8529 0.4915 1 0.3173 1 0.0027 0.3365 0 0.7909 0.5 0 0.3312 1 0.8098 1 1 0 0.7516 0.7411 0.2717 0.5 0 0.6050 0.7359 0.1645 0 1 0.5886 0.2359 0.8062 0.5600 0.5 0 1 0.9896 0.0838 0.5600 1 0.8469 0 0.3757 0.7843 0.5 0 0.2554 0.2607 0.4707 0.0986 0 0.5 1 0 0.5 1 0 0.5 1 0 0.5 1

Table 6.3: Best orthogonal maximin 9×4 designs found by the OMLHD, OSGSD-ϕ15, (2) OSGSD-dmin algorithms based on 7 minutes of computational time and scatterplot matrices of these designs.

The resulting designs are shown in Table 6.3. The OMLHD design is constructed to minimize ψp,w in (6.7) with p = 15 and w = 0.5; the OSGSD-ϕ15 design is con- (k) · · structed to minimize ϕp in (6.2) with p = 15 and distance metric dR ( , ), and the (2) (2) OSGSD-dmin design is to maximize dmin. Table 6.4 summarizes the performance of the three designs under several criteria.

(2) (2) The OSGSD-dmin design has the largest value of dmin, which is not surprising since this

136 2 (4) | | (2) Algorithms ϕ15 ρave dmin ρ max dmin T OMLHD(scaled) 0.8391 0.0040 1.3750 0.1167 0.1768 - OSGSD-ϕ15 0.7562 0.0000 1.4840 0.0000 0.0305 3,564 (2) OSGSD-dmin 1.0590 0.0000 0.9564 0.0000 0.2456 5,718

Table 6.4: Comparisons of best designs found by OMLHD, OSGSD-ϕ15, and OSGSD- (2) dmin algorithms based on 7 minutes of computational time: ϕ15 is Morris and Mitchell 2 (4) (1995) objective function with p = 15; ρave is average squared correlation; dmin is minimum 4-dimensional rectangular distance; |ρ|max is maximum absolute correlation; T is number of starting designs

(2) is the only one of the three algorithms that uses a dmin-based criterion. The OSGSD- (2) dmin design has zero correlation, so it outperforms the OMLHD design with respect (2) to the average and maximum correlation measures. However, the OSGSD-dmin design performs less well in minimizing ϕ15 based on the 4-dimensional rectangular distance which the other two algorithms use as an optimizing criterion. Under both correlation criterion and the 4-dimensional rectangular distance criterion, the OSGSD-ϕ15 design outperforms the OMLHD design.

To investigate the stability and performance of OSGSD-ϕ15 when run for a lim- ited time, the algorithm was run 100 times for 4 seconds per run; this time limit was selected to be 1% of the 7 minutes required for the OMLHD algorithm. The dis-

(4) tributions of ϕ15 and dmin for the 100 designs produced by the algorithm are shown in Table 6.5. It is shown that 98% of designs obtained in 4 seconds have smaller

ϕ15 values than the ϕ15 = 0.8391 of the scaled OMLHD, and 74% of designs have

(4) larger minimum 4-dimensional rectangular distance than dmin = 1.3750 of the scaled

OMLHD. Thus it can be concluded that the OSGSD-ϕ15 algorithm outperforms the OMLHD algorithm in terms of k-dimensional rectangular distance and correlation measures as well as computational efficiency for the (n, k) = (9, 4) design studied thus far.

137 Measure Distribution of 100 OSGSDs OMLHD Proportion of Min 1st Q Median 3rd Q Max (scaled) better OSGSDs ϕ15 0.7582 0.7907 0.8053 0.8171 0.8446 0.8391 98% (4) dmin 1.3059 1.3732 1.3929 1.4271 1.5136 1.3750 74%

1.53

0.84 1.49

0.82 1.45 15

0.8 (4) min 1.41 φ d

1.37 0.78

1.33 0.76

1.29 1 1 OSGSD using φ OSGSD using φ 15 15

(4) × Table 6.5: Distributions of ϕ15 and dmin values in 100 9 4 designs produced by the OSGSD-ϕ15 algorithm (4 seconds of computation) and corresponding values of the best scaled OMLHD design indicated by horizontal lines

Now consider the larger problem in which (n, k) = (40, 5). the Joseph and Hung

(2008) C++ code required 9 hours and 42 minutes to execute the OMLHD algorithm.

(2) One run of the OSGSD-dmin and OSGSD-ϕ15 algorithms was made based on the same (2) time limit. The result is shown in Table 6.6. The OSGSD-dmin is again best in terms (2) of maximizing dmin. The OSGSD-ϕ15 again outperforms the scaled OMLHD in terms 2 | | (5) of minimizing ϕ15, ρave, ρ max, and maximizing dmin.

To investigate the stability and performance of OSGSD-ϕ15 when run for a lim- ited time for this larger problem, multiple (100) runs of the algorithm were again

conducted, each for 349 seconds which is 1% of 9 hours and 42 minutes. The distri-

(5) butions of ϕ15 and dmin of the 100 runs are shown in Table 6.7. All the 100 designs (5) obtained with 349 seconds of computational effort have smaller ϕ15 and larger dmin (5) values than the ϕ15 = 1.3195 and dmin = 0.9231 values of the scaled OMLHD design.

138 2 (5) | | (2) Algorithms ϕ15 ρave dmin ρ max dmin T OMLHD(scaled) 1.3195 0.00001 0.9231 0.0585 0.0363 - OSGSD-ϕ15 1.1542 0.00000 1.1186 0.0000 0.0066 1,888 (2) OSGSD-dmin 2.0802 0.00000 0.4906 0.0000 0.0655 27,160

Table 6.6: Comparisons of best designs found by OMLHD, OSGSD-ϕ15, and OSGSD- (2) dmin algorithms based on 9 hours and 42 minutes of computational time: ϕ15 is 2 Morris and Mitchell (1995) objective function with p = 15; ρave is average squared (5) | | correlation; dmin is minimum 5-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs

Thus the OSGSD-ϕ15 algorithm outperforms the OMLHD algorithm in terms of ϕ15

(5) and dmin, as well as being considerably more computationally efficient.

6.6 Summary

This chapter proposes the SSLHD algorithm for obtaining maximin LHDs under a

2-dimensional distance criterion, and also proposes the two algorithms, OSGSD-ϕmin and OSGSD-d(2), for obtaining orthogonal maximin designs under a 2-dimensional ϕp distance and a ϕp-based criterion, respectively. The SSLHD algorithm improves upon the RSLHD algorithm of Forrester et al.

(2008) by using a “smart swap” which tries to increase the minimum inter-point distance as quickly as possible. The OSGSD algorithms aim to achieve both the

space-filling property and orthogonality. Through a construction using Gram-Schmidt

orthogonalization in (6.11), all OSGSD designs are guaranteed to have zero correlation

between any pair of columns. For the space-filling property, both the minimum inter-

point distance criterion in (6.6) using 2-dimensional Euclidian distance and Morris

and Mitchell (1995)’s ϕp criterion using k-dimensional distances can be applied to the class of GSDs. In examples, the OSGSD-ϕp algorithm with rectangular distance

139 Measure Numerical Summary of 100 OSGSD-ϕ15 designs OMLHD Proportion of Min 1st Q Median 3rd Q Max (scaled) better OSGSDs ϕ15 1.1562 1.1806 1.1923 1.1992 1.2242 1.3195 100% (5) dmin 1.0481 1.0743 1.0853 1.0986 1.1273 0.9231 100%

1.34 1.14

1.12 1.32 1.1 1.3 1.08 1.28 1.06

1.26 1.04 (5) min 15 1.24 1.02 d φ 1 1.22 0.98 1.2 0.96 1.18 0.94

1.16 0.92

1.14 0.9 1 1 OSGSD using φ OSGSD using φ 15 15

(5) × Table 6.7: Distributions of ϕ15 and dmin values in 100 40 5 designs produced by the OSGSD-ϕ15 algorithm (349 seconds of computation) and corresponding values of the best scaled OMLHD design indicated by horizontal lines

outperforms the OMLHD algorithm by producing a better design in terms of both distance and correlation measures within a considerably shorter time.

140 CHAPTER 7

ALTERNATIVE TWO-STAGE DESIGNS

This chapter describes an alternative approach to constructing two-stage de-

signs. This straightforward approach was considered prior to the improved two-stage

GSinCE procedure in Chapter 2. The purpose of this chapter is to review the short- coming of the more straightforward procedure. The main feature of this approach is

that Stage 1 and Stage 2 designs are constructed from orthogonal array-based Latin

Hypercube designs (OA-based LHDs). First, OA-based LHDs are reviewed and then

the construction of Stage 1 and Stage 2 designs is explained. Finally the problems

detected in this approach are discussed.

7.1 Orthogonal Array-based Latin Hypercube Design

An n × k matrix A, with elements from a set of s ≥ 2 symbols, is called an orthogonal array (OA) of strength r, size n, with k constraints and s levels if each n × r submatrix of A contains all possible 1 × r row vectors with the same frequency

λ. The number λ is called the index of the array and n = λsr. The array is denoted by OA(n, k, s, r). Thus the n × k Latin hypercube design (LHD) whose columns are permutations of 1, . . . , n, is an OA of strength 1, that is, OA(n, k, n, 1).

OA designs are used extensively for physical experiments because of their orthog-

onality property. But OA designs can result in replication of points when they are

141 projected onto a subspace defined by a few important inputs. When effect sparsity exists, this kind of design is not desirable for screening in computer experiments.

The LHDs have been extensively used in computer experiments because they spread

the design points uniformly over the range of each input separately. But LHDs are

not guaranteed to be space-filling in higher dimensions. Tang (1993) proposed a method of constructing the OA-based LHD by combining the desirable properties of

both OAs and LHDs. Such OA-based LHDs are more space-filling than randomly

generated LHDs.

The following procedure generates an OA-based LHD from a given OA(n, k, s, r).

Step 1 Select an OA(n, k, s, r) = A and randomize its rows, columns, and symbols.

Step 2 For each column of the randomized A, replace the λsr−1 positions with ele-

ment w by some random permutation of (w − 1)λsr−1 + 1, (w − 1)λsr−1 + 2,..., (w − 1)λsr−1 + λsr−1 = wλsr−1,

for all w = 1, . . . , s. This generates an OA-based LHD, denoted by U.

Step 3 For the jth column of U, replace the ith row value by the value randomly ( ) i−1 i generated from U n , n , for all i = 1, . . . , n and j = 1, . . . , k. This gives a random OA-based LHD.

To obtain a design matrix which is as uncorrelated and/or as space-filling as possible,

the resulting OA-based LHDs are evaluated under a secondary criterion such as one of the correlation and/or distance criteria shown in Table 7.1 and the best design is

obtained among such OA-based LHDs generated.

142 7.2 Stage 1 Design for a Two-stage Group Screening Proce-

dure

Suppose that f individual inputs are divided into m groups and the number of

(1) runs at Stage 1 is n. Let X = (ξ1,..., ξf ) be the design matrix at Stage 1 for the ⊤ f individual inputs, where ξj = (ξ1j, . . . , ξnj) , j = 1, . . . , f. Let G = (g1,..., gm) ⊤ where gi = (g1i, . . . , gni) for i = 1, . . . , m be the design matrix for the m group variables. At Stage 1, both designs X(1) and G are important because X(1) is needed for computation of output at Stage 1 and G is needed for analysis of groups at

Stage 1. Alternative methods to those used by GSinCE for constructing these two design matrices are discussed in Section 7.2.1.

7.2.1 Construction

Here four different methods to construct the design matrices X(1) and G are

discussed.

OAGrp Method

This method first generates a group design matrix G = (g1,..., gm) from a ran- domly generated OA-based LHD and then assigns the design points of a group variable to every individual input within the same group. For example, if the inputs 1, 5, 6

(1) are assigned to group 1, then columns 1, 5, and 6 (ξ1, ξ5, ξ6) of X are all set equal

to g1. This method assumes that the grouping is determined from subject matter expert opinion or other information.

In this case, the analysis between the group variables and the output at Stage 1

is straightforward since the group variables are directly related to the output via the

individual inputs having the exactly same values as the group variable. However, the

143 design points in X(1) projected onto 2-dimensional subspaces of the inputs within the same group are located along the diagonal line. So the design points in X(1) cannot

be space-filling. Then the design cannot provide information about the experimental

region well. One of the limitations of this method is that the grouping by empirical

study is impossible. So if there is no prior grouping information, the most reasonable course of action might be to create groups of randomly selected inputs.

OAInd Method

This method first determines the design matrix X(1) from a randomly generated

OA-based LHD for the individual inputs and then defines the group design matrix G

based on these. For example, if the inputs 1, 5, 6 are assigned to group 1, then the ith level of the group 1 variable is defined to be

ξ + ξ + ξ g = i1 i5 i6 , i1 3

th where ξi1, ξi5, ξi6 are the i input values of these 3 inputs, for i = 1, . . . , n. This method allows the output to be computed from the space-filling design X(1)

and groups to be made from empirical analysis of the individual inputs defined in X(1) and the output as for GSinCE. However, the group variable defined by the average

has a problem in estimating the group effects correctly, (see Section 7.4.2 for details).

The following 2 methods, Rotate and Shift methods are proposed by combining

the OAGrp and OAInd method. The main idea of these methods is to move the

individual input design points generated from the OA-based LHD toward a group design point so that the individual input design points within a group are scattered

around the resulting group design value. This modification of X(1) is sought to

improve the resulting G, without removing key properties of X(1).

144 Rotate Method

This method rotates the design points of the individual inputs in a matrix X(1) toward the diagonal line so that the rotated design points are located inside the middle square of the 2-dimensional input region. The following is a detailed step of moving the design points in X(1). The Rotate method is now described in detail for two inputs.

1

(a3,b3) (a4,b4) 0.5

(a2,b2) (a1,b1)

θ θ

θ 45 0 −0.5 0 0.5 1

Figure 7.1: Rotate Method

Step 1 Generate an OA-based LHD in [0,1] for two individual inputs within a group.

Let one point of this design be (a1, b1). [ ] Step 2 Re-scale each column of the OA-based LHD to 0, √1 and move the point ( ) 2 (a , b ) to (a , b ) = √a1 , √b1 . 1 1 2 2 2 2

145 Step 3 Connect the point (a2, b2) with an origin (0,0) as shown in Figure 7.1. Let the angle between this line and an x-axis be θ.

◦ Step 4 Rotate the point (a2, b2) by 45 in the counter-clockwise direction. Then a ( ) a1−b1 a1+b1 new point is defined by (a3, b3) = 2 , 2 since √ √ 2 2 2 2 a1 + b1 ◦ a1 + b1 ◦ ◦ a3 = cos(θ + 45 ) = (cos 45 cos θ − sin 45 sin θ) 2 √ 2 a2 + b2 1 a − b = 1 1 √ (cos θ − sin θ) = 1 1 √ √ 2 2 2 2 2 2 2 a1 + b1 ◦ a1 + b1 ◦ ◦ b3 = sin(θ + 45 ) = (cos 45 cos θ + sin 45 sin θ) 2 √ 2 a2 + b2 1 a + b = 1 1 √ (cos θ + sin θ) = 1 1 . 2 2 2

1 Step 4 Move the point (a3, b3) a distance 2 parallel to the x-axis. Then the resulting ( ) a1−b1+1 a1+b1 point is (a4, b4) = 2 , 2 . ( ) a1−b1+1 a1+b1 Thus the original point (a1, b1) is finally moved to 2 , 2 as shown in Figure 7.1. In the same way, any design point located in the original square on [0,1] can be moved into the inside square with length √1 . As a result, the design points are 2 scattered in the middle part of the original square on [0,1] and there are no point in its four corners. The levels of group variables are obtained by averaging the rotated individual design points.

Shrink Method

This method moves the design points of the individual inputs toward the diagonal line by a shrinkage parameter. The shrinkage parameter h indicates the proportion of the distance between the original design point and the new design point as compared

146 with the distance from the original point to the intersection point which is perpen- dicular to the diagonal line. This method is flexible in that it can produce different

designs by a choice of the shrinkage parameter. For example, if the shrinkage param-

eter h is 0, then it is the same as the OAInd method. If h is 1, then it is similar to the OAGrp method, although not identical because the OAGrp method starts with the OA-based LHD for G while the Shrink method starts with the OA-based LHD for X(1). The Shrink method is now described in detail for two inputs.

1

(c1,d1)

(c3,d3) 0.5 (a2,b2)

(c2,d2)

(a3,b3)

(a1,b1)

0 0 0.5 1

Figure 7.2: Shrink Method with h = 0.25

Step 1 Draw a diagonal line to connect points (0,0) and (1,1) as shown in Figure 7.2.

Step 2 Draw a line to connect the original point (a1, b1) with the diagonal line per- ( ) a+b a+b pendicularly and then set the intersection point to (a2, b2) = 2 , 2 .

147 Step 3 Move the point (a1, b1) by the proportion h of the distance between (a1, b1)

and (a2, b2) and then set to (a3, b3). Then the new point is defined by  ( )  | − | | − |  a − a1 b1 × h, b + a1 b1 × h , for a > b ;  1 2 1 2 1 1 (a3, b3) =   ( )  | − | | − |  a1 b1 × − a1 b1 × ≤ a1 + 2 h, b1 2 h , for a1 b1.

since the distance to move toward the x-axis or y-axis direction is determined | − | | − | by a1√ b1 × h × √1 = a1 b1 × h. 2 2 2

Figure 7.2 displays the movement of the design point by the Shrink method when h = 0.25. It shows the two cases with a1 > b1 and a1 < b1. The second case is displayed using symbols (c1, d1) instead of (a1, b1). In both cases, the original points are moved toward the diagonal line. The levels of group variables are obtained by averaging the shifted individual design points.

7.2.2 Secondary Criteria

The selection of best designs for both X(1) and G is done by applying one of the criteria in Table 7.1. In Table 7.1, the distance means the inter-point distance projected into 2-dimensional space, and the correlation means the Pearson correlation coefficient between any two columns. The distance criterion is used to ensure that no two points are “too close” and hence the design points are spread out. The correlation criterion is used so that no two variables are “too related.” The sum of squared distances or correlations are used to achieve a good performance on average.

On the other hand, the minimum distance and the maximum correlation are used to avoid the worst case by maximizing the former and minimizing the later, respectively. Table 7.1 describes the various choices of design criteria for Stage 1. In particular,

148 the distance and correlation criteria can be applied simultaneously as shown in the third case of Table 7.1. The “individual” criteria are applied to all points in X(1) and the group criteria to all points in G.

Measure Method Explantation Distance MmD Maximize the minimum distance over all pairs of individuals and groups MsD Maximize the sum of squared distances over all pairs of individuals and groups Correlation mMC Minimize the maximum absolute correlation over all pairs of individuals and groups msC Minimize the sum of squared correlations over all pairs of individuals and groups Distance MmDmMC Maximize the minimum distance Correlation over all pairs of individuals and groups and Minimize the maximum absolute correlation over all pairs of individuals and groups MsDmsC Maximize the sum of squared distances over all pairs of individuals and groups and Minimize the sum of squared correlations over all pairs of individuals and groups

Table 7.1: Secondary design criteria for Stage 1 design

7.3 Stage 2 Design for a Two-stage Group Screening Proce-

dure

The idea of Stage 2 design construction is the same as that of the GSinCE pro- cedure in Section 2.4.1. The only difference is that OA-based LHDs are used instead of Gram-Schmidt designs (GSDs) in Section 6.4.2.

149 7.4 Limitations

7.4.1 Availability of OA-based LHD

The main problem of the above 4 methods originates in the existence of the OA-

based LHD. There are many restrictions in the possible dimensions of OA-based LHD

design matrices. This is because there is a limit on the number of possible rows of the

OA for given number of columns. It is hard to find a suitable OA, when the computer experiment of interest involves a large number of inputs and the screening method

requires additional input columns for the low-impact input. So, the number of Stage 1

runs and the number of low-impact benchmark inputs for a given number of inputs

cannot be flexibly chosen. This restriction happens again when a Stage 2 design is constructed. A new OA-based LHD for Stage 2 which allows the combination of the

individual OA-LHDs from both stages to be OA-LHD, also, is hard to find.

A two-stage design as proposed in Chapter 2 can overcome these problems by

using Gram-Schmidt orthogonalization instead of the OA-based LHD. The Gram-

Schmidt orthogonalization can convert the design matrix of any dimension into an orthogonal design matrix whenever the number of rows is larger than the number of

columns. Then the maximum number of columns is n − 1 when n is the number of rows.

7.4.2 Group Variable Defined by Averaging

For the OAInd, Rotate, and Shrink methods, the group variables are defined by averaging. Suppose that the n × f Stage 1 design matrix is re-ordered

(1) (1) (1) (1) (2) (m) X = (ξ1,..., ξf ) = (ξ1 , ξ2 ,..., ξs1 , ξ1 ,..., ξsm )

150 (i) (i) (i) ⊤ × th where ξℓ = (ξ1ℓ , . . . , ξnℓ ) is an n 1 column vector corresponding to the ℓ input

th within the i group, for i = 1, . . . , m, so that the first s1 inputs form group 1, the

th next s2 inputs form group 2, and so on, to the m group of size sm. Define an n × 1 column of levels of the ith group variable to be

∑si A 1 (i) gi = ξℓ . (7.1) si ℓ=1

That is, the levels of the group variable are determined from the average of the

individual input values within the group. Then the design for the group variables is

A A A G = (g1 ,..., gm). The relationship between the output and the group variables defined in such a way is analyzed via sensitivity analysis as in Section 2.3.3 to select important groups at Stage 1.

Although the definition of the group inputs using (7.1) is simple, the use of GA directly to analyze the relationship between the group variables and the output can produce inaccurate estimates of group effects as follows.

Range of Group Variables

Suppose the Stage 1 design matrix X(1) for the f individual inputs satisfies the

desirable properties (P.1)–(P.3) in Section 2.2. Then the columns in the group design

A A A matrix G = (g1 ,..., gm) defined by the average (7.1) are uncorrelated (satisfying (P.1)) since the group variables are defined from mutually exclusive sets of uncorre-

lated columns in X(1). Good space-filling properties for both X(1) and GA can be

obtained by applying the distance criterion to both designs simultaneously. However,

the columns of GA do not necessarily have common range.

As a simple example, let the output be a function of s1 + s2 inputs with the same coefficient, y = z1 + z2 + ... + zs1 + zs1+1 + ... + zs1+s2 (7.2)

151 where zi ∈ [0, 1], i = 1, . . . , s1 + s2. Suppose that the s1 + s2 inputs are divided into

2 groups of sizes s1 and s2, respectively. Then the function in (7.2), rewritten as a function of 2 group variables, is

A A y = s1g1 + s2g2 . (7.3)

A ∈ A ∈ Suppose the 2 group inputs in (7.3) range over g1 [a1, b1] and g2 [a2, b2] after

A A averaging in (7.1). Assuming that G1 and G2 are independent uniformly distributed

th random variables on [a1, b1] × [a2, b2], the TESI of the i group variable is computed by | A 2 − 2 V ar(E(y Gi )) si (bi ai) TgA = = . (7.4) i 2 − 2 2 − 2 V ar(y) s1(b1 a1) + s2(b2 a2)

th Thus the TESI of the i group variable depends on the group range (bi − ai) as well

as the group size si (see Chapter 5 for details).

For simplicity, set s1 = s2 = 2. For an empirical study, a 40 × 4 Stage 1 design X(1) was taken as shown in Table 7.2 and 3 different groupings were made based on

that. For each grouping, the ranges of pairs of groups and their estimated TESIs using the GPM/SA code are shown in Table 7.3. The group ranges vary for different

groupings. The group with a larger range has a larger estimate of TESI. For example,

in row 1 of Table 7.3, the TESI estimate of the group 1 with a larger range is larger

than the TESI estimate of the group 2. This difference is due solely to the different

ranges of the group variables, not due to the magnitude of the effects of any of the inputs. This kind of error can be reduced by making the ranges of group variables as

close as possible. As shown in the second grouping of Table 7.3, the 2 group inputs

have very similar TESI estimates 0.4999 and 0.5002 when their ranges are very close,

0.8253 and 0.8262.

152 ***************************************************************************************** obs z1 z2 z3 z4 * obs z1 z2 z3 z4 1 0.2386 0.9364 0.6214 0.8138 * 21 0.5839 0.5037 0.3789 1.0000 2 0.7991 0.1443 0.5785 0.7198 * 22 0.7122 0.2982 0.7767 0.5357 3 0.0230 0.9406 0.3100 0.4070 * 23 0.0863 0.4049 0.1270 0.8811 4 0.4878 0.2035 0.6589 0.2019 * 24 0.0941 0.9770 0.4523 0.3193 5 0.9364 0.3561 0.4680 0.2495 * 25 0.4312 0.5618 0.9398 0 6 0.5133 0.4265 0.4804 0.9356 * 26 0.1353 1.0000 1.0000 0.8170 7 0.5361 0.7130 0.6768 0.9648 * 27 0.3088 0.1005 0.8182 0.6992 8 0.5697 0.7694 0.9210 0.8800 * 28 0.6753 0.5005 0.1384 0.6780 9 0.1705 0.0101 0.4702 0.1546 * 29 0.4093 0.6103 0.7516 0.3488 10 0.2099 0.6415 0.3220 0.0637 * 30 0.3862 0.2171 0 0.2767 11 0.2865 0.5090 0.5446 0.7353 * 31 1.0000 0.6045 0.2927 0.1613 12 0.7658 0.9123 0.2446 0.3988 * 32 0.6060 0.1781 0.6707 0.6099 13 0.1627 0 0.5837 0.3410 * 33 0.9553 0.1295 0.8275 0.0538 14 0.8297 0.7756 0.8385 0.1279 * 34 0 0.2924 0.1322 0.6409 15 0.0397 0.5973 0.2474 0.0330 * 35 0.4461 0.5875 0.3515 0.1047 16 0.9270 0.8274 0.2777 0.4860 * 36 0.8665 0.9525 0.1098 0.5214 17 0.2570 0.8457 0.4012 0.4430 * 37 0.3186 0.8224 0.9423 0.0559 18 0.6684 0.3952 0.8183 0.9100 * 38 0.8828 0.9817 0.5813 0.4176 19 0.8186 0.4284 0.1237 0.8592 * 39 0.3452 0.1611 0.7010 0.7759 20 0.6298 0.8336 0.4524 0.6071 * 40 0.7273 0.2769 0.0848 0.4777 *****************************************************************************************

Table 7.2: 40 ×4 Stage 1 design X(1)

Grouping Ranges of Groups Estimates of TESIs Adjusted TESIs Group 1 Group 2 Group 1 Group 2 Group 1 Group 2 Group 1 Group 2 (z1, z2)(z3, z4) 0.8509 0.7702 0.5503 0.4498 0.5255 0.4745 (z1, z3)(z2, z4) 0.8253 0.8262 0.4999 0.5002 0.5001 0.4999 (z1, z4)(z2, z3) 0.8025 0.8915 0.4474 0.5527 0.4734 0.5266

Table 7.3: Ranges and estimated TESIs of 2 groups under 3 different groupings

Adjustment of Group Effect

In the special case when the inputs in the same group have the same effect and the output is a linear function of inputs, it is possible to adjust the TESIs so that they are less affected by the group size. The idea is to decrease the TESI inflated by a large group range or increase the TESI deflated by a small group range, by dividing

2 the TESI T A by its squared range (bi −ai) and re-weighting it. That is, the adjusted gi

153 TESI of the ith group variable is defined to be

2 T A /(bi − ai) e gi T A = gi 2 2 T A /(b1 − a1) + T A /(b2 − a2) g1 g2 s2/(s2(b − a )2 + s2(b − a )2) = i 1 1 1 2 2 2 { 2 2 − 2 2 − 2 } { 2 2 − 2 2 − 2 } s1/(s1(b1 a1) + s2(b2 a2) ) + s2/(s1(b1 a1) + s2(b2 a2) ) 2 si = 2 2 (7.5) s1 + s2

e for i = 1, 2. Then the adjusted TESI T A depends only on the group sizes from (7.5). gi Now compute the adjusted TESIs corresponding to the grouping in Table 7.3. The adjusted TESIs in the last 2 columns lessen the effect of different ranges of the group variables.

However this special case only arises occasionally. The output function cannot be written as a function of group variables in the form of (7.3), unless the inputs within a group have the same magnitude of effect and the output is a linear function of inputs as in function (7.2). Although one tries to collect the inputs with similar effects into the same group for grouping, this many not be possible in practice. Moreover, the output in computer experiments is often complicated and likely to be non-linear.

Because of these limitations, the use of group variables defined by averaging was replaced by the new scheme (GSinCE) described in Section 2.3.

154 CHAPTER 8

SOFTWARE

This chapter provides software to (i) perform the GSinCE procedure in Chap- ter 2 (GSinCE Code), (ii) compute the sensitivity indices in Chapter 5 (Sensitivity

Code), and (iii) generate maximin designs in Chapter 6 (Maximin Code). The soft- ware is all implemented in MATLAB and the codes are saved under SOFTWARE direc- tory of the computer experiment journal club; /home/comp exp/SOFTWARE/GSinCE,

/home/comp exp/SOFTWARE/Sensitivity, and /home/comp exp/SOFTWARE/Maximin and are reproduced here.

8.1 GSinCE Code

This section describes MATLAB code that implements the GSinCE procedure in Chapter 2. There are 13 functions which implement Stage 1 phases (sampling, grouping, analysis) and Stage 2 phases (sampling, analysis). In particular, the

GSinCE calls GPM/SA codes in subdirectory gpmsa 070308 to predict the out- put using the GP model and do the sensitivity analysis in each stage. The sub- directory Examples has 4 directories (ModifiedBorehole, ModifiedAircraftWing,

ModifiedOTLC, ModifiedPistonSimulator) each of which contains a job file to per- form the GSinCE procedure for the corresponding example described in Section 4.1.

Here, it is shown how to implement the GSinCE procedure for the borehole model in Section 4.1.1. The job file GS ModifiedBorehole.m consists of two parts. The first

155 part is to enter information of the problem and to assign parameters values for the implementation of the procedure. The default values of these parameters are given below, but the user can change them as they want. All pre-specified information is saved as MATLAB structure info. The second part is an implementation of the GSinCE procedure, which is executed by calling the source codes step by step. The results are saved as MATLAB structures stage1 and stage2. See Section 4.1.1 for more details.

%%%%%%%%%%%%%%%%%%%%%%% Information of Problem %%%%%%%%%%%%%%%%%%%%%%%%%%%% % specify the path of source files addpath(’../..’) addpath(’../../gpmsa_070308/matlab’)

% name of problem info.problem = ‘ModifiedBorehole’; info.exname = ‘bore’;

% number of inputs info.f = 20;

% original range of the inputs info.min = zeros(1,info.f); info.max = ones(1,info.f); info.min(1:8) = [0.05 100 63070 990 63.1 700 1120 9855]; info.max(1:8) = [0.15 50000 115600 1110 116 820 1680 12045];

%%%%%%%%%%%%% Parameters to be assigned for GSinCE procedure %%%%%%%%%%%%%% % number of candidate designs to find the best maximin design in each stage info.r = 10000;

% number of final MCMC draws info.nburn = 100;

% sample interval for the production MCMC draws info.nlev = 11;

% maximum group size for automatic grouping info.maxsi = 5;

% familywise significance level in each stage for sign test info.alpha = 0.2;

% fraction of the range of output for perturbation info.tau = 14; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

156 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Stage 1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % the number of runs for Stage 1 stage1.n1 = 40

% preliminary design matrix stage1.design = x: [40x39 double] maximin: 0.0024 time: 535.1773

% output for Stage 1 stage1.output = y: [40x1 double] time: 0.0168

% automatic grouping with the maximum group size of 5 stage1.grouping = maxsi: 5 grouping: [2 3 1 4 4 4 2] element : [7 6 19 8 4 1 17 3 10 14 13 9 12 11 16 18 2 15 20 5] gn: 7 time: 0.0872

% number of low-impact inputs for Stage 1 stage1.dn1 = 32

% group design matrix for 7 groups and 32 low-impact columns stage1.groupdesign = g: [40x7 double] l: [40x32 double] time: 0.0210

% prediction of GP model at group design points stage1.prediction = yhat: [40x1 double] time: 30.0088 coef: 25.1390

% sensitivity analysis for perturbed predicted output % using 32 augmented designs of 7 group variables stage1.sensitivity = ste : [32x8 double] time: [32x1 double]

% selection of active groups at Stage 1 stage1.selection = actg : [2 3 1 0 0 0 0] actx : [7 6 19 8 4 1] actxn: 6 nctx : [17 3 10 14 13 9 12 11 16 18 2 15 20 5] nctxn: 14 time : 0.0447 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

157 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Stage 2 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % number of new design points for Stage 2 stage2.n2 = 70

% number of low-impact columns for Stage 2 stage2.dn2 = 19

% design matrix newly generated for Stage 2 stage2.design = x: [30x39 double] maximin: 0.0017 time: 325.7332

% output for Stage 2 stage2.output = y: [30x1 double] time : 0.0011 ycomb: [70x1 double] coef : 26.4138

% sensitivity analysis for perturbed combined output % using 19 augmented designs of 6 potentially active inputs stage2.sensitivity = ste : [19x7 double] time: [19x1 double]

% selection of a final set of active inputs stage2.selection = finx : [7 6 4 1] finxn: 4 time : 0.0037 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

8.2 Sensitivity Code

This section describes MATLAB code that computes the sensitivity in Chapter 5.

There are 10 functions required for the computation. Subdirectory Examples has 2 directories, quan input and mixed input, each of which contains a job file to compute the sensitivity indices for the corresponding example described in Sections 5.1.4 and

5.2.4.

The job file example quan comp.m generates data and computes the sensitivity indices for the example y = x1 + x2 + 3x1x3 in Section 5.1.4. In this case, the

158 sensitivity indices can be computed directly calling MPERK code since the MPERK code includes the function file for computing sensitivity indices of the quantitative inputs. The sensitivity indices based on the Gaussian correlation function are shown below, but the sensitivity indices based on the cubic correlation function can be obtained as shown in Table 5.2. Note the constant mean for the GP model is assumed here as described in Section 5.1.2. A MATLAB structure quan mperk gaussian.sens saves the estimated main effect and total effect sensitivity indices, the estimated total variance, and the estimated main effects over the grids.

%%%%%%%%%%%%% Sensitivity Analysis for Only Quantitative Inputs %%%%%%%%%%%% % specify the path of MPERK code addpath(‘/home/comp_exp/SOFTWARE/PERK/MPERK’)

% generate design matrix n = 30; f = 3; A = lhsdesign(n,f); B = A-repmat(mean(A,1),n,1); [Q R] = qr(B,0); x = (Q-repmat(min(Q),n,1)) ./ repmat(range(Q),n,1);

% compute output y = x(:,1)+x(:,2)+3*x(:,1).*x(:,3); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

159 %%%%%%%%%%%%% Sensitivity Analysis for Only Quantitative Inputs %%%%%%%%%%%% % inputs and output x1 x2 x3 y 0 0.0552 0.8646 0.0552 0.6106 0.0878 0.4434 1.5105 0.9357 0.1525 0.0979 1.3630 0.8007 0.7704 0.2298 2.1231 0.1518 0.4108 0.1233 0.6188 0.3799 1.0000 0.2446 1.6586 0.5315 0.5452 0.4674 1.8218 0.9588 0.7082 0.5626 3.2852 0.1174 0.8290 0.4922 1.1197 0.9162 0.7870 0.9376 4.2803 0.7751 0.5855 0.5259 2.5834 0.4156 0.8971 0.9557 2.5044 0.1855 0.3959 0.6445 0.9400 0.5032 0.5154 1.0000 2.5281 0.5649 0.2559 0.1945 1.1505 0.0787 0.2157 0.7786 0.4782 0.7064 0.3289 0.3165 1.7059 0.2106 0.9711 0.6601 1.5986 0.4660 0.9170 0.7914 2.4894 0.6135 0.6924 0.0493 1.3966 0.2606 0.4904 0.2153 0.9194 0.3119 0.2865 0.1784 0.7653 0.7405 0.6332 0.6567 2.8327 0.2941 0.1087 0.9290 1.2224 0.8802 0.1757 0.3608 2.0086 1.0000 0 0.8979 3.6937 0.8259 0.4666 0.7387 3.1229 0.3534 0.6608 0 1.0142 0.6643 0.8563 0.5686 2.6538 0.0433 0.3583 0.4018 0.4538 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%% Sensitivity Analysis for Only Quantitative Inputs %%%%%%%%%%%% % do the sensitivity analysis when the Gaussian correlation function is used quan_mperk_gaussian = mperk(‘X’,x,‘Y’,y,‘CorrelationFamily’,‘Gaussian’,... ‘SensitivityAnalysis’,‘Yes’) quan_mperk_gaussian.sens = sme: [0.6108 0.0965 0.2199] ste: [0.6840 0.0961 0.2929] totalvar: 0.8535 ngrid: 21 maineffect: [21x3 double] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

160 The job file example mixed comp.m generates data and computes the sensitivity

indices for the example y = (x1 + x2)I(t = 1) + 3x1I(t = 2) in Section 5.2.4. In this

case, the 3 correlation parameters (θ1, θ2, c) and the mean and variance parameters are estimated by the maximum likelihood by extending the existing MPERK code. The

estimated parameters are plugged in the formulas (5.81), (5.82), and (5.83) derived in

Section 5.2.3. A MATLAB structure sens saves the estimated main effect and total effect sensitivity indices, the estimated total variance, and the estimated correlation parameters.

%%%%%%%%%%%%% Sensitivity Analysis for Mixed Inputs %%%%%%%%%%%%%%%%%% % generate design matrix n = 30; f = 2; r = 1; A = lhsdesign(n,f+r); B = A-repmat(mean(A,1),n,1); [Q R] = qr(B,0); C = (Q-repmat(min(Q),n,1)) ./ repmat(range(Q),n,1); x = C(:,1:f); % quantitative inputs t = zeros(n,r); % qualitative input for i = 1:n if C(i,3) <= 1/2, t(i,1) = 1; else, t(i,1) = 2; end end

% compute output y = zeros(n,1); for i=1:n if t(i,1) == 1, y(i) = x(i,1)+x(i,2); elseif t(i,1) == 2, y(i) = 3*x(i,1); end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

161 %%%%%%%%%%%%% Sensitivity Analysis for Mixed Inputs %%%%%%%%%%%%%%%%%% % inputs and output x1 x2 t y 0 0.0552 2.0000 0 0.6106 0.0878 1.0000 0.6984 0.9357 0.1525 1.0000 1.0882 0.8007 0.7704 1.0000 1.5710 0.1518 0.4108 1.0000 0.5626 0.3799 1.0000 1.0000 1.3799 0.5315 0.5452 1.0000 1.0766 0.9588 0.7082 2.0000 2.8764 0.1174 0.8290 1.0000 0.9464 0.9162 0.7870 2.0000 2.7485 0.7751 0.5855 2.0000 2.3252 0.4156 0.8971 2.0000 1.2469 0.1855 0.3959 2.0000 0.5565 0.5032 0.5154 2.0000 1.5095 0.5649 0.2559 1.0000 0.8209 0.0787 0.2157 2.0000 0.2361 0.7064 0.3289 1.0000 1.0353 0.2106 0.9711 2.0000 0.6317 0.4660 0.9170 2.0000 1.3981 0.6135 0.6924 1.0000 1.3059 0.2606 0.4904 1.0000 0.7510 0.3119 0.2865 1.0000 0.5984 0.7405 0.6332 2.0000 2.2216 0.2941 0.1087 2.0000 0.8823 0.8802 0.1757 1.0000 1.0559 1.0000 0 2.0000 3.0000 0.8259 0.4666 2.0000 2.4778 0.3534 0.6608 1.0000 1.0142 0.6643 0.8563 2.0000 1.9929 0.0433 0.3583 1.0000 0.4016 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

162 %%%%%%%%%%%%% Sensitivity Analysis for Mixed Inputs %%%%%%%%%%%%%%%%%% % estimate correlation parameters ctype = 0; % Gaussian correlation function fittype = 0; % Maximum Likelihood method F = ones(n,1); % constant mean model corparms.ctype = ctype; corparms.fittype = fittype; initnum = 5; [corparms, loglik] = estgaussiancor_mixed(x,t,y,F,corparms,initnum); theta = corparms.scale’; power = ones(f,1)*2; c = corparms.c; R = cormatexp_mixed(x,t,theta,power,c); % correlation matrix invR = inv(R);

% calculate the estimates of beta and sigma2 beta = inv(F’*invR*F)*F’*invR*y; sigma2 = (1/n)*((y-F*beta)’*invR*(y-F*beta)); invSigma = invR/sigma2;

% do the sensitivity analysis sens = sensitivity_mixed(x,t,y,beta,sigma2,invSigma,theta,c,ctype) sens sme: [0.6399 0.0401 0.1202] ste: [0.7998 0.0800 0.3201] totalvar: 0.5206 params: [0.4502 0.0321 0.1731] %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

163 8.3 Maximin Code

This section describes MATLAB code that implements the algorithms for obtain-

ing maximin designs in Chapter 6. The files SSLHD.m, OSGSD d2.m, OSGSD dk.m are functions which find a best design from one starting design based on the three dif-

ferent algorithms. SSLHD.m implements the the smart swap algorithm; OSGSD d2.m

(2) implements the orthogonal swap method using dmin, and OSGSD dk.m implements the

orthogonal swap method using ϕp. There are three example files (example SSLHD.m, example OSGSD d2.m, example OSGSD dk.m) that implement SSLHD.m, OSGSD d2.m,

OSGSD dk.m, respectively, each example being set to run for a user-specified time limit.

The following is the output from an example implementation of the OSGSD-ϕp algorithm based on p = 15 and 4-dimensional rectangular distance for a best 9 × 4

design with 420 seconds upper bound on the computation time as shown in Tables 6.3

and 6.4. The MATLAB structure finalOS saves the resulting design, the number of

2 starting designs, computation times, various performance measures such as ϕp, ρave, (k) | | (2) dmin, ρ max, and dmin of the resulting design.

164 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%% specify a dimension of design n = 9; % number of row k = 4; % number of columns

%%%%% specify parameters for distance metric and phi_p metric = 1; p = 15;

%%%%% specify a time limit timelimit = 420;

%%%%%%%%%%%%%%%%%%%%% OSGSD algorithm using phi_p %%%%%%%%%%%%%%%%%%%%%%%%% % try multiple starting designs within time limit OS_time = 0; w = 0;

% repeat until the time limit is reached % call OSGSD_dk.m function file while OS_time < timelimit w = w+1; OS(w) = OSGSD_dk(n,k,metric,p); OS_time = OS_time + OS(w).time; end

% if the last run exceeds the time limit, then trim total elapsed time if OS_time == timelimit OS_w = w; else OS_time = OS_time - OS(w).time; OS_w = w-1; end

% determine the global best design among best designs % corresponding to each starting starting design temp = zeros(OS_w,1); for i=1:OS_w temp(i) = OS(i).phi; end % find a design which minimize the phi_p value [OS_phi OS_index] = min(temp); finalOS.X.orig = OS(OS_index).X; finalOS.starting = OS_w; finalOS.totaltime = OS_time; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

165 %%%%%% Calculate various performance measures based on scaled design %%%%%% finalOS.X.scaled.design = finalOS.X.orig; finalOS.X.scaled.p = p;

% calculate (d^{(k)}(x_i,x_j))^{-p} for all pairs of rows dist = zeros(nchoosek(n,2),1); A = zeros(n,n); t = 1; for i = 1:n-1 for j = i+1:n dist(t) = (norm(finalOS.X.scaled.design(i,:)... -finalOS.X.scaled.design(j,:),metric))^(-p); A(i,j) = dist(t); t = t+1; end end

% calculate phi_p value finalOS.X.scaled.phi = sum(dist)^(1/p);

% calculate minimum k-dim distance d^{(k)}_{min} finalOS.X.scaled.dkmin = (max(dist))^(-1/p);

% calculate average of squared correlation temp = corr(finalOS.X.scaled.design).^2; sumcorr = 0; for i = 1:k-1 for j = i+1:k sumcorr = sumcorr + temp(i,j); end end finalOS.X.scaled.avgsqrcorr = sumcorr / (k*(k-1)/2);

% calculate maximum of absolute correlation temp = abs(corr(finalOS.X.scaled.design)); for i = 1:k temp(i,i) = 0; end finalOS.X.scaled.maxabscorr = max(max(temp));

% calculate all inter-point distances over all projections of design % onto every 2-dimensional subspace dist = zeros(nchoosek(n,2),nchoosek(k,2)); t = 1; for h = 1:k-1 for l = h+1:k dist(:,t) = pdist(finalOS.X.scaled.design(:,[h l])); t = t+1; end end

% calculate d^{(2)}_{min} finalOS.X.scaled.d2min = min(min(dist)); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

166 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % show the results

>> finalOS.X.scaled.design 0.1437 1.0000 0.8647 0.8883 0.1371 0.0703 0 0.8546 0.9717 0.0413 0.8623 0.8454 0 0.0096 0.7763 0.1000 0.8642 0.6996 1.0000 0.0057 0.6881 0.7096 0.2875 1.0000 0.0017 0.9269 0.2941 0.0000 0.7737 0 0.1756 0 1.0000 0.9604 0.0156 0.2457

>> finalOS.X.scaled.p 15 >> finalOS.X.scaled.phi 0.7562 >> finalOS.X.scaled.avgsqrcorr 7.8642e-33 >> finalOS.X.scaled.dkmin 1.4840 >> finalOS.X.scaled.maxabscorr 1.6653e-16 >> finalOS.X.scaled.d2min 0.0305 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

167 BIBLIOGRAPHY

Altman, D. G. and Bland, J. M. (1994), “Diagnostic tests. 1: Sensitivity and speci-

ficity,” British Medical Journal, 308, 1552.

Bartz-Beielstein, T. (2006), Experimental Research in Evolutionary Computation - The New Experimentalism, Berlin: Springer.

Ben-Ari, E. N. and Steinberg, D. M. (2007), “Modeling Data from Computer Exper-

iments: An Empirical Comparison of Kriging with MARS and Projection Pursuit

Regression,” Quality Engineering, 19, 327–338.

Benjamini, Y. and Hochberg, Y. (1995), “Controlling the False Discovery Rate: A

Practical and Powerful Approach to Multiple Testing,” Journal of the Royal Sta-

tistical Society, Series B: Methodological, 57, 289–300.

Bingham, D., Sitter, R. R., and Tang, B. (2009), “Orthogonal and Nearly Orthogonal Designs for Computer Experiments,” Biometrika, 96, 51–65.

Butler, N. A. (2001), “Optimal and orthogonal Latin hypercube designs for computer experiments,” Biometrika, 88, 847–857.

Campbell, K., McKay, M. D., and Williams, B. J. (2006), “Sensitivity analysis when

model outputs are functions,” Reliability Engineering and System Safety, 91, 1468–

1472.

168 Chipman, H. (2006), “Prior Distributions for Bayesian Analysis of Screening Exper- iments,” in Screening: Methods for Experimentation in Industry, Drug Discovery

and Genetics, eds. Dean, A. M. and Lewis, S. M., New York: Springer Verlag, pp.

235–267.

Cioppa, T. M. and Lucas, T. W. (2007), “Efficient Nearly Orthogonal and Space- Filling Latin Hyp ercubes,” Technometrics, 49, 45–55.

Conover, W. J. (1999), Practical Nonparametric Statistics, John Wiley & Sons.

Conti, S. and O’Hagan, A. (2006), “Bayesian Emulation of Complex Multi-Output

and Dynamic Computer Models,” .

Dorfman, R. (1943), “The detection of defective members of large populations,” The Annals of Mathematical Statistics, 14, 436–440.

Fang, K. T., Li, R., and Sudjianto, A. (2005), Design and Modeling for Computer

Experiments, London: Chapman and Hall.

Fisher, R. (1921), “On the ‘probable error’ of a coefficient of correlation deduced from

a small sample,” Metron, 1, 1932.

Fogelson, A., Kuharsky, A., and Yu, H. (2003), “Computational Modeling of Blood

Clotting: Coagulation and Three-dimensional Platelet Aggregation,” in Polymer

and Cell Dynamics: Multicsale Modeling and Numerical Simulations, eds. Alt, W.,

Chaplain, M., Griebel, M., and Lenz, J., Basel: Birkhauser-Verlag, pp. 145–154.

Forrester, A., S´obester, A., and Keane, A. (2008), Engineering Design via Surrogate Modelling, Chichester: Wiley.

Gattiker, J. (2005), “Using the Gaussian Process Model for Simulation Analysis

(GPM/SA) Code,” Tech. Rep. LA-UR-05-5215, Los Alamos National Laboratory.

169 Han, G., Santner, T. J., Notz, W. I., and Bartel, D. L. (2009), “Prediction for Com- puter Experiments Having Quantitative and Qualitative Input Variables,” Techno-

metrics, 51, 278–288.

Higdon, D., Gattiker, J., Williams, B., and Rightley, M. (2008), “Computer Model

Calibration using High Dimensional Output,” Journal of the American Statistical Association, 103, 570–583.

Higdon, D., Kennedy, M., Cavendish, J., Cafeo, J., and Ryne, R. (2004), “Combining

field data and computer simulations for calibration and prediction,” SIAM Journal

of Scientific Computing, 26, 448–466.

Homma, T. and Saltelli, A. (1996), “Importance measures in global sensitivity anal- ysis of model output,” Reliability Engineering and System Safety, 52, 1–17.

Johnson, M. E., Moore, L. M., and Ylvisaker, D. (1990), “Minimax and Maximin

Distance Designs,” Journal of Statistical Planning and Inference, 26, 131–148.

Johnson, R. A. and Wichern, D. W. (1998), Applied Multivariate Statistical Analysis,

Prentice-Hall Inc.

Joseph, V. R. and Delaney, J. D. (2007), “Functionally Induced Priors for the Analysis

of Experiments,” Technometrics, 49, 1–11.

Joseph, V. R. and Hung, Y. (2008), “Orthogonal-Maximin Latin Hypercube Designs,”

Statistica Sinica, 18, 171–186.

Joseph, V. R., Hung, Y., and Sudjianto, A. (2008), “Blind Kriging: A New Method for Developing Metamodels,” ASME Journal of Mechanical Design, 130, 031102–1–8.

Kenett, R. and Zacks, S. (1998), Modern Industrial Statistics: Design and Control of

Quality and Reliability, Duxbury Press.

170 Kennedy, M. C. and O’Hagan, A. (2000), “Predicting the Output from a Complex Computer Code When Fast Approximations Are Available,” Biometrika, 87, 1–13.

Kleijnen, J. P. C. (1987), “Review of Random and Group-screening Designs,” Com-

munications in Statistics: Theory and Methods, 16, 2885–2900.

Lanning, D. D., Beyer, C. E., and Berna, G. A. (1997), “FRAPCON-3: Integral

Assessment,” Tech. rep., Pacific Northwest National Laboratory.

Lempert, R., Williams, B. J., and Hendrickson, J. (2002), “Using Global Sensitivity

Analysis to Understand Policy Effects and to Aid in New Policy Contruction in

Integrated Assessment Models,” Unpublished technical report, RAND.

Lewis, S. M. and Dean, A. M. (2001), “Detection of Interactions in Experiments on Large Numbers of Factors,” Journal of the Royal Statistical Society, Series B:

Statistical Methodology, 63, 633–672.

Lin, C. D., Mukerjee, R., and Tang, B. (2009), “Construction of Orthogonal and

Nearly Orthogonal Latin Hypercubes,” Biometrika, 96, 243–247.

Linkletter, C., Bingham, D., Hengartner, N., Higdon, D., and Ye, K. Q. (2006), “Vari-

able Selection for Gaussian Process Models in Computer Experiments,” Techno-

metrics, 48, 478–490.

Loeppky, J. L., Sacks, J., and Welch, W. (2009), “Choosing the Sample Size of a Computer Experiment: A Practical Guide,” Technometrics, 51, 366–376.

McKay, M. D., Beckman, R. J., and Conover, W. J. (1979), “A comparison of three methods for selecting values of input variables in the analysis of output from a

computer code,” Technometrics, 21, 239–245.

171 McMillan, N. J., Sacks, J., Welch, W. J., and Gao, F. (1999), “Analysis of Protein Ac- tivity Data by Gaussian Stochastic Process Models,” Journal of Biopharmaceutical

Statistics, 9, 145–160.

Moon, H., Santner, T. J., and Dean, A. M. (2010), “Two-stage Sensitivity-based

Group Screening in Computer Experiments,” Submitted for Publication.

Morris, M. D. (2006), “An Overview of Group Factor Screening,” in Screening: Meth-

ods for Experimentation in Industry, Drug Discovery and Genetics, eds. Dean,

A. M. and Lewis, S. M., New York: Springer Verlag, pp. 191–206.

Morris, M. D. and Mitchell, T. J. (1983), “Two-level Multifactor Designs for Detecting the Presence of Interactions,” Technometrics, 25, 345–355.

— (1995), “Exploratory Designs for Computational Experiments,” Journal of Statis-

tical Planning and Inference, 43, 381–402.

Oakley, J. E. and O’Hagan, A. (2004), “Probabilistic Sensitivity Analysis of Complex

Models: A Bayesian Approach,” Journal of the Royal Statistical Society, Series B:

Statistical Methodology, 66, 751–769.

Ong, K., Lehman, J., Notz, W. I., Santner, T. J., and Bartel, D. L. (2006), “Acetab-

ular Cup Geometry and Bone-Implant Interference have More Influence on Initial Periprosthetic Joint Space than Joint Loading and Surgical Cup Insertion,” Journal

of Biomechanical Engineering, 148, 169–175.

Ong, K., Santner, T. J., and Bartel, D. L. (2008), “Robust Design for Acetabular

Cup Stability Accounting for Patient and Surgical Variability,” Journal of Biome-

chanical Engineering, 130, 031001–11.

172 Owen, A. B. (1994), “Controlling Correlations in Latin Hypercube Samples,” Journal of the American Statistical Association, 89, 1517–1522.

Qian, P. Z. G., Wu, H., and Wu, C. F. J. (2008), “Gaussian Process Models for Com-

puter Experiments With Qualitative and Quantitative Factors,” Technometrics,

50, 383–396.

Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989), “Design and analysis

of computer experiments,” Statistical Science, 4, 409–423.

Saltelli, A., Chan, K., and Scott, E. (2000), Sensitivity Analysis, John Wiley & Sons,

Chichester.

Santner, T. J., Williams, B. J., and Notz, W. I. (2003), The Design and Analysis of Computer Experiments, New York: Springer Verlag.

Schonlau, M. and Welch, W. J. (2006), “Screening the Input Variables to a Computer

Model Via Analysis of Variance and Visualization,” in Screening: Methods for

Experimentation in Industry, Drug Discovery and Genetics, eds. Dean, A. M. and

Lewis, S. M., New York: Springer Verlag, pp. 308–327.

Sobol´, I. M. (1993), “Sensitivity analysis for non-linear mathematical models,” Math-

ematical Modeling and Computational Experiment, 1, 407–414.

Steinberg, D. M. and Lin, D. K. J. (2006), “A Construction Method for Orthogonal

Latin Hypercube Designs,” Biometrika, 93, 279–288, Correction pages 1025–1025.

Sun, F., Liu, M.-Q., and Lin, D. K. J. (2009), “Construction of Orthogonal Latin Hypercube Designs,” Biometrika, 96, 971–974.

Tang, B. (1993), “Orthogonal array-based latin hypercubes,” Journal of the American

Statistical Association, 88, 1392–1397.

173 — (1998), “Selecting Latin Hypercubes Using Correlation Criteria,” Statistica Sinica, 8, 965–978.

Upton, M. L., Guilak, F., Laursen, T. A., and Setton, L. A. (2006), “Finite Element

Modeling Predictions of Region-specific Cell-matrix Mechanics in the Meniscus,”

Biomechanics and Modeling in Mechanobiology, 5, 140–149.

Vine, A. E., Lewis, S. M., and Dean, A. M. (2005), “Two-stage Group Screening in the

Presence of Noise Factors and Unequal Probabilities of Active Effects,” Statistica

Sinica, 15, 871–888.

Vine, A. E., Lewis, S. M., Dean, A. M., and Brunson, D. (2008), “A Critical Assess- ment of Two-Stage Group Screening Through Industrial Experimentation,” Tech-

nometrics, 50, 15–25.

Watson, G. S. (1961), “A Study of the Group Screening Method (Com: V5 P397-398;

V7 P444-446),” Technometrics, 3, 371–388.

Welch, W. J. (1985), “ACED: Algorithms for the Construction of Experimental De-

signs,” The American Statistician, 39, 146–146.

Welch, W. J., Buck, R. J., Sacks, J., Wynn, H. P., Mitchell, T. J., and Morris,

M. D. (1992), “Screening, predicting, and computer experiments,” Technometrics, 34, 15–25.

Worley, B. A. (1987), “Deterministic Uncertainty Analysis,” ORNL-6428, available

from National Technical Information Service, 5285 Port Royal Road, Spring VA

22161.

174 Wu, Y., Boos, D. D., and Stefanski, L. A. (2007), “Controlling Variable Selection by the Addition of Pseudovariables,” Journal of the American Statistical Association,

102, 235–243.

Ye, K. Q. (1998), “Orthogonal Column Latin Hypercubes and Their Application

in Computer Experiments,” Journal of the American Statistical Association, 93,

1430–1439.

175