V2702151 an Examination of Response-Surface Methodologies

Home , Latin hypercube sampling

4‘ 1985 American StatistIcal Association and TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 the American Society for Quality Control

An Examination of Response-Surface Methodologies for Uncertainty Analysis in Assessment Models

D. J. Downing R. H. Gardner F. 0. Hoffman

Computer Services Environmental Sciences Division Health and Safety Research Division

Oak Ridge National Laboratory Martin Marietta Energy Systems, Inc. Oak Ridge, TN 37831

Two techniques of uncertainty analysis were applied to a mathematical model that estimates the dose-equivalent to man from the concentration of radioactivity in air, water, and food. The response-surface method involved screening of the model to determine the important parameters, development of the response-surface equation, calculating the moments using the response-surface model, and fitting a Pearson or Johnson distribution using the calculated moments. The second method sampled model inputs by Latin hypercube methods and iteratively simulated the model to obtain an empirical estimation of the cdf. Comparison of the two methods indicates that it is often difftcult to ascertain the adequacy or reliability of the response-surface method. The empirical method is simpler to implement and, because all model inputs are included in the analysis, it is also a more reliable estimator of the cumulative distribution function of the model output than the response-surface method.

KEY WORDS: Uncertainty analysis; Experimental design; Response surfaces; Factorial design; Latin hypercube sampling; Distribution fitting.

1. INTRODUCTION tive probability density functions, simulating the system to obtain the unique set of predictions (out- Mathematical models are important tools in pre- puts) associated with each parameter set, and then dicting the fate and effect of various environmental analyzing the uncertainties of the model outputs as a pollutants (e.g., Bartell et al. 1981). Many different function of their inputs. Monte Carlo methods are models are now being used to guide decisions con- most useful when the number of inputs to be varied is cerning environmental regulations, the cost ef- small and the cost of each model simulation is inex- fectiveness of new studies and data, the comparative pensive. Special sampling methods such as those de- risks associated with new technologies, and the docu- veloped by Iman and Conover (1980) can be used to mentation of potential environmental impacts. The reduce the cost and extend the usefulness of Monte uncertainties associated with model-based decisions Carlo methods, but some models are still too expen- are a direct consequence of the adequacy of the model sive for Monte Carlo methods. as a descriptor of the relevent physical and biological The purpose of this article is to examine a model of processes, and the accuracy of estimation of the model dose assessment developed by Hoffman et al. (1982) to parameters. Validation experiments are the most sat- determine the relative merits of two statistical tech- isfactory method of determining the usefulness and niques of evaluating model uncertainties. The first reliability of these models (see Snee 1977 or Mankin et technique can be called the response-surface technique al. 1977), but because validation studies are expensive and consists of the following steps: (a) screening to and time consuming, few have been done. determine the subset of important inputs, (b) The need to quantitatively describe the effects of response-surface modeling to achieve a proxy to the model uncertainties has led to the development and original code, (c) obtaining moments of the response- improvement of Monte Carlo methods (see Gardner surface model, and (d) fitting a Pearson or Johnson 1983). Monte Carlo methods propagate the uncer- distribution to the moments to obtain a statistical tainties associated with model parameters (inputs) by model of the proxy to the output distribution. The iteratively selecting random values from their respec- second technique may be termed the empirical ap-

151 152 D. J. DOWNING, I?. H. GARDNER, AND F. 0. HOFFMAN

Table 1. Input Parameters for SR Dose Model

Standard Variable Distribution Mean Deviation Mode Minimum Maximum - Dose Conversion Factor (DFI, rem/p Sr) lognormal 1.575 ,289 Weathering Loss Leaf (TW leaf, days) lognormal 12.795 4.5291 Nonleaf (TW nonleaf, days) lognormal 21.926 9.5982 Pasture (TW pasture, days) lognormal 12.799 4.5291 Interceptron Fraction Normalized for Btomass Leaf (RY leaf, m 2/kg) lognormal .1201 .0790 Nonleaf (RY nonleaf, mZ/kg) lognormal .0725 .0486 Pasture (RY pasture, ml/kg) lognormal 1.9523 .6911 Soil Surface Density (P, kg/rn2) lognormal 214.0159 23.6135 Loss Rate From Soil Root Zone (LAMS, days) lognormal 2.70E-4 4.05E-4 Milk Transfer Coefficient (FM, days/l.) lognormal 1.38E-3 7.03E-4 Meat Transfer Coefficient (FF, days/kg) lognormal 4.73E-4 2.73E-4 Amount of Feed Consumed per Day Dairy Cows (QF milk, kg/d) normal 9.700 2.300 Cattle (QF meat, kg/d) normal 9.100 1.600 Annual Rate of Consumption of Food by Humans Leaf (U leaf, kg/yr) lognormal 19.1815 7.0861 Nonleaf (U nonleaf, kg/yr) lognormal 54.7179 37.4622 Milk (U milk, I/yr) lognormal 112.739 72.540 Meat (U meat, kg/yr) lognormal 100.304 37.7599 Plant/Soil Concentration Ratio Leaf (6 leaf, Sr/kg leaf per Sr/kg soil) lognormal .3754 .1955 Nonleaf (5 nonleaf, Sr/kg nonleaf per Sr/kg soil) lognormal .I 292 .I 490 Pasture (5 pasture, Sr/kg pasture per Sr/kg soil) lognormal 2.2848 2.9805 Time-Period Crops Exposed Leaf (TE leaf, days) triangular 75.0 40.0 120.0 Nonleaf (TE nonleaf, days) triangular 100.0 60.0 180.0 Milk (TE milk, days) triangular 30.0 15.0 200.0 Meat (Tf meat, days) triangular 40.0 15.0 200.0 prouch and consists of two steps: (a) obtaining a Latin tion of vegetation or animal food products are exclud- hypercube sample from the set of all of the inputs and ed. This is due to the fact that the radiological half-life (b) obtaining the empirical cumulative distribution of 90Sr is on the order of 30 years, and thus only a function of the output resulting from using the Latin negligible amount of these nuclides is lost through hypercube sample on the inputs. radiological decay during the comparatively short time period between harvest and consumption of food 2. DESCRIPTION OF MODEL products. Table 1 contains the inputs and distri- Compliance with radiation-protection standards butions used in the analysis. It is assumed that the typically is determined by mathematical models that inputs are independent of one another. This is physi- convert measurable quantities of radioactivity in air, cally untrue, but it is a simplifying assumption due to water, and food into an estimate of the dose- lack of data about the true nature of the dependence. equivalent to man (absorbed energy per gram of A more detailed description of the model and its human tissue, expressed in rems). The computer associated algorithms can be found in Hoffman et al. model SRDOSE calculates the dose due to the inges- (1982). tion of 90Sr from four possible pathways-leafy crops, In general we may let Xi, X, , . . . , Xk denote the k nonleafy crops, milk, and meat. The program contains inputs (for the SRDOSE model k = 24) and Y denote 24 inputs that may vary and five additional inputs the output (the dose due to ingestion of 90Sr for the that are fixed and thus are of no concern. The model is SRDOSE model). There is a deterministic relationship of moderate complexity, but it is efficient and large between Y and Xi, X,, . . . , X, so that Monte Carlo runs are not overly expensive. Y = h(X,, X2) . . .) X,). The conceptual structure and algorithms of the (2.1) models used for these pathways are essentially identi- The problem is that it is not feasible to investigate h cal with those in U.S. Nuclear Regulatory Commis- due to time, cost, and pure physical complexity. For sion (1977). An additional term is included to account those reasons we need to utilize methods that will for the downward migration of a deposited radio- allow us to approximate the behavior of h (or equiva- nuclide out of the assumed 15 cm root zone of soil. lently Y), ignorant of the true form of h. In complex Loss terms to account for radiological decays of the computer codes where k is very large we need to radioisotopes between harvest and human consump- determine the input variables whose effect on h is

TECHNOMETRICS, MAY 1985, VOL. 27. NO. 2 RESPONSE-SURFACE UNCERTAINTY ANALYSIS 153 substantial. This reduces the number of input vari- utilized to approximate the percentiles of Y. The next ables to be considered and is termed screening. The section deals with the first step in implementing the importance of screening is that it reduces the dimen- response-surface technique-screening. sionality of the problem and decreases the effort needed to find some approximating function to h. 3. SCREENING TECHNIQUES Inputs considered unimportant in the screening are Let the input variables be denoted by Xj (j = 1, 2, held fixed at their nominal values in the subsequent . . . ) k) and the output variable by Y. We assume that analysis where we are trying to approximate h by the distribution of Xj is known and our objective is to some simpler function, usually a polynomial in the select a subset p of the k inputs so that the response- inputs called a response-surface model. Obviously, surface model will contain the fewest terms possible poor screening will leave us with a response-surface and still be a good representation of the computer model that is a poor approximation to the true func- model. The complexity of the response surface is con- tion h. trolled by the complexity of the computer model; the Another concept in the uncertainty analysis of a aim here is not to add parameters to the response- computer model is that the inputs are random vari- surface model unnecessarily. An initial evaluation ables with usually some unknown distribution. In (screening) of the system is useful to determine the some cases data exist to estimate this distribution, but most influential inputs. If resources for measuring this is the exception rather than the rule. The choice of inputs are limited, the screening indicates which distribution is a matter of great concern in uncertainty inputs should receive the greatest portion of those analysis, as it affects the distribution of the output. resources. Several commonly used screening tech- Furthermore, the assumption of independence in this niques are: (a) subjective, (b) differential sensitivity example is chosen more from ignorance than from analysis, (c) one-at-a-time design, (d) rank order corre- necessity. Programs exist (see Iman and Conover lation, and (e) the adjoint method. 1982) that allow correlations between input variables The subjective method involves modelers and ex- to be specified. In this problem we did not specify any perienced investigators working together to discard correlations because no data exist to estimate them. A inputs thought to be unimportant. This method is the robustness study of the effect of correlations would be least scientific and is prone to large personal biases, a valuable exercise and point to improvement in the but it may be necessary for a first cut at a computer overall understanding of the model. code with a larger number of inputs. The use of this Since Y is a function of Xi, X,, . . . , X,, which are method might necessitate creating a designed experi- random variables, Y is also a random variable. The ment in which one could check for inadequacies of the uncertainty analysis centers on the distribution of Y initial screening decisions. In differential sensitivity and statements about it. In risk assessment concern is analysis, one calculates the partial derivatives for each usually with the extremes of Y. In our example when input variable. The sensitivity coefficient, aj, is defined the dosage to man exceeds some critical value, say, x, as the partial derivative of Y with respect to the input the risk to man from cancer is serious. Thus uncer- Xj. That is, tainty analysis may be concerned with calculating the ay probability that Y will exceed Y, given our present ajzw’ knowledge (choice of model) and current conditions (input distribution). Th e f ecus of uncertainty analysis Assuming that Y is linear in Xj, we can estimate the is not entirely on the output probability distribution sensitivity coefftcient by the ratio of the percentage function, but it is the major concern of our article. For change in the output Y from its nominal value (the more information on uncertainty analysis, see Gard- value of the output when all of the inputs are set at ner (1983). their nominal value) to the percentage change in the The reader may wonder why a response-surface input Xj from its nominal value model should be built if what one wants is the pdf of Y. The answer is that these complex computer models are costly to run on a routine basis, and the number of (3.2) runs needed to estimate the pdf of Y to within practical limits of error would deplete both the budget and and treat cij as an estimate of the sensitivity coefficient. the computer resources. The number of runs is even This technique is simple and intuitively appealing, but larger if the objective is to estimate the extreme per- it depends strongly on the assumption that the input- centiles of the pdf. What one would like is a model output relationship is linear. In practice, choice of the that closely approximates the complex code and is percentage change in Xj is arbitrarily set at 1% of its simple and inexpensive to run. The response-surface nominal value. A 1% change is meaningful in some model answers this need, and statistical theory exists applications but unrealistic in others. Let the model for its analysis. Later sections show how it can be be described by the function h( .), If the relationship is

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 154 D. J. DOWNING, R. H. GARDNER, AND F. 0. HOFFMAN not linear, then we have an estimate of the slope of the we use the five-point design p i 2na for n = 0, 1, 2, function h( .) along the Xj axis near the point (pi, pz, which translates into a ratio of Z,/s = S/fi = 2.53. . . . ) &. This sensitivity estimate may change drasti- This identity might hold for some pathological non- cally at some other point in the space of the input linear relationships, but departures from equality variables, depending on the nonlinearity of h( . ). imply that the input-output relationship is not linear. To reduce the number of runs necessary to obtain A major drawback to the use of the one-at-a-time the sensitivity coefficients, Krieger et al. (1977) at- design is the absence of any data to assess the interac- tempted to obtain a ranking of the sensitivity coef- tion between variables in the way they affect the re- ficients by solving the set of underdetermined equa- sponse. tions McKay et al. (1979) and Iman and Conover (1980) suggested that Latin hypercube sampling may be used Xa = AY, (3.3) to conduct sensitivity analysis. Latin hypercube sam- where a = (k x 1) vector of sensitivity coefficients, pling is similar to stratified random sampling in which X = (n x k) rectangular matrix (n < k) specifying the the strata are chosen as equal probability intervals changes from the mean in the input variables for each and then values are randomly selected from each in- of the n computer runs, and AY = (n x 1) vector of terval. The n subintervals of each Xj are randomly output changes from the mean output. Each input is permuted relative to other Xj’s so that every combi- randomly perturbed 1o/ about its mean value. Alsmil- nation of subinterval is equally likely to be obtained. ler et al. (1980) found that this method ranked several Iman and Conover (1982) have developed a meth- inputs as having higher sensitivities than their value odology that controls the random pairing in such a using Equation (3.2), but they could not determine way that eliminates spurious correlations between the why the method was inadequate. Xj’s and in addition can allow for correlations to be An extension of the differential sensitivity method is built into the Latin hypercube sample. Iman and Con- to estimate sensitivity coefftcients in the one-at-a-time over’s methodology is used in deriving the Latin hy- design, where each input is evaluated at its mean and percube sample in this article. Evaluation of the model then at its mean plus or minus some multiple of its for each independent set of inputs provides a data set standard deviation (typically p f 4~). This necessi- from which the sensitivities can be estimated. tates (2k + 1) runs, and for large costly codes it would Using the Latin hypercube sample, Iman and Con- be impractical. A benefit of this method is that the over use partial rank correlation analysis to indicate results can be saved and used with a fractional fac- the sensitivity of the output of each of the inputs. The torial experimental design to form a central composite method computes the normal Pearson product design for estimating a response surface. The infor- moment correlation between two variables using mation from the one-at-a-time design can be used to ranks to reduce the influence of extreme observations rank the input variables as to their effect on the on the calculations and to give a truer measure of the output. Let sj be the standard deviation of the output strength of the nonlinear relationship between Xj and resulting from changing Xj from pj to (pj - caj) and Y by converting it to a measure of monotonicity (pj + caj), where pj and crj are the mean and standard rather than a measure of linearity as is obtained with deviation of the jth input variable and c is some raw data. Let D denote the correlation matrix ob- constant. Then sj is a useful measure of the effect Xj tained using the ranks of the input variables and the has on the output. Large values of sj indicate a output Y, then the partial rank correlation coefficient marked effect whereas small values indicate little or between the jth input variable and the output Y no monotonic effect. Another simple measure of the (where Y is given as the k + 1st variable) is defined by importance of Xj is the absolute value of the difference ,..i.k+l = _dj,k+l/(djjdk+l,k+1)1/2, between the output values at the extreme values of Xj, that is, when Xj = pj - ccrj and Xj = pj + cgj. This j=l,2 > . . . . k (3.4) value, denoted by I, for the importance of Xj, can be used with sj to measure departures from linearity. If where we assume that Y and Xj have a linear relationship, &,k+l = (j, k + 1) element of D-l. then it can be shown that I, and sj are constant multiples of each other. For example, in the preceding The partial rank correlation is a measure of the corre- three-point design with c = 4, we have I, = 2sj under lation between two variables removing the effect of the assumption that the output is linear with respect the other variables. Partial rank correlations with ab- to the jth input. To see this, suppose that y = a + bx solute value near 1 indicate strong monotonic re- and that we sample x at p + no for n = 0, 1, 2, . , k. lationships. These can be used to indicate which Then I, = 2kba and s2 = xi= -k [a + b(p + no) - (a inputs have a strong monotonic effect on the output. + bp)]’ = 2b2a2k(k + 1)(2k + 1)/6 so that the ratio A drawback of this technique is that the ranking Z,/s = 2k/(2k(k + 1)(2k + 1)/6)‘j2. Later in the article makes it difficult to distinguish between the relative

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 RESPONSE-SURFACE UNCERTAINTY ANALYSIS 155

Table 2. Some Advantages and Disadvantages of Methods of Screening

Method Advantages Disadvantages

Subjective Simple, economical; Nonquantitative; large biases applicable to any size possible; relation to reality computer model. difficult to assess. Differential Widely applicable; Difficult for large Sensitivity allows ranking of inputs. computer codes; results dependent on assumption of linearity. One-at-a-time Widely applicable; yields Not feasible for large information on linearity; can computer codes due to be used with response-surface- large number of runs. model design; allows ranking of inputs. Rank Order Widely applicable; allows Good estimates of the Correlation ranking of inputs; assumes coefficients can be only monotonicity between obtained in as few ask + 2 input and output; allows ork + 3 runs, but not for correlated inputs. feasible when k is large. Adjoint Method Applicable to all computer Very difficult and costly codes; gives exact results; to obtain adjoint equations; theoretically not dependent in practice the nonlinear on assumption of linearity. aspects usually ignored.

sensitivity of two variables when the response is a table shows that most of the techniques are applicable plane with no interaction but the rate of increase in to complex computer models. Most techniques allow one direction is markedly greater than the rate of one to rank the inputs in a quantitative manner. The increase in the other. Those two variables would more important aspect of Table 2 is in the list of appear to have equal sensitivities with regard to the disadvantages, which one must weigh heavily given output when indeed they are different. The manner in that the results of the sensitivity analysis depend on which the variables are sampled affects this result. them. All of the methods, except subjective, are im- When either random sampling or Latin hypercube practical on large codes. This means that for large sampling is used, the partial rank correlation will codes the subjective method should be used, at least reflect the true sensitivities with regard to the vari- for a first cut. Methods to check the adequacy of the ables’ effect on the output. We point out that the decisions must then be applied. The methods that we approach may be run on the raw data as well as the found to be most revealing for models with a moder- ranked data and compared. Large discrepancies be- ate number of inputs are the one-at-a-time method tween the two analyses might indicate departures and the rank order correlation. The adjoint method is from linearity. If the partial rank correlation is high theoretically the most appealing, but even with while the partial correlation is low, this would indicate moderate-sized computer codes it is impractical due a nonlinear relationship between the input and to the difficult in obtaining the adjoint equations. output. Table 3 gives summary statistics obtained by The adjoint method provides a rigorous mathemat- changing each input variable individually from its ical method for sensitivity analysis (Oblow 1978). This mean value to its mean value plus and minus two method yields the exact sensitivities as defined by standard deviations and its mean value plus and Equation (3.1). The adjoint equations yield the sensi- minus four standard deviations. The uncertainty value tivity coefficients for any value of the Xj’s, not just 7J is twice the standard deviation of the input variable. around the mean value, so no assumption about lin- The change in output (AY) is the difference of the earity between the input and output is made. We output values when the input is at its mean value plus point out that for each point (x,, x2, . . . , xk) at which two standard deviations and when input are at their we wish to determine the sensitivities, we must rerun mean value. Thus, negative values of AY indicate that the computer code. The adjoint equations yield the the output is a decreasing function with regard to exact solution, but that solution still depends on the input Xj. The sensitivity is the ratio of the change in point at which we are seeking a solution. The major the output to the change in the input, and is a crude drawback with the adjoint method is the difficulty in sensitivity estimate due to the large change in the constructing the adjoint equations. input. Using the 1% change would yield more precise Table 2 gives a brief description of the advantages sensitivity measures but would not be useful for later and disadvantages of the preceding techniques. The response-surface model building.

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 156 D. J. DOWNING, R. H. GARDNER, AND F. 0. HOFFMAN

Table 3. Summary Statistics for the One -at-a - Time Design Screening Analysis

Importance Uncertainty Change in Change in Sensitivity Measures Variable u Input=AX Output=AY S=AY/AX I,=lJxS /,=maxY-minY I, = sz

DFI ,578 .68 221.9 326.3 188.6 802.6 102,528 TW leaf 9.058 11.94 37.6 3.2 29.0 115.0 2,220 7-w nonleaf 19.196 26.35 101.2 3.8 72.9 286.4 14,122 TW pasture 9.058 11.94 26.6 2.2 19.9 89.8 1,352 RY leaf ,158 .23 111.3 483.9 76.5 524.3 48,363 RY nonleaf ,097 .I4 281.5 2.010.7 195.0 1,342.l 317,561 RY pasture 1.382 1.82 60.9 33.5 46.3 227.9 8,579 P 47.226 52.35 -54.9 -1 .I -51.9 252.5 10,033 TE leaf 32.745 16.00 .4 .03 1 .o 4.0 3 TE nonleaf 49.889 20.00 1.9 .I 5.0 14.3 34 TE milk 83.898 7.50 3.5 .5 41.9 19.4 60 TE meat 81.955 12.50 .9 .07 5.7 6.9 8 LAMS 8.1 E-4 1.2E-3 -215.2 -1.79E5 -145.0 388.3 32,473 FM 1.4E-3 2.OE-3 133.4 6.67E4 93.4 552.3 52.188 FF 5.4E-4 .8E-3 51.9 6.49E4 35.0 227.5 8,973 OF milk 4.600 4.60 39.3 8.5 39.1 157.0 3,851 OF meat 3.200 3.20 9.5 3.0 9.6 37.0 225 U leaf 14.172 18.83 196.7 10.5 148.8 742.1 91,320 u nonleaf 74.924 110.87 509.4 4.6 344.7 2.459.8 1.069.110 U milk 144.508 212.11 185.4 .9 130.1 860.9 130,040 U meat 75.520 100.73 29.0 .3 22.7 109.7 1,999 B leaf .391 .55 233.5 424.5 166.0 976.2 163,408 B nonleaf ,298 .45 473.8 1,052.g 313.8 3.544.4 2,337,424 I3 pasture 5.961 8.82 305.4 34.6 206.3 2.595.2 1.265.256

NOTE: Y-output; s2-sample variance of the output; U--2~~; AX-change in input; AY--change in output corresponding to change in input; U x S-Product of uncertainty (U) and sensitivity (S).

Three importance measures based on the data from their ranking based on the partial rank-order corre- the five-point one-at-a-time design are also included lation obtained using a Latin hypercube sample (see in Table 3. Large values of the importance measures Iman and Conover 1982). All four measures agree on indicate that the input has a strong effect on the their ranking with the exception of LAMS (loss rate output and should be retained for use in building a from soil root zone), which the partial rank-order response-surface model. I, is simply the product of the correlation ranked as 2 and the other methods ranked uncertainty of the input times its sensitivity. Large as 8, 11, and 11. Further analysis indicated that uncertainty may denote an absence of knowledge, LAMS is a very important input to the model. It has a which must be adjusted for. A large sensitivity (in monotonic nonlinear relationship with the output, absolute value) is a direct expression of the effect of which is evident regardless of the value of the other the input on the output. The product of the two measures reflects a combined lack-of-knowledge and effect measure. I, is simply the range of the output over the five input values. This measure is effective if the input-output relationship is monotonic, but it becomes less informative if the relationship is non- monotonic. For example, a U-shaped relationship between the input and output would not be adequately measured by the range estimate. The final importance measure I, is the sample variance of the output, s2, which will be large for strong monotonic or nonmon- otonic relationships. As indicated earlier, if the input- output relationship is exactly linear then there is a direct relationship between I, and ,/’ I,. We note that the dose conversion factor (DFI) has a ratio of .991 indicating a very strong linear relationship whereas the loss rate from the soil root zone (LAMS) has a .J ratio of .852, indicating little linear relationship with -4 -2 0 2 4 the output. These relationships are exhibited in Figure STANDARD DEVIATIONS FROM THE MEAN 1. Figure 1. Results of the One-at-a-Time Design. Table 4 contains the rankings of the input variables ( -) DN; (--) B nonleaf; (-----) B pas- on each of the three importance measures as well as ture; (---) RYnonleaf; (. . . . .) LAMS.

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 RESPONSE-SURFACE UNCERTAINTY ANALYSIS 157

Table 4. Rankings of Input Variables Using calculate the first four moments of a function using a Importance Measures and the Partial Rank second-order Taylor series approximation that yields Order Correlation From the Latin the exact moments for a second-order response- Hypercube Sample surface model. As yet, no comparison has been made between the results obtained using Monte Carlo Importance Measures Partial methods to obtain the empirical distribution and fit- I, = I, = Rank Order ting a distribution to the exact moments using Cox Variable lJ x S maxY - minY I, = s2 Correlation and Miller’s program. To construct the response-surface model, the p U nonleaf 3 3 B nonleaf 2 1 1 4 input variables selected from the k original variables B pasture 3 2 2 6 must be fit to some approximating function, usually a RY nonleaf 4 4 4 5 polynomial, that adequately describes the response DFI 5 3 B leaf 6 5 8 (output) variable’s surface. In most analyses a system- U leaf 7 8 7 atic approach is taken using a factorial design. Even LAMS 8 11 2 for moderate p, however, the number of computer U milk 9 6 6 10 FM IO 9 9 9 runs for a complete factorial design (2p) is too large RY leaf 11 10 10 13 and some fraction (l/2”) must be substituted. These TW nonleaf 12 12 12 11 fractional factorial designs (see Box and Hunter 1961) P 13 13 13 12 RY pasture 14 14 15 14 can be ordered by the level of confounding of effects. TE milk 15 21 21 22 Box and Hunter (1961) classified fractional factorial OF milk 16 16 16 15 designs into designs of resolution R, where R = 3,4,5, FF 17 15 14 16 TW leaf 18 17 17 23 . . . . A design of resolution R is one in which noj factor U meat 19 18 18 24 effect is confounded with any other effect containing TW pasture 20 19 19 17 less than R - j factors. Designs of resolution 3 exist OF meat 21 20 20 21 TE meat 22 23 23 19 that require only p + 1 runs to study up to p variables, TE nonleaf 23 22 22 20 where (p + 1) is a multiple of four. Designs developed TE leaf 24 24 24 18 by Plackett and Burman (1946) allow the investi- gation of 11 variables in 12 runs, 19 variables in 20 runs, and so forth. Designs of this type are called inputs. All four measures select the same top 10 inputs saturated. with only the partial rank-order correlation differing The problem with resolution 3 designs is that they on the eleventh. are very poor when any interaction exists between the This is a good example of where an important input variables. The problem is that the interaction can variable may be missed by measures based on lin- either inflate the apparent importance of unimportant earity assumptions. The partial rank-order correlation variables or reduce the apparent importance of impor- measure accounts for the nonlinearity and correctly tant variables. Draper and Mitchell (1968) discussed incorporates the monotonicity assessing LAMS to be the construction of saturated 2$-” designs. An impor- an important variable. Due to this, the first 11 vari- tant aspect of response-surface designs is that they ables listed in Table 4 were retained for use in the may be run in successive blocks, with each b!ock response-surface modeling. increasing the resolution. Thus, additional blocks are run only if they are needed. 4. CHOOSING AN EFFECTIVE DESIGN As mentioned before, the usual response-surface Once the input variables have been screened and model is a polynomial (usually of order 2) in the the most important subset of inputs has been selected, inputs. Recently, Pike and Smith (1981) applied in- a response-surface model may be fitted to the output verse polynomials as their approximating function to using them. If the response-surface model yields an nuclear-safety computer codes. The idea of inverse adequate fit, it can be used as a proxy for the actual polynomials was first introduced by Nelder (1966) as a computer code. Many of the larger models are expen- useful group of multifactor response functions. Letting sive to run whereas a computer code based on a XL, x,, .‘., X, denote the levels of the p input vari- response-surface model is relatively inexpensive. This ables and Y denote the corresponding output, then proxy for the actual model can be run using Monte the inverse polynomial family is given by Carlo techniques to obtain an empirical distribution of the output or to obtain moments of the output to be used in fitting either the Pearson or Johnson fJxiY = P(X), (4.1) system of curves. One can also obtain the moments of ( Ii the response-surface model analytically. Cox and where P(X) is a polynomial in Xi, X,, . . . , X,. Unfor- Miller (1976) developed a computer code that will tunately the authors were unaware of this technique at

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 158 D. J. DOWNING, R. H. GARDNER, AND F. 0. HOFFMAN the time this study was undertaken and consequently by the statistic inverse polynomials are not used in this article. The preceding family of functions possesses greater flexi- di = i xj/k - Y. Y, x 100, (4.2) bility and more realism with regard to behavior in the j=l I/ extreme regions of the factor space where extreme where yij = the value of the output when the ith input behavior in the response generally occurs. A further is at level j, and Y0 = the value of the output when all discussion and comparison of experimental designs inputs are at their mean value. If the input-output for estimating response-surface models (polynomial or relationship is exactly linear, then di = 0. Thus, small inverse polynomial) can be found in Pike and We- values of di indicate a linear relationship. A second therill (1983). A conclusion they make in their com- statistic, indicated earlier in the text, is the ratio of I, parison is the relative inefficiency of Latin hypercube to the appropriate multiple of the sample standard sampling to that of standard factorial designs (in par- deviation. If the relationship is linear, the ratio should ticular they looked at a central composite design and be near unity, although it could be near unity for some a 3k factorial design). nonlinear cases. If the ratio deviates from unity, this In contrast to Pike and Wetherill is a paper by indicates that the relationship is not linear. Table 5 Steck et al. (1976) who reported on a problem in contains the values of di and the ratio 1,/2.53s, which which the Latin hypercube sampling technique was indicate departures from linearity. Note that the three used to fit a response-surface model. The same prob- most influential inputs as measured by sz (B nonleaf, B lem was also considered using a fractional factorial pasture, and V nonleaf) have large values of di and design but with less satisfactory results, because the fractional factorial design took too many observa- low values of the ratio. This indicates that a second- tions at rarely occurring large values of the output, order model is more appropriate than a first-order whereas the Latin hypercube technique covered the model. range more uniformly. In the mind of at least one of The deviation from a second-degree equation can the authors of this article, the comparison of Pike and be estimated by utilizing the fact that for a design with Wetherill is objective and fair, but the point is moot. equally spaced points the slope of the line at the We mean by this that Latin hypercube sampling, as endpoints of the design is equal to the slope of the line used by the authors, is to obviate the building of a formed from the interior design points when the qua- response-surface model and to use the results gener- dratic model is correct. Thus for the five-point design ated by the sample to directly furnish an empirical cdf (pi - 40,) ni - 2ai, Ui, Ui + 20,) pi + 4~;) correspond- for the output. It is in this avenue of attacking the ing to Xi, let the output values be denoted by yii, yi2, problem of risk assessment that Latin hypercube sam- yis, yi4, and I&. Then the deviation from a quadratic pling has a very definite role. relationship is measured by the deviation from unity An indication of the appropriateness of the first- or of the statistic second-order response-surface model to the data can ri = (yi, - yi,)/80i + (yi, - yi,)/40i. (4.3) be obtained by evaluating the results of the one-at-a- time design in which the input values are chosen Table 5 indicates a substantial spread in the mea- symmetrically about the mean value. The deviation sure ri across the 11 variables. The largest values from linearity expressed as a percent can be estimated occur for B nonleaf and B pasture, and referring to

Table 5. Deviation From Linear and Quadratic Models of Variables Selected for Response - Surface Modeling

Rank Deviation Deviation Deviation From Linearity Based From From Variable on s2 Linearity d i Quadratic ri I2 = maxY - minY 2.53s I, 12.53s

B nonleaf 1 186.4 3.23 3,544.4 3.868.0 ,916 B pasture 2 136.8 3.74 2,595.2 2.845.8 ,912 U nonleaf 3 120.8 1.87 2.459.8 2,616.0 ,940 RY nonleaf 4 66.5 1.84 1.342.1 1.425.7 ,941 B leaf 5 43.6 1.52 976.2 1.022.7 .955 U milk 6 41.5 1.77 860.9 912.3 .944 DFI 7 17.2 1.07 802.6 810.1 ,991 U leaf 8 27.5 1.27 742.1 764.5 ,971 FM 9 24.4 1.50 552.3 578.0 ,956 RY leaf 10 25.5 1.81 524.3 556.4 ,942 LAMS II -13.1 .61 388.3 455.9 ,852

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 RESPONSE-SURFACE UNCERTAINTY ANALYSIS 159

Figure I it is obvious that they are certainly not linear. the ith input distribution; and ci = the standard devi- The values of di , ri, and 1,/2.53s for DFI indicate that ation of the ith input distribution. DFI is both linear and quadratic, a seeming contra- Given that a satisfactory response surface can be diction until we realize that if the relationship is linear, found to approximate the output, it can be used to then ri will yield values near unity also. This points approximate the probability density function (pdf) of out a drawback to the use of the statistic ri by itself. Y. There are two major methods of estimating the pdf The values of the statistics for LAMS in Table 5 are of Y using the response-surface model (5.1): (a) the interesting also. As shown in Figure 1 the output is a moment-matching technique and (b) the Monte Carlo decreasing function of LAMS. The relationship is defi- technique. nitely not linear, but this is not reflected well by di. The moment-matching technique relies on the prin- The reason is that the slope of the line from -4 to -2 ciple that the first four moments of a random variable is nearly the same as the slope of the line from 2 to 4 will adequately describe the pdf of that random vari- and similarly the lines from -2 to 0 and from 0 to 2 able. The Pearson family of distributions is defined by are nearly equal. This symmetry causes the mean df value of the sum of the doses of all pathways z = -(a + t)f/(c, + c,t + c2t2), (5.2) (SUMDOS) across the levels to be nearly equal to the value of SUMDOS at the origin (mean value of where f = pdf of the random variable T evaluated at LAMS) and the value of di to be small. In this case we the point t and a, c,, , ci, and c2 are specified constants. find that ri and 1,/2.53s are better indicators of the This family of distributions is capable of describing a true relationship between LAMS and SUMDOS. The rich variety of curves. In moment matching, values of point is that seldom does one statistic handle all cases the four coefficients a, cO, c,, and c2 are found so that well, and reliance on two or more will increase our the first four moments of the Pearson distributions are chances of correctly identifying the relationship. equal to the first four moments of the output, Y. The 2i ’ - 3 fractional factorial design of resolution 6 Procedures for doing this are described in McGrath et was employed and the 22 axial points plus center al. (1975) and Johnson et al. (1963). An alternative to point were added from the screening design. A second- the Pearson family of distributions is the Johnson order model was fitted to the response using the 11 family. The unbounded family of distributions, called variables selected from the screening process plus 28 Johnson’s S, , is defined by the mapping additional square and cross-product terms. The R2 value was .98, indicating a good fit. We point out that Z = y + 6 sinh-’ [(Y - 5)/n], (5.3) although the R2 is high it is impossible to assess the where Z is distributed normally with mean zero and practical significances of this, since the computer unit variance. The bounded family of distributions, model is deterministic and the residual error is not a called Johnson’s S,, is defined by the mapping measure of random error but a measure of lack of fit. A suggestion by Pike (personal correspondence) is to Z = y + 6 In [V/(1 - U)], 0 < u < 1, (5.4) run some repeat points (preferably in the center) vary- where Z is again standard normal, and ing the input variables that have not been varied in the experimental design. This would provide some u = (Y - 5)/k (5.5) measure of random error to compare to the lack of fit where Y is the variate we are approximating. Bowman and give an indication to the extent that the response- et al. (1981) and Bowman and Shenton (1980) de- surface model does not fit. This was not performed in scribed methods for evaluating the parameters of S, this study, but it is certainly a worthwhile idea. and S,. These methods again use the first four moments of Y to estimate y, 6,& and il. 5. ESTIMATION OF THE OUTPUT PDF In order to use the preceding techniques we must Using the techniques covered in Section 4, one can obtain the moments. There are two methods that can fit a second-order response-surface model to the be used to determine the moments of Y given the output, Y. Letting Xi (i = 1, 2, .., p) denote the response-surface function (5.1) and the distributions of selected input variables and Y denote the output, then the inputs Xl, X2,. . . , X,. The first method evaluates the model relating Y to Xi, X2, , . . , X, is typically of the jth moment of Y using Equation (5.1) and the the form input densities cc

K = BO + f Pi 5ik + i iPij5ik tjk, (5.1) pij= “’ m Y!fblVW i=l i=l j=i IS -m s -m . ..f(xp)dxl dx, ... dx,. (5.6) where Y, = the kth response of the model; <, = the kth coded value of the ith input value, (Xi, - pi)/ai ; Equation (5.6) assumes that the input variables are Xi, = the kth value of the ith input; pi = the mean of statistically independent so that the joint density is

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 160 D. J. DOWNING, R. H. GARDNER, AND F. 0. HOFFMAN

Table 6. Moments and Percentage Points Obtained From Various Simulation Scenarios

Monte Carlo of Monte Carlo of Model, 11 Monte Carlo LHS With Response Most Sensitive of Model Model Surface Inputs Varied

n 1,000 279 1,000 1,000 Mean 763.0 763.8 (.I %) 672.3 (- 11.9%) 709.1 (-7.1%) Standard 505.5 458.7 (-9.3%) 355.1 ( - 29.8%) 455.3 (-9.9%) Deviation pI 3.442 1.790 (-48.0%) 1 .I84 (-65.6%) 2.841 (-17.5%) 32.548 7.626 (-76.69%) 5.627 (-82.7%) 18.788 (-42.3%) &I 1.325.8 1,332.4 ( - .5%) 1,136.3 (-14.3%) 1,248.l ‘( -5.9%) X .95 1,658.O 1.647.7 (-.6%) 1.334.7 (-19.5%) 1.575.6 (-5.0%) X ,975 2.008.0 1,994.0 (-.7%) 1.529.6 (-23.8%) 1.928.3 (-4.0%) X .99 2,543.4 2.335.0 (-8.2%) 1.786.8 (-29.8%) 2.437.4 (-4.2%)

*A’, is the value such that Pr(X < X,) = p. NOTE: Percentage error from Monte Carlo of model is in parentheses.

equal to the product of the individual densities. Since points may be calculated. Tables by Pearson and the first four moments of Y are required and Equation Hartley (1972) are available for doing this, and ap- (5.1) may be a quadratic in Xi, the first eight moments proximate methods that work very well are given by of Xi are needed. Cox and Miller (1976) have devel- Bowman and Shenton (1979) for the Pearson family. oped a computer program that allows the compu- Percentage points for the Johnson family are straight- tation of the moments of Y in this setting. Their forward since the transformed variable has the stan- program does not allow for the case in which some of dard normal density. the Xi’s may be correlated. The Monte Carlo approach can also be used to The other method uses Equation (5.1) and Monte obtain the percentage points of Y directly from the Carlo techniques to estimate the distribution of Y by computer code or from the response-surface model. repeated simulation. This method has been called the The cost in the former approach may be prohibitive. “crude” Monte Carlo method by Cox (1977). Repeat- The Latin hypercube sampling technique can be sub- ed simulation on the response-surface model (5.1) stituted for the Monte Carlo, and the empirical distri- should be extremely fast and inexpensive compared to bution of Y can be obtained using fewer computer running the original model. The set of output values runs than needed to fit the response-surface model. from the simulation can be used to create the empiri- In our analysis we took the response-surface model cal distribution of Y. In addition, the first four sample obtained in the previous section and used it as a proxy moments can be used to fit either the Pearson or for the original code, and 1,000 Monte Carlo simula- Johnson family of distributions. One problem with tions were performed to obtain the first four moments this crude Monte Carlo technique is the determi- of the output. The same inputs were used in the orig- nation of a sufficient number of runs. In addition, inal code, and the first four moments were again com- random selection of the input variables may not be as puted. Two additional sets of simulations were run for efficient as using the Latin hypercube sampling tech- comparison. The first used a Latin hypercube sample niques of Iman and Conover (1980). varying all inputs, and the second was a Monte Carlo An interesting application of the crude Monte varying the 11 most sensitive inputs. The percentage Carlo technique applied to a computer model was points obtained from the Monte Carlo simulation given by Bowman (1980). Bowman calculated the varying all inputs can be viewed as the best estimate of skewness and kurtosis statistics (Jb, and b2) of the the true percentage points and the ones to be com- Monte Carlo generated distribution and then used pared against. The Latin hypercube sample was of size sampling theory to place 75% probability regions on n = 279. This sample size was identical to the number the computed statistics. The extremes of the probabil- of observations used to fit the response-surface model ity region identify extremes of the family of Pearson and was selected to make a fair comparison between Type I distributions, and from these the 95th percen- the two techniques. It should be pointed out that this tile point was calculated. These percentiles can be sample size is considerably larger than would be nec- ordered from the smallest to the largest yielding inter- essary to permit sensitivity analysis and obtain good vals in which one is 75% certain that 95% of the estimates of the partial rank correlation. In fact, one of distribution on Y lies below some number in the the pluses of using Latin hypercube sampling is that it interval. covers the sample space with a minimum of runs. The Once the pdf of Y is estimated, then percentile concern here is that the cumulative distribution func-

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 RESPONSE-SURFACE UNCERTAINTY ANALYSIS 161

1.0 the output distribution. Even though R2 = .98, indica- tive of a good fit, the percentage error is largest for the response-surface model estimates. The differences between the Latin hypercube percentile-point estimates 0.9 1 and the Monte Carlo percentile-point estimates seem 0.8 large, but viewing the entire cdf shown in Figure 2, they appear nearly identical. In this exercise it is clear 0.7 - that the Latin hypercube sample generated a cdf clos- est to the standard obtained from the large Monte 0.6- Carlo sample. F 0 B R o.s- 6. SUMMARY 6 : The methodology for performing an uncertainty I T 0.4- analysis in complicated computer codes is still under Y development. Many of the methods in use today are based on computer codes and have been used suc- 0.3 - cessfully on radiation pathway problems, probabilistic risk assessment in nuclear safety, and related areas. 0.2 - The question is not whether the statistical methods are good, but whether we can quantify the degree of O.l- approximation to the actual code. The problem of uncertainty analysis asks the question : Do we want an

0.0 exact answer to an approximate problem, or do we T 0 500 1000 1500 2000 2500 3000 3500 want an approximate answer to the exact problem? SUnoOS Those choosing an exact answer to an approximate

Figure 2. Probability Plot for SUMDOS Using problem will follow the route of screening variables, Monte Carlo and Latin Hypercube Sampling. fitting a response-surface model to the output varying Source: (-) Latin hypercube; ( -) Monte only the “important” variables from the screening, Carlo. then calculating the moments of this response model (either exactly or using Monte Carlo or Latin hypercube sampling methods to obtain sample estimates), tion obtained from each technique should be based on and fitting either a Pearson or Johnson distribution. equal amounts of information. The pitfalls in this approach are several. The uncer- The first four moments obtained from these simula- tainty that we selected the most important variables, tions along with the percentage points from the ap- especially when interactions cannot be neglected, is a propriate Pearson distribution are given in Table 6. major problem in this approach. After selecting the The percentiles chosen are .90, .95, .975, and .99. The important variables and fitting a response surface to results indicate that large differences among some of the output varying only these variables, one must ask the techniques in the value of the percentile-point this question: What is the cumulative effect on the estimates occur especially for larger percentiles. The output for those variables that were held fixed? This response-surface model underestimates the percentile cumulative effect may be large when one is working points obtained from the Monte Carlo simulation with a computer code containing several hundred letting all inputs vary. The results from the response- input variables. The fitting of a response-surface surface model and the Monte Carlo simulation hold- model is not as straightforward as many contributors ing the less sensitive inputs fixed are more com- to the literature lead us to believe. It is still very much parable, but the response-surface estimates of the per- of an art; with highly nonlinear functional forms, the centiles are still consistently lower. The Monte Carlo second-order response-surface model may not be an results using the Latin hypercube sample are closer to acceptable approximation except over a very limited the Monte Carlo results varying all inputs, but the range. In addition there is always the question of what estimates of the two approaches disagree more at is an acceptable fit to the output. Is an R2 of 99.9% an larger percentiles. indication of an excellent fit? Is it meaningful, since The percentage errors indicate that some of the lack of fit cannot be tested? Finally, if we accept the techniques have large differences, especially at the 99 response-surface model as representative of the percentile point. The heavy point selection in the tails output, then we can use it to obtain the first four of the input distributions for fitting the response sur- moments and fit a Pearson or Johnson distribution to face does not guarantee good prediction in the tails of it.

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 D. J. DOWNING, f?. H. GARDNER, AND F. 0. HOFFMAN

In contrast to the preceding course of action is the with Martin Marietta Energy Systems, Inc. The Oak philosophy espoused by Iman and Conover (1980). Ridge National Laboratory is operated by Martin Their method is to use the Latin hypercube sampling Marietta Energy Systems, Inc., under contract DE- methodology to obtain an estimate of the cdf of the AC05-840R21400 with the U. S. Department of output. The Latin hypercube sampling allows a repre- Energy. sentative sample of the input variables to be selected and in this way yields a more complete description of the model behavior. Using the empirical cdf, one can [Received October 1983. Revised December 1984.1 then obtain estimates of the percentile points of the output. This methodology is straightforward and does REFERENCES not suffer the pitfalls mentioned earlier. It can be used ALSMILLER, R. G., JR., BARISH, J., BJORNSTAD, D., DOWN- to screen variables, in the sense of forming a hierarchy ING, D. J., FORD, W. E., HORWEDEL, J., LEE, C. J., of most important to least important, and no vari- McADOO, J., OBLOW, E. M., PEELE, R. W., PEREY, F. G., ables need be dropped from the analysis. Moreover, STEWART, L., and WEISBIN, C. R. (1980), “Interim Report on the use of Latin hypercube sampling and partial rank Model Evaluation Methodology and the Evaluation of LEAP, Technical Report ORNL/TM-7245, Oak Ridge National Lab- order correlation can uncover strong monotonic oratory. (highly nonlinear) relationships better than standard BARTELL, S. M., LANDRUM, P. F., GIESY, J. P., and LEV- techniques. As was shown earlier in the article, the ERSEE, G. J. (1981), “Simulating the Fate of Anthracene in identification of LAMS as an important variable was Artificial Streams,” in Energy and Ecological Modeling, eds. W. missed by all of the importance measures except for Mitsch, W. R. Basserman, and J. Klopatek, Amsterdam, Elsevier, pp. 133-143. the partial rank-order correlation coefficient. In addi- BOWMAN, K. 0. (1980), “One Aspect of the Statistical Evaluation tion, plots of the input variables versus the output are of a Computer Model,” Technical Reports ORNL/CSD-52, Oak more meaningful than in the systematic experimental- Ridge National Laboratory. design approach. BOWMAN, K. 0.. SERBIN, C. A., and SHENTON, L. R. (1981), Both approaches offer different insights to the “Explicit Approximate Solutions for S,,” Communications in Sta- tistics, Part 3--Simulation and Computation, 10, l-l 5. input-output relationship of the computer code. Few BOWMAN, K. O., and SHENTON, L. R. (1979), “Approximate studies exist that indicate the superiority of one ap- Percentage Points for Pearson Distributions,” Biometrika, 66, proach over the other. 147-151. In general, it appears that uncertainty analysis re- ~ (1980), “Evaluation of the Parameters of S, by Rational Fractions,” Communications in Statistics, Part B-Simulation and quires a good approximation to the code over the Computation, 9, 127-l 32. complete set of possible input values. Response- BOX, G. E. P., and HUNTER, J. S. (1961), “The 2k-p Fractional surface approximations are designed to be local in Factorial Designs Parts I and II,” Technometrics, 3, 311-352. nature and, therefore, do not perform well in uncer- COX, NEIL D. (1977), “Comparison of Two Uncertainty Analysis tainty analysis. They are more suited to situations Methods,” Nuclear Science Engineering, 64,258-265. COX, NEIL D., and MILLER, CHARLES F. (1976), “User’s De- where local behavior of the code is of interest. Sensi- scription of Second-Order Error Propagation Computer Code tivity analysis is a good example of this type of appli- for Statistically Independent Variables (SOERP),” Technical cation. In some applications qualitative information Report RE-S-76-138, Idaho National Engineering Laboratory about the relationship of Y to the inputs is all that is (code available from Argonne Code Center). required. In such cases, the approximation provided DRAPER, N. R., and MITCHELL, T. J. (1968), “The Construction by fitting a response surface to Y is adequate. More of Saturated 2imp Designs,” Annals ofMathematical Statistics, 39, 246255. work needs to be done in assessing the strengths and GARDNER, R. H. (1983), “Error Analysis and Sensitivity Analysis weaknesses of the two approaches in the areas of in Ecology,” in Encyclopedia of Systems and Control, ed. Madan sensitivity and uncertainty analysis. Singh, London: Pergamon Press. HOFFMAN, F. O., GARDNER, R. H., and ECKERMAN, K. F. (1982), “Variability in Dose Estimation Associated With the ACKNOWLEDGMENTS Food Chain Transport and Ingestion of Selected Radionuclides,” Technical Report NUREG/CR-21612, U. S. Nuclear Regulatory We are grateful to the editor, an associate editor, Commission. and two referees whose suggestions and comments IMAN, R. L., and CONOVER, W. J. (1980), “Small Sample Sensi- greatly improved this article. Their interest and con- tivity Analysis Techniques for Computer Models, With an Appli- cern helped to make the article more understandable cation to Risk Assessment, Communications in Statistics, Part A-Theory and Methods, 17,1749-1842. and readable. We would also like to thank Collene ~ (1982). “A Distribution Free Approach to Inducing Rank Ownby who had to take a thoroughly revised paper Correlation Among Input Variables,” Communications in Statis- and recast it in its present form. The research was tics, Part B-Simulation and Computation, 11,31 l-334. supported by the National Science Foundation’s Eco- JOHNSON, N. L., NIXON, E., AMOS, D. E., and PEARSON, E. system Studies Program under Interagency Agree- S. (1963), “Table of Percentage Points of Pearson Curves, for Given fi and B, Expressed In Standard Measure,” Bio- ment DEB80-21014 with the U. S. Department of metrika, 50,45 1. Energy and under Contract DE-AC05-84OR21400 KRIEGER, T. J., DURSTON, C., ALBRIGHT, D. C. (1977), “Sta-

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2 RESPONSE-SURFACE UNCERTAINTY ANALYSIS 163

tistical Determination of Effective Variables in Sensitivity Analy- /in Statisticians (Vol. 2), New York: Cambridge University Press. sis.” Transactions in American Nuclear Science, 28,5 15-5 16. PIKE, D. J., and SMITH, J. R. (1981) “Response Surface Method- MANKIN, J. B., O’NEILL, R. V., SHUGART, H. H., and RUST, ology in Simulation Studies of Nuclear Reactor Safety,” Pro- B. W. (1977). “The Importance of Validation in Ecosystem ceedings of the First International Conference on Applied Mod- Analysis,” in New Direction in the Analysis of Ecological Systems eling and Simulation, 3, 130-132. (Part I), ed. G. S. Innis, LaJolla, Calif.: Simulation Councils, Inc. PIKE, D. J., and WETHERILL, G. (1983), “Comparative Experi- McGRATH, E. J., BASIN, S. L., BURTON, R. W., IRVING, D. C., mental Design for Response Surface Analysis of Reactor Safety JAQUETTE, S. C., KETLER, W. R., and SMITH, C. A. (1975), Codes,” paper presented at the Annual Meeting of the American “Techniques for Enicient Monte Carlo Simulation, Vol. 1, Select- Statistical Association. ing Probability Distributions,” Technical Report ORNL-RSIC- PLACKETT, R. L., and BURMAN, J. P. (1946), “The Design of 38, Oak Ridge National Laboratory. Optimum Multifactorial Experiments,” Biometrika, 33,305-325. MCKAY, M.D., CONOVER, W. J., and BECKMAN, R. J. (1979), SNEE, RONALD, D. (1977), “Validation of Regression Models: “A Comparison of Three Methods for Selecting Values of Input Methods and Examples,” Technometrics, 1,415428. Variables in the Analysis of Output From a Computer Code,” STECK, G. L., IMAN, R. L., and DAHLGREN, D. A. (1976), Technometrics, 2,239-245. “Probabilistic Analysis of LOCA: Annual Report for FY 1976,” NELDER, J. A. (1966), “Inverse Polynomials, a Useful Group of Technical Report SAND760535, Sandia Laboratories, pp. 75-82. Multifactor Response Functions,” Biometrics, 22, 1288141. U. S. NUCLEAR REGULATORY COMMISSION (1977), Calcu- OBLOW, E. M. (1978), “Sensitivity Theory for General Nonlinear lation of Annual Doses to Man From Routine Releases ofReactor Algebra Equations With Constants,” Nuclear Science Engineer- Efjluents for the Purpose of Evaluating Compliance With 10 CFR ing, 65, 187-191. (Pt. 50, App. I), Regulatory Guide 1.109, Washington, DC: PEARSON, E. S., and HARTLEY, H. 0. (1972), Biometrika Tables Author.

TECHNOMETRICS, MAY 1985, VOL. 27, NO. 2