Latin Hypercube Sampling

Chapter 16 Latin Hypercube Sampling Jaroslav Menčík Additional information is available at the end of the chapter http://dx.doi.org/10.5772/62370 Abstract The simultaneous influence of several random quantities can be studied by the Latin hy‐ percube sampling method (LHS). The values of distribution functions of each quantity are distributed uniformly in the interval (0; 1) and these values of all variables are ran‐ domly combined. This method yields statistical characteristics with less simulation ex‐ periments than the Monte Carlo method. In this chapter, the creation of the randomized input values is explained. Keywords: Probability, Monte Carlo method, Latin Hypercube Sampling, probabilistic transformation, randomization The Monte Carlo method has two disadvantages. First, it usually needs a very high number of simulations. If the output quantity must be obtained by time-consuming numerical com‐ putations, the simulations can last a very long time, and the response surface method is not always usable. Second, it can happen that the generated random numbers of distribution function F (which serves for the creation of random numbers with nonstandard distribu‐ tions) are not distributed sufficiently and regularly in the definition interval (0; 1). Some‐ times, more numbers are generated in one region than in others, and the generated quantity has thus somewhat different distribution than demanded. This problem can appear especial‐ ly if the output function depends on many input variables. A method called Latin Hypercube Sampling (LHS) removes this drawback. The basic idea of LHS is similar to the generation of random numbers via the inverse probabilistic transforma‐ tion (3) and Figure 2 shown in Chapter 15 [1, 2]. The difference is that LHS creates the values of F not by generating random numbers dispersed in chaotic way in the interval (0; 1), but by assigning them certain fix values. The interval (0; 1) is divided into several layers of the same width, and the x values are calculated via the inverse transformation (F–1) for the F values © 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 118 Concise Reliability for Engineers corresponding to the center of each layer. With reasonably high number of layers (tens or hundreds), the created quantity x will have the proper probability distribution. This approach is called stratified sampling. If the output variable y depends on several input quantities, x1, x2,..., xm, it is necessary that each quantity is assigned values of all layers and that the quantities and layers of individual variables are randomly combined. This is done by random assigning the order numbers of layers to the individual input quantities. 1.0 F(x) Layer 5 0.8 Layer 4 0.6 Layer 3 0.4 F(L2) Layer 2 > 0.2 Layer 1 < 0.0 x (L2) x Figure 1. Principle of the LHS method. Figure 1. Principle of the LHS method. The procedure is as follows. The definition interval of the distribution function F of each of m variables is divided into N N The procedure layers.is as follows. , the same The for definition all variables, interval also corresponds of the distributionto the number offunction trials (= simulationF of each experiments). of m In each trial, the order numbers of layers are assigned randomly to the individual variables (X1, X2, ..., Xm). In this way, various layers of variables is dividedthe individual into N layers.variables N are, the always same randomly for all variables, combined. I alson practice, corresponds this is achieved to the by number means of random numbers and of trials (= simulationtheir rank-ordering. experiments). Then, each In each input trial,variable the is assignordered numbers the value corresponding of layers are to assignedthe center of the pertinent layer of randomly to theits individual distribution variablesfunction. (X1, X2,..., Xm). In this way, various layers of the individual variables are always randomly combined. In practice, this is achieved by means of random X X X X F numbers and theirThe application rank-ordering. is illustrated Then, on each a case input with four variable random is quantities assigned ( the1, 2value, 3, and correspond‐ 4) and the definition interval of divided into five layers (Fig. 34). Only five layers are used here for simplicity; usually, several tens of layers are used. In ing to the centerour of case, the Y pertinent will be calculated layer offor its five distribution combinations function.of the four input quantities. Thus, 5 × 4 = 20 random numbers with uniform distribution in interval (0; 1) are generated (see the table in the left part of Fig. 35). Then, the layer numbers for The application is illustrated on a case with four random quantities (X , X , X , and X ) and the variable X1 (for example) for individual trials are assigned with resp1 ect 2to the3 order of4 random values (for X1) ranked by definition intervalsize from of F the divided maximum into to minimum. five layers Here, (Fig. layer 1). no. Only3 (with five the highest layers number are used 0.83) herefor the for first trial, layer no. 1 for the simplicity; usually,second, several no. 5 for tens the third, of layers no. 2 for are the used. fourth, In and our no. case,4 for the Y fifth,will correspondingbe calculated to thefor numbers five 0.56 - 0.25 - 0.83 - combinations of0.17 the and four 0.30 ininput the column quantities. for X1. SimilarThus, operations5 × 4 = 20 are random done for each numbers variable. with Thus, uniform in the first trial, variables X1, X2, X3, and X4 are assigned the values corresponding to the second, fourth, second, and fifth layers of their distribution distribution in interval (0; 1) are generated (see the table in the left part of Fig. 2). Then, the functions, respectively. Inverse probabilistic transformation F–1 is then used for the determination X1 from F1,1 , etc.; see layer numbers thefor table variable on the Xright.1 (for Now, example) the investigated for individual quantity Y trials = Y(X 1are, X2, assignedX3, X4) is calculated with respect five times. to The obtained values Y1, the order of randomY2, Y3, Y 4values, and Y5 can(for be X used1) ranked for the determinationby size from of thestatistical maximum characteristics to minimum. (mean, standard Here, deviation, ...). layer no. 3 (with the highest number 0.83) for the first trial, layer no. 1 for the second, no. 5 for the third, no. 2 for the fourth, and no. 4 for the fifth, corresponding to the numbers 0.56 - 0.25 - 0.83 - 0.17 and 0.30 in the column for X1. Similar operations are done for each variable. Thus, in the first trial, variables X1, X2, X3, and X4 are assigned the values corresponding to the second, Figure 2. LHS method: assignment of layers to individual variables and trials.Usually, several tens or hundreds of trials are made, which enable the construction of distribution function F(Y) and determination of the mean value, standard deviation, various quantiles, and other characteristics. References [1] Florian A. An efficient sampling scheme: Updated Latin Hypercube Sampling. Probabilistic Engineering Mechanics, 7 (1992), issue 2, 123 – 130. [2] Olsson A, Sandberg G, Dahlblom G. On Latin hypercube sampling for structural reliability analysis. Probabilistic Engineering Mechanics. 2002; 25: issue 1, 47 – 68. Latin Hypercube Sampling 119 http://dx.doi.org/10.5772/62370 fourth, second, and fifth layers of their distribution functions, respectively. Inverse probabil‐ –1 istic transformation F is then used for the determination X1 from F1,1, etc.; see the table on the right. Now, the investigated quantity Y = Y(X1, X2, X3, X4) is calculated five times. The obtained values Y1, Y2, Y3, Y4, and Y5 can be used for the determination of statistical characteristics (mean, standard deviation,...). Random numbers (RN) Layer numbers for individual trials Variable X1 X2 X3 X4 Variable X1 X2 X3 X4 -------------- -------------- Trial RN RN RN RN Trial layer layer layer layer 1 0.56 0,12 0,72 0,13 1 2 4 2 5 2 0.25 0.40 0.84 0.60 2 4 3 1 3 3 0.83 0.05 0.21 0.55 3 1 5 4 4 4 0.17 0.62 0.03 0.99 4 5 2 5 1 5 0.30 0.70 0.61 0.73 5 3 1 3 2 Fig. 35. LHS method – assignment of layers to individual variables and trials. Figure 2. LHS method: assignment of layers to individual variables and trials. Usually, several tens or hundreds of trials are made, which enable the construction of distri‐ bution function F(Y) and determination of the mean value, standard deviation, various The titles “Random numbers” and “Layer numbers” should be placed quantiles, and other characteristics. above the tables - as you see it. Author details Jaroslav Menčík Address all correspondence to: [email protected] Department of Mechanics, Materials and Machine Parts, Jan Perner Transport Faculty, University of Pardubice, Czech Republic References [1] Florian A. An efficient sampling scheme: Updated Latin Hypercube Sampling. Prob‐ abilistic Engineering Mechanics, 7 (1992), issue 2, 123 – 130. [2] Olsson A, Sandberg G, Dahlblom G. On Latin hypercube sampling for structural reli‐ ability analysis. Probabilistic Engineering Mechanics. 2002; 25: issue 1, 47 – 68. .

Latin Hypercube Sampling

Latin Hypercube Sampling in Bayesian Networks

(MLHS) Method in the Estimation of a Mixed Logit Model for Vehicle Choice

Efficient Experimental Design for Regularized Linear Models

Some Large Deviations Results for Latin Hypercube Sampling

Quantile Estimation with Latin Hypercube Sampling

Sampling Based on Sobol Sequences for Monte Carlo Techniques

Uncertainty and Sensitivity Analysis for Long-Running Computer Codes: a Critical Review

Optimization of Latin Hypercube Samples and Subprojection Properties

Comparison of Latin Hypercube Sampling and Simple Random Sampling Applied to Neural Network Modeling of Hfoz Thin Film Fabrication

Modified Latin Hypercube Sampling Monte Carlo

Assessing the Reliability of Species Distribution Projections in Climate

A First Course in Experimental Design Notes from Stat 263/363