Comparison of Latin Hypercube and Comparison of Latin

Total Page:16

File Type:pdf, Size:1020Kb

Comparison of Latin Hypercube and Comparison of Latin Comparison of Latin Hypercube and Quasi Monte Carlo Sampling Techniques . Sergei Kucherenko1, Daniel Albrecht2, Andrea Saltelli2 1Imperial College London, UK Email: [email protected] 2The European Commission, Joint Research Centre, ISPRA(VA), ITALY 1 Outline Monte Carlo integration methods Latin Hypercube sampling design Quasi Monte Carlo methods. Sobol’ sequences and their properties Comparison of sample distributions . generated by different techniques Do QMC methods loose their efficiency in higher dimensions ? Global Sensitivity Analysis and Effective dimensions CiComparison resu lts 2 Monte Carlo integration methods If[]= fxdx () ∫H n see as an expp[][()]ectation: I[]fff= E [()]f x 1 N Monte Carlo : IfNi[ ]= ∑ fz ( ) N i=1 n {zHi }− is a sequence of random points in Error: ε =−I[]fffI N [f ] . σ ()f εε= ( E(21/2 )) =→ N N 1/2 Convergence does not depent on dimensionality but it is slow Imppgyg()rove MC convergence by decreasing σ ()f Use variance reduction techniques: antithetic variables; control variates; stratified sampling → LHS sampling 3 Latin Hypercube sampling . Latin Hypercube sampling is a type of Stratified Sampling. To sample N points in d-dimensions Divide each dimension in N equal intervals => Nd subcubes. Take one point in each of the subcubes so that being projected to lower dimensions points do not overlap 4 Latin Hypercube sampling {{}π k }, kn= 1,..., - independent random permut ations of {1 ,..., N} each uniformly distributed over all N! possible permutations π ()i − 1+U k LHS coordinates: xiNknk ===ki, 1,..., , 1,..., i N k . Ui ∼ U(0,1) LHS is built by superimposing well stratified one-dimensional samples. It cannot be expected to ppgrovide good uniformityyp prop erties in a n-dimensional unit hypercube. 5 Deficiencies of LHS sampling . 1) Space is ba dly exp lore d (a ) 2) Possible correlation between variables (b) 3) Points can not be sampled sequentially => Not suited for integration 6 Discrepancy. Quasi Monte Carlo. Discrepancy is a measure of deviation from uniformit y: n Defintions: Qy() ( )∈ H, Qy() ( ) =××× [0, y12)[0 ) [0, y) ... [0, yn ), mQ( )− volume of Q N DmQ* =−supQ()y ( ) N Qy()∈ Hn N . *1/21/2 Random sequences: DNNNN → (ln ln ) / ~ 1/ (ln N )n Dcd* ≤ ( ))Low− Low discrepancy sequences (LDS) N N * Convergence: εQMC=−If[ ] I N [ f ] ≤ VfD ( ) N , ON(ln )n ε = QMC N Assymptotically εQMC ~(1/) ~ON (1/ )→ much higher than ε MC ~(ON1/ ) 7 QMC. Sobol’ sequences ON(ln )n Convergence: ε =−for all LDS N ON(ln )n−1 For Sobol' LDS: ε ==−,if Nk2k , integer N Sobol' LDS: . 1. Best uniformity of distribution as N goes to infinity. 2. Good distribution for fairly small initial sets. 3A3. A very fas t compu ttitational alg lorith m. "Preponderance of the experimental evidence amassed to date points to Sobol' sequences as the most effective quasi-Monte Carlo method for application in financial engineering." Paul Glasserman , Monte Carlo Methods in Financial Engineering, Springer, 2003 8 Sobol LDS. Property A and Property A’ A low-discrepancy sequence is said to satisfy Property A if for any binary segment (not an arbitrary subset) of the n-dimensional sequence of length 2n there is exactly one point in each 2n hyper-octant that results from subdividing the unit hypercube along each of its length extensions into half. A low-discrepancy sequence is said to satisfy Property A’ if for any binary segment (not an arbitrary subset) of the n-dimensional sequence of length 4n there is exactly one point in each 4n hyper-octant that results from subdividing the unit hypercube along each of its length extensions into four equal parts. Property A Property A’ 9 Distributions of 4 points in two dimensions Property A MC -> No . LHS -> No Sobol’ -> Yes 10 Distributions of 16 points in two dimensions Property A’ MC -> No . LHS -> No Sobol’ -> Yes 11 Comparison of Discrepancy I. Low Dimensions 0.1 MC LHS QMC-Sobol 0.01 Use standard MC and , LHS generators 0.001 og(DN_L2) LL Sobol' sequence generator: 0.0001 SobolSeq: 1e-05 *. Sobol' seqqyuences satisfy 128 256 512 1024 2048 4096 8192N 16384 32768 Log2(N) Properties A and A’ 0.01 MC LHS www.broda.co.uk QMC-Sobol 0.001 (DN_L2) gg Lo Result: QMC in low dimensions shows 0. 0001 128 256 512 1024 2048 4096 8192 16384 32768 muchllh smaller discrepancy than Log2(N) MC and LHS 12 Comparison of Discrepancy II High Dimensions 1e-07 MC LHS QMC-Sobol 1e-08 og(DN_L2) . LL 1e-09 128 256 512 1024 2048 4096 8192 16384 32768 Log2(N) All sampling methods in high-dimensions have comparable discrepancy 13 Do QMC methods loose their efficiency in higher dimensions ? ON(ln )n ε = QMC N Assymptotically εQMC ~ON (1/ ) . * but εQMC increseas with NN until ≈ exp( n ) nN=≈50, 5 1021 − not achievable for practical appl ications Is QMC better than MC and LHS in higher dimensions (≥ 20) 14 ANOVA decomposition and Sensitivity Indices Consider a model Yfx= () x is a vector of input variables xxxx= ()( , ,..., ) f(x) is integrable 12 k 01≤≤xi ANOVAdecompositionANOVA decomposition: k Yfxf==+( )0∑∑∑ fxii() + fxx ijij() , ++ ... f 1,2,..., k() xxx 1 , 2 ,..., k , iiji=>1 > . 1 fxii... (,,...,) xdx i =∀ 0,,1 k ≤≤ ks ∫ 11isisk 0 22 22 Variance decomposition: σσ=+ σσ +… ∑ iiiij∑ ,1,2,...,j n k Sobol’ SI: 1 = ∑ S i + ∑ S ij + ∑ S ijl + ... + S 1, 2 ,..., k i =1 i < j i < j < l 15 Sobol’ Sensitivity Indices (SI) Definition: S = σ 22/σ ii11...ss ii ... 1 σ 22= f xxdxx,..., ,..., - partial variances ii1111...ss∫ ii ... () i is i is 0 1 σ 2 = fx− f2 dx - total variance ∫( ( ) 0 ) 0 Sensitivity indices for subsets. of variables: xyz= ( , ) m σ 22= σ yii∑∑ 1 ,,… s sii=〈〈∈Κ1 ( 1 ... s ) tot 2 2 2 Total variance for a subset: ()σ y = σ −σ z Corresponding global sensitivity indices: 2 2 tot tot 2 2 S y = σ y /σ , S y = (σ y ) /σ . 16 Effective dimensions Let u be a cardinality of a set of variables u. The effective dimension of f (x ) in the superposition sense ithis the small lltitest integer d S such htht that ∑ Su ≥−(1εε ), << 1 0<<udS It means that f (x ) is almost a smu of d S -dimensional functions. ___________________________________. ________________________ The function fx( ) has effective dimension in the truncation sense dT if ∑ Su ≥−(1 ε ), ε <<1 ud⊆{1,2,...,T } Important property: ddS ≤ T n Example: fx( ) = ∑ xiT→= dS 1, d =n i=1 17 Classification of functions Type A. Variables are Type B, C. Variables are not equally important equally important T T S y S z >> ↔dT << n SSij≈↔↔ d T ≈ n nnyz . Type B. Type C. Dominant Dominant low order indices higher order indices n n Sn<<↔1 d ≈ ∑Sni ≈↔1 dS << ∑ i S i=1 i=1 18 When LHS is more effective than MC ? ANOVA: fx( )=+ f0 ∑ fxii ( ) + rx ( ) i rx( )− high order interactions terms 2211 LHS: ErxdxO(ε LHS )=+ [ ( )] ( ) (Stein, 1987) NN∫H n . 222111 MC: EfxdxrxdxO(ε MC )= [ i ( i )]++ [ ( )] ( ) ∑∫∫HHnn NNNi 2 if [()] [rx ( )]dx i s small ⇔ dS ( Type B func tions ) ∫H n 22 →< EE(εεLHS ) ( MC ) 19 Classification of functions Function Description Relationship dT dS QMC is LHS is type between more more tot Si and Si efficient efficient than MC than MC AA few << n << n Yes No tot to dominant Sy /ny >> Sz / nz variables BNo ≈ n << n Yes Yes unimportant subsets; only . S ≈ S , ∀ i, j low-order i j S / S tot ≈ 1, ∀ i interaction i i terms are present CNo ≈ n ≈ n No No unimportant subsets; high- S ≈ S , ∀ iji, j order i j S / S tot << 1, ∀ i interaction i i terms are present 20 How to monitor convergence of MC, LHS and QMC calculations ? The root mean square error is defined as K 1/2 ⎛⎞1 k 2 ε =−⎜⎟∑()IIdN ⎝⎠K k =1 K is a number of independent runs MC and LHS: all runs should be statisticall. yyp independent ( use a different seed point ). QMC: for each run a different part of the Sobol' LDS was used ( start from a different index number ). The root mean square error is approximated by the formula cN −α , 0 <<α 1 MC: α ≈ 0.5 QMC: α ≤ 1 LHS: α ∼ ? 21 Integration error vs. N. Type A n i i s (a) f(x) = ∑ j=1(-1) Π j=1 xj, n = 360, (b) f(x) = Π i=1 │4xi-2│/(1+a i), n = 100 0 MC (-0.50) QMC-SOB (-0.94) T T 1/2 -5 LHS (-0.52) S S ⎛⎞1 K y z ε =−()IIk 2 >> ↔dT << n ⎜⎟∑ N -10 nn ⎝⎠K k =1 yz (error) 2 -15 log -20 -25 εα~,01N −α << 8 11141720 log2(N) (a) . -2 MC (0.49) QMC-SOB (0.65) LHS (0.50) -6 r) oo (err 2 log -10 (b) -14 8 11141720 log2(N) 22 Integration error. Type A S T T y Sz K 1/2 >> ↔dT << n ⎛⎞1 k 2 ε =−⎜⎟∑()IIN nnyz ⎝⎠K k =1 εα~,01N −α << Index Function Dim n Slope Slope Slope MC QMC LHS n i . i ∑(−1) ∏ x j 1Ai=1 j = 1 360 0.50 0.94 0.52 n 42xaii−++ ∏ 1+ a 2A i = 1 i 100 0.49 0.65 0.50 a1 = a2 = 0 a3 ==a= … = a100 =652= 6.52 23 Integration error vs. N. Type B n Dominant low order indices ∑ Si ≈↔1 ↔<<dS <<n i=1 MC (0.52) -8 QMC-SOB (0. 96) LHS (0.69) -12 n n − x f (x) = i -16 ∏ i=1 n − 0.5 (a) g2(error) lo -20 n = 360 -24 8 11141720 log2(N).
Recommended publications
  • Latin Hypercube Sampling in Bayesian Networks
    From: FLAIRS-00 Proceedings. Copyright © 2000, AAAI (www.aaai.org). All rights reserved. Latin Hypercube Sampling in Bayesian Networks Jian Cheng & Marek J. Druzdzel Decision Systems Laboratory School of Information Sciences and Intelligent Systems Program University of Pittsburgh Pittsburgh, PA 15260 ~j cheng,marek}@sis, pitt. edu Abstract of stochastic sampling algorithms is less dependent on the topology of the network and is linear in the num- We propose a scheme for producing Latin hypercube samples that can enhance any of the existing sampling ber of samples. Computation can be interrupted at any algorithms in Bayesian networks. Wetest this scheme time, yielding an anytime property of the algorithms, in combinationwith the likelihood weightingalgorithm important in time-critical applications. and showthat it can lead to a significant improvement In this paper, we investigate the advantages of in the convergence rate. While performance of sam- applying Latin hypercube sampling to existing sam- piing algorithms in general dependson the numerical pling algorithms in Bayesian networks. While Hen- properties of a network, in our experiments Latin hy- rion (1988) suggested that applying Latin hypercube percube sampling performed always better than ran- sampling can lead to modest improvements in per- dom sampling. In some cases we observed as much formance of stochastic sampling algorithms, neither as an order of magnitude improvementin convergence rates. Wediscuss practical issues related to storage re- he nor anybody else has to our knowledge studied quirements of Latin hypercube sample generation and this systematically or proposed a scheme for gener- propose a low-storage, anytime cascaded version of ating Latin hypercube samples in Bayesian networks.
    [Show full text]
  • (MLHS) Method in the Estimation of a Mixed Logit Model for Vehicle Choice
    Transportation Research Part B 40 (2006) 147–163 www.elsevier.com/locate/trb On the use of a Modified Latin Hypercube Sampling (MLHS) method in the estimation of a Mixed Logit Model for vehicle choice Stephane Hess a,*, Kenneth E. Train b,1, John W. Polak a,2 a Centre for Transport Studies, Imperial College London, London SW7 2AZ, United Kingdom b Department of Economics, University of California, 549 Evans Hall # 3880, Berkeley CA 94720-3880, United States Received 22 August 2003; received in revised form 5 March 2004; accepted 14 October 2004 Abstract Quasi-random number sequences have been used extensively for many years in the simulation of inte- grals that do not have a closed-form expression, such as Mixed Logit and Multinomial Probit choice prob- abilities. Halton sequences are one example of such quasi-random number sequences, and various types of Halton sequences, including standard, scrambled, and shuffled versions, have been proposed and tested in the context of travel demand modeling. In this paper, we propose an alternative to Halton sequences, based on an adapted version of Latin Hypercube Sampling. These alternative sequences, like scrambled and shuf- fled Halton sequences, avoid the undesirable correlation patterns that arise in standard Halton sequences. However, they are easier to create than scrambled or shuffled Halton sequences. They also provide more uniform coverage in each dimension than any of the Halton sequences. A detailed analysis, using a 16- dimensional Mixed Logit model for choice between alternative-fuelled vehicles in California, was conducted to compare the performance of the different types of draws.
    [Show full text]
  • Efficient Experimental Design for Regularized Linear Models
    Efficient Experimental Design for Regularized Linear Models C. Devon Lin Peter Chien Department of Mathematics and Statistics Department of Statistics Queen's University, Canada University of Wisconsin-Madison Xinwei Deng Department of Statistics Virginia Tech Abstract Regularized linear models, such as Lasso, have attracted great attention in statisti- cal learning and data science. However, there is sporadic work on constructing efficient data collection for regularized linear models. In this work, we propose an experimen- tal design approach, using nearly orthogonal Latin hypercube designs, to enhance the variable selection accuracy of the regularized linear models. Systematic methods for constructing such designs are presented. The effectiveness of the proposed method is illustrated with several examples. Keywords: Design of experiments; Latin hypercube design; Nearly orthogonal design; Regularization, Variable selection. 1 Introduction arXiv:2104.01673v1 [stat.ME] 4 Apr 2021 In statistical learning and data sciences, regularized linear models have attracted great at- tention across multiple disciplines (Fan, Li, and Li, 2005; Hesterberg et al., 2008; Huang, Breheny, and Ma, 2012; Heinze, Wallisch, and Dunkler, 2018). Among various regularized linear models, the Lasso is one of the most well-known techniques on the L1 regularization to achieve accurate prediction with variable selection (Tibshirani, 1996). Statistical properties and various extensions of this method have been actively studied in recent years (Tibshirani, 2011; Zhao and Yu 2006; Zhao et al., 2019; Zou and Hastie, 2015; Zou, 2016). However, 1 there is sporadic work on constructing efficient data collection for regularized linear mod- els. In this article, we study the data collection for the regularized linear model from an experimental design perspective.
    [Show full text]
  • Some Large Deviations Results for Latin Hypercube Sampling
    Some Large Deviations Results For Latin Hypercube Sampling Shane S. Drew Tito Homem-de-Mello Department of Industrial Engineering and Management Sciences Northwestern University Evanston, IL 60208-3119, U.S.A. Abstract Large deviations theory is a well-studied area which has shown to have numerous applica- tions. Broadly speaking, the theory deals with analytical approximations of probabilities of certain types of rare events. Moreover, the theory has recently proven instrumental in the study of complexity of approximations of stochastic optimization problems. The typical results, how- ever, assume that the underlying random variables are either i.i.d. or exhibit some form of Markovian dependence. Our interest in this paper is to study the validity of large deviations results in the context of estimators built with Latin Hypercube sampling, a well-known sampling technique for variance reduction. We show that a large deviation principle holds for Latin Hy- percube sampling for functions in one dimension and for separable multi-dimensional functions. Moreover, the upper bound of the probability of a large deviation in these cases is no higher under Latin Hypercube sampling than it is under Monte Carlo sampling. We extend the latter property to functions that are monotone in each argument. Numerical experiments illustrate the theoretical results presented in the paper. 1 Introduction Suppose we wish to calculate E[g(X)] where X = [X1;:::;Xd] is a random vector in Rd and g(¢): Rd 7! R is a measurable function. Further, suppose that the expected value is ¯nite and cannot be written in closed form or be easily calculated, but that g(X) can be easily computed for a given value of X.
    [Show full text]
  • Quantile Estimation with Latin Hypercube Sampling
    Submitted to Operations Research manuscript OPRE-2014-08-465-final Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template does not certify that the paper has been accepted for publication in the named jour- nal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print or online or to submit the papers to another publication. Quantile Estimation With Latin Hypercube Sampling Hui Dong Supply Chain Management and Marketing Sciences Dept., Rutgers Univ., Newark, NJ 07102, [email protected] Marvin K. Nakayama Computer Science Department, New Jersey Institute of Technology, Newark, NJ 07102, [email protected] Quantiles are often used to measure risk of stochastic systems. We examine quantile estimators obtained using simulation with Latin hypercube sampling (LHS), a variance-reduction technique that efficiently extends stratified sampling to higher dimensions and produces negatively correlated outputs. We consider single- sample LHS (ssLHS), which minimizes the variance that can be obtained from LHS, and also replicated LHS (rLHS). We develop a consistent estimator of the asymptotic variance of the ssLHS quantile estimator's central limit theorem, enabling us to provide the first confidence interval (CI) for a quantile when applying ssLHS. For rLHS, we construct CIs using batching and sectioning. On average, our rLHS CIs are shorter than previous rLHS CIs and only slightly wider than the ssLHS CI. We establish the asymptotic validity of the CIs by first proving that the quantile estimators satisfy Bahadur representations, which show that the quantile estimators can be approximated by linear transformations of estimators of the cumulative distribution function (CDF).
    [Show full text]
  • Sampling Based on Sobol Sequences for Monte Carlo Techniques
    Proceedings of Building Simulation 2011: 12th Conference of International Building Performance Simulation Association, Sydney, 14-16 November. SAMPLING BASED ON SOBOL0 SEQUENCES FOR MONTE CARLO TECHNIQUES APPLIED TO BUILDING SIMULATIONS Sebastian Burhenne1;∗, Dirk Jacob1, and Gregor P. Henze2 1Fraunhofer Institute for Solar Energy Systems, Freiburg, Germany 2University of Colorado, Boulder, USA ∗Corresponding author. E-mail address: [email protected] ABSTRACT numerically expensive when the number of analyzed Monte Carlo (MC) techniques are commonly used to parameters (k) is large as its volume increases dramat- perform uncertainty and sensitivity analyses. A key el- ically with k. ement of MC methods is the sampling of input param- In this paper different sampling techniques are ana- eters for the simulation, where the goal is to explore lyzed with respect to the estimator of the mean of the the entire input space with a reasonable sample size result and how quick this estimator converges to the (N). The sample size determines the computational true mean (i.e., expected value) with respect to the cost of the analysis since N is equal to the required sample size. Another analyzed measure of the perfor- number of simulation runs. Quasi-random (QR) se- mance of the sampling strategy is its robustness. Ro- quences such as the Sobol0 sequences are designed to bustness can be measured via the standard error of the generate a sample that is uniformly distributed over estimated mean. This is done using multiple MC sim- the unit hypercube. In this paper, sampling based ulations and analyzing their results. A way to visualize on Sobol0 sequences is compared with other standard the robustness is to compare the empirical cumulated sampling procedures with respect to typical building density functions (CDFs) of several repetitions of the simulation applications.
    [Show full text]
  • Uncertainty and Sensitivity Analysis for Long-Running Computer Codes: a Critical Review
    UNCERTAINTY AND SENSITIVITY ANALYSIS FOR LONG-RUNNING COMPUTER CODES: A CRITICAL REVIEW By MASSACHUSE~s INSTrTUTE' Dustin R. Langewisch OF TECHNOLOGY B.S. Mechanical Engineering and B.S. Mathematics MAR 122010 University of Missouri-Columbia, 2007 L A [ LIBRARIES SUBMITTED TO THE DEPARTMENT OF NUCLEAR SCIENCE AND ENGINEERING IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF ARCHIVES MASTER OF SCIENCE IN NUCLEAR SCIENCE AND ENGINEERING AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY FEBRUARY 2010 The author hereby grants MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. Copyright 0 Massachusetts Institute of Technology All rights reserved. Signature of Author: "-1epartment of Nuclear Science and Engineering January 28, 2010 Certified by: Dr. Gofge E p-p'stolakis, Thesis Supervisor KEPCO Professor of Nuclear Science and Engineering Professor of Engineering Systems Certified by: Don d Helton, Thesis Reader of ear Regulatory Research a. Regulatory Commission Accepted by: Dr. Jacquelyn C. Yanch Chair, Department Committee on Graduate Students UNCERTAINTY AND SENSITIVITY ANALYSIS FOR LONG-RUNNING COMPUTER CODES: A CRITICAL REVIEW Dustin R. Langewisch SUBMITTED TO THE DEPARTMENT OF NUCLEAR SCIENCE AND ENGINEERING IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN NUCLEAR SCIENCE AND ENGINEERING AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY FEBRUARY 2010 Abstract This thesis presents a critical review of existing methods for performing probabilistic uncertainty and sensitivity analysis for complex, computationally expensive simulation models. Uncertainty analysis (UA) methods reviewed include standard Monte Carlo simulation, Latin Hypercube sampling, importance sampling, line sampling, and subset simulation.
    [Show full text]
  • Optimization of Latin Hypercube Samples and Subprojection Properties
    Numerical studies of space filling designs: optimization of Latin Hypercube Samples and subprojection properties G. Damblin, M. Couplet and B. Iooss EDF R&D, 6 Quai Watier, F-78401, Chatou, France Submitted to: Journal of Simulation for the special issue \Input & Output Analysis for Simulation" Correspondance: B. Iooss ; Email: [email protected] Phone: +33-1-30877969 ; Fax: +33-1-30878213 Abstract Quantitative assessment of the uncertainties tainting the results of computer simulations is nowadays a major topic of interest in both industrial and scientific communities. One of the key issues in such studies is to get information about the output when the numerical simulations are expensive to run. This paper considers the problem of exploring the whole space of variations of the computer model input variables in the context of a large dimensional exploration space. Various properties of space filling designs are justified: interpoint-distance, discrepancy, mini- mum spanning tree criteria. A specific class of design, the optimized Latin Hypercube Sample, is considered. Several optimization algorithms, coming from the literature, are studied in terms of convergence speed, robustness to subprojection and space filling properties of the resulting de- sign. Some recommendations for building such designs are given. Finally, another contribution of this paper is the deep analysis of the space filling properties of the design 2D-subprojections. Keywords: discrepancy, optimal design, Latin Hypercube Sampling, computer experiment. arXiv:1307.6835v1 [math.ST] 25 Jul 2013 1 Introduction Many computer codes, for instance simulating physical phenomena and industrial systems, are too time expensive to be directly used to perform uncertainty, sensitivity, optimization or robust- ness analyses (de Rocquigny et al., 2008).
    [Show full text]
  • Comparison of Latin Hypercube Sampling and Simple Random Sampling Applied to Neural Network Modeling of Hfoz Thin Film Fabrication
    Transactions on Electrical and Electronic Materials, Vo1.7, No.4 August 2006 Comparison of Latin Hypercube Sampling and Simple Random Sampling Applied to Neural Network Modeling of HfOz Thin Film Fabrication Jung Hwan Leea, Young-Don KO, and Ilgu Yun Department of Electrical and Electronics Engineering, Yonsei University, 134 Sinchon-dong, Seodaemun-gu, Seoul 120-749, Korea Kyonghee Han Centerfor the Innovation of Engineering Education, College of Engineering, Yonsei University, 134 Sinchon-dong, Seodaemun-gu, Seoul 120-749, Korea "E-mail : [email protected] (Received March 2 2006, Accepted June 16 2006) In this paper, two sampling methods which are Latin hypercube sampling (LHS) and simple random sampling were compared to improve the modeling speed of neural network model. Sampling method was used to generate initial weights and bias set. Electrical characteristic data for Hf02 thin film was used as modeling data. 10 initial parameter sets which are initial weights and bias sets were generated using LHS and simple random sampling, respectively. Modeling was performed with generated initial parameters and measured epoch number. The other network parameters were fixed. The iterative 20 minimum epoch numbers for LHS and simple random sampling were analyzed by nonparametric method because of their non- normality. Keywords : LHS, Neural network, Optimization, Nonparametric test 1. INTRODUCTION testing data are reached at desired value. Epoch number was used as modeling speed. Epoch is defined as a single Since powerful learning algorithms such as error pass through a fixed training set that is going to be used back-propagation(BP) learning were proposed, neural repeatedly. Measured epoch numbers were analyzed by network has been applied and successfully implemented nonparametric method.
    [Show full text]
  • Modified Latin Hypercube Sampling Monte Carlo
    Analog Integrated Circuits and Signal Processing, vol. 19, no. 1, April 1999. Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index Mansour Keramat1 and Richard Kielbasa2 1 Department of Electrical Engineering, Texas A&M University, College Station, Texas 77843-3128 2 Ecole Supérieure d'Electricité (SUPELEC), Service des Mesures, Plateau de Moulon, F-91192 Gif-sur-Yvette Cedex France Abstract The Monte Carlo (MC) method exhibits generality and insensitivity to the number of stochastic variables, but it is expensive for accurate Average Quality Index (AQI) or Parametric Yield estimation of MOS VLSI circuits or discrete component circuits. In this paper a variant of the Latin Hypercube Sampling MC method is presented which is an efficient variance reduction technique in MC estimation. Theoretical and practical aspects of its statistical properties are also given. Finally, a numerical and a CMOS clock driver circuit example are given. Encouraging results and good agreement between theory and simulation results have thus far been obtained. Key Words: Monte Carlo estimation, Latin hypercube sampling, average quality index. I. Introduction During the past decade, the feature sizes of VLSI devices have been scaled down rapidly. Despite the technological progress in patterning fine-line features, the fluctuations in etch rate, gate oxide thickness, doping profiles, and other fabrication steps that are critical to device performances have not been scaled down in proportion. Consequently, the Average Quality Index (AQI) [1] or its special case Parametric Yield is becoming increasingly critical in VLSI design. Circuit designers must ensure that their chips will have an acceptable AQI or parametric yield under all manufacturing process variations.
    [Show full text]
  • Assessing the Reliability of Species Distribution Projections in Climate
    bioRxiv preprint doi: https://doi.org/10.1101/2020.06.10.143917; this version posted September 9, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. 1 Assessing the reliability of species distribution projections in climate change research 2 3 4 Luca Santini 1-2*, Ana Benítez-López 2-3, Luigi Maiorano 4, Mirza Čengić 1, Mark A.J. Huijbregts 1 5 6 1 National Research Council, Institute of Research on Terrestrial Ecosystems (CNR-IRET), Via 7 Salaria km 29.300, 00015, Monterotondo (Rome), Italy 8 9 2 Department of Environmental Science, Institute for Wetland and Water Research, Faculty of 10 Science, Radboud University, P.O. Box 9010, NL-6500 GL, Nijmegen, The Netherlands 11 12 13 3 Integrative Ecology Group, Estación Biológica de Doñana (EBD-CSIC), Sevilla, Spain 14 15 4 Department of Biology and Biotechnologies, Sapienza Università di Roma, Rome, Italy 16 17 * corresponding author: [email protected] 18 19 Abstract 20 Aim 21 Forecasting changes in species distribution under future scenarios is one of the most prolific areas 22 of application for species distribution models (SDMs). However, no consensus yet exists on the 23 reliability of such models for drawing conclusions on species distribution response to changing 24 climate. In this study we provide an overview of common modelling practices in the field and 25 assess model predictions reliability using a virtual species approach.
    [Show full text]
  • A First Course in Experimental Design Notes from Stat 263/363
    A First Course in Experimental Design Notes from Stat 263/363 Art B. Owen Stanford University Autumn 2020 ii © Art Owen 2020 Dec 6 2020 version. Do not distribute or post electronically without author's permission Preface This is a collection of scribed lecture notes about experimental design. The course covers classical experimental design methods (ANOVA and blocking and factorial settings) along with Taguchi methods for robust design, A/B testing and bandits for electronic commerce, computer experiments and supersaturated designs suitable for fitting sparse regression models. There are lots of different ways to do an experiment. Most of the time the analysis of the resulting data is familiar from a regression class. Since this is one of the few classes about making data we proceed assuming that the students are already familiar with t-tests and regression methods for analyzing data. There are a few places where the analysis of experimental data differs from that of observational data, so the lectures cover some of those. The opening lecture sets the scene for experimental design by grounding it in causal inference. Most of causal inference is about inferring causality from purely observational data where you cannot randomize (or did not randomize). While inferring causality from purely observational data requires us to rely on untestable assumptions, in some problems it is the only choice. When one can randomize then much stronger causal claims are possible. This course is about how to use such randomization. Several of the underlying ideas from causal inference are important even when you did randomize. These include the potential outcomes framework, conveniently depicted via a device sometimes called the science table, the Stable Unit Treatment Value Assumption (SUTVA) that the outcome for one experimental unit is not affected by the treatment for any other one, and the issue of external validity that comes up generalizing from the units in a study to others.
    [Show full text]