Monte Carlo Methods and Importance Sampling

Total Page:16

File Type:pdf, Size:1020Kb

Monte Carlo Methods and Importance Sampling Lecture Notes for Stat 578C c Eric C. Anderson Statistical Genetics 20 October 1999 (subbin’ for E.A Thompson) Monte Carlo Methods and Importance Sampling History and denition: The term “Monte Carlo” was apparently rst used by Ulam and von Neumann as a Los Alamos code word for the stochastic simulations they applied to building better atomic bombs. Their methods, involving the laws of chance, were aptly named after the inter- national gaming destination; the moniker stuck and soon after the War a wide range of sticky problems yielded to the new techniques. Despite the widespread use of the methods, and numerous descriptions of them in articles and monographs, it is virtually impossible to nd a succint deni- tion of “Monte Carlo method” in the literature. Perhaps this is owing to the intuitive nature of the topic which spawns many denitions by way of specic examples. Some authors prefer to use the term “stochastic simulation” for almost everything, reserving “Monte Carlo” only for Monte Carlo Integration and Monte Carlo Tests (cf. Ripley 1987). Others seem less concerned about blurring the distinction between simulation studies and Monte Carlo methods. Be that as it may, a suitable denition can be good to have, if for nothing other than to avoid the awkwardness of trying to dene the Monte Carlo method by appealing to a whole bevy of examples of it. Since I am (so Elizabeth claims!) unduly inuenced by my advisor’s ways of thinking, I like to dene Monte Carlo in the spirit of denitions she has used before. In particular, I use: Definition: Monte Carlo is the art of approximating an expectation by the sample mean of a function of simulated random variables. We will nd that this denition is broad enough to cover everything that has been called Monte Carlo, and yet makes clear its essence in very familiar terms: Monte Carlo is about invoking laws of large numbers to approximate expectations.1 While most Monte Carlo simulations are done by computer today, there were many applications of Monte Carlo methods using coin-ipping, card-drawing, or needle-tossing (rather than computer- generated pseudo-random numbers) as early as the turn of the century—long before the name Monte Carlo arose. In more mathematical terms: Consider a (possibly multidimensional) random variable X having probability mass function or probability density function fX (x) which is greater than zero on a set of values X . Then the expected value of a function g of X is X E(g(X)) = g(x)fX (x) (1) x∈X if X is discrete, and Z E(g(X)) = g(x)fX (x)dx (2) x∈X if X is continuous. Now, if we were to take an n-sample of X’s, (x1,...,xn), and we computed the mean of g(x) over the sample, then we would have the Monte Carlo estimate Xn 1 ge (x)= g(x ) n n i i=1 1This applies when the simulated variables are independent of one another, and might apply when they are correlated with one another (for example if they are states visited by an ergodic Markov chain). For now we will just deal with independent simulated random variables, but all of this extends to samples from Markov chains via the weak law of large numbers for the number of passages through a recurrent state in an ergodic Markov chain (see Feller 1957). You will encounter this later when talking about MCMC. 1 2 c Eric C. Anderson. You may duplicate this freely for pedagogical purposes. of E(g(X)). We could, alternatively, speak of the random variable Xn 1 ge (X)= g(X) n n i=1 which we call the Monte Carlo estimator of E(g(X)). If E(g(X)), exists, then the weak law of large numbers tells us that for any arbitrarily small lim P (|gen(X) E(g(X))|)=0. n→∞ This tells us that as n gets large, then there is small probability that gen(X) deviates much from E(g(X)). For our purposes, the strong law of large numbers says much the same thing—the important part being that so long as n is large enough, gen(x) arising from a Monte Carlo experiment shall be close to E(g(X)), as desired. One other thing to note at this point is that gen(X) is unbiased for E(g(X)): ! Xn Xn 1 1 E(ge (X)) = E g(X ) = E(g(X )) = E(g(X)). n n i n i i=1 i=1 Making this useful: The preceding section comes to life and becomes useful when one realizes that very many quantities of interest may be cast as expectations. Most importantly for applica- tions in statistical genetics, it is possible to express all probabilities, integrals, and summations as expectations: Probabilities: Let Y be a random variable. The probability that Y takes on some value in a set A can be expressed as an expectation using the indicator function: ∈ E P (Y A)= (I{A}(Y )) (3) ∈ 6∈ where I{A}(Y ) is the indicator function that takes the value 1 when Y A and 0 when Y A. Integrals: Consider a problem now which is completely deterministic—integratingR a function q(x) b from a to b (as in high-school calculus). So we have a q(x)dx. This can be expressed as an expectation with respect to a uniformly distributed, continuous random variable U between a and b. U has density function fU (u)=1/(b a), so if we rewrite the integral we get Z Z b b 1 E (b a) q(x) dx =(b a) q(x)fU (x)dx =(b a) (q(U)) ...voila! a b a a Discrete Sums: The discrete version of the above is just the sum of a function q(x) over the count- ably many values of x in a set A.P If we have a random variable W which takes values in A all with equal probability p (so that w∈A p = 1 then the sum may be cast as the expectation X X 1 1 q(x)= q(x)p = E(q(W )). p p x∈A x∈A The immediate consequence of this is that all probabilities, integrals, and summations can be approximated by the Monte Carlo method. A crucial thing to note, however, is that there is no restriction that says U or W above must have uniform distributions. This is just for easy illustration of the points above. We will explore this point more while considering importance sampling. c Eric C. Anderson. Corrections, comments?=⇒[email protected] 3 Example I: Approximating probabilities by Monte Carlo. Consider a Wright-Fisher population of size N diploid individuals in which Xt counts the numbers of copies of a certain allelic type in the population at time t. At time zero, there are x0 copies of the allele. Given this, what is the probability that the allele will be lost from the population in t generations, i.e., P (Xt =0|X0 = x0)? This can be computed exactly by multiplying transition probability matrices together, or by employing the Baum (1972) algorithm (which you will learn about later), but it can also be approximated by Monte Carlo. It is simple to simulate genetic drift in a Wright-Fisher population; thus we can easily simulate values for Xt given X0 = x0. Then, P (Xt =0|X0 = x0)=E(I{0}(Xt)|X0 = x0) th where I{0}(Xt) takes the value 1 when Xt = 0 and 0 otherwise. Denoting the i simulated (i) value of Xt by xt our Monte Carlo estimate would be Xn e 1 (i) P (X =0|X = x ) I{ }(x ). t 0 0 n 0 t i=1 Example II: Monte Carlo approximations to distributions. A simple extension of the above example is to approximate the whole probability distribution P (Xt|X0 = x0) by Monte Carlo. Consider the histogram below: 0.04 0.03 0.02 0.01 0 Monte Carlo Estimate of Probability 0 20 40 60 80 100 120 140 160 180 200 Number of Copies of Allele It represents the results of simulations in which n =50, 000, x0 = 60, t = 14, the Wright-Fisher population size N = 100 diploids, and each rectangle represents a Monte Carlo approximation to P (a X14 <a+2|X0 = 60), a =0, 2, 4,...,200. For each such probability, the approximation follows from Xn 1 (i) P (a X <a+2|X = 60) = E(I{ }(X )) I{ }(x ) ,a=0, 2,...,200 14 0 a X<a+2 14 n a X<a+2 14 i=1 Example III: A discrete sum over latent variables. In many applications in statistical genetics, the probability P (Y ) of an observed event Y must be computed as the sum over very many latent variables X of the joint probability P (Y,X). In such a case, Y is typically xed, i.e., we have observed Y = y so we are interested in P (Y = y), but we can’t observe the values of the latent variables which may take values in the space X . Though it follows from the laws of probability that X P (Y = y)= P (Y = y, X = x), x∈X quite often X is such a large space (contains so many elements) that it is impossible to compute the sum. Application of the law of conditional probability, however, gives X X P (Y = y)= P (Y = y, X = x)= P (Y = y|X = x)P (X = x).
Recommended publications
  • Kalman and Particle Filtering
    Abstract: The Kalman and Particle filters are algorithms that recursively update an estimate of the state and find the innovations driving a stochastic process given a sequence of observations. The Kalman filter accomplishes this goal by linear projections, while the Particle filter does so by a sequential Monte Carlo method. With the state estimates, we can forecast and smooth the stochastic process. With the innovations, we can estimate the parameters of the model. The article discusses how to set a dynamic model in a state-space form, derives the Kalman and Particle filters, and explains how to use them for estimation. Kalman and Particle Filtering The Kalman and Particle filters are algorithms that recursively update an estimate of the state and find the innovations driving a stochastic process given a sequence of observations. The Kalman filter accomplishes this goal by linear projections, while the Particle filter does so by a sequential Monte Carlo method. Since both filters start with a state-space representation of the stochastic processes of interest, section 1 presents the state-space form of a dynamic model. Then, section 2 intro- duces the Kalman filter and section 3 develops the Particle filter. For extended expositions of this material, see Doucet, de Freitas, and Gordon (2001), Durbin and Koopman (2001), and Ljungqvist and Sargent (2004). 1. The state-space representation of a dynamic model A large class of dynamic models can be represented by a state-space form: Xt+1 = ϕ (Xt,Wt+1; γ) (1) Yt = g (Xt,Vt; γ) . (2) This representation handles a stochastic process by finding three objects: a vector that l describes the position of the system (a state, Xt X R ) and two functions, one mapping ∈ ⊂ 1 the state today into the state tomorrow (the transition equation, (1)) and one mapping the state into observables, Yt (the measurement equation, (2)).
    [Show full text]
  • Lecture 15: Approximate Inference: Monte Carlo Methods
    10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 15: Approximate Inference: Monte Carlo methods Lecturer: Eric P. Xing Name: Yuan Liu(yuanl4), Yikang Li(yikangl) 1 Introduction We have already learned some exact inference methods in the previous lectures, such as the elimination algo- rithm, message-passing algorithm and the junction tree algorithms. However, there are many probabilistic models of practical interest for which exact inference is intractable. One important class of inference algo- rithms is based on deterministic approximation schemes and includes methods such as variational methods. This lecture will include an alternative very general and widely used framework for approximate inference based on stochastic simulation through sampling, also known as the Monte Carlo methods. The purpose of the Monte Carlo method is to find the expectation of some function f(x) with respect to a probability distribution p(x). Here, the components of x might comprise of discrete or continuous variables, or some combination of the two. Z hfi = f(x)p(x)dx The general idea behind sampling methods is to obtain a set of examples x(t)(where t = 1; :::; N) drawn from the distribution p(x). This allows the expectation (1) to be approximated by a finite sum N 1 X f^ = f(x(t)) N t=1 If the samples x(t) is i.i.d, it is clear that hf^i = hfi, and the variance of the estimator is h(f − hfi)2i=N. Monte Carlo methods work by drawing samples from the desired distribution, which can be accomplished even when it it not possible to write out the pdf.
    [Show full text]
  • A Fourier-Wavelet Monte Carlo Method for Fractal Random Fields
    JOURNAL OF COMPUTATIONAL PHYSICS 132, 384±408 (1997) ARTICLE NO. CP965647 A Fourier±Wavelet Monte Carlo Method for Fractal Random Fields Frank W. Elliott Jr., David J. Horntrop, and Andrew J. Majda Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, New York 10012 Received August 2, 1996; revised December 23, 1996 2 2H k[v(x) 2 v(y)] l 5 CHux 2 yu , (1.1) A new hierarchical method for the Monte Carlo simulation of random ®elds called the Fourier±wavelet method is developed and where 0 , H , 1 is the Hurst exponent and k?l denotes applied to isotropic Gaussian random ®elds with power law spectral the expected value. density functions. This technique is based upon the orthogonal Here we develop a new Monte Carlo method based upon decomposition of the Fourier stochastic integral representation of the ®eld using wavelets. The Meyer wavelet is used here because a wavelet expansion of the Fourier space representation of its rapid decay properties allow for a very compact representation the fractal random ®elds in (1.1). This method is capable of the ®eld. The Fourier±wavelet method is shown to be straightfor- of generating a velocity ®eld with the Kolmogoroff spec- ward to implement, given the nature of the necessary precomputa- trum (H 5 Ad in (1.1)) over many (10 to 15) decades of tions and the run-time calculations, and yields comparable results scaling behavior comparable to the physical space multi- with scaling behavior over as many decades as the physical space multiwavelet methods developed recently by two of the authors.
    [Show full text]
  • Estimating Standard Errors for Importance Sampling Estimators with Multiple Markov Chains
    Estimating standard errors for importance sampling estimators with multiple Markov chains Vivekananda Roy1, Aixin Tan2, and James M. Flegal3 1Department of Statistics, Iowa State University 2Department of Statistics and Actuarial Science, University of Iowa 3Department of Statistics, University of California, Riverside Aug 10, 2016 Abstract The naive importance sampling estimator, based on samples from a single importance density, can be numerically unstable. Instead, we consider generalized importance sampling estimators where samples from more than one probability distribution are combined. We study this problem in the Markov chain Monte Carlo context, where independent samples are replaced with Markov chain samples. If the chains converge to their respective target distributions at a polynomial rate, then under two finite moment conditions, we show a central limit theorem holds for the generalized estimators. Further, we develop an easy to implement method to calculate valid asymptotic standard errors based on batch means. We also provide a batch means estimator for calculating asymptotically valid standard errors of Geyer’s (1994) reverse logistic estimator. We illustrate the method via three examples. In particular, the generalized importance sampling estimator is used for Bayesian spatial modeling of binary data and to perform empirical Bayes variable selection where the batch means estimator enables standard error calculations in high-dimensional settings. arXiv:1509.06310v2 [math.ST] 11 Aug 2016 Key words and phrases: Bayes factors, Markov chain Monte Carlo, polynomial ergodicity, ratios of normalizing constants, reverse logistic estimator. 1 Introduction Let π(x) = ν(x)=m be a probability density function (pdf) on X with respect to a measure µ(·). R Suppose f : X ! R is a π integrable function and we want to estimate Eπf := X f(x)π(x)µ(dx).
    [Show full text]
  • Three Problems of the Theory of Choice on Random Sets
    DIVISION OF THE HUMANITIES AND SOCIAL SCIENCES CALIFORNIA INSTITUTE OF TECHNOLOGY PASADENA, CALIFORNIA 91125 THREE PROBLEMS OF THE THEORY OF CHOICE ON RANDOM SETS B. A. Berezovskiy,* Yu. M. Baryshnikov, A. Gnedin V. Institute of Control Sciences Academy of Sciences of the U.S.S.R, Moscow *Guest of the California Institute of Technology SOCIAL SCIENCE WORKING PAPER 661 December 1987 THREE PROBLEMS OF THE THEORY OF CHOICE ON RANDOM SETS B. A. Berezovskiy, Yu. M. Baryshnikov, A. V. Gnedin Institute of Control Sciences Academy of Sciences of the U.S.S.R., Moscow ABSTRACT This paper discusses three problems which are united not only by the common topic of research stated in thetitle, but also by a somewhat surprising interlacing of the methods and techniques used. In the first problem, an attempt is made to resolve a very unpleasant metaproblem arising in general choice theory: why theconditions of rationality are not really necessary or, in other words, why in every-day life we are quite satisfied with choice methods which are far from being ideal. The answer, substantiated by a number of results, is as follows: situations in which the choice function "misbehaves" are very seldom met in large presentations. the second problem, an overview of our studies is given on the problem of statistical propertiesIn of choice. One of themost astonishing phenomenon found when we deviate from scalar­ extremal choice functions is in stable multiplicity of choice. If our presentation is random, then a random number of alternatives is chosen in it. But how many? The answer isn't trival, and may be sought in many differentdirections.
    [Show full text]
  • Distilling Importance Sampling
    Distilling Importance Sampling Dennis Prangle1 1School of Mathematics, University of Bristol, United Kingdom Abstract in almost any setting, including in the presence of strong posterior dependence or discrete random variables. How- ever it only achieves a representative weighted sample at a Many complicated Bayesian posteriors are difficult feasible cost if the proposal is a reasonable approximation to approximate by either sampling or optimisation to the target distribution. methods. Therefore we propose a novel approach combining features of both. We use a flexible para- An alternative to Monte Carlo is to use optimisation to meterised family of densities, such as a normal- find the best approximation to the posterior from a family of ising flow. Given a density from this family approx- distributions. Typically this is done in the framework of vari- imating the posterior, we use importance sampling ational inference (VI). VI is computationally efficient but to produce a weighted sample from a more accur- has the drawback that it often produces poor approximations ate posterior approximation. This sample is then to the posterior distribution e.g. through over-concentration used in optimisation to update the parameters of [Turner et al., 2008, Yao et al., 2018]. the approximate density, which we view as dis- A recent improvement in VI is due to the development of tilling the importance sampling results. We iterate a range of flexible and computationally tractable distribu- these steps and gradually improve the quality of the tional families using normalising flows [Dinh et al., 2016, posterior approximation. We illustrate our method Papamakarios et al., 2019a]. These transform a simple base in two challenging examples: a queueing model random distribution to a complex distribution, using a se- and a stochastic differential equation model.
    [Show full text]
  • Approaching Mean-Variance Efficiency for Large Portfolios
    Approaching Mean-Variance Efficiency for Large Portfolios ∗ Mengmeng Ao† Yingying Li‡ Xinghua Zheng§ First Draft: October 6, 2014 This Draft: November 11, 2017 Abstract This paper studies the large dimensional Markowitz optimization problem. Given any risk constraint level, we introduce a new approach for estimating the optimal portfolio, which is developed through a novel unconstrained regression representation of the mean-variance optimization problem, combined with high-dimensional sparse regression methods. Our estimated portfolio, under a mild sparsity assumption, asymptotically achieves mean-variance efficiency and meanwhile effectively controls the risk. To the best of our knowledge, this is the first time that these two goals can be simultaneously achieved for large portfolios. The superior properties of our approach are demonstrated via comprehensive simulation and empirical studies. Keywords: Markowitz optimization; Large portfolio selection; Unconstrained regression, LASSO; Sharpe ratio ∗Research partially supported by the RGC grants GRF16305315, GRF 16502014 and GRF 16518716 of the HKSAR, and The Fundamental Research Funds for the Central Universities 20720171073. †Wang Yanan Institute for Studies in Economics & Department of Finance, School of Economics, Xiamen University, China. [email protected] ‡Hong Kong University of Science and Technology, HKSAR. [email protected] §Hong Kong University of Science and Technology, HKSAR. [email protected] 1 1 INTRODUCTION 1.1 Markowitz Optimization Enigma The groundbreaking mean-variance portfolio theory proposed by Markowitz (1952) contin- ues to play significant roles in research and practice. The optimal mean-variance portfolio has a simple explicit expression1 that only depends on two population characteristics, the mean and the covariance matrix of asset returns. Under the ideal situation when the underlying mean and covariance matrix are known, mean-variance investors can easily compute the optimal portfolio weights based on their preferred level of risk.
    [Show full text]
  • Importance Sampling & Sequential Importance Sampling
    Importance Sampling & Sequential Importance Sampling Arnaud Doucet Departments of Statistics & Computer Science University of British Columbia A.D. () 1 / 40 Each distribution πn (dx1:n) = πn (x1:n) dx1:n is known up to a normalizing constant, i.e. γn (x1:n) πn (x1:n) = Zn We want to estimate expectations of test functions ϕ : En R n ! Eπn (ϕn) = ϕn (x1:n) πn (dx1:n) Z and/or the normalizing constants Zn. We want to do this sequentially; i.e. …rst π1 and/or Z1 at time 1 then π2 and/or Z2 at time 2 and so on. Generic Problem Consider a sequence of probability distributions πn n 1 de…ned on a f g sequence of (measurable) spaces (En, n) n 1 where E1 = E, f F g 1 = and En = En 1 E, n = n 1 . F F F F F A.D. () 2 / 40 We want to estimate expectations of test functions ϕ : En R n ! Eπn (ϕn) = ϕn (x1:n) πn (dx1:n) Z and/or the normalizing constants Zn. We want to do this sequentially; i.e. …rst π1 and/or Z1 at time 1 then π2 and/or Z2 at time 2 and so on. Generic Problem Consider a sequence of probability distributions πn n 1 de…ned on a f g sequence of (measurable) spaces (En, n) n 1 where E1 = E, f F g 1 = and En = En 1 E, n = n 1 . F F F F F Each distribution πn (dx1:n) = πn (x1:n) dx1:n is known up to a normalizing constant, i.e.
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Random Processes
    Chapter 6 Random Processes Random Process • A random process is a time-varying function that assigns the outcome of a random experiment to each time instant: X(t). • For a fixed (sample path): a random process is a time varying function, e.g., a signal. – For fixed t: a random process is a random variable. • If one scans all possible outcomes of the underlying random experiment, we shall get an ensemble of signals. • Random Process can be continuous or discrete • Real random process also called stochastic process – Example: Noise source (Noise can often be modeled as a Gaussian random process. An Ensemble of Signals Remember: RV maps Events à Constants RP maps Events à f(t) RP: Discrete and Continuous The set of all possible sample functions {v(t, E i)} is called the ensemble and defines the random process v(t) that describes the noise source. Sample functions of a binary random process. RP Characterization • Random variables x 1 , x 2 , . , x n represent amplitudes of sample functions at t 5 t 1 , t 2 , . , t n . – A random process can, therefore, be viewed as a collection of an infinite number of random variables: RP Characterization – First Order • CDF • PDF • Mean • Mean-Square Statistics of a Random Process RP Characterization – Second Order • The first order does not provide sufficient information as to how rapidly the RP is changing as a function of timeà We use second order estimation RP Characterization – Second Order • The first order does not provide sufficient information as to how rapidly the RP is changing as a function
    [Show full text]
  • Fourier, Wavelet and Monte Carlo Methods in Computational Finance
    Fourier, Wavelet and Monte Carlo Methods in Computational Finance Kees Oosterlee1;2 1CWI, Amsterdam 2Delft University of Technology, the Netherlands AANMPDE-9-16, 7/7/2016 Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 1 / 51 Joint work with Fang Fang, Marjon Ruijter, Luis Ortiz, Shashi Jain, Alvaro Leitao, Fei Cong, Qian Feng Agenda Derivatives pricing, Feynman-Kac Theorem Fourier methods Basics of COS method; Basics of SWIFT method; Options with early-exercise features COS method for Bermudan options Monte Carlo method BSDEs, BCOS method (very briefly) Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 1 / 51 Agenda Derivatives pricing, Feynman-Kac Theorem Fourier methods Basics of COS method; Basics of SWIFT method; Options with early-exercise features COS method for Bermudan options Monte Carlo method BSDEs, BCOS method (very briefly) Joint work with Fang Fang, Marjon Ruijter, Luis Ortiz, Shashi Jain, Alvaro Leitao, Fei Cong, Qian Feng Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 1 / 51 Feynman-Kac theorem: Z T v(t; x) = E g(s; Xs )ds + h(XT ) ; t where Xs is the solution to the FSDE dXs = µ(Xs )ds + σ(Xs )d!s ; Xt = x: Feynman-Kac Theorem The linear partial differential equation: @v(t; x) + Lv(t; x) + g(t; x) = 0; v(T ; x) = h(x); @t with operator 1 Lv(t; x) = µ(x)Dv(t; x) + σ2(x)D2v(t; x): 2 Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 2 / 51 Feynman-Kac Theorem The linear partial differential equation: @v(t; x) + Lv(t; x) + g(t; x) = 0; v(T ; x) = h(x); @t with operator 1 Lv(t; x) = µ(x)Dv(t; x) + σ2(x)D2v(t; x): 2 Feynman-Kac theorem: Z T v(t; x) = E g(s; Xs )ds + h(XT ) ; t where Xs is the solution to the FSDE dXs = µ(Xs )ds + σ(Xs )d!s ; Xt = x: Kees Oosterlee (CWI, TU Delft) Comp.
    [Show full text]
  • Efficient Monte Carlo Methods for Estimating Failure Probabilities
    Reliability Engineering and System Safety 165 (2017) 376–394 Contents lists available at ScienceDirect Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress ffi E cient Monte Carlo methods for estimating failure probabilities MARK ⁎ Andres Albana, Hardik A. Darjia,b, Atsuki Imamurac, Marvin K. Nakayamac, a Mathematical Sciences Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA b Mechanical Engineering Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA c Computer Science Dept., New Jersey Institute of Technology, Newark, NJ 07102, USA ARTICLE INFO ABSTRACT Keywords: We develop efficient Monte Carlo methods for estimating the failure probability of a system. An example of the Probabilistic safety assessment problem comes from an approach for probabilistic safety assessment of nuclear power plants known as risk- Risk analysis informed safety-margin characterization, but it also arises in other contexts, e.g., structural reliability, Structural reliability catastrophe modeling, and finance. We estimate the failure probability using different combinations of Uncertainty simulation methodologies, including stratified sampling (SS), (replicated) Latin hypercube sampling (LHS), Monte Carlo and conditional Monte Carlo (CMC). We prove theorems establishing that the combination SS+LHS (resp., SS Variance reduction Confidence intervals +CMC+LHS) has smaller asymptotic variance than SS (resp., SS+LHS). We also devise asymptotically valid (as Nuclear regulation the overall sample size grows large) upper confidence bounds for the failure probability for the methods Risk-informed safety-margin characterization considered. The confidence bounds may be employed to perform an asymptotically valid probabilistic safety assessment. We present numerical results demonstrating that the combination SS+CMC+LHS can result in substantial variance reductions compared to stratified sampling alone.
    [Show full text]