Importance Sampling & Sequential Importance Sampling

Importance Sampling & Sequential Importance Sampling

Importance Sampling & Sequential Importance Sampling Arnaud Doucet Departments of Statistics & Computer Science University of British Columbia A.D. () 1 / 40 Each distribution πn (dx1:n) = πn (x1:n) dx1:n is known up to a normalizing constant, i.e. γn (x1:n) πn (x1:n) = Zn We want to estimate expectations of test functions ϕ : En R n ! Eπn (ϕn) = ϕn (x1:n) πn (dx1:n) Z and/or the normalizing constants Zn. We want to do this sequentially; i.e. …rst π1 and/or Z1 at time 1 then π2 and/or Z2 at time 2 and so on. Generic Problem Consider a sequence of probability distributions πn n 1 de…ned on a f g sequence of (measurable) spaces (En, n) n 1 where E1 = E, f F g 1 = and En = En 1 E, n = n 1 . F F F F F A.D. () 2 / 40 We want to estimate expectations of test functions ϕ : En R n ! Eπn (ϕn) = ϕn (x1:n) πn (dx1:n) Z and/or the normalizing constants Zn. We want to do this sequentially; i.e. …rst π1 and/or Z1 at time 1 then π2 and/or Z2 at time 2 and so on. Generic Problem Consider a sequence of probability distributions πn n 1 de…ned on a f g sequence of (measurable) spaces (En, n) n 1 where E1 = E, f F g 1 = and En = En 1 E, n = n 1 . F F F F F Each distribution πn (dx1:n) = πn (x1:n) dx1:n is known up to a normalizing constant, i.e. γn (x1:n) πn (x1:n) = Zn A.D. () 2 / 40 We want to do this sequentially; i.e. …rst π1 and/or Z1 at time 1 then π2 and/or Z2 at time 2 and so on. Generic Problem Consider a sequence of probability distributions πn n 1 de…ned on a f g sequence of (measurable) spaces (En, n) n 1 where E1 = E, f F g 1 = and En = En 1 E, n = n 1 . F F F F F Each distribution πn (dx1:n) = πn (x1:n) dx1:n is known up to a normalizing constant, i.e. γn (x1:n) πn (x1:n) = Zn We want to estimate expectations of test functions ϕ : En R n ! Eπn (ϕn) = ϕn (x1:n) πn (dx1:n) Z and/or the normalizing constants Zn. A.D. () 2 / 40 Generic Problem Consider a sequence of probability distributions πn n 1 de…ned on a f g sequence of (measurable) spaces (En, n) n 1 where E1 = E, f F g 1 = and En = En 1 E, n = n 1 . F F F F F Each distribution πn (dx1:n) = πn (x1:n) dx1:n is known up to a normalizing constant, i.e. γn (x1:n) πn (x1:n) = Zn We want to estimate expectations of test functions ϕ : En R n ! Eπn (ϕn) = ϕn (x1:n) πn (dx1:n) Z and/or the normalizing constants Zn. We want to do this sequentially; i.e. …rst π1 and/or Z1 at time 1 then π2 and/or Z2 at time 2 and so on. A.D. () 2 / 40 A standard approach to sample from high dimensional distribution consists of using iterative Markov chain Monte Carlo algorithms, this is not appropriate in our context. Problem 2: Even if we could sample exactly from πn (x1:n), then the computational complexity of the algorithm would most likely increase with n but we typically want an algorithm of …xed computational complexity at each time step. Summary: We cannot use standard MC sampling in our case and, even if we could, this would not solve our problem. Using Monte Carlo Methods Problem 1: For most problems of interest, we cannot sample from πn (x1:n). A.D. () 3 / 40 Problem 2: Even if we could sample exactly from πn (x1:n), then the computational complexity of the algorithm would most likely increase with n but we typically want an algorithm of …xed computational complexity at each time step. Summary: We cannot use standard MC sampling in our case and, even if we could, this would not solve our problem. Using Monte Carlo Methods Problem 1: For most problems of interest, we cannot sample from πn (x1:n). A standard approach to sample from high dimensional distribution consists of using iterative Markov chain Monte Carlo algorithms, this is not appropriate in our context. A.D. () 3 / 40 Summary: We cannot use standard MC sampling in our case and, even if we could, this would not solve our problem. Using Monte Carlo Methods Problem 1: For most problems of interest, we cannot sample from πn (x1:n). A standard approach to sample from high dimensional distribution consists of using iterative Markov chain Monte Carlo algorithms, this is not appropriate in our context. Problem 2: Even if we could sample exactly from πn (x1:n), then the computational complexity of the algorithm would most likely increase with n but we typically want an algorithm of …xed computational complexity at each time step. A.D. () 3 / 40 Using Monte Carlo Methods Problem 1: For most problems of interest, we cannot sample from πn (x1:n). A standard approach to sample from high dimensional distribution consists of using iterative Markov chain Monte Carlo algorithms, this is not appropriate in our context. Problem 2: Even if we could sample exactly from πn (x1:n), then the computational complexity of the algorithm would most likely increase with n but we typically want an algorithm of …xed computational complexity at each time step. Summary: We cannot use standard MC sampling in our case and, even if we could, this would not solve our problem. A.D. () 3 / 40 Sequential Importance Sampling. Applications. Plan of the Lectures Review of Importance Sampling. A.D. () 4 / 40 Applications. Plan of the Lectures Review of Importance Sampling. Sequential Importance Sampling. A.D. () 4 / 40 Plan of the Lectures Review of Importance Sampling. Sequential Importance Sampling. Applications. A.D. () 4 / 40 q can be chosen arbitrarily, in particular easy to sample from N (i) i.i.d. 1 X q ( ) q (dx) = ∑ δX (i) (dx) ) N i=1 b Importance Sampling Importance Sampling (IS) identity. For any distribution q such that π (x) > 0 q (x) > 0 ) w (x) q (x) γ (x) π (x) = where w (x) = . w (x) q (x) dx q (x) where q is called importanceR distribution and w importance weight. A.D. () 5 / 40 Importance Sampling Importance Sampling (IS) identity. For any distribution q such that π (x) > 0 q (x) > 0 ) w (x) q (x) γ (x) π (x) = where w (x) = . w (x) q (x) dx q (x) where q is called importanceR distribution and w importance weight. q can be chosen arbitrarily, in particular easy to sample from N (i) i.i.d. 1 X q ( ) q (dx) = ∑ δX (i) (dx) ) N i=1 b A.D. () 5 / 40 π (x) now approximated by weighted sum of delta-masses Weights ) compensate for discrepancy between π and q. Plugging this expression in IS identity N (i) (i) (i) π (dx) = ∑ W δX (i) (dx) where W ∝ w X , i=1 1 N b Z = ∑ w X (i) . N i=1 b A.D. () 6 / 40 Plugging this expression in IS identity N (i) (i) (i) π (dx) = ∑ W δX (i) (dx) where W ∝ w X , i=1 1 N b Z = ∑ w X (i) . N i=1 b π (x) now approximated by weighted sum of delta-masses Weights ) compensate for discrepancy between π and q. A.D. () 6 / 40 The varianec of the weights is bounded if and only if γ2 (x) dx < ∞. q (x) Z In practice, try to ensure γ (x) w (x) = < ∞. q (x) Note that in this case, rejection sampling could be used to sample from π (x) . Practical recommendations Select q as close to π as possible. A.D. () 7 / 40 In practice, try to ensure γ (x) w (x) = < ∞. q (x) Note that in this case, rejection sampling could be used to sample from π (x) . Practical recommendations Select q as close to π as possible. The varianec of the weights is bounded if and only if γ2 (x) dx < ∞. q (x) Z A.D. () 7 / 40 Practical recommendations Select q as close to π as possible. The varianec of the weights is bounded if and only if γ2 (x) dx < ∞. q (x) Z In practice, try to ensure γ (x) w (x) = < ∞. q (x) Note that in this case, rejection sampling could be used to sample from π (x) . A.D. () 7 / 40 Example 0.8 0.6 Double exponential 0.4 0.2 0 0 10 20 30 40 50 60 70 80 90 100 0.4 0.3 Gaussian 0.2 0.1 0 0 10 20 30 40 50 60 70 80 90 100 1 t•Student (heavy tailed) 0.5 0 •10 •8 •6 •4 •2 0 2 4 6 8 10 Figure: Target double exponential distributions and two IS distributions A.D. () 8 / 40 0. 012 0.01 0. 008 0. 006 Values of theweights Values 0. 004 0. 002 0 •4 •3 •2 •1 0 1 2 3 4 Values of x Figure: IS approximation obtained using a Gaussian IS distribution A.D. () 9 / 40 •4 x 10 15 10 Values of the weights the of Values 5 0 •30 •20 •10 0 10 20 30 Values of x Figure: IS approximation obtained using a Student-t IS distribution A.D.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    101 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us