Particle Filters
Frank Schorfheide University of Pennsylvania EABCN Training School
May 10, 2016
Frank Schorfheide Particle Filters From Linear to Nonlinear DSGE Models
• While DSGE models are inherently nonlinear, the nonlinearities are often small and decision rules are approximately linear.
• One can add certain features that generate more pronounced nonlinearities:
• stochastic volatility;
• markov switching coefficients;
• asymmetric adjustment costs;
• occasionally binding constraints.
Frank Schorfheide Particle Filters From Linear to Nonlinear DSGE Models
• Linear DSGE model leads to
yt = Ψ0(θ) + Ψ1(θ)t + Ψ2(θ)st + ut , ut N(0, Σu), ∼ st = Φ1(θ)st−1 + Φ(θ)t , t N(0, Σ). ∼
• Nonlinear DSGE model leads to
yt = Ψ(st , t; θ) + ut , ut Fu( ; θ) ∼ · st = Φ(st−1, t ; θ), t F( ; θ). ∼ ·
Frank Schorfheide Particle Filters Particle Filters
• There are many particle filters...
• We will focus on three types:
• Bootstrap PF
• A generic PF
• A conditionally-optimal PF
Frank Schorfheide Particle Filters Filtering - General Idea
• State-space representation of nonlinear DSGE model
Measurement Eq. : yt = Ψ(st , t; θ) + ut , ut Fu( ; θ) ∼ · State Transition : st = Φ(st−1, t ; θ), t F( ; θ). ∼ · • Likelihood function: T Y p(Y1:T θ) = p(yt Y1:t−1, θ) | | t=1
• A filter generates a sequence of conditional distributions st Y1:t . • Iterations: | • Initialization at time t − 1: p(st−1|Y1:t−1, θ) • Forecasting t given t − 1: 1 Transition equation: R p(st |Y1:t−1, θ) = p(st |st−1, Y1:t−1, θ)p(st−1|Y1:t−1, θ)dst−1 2 Measurement equation: R p(yt |Y1:t−1, θ)= p(yt |st , Y1:t−1, θ)p(st |Y1:t−1, θ)dst • Updating with Bayes theorem. Once yt becomes available:
p(yt |st , Y1:t−1, θ)p(st |Y1:t−1, θ) p(st |Y1:t , θ) = p(st |yt , Y1:t−1, θ) = p(yt |Y1:t−1, θ)
Frank Schorfheide Particle Filters Bootstrap Particle Filter
1 Initialization. Draw the initial particles from the distribution j iid j s p(s0) and set W = 1, j = 1,..., M. 0 ∼ 0 2 Recursion. For t = 1,..., T : j j 1 Forecasting st . Propagate the period t − 1 particles {st−1, Wt−1} by iterating the state-transition equation forward:
j j j j s˜t = Φ(st−1, t ; θ), t ∼ F(·; θ). (1)
An approximation of E[h(st )|Y1:t−1, θ] is given by
M 1 X j j hˆ = h(˜s )W . (2) t,M M t t−1 j=1
Frank Schorfheide Particle Filters Bootstrap Particle Filter
1 Initialization. 2 Recursion. For t = 1,..., T :
1 Forecasting st . 2 Forecasting yt . Define the incremental weights
j j w˜t = p(yt |s˜t , θ). (3)
The predictive density p(yt |Y1:t−1, θ) can be approximated by
M 1 X j j pˆ(y |Y , θ) = w˜ W . (4) t 1:t−1 M t t−1 j=1
If the measurement errors are N(0, Σu) then the incremental weights take the form
1 w˜ j = (2π)−n/2|Σ |−1/2 exp − y −Ψ(˜sj , t; θ)0Σ−1 y −Ψ(˜sj , t; θ) , t u 2 t t u t t (5)
where n here denotes the dimension of yt .
Frank Schorfheide Particle Filters Bootstrap Particle Filter
1 Initialization. 2 Recursion. For t = 1,..., T :
1 Forecasting st . 2 Forecasting yt . Define the incremental weights
j j w˜t = p(yt |s˜t , θ). (6)
3 Updating. Define the normalized weights
w˜ j W j W˜ j = t t−1 . (7) t 1 PM j j M j=1 w˜t Wt−1
An approximation of E[h(st )|Y1:t , θ] is given by
M 1 X j j h˜ = h(˜s )W˜ . (8) t,M M t t j=1
Frank Schorfheide Particle Filters Bootstrap Particle Filter
1 Initialization. 2 Recursion. For t = 1,..., T :
1 Forecasting st . 2 Forecasting yt . 3 Updating. 4 Selection (Optional). Resample the particles via multinomial j M resampling. Let {st }j=1 denote M iid draws from a multinomial j ˜ j distribution characterized by support points and weights {s˜t , Wt } j and set Wt = 1 for j =, 1 ..., M. An approximation of E[h(st )|Y1:t , θ] is given by
M 1 X j j h¯ = h(s )W . (9) t,M M t t j=1
Frank Schorfheide Particle Filters Likelihood Approximation
• The approximation of the log likelihood function is given by
T M X 1 X j j lnp ˆ(Y1:T θ) = ln w˜t W . (10) | M t−1 t=1 j=1
• One can show that the approximation of the likelihood function is unbiased.
• This implies that the approximation of the log likelihood function is downward biased.
Frank Schorfheide Particle Filters The Role of Measurement Errors
• Measurement errors may not be intrinsic to DSGE model.
• Bootstrap filter needs non-degenerate p(yt st , θ) for incremental weights to be well defined. |
• Decreasing the measurement error variance Σu, holding everything else fixed, increases the variance of the particle weights, and reduces the accuracy of Monte Carlo approximation.
Frank Schorfheide Particle Filters Generic Particle Filter
1 Initialization. Same as BS PF 2 Recursion. For t = 1,..., T : j j 1 Forecasting st . Draws ˜t from density gt (˜st |st−1, θ) and define j j j p(˜st |st−1, θ) ωt = j j . (11) gt (˜st |st−1, θ)
An approximation of E[h(st )|Y1:t−1, θ] is given by M 1 X j j j hˆ = h(˜s )ω W . (12) t,M M t t t−1 j=1
2 Forecasting yt . Define the incremental weights j j j w˜t = p(yt |s˜t , θ)ωt . (13)
The predictive density p(yt |Y1:t−1, θ) can be approximated by M 1 X j j pˆ(y |Y , θ) = w˜ W . (14) t 1:t−1 M t t−1 j=1 3 Updating. Same as BS PF 4 Selection. Same as BS PF 3 Likelihood Approximation. Same as BS PF Frank Schorfheide Particle Filters Asymptotics
• The convergence results can be established recursively, starting from the assumption ¯ a.s. ht−1,M E[h(st−1) Y1:t−1], −→ | ¯ √M ht−1,M E[h(st−1) Y1:t−1] = N 0, Ωt−1(h) . − | ⇒
j j • Forward iteration: draw st from gt (st s ) = p(st s ). | t−1 | t−1 • Decompose ˆ ht,M E[h(st ) Y1:t−1] (15) − | M 1 X j j = h(˜st ) Ep(·|sj )[h] Wt−1 M − t−1 j=1 M 1 X j + Ep(·|sj )[h]Wt−1 E[h(st ) Y1:t−1] M t−1 − | j=1 = I + II , • Both I and II converge to zero (and potentially satisfy CLT).
Frank Schorfheide Particle Filters Asymptotics
• Updating step approximates
R 1 PM j j j h(st )p(yt st )p(st Y1:t−1)dst j=1 h(˜st )w ˜t Wt−1 [h(s ) Y ] = | | M E t 1:t R 1 M j j | p(yt st )p(st Y1:t−1)dst ≈ P w˜ W | | M j=1 t t−1 (16)
• Define the normalized incremental weights as
p(yt st ) vt (st ) = R | . (17) p(yt st )p(st Y1:t−1)dst | | • Under suitable regularity conditions, the Monte Carlo approximation satisfies a CLT of the form ˜ √M ht,M E[h(st ) Y1:t ] (18) − | ˜ ˜ ˆ = N 0, Ωt (h) , Ωt (h) = Ωt vt (st )(h(st ) E[h(st ) Y1:t ]) . ⇒ − | • Distribution of particle weights matters for accuracy! = Resampling! ⇒
Frank Schorfheide Particle Filters Adapting the Generic PF
• Conditionally-optimal importance distribution:
j j gt (˜st s ) = p(˜st yt , s ). | t−1 | t−1 j This is the posterior of st given st−1. Typically infeasible, but a good benchmark.
• Approximately conditionally-optimal distributions: from linearize version of DSGE model or approximate nonlinear filters.
• Conditionally-linear models: do Kalman filter updating on a subvector of st . Example:
yt = Ψ0(mt ) + Ψ1(mt )t + Ψ2(mt )st + ut , ut N(0, Σu), ∼ st = Φ0(mt ) + Φ1(mt )st−1 + Φ(mt )t , t N(0, Σ), ∼ where mt follows a discrete Markov-switching process.
Frank Schorfheide Particle Filters More on Conditionally-Linear Models
• State-space representation is linear conditional on mt .
• Write
p(mt , st Y1:t ) = p(mt Y1:t )p(st mt , Y1:t ), (19) | | | where st (mt , Y1:t ) N s¯ (mt ), P (mt ) . (20) | ∼ t|t t|t • Vector of meanss ¯t|t (mt ) and the covariance matrix Pt|t (m)t are sufficient statistics for the conditional distribution of st . j j j j N • Approximate (mt , st ) Y1:t by m , s¯ , P , W . | { t t|t t|t t }i=1 • The swarm of particles approximates Z h(mt , st )p(mt , st , Y1:t )d(mt , st ) (21) Z Z = h(mt , st )p(st mt , Y1:t )dst p(mt Y1:t )dmt | | M Z 1 X j j j j j h(m , s )pN st s¯ , P dst W . ≈ M t t | t|t t|t t j=1
Frank Schorfheide Particle Filters More on Conditionally-Linear Models
• We used Rao-Blackwellization to reduce variance: V[h(st , mt )] = E V[h(st , mt ) mt ] + V E[h(st , mt ) mt ] | | V E[h(st , mt ) mt ] ≥ | j j • To forecast the states in period t, generatem ˜t from gt (m ˜t mt−1) and define: |
j j j p(m ˜t mt−1) ωt = j| j . (22) gt (m ˜ m ) t | t−1 • The Kalman filter forecasting step can be used to compute:
j j j j s˜t|t−1 = Φ0(m ˜t ) + Φ1(m ˜t )st−1 j j j j 0 Pt|t−1 = Φ(m ˜t )Σ(m ˜t )Φ(m ˜t ) j j j j j (23) y˜t|t−1 = Ψ0(m ˜t ) + Ψ1(m ˜t )t + Ψ2(m ˜t )˜st|t−1 j j j j 0 Ft|t−1 = Ψ2(m ˜t )Pt|t−1Ψ2(m ˜t ) + Σu.
Frank Schorfheide Particle Filters More on Conditionally-Linear Models
• Then, Z h(mt , st )p(mt , st Y1:t−1)d(mt , st ) (24) | Z Z = h(mt , st )p(st mt , Y1:t−1)dst p(mt Y1:t−1)dmt | | M Z 1 X j j j j j j h(m , s )pN st s˜ , P dst ω W ≈ M t t | t|t−1 t|t−1 t t−1 j=1 • The likelihood approximation is based on the incremental weights j j j j w˜ = pN yt y˜ , F ω . (25) t | t|t−1 t|t−1 t j • Conditional onm ˜t we can use the Kalman filter once more to update the information about st in view of the current observation yt :
j j j j 0 j −1 j s˜ =s ˜ + P Ψ2(m ˜ ) F (yt y¯ ) t|t t|t−1 t|t−1 t t|t−1 − t|t−1 j j j j 0 j −1 j j P˜ = P P Ψ2(m ˜ ) F Ψ2(m ˜ )P . t|t t|t−1 − t|t−1 t t|t−1 t t|t−1 (26)
Frank Schorfheide Particle Filters Particle Filter For Conditionally Linear Models
1 Initialization. 2 Recursion. For t = 1,..., T : j j 1 Forecasting st . Drawm ˜t from density gt (m ˜t |mt−1, θ), calculate the j j j importance weights ωt in (22), and computes ˜t|t−1 and Pt|t−1 according to (23). An approximation of E[h(st , mt )|Y1:t−1, θ] is given by (25). j 2 Forecasting yt . Compute the incremental weightsw ˜t according to (25). Approximate the predictive density p(yt |Y1:t−1, θ) by
M 1 X j j pˆ(y |Y , θ) = w˜ W . (27) t 1:t−1 M t t−1 j=1
3 Updating. Define the normalized weights w˜ j W j W˜ j = t t−1 (28) t 1 PM j j M j=1 w˜t Wt−1 j ˜j and computes ˜t|t and Pt|t according to (26). An approximation of j j ˜j ˜ j E[h(mt , st )|Y1:t , θ] can be obtained from {m˜t , s˜t|t , Pt|t , Wt }. 4 Selection.
Frank Schorfheide Particle Filters Nonlinear and Partially Deterministic State Transitions
• Example:
s1,t = Φ1(st−1, t ), s2,t = Φ2(st−1), t N(0, 1). ∼ • Generic filter requires evaluation of p(st st−1). | 0 0 0 • Define ςt = [st , t ] and add identity t = t to state transition.
• Factorize the density p(ςt ςt−1) as | p(ςt ςt−1) = p (t )p(s1,t st−1, t )p(s2,t st−1). | | | where p(s1,t st−1, t ) and p(s2,t st−1) are pointmasses. | | • Sample innovation t from g (t st−1). t | • Then
j j j j j j j j j p(˜ς ς ) p (˜t )p(˜s s , ˜t )p(˜s s ) p (˜ ) ωj = t | t−1 = 1,t | t−1 2,t | t−1 = t . t j j j j j j j j j j j gt (˜ς ς ) g (˜ s )p(˜s s , ˜ )p(˜s s ) g (˜ s ) t | t−1 t t | t−1 1,t | t−1 t 2,t | t−1 t t | t−1
Frank Schorfheide Particle Filters Degenerate Measurement Error Distributions
• Our discussion of the conditionally-optimal importance distribution suggests that in the absence of measurement errors, one has to solve the system of equations