Lecture 19 Kalman Filter

Lecture 19 Kalman Filter 1 Introduction • We observe (measure) economic data, {zt}, over time; but these measurements are noisy. There is an unobservable variable, yt, that drives the observations. We call yt the state variable. • The Kalman filter (KF) uses the observed data to learn about the unobservable state variables, which describe the state of the model. • KF models dynamically what we measure, zt, and the state, yt. yt = g(yt-1, ut, wt)(state or transition equation) zt = f(yt, xt, vt)(measurement equation) ut, xt: exogenous variables. wt, vt: error terms. 2 Rudolf (Rudi) Emil Kalman (1930 – 2016, Hungary/USA) 1 Intuition: GDP and the State of the Economy System Error Sources Black External Controls Box (Monetary Policy) State of {u } t Economy System State Optimal Observed (desired, but Estimate of Measurements unknown) {y } System State t {z =GDP } t t ŷ Measuring t Devices (Fed) Estimator Measurement Error Sources • The business cycle (the system state), yt, cannot be measured directly • We measure the system at time t: GDP (zt). • Need to estimate “optimally” the business cycle from GDP. 3 How does the KF work? • It is a recursive data processing algorithm. As new information arrives, it updates predictions (it is a Bayesian algorithm). • It generates optimal estimates of yt, given measurements {zt}. • Optimal? – For linear system and white Gaussian errors, KF is “best” estimate based on all previous measurements (MSE sense). – For non-linear system optimality is ‘qualified.’ (say, “best linear”). • Recursive? – No need to store all previous measurements and re-estimate 4 system as new information arrives. 2 Notation • For any vector st, we define the prediction of st at time t as: st|t-1 = E(st|It-1 ) That is, it is the best guess of st based on all the information available at time t-1, which we denote by It-1= {zt-1, ..., z1; ut-1, ..., u1; xt-1, ..., x1; ...}. • As new information is released, we update our prediction:. st|t = E(st|It) • The Kalman filter predicts zt|t-1 , yt|t-1 , and updates yt|t. At time t, we define a prediction error: et|t-1 = zt -zt|t-1 The conditional variance of et|t-1 :Ft|t-1 = E[et|t-1 et|t-1’] 5 Intuitive Example: Prediction and Updating yt (true value of a company) • Distribution of true value, yt, is unobservable • Assume Gaussian distributed measurements 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 10 20 30 40 50 60 70 80 90 100 - Observed Measurement at t1: Mean = z1 and Variance = z1 - Optimal estimate of true value: ŷ(t1) = z1 - Variance of error in estimate: 2 (t ) = 2 y 1 z1 6 - Predicted value at time t2, using t1 info: ŷt2|t1 = z1 3 Intuitive Example: Prediction and Updating • We have the prediction ŷt2|t1. At t2, we have a new measurement: prediction ŷt2|t1 0.16 0.14 measurement 0.12 z(t2) 0.1 0.08 0.06 0.04 0.02 0 0 10 20 30 40 50 60 70 80 90 100 • New t2 measurement: - Measurement at t2: Mean = z2 and Variance = z2 - Update the prediction due to new measurement: ŷt2|t2 • Closer to more trusted measurement – linear interpolation? 7 Intuitive Example: Prediction and Updating corrected optimal prediction ŷt2|t1 estimate ŷt2|t2 0.16 0.14 0.12 0.1 measurement 0.08 z(t2) 0.06 0.04 0.02 0 0 10 20 30 40 50 60 70 80 90 100 • Corrected mean is the new optimal estimate of true value or Optimal (updated) estimate = ŷt|t = ŷt|t-1 + (Kalman Gain) * (zt –zt|t-1) • New variance is smaller than either of the previous two variances or Variance of estimate = Variance of prediction * (1 – Kalman Gain) 8 4 Intuitive Example: Prediction and Updating 0.16 ŷ 0.14 t2|t2 Naïve Prediction 0.12 ŷt3|t2 0.1 0.08 0.06 0.04 0.02 0 0 10 20 30 40 50 60 70 80 90 100 • At time t3, the true values changes at the rate dy/dt=u • Naïve approach: Shift probability to the right to predict • This would work if we knew the rate of change (perfect model). 9 But, this is unrealistic. Intuitive Example: Prediction and Updating 0.16 Naïve Prediction ŷt2|t2 0.14 ŷt3|t2 0.12 0.1 0.08 0.06 Prediction 0.04 ŷt3|t2 0.02 0 0 10 20 30 40 50 60 70 80 90 100 • Then, we assume imperfect model by adding Gaussian noise • dy/dt = u + w • Distribution for prediction moves and spreads out 10 5 Intuitive Example: Prediction and Updating 0.16 Updated optimal 0.14 estimate ŷt3|t3 0.12 Measurement 0.1 z(t3) 0.08 0.06 Prediction 0.04 ŷt3|t2 0.02 0 0 10 20 30 40 50 60 70 80 90 100 • Now we take a measurement at t3 • Need to once again correct the prediction •Same as before 11 Intuition: Prediction and Updating • From the intuitive example, we see a simple process: – Initial conditions (ŷt-1 and t-1) – Prediction (ŷt|t-1 , t|t-1) • Use initial conditions and model (say, constant rate of change) to make prediction – Measurement (zt) • Take measurement, zt and learn forecast error, et|t-1=zt -zt|t-1 –Updating (ŷt|t , t|t) • Use measurement to update prediction by ‘blending’ prediction and residual – always a case of merging only two Gaussians 12 • Optimal estimate with smaller variance 6 Terminology: Filtering and Smoothing • Prediction is an a priori form of estimation. It attempts to provide information about what the quantity of interest will be at some time t+τ in the future by using data measured up to and including time t-1 (usually, KF refers to one-step ahead prediction). • Filtering is an operation that involves the extraction of information about a quantity of interest at time t, by using data measured up to and including t. • Smoothing is an a posteriori form of estimation. Data measured after the time of interest are used for the estimation. Specifically, the smoothed estimate at time t is obtained by using data measured over the interval [0, T], where t < T. 13 Bayesian Optimal Filter • A Bayesian optimal filter computes the distribution P(yt|zt, zt-1, ..., z1, yt, ..., y1 ) = P(yt|zt) • Given the following: 1. Prior distribution: P(y0) 2. State space model: yt|t-1 ~P(yt |yt-1) zt|t ~P(zt |yt) 3. Measurement sequence: {zt}= {z1, z2, ..., zt} • Computation is based on recursion rule for incorporation of the new measurement yk into the posterior: P(zt-1|yt, ..., y1) → P(zt|yt, ..., y1) 14 7 Bayesian Optimal Filter: Prediction Step • Assume we know the posterior distribution of previous time step, t-1: P(yt-1|zt, zt-1, ..., z1) • The joint pdf P(yt, yt-1|{zt-1}) can be computed as (using the Markov property): P(yt, yt-1|{zt-1}) = P(yt|yt-1,{zt-1}) P(yt-1|{zt-1}) = P(yt|yt-1) P(yt-1|{zt-1}) • Integrating over yt-1 gives the Chapman-Kolmogorov equation: P(yt|{zt-1}) = P(yt|yt-1) P(yt-1|{zt-1}) dyt-1 • This is the prediction step of the optimal filter. 15 Bayesian Optimal Filter: Update Step •Now we have: 1. Prior distribution from the Chapman-Kolmogorov equation P(yt|{zt-1}) 2. Measurement likelihood: P(zt|yt) • The posterior distribution (=Likelihood x Prior): P(yt|{zt}) ∝ P(zt|yt) x P(yt|{zt-1}) (ignoring normalizing constant, P(zt|{zt-1}) = P(zt|yt) P(yt|{zt-1}) dyt). • This is the update step of the optimal filter. 16 8 State Space Form • Model to be estimated: yt = Ayt-1 + But + wt wt: state noise ~ WN(0,Q) ut: exogenous variable. A: state transition matrix B: coefficient matrix for ut. zt = Hyt + vt vt: measurement noise ~ WN(0,R) H: measurement matrix Initial conditions: y0, usually a RV. We call both equations state space form. Many economic models can be written in this form. Note: The model is linear, with constant coefficient matrices, A, B, and H. It can be generalized –see Harvey(1989). 17 State Space Form - Examples Example: VAR(1) Yt A1Yt 1 t Y1,t a 11 a 12 Y1,t 1 1 0 w 1,t Transition equation Y 2 ,t a 21 a 22 Y 2 ,t 1 0 1 w 2 ,t Y1,t 1 z t v t Measurement equation Y 2 ,t 1 18 9 State Space Form - Examples • Example: VAR(p) written in state-space form. Yt 1Yt 1 2Yt 2 ... pYt p at 1 2 ...... p 1 p Yt a t I n 0.......... ..... 0 Y 0 t 1 t F 0 I n .......... ..... 0 wt Y t p 1 0 0 0.......... I n 0 Then, we write the transition equation as: t F t wt and the measurement equation as: z t t 19 State Space Form - Examples • Example: Time-varying coefficients. zt t tut vt ~ N(0,Vt ) Measurement equation t t1 w ,t ~ N(0,W ,t ) Transition equation t t1 w ,t ~ N(0,W ,t ) The state of the system is yt = (αt, βt). Ht = [1, ut]. • Example: Stochastic volatility. zt = H ht + C xt + vt vt~WN(0, R) (Measurement equation) ht = A ht-1 + B ut + wt wt~WN(0, Q) (Transition equation) 20 10 State Space Form - Variations • When {wt} is WN and independent of y0, {yt} is Markov.

Load more