Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering
Time Series Analysis 5. State space models and Kalman filtering
Andrew Lesniewski
Baruch College New York
Fall 2019
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Outline
1 Warm-up: Recursive Least Squares
2 Kalman Filter
3 Nonlinear State Space Models
4 Particle Filtering
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering OLS regression
As a motivation for the reminder of this lecture, we consider the standard linear model Y = X Tβ + ε, (1) where Y ∈ R, X ∈ Rk , and ε ∈ R is noise (this includes the model with an intercept as a special case in which the first component of X is assumed to be 1).
Given n observations x1,..., xn and y1,..., yn of X and Y , respectively, the ordinary least square least (OLS) regression leads to the following estimated value of the coefficient β:
T −1 T βbn = (Xn Xn) Xn Yn. (2)
The matrices X and Y above are defined as
T x1 y1 X = . ∈ (R) Y = . ∈ Rn, . Matn,k and n . (3) T xn yn
respectively.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Recursive least squares
Suppose now that X and Y consists of a streaming set of data, and each new observation leads to an updated value of the estimated β. It is not optimal to redo the entire calculation above after a new observation arrives in order to find the updated value. Instead, we shall derive a recursive algorithm that stores the last calculated value of the estimated β, and updates it to incorporate the impact of the latest observation. To this end, we introduce the notation:
T −1 Pn = (Xn Xn) , T (4) Bn = Xn Yn,
and rewrite (2) as βbn = PnBn. (5)
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Recursive least squares
Using (3), we can write
Xn Xn+1 = T , xn+1 Yn Yn+1 = . yn+1
Therefore, Pn and Bn obey the following recursive relations:
−1 −1 T Pn+1 = Pn + xn+1xn+1,
Bn+1 = Bn + xn+1yn+1.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Recursive least squares
We shall now use the matrix inversion formula:
(A + BD)−1 = A−1 − A−1B(I + DA−1B)−1DA−1, (6)
valid for a square invertible matrix A, and matrices B and D such that the operations above are defined. This yields the following relation:
T −1 T Pn+1 = Pn − Pnxn+1(1 + xn+1Pnxn+1) xn+1Pn T = Pn − Kn+1xn+1Pn.
where we have introduced the notation
T −1 Kn+1 = Pnxn+1(1 + xn+1Pnxn+1) .
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Recursive least squares
Now, define the a priori error
T εbn+1 = yn+1 − xn+1βbn.
Then the recursion for Bn can be recast as
T Bn+1 = Bn + xn+1xn+1βbn + xn+1εbn+1.
Using (5), we see that
−1 −1 T Pn+1βbn+1 = Pn βbn + xn+1xn+1βbn + xn+1εbn+1 −1 T = (Pn + xn+1xn+1)βbn + xn+1εbn+1 −1 = Pn+1βbn + xn+1εbn+1.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Recursive least squares
In other words, βbn+1 = βbn + Pn+1xn+1εbn+1.
However, from the definition of Kn+1,
Pn+1xn+1 = Kn+1,
and so βbn+1 = βbn + Kn+1εbn+1.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Recursive least squares
The algorithm can be summarized as follows. We initialize βb0 (e.g. at 0), and P0 (e.g. at I), and iterate:
T εbn+1 = yn+1 − xn+1βbn, T −1 Kn+1 = Pnxn+1(1 + x Pnxn+1) , n+1 (7) T Pn+1 = Pn − Kn+1xn+1Pn,
βbn+1 = βbn + Kn+1εbn+1.
Note that (i) we no longer have to store the (potentially large) matrices Xn and T Yn, and (ii) the computationally expensive operation of inverting the matrix XnXn is replaced with a small number of simpler operations. We now move on to the main topic of these notes, the Kalman filter and its generalizations.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering State space models
A state space model (SSM) is a time series model in which the time series Yt is interpreted as the result of a noisy observation of a stochastic process Xt .
The values of the variables Xt and Yt can be continuous (scalar or vector) or discrete. Graphically, an SSM is represented as follows:
X1 → X2 → ... → Xt → ... state process y y y y ··· (8)
Y1 Y2 ... Yt ... observation process
SSMs belong to the realm of Bayesian inference, and they have been successfully applied in many fields to solve a broad range of problems. Our discussion of SSMs follows largely [2].
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering State space models
It is usually assumed that the state process Xt is Markovian, i.e. Xt depends on the history only through Xt−1, and Yt depends only on Xt :
Xt ∼ p(Xt |Xt−1), (9) Yt ∼ p(Yt |Xt ).
The most well studied SSM is the Kalman filter, for which the processes above are linear and and the sources of randomness are Gaussian. Namely, a linear state space model has the form:
Xt+1 = GXt + εt+1, (10) Yt = HXt + ηt .
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering State space models
r Here, the state vector Xt ∈ R is possibly unobservable and it can be observed n only through the observation vector Yt ∈ R .
The matrices G ∈ Matr(R) and H ∈ Matn,r(R) are assumed to be known. For example, their values may be given by (economic) theory, or they may have been obtained through MLE estimation. In fact, the matrices G and H may depend deterministically on time, i.e. G and H may be replaced by known matrices Gt and Ht , respectively.
We also assume that the distribution of the initial value X1 is known and Gaussian.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering State space models
r n The vectors of residuals εt ∈ R and ηt ∈ R satisfy
T E(εt εs ) = δtsQ, T (11) E(ηt ηs ) = δtsR,
where δts denotes Kronecker’s delta, and where Q and R are known positive definite (covariance) matrices.
We also assume that the components of εt and ηs are independent of each other for all t and s. The matrices Q and R may depend deterministically on time. The first of the equations in (10) is called the state equation, while the second one is referred to as the observation equation.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Inference for state space models
Let T denote the time horizon.
Our broad goal is to make inference about the states Xt based on a set of observations Y1,..., Yt . Three questions are of particular interest: (i) Filtering: t < T . What can we infer about the current state of the system based on all available observations? (ii) Smoothing: t = T . What can be inferred about the system based on the information contained in the entire data sample? In particular, how can we back fill missing observations? (iii) Forecasting: t > T . What is the optimal prediction of a future observation and / or a future state of the system?
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Kalman filter
In principle, any inference for this model can be done using the standard methods of multivariate statistics. However, these methods require storing large amounts of data and inverting tn × tn matrices. Notice that, as new data arrive, the storage requirements and matrix dimensionality increase. This is frequently computationally intractable and impractical. Instead, the Kalman filter relies on a recursive approach which does not require significant storage resources and involves inverting n × n matrices only. We will go through a detailed derivation of this recursion.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Kalman filter
The purpose of filtering is to update the knowledge of the system each time a new observation is made.
We define the one period predictor µt+1, when the observation Yt is made, and its covariance Pt+1:
µt+1 = E(Xt+1|Y1:t ), (12) Pt+1 = Var(Xt+1|Y1:t ),
as well as the filtered estimator µt|t and its covariance Pt|t :
µt|t = E(Xt |Y1:t ), (13) Pt|t = Var(Xt |Y1:t ).
Our objective is to compute these quantities recursively.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Kalman filter
We let vt = Yt − E(Yt |Y1:t−1) (14) denote the one period prediction error or innovation.
In Homework Assignment #6 we show that vt ’s are mutually independent. Note that
vt = Yt − E(HXt + ηt |Y1:t−1)
= Yt − Hµt .
As a consequence, we have, for t = 2, 3,...,
E(vt |Y1:t−1) = E(HXt + ηt − Hµt |Y1:t−1) (15) = 0.
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Kalman filter
Now we notice that
Var(vt |Y1:t−1) = Var(HXt + ηt − Hµt |Y1:t−1)
= Var(H(Xt − µt )|Y1:t−1) + Var(ηt |Y1:t−1) T T T = E(H(Xt − µt )(Xt − µt ) H |Y1:t−1) + E(ηt ηt |Y1:t−1) T = HPt H + R.
For convenience we denote
Ft = Var(vt |Y1:t−1). (16)
We will assume in the following that the matrix Ft is invertible. The result of the calculation above can thus be stated as:
T Ft = HPt H + R. (17)
This relation allows us to derive a relation between µt and µt|t .
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Lemma
First, we will establish the following Lemma: Let X and Y be Gaussian jointly distributed random vectors with
X µ E = X , Y µY
and X ΣXX ΣXY Var = T Y ΣXY ΣYY Then −1 E(X|Y ) = µX + ΣXY ΣYY Y − µY , (18) and −1 T Var(X|Y ) = ΣXX − ΣXY ΣYY ΣXY . (19) Proof : Consider the random variable
−1 Z = X − ΣXY ΣYY Y − µY .
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Lemma
Since Z is a linear in X and Y , the vector (Y , Z ) is Gaussian jointly distributed. Furthermore,
E(Z ) = µX , T Var(Z ) = E((Z − µX )(Z − µX ) ) −1 T = ΣXX − ΣXY ΣYY ΣXY .
Finally,
T Cov(Y , Z ) = E(Y (Z − µX ) ) T T −1 T = E(Y (X − µX ) − Y (Y − µY ) ΣYY ΣXY ) = 0.
This means that Z and Y are independently distributed!
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Lemma
Consequently, E(Z |Y ) = E(Z ) and Var(Z |Y ) = Var(Z ). Since −1 X = Z + ΣXY ΣYY Y − µY , we have −1 E(X|Y ) = µX + ΣXY ΣYY Y − µY , which proves (18). Also, conditioned on Y , X and Z differ by a constant vector, and so
Var(X|Y ) = Var(Z |Y ) = Var(Z ) −1 T = ΣXX − ΣXY ΣYY ΣXY ,
which proves (19). QED
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Kalman filter
Now, going back to the main calculation, we have
µt|t = E(Xt |Y1:t )
= E(Xt |Y1:t−1, vt ),
and
µt+1 = E(Xt+1|Y1:t )
= E(Xt+1|Y1:t−1, vt ).
Applying the Lemma to the joint distribution of Xt and vt conditioned on Yt−1 yields −1 µt|t = E(Xt |Y1:t−1) + Cov(Xt , vt |Y1:t−1)Var(vt |Y1:t−1) vt . (20)
A. Lesniewski Time Series Analysis Warm-up: Recursive Least Squares Kalman Filter Nonlinear State Space Models Particle Filtering Kalman filter
Note that