The Kalman Filter 1 Introduction 2 Introductory Example

The Kalman Filter Max Wel ling California Institute of Technology Pasadena CA wellingvisioncaltechedu Intro duction Until now we have only lo oked at systems without any dynamics or structure over time The mo dels had spatial structure but were dened at one moment in time ie they were static In this lecture we will analyse a p owerfull dynamic mo del which could b e characterized as factor analysis over time although the numb er of observed features is not necessarily larger than the numb er of factors The idea is again to compute only the mean and the covariance statistics ie to characterize the probabilities by Gaussians This has the advantage of b eing completely tractible for strongly nonlinear systems the Gaussian assumption can no longer hold The p ower of the Kalman Filter KF is that it op erates online This implies that to compute the b est estimate of the state and its uncertainty we can up date the previous estimates by the new measurement This implies that we dont have to consider all the previous data again to compute the optimal estimates we only need to consider the estimates from the previous time step and the new measurement So what are KFs usually used for or what do they mo del It is not hard to motivate the KF b ecause in practice it can b e used for almost everthing that moves Popular applications include navigation guidance radar tracking sonar ranging satellite orbit computation sto ck prize prediction etc These applications can summarized as denoising tracking and control It is used in all sorts of elds like engineering seismology bio engineering econometrics etc For instance when the Eagle landed on the mo on it did so with a KF Also gyroscop es in airplanes use KFs And the list go es on and on The main idea is that we like to estimate a state of some sort lo cation and velo city of airplane and its uncertainty However we do not directly observe these states We only observe some measurements from an array of sensors which are noisy As an additional complication the states evolve in time also with its own noise or uncertainties The question now b ecomes how can we still optimally use our measurements to estimate the unobserved hidden states and their uncertainties In the following we will rst describ e the Kalman Filter and derive the KF equations We assume that the parameters of the system are xed known Then we derive the Kalman Smo other equations which allow us to use measurements forward in time to help predict the state at the current time b etter Because these estimates are usually less noisy than the if we used measurements up till the current time only we say that we smo oth the state estimates Next we will show how to employ the Kalman Filter and smo other equations to eciently estimate the parameters of the mo del from training data This will then b e not surprisingly another instance of the EM algorithm In the app endix you will nd some usefull lemmas and matrix equalities together with the derivation of the lagone smo other which is needed for learning Intro ductory Example Let me briey describ e an example Consider a ship at sea which has lost its b earings They need to estimate their current p osition using the p ositions of the stars In the meantime the ships moves on on a wavy sea The question b ecomes how can we incorp orate the measurement optimally to estimate the ships lo cation at sea Lets assume that the mo del for the ships lo cation over time is given by y y c w t t t+1 1 This contains a drift term constant velo city c and a noise term A noisy measurement is describ ed by x y v t t t 2 Lets also assume we have some estimate y of the lo cation at some time t and some uncertainty How t t do es this change as the ships sails for one second Of course it will drift in the direction of its velo city by a distance c However the uncertainty grows due to the waves This is expressed as y y c t+1 t 1 The constant drift term will not b e used in the main b o dy of these classnotes The uncertainty is given by 2 2 2 t+1 t w If we do not do any measurements the uncertainty in the p osition will keep growing until we have no clue anymore as to where we are If we add the information of a measurement the nal estimate is weighted average b etween the observed p osition and the previous estimate 2 2 t+1 v 0 y x y t+1 t+1 t+1 2 2 2 2 v v t+1 t+1 We observe that if we have innite condence in the measurement then the new lo cation estimate v is simply equal to the measurement Also if we have innite condence in the previous estimate the measurement is ignored For the new uncertainty we nd 2 2 2 t+1 v 0 t+1 2 2 v t+1 This is also easy to interpret since it says that if one of the uncertainties dissapp ears the total uncertainty disapp ears since the other estimate can simply b e ignored in that case Notice that the uncertainty always decreases or stays equal by adding the measurement The estimates for the lo cation and uncertainty incorp oratating the measurement can b e rewritten as follows 0 y y K x y t+1 t+1 t+1 t+1 t+1 2 0 2 K t+1 t+1 t+1 2 t+1 K t+1 2 2 v t+1 From this we can see that the state estimate is corrected by the measurement error multiplied by a gain factor the Kalman gain If the gain is zero no attention is paid to the measurement if its one we simply use the measurement as our new state estimate Similarly for the uncertainties if the measurement is innitely accurate the gain is one which implies that there is no uncertainty left On the other hand if the measurement is worthless the gain is zero which therefore do es not decrease the overall uncertainty The ab ove one dimensional example will b e generalized to higher dimensions in the following The Mo del Let us rst intro duce the state of the KF y at time t The state is a vector of dimension d and remains t unobserved At every time t we alo have a k dimenional vector of observations x which dep end on the state t and some additive Gaussian noise We will assume the following dynamical mo del for the KF y Ay w t+1 t t x By v t t t Note that the dynamics is governed by a Markov pro cess ie the state at y is indep endent of all other t+1 states given y The evolution noise and the measurement noise are assumed white and Gaussian ie t distributed according to w G Q w v G R v The noise vectors v and w are also assumed to b e uncorrelated with the the state y From this we simply t t t derive Ey v t k t k Ey w t k t k Ex v t k t k Ex w t k t k Ev w t k t k Ev v t k R t k t k Ew w t k Q t k t k The ab ove mo del could b e considered as a factor analysis mo del over time ie at every instant we have a FA mo del where the factors now dep end on the factors of a previous time step The initial state y is distributed according to 1 y G 1 It is easy to generalize this mo del to include a drift and external inputs The drift is a constant change expressed by adding to the dynamical equation y Ay w t+1 t t 0 0 A y w t t 0 0 where we incorp orated the constant again through the redenitions A A and y y For the t t external inputs we let the evolution dep end on a l dimensional vector of inputs u as follows t y Ay Cu w t+1 t t t This mo del is used when we want to control the system We could even let the parameters fA C g dep end on time However in the rest of this chapter we will assume the simplest case ie a linear evolution without drift or inputs and a linear measurement equation with white uncorrelated noise Since the initial state is Gaussian and the evolution equation is linear this implies that the state at later times will remain Gaussian General Prop erties We want to b e able to estimate the state and the covariance of the state at any time t given a set of observations x fx x g If is equal to the current time t we say that we lter the state If is 1 smaller than t we say that we predict the state and nally if is larger the t we say that we smo oth the state The main probability of interest is py jx t since it conveys all the information ab out the state y at time t given all the observations up to time t Since we are assuming that this probability is Gaussian we only need to calculate its mean and covariance denoted by y Ey jx t t P Ey y jx t t t ie the state prediction error Notice that these quantities still dep end on y y where we dened y t t t the random variables x and are therefore random variables themselves We will now prove however that the covariance P do es actually not dep end on x P may b e considered as a parameter therefore in the following To pro of the ab ove claim we simply show that the correlation b etween the random variables y t and x vanishes For normally distributed random variables this implies that they are indep endent and x fx x g are indep endent y y Lemma The random variables y 1 t t t pro of x x Ey x Ey t t Z Z Z dy dx py x y x dx px x dy py jx y t t t t t t Z Z dy dx py x y x dy dx px py jx y x t t t t t t since py x px py jx t t From this we derive the following corollary y Ey y Ey y Ey P t t 2 1 t t t t t t 2 1 2 1 1 2 t1 Another usefull result we will need is the fact that the predicted measurement error " x By is t t t t1 indep endent of the measurements x The predicted measurement error is also called the innovation since t1 it represents that part of the new measurement x that can not b e predicted using knowledge of the x

Load more