The Kalman Filter

Chapter 11 Tutorial: The Kalman Filter Tony Lacey. 11.1 Intro duction The Kalman lter [1] has long b een regarded as the optimal solution to many tracking and data prediction tasks, [2]. Its use in the analysis of visual motion has b een do cumented frequently. The standard Kalman lter derivation is given here as a tutorial exercise in the practical use of some of the statistical techniques outlied in previous sections. The lter is constructed as a mean squared error minimiser, but an alternative derivation of the lter is also provided showing how the lter relates to maximum likeliho o d statistics. Do cumenting this derivation furnishes the reader with further insightinto the statistical constructs within the lter. The purp ose of ltering is to extract the required information from a signal, ignoring everything else. Howwell a lter p erforms this task can b e measured using a cost or loss function. Indeed wemay de ne the goal of the lter to b e the minimisation of this loss function. 11.2 Mean squared error Many signals can b e describ ed in the following way; y = a x + n 11.1 k k k k where; y is the time dep endent observed signal, a is a gain term, x is the information b earing signal k k k and n is the additive noise. k The overall ob jective is to estimate x . The di erence b etween the estimate ofx ^ and x itself is termed k k k the error; f e = f x x^ 11.2 k k k The particular shap e of f e is dep endent up on the application, however it is clear that the function k should b e b oth p ositive and increase monotonically [3]. An error function which exhibits these charac- teristics is the squared error function; 2 f e = x x^ 11.3 k k k 133 Since it is necessary to consider the ability of the lter to predict many data over a p erio d of time a more meaningful metric is the exp ected value of the error function; l ossf unction = E f e 11.4 k This results in the mean squared error MSE function; 2 t = E e 11.5 k 11.3 Maximum likeliho o d The ab ove derivation of mean squared error, although intuitive is somewhat heuristic. A more rigorous derivation can b e develop ed using maximum likelihood statistics. This is achieved by rede ning the goal of the lter to nding the x^ which maximises the probability or likeliho o d of y . That is; max [P y jx^] 11.6 Assuming that the additive random noise is Gaussian distributed with a standard deviation of gives; k 2 y a x^ k k k P y jx^ = K exp 11.7 k k k 2 2 k where K is a normalisation constant. The maximum likeliho o d function of this is; k 2 Y y a x^ k k k P y jx^ = K exp 11.8 k 2 2 k k Which leads to; 2 X y a x^ 1 k k k logP y jx^ = + constant 11.9 2 2 k k The driving function of equation 11.9 is the MSE, which may be maximised by the variation of x^ . k Therefore the mean squared error function is applicable when the exp ected variation of y is b est mo delled k as a Gaussian distribution. In such a case the MSE serves to provide the value of x^ which maximises k the likeliho o d of the signal y . k In the following derivation the optimal lter is de ned as b eing that lter, from the set of all p ossible lters which minimises the mean squared error. 11.4 Kalman Filter Derivation Before going on to discuss the Kalman lter the work of Norb ert Wiener [4], should rst b e acknowledged . Wiener describ ed an optimal nite impulse response FIR lter in the mean squared error sense. His solution will not b e discussed here even though it has much in common with the Kalman lter. Suce to say that his solution uses b oth the auto correlation and the cross correlation of the received signal with the original data, in order to derive an impulse resp onse for the lter. Kalman also presented a prescription of the optimal MSE lter. However Kalman's prescription has some advantages over Weiner's; it sidesteps the need to determine the impulse resp onse of the lter, something which is p o orly suited to numerical computation. Kalman describ ed his lter using state 134 space techniques, which unlike Wiener's p erscription, enables the lter to b e used as either a smo other, a lter or a predictor. The latter of these three, the ability of the Kalman lter to b e used to predict data has proven to b e a very useful function. It has lead to the Kalman lter b eing applied to a wide range of tracking and navigation problems. De ning the lter in terms of state space metho ds also simpli es the implementation of the lter in the discrete domain, another reason for its widespread app eal. 11.5 State space derivation Assume that wewanttoknow the value of a variable within a pro cess of the form; x = x + w 11.10 k +1 k k where; x is the state vector of the pro cess at time k , nx1; is the state transition matrix of the pro cess k from the state at k to the state at k + 1, and is assumed stationary over time, nxm; w is the asso ciated k white noise pro cess with known covariance, nx1. Observations on this variable can b e mo delled in the form; z = Hx + v 11.11 k k k where; z is the actual measurementof x at time k, mx1; H is the noiseless connection b etween the k state vector and the measurementvector, and is assumed stationary over time mxn; v is the asso ciated k measurement error. This is again assumed to b e a white noise pro cess with known covariance and has zero cross-correlation with the pro cess noise, mx1. As was shown in section ?? for the minimisation of the MSE to yield the optimal lter it must be p ossible to correctly mo del the system errors using Gaussian distributions. The covariances of the two noise mo dels are assumed stationary over time and are given by; T Q = E w w 11.12 k k T R = E v v 11.13 k k The mean squared error is given by 11.5. This is equivalent to; T E e e = P 11.14 k k k where; P is the error covariance matrix at time k , nxn. k Equation 11.14 may b e expanded to give; h i T T P = E e e = E x x^ x x^ 11.15 k k k k k k k 0 Assuming the prior estimate ofx ^ is calledx ^ , and was gained by knowledge of the system. It p osible to k k write an up date equation for the new estimate, combing the old estimate with measurement data thus; 0 0 11.16 + K z H x^ x^ = x^ k k k k k 0 where; K is the Kalman gain, which will b e derived shortly. The term z H x^ in eqn. 11.16 is known k k k as the innovation or measurement residual; i = z H x^ 11.17 k k k 135 Substitution of 11.11 into 11.16 gives; 0 0 x^ = x^ + K Hx + v Hx^ 11.18 k k k k k k Substituting 11.18 into 11.15 gives; 0 P = E [[I K H x x^ K v ] 11.19 k k k k k k i T 0 [I K H x x^ K v ] k k k k k 0 At this p oint it is noted that x x^ is the error of the prior estimate. It is clear that this is uncorrelated k k with the measurement noise and therefore the exp ectation may b e re-written as; h i T 0 0 P = I K H E x x^ x x^ I K H k k k k k k k T T + K E v v K 11.20 k k k k Substituting equations 11.13 and 11.15 into 11.19 gives; T 0 T P = I K H P I K H + K RK 11.21 k k k k k k 0 where P is the prior estimate of P . k k Equation 11.21 is the error covariance up date equation. The diagonal of the covariance matrix contains the mean squared errors as shown; 2 3 T T T E e e E e e E e e k1 k k+1 k1 k1 k 1 T T T 4 5 E e e E e e E e e P = 11.22 k 1 k k +1 kk k k k T T T E e e E e e E e e k 1 k k +1 k +1 k +1 k +1 The sum of the diagonal elements of a matrix is the trace of a matrix. In the case of the error covariance matrix the trace is the sum of the mean squared errors.

Load more