Optimum Filters (5)
Total Page:16
File Type:pdf, Size:1020Kb
OPTIMUM FILTERS (5) DISCRETE KALMAN FILTER Kalman filter: • Kalman filtering problem • Kalman prediction problem • Summary of the implementation of discrete Kalman filter • Steady state properties of Kalman filters Appendix 7B. Derivation of matrix differentiation formulas 50 DISCRETE KALMAN FILTER Kalman filters are optimum filters that are different from Wiener filters and have the following distinctive features: (i) the problems are formulated in terms of state-space model (or state-space form); (ii) the solution of the problems is computed recursively; (iii) they can be used to treat stationary random processes, as well as non-stationary processes. As we learnt from the previous lectures, an optimum Wiener filter is designed using the Wiener-Hopf equations in which the autocorrelation rx (k) of the noise observation x(n) (that contains the desired signal d(n) with rd (k) ) and the cross-correlation rdx (k) are used. Use of rd (k) , rx (k) and rdx (k) indicates that d(n) and x(n) are wide-sense stationary as well as jointly wide-sense stationary random processes. Whereas, for a non-stationary random process, its statistical properties are not stationary; in other words, they are time varying. For example, a deterministic sinusoidal signal d(n) = Asin(ωn +ϕ) interfered with noise v(n) is a = + = ω +ϕ non-stationary random process, x(n) d(n) v(n) , because the mean of x(n), mx (n) Asin( n ) , + = + 2 { ω − [ω + + ϕ ]} depends on time n, and the autocorrelation, rx (n k,n) rv (k) (A 2) cos( k) cos (2n k) 2 , depends not only on time lag k but also on time n. The cross-correlation + = 2 { ω − [ω + + ϕ ]} rdx (n k,n) (A 2) cos( k) cos (2n k) 2 also depends on time n. Therefore, in the optimum sense a Wiener filter can only be used to treat stationary random processes. As we will see below, Wiener filtering problems are just a special case, the steady-state case, of Kalman filtering problems. Kalman filter: Kalman filtering problem Kalman filtering addresses the general problem of trying to get the best estimate of the state x(n) of a process governed by the state equation (linear stochastic difference equation) x(n) = A(n −1)x(n −1) + w(n) (217) from measurements given by the observation equation y(n) = C(n)x(n) + v(n) . (218) Fig. 19. The diagram for constructing a Kalman filter. Specifically, when the measurements y(n) (measurement vector) are input to the Kalman filter h(n), the filter outputs xˆ(n | n) , which is the best linear estimate vector of the state vector x(n) of length p based on the observations y(l) for 0 ≤ l ≤ n (Fig. 19). The best estimate of x(n), xˆ(n | n) , from the Kalman filter is realized by minimizing the mean-square error 51 p−1 ξ(n) = tr{P(n | n)}= E{e(n − k | n − k) 2 } (219) k =0 where P(n | n) = E{e(n | n)e H (n | n)} (220) is called estimation error covariance matrix (a p × p matrix), and e(n | n) (a vector of length p) is the estimate error, given by T e(n | n) = x(n) − xˆ(n | n) = [e(n | n), e(n −1| n −1), 3, e(n − p | n − p)] T = [x(n) − xˆ(n | n), x(n −1) − xˆ(n −1| n −1), 3, x(n − p) − xˆ(n − p | n − p)] (221) tr{A} means the trace of matrix A, which is defined as the sum of the terms along the diagonal of A, p { }= tr A amm (222) m=1 p−1 Thus, tr{P(n | n)}= E{e(n − k | n − k) 2 }. k =0 Another error covariance matrix that we need in deriving the Kalman filter is prediction error covariance matrix, defined as P(n | n −1) = E{e(n | n −1)e H (n | n −1)} (223) where e(n | n −1) = x(n) − xˆ(n | n −1) (224) is the one-step prediction error (a vector of length p), in which xˆ(n | n −1) is the prediction vector of the state vector x(n) of length p, based on the observations y(l) for 0 ≤ l ≤ n −1. To derive the Kalman filter we shall carry out a recursive procedure of two steps: prediction using previous measurements, and estimation using new measurement (refer to Fig. 20). In the first step, the state x(n) is predicted using the previous observations y(l) up to time l=n–1 so as to obtain xˆ(n | n −1) . In the second step, the new information ε(n) = y(n) − yˆ(n | n −1) (the innovation process of y(n)) from new measurement y(n) is added to the estimate xˆ(n | n −1) so as to get the best estimate of x(n), xˆ(n | n) . Initialization xˆ(0 |0) ≈ E{x(0)} Step 1: predicting x(1) from xˆ(0 | 0) => xˆ(1| 0) = A(0)xˆ(0 | 0) ; Step 2: estimating x(1) by adding new info ε(1) => xˆ(1|1) = xˆ(1| 0) + K(1)ε(1) Repeat 1: predicting x(2) from xˆ(1|1) => xˆ(2 |1) = A(1)xˆ(1|1) ; Repeat 2: estimating x(2) by adding new info ε(2) => xˆ(2 | 2) = xˆ(2 | 0) + K(2)ε(2) …… Repeat 1: predicting x(n) => xˆ(n | n −1) = A(n −1)xˆ(n −1| n −1) ; Repeat 2: estimating x(n) by adding new info ε(n) => xˆ(n | n) = xˆ(n | n −1) + K(n)ε(n) Fig. 20 The two-step recursive procedure of deriving Kalman filter. Suppose that a given random process x(n) has an initial state x(0) with a given estimate xˆ(0 |0) ≈ E{x(0)}. The initial estimation error covariance matrix (Eq. (220)) for this estimate is P(0 | 0) = E{e(0 | 0)e H (0 | 0)}. With the given estimate xˆ(0 |0) ≈ E{x(0)}, we first predict x(1) using the relation xˆ(1| 0) =A(0) xˆ(0 | 0) and 52 find yˆ(1| 0) =C(1) xˆ(1| 0) as well. Secondly, we make optimal use of the available measurement y(1) by adding to xˆ(1|0) the new information ε(1) = y(1) − yˆ(1| 0) (the innovation process of y(1)), and find the best estimate of state x(1), xˆ(1|1) . Optimal use of ε(1) is made by a weighting factor K(1) that minimizes the p−1 mean-square error ξ (n) at time n=1, ξ(1) = tr{P(1|1)}= E{e(1− k |1− k)2 }. k =0 Repeating the above two steps for the next time instant, we use xˆ(1|1) to predict x(2) using the relation xˆ(2 |1) = A(1)xˆ(1|1) and find yˆ(2 |1) = C(2)xˆ(2 |1) . Then we update xˆ(2 |1) by adding new information ε(2) = y(2) − yˆ(2 |1) from the next available measurement y(2), and find the best estimate of x(2), xˆ(2 | 2) , p−1 by optimizing K(2), the weight on ε(2) , that minimizes ξ(2) = tr{P(2 | 2)}= E{e(2 − k | 2 − k) 2 }. k =0 We carry on this until time n, and then we can formulate a recursive procedure for time n; that is for each n>0, given xˆ(n −1| n −1) and P(n −1| n −1) , we can predict x(n) and obtain xˆ(n | n − 1) , and then we make optimal use of the new information ε(n) = y(n) − yˆ(n | n −1) from a new available measurement y(n) to find the best estimate xˆ(n | n) of the state x(n) by finding optimal K(n) that minimizes the mean-square error p−1 ξ(n) = tr{P(n | n)}= E{e(n − k | n − k) 2 }. k =0 The Kalman filter is established based on the above recursive procedure, as follows. In the first step, x(n) is projected ahead (predicted) via the state transition matrix in the following manner xˆ(n | n −1) = A(n −1)xˆ(n −1| n −1) (225) which is obtained by ignoring the contribution of w(n) in the state equation (Eq. (217)) because w(n) is a white noise process with zero mean and is not correlated with its previous values, w(l) for l<n. The prediction error in this case is e(n | n −1) = x(n) − xˆ(n | n −1) = A(n −1)x(n −1) + w(n) − A(n −1)xˆ(n −1| n −1) = A(n −1)e(n −1| n −1) + w(n) (226) Since e(n −1| n −1) is uncorrelated with w(n) (due to the fact that w(n) is white noise and thus uncorrelated with w(l) for n ≠ l ) so that E{e(n −1| n −1)wH (n)}= 0 , the prediction error covariance matrix becomes − = { − H − }= − − − H − + P(n | n 1) E e(n | n 1)e (n | n 1) A(n 1)P(n 1| n 1)A (n 1) Q w (n) (227) This completes the first step of deriving the Kalman filter. In the second step, the optimum estimation of x(n) is conducted by adding the new measurement y(n) to the estimate xˆ(n | n −1) in terms of the innovation process of y(n), ε(n) = y(n) − yˆ(n | n −1) (228) which is the error between y(n) and its prediction yˆ(n | n −1) and represents the new information that can not be predicted by yˆ(n | n −1) .