Variance Reduced Ensemble Kalman Filtering

1718 MONTHLY WEATHER REVIEW VOLUME 129 Variance Reduced Ensemble Kalman Filtering A. W. HEEMINK,M.VERLAAN, AND A. J. SEGERS Faculty of Information Technology and Systems, Department of Applied Mathematical Analysis, Delft University of Technology, Delft, Netherlands (Manuscript received 3 February 2000, in ®nal form 14 November 2000) ABSTRACT A number of algorithms to solve large-scale Kalman ®ltering problems have been introduced recently. The ensemble Kalman ®lter represents the probability density of the state estimate by a ®nite number of randomly generated system states. Another algorithm uses a singular value decomposition to select the leading eigenvectors of the covariance matrix of the state estimate and to approximate the full covariance matrix by a reduced-rank matrix. Both algorithms, however, still require a huge amount of computer resources. In this paper the authors propose to combine the two algorithms and to use a reduced-rank approximation of the covariance matrix as a variance reductor for the ensemble Kalman ®lter. If the leading eigenvectors explain most of the variance, which is the case for most applications, the computational burden to solve the ®ltering problem can be reduced signi®cantly (up to an order of magnitude). 1. Introduction serious disadvantage is that the statistical error in the estimates of the mean and covariance matrix from a Kalman ®ltering is a powerful framework for solving sample decreases very slowly for larger sample size. data assimilation problems (see Ghil and Malanotte-Riz- This is a well-known fundamental problem with all zoli 1991). In order to use a Kalman ®lter for assimi- Monte Carlo methods. As a result, for most practical lating data into a numerical model, this model is em- problems the sample size chosen has to be rather large. bedded into a stochastic environment by introducing a Here it should be noted that a properly constructed en- system noise process. In this way it is possible to take into account the inaccuracies of the underlying deter- semble Kalman ®lter can still provide an improved anal- ministic system. By using a Kalman ®lter, the infor- ysis even with small-sized ensembles (see Houtekamer mation provided by the resulting stochastic±dynamic and Mitchel 1998). model and the noisy measurements are combined to Another approach to solve large-scale Kalman ®lter- obtain an optimal estimate of the state of the system. ing problems is to approximate the full covariance ma- The standard Kalman ®lter implementation, however, trix of the state estimate by a matrix with reduced rank. would impose an unacceptable computational burden. This approach was introduced by Cohn and Todling In order to obtain a computationally ef®cient ®lter, sim- (1995, 1996) and Verlaan and Heemink (1995, 1997), pli®cations have to be introduced. where the latter used a robust square root formulation The ensemble Kalman ®lter (EnKF) was introduced for the ®lter implementation. Algorithms based on sim- by Evensen (1994) and has been used successfully in ilar ideas have been proposed and applied by Lermu- many applications (see Evensen and Van Leeuwen 1996; siaux (1997) and Pham et al. (1998). Houtekamer and Mitchel 1998; CanÄizares 1999). This The reduced-rank approaches can also be formulated Monte Carlo approach is based on a representation of as an ensemble Kalman ®lter where the q ensemble the probability density of the state estimate by a ®nite members have not been chosen randomly, but in the number N of randomly generated system states. The directions of the q leading eigenvectors of the covari- algorithm does not require a tangent linear model and ance matrix (see Verlaan and Heemink 1997). As a result is very easy to implement. The computational effort also these algorithms do not require a tangent linear required for the EnKF is approximately N times as much model. The computational effort required is approxi- as the effort required for the underlying model. The only mately q 1 1 model simulations plus the computations required for the singular value decomposition to deter- mine the leading eigenvectors [O(q3); see Heemink et Corresponding author address: A. W. Heemink, Delft University al. (1997)]. In many practical problems the full covari- of Technology, Department of Applied Mathematical Analysis, Mek- elweg 4, 2628 CD Delft, Netherlands. ance can be approximated accurately by a reduced-rank E-mail: [email protected] matrix with relatively small value of q. However, re- q 2001 American Meteorological Society Unauthenticated | Downloaded 09/27/21 07:26 AM UTC JULY 2001 HEEMINK ET AL. 1719 duced-rank approaches often suffer from ®lter diver- where j is a random variable uniformly distributed on gence problems for small values of q. This was observed the interval [0, 1]. Using a Monte Carlo method we ®rst by Cohn and Todling (1995), who tried several generate a sequence of uniformly distributed random methods of compensating for the truncation error. The numbers j i, i 5 1,...,N, and compute main reason for the occurrence of ®lter divergence is 1 N the fact that truncation of the eigenvectors of the co- f 5 f (j ). (2) O i variance matrix implies that the covariance is always N i51 underestimated. It is well known that underestimating Here,f is an estimator of E[ f(j )]. The variance of this the covariance may cause ®lter divergence. Filter di- Monte Carlo estimator is (see Hammersley and Hands- vergence can be avoided by chosing q relatively large, but this of course reduces the computational ef®ciency comb 1964) of the method considerably. 1 1 var( f ) 5 { f (j) 2 E[ f (j)]}2 dj. (3) CanÄizares (1999) compared the EnKF and the re- N E duced-rank approach for a number of practical large- 0 scale shallow water ¯ow problems. His conclusion is From this expression we see that the standard deviation that the computational ef®ciency of both approaches is of the statistical errors of the Monte Carlo method con- comparable. The rank q of the reduced-rank approxi- verge very slowly with the sample size(ø1/ÏN) . mation could be chosen considerably smaller (3±5 Now let us suppose that we have an approximation times) than the ensemble size N of the ensemble ®lter. f(j )of f(j ) with known E[f(j )] 5F. Here, f(j )is However, the singular value decomposition becomes also called a controlled variate. As a result we have, in computationally the most expensive part of the reduced- this case, rank algorithm for larger values of q. In this paper we propose to combine the EnKF with E[ f(j )] 5F1E[ f(j ) 2 f (j )]. (4) the reduced-rank approach to reduce the statistical error Now we can use as estimator of E[ f(j )] of the ensemble ®lter. This is known as variance reduction, referring to the variance of the statistical error 1 N f 5F1 [ f (j ) 2 f(j )] (5) of the ensemble approach (see Hammersley and Hands- f O ii N i51 comb 1964). The ensemble of the new ®lter algorithm now consists of two parts: q ensembles in the direction with variance: of the q leading eigenvalues of the covariance matrix var{ f } and N randomly chosen ensembles. In the algorithm, f only the projection of the random ensemble members 1 1 5 [ f (j) 2 f(j) 2 {E[ f (j)] 2F}]2 dj. (6) orthogonal to the ®rst ensemble members is used to N E obtain the ®lter gain and the state estimate. This partially 0 orthogonal ensemble kalman ®lter (POEnKF) does not If f(j ) is reasonable approximation of f(j ), F will be suffer from divergence problems because the reduced- a reasonable estimate of E[ f(j )]. The Monte Carlo rank approximation is embedded in an EnKF. The EnKF method, in this case, will only be used for estimating acts as a compensating mechanism for the truncation the remaining part E[ f(j ) 2 f(j )] with a variance of error. At the same time, POEnKF is much more accurate the error as given by (6). This variance is in general than the ensemble ®lter with ensemble size N 1 q be- signi®cantly smaller then the variance (3) of the original cause the leading eigenvectors of the covariance matrix Monte Carlo approximation where f(j ) 5 0. Therefore are computed accurately using the full (extended) Kal- f(j ) is also often called a variance reductor. Variance man ®lter equations without statistical errors. reduction is attractive as long as we have an approxi- In section 2 we ®rst show a simple example of var- mation f(j )of f(j ) that is better than f(j ) 5 0. In iance reduction to illustrate the basic idea of our com- most cases it will be easy to ®nd an approximation that bined ®lter. In section 3 of this paper we summarize the is better than doing nothing at all. ensemble kalman ®lter and the reduced-rank square root Let us for example take (see Fig. 1) ®lter and introduce the partially orthogonal ensemble kalman ®lter algorithm and a few variants of this al- f(j ) 5 2j 232 j , gorithm. We illustrate the performance of the various and as approximation algorithms with an advection diffusion model applica- tion in section 4. Here, in order to compare the results f(j ) 5 j . of the various suboptimal ®lter algorithms with the exact Kalman ®lter, we concentrate our attention on linear In this case the variance (3) of the statistical error of problems. the Monte Carlo method is 0.1304/N while the variance (6) of the estimator (5) is only 0.0304/N. The basic idea just described can also be introduced 2.

Load more