1718 MONTHLY WEATHER REVIEW VOLUME 129

Variance Reduced Ensemble Kalman Filtering

A. W. HEEMINK,M.VERLAAN, AND A. J. SEGERS Faculty of Information Technology and Systems, Department of Applied Mathematical Analysis, Delft University of Technology, Delft, Netherlands

(Manuscript received 3 February 2000, in ®nal form 14 November 2000)

ABSTRACT A number of algorithms to solve large-scale Kalman ®ltering problems have been introduced recently. The ensemble Kalman ®lter represents the probability density of the state estimate by a ®nite number of randomly generated system states. Another algorithm uses a singular value decomposition to select the leading eigenvectors of the of the state estimate and to approximate the full covariance matrix by a reduced-rank matrix. Both algorithms, however, still require a huge amount of computer resources. In this paper the authors propose to combine the two algorithms and to use a reduced-rank approximation of the covariance matrix as a variance reductor for the ensemble Kalman ®lter. If the leading eigenvectors explain most of the variance, which is the case for most applications, the computational burden to solve the ®ltering problem can be reduced signi®cantly (up to an order of magnitude).

1. Introduction serious disadvantage is that the statistical error in the estimates of the and covariance matrix from a Kalman ®ltering is a powerful framework for solving sample decreases very slowly for larger sample size. problems (see Ghil and Malanotte-Riz- This is a well-known fundamental problem with all zoli 1991). In order to use a Kalman ®lter for assimi- Monte Carlo methods. As a result, for most practical lating data into a numerical model, this model is em- problems the sample size chosen has to be rather large. bedded into a stochastic environment by introducing a Here it should be noted that a properly constructed en- system noise process. In this way it is possible to take into account the inaccuracies of the underlying deter- semble Kalman ®lter can still provide an improved anal- ministic system. By using a Kalman ®lter, the infor- ysis even with small-sized ensembles (see Houtekamer mation provided by the resulting stochastic±dynamic and Mitchel 1998). model and the noisy measurements are combined to Another approach to solve large-scale Kalman ®lter- obtain an optimal estimate of the state of the system. ing problems is to approximate the full covariance ma- The standard Kalman ®lter implementation, however, trix of the state estimate by a matrix with reduced rank. would impose an unacceptable computational burden. This approach was introduced by Cohn and Todling In order to obtain a computationally ef®cient ®lter, sim- (1995, 1996) and Verlaan and Heemink (1995, 1997), pli®cations have to be introduced. where the latter used a robust square root formulation The ensemble Kalman ®lter (EnKF) was introduced for the ®lter implementation. Algorithms based on sim- by Evensen (1994) and has been used successfully in ilar ideas have been proposed and applied by Lermu- many applications (see Evensen and Van Leeuwen 1996; siaux (1997) and Pham et al. (1998). Houtekamer and Mitchel 1998; CanÄizares 1999). This The reduced-rank approaches can also be formulated Monte Carlo approach is based on a representation of as an ensemble Kalman ®lter where the q ensemble the probability density of the state estimate by a ®nite members have not been chosen randomly, but in the number N of randomly generated system states. The directions of the q leading eigenvectors of the covari- algorithm does not require a tangent linear model and ance matrix (see Verlaan and Heemink 1997). As a result is very easy to implement. The computational effort also these algorithms do not require a tangent linear required for the EnKF is approximately N times as much model. The computational effort required is approxi- as the effort required for the underlying model. The only mately q ϩ 1 model simulations plus the computations required for the singular value decomposition to deter- mine the leading eigenvectors [O(q3); see Heemink et Corresponding author address: A. W. Heemink, Delft University al. (1997)]. In many practical problems the full covari- of Technology, Department of Applied Mathematical Analysis, Mek- elweg 4, 2628 CD Delft, Netherlands. ance can be approximated accurately by a reduced-rank E-mail: [email protected] matrix with relatively small value of q. However, re-

᭧ 2001 American Meteorological Society

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC JULY 2001 HEEMINK ET AL. 1719 duced-rank approaches often suffer from ®lter diver- where ␰ is a random variable uniformly distributed on gence problems for small values of q. This was observed the interval [0, 1]. Using a we ®rst by Cohn and Todling (1995), who tried several generate a sequence of uniformly distributed random methods of compensating for the truncation error. The numbers ␰ i, i ϭ 1,...,N, and compute main reason for the occurrence of ®lter divergence is 1 N the fact that truncation of the eigenvectors of the co- f ϭ f (␰ ). (2) ͸ i variance matrix implies that the covariance is always N iϭ1 underestimated. It is well known that underestimating Here,f is an estimator of E[ f(␰ )]. The variance of this the covariance may cause ®lter divergence. Filter di- Monte Carlo estimator is (see Hammersley and Hands- vergence can be avoided by chosing q relatively large, but this of course reduces the computational ef®ciency comb 1964) of the method considerably. 1 1 var( f ) ϭ { f (␰) Ϫ E[ f (␰)]}2 d␰. (3) CanÄizares (1999) compared the EnKF and the re- N ͵ duced-rank approach for a number of practical large- 0 scale shallow water ¯ow problems. His conclusion is From this expression we see that the that the computational ef®ciency of both approaches is of the statistical errors of the Monte Carlo method con- comparable. The rank q of the reduced-rank approxi- verge very slowly with the sample size(ഠ1/͙N) . mation could be chosen considerably smaller (3±5 Now let us suppose that we have an approximation times) than the ensemble size N of the ensemble ®lter. ␾(␰ )of f(␰ ) with known E[␾(␰ )] ϭ⌽. Here, ␾(␰ )is However, the singular value decomposition becomes also called a controlled variate. As a result we have, in computationally the most expensive part of the reduced- this case, rank algorithm for larger values of q. In this paper we propose to combine the EnKF with E[ f(␰ )] ϭ⌽ϩE[ f(␰ ) Ϫ ␾ (␰ )]. (4) the reduced-rank approach to reduce the statistical error Now we can use as estimator of E[ f(␰ )] of the ensemble ®lter. This is known as variance re- duction, referring to the variance of the statistical error 1 N f ϭ⌽ϩ [ f (␰ ) Ϫ ␾(␰ )] (5) of the ensemble approach (see Hammersley and Hands- ␾ ͸ ii N iϭ1 comb 1964). The ensemble of the new ®lter algorithm now consists of two parts: q ensembles in the direction with variance: of the q leading eigenvalues of the covariance matrix var{ f } and N randomly chosen ensembles. In the algorithm, ␾ only the projection of the random ensemble members 1 1 ϭ [ f (␰) Ϫ ␾(␰) Ϫ {E[ f (␰)] Ϫ⌽}]2 d␰. (6) orthogonal to the ®rst ensemble members is used to N ͵ obtain the ®lter gain and the state estimate. This partially 0 orthogonal ensemble kalman ®lter (POEnKF) does not If ␾(␰ ) is reasonable approximation of f(␰ ), ⌽ will be suffer from divergence problems because the reduced- a reasonable estimate of E[ f(␰ )]. The Monte Carlo rank approximation is embedded in an EnKF. The EnKF method, in this case, will only be used for estimating acts as a compensating mechanism for the truncation the remaining part E[ f(␰ ) Ϫ ␾(␰ )] with a variance of error. At the same time, POEnKF is much more accurate the error as given by (6). This variance is in general than the ensemble ®lter with ensemble size N ϩ q be- signi®cantly smaller then the variance (3) of the original cause the leading eigenvectors of the covariance matrix Monte Carlo approximation where ␾(␰ ) ϭ 0. Therefore are computed accurately using the full (extended) Kal- ␾(␰ ) is also often called a variance reductor. Variance man ®lter equations without statistical errors. reduction is attractive as long as we have an approxi- In section 2 we ®rst show a simple example of var- mation ␾(␰ )of f(␰ ) that is better than ␾(␰ ) ϭ 0. In iance reduction to illustrate the basic idea of our com- most cases it will be easy to ®nd an approximation that bined ®lter. In section 3 of this paper we summarize the is better than doing nothing at all. ensemble kalman ®lter and the reduced-rank square root Let us for example take (see Fig. 1) ®lter and introduce the partially orthogonal ensemble kalman ®lter algorithm and a few variants of this al- f(␰ ) ϭ 2␰ 23Ϫ ␰ , gorithm. We illustrate the performance of the various and as approximation algorithms with an advection diffusion model applica- tion in section 4. Here, in order to compare the results ␾(␰ ) ϭ ␰ . of the various suboptimal ®lter algorithms with the exact Kalman ®lter, we concentrate our attention on linear In this case the variance (3) of the statistical error of problems. the Monte Carlo method is 0.1304/N while the variance (6) of the estimator (5) is only 0.0304/N. The basic idea just described can also be introduced 2. Variance reduction: An illustrative example for the ensemble Kalman ®lter. By computing an ap- Suppose we want to determine proximation of the ®lter problem using a reduced-rank E[ f(␰ )], (1) ®lter, the ensemble Kalman ®lter needs only be used for

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC 1720 MONTHLY WEATHER REVIEW VOLUME 129

x fa(k) ϭ M [x (k Ϫ 1)] (9) Pfa(k) ϭ M(k Ϫ 1)P (k Ϫ 1)M(k Ϫ 1)T ϩ Q(k Ϫ 1), (10) where [(M [x f(kץ M(k) ϭ i (11) ij f (x j(kץ represents the tangent linear model. At observation time k the observation yo(k) becomes available. The estimate is updated by the analysis equations xaf(k) ϭ x (k) ϩ K(k)[y o(k) Ϫ H(k)x f(k)] (12) Paf(k) ϭ P (k) Ϫ K(k)H(k)P f(k), (13) where

FIG. 1. Function f(␰ ) (Ð) and its approximation ␾(␰ )(±±±)for K(k) ϭ P f(k)H(k)[T H(k)P f(k)H(k)T ϩ R(k)]Ϫ1 (14) 0 Յ ␰ Յ 1. is the Kalman gain. The initial condition for the recur- a a sion is given by x (0) ϭ x 0 and P (0) ϭ P0. the remaining part of the problem. As a result, the sta- tistical errors of the combined approach will be reduced c. Ensemble Kalman ®lter signi®cantly. The ensemble Kalman ®lter algorithm is based on a representation of the probability density of the state es- 3. Kalman ®ltering timate by a ®nite number N of randomly generated sys- tem states. a. State space model The EnKF for the model (7)±(8) can be summarized Suppose that modeling techniques have provided us as follows (see Burgers et al. 1998). with a nonlinear stochastic state space model of the form 1) INITIALIZATION xff(k ϩ 1) ϭ M [x (k)] ϩ ␩(k), (7) An ensemble of N states␰ a (0) are generated to rep- and with observations i resent the uncertainity in x0; yof(k) ϭ H(k)x (k) ϩ ⑀(k). (8) Here xf (k) is the forecast of the system state at time k 2) FORECAST STEP andM [xf (k)] represents one time step of a numerical fa model. To model the inaccuracies of this model, a white ␰ ii(k) ϭ M [␰ (k Ϫ 1)] ϩ ␩ i(k Ϫ 1) (15) system noise process ␩(k) with covariance Q(k) is in- 1 N troduced. Covariance Q(k) is assumed to be of low rank f f x (k) ϭ ␰ i(k) (16) 1/2 1/2 T ͸ so that it can be factorized as Q(k) ϭ Q(k) [Q(k) ] , N iϭ1 where Q(k)1/2 is a matrix with relatively few columns. E fff(k) ϭ [␰ (k) Ϫ x (k),...,␰ ff(k) Ϫ x (k)]; (17) In the relation (8) yo(k) is the observations vector and 1 N H(k) is the observation matrix that relates the available observations to the system state. The observation noise 3) ANALYSIS STEP process ⑀(k) with covariance R(k) ϭ R(k)1/2[R(k)1/2]T is introduced to model uncertainties in the observations. 1 P f(k) ϭ E ff(k)E (k)T (18) N Ϫ 1 b. Conventional Kalman ®lter for nonlinear systems K(k) ϭ P f(k)H(k)[T H(k)P f(k)H(k)T ϩ R(k)]Ϫ1 (19) It is desired to combine the measurement taken from ␰ af(k) ϭ ␰ (k) ϩ K(k)[y o(k) Ϫ H(k)␰ f(k) ϩ ⑀ (k)], (20) the actual system and modeled by relation (8) with the ii ii information provided by the system model (7) in order where the ensemble of state vectors are generated with to obtain an optimal estimate of the system state. The the realizations ␩i(k) and ⑀i(k) of the noise processes optimal state estimate xa(k Ϫ 1) at time k Ϫ 1 is forecast ␩(k) and ⑀(k), respectively. Note that in the ®nal im- from observation time k Ϫ 1 to observation time k by plementation of the algorithm Pf (k) need not actually the equations be computed (see Evensen 1994).

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC JULY 2001 HEEMINK ET AL. 1721

For most practical problems the forecast equation (15) 3) ANALYSIS STEP is computationally dominant (see Dee 1991). As a result the computational effort required for the EnKF is ap- P fff(k) ϭ L (k)L (k)T (26) proximately N model simulations. The standard devia- K(k) ϭ P f(k)H(k)[T H(k)P f(k)H(k)T ϩ R(k)]Ϫ1 (27) tion of the errors in the state estimate are of a statistical nature and converge very slowly with the sample size xaf(k) ϭ x (k) ϩ K(k)[y o(k) Ϫ H(k)x f(k)] (28) (ഠ1/N). This is one of the very few drawbacks of a LÄ af(k) ϭ {[I Ϫ K(k)H(k)]L (k), K(k)R(k)1/2 } (29) Monte Carlo approach (see also section 2). Here it should be noted that for many atmospheric data assim- Laaa(k) ϭ ⌸ (k)LÄ (k), (30) ilation problems the analysis step is also a very time where ⌸a(k) is a projection onto the q leading eigen- consuming part of the algorithm (see Houtekamer and vectors of the matrix LÄ a(k)LÄ a(k)T. This reduction step Mitchel 1998). is again introduced to reduce the number of columns in La(k)toq in LÄ a(k). Note that the full covariance Pf (k) need not to be computed (see Verlaan and Heemink d. Reduced-rank square root ®lter 1997). The analysis step (29)±(30) is based on the Jo- seph form of the ®lter equations (see Maybeck 1979). The reduced-rank square root (RRSQRT) ®lter al- This analysis step is not the most ef®cient procedure. gorithm (see Verlaan and Heemink 1997; Heemink et Equations (29)±(30) are, however, more general than al. 1997) is based on a factorization of the covariance the analysis step proposed by Verlaan and Heemink matrix P of the state estimate according to P ϭ LLT, (1997) and Heemink et al. (1997). Equations (29)±(30) hold for arbitrary ®lter gains K(k) and not only for gain where L is a matrix with the q leading eigenvectors li (scaled by the square root of the eigenvalues), i ϭ 1, matrices satisfying Eq. (27). This becomes important if ..., q, of P as columns. The algorithm can be sum- the RRSQRT algorithm is used as part of the POEnKF marized as follows. described in the next section. For smaller values of q, Eq. (23) is for many practical problems computationally dominant, resulting in a com- putational effort of q ϩ 1 model simulations. The pro- 1) INITIALIZATION jections (25) and (30) require O(q3) computations (see Heemink et al. 1997). As a result for very large q this aaaa x (0) ϭ x01, L (0) ϭ [l (0), ...,lq(0)]. (21) part of the algorithm becomes time consuming too (see CanÄizares 1999). Errors of the algorithm are caused by the fact that in the direction of the leading eigenvectors only two ensemble members are available and, as a 2) FORECAST STEP result, the system dynamics is in essence linearized (see Heemink et al. 1997). Other errors are introduced by x fa(k) ϭ M [x (k Ϫ 1)] (22) the representation of the covariance matrix by only the 1 q leading eigenvectors. Because of the truncation, the f aa l ii(k) ϭ {M [x (k Ϫ 1) ϩ ⑀l (k Ϫ 1)] covariance matrix is always underestimated. As a result ⑀ the algorithm is sensitive to ®lter divergence problems. Ϫ M [xa(k Ϫ 1)]}. (23) This problems can be avoided by choosing q relatively large, but this obviously reduces the computational ef- In Eq. (23) ⑀ represents a perturbation, often chosen ®ciency. close to 1:

Ä ff f 1/2 e. Partially orthogonal ensemble Kalman ®lter L (k) ϭ [l1(k),...,lq(k), Q(k Ϫ 1) ]. (24) In Eq. (24) a square root LÄ f (k) of the forecast covariance The ensemble Kalman ®lter and the reduced-rank is constructed by adding the columns of Q(k Ϫ 1)1/2. square root ®lter are both of the square root type since However, the dimension of LÄ f (k) has now increased. both algorithms are formulated in terms of the square Therefore a new reduced-rank square root Lf (k) is com- root of the covariance matrix [respectively Ef (k) and puted instead: Lf (k)]. Also, both algorithms are of the ensemble type. The ensemble ®lter is based on N randomly chosen en- semble members, while in the reduced-rank ®lter the Lfff(k) ϭ ⌸ (k)LÄ (k), (25) ensemble members are chosen deterministically in the where ⌸f (k) is a projection onto the q leading eigen- direction of the q leading eigenvectors. Because the two vectors of the matrix LÄ f (k) LÄ f (k)T. Using the reduction algorithms both have a very similar algorithmic struc- step (25) we obtain an approximate square root Lf (k)of ture, they can be integrated relatively easily. Pf (k) with only q columns. The ensemble of the POEnKF consists of two parts:

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC 1722 MONTHLY WEATHER REVIEW VOLUME 129

The q leading eigenvectors li of the covariance matrix this Monte Carlo approach signi®cantly. On the other plus N randomly chosen ensemble members ␰ i. The hand, by embedding the reduced-rank ®lter in an En- ensemble members li are updated using an RRSQRT semble Kalman ®lter the covariance is not underesti- algorithm and ␰ i by using the EnKF algorithm. The two mated, eliminating the ®lter divergence problems of the algorithms interact with each other at the analysis step. reduced-rank approach (also for very small numbers of Here, only the information of the random ensemble q). As a result q can be chosen on the basis of ef®ciency members orthogonal to the q leading eigenvectors is arguments and not for stabilizing the ®lter algorithm. used. Many generalizations and modi®cations of the The POEnKF algorithm can be summarized as fol- POEnKF exists. One important modi®cation to make lows. the POEnKF more robust for strongly nonlinear prob- lems is to take into account the information of the ran- dom ensemble members in the direction of the q leading 1) INITIALIZATION eigenvectors of the covariance matrix. If [Laa(0) E (0)] EЉ(k) ϭ ⌸Љ(k) Ef (k) (35)

aaaa ϭ [l1 (0), ...,lq(0), ␰ 1 (0), ...,␰ N(0)] (31) is the projection of the random ensemble members onto

aathe ®rst q ensemble members li, then wherelii is the leading eigenvector of P0 and␰ is gen- erated randomly to represent the uncertainty in x 0. 1 EЉ(k)EЉ(k)T (36) N Ϫ 1 2) FORECAST STEP is the ensemble Kalman ®lter approximation of the re- forecast equations (22)±(25) of the RRSQRT al- duced-rank covariance matrix with rank q. Instead of gorithm, Eq. (33), where this information is completely ignored, forecast equations (15)±(17) of the EnKF algo- it is also possible to use the equation rithm. 1 Ϫ ␥ P fff(k) ϭ ␥L (k)L (k) ϩ EЉ(k)EЉ(k) N Ϫ 1 3) ANALYSIS STEP 1 In the analysis step we merge the information of the ϩ E ⊥⊥(k)E (k)T , (37) RRSQRT and the EnKF algorithms to obtain Pf (k). In N Ϫ 1 the direction of the leading eigenvectors we use the where 0 Յ ␥ Յ 1 is a weighting coef®cient. The ®rst results of the RRSQRT algorithm. In the other directions term of the right-hand side suffers from errors caused the RRSQRT algorithm does not give any information by the approximation of the nonlinear dynamic of the and we use the EnKF results. First we compute system in the forecast equation, the second term is dom- ⊥ inated by the statistical noise of the Monte Carlo meth- E (k) ϭ ⌸(k) Ef (k), (32) od. For most practical problems the errors of the ®rst where ⌸⊥(k) is a projection of the random ensemble term are relatively small and ␥ ϭ 1 is a good choise. members orthogonal to the ®rst q ensemble members However, for very nonlinear data assimilation problems f li. Then we compute it may be better to choose ␥ smaller. 1 P fff(k) ϭ L (k)L (k)T ϩ E ⊥⊥(k)E (k) (33) N Ϫ 1 f. Complementary orthogonal subspace ®lter for ef®cient ensembles K(k) ϭ P f(k)H(k)[T H(k)P f(k)H(k)T ϩ R(k)]Ϫ1 , (34) The POEnKF described in the previous section uses a RRSQRT time update for a small number of directions analysis Eq. (20) of the EnKF algorithm for the that correspond to the largest eigenvalues of the forecast ensemble Ef (k), error covariance. A separate ensemble update is used analysis Eqs. (28)±(30) of the RRSQRT algorithm. for the remaining space. However, the covariance for For small values of q the time propagation equations this orthogonal subspace is obtained by ®rst generating (15) and (23) for the ensemble is computationally dom- an ensemble for the whole state space and subsequently inating. As a result for most practical problems the com- removing all contributions in the directions with the putational effort for the POEnKF is approximately N ϩ largest variance. Since the contribution of the ensemble q times the effort required for one model simulation. is ignored in the subspace spanned by modes of the By integrating the ensemble Kalman ®lter and the re- RRSQRT part of the POEnKF, it seems more ef®cient duced-rank square root ®lter the best of both are com- to remove those contributions from the ensemble alto- bined. The reduced-rank part acts as a variance reductor gether. This can be achieved with a simple modi®cation for the ensemble ®lter reducing the statistical errors of of the POEnKF, which will be referred to hereafter as

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC JULY 2001 HEEMINK ET AL. 1723 the complementary orthogonal subspace ®lter for ef®- cient ensembles (COFFEE) algorithm. The COFFEE algorithm can be summarized as fol- lows.

1) INITIALIZATION

[Laa(0) E (0)]

aaaa ϭ [l1 (0), ...,lq(0), ␰ 1 (0), ...,␰ N(0)], (38)

a whereli again denotes the q leading eigenvectors of P 0. However, the random ensemble members are now gen- erated only to represent the covariance P 0 Ϫ La(0)La(0)T. This is the part of the covariance that is not represented by the leading eigenvectors.

2) FORECAST STEP FIG. 2. Velocity ®eld. La is updated using the RRSQRT update (22)±(25) a and the ensemble␰ i is updated using (15)±(17). How- 30 grid. Preliminary experiments indicated that the nu- ever, a random ensemble is now generated with an ad- merical diffusion was still so large that the physical ditional covariance equal to the covariance of the trun- diffusion terms could be omitted. For all experiments cated part in the RRSQRT half, that is, (17) changes to hereafter the velocity ®eld was assumed to be known E fff(k) ϭ [␰ (k) Ϫ x (k) ϩ ␩ (k),...,␰ ff(k) Ϫ x (k) and constant in time. Figure 2 shows the velocity ®eld 11n that is similar to that of the well-known Molenkamp

ϩ ␩N(k)], (39) test. In order to compare the algorithms proposed to other with E[␩ (k)␩ (k)Ј)] ϭ [I Ϫ ⌸f (k)]Lf (k)Lf (k)T[I Ϫ i i existing algorithms some twin experiments were carried ⌸f (k)]T. In the RRSQRT forecast step, only the q lead- ing eigenvectors are taken into account. The other ei- out. A reference solution was generated by inserting genvectors are neglected. In the COFFEE forecast step, constant emissions at grid cells {(5, 8), (7, 8), (15, 5), the part of the covariance that is truncated in the (3, 18), (20, 15)}. The increase of concentration per RRSQRT forecast step is also represented by the random time step for these location was {0.2, 0.5, 0.4, 0.2, 0.25}, ensembles. respectively. Observations were generated from simulated true concentrations, which were computed by adding ¯uc- 3) ANALYSIS STEP tuations to the mean emissions. For this purpose an The measurement update is almost identical to that AR(1) process was used, that is, the ¯uctuations to the of the POEnKF, except that no projection of the ensem- emissions per time step were computed according to ble is needed, that is, (32) can be omitted. ␩Ä jjjj(k ϩ 1) ϭ ␣␩Ä (k) ϩ ␩ (k), (41)

with ␩j independent Gaussian white noise processes 4. Application with E[␩j(k)] ϭ 0 and var [␩j(k)] ϭ 1. The index j refers to source locations {(m, n) ´´´(m, n) } ϭ {(4, 11), a. Model and experimental setup 1 p (13, 3), (26, 15), (15, 10), (23, 2), (11, 9), (15, 20), (23,

Before using the new algorithms for real-life atmo- 10), (5, 25)}. The decays per time step are {␣1 ´´´␣p} spheric chemistry data assimilations problems, we de- ϭ {0.9, 0.8, 0.8, 0.9, 0.9}. Negative emissions are trun- ®ne in this paper ®rst a test problem. Consider the 2D cated, which creates some nonlinearity in the data-as- advection diffusion equation similation problem. Finally, white observational noise with variance 0.01 is added to the true concentrations. c ץ 22cץ cץ cץ cץ ϩ u ϩ ␷ ϩϭ␯ ϩ ␯ ϩ S, (40) yץx22ץ yץ xץ tץ b. Results of the suboptimal Kalman ®lter algorithms with a square domain [0, L] ϫ [0, L] and zero initial conditions. Here, c is the concentration, [u, ␷] is the In evaluating the ®lter algorithms we compute the velocity ®eld, ␯ is the dispersion coef®cient, and S rep- rms error resents the source terms. The concentration at the 1 boundary is zero for in¯ow. A backward Lagrangian RMS :ϭ [c (k) Ϫ cà (k)]2 , (42) 2 ͸ m,n m,n scheme is used to discretize these equations on a 30 ϫ ΊMKm,n,k

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC 1724 MONTHLY WEATHER REVIEW VOLUME 129

FIG. 3. Reference concentration ®eld after k ϭ 100 time steps. Contours at shown with increment 1 for reference concentrations.

FIG. 4. True concentration ®eld after k ϭ 100 time steps. Contours at shown with increment 1 for reference concentrations.

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC JULY 2001 HEEMINK ET AL. 1725

FIG. 5. Concentrations computed with the EnKF with N ϭ 20 at k ϭ 100. Contours at shown with increment 1 for true concentrations (Ð) and ®lter solution ( ´ ´ ´ ).

FIG. 6. Concentrations computed with the RRSQRT ®lter with N ϭ 20 at k ϭ 100. Contours at shown with increment 1 for true concentrations (Ð) and ®lter solution ( ´ ´ ´ ).

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC 1726 MONTHLY WEATHER REVIEW VOLUME 129

time-varying ¯uctuations, while the reference solutions only contains a steady emission that is advected and spreading smoothly. The rms difference between both runs is 0.665 without any data assimilation. In Figs. 5 and 6 we show the results for the ensemble Kalman ®lter and the reduced-rank square root ®lter at the ®nal time k ϭ 100 for 20 ensemble members and 20 modes, respectively. These settings correspond to approximately the computational requirements for 20 model runs. For these settings neither ®lter performs well. Their respective rms errors are 0.838 and 0.797, which is worse than without data assimilation. The con- tours clearly show that both algorithms create spurious correlations if not fully converged. This result is con- sistent with the ®ndings of Houtekamer and Mitchel (1998), who found that there tend to be spurious cor- relations at long distances when ensemble sizes are FIG. 7. Eigenvalues of the error covariance matrix P(100|100). Computed with the RRSQRT ®lter with q ϭ 50. small. They introduced a data-selection procedure to reduce these effects. Although 20 eigenvalues explain almost 98% of the error covariance, both the EnKF and the RRSQRT ®lter still show divergence problems in where cm,n(k) are the exact generated concentrations and the sense that the rms error using data assimilation is cÃm,n(k) are the estimates computed, M is the number of larger than the error without any data assimilation (see grid points in one direction, and K is the number of also Fig. 10). See Fig. 7 for the variance explained by time steps. the leading eigenvectors of the analysis error covariance In Figs. 3 and 4 the concentration ®elds of the truth at the ®nal time k ϭ 100, as computed with the RRSQRT run and the reference run are shown after K ϭ 100 time ®lter with 100 modes. steps. The ϩ signs indicate measurement locations and In Figs. 8 and 9 the results for the POEnKF and the the diamond signs the locations of the emissions. It can COFFEE are shown at the ®nal time. In both compu- be seen clearly that the true ®elds are perturbed with tations 15 modes and an ensemble with N ϭ 5 members

FIG. 8. Concentrations computed with the POEnKF with q ϩ N ϭ 15 ϩ 5atk ϭ 100. Contours at shown with increment 1 for true concentrations (Ð) and ®lter solution ( ´ ´ ´ ).

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC JULY 2001 HEEMINK ET AL. 1727

FIG. 9. Concentrations computed with the COFFEE KF with q ϩ N ϭ 15 ϩ 5atk ϭ 100. Contours at shown with increment 1 for true concentrations (Ð) and ®lter solution ( ´ ´ ´ ).

FIG. 10. Rms errors vs computation times for the suboptimal Kalman ®lter algorithms.

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC 1728 MONTHLY WEATHER REVIEW VOLUME 129 was used. Although the computational requirements are much faster than both the reduced-rank approach and approximately the same as in the previous experiments the ensemble Kalman ®lter. Next, the POEnKF and these ®lters do not diverge and actually improve on the COFFEE algorithms will be applied to real-life, non- reference solution. The rms's are 0.642 and 0.480, re- linear data assimilation problems in atmospheric chem- spectively. This also shows that the reduced variance in istry and shallow water ¯ow forecasting. Furthermore, the COFFEE algorithm as compared to the POEnKF we will also study the performance of the new algo- does indeed reduce the errors even further in this case. rithms in nonlinear, unstable systems. In other experiments not shown here we also found that of the various variants that are possible for the COFFEE REFERENCES algorithm, it is most ef®cient to project the Kalman gain Burgers, G., P. J. Van Leeuwen, and G. Evensen 1998: Analysis onto the subspace spanned by the modes of the RRSQRT scheme in the ensemble Kalman ®lter. Mon. Wea. Rev., 126, part of the algorithm, that is, to effectively ignore cor- 1719±1724. relation computed from the EnKF part of the algorithm CanÄizares, R., 1999: On the application of data assimilation in re- but to include their variance in the innovation error co- gional coastal models. Ph.D. thesis, Delft University of Tech- variance. nology, Delft Netherlands, 133 pp. Cohn, S. E., and R. Todling, 1995: Approximate Kalman ®lters for Figure 10 shows the convergence of the four sub- unstable dynamics. Second Int. Symp. on Assimilation of Ob- optimal schemes used in this paper. For the POEnKF servations in Meteorology and Oceanography, Tokyo, Japan, and the COFFEE algorithm the number of random en- WMO, 241±246. semble members was kept ®xed at N ϭ 5. We observed ÐÐ, and ÐÐ, 1996: Approximate data assimilation schemes for stable and unstable dynamics. J. Meteor. Soc. Japan, 74, 63± during the experiments that for a small number of modes 75. or ensemble members the rms error are sensitive to the Dee, D. P., 1991: Simpli®cation of the Kalman ®lter for meteorolog- random numbers used in the computation. To reduce ical data assimilation. Quart. J. Roy. Meteor. Soc., 117, 365± this random effect on the results the same set of random 384. numbers was used for all the experiments, which made Evensen, E., 1994: Sequential data assimilation with a nonlinear QG model using Monte Carlo methods to forecast error statistics. J. the results more comparable. However, the seemingly Geophys. Res., 99, 10 143±10 162. random behavior for small ensembles (or modes) is still Evensen, G., and P. J. Van Leeuwen, 1996: Assimilation of geosat in¯uenced by the set of random numbers used. When altimiter data for the agulhas current using the Ensemble Kalman comparing the algorithms, it can be seen that both the ®lter with a quasigeostrophic model. Mon. Wea. Rev., 124, 85± 96. EnKF and the RRSQRT algorithms seem to diverge for Ghil, M., and P. Malanotte-Rizzoli, 1991: Data assimilation in me- small ensemble size. The full (extended) Kalman ®lter teorology and oceanography. Advances in Geophysics, vol. 33, produces an rms error of 0.46. Academic Press, 141±266. Both the POEnKF and the COFFEE algorithm im- Hammersley, J., and D. Handscomb, 1964: Monte Carlo Methods. prove the convergence for small ensemble sizes, where Wiley and Sons. Heemink, A. W., K. Bolding, and M. Verlaan, 1997: Storm surge the COFFEE variant is sligtly more ef®cient. However, forecasting using Kalman ®ltering. J. Meteor. Soc. Japan, 75, if the number of modes is increased the random part of 305±318. these algorithms seems to hamper quick convergence of Houtekamer, P. L., and A. L. Mitchel, 1998: Data assimilation using the RRSQRT part of the algorithm. In this experiment ensemble Kalman ®lter technique. Mon. Wea. Rev., 126, 796± 811. accurate results are obtained with the COFFEE algo- Lermusiaux, P. F. J., 1997: Error subspace data assimilation methods rithm at only 40% of the computation time compared for ocean ®eld estimation: Theory, validation and applications. to the EnKF (q ϭ 20 and N ϭ 50). Ph. D. thesis, Harvard University, 402 pp. Maybeck, P. S., 1979: Stochastic Models, Estimation and Control. Vol. 141±1, Mathematics in Science and Engineering, Academic Press, 423 pp. 5. Conclusions Pham, D., J. Verron, and M. Rouband, 1998: A singular evolutive extended Kalman ®lter for data assimilation in oceanography. J. In this paper we propose to use a reduced-rank square Mar. Syst., 16, 323±340. root Kalman ®lter as a variance reductor for the ensem- Verlaan, M., and A. W. Heemink, 1995: Data assimilation schemes ble Kalman ®lter to reduce the statistical errors of this for non-linear shallow water ¯ow models. Proc. Second Int. Symp. on Assimilation of Observations, Tokyo, Japan, WMO, Monte Carlo approach. The resulting partially orthog- 247±252. onal ensemble ®lter and its variants combine the best ÐÐ, and ÐÐ, 1997: Tidal ¯ow forecasting using reduced-rank of both. It is less sensitive to divergence problems and square root ®lters. Stochastic Hydro. Hydraul., 11, 349±368.

Unauthenticated | Downloaded 09/27/21 07:26 AM UTC