<<

APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1381

Analogs on the Lorenz and Ensemble Spread

AITOR ATENCIA AND ISZTAR ZAWADZKI Department of Atmospheric and Oceanic Sciences, McGill University, Montreal, Quebec, Canada

(Manuscript received 31 March 2016, in final form 3 October 2016)

ABSTRACT

Intrinsic is defined as the uncertainty in a forecast due to small errors in the initial con- ditions. In fact, not only the amplitude but also the structure of these initial errors plays a key role in the evolution of the forecast. Several methodologies have been developed to create an ensemble of forecasts from a feasible set of initial conditions, such as bred vectors or singular vectors. However, these methodologies consider only the fastest growth direction globally, which is represented by the Lyapunov vector. In this paper, the simple Lorenz 63 model is used to compare bred vectors, random perturbations, and normal modes against analogs. The concept of analogs is based on the ergodicity theory to select compatible states for a given initial condition. These analogs have a complex structure in the phase space of the Lorenz attractor that is compatible with the properties of the nonlinear chaotic system. It is shown that the initial averaged growth rate of errors of the analogs is similar to the one obtained with bred vectors or normal modes (fastest growth), but they do not share other properties or statistics, such as the spread of these growth rates. An in-depth study of different properties of the analogs and the previous existing perturbation methodologies is carried out to shed light on the consequences of forecasting the choice of the perturbations.

1. Introduction an open question and different methods are still used and continuously developed (Pazó et al. 2013). How- From the seminal paper of Lorenz (1963) about non- ever, all these methodologies are focused on the fastest linear chaotic systems, predictability theory has evolved growth direction while compromising other properties from the quantification of the quality of a forecast to a of the initial perturbations. Nowadays, the perturba- novel perspective where sources of uncertainty and its tion methods applied in many forecasting centers are consequences for the limit of predictability are studied. based on ensemble data assimilation introducing other Among the main sources of uncertainty, infinitesimal properties, such as the background covariance matrix errors in the initial conditions and their growth due to the (Buizza et al. 2005). nonlinearities of the system equations are defined as the To study more complex behaviors from the initial intrinsic predictability of the system or first-kind pre- uncertainties in a chaotic system, low-order nonlinear dictability (Chu 1999). systems were studied. Even though these models lack Two main perturbation methodologies are vastly realism in comparison with the atmosphere, they used to perturb the initial conditions: singular vectors compensate by providing a better framework to obtain (Lorenz 1965; Palmer 1993) and bred vectors (Toth statistically significant results and having an easier in- and Kalnay 1993; Kalnay et al. 2002). Magnusson et al. terpretation (Farrell 1990). (2008) compared both perturbation methodologies The growth of initial errors, or the forecast spread (and a new bred vector methodology by applying among an ensemble, depends on several factors: lo- principal component analysis) to determine the dif- cation in the phase space (Nese 1989), direction of ferent fastest perturbation growths. The optimal meth- the initial perturbation (Lorenz 1965), and the fore- odology for perturbing the initial conditions remains cast lead time (Trevisan 1995). Despite the high var- iance of these errors, Lacarra and Talagrand (1988) Corresponding author e-mail: Aitor Atencia, aitor. found a transient growth that affects the short-range [email protected] average of errors and Trevisan and Legnani (1995)

DOI: 10.1175/MWR-D-16-0123.1 Ó 2017 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses). Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1382 MONTHLY WEATHER REVIEW VOLUME 145 related this behavior to the amplitude of the initial variables that tended to zero, obtaining a system of three error. However, Nicolis (1992) showed that the first coupled differential equations: two moments of the distribution (mean and variance) are not enough to characterize the predictability dX 5 s(Y 2 X) (1) due to a bimodal distribution of errors in the Lorenz dt 63 system. dY 52XZ 1 rX 2 Y (2) To entirely understand the difficulty of the problem, dt the structure of the errors in the attractor of the cha- dZ 5 XY 2 bZ, (3) otic system has to be studied (Judd et al. 2008). A way dt to obtain this structure is through the use of analogs in theattractor.Theideaofanalogswasintroducedby where the three parameters s, r, and b are positive and Lorenz (1969) to compensate for the unknown real are the Prandtl number, the Rayleigh number, and a atmosphere system of equations. Analogs were used in physical proportion, respectively. theLorenz63system(Trevisan 1993)tocomparetheir In the early 1960s, Lorenz discovered the chaotic be- properties to those of the random perturbations. havior of his simplified three-dimensional system by set- However, as stated in Trevisan (1993, p. 1017): ‘‘The ting the parameters to s 5 10, r 5 28, and b 5 8/3. This limited total time of the model integration, dictated by chaotic behavior, which appears for a wide range of pa- computational costs, did not allow, even in such a rameter values, is often adopted as a low-order test bed for simple model, one to find analogs sufficiently close to atmospheric predictability studies (Palmer 1993; Evans one another so that the initial error could be consid- et al. 2004; Magnusson et al. 2008; among others). A time ered small.’’ Nowadays, the computer power has in- series of the variables in the can be obtained creased and small enough error analogs can be by using numerical methods. In this work, the fourth-order obtained. To ensure the validity of the analogs used in Runge–Kutta forward method has been used with a time this study, the multiplicative (Oseledets step of Dt 5 0.01. Any random point of the three- 1968) is verified within our states. The main assump- dimensional phase space composed of the variables X, Y, tion of this theory is the equivalence between temporal and Z evolves into the system attractor. The Lorenz system averages of a given observable and the average of attractor has a dimension of around 2.07, which according identical processes at a given time. This ensures the to Ruelle and Takens (1971) is called strange attractor analogs have the desired properties and do not require because its structure has a noninteger dimension. the use of shadowing filtering (Judd and Stemler 2010) The attractor A and the realm of attraction r(A) are to create perturbations compatible with the attractor two subsets in the phase space of variables M. The of the chaotic system. mathematical definition of the attractor can be found in The main goal of the present study is to compare the Milnor (2004), and it is divided into two conditions for a results obtained with previous perturbation tech- closed subset A M: niques with analogs obtained from a long enough d The realm of attraction r(A), consisting of all points dataset that guarantees the ergodicity assumption to x 2 M for which w(x)1 A, must have a strictly be true. The model used in the paper is the Lorenz 63 positive measure. model, and it is introduced in section 2 together with 0 0 d There is no strictly smaller A A,sor(A ) the constructed dataset of observations. The pertur- coincides with r(A) up to a set of measure zero. bation methodologies and the details of the definition of analogs are described in section 3. An in-depth The rigorous demonstration of the existence of an comparison of several properties of the different per- attractor, such as the one in the Lorenz system, requires turbations is presented in section 4 followed by a dis- solving the system of equations using topological ordinary cussion. Finally, the conclusions of the paper are differential equation (ODE) properties as shown in Tucker summarized. (2002). From the point of view of practical applications, the use of density in the phase space is a less rigorous but practically affordable way to determine the realm 2. The model of attraction and attractor. Figure 1 shows three cross The model used in this article to obtain the temporal data used as observations is the well-known chaotic system first introduced by Lorenz (1963). Lorenz trun- 1 The omega limit set [w(x)] is the collection of all accumulation cated the solution of the finite-amplitude convection points for the sequence x;G(x);G2(x);... of successive images of x system derived by Saltzman (1962) and removed the being G a nonlinear mapping.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1383

FIG. 1. Density of states for the Lorenz system in different cross sections. (a) The surface for (left to right) X is equal to 0, Y is equal to 0, and Z is equal to 40 on the plane Y–Z, X–Z, and X–Y, respectively. (b) An initial point is chosen from inside the attractor obtained in the study shown in (a). The difference between (a) and (b) are the initialization points: (a) has an initial point every 0.5 in X, Y, and Z in the range [240, 40], [240, 40], and [220, 60], respectively (plus 50 random perturbations). The lower row has only an initial point chosen from inside the attractor obtained in the upper row study. sections—on the Y–Z, X–Z,andX–Y planes—of the value of a process can be estimated by its temporal av- density of states for the Lorenz system with two dif- erage. This property can be understood as the system ferent initializations. Figure 1a has an initial point ‘‘forgetting’’ its initial state (Miller et al. 2010). This idea every 0.5 unit in the X, Y,andZ planes in the ranges seems to contradict the fact that chaotic systems are [240, 40], [240, 40], and [220, 60], respectively. At sensitive to initial conditions. However, if the Lorenz each of these initial points, 50 random perturbations system is ran long enough, it will not have information within a range of 0.1 units are created. The Lorenz about the initial position. To determine how long is long system is solved for all these points for a length of enough, an experiment is carried out. Two opposite 50 time units (5000 time steps). The density is plotted in states—states in different wings of the attractor—are logarithmic units. More than two orders of magnitude chosen. The probability distribution is computed from in the density can be observed as the difference be- each trajectory created from these two opposite initial tween the realm of attraction and the attractor. To states. Figure 2 shows the significance level of both dis- ensure our high-density space is the actual attractor, tributions belonging to the same process as a function the second condition was verified. A point inside the of the length of the run. It can be observed that after attractor is selected and the Lorenz system is solved for 250 000 time units (or 2.5 3 107 time steps), both runs are 10 000 time units (1 000 000 time steps). The density of indistinguishable. Figure 2 also shows the Kolmogorov– this trajectory is shown in Fig. 1b. Both oc- Smirnov statistic (K–S distance), or the maximum differ- cupy the same subset, suggesting that a smaller closed ence between the two cumulative distribution functions set of the attractor does not exist. used to compute the significance of the two different Once the attraction subset or attractor is defined, the samples hypothesis (Kolmogorov 1933; Smirnov 1948). data can be created, ensuring all the states belong to the After determining the minimum length of the run to ensure system attractor. Ergodicity states that the expected that ergodicity is satisfied, a new run is carried out with

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1384 MONTHLY WEATHER REVIEW VOLUME 145

demonstrated that the statistical combination of analog members is also called analog. Consequently, the original idea of analog situations corresponds to only the phase- space distance without imposing other properties. How- ever, it is known that chaotic systems have interesting properties in the phase space, such as fractal dimension (Ruelle and Takens 1971), transient exponential error growth (Trevisan and Legnani 1995), and scaling prop- erties (Geisel and Nierwetberg 1982), among others. Analogs could more generally be defined as a set of close states on an attractor of a nonlinear system. This subtle difference in the definition of imposing on the states to belong to the attractor, it is enough to guarantee that these analogs have the chaotic systems’ properties. In addition, another property is required in the current definition of the analogs: an expression of the ergodicity of the system. A time series of observations, if long enough, will sample the FIG. 2. Significance level (gray solid line; left axis) as function of states of the system’s attractor. Here, the ergodicity theo- the length of the run in time steps. This significance level is computed 5 by the K–S test; its statistic is plotted on the graph (black solid line; rem is used to theoretically define perturbed states (X2 right axis). Significance values close to 0 means the hypothesis of X1 1 dX) as the neighboring states of X1 that reside on the both samples belonging to the same distribution is not verified. The attractor as determined by a very long integration of the dashed horizontal line stands for when the hypothesis of belonging Lorenz system {X2(t) 5 Gt[X1(t 5 0)]}. In this definition to the same distribution is satisfied at the 0.99 significance level. the Lorenz system (or any system of nonlinear equations) is a length of 400 000 000 time steps or 4 000 000 time units. defined as the nonlinear mapping G. Consequently, the This run is used as our observation dataset.2 infinitesimal perturbation contains all the information of the chaotic system [dX 5 Gt(X1 2 X1) and these analogs would 3. Perturbations satisfy the previously mentioned properties, such as fractal dimension, exponential growth, and scaling properties. Perturbations of a given state in the attractor provide information on the sensitivity of the system to the initial b. Bred vectors errors and, consequently, the intrinsic predictability of the Bred vectors (BV) were first introduced by Toth and system itself. These perturbations can have different prop- Kalnay (1993) to perturb the NCEP forecast in a way that erties, from being totally random around the initial state to allows for the representation of atmospherically realistic the linear tangent approximation applied in the derivation structures. BV are finite perturbations periodically re- of the singular vectors. In this section, the assumptions where T ,(1א scaled at chosen times, defined as t 5 pT( p 2 made on ensemble generation of the most common per- r is the rescaling interval and p is a natural positive number turbation methodologies are discussed. Specifically, a that defines the rescaling time. The BV perturbations are perturbation based on the ergodic property of our ob- obtained by using the full nonlinear model as opposed to servations, called analogs for its similitude with the term other techniques where linear approximations are used (an used by Lorenz (1969), is also presented, highlighting the example is the singular vector). A control (unperturbed) subtle differences between both definitions. These pertur- simulation x 5 [X , Y , Z ] and a perturbed one (x ), in bations are applied and studied in the following section. c c c c p which an initial perturbation is randomly created with a a. Analogs magnitude of «, are compared at the selected time (tr). The difference between them is computed as follows: The concept of analogs was introduced by Lorenz (1969) in his attempt to estimate weather predictability D 5 2 x(tr) xc xp (4) using observations. At that time, analogs were defined as two states that resemble each other. Keeping in and then rescaled to the initial magnitude. The resultant mind the original definition, Van den Dool et al. (2003) vector is the BV: Dx(t ) b(t ) 5 « r , (5) 2 r kD k In this work, model errors are neglected by using the same x(tr) model in the construction of the observation dataset and the forecast system. where jjjj is the Euclidean norm of the vector.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1385

FIG. 3. Example of the SVD applied to a given trajectory (initial point is 7, 11, 17) of the Lorenz system. This scheme shows the application of the tangent linear model (G ) for a tem-

poral interval of t 2 t0 (eight time steps in this example). (a) Perturbations of size 1, whereas (b) the ellipsoid is obtained by applying the model. The three axes for defining the time t ellipsoid are the final singular vector multiplied by the singular value. These final singular

vectors correspond to the initial singular vectors at the previous time step t0.

The BV are used to create a new perturbed simu- uniformly distributed random perturbations have been 0 lation [xp(tr)] at the given time. The previously de- created around the selected initial point. scribed procedure is applied for the new perturbed The linear evolution of any small perturbation simulation. After several cycles of this procedure, the [Dx(t0) 5 xc(t0) 2 xp(t0)] can be studied by using the breeding vectors converge in a statistical sense. The tangent linear model of the system in differential form number of cycles needed for convergence depends on around the point xc: the time scales of the dominant instabilities (Toth and

›G Kalnay 1993). An ensemble of BV can be obtained by Dx_ 5 3Dx 5 J 3Dx, (6) ›x 5 using different small random initial perturbations x xc(t0) following the same procedure. However, all the bred vectors tend to the leading Lyapunov vector and, where J is the Jacobian of G. The tangent linear model is consequently, they are linearly dependent (Kalnay obtained by integrating this equation between t0 and t. et al. 2002). Lorenz (1965) introduced (using different names) the use of the tangent linear model when linearizing the c. Singular vectors, normal modes, and random system of equations and neglecting terms of the second perturbations order. He found that using this approach, the evolution The straightforward way to perturb a system, when of small perturbations can be obtained by there is no information about the system or properties Dx(t) ffi G (t , t) 3Dx(t ), (7) that these perturbations have to fulfill, is to use random 0 0 uniformly distributed perturbations.3 In this paper, 2500 where G (t0, t) is the propagator of the tangent linear model. The eigenvalues of the propagator G (t) provide 3 Uncorrelated is not a characteristic of atmospheric information about the temporal evolution of the per- phenomena nor of most measurement errors and consequently turbations. For positive (negative) values, the pertur- random perturbations are not consistent with actual uncertainties. bation grows (decays) in the eigenvector direction.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1386 MONTHLY WEATHER REVIEW VOLUME 145

These eigenvectors are the so-called singular vectors 4. Results of the perturbed simulations (Lorenz 1965) used to optimally perturb the initial Having summarized the different perturbations conditions at ECMWF (Palmer 1993). methodologies, the results of applying them to the To obtain the singular vectors and the singular values Lorenz system are now compared. To ensure statistical (or eigenvalues), a singular value decomposition (SVD) significance of the results, 2500 perturbations are ap- has to be carried out. According to SVD theory (Golub plied at each initial state. To account for the dependence and Van Loan 1996, 70–71) for a given matrix, such as on the position in the phase space, 2000 different initial the propagator G (t), a relation among the initial sin- states have been studied within the attractor. Different gular vectors (u ), the final singular vectors (v ), and the i i properties are analyzed in this study. First, the growth of singular values is obtained as follows: error is computed (i) due to its direct relation with the G 5 US / G 3 5 3 V vi si ui . (8) predictability of the system. The fractal properties of the perturbations are analyzed (ii) to see whether the per- Figure 3 shows an example of the SVD theory applied turbations obtained are compatible with the system of to a given point of the Lorenz system. Initially, a equations and its chaotic nature. Finally, the probability sphere of radius 1 of perturbations around the point distribution of initial states and its evolution in time is [7, 11, 17] is selected. The evolution (t 2 t )ofthese 0 also studied (iii) because of its importance in ensemble perturbations gives as the result the ellipsoid depicted forecasting. in Fig. 3b. These results are obtained by applying uniformly distributed random perturbation to the a. Growth of error and predictability Lorenz system. The SVD theory gives as a result the Following previous works such as Bowler (2006) and orthonormal vectors (directions) of the axis that de- Pazó et al. (2013), the performance of the perturbations fine the obtained ellipsoid and the value (singular is evaluated by calculating the root-mean-square dis- values) of each axis. The initial singular vector cor- tance (RMSD) of the ensemble mean versus the un- responds to the initial direction and the perturbations perturbed trajectory (x ).4 In this experiment, the model that result in the final singular vectors (axis of the c is perfect and accordingly the ‘‘truth’’ (control run) is a ellipsoid). For this reason, random perturbations in- trajectory obtained by integrating the equations from side of the sphere can be summarized by three vectors the control-initial point (assuming there is no error in using the SVD theory at each point. The maximum this initial control point). In the RMSD equation, the direction of growth is characterized by the first sin- distance is measured as the Euclidean distance (jjjj) gular vector, but the distribution inside the sphere has between the perturbed run (xi ) and the control (un- to be studied by computing the result of all the random p perturbed) run. This distance is exactly equivalent perturbation. A final comment about the random to the square error of the perturbed member. An perturbations and the singular vectors is that the sin- early work in ensemble forecasting (Murphy 1988) gular vector is valid only as far as the tangent linear showed the relation between spread and skill of an approximation is valid; or in other words, the pertur- ensemble forecast. A relation between three quanti- bations are small enough. For this reason, in the ties (mean-square error of the perturbed member, present work the random perturbations are used in- spread of ensemble forecast, and skill of the ensemble stead of the singular vectors, without forgetting forecast) can be obtained. The ensemble average (hi) that the three singular vectors are the three axes that of the square error of the perturbed member can be define the ellipse that contains all the random pertur- written as bations. Consequently, the results obtained by applying the random perturbations and observing its evolutions 1 N are similar to an ellipsoid defined by the three singular hD i i 5 å k i 2 2k 5 h i 2i x (t) [xp(t) xc(t)] xp(t) N 5 vectors (SV) and following its evolution. i 1 1 k k2 2 k k 3 k EM k Finally, the normal mode method is introduced. This xc(t) 2 xc(t) x (t) , (9) technique identifies the fastest-growing modes by using the eigenvectors of the instantaneous Jacobian [J in where the index i stands for each of the N ensemble Eq. (6)]. To extend the analysis with another random members. The spread (variance) of the ensemble method in 2D (as the bred vector), the two leading sin- gular vectors for each point are computed. Random perturbations are created in the plane span formed by 4 We avoid here the commonly used term ‘‘RMS error’’ since an these two vectors. This method is equivalent to the ensemble member initialized by an analog is not intended to, nor normal modes obtained in Magnusson et al. (2008). can it, give the same forecast as the ‘‘truth.’’

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1387

information, and they are all related to the growth of errors in chaotic systems. For this reason in this paper not only is the averaged error of the ensemble member studied but also the distribution of these errors. By averaging the ensemble member errors, the initial growth of error follows a more complex behavior than a regular exponential growth as demonstrated by Nicolis et al. (1995). They showed that a subexponential initial behavior is followed (in some cases) by a super- exponential growth.5 This can be observed in Fig. 4, where this behavior in the growth of errors (defined by the averaged over the attractor RMSD) is valid for the first 100 time steps or 1 time unit. After this period, the nonlinearities play a role and the behavior changes from sub–superexponential to a power-law growth of error. Finally, the saturation is reached around 1000 time steps or 10 time units. This saturation is equivalent to the loss of predictability of the system; or in other words, when FIG. 4. Averaged rmse over the attractor as a function of time for the AP (black), RP (blue) NM (green), and BV (red). The gray the states are not distinguishable from any randomly lines are the RMSDs for 20 different positions in the attractor to selected state in the attractor. These three different show the dependence on the position in the phase space. A first stages on the error growth can be observed clearly in the subexponential growth can be observed until 1.0 t.u., when the RMSD averaged over the whole attractor. However, the growth seems to evolve toward superexponential growth (an or- ange dashed curve of exponential growth is plotted for comparison local growth of error has a more fluctuating growth of purposes); afterward, a power law is driving the growth of errors error (as can be observed in the gray lines of Fig. 4) due until saturation is reached around 10 t.u. (henceforth this time is to the rotation around the center of each wing. defined as the time of total loss of predictability and defines the Evans et al. (2004) found the dependency on the lo- upper horizontal axis). cation in the attractor of the growth rate (or pre- dictability associated with the Lyapunov exponent). The predictability map, which is the growth rate for different forecast can be formulated as the departure of the en- starting locations, is shown in Fig. 5 for the four different semble members from the ensemble mean [xEM(t)]: perturbation methods. The result is the average of the mean-square error for 2500 different ensemble mem- Var[kxi (t)k] 5 h[kxi (t)k 2 kxEM(t)k]2i p p bers or perturbations. It can be observed that the mean 5 hk i k2i 2 k EM k2 xp(t) x (t) . (10) error growth rates for analog and normal mode per- turbed runs are similar (but not equal). The bred vectors Finally, the skill can be measured by using the mean- have similar values for most of the points, but the red square distance between the ensemble mean and the points have larger values than for the analog states. The control run: growth rate is significantly smaller for the random per- turbations than for the other methodologies; however, MSd[k i (t)k] 5 k[ EM(t) 2 (t)]2k xp x xc the maximum and minimum growth rates are located in 5 k EM k2 1 k k2 the same place of the phase space. x (t) xc(t) The previous results are averages for the 2500 pertur- 2k t k 3 k EM t k 2xc( ) x ( ) . (11) bations at each point in the Lorenz attractor. In addition, Consequently, the next relation between the three the direction of the perturbation is important as shown in magnitudes is derived: section 3c. Bred vectors are oriented in the direction of growth according to the nonlinear model (the bred vec- hkD i ki 2 k i k 5 k i k tors converge with the dominant Lyapunov vector but xp(t) Var[ xp(t) ] MSd[ xp(t) ]. (12) can be smaller than the faster growth by the first SV); For this reason in the present work the error of dif- consequently, as can be seen also in Fig. 5, the local ferent ensemble members is averaged, taking into ac- count that this quantity can be exchanged with the mean-square error and the variance of the ensemble 5 Examples and conditions for this behavior can be found in members. All these quantities provide complimentary Nicolis et al. (1995).

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1388 MONTHLY WEATHER REVIEW VOLUME 145

FIG. 5. Predictability map obtained for different types of perturbations: (a) AP, (b) BVP, (c) RP, and (d) NM. The slope is computed for a period of eight time steps or 0.08 t.u., which is equivalent to the local Lyapunov vector or the instantaneous growth rate (Magnusson et al. 2008). The 2000 dots are plotted where the control run is initialized and show the dependence on the location of the predictability. The value is the average of the errors for 2500 perturbed runs. growth rate is maximized as designed. Random pertur- imposed but the distance to the control run is. For this bations (RP) have the three singular vectors to describe reason, it is interesting to plot also the standard deviation the evolution of the sphere for a tangent linear approxi- of the local growth rates to account for the variability in mation, but the vector direction of the initial perturbation the direction of the perturbations (Fig. 6). is homogeneous (all directions are equivalent). Conse- When predictability is defined by the fastest-growing quently, the mean average is smaller than the growth perturbation of the state, the implicit assumption is that all obtained with the bred vectors. The maximum growth is perturbations are present simultaneously and they all described by the dominant singular vector; for this rea- generate modes of growth. The fastest modes will rapidly son, the normal modes (NM) that are constructed by dominate and determine the system’s evolution. The sit- spanning the plane formed by the first and second sin- uation with analog perturbations (AP) is different: al- gular vectors have a larger mean. Finally, the analog though all analogs are possible outcomes of perturbations perturbations look for possible states of the nonlinear aroundastate,onlyoneatatimewillactuallytakeplace. model from a long run. The spatial structure of these In fact, the spread of the perturbations by analogs de- perturbations will be studied in section 4b to have a better scribes the a priori knowledge of the growth of un- understanding of some possible causes for these differ- certainties in the system’s evolution. With bred vectors one ences. It was pointed out above that the analog pertur- artificially forces the state into the direction of the fastest bations lead to averaged local growth rate similar to the growth only, creating an appearance of order where order bred vectors. The direction of the perturbation is not does not exist. Analogs reveal the full extent of chaotic

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1389

FIG. 6. Variability of the local growth rate map obtained for different types of perturbations: (a) AP, (b) BVP, (c) RP, and (d) NM. The spread is computed as the standard deviation of all the local growth rates obtained by the 2500 perturbations. The 2000 dots are plotted where the control run is initialized and show dependence on the location in the Lorenz attractor. uncertainty. Normal modes partially capture this growing The first column in Fig. 7 shows the difference among error mechanism by eliminating one dimension (the third random perturbations, bred vectors, normal modes, and singular vector is at least a growing direction). analogs for the initial point (4.553, 5.613, and 20.212, re- spectively). The structure of the random 2500 perturba- b. Spatial properties and fractality tions is a 3D sphere with equal distribution of states inside To better understand the source of differences among which cover all the possible directions. As mentioned in the growth rate of analogs and existing perturbation section 3c, one direction in this sphere is equivalent to the methods [RP, NM, and bred vector perturbations leading or initial singular vector that evolves into the final (BVP)], the distribution of perturbations in the phase singular vector in the direction of the maximum axis of space is studied in this subsection. Figure 7 shows the the ellipse in the first column, second row. The number of X–Z projection of the perturbation for the initial time states is uniformly distributed in the first column, but in and three later time steps. The shapes of the pertur- the second column, an increase of the density is appre- bations by analogs are dependent on the position in the ciated in the center of the ellipse. This process is the main attractor, but the conclusions drawn here are valid for cause for obtaining a lower growth rate than with the all of them. Figure 7 is valid as a qualitative example. other two perturbation methods.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1390 MONTHLY WEATHER REVIEW VOLUME 145

FIG. 7. Projection in the X–Z plane of the perturbations around the point (4.55295, 5.61306, 20.2116) created by (top to bottom) RP, BV, NM, and AP perturbations. (left to right) The initial time step and then different time steps: 0.07, 0.14, and 0.2 t.u., respectively. There are 2500 perturbations in each figure, and the density range is found in the upper-left panel.

The bred vectors are rescaled to the average squared BV and SV can be found in Legras and Vautard (1996). error obtained by both the analogs and the random per- These subtle differences are not taken into account in this turbations. Consequently, all bred vectors obtained in this paper because its main goal is to study the values obtained figure have the same distance from the control run, and not the asymptotic infinite-time limit values (repre- forming a spherical surface. As mentioned by Kalnay sentative of the actual Lyapunov vector). et al. (2002), the bred vector procedure converges with the Normal modes form a circle in the plane spanned by leading Lyapunov vector. Accordingly, the reddish region the two first singular vectors. These two vectors have a in the initial time for the bred vectors corresponds to the certain inclination with respect to the X–Z plane (this can convergent bred vector or local leading forward Lyapunov be observed in the maximum concentration of points for vector. A more in-depth explanation between the rigor- the bred vectors). For this reason, the normal modes ous definition of the Lyapunov vector and its relation to show an elliptical structure in this plane. The distribution

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1391 of states is uniform over this circle. The third singular vector is compressing (together with the second singular vector, which has a smaller magnitude) the initial sphere formed by the random perturbations. The normal mode states avoid this direction, and the states have a smaller compression (only caused by the second singular vector). For this reason, the states are still mostly uniformly dis- tributed around the ellipse. The main growing direction in the last column is equivalent to the random perturbations, normal modes, and the bred vectors, and at the same time it is the di- rection of the leading Lyapunov vector. However, the density of states is higher in the center of the ellipsoid for the random perturbations, while the bred vectors have evolved from a spherical surface to an ellipsoid surface. For this reason, the average growth rate is larger for the bred vectors than for the random perturbations although the two ellipses are similar. The normal modes have states FIG. 8. Estimation of the correlation dimension as the slope farther from the center than the bred vectors, or random (dashed red line) in the scaling region between the average number perturbation for maintaining the same initial error. These of points Cr and the radius of the sphere in logarithmic units. The states have the same growth rate as the random states in black curve shows the real values of Cr, where both small- and the same direction (but smaller amplitude). For this rea- large-scale effects can be noticed. son, the ellipse of the normal modes is bigger, resulting in a larger average growth rate than the random perturbations. characteristics can be found in the original book of Finally, the phase-space distribution of analogs is an- Mandelbrot (1983)]. Or in other words, similar (geo- alyzed. The analogs’ states have a particular shape that is metrically or statistically) features are observed at different from the regular sphere shape obtained with the different scales. These states are scaled (spatially dis- other procedures. The given shape is the consequence tributed) in a way that they have a relation between of a long-term run of the Lorenz system. Consequently, scales that can be linear (fractal) or multiplicative these perturbations represent properties of the Lorenz (multifractal) for more complex systems. attractor and the short-term nonlinear (linear) properties Grassberger and Procaccia (1983) developed a stan- captured with the BVP (NM and RP). Initially, the four dard method for efficiently computing the fractal di- methods for perturbing the control run have the same mension or correlation dimension. Starting on a given average squared error (an ensemble mean initially iden- point x on the attractor, the total number of states tical to the control run), but the growth rate for the an- [Nx(r)] inside of a sphere of radius r is counted. The r alogs is similar to the one obtained with the NM. This can increases as the number of points inside the sphere in- be explained by the larger phase-space coverage of the creases following a power law: perturbations, as can be clearly seen in the last two col- umns. The states are kept (mostly) uniformly distributed } d Nx(r) r , (13) for all the time steps. These two facts give as a result an overall growth rate similar to the maximized one ob- where d is the pointwise dimension. When Nx(r) is av- tained with the bred vectors. eraged over several points on the attractor (Cr), the One of the main characteristics of a chaotic system is exponent d is called the correlation dimension. the fractal dimension of the attractor. This property is a The Lorenz attractor has a well-studied fractal di- key part of this kind of system because the fractality mension of 2.07; the correlation dimension of the Lorenz allows for the existence of nonperiodic flows. In other attractor (dcor # dfrsc) can be estimated by computing words, noncrossing trajectories, such as the ones ob- the slope between logCr and logr (Fig. 8). As can be tained by chaotic systems, are related at the same time to observed, the power law holds only over an intermediate the fact that only some states are allowed. Besides, these range of radius. The curve saturates for large radius due allowed states do not fill the space—they form complex to the dimension of the attractor. On the other end, at an geometric shapes with fine structure at arbitrarily small extremely small radius, the only point inside the sphere scales and, usually, they have some degree of self- is x itself and points of the same trajectory would be similarity [a more in-depth study on fractality and its incorporated as the radius is increased. Consequently,

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1392 MONTHLY WEATHER REVIEW VOLUME 145

FIG. 9. Examples of the computed correlation dimensions for the four different perturbations: BV (gray), RP (blue), NM (green), and AP (red). (a) The perturbation applied at the initial point or initial time. (b) The correlation vs the radius obtained from the 2500 perturbed states at time 3.01 t.u. (c) Using all the states from the initial time until 3.01 t.u. Small-scale effects are present only in (c). The values of the slopes can be found in each figure for the three perturbation methodologies. The error obtained when computing the slope is inside the parentheses. The dark-colored asterisks are used to compute the slope for the second methodology (accumulation of states), whereas the light-colored ones are used when computing the slope for the first methodology (only states for a given time step). the relation between scales would represent the di- scales are still filling the space for the random perturba- mension due to the temporal resolution used in the nu- tion or over the surface for the bred vectors. The behavior merical scheme to resolve the Lorenz equations. is different for Fig. 10b because close states of the same However, this small-scale effect is not present in the trajectory deform the sphere for the first time steps, study of the correlation dimension for the initial time giving a lower correlation dimension. Both random per- perturbation because only one state for the same tra- turbations and analogs have a similar correlation di- jectory is selected as a possible analogous situation. mension after a few time steps that is equivalent, at the The correlation dimension (considered as a measure of same time, to the Lorenz attractor fractal dimension. the fractal dimension) of the perturbations is studied Consequently, the perturbation correlation dimension is from two different points of view. First, it is studied at a dominated by the system of equations when it is com- given time step, starting from the initial time and com- puted using the whole trajectory. This behavior is differ- puting the correlation dimension of the 2500 perturba- ent in the first procedure for measuring the correlation tions in the following time steps. As mentioned dimension. Yet, it can be due to sampling problems (the previously, this study does not suffer from the small-scale states are farther from each other at every time step) even effect due to the temporal resolution because only one though 2500 states are still used or are due to different time step is used. However, it lacks the temporal evolu- local fractal properties of the attractor. However, from tion of the , which is the main cause of these two results the random perturbations can be ob- the fractal structure. For this reason, a second study has served filling up the space that is contradictory to the been carried out in which the correlation dimension of geometry of the Lorenz system, whereas the bred vectors the time series obtained for the different perturbations are not representative of the equations because the de- was computed. An example of the correlation dimension formation of the 2D surface seems to become a stretched obtained from both types of studies is plotted in Fig. 9. surface, having a lower dimension than the actual Lorenz The result of these two different approaches to com- attractor. None of these two perturbation methodologies pute the correlation dimension of the four different seem to fulfill the fractal property of the Lorenz system as methodologies for perturbing the control run as a func- the analogs naturally do from the initial time; besides, tion of the lead time or time step is plotted in Fig. 10.As both methodologies for perturbing the control run need verification, the random perturbations can be observed some time [1 time unit or 100 time steps for the random having a correlation dimension around 3 at time 0 be- perturbations or more than 10 time units (t.u.), when all cause the sphere is filled. The bred vectors (and normal predictability is already lost, for the bred vectors] for modes) have a correlation dimension around 2 because achieving a similar fractal geometry as the Lorenz system they are over the surface of the sphere (over the surface of equations. This behavior is less present with the normal spanned by the first two singular vectors). This behavior is modes. In Fig. 10a two dimensions can be observed to be similar for the first time steps in Fig. 10a because the small equivalent to the surface perpendicular to the growing

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1393

FIG. 10. Result of the correlation dimension (slope from Fig. 9) as a function of the time step in the integration of the perturbation for (a) states for a given time step and (b) for all the states until the given time step. The solid lines stands for the 10 time steps averaged for to obtain the slope and the shaded area is the error associated with the computation of the slope. The BV are plotted in black, the RP results in blue, the NM in green, the AP in red, and the correlation dimension of the Lorenz attractor as a reference is in orange. direction (two first singular vectors) and, consequently, more than quantitative because the CDF changes during the initial fractal structure is kept until 500 time steps or the whole trajectory around the attractor. Moreover, the 5 t.u., when the dynamic of the system dominates the CDFs have a different shape depending on the initial states and a fractal dimension compatible with the Lorenz point selected in the phase space. However, the statistics attractor is obtained. The accumulated states (Fig. 10b) are quite similar, and for this reason the following results are similar to the analogs and at the same time to the are pooled to get the evolution of the statistics as a Lorenz attractor. For this reason, the normal modes can function of the time of derivation of the model. be concluded to be compatible with the dynamic system The most important feature of the CDFs plotted in for the accumulated space and compatible with the ran- Fig. 11 is the difference among the probability distri- dom perturbations. The small reduction in the fractal butions for the random perturbations and bred vectors dimension in comparison with the random perturbations in comparison with the analogs or normal modes in the can be due to the approximations when computing the first two rows. As observed in sections 4a and 4b, these singular vectors from the tangent linear model. However, two perturbations have different properties for the ini- these effects are not important in terms of the fractal di- tial time. It can be observed that the probability distri- mension of the normal mode’s perturbations and its butions for this time step are also different. In the initial trajectories. time step, the random perturbations, normal modes, and the analogs share some similarities, whereas the surface c. Distribution of states in the phase space of bred vectors is completely different. However, the As studied by Trevisan (1995), an important part of standard deviation of the analogs is larger than the the predictability information is found in the time- standard deviation of the random perturbation because dependent PDF of the distance between a couple of the density on the tails is larger. The normal modes analogs. In her paper, a clear signature of the organi- have a similar standard deviation (slightly larger) than zation of the system’s dynamics is associated with the the random perturbations and slightly smaller than the existence of distinct regimes, which is a clear bimodality analogs for the same reason. This small similarity (the for the Lorenz system associated with the two butterfly significance test for the hypothesis of belonging to dif- wings on the attractor. ferent processes between RP and analogs for the initial Here, the initial CDFs for different perturbation time is higher than 95%) disappears for the following methodologies are compared. An example of the CDF time steps. As can be observed in the middle row of obtained for three different periods (initial time, Fig. 11, the density of analog states is completely 400 time steps, and 1000 time steps) is plotted in Fig. 11. different from the bred vectors and random pertur- As mentioned previously, these examples are qualitative bations. However, analogs and normal modes have a

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1394 MONTHLY WEATHER REVIEW VOLUME 145

FIG. 11. Example of the cumulative distribution of states with respect to the control run for the (left) X, (middle) Y, and (right) Z variables showing AP (black), RP (red), NM (green), and BV (blue). Perturbations are created at (top to bottom) the initial time and after the model has run for 4 and 10 t.u. The perturbations are spread around the whole attractor after running the model around 10 t.u. The K–S statistic is also plotted in the center panel to show the difference between the NM CDF and the AP CDF.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1395

FIG. 12. Properties of the PDFs for the different perturbation methods as a function of the run time (t.u.). (a) The mean of the PDFs [associated with the error as shown in Eq. (12)] for RP (blue), BV (gray), and AP (red). The shaded area corresponds to the standard deviation associated with the PDFs. The square insert is a zoomed-in view of the first 10 time steps. (b) The range of values in the PDFs for the X, Y, and Z variables as a function of the run time. The same value for the three values is equivalent to a homogeneous distribution, or in other words a sphere around the control run. more similar distribution—even the CDFs have some perturbations. As observed in the zoom area of this differences—both the growth and the mean are similar. figure, the standard deviation is larger for the analogs This behavior is different for the bred vectors (a dif- than for any other perturbation methodology. After this ferent mean for the variables X, Y, and Z) and the initial period, the growth rate of the standard deviation random perturbation (the mean is ‘‘similar’’ only for the is qualitatively similar for the four perturbation meth- Z component). Consequently, the analog prediction, odologies for these two moments. These results are although sharing some global properties with the ran- found in previous studies (Lorenz 1963; Trevisan 1993; dom perturbations, such the transient error growth as Magnusson et al. 2008; among others). This similar be- shown by Trevisan and Legnani (1995), has a forecast havior is attributed to the alignment of the perturbations outcome with a different probability distribution of along the preferred direction of growth identified by the states. This different behavior lasts only until the as- first Lyapunov vector. To study the alignment of these ymptotic limit is reached. When this total loss of pre- perturbations, the range for the variables X, Y, and Z are dictability has been reached, the perturbations are studied separately (Fig. 12b). The three variables start in randomly distributed around the attractor and all the the same position for the bred vectors and the random CDFs resemble each other (last row in Fig. 11). perturbations that are to be distributed within a sphere. The probability distribution for these methodologies This is not the case for the analogs or normal modes has different shapes that are far away from regular because of the initial elliptical shape. However, it can be parametric formulas. Despite this fact, some moments of observed in the alignment of the three variables among the distribution, such as the mean or the standard de- the perturbation methods (the three variables have a viation, are used to summarize the information pro- scaled proportional magnitude from 0.2 t.u. on). Even vided. Figure 12a shows the temporal evolution of the though the variables are aligned and the initial aver- mean and its standard deviation for several lead times. aged statistics are the same for the four perturbation The ensemble variance has exactly the same initial methods, the range in X and Z is larger for the analogs magnitude for the three perturbation methodologies, than for any of the other three methodologies. It is also but its growth is different as can be observed (and it was important to highlight the different behaviors among the previously mentioned in section 4a). The growth rate variables in the alignment process (between the initial (the slope is computed as the difference between the time and the 0.2 t.u. or 20 time steps); the slope for the X initial time and the 0.08 t.u.) is similar for the bred variable of the analogs is positive in comparison with the vectors, normal modes, and the analogs, but it is com- reduction of X values for the BV and RP. Therefore, the pletely different (negative vs positive) for the random extreme events would be contained within the analogs,

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1396 MONTHLY WEATHER REVIEW VOLUME 145

medium ranges, and consequently the forecast can also be considered different. When the predictability is close to being lost (around 7–8 t.u.), the probability distribu- tions are similar because they correspond to 2500 points randomly distributed within the Lorenz attractor. Ac- cordingly, the ensemble prediction with these other methods only approximately reproduces the initial dis- tribution and the final climatology, but it misses the short-to-medium-range forecast.

5. Discussion The simple toy model developed by Lorenz (1963) is the low-order chaotic system used in this study. It is im- portant to highlight the qualitative validity of the drawn conclusions in low-order models as a guideline in the study of more complex models (Judd and Stemler 2010). Cha- FIG. 13. Significance level (averaged for different initial points otic systems, such as the Lorenz model, have been widely over the whole attractor) as function of the length of the run (t.u.) for the comparison between AP CDF vs RP CDF (red), BV CDF studied due to their similarities with the atmospheric (blue), and NM CDF (green). This significance level is computed system. One of these similarities is the sensitivity to errors by the K–S test (an example of the K–S distance can be observed in in initial conditions, which is the definition of intrinsic the center panel in Fig. 11). Significance values close to 1 means the predictability. Ensemble forecasting is an appropriate hypothesis of both samples belonging to different processes is response to this problem. By using several members, not verified. The gray dashed horizontal line stands for when the hy- pothesis of belonging to two different distributions is satisfied at only is information about the evolution of a state ob- the 0.95 significance level. tained, but also the uncertainty associated with the given state and its intrinsic predictability. However, the type of and from the lead times after 20 time steps within the perturbations used to obtain the ensemble of initial con- normal modes. dition influences the outcome of the chaotic system. In this The averaged statistic moments are similar among paper, the most commonly used perturbation methods the perturbation methodologies but the range and the have been compared with analog situations where the probability distributions are different. To analyze ergodicity hypothesis is taken into account. whether this behavior is kept within the attractor, the The intrinsic predictability is usually described by the probability obtained by the Kolmogorov–Smirnov test growth of the errors in the initial conditions. For this of both distributions belonging to different processes reason, an adequate perturbation methodology should (the process meaning the perturbation methodology reproduce, on average, the maximum growth of errors. used to obtain this distribution) is plotted in Fig. 13. The Bred vectors and normal modes have a larger growth mean of the significance levels of the random pertur- rate than the random perturbations. The random per- bation, bred vectors, and normal modes against the an- turbations follow the ellipse defined by the singular alogs is plotted as red, blue, and green solid lines, vectors. Consequently, one of these random perturba- respectively. As observed in Fig. 11, at the beginning the tions is close to the first singular vector. Yet, the ran- normal modes are likely to have a distribution similar to dom perturbations have artificial states, which do not the analogs. These two distributions are significantly belong to the attractor, and as demonstrated by Judd different (95%) for the period from 2 to almost 5 t.u. et al. (2008), they have an initial evolution toward the Random perturbation resembles the analogs’ distribu- attractor. The initial growth of these states is small tion slightly, but this small similarity disappears after compared to the first singular vector or bred vector. 100–200 time steps (2 time units). In the bred vector Consequently, the average growth rate for the random case, from the beginning these two distributions are perturbations is smaller than the bred vectors or nor- completely different. Consequently, the BV and RP mal modes due to giving the same weight to all these methodologies would produce a completely different random states instead of only to the surface spanned by forecast and they should not be considered as pertur- the first two singular vectors (normal modes). The bation methods with different properties but different reasoning behind this non-desired behavior is that un- forecasting methods. The normal modes have a certain real error statistics are applied to the random pertur- similarity for short ranges, but the similarity is lost for bations. The same probability is applied to all the

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1397 created states, having no information about the actual probability of states. The only way to obtain these statistics is by running the system for a long enough period and by analyzing the difference between closer states. While the ergodicity hypothesis is valid, the statistics for different time states are equivalent to statistics for states at a given point. This hypothesis has been verified in our dataset before obtaining the analog states (section 2). Consequently, analog states defined by applying the ergodicity hy- pothesis are an approach to obtain the natural error statistic for a given state of the dynamic system. This is the main reason because the analogs are selected as the reference perturbation in the comparison carried out. Ensemble forecasting uses the probability distribution of possible states to provide different statistical in- FIG. 14. Total derivative associated with a trajectory for different formation about the uncertainty associated with errors in points on the cross section for Z 5 40. The gray area corresponds to the initial conditions. The differences among the PDFs the attractor obtained from our dataset, and the gray lines are have been studied in section 4c. Significant differences for equidistant contours to the 3D attractor. The larger the value of the these distributions for medium-range lead times (around derivative, the faster the states will evolve. 400 time steps or 0.4 of the loss of predictability time) have been observed. Consequently, the outcome of the the differences in the forecast outcome when using model applying the different perturbation methodologies perturbation methods that are lacking the actual struc- results in statistically different future scenarios (even that ture of the errors have also been mentioned. In the the ensemble mean could still be similar). following, the possible causes for these differences are Of the three perturbation methodologies compared analyzed. To do this, a main property of the nonlinear with the analogs, the normal mode is the one that has the chaotic system is studied: its fractality or the existence closest distribution of errors for short- and long-range of a strange attractor. lead times (before the total loss of predictability). Fractal systems are sets of states that exhibit a re- Running the model for a long time and analyzing the peating pattern at several scales and therefore do not covariances of the errors is similar to approximating the entirely fill the phase space. This property does not allow model evolution by the tangent linear propagator and the existence of all states in the phase space. The Lorenz using this information to create perturbations over the equations, however, allow all states in the phase space, surface spanned by the first two singular vectors. The but as discussed in several papers (e.g., Manneville and similarities should be maintained as long as the non- Pomeau 1979) an initial transition period is required to linearities are not important, which is related to the ensure the state is inside the attractor. The Lorenz at- amplitude of the perturbations. This is the reason tractor has a global correlation dimension of 2.07, which behind a similar distribution of the normal mode per- is a measure of the fractal dimension. As shown in Fig. 8, turbations for the short-range lead time. this correlation is obtained for scales within 0.001 and The system of equations forces the perturbations to 10. Consequently, there is a pattern of possible states at align after a short period (Fig. 12 shows the three vari- the scales the perturbations are created. None of the ables are aligned after 20 time steps or 0.02 of the loss of existing perturbation methods uses this information— predictability time). However, the analogs’ range is they allow states outside the attractor. This can be ob- larger for the three variables than for the bred vectors or served when studying the fractal structure of the states the random perturbations, and for X and Z when com- for a time step only (Fig. 10). The random perturbation paring with the normal modes. This is associated with fills the 3D sphere (correlation dimension of 3). This the distribution of variables in the phase space. There- causes a ‘‘not allowed point’’ to end up in the attractor fore, the PDFs obtained by the different perturbations (also known as a transition period). The bred vectors methodologies are not able to contain the values ob- allow states only on the surface of a sphere; this would tained by the analogs. most importantly not reproduce the spatial structure of We have discussed that the analogs provide the actual the perturbations. Normal modes are the only pertur- error statistics of the Lorenz system (or any system) by bations that partially follow this structure. Initially they analyzing the inner variability of similar states. Some of fill the 2D circle over the surface spanned by the first two

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1398 MONTHLY WEATHER REVIEW VOLUME 145 vectors. This surface is maintained for up to 3 t.u., not analogs are good enough in the sense that the ergodicity following the actual structure of the Lorenz attractor. assumption is satisfied so that temporal states can be However, for the medium range the fractal structure is interchanged as different realizations of a given state. recovered when the nonlinearities start to influence the The ergodic hypothesis assumes a certain stationarity of states due to the amplitude of the perturbations. the system. Taking into account the trends found in our To understand the evolution of the not allowed states, climatic system (such as climate change), the system the derivative of the trajectory is studied (Fig. 14). This cannot be defined as ergodic and these ergodic analogs figure shows the large values of the derivative (speed of could not been defined or found. In addition, Van den the states along the trajectory) for the areas with a far- Dool (1994) showed that a dataset of around 1030 yr is ther distance to the attractor (upper-left corner and necessary to find analogs within the current observa- lower-right corner). This structure, even though it is not tional error over an area similar in size to the Northern totally related to the attractor due to the coupled terms, Hemisphere. Consequently, it seems infeasible to find can be seen as proof of the fast evolution of not be- analogs with a small error within the NWP model cli- longing to the attractor points. A similar explanation has matology and impracticable to reproduce this experi- been found in the literature (Trevisan 1993), which ment with the NWP system of equations. mentions random errors that are not in the attractor are The applicability to complex numerical models is only dominated by the negative Lyapunov vector, making qualitatively indicative. It is practically impossible to them rapidly converge toward the attractor. This di- construct an attractor (if it exists) of a complex numer- rection is partially avoided in the normal mode pertur- ical model of the atmosphere. Even in idealized exper- bation and it can help to understand its behavior. iments, where a control run is taken as reference, our To summarize, the analog perturbations have several results are not directly transposable. Nevertheless, our properties different from the previous perturbation conclusions with the experiments on the Lorenz attrac- methodologies, such as the variance of the rate growth tor are indicative of possible experiments with numeri- of error and the resulting PDF. These differences can be cal models of various and suggest caution related to the fractal structure of the analog perturba- in interpreting ensembles generated with different per- tion, which is compatible with the attractor. This paper turbations of initial conditions. In addition, the impor- provides insight into this property, which has never been tance of the fractal properties (states belonging to the studied in any of the perturbation methodologies. attractor) has been demonstrated in this study. Data Finally, an important point must be added here on the assimilation (DA) might partially correct the problem of practical applicability of these results for NWP fore- states not belonging to the attractor of these perturba- casting. To start this discussion, two concepts have to be tion methodologies. Some DA methodologies [e.g., addressed: the existence of the atmospheric/NWP model three-dimensional variational DA (3D-Var)] can in- attractor and the possibility of finding analogs on it. corporate balance constraints built into the cost func- The demonstration of the existence of the Lorenz 63 tion, thereby mapping states onto a slow manifold. system attractor was one of Smale’s problems (Smale In NWP the situation is fundamentally different: the 1998). It was solved by Tucker (2002) by using advance problem is the forecast of a reality that is more complex ODE topologies in the three-dimensional phase space. than the model, particularly at the mesoscale, and the The atmospheric–NWP model phase space has many evaluation of the forecast error with respect to the re- more dimensions (it can be of the order of millions, ality is defined by observations affected by observational taking into account grid points, vertical levels, and var- errors. The objective of the perturbations is to in- iables). Consequently, the rigorous demonstration of the corporate observational errors in the initial conditions existence of the NWP attractor is still an open problem or the uncertainties of the parameterizations used in the [an attempt was made by Li and Chou (1997)] and might models (Kong et al. 2014). These errors have statistics be unsolvable. Epstein (1969) tried a simplification of determined by their source or origin (the way the system the problem by treating the initial condition as a prob- is probed). In our idealized experiments, we consider ability distribution that evolves according to the laws of ensemble generation by perturbations around a state fluid dynamics used in the NWP model equations. not necessarily related to any particular errors. Contrary However, even this approach is impracticable for the to NWP we have the entire ergodic attractor of the current NWP models. system of Lorenz equations. This attractor is repre- A less rigorous way to define an attractor is through sented by the probability density of all possible states. the climatology (Essex et al. 1987). Once a climatology is The analogs in this attractor provide the expected value constructed (and defined as an attractor of the system), of the statistics of perturbations (not errors) around a analogs can be selected. The question is whether these state. In this sense these are the natural statistics, and we

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC APRIL 2017 A T E N C I A A N D Z A W A D Z K I 1399 use them as reference for comparison with other displaced toward the attractor in the first time steps. methods of perturbation. The sphere of perturbations created by the bred vectors That said, some qualitative similarities between the maximizes the growth rate but has a lower dimension experiments on the Lorenz attractor and numerical than the Lorenz attractor. The normal mode maintains models can be expected: limiting the ensemble fore- the two fractal dimensions longer than the bred vectors cast to the fastest-growing modes will lead to ensem- maintain a sphere surface. The time needed to recover bles with incorrect distribution of forecasted states. the fractal properties of the Lorenz system is similar to Yet, the good proximity between ensemble forecasts half (or more) the time scale of the total loss of pre- initialized by analogs and those by normal modes is dictability. The analog perturbation has the correct reassuring. A first attempt at using real field in- fractal dimension from the beginning and oscillates formation and normal modes is the random field per- around the fractal dimension of the whole attractor turbation developed by Magnusson et al. (2009),where during its evolution. the difference between two randomly chosen atmo- According to these findings, it can be concluded that spheric states (i.e., analyses) is used. Our results sug- the analogs contain more information about the sys- gest the use of this methodology. tem than 1) the random perturbations that lead to an artificial slow growth rate, since they are not com- patible with the attractor; 2) bred vectors, whose PDF 6. Conclusions is not representative of the system until the saturation In this paper an in-depth comparison between ana- level is reached because of its lack of variability in the logs, defined by the ergodicity hypothesis, and pre- initial states; and 3) normal modes do not reproduce viously developed perturbation techniques is carried the PDF for medium-range times but it is the pertur- out. The comparison has not only accounted for differ- bation methodology that has the most properties in ences in the mean growth rate of errors but also its common with the analogs. These ergodic analogs fractal dimension and differences in the probability will be studied in the future to see whether the extra distribution function. information contained in them can be used in a The local predictability has been computed and com- forecasting mode. pared with the previous techniques. It has shown a larger growth rate of error than the random perturbation, which Acknowledgments. Special thanks go to all our group is similar to that obtained by the bred vectors and normal members for their fruitful discussions during the group modes. The standard deviation of the growth rate of er- meetings. Three anonymous reviewers have notably im- rors is larger in the analogs than in any other perturbation proved this manuscript through their corrections. technique. These two facts ensure having the maximum growth according to the nonlinear model without losing REFERENCES the variability inherent in ensemble forecasting. The probability distribution function for 2500 mem- Bowler, N. E., 2006: Comparison of error breeding, singular vec- tors, random perturbations and ensemble Kalman filter per- bers has been studied in order to analyze the differences turbation strategies on a simple model. Tellus, 58A, 538–548, in the ensemble forecast produced using these tech- doi:10.1111/j.1600-0870.2006.00197.x. niques. The results have shown significant differences in Buizza, R., P. Houtekamer, G. Pellerin, Z. Toth, Y. Zhu, and the outcome PDFs and, consequently, a different fore- M. Wei, 2005: A comparison of the ECMWF, MSC, and cast even though some statistical properties, such as the NCEP global ensemble prediction systems. Mon. Wea. Rev., 133, 1076–1097, doi:10.1175/MWR2905.1. mean, were similar. Studying the range for the X, Y, and Chu, P. C., 1999: Two kinds of predictability in the Lorenz Z variables showed that the analog perturbations have system. J. Atmos. Sci., 56, 1427–1432, doi:10.1175/ the largest range. Consequently, analog forecasting 1520-0469(1999)056,1427:TKOPIT.2.0.CO;2. have a different ensemble forecast containing more ex- Epstein, E. S., 1969: Stochastic dynamic prediction. Tellus, 21A, treme values in comparison with the other techniques. 739–759, doi:10.3402/tellusa.v21i6.10143. Essex, C., T. Lookman, and M. Nerenberg, 1987: The climate at- This difference is significant only for the medium range tractor over short timescales. Nature, 326, 64–66, doi:10.1038/ when comparing with the normal modes. 326064a0. The fractal properties of these perturbations have Evans, E., N. Bhatti, L. Pann, J. Kinney, M. Peña, S.-C. Yang, been studied to understand a possible source of these E. Kalnay, and J. Hansen, 2004: RISE: Undergraduates find differences. It has been observed that the random that regime changes in Lorenz’s model are predictable. Bull. Amer. Meteor. Soc., 85, 520–524, doi:10.1175/BAMS-85-4-520. perturbations are not compatible with the fractal Farrell, B. F., 1990: Small error dynamics and the predictability of properties of the Lorenz system creating artificial atmospheric flows. J. Atmos. Sci., 47, 2409–2416, doi:10.1175/ perturbations (not containedintheattractor)thatare 1520-0469(1990)047,2409:SEDATP.2.0.CO;2.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC 1400 MONTHLY WEATHER REVIEW VOLUME 145

Geisel, T., and J. Nierwetberg, 1982: Onset of diffusion and universal Murphy, J., 1988: The impact of ensemble forecasts on predictability. scaling in chaotic systems. Phys. Rev. Lett., 48,7,doi:10.1103/ Quart. J. Roy. Meteor. Soc., 114, 463–493, doi:10.1002/ PhysRevLett.48.7. qj.49711448010. Golub, G. H., and C. F. Van Loan, 1996: Matrix Computations. 3rd Nese, J. M., 1989: Quantifying local predictability in phase space. ed. The John Hopkins University Press, 728 pp. Phys. D, 35, 237–250, doi:10.1016/0167-2789(89)90105-X. Grassberger, P., and I. Procaccia, 1983: Characterization of Nicolis, C., 1992: Probabilistic aspects of error growth in atmo- strange attractors. Phys. Rev. Lett., 50, 346, doi:10.1103/ spheric dynamics. Quart. J. Roy. Meteor. Soc., 118, 553–568, PhysRevLett.50.346. doi:10.1002/qj.49711850508. Judd, K., and T. Stemler, 2010: Forecasting: It is not about statistics, ——, S. Vannitsem, and J.-F. Royer, 1995: Short-range predictability it is about dynamics. Philos. Trans. Roy. Soc. London, 368A, of the atmosphere: Mechanisms for superexponential error 263–271, doi:10.1098/rsta.2009.0195. growth. Quart. J. Roy. Meteor. Soc., 121, 705–722, doi:10.1002/ ——, C. A. Reynolds, T. E. Rosmond, and L. A. Smith, 2008: The qj.49712152312. geometry of model error. J. Atmos. Sci., 65, 1749–1772, Oseledets, V. I., 1968: A multiplicative ergodic theorem: Lyapunov doi:10.1175/2007JAS2327.1. characteristic exponents for dynamical systems. Tr. Mosk. Kalnay, E., M. Corazza, and M. Cai, 2002: Are bred vectors the same as Mat. O-va., 19, 179–210. Lyapunov vectors? Proc. XXVII General Assembly, Nice, France, Palmer, T., 1993: Extended-range atmospheric prediction and the European Geophysical Society, Abstract 6820. [Available Lorenz model. Bull. Amer. Meteor. Soc., 74, 49–65, online at http://adsabs.harvard.edu/abs/2002EGSGA..27.6820K.] doi:10.1175/1520-0477(1993)074,0049:ERAPAT.2.0.CO;2. Kolmogorov, A. N., 1933: Sulla determinazione empirica di una Pazó, D., J. López, and M. Rodríguez, 2013: The geometric norm legge di distribuzione. G. Ist. Ital. Attuari, 4, 83–91. improves ensemble forecasting with the breeding method. Kong, F., and Coauthors, 2014: CAPS storm-scale ensemble fore- Quart. J. Roy. Meteor. Soc., 139, 2021–2032, doi:10.1002/ casting system: Impact of IC and LBC perturbations. 26th qj.2115. Conf. on Weather Analysis and Forecasting/22nd Conf. on Ruelle, D., and F. Takens, 1971: On the nature of turbulence. Numerical Weather Prediction, Atlanta, GA, Amer. Meteor. Commun. Math. Phys., 20, 167–192, doi:10.1007/BF01646553. Soc., 119. [Available online at https://ams.confex.com/ams/ Saltzman, B., 1962: Finite amplitude free convection as an initial 94Annual/webprogram/Paper234762.html.] value problem—I. J. Atmos. Sci., 19, 329–341, doi:10.1175/ Lacarra, J.-F., and O. Talagrand, 1988: Short-range evolution of 1520-0469(1962)019,0329:FAFCAA.2.0.CO;2. small perturbations in a barotropic model. Tellus, 40A, 81–95, Smale, S., 1998: Mathematical problems for the next century. Math. doi:10.1111/j.1600-0870.1988.tb00408.x. Intell., 20, 7–15, doi:10.1007/bf03025291. Legras, B., and R. Vautard, 1996: A guide to liapunov vectors. Smirnov, N., 1948: Table for estimating the goodness of fit of em- Proceedings of a Seminar Held at ECMWF on Predictability, pirical distributions. Ann. Math. Stat., 19, 279–281, doi:10.1214/ Vol. 1, ECMWF, 143–156. aoms/1177730256. Li, J., and J. Chou, 1997: Existence of the atmosphere attractor. Sci. Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: China, 40D, 215–220, doi:10.1007/BF02878381. The generation of perturbations. Bull. Amer. Meteor. Soc., Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 74, 2317–2330, doi:10.1175/1520-0477(1993)074,2317: 130–141, doi:10.1175/1520-0469(1963)020,0130:DNF.2.0.CO;2. EFANTG.2.0.CO;2. ——, 1965: A study of the predictability of a 28-variable at- Trevisan, A., 1993: Impact of transient error growth on global av- mospheric model. Tellus, 17A, 321–333, doi:10.1111/ erage predictability measures. J. Atmos. Sci., 50, 1016–1028, j.2153-3490.1965.tb01424.x. doi:10.1175/1520-0469(1993)050,1016:IOTEGO.2.0.CO;2. ——, 1969: Atmospheric predictability as revealed by naturally ——, 1995: Statistical properties of predictability from atmospheric occurring analogues. J. Atmos. Sci., 26, 636–646, doi:10.1175/ analogs and the existence of multiple flow regimes. J. Atmos. 1520-0469(1969)26,636:APARBN.2.0.CO;2. Sci., 52, 3577–3592, doi:10.1175/1520-0469(1995)052,3577: Magnusson, L., E. Källén, and J. Nycander, 2008: Initial state SPOPFA.2.0.CO;2. perturbations in ensemble forecasting. Nonlinear Processes ——, and R. Legnani, 1995: Transient error growth and local Geophys., 15, 751–759, doi:10.5194/npg-15-751-2008. predictability: A study in the Lorenz system. Tellus, 47A, 103– ——, J. Nycander, and E. Källén, 2009: -dependent versus flow- 117, doi:10.1034/j.1600-0870.1995.00006.x. independent initial perturbations for ensemble prediction. Tucker, W., 2002: A rigorous ODE solver and Smale’s 14th Tellus, 61A, 194–209, doi:10.1111/j.1600-0870.2008.00385.x. problem. Found. Comput. Math., 2, 53–117, doi:10.1007/ Mandelbrot, B. B., 1983: The Fractal Geometry of Nature. Mac- s002080010018. millan, 468 pp. Van den Dool, H., 1994: Searching for analogues, how long Manneville, P., and Y. Pomeau, 1979: Intermittency and the Lorenz must we wait? Tellus, 46A, 314–324, doi:10.3402/ model. Phys. Lett., 75A, 1–2, doi:10.1016/0375-9601(79)90255-X. tellusa.v46i3.15481. Miller, F., A. Vandome, and J. McBrewster, 2010: Ergodic Theory. ——, J. Huang, and Y. Fan, 2003: Performance and analysis of the Alphascript Publishing, 108 pp. constructed analogue method applied to U.S. soil moisture Milnor, J., 2004: On the concept of attractor. The Theory of Chaotic over 1981–2001. J. Geophys. Res., 108, 8617, doi:10.1029/ Attractors, Springer, 243–264. 2002JD003114.

Unauthenticated | Downloaded 09/25/21 09:27 AM UTC