ARTICLE IN PRESS

Atmospheric Environment 37 (2003) 5083–5096

Direct and adjoint sensitivity analysis ofchemical kinetic systems with KPP: Part I—theory and software tools Adrian Sandua, Dacian N. Daescub, Gregory R. Carmichaelc,* a Department of Computer Science, Virginia Polytechnic Institute and State University, 660 McBryde Hall, Blacksburg, VA 24061, USA b Institute for Mathematics and its Applications, University of Minnesota, 400 Lind Hall, 207 Church Street S.E., Minneapolis, MN 55455, USA Center for Global and Regional Environmental Research, University of Iowa, 204 IATL, Iowa City, IA 52242-1297, USA

Received 26 January 2003; accepted 7 August 2003

Abstract

The analysis ofcomprehensive chemical reactions mechanisms, parameter estimation techniques, and variational chemical data assimilation applications require the development of efficient sensitivity methods for chemical kinetics systems. The new release (KPP-1.2) of the kinetic preprocessor (KPP) contains software tools that facilitate direct and adjoint sensitivity analysis. The direct-decoupled method, built using BDF formulas, has been the method of choice for direct sensitivity studies. In this work, we extend the direct-decoupled approach to Rosenbrock stiff integration methods. The need for Jacobian derivatives prevented Rosenbrock methods to be used extensively in direct sensitivity calculations; however, the new automatic and symbolic differentiation technologies make the computation of these derivatives feasible. The direct-decoupled method is known to be efficient for computing the sensitivities of a large number ofoutput parameters with respect to a small number ofinput parameters. The adjoint modeling is presented as an efficient tool to evaluate the sensitivity of a scalar response function with respect to the initial conditions and model parameters. In addition, sensitivity with respect to time-dependent model parameters may be obtained through a single backward integration ofthe adjoint model. KPP softwaremay be used to completely generate the continuous and discrete adjoint models taking fulladvantage ofthe sparsity ofthe chemical mechanism. Flexible direct-decoupled and adjoint sensitivity code implementations are achieved with minimal user intervention. In a companion paper, we present an extensive set of numerical experiments that validate the KPP software tools for several direct/adjoint sensitivity applications, and demonstrate the efficiency of KPP-generated sensitivity code implementations. r 2003 Elsevier Ltd. All rights reserved.

Keywords: Chemical kinetics; Sensitivity analysis; Direct-decoupled method; Adjoint model

1. Introduction nonlinear differential equations

dy 0 0 0 F The mathematical formulation of chemical reaction ¼ f ðt; y; pÞ; yðt Þ¼y ; t ptpt : ð1Þ dt mechanisms is given by a coupled system ofstiff The solution yðtÞARn represents the time evolution of the concentrations ofthe species considered in the *Corresponding author. Tel.: +1-319-335-3333; fax: +1- chemical mechanism starting from the initial configura- 0 319-335-3337. tion y : Throughout this work vectors will be repre- T E-mail addresses: [email protected] (A. Sandu), daescu@ sented in column format and an upper script ðÁÞ will ima.umn.edu (D.N. Daescu), [email protected] denote the transposition operator. The rate ofchange in (G.R. Carmichael). the concentrations y is determined by the nonlinear

1352-2310/$ - see front matter r 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.atmosenv.2003.08.019 ARTICLE IN PRESS 5084 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096

T production/loss function f ¼ðf1; y; fnÞ ; which depends frequent modifications of the existing ones, there is a on a vector ofparameters pARm: In practice, the vector need for software tools that facilitate the sensitivity p may represent reaction rate coefficients, the initial state analysis ofgeneral chemical kinetic mechanisms. The ofthe model ðp ¼ y0Þ; additional source/sink terms (e.g. kinetic preprocessor (KPP) (Damian-Iordache et al., emission rates), etc. We assume that problem (1) has a 1995) has been successfully used in the forward unique solution y ¼ yðt; pÞ once the model parameters integration ofthe chemical kinetics systems ( Sandu are specified. Comprehensive atmospheric reaction et al., 1997a, b; Verwer et al., 1999). The new release mechanisms take into consideration as many as 100 (KPP-1.2) presented in this paper implements a com- chemical species involved in hundreds ofchemical prehensive set of software tools for direct and adjoint reactions (see e.g. Carter, 2000), such that to efficiently sensitivity analysis. Given a chemical mechanism de- integrate system (1) fast and reliable numerical methods scribed by a list ofchemical reactions, KPP generates a must be implemented (Byrne and Dean, 1993; Jacobson flexible code for the model, its forward integration and and Turco, 1994; Sandu et al., 1997a, b). In addition, the the direct-decoupled/adjoint sensitivity analysis. model parameters are often obtained from experimental The paper is organized as follows: a review of the data and their accuracy is hard to estimate. The direct-decoupled sensitivity analysis and extensions to development and validation ofchemical reactions Runge–Kutta and Rosenbrock stiff integration methods mechanisms require a systematic sensitivity analysis to are presented in Section 2. In Section 3, we review the evaluate the effects of parameter variations on the model continuous and discrete adjoint sensitivity methods and solution. discuss practical issues ofthe adjoint code implementa- n The sensitivities ScðtÞAR are defined as the deriva- tion for chemical kinetics systems. The KPP tools that tives ofthe solution with respect to the parameters facilitate the implementation of the sensitivity methods @yðtÞ are presented in Section 4. Tutorial examples for ScðtÞ¼ ; 1pcpm: ð2Þ building the direct-decoupled code and the adjoint code @pc with KPP are presented in Sections 5 and 6, respectively. A large sensitivity value ScðtÞ shows that the para- In Section 7, we outline the new numerical methods for meter pc plays an essential role in determining the model sensitivity calculations available in the KPP numerical forecast yðtÞ; therefore one problem of interest is to library. A summary ofthe results and concluding nÂm evaluate the sensitivities SðtÞAR for t0ptptF: Some remarks are presented in Section 8. practical applications (e.g. data assimilation) require the sensitivity ofa scalar response function Z tF 2. Direct sensitivity analysis g ¼ g#ðt; yðt; pÞÞ dt ð3Þ 0 t For the direct analysis we consider the parameters p to with respect to the model parameters be constant, i.e. they do not change in time. By Z F @g t @g# differentiating (1) with respect to the parameters one ¼ STðtÞ ðt; yðt; pÞÞ dt: ð4Þ obtains the sensitivity equations (variational equations) @p t0 @y dSc Depending on the problem at hand, an appropriate ¼ Jðt; y; pÞSc þ fpc ðt; y; pÞ; dt method for sensitivity evaluation must be selected. The 0 0 @y most popular and efficient techniques for sensitivity Scðt Þ¼ ; 1pcpm; ð5Þ pc studies are the direct-decoupled and the adjoint sensi- @ tivity methods. These approaches are complementary; where J is the Jacobian matrix ofthe derivative function for example, a single direct sensitivity calculation could and fpc are its partial derivatives with respect to the 0 0 F provide @½y1ðtÞ; y; ynðtފ=@yjðt Þ for all t ptpt ; parameters while a single adjoint calculation provides @y ðtFÞ= j @½f1; y; fnŠ 0 F @½y1ðtÞ; y; ynðtފ for all t ptpt : The sensitivity infor- Jðt; y; pÞ¼ ; @½y1; y; ynŠ mation provided by the direct method may be used to y @½f1; ; fnŠ c asses how parametric uncertainties propagate in the fpc ðt; y; pÞ¼ ; 1p pm: ð6Þ @pc system; the adjoint method is more suitable for inverse modeling and may be used to identify the origin of Ifthe parameter pc represents the initial concentration c 0 0 uncertainty in a given model output. In this paper, ofthe th species, pc ¼ yc; then fpc ¼ 0 and Scðt Þ¼ special emphasis is given to the adjoint technique which ð0; y; 0; 1; 0; y; 0ÞT; the cth vector ofthe canonical n 0 has not been used as extensively as the direct method in basis in R ; otherwise, Scðt Þ¼0: the context ofchemical systems. The variational equations (5) are linear. The direct Given the multitude ofapplications, the continuous method solves simultaneously the model equation (1) development ofnew reaction mechanisms and the together with the variational equations (5) to obtain ARTICLE IN PRESS A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 5085 both concentrations and sensitivities. The combined nonlinear system ofequations which implicitly defines system (1)–(5) has the Jacobian yiþ1; and which must be solved by a Newton-type þ y iterative scheme. Typically, the solution yi 1 at @½f ; JS1 þ fp1 ; ; JSm þ fpm Š fqþ1g y iteration q þ 1 is computed as 0@½y; S1; ; SmŠ 1 J 0 ? 0 ½I À hbJðti; yi; pފðyiþ1 À yiþ1Þ B C fqþ1g fqg B ðJS Þ þ J J ? 0 C XkÀ1 B 1 y p1 C ¼ B C; ð7Þ ¼ a yiÀj þ hbf ðtiþ1; yiþ1; pÞÀyiþ1; ð10Þ @ ^ & ^ A j fqg fqg j¼0 ðJS Þ þ J 0 ? J m y pm which requires one factorization of I À hbJ and one where the subscripts in the component sub-matrices backsubstitution per iteration. denote partial differentiation. The eigenvalues of the Discretization of the continuous sensitivity equation:In combined Jacobian (7) are the eigenvalues of J (the this approach (Dunker, 1984; Caracotsios and Stewart, Jacobian of the model equations), with different multi- 1985; Leis and Kramer, 1986) one discretizes the plicities; therefore, if model (1) is stiff the sensitivity continuous sensitivity equation (5) with the same BDF equations (5) are also stiff. To maintain stability, an method used for discretizing model (9) implicit time-stepping method is needed. Implicit meth- XkÀ1 ods solve linear systems with the ‘‘prediction’’ matrix iþ1 iÀj iþ1 iþ1 iþ1 Sc ¼ ajSc þ hbJðt ; y ; pÞSc ðI À hgJÞ; where h is the stepsize and g a parameter j¼0 determined by the method. In the naive approach one iþ1 iþ1 c þ hbfpc ðt ; y ; pÞ; 1p pm: ð11Þ would have to solve linear algebraic systems of dimension nðm þ 1ÞÂnðm þ 1Þ corresponding to Note that system (11) is linear, therefore it admits a Eq. (7). The direct-decoupled method (Dunker, 1984) noniterative solution exploits the special structure ofthe combined Jacobian ½I À hbJðtiþ1; yiþ1; pފSiþ1 (7); specifically, one only computes the LU factorization c T XkÀ1 ofthe n  n model prediction matrix I À hgJ ¼ P Á L Á iÀj iþ1 iþ1 c ¼ ajSc þ hbfpc ðt ; y ; pÞ; 1p pm: ð12Þ U: Then the nðm þ 1ÞÂnðm þ 1Þ prediction matrix for j¼0 Jacobian (7) has the LU factorization 0 1 The direct-decoupled method solves Eq. (10) first for the PT Á L 0 ? 0 B C new-time solution yiþ1: The matrix I À hbJðtiþ1; yiþ1; pÞ B Àhg½ðJS Þ þ J Š U À1 PT Á L ? 0 C B 1 y p1 C is computed and factorized; and systems Eq. (12) are B C @ ^ & ^ A solved for c ¼ 1 through m: Note that all systems (12) À1 T use the same matrix factorization, and this factorization Àhg½ðJSmÞ þ Jp Š U 0 ? P Á L 0 y m 1 is also reused in Eq. (10) for computing yiþ2 during the ? U 0 0 next time step. Therefore, the solution together with m B C B 0 U ? 0 C sensitivities are obtained at the cost ofa single matrix  B C: ð8Þ @ ^ & ^ A factorization per time step. 00? U Discrete sensitivity equation: In this approach one considers directly the sensitivities ofthe numerical The direct-decoupled method was developed and is solution. A discrete equation involving these entities is traditionally presented in the context ofBDF time- obtained by taking the derivative ofEq. (9) with respect stepping schemes (Dunker, 1984; Caracotsios and to pc ; it is easy to see that the process leads again to Stewart, 1985; Leis and Kramer, 1986). In what follows Eq. (11). Therefore, the numerical solutions of the i we review this approach, then we extend the direct- variational equations Sc are also the sensitivities ofthe i decoupled philosophy to Runge–Kutta and Rosenbrock numerical solution dy =dpc: integrators. 2.2. Direct-decoupled Runge–Kutta methods 2.1. Direct-decoupled backward differentiation formulas A general s-stage Runge–Kutta method is defined as Model (5) is approximated by the k-step, order k (Hairer and Wanner, 1991, Section IV.3) linear multistep formula Xs XkÀ1 yiþ1 ¼ yi þ h b k0; T j ¼ ti þ c h; yiþ1 ¼ a yiÀj þ hbf ðtiþ1; yiþ1; pÞ; h ¼ tiþ1 À ti: ð9Þ j j j j j¼1 j¼0 Xs j i 0 0 j j The formula coefficients aj; b are determined such that Y ¼ y þ h ajqkq; kj ¼ f ðT ; Y ; pÞ; ð13Þ the method has order k ofconsistency. Relation (9) is a q¼1 ARTICLE IN PRESS 5086 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 where the coefficients ajq; bj and cj are prescribed for the where s is the number of stages. The formula coefficients desired accuracy and stability properties. The stage (bj; ajq; cjq; aj; gjj; and gj) give the order ofconsistency 0 derivative values kj are defined implicitly, and require and the stability properties. At each stage ofthe method, 0 solving a (set of) nonlinear system(s). Newton-type a linear system ofequations with unknowns kj and methods solve coupled linear systems ofdimension (at matrix 1=ðhgjjÞI À J must be solved. Methods for which most) n  s: g11 ¼ ? ¼ gss are ofparticular interest since in this case Discretization of the continuous sensitivity equation: only one LU matrix decomposition is required per Application ofthis method to the sensitivity equation (5) integration step (Sandu et al., 1997b, Verwer et al., 1999, gives Djouad et al., 2002). Form (15) is advantageous for Xs implementation purposes and is equivalent with iþ1 i c c Sc ¼ Sc þ h bjkj for 1p pm; the standard formulation (Hairer and Wanner, 1991, j¼1 Section IV.7). ! Discretization of the continuous sensitivity equation:To Xs c j j i c apply method (15) to the combined sensitivity equations kj ¼ JðT ; Y ; pÞ Sc þ h ajqkq q¼1 (1)–(5), we need to make explicit use ofthe combined Jacobian (7). One step ofthe method updates the þ f ðT j; Y j; pÞ: ð14Þ pc solution with Eq. (15) and the sensitivities using System (14) is linear and does not require an iterative Xs iþ1 i c c procedure. Sc ¼ Sc þ bjkj for 1p pm; ð16Þ Discrete sensitivity equation: Clearly, Eq. (14) can be j¼1 obtained by differentiating Eq. (13) with respect to pc "# c 0 and setting kj ¼ dkj =dpc; therefore the Runge–Kutta 1 i i c I À Jðt ; y ; pÞ kj numerical solutions ofthe sensitivity equations are equal hgjj ! to the sensitivities ofthe Runge–Kutta numerical XjÀ1 solution ofthe model equation. j j i c j j ¼ JðT ; Y ; pÞ Sc þ ajqkq þ fpc ðT ; Y ; pÞ Computational considerations: One first solves Eq. (13) q¼1 for concentrations, which gives all stage derivative XjÀ1 0 0 cjq c vectors k1; y; ks : Next one solves Eq. (14) for sensitiv- þ k c c h q ities; each set k ; y; k is the solution ofan ðnsÞÂðnsÞ q¼1 1 s linear system. The matrix ofthis system is the same for @J i i 0 @J i i i 0 all 1 c m; but is different than the matrix used in the þ ðt ; y ; pÞ kj þ ðt ; y ; pÞÂSc kj p p @pc @y Newton iteration that solves Eq. (13). One can reuse the þ hg J ðti; yi; pÞSi þ hg f ðti; yi; pÞ; matrix factorization of Eq. (13) if the linear systems (14) j t c j pc;t are solved by an iterative method. However, due to the where Jt ¼ @J=@t; fpc;t ¼ @fpc =@t: If g11 ¼ ? ¼ gss; a repeated LU factorizations or the repeated Jacobian single n  n matrix LU decomposition is required per evaluations involved the standard Runge–Kutta step to obtain both the concentrations and the methods do not seem suitable for direct sensitivity sensitivities. calculations. Discrete sensitivity equation: We note that the derivative ofmethod (15) with respect to pc leads also 2.3. Direct-decoupled Rosenbrock methods to Eq. (16). Consequently, the sensitivities ofthe numerical solution coincide with the numerical solutions An s-stage Rosenbrock method (Hairer and Wanner, ofthe sensitivity equations. 1991, Section IV.7) computes the next-step solution by Computational considerations: Formula (16) requires the formulas the evaluation ofthe Hessian, i.e. the derivatives ofthe Xs Jacobian with respect to y; as well as the Jacobian iþ1 i 0 j i y ¼ y þ bjkj ; T ¼ t þ ajh; derivatives with respect to the parameters; these entities j¼1 are 3-tensors. In addition, an extra Jacobian evaluation XjÀ1 and one Jacobian-vector multiplication is needed at each j i 0 Y ¼ y þ ajqkq; stage. The need for Jacobian derivatives prevented q¼1 Rosenbrock methods to be used extensively in direct "# sensitivity calculations. However, the new automatic 1 i i 0 j j I À Jðt ; y ; pÞ kj ¼ f ðT ; Y ; pÞ differentiation and symbolic differentiation technologies hgjj make the computation ofthese derivatives feasible. XjÀ1 To avoid computing the derivatives ofthe Jacobian cjq @f þ k0 þ hg ðti; yi; pÞ; ð15Þ one can approximate Jacobian (7) by diagðJ; y; JÞ and h q j @t q¼1 use a Rosenbrock-W integration formula (Hairer and ARTICLE IN PRESS A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 5087

Wanner, 1991, Section IV.7). The resulting method gives evolution is governed by the equations consistent approximations ofthe sensitivities ofthe dy continuous solution, but these are now different than the nþ1 ¼ g#ðyðt; pÞÞ; y ðt0Þ¼0: ð19Þ dt nþ1 sensitivities ofthe numerical solution. Then Z Remark 1. Seefeld and Stockwell (1999) present an tF F extension ofthe direct-decoupled method to systems ynþ1ðt ; pÞ¼ g#ðyðt; pÞÞ dt ð20Þ with time-varying parameters. In their approach, a time- t0 dependent parameter is expressed as the product ofa and the adjoint sensitivity is applied to the augmented time-varying profile and a time-independent multiplier: T T system (1)–(19), with the state vector Y ¼ðy ; ynþ1Þ p ðtÞ¼pÃðtÞÂq with the nominal value ofthe multi- F j j j and the response functional g ¼ ynþ1ðt ; pÞ: plier chosen to be unity. The direct-decoupled method is In the adjoint sensitivity one must distinguish between then used to compute the derivatives ofthe concentra- the continuous and the discrete adjoint modeling (Sirkes tions with respect to the time-independent multiplier. and Tziperman, 1997). While in general the derivation of the continuous adjoint model is presented, often in practice it is a discrete adjoint model that is implemen- ted. This distinction is ofparticular importance in the 3. Adjoint sensitivity analysis context ofstiffchemicalreactions systems that require sophisticated numerical integrators. The direct-decoupled method is known to be very effective when the sensitivities of a large number of 3.1. Continuous adjoint sensitivity output variables are computed with respect to a small number ofinput parameters. The adjoint method Traditionally, the adjoint model equations are derived provides an efficient alternative to the direct-decoupled using the Hilbert spaces theory and variational techni- method for evaluating the sensitivity of a scalar response ques (Wang et al., 2001). The continuous adjoint model function with respect to the initial conditions and model is obtained from the continuous forward model using parameters. Mathematical foundations of the adjoint the linearization technique. To first-order approxima- sensitivity for nonlinear dynamical systems and various tion, a perturbation dp in the input parameters leads to a classes ofresponse functionalsare presented by Cacuci perturbation in the response functional (1981a, b). The construction ofthe adjoint operators dg ¼ gðyðtF; p þ dpÞÞ À gðyðtF; pÞÞ associated with linear and nonlinear dynamics and applications to atmospheric modeling are described in @g ¼ ; dy ; ð21Þ detail by Marchuk (1995) and Marchuk et al. (1996). @y nðt¼tFÞ Menut et al. (2000) and Vautard et al. (2000) use the F adjoint modeling for sensitivity studies in atmospheric where the state perturbation dyðt Þ is obtained by chemistry. A review ofthe adjoint method applied to solving the tangent linear model problem four-dimensional variational atmospheric chemistry ddy 0 F ¼ Jðt; y; pÞdy þ fpðt; y; pÞdp; t ptpt ; ð22Þ data assimilation is presented in the work of Wang dt et al. (2001). We will refer to the dynamical model (1) as a forward dyðt0Þ¼dy0: ð23Þ (direct) model. Given a scalar response function In the direct sensitivity approach, for each parameter g ¼ gðyðtF; pÞÞ ð17Þ variation dpl ; 1plpm; a new system (22)–(23) must be solved to compute dyðtFÞ: The adjoint model is used to we are interested to evaluate the sensitivities express explicitly the perturbation dg in terms of Xn parameter variations dp as follows: we introduce the @g @g @yj @g ¼ ¼ ; Sc adjoint variable lðtÞARn (to be precisely defined later), @pc @yj @pc F @y F j¼1 ðt¼t Þ nðt¼t Þ take the scalar product ofEq. (22) with l and integrate for 1pcpm; ð18Þ on ½t0; tFŠ to obtain:

Z F Z F / S n t y t where Á; Á n denotes the inner product in R ; dd / / S T l; dt ¼ l; Jðt; y; pÞdy u; v n ¼ v u: t0 dt n t0 In the typical case where the cost functional is given S þ fpðt; y; pÞdp n dt: ð24Þ by a time integral (3), the sensitivity problem (3)–(4) may be reduced to problem (17)–(18) by augmenting the state Integrating by parts the left-hand side of Eq. (24), using vector y with a new component ynþ1 whose time matrix transposition on the right-hand side, and ARTICLE IN PRESS 5088 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 rearranging the terms give the equivalent formulation solved for each additional parameter, whereas in the Z tF adjoint approach a new set ofadjoint equations / S tF dl T l; dy njt0 ¼ þ J ðt; y; pÞl; dy (26)–(27) must be solved for each additional response. 0 dt t n The sensitivities ofa vector-valued function / T S F k þ fp ðt; y; pÞl; dp m dt: ð25Þ Gðyðt ÞÞAR may be obtained by a backward integra- Therefore, if l is defined as the solution ofthe adjoint tion ofthe adjoint model problem dL ¼ÀJTðt; y; pÞL; ð30Þ dt dl T ¼ÀJ ðt; y; pÞl; ð26Þ dt @G T LðtFÞ¼ ðtFÞ; ð31Þ @g @y lðtFÞ¼ ðtFÞ; ð27Þ @y where LðtÞARnÂk are the adjoint variables. The matrix the perturbation in the response functional can be approach to the adjoint sensitivity analysis ofatmo- expressed from Eqs. (21), (23), and (25) as spheric models with multiple responses is discussed by Z tF Ustinov (2001). / 0 0S / T S dg ¼ lðt Þ; dy n þ fp ðt; y; pÞl; dp m dt: ð28Þ t0 0 F Evaluation of lðtÞ; t ptpt requires only one forward 3.2. The relationship with the Green’s function method integration from t0 to tF ofmodel (1) to compute 0 F yðtÞ; t otpt ; followed by a single backward integration The adjoint method presented above can be related to F 0 from t to t ofthe n-dimensional adjoint model the Green’s function. The Green’s function method of 0 n (26)–(27). lðt ÞAR represent the sensitivities ofthe sensitivity analysis in chemical kinetics is discussed in 0 response g with respect to the initial conditions y : detail by Hwang et al. (1978), Dougherty et al. (1979), Having available lðtÞ; sensitivities with respect to and Vuilleumier et al. (1997). The Green’s function additional model parameters pl are matrix Kðt; tÞARnÂn associated to Eq. (5) satisfies Z tF @g / S d ¼ lðtÞ; fpl ðt; y; pÞ n dt Kðt; tÞÀJðt; y; pÞKðt; tÞ¼0; t > t; ð32Þ @p 0 dt l Zt tF ¼ f Tðt; y; pÞlðtÞ dt ð29Þ Kðt; tÞ¼I: ð33Þ pl t0 The sensitivities ScðtÞ may be expressed in terms ofthe and may be computed using an appropriate numerical Green’s function (Hwang et al., 1978) quadrature scheme. Z t 0 0 ScðtÞ¼Kðt; t ÞScðt Þþ Kðt; tÞfpc ðtÞ dt; Remark 2. Modeling chemical kinetic systems requires t0 the specification oftime-dependent model parameters. 1pcpm: ð34Þ For example, photolytic reaction rates are determined F by the solar radiation and thermal reactions rates To obtain Scðt Þ; the differential equations for the depend on the temperature, therefore the reaction rates adjoint Green’s function are implicitly a function of time. The adjoint modeling d à F à F F K ðt; t ÞþK ðt; t ÞJðt; y; pÞ¼0; tot ; ð35Þ may be used to evaluate the sensitivity ofthe response dt functional with respect to time-dependent model para- à F F meters. The adjoint functions lðtÞ are also called K ðt ; t Þ¼I ð36Þ influence functions and represent the sensitivity ofthe are solved backwards in time from tF to t0 and using the response functional with respect to the variations in the identity model state at time t; ifthe parameters are time KÃð ; tÞ¼Kðt; Þ; ð Þ dependent, p ¼ pðtÞ; the impulse response functions t t 37 T fpc ðt; y; pðtÞÞlðtÞ represent the variation in g due to a the right-hand side integrals in Eq. (34) are computed unit impulse ( function) in the parameter pc at time with an appropriate numerical quadrature scheme t (Bryson and Ho, 1975). (Hwang et al., 1978; Dougherty et al. 1979). In this F way, all Scðt Þ; 1plpm may be evaluated by solving Remark 3. In practice, it is often of interest to evaluate n þ 1 sets (Eqs. (1) and(35)–(36)) ofstiffODE’sof the sensitivity ofmultiple response functionals(e.g. dimension n: concentrations ofvarious pollutants at a given time) A fundamental observation is that to evaluate F with respect to the model parameters. One should notice sensitivities (18) explicit knowledge of Scðt Þ; 1p that in the direct approach a new set ofEqs. (5) must be lpm is not needed; only the inner product ARTICLE IN PRESS A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 5089

/ F F S @g=@yðt Þ; Scðt Þ n is required. From Eqs. (18) and and attach the parameters equations (34), we obtain 0 iþ1 i p ¼ p; p ¼ p ; i ¼ 0; y; N À 1: ð44Þ @g @g F F ¼ ðt Þ; Scðt Þ The sensitivity with respect to the initial conditions @pc @y 0 0 T 0 T T n Y ¼ððy Þ ; ðp Þ Þ is given by 0 1 @g F F 0 0 ¼ ðt Þ; Kðt ; t ÞScðt Þ @gðyN Þ 0 1 @y B C T @g Z n B @y0 C @gðyN Þ @Y N ðyN Þ tF B C ¼ ¼ @ @y A: ð45Þ @g F F @ N A @Y 0 @Y 0 þ ðt Þ; Kðt ; tÞ fpc ðt; y; pÞ dt; ð38Þ @gðy Þ 0 @y 0m t n @p which may be written using Eq. (37) and matrix A successive application ofthe chain rule followedby transposition in the equivalent form transposition gives @g ÃT 0 F @g F 0 T T T À T ¼ K ðt ; t Þ ðt Þ; Scðt Þ @Y N @Y 1 @Y 2 @Y N 1 @pc @y ¼ ? Z n @Y 0 @Y 0 @Y 1 @Y NÀ2 tF ÃT F @g F N T þ K ðt; t Þ ðt Þ; fpc ðt; y; pÞ dt: ð39Þ @Y 0 @y  ; ð46Þ t n @Y NÀ1 Ifwe define the adjoint variables lðtÞARn as and by differentiating equations (43)–(44) we obtain ! ÃT F @g F lðtÞ¼K ðt; t Þ ðt Þ; ð40Þ iþ1 F i ðyi; piÞ F i ðyi; piÞ @y @Y y p i ¼ ; @Y 0ðm; nÞ Iðm; mÞ then Eq. (39) is written in a form that reveals the similarities with Eq. (28) i ¼ 0; y; N À 1: ð47Þ

Z F N F g t Therefore, if we define the adjoint variables at t ¼ t @ / 0 0 S / S ¼ lðt Þ; Scðt Þ n þ lðtÞ; fpc ðt; y; pÞ n dt; @pc t0 N @g N N l ¼ ðy Þ; n ¼ 0m ð48Þ 1plpm: ð41Þ @y i From Eqs. (35)–(36) and definition (40) it follows and evaluate the adjoint variables at t ; i ¼ N À that lðtÞ is the solution ofthe adjoint model problem 1; y; 1; 0 using the recursive relations ! ! (26)–(27). li @Y iþ1 T liþ1 ¼ ni @Y i niþ1 3.3. Discrete adjoint sensitivity ! i i i T iþ1 Fyðy ; p Þ l ¼ ; ð49Þ The discretization ofsystem (1) with a selected i i i T iþ1 iþ1 Fpðy ; p Þ l þ n numerical method provides a numerical approximation yN EyðtFÞ through a sequence of N intermediate states we obtain from Eqs. (45) to (49) the sensitivities yiþ1 ¼ F iðyi; pÞ; i ¼ 0; y; N À 1; ð42Þ @gðyN Þ @gðyN Þ ¼ l0; ¼ n0: ð50Þ @y0 @p where F i represents a one-step numerical integration formula which advances the solution from ti to tiþ1: This establishes an explicit relationship between the evaluated 3.4. Practical issues of the adjoint code implementation response functional gðyN Þ and the model parameters. Using interpolation techniques, we may assume that the The adjoint method relies on the linearization ofthe parameters p are discrete (time independent) and are forward model and the nonlinearity introduced by the determined by their values at the interpolation nodes. chemistry may have a direct impact on the time interval The discrete adjoint model equations are obtained length and the qualitative aspects ofthe adjoint directly from the discrete forward model equations (42) sensitivity analysis. Since for nonlinear problems the as we now explain. To present a compact, yet explicit adjoint equations depend on the forward model state, in derivation ofthe discrete adjoint sensitivity formulae,we order to perform the adjoint computations the forward consider an augmented state vector YðtÞ¼ model state must be available in reverse order. There- ðyTðtÞ; pTðtÞÞT; where pðtÞp: We rewrite the discrete fore, a large amount of memory must be allocated for equations (42) as storing the state during the forward run. As noticed in the previous sections in practice two iþ1 i i y ¼ F ðY Þ; i ¼ 0; y; N À 1; ð43Þ strategies may be used to implement the adjoint code. ARTICLE IN PRESS 5090 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096

In the continuous approach, the continuous adjoint adjoint approach (discrete adjoint for the transport model is first derived from the linearized continuous integration, continuous adjoint for the chemistry inte- forward model equations; then a numerical method of gration). choice is used to integrate the adjoint model. In this approach, the complexity ofthe numerical method used during the forward integration does not interfere with 4. The KPP tools the adjoint computations. While during the forward integration one has to solve a stiff nonlinear ODE In this section, we present the KPP software tools that system, during the adjoint integration a stiff linear are useful in derivative computations. A detailed system ofODE’s must be solved. Therefore,highly discussion ofthe basic KPP capabilities can be found stable implicit methods may be efficiently implemented in our previous work (Damian-Iordache et al., 1995; as no iterations are needed for solving the adjoint. 2002). Here, we focus on the new features introduced in The second method is the discrete adjoint approach, the release 1.2 that allow an efficient sensitivity analysis where the explicit dependence ofthe state vector ofchemical kinetic systems. evolution on the input parameters is obtained by the KPP builds simulation code for chemical systems numerical integration ofthe forwardmodel. The discrete driven by the law ofmass action kinetics adjoint model is then derived directly from the linearized d discrete forward model equations. In this way, the y ¼ S Á diag½k ðtÞ; y; k ðtފ Á pðyÞ dt 1 R discrete adjoint model provides an exact gradient ¼ S Á oðt; yÞ¼f ðt; yÞ; ð51Þ (sensitivity) relative to the numerical computation of the response functional. This approach appears to be where S is the stoichiometric matrix, kiðtÞ the ith T more suitable for the variational data assimilation where reaction rate coefficient, p ¼½p1; y; pRŠ the vector of T a minimization process must be performed since the reactant products and o ¼½o1; y; oRŠ the vector of minimization routine will receive the exact gradient of reaction velocities. the evaluated cost function. However, implementing an efficient discrete adjoint code may be a difficult task if 4.1. The derivative function sophisticated numerical methods are used. Additional issues related with the consistency ofa numerical scheme KPP orders the variable species such that the sparsity and the consistency ofits adjoint are discussed by Sei pattern ofthe Jacobian is maintained afteran LU and Symes (1995). decomposition and generates the following function to Hand-generated adjoint codes are tedious to write and compute the vector A VAR ofcomponent time deriva- often subject to errors. Automatic differentiation tools tives from the concentrations of variable V, radical R and (Giering and Kaminski, 1998; Rostaing et al., 1993) may fixed F species, and the vector ofrate coefficients RCT. facilitate the adjoint code generation, but they must be used with caution and the correctness ofthe automatic- SUBROUTINE FunVar ð V; R; F; RCT; A VAR Þ: generated adjoint code must be carefully verified. The state ofthe art stiffODEsolvers are usually written in a 4.2. The Jacobian form which is not optimal for the adjoint code generation and often one has to rewrite them in a form The Jacobian ofthe derivative functionis also that is suitable for the adjoint compilers. constructed by KPP via the command There is no general rule to decide which approach ½ j j Š: (discrete vs. continuous) should be used to implement #JACOBIAN OFF ON SPARSE the adjoint model. The performance of the adjoint code The option OFF inhibits the generation ofthe Jacobian is problem dependent and a selection can be made only subroutine; the option ON generates the Jacobian in full after an extensive analysis and testing for the particular matrix format, while the option SPARSE generates the problem to be solved have been performed. following routine for the Jacobian in sparse row- For atmospheric chemistry data assimilation and compressed format sensitivity studies, a discrete adjoint model was used SUBROUTINE JacVar SPð V; R; F; RCT; JS Þ: by Fisher and Lary (1995) for a Burlich–Stoer integra- tion scheme (Stoer and Burlich, 1980) and by Elbern Implicit numerical integrators solve systems ofthe et al. (1997) for a quasi-steady-state approximation form ðI À hgJÞx ¼ b; where h is the step size, g a method- (QSSA) method. A continuous adjoint chemistry model dependent parameter, J the Jacobian, and I À hgJ the with a fourth-order Rosenbrock solver (Hairer and ‘‘prediction’’ matrix. KPP generates the following sparse Wanner, 1991) for the forward/backward integration linear algebra subroutines: was successfully applied to 4D-Var chemical data assimilation by Errera and Fonteyn (2001) in a hybrid SUBROUTINE KppDecompðN; P; IERÞ ARTICLE IN PRESS A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 5091 performs an in-place, nonpivoting, sparse LU decom- but simplifies the future handling of sparse data position ofthe prediction matrix P (IER returns a structures. nonzero value ifsingularity is detected). Using this The following subroutine computes the reactant factorization the sparse backward and forward substitu- products for each reaction, i.e. A RPROD is the vector tions are performed by pðyÞ in formulation (51).

SUBROUTINE KppSolveðP; XÞ: SUBROUTINE ReactantProd ð V; R; F; A RPROD Þ: In the stoichiometric formulation (51) the Jacobian is Similarly, a new subroutine required for the adjoint @f ðt; yÞ computations Jðt; yÞ¼ ¼ S Á diag½k ðtÞ; y; k ðtފ @y 1 R SUBROUTINE KppSolveTR ð P; b; X Þ @pðyÞ Á ¼ S Á diag½k ðtÞ; y; k ðtފ Á JRPðyÞ: ð52Þ @y 1 R solves the linear system PTX ¼ b with the transposed coefficient matrix, and uses the same LU factorization as The following subroutine computes the Jacobian of KppSolve. The sparse subroutines KppDecomp and reactant products vector; i.e. JV RPROD is the matrix RP KppSolve are extremely efficient (Sandu et al., 1996). J ¼ @pðyÞ=@y above Two other KPP-generated subroutines are useful for SUBROUTINE JacVarReactantProd direct and adjoint sensitivity analysis ð V; R; F; JV RPROD Þ: SUBROUTINE JacVar SP Vec ð JVS; U; V Þ The matrix JV RPROD is ofcourse sparse and is computes the sparse Jacobian times vector product computed and stored in row-compressed sparse format. ðV’JVS Á UÞ; and the subroutine The number ofnonzeros is stored in the parameter NJVRP, the column indices in the vector ICOL JVRP and SUBROUTINE JacVarTR SP Vec ð JVS; U; V Þ the beginning ofeach row in CROW JVRP. computes the sparse Jacobian transposed times vector 4.4. The derivatives with respect to reaction coefficients product ðV’JV ST Á UÞ: The stoichiometric formulation allows a direct com- 4.3. The stoichiometric formulation putation ofthe derivatives with respect to rate coeffi- cients. From Eq. (51) one sees that the partial derivative KPP can generate the elements ofthe derivative ofthe time derivative functionwith respect to a reaction function and the Jacobian in the stoichiometric for- coefficient is given by the corresponding column in the mulation (51); this means that the product between the stoichiometric matrix times the corresponding entry in stoichiometric matrix, rate coefficients, and reactant the vector ofreactant products products is not explicitly performed. The option that y controls the code generation is f ðt; yÞ¼S Á diag½k1ðtÞ; ; kRðtފ Á pðyÞ @f1:NVAR #STOICMAT ½ OFF j ON Š: ) ¼ S1:NVAR; j Á pjðyÞ: @kj The ON value ofthe switch instructs KPP to generate The following subroutine computes the partial deriva- code for the stoichiometric matrix, the vector of reactant tive ofthe time derivative functionwith respect to a set products in each reaction, and the partial derivative of of reaction coefficients; this is more efficient than the time derivative function with respect to rate computing each derivative separately. coefficients. These elements are discussed below. The stoichiometric matrix is usually very sparse; the SUBROUTINE dFunVar dRcoeff total number ofnonzero entries in the stoichiometric ð V; R; F; NCOEFF; JCOEFF; DFDR Þ: matrix is the constant NSTOICM in KPP-generated code. A total of NCOEFF derivatives are taken with respect to KPP produces the stoichiometric matrix in sparse, reaction coefficients JCOEFF(1)–JCOEFF(NCOEFF). column-compressed format. Elements are stored in JCOEFF, therefore, is a vector of integers containing columnwise order in the one-dimensional vector the indices ofreaction coefficientswith respect to which ofvalues STOICM(1:NSTOICM); their row indices we differentiate. The subroutine returns the NVAR  are stored in IROW STOICM(1:NSTOICM); the NCOEFF matrix DFDR; each column ofthis matrix vector CCOL STOICM(1:NVAR+1) contains pointers to contains the derivative ofthe functionwith respect to the start ofeach column; forexample, column j starts one rate coefficient. Specifically, in the sparse vector at position CCOL STOICM(j) and ð þ ÞÀ @f1:NVAR ends at CCOL STOICM j 1 1: The last value DFDR1:NVAR; j ¼ ; 1pjpNCOEFF: CCOL STOICMðNVAR þ 1Þ¼NSTOICM þ 1 is not necessary @kJCOEFFðjÞ ARTICLE IN PRESS 5092 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096

The partial derivative ofthe Jacobian with respect to variable NHESS). The Hessian itselfis represented in the rate coefficient kj can be obtained from Eq. (52) as coordinate sparse format; the real vector HESS holds the external product ofcolumn j ofthe stoichiometric the values, and the integer vectors IHESS fI; J; Kg the matrix with row j of JRP indices ofnonzero entries @J 1:NVAR;1:NVAR ¼ S Á JRP : HESS ¼½ 1; 2; :::; NHESS Š IHESS fI; J; Kg @k 1:NVAR; j j;1:NVAR j ¼½ 1; 2; :::; NHESS Š In practice, one needs the product ofthis Jacobian partial derivative with a user vector, i.e. such that the nonzero Hessian entries are stored as

@J1:NVAR;1:NVAR Hi; j; k ¼ HESSðmÞ; fi; j; kg¼IHESS fI; J; KgðmÞ Á U1:NVAR @kj for Hi; j; ka0 and 1pmpNHESS: ¼ S ÁðJRP Á U Þ: 1:NVAR; j j;1:NVAR 1:NVAR The sparsity coordinate vectors IHESS fI; J; Kg are This is computed by the KPP-generated subroutine computed by KPP and initialized statically; these vectors SUBROUTINE dJacVar dRcoeff are constant as the sparsity pattern ofthe Hessian does not change during the computation. ð ; ; ; ; ; ; Þ: V R F U NCOEFF JCOEFF DJDR Note that due to the symmetry relation (54) it is U is the user-supplied vector. A total of NCOEFF enough to store only the upper part ofeach component’s derivatives are taken with respect to reaction coefficients Hessian matrix; the sparse structures for the Hessian JCOEFF(1)–JCOEFF(NCOEFF). The subroutine returns store only those entries Hi; j; k for which jpk: the NVARÂNCOEFF matrix DJDR; each column ofthis The subroutine matrix contains the derivative ofthe Jacobian with SUBROUTINE HessVar ð V; R; F; HESS Þ respect to one rate coefficient times the user vector. Specifically, computes the Hessian (in coordinate sparse format) @J1:NVAR;1:NVAR given the concentrations ofthe variable species V: Note DJDR1:NVAR; j ¼ Á U1:NVAR; @kJCOEFFðjÞ that only the entries Hi; j; k with jpk are computed and stored. 1pjpNCOEFF: The subroutine

4.5. The Hessian SUBROUTINE HessVar Vec ð HESS; U1; U2; HU Þ computes the Hessian times user vectors product, which The Hessian contains second-order derivatives ofthe is a vector and can be regarded as the derivative of time derivative functions. More exactly, the Hessian is a Jacobian-vector product times vector 3-tensor such that 2 @ðJðyÞÁU1Þ @ fiðt; y1; y; yn; p1; y; pmÞ HU’ðHESS Â U2ÞÁU1 ¼ Á U2: Hi; j; kðt; y; pÞ¼ ; @y @yj@yk 1pi; j; kpn: ð53Þ Similarly, the subroutine

For each component i there is a Hessian matrix Hi;:;:; SUBROUTINE HessVarTR Vec ð HESS; U1; U2; HTU Þ: since the time derivative function is smooth these Hessian matrices are symmetric computes the Hessian transposed times user vectors product (which equals the derivative ofJacobian 2 2 @ fiðt; y; pÞ @ fiðt; y; pÞ transposed-vector product times vector) Hi; j; kðt; y; pÞ¼ ¼ @yj@yk @yk@yj T T @ðJ ðyÞÁU1Þ ¼ Hi;k;jðt; y; pÞ: ð54Þ HTU’ðHESS Â U2Þ Á U1 ¼ Á U2: @y An alternative way to look at the Hessian is to consider the 3-tensor as the derivative ofthe Jacobian (6) with respect to individual species concentrations 5. Building direct-decoupled code with KPP

@Ji; jðt; y1; y; yn; pÞ dfiðt; y; pÞ Hi; j;k ¼ ; Ji; j ¼ ; In this section, we present a tutorial example for the @yk dyj direct-decoupled code implementation in the KPP 1pi; j; kpn: ð55Þ framework. For simplicity, we consider the problem of n Clearly, the Hessian is a very sparse tensor (consider- evaluating the sensitivities ScðtÞAR ofthe model state ably sparser than the Jacobian). KPP computes the (concentrations) at time moments t; t0ptotF; with number ofnonzero Hessian entries (and saves this in the respect to the concentration ofa species c at a previous ARTICLE IN PRESS A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 5093 time instance t0 state at previous instants in time, t0ptotF @½y1ðtÞ; y; ynðtފ F F T ScðtÞ¼ ; c ¼ 1; y; n: ð56Þ @gðt Þ @gðt Þ @ycðt0Þ SðtÞ¼ ; y; ; ð58Þ @y1ðtÞ @ynðtÞ The forward numerical integration, yi-yiþ1; is per- using the numerical scheme (57). The adjoint variables formed using the first-order linearly implicit Euler are initialized with lN ¼ lðtFÞ¼@gðyðtFÞÞ=@y: method (Hairer and Wanner, 1991, Section IV.9) 1 6.1. Continuous adjoint implementation Jðti; yiÞÀ I ðyi À yiþ1Þ¼f ðti; yiÞ; h iþ1- i i ¼ 0; y; N À 1; ð57Þ One step ofthe backward integration, l l ; ofthe continuous adjoint model (26) using the linearly implicit with a constant step size h; the final time is reached for Euler method is written (use Eq. (57) for Eq. (26) with tN ¼ t0 þ Nh ¼ tF: The implementation ofone forward h’ À hÞ integration step requires the solution ofa system with 1 the sparse Jacobian, and is readily implemented using JTðtiþ1; yiþ1ÞÀ I ðliþ1 À liÞ KPP-generated routines. h To calculate sensitivities with respect to initial values ¼ JTðtiþ1; yiþ1Þliþ1 ð59Þ we follow formulas (16) in the one-stage, autonomous form. The Hessian is denoted by H ¼ @J=@y: and after rearranging we obtain iþ1 i 0 iþ1 i c T y ¼ y À k ; Sc ¼ Sc À k for 1pcpm; 1 1 Jðtiþ1; yiþ1ÞÀ I li ¼À liþ1: ð60Þ h h 1 Jðti; yiÞÀ I k0 ¼ f ðti; yiÞ; Since the adjoint equations are linear, fully implicit h methods may be implemented at the same computa- tional cost as linearly implicit methods. Ifthe backward i i 1 c i i i i i i 0 integration is performed with the implicit Euler method Jðt ; y ÞÀ I k ¼ Jðt ; y ÞSc þðHðt ; y ÞÂScÞk h then for 1pcpm: 1 T 1 Jðti; yiÞÀ I li ¼À liþ1 ð61Þ For the implementation one needs the following KPP- h h generated routines: sparse Jacobian decomposition and substitution, sparse Jacobian times vector multiplica- with the only difference from Eq. (59) being that the tion, and sparse Hessian times vectors multiplication. Jacobian matrix in Eq. (61) is now evaluated at ðti; yiÞ: Note that an implementation ofthe direct-decoupled Note that Eqs. (59) and (61) need only a sparse LU code for computing sensitivities with respect to rate decomposition and a backsubstitution with the trans- coefficients can be easily obtained following the same posed Jacobian, both operations being implemented by pattern and using the KPP-generated subroutines for the KPP. function and Jacobian derivatives with respect to the rate coefficients. 6.2. Discrete adjoint implementation

The discrete adjoint code is implemented according to 6. Building adjoint code with KPP formulae (43) and (49). Differentiating Eq. (57) with respect to yi we obtain First applications ofthe KPP softwaretools to the 1 @yiþ1 adjoint code generation for chemical kinetics systems Hi Âðyi À yiþ1Þþ Ji À I I À ¼ Ji; ð62Þ h @yi were presented by Daescu et al. (2000), who reported a superior performance over the adjoint code generated where Ji ¼ Jðti; yiÞ and Hi ¼ @Ji=@yi: From Eq. (62) we with the general purpose adjoint compiler TAMC obtain explicitly (Giering and Kaminski, 1998) for a two-stage Rosen- brock method. In this section, we present a tutorial @yiþ1 1 À1 ¼ I À Ji À I Ji þ Hi Âðyiþ1 À yiÞ example for the continuous and discrete adjoint code @yi h implementation in the KPP framework. For simplicity, 1 À1 1 we consider the problem ofevaluating the sensitivities of ¼À Ji À I I þ Hi Âðyiþ1 À yiÞ ð63Þ a response functional gðyðtFÞÞ with respect to the model h h ARTICLE IN PRESS 5094 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096

(where  stands for 3-tensor times vector product). integrator Odessa (Leis and Kramer, 1986) is available From Eqs. (49) and (63) it results with the KPP sparse linear algebra routines. Drivers are present for the general computation of T @yiþ1 sensitivities with respect to all initial values (general- li ¼ liþ1 @yi ddm ic) and with respect to several (user-defined) rate 1 T 1 ÀT coefficients (general ddm rc). Note that KPP produces ¼À I þ Hi Âðyiþ1 À yiÞ Ji À I liþ1: the building blocks for the simulation and also for the h h sensitivity calculations; it also provides application ð64Þ programming templates. Some minimal programming may be required from the users in order to construct Eq. (64) represents the discrete adjoint integration step their own application from the KPP building blocks. associated with the linearly implicit Euler forward The continuous adjoint model can be easily con- integration. Following Daescu et al. (2000), we intro- structed using KPP-generated routines and is integrated duce two new variables k and z with any user selected numerical method. The discrete adjoint models associated with the Ros1, Ros2, and 1 T k ¼ yiþ1 À yi; Ji À I z ¼ liþ1; ð65Þ Rodas3 integrators are provided for variable step size h integration. Drivers for adjoint sensitivity and data assimilation applications are also included. with which Eq. (64) becomes 1 li ¼À z ÀðHi  kÞTz: ð66Þ h 8. Conclusions and future work

The linear system in Eq. (65) is solved for z efficiently The analysis ofcomprehensive chemical reactions through calls to the KPP-generated sparse linear algebra mechanisms, parameter estimation techniques, and routines KppDecomp and KppSolve TR. The right- variational chemical data assimilation applications hand side term in Eq. (66) is evaluated by Hess- require the development ofefficientsensitivity methods VarTR Vec taking full advantage of the Hessian for chemical kinetics systems. Popular methods for sparsity. chemical sensitivity analysis include the direct-decoupled method and automatic differentiation. Remark 4. By comparison ofEqs. (65) and (66) with In this paper, we review the theory ofthe direct and (60) and (61) it can be seen that discrete adjoint model is the adjoint methods for sensitivity analysis in the a more demanding computational process and its context ofchemical kinetic simulations. The direct efficient implementation is not a trivial task. For this method integrates the model and its sensitivity equations reason, the use ofdiscrete adjoints in atmospheric simultaneously and can efficiently evaluate the sensitiv- chemistry applications has been limited to explicit or ities ofall concentrations with respect to fewmodel low-order linearly implicit numerical methods (Fisher parameters. An efficient numerical implementation is the and Lary, 1995; Elbern et al., 1997; Daescu et al., 2000). direct-decoupled method, traditionally formulated using BDF formulas. We extended the direct-decoupled Remark 5. For linear dynamics J ¼ JðtÞ the second- approach to Runge–Kutta and Rosenbrock stiff inte- order derivatives in Eq. (66) are identically zero. There- gration methods. The adjoint method integrates the fore, the discrete adjoint model (66) is equivalent with adjoint ofthe tangent linear model backwards in time the continuous adjoint model (61). and can efficiently evaluate the sensitivities of a scalar response function with respect to a large number of model parameters. Sensitivities with respect to time- dependent model parameters may be obtained through a 7. The KPP numerical library single backward integration ofthe adjoint model. The kinetic preprocessor (KPP) developed by the The KPP numerical library is extended with a set of authors is a symbolic engine that translates a given numerical integrators and drivers for direct-decoupled chemical mechanism into or C kinetic simula- and for adjoint sensitivity computations. tion code. Efficiency is obtained by carefully exploiting Several Rosenbrock methods are implemented for the sparsity structure ofthe Jacobian. A comprehensive direct-decoupled sensitivity analysis, namely Ros1, suite ofstiffnumericalintegrators is also provided. Ros2, Ros3, Rodas3, and Ros4. The implementations The second part ofthis paper presents the compre- distinguish between sensitivities with respect to initial hensive set of software tools for sensitivity analysis that values and sensitivities with respect to parameters for were developed and implemented in the new release of efficiency. In addition, the BDF direct-decoupled the KPP-1.2. KPP was extended to symbolically ARTICLE IN PRESS A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096 5095 generate code for the stoichiometric formulation of Byrne, G.D., Dean, A.M., 1993. The numerical solution of kinetic systems; for a direct sparse multiplication of some chemical kinetics models with VODE and CHEMKIN Jacobian transposed times vector; for the solution of II. Computers and Chemistry 17 (3), 297–302. linear systems involving the transposed Jacobian; for the Cacuci, D.G., 1981a. Sensitivity theory for nonlinear systems. I. derivatives ofrate function and its Jacobian with respect Nonlinear functional analysis approach. Journal of Math- ematical Physics 22, 2794–2802. to reaction coefficients; for the Hessian, i.e. the deriva- Cacuci, D.G., 1981b. Sensitivity theory for nonlinear systems. tives ofthe Jacobian with respect to the concentrations; II. Extensions to additional classes ofresponses. Journal of and for the sparse tensor product of Hessian with user Mathematical Physics 22, 2803–2812. defined vectors. The Hessian and the derivatives with Caracotsios, M., Stewart, W.E., 1985. Sensitivity analysis of respect to rate coefficients are sparse entities; KPP initial value problems with mixed ODEs and algebraic analyzes their sparsity and produces simulation code equations. Computers and Chemical Engineering 9 (4), together with efficient sparse data structures. 359–365. The use ofthese softwaretools to build sensitivity Carter, W.P.L., 2000. Documentation ofthe SAPRC-99 simulation code is outlined, following the theoretical chemical mechanism for VOC reactivity assessment. Final overview ofdirect and adjoint sensitivity analysis. Report to California Air Resources Board Contract No. 92- Direct-decoupled code is constructed in a straightfor- 329 and 95-308, May 2000. Daescu, D.N., Carmichael, G.R., Sandu, A., 2000. Adjoint ward way; BDF and Runge–Kutta direct-decoupled implementation ofRosenbrock methods applied to varia- integrators, as well as specific drivers are included in the tional data assimilation problems. Journal ofComputa- KPP numerical library. The need for Jacobian deriva- tional Physics 165 (2), 496–510. tives prevented Rosenbrock methods to be used Daescu, D.N., Sandu, A., Carmichael, G.R., 2003. Direct and extensively in direct sensitivity calculations; however, adjoint sensitivity analysis ofchemical kinetic systems with the proposed symbolic differentiation technology makes KPP: II—Numerical validation and applications. Atmo- the computation ofthese derivatives feasible.The spheric Environment, in press, doi:10.1016/j.atmosenv. continuous and discrete adjoint models are completely 2003.08.020. generated by the KPP software taking full advantage of Damian-Iordache, V., Sandu, A., Damian-Iordache, M., the sparsity ofthe chemical mechanism. Flexible direct- Carmichael, G.R., Potra, F.A., 1995. KPP—a symbolic decoupled and adjoint sensitivity code implementations preprocessor for chemistry kinetics—user’s guide. are achieved, and various numerical integration methods Technical Report, The University ofIowa, Iowa City, IA 52246. can be employed with minimal user intervention. Damian-Iordache, V., Sandu, A., Damian-Iordache, M., In the companion paper (Daescu et al., 2003), we Carmichael, G.R., Potra, F.A., 2002. The kinetic prepro- present an extensive set ofnumerical experiments and cessor KPP—a software environment for solving chemical demonstrate the efficiency of the KPP software as a tool kinetics. Computers and Chemical Engineering 26 (11), for direct/adjoint sensitivity applications. These applica- 1567–1579. tions include direct-decoupled and adjoint sensitivity Djouad, R., Sportisse, B., Audiffren, N., 2002. Numerical calculations with respect to initial conditions, emissions, simulation ofaqueous-phase atmospheric models: use ofa and reaction rate coefficients; time-dependent sensitivity non-autonomous Rosenbrock method. Atmospheric Envir- analysis; variational data assimilation and parameters onment 36, 873–879. estimation. Dougherty, E.P., Hwang, J.-T., Rabitz, H., 1979. Further developments and applications ofthe Green’s function method ofsensitivity analysis in chemical kinetics. Journal ofChemical Physics 71 (4), 1794–1808. Acknowledgements Dunker, A.M., 1984. The decoupled direct method for calculating sensitivity coefficients in chemical kinetics. The authors are grateful to NSF support for this work Journal ofChemical Physics 81, 2385–2393. through the award ITR AP&IM-0205198. The work of Elbern, H., Schmidt, H., Ebel, A., 1997. Variational data A. Sandu was also supported in part by the NSF assimilation for tropospheric chemistry modeling. Journal CAREER award ACI-0093139. D.N. Daescu acknowl- ofGeophysical Research 102 (D13), 15967–15985. edges the support from the Supercomputing Institute for Errera, Q., Fonteyn, D., 2001. Four-dimensional variational Digital Simulation and Advanced Computation ofthe chemical assimilation ofCRISTA stratospheric measure- ments. Journal ofGeophysical Research 106 (D11), University ofMinnesota. 12253–12265. Fisher, M., Lary, D.J., 1995. Lagrangian four-dimensional variational data assimilation ofchemical species. Quarterly References Journal ofthe Royal Meteorological Society 121, 1681–1704. Bryson, A.E., Ho, Y.-C., 1975. Applied Optimal Control: Giering, R., Kaminski, T., 1998. Recipes for adjoint code Optimization, Estimation, and Control. Hemisphere Pub- construction. ACM Transactions on Mathematical Soft- lishing Corporation, Washington, DC. ware 24 (4), 437–474. ARTICLE IN PRESS 5096 A. Sandu et al. / Atmospheric Environment 37 (2003) 5083–5096

Hairer, E., Wanner, G., 1991. Solving Ordinary Differential for atmospheric chemistry equations II—Rosenbrock sol- Equations II. Stiff and Differential-Algebraic Problems. vers. Atmospheric Environment 31, 3459–3472. Springer, Berlin. Seefeld, S., Stockwell, W.R., 1999. First-order sensitivity Hwang, J.-T., Dougherty, E.P., Rabitz, S., Rabitz, H., 1978. analysis ofmodels with time-dependent parameters: an The Green’s function method of sensitivity analysis in application to PAN and ozone. Atmospheric Environment chemical kinetics. Journal ofChemical Physics 69 (11), 33, 2941–2953. 5180–5191. Sei, A., Symes, W.W., 1995. A note on consistency and Jacobson, M.Z., Turco, R.P., 1994. SMVGEAR: a sparse- adjointness for numerical schemes. Technical Report 95527, matrix, vectorized Gear code for atmospheric models. Rice university, Houston, TX. Atmospheric Environment 17, 273–284. Sirkes, Z., Tziperman, E., 1997. Finite difference of adjoint or Leis, J.R., Kramer, M.A., 1986. ODESSA—an ordinary adjoint of finite difference? Monthly Weather Review 49, differential equation solver with explicit simultaneous 5–40. sensitivity analysis. ACM Transactions on Mathematical Stoer, J., Burlich, R., 1980. Introduction to Numerical Software 14 (1), 61–67. Analysis. Springer, New York, NY. Marchuk, G.I., 1995. Adjoint Equations and Analysis of Ustinov, E.A., 2001. Adjoint sensitivity analysis ofatmos- Complex Systems. Kluwer Academic Publishers, Dordrecht. pheric dynamics: application to the case ofmultiple Marchuk, G.I., Agoshkov, I.V., Shutyaev, P.V., 1996. Adjoint observables. Journal ofthe Atmospheric Sciences 58, Equations and Perturbation Algorithms in Nonlinear 3340–3348. Problems. CRC Press, Boca Raton, FL. Vautard, R., Beekmann, M., Menut, L., 2000. Applications of Menut, L., Vautard, R., Beekmann, M., Honore,! C., 2000. adjoint modeling in atmospheric chemistry: sensitivity and Sensitivity ofphotochemical pollution using the adjoint ofa inverse modeling. Environmental Modeling and Software simplified chemistry-transport model. Journal ofGeophysi- 15, 703–709. cal Research—Atmospheres 105-D12 (15), 15379–15402. Verwer, J.G., Spee, E., Blom, J.G., Hunsdorfer, W., 1999. A Rostaing, N., Dalmas, S., Galligo, A., 1993. Automatic second order Rosenbrock method applied to photochemical differentiation in Odyssee.! Tellus 45, 558–568. dispersion problems. SIAM Journal on Scientific Comput- Sandu, A., Potra, F.A., Damian-Iordache, V., Carmichael, ing 20, 1456–1480. G.R., 1996. Efficient implementation of fully implicit Vuilleumier, L., Harley, R.A., Brown, N.J., 1997. First- and methods for atmospheric chemistry. Journal of Computa- second-order sensitivity analysis ofa photochemically tional Physics 129, 101–110. reactive system (a Green’s function approach). Environ- Sandu, A., van Loon, M., Potra, F.A., Carmichael, G.R., mental Science and Technology 31, 1206–1217. Verwer, J.G., 1997a. Benchmarking stiff ODE solvers for Wang, K.Y., Lary, D.J., Shallcross, D.E., Hall, S.M., Pyle, atmospheric chemistry equations I—implicit vs. explicit. J.A., 2001. A review on the use ofthe adjoint method in Atmospheric Environment 31, 3151–3166. four-dimensional atmospheric-chemistry data assimilation. Sandu, A., Blom, J.G., Spee, E., Verwer, J.G., Potra, F.A., Quarterly Journal ofthe Royal Meteorological Society 127 Carmichael, G.R., 1997b. Benchmarking stiff ODE solvers (576, Part B), 2181–2204.