A Trust-Region Method Applied to Parameter Identification of a Simple
Total Page:16
File Type:pdf, Size:1020Kb
Applied Mathematical Modelling 29 (2005) 289–307 www.elsevier.com/locate/apm A trust-region method applied to parameter identification of a simple prey–predator model Je´roˆme M.B. Walmag *,E´ ric J.M. Delhez Universite´ de Lie`ge, Mode´lisation et Me´thodes Mathe´matiques, Sart-Tilman B37, Lie`ge B-4000, Belgium Accepted 6 September 2004 Abstract In this paper, the calibration of the non linear Lotka–Volterra model is used to compare the robustness and efficiency (CPU time) of different optimisation algorithms. Five versions of a quasi-Newton trust-region algorithm are developed and compared with a widely used quasi-Newton method. The trust-region algorithms is more robust and three of them are numerically cheaper than the more usual line search approach. Computation of the first derivatives of the objective function is cheaper with the backward differentia- tion (or adjoint model) technique than with the forward method as soon as the number of parameter is greater than a few ones. In the optimisation problem, the additional information about the Jacobian matrix made available by the forward method reduces the number of iterations but does not compensate for the increased numerical costs. A quasi-Newton trust-region algorithm with backward differentiation and BFGS update after both suc- cessful and unsuccessful iterations represents a robust and efficient algorithm that can be used to calibrate very demanding dynamic models. Ó 2004 Elsevier Inc. All rights reserved. Keywords: Optimisation; Calibration; Dynamical system; Trust-region; Ecosystem model * Corresponding author. E-mail address: [email protected] (J.M.B. Walmag). 0307-904X/$ - see front matter Ó 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.apm.2004.09.005 290 J.M.B. Walmag, E´ .J.M. Delhez / Applied Mathematical Modelling 29 (2005) 289–307 1. Introduction Originally restricted to the traditional engineering fields of solid mechanics and fluid dynamics, numerical models developed as scientific tools in a wide variety of systems. In the field of environ- mental modelling, in particular, ecosystem models are now routinely used to describe and under- stand the links between the different trophic levels and the cycling of nutrients in the food web (e.g. [1,15,22–24,27,29]). Such models are all expressions of the unquestionable law of conservation of mass but rely on pure modelling to express the interaction between the different models compartments. As a result, they contain a large number of parameters that must be tuned to give reliable, or at least realistic, simulations. These parameters are indeed far from being universal constants. Some of them reflect intrinsic but highly variable properties of the different taxa. Some other parameters are introduced to summarise a wide variety of biological processes and are therefore only weakly related to known quantities. Other parameters are introduced in the closure of the model and have therefore a pure mathematical origin. The appropriate calibration of such models forms therefore an impor- tant step in their development. The first ecosystem models were calibrated using a trial and error procedure; comparing—often graphically—the results obtained with different sets of parameters. With such highly non linear dynamics as ecosystem dynamics, such a procedure is exhausting, time consuming and subjective. Numerical methods developed in the field of mathematical optimisation offer however a promis- ing alternative for, at least, two reasons. First, these methods provide systematic techniques that do not rely on the modelerÕs skill to improve the calibration. Second, mathematical optimisation aims at the reduction of an appropriate gauge of the discrepancy between observations and model predictions (the so-called objective function). It introduces thus an objective assessment of the model performance that can be used to quantify model errors and to compare different modelling approaches. Different approaches have been used to assess the optimum set of parameters. In environmental science, simulated annealing and genetic algorithms are among the most widely used methods (e.g. [19,13,14,2]). These methods are interesting because they avoid the cal- culation of the derivatives of the objective function with respect to the parameters and can be used to identify the global minimum. Thousands of simulation of the model are however necessary, even with a limited number of unknown parameters. This restrict the use of methods of this kind to simple dynamic models. Other methods must be sought to handle complex models that are more demanding in compu- ter resources. Evans [4] and Hemmings et al. [12] implement PowellÕs conjugate direction method [25]. Other authors (e.g. [6,31,8,28,30,7]) resort to various gradient descent methods to solve the optimisation problem. Such methods do not ensure convergence to the global minimum but require less function evaluations than meta-heuristic ones. Considering the increasing complexity of ecosystem models and the ultimate goal of calibrating three-dimensional ecosystem models taking into account the spatial variations and the influence of hydrodynamics on the ecosystem dynamics, optimisation algorithms requiring as few model simulations as possible are still strongly needed. This paper describes how trust-region optimisa- tion algorithms [3] can be used in this framework. The robustness and efficiency of the trust region J.M.B. Walmag, E´ .J.M. Delhez / Applied Mathematical Modelling 29 (2005) 289–307 291 algorithm described here is compared with more traditional Gauss–Newton and quasi-Newton methods [25]. 2. Problem formulation We use the classical Lotka–Volterra model (e.g. [21]) as a test case for the different parameter identification techniques. This model involves two state variables X(t), a prey, and Y(t), its pred- ator, depending on one independent time variable t. The dynamical equations are dX ¼ a X À a XY ; ð1Þ dt 1 2 dY ¼ a XY À a Y ; ð2Þ dt 3 4 where a1, a2, a3 and a4 are, respectively, the growth rate of the prey, the predation rate, the growth rate of the predator and its mortality rate. The initial conditions are noted X ð0Þ¼X 0 and Y ð0Þ¼Y 0: ð3Þ The equations are integrated numerically for a period of time T = 100 (arbitrary units) using a straightforward semi-implicit scheme with a time-step Dt = 0.005. In spite of its apparent simplicity, the Lotka–Volterra model is assumed to provide a suitable test case to compare parameter estimation techniques thanks to its non linear dynamics. The four parameters of the models and the initial conditions are regarded as unknown para- meters to be identified and form therefore the components of the control variables vector x 2 R6, T x ¼ ½X 0 Y 0 a1 a2 a3 a4 : ð4Þ In real applications, the model must be calibrated against experimental data. In this numerical study, however, a twin experiment is carried out: a reference solution is generated with the model itself using the parameters xref listed in Table 1. The solution curve is then sampled at N =40 b b obs random points to provide the reference data X i and Y i. The reference solution and the sampling points are shown in Fig. 1. Table 1 Parameter reference values xref for the twin experiment Parameter Reference value ref X 0 1 ref Y 0 1 ref a1 0.4 ref a2 0.2 ref a3 0.2 ref a4 0.1 292 J.M.B. Walmag, E´ .J.M. Delhez / Applied Mathematical Modelling 29 (2005) 289–307 4.5 2 y Reference x Sampling Initial 1.5 1 3 0.5 0 0 20 40 60 80 t 100 4.5 y 1.5 3 1.5 0 0 0.5 1 1.5 2 0 x 0 20 40 60 80 t 100 Fig. 1. Right: reference and initial systems in phase diagram. Left: reference and initial state variables responses versus time. The small circles show the random sampling used for assimilation. For parameter estimation and data assimilation, differences between predicted and measured values of these variables must be quantified by a single misfit number, the objective function. A lot of options are available for choosing the misfit function (see [5]). In this study, we choose the classical least-squares approach. For a given set of parameters x, we compute the discrete response Xi and Yi of the model and estimate the error by 1 NXmax 1 NXmax f ðxÞ¼ d ðX À Xb Þ2 þ d ðY À Yb Þ2; ð5Þ 2 i i i 2 i i i i¼0 i¼0 where 1 if there is a data available at time i; di ¼ ð6Þ 0 otherwise is a sampling function and Nmax = T/Dt is the number of integration steps. Of course, we have NXmax di ¼ N obs: ð7Þ i¼0 The parameter identification problem is then recast into the unconstrained optimisation problem (e.g. [9]) J.M.B. Walmag, E´ .J.M. Delhez / Applied Mathematical Modelling 29 (2005) 289–307 293 find xà ¼ arg min f ðxÞ: ð8Þ x i.e. minimizing the misfit between model and data is sought. 3. Trust-region method Sequential quadratic methods are appealing to solve the optimisation problem at hand. The convergence of Newton-like methods is however not guaranteed. To ensure global convergence, i.e. convergence to a local minimum irrespective of the initial guess, it is necessary to resort to globalisation techniques. In this paper, a global trust-region algorithm is implemented and com- pared with the more traditional line search approach. The objective function f(x) appearing in problem (8) is assumed to be a real-valued twice con- tinuously differentiable function. Following the trust-region approach, a quadratic model 1 mðkÞðxÞ¼f ðxðkÞÞþðx À xðkÞÞTr f ðxðkÞÞþ ðx À xðkÞÞTH ðkÞðx À xðkÞÞð9Þ x 2 of f is build that is deemed to be accurate in a spherical trust-region of radius D(k) around the cur- (k) (k) (k) rent iterate x .