S

and/or limiting cases of the Hamiltonian dynamics; Sampling Techniques for Computational and (3) deterministic dynamics on an extended phase Statistical Physics space. Benedict Leimkuhler1 and Gabriel Stoltz2 1Edinburgh University School of , Edinburgh, Scotland, UK Description 2Universite´ Paris Est, CERMICS, Projet MICMAC Ecole des Ponts, ParisTech – INRIA, Marne-la-Vallee,´ Applications of sampling methods arise most com- France monly in molecular dynamics and polymer modeling, but they are increasingly encountered in fluid dynamics and other . In this article, we focus on the treat- ment of systems of particles described by position and Mathematics Subject Classification momentum vectors q and p, respectively, and modeled by a Hamiltonian energy H D H.q;p/. 82B05; 82-08; 65C05; 37M05 Macroscopic properties of materials are obtained, according to the laws of statistical physics, as the average of some with respect to a proba- Short Definition bility measure  describing the state of the system ( Calculation of Ensemble Averages): The computation of macroscopic properties, as pre- Z dicted by the laws of statistical physics, requires sam- E.A/ D A.q; p/ .dq dp/: (1) pling phase-space configurations distributed according E to the probability measure at hand. Typically, approx- imations are obtained as time averages over trajecto- In practice, averages such as (1) are obtained by gener- ries of discrete dynamics, which can be shown to be ating, by an appropriate numerical method, a sequence i i ergodic in some cases. Arguably, the greatest interest of microscopic configurations .q ;p /i0 such that is in sampling the canonical (constant temperature) ensemble, although other distributions (isobaric, mi- n1 Z 1 X crocanonical, etc.) are also of interest. Focusing on lim A.qi ; / D A.q; p/ .dq dp/: (2) n!C1 n E the case of the canonical measure, three important iD0 types of methods can be distinguished: (1) Markov chain methods based on the Metropolis–Hastings algo- The Canonical Case rithm; (2) discretizations of continuous stochastic dif- For simplicity, we consider the case of the canonical ferential which are appropriate modifications measure:

© Springer-Verlag Berlin Heidelberg 2015 B. Engquist (ed.), Encyclopedia of Applied and Computational Mathematics, DOI 10.1007/978-3-540-70529-1 1288 Sampling Techniques for Computational Statistical Physics

1 ˇH.q;p/ or degenerate diffusive processes being added to deter- .dq dp/ D Z e dqdp; Z ministic models to improve sampling efficiencies. ˇH.q;p/ Z D e dqdp; (3) Direct probabilistic methods are typically based on E a prior probability measure used to sample configura- tions, which are then accepted or rejected according 1 where ˇ D kBT . Many sampling methods designed to some criterion (as for the rejection method, for for the canonical ensemble can be extended or adapted instance). Usually, a prior probability measure which to sample other ensembles. is easy to sample should be used. However, due to If the Hamiltonian is separable (i.e., it is the sum the high dimensionality of the problem, it is extremely of a quadratic kinetic energy and a potential energy), difficult to design a prior sufficiently close to the as is usually the case when Cartesian coordinates are canonical distribution to achieve a reasonable accep- used, the measure (3) has a tensorized form, and the tance rate. Direct probabilistic methods are therefore components of the momenta are distributed accord- rarely used in practice. ing to independent Gaussian distributions. It is there- fore straightforward to sample the kinetic part of the Markov Chain Methods canonical measure. The real difficulty consists in sam- Markov chain methods are mostly based on the pling positions distributed according to the canonical Metropolis–Hastings algorithm [5, 13], which is a measure widely used method in molecular simulation. The prior Z required in direct probabilistic methods is replaced by 1 ˇV.q/ ˇV.q/ a proposal move which generates a new configuration .dq/ D Z e dq; Z D e dq; D from a former one. This new configuration is then (4) accepted or rejected according to a criterion ensuring which is typically a high dimensional distribution, with that the correct measure is sampled. Here again, many local concentrated modes. For this reason, many designing a relevant proposal move is the cornerstone sampling methods focus on sampling the configura- of the method, and this proposal depends crucially on tional part  of the canonical measure. the model at hand. Since most concepts needed for sampling purposes can be used either in the configuration space or in the The Metropolis–Hastings Algorithm phase space, the following notation will be used: The The Metropolis–Hastings algorithm generates a state of the system is denoted by x 2 S  Rd ,which n Markov chain of the system configurations .x /n0 D can be the position space q 2 (and then d D 3N ), having the distribution of interest .dx/ as a stationary E or the full phase space .q; p/ 2 with d D 6N . distribution. The invariant distribution  has to be The measure .dx/ is the canonical distribution to be known only up to a multiplicative constant to perform sampled ( in configuration space,  in phase space). this algorithm (which is the case for the canonical measure and its marginal in position). It consists in General Classification a two-step procedure, starting from a given initial From a mathematical point of view, most sampling condition x0: methods may be classified as (see [2]): 1. Propose a new state xQnC1 from xn according to the 1. “Direct” probabilistic methods, such as the standard proposition kernel T.xn; / rejection method, which generate identically and 2. Accept the propositionà with probability min independently distributed (i.i.d) configurations .xQ nC1/T.xQ nC1;xn/ 1; , and set in this case 2. Markov chain techniques .xn/T.xn; xQ nC1/ 3. Markovian stochastic dynamics xnC1 DQxnC1; otherwise, set xnC1 D xn 4. Purely deterministic methods on an extended phase- It is important to count several times a configuration space when a proposal is rejected. Although the division described above is useful to bear The original Metropolis algorithm was proposed in mind, there is a blurring of the lines between the dif- in [13] and relied on symmetric proposals in the config- ferent types of methods used in practice, with Markov uration space. It was later extended in [5] to allow for chains being constructed from Hamiltonian dynamics nonsymmetric propositions which can bias proposals Sampling Techniques for Computational Statistical Physics 1289 toward higher probability regions with respect to the element in devising efficient algorithms. It is observed target distribution . The algorithm is simple to in- in practice that the optimal acceptance/rejection rate, terpret in the case of a symmetric proposition kernel in terms of the variance of the estimator (a mean of on the configuration space (.x/ / eˇV.q/ and some observable over a trajectory), for example, is T.q;q0/ D T.q0;q/). The Metropolis–Hastings ratio often around 0.5, ensuring some balance between: is simply • Large moves that decorrelate the iterates when they   are accepted (hence reducing the correlations in the .q;q0/ D exp ˇ.V.q0/  V.q// : chain, which is interesting for the convergence to happen faster) but lead to high rejection rates (and If the proposed move has a lower energy, it is always thus degenerate samples since the same position accepted, which allows to visit more frequently the may be counted several times) states of higher probability. On the other hand, tran- • And small moves that are less rejected but do not sitions to less likely states of higher energies are not decorrelate the iterates much forbidden (but accepted less often), which is important This trade-off between small and large proposal moves to observe transitions from one metastable region to has been investigated rigorously in some simple cases another when these regions are separated by some in [16,17], where optimal acceptance rates are obtained energy barrier. in a limiting regime.

Properties of the Algorithm Some Examples of Proposition Kernels The probability transition kernel of the Metropolis– The most simple transition kernels are based on ran- Hastings chain reads dom walks. For instance, it is possible to modify the current configuration by a random perturbation applied   to all particles. The problem with such symmetric P.x;dx0/ D min 1; r.x; x0/ T.x;dx0/ proposals is that they may not be well suited to the 0 .1  ˛.x// ıx.dx /; (5) target probability measure (creating very correlated successive configurations for small , or very unlikely where ˛.x/ 2 Œ0; 1 is the probability to accept a move moves for large ). Efficient nonsymmetric proposal starting from x (considering all possible propositions): moves are often based on discretizations of continuous Z stochastic dynamics which use a biasing term such as rV to ensure that the dynamics remains sufficiently ˛.x/ D min .1; r.x; y// T .x; dy/: S close to the minima of the potential. An interesting proposal relies on the Hamiltonian The first part of the transition kernel corresponds to dynamics itself and consists in (1) sampling new mo- the accepted transitions from x to x0, which occur menta pn according to the kinetic part of the canon- with probability min .1; r.x; x0//, while the term .1  ical measure; (2) performing one or several steps of S 0 ˛.x//ıx .dx / encodes all the rejected steps. the Verlet scheme starting from the previous position A simple computation shows that the Metropolis– qn, obtaining a proposed configuration .eqnC1;epnC1/; Hastings transition kernel P is reversible with respect and (3) computing rn D expŒˇ.H.eqnC1;epnC1/  to , namely, P.x;dx0/.dx/ D P.x0;dx/.dx0/. H.qn;pn// and accepting the new position qnC1 with This implies that the measure  is an invariant mea- probability min.1; rn/. This algorithm is known as sure. To conclude to the pathwise ergodicity of the al- the Hybrid Monte Carlo algorithm (first introduced gorithm (2) (relying on the results of [14]), it remains to in [3] and analyzed from a mathematical viewpoint check whether the chain is (aperiodically) irreducible, in [2, 20]). i.e., whether any state can be reached from any other A final important example is parallel tempering one in a finite number of steps. This property depends strategies [10], where several replicas of the system on the proposal kernel T , and should be checked for are simulated in parallel at different temperatures, and the model under consideration. sometimes exchanges between two replicas at different Besides determining the theoretical convergence temperatures are attempted, the probability of such an of the algorithm, the proposed kernel is also a key exchange being given by a Metropolis–Hastings ratio. 1290 Sampling Techniques for Computational Statistical Physics

Continuous Stochastic Dynamics 3N  3N real matrices. The term .qt /dWt is a fluc- A variety of stochastic dynamical methods are in use tuation term bringing energy into the system, this en- for sampling the canonical measure. For simplicity of ergy being dissipated through the viscous friction term 1 exposition, we consider here systems with the N -body .qt /M pt dt. The canonical measure is preserved Hamiltonian H D pT M 1p=2 C V.q/. precisely when the “fluctuation-dissipation” relation T 2  D ˇ is satisfied. Many variants and extensions Brownian Dynamics of Langevin dynamics are available. Brownian dynamics is a stochastic dynamics on the Using the Hormander¨ conditions, it is possible to position variable q 2 D only: demonstrate ergodicity of the system provided .q/ s has full rank (i.e., a rank equal to 3N )forallq in 2 position space. A spectral gap can also be demonstrated dqt DrV.qt /dt C dWt ; (6) under appropriate assumptions on the potential energy ˇ function, relying on recent advances in hypocoercivity [22] or thanks to Lyapunov techniques. where W is a standard 3N -dimensional Wiener pro- t Brownian motion may be viewed as either the non- cess. It can be shown that this system is an ergodic pro- inertial limit (m ! 0) of the Langevin dynamics, or cess for the configurational invariant measure .dq/ D its overdamped limit ( !1) with a different time- Z1 exp.ˇV .q// dq, the ergodicity following from  scaling. the elliptic nature of the generator of the process. The The discretization of stochastic differential equa- dynamics (6) may be solved numerically using the tions, such as Langevin dynamics, is still a topic of Euler–Maruyama scheme: research. Splitting methods, which divide the system s into deterministic and stochastic components, are in- 2t qnC1 D qn  trV.qn/ C Gn; (7) creasingly used for this purpose. As an illustration, ˇ one may adopt a method whereby Verlet integration is supplemented by an “exact” treatment of the Ornstein– n where the .G /n0 are independent and identically Uhlenbeck process, replacing distributed (i.i.d.) centered Gaussian random vectors in R3N with identity covariance E .Gn ˝ Gn/ D 1 Id3N . Although the discretization scheme does not ex- dpt D.qt /M pt dt C .qt /dWt actly preserve the canonical measure, it can be shown under certain boundedness assumptions (see [12, 21]), that the numerical scheme is ergodic, with an invariant by a discrete process that samples the associated Gaus- probability close to the canonical measure  in a sian distribution. In some cases, it is possible to show suitable norm. The numerical bias may be eliminated that such a method is ergodic. using a Metropolis rule, see, e.g., [16, 18]. Numerical discretization methods for Langevin dy- namics may be corrected in various ways to exactly Langevin Dynamics preserve the canonical measure, using the Metropo- Hamiltonian dynamics preserve the energy, while a lis technique [5, 13] (see, e.g., the discussion in [9], sampling of the canonical measure requires visiting all Sect. 2.2). the energy levels. Langevin dynamics is a model of a Hamiltonian system coupled with a heat bath, defined by the following equations: Deterministic Dynamics on Extended Phase ( Spaces 1 dqt D M pt dt; It is possible to modify Hamiltonian dynamics by the addition of control laws in order to sample the canoni- dp DrV.q /dt .q /M1p dt C .q /dW ; t t t t t t cal (or some other) distribution. The simplest example (8) of such a scheme is the Nose–Hoover´ method [6, 15] where Wt is a 3N -dimensional standard Brownian which replaces Newton’s equations of motion by the motion, and  and  are (possibly position dependent) system: Sampling Techniques for Computational Statistical Physics 1291

qP D M 1p; which incorporates only a scalar noise process. This method has been called the Nose–Hoover–Langevin´ pP DrV.q/ p; method in [8], where also ergodicity was proved in the 1 T 1 P D Q .p M p  NkB T/; case of an underlying harmonic system (V quadratic) under certain assumptions. A similar technique, the “Langevin Piston” [4] has been suggested to control the where Q>0is a parameter. It can be shown pressure in molecular dynamics, where the sampling that this dynamics preserves the product distribution is performed with respect to the NPT (isobaric– eˇH.q;p/ eˇQ 2=2 as a stationary macrostate. It is, in isothermal) ensemble. some cases (e.g., when the underlying system is linear), not ergodic, meaning that the invariant distribution is not unique [7]. Nonetheless, the method is still popular References for sampling calculations. The best arguments for its continued success, which have not been founded 1. Bond, S., Laird, B., Leimkuhler, B.: The Nose-Poincar´ e´ method for constant temperature molecular dynamics. rigorously yet, are that (a) molecular systems typically J. Comput. Phys. 151, 114–134 (1999) have large phase spaces and may incorporate liquid 2. Cances,` E., Legoll, F., Stoltz, G.: Theoretical and numerical solvent, steep potentials, and other mechanisms that comparison of sampling methods for molecular dynamics. provide a strong internal diffusion property or (b) Math. Model. Numer. Anal. 41(2), 351–390 (2007) 3. Duane, S., Kennedy, A.D., Pendleton, B.J., Roweth, D.: any inaccessible regions in phase space may not Hybrid Monte-Carlo. Phys. Lett. B 195(2), 216–222 contribute much to the averages of typical quantities of (1987) interest. 4. Feller, S., Zhang, Y., Pastor, R., Brooks, B.: Constant pres- sure molecular dynamics simulation: the langevin piston The accuracy of sampling can sometimes be im- method. J. Chem. Phys. 103(11), 4613–4621 (1995) proved by stringing together “chains” of additional 5. Hastings, W.K.: Monte Carlo sampling methods using variables [11], but such methods may introduce addi- Markov chains and their applications. Biometrika 57, tional and unneeded complexity (especially as there are 97–109 (1970) 6. Hoover, W.: Canonical dynamics: equilibrium phase space more reliable alternatives, see below). When ergodicity distributions. Phys. Rev. A 31, 1695–1697 (1985) is not a concern (e.g., when a detailed atomistic model 7. Legoll, F., Luskin, M., Moeckel, R.: Non-ergodicity of of water is involved), an alternative to the Nose–´ the Nose-Hoover´ thermostatted harmonic oscillator. Arch. Hoover method is to use the Nose–Poincar´ e´ method [1] Ration. Mech. Anal. 184, 449–463 (2007) 8. Leimkuhler, B., Noorizadeh, N., Theil, F.: A gentle stochas- which is derived from an extended Hamiltonian and tic thermostat for molecular dynamics. J. Stat. Phys. 135(2), which allows the use of symplectic integrators (pre- 261–277 (2009) serving phase space volume and, approximately, en- 9. Lelievre,` T., Rousset, M., Stoltz, G.: Free Energy Computa- ergy, and typically providing better long-term stability; tions: a Mathematical Perspective. Imperial College Press,  London/Hackensack (2010) see Molecular Dynamics). 10. Marinari, E., Parisi, G.: Simulated tempering – a new Monte-Carlo scheme. Europhys. Lett. 19(6), 451–458 S (1992) Hybrid Methods by Stochastic Modification 11. Martyna, G., Klein, M., Tuckerman, M.: Nose-Hoover´ When ergodicity is an issue, it is possible to enhance chains: the canonical ensemble via continuous dynamics. extended dynamics methods by the incorporation of J. Chem. Phys. 97(4), 2635–2643 (1992) stochastic processes, for example, as defined by the ad- 12. Mattingly, J.C., Stuart, A.M., Higham, D.J.: Ergodicity for SDEs and approximations: locally Lipschitz vector dition of Ornstein–Uhlenbeck terms. One such method fields and degenerate noise. Stoch. Process. Appl. 101(2), has been proposed in [19]. It replaces the Nose–Hoover´ 185–232 (2002) system by the highly degenerate stochastic system: 13. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1091 1 (1953) dqt D M pt dt; 14. Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic dpt D .rV.qt /  pt /dt; Stability. Communications and Control Engineering . Ä Â Ã s Springer, London/New York (1993) 15. Nose,´ S.: A molecular-dynamics method for simulations 1 T 1 N 2 d t D Q pt M pt    dt C dWt ; in the canonical ensemble. Mol. Phys. 52, 255–268 ˇ ˇQ (1984) 1292 Schr¨odinger for Chemistry

16. Roberts, G.O., Rosenthal, J.S.: Optimal scaling of discrete hydrogen atom. A unified framework allowing for approximations to Langevin diffusions. J. R. Stat. Soc. B 60, a systematic study of quantum phenomena arose, 255–268 (1998) however, first in the 1920s. Starting point was de 17. Roberts, G.O., Gelman, A., Gilks, W.R.: Weak convergence and optimal scaling of random walk Metropolis algorithms. Broglie’s observation of the wave-like behavior of Ann. Appl. Probab. 7, 110–120 (1997) matter, finally result- ing in the Schrodinger¨ equation 18. Rossky, P.J., Doll, J.D., Friedman, H.L.: Brownian dynam- [8]and[3] for the multiparticle case. The purpose of ics as smart Monte Carlo simulation. J. Chem. Phys. 69, this article is to motivate this equation from some 4628–4633 (1978) 19. Samoletov, A., Chaplain, M.A.J., Dettmann, C.P.: Ther- basic principles and to sketch at the same time mostats for “slow” configurational modes. J. Stat. Phys. 128, the mathematical structure of quantum mechanics. 1321–1336 (2007) More information can be found in textbooks on 20. Sch¨utte, C.: Habilitation thesis. Freie Universitat¨ Berlin, Berlin (1999). http://publications.mi.fu-berlin.de/89/ quantum mechanics like Atkins and Friedman [1] 21. Talay, D., Tubaro, L.: Expansion of the global error for or Thaller [11,12]. The first one is particularly devoted numerical schemes solving stochastic differential equations. to the understanding of the molecular processes Stoch. Anal. Appl. 8(4), 483–509 (1990) that are important for chemistry. The second and 22. Villani, C.: Hypocoercivity. Mem. Am. Math. Soc. 202(950), 141 (2009) the third one more emphasize the mathematical structure and contain a lot of impressive visualizations. The monograph [4] gives an introduction to the mathematical theory. A historically very interesting Schrodinger¨ Equation for Chemistry text, in which the mathematically framework of quantum mechanics has been established and which Harry Yserentant was at the same time a milestone in the development Institut f¨ur Mathematik, Technische Universitat¨ of spectral theory, is von Neumann’s seminal treatise Berlin, Berlin, Germany [13]. The present exposition is largely taken from Yserentant [14].

The Schrodinger¨ Equation of a Free Particle Mathematics Subject Classification Let us first recall the notion of a plane wave, a complex- valued function 81-02; 81V55; 35J10 Rd  R ! C W .x; t/ ! e ikxi!t; (1) Short Definition with k 2 Rd the wave vector and ! 2 R the The Schrodinger¨ equation forms the basis of nonrel- frequency. A dispersion relation ! D !.k/ assigns ativistic quantum mechanics and is fundamental for to each wave vector a characteristic frequency. Such our understanding of atoms and molecules. The entry dispersion relations fix the physics that is described by motivates this equation and embeds it into the general this kind of waves. Most common is the case ! D framework of quantum mechanics. cjkj which arises, for example, in the propagation of light in vacuum. When the wave nature of matter was recognized, the problem was to guess the dispersion Description relation for the matter waves: to guess, as this hy- pothesis creates a new kind of physics that cannot be Introduction deduced from known theories. A good starting point Quantum mechanics links chemistry to physics. is Einstein’s interpretation of the photoelectric effect. Conceptions arising from quantum mechanics form When polished metal plates are irradiated by light of the framework for our understanding of atomic and sufficiently short wave length they may emit electrons. molecular processes. The history of quantum mechan- The magnitude of the electron current is as expected ics began around 1900 with Planck’s analysis of the proportional to the intensity of the light source, but black-body radiation, Einstein’s interpretation of their energy surprisingly to the wave length or the the photoelectric effect, and Bohr’s theory of the frequency of the incoming light. Einstein’s explanation Schr¨odinger Equation for Chemistry 1293 Z Z was that light consists of single light quanta with j .x; t/j2 dx D j b.k; t/j2 dk energy and momentum

E D„!; p D„k (2) remainsconstant in time. Thisfollowsfrom Plancherel’s theorem, a central result of Fourier analysis. We depending on the frequency ! and the wave vector k. assume in the sequel that this value is normalized The quantity to 1, which is basic for the statistical interpretation of the wavefunctions . The quantities j j2 and j bj2 „D1:0545716  1034 kg m2 s1 can then be interpreted as probability densities. The is Planck’s constant, an incredibly small quantity of Z Z the dimension energy  time called action, reflecting j .x; t/j2 dx; j b.k; t/j2 dk the size of the systems quantum mechanics deals with. ˝ b˝ To obtain from (2) a dispersion relation, Schrodinger¨ represent the probabilities to find the particle at time t started first from the energy-momentum relation of in the region ˝ of the position space, respectively, the special relativity, but this led by reasons not to be region ˝b of the momentum space. The quantity discussed here to the wrong predictions. He therefore fell back to the energy-momentum relation Z „2 jkj2 j b.k; t/j2 dk; 1 2m E D jpj2 2m is the expectation value of the kinetic energy. With help from classical, Newtonian mechanics. It leads to the of the Hamilton operator dispersion relation „2 H D ; (5) „ 2m ! D jkj2 2m this expectation value can be rewritten as for the plane waves (1). These plane waves can be Z superimposed to wave packets H dx D . ; H /: Á Z 3 „ 1 i jkj2t b ikx .x; t/ D p e 2m 0.k/ e dk: (3) 2 The expectation values of the components of the mo- mentum are in vector notation These wave packets are the solutions of the partial Z S „k j b.k; t/j2 dk:

@ „2 i „ D  ; (4) @t 2m Introducing the momentum operator the Schrodinger¨ equation for a free particle of mass m p Di „r (6) in absence of external forces. their position representation is the inner product The Schrodinger¨ equation (4) is of first order in Z time. Its solutions, the wavefunctions of free particles, p dx D . ; p /: are uniquely determined by their initial state 0.If 0 is a rapidly decreasing function (in the Schwartz space) the solution possesses time of arbitrary The expectation values of the three components of the order, and all of them are rapidly decreasing functions particle position are finally Z of the spatial variables. To avoid technicalities, we 2 assume this for the moment. We further observe that x j .x; t/j dx D . ; q /; 1294 Schr¨odinger Equation for Chemistry with q the position operator given by ! x . have to establish a connection between such strongly This coincidence between observable physical quanti- continuous groups of unitary operators and abstract ties like energy, momentum, or position and operators Hamilton operators. Let D.H / be the linear subspace H acting upon the wavefunctions is in no way accidental. of the given system Hilbert space that consists of H It forms the heart of quantum mechanics. those elements in for which the limit U. /  I The Mathematical Framework of Quantum H D i „ lim Mechanics !0 We have seen that the physical state of a free particle at exists in the sense of norm convergence. The mapping a given time t is completely determined by a function ! H from the domain D.H / into the Hilbert in the Hilbert space L2 that again depends uniquely on space H is then called the generator H of the . the state at a given initial time. In the case of more The generator of the evolution operator of the free general systems, the space L2 is replaced by another particle is the operator Hilbert space, but the general concept remains: „2 Postulate 1. A quantum-mechanical system consists of H D  (8) a complex Hilbert space H with inner product .; / and 2m a one-parameter group U.t/, t 2 R, of unitary linear with the Sobolev space H 2 as domain of definition operators on H with D.H /. In view of this observation, the following result U.0/ D I;U.sC t/ D U.s/U.t/ for the general abstract case is unsurprising: Theorem 1 For all initial values .0/ in the domain that is strongly continuous in the sense that for all 2 D.H / of the generator of the group of the propagators H in the Hilbert space norm U.t/, the elements (7) are contained in D.H /, too, lim U.t/ D : depend continuously differentiable on t, and satisfy the t!0 differential equation A state of the system corresponds to a normalized vec- H d tor in . The time evolution of the system is described i „ .t/ D H .t/: (9) by the group of the propagators U.t/; the state dt

.t/ D U.t/ .0/ (7) It should be noted once more, however, that the dif- ferential (9), the abstract Schrodinger¨ equation, makes of the system at time t is uniquely determined by its sense only for initial values in the domain of the state at time t D 0. generator H, but that the propagators are defined on the whole Hilbert space. In the case of free particles considered so far, the A little calculation shows that the generators of one- solution of the Schrodinger¨ equation and with that parameter unitary groups are necessarily symmetric. time evolution is given by (3). The evolution operators More than that, they are even selfadjoint. There is U.t/, or propagators, read therefore in the Fourier or a direct correspondence between unitary groups and momentum representation selfadjoint operators, Stone’s theorem, a cornerstone in

„ b i jkj2t b the mathematical foundation of quantum mechanics: .k/ ! e 2m .k/: Theorem 2 If U.t/, t 2 R, is a one-parameter unitary Strictly speaking, they have first only been defined group as in Postulate 1, the domain D.H / of its for rapidly decreasing functions, functions in a dense generator H is a dense subset of the underlying Hilbert subspace of L2, but it is obvious from Plancherel’s space and the generator itself selfadjoint. Every selfad- theorem that they can be uniquely extended from there joint operator H is conversely the generator of such a to L2 and have the required properties. one-parameter unitary group, that is usually denoted The next step is to move from Postulate 1 to an ab- as  i Ht stract version of the Schrodinger¨ equation. For that we U.t/ D e „ : Schr¨odinger Equation for Chemistry 1295

Instead of the unitary group of the propagators, a D.A/ and D.B/ such that A 2 D.B/ and B 2 quantum-mechanical system can be thus equivalently D.A/. The product of the corresponding uncertainties fixed by the generator H of this group, the Hamilton is then bounded from below by operator, or in the language of physics, the Hamiltonian of the system. 1 A B  j..BA  AB/ ; /j: (11) In our discussion of the free particle, we have 2 seen that there is a direct correspondence between the expectation values of the energy, the momentum, The proof is an exercise in . As an and the position of the particle and the energy or example, we consider the components Hamilton operator (5), the momentum operator (6), and the position operator x ! x . Each of these operators @ qk D xk ;pk Di „ is selfadjoint. This reflects the general structure of @xk quantum mechanics: of the position and the momentum operator. Their Postulate 2. Observable physical quantities, or ob- commutators are servables, are in quantum mechanics represented by selfadjoint operators A W D.A/ ! H defined on dense qkpk  pkqk D i „ I: subspaces D.A/ of the system Hilbert space H.The quantity This results in the Heisenberg uncertainty principle hAiD. ; A / (10) is the expectation value of a measurement of A for the 1 pk qk  „: (12) system in state 2 D.A/. 2

At this point, we have to recall the statistical nature Position and momentum therefore can never be of quantum mechanics. Quantum mechanics does not determined simultaneously without uncertainty, make predictions on the outcome of a single mea- independent of the considered state of the system. surement of a quantity A but only on the mean result The inequality (12) and with that also (11)aresharpas of a large number of measurements on “identically the instructive example prepared” states. The quantity (10) has thus to be Á Á interpreted as the mean result that one obtains from a 1 3 x .x/ D p 0 large number of such measurements. This gives reason # # to consider the or uncertainty of the rescaled three-dimensional Gauss functions A DkA hAi k Á Á 3=2 1 1 2 0.x/ D p exp  jxj S for states 2 D.A/. The uncertainty is zero if and  2 only if A DhAi ,thatis,if is an eigenvector of A for the eigenvalue hAi. Only in such eigenstates the of arbitrary width demonstrates. For these wavefunc- quantity represented by the operator A can be sharply tions, the inequality (12) actually turns into an equality. measured without uncertainty. The likelihood that a From p b 3 measurement returns a value outside the spectrum of .k/ D . #/ 0.#k/ A is zero. one recognizes that a sharp localization in space, that One of the fundamental results of quantum me- is, a small parameter # determining the width of ,is chanics is that, only in exceptional cases, can different combined with a loss of localization in momentum. physical quantities be measured simultaneously with- States with a well defined, sharp energy E play a out uncertainty, the Heisenberg uncertainty principle. particularly important role in quantum mechanics, that Its abstract version reads as follows: is, solutions ¤ 0 in H of the eigenvalue problem Theorem 3 Let A and B two selfadjoint operators and let be a normalized state in the intersection of H D E ; 1296 Schr¨odinger Equation for Chemistry the stationary Schrodinger¨ equation. The functions states almost completely. Wavefunctions that describe the same physical state can differ at most by a constant E i t ! ei „ t phase shift ! e , a real number. Wavefunctions that differ by such a phase shift lead to the same expectation values of observable quantities. The proof represent then solutions of the original time-dependent is again an exercise in linear algebra. In view of Schrodinger¨ equation. The main focus of quantum this discussion, the requirements on the wavefunctions chemistry is on stationary Schrodinger¨ equations. describing a system of indistinguishable particles are rather obvious and can be formulated in terms of the The Quantum Mechanics of Multiparticle operations that formally exchange the single particles: Systems Let us assume that we have a finite collection of N Postulate 3. The Hilbert space of a system of N particles of different kind with the spaces L2.˝i / as indistinguishable particles with system Hilbert space system Hilbert spaces. The Hilbert space describing the L2.˝/ consists of complex-valued, square integrable system that is composed of these particles is then the functions tensor product of these Hilbert spaces or a subspace of this space, that is, a space of square integrable W . 1;:::; N / ! . 1;:::; N / wavefunctions on the N -fold cartesian product of ˝,thatis,isa N W ˝1  ::: ˝N ! C subspace of L2.˝ /. For every in this space and every permutation P of the arguments i , the function ! .P / is also in this space, and moreover it with the N -tuples . 1;:::; N /, i 2 ˝i , as arguments. From the point of view of mathematics, this is of differs from at most by a constant phase shift. course another postulate that can in a strict sense not This postulate can be rather easily translated into a be derived from anything else, but is motivated by the symmetry condition on the wavefunctions that governs statistical interpretation of the wavefunctions and of the quantum mechanics of multiparticle systems: the quantity j j2 as a probability density. Quantum- mechanical particles of the same type, like electrons, Theorem 4 The Hilbert space describing a system of can, however, not be distinguished from each other indistinguishable particles either consists completely by any means or experiment. This is both a physical of antisymmetric wavefunctions, functions for which statement and a mathematical postulate that needs to be specified precisely. It has striking consequences for .P / D sign.P / . / the form of the physically admissible wavefunctions and of the Hilbert spaces that describe such systems holds for all permutations P of the components of indistinguishable particles. 1;:::; N of , that is, of the single particles, or To understand these consequences, we have to recall only of symmetric wavefunctions, wavefunctions for that an observable quantity like momentum or energy which is described in quantum mechanics by a selfadjoint op- .P / D . / erator A and that the inner product . ; A / represents holds for all permutations P of the arguments. the expectation value for the outcome of a measure- ment of this quantity in the physical state described by Which of the two choices is realized depends solely the normalized wavefunction . At least a necessary on the kind of particles and cannot be decided in condition that two normalized elements or unit vectors the present framework. Particles with antisymmetric and 0 in the system Hilbert space H describe wavefunctions are called fermions and particles with the same physical state is surely that . ; A / D symmetric wavefunctions bosons. . 0;A 0/ for all selfadjoint operators A W D.A/ Â Quantum chemistry is mainly interested in elec- H ! H whose domain D.A/ contains both and trons. Electrons have a position in space and an internal 0, that is, that the expectation values of all possi- property called spin that in many respects behaves like ble observables coincide. This requirement fixes such an angular momentum. The spin  of an electron can Schr¨odinger Equation for Chemistry 1297 attain the two values  D˙1=2. The configuration The Molecular Schrodinger¨ Equation space of an electron is therefore not the R3 but the Neglecting spin, the system Hilbert space of an atom or cartesian product molecule consisting of N particles (electrons and nu- 3 N 3N clei) is the space L2.R / D L2.R /. The Hamilton ˝ D R3 f1=2; C1=2g: operator

XN XN The space L2.˝/ consists of the functions W ˝ ! C 1 1 Q Q H D  C i j ; (14) with square integrable components x ! .x; /,  D 2m i 2 jx  x j iD1 i i;jD1 i j ˙1=2, and is equipped with the inner product i¤j X Z . ; / D .x; / .x;/ dx: written down here in dimensionless form, is derived D˙1=2 via the correspondence principle from its counterpart in classical physics, the Hamilton function AsystemofN electrons is correspondingly described XN XN by wavefunctions 1 2 1 Qi Qj H.p;q/ D jpi j C 2mi 2 jqi  qj j R3 N N C iD1 i;jD1 W . / f1=2; 1=2g ! (13) i¤j with square integrable components x ! .x; /, or total energy of a system of point-like particles in the where x 2 R3N and  is a vector consisting of N spins potential field i D˙1=2. These wavefunctions are equipped with the inner product 1 XN Q Q V.q/ D i j : Z 2 jq  q j X i;jD1 i j . ; / D .x; / .x;/ dx; i¤j  The mi are the masses of the particles in multiples N where the sum now runs over the 2 possible spin of the electron mass and the Qi the charges of the vectors . particles in multiples of the electron charge. As has Electrons are fermions, as all particles with half- first been shown by Kato [7], the Hamilton operator integer spin. That is, the wavefunctions change their (14) can be uniquely extended from the space of the in- sign under a simultaneous exchange of the positions xi finitely differentiable functions with bounded support and xj and the spins i and j of electrons i ¤ j . to a selfadjoint operator H from its domain of defini- 3N 3N They are, in other words, antisymmetric in the sense tion D.H /  L2.R / to L2.R /. It fits therefore that into the abstract framework of quantum mechanics S .P x; P / D sign.P / .x; / sketched above. The domain D.H / of the extended op- 2 holds for arbitrary simultaneous permutations x !Px erator is the Sobolev space H consisting of the twice and  ! P of the electron positions and spins. This weakly differentiable functions with first and second is a general version of the Pauli principle, a principle order weak derivatives in L2, respectively a subspace that is of fundamental importance for the physics of of this Sobolev space consisting of components of the atoms and molecules. The Pauli principle has stunning full, spin-dependent wavefunctions in accordance with consequences. It entangles the electrons with each the Pauli principle if spin is taken into account. The other, without the presence of any direct interaction resulting Schrodinger¨ equation force. A wavefunction (13) describing such a system vanishes at points .x; / at which x D x and  D  @ i j i j i D H for indices i ¤ j . This means that two electrons @t with the same spin cannot meet at the same place, a purely quantum-mechanical repulsion effect that has is an extremely complicated object, because of the no counterpart in classical physics. high dimensionality of the problem but also because of 1298 Schr¨odinger Equation for Chemistry the oscillatory character of its solutions and the many Hunziker-van Winter-Zhislin theorem, a semi-infinite different time scales on which they vary and which can interval; see [4] for details. Of interest for chemistry range over many orders of magnitude. Comprehensive are configurations of electrons and nuclei for which survey articles on the properties of atomic and molec- the minimum of the total spectrum is an isolated ular Schrodinger¨ operators are Hunziker and Sigal [6] eigenvalue of finite multiplicity, the ground state en- and Simon [9]. ergy of the system. The assigned eigenfunctions, the Following Born and Oppenheimer [2], the full ground states, as well as all other eigenfunctions for problem is usually split into the electronic Schrodinger¨ eigenvalues below the essential spectrum decay then equation describing the motion of the electrons in exponentially. That means that the nuclei can bind all the field of given clamped nuclei, and an equation electrons. More information on the mathematical prop- for the motion of the nuclei in a potential field erties of these eigenfunctions can be found in  Exact that is determined by solutions of the electronic Wavefunctions Properties. Chemists are mainly inter- equation. The transition from the full Schrodinger¨ ested in the ground states. The position of the nuclei equation taking also into account the motion of the is then determined minimizing the ground state energy nuclei to the electronic Schrodinger¨ equation is a as function of their positions, a process that treats mathematically very subtle problem; see [10]and the nuclei as classical objects. It is called geometry the literature cited therein or the article of Hagedorn optimization. ( Born–Oppenheimer Approximation, Adiabatic The Born-Oppenheimer approximation is only a Limit, and Related Math. Issues) for more information. first step toward the computationally feasible models The intuitive idea behind this splitting is that the elec- that are actually used in quantum chemistry. The his- trons move much more rapidly than the much heavier torically first and most simple of these models is the nuclei and almost instantaneously follow their motion. Hartree-Fock model in which the true wavefunctions Most of quantum chemistry is devoted to the solution are approximated by correspondingly antisymmetrized of the stationary electronic Schrodinger¨ equation, tensor products the eigenvalue problem for the electronic Hamilton operator YN u.x/ D i .xi / 1 XN 1 XN 1 iD1 H D  C V .x/ C 2 i 0 2 jx  x j iD1 i;jD1 i j ¤ 3 i j of functions i of the electron positions xi 2 R .These orbital functions are then determined via a variational again written down in dimensionless form, where principle. This intuitively very appealing ansatz often leads to surprisingly accurate results. Quantum chemistry is full of improvements and extensions of XN XK Z this basic approach; see the comprehensive monograph V0.x/ D [5] for further information. Many entries in this jxi  a j iD1 D1 encyclopedia are devoted to quantum chemical models and approximation methods that are derived from is the nuclear potential. It acts on functions with ar- the Schrodinger¨ equation. We refer in particular 3 guments x1;:::;xN in R , which are associated with to the article ( Hartree–Fock Type Methods)on the positions of the considered electrons. The a are the Hartree-Fock method, to the contributions the now fixed positions of the nuclei and the values Z ( Post-Hartree-Fock Methods and Excited States the charges of the nuclei in multiples of the electron Modeling) on post-Hartree Fock methods and charge. The equation has still to be supplemented ( Coupled-Cluster Methods) on the coupled cluster by the symmetry constraints arising from the Pauli method, and to the article ( Density Functional principle. Theory) on density functional theory. Time-dependent The spectrum of the electronic Schrodinger¨ operator problems are treated in the contribution ( Quantum is bounded from below. Its essential spectrum is, by the Time-Dependent Problems). Schr¨odinger Equation: Computation 1299

N n References i„@t ˆ.t; x; y/ D Hˆ.t; x; y/; x 2 R ; y 2 R ; (1) 1. Atkins, P., Friedman, R.: Molecular Quantum Mechanics. Oxford University Press, Oxford (1997) where the vectors x and y denote the positions of N 2. Born, M., Oppenheimer, R.: Zur Quantentheorie der Molekeln. Ann. Phys. 84, 457–484 (1927) nuclei and n electrons, respectively, while „ is the 3. Dirac, P.: Quantum mechanics of many electron systems. reduced Planck constant. The molecular Hamiltonian Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 123, 714–733 operator H consists of two parts, the kinetic energy (1929) operator of the nuclei and the electronic Hamiltonian 4. Gustafson, S., Sigal, I.: Mathematical Concepts of Quan- H tum Mechanics. Springer, Berlin/Heidelberg/New York e for fixed nucleonic configuration: (2003) 5. Helgaker, T., Jørgensen, P., Olsen, J.: Molecular Electronic XN 2 Structure Theory. Wiley, Chichester (2000) „ H D x C He.y; x/; 6. Hunziker, W., Sigal, I.: The quantum N-body problem. 2M j j D1 j J. Math. Phys. 41, 3448–3510 (2000) 7. Kato, T.: Fundamental properties of Hamiltonian operators of Schrodinger¨ type. Trans. Am. Math. Soc. 70, 195–221 with, (1951) 8. Schrodinger,¨ E.: Quantisierung als Eigenwertproblem. Ann. Xn 2 X 1 Phys. 79, 361–376 (1926) H „ e.y; x/ D yj C 9. Simon, B.: Schrodinger¨ operators in the twentieth century. 2mj jyj  yk j J. Math. Phys. 41, 3523–3555 (2000) j D1 j

The Born-Oppenheimer Approximation

Schrodinger¨ Equation: Computation The main computational challenge to solve the Schrodinger¨ equation is the high dimensionality of S Shi Jin the molecular configuration space RN Cn. For example, Department of Mathematics and Institute of Natural the carbon dioxide molecule CO2 consists of 3 nuclei Science, Shanghai Jiao Tong University, Shanghai, and 22 electrons; thus one has to solve the full China time-dependent Schrodinger¨ equation in space R75, Department of Mathematics, University of Wisconsin, which is a formidable task. The Born-Oppenheimer Madison, WI, USA approximation [1] is a commonly used approach in computational chemistry or physics to reduce the degrees of freedom. The Schrodinger¨ Equation This approximation is based on the mass discrep- ancy between the light electrons, which move fast, thus The linear Schrodinger¨ equation is a fundamental will be treated quantum mechanically, and the heavy quantum mechanics equation that describes the nuclei that move slower and are treated classically. complex-valued wave function ˆ.t; x; y/ of molecules Here one first solves the following time-independent or atoms electronic eigenvalue problems: 1300 Schr¨odinger Equation: Computation

N He.y; x/ k.yI x/ D Ek.x/ k.yI x/; 8x 2 R ; Thus the nuclear motion proceeds without the transi- tions between electronic states or energy surfaces. This k D 1;2;:::: (2) is also referred to as the adiabatic approximation. There are two components in dealing with a quan- H Assuming that the spectrum of e, a self-adjoint op- tum calculation. First, one has to solve the eigenvalue erator, is discrete with a complete set of orthonormal problem (2). Variational methods are usually used [10]. eigenfunctions f k.yI x/g called the adiabatic basis, However, for large n, this remains an intractable task. over the electronic coordinates for every fixed nucleus Various mean field theories have been developed to coordinates x, i.e., reduce the dimension. In particular, the Hartree-Fock Z 1 Theory [9]andtheDensity Function Theory [8]aim  j .yI x/ k.yI x/dy D ıjk; at representing the 3n-dimensional electronic wave 1 function into a product of one-particle wave function in 3 dimension. These approaches usually yield nonlinear where ı is the Kronecker delta. The electronic eigen- jk Schrodinger¨ equations. value Ek.x/, called the potential energy surface,de- pends on the positions x of the nuclei. Next the total wave function ˆ.t; x; y/ is expanded The Semiclassical Limit in terms of the eigenfunctions f kg: X The second component in quantum simulation is to ˆ.t; x; y/ D k.t; x/ k.yI x/: (3) solve the time-dependent Schrodinger¨ equation (5). Its k numerical approximation per se is similar to that of the parabolic heat equation. Finite difference, finite Assume m D m,andM D M ,forallj . j j element or, most frequently, spectral methods can be We take the atomic units by setting „D1; Z D p used for the spatial discretization. For the time dis- 1 and introduce " D m=M. Typically " ranges cretization, one often takes a time splitting of Strong between 102 and 103. Insert ansatz (3) into the time- or Trotter type that separates the kinetic energy from dependent Schrodinger¨ equation (1), multiply all the the potential operators in alternating steps. However, terms from the left by .yI x/, and integrate with re- k due to the smallness of „ or ", the numerical resolution spect to y, then one obtains a set of coupled differential of the wave function remains difficult. A classical equations: method to deal with such an oscillatory wave problem 2 3 is the WKB method, which seeks solution of the form XN @ "2 .t;x/ D A.t; x/eiS.t;x/=„ (in the sequel, we consider i" .t; x/ D 4  C E .x/5 .t; x/ @t k 2 xj k k only one energy level in (5), thus omitting the subscript j D1 X k and replacing E by V ). If one applies this ansatz C Ckl l .t; x/; (4) in (5), by ignoring the O."/ term, one obtains the l eikonal equation for phase S and transport equation for amplitude A: where the coupling operator Ckl is important to de- scribe quantum transitions between different potential 1 2 @t S C jrSj C V.x/ D 0I (6) energy surfaces. 2 As long as the potential energy surfaces fEk.x/g are A @t A CrS rA C S D 0: (7) well separated, all the coupling operators Ckl are ig- 2 nored, and one obtains a set of decoupled Schrodinger¨ The eikonal equation (6) is a typical Hamilton- equations: Jacobi equation, which develops singularities in S, 2 3 usually referred to caustics in the context of geometric XN @ "2 optics. Beyond the singularity, one has to superim- i" .t; x/ D 4  C E .x/5 .t; x/; @t k 2 xj k k pose the solutions of S, each of which satisfying j D1 the eikonal equation (6), since the solution becomes .t; x/ 2 RC  RN : (5) multivalued [6]. Schr¨odinger Equation: Computation 1301

This equation can be solved by the method of char- localized [2]ordiffusive. In the latter case, the acteristics, provided that V.x/ is sufficiently smooth. Wigner transform can be used to study the high- Its characteristic flow is given by the following Hamil- frequency limit [7]. tonian system of ordinary differential equations, which • V is nonlinear. The semiclassical limit (10) fails is Newton’s second law: after caustic formation. The understanding of this limit for strong nonlinearities remains a major dx d mathematical challenge. Not much is known except .t; y/ D .t;y/I .t; y/ DrV.x.t; y// : dt dt x in the one-dimensional defocusing nonlinearity case (8) (V Dj j2)[4]. Another approach to study the semiclassical limit is In (4), when different Ek intersect, or get close, one the Wigner transform [12]: cannot ignore the quantum transitions between differ- Z Á ent energy levels. A semiclassical approach, known as 1 " w"Œ ".x; / WD x C  the surface hopping method, was developed by Tully. It n .2/ Rn 2 is based on the classical Hamiltonian system (8), with Á " a Monte Carlo procedure to account for the quantum x   ei d; (9) 2 transitions [11]. For a survey of semiclassical computational meth- which is a convenient tool to study the limit of .t;x/ ods for the Schrodinger¨ equation, see [5]. to obtain the classical Liouville equation:

@t w C r w rV.x/ r w D 0: (10) References Its characteristic equation is given by the Hamiltonian system (8). 1. Born, M., Oppenheimer, R.: Zur Quantentheorie der Molekeln. Ann. Phys. 84, 457–484 (1927) 2. Frohlich,¨ J., Spencer, T.: Absence of diffusion in the An- derson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 465–471 (1983) Various Potentials 3. Griffiths, D.J.: Introduction to Quantum Mechanics, 2nd edn. Prentice Hall, Upper Saddle River (2004) The above Wigner analysis works well if the potential 4. Jin, S., Levermore, C.D., McLaughlin, D.W.: The semiclas- V is smooth. In applications, E can be discontinuous sical limit of the defocusing NLS hierarchy. Commun. Pure Appl. Math. 52(5), 613–654 (1999) (corresponding to potential barriers), periodic (for solid 5. Jin, S., Markowich P., Sparber, C.: Mathematical and com- mechanics with lattice structure), random (with an putational methods for semiclassical Schrodinger equations. inhomogeneous background), or even nonlinear (where Acta Numer. 20, 211–289 (2011) the equation is a field equation, with applications to 6. Maslov, V.P., Fedoriuk M.V.: Semi-Classical Approxima- S tion in Quantum Mechanics. D. Reidel, Dordrecht/Hollan optics and water waves, or a mean field equation as (1981) an approximation of the original multiparticle linear 7. Papanicolaou, G., Ryzhik, L.: Waves and transport. In: Schrodinger¨ equation). Different approaches need to be Caffarelli, L., Weinan, E. (eds.) Hyperbolic equations and taken for each of these different cases. frequency interactions (Park City, UT, 1995). Amer. Math. Soc. Providence, RI 5, 305–382 (1999) • V is discontinuous. Through a potential barrier, the 8. Parr, R.G., Yang W.: Density-Functional Theory of Atoms quantum tunnelling phenomenon occurs and one and Molecules. Oxford University Press, New York (1989) has to handle wave transmission and reflections [3]. 9. Szabo, A., Ostlund, N.S.: Modern Quantum Chemistry. • V is periodic. The Bloch decomposition is used to Dover, Mineola/New York (1996) 10. Thijssen, J.M.: Computational Physics. Cambridge Univer- decompose the wave field into sub-fields along each sity Press, Cambridge (1999) of the Bloch bands, which are the eigenfunctions as- 11. Tully, J.: Molecular dynamics with electronic transitions. sociated with a perturbed Hamiltonian that includes J. Chem. Phys. 93, 1061–1071 (1990) H plus the periodic potential [13]. 12. Wigner, E.: On the quantum correction for thermodynamic equilibrium. Phys. Rev. 40, 749–759 (1932) • V is random. Depending on the space dimension 13. Wilcox, C.H.: Theory of Bloch waves. J. Anal. Math. 33, and strength of the randomness, the waves can be 146–167 (1978) 1302 Scientific Computing

precisely the quantitative prediction capabilities of the Scientific Computing simulations, may not be well established. Hans Petter Langtangen1;2, Ulrich R¨ude3,andAslak Tveito1;2 Relations to Other Fields 1Simula Research Laboratory, Center for Biomedical A term closely related to Scientific Computing (and Computing, Fornebu, Norway that Wikipedia actually treats as a synonym) is Com- 2Department of Informatics, University of Oslo, Oslo, putational Science, which we here define as solving a Norway scientific problem with the aid of techniques from Sci- 3Department of , University entific Computing. While Scientific Computing deals Erlangen-Nuremberg, Erlangen, Germany with solution techniques and tools, Computational Sci- ence has a stronger focus on the science,thatis,a scientific question and the significance of the answer. In between these focal points, the craft of Scien- Scientific Computing is about practical methods for tific Computing is fundamental in order to produce solving mathematical problems. One may argue that an answer. Scientific Computing and Computational the field goes back to the invention of Mathemat- Science are developing into an independent scientific ics, but today, the term Scientific Computing usually discipline; they combine elements from Mathematics means application of computers to solve mathematical and Computer Science to form the foundations for a problems. The solution process consists of several key new methodology of scientific discovery. Developing steps, which form the so-called simulation pipeline: and Scientific Computing may 1. Formulation of a mathematical model which de- turn out to be as fundamental to the future progress in scribes the scientific problem and is suited for science as was the development of novel Mathematics computer-based solution methods and some chosen in the times of Newton and Euler. hardware Another closely related term is , 2. Construction of algorithms which precisely describe which is about the “development of practical algo- the computational steps in the model rithms to obtain approximate solutions of mathematical 3. Implementation of the algorithms in problems and the validation of these solutions through 4. Verification of the implementation their mathematical analysis.” The development of al- 5. Preparation of input data to the model gorithms is central to both Numerical Analysis and 6. Analysis and visualization of output data from the Scientific Computing, and so is the validation of the model computed solutions, but Numerical Analysis has a 7. Validation of the mathematical model, with associ- particular emphasis on mathematical analysis of the ated parameter estimation accuracy of approximate algorithms. Some narrower 8. Estimation of the precision (or uncertainty) of the definitions of Scientific Computing would say that it mathematical model for predictions contains all of Numerical Analysis, but in addition In any of the steps, it might be necessary to go back applies more experimental computing techniques to and change the formulation of the model or previous evaluate the accuracy of the computed results. Scien- steps to make further progress with other items on tific Computing is not necessarily restricted to approx- the list. When this process of iterative improvement imate solution methods, although those are the most has reached a satisfactory state, one can perform com- widely used. Other definitions of Scientific Computing puter simulations of a process in nature, technological may include additional points from the list above, devices, or society. In essence, that means using the up to our definition which is wide and includes all computer as a laboratory to mimic processes in the the key steps in creating predictive computer simula- real world. Such a lab enables impossible, unethical, tions. or costly real-world experiments, but often the process The term Numerical Analysis seems to have ap- of developing a computer model gives increased sci- peared in 1947 when the Institute for Numerical Analy- entific insight in itself. The disadvantage of computer sis was set up at UCLA with funding from the National simulations is that the quality of the results, or more Bureau of Standards in the Office Naval Research. Scientific Computing 1303

A landmark for the term Scientific Computing dates Scientific Computing: Mathematics or back to 1980 when Gene Golub established SIAM Computer Science? Journal on Scientific Computing. Computational Sci- Universities around the world are organized in de- ence, and Computational Science and Engineering, be- partments covering a reasonable portion of science or came widely used terms during the late the 1990s. The engineering in a fairly disjoint manner. This organi- book series Lecture Notes in Computational Science zational structure has caused much headache amongst and Engineering was initiated in 1995 and published researchers in Numerical Analysis and Scientific Com- from 1997 (but the Norwegian University of Science puting because these fields typically would have to find and Technology proposed a professor in Computa- its place either in a Computer Science department or in tional Science as early as 1993). The Computational a Mathematics department, and Scientific Computing Science term was further coined by the popular con- and Numerical Analysis belong in part to both these ferences SIAM Conference on Computational Science disciplines. Heated arguments have taken place around and Engineering (from 2000) and the International the world, and so far no universal solution has been Conference on Computational Science (ICCS, from provided. The discussion may, however, be used to 2001). Many student programs with the same names illustrate the validity of Sayre’s law (From first lines of appeared at the turn of the century. wikipedia.org/wiki/Sayres law: Sayre’s law states, in a In Numerical Analysis the guiding principle is to formulation quoted by Charles Philip Issawi: “In any perform computations based on a strong theoretical dispute the intensity of feeling is inversely proportional foundation. This foundation includes a proof that the to the value of the issues at stake.” By way of corollary, algorithm under consideration is computable and how it adds: “That is why academic politics are so bitter.”) accurate the computed solution would be. Suppose, which is often attributed to Henry Kissinger. for instance, that our aim is to solve a system of algebraic equations using an iterative method. If we have an algorithm providing a sequence of approxima- Formulation of Mathematical Models tions given by fxi g, the basic questions of Numerical Analysis are (i) (existence) to prove that xiC1 can be The demand for a mathematical model comes from computed provided that xi already is computed and the curiosity or need to answer a scientific question. (ii) (convergence) how accurate is the i-th approxi- When the model is finally available for computer mation. In general, these two steps can be extremely simulations, the results very often lead to reformulation challenging. Earlier, the goal of having a solid the- of the model or the scientific question. This iterative oretical basis for algorithms was frequently realistic process is the essence of doing science with aid of since the computers available did not have sufficient mathematical models. power to address very complex problems. That situ- Although some may claim that formulation of math- ation has changed dramatically in recent years, and ematical models is an activity that belongs to Engi- we are now able to use computers to study prob- neering and classical sciences and that the models are S lems that are way beyond the realm of Numerical prescribed in Scientific Computing, we will argue that Analysis. this is seldom the case. The classical scientific subjects Scientific Computing is the discipline that takes (e.g., Physics, Chemistry, Biology, , Mathe- over where a complete theoretical analysis of the matics, Computer Science, Economics) do formulate algorithms involved is impossible. Today, Scientific mathematical models, but the traditional focus targets Computing is an indispensable tool in science, models suitable for analytical insight. Models suitable and the models, methods, and algorithms under for being run on computers require mathematical for- considerations are rarely accessible by analytical mulations adapted to the technical steps of the solution tools. This lack of theoretical rigor is often addressed process. Therefore, experts on Scientific Computing by using extensive, carefully conducted computer will often go back to the model and reformulate it to experiments to investigate the quality of computed improve steps in the solution process. In particular, solutions. More standardized methods for such it is important to formulate models that fit approxi- investigations are an important part of mation, software, hardware, and parameter estimation Scientific Computing. constraints. Such aspects of formulating models have 1304 Scientific Computing to a large extent been developed over the last decades Systems of Nonlinear Equations; graphs and networks; through Scientific Computing research and practice.  Numerical Analysis of Ordinary Differential Successful Scientific Computing therefore demands a Equations;  Computational Partial Differential close interaction between understanding of the phe- Equations; Integral Equations; nomenon under interest (often referred to as domain Theory; stochastic variables, processes, and fields; knowledge) and the techniques available in the solution random walks; and Imaging Techniques. The entries process. Occasionally, intimate knowledge about the on  Numerical Analysis,  Computational Partial application and the scientific question enables the use Differential Equations,and Numerical Analysis of special properties that can reduce computing time, of Ordinary Differential Equations provide more increase accuracy, or just simplify the model consider- detailed overview of these topics and associated ably. algorithms. To illustrate how the steps of computing impacts modeling, consider flow around a body. In Physics (or traditional Fluid Dynamics to be more precise), one Discrete Models and Algorithms frequently restricts the development of a model to what can be treated by pen and paper Mathematics, which Mathematical models may be continuous or discrete. in the current example means assumption of stationary Continuous models can be addressed by symbolic laminar flow, that the body is a sphere, and that the computing, otherwise (and usually) they must be made domain is infinite. When analyzing flow around a discrete through discretization techniques. Many phys- body through computer simulations, a time-dependent ical phenomena leave a choice between formulating model may be easier to implement and faster to run the model as continuous or discrete. For example, a on modern parallel hardware, even if only a stationary geological material can be viewed as a finite set of solution is of interest. In addition, a time-dependent small elements in contact (discrete element model) or model allows the development of instabilities and the as a continuous medium with prescribed macroscopic well-known oscillating vortex patterns behind the body material properties (continuum mechanical model). In that occur even if the boundary conditions are station- the former case, one applies a set of rules for how ary. A difficulty with the discrete model is the need for elements interact at a mesoscale level and ends up a finite domain and appropriate boundary conditions with a large system of algebraic equations that must that do not disturb the flow inside the domain (so-called be solved, or sometimes one can derive explicit for- Artificial Boundary Conditions). In a discrete model, mulas for how each element moves during a small it is also easy to allow for flexible body geometry time interval. Another discrete modeling approach is and relatively straightforward to include models of based on cellular automata, where physical relations turbulence. What kind of turbulence model to apply are described between a fixed grid of cells. The Lat- might be constrained by implementational difficul- tice Boltzmann method, for example, uses 2D or 3D ties, details of the computer architecture, or comput- cellular automata to model the dynamics of a fluid ing time feasibility. Another aspect that impacts the on a meso-scopic level. Here the interactions between modeling is the estimation of the parameters in the the states of neighboring cells are derived from the model (usually done by solving  Inverse Problems: principles of statistical mechanics. Numerical Methods). Large errors in such estimates With a continuous medium, the model is expressed may favor a simpler and less accurate model over in terms of partial differential equations (with appro- a more complex one where uncertainty in unknown priate initial and boundary conditions). These equa- parameters is greater. The conclusion on which model tions must be discretized by techniques like  Finite to choose depends on many factors and ultimately Difference Methods,  Finite Element Methods,or on how one defines and measures the accuracy of  Finite Volume Methods, which lead to systems of predictions. algebraic equations. For a purely elastic medium, one Some common ingredients in mathematical models can formulate mathematically equivalent discrete and are Integration; Approximation of Functions (Curve discretized continuous models, while for complicated and Surface Fitting); optimization of Functions or material behavior the two model classes have their pros Functionals; Matrix Systems; Eigenvalue Problems; and cons. Some will prefer a discrete element model Scientific Computing 1305 because it often has fewer parameters to estimate With  Symbolic Computing one can bring than a continuum mechanical constitutive law for the additional power to exact and approximate solution material. techniques based on traditional pen and paper Some models are spatially discrete but continuous Mathematics. One example is perturbation methods in time. Examples include  Molecular Dynamics and where the solution is expressed as a power series of planetary systems, while others are purely discrete, like some dimensionless parameter, and one develops a the network of Facebook users. hierarchy of models for determining the coefficients The fundamental property of a discrete model, ei- in the power series. The procedure originally involves ther originally discrete or a discretized continuous lengthy analytical computations by hand which can be model, is its suitability for a computer. It is necessary automated using symbolic computing software such as to adjust the computational work, which is usually Mathematica, , or SymPy. closely related to the accuracy of the discrete model, When the mathematical details of the chosen dis- to fit the given hardware and the acceptable time for cretization are worked out, it remains to organize those calculating the solution. The choice of discretization details in algorithms. The algorithms are computational is also often dictated by software considerations. For recipes to bridge the gap between the mathematical example, one may prefer to discretize a partial dif- description of discrete models and their associated ferential equation by a finite difference method rather implementation in software. Proper documentation of than a finite element method because the former is the algorithms is extremely important such that others much simpler to implement and thus may lead to more know all ingredients of the computer model on which efficient software and thus eventually more accurate scientific findings are based. Unfortunately, the details simulation results. of complex models that are routinely used for impor- The entries on  Numerical Analysis,  Compu- tant industrial or scientific applications may sometimes tational Partial Differential Equations,  Numerical be available only through the actual computer code, Analysis of Ordinary Differential Equations,and which might even be proprietary. Imaging present overviews of different discretization Many of the mathematical subproblems that techniques, and more specialized articles go deeper arise from a model can be broken into smaller into the various methods. problems for which there exists efficient algorithms The accuracy of the discretization is normally the and implementations. This technique has historically most important factor that governs the choice of tech- been tremendously effective in Mathematics and nique. Discretized continuous models are based on Physics. Also in Scientific Computing one often approximations, and quantifying the accuracy of these sees that the best way of solving a new problem approximations is a key ingredient in Scientific Com- is to create a clever glue between existing building puting, as well as techniques for assessing their com- blocks. putational cost. The field of Numerical Analysis has developed many mathematical techniques that can help S establish a priori or a posteriori bounds on the errors in numerous types of approximations. The former can Implementation bound the error by properties of the exact solution, while the latter applies the approximate (i.e., the com- Ideally, a well-formulated set of algorithms should puted) solution in the bound. Since the exact solution easily translate into computer programs. While this of the mathematical problem remains unknown, pre- is true for simple problems, it is not in the general dictive models must usually apply a posteriori esti- case. A large portion of many budgets for Science mates in an iterative fashion to control approximation Computing projects goes to software development. errors. With increasingly complicated models, the complex- When mathematical expressions for or bounds of ity of the computer programs appears to grow even the errors are not obtainable, one has to resort to faster, because Computer Languages were not designed experimental investigations of approximation errors. to easily express complicated mathematical concepts. Popular techniques for this purpose have arisen from This is one main reason why writing and maintaining verification methods (see below). scientific software is challenging. 1306 Scientific Computing

The fundamental challenge to develop correct and popularity for implementing Scientific Computing efficient software for Scientific Computing is notori- algorithms. The reason is that these languages are more ously underestimated. It has an inherent complexity high level; that is, they allow humans to write computer that cannot be addressed by automated procedures code closer to the mathematical concepts than what is alone, but must be acknowledged as an independent easily achievable with C, C++, and FORTRAN. scientific and engineering problem. Also, testing of Over the last four decades, numerous high-quality the software quickly consumes even more resources. libraries have been developed, especially for frequently Software Engineering is a central field of Computer occurring problems from numerical linear algebra, dif- Science which addresses techniques for developing and ferential equations, approximation of functions, opti- testing software systems in general, but has so far had mization, etc. Development of new software today will minor impact on scientific software. We can identify usually maximize the utilization of such well-tested three reasons. First, the structure of the software is libraries. The result is a heterogeneous software envi- closely related to the mathematical concepts involved. ronment that involves several languages and software Second, testing is very demanding since we for most packages, often glued together in easy-to-use and easy- relevant applications do not know the answer before- to-program applications in MATLAB or Python. hand. Actually, much scientific software is written If we want to address complicated mathematical to explore new phenomena where neither qualitative models in Scientific Computing, the software needs nor quantitative properties of the computed results to provide the right abstractions to ease the imple- are known. Even when analytical insight is known mentation of the mathematical concepts. That is, the about the solution, the computed results will contain step from the mathematical description to the com- unknown discretization errors. The third reason is that puter code must be minimized under the constraint of scientific software quickly consumes the largest com- minor performance loss. This constrained optimization putational resources available and hence employs spe- problem is the great challenge in developing scientific cial High-Performance Computing (HPC) platforms. software. These platforms imply that computations must run Most classical mathematical methods are serial, but in parallel on heterogeneous architectures, a fact that utilization of modern computing platforms requires seriously complicates the software development. There algorithms to run in parallel. The development of has traditionally been little willingness to adopt good algorithms for  Parallel Computing is one of the most Software Engineering techniques if they cause any loss significant activities in Scientific Computing today. of computational performance (which is normally the Implementation of parallel algorithms, especially in case). combination with high-level abstractions for compli- In the first decades of Scientific Computing, cated mathematical concepts, is an additional research FORTRAN was the dominating Computer Language challenge. Easy-to-use parallel implementations are for implementing algorithms and FORTRAN has still a needed if a broad audience of scientists shall effec- strong position. Classical FORTRAN (with the dialects tively utilize Modern HPC Architectures such as clus- IV, 66, and 77) is ideal for mathematical formulas and ters with multi-core and multi-GPU PCs. Fortunately, heavy array computations, but lacks more sophisticated many of the well-known libraries for, e.g., linear alge- features like classes, namespaces, and modules to bra and optimization are updated to perform well on elegantly express complicated mathematical concepts. modern hardware. Therefore, the much richer C++ language attracted Often scientific progress is limited by the available significant attention in scientific software projects from hardware capacity in terms of memory and compu- the mid-1990s. Today, C++ is a dominating language tational power. Large-scale projects can require ex- in new projects, although the recent FORTRAN pensive resources, where not only the supercomputers 2003/2008 has many of the features that made per se but also their operational cost become limiting C++ popular. C, C++, and FORTRAN enable the factors. Here it becomes mandatory that the accuracy programmer to utilize almost the maximum efficiency provided by a Scientific Computing methodology is of an HPC architecture. On the other hand, much less evaluated relative to its cost. Traditional approaches computationally efficient languages such as MATLAB, that just quantify the number of numerical operations Mathematica, and Python have reached considerable turn often out to be misleading. Worse than that, Scientific Computing 1307 theoretical analysis often provides only asymptotic error model. This strategy constitutes the perhaps most bounds for the error with unspecified constants. This important verification technique and demonstrates how translates to cost assessments with unspecified con- dependent software testing is on results from Numeri- stants that are of only little use for quantifying the cal Analysis. real cost to obtain a simulation result. In Scientific Analytical insight from alternative or approximate Computing, such mathematical techniques must be mathematical models can in many physical problems combined with more realistic cost predictions to guide be used to test that certain aspects of the solution be- the development of effective simulation methods. This have correctly. For example, one may have asymptotic important research direction comes under names such results for the solution far away from the locations as  Hardware-Oriented Numerics for PDE or system- where the main physics is generated. Moreover, prin- atic performance engineering and is usually based on ciples such as mass, momentum, and energy balance a combination of rigorous mathematical techniques for the whole system under consideration can also be with engineering-like heuristics. Additionally, techno- checked. These types of tests are not as effective for logical constraints, such as the energy consumption uncovering software bugs as the tests described above, of supercomputers are increasingly found to become but add evidence that the software works. critical bottlenecks. This in turn motivates genuinely Scientific software needs to compute correct num- new research directions, such as evaluating the numer- bers, but must also run fast. Tests for checking that the ical efficiency of an algorithm in terms of its physical computational speed has not changed unexpectedly are resource requirements like the energy usage. therefore an integral part of any test suite. Especially on parallel computing platforms, this type of efficiency tests is as important as the correctness tests. Verification A fundamental requirement of verification proce- dures is that all tests are automated and can at any time  Verification of scientific software means setting up a be repeated. Version control systems for keeping track series of tests to bring evidence that the software solves of different versions of the files in a software package the underlying mathematical problems correctly. can be integrated with automatic testing such that every A fundamental challenge is that the problems are registered change in the software triggers a run of the normally solved approximately with an error that test suite, with effective reports to help track down new is quantitatively unknown. Comparison with exact errors. It is also easy to roll back to previous versions mathematical results will in those cases yield a of the software that passed all tests. Science papers that discrepancy, but the purpose of verification is to rely heavily on Scientific Computing should ideally ensure that there are no additional nonmathematical point to web sites where the version history of the discrepancies caused by programming mistakes. software and the tests are available, preferably also The ideal tests for verification is to have exact with snapshots of the whole computing environment solutions of the discrete problems. The discrepancy where the simulation results were obtained. These S of such solutions and those produced by the soft- elements are important for  Reproducibility: Methods ware should be limited by (small) roundoff errors due of the scientific findings. to Finite Precision Arithmetic in the machine. The standard verification test, however, is to use exact solutions of the mathematical problems to compute Preparation of Input Data observed errors and check that these behave correctly. More precisely, one develops exact solutions for the With increasingly complicated mathematical models, mathematical problem to be solved, or a closely related the preparation of input data for such models has be- one, and establishes a theoretical model for the errors. come a very resource consuming activity. One example Error bounds from Numerical Analysis will very often is the computation of the drag (fuel consumption) of suggest error models. For each problem one can then a car, which demands a mathematical description of vary discretization parameters to generate a data set of the car’s surface and a division of the air space outside errors and see if the relation between the errors and the car into small hexahedra or tetrahedra. Techniques the discretization parameters is as expected from the of  Geometry Processing can be used to measure 1308 Scientific Computing and mathematically represent the car’s surface, while input data must be known for the output data to be Meshing Algorithms are used to populate the air flow computed. Very often this is not the case, because domain with small hexahedra or tetrahedra. An even some input data remains unknown, while some output greater challenge is met in biomedical or geophysical data is known or can be measured. And since we computing where Segmentation Methods must nor- lack some input data, we cannot run the forward mally be accompanied by human interpretation when model. This situation leads to a parameter estimation extracting geometries from noisy images. problem, also known as a model calibration problem, A serious difficulty of preparing input data is related a parameter identification problem, or an  Inverse to lack of knowledge of certain data. This situation Problems: Numerical Methods. The idea is to use requires special methods for parameter estimation as some of the known output data to estimate some of described below. the lacking input data with aid of the model. For- ward models are for the most part well posed in the sense that small errors in input data are not amplified Visualization of Output Data significantly in the output. Inverse problems, on the other hand, are normally ill posed: small errors in Computer simulations tend to generate large amounts the measured output may have severe impact on the of numbers, which are meaningless to the scientists precision of our estimates of input data. Much of unless the numbers are transformed to informative the methodology research is about reducing the ill pictures closely related to the scientific investigation at posedness. hand. This is the goal of  Visualization. The simplest  Validation consists in establishing evidence that and often most effective type of visualization is to the computer model really models the real phenomena draw curves relating key quantities in the investiga- we are interested in. The idea is to have a range of test tion. Frequently, curves give incomplete insight into cases, each with some known output, usually measured processes, and one needs to visualize more complex in physical experiments, and checking that the model objects such as big networks or time-dependent, three- reproduces the known output. The tests are straightfor- dimensional scalar or vector fields. Even if the goal of wardly conducted if all input data is known. However, simulating a car’s fuel consumption is a single number very often some input parameters in the model are for the drag force, any physical insight into enhancing unknown, and the typical validation procedure for a geometric features of the car requires detailed visual- given test case is to tune those parameters in a way ization of the air velocities, the pressure, and vortex that reproduces the known output. Provided that the structures in time and 3D space. Visualization is partly tuned parameters are within realistic regions, the model about advanced numerical algorithms and partly about passes the validation test in the sense that there exists visual communication. The importance of effective relevant input data such that the model predicts the visualization in Scientific Computing can hardly be observed output. overestimated, as it is a key tool in software debugging, Understanding of the process being simulated scientific investigations, and communication of the can effectively guide manual tuning of unknown main research findings. input parameters. Alternatively, many different numerical methods exist for automatically fitting input parameters. Most of them are based on constrained Validation and Parameter Estimation optimization, where one wants to minimize the squared distance between predicted and observed While verification is about checking that the algorithms output with respect to the unknown input parameters, and their implementations are done right, validation is given the constraint that any predicted value about checking that the mathematical model is rele- must obey the model. Numerous specific solution vant for predictions. When we create a mathematical algorithms are found in the literature on deterministic model for a particular process in nature, a technolog-  Inverse Problems: Numerical Methods. Usually, ical device, or society, we think of a forward model the solution of an inverse problems requires a large in the meaning that the model requires a set of in- number of solutions of the corresponding forward put data and can produce a set of output data. The problem. Scientific Computing 1309

Many scientific questions immediately lead to The simplest and most general method for un- inverse problems. Seismic imaging is an example certainty quantification is Monte Carlo simulation. where one aims to estimate the spatial properties of the Large samples of input data are drawn at random Earth’s crust using measurements of reflected sound from the known probability densities and fed as input waves. Mathematically, the unknown properties are to the model. The forward model is run to compute spatially varying coefficients in partial differential the output corresponding to each sample. From the equations, and the measurements contain information samples of output data, one can compute the aver- about the solution of the equations. The primary age, variance, and other statistical measures of the purpose is to estimate the value of the coefficients, quantities of interest. One must often use of the or- which in a forward model for wave motion constitute der 105107 samples (and hence runs of the forward input data that must be known. When the focus is about model) to compute reasonably precise statistics. Much solving the inverse problem itself, one can often apply faster but less general methods exist. During recent simpler forward models (in seismic imaging, e.g., years,  Polynomial Chaos Expansions have become ordinary differential equations for ray tracing have popular. These assume that the mapping from stochas- traditionally replaced full partial differential equations tic input to selected output quantities is smooth such for the wave motion as forward model). that the mapping can be effectively described by a Reliable estimation of parameters requires more polynomial expansion with few terms. The expansion observed data than unknown input data. Then we can may converge exponentially fast and reduce the num- search for the best fit of parameters, but there is ber of runs of the forward model by several orders of no unique definition of what “best” is. Furthermore, magnitude. measured output data is subject to measurement errors. Having validated the model and estimated the It is therefore fruitful to acknowledge that the solution uncertainty in the output, we can eventually perform of inverse problems has a variability. Control of this predictive computer simulations and calculate the variability gives parameter estimates with correspond- precision of the predictions. At this point we have ing statistical uncertainties. For this purpose, one may completed the final item in our list of key steps in the turn to solving stochastic inverse problems. These simulation pipeline. are commonly formulated in a  Bayesian Statistics: We remark that although the stochastic parameter Computation where probability densities for the input estimation framework naturally turns an originally de- parameters are suggested, based on prior knowledge, terministic model into a stochastic one, modelers may and the framework updates these probability densities early in the modeling process assume that the details by taking the forward model and the data into account. of some effect are not precisely known and therefore The inserted prior knowledge handles the ill posedness describe the effect as a stochastic quantity. Some input of deterministic inverse problems, but at a much in- to the model is then stochastic and the question is creased computational cost. how the statistical variability is propagated through the model. This basically gives the same computational S problem as in uncertainty quantification. One exam- Uncertainty Quantification ple where stochastic quantities are used directly in the modeling is environmental forces from wind and With stochastic parameter estimation we immediately waves on structures. The forces may be described as face the question: How does the uncertainty in esti- stochastic space-time fields with statistical parameters mated parameters propagate through the model? That that must be estimated from measurements. is, what is the uncertainty in the predicted output? Methods from  Uncertainty Quantification: Computa- tion can be used to answer this problem. If parameters Laboratory and Field Experiments are estimated by Bayesian frameworks, we have com- plete probability descriptions. With simpler estimation The workflow in Scientific Computing to establish pre- methods we may still want to describe uncertainty dictive simulation models is seen to involve knowledge in the parameters in terms of assumed probability from several subjects, clearly Applied and Numeri- densities. cal Mathematics, Statistics, and Computer Science, 1310 Self-Consistent Field (SCF) Algorithms but the mathematical techniques and software work The Hartree-Fock model reads, for a molecular must be closely integrated with domain-specific mod- system containing N electrons, as eling knowledge from the field where the science ( problem originates, say Physics, Mechanics, Geology, HF EHF Geophysics, Astronomy, Biology, Finance, or Engi- E0 .N / D inf .˚/; ˚ D . 1;  ; N / neering disciplines. A less emphasized integration is Z ) with laboratory and field experiments, as Scientific 1 R3 N  Computing is often applauded to eliminate the need 2 .H . ˙ // ; i j D ıij ; (1) R3 for experiments. However, we have explained that ˙ predictive simulations require validation, parameter Z Z estimation, and control of the variability of input and R3 R3   where ˙ D  f"; #g, i j D i .x/ j .x/ output data. The computations involved in these tasks R3 R3 X Z ˙ ˙ cannot be carried out without access to carefully con-  dx WD i .r;/ j .r;/ dr, R3 ducted experiments in the laboratory or the field. A 2f";#g hope is that an extensive amount of large-scale data and sets from experiments can be made openly available to all computational scientists and thereby acceler- XN Z Z HF 1 2 ate the integration of experimental data in Scientific E .˚/ D jr i j C ˚ Vnuc 2 R3 R3 Computing. iD1 ˙ ˙ Z Z 0 1 ˚ .r/˚ .r / 0 C 0 dr dr 2 R3 R3 jr  r j Z Z 0 2 Self-Consistent Field (SCF) Algorithms 1 j˚ .x; x /j 0  0 dx dx : 2 R3 R3 jr  r j ˙ ˙ Eric Cances` Ecole des Ponts ParisTech – INRIA, UniversiteParis´ The function Vnuc denotes the nuclear potential. If Est, CERMICS, Projet MICMAC, Marne-la-Vallee,` the molecular system contains M nuclei of charges Paris, France z1;  ; zM located at positions R1;  ; RM ,thefol- lowing holds

Definition XM zk Vnuc.r/ D : jr  Rkj Self-consistent field (SCF) algorithms usually refer kD1 to numerical methods for solving the Hartree-Fock, or the Kohn-Sham equations. By extension, they The density ˚ and the density matrix ˚ associated also refer to constrained optimization algorithms with ˚ are defined by aiming at minimizing the Hartree-Fock, or the Kohn- Sham energy functional, on the set of admissible XN X 2 states. ˚ .r/ D j i .r;/j and iD1 2f";#g XN 0 0  Discretization of the Hartree-Fock Model ˚ .x; x / D i .x/ i .x / : iD1 As usual in electronic structure calculation, we adopt the system of atomic units, obtained by setting to The derivation of the Hartree-Fock model from 1 the values of the reduced Planck constant „,of the N -body Schrodinger¨ equation is detailed in the the elementary charge, of the mass of the electron, contribution by I. Catto to the present volume. and of the constant 40,where0 is the dielectric The Galerkin approximation of the minimization HF permittivity of the vacuum. problem (1) consists in approaching E0 .N / by Self-Consistent Field (SCF) Algorithms 1311 ( where .j/ is the standard notation for the two- HF V EHF E0 .N; / D min .˚/; ˚ D . 1;  ; N / electron integrals ) Z Z Z  .x/ .x/ .x0/ .x0/ 2 VN ;  D ı ; (2)     0 i j ij .j/ WD 0 dx dx : R3 R3 R3 jr  r j ˙ ˙ ˙ (6) V 1 R3 where  H . ˙ / is a finite dimensional approxi- We will make use of the symmetry property HF mation space of dimension Nb. Obviously, E0 .N / Ä EHF.N; V/ for any V  H 1.R3 /. 0 ˙ 0 0 In the sequel, we denote by Cm;n the vector space Tr .G.D/D / D Tr .G.D /D/: (7) of the complex-valued matrices with m lines and n Cm;m columns, and by h the vector space of the hermitian The formulation (3) is referred to as the molecular matrices of size m  m. We endow these spaces with orbital formulation of the Hartree-Fock model, by the Frobenius scalar product defined by .A; B/F WD contrast with the density matrix formulation defined as  V Tr .A B/. Choosing a basis .1;  ;Nb / of ,and N n expanding ˚ D . 1;  ; N / 2 V as N HF V HF CNb  b E0 .N; / D min E .D/; D 2 h ; o XNb DSD D D; Tr .SD/ D N : (8) i .x/ D Ci .x/; iD1 It is easily checked that problems (3)and(8) are equiv- N problem (2) also reads alent, remarking that the map fC 2 CNb  j C SC D N  CNb  b ˚ IN g3C 7! CC 2fD 2 h j DSD D N HF V HF  CNb  D; .SD/ D N g ˚ D . E0 .N; / D min E .CC /; C 2 ; Tr is onto. ForP any 1, N Nb «  ; N / 2 V with i .x/ D Ci .x/,the C SC I ; D1 D N (3) following holds where IN is the identity matrix of rank N and where XNb 0 0   ˚ .x; x / D D .x/.x / with D D CC : 1 ;D1 EHF.D/ D Tr .hD/ C Tr .G.D/D/: 2 We refer to the contribution by Y. Maday for the The entries of the overlap matrix S and of the one- derivation of a priori and a posteriori estimates on the electron Hamiltonian matrix h are given by energy difference EHF.N; V/  EHF.N /, and on the 0 0 S Z distance between the minimizers of (2) and those of 2  (1), for L and Sobolev norms. S WD  (4) R3 Most Hartree-Fock calculations are performed in ˙ atomic orbital basis sets (see, e.g., [9] for details), and and more specifically with Gaussian type orbitals. The latter are of the form Z Z XM  1 .x/  .x/  XNg h WD rr zk dx: 2 2 R3 R3 jr  Rkj ˛;gjrRk./j ˙ kD1 ˙ .r;/D P;g.r  Rk.//e S./; (5) gD1 L CNb Nb (9) The linear map G 2 . h / is defined by where P;g is a polynomial, ˛;g >0,andS D

XNb ˛ or ˇ, with ˛./ D ı;" and ˇ./ D ı;#.The ŒG.D/ WD Œ.j/  .j/ D; main advantage of Gaussian type orbitals is that all the ;D1 integrals in (4)–(6) can be calculated analytically [5]. 1312 Self-Consistent Field (SCF) Algorithms

In order to simplify the notation, we assume in the eigenvalues of F.D/:  D N C1 PN >0[2, 9]. N  sequel that the basis .1;  ;Nb / is orthonormal, or, From a geometrical viewpoint, D D iD1 ˚i ˚i is the equivalently, that S D INb . The molecular orbital and matrix of the orthogonal projector on the vector space density matrix formulations of the discretized Hartree- spanned by the lowest N eigenvalues of F.D/.Note Fock model then read that (13) can be reformulated without any reference to the molecular orbitals ˚i as follows: ˚ « ˚ « EHF.N; V/ D min EHF.CC /; C 2 C ; (10) D 2 argmin Tr .F.D/D0/; D0 2 P ; (14) 0 ˚ « Nb N  C D C 2 C j C C D IN ; ˚ « and that, as  D N C1  N >0, the right-hand side of HF V HF P E0 .N; / D min E .D/; D 2 ; (11) (14) is a singleton. This formulation is a consequence n of the following property: for any hermitian matrix Nb Nb 2 P D D 2 C j D D D; Nb Nb h F 2 C with eigenvalues  Ä Ä  and any h 1 PNb o N  orthogonal projector D of the form D D iD1 ˚i ˚i Tr .D/ D N :  with ˚i ˚j D ıij and F˚i D i ˚i , the following holds 8D0 2 P,

   Hartree-Fock-Roothaan-Hall Equations Tr .FD0/  Tr .FD/ C N C1 N kD  D0k2 : 2 F The matrix (15)

F.D/ WD h C G.D/ (12) Roothaan Fixed Point Algorithm is called the Fock matrix associated with D.Itisthe gradient of the Hartree-Fock energy functional at D, It is very natural to try and solve (13) by means of the for the Frobenius scalar product. following fixed point algorithm, originally introduced It can be proved (see again [9] for details) that if D by Roothaan in [24]: is a local minimizer of (11), then 8 Rth 8 ˆF.D /˚i;kC1 D i;kC1 ˚i;kC1 N ˆ k ˆ X ˆ  ˆ  ˆ˚i;kC1˚j;kC1 D ıij ˆD D ˚i ˚i ˆ ˆ < Ä  ÄÄ <ˆ iD1 1;kC1 2;kC1 N;kC1 are the lowest N F.DRth/ (16) F.D/˚i D i ˚i (13) ˆ eigenvalues k ˆ ˆ N ˆ˚ ˚ D ı ˆ X ˆ i j ij ˆ Rth  ˆ :D D ˚i;kC1˚ ; :ˆ1 Ä 2 ÄÄN are the lowest kC1 i;kC1 N eigenvalues F.D/. iD1

which also reads, in view of (15), The above system is a nonlinear eigenvalue problem: the vectors ˚ are orthonormal eigenvectors of the ˚ « i DRth 2 argmin Tr .F.DRth/D0/; D0 2 P : (17) hermitian matrix F.D/ associated with the lowest kC1 k N eigenvalues of F.D/,andthematrixF.D/ de- Solving the nonlinear eigenvalue problem (13)then pends on the eigenvectors ˚ through the definition i boils down to solving a sequence of linear eigenvalue of D. The first three equations of (13) are called the problems. Hartree-Fock-Roothaan-Hall equations. The property It was, however, early realized that the above algo- that  ;  ; are the lowest N eigenvalues of the 1 N rithm often fails to converge. More precisely, it can be hermitian matrix F.D/ is referred to as the Aufbau proved that, under the assumption that principle. An interesting property of the Hartree-Fock model is that, for any local minimizer of (11), there inf .N C1;k  N;k />0; (18) is a positive gap between the N th and .N C 1/th k2N Self-Consistent Field (SCF) Algorithms 1313 which seems to be always satisfied in practice, the E.D;D0/ with the penalized functional E.D;D0/ C Rth N b 0 2 sequence .Dk /k2 generated by the Roothaan 2 kD  D kF (b>0) suppresses the oscillations when algorithm either converges to a solution D of the b is large enough, but the resulting algorithm Hartree-Fock-Roothaan-Hall equations satisfying the ˚ « b b b 0 0 P Aufbau principle DkC1 2 argmin Tr .F.Dk  bDk /D /; D 2

often converges toward a critical point of the Hartree- Rth kDk  Dk!0 with D satisfying (13); (19) k!1 Fock-Roothaan-Hall equations, which does not satisfy the Aufbau principle, and is, therefore, not a local minimizer. This algorithm, introduced in [25], is called or asymptotically oscillates between two states Deven the level-shifting algorithm. It has been analyzed in and Dodd, none of them being solutions to the Hartree- [6, 17]. Fock-Roothaan-Hall equations

Rth Rth Direct Minimization Methods kD2k Devenk!0 and kD2kC1 Doddk!0: k!1 k!1 (20) The molecular orbital formulation (10) and the density The behavior of the Roothaan algorithm can be matrix formulation (11) of the discretized Hartree- explained mathematically, noticing that the sequence Fock model are constrained optimization problems. Rth .D / N is obtained by minimizing by relaxation the In both cases, the minimization set is a (non-convex) k k2 C functional smooth compact manifold. The set is a manifold 1 of dimension NNb  2 N.N C 1/, called the Stiefel 0 0 0 manifold; the set P of the rank-N orthogonal projec- E.D;D / D Tr .hD/ C Tr .hD / C Tr .G.D/D /: N tors in C b is a manifold of dimension N.Nb  N/, called the Grassmann manifold. We refer to [12]foran Indeed, we deduce from ( 7), (12), and (17)that interesting review of optimization methods on Stiefel ˚ « and Grassmann manifolds. DRth D argmin Tr .F.DRth/D0/; D0 2 P 1 ˚ 0 From a historical viewpoint, the first minimization Rth 0 D argmin Tr .hD0 / C Tr .hD / method for solving the Hartree-Fock-Roothaan-Hall « equations, the so-called steepest descent method,was CTr .G.DRth/D0/; D0 2 P ˚ 0 « proposed in [20]. It basically consists in performing D argmin E.DRth;D0/; D0 2 P ; one unconstrained gradient step on the function D 7! ˚ 0 « e Rth Rth P E.D/ (i.e., DkC1 D Dk trE.Dk / D Dk tF.Dk /), D2 D argmin Tr .F.D1 /D/; D 2 e ˚ followed by a “projection” step DkC1 ! DkC1 2 Rth P S D argmin Tr .hD/ C Tr .hD1 / . The “projection” can be done using McWeeny’s « purification, an iterative method consisting in replacing CTr .G.DRth/D/; D0 2 P ˚ 1 at each step D with 3D22D3. It is easily checked that Rth e D argmin Tr .hD/ C Tr .hD1 / if DkC1 is close enough to P, the purification method « P Rth 0 P converges quadratically to the point of closest to CTr .G.D/D1 /; D 2 e ˚ « DkC1 for the Frobenius norm. The steepest descent Rth P D argmin E.D;D1 /; D 2 ; method has the drawback of any basic gradient method: it converges very slowly, and is therefore never used in and so on, and so forth. Together with (15)and(18), practice. Rth this leads to numerical convergence [6]: kDkC2  Newton-like algorithms for computing Hartree- Rth Dk kF ! 0. The convergence/oscillation properties Fock ground states appeared in the early 1960s with (19)/(20) can then be obtained by resorting to the Bacskay quadratic convergent (QC) method [3]. Łojasiewicz inequality [17]. Bacskay’s approach was to lift the constraints and Oscillations of the Roothaan algorithm are called use a standard Newton algorithm for unconstrained charge sloshing in the physics literature. Replacing optimization. The local maps of the manifold P used 1314 Self-Consistent Field (SCF) Algorithms n o in [3] are the following exponential maps: for any Pe D D 2 CNb Nb ;0Ä D Ä 1; Tr .D/ D N ; CNb Nb  h C 2 such that C SC D INb , where 0 Ä D Ä 1 means that 0 Ä ˚ D˚ Ä ˚ ˚ for Ä N I 0 all ˚ 2 C b , or equivalently, that all the eigenvalues P D CeAD eAC ;D D N ; 0 0 00 of D lay in the range Œ0; 1. It is easily seen that the set Ä  Pe is convex. It is in fact the convex hull of the set P. 0 A vo .Nb N/N A fundamental remark is that all the local minimizers A D ;Avo 2 C I Avo 0 of (21)areonP [9]. This is the discretized version of a result by Lieb [18]. It is, therefore, possible to the suffix vo denotes the virtual-occupied off-diagonal solve the Hartree-Fock model by relaxing the non- block of the matrix A. Starting from some reference convex constraint D2 D D into the convex constraint matrix C, Bacskay QC algorithm performs one Newton 0 Ä D Ä 1. step on the unconstrained optimization problem The orthogonal projection of a given hermitian Pe ˚ matrix D on for the Frobenius scalar product can C HF A A  min E .Avo/ WD E .Ce D0e C /; be computed by diagonalizing D [8]. The cost of one « iteration of the usual projected gradient algorithm [4] A C.Nb N/N ; vo 2 is therefore the same at the cost of one iteration of the Roothaan algorithm. and updates the referenceÄ matrix C by replacing C A more efficient algorithm, the Optimal Damping e eA e 0 Avo e Algorithm (ODA), is the following [7] with Ce ,whereA D e , Avo denoting the Avo 0 result of the Newton step. Newton methods being very ˚ « e 0 0 P expensive in terms of computational costs, various at- DkC1 2 argmin ˚Tr .F.Dk/D /; D 2 « e HF e e e tempts have been made to build quasi-Newton versions DkC1 D argmin E .D/; D 2 SegŒDk;DkC1 ; of Bacskay QC algorithm (see, for e.g., [10, 13]). A natural alternative to Bacskay QC is to use ˚ e e Newton-like algorithms for constrained optimization where SegŒDk;DkC1 D .1  /Dk C DkC1;2 e in order to directly tackle problems (10)or(11) Œ0; 1g denotes the line segment linking Dk and DkC1. (see, e.g., [26]). Trust-region methods for solving As EHF is a second degree polynomial in the density the constrained optimization problem (10)havealso matrix, the last step consists in minimizing a quadratic been developed by Helgaker and co-workers [27], function of  on Œ0; 1, which can be done analytically. e and independently by Mart´ınez and co-workers [14]. The procedure is initialized with D0 D D0, D0 2 P Recently, gradient flow methods for solving (10)[1] being the initial guess. The ODA thus generates two and (11)[17] have been introduced and analyzed from sequences of matrices: a mathematical viewpoint. • The main sequence of density matrices .Dk/k2N 2 For molecular systems of moderate size, and when PN which is proven to numerically converge to an the Hartree-Fock model is discretized in atomic orbital Aufbau solution to the Hartree-Fock-Roothaan-Hall basis sets, direct minimization methods are usually equations [9] e less efficient than the methods based on constraint • A secondary sequence .Dk/k1 of matrices belong- e relaxation or optimal mixing presented in the next two ing to P sections. The Hartree-Fock energy is a Lyapunov functional of ODA: it decays at each iteration. This follows from the fact that for all D0 2 P and all  2 Œ0; 1, Lieb Variational Principle and Constraint

Relaxation HF e 0 HF e e E ..1  /Dk C D / D E .Dk/ C Tr .F.Dk/ 2   We now consider the variational problem 0 e  0 e 0 e .D  Dk// C Tr G.D  Dk/.D  Dk/ : ˚ « 2 min EHF.D/; D 2 Pe ; (21) (22) Self-Consistent Field (SCF) Algorithms 1315

The “steepest descent” direction, that is, the In the EDIIS algorithm [16], the mixing coeffi- density matrix D for which the slope seD !D D cients are chosen to minimize the Hartree-Fock energy e e k e Tr .F.Dk/.D  Dk// is minimum, is precisely DkC1. of Dk: In some sense, ODA is a combination of diagonal- 8 0 1 9 ization and direct minimization. The practical imple- < Xk Xk = mentation of ODA is detailed in [7], where numerical HF @ A min :E cj Dj ;cj  0; cj D 1; tests are also reported. The cost of one ODA itera- j D1 j D1 tion is approximatively the same as for the Roothaan algorithm. Numerical tests show that ODA is par- e ticularly efficient in the early steps of the iterative (note that as the cj ’s are chosen non-negative, Dk is e procedure. the element of P which minimizes the Hartree-Fock energy on the convex hull of fD0;D1;  ;Dk g). The DIIS algorithm does not always converge. On Convergence Acceleration the other hand, when it converges, it is extremely fast. This nice feature of the DIIS algorithm has not yet been fully explained by rigorous mathematical arguments SCF convergence can be accelerated by performing, at (see however [23] for a numerical analysis of DIIS-type each step of the iterative procedure, a mixing of the algorithms in an unconstrained setting). previous iterates: 8 ˚ « e 0 0 P <ˆ DkC1 2 argmin Tr .F kD /; D 2 Xk Xk SCF Algorithms for Kohn-Sham Models e (23) :ˆ F k D cj;kF.Dj /; cj;k D 1; j D0 j D0 After discretization in a finite basis set, the Kohn-Sham energy functional reads where the mixing coefficients cj;k are optimized ac- cording to some criterion. Note that in the Hartree- 1 Fock setting, the mean-field Hamiltonian F.D/ is EKS.D/ D Tr .hD/ C Tr .J.D/D/ C E .D/; 2 xc affine in D, so that mixing the F.Dj /’s amounts to mixing the Dj ’s: P where ŒJ.D/ WD .j/D is the Coulomb operator, and where E is the exchange-correlation Xk xc e e e energy functional [11]. In the standard Kohn-Sham F k D F.Dk/ where Dk D cj;kDj : KS P j D0 model [15], E is minimized on , while in the extended Kohn-Sham model [11], EKS is minimized Pe S This is no longer true for Kohn-Sham models. on the convex set . The algorithms presented in the In Pulay’s DIIS algorithm [22], the mixing coeffi- previous sections can be transposed mutatis mutandis cients are obtained by solving to the Kohn-Sham setting, but as Exc.D/ is not a sec- ond order polynomial in D, the mathematical analysis 8  9 ˆ 2 > is more complicated. In particular, no rigorous result <Xk  Xk =   on the Roothaan algorithm for Kohn-Sham has been min  cj ŒF .Dj /; Dj  ; cj D 1 : :ˆ  ;> published so far. P P j D1 j D1 F Note thatP the equality F. i ci Di / D i ci F.Di / whenever i ci D 1 is true for Hartree-Fock with The commutator ŒF.D/; D is in fact the gradient F.D/ DrEHF.D/ D h C G.D/, but not for Kohn- of the functional A 7! EHF.eADeA/ defined on the Sham with F.D/ DrEKS.D/ D h C J.D/ C vector space of the Nb  Nb antihermitian matrices rExc.D/. Consequently, in contrast with the situation (note that eADeA 2 P for all D 2 P and A encountered in the Hartree-Fock framework, mixing antihermitian); it vanishes when D is a critical point density matrices and mixing Kohn-Sham Hamiltonians of EHF on P. are not equivalent procedures. This leads to a variety 1316 Self-Consistent Field (SCF) Algorithms of acceleration schemes for Kohn-Sham that boil down 8. Cances,` E., Pernal, K.: Projected gradient algorithms for to either DIIS or EDIIS in the Hartree-Fock setting. Hartree-Fock and density-matrix functional theory. J. Chem. For the sake of brevity, and also because the situation Phys. 128, 134108 (2008) 9. Cances,` E., Defranceschi, M., Kutzelnigg, W., Le Bris, C., is evolving fast (several new algorithms are proposed Maday, Y.: Computational quantum chemistry: A primer. every year, and identifying the best algorithms is a In: Handbook of Numerical Analysis, vol. X, pp. 3–270. matter of debate), we will not present these schemes North-Holland, Amsterdam (2003) here. 10. Chaban, G., Schmidt, M.W., Gordon, M.S.: Approximate second order method for orbital optimization of SCF and Let us finally mention that, if iterative methods MCSCF wavefunctions. Theor. Chem. Acc. 97, 88–95 based on repeated diagonalization of the mean-field (1997) Hamiltonian, combined with mixing procedures, are 11. Dreizler, R.M., Gross, E.K.U.: Density Functional Theory. Springer, Berlin/New York (1990) more efficient than direct minimization methods for 12. Edelman, A., Arias, T.A., Smith, S.T.: The geometry of moderate size molecular systems, and when the Kohn- algorithms with orthonormality constraints. SIAM J. Matrix Sham problem is discretized in atomic orbital basis Anal. Appl. 20, 303–353 (1998) sets, the situation may be different for very large sys- 13. Fischer, T.H., Almlof,¨ J.: General methods for geometry and wave function optimization. J. Phys. Chem. 96, 9768–9774 tems, or when finite element or planewave discretiza- (1992) tion methods are used (see, e.g., [19,21] and references 14. Francisco, J., Mart´ınez, J.M., Mart´ınez, L.: Globally conver- therein). gent trust-region methods for self-consistent field electronic structure calculations, J. Chem. Phys. 121, 10863–10878 (2004) 15. Kohn, K., Sham, L.J.: Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 Cross-References (1965) 16. Kudin, K., Scuseria, G.E., Cances,` E.: A black-box self-  A Priori and a Posteriori Error Analysis in Chemistry consistent field convergence algorithm: one step closer,  Density Functional Theory J. Chem. Phys. 116, 8255–8261 (2002)  17. Levitt, A.: Convergence of gradient-based algorithms for the Hartree–Fock Type Methods Hartree-Fock equations, preprint (2011)  Linear Scaling Methods 18. Lieb, E.H.: Variational principle for many-fermion systems.  Numerical Analysis of Eigenproblems for Electronic Phys.Rev.Lett.46, 457–459 (1981) Structure Calculations 19. Marks, L.D., Luke, D.R.: Robust mixing for ab initio quantum mechanical calculations. Phys. Rev. B 78, 075114 (2008) 20. McWeeny, R.: The density matrix in self-consistent field theory. I. Iterative construction of the density matrix. Proc. References R. Soc. Lond. A 235, 496–509 (1956) 21. Mostofi, A.A., Haynes, P.D., Skylaris, C.K., Payne, M.C.: 1. Alouges, F., Audouze, C.: Preconditioned gradient flows Preconditioned iterative minimization for linear-scaling and applications to the Hartree-Fock functional. Numer. electronic structure calculations. J. Chem. Phys. 119, Methods PDE 25, 380–400 (2009) 8842–8848 (2003) 2. Bach, V., Lieb, E.H., Loss, M., Solovej, J.P.: There are no 22. Pulay, P.: Improved SCF convergence acceleration. J. Com- unfilled shells in unrestricted Hartree-Fock theory. Phys. put. Chem. 3, 556–560 (1982) Rev. Lett. 72(19), 2981–2983 (1994) 23. Rohwedder, T., Schneider, R.: An analysis for the DIIS ac- 3. Bacskay, G.B.: A quadratically convergent Hartree-Fock celeration method used in quantum chemistry calculations. (QC-SCF) method. Application closed shell. Syst. Chem. J. Math. Chem. 49, 1889–1914 (2011) Phys. 61, 385–404 (1961) 24. Roothaan, C.C.J.: New developments in molecular orbital 4. Bonnans, J.F., Gilbert, J.C., Lemarechal,´ C., theory. Rev. Mod. Phys. 23, 69–89 (1951) Sagastizabal,´ C.: Numerical Optimization. Theoretical 25. Saunders, V.R., Hillier, I.H.: A “level-shifting” method for and Practical Aspects. Springer, Berlin/New York (2006) converging closed shell Hartree-Fock wavefunctions. Int. J. 5. Boys, S.F.: Electronic wavefunctions. I. A general method Quantum Chem. 7, 699–705 (1973) of calculation for the stationary states of any molecular 26. Shepard, R.: Elimination of the diagonalization bottleneck system. Proc. R. Soc. A 200, 542–554 (1950) in parallel direct-SCF methods. Theor. Chim. Acta 84, 6. Cances,` E., Le Bris, C.: On the convergence of SCF algo- 343–351 (1993) rithms for the Hartree-Fock equations. M2AN Math. Model. 27. Thøgersen, L., Olsen, J., Yeager, D., Jørgensen, P., Sałek, Numer. Anal. 34, 749–774 (2000) P., Helgaker, T.: The trust-region self-consistent field 7. Cances,` E., Le Bris, C.: Can we outperform the DIIS ap- method: towards a black-box optimization in Hartree- proach for electronic structure calculations? Int. J. Quantum Fock and Kohn-Sham theories. J. Chem. Phys. 121, 16–27 Chem. 79, 82–90 (2000) (2004) Semiconductor Device Problems 1317

subbands and magnetic fields, spintronics, and models Semiconductor Device Problems for carbon nanotube, graphene, and polymer thin-film materials. For technological aspects, we refer to [9]. Ansgar J¨ungel Institut f¨ur Analysis und Scientific Computing, Microscopic Semiclassical Models ¨ Technische Universitat Wien, Wien, Austria We are interested in the evolution of charge carri- ers moving in an electric field. Their motion can be modeled by Newton’s law. However, in view of the Mathematics Subject Classification huge number of electrons involved, the solution of the Newton equations is computationally too expen- 82D37; 35Q20; 35Q40; 35Q79; 76Y05 sive. Moreover, we are not interested in the trajectory of each single particle. Hence, a statistical approach seems to be sufficient, introducing the distribution Definition function (or “probability density”) f.x;v;t/ of an electron ensemble, depending on the position x 2 R3, Highly integrated electric circuits in computer pro- velocity v DPx D dx=dt 2 R3, and time t>0.By cessors mainly consist of semiconductor transistors Liouville’s theorem, the trajectory of f .x.t/; v.t/; t/ which amplify and switch electronic signals. Roughly does not change during time, in the absence of colli- speaking, a semiconductor is a crystalline solid whose sions, and hence, conductivity is intermediate between an insulator and a conductor. The modeling and simulation of semicon- df 0 D D @t f CPxrxf CPvrvf along trajectories; ductor transistors and other devices is of paramount dt (1) importance in the microelectronics industry to reduce the development cost and time. A semiconductor de- where @t f D @f = @ t and rx f , rvf are gradients with vice problem is defined by the process of deriving respect to x, v, respectively. physically accurate but computationally feasible model Since electrons are quantum particles (and position equations and of constructing efficient numerical algo- and velocity cannot be determined with arbitrary accu- rithms for the solution of these equations. Depending racy), we need to incorporate some quantum mechan- on the device structure, size, and operating conditions, ics. As the solution of the many-particle Schrodinger¨ the main transport phenomena may be very different, equation in the whole space is out of reach, we need an caused by diffusion, drift, scattering, or quantum ef- approximate approach. First, by Bloch’s theorem, it is fects. This leads to a variety of model equations de- sufficient to solve the Schrodinger¨ equation in a semi- signed for a particular situation or a particular device. conductor lattice cell. Furthermore, the many-particle Furthermore, often not all available physical informa- interactions are described by an effective Coulomb tion is necessary, and simpler models are needed, help- force. Finally, the properties of the semiconductor S ing to reduce the computational cost in the numerical crystal are incorporated by the semiclassical Newton simulation. One may distinguish four model classes: equations. microscopic/mesoscopic and macroscopic semiclassi- More precisely, let p D„k denote the crystal cal models and microscopic/mesoscopic and macro- momentum, where „ is the reduced Planck constant scopic quantum models (see Fig. 1). and k is the wave vector. For electrons with low energy, the velocity is proportional to the wave vector, xP D„k=m,wherem is the electron mass at rest. Description In the general case, we have to take into account the energy band structure of the semiconductor crystal (see In the following, we detail only some models from the [4, 7, 8] for details). Newton’s third law is formulated four model classes since the field of semiconductor as pP D qrxV ,whereq is the elementary charge and device problems became extremely large in recent V.x;t/ is the electric potential. Then, using vP DPp=m years. For instance, we ignore compact models, hybrid and rk D .m=„/rv,(1) becomes the (mesoscopic) model approaches, lattice heat equations, transport in Boltzmann transport equation: 1318 Semiconductor Device Problems

Microscopic/mesoscopic semi-classical models Microscopic/mesoscopic quantum models

Semi-classical Liouville-von Lindblad equation Newton’s equations Neumann equation

statistical approach collisionless collisional

Boltzmann transport Wigner-Boltzmann Schrödinger equation equation equation

moment method + moment moment method + moment method Chapman-Enskog method Chapman-Enskog Hydrodynamic Energy-transport Quantum hydro- Quantum energy- equations equations dynamic equations transport equations constant constant temperature temperature Drift-diffusion Quantum drift- equations diffusion equations

Macroscopic semi-classical models Macroscopic quantum models

Semiconductor Device Problems, Fig. 1 Hierarchy of some semiconductor models mentioned in the text

„ q alternative is the use of deterministic solvers, for exam- @ f C k r f C r V r f t m x „ x k ple, expanding the distribution function with spherical harmonics [6]. D Q.f /; .x; k/ 2 R3  R3;t>0; (2) where Q.f / models collisions of electrons with Macroscopic Semiclassical Models phonons, impurities, or other particles. The moments When collisions become dominant in the semiconduc- of f are interpreted as the particle density n.x; t/, tor domain, that is, the mean free path (the length current density J.x;t/, and energy density .ne/.x; t/: which a particle travels between two consecutive col- Z Z lision events) is much smaller than the device size, a „ fluid dynamical approach may be appropriate. Macro- n D fdk; J D kf d k; R3 m R3 scopic models are derived from (2) by multiplying Z the equation by certain weight functions, that is 1, „2 ne D jkj2fdk: (3) k,andjkj2=2, and integrating over the wave-vector 2m R3 space. Setting all physical constants to one in the following, for notational simplicity, we obtain, using In the self-consistent setting, the electric potential V the definitions (3), the balance equations: is computed from the Poisson equation "sV D q.nC.x//,where"s is the semiconductor permittivity Z 3 and C.x/ models charged background ions (doping @t n C divx J D Q.f /dk; x 2 R ;t>0; (4) R3 profile). Since n depends on the distribution function Z Z f , the Boltzmann–Poisson system is nonlinear. @t J C divx k ˝ kf d k  nrxV D kQ.f /dk; The Boltzmann transport equation is defined over R3 R3 the six-dimensional phase space (plus time) whose (5) high dimensionality makes its numerical solution a Z 1 2 very challenging task. One approach is to employ the @t .ne/ C divx kjkj fdk rxV  2 R3 which consists in simulating a Z 1 stochastic process. Drawbacks of the method are the J D jkj2Q.f /dk: (6) stochastic nature and the huge computational cost. An 2 R3 Semiconductor Device Problems 1319

The higher-order integrals cannot be expressed in terms devices due to, for example, temperature effects, which of the moments (3), which is called the closure prob- can be modeled by the energy-transport equations. lem. It can be solved by approximating f by the In the presence of high electric fields, the station- equilibrium distribution f0, which can be justified by ary equations corresponding to (7)–(9) are convection a scaling argument and asymptotic analysis. The equi- dominant. This can be handled by the Scharfetter– librium f0 can be determined by maximizing the Boltz- Gummel discretization technique. The key idea is to mann entropy under the constraints of given moments approximate the current density along each edge in n, nu,andne [4]. Inserting f0 in (4)–(6) gives explicit a mesh by a constant, yielding an exponential ap- expressions for the higher-order moments, yielding proximation of the electric potential. This technique is the so-called hydrodynamic model. Formally, there related to mixed finite-element and finite-volume meth- is some similarity with the Euler equations of fluid ods [2]. Another idea to eliminate the convective terms dynamics, and there has been an extensive discussion is to employ (dual) entropy variables. For instance, in the literature whether electron shock waves in semi- for the energy-transport equations, the dual entropy conductors are realistic or not [10]. variables are w D .w1; w2/ D ..  V/=T;1=T /, Diffusion models, which do not exhibit shock solu- where  is the chemical potential, given by n D tions, can be derived by a Chapman–Enskog expansion T 3=2 exp.=T /.Then(8)and(9) can be formulated as around the equilibrium distribution f0 according to the system: f D f0 C ˛f 1,where˛>0is the Knudsen number

(the ratio of the mean free path and the device length) @t b.w/  div.D.w;V/rw/ D 0; which is assumed to be small compared to one. The function f turns out to be the solution of a certain op- 1 where b.w/ D .n; 3 nT /> and D.w;V/ 2 R22 is a erator equation involving the collision operator Q.f /. 2 symmetric positive definite diffusion matrix [4]such Depending on the number of given moments, this leads that standard finite-element techniques are applicable. to the drift-diffusion equations (particle density given):

Microscopic Quantum Models @t n C divx J D 0; J Drxn C nrxV; The semiclassical approach is reasonable if the car- 3 x 2 R ;t>0; (7) riers can be treated as particles. The validity of this description is measured by the de Broglie wavelength or the energy-transport equations (particle and energy B corresponding to a thermal average carrier. When densities given) the electric potential varies rapidly on the scale of n B or when the mean free path is much larger than @t n C divx J D 0; J Drxn C rxV;  , quantum mechanical models are more appropriate. T B A general description is possible by the Liouville–von R3 x 2 ;t>0; (8) Neumann equation: S @t .ne/ C divx S C nu rxV D 0; b b b b 3 i"@t  D ŒH;  WD H  H; t > 0; S D .r .nT /  nr V/; (9) 2 x x for the density matrix operator b,wherei 2 D1, 3 where ne D 2 nT , T being the electron temperature, ">0 is the scaled Planck constant, and H is the and S is the heat flux. For the derivation of these quantum mechanical Hamiltonian. The operator b is models; we have assumed that the equilibrium distri- assumed to possess a complete orthonormal set of bution is given by Maxwell–Boltzmann statistics and eigenfunctions . j / and eigenvalues .j /.These- that the elastic scattering rate is proportional to the quence of Schrodinger¨ equations i"@t j D H j wave vector. More general models can be derived too, (j 2 N), together with the numbers j  0, is called see [4, Chap. 6]. a mixed-state SchrPodinger¨ system with the particle 1 2 The drift-diffusion model gives a good description density n.x; t/ D j D1 j j j .x; t/j . In particular, of the transport in semiconductor devices close to j can be interpreted as the occupation probability of equilibrium but it is not accurate enough for submicron the state j . 1320 Semiconductor Device Problems

The Schrodinger¨ equation describes the evolution of Macroscopic Quantum Models a quantum state in an active region of a semiconductor Macroscopic models can be derived from the Wigner– device. It is used when inelastic scattering is suffi- Boltzmann equation (10)inasimilarmannerasfrom ciently weak such that phase coherence can be assumed the Boltzmann equation (2). The main difference to and effects such as resonant tunneling and quantum the semiclassical approach is the definition of the equi- conductance can be observed. Typically, the device librium. Maximizing the von Neumann entropy under is connected to an exterior medium through access the constraints of given moments of a Wigner function zones, which allows for the injection of charge carriers. w, the formal solution (if it exists) is given by the so- Instead of solving the Schrodinger¨ equation in the called quantum Maxwellian MŒw, which is a nonlocal whole domain (self-consistently coupled to the Poisson version of the semiclassical equilibrium. It was first equation), one wishes to solve the problem only in suggested by Degond and Ringhofer and is related the active region and to prescribe transparent boundary to the (unconstrained) quantum equilibrium given by conditions at the interfaces between the active and Wigner in 1932 [3, 5]. We wish to derive moment access zones. Such a situation is referred to as an equations from the Wigner–Boltzmann equation (10) open quantum system. The determination of transpar- for the particle density n, current density J , and energy ent boundary conditions is a delicate issue since ad hoc density ne, defined by: approaches often lead to spurious oscillations which Z Z deteriorate the numerical solution [1]. n D MŒwdp; J D pM Œwdp; Nonreversible interactions of the charge carriers R3 R3 Z with the environment can be modeled by the Lindblad 1 equation: ne D jpj2MŒwdp: 2 R3 Á X 1 i"@bDŒH;bCi L bL  .LL bCbLL / ; Such a program was carried out by Degond et al. [3], t k k 2 k k k k k using the simple relaxation-type operator Q.w/ D MŒw w. This leads to a hierarchy of quantum hydro-  dynamic and diffusion models which are, in contrast to where Lk are the so-called Lindblad operators and L k their semiclassical counterparts, nonlocal. is the adjoint of Lk . In the Fourier picture, this equation can be formulated as a quantum kinetic equation, the When we employ only one moment (the particle (mesoscopic) Wigner–Boltzmann equation: density) and expand the resulting moment model in powers of " up to order O."4/ (to obtain local equa- tions), we arrive at the quantum drift-diffusion (or @t w C p rxw C ŒVw D Q.w/; density-gradient) equations: .x; p/ 2 R3  R3;t>0; (10) "2 @ n C div J D 0; J Dr n C nr V C nr t x x x 6 x where p is the crystal momentum, ŒVw is the po- p Á tential operator which is a nonlocal version of the drift x n p ;x2 R3;t>0: term rxV rpw [4, Chap. 11], and Q.w/ is the collision n operator. The Wigner function w D WŒb,whereW denotes the Wigner transform, is essentially the Fourier This model is employed to simulate the carrier in- transform of the density matrix. A nice feature of the version layer near the oxide of a MOSFET (metal- Wigner equation is that it is a phase-space description, oxide-semiconductor field-effect transistor). The main similar to the semiclassical Boltzmann equation. Its difficulty of the numerical discretization is the treat- drawbacks are that the Wigner function cannot be ment of the highly nonlinear fourth-order quantum interpreted as a probability density, as the Boltzmann correction. However, there exist efficient exponentially distribution function, and that the Wigner equation has fitted finite-element approximations, see the references to be solved in the high dimensional phase space. of Pinnau in [4, Chap. 12]. A remedy is to derive macroscopic models which are Formally, the moment equations for the charge discussed in the following section. carriers and energy density give the quantum Shearlets 1321 energy-transport model. Since its mathematical Quantum Transport: Modelling, Analysis and Asymptotics. structure is less clear, we do not discuss this model Lecture Notes in Mathematics 1946, pp. 111–168. Springer, [4, Chap. 13.2]. Berlin (2009) 4. J¨ungel, A.: Transport Equations for Semiconductors. Employing all three moments n, nu, ne, the moment Springer, Berlin (2009) equations, expanded up to terms of order O."4/,be- 5. J¨ungel, A.: Dissipative quantum fluid models. Revista Mat. come the quantum hydrodynamic equations: Univ. Parma, to appear (2011) 6. Hong, S.-M., Pham, A.-T., Jungemann, C.: Deterministic Á Solvers for the Boltzmann Transport Equation. Springer, J ˝ J New York (2011) @t n C div J D 0; @t J C divx C P Z n 7. Lundstrom, M.: Fundamentals of Carrier Transport. Cam- bridge University Press, Cambridge (2000) C nrxV D pQ.MŒw/dp; 8. Markowich, P., Ringhofer, C., Schmeiser, C.: Semiconduc- R3 tor Equations. Springer, Vienn (1990) 9. Mishra, U., Singh, J.: Semiconductor Device Physics and @t .ne/  divx..P C neI/u  q/ CrxV  Z Design. Springer, Dordrecht (2007) 1 10. Rudan, M., Gnudi, A., Quade, W.: A generalized approach J D jpj2Q.M Œw/dp; x 2 R3;t>0; to the hydrodynamic model of semiconductor equations. 2 R3 In: Baccarani, G. (ed.) Process and Device Modeling for Microelectronics, pp. 109–154. Elsevier, Amsterdam (1993) where I is the identity matrix in R33, the quantum stress tensor P and the energy density ne are given by:

"2 3 I 2 P D nT  nrx log n; ne D nT 12 2 Shearlets 1 "2 C njuj2  n log n; 2 24 x Gitta Kutyniok Institut f¨ur Mathematik, Technische Universitat¨ u D J=n is the mean velocity, and q D 2 Berlin, Berlin, Germany ." =24/n.xu C 2rx divx u/ is the quantum heat flux. When applying a Chapman–Enskog expansion around the quantum equilibrium, viscous effects are Mathematics Subject Classification added, leading to quantum Navier–Stokes equations [5, Chap. 5]. These models are very interesting from 42C40; 42C15; 65T60 a theoretical viewpoint since they exhibit a surprising nonlinear structure. Simulations of resonant tunneling diodes using these models give qualitatively reasonable results. However, as expected, quantum phenomena are Synonyms easily destroyed by the occurring diffusive or viscous S effects. Shearlets; Shearlet system

References Short Description 1. Antoine, X., Arnold, A., Besse, C., Ehrhardt, M., Schadle,¨ 2 2 A.: A review of transparent and artificial boundary con- Shearlets are multiscale systems in L .R / which ditions techniques for linear and nonlinear Schrodinger¨ efficiently encode anisotropic features. They extend equations. Commun. Comput. Phys. 4, 729–796 (2008) the framework of wavelets and are constructed by 2. Brezzi, F., Marini, L., Micheletti, S., Pietra, P., Sacco, R., Wang, S.: Discretization of semiconductor device problems. parabolic scaling, shearing, and translation applied In: Schilders, W., ter Maten, W. (eds.) Handbook of Numer- to one or very few generating functions. The main ical Analysis. Numerical Methods in Electromagnetics, vol. application of shearlets is imaging science, for 13, pp. 317–441. North-Holland, Amsterdam (2005) example, denoising, edge detection, or inpainting. 3. Degond, P., Gallego, S., Mehats,´ F., Ringhofer, C.: Quan- 2 Rn tum hydrodynamic and diffusion models derived from the Extensions of shearlet systems to L . /, n  3 are entropy principle. In: Ben Abdallah, N., Frosali, G. (eds.) also available. 1322 Shearlets

Description The continuous shearlet transform is an isometry, pro- vided that satisfies some weak regularity conditions. Multivariate Problems One common class of generating functions are clas- Multivariate problem classes are typically governed by sical shearlets, which are band-limited functions 2 anisotropic features such as singularities concentrated L2.R2/ defined by on lower dimensional embedded manifolds. Examples Á are edges in images or shock fronts of transport dom- O O O O 2 . / D . 1; 2/ D 1. 1/ 2 ; inated equations. Since due to their isotropic nature 1 wavelets are deficient to efficiently encode such func- 2 R tions, several directional representation systems were where P1 2 L . / is a discrete wavelet, i.e., it O j 2 R proposed among which are ridgelets, contourlets, and satisfies j 2Z j 1.2 /j D 1 for a.e. 2 with O 1 R O 1 1 1 1 curvelets. 1 2 C . / and supp 1  Œ 2 ;  16  [ Œ 16 ; 2 , 2 R Shearlets were introduced in 2006 [10]andare and P2 2 L . / is a “bump function” in the sense 1 O 2 to date the only directional representation system that kD1 j 2. C k/j D 1 for a.e. 2 Œ1; 1 O 1 R O which provides optimally sparse approximations with 2 2 C . / and supp 2  Œ1; 1. Figure 1 of anisotropic features while providing a unified illustrates classical shearlets and the tiling of Fourier treatment of the continuum and digital realm in domain they provide, which ensures their directional the sense of allowing faithful implementations. One sensitivity. important structural property is their membership From a mathematical standpoint, continuous shear- in the class of affine systems, similar to wavelets. lets are being generated by a unitary representation of A comprehensive presentation of the theory and a particular semi-direct product, the shearlet group [2]. applications of shearlets can be found in [16]. However, since those systems and their associated transforms do not provide a uniform resolution of Continuous Shearlet Systems all directions but are biased towards one axis, for Continuous shearlet systems are generated by applica- applications cone-adapted continuous shearlet systems Q 2 R2 tion of parabolic scaling A , AQ , a>0, shearing S , were introduced. For ; ; 2 L . /,thecone- a a s Q s 2 R and translation,where adapted continuous shearlet system SHcont . ; ; /is defined by  à  à 1=2 a0 Q a 0 Aa D 1=2 ; Aa D ; 0a 0a SHcont. ; ; Q / D ˚cont . / [ cont. / [ Qcont. Q /;  à 1s and S D ; where s 01 R2 to one or very few generating functions. For 2 ˚cont. / Df t D .t/ W t 2 g; 2 2 L .R /, the associated continuous shearlet system is  3 1 1 cont. / Df a;s;t D a 4 .A S . t// defined by a s W a 2 .0; 1; jsjÄ1 C a1=2;t2 R2g;

n  3 1 T 3 Q . Q / Df Q D a 4 Q .AQ S . t//  4 1 1 R cont a;s;t a s a;s;t D a .Aa Ss . t// W a>0;s2 ; o W a 2 .0; 1; jsjÄ1 C a1=2;t2 R2g: t 2 R2 ; The associated transform is defined in a similar man- with a determining the scale, s the direction,andt the ner as before. The induced uniform resolution of all position of a shearlet a;s;t . The associated continuous directions by a cone-like partition of Fourier domain is shearlet transform of some function f 2 L2.R2/ is the illustrated in Fig. 2. mapping The high directional selectivity of cone-adapted continuous shearlet systems is reflected in the 2 2 2 L .R / 3 f 7!hf; a;s;t i;a>0;s2 R;t 2 R : result that they precisely resolve wavefront sets of Shearlets 1323

c a b

Shearlets, Fig. 1 Classical shearlets: (a) j a;s;t j for exemplary values of a; s,andt.(b) Support of O .(c) Approximate support of Oa;s;t for different values of a and s

distributions f by the decay behavior of jhf; a;s;t ij To allow more flexibility in the denseness of the po- and jhf; Q a;s;t ij as a ! 0 [15]. sitioning of shearlets, sometimes the discretization of the translation parameter t is performed by t D 1 1 Z2 Discrete Shearlet Systems A2j Sk diag.c1;c2/m, m 2 ;c1;c2 >0.Avery Discretization of the parameters a; s,andt by a D general discretization approach is by coorbit theory j Z j=2 Z 1 1 which is however only applicable to the non-cone- 2 , j 2 , s Dk2 , k 2 ,andt D A2j Sk m, m 2 Z2 leads to the associated discrete systems. For adapted setting [4]. 2 L2.R2/,thediscrete shearlet system is defined by For classical (band-limited) shearlets as generating functions, both the discrete shearlet system and the 3 j 2 f j;k;m D 2 4 .SkA2j m/ W j; k 2 Z;m2 Z g; cone-adapted discrete shearlet system – the latter one with a minor adaption at the intersections of the cones 2 2 with j determining the scale, k the direction,andm the and suitable – form tight frames for L .R /.Atheory position of a shearlet j;k;m. The associated discrete for compactly supported (cone-adapted) discrete shear- shearlet transform of some f 2 L2.R2/ is the mapping let systems is also available [14]. For a special class of separable generating functions, compactly supported 2 2 2 L .R / 3 f 7!hf; j;k;mi;j;k2 Z;m2 Z : cone-adapted discrete shearlet systems form a frame with the ratio of frame bounds being approximately 4. Similarly, for ; ; Q 2 L2.R2/,thecone-adapted Discrete shearlet systems provide optimally sparse S discrete shearlet system SHdisc. ; ; Q / is defined by approximations of anisotropic features. A customarily employed model are cartoon-like functions,i.e.,com- L2.R2/ C 2 SHdisc. ; ; Q / D ˚disc. / [ disc. / [ Qdisc. Q /; pactly supported functions in which are apart from a closed piecewise C 2 discontinuity curve. where Up to a log-factor, discrete shearlet systems based on classical shearlets or compactly supported shearlets 2 satisfying some weak regularity conditions provide the ˚disc. / Df m D .m/ W m 2 Z g; optimal decay rate of the best N -term approximation 3 j disc. / Df j;k;m D 2 4 .Sk A2j m/ of cartoon-like functions f [7, 17], i.e., j=2 Z2 W j  0; jkjÄd2 e;m2 g; 2 2 3 kf  fN k2 Ä CN .log N/ as N !1; 3 Q Q Q 4 j Q T Q disc. / Df j;k;m D 2 .Sk A2j m/ where here f denotes the N -term shearlet approxi- j=2 Z2 N W j  0; jkjÄd2 e;m2 g: mation using the N largest coefficients. 1324 Shearlets

a b

c

1

0.8

0.6 1

0.4 0.5

0.2 0 0 –0.5 1 0.5 0 –0.5 –1 –1

Shearlets, Fig. 2 Cone-adapted shearlet system: (a) Partition- different values of a and s.(c) j Oa;s;t j for some shearlet and b exemplary values of a; s,andt ing into cones. (b) Approximate support of Oa;s;t and Qa;s;t for

Fast Algorithms Several of the associated algorithms are provided at The implementations of the shearlet transform can www.ShearLab.org. be grouped into two categories, namely, in Fourier- based implementations and in implementations in spa- tial domain. Fourier-based implementations aim to produce the Extensions to Higher Dimensions same frequency tiling as in Fig. 2b typically by em- Continuous shearlet systems in higher dimensions ploying variants of the Pseudo-Polar transform [5, 21]. have been introduced in [3]. In many situations, these Spatial domain approaches utilize filters associated systems inherit the property to resolve wavefront with the transform which are implemented by a con- sets. The theory of discrete shearlet systems and volution in the spatial domain. A fast implementa- their sparse approximation properties have been tion with separable shearlets was introduced in [22], introduced and studied in [20] in dimension 3 with the subdivision schemes are the basis of the algorithmic possibility to extend the results to higher dimensions, approach in [19], and a general filter approach was and similar sparse approximation properties were studied in [11]. derived. Shift-Invariant Approximation 1325

Applications 15. Kutyniok, G., Labate, D.: Resolution of the wavefront set Shearlets are nowadays used for a variety of applica- using continuous shearlets. Trans. Am. Math. Soc. 361, tions which require the representation and processing 2719–2754 (2009) 16. Kutyniok, G., Labate, D. (eds.): Shearlets: Multiscale of multivariate data such as imaging sciences. Promi- Analysis for Multivariate Data. Birkhauser,¨ Boston (2012) nent examples are deconvolution [23], denoising [6], 17. Kutyniok, G., Lim, W-Q.: Compactly supported shearlets edge detection [9], inpainting [13], segmentation [12], are optimally sparse. J. Approx. Theory 163, 1564–1589 and separation [18]. Other application areas are sparse (2011) 18. Kutyniok, G., Lim, W-Q.: Image separation using wavelets decompositions of operators such as the Radon opera- and shearlets. In: Curves and Surfaces (Avignon, 2010). tor [1] or Fourier integral operators [8]. Lecture Notes in Computer Science, vol. 6920. Springer, Berlin, Heidelberg (2012) 19. Kutyniok, G., Sauer, T.: Adaptive directional subdivision schemes and shearlet multiresolution analysis. SIAM J. Math. Anal. 41, 1436–1471 (2009) References 20. Kutyniok, G., Lemvig, J., Lim, W-Q.: Optimally sparse approximations of 3D functions by compactly supported 1. Colonna, F., Easley, G., Guo, K., Labate, D.: Radon trans- shearlet frames. SIAM J. Math. Anal. 44, 2962–3017 form inversion using the shearlet representation. Appl. (2012) Comput. Harmon. Anal. 29, 232–250 (2010) 21. Kutyniok, G., Shahram, M., Zhuang, X.: ShearLab: a 2. Dahlke, S., Kutyniok, G., Maass, P., Sagiv, C., Stark, rational design of a digital parabolic scaling algorithm. H.-G., Teschke, G.: The uncertainty principle associated SIAM J. Imaging Sci. 5, 1291–1332 (2012) with the continuous shearlet transform. Int. J. Wavelets 22. Lim, W-Q.: The discrete shearlet transform: a new direc- Multiresolut. Inf. Process. 6, 157–181 (2008) tional transform and compactly supported shearlet frames. 3. Dahlke, S., Steidl, G., Teschke, G.: The continuous shearlet IEEE Trans. Image Process. 19, 1166–1180 (2010) transform in arbitrary space dimensions. J. Fourier Anal. 23. Patel, V.M., Easley, G., Healy, D.M.: Shearlet-based de- Appl. 16, 340–364 (2010) convolution. IEEE Trans. Image Process. 18, 2673–2685 4. Dahlke, S., Steidl, G., Teschke, G.: Shearlet coorbit spaces: (2009) compactly supported analyzing shearlets, traces and em- beddings. J. Fourier Anal. Appl. 17, 1232–1255 (2011) 5. Easley, G., Labate, D., Lim, W-Q.: Sparse directional im- age representations using the discrete shearlet transform. Appl. Comput. Harmon. Anal. 25, 25–46 (2008) Shift-Invariant Approximation 6. Easley, G., Labate, D., Colonna, F.: Shearlet based total Variation for denoising. IEEE Trans. Image Process. 18, Robert Schaback 260–268 (2009) 7. Guo, K., Labate, D.: Optimally sparse multidimensional Institut f¨ur Numerische und Angewandte Mathematik representation using shearlets. SIAM J. Math. Anal. 39, (NAM), Georg-August-Universitat¨ Gottingen,¨ 298–318 (2007) Gottingen,¨ Germany 8. Guo, K., Labate, D.: Representation of Fourier integral op- erators using shearlets. J. Fourier Anal. Appl. 14, 327–371 (2008) 9. Guo, K., Labate, D.: Characterization and analysis of edges Mathematics Subject Classification S using the continuous shearlet transform. SIAM J. Imaging Sci. 2, 959–986 (2009) 10. Guo, K., Kutyniok, G., Labate, D.: Sparse multidi- 41A25; 42C15; 42C25; 42C40 mensional representations using anisotropic dilation and shear operators. In: Wavelets and Splines (Athens, 2005), pp. 189–201. Nashboro Press, Nashville (2006) 11. Han, B., Kutyniok, G., Shen, Z.: Adaptive multiresolution Synonyms analysis structures and shearlet systems. SIAM J. Numer. Anal. 49, 1921–1946 (2011) 12. Hauser,¨ S., Steidl, G.: Convex multiclass segmentation Approximation by integer translates with shearlet regularization. Int. J. Comput. Math. 90, 62–81 (2013) 13. King, E.J., Kutyniok, G., Zhuang, X.: Analysis of In- painting via Clustered Sparsity and Microlocal Analysis. Short Definition J. Math. Imaging Vis. (to appear) 14. Kittipoom, P., Kutyniok, G., Lim, W-Q.: Construction of compactly supported shearlet frames. Constr. Approx. 35, Shift-invariant approximation deals with functions f 21–72 (2012) on the whole real line, e.g., time series and signals. 1326 Shift-Invariant Approximation

It approximates f by shifted copies of a single gen- the coefficients in (1) should be obtained instead as  erator ', i.e., ck;h.f / D .f; '. h  k//2 for any f 2 L2.IR/ to Á turn the approximation into an optimal L projection. X x 2 f.x/ S .x/ WD c .f /'  k ;x2 IR: Surprisingly, these two approaches coincide for the f;h;' k;h h k2ZZ sinc case. (1) Analysis of shift-invariant approximation focuses    on the error in (1) for various generators ' and for dif- The functions ' h  k for k 2 ZZ span a space that is shift-invariant wrt. integer multiples of h.Exten- ferent ways of calculating useful coefficients ck;h.f /. sions [1, 2] allow multiple generators and multivariate Under special technical conditions, e.g., if the gener- functions. Shift-invariant approximation uses only a ator ' is compactly supported, the Strang–Fix condi- single scale h, while wavelets use multiple scales and tions [4] refinable generators. .j / 'O .2k/ D ı0k;k2 ZZ; 0 Ä j

Description imply that the error of (1)isO.hm/ for h ! 0 in m Sobolev space W2 .IR/ if the coefficients are given via Nyquist–Shannon–Whittaker–Kotelnikov sampling L2 projection. This holds for B-spline generators of provides the formula order m. Á The basic tool for analysis of shift-invariant L X x 2 f.x/ D f.kh/sinc  k approximation is the bracket product h k2ZZ X Œ'; .!/ WD '.!O C 2k/ O .! C 2k/; ! 2 IR for band-limited functions with frequencies in k2ZZ Œ=h; C=h. It is basic in Electrical Engineering for AD/DA conversion of signals after low-pass filtering. which is a 2-periodic function. It should exist point- Another simple example arises from the hat function wise, be in L2Œ;  and satisfy a stability property or order 2 B-spline B2.x/ WD 1 jxj for 1 Ä x Ä 1 and 0 elsewhere. Then the “connect-the-dots” formula 0

3. Jetter, K., Plonka, G.: A survey on L2-approximation orders stochastic processes, stochastic numerical methods are from shift-invariant spaces. In: Multivariate approximation derived from these representations. and applications, pp. 73–111. Cambridge University Press, We can distinguish several classes of stochastic nu- Cambridge (2001) 4. Strang, G., Fix, G.: A Fourier analysis of the finite element merical methods: Monte Carlo methods consist in sim- variational method. In: Geymonat, G. (ed.) Constructive ulating large numbers of independent paths of a given Aspects of Functional Analysis. C.I.M.E. II Ciclo 1971, stochastic process; stochastic particle methods consist pp 793–840 (1973) in simulating paths of interacting particles whose em- pirical distribution converges in law to a deterministic measure; and ergodic methods consist in simulating Simulation of Stochastic Differential one single path of a given stochastic process up to a Equations large time horizon. Monte Carlo methods allow one to approximate statistics of probability distributions of Denis Talay stochastic models or solutions to linear partial differen- INRIA Sophia Antipolis, Valbonne, France tial equations. Stochastic particle methods approximate solutions to deterministic nonlinear McKean–Vlasov PDEs. Ergodic methods aim to compute statistics of Synonyms equilibrium measures of stochastic models or to solve elliptic PDEs. See, e.g., [2]. Numerical mathematics; Stochastic analysis; Stochas- In all cases, one needs to develop numerical ap- tic numerics proximation methods for paths of stochastic processes. Most of the stochastic processes used as models or involved in stochastic representations of PDEs are ob- Definition tained as solutions to stochastic differential equations Z Z The development and the mathematical analysis of t t Xt .x/ D x C b.Xs.x// ds C .Xs .x// dWs; stochastic numerical methods to obtain approximate 0 0 solutions of deterministic linear and nonlinear par- (1) tial differential equations and to simulate stochastic where .Wt / is a standard Brownian motion or, more models. generally, a Levy´ process. Existence and uniqueness of solutions, in strong and weak senses, are exhaustively studied, e.g., in [10]. Overview Monte Carlo Methods for Linear PDEs Owing to powerful computers, one now desires to model and simulate more and more complex physi- S Set a.x/ WD .x/ .x/t , and consider the parabolic cal, chemical, biological, and economic phenomena at PDE various scales. In this context, stochastic models are intensively used because calibration errors cannot be @u Xd 1 Xd .t; x/ D bi .x/ @ u.x/ C ai .x/ @ u.x/ avoided, physical laws are imperfectly known (as in @t i 2 j ij turbulent fluid mechanics), or no physical law exists iD1 i;jD1 (as in finance). One then needs to compute moments or (2) more complex statistics of the probability distributions with initial condition u.0; x/ D f.x/. Under various of the stochastic processes involved in the models. hypotheses on the coefficients b and , it holds that A stochastic process is a collection .Xt / of random u.t; x/ D Eu0.t; Xt .x//,whereXt .x/ is the solution variables indexed by the time variable t. to (1). This is not the only motivation to develop stochastic Let h>0be a time discretization step. Let simulations. As solutions of a wide family of com- .Gp/ be independent centered Gaussian vectors with plex deterministic partial differential equations (PDEs) unit covariance matrix. Define the Euler scheme by N h can be represented as expectations of functionals of X0 .x/ D x and the recursive relation 1328 Simulation of Stochastic Differential Equations

N h N h N h European option prices, moments of solutions to me- X.pC1/h.x/ DXph.x/ C bXph.x/ h p chanical systems with random excitations, etc. N h C .Xph.x// hGpC1: When the PDE (2) is posed in a domain D with Dirichlet boundary conditions u.t; x/ D g.x/ on @D , The simulation of this random sequence only requires then u.t; x/ D Ef.Xt .x// It< C Eg.X .x// It , the simulation of independent Gaussian random vari- where is the first hitting time of @D by .Xt .x//. ables. Given a time horizon Mh, independent copies of An approximation method is obtained by substituting the sequence .Gp;1 Ä p Ä M/ provide independent h h XN .x/ to X .x/,where N is the first hitting N h;k ph^N h paths .Xph .x/; 1 Ä p Ä M/. time of @D by the interpolated Euler scheme. For a The global error of the Monte Carlo method with N convergence rate analysis, see, e.g., [7]. simulations which approximates u.ph;x/is Let n.x/ denote the unit inward normal vector at

N Á point x on @D . When one adds Neumann boundary 1 X E N h;k conditions ru.t; x/  n.x/ D 0 on @D to (2), then u0.XMh/  u0 XMh N E ] ] kD1 u.t; x/ D f.Xt .x//,whereX := is the solution to   E E N h an SDE with reflection D u0.XMh/  u0 XMh „ ƒ‚ … Z Z DW .h/ t t d ] ] ] Xt .x/ Dx C b.Xs .x// ds C .Xs .x// dWs N Á   1 X 0 0 E N h N h;k Z C u0 XMh  u0 X : tn N Mh ] kD1 C .Xs/dLs.X/; „ ƒ‚ … 0 DWs .h;N /

Nonasymptotic variants of the central limit theorem where .Lt .X// is the local time of X at the boundary. Then one constructs the reflected Euler scheme in such imply that the statistical error s.h/ satisfies a way that the simulation of the local time, which C.M/ would be complex and numerically instable, is avoided. 8M  1; 9C.M/ > 0; Ejs.h/jÄ p for all N This construction and the corresponding error analysis have been developed in [4]. 0

( R R dXt D B.t; Xt ; b.Xt ;y/t .dy//dt C S.t;Xt ; .Xt ;y/t .dy//dWt ; (4) t .dy/ WD probability distribution of Xt :

In addition, the flow of the probability distributions t approach is numerically relevant in the cases of small solves the nonlinear McKean–Vlasov–Fokker–Planck viscosities. It is also intensively used, for example, in equation Lagrangian stochastic simulations of complex flows d  and in molecular dynamics: see, e.g., [9,15]. When the t D L t ; (5) dt t functions B, S, b,  are smooth, optimal convergence t  rates have been obtained for finite time horizons, e.g., where, A denoting the matrix S  S , L is the formal adjoint of the differential operator in [1, 3]. Other stochastic representations have been devel- X R oped for backward SDEs related to quasi-linear PDEs L WD B .t; x; b.x;y/.dy//@  k k and variational inequalities. See the survey [6]. k X R 1 C 2 Ajk.t; x; .x;y/.dy//@jk: (6) j;k References From an analytical point of view, the SDEs (4) S 1. Antonelli, F., Kohatsu-Higa, A.: Rate of convergence of provide probabilistic interpretations for macroscopic a particle method to the solution of the McKean-Vlasov equations which includes, e.g., smoothened versions of equation. Ann. Appl. Probab. 12, 423–476 (2002) the Navier–Stokes and Boltzmann equations: see, e.g., 2. Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algo- the survey [16]and[13]. rithms and Analysis. Stochastic Modelling and Applied Probability Series, vol. 57. Springer, New York (2007) From a numerical point of view, whereas the time 3. Bossy, M.: Optimal rate of convergence of a stochastic par- discretization of .Xt / does not lead to an algorithm ticle method to solutions of 1D viscous scalar conservation since  is unknown, the Euler scheme for the par- laws. Math. Comput. 73(246), 777–812 (2004) t ´ ticle system f.X .i//; i D 1;:::;Ng can be simu- 4. Bossy, M., Gobet, E., Talay, D.: A symmetrized Euler t scheme for an efficient approximation of reflected diffu- lated: the solution t to (5) is approximated by the sions. J. Appl. Probab. 41(3), 877–889 (2004) empirical distribution of the simulated particles at 5. Bossy, M., Champagnat, N., Maire, S., Talay D.: Probabilis- time t, the number N of the particles being cho- tic interpretation and random walk on spheres algorithms for sen large enough. Compared to the numerical resolu- the Poisson-Boltzmann equation in Molecular Dynamics. ESAIM:M2AN 44(5), 997–1048 (2010) tion of the McKean–Vlasov–Fokker–Planck equation 6. Bouchard, B., Elie, R., Touzi, N.: Discrete-time approx- by deterministic methods, this stochastic numerical imation of BSDEs and probabilistic schemes for fully 1330 Perturbation Problems

nonlinear PDEs. In: Advanced Financial Modelling. Radon a singular perturbation problem (using the nomencla- Series on Computational and Applied Mathematics, vol. 8, ture of Friedrichs and Wasow [4], now universal). pp. 91–124. Walter de Gruyter, Berlin (2008) The prototype singular perturbation problem 7. Gobet, E., Menozzi, S.: Stopped diffusion processes: boundary corrections and overshoot. Stochastic Process. occurred as Prandtl’s boundary layer theory of 1904, Appl. 120(2), 130–162 (2010) concerning the flow of a fluid of small viscosity past 8. Graham, C., Talay, D.: Mathematical Foundations of an object (cf. [15, 20]). Applications have continued Stochastic Simulations. Vol. I: Stochastic Simulations and to motivate the subject, which holds independent Monte Carlo Methods. Stochastic Modelling and Ap- plied Probability Series, vol. 68. Springer, Berlin/New York mathematical interest involving differential equations. (2013, in press) Prandtl’s Gottingen¨ lectures from 1931 to 1932 9. Jourdain, B., Lelievre,` T., Roux, R.: Existence, uniqueness considered the model and convergence of a particle approximation for the Adap- tive Biasing Force process. M2AN 44, 831–865 (2010) 10. Karatzas, I., Shreve, S.E.: Brownian Motion and Stochas- y00 C y0 C y D 0 tic Calculus. Graduate Texts in Mathematics, vol. 113. Springer, New York (1991) 11. Lamberton, D., Pages,` G.: Recursive computation of the in- on 0 Ä x Ä 1 with prescribed endvalues y.0/ and variant distribution of a diffusion. Bernoulli 8(23), 367–405 y.1/ for a small positive parameter  (corresponding (2002) physically to a large Reynolds number). Linearly in- 12. Mattingly, J., Stuart, A., Tretyakov, M.V.: Convergence of numerical time-averaging and stationary measures via dependent solutions of the differential equation are Poisson equations. SIAM J. Numer. Anal. 48(2), 552–577 given by (2010) e./x and e./x= 13. Mel´ eard,´ S.: A trajectorial proof of the vortex method for p the two-dimensional Navier-Stokes equation. Ann. Appl. where ./ Á 1 14 D 1 C O./ and ./ Á p 2 Probab. 10(4), 1197–1211 (2000) 1C 14 2 14. Milstein, G.N., Tretyakov, M.V.: Stochastic Numerics for 2 D 1   C O. / as  ! 0. Setting Mathematical Physics. Springer, Berlin/New York (2004) 15. Pope, S.B.: Turbulent Flows. Cambridge University Press, ./x ./x= Cambridge/New York (2003) y.x;/ D ˛e C ˇe ; 16. Sznitman, A.-S.: Topics in propagation of chaos. In: Ecole´ ´ d’Ete´ de Probabilites´ de Saint-Flour XIX-1989. Lecture we will need y.0/ D ˛ C ˇ and y.1/ D ˛e./ C Notes Mathematics, vol. 1464. Springer, Berlin/New York ./= (1991) ˇe . The large decay constant = implies that 17. Talay, D.: Probabilistic numerical methods for partial ˛ y.1/e./,so differential equations: elements of analysis. In: Talay, D., Tubaro, L. (eds.) Probabilistic Models for Nonlinear Par- ./.1x/ ./x= ./ tial Differential Equations. Lecture Notes in Mathematics, y.x;/ e y.1/C e .y.0/  e y.1// vol. 1627, pp. 48–196, Springer, Berlin/New York (1996) and

y.x;/ D e.1x/y.1/Cex=ex.y.0/ey.1//CO./: Singular Perturbation Problems The second term decays rapidly from y.0/  ey.1/ to Robert O’Malley zero in an O./-thick initial layer near x D 0,sothe Department of Applied Mathematics, limiting solution University of Washington, Seattle, WA, USA e1x y.1/ for x>0satisfies the reduced problem Regular perturbation methods often succeed in pro- 0 viding approximate solutions to problems involving Y0 C Y0 D 0 with Y0.1/ D y.1/: a small parameter  by simply seeking solutions as a formal power series (or even a polynomial) in . Convergence of y.x;/ at x D 0 is nonuniform unless When the regular perturbation approach fails to pro- y.0/ D ey.1/. Indeed, to all orders j ,theasymptotic vide a uniformly valid approximation, one encounters solution for x>0is given by the outer expansion Singular Perturbation Problems 1331 X ( ./.1x/ j Y.x;/ D e y.1/ Yj .x/ x.t;/ D X0.t/ C O./ and j 0 y.t;/ D Y0.t/ C 0.t=/ C O./

(see Olver [12] for the definition of an asymptotic (atleastfort finite) where 0. / is the asymptotically expansion). It is supplemented by an initial (boundary) stable solution of the stretched problem layer correction

Á d0 x D g.x.0/;0 CY0.0/; 0; 0/; 0.0/ D y.0/Y0.0/: ; Á e./x=.y.0/  e./y.1// d  X Á x The theory supports practical numerical methods for j j  integrating stiff differential equations (cf. [5]). A more j 0 inclusive geometric theory, using normally hyperbolic invariant manifolds, has more recently been exten- where terms j all decay exponentially to zero as the sively used (cf. [3, 9]). stretched inner variable x= tends to infinity. For many linear problems, classical analysis (cf. Traditionally, one learns to complement regular [2, 6, 23]) suffices. Multiscale methods (cf. [8, 10]), outer expansions by local inner expansions in regions however, apply more generally. Consider, for example, of nonuniform convergence. Asymptotic matching the two-point problem methods, generalizing Prandtl’s fluid dynamical insights involving inner and outer approximations, 00 0 then provide higher-order asymptotic solutions (cf. y C a.x/y C b.x;y/ D 0 Van Dyke [20], Lagerstrom [11], and I’lin [7], noting that O’Malley [13] and Vasil’eva et al. [21] provide on 0 Ä x Ä 1 when a.x/ > 0 and a and b are smooth. C more efficient direct techniques involving boundary We will seek the solution as  ! 0 when y.0/ and layer corrections). The Soviet A.N. Tikhonov and y.1/ are given in the form American Norman Levinson independently provided X j methods, in about 1950, to solve initial value problems y.x;;/ yj .x; / for the slow-fast vector system j 0

( using the fast variable xP D f.x;y;t;/ Z yP D g.x; y; t; / 1 x  D a.s/ds  0 on t  0 subject to initial values x.0/Ãand y.0/.Aswe X .t/ to provide boundary layer behavior near x D 0. might expect, the outer limit 0 should satisfy Y .t/ Because S 0 a.x/ the reduced (differential-algebraic) system y0 D y C y x   ( and XP0 D f.X0;Y0;t;0/; X0.0/ D x.0/ 0 0 D g.X0;Y0;t;0/ 2 a .x/ a2.x/ y00 D y C a.x/y C y C y ; xx  x   2  for an attracting root the given equation is converted to the partial differen- tial equation Y0 D .X0;t/  à  @2y @y @2y @y of g D 0, along which g remains a stable matrix. a2.x/ C C  2a.x/ C a0.x/ y @2 @ @x@ @ We must expect nonuniform convergence of the fast à variable y near t D 0, unless y.0/ D Y0.0/. Indeed, @y 2 C a.x/ C b.x;y/ C  yxx D 0: Tikhonov and Levinson showed that @x 1332 Singular Perturbation Problems

We naturally ask the leading term y0 to satisfy Related two-time scale methods have long been used in celestial mechanics (cf. [17,22]) to solve initial @2y @y value problems for nearly linear oscillators 0 C 0 D 0; @2 @ yR C y D f .y; y/P so y0 has the form on t  0. Regular perturbation methods suffice on y .x; / A .x/ B .x/e: 0 D 0 C 0 bounded t intervals, but for t D O.1=/ one must seek solutions X The boundary conditions moreover require that j y.t; ;/ yj .t; / j 0 A .0/ C B .0/ D y.0/ and A .1/ y.1/ 0 0 0 using the slow time

(since e is negligible when x D 1). From the  D t: coefficient, we find that y1 must satisfy  à @2y @y @2y @y We must expect a boundary layer (i.e., nonuniform a2.x/ 1 C 1 D2a.x/ 0  a0.x/ 0 @2 @ @x@ @ convergence) at t D1to account for the cumulative effect of the small perturbation f . @y0 a.x/  b.x;y0/ Instead of using two-timing or averaging (cf. [1]or @x [19]), let us directly seek an asymptotic solution of the 0 Da.x/A initial value problem for the Rayleigh equation  0  0 0  C a.x/B0 C a .x/B0 e  à 1  yR C y D yP 1  yP2 b.x;A0 C B0e /: 3

 Consider the right-hand side as a power series in e . in the form Undetermined coefficient arguments show that its first two terms (multiplying 1 and e) will resonate with y.t; ;/ A. ; /eit B. ; /e3it 2C. ; /e5it the solutions of the homogeneous equation to produce D C C unbounded or secular solutions (multiples of  and C :::C complex conjugate e)as !1unless we require that 1. A0 satisfies the reduced problem for undetermined slowly varying complex-valued coef- ficients A; B; C;:::depending on  (cf. [16]). Differen- 0 a.x/A0 C b.x;A0/ D 0; A0.1/ D y.1/ tiating twice and separating the coefficients of the odd harmonics eit;e3it ;e5it ;:::in the differential equation, (and continues to exist throughout 0 Ä x Ä 1) we obtain 2. B satisfies the linear problem 0  A   2A A   d 2 d 2 d 0 0 2i  iA 1 jAj C   A a.x/B0 C by .x; A0/ C a .x/ B0 D 0; d d 2 d à B0.0/ D y.0/  A0.0/: dA    1  2jAj2  3i.A/2B C ::: D 0; d Thus, we have completely obtained the limiting so- i lution Y .x; /. We note that the numerical solution 8B  A3 C ::: D 0; and 0 3 of restricted two-point problems is reported in Roos C A2B et al. [18]. Special complications, possibly shock lay- 24  3i CD0: ers, must be expected at turning points where a.x/ vanishes (cf. [14, 24]). The resulting initial value problem Solid State Physics, Berry Phases and Related Issues 1333  à dA A i In: Cronin, J., O’Malley, R.E. (eds.) Analyzing Multi- D .1 jAj2/ C .jAj4  2/ C ::: d 2 8 scale Phenomena Using Singular Perturbation Methods, pp. 85–131. American Mathematical Society, Providence (1999) for the amplitude A. ; / can be readily solved for 10. Kevorkian, J., Cole J.D.: Multiple Scale and Singular Per- finite by using polar coordinates and regular pertur- turbation Methods. Springer, New York (1996) bation methods on all equations. Thus, we obtain the 11. Lagerstrom, P.A.: Matched Asymptotic Expansions. Springer, New York (1988) asymptotic solution 12. Olver, F.W.J.: Asymptotics and Special Functions. Aca- demic, New York (1974) 13. O’Malley, R.E., Jr.: Singular Perturbation Methods for Or- it i 3 3it y.t; ;/ D A. ; /e  A . ; /e dinary Differential Equations. Springer, New York (1991) 24  14. O’Malley, R.E., Jr.: Singularly perturbed linear two-point  boundary value problems. SIAM Rev. 50, 459–482 (2008) C A3. ; /.3jA. ; /j2  2/e3it 64 15. O’Malley, R.E., Jr.: Singular perturbation theory: a viscous à flow out of Gottingen.¨ Annu. Rev. Fluid Mech. 42, 1–17 A5. ; / (2010)  e5it C :::C complex 16. O’Malley, R.E., Jr., Kirkinis, E.: A combined renormaliza- 3 tion group-multiple scale method for singularly perturbed conjugate problems. Stud. Appl. Math. 124, 383–410 (2010) 17. Poincare,´ H.: Les Methodes Nouvelles de la Mecanique Celeste II. Gauthier-Villars, Paris (1893) there. We note that the oscillations for related coupled 18. Roos, H.-G., Stynes, M., Tobiska L.: Robust Numerical van der Pol equations are of special current interest in Methods for Singularly Perturbed Differential Equations, neuroscience. 2nd edn. Springer, Berlin (2008) 19. Sanders, J.A., Verhulst, F., Murdock, J.: Averaging Meth- The reader who consults the literature cited will ods in Nonlinear Dynamical Systems, 2nd edn. Springer, find that singular perturbations continue to provide New York (2007) asymptotic solutions to a broad variety of differential 20. Van Dyke, M.: Perturbation Methods in Fluid Dynamics. equations from applied mathematics. The underlying Academic, New York (1964) 21. Vasil’eva, A.B., Butuzov, V.F., Kalachev L.V.: The Bound- mathematics is also extensive and increasingly sophis- ary Function Method for Singular Perturbation Problems. ticated. SIAM, Philadelphia (1995) 22. Verhulst, F.: Nonlinear Differential Equations and Dynami- cal Systems, 2nd edn. Springer, New York (2000) 23. Wasow, W.: Asymptotic Expansions for Ordinary Differen- References tial Equations. Wiley, New York (1965) 24. Wasow, W.: Linear Turning Point Theory. Springer, 1. Bogoliubov, N.N., Mitropolski, Y.A.: Asymptotic Methods New York (1985) in the Theory of Nonlinear Oscillations. Gordon and Breach, New York (1961) 2. Fedoryuk, M.V.: Asymptotic Analysis. Springer, New York (1994) S 3. Fenichel, N.: Geometic singular perturbation theory for ordinary differential equations. J. Differ. Equ. 15, 77–105 (1979) 4. Friedrichs, K.-O., Wasow, W.: Singular perturbations of Solid State Physics, Berry Phases and nonlinear oscillations. Duke Math. J. 13, 367–381 (1946) Related Issues 5. Hairer, E., Wanner, G.: Solving Ordinary Differential Equa- tions II. Stiff and Differential-Algebraic Problems, 2nd revised edn. Springer, Berlin (1996) Gianluca Panati 6. Hsieh, P.-F., Sibuya Y.: Basic Theory of Ordinary Differen- Dipartimento di Matematica, Universit di Roma “La tial Equations. Springer, New York (1999) Sapienza”, Rome, Italy 7. Il’in, A.M.: Matching of Asymptotic Expansions of Solu- tions of Boundary Value Problems. American Mathematical Society, Providence (1992) 8. Johnson, R.S.: Singular Perturbation Theory, Mathematical Synonyms and Analytical Techniques with Applications to Engineer- ing. Springer, New York (2005) 9. Kaper, T.J.: An introduction to geometric methods and dy- Dynamics of Bloch electrons; Theory of Bloch bands; namical systems theory for singular perturbation problems. Semiclassical model of solid-state physics 1334 Solid State Physics, Berry Phases and Related Issues

Definition/Abstract Clearly, the case of a potential with Coulomb-like singularities is included. Crystalline solids are solids in which the ionic cores of the atoms are arranged periodically. The dynamics The Bloch–Floquet Transform (Bloch of a test electron in a crystalline solid can be conve- Representation) niently analyzed by using the Bloch-Floquet transform, Since Hper commutes with the lattice translations, it while the localization properties of electrons are better can be decomposed as a direct integral of simpler described by using Wannier functions. The latter can operators by the (modified) Bloch–Floquet transform.  also be obtained by minimizing a suitable localization Preliminarily,˚ we define the dual lattice« as  WD functional, yielding a convenient numerical algorithm. k 2 Rd W k   2 2Z for all  2  : We denote by Macroscopic transport properties of electrons in Y (resp. Y ) the centered fundamental domain of  crystalline solids are derived, by using adiabatic the- (resp.  ), namely, ory, from the analysis of a perturbed Hamiltonian, which includes the effect of external macroscopic or X Ä  d 1 1 slowly varying electromagnetic potentials. The geo- Y  D k 2 Rd W k D k0   for k0 2  ; ; j D1 j j j 2 2 metric Berry phase and its curvature play a prominent role in the corresponding effective dynamics. ˚ «   where fj g is the dual basis to j ,i.e.,j  i D  2ıj;i. When the opposite faces of Y are identified, T Rd  The Periodic Hamiltonian one obtains the torus d WD = . d One defines, initially for 2 C0.R /, the modified In a crystalline solid, the ionic cores are arranged Bloch–Floquet transform as  periodically,n accordingP to a periodicity lattice o D d X Rd Z 1   C  2 W  D j D1nj j for some nj 2 ' UQ ik .y / . BF /.k; y/ WD 1 e .y C /; d jY j 2 Z ; where f1;:::;d g are fixed linearly independent 2 vectors in Rd . y 2 Rd ;k2 Rd : (3) The dynamics of a test electron in the potential generated by the ionic cores of the solid and, in a mean-   For any fixed k 2 Rd , UQ .k; / is a  -periodic field approximation, by the remaining electrons is BF function and can thus be regarded as an element of described by the Schrodinger¨ equation i@t D Hper , 2 d Hf WD L .TY /, TY being the flat torus R = .The where the Hamiltonian operator reads (in Rydberg UQ mapdefinedby(R 3) extends to a unitary operator BF W units) 2 Rd ˚ H L . / ! Y  f dk;with inverse given by 2 Rd Z Hper D C V .x/ acting in L . /: (1)   1 UQ 1 ikx BF ' .x/ D 1 dk e '.k; Œx/;   Here,  Dr2 is the Laplace operator and the function jY j 2 Y V W Rd ! R is periodic with respect to  , i.e.,  where Œ   refers to the decomposition x D  C Œx, V .x C / D V .x/ for all  2 ; x 2 Rd .Amath- x   with  2  and Œx 2 Y . ematical justification of such a model in the reduced x The advantage of this construction is that the trans- Hartree-Fock approximation was obtained in Catto formed Hamiltonian is a fibered operator over Y . et al. [3] and Cances` et al. [4], see  Mathematical Indeed, one checks that Theory for Quantum Crystals and references therein. 2 d Z To assure that Hper is self-adjoint in L .R / on the ˚ 2;2 d 1 Sobolev space W .R /,wemakeanusualKato-type UQBF Hper UQ D dkHper.k/ BF  assumption on the  -periodic potential: Y

2 d with fiber operator V 2 L .R / for d Ä 3; loc   p Rd 2 Rd V 2 Lloc. / with p>d=2for d  4: (2) Hper.k/ D  iry C k C V .y/ ; k 2 ; (4) Solid State Physics, Berry Phases and Related Issues 1335

2;2 acting on the k-independent domain W .TY /  prominent role in the modern theories of macroscopic 2 L .TY /. The latter fact explains why it is mathemat- polarization [9, 18] and of orbital magnetization [21]. ically convenient to use the modified BF transform. A convenient system of localized generalized eigen- Each fiber operator Hper.k/ is self-adjoint, has compact functions has been proposed by Wannier [22]. By defi- resolvent, and thus pure point spectrum accumulating nition, a Bloch function corresponding to the nth Bloch at infinity. We label the eigenvalue increasingly, i.e., band is any u satisfying (5). Clearly, if u is a Bloch i#.k/ E0.k/ Ä E1.k/ Ä E2.k/ Ä :::. With this choice, function then uQ,definedbyuQ.k; y/ D e u.k; y/ for   they are  -periodic, i.e., En.k C / D En.k/ for all any  -periodic function #, is also a Bloch function.   2  . The function k 7! En.k/ is called the nth The latter invariance is often called Bloch gauge invari- Bloch band. ance.  For fixed k 2 Y , one considers the eigenvalue 2 d Definition 1 The Wannier function wn 2 L .R / problem corresponding to a Bloch function un for the Bloch band En is the preimage of un with respect to the Hper.k/ un.k; y/ D En.k/ un.k; y/; Bloch-Floquet transform, namely

ku .k; /k 2 T D 1: (5) Z n L . Y /   1 UQ 1 ikx wn.x/ WD BF un .x/ D 1 dke un.k; Œx/: jY j 2 Y  A solution to the previous eigenvalue equation (e.g., by numerical simulations) provides a complete solution to The translated Wannier functions are the dynamical equation induced by (1). Indeed, if the initial datum 0 satisfies wn; .x/ WD wn.x  / Z UQ 2  1 ik ikx . BF 0/.k; y/D'.k/un.k; y/ for some ' 2 L .Y /; D 1 dk e e un.k; Œx/;  2 : jY j 2 Y 

(one says in jargon that “ 0 is concentrated on the Thus, in view of the orthogonality of the trigonometric nth band”) then the solution .t/ to the Schrodinger¨ polynomials and the fact that UQBF is an isometry, equation with initial datum 0 is characterized by the functions fwn; g2 are mutually orthogonal in 2 d   L .R /. Moreover, the family fwn; g2 is a complete .UQ .t//.k; y/ D eiEn.k/t '.k/ u .k; y/: UQ 1 BF n orthonormal basis of BF Ran P,whereP.k/ is the spectral projection of Hper.k/R corresponding to the ˚ In particular, the solution is exactly concentrated on eigenvalue En.k/ and P D Y  P.k/ dk. the nth band at any time. By linearity, one recovers the In view of the properties of the Bloch–Floquet solution for any initial datum. Below, we will discuss transform, the existence of an exponentially localized to which extent this dynamical description survives Wannier function for the Bloch band En is equivalent S when macroscopic perturbations of the operator (1)are to the existence of an analytic and  -pseudoperiodic considered. Bloch function (recall that (3) implies that the Bloch function must satisfy u.k C ;y/ D eiy u.k; y/  Wannier Functions and Charge Localization for all  2  ). A local argument assures that While the Bloch representation is a useful tool to there is always a choice of the Bloch gauge such that deal with dynamical and energetic problems, it is not the Bloch function is analytic around a given point. convenient to study the localization of electrons in However, as several authors noticed [5,13], there might solids. A related crucial problem is the construction be topological obstruction to obtain a global analytic of a basis of generalized eigenfunctions of the oper- Bloch function, in view of the competition between the ator Hper which are exponentially localized in space. analyticity and the pseudoperiodicity. Indeed, such a basis allows to develop computational Hereafter, we denote by .k/ the set fEi .k/ W n Ä methods which scale linearly with the system size i Ä n C m  1g, corresponding to a physically relevant [6], makes possible the description of the dynamics family of m Bloch bands, and we assume the following by tight-binding effective Hamiltonians, and plays a gap condition: 1336 Solid State Physics, Berry Phases and Related Issues

inf dist ..k/; .H.k// n .k// > 0: (6) Although the terminology is not standard, we call k2T d Bloch frame asetfagaD1;:::;m of quasi-Bloch functions such that f .k/; : : : ;  .k/g is an orthonormal basis If a Bloch band E satisfies (6)form D 1 we say that 1 m n of Ran P .k/ at (almost-)every k 2 Y .Asinthe it is an single isolated Bloch band. For m>1, we refer  previous case, there is a gauge ambiguity: a Bloch to a composite family of Bloch bands. frame is fixed only up to a k-dependent unitary matrix U U.k/ 2 .m/, i.e., if fPagaD1;:::;m is a Bloch frame then Single Isolated Bloch Band e m the functions a.k/ D bD1 b.k/Ub;a.k/ also define In the case of a single isolated Bloch band, the problem a Bloch frame. of proving the existence of exponentially localized Definition 2 The composite Wannier functions corre- Wannier functions was raised in 1959 by W. Kohn [10], sponding to a Bloch frame f g are the func- whosolveditindimensiond D 1. In higher dimen- a aD1;:::;m tions sion, the problem has been solved, always in the case of a single isolated Bloch band, by J. des Cloizeaux [5]   UQ 1 (under the nongeneric hypothesis that V has a center wa.x/ WD BF a .x/; a 2 f1;:::;mg : of inversion) and finally by G. Nenciu under general hypothesis [12], see also [8] for an alternative proof. As in the case of a single Bloch band, the expo- Notice, however, that in real solids, it might happen nential localization of the composite Wannier functions that the interesting Bloch band (e.g., the conduction is equivalent to the analyticity of the correspond- band in graphene) is not isolated from the rest of the ing Bloch frame (which, in addition, must be  - spectrum and that k 7! P .k/ is not smooth at the  pseudoperiodic). As before, there might be topological degeneracy point. In such a case, the corresponding obstruction to the existence of such a Bloch frame. As Wannier function decreases only polynomially. far as the operator (1) is concerned, the existence of exponentially localized composite Wannier functions Composite Family of Bloch Bands has been proved in Nenciu [12]indimensiond D 1; It is well-known that, in dimension d>1,the as for d>1, the problem remained unsolved for Bloch bands of crystalline solids are not, in general, more than two decades, until recently [2, 16]. Notice isolated. Thus, the interesting problem, in view of real that for magnetic periodic Schrodinger¨ operators the applications, concerns the case of composite families existence of exponentially localized Wannier functions of bands, i.e., m>1in (6), and in this context, the is generically false. more general notion of composite Wannier functions is relevant [1, 5]. Physically, condition (6)isalways The Marzari–Vanderbilt Localization Functional satisfied in semiconductors and insulators by consid- To circumvent the long-standing controversy about the ering the family of all the Bloch bands up to the Fermi existence of exponentially localized composite Wan- energy. nier functions, and in view of the application to numeri- Given a composite family of Bloch bands, we con- cal simulations, the solid-state physics community pre- sider the orthogonal projector (in Dirac’s notation) ferred to introduce the alternative notion of maximally localized Wannier functions [11]. The latter are defined nCXm1 as the minimizers of a suitable localization functional, P.k/ WD jui .k/ihui .k/j ; known as the Marzari–Vanderbilt (MV) functional. iDn For a single-band normalized Wannier function w 2 L2.Rd /, the localization functional is which is independentR from the Bloch gauge, and we ˚ Z pose P D Y  P.k/ dk. A function  is called a 2 2 quasi-Bloch function if FMV .w/ D jxj jw.x/j dx Rd ÂZ Ã  Xd 2 P.k/.k; / D .k; / and .k; / ¤ 0 8k 2 Y : 2  xj jw.x/j dx ; (8) Rd (7) j D1 Solid State Physics, Berry Phases and Related Issues 1337 R 2 which is well defined at least whenever Rd jxj effect of the periodic potential V is the modification jw.x/j2dx < C1. More generally, for a system of the relation between the momentum and the of L2-normalized composite Wannier functions kinetic energy of the electron, from the free relation 2 Rd 1 2 w Dfw1;:::;wmgL . /,theMarzari–Vanderbilt Efree.k/ D 2 k to the function k 7! En.k/ given localization functional is by the nth Bloch band. Therefore, the semiclassical equations of motion are Xm Xm Z 2 2 ( FMV .w/ D FMV .wa/ D jxj jwa.x/j dx Rd rP DrEn./ aD1 aD1 (11) ÂZ Ã  V.r/ r B.r/ Xm Xd 2 P Dr CP 2  xj jwa.x/j dx : (9) Rd aD1 j D1 where r 2 R3 is the macroscopic position of the electron,  D k  A.r/ is the kinetic momentum We emphasize that the above definition includes the T with k 2 d the Bloch momentum, rV the external crucial constraint that the corresponding Bloch func- electric field and B DrA the external magnetic Q tions 'a.k; / D .UBF wa/.k; /,fora 2f1;:::;mg,are field. a Bloch frame. In fact, one can derive also the first-order correction While such approach provided excellent results to (11). At this higher accuracy, the electron acquires from the numerical viewpoint, the existence and an effective k-dependent electric moment An.k/ and exponential localization of the minimizers have been magnetic moment Mn.k/.Ifthenth Bloch band is investigated only recently [17]. non-degenerate (hence isolated), the former is given by the Berry connection

Dynamics in Macroscopic Electromagnetic An.k/ D i hun.k/ ; rkun.k/iH Potentials Z f  D i un.k; y/ rkun.k; y/ dy; To model the transport properties of electrons in solids, Y one modifies the operator (1) to include the effect of ˝ M i the external electromagnetic potentials. Since the latter and the latter reads n.k/ D 2 rkun.k/; .Hper.k/ E .k//r u .k/ ; i.e., explicitly vary at the laboratory scale, it is natural to assume that n k n iHf the ratio " between the lattice constant a DjY j1=d and the length-scale of variation of the external potentials is   i X ˝ M .k/ D  @ u .k/; .H .k/ small, i.e., " 1. The original problem is replaced by n i 2 ij l kj n per 1Äj;lÄ3 ˛ S i"@ . ; x/ En.k//@kl un.k/ H Â Ã f 1 2 D .irx A."x// C V .x/ C V."x/ . ; x/ 2 where ij l is the totally antisymmetric symbol. The refined semiclassical equations read Á H" . ; x/ (10) ( rP Dr .E ./  "B.r/  M .//  "P  ˝ ./ where D "t is the macroscopic time, and V 2  n n n 1 Rd R 1 Rd R M Cb . ; / and Aj 2 Cb . ; /, j 2 f1;:::;dg P Drr .V .r/  "B.r/  n.// CPr  B.r/ are respectively the external electrostatic and magnetic (12) potential. Hereafter, for the sake of a simpler notation, where ˝n.k/ DrAn.k/ corresponds to the curva- we consider only d D 3. ture of the Berry connection. The previous equations While the dynamical equation (10) is quantum have a hidden Hamiltonian structure [14]. Indeed, mechanical, physicists argued [1] that for suitable by introducing the semiclassical Hamiltonian function wavepackets, which are localized on the nth Bloch Hsc.r; / D En./ C V.r/  "B.r/  Mn./,(12) band and spread over many lattice spacings, the main become 1338 Solid State Physics, Berry Phases and Related Issues ! ! ! and inserting the definition of the Weyl quantization for B.r/ I rP rr Hsc.r; / D (13) a one arrives at the formula I " An./ P r Hsc.r; / Z 1 W .q; p/ D d i p .q C " =2/ where I is the identity matrix and B (resp. An)is " d e .2/ Rd the 3  3 matrix correspondingP to the vector field B  .q  " =2/; (14) B (resp. ˝n), i.e., l;m.r/ D 1Äj Ä3 lmj Bj .r/ D .@l Am  @mAl /.r/. Since the matrix appearing on the 2 R2d l.h.s corresponds to a symplectic form B;" (i.e., a non- which yields W" 2 L . /. Although W" is real- degenerate closed 2-form) on R6,(13) has Hamiltonian valued, it attains also negative values in general, so form with respect to B;". it does not define a probability distribution on phase The mathematical derivation of the semiclassical space. model (12) from (10)as" ! 0 has been accomplished After this preparation, we can vaguely state the link in Panati et al. [14]. The first-order correction to the between (10)and(12), see [20] for the precise for- mulation. Let En be an isolated, nondegenerate Bloch semiclassical (11) was previously investigated in Sun- daram and Niu [19], but the heuristic derivation in the band. Denote by ˚ " .r; k/ the flow of the dynamical latter paper does not yield the term of order " in the system (12) in canonical coordinates .r; k/ D .r;  C second equation. Without such a term, it is not clear if A.r// (recall that the Weyl quantization, and hence the the equations have a Hamiltonian structure. definition of Wigner function, is not invariant under As for mathematically related problems, both the non-linear changes of canonical coordinates). Then for R semiclassical asymptotic of the spectrum of H" and the each finite time-interval I  there is a constant C corresponding scattering problem have been studied in C such that for 2 I , a 2 per and for 0 “well- detail (see [7] and references therein). The effective concentrated on the nth Bloch band” one has ˇZ ˇ quantum Hamiltonians corresponding to (10)for" ! ˇ Á ˇ  0 have also been deeply investigated [13]. ˇ . / 0 ˇ ˇ a.q; p/ W" .q; p/ W" ı ˚ " .q; p/ dq dpˇ The connection between (10)and(12) can be ex- R2d 2 2 pressed either by an Egorov-type theorem involving Ä " CdC.a; 0/ k 0k ; quantum observables, or by using Wigner functions. Here we focus on the second approach. where .t/ is the solution to the Schrodinger¨ equation First we define the Wigner function. We consider (10) with initial datum . C 1 R2d 0 the space D Cb . / equipped with the standard  distance dC, and the subspace of  -periodic observ- ables Slowly Varying Deformations and

 Piezoelectricity Cper Dfa 2 C W a.r; k C / D a.r; k/ 8  2  g: To investigate the contribution of the electrons to Recall that, according to the Calderon-Vaillancourt the macroscopic polarization and to the piezoelectric C theorem, there is a constant C such that for a 2 its effect, it is crucial to know how the electrons move in b B 2 R3 Weyl quantization a 2 .L . // satisfies a crystal which is strained at the macroscopic scale. Assuming the usual fixed-lattice approximation,the b 2 jh ; a iL2.R3/ jÄ CdC.a; 0/ k k : problem can be reduced to study the solutions to  à Hence, the map C 3 a 7!h ; ba i2C is linear 1 i @t .t; x/ D   C V .x; "t/ .t; x/ (15) continuous and thus defines an element W" of the dual 2 space C0, the Wigner function of . Writing

for " 1,whereV .;t/ is  -periodic for every b 0 h ; a iDWhW" ;aiC ;C t 2 R, i.e., the periodicity lattice does not depend on Z time. While a model with a fixed lattice might seem

DW a.q; p/ W" .q; p/ dq dp R2d unrealistic at first glance, we refer to Resta [18]and Solid State Physics, Berry Phases and Related Issues 1339

King-Smith and Vanderbilt [9] for its physical justifica- is the starting point of any state-of-the-art numerical 1 tion. The analysis of the Hamiltonian H.t/ D2  C simulation of macroscopic polarization in insulators. V .x; t/ yields a family of time-dependent Bloch func- tions fun.k; t/gn2N and Bloch bands fEn.k; t/gn2N. Assuming that the relevant Bloch band is isolated Cross-References from the rest of the spectrum, so that (6) holds true at every time, and that the initial datum is well-  Born–Oppenheimer Approximation, Adiabatic concentrated on the nth Bloch band, one obtains a Limit, and Related Math. Issues semiclassical description of the dynamics analogous to  Mathematical Theory for Quantum Crystals (12). In this case, the semiclassical equations read ( rP DrkEn.k; t/  "n.k; t/ References (16) P D 0 1. Blount, E.I.: Formalism of band theory. In: Seitz, F., Turn- bull, D. (eds.) Solid State Physics, vol. 13, pp. 305–373. where Academic, New York (1962) 2. Brouder, Ch., Panati, G., Calandra, M., Mourougane, Ch., Marzari, N.: Exponential localization of Wannier functions in insulators. Phys. Rev. Lett. 98, 046402 (2007) n.k; t/ D@t An.k; t/ rk n.k; t/ 3. Catto, I., Le Bris, C., Lions, P.-L.: On the thermodynamic limit for Hartree-Fock type problems. Ann. Henri Poincare´ with 18, 687–760 (2001) 4. Cances,` E., Deleurence, A., Lewin, M.: A new approach to the modeling of local defects in crystals: the reduced An.k; t/ D i hun.k; t/ ; rkun.k; t/iH Hartree-Fock case. Commun. Math. Phys. 281, 129–177 f (2008) .k; t/ Di u .k; t/ ; @ u .k; t/ : 5. des Cloizeaux, J.: Analytical properties of n-dimensional n h n t n iHf energy bands and Wannier functions. Phys. Rev. 135, A698–A707 (1964) The notation emphasizes the analogy with the electro- 6. Goedecker, S.: Linear scaling electronic structure methods. magnetism: if An.k; t/ and n.k; t/ are interpreted as Rev. Mod. Phys. 71, 1085–1111 (1999) the geometric analogous of the vector potential and 7. Gerard,´ C., Martinez, A., Sjostrand,¨ J.: A mathematical approach to the effective Hamiltonian in perturbed periodic of the electrostatic scalar potential, then n.k; t/ and problems. Commun. Math. Phys. 142, 217–244 (1991) ˝n.k; t/ correspond, respectively, to the electric and to 8. Helffer, B., Sjostrand,¨ J.: Equation´ de Schrodinger¨ avec the magnetic field. champ magnetique´ et equation´ de Harper. In: Holden, H., One can rigorously connect (15) and the semiclas- Jensen, A. (eds.) Schrodinger¨ Operators. Lecture Notes in Physics, vol. 345, pp. 118–197. Springer, Berlin (1989) sical model (16), in the spirit of the result stated at 9. King-Smith, R.D., Vanderbilt, D.: Theory of polarization of the end of the previous section, see [15]. From (16) crystalline solids. Phys. Rev. B47, 1651–1654 (1993) S one obtains the King-Smith and Vanderbilt formula 10. Kohn, W.: Analytic properties of Bloch waves and Wannier [9], which approximately predicts the contribution P functions. Phys. Rev. 115, 809–821 (1959) 11. Marzari, N., Vanderbilt, D.: Maximally localized general- of the electrons to the macroscopic polarization of a ized Wannier functions for composite energy bands. Phys. crystalline insulator strained in the time interval Œ0; T , Rev. B 56, 12847–12865 (1997) namely, 12. Nenciu, G.: Existence of the exponentially localised Wan- nier functions. Commun. Math. Phys. 91, 81–85 (1983) Z 13. Nenciu, G.: Dynamics of band electrons in electric and 1 X P D .A .k; T /  A .k; 0// dk ; magnetic fields: rigorous justification of the effective Hamil- d n n tonians. Rev. Mod. Phys. 63, 91–127 (1991) .2/ Y  n2Nocc 14. Panati, G., Spohn, H., Teufel, S.: Effective dynamics for (17) Bloch electrons: Peierls substitution and beyond. Commun. where the sum runs over all the occupied Bloch bands, Math. Phys. 242, 547–578 (2003) 15. Panati, G., Sparber, Ch., Teufel, S.: Geometric currents in i.e., Nocc D fn 2 N W En.k; t/ Ä EF g with EF the Fermi energy. Notice that (17) requires the computa- piezoelectricity. Arch. Ration. Mech. Anal. 191, 387–422 (2009) tion of the Bloch functions only at the initial and at 16. Panati, G.: Triviality of Bloch and Bloch-Dirac bundles. the final time; in view of that, the previous formula Ann. Henri Poincare´ 8, 995–1011 (2007) 1340 Source Location

17. Panati, G., Pisante, A.: Bloch bundles, Marzari-Vanderbilt f0 D Au0. Due to linearity, A.u C u0/ D f C f0. functional and maximally localized Wannier functions. Obviously, u and u C u0 have the same Cauchy data preprint arXiv.org (2011) on  ,sof and f C f produce the same data (2), 18. Resta, R.: Theory of the electric polarization in crystals. 0 0 Ferroelectrics 136, 51–75 (1992) but they are different in ˝. It is clear that there is a 19. Sundaram, G., Niu, Q.: Wave-packet dynamics in slowly very large (infinite dimensional) manifold of solutions perturbed crystals: gradient corrections and Berry-phase to the inverse source problem (1)and(2). To regain 59 effects. Phys. Rev. B , 14915–14925 (1999) uniqueness, one has to restrict unknown distributions to 20. Teufel, S., Panati, G.: Propagation of Wigner functions for the Schrodinger¨ equation with a perturbed periodic poten- a smaller but physically meaningful uniqueness class. tial. In: Blanchard, Ph., Dell’Antonio, G. (eds.) Multiscale Methods in Quantum Mechanics. Birkhauser,¨ Boston (2004) 21. Thonhauser, T., Ceresoli, D., Vanderbilt, D., Resta, R.: Orbital magnetization in periodic insulators. Phys. Rev. Lett. Inverse Problems of Potential Theory 95, 137205 (2005) 22. Wannier, G.H.: The structure of electronic excitation levels We start with an inverse source problem which has a in insulating crystals. Phys. Rev. 52, 191–197 (1937) long and rich history. Let ˚ be a fundamental solution of a linear second-order elliptic partial differential operator A in Rn. The potential of a (Radon) measure  supported in ˝ is Source Location Z Victor Isakov u.xI / D ˚.x; y/d.y/: (3) ˝ Department of Mathematics and Statistics, Wichita State University, Wichita, KS, USA The general inverse problem of potential theory is to find ; supp  ˝; from the boundary data (2). Since Au.I / D  (in generalized sense), the General Problem inverse problem of potential theory is a particular case of the inverse source problem. In the inverse problem Let A be a partial differential operator of second order of gravimetry, one considers A D,

Au D f in ˝: (1) 1 ˚.x;y/ D ; 4jx  yj In the inverse source problem, one is looking for the source term f from the boundary data and the gravity field is generated by volume mass distribution f 2 L1.˝/. We will identify f with a u D g0;@u D g1 on 0  @˝; (2) measure .Sincef with the data (2) is not unique, one can look for f with the smallest (L2.˝/-) norm. 2 where g0;g1 are given functions. In this short expos- The subspace of harmonic functions fh is L -closed, itory note, we will try to avoid technicalities, so we so for any f , there is a unique fh such that f D 2 assume that (in general nonlinear) A is defined by a fh C f0 where f0 is (L )-orthogonal to fh. Since the known C 2-function and f is a function of x 2 ˝ fundamental solution is a harmonic function of y when Rn 2 where ˝ is a given bounded domain in with C x is outside ˝,thetermf0 produces zero potential boundary.  denotes the exterior unit normal to the outside ˝. Hence, the harmonic orthogonal component boundary of a domain. H k.˝/ is the Sobolev space of f has the same exterior data and minimal L2- with the norm kk.k/.˝/. norm. Applying the Laplacian to the both sides of the A first crucial question is whether there is enough equation u.I fh/ D fh,wearriveatthebiharmonic 2 data to (uniquely) find f .IfA is a linear operator, then equation  u.I fh/ D 0 in ˝.When0 D @˝,we solution f of this problem is not unique. Indeed, let have a well-posed first boundary value problem for 2 u0 be a function in the Sobolev space H .˝/ with the biharmonic equation for u.I fh/. Solving this prob- zero Cauchy data u0 D @u0 D 0 on 0,andlet lem, we find fh from the previous Poisson equation. Source Location 1341

However, it is hard to interpret fh (geo)physically, equation, u1 D u2 near @˝. Then from (4) (with d D knowing fh does not help much with finding f . .D1  D2 /dm, dmis the Lebesgue measure) A (geo)physical intuition suggests looking for a Z Z perturbing inclusion D of constant density, i.e., for u D u f D D (characteristic function of an open set D). D1 D2 Since (in distributional sense) u.I / D  in ˝, by using the Green’s formula (or the definition of for any function u which is harmonic in ˝. Novikov’s a weak solution), we yield method of orthogonality is to assume that D1 and D2  Z Z are different and then to select u in such way that the    left integral is less than the right one. To achieve this  u d D ..@ u/u  .@u /u/ (4)  ˝ @˝ goal, u is replaced by its , and one integrates by parts to move integrals to boundaries and makes use for any function u 2 H 1.˝/ which is harmonic in ˝. of the maximum principles to bound interior integrals. If 0 D @˝, then the right side in (4) is known; we are The inverse problem of potential theory is a severely given all harmonic moments of . In particular, letting (exponentially) ill-conditioned problem of mathemat- u D 1, we obtain the total mass of , and by letting u ical physics. The character of stability, conditional to be coordinate (linear) functions, we obtain moments stability estimates, and regularization methods of nu- of  of first order and hence the center of gravity of . merical solutions of such problems are studied starting Even when one assumes that f D D ,thereisa from pioneering work of Fritz John and Tikhonov in nonuniqueness due to possible disconnectedness of the 1950–1960s. complement of D. Indeed, it is well known that if D To understand the degree of ill conditioning, one can is the ball B.a; R/ with center a of radius R,then consider harmonic continuation from the circle 0 D 1 fx WjxjDRg onto the inner circle  Dfx WjxjD its Newtonian potential u.x; D/ D M 4jxaj ,where M is the total mass of D. So the exterior potentials g. By using polar coordinates .r; /, any harmonic of all annuli B.a; R2/ n B.a; R1/ are the same when function decaying at infinity can beP (in a stable way) M m im 3 3 approximated by u.r; I M/ D umr e . R1R2 D C where C is a positive constant. Moreover, mD1 by using this simple example and some reflections in Let us define the linear operator of the continuation Rn, one can find two different domains with connected as A.@r .; R// D @r u.; /. Using the formula for boundaries and equal exterior Newtonian potentials. u.I M/, it is easy to see that the condition number of R M Augmenting this construction by the condensation of the corresponding matrix is .  / which is growing R singularities argument from the theory of functions exponentially with respect to M .If  D 10,then of complex variables, one can construct a continuum the use of computers is only possible when M<16, of different domains with connected boundaries and and typical practical measurements errors of 0:01 allow the same exterior potential. So there is a need to have meaningful computational results when M<3. geometrical conditions on D. The following logarithmic stability estimate holds S A domain D is called star shaped with respect to a and can be shown to be best possible. We denote by 2 2 2 point a if any ray originated at a intersects D over an jj2.S / the standard norm in the space C .S /. interval. An open set D is x1 convex if any straight line Theorem 2 Let D1;D2 be two domains given in polar parallel to the x1-axis intersects D over an interval. coordinates .r; / by the equations @Dj Dfr D In what follows 0 is a non-void open subset of @˝. 2 1 dj ./g where jdj j2.S / Ä M2;

Hyperbolic Equations The statement about uniqueness for arbitrary S0 follows from the sharp uniqueness of the continua- The inverse source problem in a wave motion is to find tion results for second-order hyperbolic equations and .u;f/ 2 H .2/.˝/  L2.˝/ from the following partial some geometric ideas [5], section 3.4. For hyperbolic differential equation equations with analytic coefficients, these sharp results are due to Fritz John and are based on the Holmgren 2 2 Theorem. For C 1-space coefficients, the Holmgren @t u  u D f; @t f D 0 in ˝ D G  .0; T /; Theorem was extended by Tataru. Stability of con- tinuation (and hence in the inverse source problem) with natural lateral boundary and initial conditions is (as for the harmonic continuation) at best of the logarithmic type (i.e., we have severely ill-conditioned @ u D 0 on @˝  .0; T /; u D @ u D 0 on ˝ f0g;  t inverse problem). (5) When S D @G, one has a very strong (best and the additional data 0 possible) Lipschitz stability estimate (10). This esti- mate was obtained by Lop-Fat Ho (1986) by using the u D g on 0 D S0  .0; T /; technique of multipliers; for more general hyperbolic 2 2 where S0 is a part of @G. Assuming @t u 2 H .˝/ and equations by Klibanov, Lasiecka, Tataru, and Triggiani letting (1990s) by using Carleman-type estimates; and by Bardos, Lebeau, and Rauch (1992) by propagation of v D @2u (6) t singularities arguments. Similar results are available and differentiating twice with respect to t transform for general linear hyperbolic equations of second or- this problem into finding the initial data in the follow- der with time-independent coefficients. However, for ing hyperbolic mixed boundary value problem Lipschitz stability, one has to assume the existence of a suitable pseudo-convex function or absence of trapped 2 @t v  v D 0 in ˝; (7) bicharacteristics. Looking for the source of the wave motion (in the more complicated elasticity system), in with the lateral boundary condition particular, can be interpreted as finding location and intensity of earthquakes. The recent medical diagnostic technique called thermoacoustical tomography can @v D 0 on @G  .0; T /; (8) be reduced to looking for the initial displacement u0. One of the versions of this problem is a classical one from the additional data of looking for a function from its spherical means. 2 In a limiting case when radii of spheres are getting v D @t g on 0: (9) large, one arrives at one of the most useful problems of u Indeed, one can find from (6) and the initial condi- tomography whose mathematical theory was initiated S tions (5). by Radon (1917) and Fritz John (1940s). For a recent Theorem 3 Let advance in tomography in case of general attenuation, we refer to [2]. Detailed references are in [4, 5]. In addition to direct applications, the inverse source 2dist.x;S0I G/ < T;x 2 @G: problems represent linearizations of (nonlinear) prob- lems of finding coefficients of partial differential equa- Then the data (9) on 0 for a solution v of (7) and tions and can be used in the study of uniqueness and (8) uniquely determine v on ˝. stability of identification of coefficients. For example, If, in addition, S0 D @G,then 2 2 subtracting two equations @t u2a2u2 D 0 and @t u1 2 2 a1u1 D 0 yields @t ua2u D ˛f with ˛ D u1 (as kv.; 0/k.1/.G/ Ck@t v.; 0/k.0/.G/ Ä C k@t gk.1/.0/: a known weight function) and unknown f D a2  a1. (10) A general technique to show uniqueness and stability Here, d.x;S0I G/ is the (minimal) distance from x of such inverse source problems by utilizing Carleman to S0 inside G. estimates was introduced in [3]. 1344 Sparse Approximation

Acknowledgements This research was in part supported by sparse representation in an appropriate basis/frame. the NSF grant DMS 10-07734 and by Emylou Keith and Betty Concrete applications include compression, denois- Dutcher Distinguished Professorship at WSU. ing, signal separation, and signal reconstruction (com- pressed sensing). On the one hand, the theory of sparse approximation References is concerned with identifying the type of vectors, func- tions, etc. which can be well approximated by a sparse 1. Ammari, H., Kang, H.: An Introduction to Mathematics of Emerging Biomedical Imaging. Springer, Berlin (2008) expansion in a given basis or frame and with quantify- 2. Arbuzov, E.V., Bukhgeim, A.L., Kazantsev, S.G.: Two- ing the approximation error. For instance, when given dimensional tomography problems and the theory of A- a wavelet basis, these questions relate to the area of analytic functions. Siber. Adv. Math. 8, 1–20 (1998) function spaces, in particular, Besov spaces. On the 3. Bukhgeim, A.L., Klibanov, M.V.: Global uniqueness of class of multidimensional inverse problems. Sov. Math. Dokl. 24, other hand, algorithms are required to actually find a 244–247 (1981) sparse approximation to a given vector or function. 4. Isakov, V.: Inverse Source Problems. American Mathematical In particular, if the frame at hand is redundant or if Society, Providence (1990) only incomplete information is available – as it is the 5. Isakov, V.: Inverse Problems for PDE. Springer, New York (2006) case in compressed sensing – this is a nontrivial task. 6. Novikov, P.: Sur le probleme inverse du potentiel. Dokl. Several approaches are available, including convex Akad. Nauk SSSR 18, 165–168 (1938) relaxation (`1-minimization), greedy algorithms, and 7. Prilepko, A.I., Orlovskii, D.G., Vasin, I.A.: Methods for Solving Inverse Problems in Mathematical Physics. Marcel certain iterative procedures. Dekker, New York (2000)

Sparse Approximation Sparsity

N N Holger Rauhut Let x beavectorinR or C or `2. / for some Lehrstuhl C f¨ur Mathematik (Analysis), possibly infinite set  . We say that x is s-sparse if RWTH Aachen University, Aachen, Germany kxk0 WD #f` W x` ¤ 0gÄs:

Definition For a general vector x, the error of best s-term approx- imation quantifies the distance to sparse vectors, The aim of sparse approximation is to represent an object – usually a vector, matrix, function, image, s .x/p WD inf kx  zkp: or operator – by a linear combination of only few zWkzk0Äs elements from a basis, or more generally, from a P p 1=p redundant system such as a frame. The tasks at hand are Here, kxkp D . j jxj j / is the usual `p-norm for to design efficient computational methods for finding 0

This inequality enlightens the importance of `q-spaces orthonormal basis also does not make sense anymore, with q<1in this context. so that f j ;j 2 J g is then just some system of The situation above describes sparsity with respect elements spanning the space – possibly a basis. to the canonical basis. For a more general setup, con- Important types of systems f j ;j 2 J g consid- sider a (finite- or infinite-dimensional) Hilbert space H ered in this context include the trigonometric system (often a space of functions) endowed with an orthonor- fe2ik;k 2 ZgL2Œ0; 1, wavelet systems [9, 36], or mal basis f j ;j 2 J g. Given an element f 2 H, our Gabor frames [23]. aim is to approximate it by a finite linear combination of the j , that is, by X Quality of a Sparse Approximation xj j ; j 2S One important task in the field of sparse approximation is to quantify how well an element f 2 H or a where S  J is of cardinality at most s,say.Incontrast whole class B  H of elements can be approximated to linear approximation, the index set S is not fixed a by sparse expansions. An abstract way [12, 20]of priori but is allowed to depend on f . Analogously as describing good approximation classes is to introduce above, the error of best s-term approximation is then defined as X B WD ff 2 H W f D x ; kxk < 1g X p j j p j 2J s .f /H WD inf kf  xj j kH xWkxk Äs 0 j 2J P with norm kf k D inffkxk W f D x g.If P Bp p j j j f W j 2 J g is an orthonormal basis, then it follows and the element j xj j with kxk0 Ä s realizing j the infimum is called a best s-term approximation to directly from (1)that,for0

This definition includes orthonormal bases but allows also redundancy,P that is, the coefficient vector x in the Algorithms for Sparse Approximation expansion f D j 2J xj j is no longer unique. Re- dundancy has several advantages. For instance, since For practical purposes, it is important to have algo- there are more possibilities for a sparse approximation rithms for computing optimal or at least near-optimal of f , the error of s-sparse approximation may poten- sparse approximations. When H D CN is finite di- N tially be smaller. On the other hand, it may get harder to mensional and f j ;j D 1;:::;NgC is an actually find a sparse approximation (see also below). orthonormal basis, then this is easy.P In fact, the coef- N In another direction, one may relax the assump- ficients in the expansion f D j D1 xj j are given tion that H is a Hilbert space and only require it to by xj Dhf; j i, so that a best s-term approximation be a Banach space. Clearly, then the notion of an to f in H is given by 1346 Sparse Approximation X on this greedy algorithm which avoids this drawback hf; j i j j 2S consists in the orthogonal matching pursuit algorithm [31, 33] outlined next. Again, starting with r0 D f , where S is an index set of s largest absolute entries of S0 D;and k D 0, the following steps are con- the vector .hf; i/N . ducted: n o j j D1 jhrk; j ij N 1. jk WD argmax W j 2 J . When f j ;j D 1;:::;MgC is redundant, that k j k is, M>N, then it becomes a nontrivial problem to 2. SkC1 WD Sk [fjkg. P N 3. x.kC1/ WD argmin kf  find the sparsest approximation to a given f 2 C . zWsupp.z/SkC1 j 2SkC1 Denoting by  the N  M matrix whose columns zj j k2. P 4. r WD f  x.kC1/ . are the vectors j , this problem can be expressed as kC1 j 2SkC1 j j finding the minimizer of 5. Repeat from step (1) with k 7! k C 1 until a stopping criterion isP met. e e .k/ 6. Output f D f k D x j . min kxk0 subject to kx  f k2 Ä "; (3) j 2Sk j The essential difference to matching pursuit is the orthogonal projection step in (3). Orthogonal matching for a given threshold ">0. In fact, this problem pursuit may require a smaller number of iterations than is known to be NP hard in general [11, 25]. Sev- matching pursuit. However, the orthogonal projection eral tractable alternatives have been proposed. We makes an iteration computationally more demanding discuss the greedy methods matching pursuit and or- than an iteration of matching pursuit. thogonal matching pursuit as well as the convex re- laxation method basis pursuit (`1-minimization) next. Other sparse approximation algorithms include itera- Convex Relaxation tive schemes, such as iterative hard thresholding [1] A second tractable approach to sparse approximation and iteratively reweighted [10]. is to relax the `0-minimization problem to the convex optimization problem of finding the minimizer of Matching Pursuits Given a possibly redundant system f j ;j 2 J g min kxk1 subject to kx  f k2 Ä ": (4) H – often called a dictionary – the greedy algorithm matching pursuit [24,27,31,33] iteratively builds up the This program is also known as basis pursuit [6]and support set and the sparse approximation. Starting with can be solved using various methods from convex r0 D f , S0 D;and k D 0 it performs the following optimization [2]. At least in the real-valued case, the steps: n o minimizer x of the above problem will always have at jhrk; j ij  1. jk WD argmax W j 2 J . most N nonzero entries, and the support of x defines a k j k linear independent set f W x ¤ 0g, which is a basis 2. SkC1 WD Sk [fjkg. j j hr ; i N  k jk of C if x has exactly N nonzero entries – thus, the 3. rkC1 D rk  k k jk . jk 2 name basis pursuit. 4. k 7! k C 1. 5. Repeat from step (1) with k 7! k C 1 until a stopping criterion is met. Finding the Sparsest Representation P hr ; i e e k ` j` When the dictionary f j g is redundant, it is of great 6. Output f D f k D `D1 j` . k j` k2 Clearly, if s steps of matching pursuit are performed, interest to provide conditions which ensure that a spe- then the output fe has an s-sparse representation with cific algorithm is able to identify the sparsest possible e e representation. For this purpose, it is helpful to define respect to f . It is known that the sequence f k con- verges to f when k tends to infinity [24]. A possible the coherence  of the system f j g, or equivalently of  stopping criterion for step (5) is a maximal number of the matrix having the vectors j as its columns. As- suming the normalization k j k2 D 1,itisdefinedas iterations, or that the residual norm krkkÄ for some prescribed tolerance >0. the maximal inner product between different dictionary Matching pursuit has the slight disadvantage that an elements,  D max jh j ; kij: index k may be selected more than once. A variation j ¤k Sparse Approximation 1347

Suppose thatP f has a representation with s terms, that representing the noise. The additional knowledge that is, f D j xj j with kxk0 Ä s.Thens iterations the signal at hand can be approximated well by a of orthogonal matching pursuit [3, 33]aswellasbasis sparse representation can be exploited to clean the pursuit (4) with " D 0 [3, 14, 34] find the sparsest signal by essentially removing the noise. TheP essential representation of f with respect to f j g provided that idea is to find a sparse approximation j xj j of e f with respect to a suitable dictionary f j g and to .2s  1/ < 1: (5) use it as an approximation to the original f .One algorithmic approach is to solve the `1-minimization Moreover, for a general f , both orthogonal matching problem (4), where  is now a suitable estimate of pursuit and basis pursuit generate an s-sparse approx- the `2-norm of the noise .If is a wavelet basis, imation whose approximation error is bounded by the this principle is often called wavelet thresholding or errorofbests-term approximation up to constants; see wavelet shrinkage [16] due to connection of the soft- [3, 18, 31] for details. thresholding function [13]. For typical “good” dictionaries f gM  CN ,the p j j D1 coherence scales as  N [29], so that the bound Signal Separation Suppose one observes the superposition f D f C f (5) implies that s-sparse representations in such dictio-p 1 2 naries with small enough sparsity, that is, s Ä c N , of two signals f1 and f2 of different nature, for in- can be found efficiently via the described algorithms. stance, the “harmonic” and the “spiky” component of an acoustic signal, or stars and filaments in an astronomical image. The task is to separate the two Applications of Sparse Approximation components f1 and f2 from the knowledge of f . Knowing that both f1 and f2 have sparse represen- j 2 Sparse approximation find a variety of applications. tations in dictionaries f j g and f j g of “different f f Below we shortly describe compression, denoising, nature,” one can indeed recover both 1 and 2 by and signal separation. Sparse representations play also similar algorithms as outlined above, for instance, by ` a major role in adaptive numerical methods for solving solving the 1-minimization problem operator equations such as PDEs. When the solution X X 1 2 1 1 2 2 has a sparse representation with a suitable basis, say min kz kCkz k1 subject to f D zj j C zj j : z1;z2 finite elements or wavelets, then a significant accel- j j eration with respect to standard linear methods can 1 2 be achieved. The algorithms used in this context are The solutionP .x ; x / definesP the reconstructions e 1 1 e 2 2 of different nature than the ones described above. We f 1 D j xj j and f 2 D j xj j . If, for instance, 1 N 2 N refer to [8] for details. f j gj D1 and f j gj D1 are mutually incoherent bases CN 1 2 in ,thatis,k j k2 Dk j k2 D 1 for all j and the S 1 2 Compression maximal inner product  Djh j ; j ij is small, then An obvious application of sparse approximation is the above optimization problem recovers both f1 and image and signal compression. Once a sparse approx- f2 provided they have representations with altogether imation is found, one only needs to store the nonzero s terms where s < 1=.2/ [15]. An example of two coefficients of the representation. If the representation mutually incoherent bases are the Fourierp basis and is sparse enough, then this requires significantly less the canonical basis, where  D 1= N [17]. Under memory than storing the original signal or image. a probabilistic model, better estimates are possible This principle is exploited, for instance, in the JPEG, [5, 35]. MPEG, and MP3 data compression standards.

Denoising Compressed Sensing Often acquired signals and images are corrupted by noise, that is, the observed signal can be written as fe D The theory of compressed sensing [4,21,22,26] builds f C ,wheref is the original signal and  is a vector on sparse representations. Assuming that a vector 1348 Sparse Approximation x 2 CN is s-sparse (or approximated well by a sparse 3. Bruckstein, A., Donoho, D.L., Elad, M.: From sparse solu- vector), one would like to reconstruct it from only tions of systems of equations to sparse modeling of signals and images. SIAM Rev. 51(1), 34–81 (2009) limited information, that is, from 4. Candes,` E.J.: Compressive sampling. In: Proceedings of the International Congress of , Madrid (2006) CmN 5. Candes,` E.J., Romberg, J.K.: Quantitative robust uncertainty y D Ax; with A 2 principles and optimally sparse decompositions. Found. Comput. Math. 6(2), 227–254 (2006) 6. Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic de- where m is much smaller than N . Again the algorithms composition by basis pursuit. SIAM J. Sci. Comput. 20(1), outlined above apply, for instance, basis pursuit (4) 33–61 (1998) with A replacing . In this context, one would like 7. Christensen, O.: An Introduction to Frames and Riesz Bases. Applied and Numerical Harmonic Analysis. to design matrices A with the minimal number m Birkhauser,¨ Boston (2003) of rows (i.e., the minimal number of linear measure- 8. Cohen, A.: Numerical Analysis of Wavelet Methods. Stud- ments), which are required to reconstruct x from y. ies in Mathematics and its Applications, vol. 32, xviii, 336p. The recovery criterion based on coherence  of A EUR 95.00. North-Holland, Amsterdam (2003) 9. Daubechies, I.: Ten Lectures on Wavelets. CBMS-NSF Re- described above applies but is highly suboptimal. In gional Conference Series in Applied Mathematics, vol 61. fact, it can be shown that for certain random matrices SIAM, Philadelphia (1992) m  cs log.eN=s/ measurements suffice to (stably) 10. Daubechies, I., DeVore, R.A., Fornasier, M., G¨unt¨urk, C.: Iteratively re-weighted least squares minimization for sparse reconstruct an s-sparse vector using `1-minimization recovery. Commun. Pure Appl. Math. 63(1), 1–38 (2010) with high probability, where c is a (small) universal 11. Davis, G., Mallat, S., Avellaneda, M.: Adaptive greedy constant. This bound is sufficiently better than the ones approximations. Constr. Approx. 13(1), 57–98 (1997) that can be deduced from coherence based bounds as 12. DeVore, R.A.: Nonlinear approximation. Acta Numer. 7, described above. A particular case of interest arise 51–150 (1998) 13. Donoho, D.: De-noising by soft-thresholding. IEEE Trans. when A consists of randomly selected rows of the Inf. Theory 41(3), 613–627 (1995) discrete Fourier transform matrix. This setup corre- 14. Donoho, D.L., Elad, M.: Optimally sparse representation in sponds to randomly sampling entries of the Fourier general (nonorthogonal) dictionaries via `1 minimization. transform of a sparse vector. When m  cs log4 N , Proc. Natl. Acad. Sci. U.S.A. 100(5), 2197–2202 (2003) 15. Donoho, D.L., Huo, X.: Uncertainty principles and ideal then `1-minimization succeeds to (stably) recover s- atomic decompositions. IEEE Trans. Inf. Theory 47(7), sparse vectors from m samples [26, 28]. 2845–2862 (2001) This setup generalizes to the situation that one takes 16. Donoho, D.L., Johnstone, I.M.: Minimax estimation via limited measurements of a vector f 2 CN ,which wavelet shrinkage. Ann. Stat. 26(3), 879–921 (1998) M 17. Donoho, D.L., Stark, P.B.: Uncertainty principles and signal is sparse with respect to a basis or frame f j gj D1. recovery. SIAM J. Appl. Math. 48(3), 906–931 (1989) In fact, then f D x for a sparse x 2 CM and with 18. Donoho, D.L., Elad, M., Temlyakov, V.N.: Stable recovery a measurement matrix A 2 CmN ,wehave of sparse overcomplete representations in the presence of noise. IEEE Trans. Inf. Theory 52(1), 6–18 (2006) 19. Duffin, R.J., Schaeffer, A.C.: A class of nonharmonic y D Af D Ax; Fourier series. Trans. Am. Math. Soc. 72, 341–366 (1952) 20. Fornasier, M., Grochenig,¨ K.: Intrinsic localization of frames. Constr. Approx. 22(3), 395–415 (2005) so that we reduce to the initial situation with A0 D A 21. Fornasier, M., Rauhut, H.: Compressive sensing. In: Scherzer, O. (ed.) Handbook of Mathematical Methods in replacing A.Oncex is recovered, one forms f D x. Imaging, pp. 187–228. Springer, New York (2011) Applications of compressed sensing can be found in 22. Foucart, S., Rauhut, H.: A mathematical introduction to various signal processing tasks, for instance, in medical compressive sensing. Applied and Numerical Harmonic imaging, analog-to-digital conversion, and radar. Analysis. Birkhauser,¨ Basel 23. Grochenig,¨ K.: Foundations of Time-Frequency Analysis. Applied and Numerical Harmonic Analysis. Birkhauser,¨ Boston (2001) 24. Mallat, S.G., Zhang, Z.: Matching pursuits with time- References frequency dictionaries. IEEE Trans. Signal Process. 41(12), 3397–3415 (1993) 1. Blumensath, T., Davies, M.: Iterative thresholding for sparse 25. Natarajan, B.K.: Sparse approximate solutions to linear approximations. J. Fourier Anal. Appl. 14, 629–654 (2008) systems. SIAM J. Comput. 24, 227–234 (1995) 2. Boyd, S., Vandenberghe, L.: Convex Optimization. 26. Rauhut, H.: Compressive sensing and structured random Cambridge University Press, Cambridge (2004) matrices. In: Fornasier, M. (ed.) Theoretical Foundations Special Functions: Computation 1349

and Numerical Methods for Sparse Recovery. Radon Se- Bessel, Legendre, or parabolic cylinder functions are ries on Computational and Applied Mathematics, vol. 9, well known for everyone involved in physics. This is pp. 1–92. De Gruyter, Berlin/New York (2010) not surprising because Bessel functions appear in the 27. Roberts, D.H.: Time series analysis with clean I. Derivation of a spectrum. Astronom. J. 93, 968–989 (1987) solution of partial differential equations in cylindri- 28. Rudelson, M., Vershynin, R.: On sparse reconstruction from cally symmetric domains (such as optical fibers) or Fourier and Gaussian measurements. Commun. Pure Appl. in the Fourier transform of radially symmetric func- Math. 61, 1025–1045 (2008) tions, to mention just a couple of applications. On 29. Strohmer, T., Heath, R.W.: Grassmannian frames with ap- plications to coding and communication. Appl. Comput. the other hand, Legendre functions appear in the so- Harmon Anal. 14(3), 257–275 (2003) lution of electromagnetic problems involving spherical 30. Temlyakov, V.N.: Nonlinear methods of approximation. or spheroidal geometries. Finally, parabolic cylinder Found Comput. Math. 3(1), 33–107 (2003) 31. Temlyakov, V.: Greedy Approximation. Cambridge Mono- functions are involved, for example, in the analysis of graphs on Applied and Computational Mathematics, No. 20. the wave scattering by a parabolic cylinder, in the study Cambridge University Press, Cambridge (2011) of gravitational fields or quantum mechanical problems 32. Triebel, H.: Theory of Function Spaces. Monographs in such as quantum tunneling or particle production. Mathematics, vol. 78. Birkhauser,¨ Boston (1983) 33. Tropp, J.A.: Greed is good: Algorithmic results for sparse But there are many more functions under the term approximation. IEEE Trans. Inf. Theory 50(10), 2231–2242 “special functions” which, differently from the ex- (2004) amples mentioned above, are not of hypergeometric 34. Tropp, J.A.: Just relax: Convex programming methods for type, such as some cumulative distribution functions identifying sparse signals in noise. IEEE Trans. Inf. Theory 51(3), 1030–1051 (2006) [3, Chap. 10]. These functions also need to be evalu- 35. Tropp, J.A.: On the conditioning of random subdictionaries. ated in many problems in statistics, probability theory, Appl. Comput. Harmon Anal. 25(1), 1–24 (2008) communication theory, or econometrics. 36. Wojtaszczyk, P.: A Mathematical Introduction to Wavelets. Cambridge University Press, Cambridge (1997)

Basic Methods

Special Functions: Computation The methods used for the computation of special func- tions are varied, depending on the function under 1 2 3 Amparo Gil , Javier Segura , and Nico M. Temme consideration as well as on the efficiency and the 1 Departamento de Matematica´ Aplicada y Ciencias de accuracy demanded. Usual tools for evaluating special la Computacion,´ Universidad de Cantabria, E.T.S. functions are the evaluation of convergent and diver- Caminos, Canales y Puertos, Santander, Spain gent series, the computation of continued fractions, the 2 Departamento de Matematicas,´ Estad´ıstica y use of Chebyshev approximations, the computation of Computacion,´ Universidad de Cantabria, Santander, the function using integral representations (numerical S Spain quadrature), and the numerical integration of ODEs. 3 Centrum voor Wiskunde and Informatica (CWI), Usually, several of these methods are needed in order Amsterdam, The Netherlands to build an algorithm able to compute a given function for a large range of values of parameters and argument. Also, an important bonus in this kind of algorithms Mathematics Subject Classification will be the possibility of evaluating scaled functions: if, for example, a function f.z/ increases exponentially 65D20; 41A60; 33C05; 33C10; 33C15 for large jzj, the factorization of the exponential term and the computation of a scaled function (without the exponential term) can be used to avoid degradations in Overview the accuracy of the functions and overflow problems as z increases. Therefore, the appropriate scaling of a The special functions of mathematical physics [4] special function could be useful for increasing both the are those functions that play a key role in many range of computation and the accuracy of the computed problems in science and engineering. For example, expression. 1350 Special Functions: Computation

Next, we briefly describe three important techniques for N D 0; 1; 2; : : :, and the order estimate holds for for computing special functions which appear ubiqui- fixed N .ThisisthePoincare-type´ expansion and for tously in algorithms for special function evaluation: special functions like the gamma and Bessel functions convergent and divergent series, recurrence relations, they are crucial for evaluating these functions. Other and numerical quadrature. variants of the expansion are also important, in partic- ular expansions that hold for a certain range of addi- Convergent and Divergent Series tional parameters (this leads to the uniform asymptotic Convergent series for special functions usually arise in expansions in terms of other special functions like Airy the form of hypergeometric series: functions, which are useful in turning point problems). 0 1 a1;  ;ap X1 n .a1/n .ap/n z Recurrence Relations F @ I zA D ; (1) p q .b / .b / nŠ In many important cases, there exist recurrence rela- b ;  ;b nD0 1 n q n 1 q tions relating different values of the function for differ- ent values of its variables; in particular, one can usu- where p Ä q C 1 and .a/n is the Pochhammer symbol, ally find three-term recurrence relations [3, Chap. 4]. also called the shifted factorial, defined by In these cases, the efficient computation of special functions uses at some stage the recurrence relations .a/0 D 1; .a/n D a.a C 1/ .a C n  1/ .n  1/; satisfied by such families of functions. In fact, it is dif- €.a C n/ ficult to find a computational task which does not rely .a/ D : (2) n €.a/ on recursive techniques: the great advantage of having recursive relations is that they can be implemented with The series is easy to evaluate because of the recur- ease. However, the application of recurrence relations can be risky: each step of a recursive process generates sion .a/nC1 D .a Cn/.a/n, n  0, of the Pochhammer symbols. For example, for the modified Bessel function not only its own rounding errors but also accumulates the errors of the previous steps. An important aspect

Á 1 is then the study of the numerical condition of the  X . 1 z2/n I .z/ D 1 z 4 recurrence relations, depending on the initial values for  2 €. C n C 1/ nŠ nD0 starting recursion. ! Á If we write the three-term recurrence satisfied by the   1 1 2 function yn as D z 0F1 I z ; (3) 2  C 1 4 this is a stable representation when z >0and   0 ynC1 C bnyn C anyn1 D 0; (6) and it is an efficient representation when z is not large compared with . .m/ With divergent expansion we mean asymptotic ex- then, if a solution yn of (6) exists that satisfies .m/ yn .D/ pansions of the form lim D 0 for all solutions yn that are linearly !C1 .D/ n yn .m/ .m/ X1 independent of yn , we will call yn the minimal cn F.z/ ; z !1: (4) solution. The solution y.D/ is said to be a dominant zn n nD0 solution of the three-term recurrence relation. From a computational point of view, the crucial point is the The series usually diverges, but it has the property identification of the character of the function to be evaluated (either minimal or dominant) because the NX1   stable direction of application of the recurrence relation cn N F.z/ D C RN .z/; RN .z/ D O z ; is different for evaluating the minimal or a dominant zn nD0 solution of (6): forward for dominant solutions and z !1; (5) backward for minimal solutions. Special Functions: Computation 1351

For analyzing whether a special function is minimal f C z g C 2n n 1 ; n 1 : (11) or not, analytical information is needed regarding its fn 2n gn z behavior as n !C1. Assume that for large values of n the coefficients As the known asymptotic behavior of the Bessel func- an;bn behave as follows. tions reads Á  à n n ˛ ˇ 1 z .n  1/Š 2 an an ;bn bn ;ab¤ 0 (7) J .z/ ;Y.z/  ; n nŠ 2 n  z n !1; (12) with ˛ and ˇ real; assume that t1;t2 are the zeros of 2 the characteristic polynomial ˚.t/ D t C bt C a it is easy to identify Jn.z/ and Yn.z/ as the minimal with jt1jjt2j. Then it follows from Perron’s theorem (fn) and a dominant (gn) solutions, respectively, of the [3, p. 93] that we have the following results: three-term recurrence relation (10). 1 1. If ˇ>2 ˛, then the difference equation (6)hastwo Similar results hold for the modified Bessel func- linearly independent solutions fn and gn, with the tions, with recurrence relation property 2n y C y  y D 0; z ¤ 0; (13) nC1 z n n1 f a g n  n˛ˇ; n bnˇ;n!1: fn1 b gn1 with solutions I .z/ (minimal) and .1/nK .z/ (domi- (8) n n nant). In this case, the solution fn is minimal. 2. If ˇ D 1 ˛ and jt j > jt j, then the difference 2 1 2 Numerical Quadrature equation (6) has two linear independent solutions f n Another example where the study of numerical and g , with the property n stability is of concern is the computation of special functions via integral representations. It is tempting,

fn ˇ gn ˇ but usually wrong, to believe that once an integral t1n ; t2n ;n!1; (9) fn1 gn1 representation is given, the computational problem is solved. One has to choose a stable quadrature rule and this choice depends on the integral In this case, the solution f is minimal. n under consideration. Particularly problematic is the 3. If ˇ D 1 ˛ and jt jDjt j,orifˇ<1 ˛,then 2 1 2 2 integration of strongly oscillating integrals (Bessel some information is still available, but the theorem and Airy functions, for instance); in these cases is inconclusive with respect to the existence of an alternative approach consists in finding non- minimal and dominant solutions. oscillatory representations by properly deforming the S Let’s consider three-term recurrence relations sat- integration path in the complex plane. Particularly isfy by Bessel functions as examples. Ordinary Bessel useful is the saddle point method for obtaining functions satisfy the recurrence relation integral representations which are suitable for applying the trapezoidal rule, which is optimal for 2n computing certain integrals in R. Let’s explain the ynC1  yn C yn1 D 0; z ¤ 0; (10) saddle point method taking the Airy function Ai.z/ z as example. Quadrature methods for evaluating complex Airy functions can be found in, for with solutions Jn.z/ (the Bessel function of the first example, [1, 2]. kind) and Yn.z/ (the Bessel function of the second We start from the following integral representation kind). This three-term recurrence relation corresponds in the complex plane: to (8), with the values a D 1, ˛ D 0, b D2 , ˇ D 1. z Z Then, there exist two independent solutions f and g n n 1 1 w3zw Ai.z/ D e 3 dw; (14) satisfying 2i C 1352 Special Functions: Computation

p the circle with radius r and are indicated by small dots. 5.0 The saddle point on the positive real axis corre- sponds with the case D 0 and the two saddles on the imaginary axis with the case D .Itis interesting to see that the contour may split up and run 2 through both saddle points ˙w0.When D 3  both saddle points are on one path, and the half-line in the −5.0 5.0 zplane corresponding with this is called a Stokes line. Integrating with respect to D v  v0 (and writing  D u  u0), we obtain

−5.0 Z  à e 1 d Ai.z/ D e r .; / C i d ; (18) 2i 1 d Special Functions: Computation, Fig. 1 Saddle point con- 1 2 tours for D 0; 3 ; 3 ;  and r D 5 3 2 2 where  D 3 z and where z 2 C and C is a contour starting at 1ei=3 . C 3v / and terminating at 1eCi=3 (in the valleys of  D Ä q 0 ; 1 < <1; 1 2 the integrand). In the example we take ph z 2 3 u0 C . C 4v0 C 3r/ 2 3 Œ0; 3 . 1 3 (19) Let .w/ D 3 w  zw. The saddle points are p 0 w0 D z and w0 and follow from solving .w/ D 2 w  z D 0. The saddle point contour (the path of 2 2 r .; / D<Œ .w/  .w0/ D u0.  / steepest descent) that runs through the saddle point w0 1 3 2 is defined by =Œ .w/ D=Œ .w0/. 2v0 C 3    : (20) We write The integral representation for the Airy function z D x C iy D rei ; w D u C iv; w D u C iv : 0 0 0 in (18) is now suitable for applying the trapezoidal (15) rule. The resulting algorithm will be flexible and Then efficient.

p 1 p 1 2 2 u0 D r cos 2 ; v0 D r sin 2 ; x D u0  v0 ;

y D 2u0v0: (16) References

The path of steepest descent through w0 isgivenbythe 1. Gautschi, W.: Computation of Bessel and Airy functions and equation of related Gaussian quadrature formulae. BIT 42(1), 110–118 (2002) 2. Gil, A., Segura, J., Temme, N.M.: Algorithm 819: AIZ, BIZ: .v  v /.v C 2v / u D u C Ä q 0 0 ; two Fortran 77 routines for the computation of complex 0 Airy functions. ACM Trans. Math. Softw. 28(3), 325–336 1 2 2 3 u0 C 3 .v C 2v0v C 3u0/ (2002) 3. Gil, A., Segura, J., Temme, N.M.: Numerical Methods for 1

Splitting methods involve thus three steps: (i) Splitting Methods Œi choosingP the set of functions f such that f D f Œi, (ii) solving either exactly or approximately Sergio Blanes1, Fernando Casas2, and Ander Murua3 i each equation x0 D f Œi.x/, and (iii) combining these 1Instituto de Matematica´ Multidisciplinar, Universitat solutions to construct an approximation for (1)upto Politecnica` de Valencia,` Valencia,` Spain the desired order. 2Departament de Matematiques` and IMAC, The splitting idea can also be applied to partial Universitat Jaume I, Castellon,´ Spain differential equations (PDEs) involving time and one or 3Konputazio Zientziak eta A.A. Saila, Informatika more space dimensions. Thus, if the spatial differential Fakultatea, UPV/EHU, Donostia/San Sebastian,´ Spain operator contains parts of a different character (such as advection and diffusion), then different discretization techniques may be applied to each part, as well as for Synonyms the time integration. Splitting methods have a long history and have Fractional step methods; Operator-splitting methods been applied (sometimes with different names) in many different fields, ranging from parabolic and reaction-diffusion PDEs to quantum statistical Introduction mechanics, chemical physics, and Hamiltonian dynamical systems [7]. Splitting methods constitute a general class of nu- Some of the advantages of splitting methods are the merical integration schemes for differential equations following: they are simple to implement, are explicit whose vector field can be decomposed in such a way if each subproblem is solved with an explicit method, that each subproblem is simpler to integrate than the and often preserve qualitative properties the differential original system. For ordinary differential equations equation might possess. (ODEs), this idea can be formulated as follows. Given the initial value problem Splitting Methods for ODEs 0 D x D f.x/; x0 D x.0/ 2 R (1) Increasing the Order RD RD with f W ! and solutionP 't .x0/, assume Very often in applications, the function f in the ODE m Œi Œa that f can be expressed as f D iD1 f for certain (1) can be split in just two parts, f.x/ D f .x/ C functions f Œi, such that the equations Œb Œb Œa f .x/. Then both h D 'h ı 'h and its adjoint,  1 Œa Œb h Á h D 'h ı 'h , are first-order integration 0 Œi D x D f .x/; x0 D x.0/ 2 R ;iD 1;:::;m schemes. These formulae are often called the Lie– (2) Trotter splitting. On the other hand, the symmetric S version can be integrated exactly, with solutions x.h/ D SŒ2 Œa Œb Œa Œi h Á 'h=2 ı 'h ı 'h=2 (4) 'h .x0/ at t D h, the time step. The different parts of f may correspond to physically different contributions. provides a second-order integrator, known as the Then, by combining these solutions as Strang–Marchuk splitting, the leapfrog, or the Stormer–Verlet¨ method, depending on the context Œm Œ2 Œ1 where it is used [2]. Notice that SŒ2 D  ı  . h D 'h ıı'h ı 'h (3) h h=2 h=2 More generally, one may consider a composition of and expanding into series in powers of h, one finds the form 2 that h.x0/ D 'h.x0/ C O.h /,sothath provides a Œa Œb Œa Œa Œb Œa h D ' ı ' ı ' ıı' ı ' ı ' (5) first-order approximation to the exact solution. Higher- asC1h bsh as h a2h b1h a1h order approximations can be achieved by introducing more flows with additional coefficients, 'Œi ,incom- aij h and try to increase the order of approximation by position (3). suitably determining the parameters ai , bi . The number 1354 Splitting Methods

Œb Œa F Œaba D ŒF Œab;FŒa; F Œabbb DŒF Œabb;FŒb; s of 'h (or 'h ) evaluations in (5) is usually referred to as the number of stages of the integrator. This is called Œabba Œabb Œa Œabaa Œaba Œa  F D ŒF ;F ; F D ŒF ;F ; time-symmetric if h D h , in which case one has a left-right palindromic composition. Equivalently, in the symbol Œ;  stands for the Lie bracket, and (5), one has va;vb ;vab;vabb;vaba;vabbb;::: are polynomials in the parameters ai ;bi of theP splitting schemeP (5). In a1 D asC1;b1 D bs;a2 D as ;b2 D bs1;::: sC1 s particular, oneP gets vaPD iD1 ai , vb D iD1 bi , (6) 1 s i vab D 2  iD1 bi j D1 aj . The order conditions The order conditions the parameters ai , bi have to then read va D vb D 1 and vab D vabb D satisfy can be obtained by relating the previous in- vaba D  D 0 up to the order considered. To tegrator h with a formal series h of differential achieve order r D 1; 2; 3; : : : ;P 10, the number r operators [1]: it is known that the h-flow 'h of the of conditions to be fulfilled is j D1 nj ,where 0 Œa Œb original system x D f .x/ C f .x/ satisfies, nj D 2; 1; 2; 3; 6; 9; 18; 30; 56; 99. This number is 1 D for each g 2 C .R ; R/, the identity g.'h.x// D smaller for r>3when dealing with second-order Œa Œb eh.F CF /Œg.x/,whereF Œa and F Œb are the Lie ODEs of the form y00 D g.y/ when they are rewritten derivatives corresponding to f Œa and f Œb, respec- as (1)[1]. tively, acting as For time-symmetric methods, the order conditions at even orders are automatically satisfied, which leads XD to n1 C n3 CCn2k1 order conditions to achieve Œa Œa @g F Œg.x/ D fj .x/ .x/; order r D 2k. For instance, n1 C n3 D 4 conditions @xj j D1 need to be fulfilled for a symmetric method (5–6)tobe D of order 4. X @g F ŒbŒg.x/ D f Œb.x/ .x/: (7) j @x j D1 j Splitting and Composition Methods When the original system (1) is split in m>2parts, Similarly, the approximation h.x/ 'h.x/ given higher-order schemes can be obtained by considering by the splitting method (5) satisfies the identity a composition of the basic first-order splitting method g. .x// D .h/Œg.x/,where  Œ1 Œm1 Œm h (3) and its adjoint h D 'h ıı'h ı 'h .More specifically, compositions of the general form Œa Œb Œa Œb Œa .h/ D ea1hF eb1hF  eas hF ebs hF easC1hF : D  ı  ıı ı  ; (10) (8) h ˛2sh ˛2s1h ˛2h ˛1h

Hence, the coefficients ai ;bi must be chosen in such a can be considered with appropriately chosen coeffi- way that the operator .h/ is a good approximation of 2s Œa Œb cients .˛ ;:::;˛ / 2 R so as to achieve a prescribed eh.F CF /, or equivalently, h1 log./ F Œa CF Œb. 1 2s order of approximation. Applying repeatedly the Baker–Campbell– In the particular case when system (1) is split in Hausdorff (BCH) formula [2], one arrives at Œb Œa m D 2 parts so that h D 'h ı 'h , method (10) 1 reduces to (5) with a1 D ˛1 and log..h// D .v F Œa C v F Œb/ C hv F Œab h a b ab b D ˛  C ˛ ;aC D ˛ C ˛ C ; 2 Œabb Œaba j 2j 1 2j j 1 2j 2j 1 Ch .vabbF C vabaF / for j D 1;:::;s; (11) 3 Œabbb Œabba Ch .vabbbF C vabbaF Œabaa 4 CvabaaF / C O.h /; (9) where ˛2sC1 D 0. In that case, the coefficients ai and bi are such that where XsC1 Xs ai D bi : (12) Œab Œa Œb Œabb Œab Œb F DŒF ;F ; F D ŒF ;F ; iD1 iD1 Splitting Methods 1355

Conversely, any splitting method (5) satisfying (12) can rigorous statements with techniques of backward error Œb Œa analysis [2]. be written in the form (10) with h D 'h ı 'h . Moreover, compositions of the form (10)make sense for an arbitrary basic first-order integrator h Further Extensions  (and its adjoint h )oftheoriginalsystem(1). Obvi- Several extensions can be considered to reduce the ously, if the coefficients ˛j of a composition method number of stages necessary to achieve a given order (10) are such that h is of order r for arbitrary basic and get more efficient methods. One of them is the integrators h of (1), then the splitting method (5) with use of a processor or corrector. The idea consists in (11)isalsooforderr. Actually, as shown in [6], the enhancing an integrator h (the kernel) with a map h integrator (5)isoforderr for ODEs of the form (1) 1 (the processor) as Oh D h ı h ı  . Then, after Œa Œb h with f D f C f if and only if the integrator (10) O n n 1 n steps, one has h D h ı h ı h , and so only (with coefficients ˛j obtained from (11)) is of order r the cost of h is relevant. The simplest example of a for arbitrary first-order integrators h. processed integrator is provided by the Stormer–Verlet¨ This close relationship allows one to establish in an Œb Œa method (4). In that case, h D h D ' ı ' and elementary way a defining feature of splitting methods h h  D 'Œa . The use of processing allows one to get (5)oforderr  3:atleastonea and one b h h=2 i i methods with fewer stages in the kernel and smaller are necessarily negative [1]. In other words, splitting error terms than standard compositions [1]. schemes of order r  3 always involve backward The second extension uses the flows corresponding fractional time steps. to other vector fields in addition to F Œa and F Œb. For instance, one could consider methods (5)such Œa Œb Œabb Preserving Properties that, in addition to 'h and 'h ,usetheh-flow 'h Œi Œabb Assume that the individual flows 'h share with the ex- of the vector field F when its computation is act flow 'h some defining property which is preserved straightforward. This happens, for instance, for second- by composition. Then it is clear that any composition order ODEs y00 D g.y/ [1, 7]. Œa of the form (5)and(10) with h given by (3)also Splitting is particularly appropriate when kf k possesses this property. Examples of such features are kf Œbk in (1). Introducing a small parameter ", we can symplecticity, unitarity, volume preservation, conser- write x0 D "f Œa.x/ C f Œb.x/, so that the error of vation of first integrals, etc. [7]. In this sense, splitting scheme (5)isO."/. Moreover, since in many practi- methods form an important class of geometric numer- cal applications "

Splitting Methods for PDEs

S 4 a D 0:07920369643119565 b D 0:209515106613362 6 1 1 In the numerical treatment of evolutionary PDEs of a D 0:353172906049774 b D0:143851773179818 2 2 parabolic or mixed hyperbolic-parabolic type, splitting a D0:04206508035771952 3 time-integration methods are also widely used. In this SN64a1 D 0:08298440641740515 b1 D 0:245298957184271 setting, the overall evolution operator is formally writ- a2 D 0:396309801498368 b2 D 0:604872665711080 ten as a sum of evolution operators, typically repre- a D0:3905630492234859 3 senting different aspects of a given model. Consider SNI54a1 D 0:81186273854451628884 b1 D0:0075869131187744738 an evolutionary PDE formulated as an abstract Cauchy a2 D0:67748039953216912289 b2 D 0:31721827797316981388 problem in a certain function space U fu W RD  R ! Rg,

ut D L.u/; u.t0/ D u0; (15) Numerical Example: A Perturbed Kepler Problem where L is a spatial partial differential operator. For To illustrate the performance of the previous splitting instance, methods, we apply them to the time integration of the ! Xd Xd perturbed Kepler problem described by the Hamilto- @ @ @ u.x; t/ D ci .x/ u.x; t/ nian @t @x @x D j D i 1 1 "   j 1 i 1 H D .p2 C p2/   q2  2q2 ; (14) 2 1 2 r 2r5 2 1 Cf.x;u.x; t//; u.x; t0/ D u0.x/ q or in short, L.x; u/ Dr.cru/ C f.u/ corresponds 2 2 where r D q1 C q2 .Wetake" D 0:001 to a diffusion-reaction problem. In that case, it makes 0 and integrate the equations of motion qi D pi , sense to split the problem into two subequations, cor- 0 pi D@H=@qi , i D 1; 2, with initialp conditions responding to the different physical contributions, q1 D 4=5, q2 D p1 D 0, p2 D 3=2. Splitting methods are used with the partition into kinetic ut D La.u/ Ár.cru/; ut D Lb .u/ Á f.u/; and potential energy. We measure the two-norm (16) error in the position at tf D 2; 000, .q1;q2/ D .0:318965403761932; 1:15731646810481/,forsolve numerically each equation in (16), thus giving Œa Œa Œb Œb different time steps and plot the corresponding error u .h/ D 'h .u0/, u .h/ D 'h .u0/, respectively, as a function of the number of evaluations for each for a time step h, and then compose the operators Œa Œb method in Fig. 1. Notice that although the generic 'h , 'h to construct an approximation to the solu- Œb Œa method S64 has three more stages than the minimum tion of (15). Thus, u.h/ 'h .'h .u0// provides given by S34, this extra cost is greatly compensated by a first-order approximation, whereas the Strang split- Œa Œb Œa a higher accuracy. On the other hand, since this system ting u.h/ 'h=2.'h .'h=2.u0/// is formally second- corresponds to the second-order ODE q00 D g.q/, order accurate for sufficiently smooth solutions. In method SN64 leads to a higher accuracy with the this way, especially adapted numerical methods can same computational cost. Finally, SNI54 takes profit of be used to integrate each subproblem, even in parallel the near-integrable character of the Hamiltonian (14) [3, 4]. Splitting Methods 1357

Splitting Methods, Fig. 1 Error in the solution 0 S2 .q1.tf /; q2.tf // vs. the RK4 number of evaluations for −1 S34 different fourth-order splitting S64 methods (the extra cost in the SN 4 method SNI 4, designed for −2 6 5 SNI 4 perturbed problems, is not 5 taken into account) −3

(ERROR) −4 10

LOG −5

−6

−7

−8 4 4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 6

LOG10(N. EVALUATIONS)

Systems of hyperbolic conservation laws, such as leads to the correct solution of the original problem or not [3]. On the other hand, even if the solution is sufficiently u C f.u/ C g.u/ D 0; u.x; t / D u .x/; t x x 0 0 smooth, applying splitting methods of order higher than two is not possible for certain problems. This can also be treated with splitting methods, in this happens, in particular, when there is a diffusion term case, by fixing a step size h and applying a especially in the equation; since then the presence of negative tailored numerical scheme to each scalar conserva- coefficients in the method leads to an ill-posed prob- tion law ut C f.u/x D 0 and ut C g.u/x D 0. lem. When c D 1 in (16), this order barrier has This is a particular example of dimensional splitting been circumvented, however, with the use of complex- where the original problem is approximated by solv- valued coefficients with positive real parts: the operator Œa ing one space direction at a time. Early examples of 'zh corresponding to the Laplacian La is still well dimensional splitting are the so-called locally one- defined in a reasonable distribution set for z 2 C, dimensional (LOD) methods (such as LOD-backward provided that <.z/  0. S Euler and LOD Crank–Nicolson schemes) and al- There exist also relevant problems where high-order ternating direction implicit (ADI) methods (e.g., the splitting methods can be safely used as is in the inte- Peaceman–Rachford algorithm) [4]. gration of the time-dependent Schrodinger¨ equation 1 Although the formal analysis of splitting methods iut D2m u C V.x/u split into kinetic T D in this setting can also be carried out by power se- .2m/1 and potential V energy operators and ries expansions, several fundamental difficulties arise, with periodic boundary conditions. In this case, however. First, nonlinear PDEs in general possess the combination of the Strang splitting in time and solutions that exhibit complex behavior in small re- the Fourier collocation in space is quite popular in gions of space and time, such as sharp transitions and chemical physics (with the name of split-step Fourier discontinuities. Second, even if the exact solution of method). These schemes have appealing structure- the original problem is smooth, it might happen that preserving properties, such as unitarity, symplecticity, the composition defining the splitting method provides and time-symmetry [5]. Moreover, it has been shown nonsmooth approximations. Therefore, it is necessary that for a method (5)oforderr with the splitting into to develop sophisticated tools to analyze whether the kinetic and potential energy and under relatively mild numerical solution constructed with a splitting method assumptions on T and V , one has an rth-order error 1358 Stability, Consistency, and Convergence of Numerical Discretizations

n rC1 discretization is the difference between the solution bound k h u0  u.nh/kÄCnh max0ÄsÄnh ku.s/kr in terms of the rth-order Sobolev norm [5]. of the original problem and the solution of the discrete problem, which must be defined so that the difference makes sense and can be quantified. Consistency of Cross-References a discretization refers to a quantitative measure of the extent to which the exact solution satisfies the  Composition Methods discrete problem. Stability of a discretization refers to  One-Step Methods, Order, Convergence a quantitative measure of the well-posedness of the  Symmetric Methods discrete problem. A fundamental result in numerical  Symplectic Methods analysis is that the error of a discretization may be bounded in terms of its consistency and stability.

References A Framework for Assessing Discretizations Many different approaches are used to discretize 1. Blanes, S., Casas, F., Murua, A.: Splitting and composition differential equations: finite differences, finite ele- methods in the numerical integration of differential equa- ments, spectral methods, integral equation approaches, tions. Bol. Soc. Esp. Mat. Apl. 45, 89–145 (2008) etc. Despite the diversity of methods, fundamental 2. Hairer, E., Lubich, C., Wanner, G.: Geometric Numerical concepts such as error, consistency, and stability are Integration. Structure-Preserving Algorithms for Ordinary Differential Equations, 2nd edn, Springer, Berlin (2006) relevant to all of them. Here, we describe a framework 3. Holden, H., Karlsen, K.H., Lie, K.A., Risebro, N.H.: Split- general enough to encompass all these methods, ting Methods for Partial Differential Equations with Rough although we do restrict to linear problems to avoid Solutions. European Mathematical Society, Z¨urich (2010) many complications. To understand the definitions, it 4. Hundsdorfer, W., Verwer, J.G.: Numerical Solution of Time-Dependent Advection-Diffusion-Reaction Equations. is good to keep some concrete examples in mind, and Springer, Berlin (2003) so we start with two of these. 5. Lubich, C.: From Quantum to Classical Molecular Dynamics: Reduced Models and Numerical Analysis. European Mathe- A Finite Difference Method matical Society, Z¨urich (2008) 6. McLachlan, R.I.: On the numerical integration of ordinary As a first example, consider the solution of the Poisson differential equations by symmetric composition methods. equation, u D f , on a domain ˝  R2, subject to SIAM J. Numer. Anal. 16, 151–168 (1995) the Dirichlet boundary condition u D 0 on @˝.One 7. McLachlan, R.I., Quispel, R.: Splitting methods. Acta Nu- mer. 11, 341–434 (2002) possible discretization is a finite difference method, which we describe in the case ˝ D .0; 1/.0; 1/ is the unit square. Making reference to Fig. 1,leth D 1=n, n>1integer, be the grid size, and define the grid domain, ˝h Df.lh; mh/ j 0

Stability, Consistency, and Convergence of Numerical Discretizations, Fig. 1 The grid domain ˝N h consists of the points in ˝h, marked with solid dots,andin@˝h, marked with hollow dots.On the right is the stencil of the five-point Laplacian, which consists of a grid point p and its four nearest neighbors

huh.p/ D f.p/; p 2 ˝h; uh.p/ D 0; p 2 @˝h: the discrete operator as a linear map Lh from a vector space Vh, called the discrete solution space, to a second If we regard as unknowns, the N D .n  1/2 values vector space Wh, called the discrete data space. In the uh.p/ for p 2 ˝h, this gives us a systems of N linear case of the finite difference operator, the discrete solu- N equations in N unknowns which may be solved very tion space is the space of mesh functions on ˝h which efficiently. vanish on @˝h, the discrete data space is the space of mesh functions on ˝h, and the discrete operator A Finite Element Method Lh D h, the five-point Laplacian. In the case of the A second example of a discretization is provided by finite element method, Vh is the space of continuous a finite element solution of the same problem. In this piecewise linear functions with respect to the given  case we assume that ˝ is a polygon furnished with triangulation that vanish on @˝,andWh D Vh ,the dual space of Vh. The operator Lh is given by a triangulation Th, such as pictured in Fig. 2.The finite element method seeks a function uh W ˝ ! Z R which is continuous and piecewise linear with re- .Lhw/.v/ D rw rvdx; w;v 2 Vh: spect to the mesh and vanishing on @˝,andwhich ˝ satisfies For the finite difference method, we define the discrete Z Z data fh 2 Wh by fh D f j˝h , while for the finite S  ruh rvdxD fvdx; element method f 2 W is given by f .v/ D ˝ ˝ R h h h fvdx. In both cases, the discrete solution uh 2 Vh for all test functions v which are themselves continuous is found by solving the discrete equation and piecewise linear with respect to the mesh and vanish on @˝. If we choose a basis for this set of Lhuh D fh: (1) space of test functions, then the computation of uh may be reduced to an efficiently solvable system of Of course, a minimal requirement on the discretization N linear equations in N unknowns, where, in this is that the finite dimensional linear system (1)has case, N is the number of interior vertices in the a unique solution, i.e., that the associated matrix is triangulation. invertible (so Vh and Wh musthavethesamedimen- sion). Then, the discrete solution uh is well-defined. Discretization The primary goal of numerical analysis is to ensure We may treat both these examples, and many other that the discrete solution is a good approximation of discretizations, in a common framework. We regard the true solution u in an appropriate sense. 1360 Stability, Consistency, and Convergence of Numerical Discretizations

Stability, Consistency, and Convergence of Numerical Discretizations, Fig. 2 A finite element mesh of the domain ˝.The solution is sought as a piecewise linear function with respect to the mesh

Representative and Error now have a quantitative goal for our computation that Since we are interested in the difference between u the error kUh  uhkh be sufficiently small. and uh, we must bring these into a common vector space, where the difference makes sense. To this end, Consistency and Stability we suppose that a representative Uh 2 Vh of u is given. The representative is taken to be an element of Consistency Error Vh which, though not practically computable, is a good Having used the representative Uh of the solution to approximation of u. For the finite difference method, define the error, we also use it to define a second sort a natural choice of representative is the grid function of error, the consistency error, also sometimes called

Uh D uj˝h . If we show that the difference Uh  uh the truncation error. The consistency error is defined is small, we know that the grid values uh.p/ which to be LhUh  fh, which is an element of Wh.Now determine the discrete solution are close to the exact Uh represents the true solution u, so the consistency values u.p/. For the finite element method, a good error should be understood as a quantity measuring the possibility for Uh is the piecewise linear interpolant extent to which the true solution satisfies the discrete of u,thatis,Uh is the piecewise linear function that equation (1). Since Lu D f , the consistency error coincides with u at each vertex of the triangulation. should be small if Lh is a good representative of L and Another popular possibility is to take Uh to be the best fh a good representative of f . In order to relate the approximation of u in Vh in an appropriate norm. In norm of the error to the consistency error, we need a any case, the quantity Uh  uh, which is the differ- norm on the discrete data space Wh as well. We denote ence between the representative of the true solution 0 this norm by kwkh for w 2 Wh and so our measure of and the discrete solution, defines the error of the 0 the consistency error is kLhUh  fhkh. discretization. At this point we have made our goal more concrete: we wish to ensure that the error, Uh  uh 2 Vh,is Stability small. To render this quantitative, we need to select a If a problem in differential equations is well-posed, norm on the finite dimensional vector space Vh with then, by definition, the solution u depends continuously which to measure the error. The choice of norm is on the data f . On the discrete level, this continuous an important aspect of the problem presentation, and dependence is called stability. Thus, stability refers to an appropriate choice must reflect the goal of the 1 the continuity of the mapping Lh W Wh ! Vh,which computation. For example, in some applications, a takes the discrete data fh to the discrete solution uh. large error at a single point of the domain could be Stability is a matter of degree, and an unstable dis- catastrophic, while in others only the average error over cretization is one for which the modulus of continuity the domain is significant. In yet other cases, derivatives 1 of Lh is very large. of u are the true quantities of interest. These cases To illustrate the notion of instability, and to motivate would lead to different choices of norms. We shall the quantitative measure of stability we shall introduce denote the chosen norm of v 2 Vh by kvkh. Thus, we below, we consider a simpler numerical problem than Stability, Consistency, and Convergence of Numerical Discretizations 1361 the discretization of a differential equation. Suppose Relating Consistency, Stability, and Error we wish to compute the definite integral Z The Fundamental Error Bound 1 n x1 Let us summarize the ingredients we have introduced nC1 D x e dx; (2) in our framework to assess a discretization: 0 • The discrete solution space, Vh, a finite dimensional for n D 15. Using integration by parts, we obtain a vector space, normed by kkh simple recipe to compute the integral in short sequence • The discrete data space, Wh, a finite dimensional 0 of arithmetic operations: vector space, normed by kkh • The discrete operator, Lh W Vh ! Wh, an invertible linear operator nC1 D 1  nn;nD 1; : : : ; 15; • The discrete data fh 2 Wh  D 1  e1 D 0:632121 : : : : 1 (3) • The discrete solution uh determined by the equation Lhuh D fh Now suppose we carry out this computation, beginning • The solution representative Uh 2 Vh with 1 D 0:632121 (so truncated after six decimal • The error Uh  uh 2 Vh places). We then find that 16 D576; 909,whichis • The consistency error LhUh  fh 2 Wh stab 1 truly a massive error, since the correct value is  D L 16 • The stability constant Ch DkLh k .Wh;Vh/ 0:0590175 : : :. If we think of (3) as a discrete solution With this framework in place, we may prove a rigorous 1 operator (analogous to Lh above) taking the data 1 error bound, stating that the error is bounded by the to the solution 16, then it is a highly unstable scheme: product of the stability constant and the consistency a perturbation of the data of less than 106 leads to error: a change in the solution of nearly 6  105. In fact, it is easy to see that for (3), a perturbation  in the data leads stab 0 kUh  uhkh Ä Ch kLhUh  fhkh: (4) to an error of 15Š   in solution – a huge instability.

It is important to note that the numerical computation The proof is straightforward. Since Lh is invertible, of the integral (2) is not a difficult numerical problem. It could be easily computed with Simpson’s rule, for 1 1 Uh  uh D Lh ŒLh.Uh  uh/ D Lh .LhUh  Lhuh/ example. The crime here is solving the problem with 1 the unstable algorithm (3). D Lh .LhUh  fh/: Returning to the case of the discretization (1), imag- ine that we perturb the discrete data fh to some fQh D Taking norms, gives fh C h, resulting in a perturbation of the discrete 1 0 1 Q kUh  uhkh ÄkL kL.W ;V /kLhUh  fhk ; solution to uQh D Lh fh. Using the norms in Wh and h h h h Vh to measure the perturbations and then computing S the ratio, we obtain as claimed.

solution perturbation kQu  u k kL1 k The Fundamental Theorem D h h h D h h h : Q 0 0 A discretization of a differential equation always en- data perturbation kfh  fhk khkh h tails a certain amount of error. If the error is not small enough for the needs of the application, one generally We define the stability constant C stab, which is our h refines the discretization, for example, using a finer quantitative measure of stability, as the maximum value grid size in a finite difference method or a triangulation this ratio achieves for any perturbation  of the data. h with smaller elements in a finite element method. In other words, the stability constant is the norm of the Thus, we may consider a whole sequence or family of operator L1: h discretizations, corresponding to finer and finer grids or triangulations or whatever. It is conventional to 1 stab kLh hkh 1 L parametrize these by a positive real number h called Ch D sup 0 DkLh k .Wh;Vh/: 0¤h2Wh khkh the discretization parameter. For example, in the finite 1362 Stability, Consistency, and Convergence of Numerical Discretizations difference method, we may use the same h as before, the initial data, one sees that this method, though the grid size, and in the finite element method, we consistent, cannot be convergent if >1. can take h to be the maximal triangle diameter or Twenty years later, the property of stability of dis- something related to it. We shall call such a family cretizations began to emerge in the work of von Neu- of discretizations a discretization scheme. The scheme mann and his collaborators. First, in von Neumann’s is called convergent if the error norm kUh  uhkh work with Goldstine on solving systems of linear tends to 0 as h tends to 0. Clearly convergence is a equations [5], they studied the magnification of round- highly desirable property: it means that we can achieve off error by the repeated algebraic operations involved, whatever level of accuracy we need, as long as we do a somewhat like the simple example (3) of an unsta- fine enough computation. Two more definitions apply ble recursion considered above. A few years later, to a discretization scheme. The scheme is consistent if in a 1950 article with Charney and Fjørtoft [1]on 0 the consistency error norm kLhUh  fhkh tends to 0 numerical solution of a convection diffusion equation with h. The scheme is stable if the stability constant arising in atmospheric modeling, the authors clearly stab stab stab Ch is bounded uniformly in h: Ch Ä C for highlighted the importance of what they called com- some number C stab and all h. From the fundamental putational stability of the finite difference equations, error bound, we immediately obtain what may be and they used Fourier analysis techniques to assess called the fundamental theorem of numerical analysis: the stability of their method. This approach developed a discretization scheme which is consistent and stable into von Neumann stability analysis, still one of the is convergent. most widely used techniques for determining stability of finite difference methods for evolution equations. During the 1950s, there was a great deal of Historical Perspective study of the nature of stability of finite difference Consistency essentially requires that the discrete equa- equations for initial value problems, achieving its tions defining the approximate solution are at least capstone in the 1956 survey paper [3]ofLaxand approximately satisfied by the true solution. This is Richtmeyer. In that context, they formulated the an evident requirement and has implicitly guided the definition of stability given above and proved that, for a construction of virtually all discretization methods, consistent difference approximation, stability ensured from the earliest examples. Bounds on the consistency convergence. error are often not difficult to obtain. For finite differ- ence methods, for example, they may be derived from Taylor’s theorem, and, for finite element methods, from Techniques for Ensuring Stability simple approximation theory. Stability is another mat- ter. Its central role was not understood until the mid- Finite Difference Methods twentieth century, and there are still many differential We first consider an initial value problem, for example, equations for which it is difficult to devise or to assess the heat equation or wave equation, discretized by a stable methods. finite difference method using grid size h and time step That consistency alone is insufficient for the conver- k. The finite difference method advances the solution gence of a finite difference method was pointed out in from some initial time t0 to a terminal time T by a seminal paper of Courant, Friedrichs, and Lewy [2] a sequence of steps, with the lth step advancing the in 1928. They considered the one-dimensional wave discrete solution from time .l  1/k to time lk.At equation and used a finite difference method, analo- each time level lk, the discrete solution is a spatial l gous to the five-point Laplacian, with a space-time grid grid function uh, and so the finite difference method l1 l of points .jh; lk/ with 0 Ä j Ä n, 0 Ä l Ä m integers defines an operator G.h; k/ mapping uh to uh, called and h; k > 0 giving the spatial and temporal grid size, the amplification matrix. Since the amplification matrix respectively. It is easy to bound the consistency error by is applied many times in the course of the calculation 2 2 O.h Ck /, so setting k D h for some constant >0 (m D .T  t0/=k times to be precise, a number and letting h tend to 0, one obtains a consistent scheme. which tends to infinity as k tends to 0), the solution m However, by comparing the domains of dependence at the final step uh involves a high power of the of the true solution and of the discrete solution on amplification matrix, namely G.h; k/m, applied to the Stability, Consistency, and Convergence of Numerical Discretizations 1363

Stability, Consistency, and Convergence of Numerical Dis- t D 0:03 computed with h D 1=20, k D 1=2;000 (stable). cretizations, Fig. 3 Finite difference solution of the heat equa- Right: same computation with k D 1=1;000 (unstable) tion using (5). Left: initial data. Middle: discrete solution at

0 Several methods are used to bound the norm of the data uh. Therefore, the stability constant will depend on a bound for kG.h; k/mk. Usually this can only be amplification matrix. If an L1 norm is chosen, one obtained by showing that kG.h; k/kÄ1 or, at most, can often use a discrete maximum principle based on kG.h; k/kÄ1 C O.k/. As a simple example, we may the structure of the matrix. If an L2 norm is chosen, consider an initial value problem for the heat equation then Fourier analysis may be used if the problem with homogeneous boundary conditions on the unit has constant coefficients and simple enough boundary square: conditions. In other circumstances, more sophisticated matrix or eigenvalue analysis is used. For time-independent PDEs, such as the Poisson @u D u;x2 ˝; 0 < t Ä T; equation, the requirement is to show that the inverse @t of the discretization operator is bounded uniformly in u.x; t/ D 0; x 2 @˝; 0 < t Ä T; the grid size h. Similar techniques as for the time-

u.x; 0/ D u0.x/; x 2 ˝; dependent problems are applied.

Galerkin Methods which we discretize with the five-point Laplacian and Galerkin methods, of which finite element methods are forward differences in time: an important case, treat a problem which can be put into the form: find u 2 V such that B.u;v/D F.v/for l l1 u .p/  u .p/ l1 all v 2 V . Here, V is a Hilbert space, B W V  V ! R D hu .p/; p 2 ˝h; S k is a bounded bilinear form, and F 2 V , the dual 0

Stability, Consistency, and Convergence of Numerical Dis- shown in green (in the right plot, the blue curve essentially cretizations, Fig. 4 Approximation of the problem (7), with coincides with the green curve, and so is not visible). An unstable u D cos x shown on left and  D u0 on the right. The exact finite element method, using piecewise quadratics for ,isshown solution is shown in blue, and the stable finite element method, in red using piecewise linears for  and piecewise constants for u,is

above took Vh to be the subspace of continuous piece- If we partition I into subintervals and choose Sh wise linears. If the bilinear form B is coercive in the and Wh both to be the space of continuous piecewise sense that there exists a constant >0for which linears, then the resulting matrix problem is singular, so the method is unusable. If we choose Sh to 2 B.v; v/  kvkV ;v2 V; continuous piecewise linears, and Wh to be piecewise constants, we obtain a stable method. But if we choose then stability of the Galerkin method with respect to Sh to contain all continuous piecewise quadratic the V norm is automatic. No matter how the subspace functions and retain the space of piecewise constants Vh is chosen, the stability constant is bounded by 1=. for Wh, we obtain an unstable scheme. The stable If the bilinear form is not coercive (or if we consider and unstable methods can be compared in Fig. 4. a norm other than the norm in which the bilinear For the same problem of the Poisson equation in form is coercive), then finding stable subspaces for first-order form, but in more than one dimension, Galerkin’s method may be quite difficult. As a very the first stable elements were discovered in 1975 simple example, consider a problem on the unit interval [4]. I D .0; 1/,tofind.; u/ 2 H 1.I /  L2.I / such that Z Z Z Z References 1 1 1 1  dx C 0u dx C  0vdxD fvdx; 1. Charney, J.G., Fjørtoft, R., von Neumann, J.: Numerical 0 0 0 0 integration of the barotropic vorticity equation. Tellus 2, . ; v/ 2 H 1.I /  L2.I /: (7) 237–254 (1950) 2. Courant, R., Friedrichs, K., Lewy, H.: Uber¨ die partiellen 0 0 Differenzengleichungen der mathematischen Physik. Math. This is a weak formulation of system  D u ,  D f , Ann. 100(1), 32–74 (1928) with Dirichlet boundary conditions (which arise from 3. Lax, P.D., Richtmyer, R.D.: Survey of the stability of linear this weak formulation as natural boundary conditions), finite difference equations. Commun. Pure Appl. Math. 9(2), so this is another form of the Dirichlet problem for 267–293 (1956) 00 4. Raviart, P.A., Thomas, J.M.: A mixed finite element method Poisson’s equation u D f on I , u.0/ D u.1/ D 0.In for 2nd order elliptic problems. In: Mathematical Aspects higher dimensions, there are circumstances where such of Finite Element Methods. Proceedings of the Conference, a first-order formulation is preferable to a standard Consiglio Naz. delle Ricerche (C.N.R.), Rome, 1975. Volume second-order form. This problem can be discretized by 606 of Lecture Notes in Mathematics, pp. 292–315. Springer, 1 Berlin (1977) a Galerkin method, based on subspaces Sh  H .I / 5. von Neumann, J., Goldstine, H.H.: Numerical inverting of 2 and Wh  L .I /. However, the choice of subspaces matrices of high order. Bull. Am. Math. Soc. 53, 1021–1099 is delicate, even in this one-dimensional context. (1947) Statistical Methods for Uncertainty Quantification for Linear Inverse Problems 1365

is a continuous linear operator defined on H,andthe Statistical Methods for Uncertainty errors " are random variables. Since the errors are Quantification for Linear Inverse i random, an estimate fO of f is a random variable taking Problems values in H. Given the finite amount of data, we can only hope to recover components of f that admit a Luis Tenorio finite-dimensional parametrization. Such parametriza- Mathematical and Computer Sciences, Colorado tions also help us avoid defining probability measures School of Mines, Golden, CO, USA in function spaces. For example, we can discretize the operators and the function so that the estimate fO is a vector in Rm. Alternatively, one may be able to To solve an inverse problem means to recover an un- useP a finite-dimensional parametrization such as f D m known object from indirect noisy observations. As an kD1 ak k,where k are fixed functions defined on illustration, consider an idealized example of the blur- H. This time the random variable is the estimate aO of ring of a one-dimensional signal, f.x/, by a measuring the vector of coefficients a D .ak/. In either case the instrument. Assume the function is parametrized so problem of finding an estimate of a function reduces to that x 2 Œ0; 1 and that the actual data can be modeled a finite-dimensional linear algebra problem. as noisy observations of a blurred version of f .We Example 1 Consider the inverse problem for a Fred- may model the blurring as a convolution with a ker- holm integral equation: nel, K.x/, determined by the instrument. The forward Z operatorR maps f to the blurred function  given by 1 1 .x/ D 0 K.x  t/f.t/dt (i.e., a Fredholm integral yi Dy.xi /D K.xi t/f.t/dtC"i ;.iD1;:::;n/: equation of the first kind.) The statistical model for 0 the data is then y.x/ D .x/ C ".x/; where ".x/ To discretize the integral, we can use m equally spaced is measurement noise. The inverse problem consists 0 points ti in Œ0; 1 and define t D .tj C tj 1/=2. Then, of recovering f from finitely many measurements j y.x1/,...,y.xn/. However, inverse problems are usu- Z 1 1 mX1 ally ill-posed (e.g., the estimates may be very sensitive .x /D K.x y/f.y/dy K.x t 0 /f.t0 /: i i m i j j to small perturbations of the data) and deblurring is one 0 j D1 such example. A regularization method is required to solve the problem. An introduction to regularization of Hence we have the approximation  D ..x1/;:::; t 0 0 inverse problems can be found in [11] and more general .xn// Kf with Kij D K.xi tj / and fi D f.ti /. references are [10, 27]. Writing the discretization error as ı D   Kf,we Since the observations y.xi / are subject to system- arrive at the following model for the data vector y: atic errors (e.g., discretization) as well as measure- ment errors that will be modeled as random variables, y D Kf C ı C ": (1) S the solution of an inverse problem should include a summary of the statistical characteristics of the inver- As m !1the approximation of the integral improves sion estimate such as means, standard deviations, bias, but the matrix K becomes more ill-conditioned. To mean squared errors, and confidence sets. However, the regularize the problem, we define an estimate fO of f selection of proper statistical methods to assess esti- using penalized least squares: mators depends, of course, on the class of estimators, which in turn is determined by the type of inverse fO D arg min ky  Kgk2 C 2kDgk2 Rm problem and chosen regularization method. Here we g2 use a general framework that encompasses several D .K t K C 2Dt D/1K t y Á Ly; different and widely used approaches. We consider the problem of assessing the statistics where >0is a fixed regularization parameter and of solutions of linear inverse problems whose data are D is a chosen matrix (e.g., a matrix that computes modeled as yi D Ki Œf  C "i .i D 1;:::;n/; where discrete derivatives). This regularization addresses the the functions f belong to a linear space H, each Ki ill-conditioning of the matrix K; it is a way of adding 1366 Statistical Methods for Uncertainty Quantification for Linear Inverse Problems the prior information that we expect kDf k to be small. at a point (i.e., f ! f.x/) is a continuous linear The case D D I is known as (discrete) Tikhonov- functional. Phillips regularization. Note that we may write f.xO / i Example 3 Let I D Œ0; 1.LetW .I / be the linear as a linear function of y: f.xO / D et Ly where fe g is m i i i space of real-valued functions on I such that f has the standard orthonormal basis in Rm.  m1 continuous derivatives on I , f .m1/ is absolutely In the next two examples, we assume that H is a continuous on I (so f .m/ exists almost everywhere on .m/ 2 Hilbert space, and each Ki W H ! R is a bounded lin- I ), and f 2 L .I /. The space Wm.I / is a Hilbert ear operator (and thus continuous). We write KŒf  D space with inner product t .K1Œf ; : : : ; KnŒf / . mX1 Z K Rn Example 2 Since .H/  is finite dimensional, hf; giD f .k/.0/ g.k/.0/ C f .m/.x/ g.m/.x/ dx it follows that K is compact as is its adjoint K W kD0 I Rn ! H,andKK is a self-adjoint compact operator on H. In addition, there is a collection of orthonormal and has the following properties [2, 29]: (a) For every functions f g in H, orthonormal vectors f v g in Rn, k k x 2 I , there is a function x 2 Wm.I / such that and a positive, nonincreasing sequence .k/ such that the linear functional f ! f.x/ is continuous on [22]: (a) f g is an orthonormal basis for Null.K/?; k Wm.I / and given by f !hx;fi. The function (b) f v g is an orthonormal basis for the closure of k R W I  I ! R, R.x; y/ Dhx;y i is called K Rn K K Range. / in ;and(c) Œ k  D kvk and Œvk D a reproducing kernel of the Hilbert space, and (b)  . Write f D f C f , with f 2 Null.K/ and k k 0 1 0 Wm.I / D Nm1 ˚ Hm,whereNm1 is the space K ? f1 2 Null.P/ . Then, there are constants ak such of polynomials of degree at most m  1 and H D n m that f D a . The data do not provide any .k/ 1 kD1 k k f f 2 Wm.I / W f .0/ D 0 for k D 0;:::;m 1 g: information about f so without any other information 0 Since the space Wm.I / satisfies (a), it is called a we have no way of estimating such component of reproducing kernel Hilbert space (RKHS). To control f . This introduces a systematic bias. The problem of the smoothness of the Tikhonov estimate, we put a estimating f is thus reduced to estimating f ,thatis, 1 penalty on the derivative of f1, which is the projection the coefficients a . In fact, we may transform the data k f1 D PH f onto Hm. To write the penalized sum to hy; v iD a Ch"; v i and use them to estimate k k k k of squares, we use the fact that each functional Ki W the vector of coefficients a D .a /; the transformed k Wm.I / ! R is continuous and thus Ki f Dhi ;fi data based on this sequence define a sequence space for some function i 2 Wm.I /. We can then write model [7, 18]. We may also rewrite the data as y D Z VaC "; where V D .1v1 nvn /. An estimate of 2 2 .m/ 2 f is obtained using a penalized least-squares estimate k yKf k C  .f1 .x// dx I of a: Xn D .y h ;fi /2 C 2k P f k2: (3) 2 2 2 i i H aO D arg min ky  Vbk C  kbk : j D1 b2Rn

k1 This leads again to an estimate that is linear in y; write Define k.x/ D x for k D 1;:::;m and k D it as aO D Ly for some matrix L. The estimate of f.x/ PH km for k D m C 1;:::;m C n.Thenf D is then similar to that in Example 1: k ak k C ı,whereı belongs to the orthogonal complement of the span of f kg. It can be shown that f.x/O D .x/t aO D .x/t Ly; (2) the minimizer fO of (3) is again of the form (2)[29]. To estimate a we rewrite (3) as a function of a and use the t with .x/ D . 1.x/; : : : ; n.x// .  following estimate:

If the goal is to estimate pointwise values of f 2 2 2 t aO D arg min k y  Xbk C  bH PbH ; H, then the Hilbert space H needs to be defined b appropriately. For example, if H D L2.Œ0; 1/,then pointwise values of f are not well defined. The fol- where X is the matrix of inner products hi ; j i, Pij D t lowing example introduces spaces where evaluation hPH i ;PH j i and aH D .amC1;:::;amCn/ .  Statistical Methods for Uncertainty Quantification for Linear Inverse Problems 1367

These examples describe three different frameworks aO D arg min ky  Kbk2 C 2bt Sb where the functional estimation is reduced to a finite- b dimensional penalized least-squares problem. They D . K t K C 2S /1K t y; (6) serve to motivate the framework we will use in the statistical analysis. We will focus on frequentist where S is a symmetric non-negative matrix and >0 statistical methods. Bayesian methods for inverse is a fixed regularization parameter. The estimate of f problems are discussed in [17]; [1,23] provide a tutorial is defined as f.x/O D .x/t aO: comparison of frequentist and Bayesian procedures for inverse problems. Bias, Variance, and MSE Consider first the simpler case of general linear For a fixed regularization parameter , the estimator regression:then  1 data vector is modeled as y D f.x/O is linear in y, and therefore its mean, bias, KaC ",whereK is an n  m matrix, n>m, K t K is variance, and mean squared error can be determined non-singular and " is a random vector with mean zero using only knowledge of the first two moments of the and covariance matrix  2I. The least-squares estimate distribution of the noise vector ".Using(6)wefindthe of a is mean, bias, and variance of f.x/O :

E O t 1 t K aO D arg min ky  Kbk2 D . K t K /1K t y; (4) . f.x//D .x/ G  K Œf ; b t 1 t Bias. f.x//O D .x/ G  K KŒf   f.x/ and it has the following properties: Its expected value 2 2 Var. f.x//O D  kKG .x/k ; is E.aO/ D a regardless of the true value a;thatis,aO is an unbiased estimator of a. The covariance matrix of aO t 2 1 where G  D . K K C  S / . Hence, unlike the is Var.aO/ Á  2.K t K/1. An unbiased estimator of  2 least-squares estimate of a (4), the penalized least- is O 2 DkyKaOk2=.nm/: Note that the denominator squares estimate (6)isbiasedevenwhenı D 0.This is the difference between the number of observations; bias introduces a bias in the estimates of f .Intermsof it is a kind of “effective number of observations.” It can a and ı,thisbiasis be shown that m D tr.H /,whereH D K.K t K/1K t is the hat matrix; it is the matrix defined by Hy Á Bias. f.x//O D E. f.x//O  f.x/ yO D KaO.Thedegrees of freedom (dof) of yO is defined t t t as the sum of the covariances of .KaO/i with yi divided D .x/ Ba C .x/ G K ı; (7) by  2 [24]. For we have dof.KaO/ D 2 2 m D tr.H /. Hence we may write O as the residual where B D G S . Prior information about a sum of squares normalized by the effective number of should be used to choose the matrix S so that kBak observations: is small. Note that similar formulas can be derived for correlated noise provided the covariance matrix S ky Oyk2 ky Oyk2 is known. Also, analogous closed formulas can be O 2 D D : (5) n  dof.yO / tr.I  H / derived for estimates of linear functionals of f . The mean squared error (MSE) can be used to We now return to ill-posed inverse problems and include the bias and variance in the uncertainty eval- O define a general framework motivated by Examples 1– uation of f.x/; it is defined as the expected value of O 2 2 that is similar to general linear regression. We assume .f.x/f.x// , which is equivalent to the squared bias that the data vector has a representation of the form plus the variance: y D KŒf  C " D Ka C ı C "; where K is a linear O O 2 V O operator H ! Rn, K is an n  m matrix, ı is a fixed MSE. f.x//D Bias. f.x// C ar. f.x//: unknown vector (e.g., discretization error), and " is O a random vector of mean zero and covariance matrix The integrated mean squared error of f is  2I. We also assume that there is an n  1 vector a Z and a vector function  such that f.x/ D .x/t a for IMSE.f/O D E j f.x/O  f.x/j2 dx all x. The vector a is estimated using penalized least t t squares: D Bias.aO/ F Bias.aO/Ctr. FGK KG /; 1368 Statistical Methods for Uncertainty Quantification for Linear Inverse Problems R where F D .x/.x/t dx. information leads to the modeling of a and ı as random The bias component of the MSE is the most difficult variables with means and covariance matrices a D E V E V to assess as it depends on the unknown f (or a and ı), a, ˙ a D ar.a/, ı D ı and ˙ ı D ar.ı/,then but, depending on the available prior information, some the average bias is inequalities can be derived [20, 26]. E O t t t Example 4 If H is a Hilbert space and the functionals Œ Bias. f.x//D .x/ Ba C .x/ G K ı: Ki are bounded, then there are function i 2 H such that Ki Œf  Dhi ;fi, and we may write Similarly, we can easily derive a bound for the mean X squared bias that can be used to put a bound on the average MSE. EŒ f.x/O Dh ai .x/i ;f iDhAx ;f i; i Since the bias may play a significant factor in the P inference (in some geophysical applications the bias is where the function Ax.y/ D i ai .x/i .y/ is called the dominant component of the MSE), it is important to the Backus-Gilbert averaging kernel for fO at x [3]. study the residuals of the fit to determine if a significant In particular, since we would like to have f.x/ D bias is present. The mean and covariance matrix of the hAx ;f i, we would like Ax to be as concentrated as residual vector r D y  KaO are possible around x; a plot of the function Ax may pro- vide useful information about the mean of the estimate Er DKBias.aO/ C ı DKBa C .I  H /ı (8) f.x/O . One may also summarize characteristics of jAxj 2 2 such as its center and spread about the center (e.g., Var.r/ D  .I  H / ; (9) [20]). Heuristically, jAxj should be like a ı-function t centered at x. This can be formalized in an RKHS H. where H  D KGK is the hat matrix. Equation (8) In this case, there is a function x 2 H such that shows that if there is a significant bias, then we may O f.x/ Dhx;fi and the bias of f.x/can be written see a trend in the residuals. From (8) we see that O as Bias. f.x//DhAx  x;f i and therefore the residuals are correlated and heteroscedastic (i.e., Var.ri / depends on i) even if the bias is zero, which O j Bias. f.x//jÄkAx  x kkf k: complicates the interpretation of the plots. To stabilize the variance, it is better to plot residuals that have been We can guarantee a small bias when Ax is close to x 0 corrected for heteroscedasticity, for example, ri D in H. In actual computations, averaging kernels can ri =.1  .H/ii/. be approximated using splines. A discussion of this topic as well as information about available software Confidence Intervals for splines and reproducing kernels can be found in In addition to the mean and variance of f.x/O ,we  [14, 20]. may construct confidence intervals that are expected to Another bound for the bias follows from (7) via the contain E.f.x//O with some prescribed probability. We Cauchy-Schwarz and triangle inequalities: now assume that the noise is Gaussian, " N.0;2I/. We use ˚ to denote the cumulative distribution func- 2 t j Bias. f.x//O jÄkG .x/k. kSakCkK ık /: tion of the standard Gaussian N.0; 1/ and write z˛ D ˚ 1.1  ˛/. O Plots of kG .x/k (or k Ax  x k) as a function of Since f.x/ is a biased estimate of f.x/, we can x may provide geometric information (usually con- only construct confidence intervals for Ef.x/O .We servative) about the bias. Other measures such as the should therefore interpret the intervals with caution worst or an average bias can be obtained depending as they may be incorrectly centered if the bias is on the available prior information we have on f or significant. its parametric representation. For example, if a and ı Under the Gaussianity assumption, a confidence E O are known to lie in convex sets S1 and S2, respectively, interval I˛.x; / for f.x/of coverage 1  ˛ is then we may determine the maximum of j Bias. f/O j subject to a 2 S1 and ı 2 S2. Or, if the prior I˛.x; / D f.x/O ˙ z˛=2 kKG .x/k: Statistical Methods for Uncertainty Quantification for Linear Inverse Problems 1369

That is, for each x the probability that I˛.x; / con- not available. Still, the formulas derived above with tains Ef.x/O is 1  ˛. If we have prior information to “reasonable” estimates O and O in place of  and  find an upper bound for the bias, jBias.f.x//O jÄB.x/, are approximately valid. This depends of course on the then a confidence interval for f.x/ with coverage at class of possible functions f , the noise distribution, least 1  ˛ is f.x/O ˙ .z˛=2 kKG .x/kCB.x//: the signal-to-noise ratio, and the ill-posedness of the If the goal is to detect structure by studying the problem. We recommend conducting realistic simula- confidence intervals I˛.x; / for a range of values of tion studies to understand the actual performance of the x, then it is advisable to correct for the total number of estimates for a particular problem. intervals considered so as to control the rate of incor- Generalized cross-validation (GCV) methods to se- rect detections. One way to do this is by constructing lect  have proved useful in applications and theoreti- 1  ˛ confidence intervals of the form I˛.x; ˇ/ with cal studies. A discussion of these methods can be found simultaneous coverage for all x in some closed set S. in [13, 14, 29, 30]. We now summarize a few methods This requires finding a constant ˇ>0such that for obtaining an estimate of . The estimate of  2 given by (5) could be readily PŒ Ef.x/O 2 I˛.x; ˇ/; 8x 2 S 1  ˛; used for, once again, dof.KaO/ D tr.H /, with the corresponding hat matrix H  (provided ı D 0). which is equivalent to However, because of the bias of KaO and the fixed error Ä ı, it is sometimes better to estimate  by considering the data as noisy observations of  D Ey D KŒf  –in P sup jZ t V .x/jˇ Ä ˛; x2S which case we may assume ı D 0;thatis,yi D i C"i . This approach is natural as  2 is the variance of the where V .x/ D KG.x/=kKG.x/k and Z1; errors, "i , in the observations of i . :::;Zn are independent N.0; 1/. We can use results To estimate the variance of "i , we need to remove regarding the tail behavior of of Gaussian the trend i . This trend may be seen as the values of processes to find an appropriate value of ˇ.For a function .x/: i D .xi /. A variety of nonpara- example, for the case when x 2 Œa; b [19](seealso metric regression methods can be used to estimate the [25]) shows that for ˇ large function  so it can be removed, and an estimate of Ä the noise variance can be obtained (e.g., [15, 16]). If v 2 P sup jZ t V .x/jˇ eˇ =2 C 2.1  ˚.ˇ//; the function can be assumed to be reasonably smooth, x2S  then we can use the framework described in 3 with K R i Œ D .xi / and a penaltyP in the second derivative b 0 t where v D a kV .x/k dx. It is not difficult to find of . In this case .x/O D ai i .x/ D a .x/ is a root of this nonlinear equation; the only potential called a spline smoothing estimate because it is a finite problem may be computing v, but even an upper bound linear combination of spline functions i [15, 29]. The for it leads to intervals with simultaneous coverage at estimate of  2 defined by (5) with the corresponding S least 1  ˛. Similar results can be derived for the case hat matrix H  was proposed by [28]. Using (8)and(9) when S is a subset of R2 or R3 [25]. we find that the expected value of the residual sum of An alternative approach is to use methods based squares is on controlling the false discovery rate to correct for the interval coverage after the pointwise confidence 2 2 2 t t intervals have been selected [4]. Eky Oyk D  trŒ.I  H /  C a BK KBa; (10) Estimating  and  and thus O 2 is not an unbiased estimator of  2 even The formulas for the bias, variance, and confidence if the bias is zero (i.e., Ba D 0, which happens intervals described so far require knowledge of  and when  is linear), but it has been shown to have a selection of  that is independent of the data y. good asymptotic properties when  is selected using If y is also used to estimate  or choose ,then generalized cross-validation [13,28]. From (10)wesee f.x/O is no longer linear in y and closed formulas that a slight modification of O 2 leads to an estimate that for the moments, bias, or confidence intervals are is unbiased when Ba D 0 [5] 1370 Statistical Methods for Uncertainty Quantification for Linear Inverse Problems

ky Oyk2 an estimate fO of f following the same procedure O 2 D : j B 2 trŒ.I  H /  used to obtain fO. The statistics of the sample of fOj are used as estimates of those of fO. One problem Simulation studies seem to indicate that this estimate with this approach is that the bias of fO may lead to a has a smaller bias for a wider set of values of  [6]. This poor estimate of KŒf  and thus to unrealistic simulated property is desirable as  is usually chosen adaptively. data. In some cases the effect of the trend can also be An introduction to bootstrap methods can be found reduced using first- or second-order finite differences in [9,12]. For an example of bootstrap methods to con- without having to choose a regularization parameter struct confidence intervals for estimates of a function . For example, a first-order finite-difference estimate based on smoothing splines, see [31]. proposed by [21]is

1 Xn1 O 2 D .y  y /2: R 2.n  1/ iC1 i iD1 References

2 1. Aguilar, O., Allmaras, M., Bangerth, W., Tenorio, L.: Statis- The bias of OR is small if the local changes of .x/ tics of parameter estimates: a concrete example. SIAM Rev. are small. In particular, the bias is zero for a linear 57, 131 (2015) trend. Other estimators of  as well performance com- 2. Aronszajn, N.: Theory of reproducing kernels. Trans. Am. parisons can be found in [5, 6]. Math. Soc. 89, 337 (1950) 3. Backus, G., Gilbert, F.: Uniqueness in the inversion of inaccurate gross earth data. Philos. Trans. R. Soc. Lond. Methods A 266, 123 (1970) We have assumed a Gaussian noise distribution for 4. Benjamini, Y., Yekutieli, D.: False discovery rate–adjusted the construction of confidence intervals. In addition, multiple confidence intervals for selected parameters. J. Am. Stat. Assoc. 100, 71 (2005) we have only considered linear operators and linear 5. Buckley, M.J., Eagleson, G.K., Silverman, B.W.: The es- estimators. Nonlinear estimators arise even when the timation of residual variance in nonparametric regression. operator K is linear. For example, if  and  are Biometrika 75, 189 (1988) estimated using the same data or if the penalized least- 6. Carter, C.K., Eagleson, G.K.: A comparison of variance estimators in nonparametric regression. J. R. Stat. Soc. B squares estimate of a includes interval constraints (e.g., 54, 773 (1992) positivity), then the estimate aO is no longer linear in 7. Cavalier, L.: Nonparametric statistical inverse problems. y. In some cases the use of bootstrap (resampling) Inverse Probl. 24, 034004 (2008) methods allows us to assess statistical properties while 8. Craven, P., Wahba, G.: Smoothing noisy data with splines. Numer. Math. 31, 377 (1979) relaxing the distributional and linearity assumptions. 9. Davison, A.C., Hinkley, D.V.: Bootstrap Methods and The idea is to simulate data y as follows: the Their Application. Cambridge University Press, Cambridge function estimate fO is used as a proxy for the unknown (1997)  10. Engl, H., Hanke, M., Neubauer, A.: Regularization of In- function f .Noise" is simulated using a parametric verse Problems. Kluwer, Dordrecht (1996) or nonparametric method. In the parametric bootstrap, 11. Engl H his chapter " is sampled from the assumed distribution whose 12. Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. parameters are estimated from the data. For example, Chapman & Hall, New York (1993) 2  13. Gu, C.: Smoothing Spline ANOVA Models. Springer, if the "i are independent N.0; /,then"i is sampled Berlin/Heidelberg/New York (2002) 2  from N.0;O /.Inthenonparametric bootstrap, "i is 14. Gu, C.: Smoothing noisy data via regularization: statistical sampled with replacement from the vector of residuals perspectives. Inverse Probl. 24, 034002 (2008) of the fit. However, as Eqs. (8)and(9)show,evenin 15. Green, P.J., Silverman, B.W.: Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Ap- the linear case, the residuals have to be corrected to proach. Chapman & Hall, London (1993) behave approximately like the true errors. Of course, 16. Hardle,¨ W.K., M¨uller, M., Sperlich, S., Werwatz, A.: due to the bias and correlation of the residuals, these Nonparametric and Semiparametric Models. Springer, corrections are often difficult to derive and implement. Berlin/Heidelberg/New York (2004)  O 17. Kaipio, J., Somersalo, E.: Statistical and Computational Using "i and f , one generates simulated data vectors Inverse Problems. Springer, Berlin/Heidelberg/New York  K O   yj D ŒfC "j . For each such yj one computes (2004) Step Size Control 1371

18. Mair, B., Ruymgaart, F.H.: Statistical estimation in Hilbert algorithms continually adjust the step size in accor- scale. SIAM J. Appl. Math. 56, 1424 (1996) dance with the local variation of the solution, at- 19. Naiman, D.Q.: Conservative confidence bands in curvilinear tempting to compute a numerical solution to within regression. Ann. Stat. 14, 896 (1986) 20. O’Sullivan, F.: A statistical perspective on ill-posed inverse a given error tolerance at minimal cost. As a typical problems. Stat. Sci. 1, 502 (1986) integration may run over thousands of steps, the task 21. Rice, J.: Bandwidth choice for nonparametric regression. is ideally suited to proven methods from automatic Ann. Stat. 12, 1215 (1984) control. 22. Rudin, W.: Functional Analysis. McGraw-Hill, New York (1973) Assume that the problem to be solved is a dynamical 23. Stark, P.B., Tenorio, L.: A primer of frequentist and system, Bayesian inference in inverse problems. In: Biegler, L., dy D f.y/I y.0/ D y ; (1) et al. (eds.) Computational Methods for Large-Scale Inverse dt 0 Problems and Quantification of Uncertainty, pp. 9–32. Wi- m ley, Chichester (2011) with y.t/ 2 R . Without loss of generality, we may 24. Stein, C.: Estimation of the mean of a multivariate normal assume that the problem is solved numerically using distribution. Ann. Stat. 9, 1135 (1981) a one-step integration procedure, explicit or implicit, 25. Sun, J., Loader, C.R.: Simultaneous confidence bands for liner regression and smoothing. Ann. Stat. 22, 1328 (1994) written formally as 26. Tenorio, L., Andersson, F., de Hoop, M., Ma, P.: Data anal- ysis tools for uncertainty quantification of inverse problems. y D ˆ .y /I y D y.0/; (2) Inverse Probl. 29, 045001 (2011) nC1 h n 0 27. Vogel, C.R.: Computational Methods for Inverse Problems. SIAM, Philadelphia (2002) where the map ˆ advances the solution one step of 28. Wahba, G.: Bayesian “confidence intervals” for the cross- h validated smoothing spline. J. R. Stat. Soc. B 45, 133 (1983) size h, from time tn to tnC1 D tn C h.Herethe 29. Wahba, G.: Spline Models for Observational Data. CBMS- sequence yn is the numerical approximations to the NSF Regional Conference Series in Applied Mathematics, exact solution, y.tn/. The difference en D yn  y.tn/ vol. 50. SIAM, Philadelphia (1990) is the global error of the numerical solution. If the 30. Wahba G her chapter 31. Wang, Y., Wahba, G.: Bootstrap confidence intervals for method is of convergence order p, and the vector field smoothing splines and their comparison to Bayesian con- f in (1) is sufficiently differentiable, then kenkD fidence intervals. J. Stat. Comput. Simul. 51, 263 (1995) O.hp/ as h ! 0. The accuracy of the numerical solution can also be evaluated locally. The local error ln is defined by

y.tnC1/ C ln D ˆh .y.tn// : (3) Step Size Control Thus, if the method would take a step of size h, starting Gustaf Soderlind¨ on the exact solution y.t /, it will deviate from the Centre for Mathematical Sciences, Numerical n exact solution at t by a small amount, l .Ifthe S Analysis, Lund University, Lund, Sweden nC1 n method is of order p, the local error will satisfy

pC1 pC2 Introduction klnkD'nh C O.h / I h ! 0: (4)

Step size control is used to make a numerical method Here the principal error function 'n varies along the that proceeds in a step-by-step fashion adaptive.This solution, and depends on the problem (in terms of includes time stepping methods for solving initial value derivatives of f ) as well as on the method. problems, nonlinear optimization methods, and contin- Using differential inequalities, it can be shown that uation methods for solving nonlinear equations. The the global and local errors are related by the a priori objective is to increase efficiency, but also includes error bound managing the stability of the computation. MŒftn This entry focuses exclusively on time stepping klmk e  1 kenk / max  ; (5) adaptivity in initial value problems. Special control mÄn h MŒf 1372 Step Size Control where MŒf is the logarithmic Lipschitz constant of computing two results, ynC1 and yOnC1 from yn,the f . Thus, the global error is bounded in terms of the solver estimates the local error by rn DkynC1 OynC1k. local error per unit step, ln=h. For this reason, one can To control the error, the relation between step size and manage the global error by choosing the step size h so error is modeled by as to keep klnk=h D TOL during the integration, where TOL is a user-prescribed local error tolerance.The k rn DO'nhn: (8) global error is then proportional to TOL, by a factor that reflects the intrinsic growth or decay of solutions to (1). Good initial value problem solvers usually produce Here k is the order of the local error estimator. Depend- numerical results that reflect this tolerance proportion- ing on the estimator’s construction, k may or may not ality. By reducing TOL, one reduces the local error equal p C1,wherep is the order of the method used to as well as the global error, while computational cost advance the solution. For control purposes, however, increases, as h TOL1=p. it is sufficient that k is known, and that the method Although it is possible to compute a posteriori operates in the asymptotic regime, meaning that (8)is global error estimates, such estimates are often costly. an accurate model of the error for the step sizes in use. All widely used solvers therefore control the local The common approach to varying the step size is error, relying on the relation (5), and the possibility of multiplicative control comparing several different numerical solutions com- puted for different values of TOL. There is no claim hnC1 D n  hn; (9) that the step size sequences are “optimal,” but in all problems where the principal error function varies by where the factor needs to be determined so that the several orders of magnitude, as is the case in stiff dif- n error estimate r is kept near the target value TOL for ferential equations, local error control is an inexpensive n all n. tool that offers vastly increased performance. It is a A simple control heuristic is derived by requiring necessity for efficient computations. that the next step size h solves the equation TOL D A time stepping method is made adaptive by provid- nC1 'O hk ; this assumes that ' varies slowly. Thus, ing a separate procedure for updating the step size as a n nC1 n dividing this equation by (8), one obtains function of the numerical solution. Thus, an adaptive method can be written formally as the interactive  à 1=k recursion TOL hnC1 D hn: (10) rn ynC1 D ˆhn .yn/ (6) h D ‰ .h /; (7) nC1 ynC1 n This multiplicative control is found in many solvers. It is usually complemented by a range of safety mea- where the first equation represents the numerical sures, such as limiting the maximum step size increase, method and the second the step size control. If ‰y Á I preventing “too small” step size changes, and special (the identity map), the scheme reduces to a constant schemes for recomputing a step, should the estimated step size method. Otherwise, the interaction between error be much larger than TOL. the two dynamical systems implies that step size Although it often works well, the control law (10) control interferes with the stability of the numerical and its safety measures have several disadvantages that method. For this reason, it is important that step size call for more advanced feedback control schemes. Con- control algorithms are designed to increase efficiency trol theory and digital signal processing, both based without compromising stability. on linear difference equations, offer a wide range of proven tools that are suitable for controlling the step size. Taking logarithms, (10) can be written as the Basic Multiplicative Control linear difference equation

Modern time stepping methods provide a local error 1 estimate log hnC1 D log hn  log rOn; (11) . By using two methods of different orders, k Step Size Control 1373 where rOn D rn=TOL. This recursion continually advanced controllers in existing codes, keeping in mind changes the step size, unless log rOn is zero (i.e., that a multistep controller is started either by using rn D TOL). If rOn >1the step size decreases, and (10), or by merely putting all factors representing if rOn <1it increases. Thus, the error rn is kept near nonexistent starting data equal to one. Examples of the set point TOL.As(11) is a summation process, the how to choose the parameters in (13) are found in controller is referred to as an integrating controller,or Table 1. I control. This integral action is necessary in order to A causal digital filter is a linear difference equation eliminate a persistent error, and to find the step size of the form (12), converting the input signal log rO that makes rn D TOL. to an output log h. This implies that digital control The difference equation (11) may be viewed as and filtering are intimately related. There are several using the explicit Euler method for integrating a dif- important filter structures that fit the purpose of step ferential equation that represents a continuous control. size control, all covered by the general controller (13). Just as there are many different methods for solving Among them are finite impulse response (FIR) filters; differential equations, however, there are many dif- proportional–integral (PI) controllers; autoregressive ferent discrete-time controllers that can potentially be (AR) filters; and moving average (MA) filters. These optimized for different numerical methods or problem filter classes are not mutually exclusive but can be types, and offering different stability properties. combined. The elementary controller (10) is a FIR filter, also known as a deadbeat controller. Such controllers have General Multiplicative Control the quickest dynamic response to variations in log 'O, but also tend to produce nonsmooth step size se- In place of (11), a general controller takes the error quences, and sometimes display ringing or stability sequence log rO Dflog rOng as input, and produces a step problems. These problems can be eliminated by using size sequence log h Dflog hng via a linear difference PI controllers and MA filters that improve stability equation, and suppress step size oscillations. Filter design is a matter of determining the filter coefficients with .E  1/Q.E/ log h DP.E/log r:O (12) respect to order conditions and stability criteria,and is reminiscent of the construction of linear multistep Here E is the forward shift operator, and P and Q are methods [11]. two polynomials of equal degree, making the recursion explicit. The special case (11)hasQ.E/ Á 1 and P.E/ Á 1=k, and is a one-step controller, while (12) Stability and Frequency Response in general is a multistep controller. Finally, the factor E  1 in (12) is akin to the consistency condition in Controllers are analyzed and designed by investigating S linear multistep methods. Thus, if log rO Á 0, a solution the closed loop transfer function.Intermsofthez to (12)islogh Á const: transform of (12), the control action is If, for example, P.z/ D ˇ1z C ˇ0 and Q.z/ D z C ˛0, then the recursion (12) is equivalent to the two-step multiplicative control, log h DC.z/ log rO (14) Â Ã Â Ã Â Ã ˇ1 ˇ0 ˛0 TOL TOL hn where the control transfer function is given by hnC1 D hn: (13) rn rn1 hn1 P.z/ By taking logarithms, it is easily seen to correspond C.z/ D : (15) .z  1/Q.z/ to (12). One could include more factors following the same pattern, but in general, it rarely pays off to use a longer step size – error history than two Similarly, the error model (8) can be written to three steps. Because of the simple structure of (13), it is relatively straightforward to include more log rO D k  log h C log 'O  log TOL: (16) 1374 Step Size Control

logϕˆ

log TOL Controller logh Process log ( ) −1

Step Size Control, Fig. 1 Time step adaptivity viewed as a feed- enters as an additive disturbance, to be compensated by the back control system. The computational process takes a stepsize controller. The error estimate log r is fed back and compared log h as input and produces an error estimate log r D k log h C to log TOL. The controller constructs the next stepsize through log 'O. Representing the ODE, the principal error function log 'O log h D C.z/  .log TOL  log r/ (From [11])

These relations and their interaction are usually illus- Step Size Control, Table 1 Some recommended two-step con- trated in a block diagram, see Fig. 1. trollers. The H 211 controllers produce smooth step size se- Overall stability depends on the interaction between quences, using a moving average low-pass filter. In H 211b,the filter can be adjusted. Starting at b D 2 it is a deadbeat (FIR) the controller C.z/ and the computational process. filter; as the parameter b increases, dynamic response slows and Inserting (16)into(14) and solving for log h yields high frequency suppression (smoothing) increases. Note that the ˇj coefficients are given in terms of the product kˇj for use with error estimators of different orders k.The˛ coefficient is C.z/ C.z/ log h D log 'O C log TOL: however independent of k (From [11]) 1 C kC.z/ 1 C kC.z/ (17) kˇ1 kˇ0 ˛0 Type Name Usage 3=5 1=5 – PI PI.4.2 Nonstiff solvers Here the closed loop transfer function H.z/ W log 'O 7! 1=b 1=b 1=b MA H 211b Stiff solvers; log h is defined by b 2 Œ2; 6 1=6 1=6 –MA+PIH 211 PI Stiff problems, smooth solutions C.z/ P.z/ H.z/ D D : (18) 1 C kC.z/ .z  1/Q.z/ C kP .z/ sequence is of importance, for example, to avoid higher It determines the performance of the combined system order BDF methods to suffer stability problems. of step size controller and computational process, and, in particular, how successful the controller will be in adjusting log h to log 'O so that log r log TOL. For a controller or a filter to be useful, the closed Implementation and Modes of Operation loop must be stable. This is determined by the poles of H.z/, which are the roots of the characteristic equation Carefully implemented adaptivity algorithms are cen- .z  1/Q.z/ C kP .z/ D 0. These must be located well tral for the code to operate efficiently and reliably for inside the unit circle, and preferably have positive real broad classes of problems. Apart from the accuracy parts, so that homogeneous solutions decay quickly requirements, which may be formulated in many differ- without oscillations. ent ways, there are several other factors of importance To asses frequency response, one takes log 'O D in connection with step size control. fei!ng with ! 2 Œ0;  to investigate the output log h D H.ei! /fei!ng. The amplitude jH.ei!/j measures the EPS Versus EPUS attenuation of the frequency !. By choosing P such For a code that emphasizes asymptotically correct  that P.ei! / D 0 for some !, it follows that H.ei! / D error estimates, controlling the local error per unit 0. Thus, zeros of P.z/ block signal transmission.The step klnk=hn is necessary in order to accumulate a natural choice is ! D  so that P.1/ D 0,as targeted global error, over a fixed integration range, this will annihilate .1/n oscillations, and produce a regardless of the number of steps needed to complete smooth step size sequence. This is achieved by the the integration. Abbreviated EPUS, this approach is two H211 controllers in Table 1. A smooth step size viable for nonstiff problems, but tends to be costly for Step Size Control 1375 stiff problems, where strong dissipation usually means di lOi that the global error is dominated by the most recent D : (21) TOL TOL  i C TOL jyi j local errors. There, controlling the local error per step kl k, referred to as EPS, is often a far more efficient n Most codes employ two different tolerance parameters, option, if less well aligned with theory. In modern ATOL and RTOL,definedbyATOL D TOL   and codes, the trend is generally to put less emphasis i i RTOL D TOL, respectively, replacing the denominator on asymptotically correct estimates, and control klnk in (21)byATOL C RTOL jy j. Thus, the user controls directly. This has few practical drawbacks, but it makes i i the accuracy by the vector ATOL and the scalar RTOL. it less straightforward to compare the performance of For scaling purposes, it is also important to note that two different codes. TOL and RTOL are dimensionless, whereas ATOL is not. The actual computational setup will make the step Computational Stability size control operate differently as the user-selected Just as a well-conditioned problem depends continu- tolerance parameters affect both the set point and the ously on the data, the computational procedure should control objective. depend continuously on the various parameters that control the computation. In particular, for tolerance proportionality, there should be constants c and C such Interfering with the Controller that the global error e can be bounded above and below, In most codes, a step is rejected if it exceeds TOL by a small amount, say 20 %, calling for the step to c  TOL ÄkekÄC  TOL ; (19) be recomputed. As a correctly implemented controller is expectation-value correct, a too large error is al- where the method is tolerance proportional if  D 1. most invariably compensated by other errors being too The smaller the ratio C=c, the better is the compu- small. It is, therefore, in general harmless to accept tational stability,butC=c can only be made small steps that exceed TOL by as much as a factor of 2, and with carefully implemented tools for adaptivity. Thus, indeed often preferable to minimize interference with with the elementary controller (10), prevented from the controller’s dynamics. making small step size changes, C=c is typically large, Other types of interference may come from con- whereas if the controller is based on a digital filter (here ditions that prevent “small” step size changes, as this H211b with b D 4,cf.Table1) allowing a continual might call for a refactorization of the Jacobian. How- change of the step size, the global error becomes a ever, such concerns are not warranted with smooth smooth function of TOL,seeFig.2. This also shows controllers, which usually make small enough changes that the behavior of an implementation is significantly not to disturb the Newton process beyond what can be affected by the control algorithms, and how they are managed. On the contrary, a smoothly changing step implemented. size is beneficial for avoiding instability in multistep methods such as the BDF methods. S Absolute and Relative Errors It is however necessary to interfere with the con- All modern codes provide options for controlling both troller’s action when there is a change of method order, absolute and relative errors. If at any given time, the or when a too large step size change is suggested. This estimated error is lO and the computed solution is y,then is equivalent to encountering an error that is much a weighted error vector d with components larger or smaller than TOL. In the first case, the step needs to be rejected, and in the second, the step size increase must be held back by a limiter. lOi di D (20) i Cjyi j is constructed, where i is a scaling factor, determining Special Problems a gradual switchover from relative to absolute error as yi ! 0. The error (and the step) is accepted if kdkÄ Conventional multiplicative control is not useful in TOL, and the expression TOL=kdk corresponds to the connection with geometric integration, where it fails factors TOL=r in (10)and(13). By (20), to preserve structure. The interaction (6, 7)showsthat 1376 Step Size Control

10−4 10−4

10−5 10−5

10−6 10−6

10−7 10−7 achieved accuracy achieved accuracy 10−8 10−8

10−9 10−9

10−10 10−10 10−10 10−8 10−6 10−4 10−10 10−8 10−6 10−4 TOL TOL

Step Size Control, Fig. 2 Global error vs. TOL for a linear Although computational effort remains unchanged, stability is multistep code applied to a stiff nonlinear test problem. Left much enhanced. The graphs also reveal that the code is not panel shows results when the controller is based on (10). In the tolerance proportional right panel, it has been replaced by the digital filter H 211b.

adaptive step size selection adds dynamics, interfering h −1 h with structure preserving integrators. Ψ A one-step method ˆ W y 7! y is called h n nC1 −1 ΦhhΦ +1 1 symmetric if ˆh D ˆh. This is a minimal require- ment for the numerical integration of, for example, Φ−h Φ−h reversible Hamiltonian systems, in order to nearly pre- Ψ serve action variables in integrable problems. To make −hh−1 − such a method adaptive, symmetric step size control is also needed. An invertible step size map ‰y W R ! R Step Size Control, Fig. 3 Symmetric adaptive integration in forward time (upper part), and reverse time (lower part) illus- is called symmetric if ‰y is an involution, see Fig. 3. trate the interaction (6, 7). The symmetric step size map ‰y A symmetric ‰y then maps hn1 to hn and hn to governs both h and h (From [8]) hn1, and only depends on yn; with these conditions satisfied, the adaptive integration can be run in reverse time and retrace the numerical trajectory that was generated in forward time, [8]. However, this cannot can be used, where the function G needs to be chosen be achieved by multiplicative controllers (9), and a with respect to the symmetry and geometric prop- special, nonlinear controller is therefore necessary. erties of the differential equation to be solved. This An explicit control recursion satisfying the approach corresponds to constructing a Hamiltonian requirements is either additive or inverse-additive, with continuous control system, which is converted to the the latter being preferable. Thus, a controller of the discrete controller (22) by geometric integration of the form 1 1 control system. This leaves the long-term behavior of  D G.yn/ (22) hn hn1 the geometric integrator intact, even in the presence Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science 1377 of step size variation. It is also worth noting that (22) 3. Gustafsson, K.: Control theoretic techniques for stepsize selection in explicit Runge–Kutta methods. ACM TOMS generates a smooth step size sequence, as hn  hn1 D O.h h /. 17, 533–554 (1991) n n1 4. Gustafsson, K.: Control theoretic techniques for stepsize This type of control does not work with an error selection in implicit Runge–Kutta methods. ACM TOMS estimate, but rather tracks a prescribed target function; 20, 496–517 (1994) it corresponds to keeping hQ.y/ D const:,whereQ is 5. Gustafsson, K., Soderlind,¨ G.: Control strategies for the a given functional reflecting the geometric structure of iterative solution of nonlinear equations in ODE solvers. SIAM J. Sci. Comp. 18, 23–40 (1997) (1). One can then take G.y/ D grad Q.y/f.y/=Q.y/. 6. Hairer, E., N rsett, S.P., Wanner, G.: Solving Ordinary Dif- For example, in celestial mechanics, Q.y/ could be ferential Equations I: Nonstiff Problems, 2nd edn. Springer, selected as total centripetal acceleration; then the step Berlin (1993) 7. Hairer, E., Wanner, G.: Solving Ordinary Differential Equa- size is small when centripetal acceleration is large tions II: Stiff and Differential-Algebraic Problems, 2nd edn. and vice versa, concentrating the computational effort Springer, Berlin (1996) to those intervals where the solution of the problem 8. Hairer, E., Soderlind,¨ G.: Explicit, time reversible, adaptive changes rapidly and is more sensitive to perturbations. step size control. SIAM J. Sci. Comp. 26, 1838–1851 (2005) 9. Hairer, E., Lubich, C., Wanner, G.: Geometric Numerical Integration. Structure-Preserving Algorithms for Ordinary Differential Equations, 2nd edn. Springer, Berlin (2006) 10. Shampine, L., Gordon, M.: Computer Solution of Ordinary Literature Differential Equations: The Initial Value Problem. Freeman, San Francisco (1975) Step size control has a long history, starting with 11. Soderlind,¨ G.: Digital filters in adaptive time-stepping. the first initial value problem solvers around 1960, ACM Trans. Math. Softw. 29, 1–26 (2003) 12. Soderlind,¨ G., Wang, L.: Adaptive time-stepping and com- often using a simple step doubling/halving strategy. putational stability. J. Comp. Methods Sci. Eng. 185, 225– The controller (10) was soon introduced, and further 243 (2006) developments quickly followed. Although the schemes 13. Stoffer, D.: Variable steps for reversible integration meth- were largely heuristic, performance tests and practical ods. Computing 55, 1–22 (1995) experience developed working standards. Monographs such as [1, 2, 6, 7, 10] all offer detailed descriptions. The first full control theoretic analysis is found in [3, 4], explaining and overcoming some previously Stochastic and Statistical Methods in noted difficulties, developing proportional-integral (PI) Climate, Atmosphere, and Ocean Science and autoregressive (AR) controllers. Synchronization with Newton iteration is discussed in [5]. A complete Daan Crommelin1;2 and Boualem Khouider3 framework for using digital filters and signal process- 1Scientific Computing Group, Centrum Wiskunde and ingisdevelopedin[11], focusing on moving average Informatica (CWI), Amsterdam, The Netherlands (MA) controllers. Further developments on how to 2Korteweg-de Vries Institute for Mathematics, S obtain improved computational stability are discussed University of Amsterdam, Amsterdam, The in [12]. Netherlands The special needs of geometric integration are dis- 3Department of Mathematics and Statistics, University cussed in [8], although the symmetric controllers are of Victoria, Victoria, BC, Canada not based on error control. Error control in implicit, symmetric methods is analyzed in [13]. Introduction

References The behavior of the atmosphere, oceans, and climate is intrinsically uncertain. The basic physical principles 1. Butcher, J.C.: Numerical Methods for Ordinary Differential that govern atmospheric and oceanic flows are well Equations. Wiley, Chichester (2008) known, for example, the Navier-Stokes equations for 2. Gear, C.W.: Numerical Initial Value Problems in Ordinary Differential Equations. Prentice Hall, Englewood Cliffs fluid flow, thermodynamic properties of moist air, and (1971) the effects of density stratification and Coriolis force. 1378 Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science

Notwithstanding, there are major sources of random- making explicit the margins of uncertainty inherent ness and uncertainty that prevent perfect prediction and to any prediction. complete understanding of these flows. The climate system involves a wide spectrum of space and time scales due to processes occurring on Statistical Methods the order of microns and milliseconds such as the formation of cloud and rain droplets to global phenom- For assessment and validation of models, a compar- ena involving annual and decadal oscillations such as ison of individual model trajectories is typically not the EL Nio-Southern Oscillation (ENSO) and the Pa- suitable, because of the uncertainties described earlier. cific Decadal Oscillation (PDO) [5]. Moreover, climate Rather, the statistical properties of models are used records display a spectral variability ranging from 1 to summarize model behavior and to compare against cycle per month to 1 cycle per 100;000 years [23]. The other models and against observations. Examples are complexity of the climate system stems in large part the mean and variance of spatial patterns of rainfall or from the inherent nonlinearities of fluid mechanics and sea surface temperature, the time evolution of global the phase changes of water substances. The atmosphere mean temperature, and the statistics of extreme events and oceans are turbulent, nonlinear systems that dis- (e.g., hurricanes or heat waves). Part of the statistical play chaotic behavior (e.g., [39]). The time evolutions methods used in this context is fairly general, not of the same chaotic system starting from two slightly specifically tied to climate-atmosphere-ocean science different initial states diverge exponentially fast, so that (CAOS). However, other methods are rather specific chaotic systems are marked by limited predictability. for CAOS applications, and we will highlight some of Beyond the so-called predictability horizon (on the these here. General references on statistical methods in order of 10 days for the atmosphere), initial state CAOS are [61, 62]. uncertainties (e.g., due to imperfect observations) have grown to the point that straightforward forecasts are no EOFs longer useful. A technique that is used widely in CAOS is Principal Another major source of uncertainty stems from Component Analysis (PCA), also known as Empirical the fact that numerical models for atmospheric and Orthogonal Function (EOF) analysis in CAOS. Con- oceanic flows cannot describe all relevant physical sider a multivariate dataset ˆ 2 RM N .InCAOSthis processes at once. These models are in essence will typically be a time series .t1/; .t2/; : : : ; .tN / M discretized partial differential equations (PDEs), and where each .tn/ 2 R is a spatial field (of, e.g., tem- the derivation of suitable PDEs (e.g., the so-called perature or pressure). For simplicity we assume that primitive equations) from more general ones that Pthe time mean has been subtracted from the dataset, so are less convenient for computation (e.g., the full N nD1 ˆmn D 0 8m.LetC be the M  M (sample) Navier-Stokes equations) involves approximations and covariance matrix for this dataset: simplifications that introduce errors in the equations. Furthermore, as a result of spatial discretization of 1 C D ˆˆT : the PDEs, numerical models have finite resolution N  1 so that small-scale processes with length scales m below the model grid scale are not resolved. These We denote by .m;v /, m D 1;:::;M the ordered limitations are unavoidable, leading to model error and eigenpairs of C : uncertainty. m m The uncertainties due to chaotic behavior and Cv D m v ;m  mC1 8m: unresolved processes motivate the use of stochastic and statistical methods for modeling and understanding The ordering of the (positive) eigenvalues implies that climate, atmosphere, and oceans. Models can be the projection of the dataset onto the leading eigen- augmented with random elements in order to represent vector v1 gives the maximum variance among all time-evolving uncertainties, leading to stochastic projections. The next eigenvector v2 gives the max- models. Weather forecasts and climate predictions imum variance among all projections orthogonal to are increasingly expressed in probabilistic terms, v1, v3 gives maximum variance among all projections Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science 1379 P 1 2 ˆ orthogonal to v and v , etc. The fraction m= l l be the result of, e.g., projecting the dataset onto the equals the fraction of the total variance of the data EOFs (in which case the data are time series of ˛.t/). captured by projection onto the m-th eigenvector vm. These models are often cast as SDEs whose parameters The eigenvectors vm are called the Empirical Or- must be estimated from the available time series. If the thogonal Functions (EOFs) or Principal Components SDEs are restricted to have linear drift and additive (PCs). Projecting the original dataset ˆ onto the lead- noise (i.e., restricted to be those of a multivariate ing EOFs, i.e., the projection/reduction Ornstein-Uhlenbeck (OU) process), the estimation can be carried out for high-dimensional SDEs rather easily. XM 0 r m 0 That is, assume the SDEs have the form .tn/ D ˛m.tn/v ;M M; mD1 d˛.t/ D B˛.t/dt C  dW.t/ ; (1) can result in a substantial data reduction while retain- ing most of the variance of the original data. in which B and  are both a constant real M 0  PCA is discussed in great detail in [27]and[59]. M 0 matrix and W.t/ is an M 0-dimensional vector Over the years, various generalizations and alternatives of independent Wiener processes (for simplicity we for PCA have been formulated, for example, Principal assume that ˛ has zero mean). The parameters of this Interaction and Oscillation Patterns [24], Nonlinear model are the matrix elements of B an . They can be Principal Component Analysis (NLPCA) [49], and estimated from two (lagged) covariance matrices of the Nonlinear Laplacian Spectral Analysis (NLSA) [22]. time series. If we define These more advanced methods are designed to over- come limitations of PCA relating to the nonlinear or 0 dynamical structure of datasets. Rij D E˛i .t/˛j .t/ ; Rij D E˛i .t/˛j .t C /; In CAOS, the EOFs vm often correspond to spatial patterns. The shape of the patterns of leading EOFs with E denoting expectation, then for the OU process can give insight in the physical-dynamical processes (1), we have the relations underlying the dataset ˆ. However, this must be done with caution, as the EOFs are statistical constructions R D exp.B / R0 and cannot always be interpreted as having physical or dynamical meaning in themselves (see [50]fora and discussion). BR0 C R0BT C T D 0 The temporal properties of the (time-dependent) coefficients ˛m.t/ can be analyzed by calculating, The latter of these is the fluctuation-dissipation relation e.g., autocorrelation functions. Also, models for these for the OU process. By estimating R0 and R (with coefficients can be formulated (in terms of ordinary some >0) from time series of ˛, estimates for differential equations (ODEs), stochastic differential B and A WD T can be easily computed using S equations (SDEs), etc.) that aim to capture the main these relations. This procedure is sometimes referred dynamical properties of the original dataset or model to as linear inverse modeling (LIM) in CAOS [55]. variables .t/. For such reduced models, the emphasis The matrix  cannot be uniquely determined from A; is usually on the dynamics on large spatial scales and however, any  for which A D T (e.g., obtained long time scales. These are embodied by the leading by Cholesky decomposition of A) will result in an OU EOFs vm;mD 1;:::;M0, and their corresponding process with the desired covariances R0 and R . 0 coefficients ˛m.t/, so that a reduced model .M M/ As mentioned, LIM can be carried out rather easily can be well capable of capturing the main large-scale for multivariate processes. This is a major advantage dynamical properties of the original dataset. of LIM. A drawback is that the OU process (1) cannot capture non-Gaussian properties, so that LIM can only Inverse Modeling be used for data with Gaussian distributions. Also, the One way of arriving at reduced models is inverse mod- estimated B and A are sensitive to the choice of , eling, i.e., the dynamical model is obtained through unless the available time series is a an exact sampling statistical inference from time series data. The data can of (1). 1380 Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science

Estimating diffusion processes with non-Gaussian Let MN be the maximum of this sequence, MN D properties is much more complicated. There are var- maxfr1;:::;rN g. If the probability distribution for MN ious estimation procedures available for SDEs with can be rescaled so that it converges in the limit of in- nonlinear drift and/or multiplicative noise; see, e.g., creasingly long sequences (i.e., N !1), it converges [30, 58] for an overview. However, the practical use to a generalized extreme value (GEV) distribution. of these procedures is often limited to SDEs with More precisely, if there are sequences aN .> 0/ and very low dimensions, due to curse of dimension or to bN such that Prob..MN  bN /=aN Ä z/ ! G.z/ as computational feasibility. For an example application N !1,then in CAOS, see, e.g., [4]. Â Ä Á à The dynamics of given time series can also be z   1= G.z/ D exp  1 C : captured by reduced models that have discrete state  spaces, rather than continuous ones as in the case of SDEs. There are a number of studies in CAOS that G.z/ is a GEV distribution, with parameters  (lo- employ finite-state Markov chains for this purpose cation), >0(scale), and (shape). It combines (e.g., [8,48,53]). It usually requires discretization of the the Frechet´ ( >0), Weibull ( <0), and Gumbel state space; this can be achieved with, e.g., clustering ( ! 0) families of extreme value distributions. Note methods. A more advanced methodology, building on that this result is independent of the precise distribution the concept of Markov chains yet resulting in contin- of the random variables rn. The parameters ; ; can uous state spaces, is that of hidden Markov models. be inferred by dividing the observations r1;r2;::: in These have been used, e.g., to model rainfall data (e.g., blocks of equal length and considering the maxima on [3, 63]) and to study regime behavior in large-scale these blocks (the so-called block maxima approach). atmospheric dynamics [41]. Yet a more sophisticated An alternative method for characterizing extremes, methodology that combines the clustering and Markov making more efficient use of available data than the chain concepts, specifically designed for nonstationary block maxima approach, is known as the peaks-over- processes, can be found in [25]. threshold (POT) approach. The idea is to set a thresh- old, say r, and study the distribution of all observa- Extreme Events tions rn that exceed this threshold. Thus, the object The occurrence of extreme meteorological events, such of interest is the conditional probability distribution   as hurricanes, extreme rainfall, and heat waves, is Prob.rn  r > z j rn >r/, with z >0. Under of great importance because of their societal impact. fairly general conditions, this distribution converges to Statistical methods to study extreme events are there- 1  H.z/ for high thresholds r,whereH.z/ is the fore used extensively in CAOS. The key question for generalized Pareto distribution (GPD): studying extremes with statistical methods is to be able  à to assess the probability of certain events, having only a z 1= H.z/ D 1  1 C : dataset available that is too short to contain more than Q a few of these events (and occasionally, too short to contain even a single event of interest). For example, The parameters of the GPD family of distributions are how can one assess the probability of sea water level directly related to those of the GEV distribution: the at some coastal location being more than 5 m above shape parameter is the same in both, whereas the average if only 100 years of observational data for that threshold-dependentscale parameter is Q D C .r  location is available, with a maximum of 4 m above / with  and  as in the GEV distribution. average? Such questions can be made accessible using By inferring the parameters of the GPD or GEV extreme value theory. General introductions to extreme distributions from a given dataset, one can calculate value theory are, e.g., [7]and[11]. For recent research probabilities of extremes that are not present them- on extremes in the context of climate science, see, e.g., selves in that dataset (but have the same underlying [29] and the collection [1]. distribution as the available data). In principle, this The classical theory deals with sequences or obser- makes it possible to assess risks of events that have not vations of N independent and identically distributed been observed, provided the conditions on convergence (iid) random variables, denoted here by r1;:::;rN . to GPD or GEV distributions are met. Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science 1381

As mentioned, classical results on extreme value where t is time and u.x; v/ and v.x; y/ contain the theory apply to iid random variables. These results external forcing and internal dynamics that couple the have been generalized to time-correlated random vari- slow and fast variables. ables, both stationary and nonstationary [7]. This is Hasselmann assumes a large scale-separation important for weather and climate applications, where between the slow and fast time scales: D  à !  à !y datasets considered in the context of extremes are often dy 1 dx 1 O y j D O x i ,for time series. Another relevant topic is the development j dt x i dt of multivariate extreme value theory [11]. all components i and j . The time scale separation was used earlier to justify statistical dynamical models (SDM) used then to track the dynamics of the climate Stochastic Methods system alone under the influence of external forcing. Without the variability due to the internal interactions Given the sheer complexity of climate-atmosphere- of CAO, the SDMs failed badly to explain the observed ocean (CAO) dynamics, when studying the global cli- “red” spectrum which characterizes low-frequency mate system or some parts of global oscillation patterns variability of CAO. such ENSO or PDO, it is natural to try to separate Hasselmann made the analogy with the Brownian the global dynamics occurring on longer time scales motion (BM), modeling the erratic movements of a few from local processes which occur on much shorter large particles immersed in a fluid that are subject to scales. Moreover, as mentioned before, climate and bombardments by the rapidly moving fluid molecules weather prediction models are based on a numerical as a “natural” extension of the SDM models. Moreover, discretization of the equations of motion, and due to Hasselmann [23] assumes that the variability of x can limitations in computing resources, it is simply impos- be divided into a mean tendency hdx=dtiDhu.x; y/i sible to represent the wide range of space and time (Here h:i denotes average with respect to the joint scales involved in CAO. Instead, general circulation distribution of the fast variables.) and a fluctuation models (GCMs) rely on parameterization schemes to tendency dx0=dt D u.x; y/ hu.x; y/iDu0.x; y/ represent the effect of the small/unresolved scales on which, according to the Brownian motion problem, is the large/resolved scales. Below, we briefly illustrate assumed to be a pure diffusion process or white noise. how stochastic models are used in CAO both to build However, unlike BM, Hasselmann argued that for the theoretical models that separate small-scale (noise) and weather and climate system, the statistics of y are large-scale dynamics and to “parameterize” the effect not in equilibrium but depend on the slowly evolving of small scales on large scales. A good snapshot on the large-scale dynamics and thus can only be obtained state of the art, during the last two decades or so, in empirically. To avoid linear growth of the covariance stochastic climate modeling research can be found in matrix hx0 ˝ x0i, Hasselmann assumes a damping [26, 52]. term proportional to the divergence of the background 0 0 0 S frequency F.0/of hx ˝ x i,whereı.!R  ! /Fij.!/ D Model Reduction for Noise-Driven Large-Scale 0 1 1 0 i!t hVi .!/Vj .! /i with V.!/ D 2 1 u .t/e dt. Dynamics This leads to the Fokker-Plank equation: [23] In an attempt to explain the observed low-frequency variability of CAO, Hasselmann [23] splits the system @p.x ; t / Cr .uO.x/p.x; t// Dr .Dr p.x;t// (3) into slow climate components (e.g., oceans, biosphere, @t x x x cryosphere), denoted by the vector x,andfastcom- ponents representing the weather, i.e., atmospheric for the distribution p.x;t/ of x.t/ as a stochastic variability, denoted by a vector y. The full climate process given that x.0/ D x0,whereD is the nor- system takes the form malized covariance matrix D Dhx0 ˝ x0i=2t and uO Dhuirx F.0/. Given the knowledge of the mean dx statistical forcing hui, the evolution equation for p can D u.x; y/ (2) dt be determined from the time series of x obtained either dy from a climate model simulation of from observations. D v.x; y/; dt Notice also that for a large number of slow variables xi , 1382 Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science the PDE in (3) is impractical; instead, one can always discrete form, is possible as illustrated by the MTV resort to Monte Carlo simulations using the associated theory presented next. Langevin equation: The Systematic Mode Reduction MTV Methodology A systematic mathematical methodology to derive dx DOu.x/dt C †.x/dW (4) t Langevin-type equations (4) a` la Hasselmann, for the slow climate dynamics from the coupled atmosphere- where †.x/†.x/T D D.x/. However, the functional ocean-land equations of motion, which yields the dependence of uO and D remains ambiguous, and rely- propagation (or drift) and diffusion terms uO.x/ and ing on rather empirical methods to define such terms D.x/ in closed form, is presented in [44,45]byMajda, is unsatisfactory. Nonetheless, Hasselmann introduced Timofeyev, and Vanden-Eijnden (MTV). a “linear feedback” version of his model where the Starting from the generalized form of the discretized drift or propagation term is a negative definite linear equations of motion operator: uO.x/ D Ux and D is constant, independent dz of x as an approximation for short time excursions of D Lz C B.z; z/ C f.t/ the climate variables. In this case, p.x;t/ is simply a dt Gaussian distribution whose time-dependent mean and variance are determined by the matrices D and U as where L and B are a linear and a bilinear operators noted in the inverse modeling section above. while f.t/ represent external forcing, MTV operate Due to its simplicity, the linear feedback model is the same dichotomy as Hasselmann did of splitting widely used to study the low-frequency variability of the vector z into slow and fast variables x and y,re- various climate processes. It is, for instance, used in spectively. However, they introduced a nondimensional [17] to reproduce the observed red spectrum of the parameter  D y = x which measures the degree of sea surface temperature in midlatitudes using simula- time scale separation between the two sets of variables. tion data from a simplified coupled ocean-atmosphere This leads to the slow-fast coupled system model. However, this linear model has severe limi- 1 1 tations of, for example, not being able to represent dx D .L11x C L12y/dt C B11.x; x/dt   deviations from Gaussian distribution of some climate 1 1 1 C  B12.x; y/ C B22.y; y/ dt phenomena [13, 14, 17, 36, 51, 54]. It is thus natural 1 1 to try to reincorporate a nonlinearity of some kind C Dxdt C F1.t/dt C  f1. t/ (5)   into the model. The most popular idea consisted in 1 2 2 dy D L21x C L22y C B .x; y/ C B .y; y/ dt making the matrix D or equivalently † dependent on 12 22 2 1 1 1 x (quadratically for D or linearly for † as a next   €ydt C  dWt C  f2. t/ order Taylor correction) to which is tied the notion of multiplicative versus additive (when D is constant) under a few key assumptions, including (1) the non- noise [37, 60]. Beside the crude approximation, the linear self interaction term of the fast variables is apparent advantage of this approach is the maintaining “parameterized” by an Ornstein-Uhlenbeck process: 2 1 p 1 of the stabilizing linear operator U in place although it B22.y; y/dt WD  €ydt C  dWt and (2) a small is not universally justified. dissipation term Dxdt is added to the slow dynamics A mathematical justification for Hasselmann’s while (3) the slow variable forcing term assumes slow framework is provided by Arnold and his collaborators and fast contributions f1.t/ D F1.t/ C f1.t/.More- (see [2] and references therein). It is based on the over, the system in (5) is written in terms of the slow well-known technique of averaging (the law of large time t ! t. numbers) and the central limit theorem. However, as in MTV used the theory of asymptotic expansion ap- Hasselmann’s original work, it assumes the existence plied to the backward Fokker-Plank equation associ- and knowledge of the invariant measure of the ated with the stochastic differential system in (5)to fast variables. Nonetheless, a rigorous mathematical obtain an effective reduced Langevin equation (4)for derivation of such Langevin-type models for the slow the slow variables x in the limit of large separation climate dynamics, using the equations of motion in of time scales  ! 0 [44, 45]. The main advantage Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science 1383 of the MTV theory is that unlike Hasselmann’s ad Buizza et al. [6] used uniformly distributed random hoc formulation, the functional form of the drift and scalars to rescale the parameterized tendencies in the diffusion coefficients, in terms of the slow variables, governing equations. Similarly, Lin and Neelin [38] are obtained and new physical phenomena can emerge introduced a random perturbation in the tendency of from the large-scale feedback besides the assumed convective available potential energy (CAPE). In [38], stabilization effect. It turns out that the drift term is the random noise is assumed to be a Markov process of not always stabilizing, but there are dynamical regimes the form tCt D t t C zt where zt is a white noise where growing modes can be excited and, depending with a fixed standard deviation and t is a parameter. on the dynamical configuration, the Langevin equation Plant and Craig [57] used extensive cloud-permitting (4) can support either additive or multiplicative noise. numerical simulations to empirically derive the param- Even though MTV assumes strict separation of eters for the PDF of the cloud base mass flux itself scales,  1, it is successfully used for a wide whose Poisson shape is determined according to ar- range of examples including cases where  D O.1/ guments drawn from equilibrium statistical mechanics. [46]. Also in [47], MTV is successfully extended Careful simulations conducted by Davoudi et al. [10] to fully deterministic systems where the requirement revealed that while the Poisson PDF is more or less that the fast-fast interaction term B22.y; y/ in (5)is accurate for isolated deep convective clouds, it fails to parameterized by an Ornstein-Uhlenbeck process is extend to cloud clusters where a variety of cloud types relaxed. Furthermore, MTV is applied to a wide range interact with each other: a crucial feature of organized of climate problems. It is used, for instance, in [19]for tropical convection. a realistic barotropic model and extended in [18]toa Majda and Khouider [43] borrowed an idea from three-layer quasi-geostrophic model. The example of material science [28] of using the Ising model of midlatitude teleconnection patterns where multiplica- ferromagnetization to represent convective inhibition tive noise plays a crucial role is studied in [42]. MTV is (CIN). An order parameter , defined on a rectangular also applied to the triad and dyad normal mode (EOF) lattice, embedded within each horizontal grid box of interactions for arbitrary time series [40]. the climate model, takes values 1 or 0 at a given site, according to whether there is CIN or there is Stochastic Parametrization potential for deep convection (PAC). The lattice model In a typical GCM, the parametrization of unresolved makes transitions at a given site according to intuitive processes is based on theoretical and/or empirical probability rules depending both on the large-scale cli- deterministic equations. Perhaps the area where mate model variables and on local interactions between deterministic parameterizations have failed the lattice sites based on a Hamiltonian energy principle. most is moist convection. GCMs fail very badly The Hamiltonian is given by in simulating the planetary and intra-seasonal variability of winds and rainfall in the tropics due 1 X X H.;U/ D J.jxyj/.x/.y/Ch.U /  to the inadequate representation of the unresolved 2 x S variability of convection and the associated cross- x;y x scale interactions behind the multiscale organization of tropical convection [35]. To overcome this problem, where J.r/ is the local interaction potential and h.U / some climate scientists introduced random variables to is the external potential which depends on the climate mimic the variability of such unresolved processes. variables U and where the summations are taken over Unfortunately, as illustrated below, many of the all lattice sites x;y. A transition (spin-flip by analogy existing stochastic parametrizations were based on to the Ising model of magnetization) occurs at a site the assumptions of statistical equilibrium and/or of a y if for a small time ,wehavetC .y/ D 1  stationary distribution for the unresolved variability, t .y/ and tC .x/ D t .x/ if x ¤ y. Transitions which are only valid to some extent when there is scale occur at a rate C.y;;U/ set by Arrhenius dynamics: 1 C.x;;U/ D exp.xH.;U// if x D 0 and separation. I 1 The first use of random variables in CGMs appeared C.x;;U/ D if x D 1 so that the resulting Markov I in Buizza et al. [6] as means for improving the skill process satisfies detailed balance with respect to the of the ECMWF ensemble prediction system (EPS). Gibbs distribution .; U / / exp.H.;U/.Here 1384 Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science

xPH.;U// D H.CŒ1.x/ex /; U /H.;U/ D climatology were improved drastically when compared  z J.jx  zj/.z/ C h.U / with ex.y/ D 1 if y D x to their deterministic counterparts. The realistic statis- and 0 otherwise. tical behavior of the SMCM is successfully assessed For computational efficiency, a coarse graining of against observations in [56]. Local interaction effects the stochastic CIN model is used in [34]toderivea are reintroduced in [32] where a coarse-graining ap- stochastic birth-deathP process for the mesoscopic area proximation based on conditional expectation is used coverage X D x2X .x/ where X represents a to recover the multidimensional birth-death process generic site of a mesoscopic lattice, which in practice dynamics with local interactions. A Bayesian method- can be considered to be the GCM grid. The stochastic ology for inferring key parameters for the SMCM is CIN model is coupled to a toy GCM where it is developed and validated in [12]. A review of the basic successfully demonstrated how the addition of such a methodology of the CIN and SMCM models, which is stochastic model could improve the climatology and suitable for undergraduates, is found in [31]. waves dynamics in a deficient GCM [34, 42]. A systematic data-based methodology for inferring This Ising-type modeling framework is extended in a suitable stochastic process for unresolved processes [33] to represent the variability of organized tropical conditional on resolved model variables was proposed convection (OTC). A multi-type order parameter is in [9]. The local feedback from unresolved processes introduced to mimic the multimodal nature of OTC. on resolved ones is represented by a small Markov Based on observations, tropical convective systems chain whose transition probability matrix is made de- (TCS) are characterized by three cloud types, cumulus pendent on the resolved-scale state. The matrix is congestus whose height does not exceed the freezing estimated from time series data that is obtained from level develop when the atmosphere is dry, and there is highly resolved numerical simulations or observations. convective instability, positive CAPE. In return conges- This approach was developed and successfully tested tus clouds moisten the environment for deep convective on the Lorenz ’96 system [39]in[9]. [16] applied it to towers. Stratiform clouds that develop in the upper parameterize shallow cumulus convection, using data troposphere lag deep convection as a natural freezing from large eddy simulation (LES) of moist atmospheric phase in the upper troposphere. Accordingly, the new convection. A two-dimensional lattice, with at each order parameter  takes the multiple values 0,1,2,3, on lattice node a Markov chain, was used to mimic (or a given lattice site, according to whether the given site emulate) the convection as simulated by the high- is, respectively, clear sky or occupied by a congestus, resolution LES model, at a fraction of the computa- deep, or stratiform cloud. tional cost. Similar Arrhenius-type dynamics are used to build Subsequently, [15] combined the conditional transition rates resulting in an ergodic Markov process Markov chain methodology with elements from the with a well-defined equilibrium measure. Unphysi- SMCM [33]. They applied it to deep convection cal transitions of congestus to stratiform, stratiform but without making use of the Arrhenius functional to deep, stratiform to congestus, clear to stratiform, forms of the transition rates in terms of the large-scale and deep to congestus were eliminated by setting the variables (as was done in [33]). Similar to [16], LES associated rates to zero. When local interactions are data was used for estimation of the Markov chain ignored, the equilibrium measure and the transition transition probabilities. The inferred stochastic model rates depend only on the large-scale climate variables in [15] was well capable of generating cloud fractions U where CAPE and midlevel moisture are used as very similar to those observed in the LES data. While triggers and the coarse-graining process is carried with the main cloud types of the original SMCM were exact statistics. It leads to a multidimensional birth- preserved, an important improvement in [15] resides death process with immigration for the area fractions in the addition of a fifth state for shallow cumulus of the associated three cloud types. The stochastic clouds. As an experiment, direct spatial coupling of multicloud model (SMCM) is used very successfully the Markov chains on the lattice was also considered in [20, 21] to capture the unresolved variability of in [15]. Such coupling amounts to the structure of organized convection in a toy GCM. The simula- a stochastic cellular automaton (SCA). Without this tion of convectively coupled gravity waves and mean direct coupling, the Markov chains are still coupled, Stochastic and Statistical Methods in Climate, Atmosphere, and Ocean Science 1385 but indirectly, through their interaction with the large- 18. Franzke, C., Majda, A.: Low order stochastic mode reduc- scale variables (see, e.g., [9]). tion for a prototype atmospheric GCM. J. Atmos. Sci. 63, 457–479 (2006) 19. Franzke, C., Majda, A., Vanden-Eijnden, E.: Low-order References stochastic mode reduction for a realistic barotropic model climate. J. Atmos. Sci. 62, 1722–1745 (2005) 1. AghaKouchak, A., Easterling, D., Hsu, K., Schubert, S., 20. Frenkel, Y., Majda, J., Khouider, B.: Using the stochastic Sorooshian, S. (eds.): Extremes in a Changing Climate, multicloud model to improve tropical convective parameter- p. 423. Springer, Dordrecht (2013) ization: a paradigm example. J. Atmos. Sci. 69, 1080–1105 2. Arnold, L.: Hasselmann’s program revisited: the analysis of (2012) stochasticity in deterministic climate models. In: Imkeller, 21. Frenkel, Y., Majda, J., Khouider, B.: Stochastic and deter- P., von Storch, J.-S. (eds) Stochastic Climate Models. ministic multicloud parameterizations for tropical convec- Birkhauser,¨ Basel (2001) tion. Clim. Dyn. 41, 1527–1551 (2013) 3. Bellone, E., Hughes, J.P., Guttorp, P.: A Hidden Markov 22. Giannakis, D., Majda, A.J.: Nonlinear Laplacian spec- model for downscaling synoptic atmospheric patterns to tral analysis for time series with intermittency and low- precipitation amounts. Clim. Res. 15, 1–12 (2000) frequency variability. Proc. Natl. Acad. Sci. USA 109, 4. Berner, J.: Linking nonlinearity and non-Gaussianity of 2222–2227 (2012) planetary wave behavior by the Fokker-Planck equation. J. 23. Hasselmann, K.: Stochastic climate models. Part I, theory. Atmos. Sci. 62, 2098–2117 (2005) Tellus 28, 473–485 (1976) 5. Bond, N.A., Harrison, D.E.: The pacific decadal oscillation, 24. Hasselmann, K.: PIPs and POPs: the reduction of complex air-sea interaction and central north pacific winter atmo- dynamical systems using principal interaction and oscilla- spheric regimes. Geophys. Res. Lett. 27, 731–734 (2000) tion patterns. J. Geophys. Res. 93(D9), 11015–11021 (1988) 6. Buizza, R., Milleer, M., Palmer, T.N.: Stochastic represen- 25. Horenko, I.: Nonstationarity in multifactor models of dis- tation of model uncertainties in the ECMWF ensemble pre- crete jump processes, memory, and application to cloud diction system. Quart. J. R. Meteorol. Soc. 125, 2887–2908 modeling. J. Atmos. Sci. 68, 1493–1506 (2011) (1999) 26. Imkeller P., von Storch J.-S. (eds.): Stochastic Climate 7. Coles, S.: An Introduction to Statistical Modeling of Ex- Models. Birkhauser,¨ Basel (2001) treme Values, p. 208. Springer, London (2001); Statistical 27. Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Methods in the Atmospheric Sciences, 3rd edn., p. 704. Springer, New York (2002) Academic, Oxford 28. Katsoulakis, M., Majda, A.J., Vlachos, D.: Coarse-Grained 8. Crommelin, D.T.: Observed non-diffusive dynamics in stochastic processes and Monte-Carlo simulations in lattice large-scale atmospheric flow. J. Atmos. Sci. 61, 2384–239 systems. J. Comp. Phys. 186, 250–278 (2003) (2004) 29. Katz, R.W., Naveau P.: Editorial: special issue on statistics 9. Crommelin, D.T., Vanden-Eijnden, E.: Subgrid-scale pa- of extremes in weather and climate. Extremes 13, 107–108 rameterization with conditional Markov chains. J. Atmos. (2010) Sci. 65, 2661–2675 (2008) 30. Kessler, M., Lindner, A., Sørensen, M. (eds.): Statistical 10. Davoudi, J., McFarlane, N.A., Birner, T.: Fluctuation of Methods for Stochastic Differential Equations. CRC, Boca mass flux in a cloud-resolving simulation with interactive Raton (2012) radiation. J. Atmos. Sci. 67, 400–418 (2010) 31. Khouider, B.: Markov-jump stochastic models for orga- 11. de Haan, L., Ferreira, A.: Extreme Value Theory: An Intro- nized tropical convection. In: Yang, X.-S. (ed.) Mathemat- duction, p. 417. Springer, New York (2006) ical Modeling With Multidisciplinary Applications. Wiley, 12. de la Chevrotiere,` M., Khouider, B., Majda, A.: Calibration (2013). ISBN: 978-1-1182-9441-3 of the stochastic multicloud model using Bayesian infer- 32. Khouider, B.: A stochastic coarse grained multi-type par- S ence. SIAM J. Sci. Comput. 36(3), B538–B560 (2014) ticle interacting model for tropical convection. Commun. 13. DelSole, T.: Stochastic models of quasi-geostrophic turbu- Math. Sci. 12, 1379–1407 (2014) lence. Surv. Geophys. 25, 107–149 (2004) 33. Khouider, B., Biello, J., Majda, A.J.: A stochastic multi- 14. DelSole, T., Farrel, B.F.: Quasi-linear equilibration of a cloud model for tropical convection. Commun. Math. Sci. thermally maintained, stochastically excited jet in a quasi- 8, 187–216 (2010) geostrophic model. J. Atmos. Sci. 53, 1781–1797 (1996) 34. Khouider, B., Majda, A.J., Katsoulakis, M.: Coarse grained 15. Dorrestijn, J., Crommelin, D.T., Biello, J.A., Boing,¨ S.J.: stochastic models for tropical convection. Proc. Natl. Acad. A data-driven multicloud model for stochastic parameter- Sci. USA 100, 11941–11946 (2003) ization of deep convection. Phil. Trans. R. Soc. A 371, 35. Khouider, B., Majda, A.J., Stechmann, S.: Climate science 20120374 (2013) in the tropics: waves, vortices, and PDEs. Nonlinearity 26, 16. Dorrestijn, J., Crommelin, D.T., Siebesma, A.P., Jonker, R1-R68 (2013) H.J.J.: Stochastic parameterization of shallow cumulus con- 36. Kleeman, R., Moore, A.: A theory for the limitation of vection estimated from high-resolution model data. Theor. ENSO predictability due to stochastic atmospheric tran- Comput. Fluid Dyn. 27, 133–148 (2013) sients. J. Atmos. Sci. 54, 753–767 (1997) 17. Frankignoul, C., Hasselmann, K.: Stochastic climate mod- 37. Kondrasov, D., Krastov, S., Gill, M.: Empirical mode reduc- els, Part II application to sea-surface temperature anomalies tion in a model of extra-tropical low-frequency variability. and thermocline variability. Tellus 29, 289–305 (1977) J. Atmos. Sci. 63, 1859–1877 (2006) 1386 Stochastic Eulerian-Lagrangian Methods

38. Lin, J.B.W., Neelin, D.: Toward stochastic deep convective 58. Prakasa Rao, B.L.S.: Statistical Inference for Diffusion Type parameterization in general circulation models. Geophy. Processes. Arnold Publishers, London (1999) Res. Lett. 27, 3691–3694 (2000) 59. Preisendorfer, R.W.: Principal Component Analysis in Me- 39. Lorenz, E.N.: Predictability: a problem partly solved. In: teorology and Oceanography. Elsevier, Amsterdam (1988) Proceedings, Seminar on Predictability ECMWF, Reading, 60. Sura, P., Newman, M., Penland, C., Sardeshmukh, P.: Mul- vol. 1, pp. 1–18 (1996) tiplicative noise and non-Gaussianity: a paradigm for atmo- 40. Majda, A., Franzke, C., Crommelin, D.: Normal forms for spheric regimes. J. Atmos. Sci. 62, 1391–1409 (2005) reduced stochastic climate models. Proc. Natl. Acad. Sci. 61. Von Storch, H., Zwiers, F.W.: Statistical Analysis in Climate USA 16, 3649–3653 (2009) Research. Cambridge University Press, Cambridge (1999) 41. Majda, A.J., Franzke, C.L., Fischer, A., Crommelin, D.T.: 62. Wilks, D.S.: Statistical Methods in the Atmospheric Sci- Distinct metastable atmospheric regimes despite nearly ences, 3rd edn., p. 704. Academic, Oxford (2011) Gaussian statistics: a paradigm model. Proc. Natl. Acad. Sci. 63. Zucchini, W., Guttorp, P.: A Hidden Markov model for USA 103, 8309–8314 (2006) space-time precipitation. Water Resour. Res. 27, 1917–1923 42. Majda, A.J., Franzke, C.L., Khouider, B.: An applied mathe- (1991) matics perspective on stochastic modelling for climate. Phil. Trans. R. Soc. 366, 2427–2453 (2008) 43. Majda, A.J., Khouider, B.: Stochastic and mesoscopic mod- els for tropical convection. Proc. Natl. Acad. Sci. 99, 1123–1128 (2002) Stochastic Eulerian-Lagrangian Methods 44. Majda, A.J., Timofeyev, I., Vanden-Eijnden, E.: Models for stochastic climate prediction. Proc. Natl. Acad. Sci. 96, 14687–14691 (1999) Paul J. Atzberger 45. Majda, A.J., Timofeyev, I., Vanden-Eijnden, E.: A mathe- Department of Mathematics, University of California matical framework for stochastic climate models. Commun. Santa Barbara (UCSB), Santa Barbara, CA, USA Pure Appl. Math. LIV, 891–974 (2001) 46. Majda, A.J., Timofeyev, I., Vanden-Eijnden, E.: A priori tests of a stochastic mode reduction strategy. Physica D 170, 206–252 (2002) Synonyms 47. Majda A., Timofeyev, I., Vanden-Eijnden, E.: Stochastic models for selected slow variables in large deterministic systems. Nonlinearity 19, 769–794 (2006) Fluid-structure interaction; Fluid dynamics; Fluctu- 48. Mo, K.C., Ghil, M.: Statistics and dynamics of persistent ating hydrodynamics; Immersed Boundary Method; anomalies. J. Atmos. Sci. 44, 877–901 (1987) SELM; Statistical mechanics; Stochastic Eulerian La- 49. Monahan, A.H.: Nonlinear principal component analysis grangian method; Thermal fluctuations by neural networks: theory and application to the . J. Clim. 13, 821–835 (2000) 50. Monahan, A.H., Fyfe, J.C., Ambaum, M.H.P., Stephenson, D.B., North, G.R.: Empirical orthogonal functions: the medium is the message. J. Clim. 22, 6501–6514 (2009) Abstract 51. Newman, M., Sardeshmukh, P., Penland, C.: Stochastic forcing of the wintertime extratropical flow. J. Atmos. Sci. We present approaches for the study of fluid-structure 54, 435–455 (1997) interactions subject to thermal fluctuations. A me- 52. Palmer, T., Williams, P. (eds.): Stochastic Physics and Cli- mate Modelling. Cambridge University Press, Cambridge chanical description is utilized combining Eulerian (2009) and Lagrangian reference frames. We establish general 53. Pasmanter, R.A., Timmermann, A.: Cyclic Markov chains conditions for the derivation of operators coupling with an application to an intermediate ENSO model. Non- these descriptions and for the derivation of stochastic linear Process. Geophys. 10, 197–210 (2003) 54. Penland, C., Ghil, M.: Forecasting northern hemisphere driving fields consistent with statistical mechanics. We 700-mb geopotential height anomalies using empirical nor- present stochastic numerical methods for the fluid- mal modes. Mon. Weather Rev. 121, 2355–2372 (1993) structure dynamics and methods to generate efficiently 55. Penland, C., Sardeshmukh, P.D.: The optimal growth of the required stochastic driving fields. To help establish tropical sea surface temperature anomalies. J. Clim. 8, 1999–2024 (1995) the validity of the proposed approach, we perform 56. Peters, K., Jakob, C., Davies, L., Khouider, B., Majda, A.: analysis of the invariant probability distribution of the Stochastic behaviour of tropical convection in observations stochastic dynamics and relate our results to statisti- and a multicloud model. J. Atmos. Sci. 70, 3556–3575 cal mechanics. Overall, the presented approaches are (2013) 57. Plant, R.S., Craig, G.C.: A stochastic parameterization for expected to be applicable to a wide variety of systems deep convection based on equilibrium statistics. J. Atmos. involving fluid-structure interactions subject to thermal Sci. 65, 87–105 (2008) fluctuations. Stochastic Eulerian-Lagrangian Methods 1387

Introduction formalism which captures essential features of the coupled stochastic dynamics of the fluid and struc- Recent scientific and technological advances motivate tures. the study of fluid-structure interactions in physical To model the fluid-structure system, a mechanical regimes often involving very small length and time description is utilized involving both Eulerian and scales [26, 30, 35, 36]. This includes the study of Lagrangian reference frames. Such mixed descriptions microstructure in soft materials and complex fluids, arise rather naturally, since it is often convenient to the study of biological systems such as cell motil- describe the structure configurations in a Lagrangian ity and microorganism swimming, and the study of reference frame while it is convenient to describe processes within microfluidic and nanofluidic devices. the fluid in an Eulerian reference frame. In practice, At such scales thermal fluctuations play an important this presents a number of challenges for analysis role and pose significant challenges in the study of and numerical studies. A central issue concerns how such fluid-structure systems. Significant past work has to couple the descriptions to represent accurately been done on the formulation of descriptions for fluid- the fluid-structure interactions, while obtaining a structure interactions subject to thermal fluctuations. coupled description which can be treated efficiently To obtain descriptions tractable for analysis and nu- by numerical methods. Another important issue merical simulation, these approaches typically place concerns how to account properly for thermal an emphasis on approximations which retain only the fluctuations in such approximate descriptions. This structure degrees of freedom (eliminating the fluid must be done carefully to be consistent with dynamics). This often results in simplifications in the statistical mechanics. A third issue concerns the descriptions having substantial analytic and compu- development of efficient computational methods. tational advantages. In particular, this eliminates the This requires discretizations of the stochastic many degrees of freedom associated with the fluid differential equations and the development of efficient and avoids having to resolve the potentially intricate methods for numerical integration and stochastic field and stiff stochastic dynamics of the fluid. These ap- generation. proaches have worked especially well for the study We present a set of approaches to address these of bulk phenomena in free solution and the study issues. The formalism and general conditions for the of many types of complex fluids and soft materials operators which couple the Eulerian and Lagrangian [3, 3, 9, 13, 17, 23]. descriptions are presented in section “Stochastic Recent applications arising in the sciences and in Eulerian Lagrangian Method.” We discuss a convenient technological fields present situations in which resolv- description of the fluid-structure system useful ing the dynamics of the fluid may be important and for working with the formalism in practice in even advantageous both for modeling and computation. section “Derivations for the Stochastic Eulerian This includes modeling the spectroscopic responses Lagrangian Method.” A derivation of the stochastic of biological materials [19, 25, 37], studying trans- driving fields used to represent the thermal fluctuations S port in microfluidic and nanofluidic devices [16, 30], is also presented in section “Derivations for the and investigating dynamics in biological systems [2, Stochastic Eulerian Lagrangian Method.” Stochastic 11]. There are also other motivations for represent- numerical methods are discussed for the approximation ing the fluid explicitly and resolving its stochastic of the stochastic dynamics and generation of dynamics. This includes the development of hybrid stochastic fields in sections “Computational Method- fluid-particle models in which thermal fluctuations ology.” To validate the methodology, we perform mediate important effects when coupling continuum analysis of the invariant probability distribution and particle descriptions [12, 14], the study of hy- of the stochastic dynamics of the fluid-structure drodynamic coupling and diffusion in the vicinity of formalism. We compare this analysis with results surfaces having complicated geometries [30], and the from statistical mechanics in section “Equilibrium study of systems in which there are many interacting Statistical Mechanics of SELM Dynamics.” A mechanical structures [8, 27, 28]. To facilitate the de- more detailed and comprehensive discussion of the velopment of methods for studying such phenomena approaches presented here can be found in our in fluid-structure systems, we present a rather general paper [6]. 1388 Stochastic Eulerian-Lagrangian Methods

Stochastic Eulerian Lagrangian Method  and m are constant, but with some modifications these could also be treated as variable. The ; are To study the dynamics of fluid-structure interactions Lagrange multipliers for imposed constraints, such as subject to thermal fluctuations, we utilize a mechanical incompressibility of the fluid or a rigid body constraint description involving Eulerian and Lagrangian refer- of a structure. The operator L is used to account for ence frames. Such mixed descriptions arise rather nat- dissipation in the fluid, such as associated with New- urally, since it is often convenient to describe the struc- tonian fluid stresses [1]. To account for how the fluid ture configurations in a Lagrangian reference frame and structures are coupled, a few general operators are while it is convenient to describe the fluid in an Eu- introduced, ; $; . lerian reference frame. In principle more general de- The linear operators ; ; $ are used to model the scriptions using other reference frames could also be fluid-structure coupling. The  operator describes how considered. Descriptions for fluid-structure systems a structure depends on the fluid flow, while $ is having these features can be described rather generally a negative definite dissipative operator describing the by the following dynamic equations viscous interactions coupling the structure to the fluid. We assume throughout that this dissipative operator du is symmetric, $ D $ T . The linear operator is  D Lu C Œ$ .v   u/ C  C f (1) dt thm used to attribute a spatial location for the viscous dv interactions between the structure and fluid. The linear m D$.v   u/ r ˚ŒX C  C F (2) dt X thm operators are assumed to have dependence only on the  ŒX dX configuration degrees of freedom D , D D v: (3) ŒX. We assume further that $ does not have any dt dependence on X. For an illustration of the role these The u denotes the velocity of the fluid, and  the coupling operators play, see Fig. 1. uniform fluid density. The X denotes the configuration To account for the mechanics of structures, ˚ŒX of the structure, and v the velocity of the structure. denotes the potential energy of the configuration X. The mass of the structure is denoted by m. To simplify The total energy associated with this fluid-structure the presentation, we treat here only the case when system is given by

Stochastic Eulerian-Lagrangian Methods, Fig. 1 The de- operator  prescribes how structures are to be coupled to the scription of the fluid-structure system utilizes both Eulerian and fluid. The operator prescribes how the fluid is to be coupled Lagrangian reference frames. The structure mechanics are often to the structures. A variety of fluid-structure interactions can most naturally described using a Lagrangian reference frame. be represented in this way. This includes rigid and deformable The fluid mechanics are often most naturally described using bodies, membrane structures, polymeric structures, or point an Eulerian reference frame. The mapping X(q) relates the particles Lagrangian reference frame to the Eulerian reference frame. The Stochastic Eulerian-Lagrangian Methods 1389 Z 1 the thermal fluctuations is given in section “Derivations EŒu; v; X D ju.y/j2dy ˝ 2 for the Stochastic Eulerian Lagrangian Method.” 1 It is important to mention that some care must be C mv2 C ˚ŒX: (4) taken when using the above formalism in practice and 2 when choosing operators. An important issue concerns The first two terms give the kinetic energy of the fluid the treatment of the material derivative of the fluid, and structures. The last term gives the potential energy du=dt D @u=@t C u ru. For stochastic systems of the structures. the field u is often highly irregular and not defined in As we shall discuss, it is natural to consider cou- a point-wise sense, but rather only in the sense of a pling operators and  which are adjoint in the sense generalized function (distribution) [10, 24]. To avoid Z Z these issues, we shall treat du=dt D @u=@t in this initial presentation of the approach [6]. The SELM pro- . u/.q/  v.q/dq D u.x/  . v/.x/dx (5) S ˝ vides a rather general framework for the study of fluid- structure interactions subject to thermal fluctuations. for any u and v.TheS and ˝ denote the spaces used to To use the approach for specific applications requires parameterize respectively the structures and the fluid. the formulation of appropriate coupling operators We denote such an adjoint by D  Ž or  D and  to model the fluid-structure interaction. We Ž. This adjoint condition can be shown to have the provide some concrete examples of such operators in important consequence that the fluid-structure coupling the paper [6]. conserves energy when $ !1in the inviscid and zero temperature limit. Formulation in Terms of Total Momentum Field To account for thermal fluctuations, a random force When working with the formalism in practice, it turns density fthm is introduced in the fluid equations and Fthm out to be convenient to reformulate the description in in the structure equations. These account for sponta- terms of a field describing the total momentum of the neous changes in the system momentum which occurs fluid-structure system at a given spatial location. As we as a result of the influence of unresolved microscopic shall discuss, this description results in simplifications degrees of freedom and unresolved events occurring in in the stochastic driving fields. For this purpose, we the fluid and in the fluid-structure interactions. define The thermal fluctuations consistent with the form of the total energy and relaxation dynamics of the system p.x;t/D u.x;t/C Œmv.t/.x/: (9) are taken into account by the introduction of stochastic driving fields in the momentum equations of the fluid The operator is used to give the distribution in and structures. The stochastic driving fields are taken space of the momentum associated with the structures to be Gaussian processes with mean zero and with ı- for given configuration X.t/. Using this approach, the correlation in time [29]. By the fluctuation-dissipation fluid-structure dynamics are described by S principle [29], these have covariances given by dp L hf .s/fT .t/iD.2k T/.L  $  / ı.t  s/ D u C ŒrX˚.X/ thm thm B dt (6) C .rX Œmv/  v C  C gthm (10) hF .s/FT .t/iD.2k T/$ı.t s/ (7) thm thm B dv T m D$.v   u/ rX˚.X/ C  C Fthm (11) hfthm.s/Fthm.t/iD.2kB T/ $ı.t s/: (8) dt dX Ž T Dv (12) We have used that  D and $ D $ . We remark dt that the notation ghT which is used for the covariance 1 operators should be interpreted as the tensor product. where u D  .p  Œmv/ and gthm D fthm C

This notation is meant to suggest the analogue to the ŒFthm. The third term in the first equation arises outer-product operation which holds in the discrete set- from the dependence of on the configuration of ting [5]. A more detailed discussion and derivation of the structures, Œmv D . ŒX/Œmv. The Lagrange 1390 Stochastic Eulerian-Lagrangian Methods multipliers for imposed constraints are denoted by “Stochastic Eulerian Lagrangian Method.” For these ;. For the constraints, we use rather liberally the equations, we focus primarily on the motivation for notation with the Lagrange multipliers denoted here the stochastic driving fields used for the fluid-structure not necessarily assumed to be equal to the previous system. definition. The stochastic driving fields are again Gaus- For the thermal fluctuations of the system, we sian with mean zero and ı-correlation in time [29]. The assume Gaussian random fields with mean zero and ı- stochastic driving fields have the covariance structure correlated in time. For such stochastic fields, the central given by challenge is to determine an appropriate covariance structure. For this purpose, we use the fluctuation- T L dissipation principle of statistical mechanics [22, 29]. hgthm.s/gthm.t/iD.2kB T/ ı.t  s/ (13) For linear stochastic differential equations of the hF .s/FT .t/iD.2k T/$ ı.t s/ (14) thm thm B form T hgthm.s/Fthm.t/iD0: (15) dZt D LZt dt C QdBt (19) This formulation has the convenient feature that the the fluctuation-dissipation principle can be expressed stochastic driving fields become independent. This is as a consequence of using the field for the total momen- tum for which the dissipative exchange of momentum G D QQT D.LC /  .LC /T : (20) between the fluid and structure no longer arises. In the equations for the total momentum, the only source of This relates the equilibrium covariance structure C of dissipation remaining occurs from the stresses of the the system to the covariance structure G of the stochas- fluid. This approach simplifies the effort required to tic driving field. The operator L accounts for the generate numerically the stochastic driving fields and dissipative dynamics of the system. For the Eqs. 16–18, will be used throughout. the dissipative operators only appear in the momentum equations. This can be shown to have the consequence that there is no thermal forcing in the equation for X.t/; Derivations for the Stochastic Eulerian this will also be confirmed in section “Formulation in Lagrangian Method Terms of Total Momentum Field.” To simplify the pre- sentation, we do not represent explicitly the stochastic We now discuss formal derivations to motivate the dynamics of the structure configuration X. stochastic differential equations used in each of For the fluid-structure system, it is convenient to the physical regimes. For this purpose, we do not work with the stochastic driving fields by defining present the most general derivation of the equations. For brevity, we make simplifying assumptions when 1 1 T q D Œ fthm;m Fthm : (21) convenient. In the initial formulation of SELM, the fluid- The field q formally is given by q D QdBt =dt and de- structure system is described by termined by the covariance structure G D QQT .This covariance structure is determined by the fluctuation- du  DLu C Œ$ .v   u/ C  C f (16) dissipation principle expressed in Eq. 20 with dt thm Ä dv 1 .L  $  / 1 $ m D$.v   u/ rX˚.X/ C  L D (22) dt m1$ m1$ Ä C F (17) 1 thm  kB T I 0 C D 1 : (23) dX 0mkB T I Dv: (18) dt The I denotes the identity operator. The covariance The notation and operators appearing in these C was obtained by considering the fluctuations at equations have been discussed in detail in section equilibrium. The covariance C is easily found since Stochastic Eulerian-Lagrangian Methods 1391

T T T T the Gibbs-Boltzmann distribution is a Gaussian with hgthmg iDhfthmf iChfthmF i 1 thm thm thm formal density .u; v/ D exp ŒE=kB T.TheZ0 Z0 T T T Ch Fthmf iCh FthmF i is the normalization constant for . The energy is given  thm thm by Eq. 4. For this purpose, we need only consider the D.2kB T/  L C $  energy E in the case when ˚ D 0.Thisgivesthe  T T T covariance structure  $  $ C $

Ä D.2kB T/L: (29) 2 .L  $  / m11 $ G D .2kB T/ 1 1 2 : m  $ m $ This makes use of the adjoint property of the coupling (24) operators Ž D  . One particularly convenient feature of this reformu- Ž Ž To obtain this result, we use that  D and $ D $ . lation is that the stochastic driving fields Fthm and gthm From the definition of q, it is found that the covariance become independent. This can be seen as follows: of the stochastic driving fields of SELM is given by T T T Eqs. 6–8. This provides a description of the thermal hgthmFthmiDhfthmFthmiCh FthmFthmi (30) fluctuations in the fluid-structure system. D .2kB T/. $ C $ / D 0:

Formulation in Terms of Total Momentum Field This decoupling of the stochastic driving fields greatly It is convenient to reformulate the description of the reduces the computational effort to generate the fields fluid-structure system in terms of a field for the total with the required covariance structure. This shows that momentum of the system associated with spatial loca- the covariance structure of the stochastic driving fields tion x. For this purpose we define of SELM is given by Eqs. 13–15.

p.x;t/D u.x;t/C Œmv.t/.x/: (25) Computational Methodology The operator is used to give the distribution in space of the momentum associated with the structures. We now discuss briefly numerical methods for the Using this approach, the fluid-structure dynamics are SELM formalism. For concreteness we consider the described by specific case in which the fluid is Newtonian and incompressible. For now, the other operators of the SELM formalism will be treated rather generally. This dp DLu C Œr ˚.X/ case corresponds to the dissipative operator for the dt X fluid C .rX Œmv/  v C  C gthm (26) S dv Lu D u: (31) m D$.v   u/ r ˚.X/ C  dt X The  denotes the Laplacian u D @ uC@ uC@ u. C F (27) xx yy zz thm The incompressibility of the fluid corresponds to the dX constraint Dv (28) dt ru D 0: (32) 1 where u D  .p  Œmv/ and gthm D fthm C ŒFthm. The third term in the first equation arises from the This is imposed by the Lagrange multiplier .Bythe dependence of on the configuration of the structures, Hodge decomposition,  is given by the gradient of a Œmv.t/ D . ŒX/Œmv.t/. function p with  Drp.Thep can be interpreted The thermal fluctuations are taken into account by as the local pressure of the fluid. two stochastic fields gthm and Fthm. The covariance of A variety of methods could be used in practice gthm is obtained from to discretize the SELM formalism, such as finite 1392 Stochastic Eulerian-Lagrangian Methods

difference methods, spectral methods, and finite In this expression, we let fthm D gthm  ŒFthm for the element methods [20, 32, 33]. We present here particular realized values of the fields gthm and Fthm. discretizations based on finite difference methods. We remark that in fact the semi-discretized equa- tions of the SELM formalism in this regime can also Numerical Semi-discretizations for be given in terms of u directly, which may provide

Incompressible Newtonian Fluid a simpler approach in practice. The identity fthm D

The Laplacian will be approximated by central differ- gthm  ŒFthm could be used to efficiently generate the ences on a uniform periodic lattice by required stochastic driving fields in the equations for u. We present the reformulation here, since it more X3 umCe  2um C ume directly suggests the semi-discretized equations to be ŒLu D j j : (33) m x2 used for the reduced stochastic equations. j D1 For this semi-discretization, we consider a total

The m D .m1;m2;m3/ denotes the index of the lattice energy for the system given by site. The e denotes the standard basis vector in three j  X m dimensions. The incompressibility of the fluid will be EŒu; v; X D ju.x /j2x3 C jvj2 C ˚ŒX: 2 m m 2 approximated by imposing the constraint m (40) X3 j j umCe  umej ŒD  u D j : (34) m 2x This is useful in formulating an adjoint condition 5 for j D1 the semi-discretized system. This can be derived by The superscripts denote the vector component. In prac- considering the requirements on the coupling operators tice, this will be imposed by computing the projection  and which ensure the energy is conserved when of a vector u to the subspace fu 2 R3N j D  u D 0g, $ !1in the inviscid and zero temperature limit. where N is the total number of lattice sites. We denote To obtain appropriate behaviors for the thermal this projection operation by fluctuations, it is important to develop stochastic driv- ing fields which are tailored to the specific semi- u D }u: (35) discretizations used in the numerical methods. Once the stochastic driving fields are determined, which is The semi-discretized equations for SELM to be used in the subject of the next section, the equations can be practice are integrated in time using traditional methods for SDEs, such as the Euler-Maruyama method or a stochastic dp Runge-Kutta method [21]. More sophisticated inte- D Lu C ŒrX˚ C .rX Œmv/  v C  C gthm dt grators in time can also be developed to cope with (36) sources of stiffness but are beyond the scope of this entry [7]. For each of the reduced equations, similar semi-discretizations can be developed as the one pre- dv D$Œv   u C F (37) sented above. dt thm dX Stochastic Driving Fields for D v: (38) dt Semi-discretizations To obtain behaviors consistent with statistical mechan- 1 The component um D  .pm  Œmvm/. Each ics, it is important stochastic driving fields be used of the operators now appearing is understood to be which are tailored to the specific numerical discretiza- discretized. We discuss specific discretizations for  tion employed [5–7, 15]. To ensure consistency with and in paper [6]. To obtain the Lagrange multiplier  statistical mechanics, we will again use the fluctuation- which imposes incompressibility, we use the projection dissipation principle but now apply it to the semi- operator and discretized equations. For each regime, we then discuss the important issues arising in practice concerning the I  D.  }/.Lu C $Œv   u C fthm/ (39) efficient generation of these stochastic driving fields. Stochastic Eulerian-Lagrangian Methods 1393

Formulation in Terms of Total Momentum Field 1  .z/ D exp ŒE.z/=k T: (42) To obtain the covariance structure for this regime, we GB Z B apply the fluctuation-dissipation principle as expressed in Eq. 20 to the semi-discretized Eqs. 36–38.Thisgives The z is the state of the system, E is the energy, the covariance kB is Boltzmann’s constant, T is the system tem- 2 3 perature, and Z is a normalization constant for the 2x3L0 0 distribution [29]. We show that this Gibbs-Boltzmann 4 2 5 G D2LC D .2kB T/ 0m$0 : distribution is the equilibrium distribution of both the 000full stochastic dynamics and the reduced stochastic (41) dynamics in each physical regime. We present here both a verification of the invariance The factor of x3 arises from the form of the energy of the Gibbs-Boltzmann distribution for the general for the discretized system which gives covariance for formalism and for numerical discretizations of the the equilibrium fluctuations of the total momentum formalism. The verification is rather formal for the 1 3 undiscretized formalism given technical issues which  x kB T ;seeEq.40. In practice, achieving the covariance associated with the dissipative operator of would need to be addressed for such an infinite dimen- the fluid L is typically the most challenging to generate sional dynamical system. However, the verification is efficiently. This arises from the large number N of rigorous for the semi-discretization of the formalism, lattice sites in the discretization. which yields a finite dimensional dynamical system. One approach is to determine a factor Q such that The latter is likely the most relevant case in practice. T Given the nearly identical calculations involved in the the block Gp;p D QQ ; subscripts indicate block entry of the matrix. The required random field with verification for the general formalism and its semi- discretizations, we use a notation in which the key covariance Gp;p is then given by g D Q,where  is the uncorrelated Gaussian field with the covari- differences between the two cases primarily arise in ance structure I. For the discretization used on the the definition of the energy. In particular, the energy is uniform periodic mesh, the matrices L and C are understood to be given by Eq. 4 when considering the cyclic [31]. This has the important consequence that general SELM formalism and Eq. 40 when considering they are both diagonalizable in the discrete Fourier semi-discretizations. basis of the lattice. As a result, the field fthm can be generated using the fast Fourier transform (FFT) with Formulation in Terms of Total Momentum Field at most O.N log.N // computational steps. In fact, in The stochastic dynamics given by Eqs. 10–12 is a this special case of the discretization, “random fluxes” change of variable of the full stochastic dynamics of at the cell faces can be used to generate the field in the SELM formalism given by Eqs. 1–3. Thus verifying O.N/ computational steps [5]. Other approaches can the invariance using the reformulated description is be used to generate the random fields on nonperiodic also applicable to Eqs. 1–3 and vice versa. To verify S meshes and on multilevel meshes; see [4, 5]. the invariance in the other regimes, it is convenient to work with the reformulated description. The energy as- sociated with the reformulated description is given by Z Equilibrium Statistical Mechanics of SELM 1 EŒp; v; X D jp.y/  Œmv.y/j2dy Dynamics 2 ˝ m C jvj2 C ˚ŒX: (43) We now discuss how the SELM formalism and the 2 presented numerical methods capture the equilibrium The energy associated with the semi-discretization is statistical mechanics of the fluid-structure system. This is done through an analysis of the invariant probability X 1 2 3 distribution of the stochastic dynamics. For the fluid- EŒp; v; X D jp.xm/  Œmvmj x (44) 2 m structure systems considered, the appropriate proba- m bility distribution is given by the Gibbs-Boltzmann m C jvj2 C ˚ŒX: distribution 2 1394 Stochastic Eulerian-Lagrangian Methods

The probability density .p; v; X;t/for the current In this regime, G isgivenbyEq.13 or 41.Inthe state of the system under the SELM dynamics is notation ŒrG.z/i D @zj Gij .z/ with the summation governed by the Fokker-Planck equation convention for repeated indices. To simplify the notation, we have suppressed denoting the specific @ DrJ (45) functions on which each of the operators acts; see @t Eqs. 10–12 for these details. The requirement that the Gibbs-Boltzmann distri- with probability flux bution GB given by Eq. 42 be invariant under the 2 3 stochastic dynamics is equivalent to the distribution L C Cr  v C  X yielding rJ D 0. We find it convenient to group terms J D 4 $ r ˚ C  5  X and express this condition as v 1 1  .rG/  Gr: (46) 2 2 rJ D A1 C A2 CrA3 CrA4 D 0 (47)

The covariance operator G is associated with the Gaus- T T sian field gDŒgthm; Fthm;0 byhg.s/g .t/iDGı.t  s/. where

  1 A1 D . CrX  v C 1/ rpE C .rX˚ C 1/ rvE C .v/ rXE .kB T/ GB  

A2 D rp  . CrX  v C 1/ Crv  .rX˚ C 2/ CrX  .v/ GB 1 A D .rG/ 3 2 GB 2   3 L 1 u C 2 C GpprpE C GpvrvE C GpXrXE .2kB T/ 4 1 5 A4 D  $ C 2 C GvprpE C GvvrvE C GvXrXE .2kB T/ GB: (48) 1 GXprpE C GXvrvE C GXXrXE .2kB T/

1 We assume here that the Lagrange multipliers can be where u D  .p Œmv/. Similar expressions for the split  D 1 C 2 and  D 1 C 2 to impose the energy of the undiscretized formalism can be obtained constraints by considering in isolation different terms by using the calculus of variations [18]. contributing to the dynamics; see Eq. 48.Thisisalways We now consider rJ and each term A1;A2; A3; A4. possible for linear constraints. The block entries of The term A1 can be shown to be the time derivative the covariance operator G are denoted by Gi;j with of the energy A1 D dE=dt when considering only a i;j 2fp; v; Xg. For the energy of the discretized subset of the contributions to the dynamics. Thus, con- system given by Eq. 4,wehave servation of the energy under this restricted dynamics would result in A1 being zero. For the SELM formal- 3 ism, we find by direct substitution of the gradients of rpn E D u.xn/xn (49) E given by Eqs. 49–51 into Eq. 48 that A1 D 0.When X   there are constraints, it is important to consider only r E D u.x /  r Œmv x3 C mv vq m vq m m q admissible states .p; v; X/. This shows in the inviscid m (50) and zero temperature limit of SELM, the resulting dy- namics are nondissipative. This property imposes con- X   3 straints on the coupling operators and can be viewed as rXq E D u.xm/  rXq Œmvm xm CrXq ˚: m a further motivation for the adjoint conditions imposed (51) in Eq. 5. Stochastic Eulerian-Lagrangian Methods 1395

The term A2 gives the compressibility of the phase- stochastic dynamics of the fluid-structure system space flow generated by the nondissipative dynamics having convenient features for analysis and for of the SELM formalism. The flow is generated by the computational methods. Analysis was presented vector field . CrX  v C 1; rX˚ C 1; v/ on establishing for the SELM stochastic dynamics that the phase-space .p; v; X/. When this term is nonzero, the Gibbs-Boltzmann distribution is invariant. The there are important implications for the Liouville the- SELM formalism provides a general framework orem and statistical mechanics of the system [34]. for the development of computational methods for For the current regime, we have A2 D 0 since in applications requiring a consistent treatment of the divergence each component of the vector field is structure elastic mechanics, hydrodynamic coupling, seen to be independent of the variable on which the and thermal fluctuations. A more detailed and derivative is computed. This shows in the inviscid and comprehensive discussion of SELM can be found zero temperature limit of SELM, the phase-space flow in our paper [6]. is incompressible. For the reduced SELM descriptions, we shall see this is not always the case. Acknowledgements The author P. J. A. acknowledges support from research grant NSF CAREER DMS - 0956210. The term A3 corresponds to fluxes arising from multiplicative features of the stochastic driving fields. When the covariance G has a dependence on the References current state of the system, this can result in pos- sible changes in the amplitude and correlations in 1. Acheson, D.J.: Elementary Fluid Dynamics. Oxford Ap- the fluctuations. These changes can yield asymmetries plied Mathematics and Computing Science Series. Oxford University Press, Oxford/Clarendon Press/New York (1990) in the stochastic dynamics which manifest as a net 2. Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, probability flux. In the SELM formalism, it is found K., Walker, P.: Molecular Biology of the Cell. Garland that in the divergence of G, each contributing entry is Publishing, New York (2002) 3. Armstrong, R.C., Byron Bird, R., Hassager, O.:Dynamic independent of the variable on which the derivative is Polymeric Liquids, Vols. I, II. Wiley, New York (1987) being computed. This shows for the SELM dynamics 4. Atzberger, P.: Spatially adaptive stochastic multigrid meth- there is no such probability fluxes, A3 D 0. ods for fluid-structure systems with thermal fluctuations, technical report (2010). arXiv:1003.2680 The last term A4 accounts for the fluxes arising from 5. Atzberger, P.J.: Spatially adaptive stochastic numerical the primarily dissipative dynamics and the stochastic methods for intrinsic fluctuations in reaction-diffusion sys- driving fields. This term is calculated by substituting tems. J. Comput. Phys. 229, 3474–3501 (2010) the gradients of the energy given by Eqs. 49–51 and us- 6. Atzberger, P.J.: Stochastic eulerian lagrangian methods ing the choice of covariance structure given by Eq. 13 for fluid-structure interactions with thermal fluctuations. J. Comput. Phys. 230, 2821–2837 (2011) or 41. By direct substitution this term is found to be 7. Atzberger, P.J., Kramer, P.R., Peskin, C.S.: A stochastic im- zero, A4 D 0. mersed boundary method for fluid-structure dynamics at mi- This shows the invariance of the Gibbs-Boltzmann croscopic length scales. J. Comput. Phys. 224, 1255–1292 distribution under the SELM dynamics. This provides (2007) S 8. Banchio, A.J., Brady, J.F.: Accelerated stokesian dynamics: a rather strong validation of the stochastic driving Brownian motion. J. Chem. Phys. 118, 10323–10332 (2003) fields introduced for the SELM formalism. This shows 9. Brady, J.F. Bossis, G.: Stokesian dynamics. Annu. Rev. the SELM stochastic dynamics are consistent with Fluid Mech. 20, 111–157 (1988) equilibrium statistical mechanics [29]. 10. Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions. Cambridge University Press, Cambridge/New York (1992) 11. Danuser, G., Waterman-Storer, C.M: Quantitative fluores- Conclusions cent speckle microscopy of cytoskeleton dynamics. Annu. Rev. Biophys. Biomol. Struct. 35, 361–387 (2006) 12. De Fabritiis, G., Serrano, M., Delgado-Buscalioni, R., An approach for fluid-structure interactions subject Coveney, P.V.: Fluctuating hydrodynamic modeling of fluids to thermal fluctuations was presented based on a at the nanoscale. Phys. Rev. E 75, 026307 (2007) mechanical description utilizing both Eulerian and 13. Doi, M., Edwards, S.F.: The Theory of Polymer Dynamics. Lagrangian reference frames. General conditions were Oxford University Press, New York (1986) 14. Donev, A., Bell, J.B., Garcia, A.L., Alder, B.J.: A hybrid established for operators coupling these descriptions. particle-continuum method for hydrodynamics of complex A reformulated description was presented for the fluids. SIAM J. Multiscale Model. Simul. 8 871–911 (2010) 1396 Stochastic Filtering

15. Donev, A., Vanden-Eijnden, E., Garcia, A.L., Bell, J.B.: On the accuracy of finite-volume schemes for fluctuating Stochastic Filtering hydrodynamics, Commun. Appl. Math. Comput. Sci. 5(2), 149–197 (2010) M.V. Tretyakov 16. Eijkel, J.C.T., Napoli, M., Pennathur, S.: Nanofluidic tech- School of Mathematical Sciences, University of nology for biomolecule applications: a critical review. Lab on a Chip 10, 957–985 (2010) Nottingham, Nottingham, UK 17. Ermak, D.L. McCammon, J.A.: Brownian dynamics with hydrodynamic interactions. J. Chem. Phys. 69, 1352–1360 (1978) 18. Gelfand, I.M., Fomin, S.V.: Calculus of Variations. Dover, Mathematics Subject Classification Mineola (2000) 19. Gotter, R., Kroy, K., Frey, E., Barmann, M., Sackmann, 60G35; 93E11; 93E10; 62M20; 60H15; 65C30; E.: Dynamic light scattering from semidilute actin solu- 65C35; 60H35 tions: a study of hydrodynamic screening, filament bending stiffness, and the effect of tropomyosin/troponin-binding. Macromolecules 29 30–36 (1996) 20. Gottlieb, D., Orszag, S.A.: Numerical Analysis of Spec- Synonyms tral Methods Theory and Applications. SIAM, Philadelphia (1993) 21. Kloeden, P.E., Platen, E.: Numerical Solution of Stochastic Filtering problem for hidden Markov models; Nonlin- Differential Equations. Springer, Berlin/New York (1992) ear filtering 22. Landau, L.D., Lifshitz, E.M.: Course of Theoretical Physics. Statistical Physics, vol. 9. Pergamon Press, Oxford (1980) 23. Larson, R.G.: The Structure and Rheology of Complex Fluids. Oxford University Press, New York (1999) Short Definition 24. Lieb, E.H., Loss, M.: Analysis. American Mathematical Society, Providence (2001) T T 25. Mezei, F., Pappas, C., Gutberlet, T.: Neutron Spin Echo Let 1 and 2 be either a time interval Œ0; T  or a Spectroscopy: Basics, Trends, and Applications. Spinger, discrete set. The stochastic filtering problem consists Berlin/New York (2003) in estimating an unobservable signal Xt ;t2 T1; based 26. Moffitt, J.R., Chemla, Y.R., Smith, S.B., Bustamante, C.: on an observation fys;sÄ t; s 2 T2g; where the Recent advances in optical tweezers. Annu. Rev. Biochem. 77, 205–228 (2008) process yt is related to Xt via a stochastic model. 27. Peskin, C.S.: Numerical analysis of blood flow in the heart. J. Comput. Phys. 25, 220–252 (1977) 28. Peskin, C.S.: The immersed boundary method. Acta Numer- ica 11, 479–517 (2002) Description 29. Reichl, L.E.: A Modern Course in Statistical Physics. Wiley, New York (1998) We restrict ourselves to the case when an unobservable 30. Squires, T.M., Quake, S.R.: Microfluidics: fluid physics at signal and observation are continuous time processes the nanoliter scale. Rev. Mod. Phys. 77, 977–1026 (2005) T T 31. Strang, G.: Linear Algebra and Its Applications. Harcourt with 1 D 2 D Œ0; T  (see discrete filtering in, Brace Jovanovich College Publishers, San Diego (1988) e.g., [1, 3, 7]). Let .&; F; P/ be a complete probability 32. Strang, G., Fix, G.: An Analysis of the Finite Element space, Ft ;0Ä t Ä T; be a filtration satisfying Method. Wellesley-Cambridge Press, Wellesley (2008) the usual hypotheses, and .wt ; Ft / and .vt ; Ft / be d1- 33. Strikwerda, J.C.: Finite Difference Schemes and Partial Differential Equations. SIAM, Philadelphia (2004) dimensional and r-dimensional independent standard 34. Tuckerman, M.E., Mundy, C.J., Martyna, G.J.: On the clas- Wiener processes, respectively. We consider the classi- sical statistical mechanics of non-hamiltonian systems. EPL cal filtering scheme in which the unobservable signal (Europhys. Lett.) 45, 149–155 (1999) d process (“hidden” state) Xt 2 R and the observation 35. Valentine, M.T., Weeks, E.R., Gisler, T., Kaplan, P.D., Yodh, Rr ˆ A.G., Crocker, J.C., Weitz, D.A.: Two-point microrheol- process yt 2 satisfy the system of Ito stochastic ogy of inhomogeneous soft materials. Phys. Rev. Lett. 85, differential equations (SDE): 888–91 (2000) 36. Watari, N., Doi, M., Larson, R.G.: Fluidic trapping of de- formable polymers in microflows. Phys. Rev. E 78, 011801 dX D ˛.X/ds C .X/dws C .X/dvs; (2008) X D x; (1) 37. Watson, M.C., Brown, F.L.H.: Interpreting membrane scat- 0 tering experiments at the mesoscale: the contribution of dis- sipation within the bilayer. Biophys. J. 98, L9–L11 (2010) dy D ˇ.X/ds C dvs ;y0 D 0; (2) Stochastic Filtering 1397 where ˛.x/ and ˇ.x/ are d-dimensional and r- filtering (4) is usually called the Kushner-Stratonovich dimensional vector functions, respectively, and .x/ equation or the Fujisaki-Kallianpur-Kunita equation. If and .x/are d d1-dimensional and d r-dimensional the conditional measure E.I.X.t/ 2 dx/ j ys;0Ä matrix functions, respectively. The vector X0 D x in s Ä t/ has a smooth density .t;x/ with respect to the initial condition for (1) is usually random (i.e., the Lebesgue measure, then it solves the following uncertain), it is assumed to be independent of both w nonlinear stochastic equation: and v, and its density './ is assumed to be known. d   > Let f.x/ be a function on R . We assume that the d.t;x/ D L .t;x/dt C .M  t Œˇ/ .t;x/ coefficients in (1), (2) and the function f are bounded .dy   Œˇdt/ ; .0; x/ D '.x/; (5) and have bounded derivatives up to some order. The t stochastic filtering problem consists in constructing   whereP L is an adjoint operatorP to L W L f WD f.XO / y ;0Ä 1 d @2 d @  the estimate t based on the observation s aij f  .˛i f/and M is 2 i;jD1 @xi @xj iD1 @xi s Ä t; which is the best in the mean-square sense, M M M M > an adjoint operatorP to W  D . 1 ;:::; r / i.e., the problem amounts to computing the conditional  d @ with M f D ij f C ˇj f: We note that expectation: j R iD1 @xi t Œf  D Rd f.x/.t;x/dx:R We also remark that the t process vNt WD yt  s Œˇds is called the innovation t Œf  D f.XO t / D E .f .Xt / j ys ;0Ä s Ä t/ 0 process. Ey DW f.Xt /: (3) Now we will give another expression for the optimal filter. Let Applications of nonlinear filtering include track- Z Z  t t ing, navigation systems, cryptography, image process- > 1 2 t WDexp ˇ .Xs/dvs C ˇ .Xs/ds ing, weather forecasting, financial engineering, speech 0 2 0 Z Z  recognition, and many others (see, e.g., [2,11] and ref- t t > 1 2 erences therein). For a historical account, see, e.g., [1]. D exp ˇ .Xs/dys  ˇ .Xs/ds : 0 2 0 E 1 Optimal Filter Equations According to our assumptions, we have t D 1; In this section we give a number of expressions for the 0 Ä t Ä T: We introduce theR new probability optimal filter which involve solving some stochastic PQ F PQ 1 P measure on .&; / W .€/ D € T d .!/: The evolution equations. Proofs of the results presented in measures P and PQ are mutually absolutely continuous. this section and their detailed exposition and exten- Due to the Girsanov theorem, ys is a Wiener process on sions are available, e.g., in [1, 4, 6–11]. .&; F; Ft ; PQ /; the processes Xs and ys are independent The solution t Œf  to the filtering problem (3), (1)– on .&; F; Fs; PQ /; and the process Xs satisfies the Itoˆ (2) satisfies the nonlinear stochastic evolution equa- SDE tion: S   dX D .˛.X/  .X/ˇ.X//ds C .X/dws > > dt Œf  D t ŒLfdtC t ŒM f t Œf t Œˇ  C.X/dys;X0 D x: (6) .dy  t Œˇdt/ ; (4) Due to the Kallianpur-Striebel formula (a particular where L is the generator for the diffusion process Xt : case of the general Bayes formula [7, 9]) for the conditional mean (3), we have Xd 2 Xd 1 @ f @f y Lf WD aij C ˛i EQ .f .Xt /t j ys;0Ä s Ä t/ EQ .f .Xt /t / 2 @xi @xj @xi t Œf D D ; i;jD1 iD1 y EQ .t j ys;0Ä s Ä t/ EQ t (7) with a Dfaij g being a d  d-dimensional matrix de- where Xt is from (6), EQ means expectation according > > M y fined by a.x/ D .x/ .x/ C .x/ .x/ and D to the measure PQ ; and EQ ./ WD EQ .jys;0Ä s Ä t/: M M > M P. 1;:::; r / is a vector of the operators j f WD Let d @f y ij C ˇj f: The equation of optimal nonlinear t Œg WD EQ .g.Xt /t /; iD1 @xi 1398 Stochastic Filtering

d where g is a scalar function on R : The process t is therein and for a number of recent developments often called the unnormalized optimal filter. It satisfies see [2]. the linear evolution equation Linear Filtering and Kalman-Bucy Filter L M> dt Œg D t Œ gdt C t Œ gdyt ;0Œg D 0Œg; There are a very few cases when explicit formulas for (8) optimal filters are available [1, 7]. The most notable which is known as the Zakai equation. Assuming case is when the filtering problem is linear. Consider that there is the corresponding smooth unnormalized the system of linear SDE: filtering density .t; x/, it solves the linear stochastic partial differential equation (SPDE) of parabolic type: dX D .as C AsX/ds C Qsdws C Gsdvs ;     > d.t; x/ D L .t; x/dt C M .t; x/ dyt ; X0 D x; (13) .0; x/ D '.x/: (9) R dy D .bs C BsX/ds C dvs;y0 D 0; (14) We note that t Œg D Rd g.x/.t;x/dx: The unnormalized optimal filter can also be found where As;Bs ;Qs; and Gs are deterministic matrix as a solution of a backward SPDE. To this end, let us functions of time s having the appropriate dimensions; fix a time moment t and introduce the function as and bs are deterministic vector functions of time s Á having the appropriate dimensions; the initial condition y s;x s;x;1 ug.s; xI t/ D EQ g.X / ; (10) t t X0 D x is a Gaussian random vector with mean d d d M0 2 R and covariance matrix C0 2 R  R and Rd s;x s;x;1 0 where x 2 is deterministic and Xs0 ;s0 ;s  s; it is independent of both w and v; the other notation is is the solution of the Itoˆ SDE as in (1)and(2). We note that the solution Xt ;yt of the SDE (13)and 0 dX D .˛.X/  .X/ˇ.X//ds C .X/dws0 (14) is a Gaussian process. The conditional distribution of Xt given fys;0Ä s Ä tg is Gaussian with mean XOt C.X/dys0 ;Xs D x; and covariance Pt ; which satisfy the following system > d D ˇ .X/dys0 ;s D 1: of differential equations [1, 7, 10, 11]:

The function u .s; xI t/; s Ä t; is the solution of the Á   g O O > Cauchy problem for the backward linear SPDE: dX D as C As X ds C Gs C PBs O O > .dys  .bs C Bs X/ds/; X0 D M0; (15)  du D Luds C M u dys; u.t; x/ D g.x/: (11)

d   > The notation “ dy” means backward Itoˆ integral [5,9]. P D PA> C A P  G C PB> G C PB> dt s s s s s s If X0 D x D is a random variable with the density > > './; we can write CQsQs C GsGs ;P0 D C0: (16)

uf;' .0; t/ O t Œf  D ; (12) The solution Xt ;Pt is called the Kalman-Bucy filter u1;' .0; t/ (or linear quadratic estimation). We remark that (15) R for the conditional mean XOt D E.Xt j ys;0Ä s Ä t/ where ug;'.0; t/ WD Rd ug.0; xI t/'.x/dx D Á is a linear SDE, while the solution Pt of the matrix EQ y 0; 0; ;1 g.Xt /t D t Œg: Riccati equation (16) is deterministic and it can be pre- Generally, numerical methods are required to solve computed off-line. Online updating of XOt with arrival optimal filtering equations. For an overview of various of new data yt from observations is computationally numerical approximations for the nonlinear filtering very cheap, and the Kalman-Bucy filter and its various problem, see [1, Chap. 8] together with references modifications are widely used in practical applications. Stochastic ODEs 1399

References where the increments Wt2  Wt1 and Wt4  Wt3 on non- overlapping intervals (i.e., with 0 Ä t1

Z T NX1 f.t/dWt WD m.s.  lim f.tn/ N!1 t0 j D0 Stochastic ODEs   Wt C  Wt ; Peter Kloeden n 1 n FB Mathematik, J.W. Goethe-Universitat,¨ where t t D =N for n D 0, 1, :::, N 1.The Frankfurt am Main, Germany nC1 n   integrand function f may be random or even depend on the path of the Wiener process, but f.t/ should be independent of future increments of the Wiener A scalar stochastic ordinary differential equation S process, i.e., W  W for h>0. (SODE) tCh t The Itoˆ stochastic integral has the important proper- ties (the second is called the Itoˆ isometry) that dX D f.t;X /dt C g.t; X /dW (1) t t t t ÄZ "ÂZ à # T T 2 E f.t/dWt D 0; E f.t/dWt involves a Wiener process Wt , t  0, which is one t0 t0 of the most fundamental stochastic processes and is Z often called a Brownian motion. A Wiener process is a T   E 2 Gaussian process with W0 D 0 with probability 1 and D f.t/ dt: t0 normally distributed increments Wt  Ws for 0 Ä s

@U @U 1 @2U @U L0U D C f C g2 ;L1U D g : Numerical Solution of SODEs @t @x 2 @x2 @x The simplest numerical method for the above An immediate consequence is that the integration rules SODE (1)istheEuler-Maruyama scheme given by and tricks from deterministic calculus do not hold and different expressions result, e.g., YnC1 D Yn C f.tn;Yn/n C g.tn;Yn/Wn; Z T 1 2 1 Ws dWs D WT  T: where n D tnC1 tn and Wn D WtnC1 Wtn .Thisis 0 2 2 intuitively consistent with the definition of the Itointe-ˆ gral. Here Yn is a random variable, which is supposed The situation for vector-valued SODE and vector- to be an approximation on Xt . The stochastic incre- valued Wiener processes is similar. Details can be n ments Wn, which are N .0; n/ distributed, can be found in Refs. [3, 4, 6]. generated using, for example, the Box-Muller method. In practice, however, only individual realizations can be computed. Stratonovich SODEs Depending on whether the realizations of the solu- tions or only their probability distributions are required There is another stochastic integral called the to be close, one distinguishes between strong and weak Stratonovich integral, for which the integrand function convergence of numerical schemes, respectively, on is evaluated at the midpoint of each partition a given interval Œt0;T.Let D maxn n be the subinterval rather than at the left end point. It is written maximum step size. Then a numerical scheme is said with ıdWt to distinguish it from the Itointegral.Aˆ to converge with strong order  if, for sufficiently Stratonovich SODE is thus written small , ˇ ˇÁ ˇ ./ˇ  dXt D f.t;Xt /dt C g.t; Xt / ı dWt : E ˇX  Y ˇ Ä K  T NT T

Stratonovich stochastic calculus has the same chain and with weak order ˇ if rule as deterministic calculus, which means that ˇ Áˇ Stratonovich SODE can be solved with the same ˇ ./ ˇ ˇ ˇE .p.XT //  E p.Y / ˇ Ä Kp;T  integration tricks as for ordinary differential equations. NT However, Stratonovich stochastic integrals do not satisfy the nice properties above for Itoˆ stochastic for each polynomial p. These are global discretization integrals, which are very convenient for estimates in errors, and the largest possible values of  and ˇ proofs. Nor does the Stratonovich SODE have same give the corresponding strong and weak orders, re- direct connection with diffusion process theory as the spectively, of the scheme for a whole class of stochas- Itoˆ SODE, e.g., the coefficient of the Fokker-Planck tic differential equations, e.g., with sufficiently often equation correspond to those of the Itoˆ SODE (1), i.e., continuously differentiable coefficient functions. For example, the Euler-Maruyama scheme has strong order 1 2  D and weak order ˇ D 1, while the Milstein @p @ 1 2 @ p 2 C f C g D 0: scheme @t @x 2 @x2 Stochastic ODEs 1401

YnC1 D Yn C f.tn;Yn/n C g.tn;Yn/Wn methods have been suggested to prevent this. The paper ˚ « [1] provides a systematic method to handle both of 1 @g 2 C g.tn;Yn/ .tn;Yn/ .Wn/  n these problems by using pathwise convergence, i.e., 2 @x ˇ ˇ ˇ ./ ˇ has strong order  D 1 and weak order ˇ D 1;see sup Xtn .!/  Yn .!/ ! 0 as  ! 0; [2, 3, 5]. nD0;:::;NT Note that these convergence orders may be better for ! 2 &: specific SODE within the given class, e.g., the Euler- Maruyama scheme has strong order  D 1 for SODE It is quite natural to consider pathwise convergence with additive noise, i.e., for which g does not depend since numerical calculations are actually carried out on x, since it then coincides with the Milstein scheme. path by path. Moreover, the solutions of some SODE The Milstein scheme is derived by expanding the in- do not have bounded moments, so pathwise conver- tegrand of the stochastic integral with the Itoformula,ˆ gence may be the only option. the stochastic chain rule. The additionalR R term involves the double stochastic integral tnC1 s dW dW , tn tn u s which provides more information about the non- Iterated Stochastic Integrals smooth Wiener process inside the˚ discretization« 1 2 subinterval and is equal to 2 .Wn/  n . Vector-valued SODE with vector-valued Wiener Numerical schemes of even higher order can be processes can be handled similarly. The main new obtained in a similar way. difficulty is how to simulate the multiple stochastic In general, different schemes are used for strong integrals since these cannot be written as simple and weak convergence. The strong stochastic Taylor formulas of the basic increments as in the double 1 3 schemes have strong order  D 2 , 1, 2 , 2, :::, integral above when they involve different Wiener whereas weak stochastic Taylor schemes have weak processes. In general, such multiple stochastic integrals order ˇ D 1,2, 3, :::.See[3] for more details. In cannot be avoided, so they must be approximated particular, one should not use heuristic adaptations of somehow. One possibility is to use random Fourier numerical schemes for ordinary differential equations series for Brownian bridge processes based on the such as Runge-Kutta schemes, since these may not given Wiener processes; see [3, 5]. converge to the right solution or even converge at all. Another way is to simulate the integrals themselves The proofs of convergence rates in the literature by a simpler numerical scheme. For example, double assume that the coefficient functions in the above integral stochastic Taylor schemes are uniformly bounded, i.e., Z Z the partial derivatives of appropriately high order of tnC1 t 2 1 the SODE coefficient functions f and g exist and are I.2;1/;n D dWs dWt t t uniformly bounded. This assumption, however, is not n n S satisfied in many basic and important applications, for 1 2 for two independent Wiener processes Wt and Wt example, with polynomial coefficients such as can be approximated by applying the (vector-valued) Euler-Maruyama scheme to the 2-dimensional Itoˆ 2 2 dXt D.1 C Xt /.1  Xt /dt C .1  Xt /dWt SODE (with superscripts labeling components)

1 2 1 2 2 or with square-root coefficients such as in the Cox- dXt D Xt dWt ;dXt D dWt ; (2) Ingersoll-Ross volatility model p over the discretization subinterval Œtn;tnC1 with a dVt D .# Vt /dtC  Vt dWt ; suitable step size ı D .tnC1  tn/=K. The solution X 1 0 of the SODE (2) with the initial condition tn D , 2 2 X D W at time t D tnC1 is given by which requires Vt  0. The second is more difficult tn tn because there is a small probability that numerical iterations may become negative, and various ad hoc X 1 D I ;X2 D W 2: tnC1 .2;1/;n tnC1 n 1402 Stochastic Simulation

j j j The important thing is to decide first what kind of Writing k D tn C kı and ıWn;k D W kC1  W k ,the stochastic Euler scheme for the SDE (2) reads approximation one wants, strong or weak, as this will determine the type of scheme that should be used, and 1 1 2 1 2 2 2 then to exploit the structural properties of the SODE YkC1 D Yk C Yk ıWn;k;YkC1 D Yk C ıWn;k; under consideration to simplify the scheme that has for 0 Ä k Ä K  1; (3) been chosen to be implemented.

Y 1 0 Y 2 W 2 with the initial value 0 D , 0 D tn . The strong order of convergence of  D 1 of the Euler-Maruyama 2 References scheme ensures that ˇ ˇ p 1. Jentzen, A., Kloeden, P.E., Neuenkirch, A.: Convergence E ˇ 1 ˇ YK  I.2;1/;n Ä C ı; of numerical approximations of stochastic differential equa- tions on domains: higher order convergence rates without global Lipschitz coefficients. Numer. Math. 112, 41–64 so I.2;1/;n can be approximated in the Milstein scheme (2009) 1 2 1 by YK with ı n,i.e.,K n , without affecting 2. Kloeden, P.E.: The systematic deviation of higher order nu- the overall order of convergence. merical methods for stochastic differential equations. Milan J. Math. 70, 187–207 (2002) 3. Kloeden, P.E., Platen, E.: The Numerical Solution of Stochastic Differential Equations, 3rd rev. printing. Commutative Noise Springer, Berlin, (1999) 4. Mao, X.: Stochastic Differential Equations and Applica- Identities such as tions, 2nd edn. Horwood, Chichester (2008) 5. Milstein, G.N.: Numerical Integration of Stochastic Differ- Z Z Z Z tnC1 t tnC1 t ential Equations. Kluwer, Dordrecht (1995) j1 j2 j2 j1 6. Øksendal, B.: Stochastic Differential Equations. An Intro- dWs dWt C dWs dWt tn tn tn tn duction with Applications, 6th edn. 2003, Corr. 4th printing. Springer, Berlin (2007) j1 j2 D Wn Wn allow one to avoid calculating the multiple integrals if the corresponding coefficients in the numerical scheme 1 2 Stochastic Simulation are identical, in this case if L g2.t; x/ Á L g1.t; x/ (where L2 is defined analogously to L1) for an SODE Dieter W. Heermann of the form Institute for Theoretical Physics, Heidelberg 1 2 University, Heidelberg, Germany dXt D f.t;Xt /dtC g1.t; Xt /dWt C g2.t; Xt /dWt : (4) Then the SODE (4)issaidtohavecommutative noise. Synonyms

Concluding Remarks Brownian Dynamics Simulation; Langevin Simulation; Monte Carlo Simulations The need to approximate multiple stochastic integrals places a practical restriction on the order of strong Modelling a system or data one is often faced with the schemes that can be implemented for a general SODE. following: Wherever possible special structural properties like • Exact data is unavailable or expensive to obtain commutative noise of the SODE under investigation • Data is uncertain and/or specified by a probability should be exploited to simplify strong schemes as distribution much as possible. For weak schemes the situation or decisions have to be made with respect to the de- is easier as the multiple integrals do not need to be grees of freedom that are taken explicitly into account. approximated so accurately. Moreover, extrapolation This can be seen by looking at a system with two of weak schemes is possible. components. One of the components could be water Stochastic Simulation 1403 X molecules and the other component large molecules. AN D A.x/P.x;˛/; (5) The decision is to take the water molecules explicitly x into account or to treat them implicitly. Since the water molecules move much faster than the large molecules, where P.x;˛/is the probability of the state and ˛ aset we can eliminate the water by subsuming their action of parameters (e.g., the temperature T ,massm,etc.). on the larger molecules by a stochastic force, i.e., as A point of view that can be taken is that what a random variable with a specific distribution. Thus the Eqs. (1)and(4) accomplish is the generation of we have eliminated some details in favor of a prob- a stochastic trajectory through the available states. abilistic description where perhaps some elements of This can equally be well established by other means. the model description are given by deterministic rules As long as we satisfy the condition that the right and other contributes stochastically. Overall a model probability distribution is generated, we could generate derived along the outlined path can be viewed as if an the trajectory by a Monte Carlo method [1]. individual state has a probability that may depend on In a Monte Carlo formulation, a transition probabil- model parameters. ity from a state x to another state x0 is specified In most of the interesting cases, the number of available states the model has will be so large that they simply cannot be enumerated. A sampling of the W.x0jx/: (6) states is necessary such that the most relevant states will be sampled with the correct probability. Assume Together with the proposition probability for the that the model has some deterministic part. In the above state x0, a decision is made to accept or reject the example, the motion of larger molecules is governed state (Metropolis-Hastings Monte Carlo Method). An by Newton’s equation of motion. These are augmented advantage of this formulation is that it allows freedom by stochastic forces mimicking the water molecules. in the choice of proposition of states and the efficient Depending on how exactly this is implemented results sampling of the states (importance sampling). In more in Langevin equations general terms, what one does is to set up a biased ran- p dom walk that explores the target distribution (Markov mxR DrU.x/  mxP C .t/ 2kBTm; (1) Chain Monte Carlo). A special case of the sampling that yields a Markov where x denotes the state (here the position), U the chain is the Gibbs sampler. Assume x D .x1;x2/ with potential, m the mass of the large molecule, kB the target P.x;˛/ Boltzmann constant, T the temperature, and the stochastic force with the properties: Algorithm 1 Gibbs Sampler Algorithm: x D .x1;x2/ 1: initialize 0 0 0 S h .t/i D 0 (2) 2: while i Ä max number of samples do ˝ ˛ 1 1 2 0 0 3: sample xi P.x jxi1;˛/ .t/ .t / D ı.t  t /: (3) 2 2 1 4: sample xi P.x jxi ;˛/ 5: end while Neglecting the acceleration in the Langevin equa- tion yields the Brownian dynamics equation of motion p then fx1;x2g is a Markov chain. Thus we obtain a x.t/P DrU.x/= C .t/ 2D (4) sequence of states such that we can again apply (5)to compute the quantities of interest. with  D m and D D kB T=. Common to all of the above stochastic simulation Hence the sampling is obtained using the equations methods is the use of random numbers. They are either of motion to transition from one state to the next. If used for the implementation of the random contribution enough of the available states (here x)aresampled, to the force or the decision whether a state is accepted quantities of interest that depend on the states can be or rejected. Thus the quality of the result depends on calculated as averages over the generated states: the quality of the random number generator. 1404 Stochastic Systems

Reference systems biology, etc. Stochasticity or randomness is perhaps associated with a bad outcome, but harnessing 1. Binder, K., Heermann, D.W.: Monte Carlo Simulation in stochasticity has been pursued in arts to create beauty, Statistical Physics: An Introduction. Graduate Texts in e.g., in the paintings of Jackson Pollock or in the music Physics, 5th edn. Springer, Heidelberg (2010) of Iannis Xenakis. Similarly, it can be exploited in science and engineering to design new devices (e.g., stochastic resonances in the Bang and Olufsen speak- ers) or to design robust and cost-effective products Stochastic Systems under the framework of uncertainty-based design and real options [17]. Guang Lin1;2;3 and George Em Karniadakis4 Stochasticity is often associated with uncertainty, 1Fundamental and Computational Sciences either intrinsic or extrinsic, and specifically with the Directorate, Pacific Northwest National Laboratory, lack of knowledge of the properties of the system; Richland, WA, USA hence quantifying uncertain outcomes and system 2School of Mechanical Engineering, Purdue responses is of great interest in applied probability and University, West Lafayette, IN, USA scientific computing. This uncertainty can be further 3Department of Mathematics, Purdue University, West classified as aleatory, i.e., statistical, and epistemic Lafayette, IN, USA which can be reduced by further measurements or 4Division of Applied Mathematics, Brown University, computations of higher resolution. Mathematically, Providence, RI, USA stochasticity can be described by either deterministic or stochastic differential equations. For example, the Mathematics Subject Classification molecular dynamics of a simple fluid is described by the classical deterministic Newton’s law of motion or 93E03 by the deterministic Navier-Stokes equations whose outputs, in both cases, however may be stochastic. On the other hand, stochastic elliptic equations can be Synonyms used to predict the random diffusion of water in porous media, and similarly a stochastic differential equation Noisy Systems; Random Systems; Stochastic Systems may be used to model neuronal activity in the brain [9, 10]. Here we will consider systems that are governed Short Definition by stochastic ordinary and partial differential equations (SODEs and SPDEs), and we will present some ef- A stochastic system may contain one or more elements fective methods for obtaining stochastic solutions in of random, i.e., nondeterministic behavior. Compared the next section. In the classical stochastic analysis, to a deterministic system, a stochastic system does not these terms refer to differential equations subject to always generate the same output for a given input. The white noise either additive or multiplicative, but in elements of systems that can be stochastic in nature more recent years, the same terminology has been may include noisy initial conditions, random boundary adopted for differential equations with color noise, i.e., conditions, random forcing, etc. processes that are correlated in space or time. The color of noise, which can also be pink or violet, may dictate the numerical method used to predict efficiently Description the response of a stochastic system, and hence it is important to consider this carefully at the modeling Stochastic systems (SS) are encountered in many appli- stage. Specifically, the correlation length or time scale cation domains in science, engineering, and business. is the most important parameter of a stochastic process They include logistics, transportation, communication as it determines the effective dimension of the process; networks, financial markets, supply chains, social sys- the smaller the correlation scale, the larger the dimen- tems, robust engineering design, statistical physics, sionality of the stochastic system. Stochastic Systems 1405

Example: To be more specific in the following, minimizes the mean square error [3, 18, 20]. The K-L we present a tumor cell growth model that involves expansion has been employed to represent random stochastic inputs that need to be represented according input processes in many stochastic simulations (see, to the correlation structure of available empirical data e.g., [7, 20]). [27]. The evolution equation is

x.tP I !/ DG.x/ C g.x/f1.tI !/ C f2.tI !/; Stochastic Modeling and Computational Methods x.0I !/ Dx0.!/; (1) We present two examples of two fundamentally differ- where x.tI !/ denotes the concentration of tumor cell ent descriptions of SS in order to show some of the at time t, complexity but also rich stochastic response that can x x be obtained: the first is based on a discrete particle G.x/ D x.1  x/  ˇ ;g.x/D ; model and is governed by SODEs, and the second is x C 1 x C 1 a continuum system and is governed by SPDEs. ˇ is the immune rate, and is related to the rate of A Stochastic Particle System: We first describe a growth of cytotoxic cells. The random process f1.tI !/ represents the strength of the treatment (i.e., the dosage stochastic model for a mesoscopic system governed of the medicine in chemotherapy or the intensity of by a modified version of Newton’s law of motion, the so-called dissipative particle dynamics (DPD) equa- the ray in radiotherapy), while the process f2.tI !/ is related to other factors, such as drugs and radiotherapy, tions [19]. It consists of particles which correspond that restrain the number of tumor cells. The parameters to coarse-grained entities, thus representing molecular ˇ, and the covariance structure of random processes clusters rather than individual atoms. The particles move off-lattice interacting with each other through f1 and f2 are usually estimated based on empirical data. a set of prescribed (conservative and stochastic) and velocity-dependent forces. Specifically, there are three If the processes f1.t; !/ and f2.t; !/ are indepen- dent, they can be represented using the Karhunen- types of forces acting on each dissipative particle: (a) Loeve (K-L) expansion by a zero mean, second-order a purely repulsive conservative force, (b) a dissipative random process f.t;!/defined on a probability space force that reduces velocity differences between the .&; F; P/ and indexed over t 2 Œa; b. Let us denote the particles, and (c) a stochastic force directed along the continuous covariance function of f.tI !/ as C.s; t/. line connecting the center of the particles. The last Then the process f.tI !/ can be represented as two forces effectively implement a thermostat so that thermal equilibrium is achieved. Correspondingly, the amplitude of these forces is dictated by the fluctuation- XNd p S f.tI !/ D kek.t/ k .!/; dissipation theorem that ensures that in thermodynamic kD1 equilibrium the system will have a canonical distri- bution. All three forces are modulated by a weight where k .!/ are uncorrelated random variables with function which specifies the range of interaction or zero mean and unitary variance, while k and ek.t/ are, cutoff radius rc between the particles and renders the respectively, eigenvalues and (normalized) eigenfunc- interaction local. tions of the integral operator with kernel C.s; t/, i.e., The DPD equations for a system consisting of N Z particles have equal mass (for simplicity in the pre- b sentation) m, position ri , and velocities ui , which are C.s; t/ek.s/ds D kek.t/: a stochastic in nature. The aforementioned three types of forces exerted on a particle i by particle j are given by The dimension Nd depends strongly on the correlation c .c/ d d scale of the kernel C.s;t/.Ifwerearrangetheeigen- Fij D F .rij /eij ; Fij D! .rij /.vij  eij /eij ;  values k in a descending order, then any truncation r r of the expansion f.tI !/ is optimal in the sense that it Fij D ! .rij / ij eij ; 1406 Stochastic Systems

5 VLJ 4.5 averaged 4 3.5 3 T B 2.5

V(r)/k 2 1.5 1 0.5

0

0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 distance (r)

Stochastic Systems, Fig. 1 Left: Lennard-Jones potential and its averaged soft repulsive-only potential. Right: Polymer chains flowing in a sea of solvent in DPD. For more details see [19]

where rij D ri  rj , vij D vi  vj , rij Djrij j lubrication) involve multiscale processes and can be rij and the unit vector eij D .Thevariables and  modeled using modifications of the above stochastic rij determine the strength of the dissipative and random DPD equations [19]. Dilute polymer solutions are a forces, respectively, ij are symmetric Gaussian ran- typical example, since individual polymer chains form dom variables with zero mean and unit variance, and a group of large molecules by atomic standards but !d and !r are weight functions. still governed by forces similar to intermolecular ones. The time evolution of DPD particles is described by Therefore, they form large repeated units exhibiting Newton’s law slow dynamics with possible nonlinear interactions (see Fig. 1(right)). p Fcıt C Fd ıt C Fr ıt dr D v ıtI dv D i i i ; A Stochastic Continuum System: Here we present i i i m i an example from classical aerodynamics on shock P dynamics by reformulating the one-dimensional piston c c where Fi D i¤j Fij is the total conservative force problem within the stochastic framework, i.e., we allow d r acting on particle i; Fi and Fi are defined similarly. for random piston motions which may be changing in The velocityp increment due to the random force has time [11]. In particular, we superimpose small random a factor ıt since it represents Brownian motion, velocity fluctuations to the piston velocity and aim to which is described by a standard Wiener process with obtain both analytical and numerical solutions of the j  j  t1 t2 a covariance kernel given by CFF .ti ;tj / D e A , stochastic flow response. We consider a piston having where A is the correlation time for this stochastic a constant velocity, Up, moving into a straight tube process. The conservative force Fc is typically given filled with a homogeneous gas at rest. A shock wave in terms of a soft potential in contrast to the Lennard- will be generated ahead of the piston. A sketch of the Jones potential used in molecular dynamics studies piston-driven shock tube with random piston motion (see Fig. 1(left)). The dissipative and random forces are superimposed is shown in Fig. 2(left). d r characterized by strengths ! .rij / and ! .rij / coupled As shown in Fig. 2 (left), Up and S are the deter- due to the fluctuation-dissipation theorem. ministic speed of the piston and deterministic speed of Several complex fluid systems in industrial and the shock, respectively, and 0, P0, C0, 1, P1,andC1 biological applications (DNA chains, polymer gels, are the deterministic density, pressure, and local sound Stochastic Systems 1407

Perturbation Analysis 20 Later time: ∼τ Early time: ∼τ2 2 )

α 15 qS’A / p 10 U = Up + vp ( t ) U = 0 P = P 1 P = P0 (t)>/(U ρ = ρ ρ ρ 2 1 S + v ( t ) = ξ s 0 5 C = C < 1 C = C0

Up + vp ( t ) 0

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 τ

Stochastic Systems, Fig. 2 Left: Sketch of piston-driven shock Dashed line: early-time asymptotic results, h 2. /i 2. Dash- tube with random piston motion. Right: Normalized variance of dotted line: late-time asymptotic results, h 2. /i perturbed shock paths. Solid line: perturbation analysis results. speed ahead and after of the shock, respectively. We where S 0 D dS , m

certainty: error analysis and applications. J. Comput. Phys. 26. Yang, X., Choi, M., Lin, G., Karniadakis, G.E.: Adap- 227, 9572–9595 (2008) tive anova decomposition of stochastic incompressible and 6. Foo, J.Y., Karniadakis, G.E.: Multi-element probabilistic compressible flows. J. Comput. Phys. 231, 1587–1614 collocation in high dimensions. J. Comput. Phys. 229, (2012) 1536–1557 (2009) 27. Zeng, C., Wang, H.: Colored noise enhanced stability in a 7. Ghanem, R.G., Spanos, P.: Stochastic Finite Eelements: A tumor cell growth system under immune response. J. Stat. Spectral Approach. Springer, New York (1991) Phys. 141, 889–908 (2010) 8. Griebel, M.: Sparse grids and related approximation schemes for higher-dimensional problems. In: Proceedings of the Conference on Foundations of Computational Math- ematics, Santander (2005) 9. van Kampen, N.: Stochastic Processes in Physics and Chem- istry, 3rd edn. Elsevier, Amsterdam (2008) Stokes or Navier-Stokes Flows 10. Laing, C., Lord, G.J.: Stochastic Methods in Neuroscience. Oxford University Press, Oxford (2009) Vivette Girault and Fred´ eric´ Hecht 11. Lin, G., Su, C.H., Karniadakis, G.E.: The stochastic pis- Laboratoire Jacques-Louis Lions, UPMC University ton problem. Proc. Natl. Acad. Sci. U. S. A. 101(45), 15,840–15,845 (2004) of Paris 06 and CNRS, Paris, France 12. Lin, G., Tartakovsky, A.M., Tartakovsky, D.M.: Uncertainty quantification via random domain decomposition and prob- abilistic collocation on sparse grids. J. Comput. Phys. 229, 6995–7012 (2010) Mathematics Subject Classification 13. Ma, X., Zabaras, N.: An adaptive hierarchical sparse grid collocation method for the solution of stochastic differential 76D05; 76D07; 35Q30; 65N30; 65F10 equations. J. Comput. Phys. 228, 3084–3113 (2009) 14. Maitre, O.P.L., Njam, H.N., Ghanem, R.G., Knio, O.M.: Multi-resolution analysis of Wiener-type uncertainty prop- agation schemes. J. Comput. Phys. 197, 502–531 (2004) Definition Terms/Glossary 15. Maitre, O.P.L., Njam, H.N., Ghanem, R.G., Knio, O.M.: Uncertainty propagation using Wiener-Haar expansions. J. Comput. Phys. 197, 28–57 (2004) Boundary layer It refers to the layer of fluid in the 16. Mumford, D.: The dawning of the age of stochasticity. immediate vicinity of a bounding surface where the In: Mathematics Towards the Third Millennium, Rome effects of viscosity are significant. (1999) GMRES Abbreviation for the generalized minimal 17. de Neufville, R.: Uncertainty Management for Engineering Systems Planning and Design. MIT, Cambridge (2004) residual algorithm. It refers to an iterative method 18. Papoulis, A.: Probability, Random Variables and Stochastic for the numerical solution of a nonsymmetric sys- Processes, 3rd edn. McGraw-Hill, Europe (1991) tem of linear equations. 19. Symeonidis, V., Karniadakis, G., Caswell, McGraw-Hill Precondition It consists in multiplying both sides of Europe B.: A seamless approach to multiscale complex fluid simulation. Comput. Sci. Eng. 7, 39–46 (2005) a system of linear equations by a suitable matrix, 20. Venturi, D.: On proper orthogonal decomposition of ran- called the preconditioner, so as to reduce the condi- domly perturbed fields with applications to flow past a tion number of the system. S cylinder and natural convection over a horizontal plate. J. Fluid Mech. 559, 215–254 (2006) 21. Wan, X., Karniadakis, G.E.: An adaptive multi-element gen- eralized polynomial chaos method for stochastic differential The Incompressible Navier-Stokes Model equations. J. Comput. Phys. 209, 617–642 (2005) 22. Wiener, N.: The homogeneous chaos. Am. J. Math. 60, 897–936 (1938) The incompressible Navier-Stokes system of equations 23. Winter, C., Guadagnini, A., Nychka, D., Tartakovsky, is a widely accepted model for viscous Newtonian D.: Multivariate sensitivity analysis of saturated incompressible flows. It is extensively used in me- flow through simulated highly heterogeneous teorology, oceanography, canal flows, pipeline flows, groundwater aquifers. J. Comput. Phys. 217, 166–175 (2009) automotive industry, high-speed trains, wind turbines, 24. Xiu, D., Hesthaven, J.: High order collocation methods for etc. Computing accurately its solutions is a difficult and differential equations with random inputs. SIAM J. Sci. important challenge. Comput. 27(3), 1118–1139 (2005) A Newtonian fluid is a model whose Cauchy stress 25. Xiu, D., Karniadakis, G.E.: The Wiener-Askey polynomial chaos for stochastic differential equations. SIAM J. Sci. tensor depends linearly on the strain tensor, in contrast Comput. 24(2), 619–644 (2002) to non-Newtonian fluids for which this relation is 1410 Stokes or Navier-Stokes Flows nonlinear and possibly implicit. For a Navier-Stokes are complemented with boundary conditions, such as fluid model, the constitutive equation defining the the no-slip condition Cauchy stress tensor T is: u D 0; on @˝; (6) T DI C 2D.u/; (1) and an initial condition where >0is the constant viscosity coefficient, rep- resenting friction between molecules,  is the pressure, 1 T u.;0/D u0./ in ˝; u is the velocity, D.u/ D 2 r u Cru is the strain @ui tensor, and .r u/ij D is the gradient tensor. When satisfying divu0 D 0; and u0 D 0; on @˝: (7) @xj substituted into the balance of linear momentum In practical situations, other boundary conditions may du be prescribed. One of the most important occurs in  D div T C f ; (2) dt flows past a moving obstacle, in which case (6)is replaced by where >0is the fluid’s density, f is an external du Z body force (e.g., gravity), and dt is the material time derivative u D g; on @˝ where g  n D 0; (8) @˝ du @u X @u D Cur u; where ur uDŒr uuD u ; where g is the velocity of the moving body and n dt @t i @x i i denotes the unit exterior normal vector to @˝.To (3) simplify, we shall only discuss (6), but we shall present (1) gives, after division by , numerical experiments where (8) is prescribed. If the boundary conditions do not involve the bound- @u 1  C u ru D r  C 2 div D.u/ C f : ary traction vector Tn,(4) can be substantially simpli- @t   fied by using the fluid’s incompressibility. Indeed, (5) implies 2div D.u/ D  u,and(4) becomes But the density  is constant, since the fluid is incom- pressible. Therefore renaming the quantities p D   @u and the kinematic viscosity  D  , the momentum C u ru   u Crp D f : (9)  @t equation reads: When written in dimensionless variables (denoted by @u C u ru Drp C 2 div D.u/ C f : (4) the same symbols), (9) reads @t @u 1 As the fluid is incompressible, the conservation of mass C u ru   u Crp D f ; (10) @t Re @ C div.u/ D 0; where Re is the Reynolds number, ReD LU , L is a @t  characteristic length, and U a characteristic velocity. 5 reduces to the incompressibility condition When 1 Ä Re Ä 10 , the flow is said to be laminar. Finally, when the Reynolds number is small and the div u D 0: (5) force f does not depend on time, the material time derivative can be neglected. Then reverting to the original variables, this yields the Stokes system: From now on, we assume that ˝ is a bounded, con- nected, open set in R3, with a suitably piecewise smooth boundary @˝ (essentially, without cusps or   u Crp D f ; (11) multiple points). The relations (4)and(5)arethe incompressible Navier-Stokes equations in ˝.They complemented with (5)and(6). Stokes or Navier-Stokes Flows 1411

Some Theoretical Results where ˇ>0, only depends on ˝. This inequality is equivalent to the following “inf-sup condition”: Let us start with Stokes problems (11), (5), and (6). In view of both theory and numerics, it is useful to write .div v;q/ inf sup  ˇ: (14) it in variational form. Let 2 1 3 kr vkL2.˝/kqkL2.˝/ q2Lı.˝/ v2H0 .˝/

1 2 2 3 H .˝/ Dfv 2 L .˝/ Irv 2 L .˝/ g ; Interestingly, (14) is not true when @˝ has an outward 1 1 cusp, a situation that occurs, for instance, in a flow H0 .˝/ Dfv 2 H .˝/ I v D 0 on @˝g; exterior to two colliding spheres. There is no simple proof of (14). Its difficulty lies in the no-slip boundary L2.˝/ Dfv 2 L2.˝/ I .v; 1/ D 0g; ı condition prescribed on v: The proof is much simpler where .; / denotes the scalar product of L2.˝/: when it is replaced by the weaker condition v  n D 0. Z The above isomorphism easily leads to the following .f; g/ D fg: result: ˝ Theorem 1 For any f in H 1.˝/3 and any >0,

1 1 Problem (12) has exactly one solution and this solution Let H .˝/ denote the dual space of H0 .˝/ and 1 depends continuously on the data: h; i the duality pairing between them. The space H0 takes into account the no-slip boundary condition on 2 1 1 the velocity, and the space Lı is introduced to lift kr ukL2.˝/ Ä kf kH 1.˝/; kpkL2.˝/ Ä kf kH 1.˝/: the undetermined constant in the pressure; note that  ˇ (15) it is only defined by its gradient and hence up to one additive constant in a connected region. Assume that Next we consider the steady Navier-Stokes system. f belongs to H 1.˝/3. For our purpose, a suitable The natural extension of (12)is:Findapair.u;p/ 2 1 3 H 1.˝/3  L2.˝/ variational form is: Find a pair .u;p/ 2 H0 .˝/  0 ı solution of 2 Lı.˝/ solution of 1 3 2 8.v;q/2 H0 .˝/  Lı.˝/ ; 8.v;q/2H 1.˝/3  L2.˝/ ; 0 ı .r u; r v/ C .u ru; v/ .r u; r v/  .p; div v/.q; div u/Dhf ; vi:  .p; div v/  .q; div u/ Dhf ; vi: (16) (12)

Its analysis is fairly simple because on one hand the Albeit linear, this problem is difficult both from theo- nonlinear convection term u ru has the following retical and numerical standpoints. The pressure can be antisymmetry: eliminated from (12) by working with the space V of S divergence-free velocities: 8u 2 V;8v 2 H 1.˝/3 ;.u ru; v/ D.u rv; u/; 1 3 (17) V Dfv 2 H0 .˝/ I div v D 0g; and on the other hand, it belongs to L3=2.˝/3,which, but the difficulty lies in recovering the pressure. Ex- roughly speaking, is significantly smoother than the istence and continuity of the pressure stem from the data f in H 1.˝/3. This enables to prove existence of following deep result: The divergence operator is an solutions, but uniqueness is only guaranteed for small ? 2 ? isomorphism from V onto Lı.˝/,whereV is the force or large viscosity. More precisely, let 1 3 orthogonal of V in H0 .˝/ . In other words, for every q 2 L2.˝/, there exists one and only one v 2 V ? so- .w ru; v/ ı N D sup : lution of div v D q. Moreover v depends continuously w;u;v2V;w;u;v¤0 kr wkL2.˝/kr ukL2.˝/kr vkL2.˝/ on q: 1 kr vk 2 Ä kqk 2 ; (13) Then we have the next result. L .˝/ ˇ L .˝/ 1412 Stokes or Navier-Stokes Flows

Theorem 2 For any f in H 1.˝/3 and any >0, Discretization Problem (16) has at least one solution. A sufficient condition for uniqueness is Solving numerically a steady Stokes system is costly because the theoretical difficulty brought by the pres- N sure is inherited both by its discretization, whatever kf k 1 <1: (18) 2 H .˝/ the scheme, and by the computer implementation of its scheme. This computational difficulty is aggra- Now we turn to the time-dependent Navier-Stokes vated by the need of fine meshes for capturing com- system. Its analysis is much more complex because plex flows produced by the Navier-Stokes system. In in R3 the dependence of the pressure on time holds comparison, when the flow is laminar, at reasonable in a weaker space. To simplify, we do not treat the Reynolds numbers, the additional cost of the nonlinear most general situation. For a given time interval Œ0; T , convection term is minor. There are some satisfac- Banach space X, and number r  1, the relevant tory schemes and algorithms but so far no “miracle” spaces are of the form Lr .0; T I X/, which is the space method. of functions defined and measurable in 0; T Œ, such that Three important methods are used for discretiz- ing flow problems: Finite-element, finite-difference, or Z T finite-volume methods. For the sake of simplicity, we r kvkX dt < 1; shall mainly consider discretization by finite-element 0 methods. Usually, they consist in using polynomial R2 and functions on cells: triangles or quadrilaterals in or tetrahedra or hexahedra in R3. Most finite-difference

1;r r dv r schemes can be derived from finite-element methods W .0; T I X/Dfv 2L .0; T I X/I 2L .0; T I X/g; 2 3 dt on rectangles in R or rectangular boxes in R , coupled with quadrature formulas, in which case the mesh W 1;r .0; T I X/Dfv 2W 1;r .0; T I X/Iv.0/Dv.T /D0g; 0 may not fit the boundary and a particular treatment may be required near the boundary. Finite volumes 1;r0 1 1 with dual space W .0; T I X/, r C r0 D 1.There are closely related to finite differences but are more are several weak formulations expressing (4)–(7). For complex because they can be defined on very general numerical purposes, we shall use the following one: cells and do not involve functions. All three methods Find u 2 L2.0; T I V/ \ L1.0; T I L2.˝/3/, with require meshing of the domain, and the success of du 3=2 0 1;1 2 dt in L .0; T I V /,andp 2 W .0; T I Lı.˝// these methods depends not only on their accuracy but satisfying a.e. in 0; T Œ also on how well the mesh is adapted to the problem under consideration. For example, boundary layers 1 3 2 8.v;q/2 H0 .˝/  Lı.˝/; may appear at large Reynolds numbers and require locally refined meshes. Constructing a “good” mesh d .u.t/; v/C.r u.t/; r v/C.u.t/ ru.t/; v/ can be difficult and costly, but these important meshing dt issues are outside the scope of this work. Last, but not  .p.t/; div v/  .q; div u.t// Dhf .t/; vi; least, in many practical applications where the Stokes (19) system is coupled with other equations, it is important that the scheme be locally mass conservative, i.e., the with the initial condition (7). This problem always has integral mean value of the velocity’s divergence be zero at least one solution. in each cell. Theorem 3 For any f in L2.0; T I H 1.˝/3/, any >0, and any initial data u 2 V , Problem (19), (7) 0 Discretization of the Stokes Problem has at least one solution. Let Th be a triangulation of ˝ made of tetrahedra (also Unfortunately, unconditional uniqueness (which is called elements) in R3, the discretization parameter h true in R2) is to this date an open problem in R3.In being the maximum diameter of the elements. For ap- fact, it is one of the Millennium Prize Problems. proximation purposes, Th is not completely arbitrary: Stokes or Navier-Stokes Flows 1413 it is assumed to be shape regular in the sense that in order to guarantee continuity at the interfaces of its dihedral angles are uniformly bounded away from elements. 0 and , and it has no hanging node in the sense We begin with constant pressures with one degree that the intersection of two cells is either empty, or a of freedom at the center of each tetrahedron. It can be vertex, or a complete edge, or a complete face. For checked that, except on some very particular meshes, a given integer k  0,letPk denote the space of a conforming P1 velocity space is not sufficiently rich polynomials in three variables of total degree k and to satisfy (21). This can be remedied by adding one Qk that of degree k in each variable. The accuracy degree of freedom (a vector in the normal direction) at of a finite-element space depends on the degree of the center of each face, and it is achieved by enriching the polynomials used in each cell; however, we shall the velocity space with one polynomial of P3 per face. concentrate on low-degree elements, as these are most This element, introduced by Bernardi and Raugel, is frequently used. inf-sup stable and is of order one. It is frugal in number Let us start with locally mass conservative methods of degrees of freedom and is locally mass conserva- and consider first conforming finite-element methods, tive but complex in its implementation because the i.e., where the finite-element space of discrete veloci- velocity components are not independent. Of course, 1 3 P ties, say Xh, is contained in H0 .˝/ . Strictly speaking, three polynomials of 3 (one per component) can be the space of discrete pressures should be contained used on each face, and thus each component of the 2 P in Lı.˝/. However, the zero mean-value constraint discrete velocity is the sum of a polynomial of 1, destroys the band structure of the matrix, and therefore, which guarantees accuracy, and a polynomial of P3, this constraint is prescribed weakly by means of a which guarantees inf-sup stability, but the element is small, consistent perturbation. Thus the space of dis- more expensive. crete pressures, say Qh, is simply a discrete subspace The idea of degrees of freedom on faces motivates 2 of L .˝/, and problem (12) is discretized by: Find a nonconforming method where Xh is contained in 2 3 .uh;ph/ 2 Xh  Qh solution of L .˝/ and problem (12) is discretized by the follow- ing: Find .uh;ph/ 2 Xh  Qh solution of

8.vh;qh/ 2 Xh  Qh ; 8.vh;qh/ 2 Xh  Qh ; .r uh; r vh/.ph; div vh/  .qh; div uh/  ".ph;qh/ X X  .r uh; r vh/T  .ph; div vh/T Dhf ; v i; (20) h T 2Th T 2Th X where ">0is a small parameter. Let Mh D Qh \  .qh; div uh/T  ".ph;qh/ Dhf ; vhi: (22) 2 T 2T Lı.˝/. Regardless of their individual accuracy, Xh h and Mh cannot be chosen independently of each other S because they must satisfy a uniform discrete analogue The inf-sup condition (21) is replaced by: of (14), namely, P T .div vh;qh/T inf sup T 2 h  ˇ where .div v ;q / kr v k kq k 2 inf sup h h  ˇ; (21) qh2Mh vh2Xh h h h L .˝/ qh2Mh vh2Xh kr vhkL2.˝/kqhkL2.˝/ Á X 1=2 2 kkh D kkL2.˝/ : (23)  for some real number ˇ >0, independent of h. T 2Th Elements that satisfy (21) are called inf-sup stable. For such elements, the accuracy of (20) depends directly on The simplest example, introduced by Crouzeix and the individual approximation properties of Xh and Qh. Raviart, is that of a constant pressure and each veloc- Roughly speaking, (21) holds when a discrete ve- ity’s component P1 per tetrahedron and each velocity’s locity space is sufficiently rich compared to a given component having one degree of freedom at the center discrete pressure space. Observe also that the discrete of each face. The functions of Xh must be continuous velocity’s degree in each cell must be at least one at the center of each interior face and vanish at the 1414 Stokes or Navier-Stokes Flows center of each boundary face. Then it is not hard to convection term .w ru; v/.As(17) does not prove that (23) is satisfied. Thus this element has order necessarily extend to the discrete spaces, the preferred one, it is fairly economical and mass conservative, and choice, from the standpoint of theory, is its implementation is fairly straightforward. h i The above methods easily extend to hexahedral 1 c.whI uh; vh/ D .wh ruh; vh/  .wh rvh; uh/ ; triangulations with Cartesian structure (i.e., eight hexa- 2 (25) hedra meeting at any interior vertex) provided the poly- nomial space Pk is replaced by the inverse image of Qk because it is both consistent and antisymmetric, which on the reference cube. Furthermore, such hexahedral makes the analysis easier. But from the standpoint of triangulations offer more possibilities. For instance, a numerics, the choice conforming, inf-sup stable, locally mass conservative scheme of order two can be obtained by taking, in each c.whI uh; vh/ D .wh ruh; vh/ (26) cell, a P pressure and each component of the velocity 1 is simpler and seems to maintain the same accuracy. in Q . 2 Observe that at each step n,(24) is a discrete Stokes Now we turn to conforming methods that use con- system with two or three additional linear terms, ac- tinuous discrete pressures; thus the pressure must be at cording to the choice of form c. In both cases, the least P in each cell and continuous at the interfaces. 1 matrix of the system is not symmetric, which is a strong Therefore the resulting schemes are not locally mass disadvantage. This can be remedied by completely conservative. It can be checked that velocities with time lagging the form c, i.e., replacing it by .un1  P components are not sufficiently rich. The simplest h 1 r un1; v /. alternative, called “mini-element” or “P –bubble,” en- h h 1 There are cases when none of the above lineariza- riches each velocity component in each cell with a tions are satisfactory, and the convection term is ap- polynomial of P that vanishes on the cell’s boundary, 4 proximated by c.unI un; v / with c defined by (25) whence the name bubble. This element is inf-sup h h h or (26). The resulting scheme is nonlinear and must be stable and has order one. Its extension to order two, linearized, for instance, by an inner loop of Newton’s introduced in R2 by Hood and Taylor, associates with iterations. Recall Newton’s method for solving the the same pressure, velocities with components in P .It 2 equation f.x/ D 0 in R: Starting from an initial guess is inf-sup stable and has order two. x0, compute the sequence .xk/ for k  0 by

Discretization of the Navier-Stokes System f.x / x D x  k : Here we present straightforward discretizations of (19). kC1 k 0 f .xk/ The simplest one consists in using a linearized back- ward Euler finite-difference scheme in time. Let N>1 Its generalization is straightforward and at step n, n1 be an integer, ıt D T=N the corresponding time step, starting from uh;0 D uh , the inner loop reads: Find and tn D nı t the discrete times. Starting from a finite- .uh;kC1;ph;kC1/ 2 Xh  Qh solution of 0 element approximation or interpolation, say uh of u0 satisfying the discrete divergence constraint of (20), we 1 n n 8.vh;qh/2Xh Qh; .uh;kC1; vh/C.r uh;kC1;r vh/ construct a sequence .uh;ph / 2 Xh  Qh such that for ıt 1 Ä n Ä N : C c.uh;kC1I uh;k ; vh/ C c.uh;kI uh;kC1; vh/

8.vh;qh/ 2 Xh  Qh;  .ph;kC1; div vh/  .qh; div uh;kC1/  ".ph;kC1;qh/ 1 n n1 n n1 n n 1 n1 .uh  uh ; vh/C.r uh; r vh/Cc.uh I uh; vh/ D c.u I u ; v / Chf ; v iC .u ; v /: ıt h;k h;k h h ıt h h n n n n (27)  .ph ; div vh/  .qh; div uh/  ".ph;qh/Dhf ; vhi; (24) Experience shows that only a few iterations are suffi- n cient to match the discretization error. Once this inner where f is an approximation of f.tn; / and loop converges, we set un WD u , pn WD p . c.whI uh; vh/ a suitable approximation of the h h;kC1 h h;kC1 Stokes or Navier-Stokes Flows 1415

An interesting alternative to the above schemes is AU  BT P D F ; BU  CP D 0; (29) the characteristics method that uses a discretization of the material time derivative (see (3)): whose matrix is symmetric but indeed indefinite. Since A is nonsingular, a partial solution of (29)is d 1   u.t ; x/ ' un.x/  un1.n1.x// ;   dt n ıt h h BA1BT C C P DBA1F ;

n1 1 T where  .x/ gives the position at time tn1 of a parti- U D A .F C B P/: (30) cle located at x at time t . Its first-order approximation n As B has maximal rank, the Schur complement is BA1BT C C is symmetric positive definite, and n1.x/ D x  .ı t/un1.x/: h an iterative gradient algorithm is a good candidate for Thus (24) is replaced by solving (30). Indeed, (30) is equivalent to minimizing with respect to Q the quadratic functional 8.vh;qh/ 2 Xh  Qh;       1 1 1 n n1 n1 n K.Q/ D AvQ; vQ C CQ; Q ; with u  u ı  ; vh C .r u ; r vh/ 2 2 ıt h h h T n n n n AvQ D F C B Q: (31)  .ph ; div vh/  .qh; div uh/  ".ph;qh/Dhf ; vhi; (28) A variety of gradient algorithms for approximating the minimum are obtained by choosing a sequence whose matrix is symmetric and constant in time and k of direction vectors W and an initial vector P0 and requires no linearization. On the other hand, computing computing a sequence of vectors P k defined for each the right-hand side is more complex. k  1 by:

Pk D P k1  k1W k1; where Algorithms K.Pk1  k1W k1/ D inf K.Pk1  W k1/: 2R In this section we assume that the discrete spaces (32) are inf-sup stable, and to simplify, we restrict the discussion to conforming discretizations. Any Usually the direction vectors W k are related to the discretization of the time-dependent Navier-Stokes gradient of K, whence the name of gradient algorithms. system requires the solution of at least one Stokes It can be shown that each step of these gradient al- problem per time step, whence the importance of an gorithms requires the solution of a linear system with efficient Stokes solver. But since the matrix of the matrix A, which is equivalent to solving a Laplace discrete Stokes system is large and indefinite, in R3 the equation per step. This explains why solving the Stokes S system is rarely solved simultaneously for uh and ph. system is expensive. Instead the computation of ph is decoupled from that The above strategy can be applied to (28) but not of uh. to (24) because its matrix A is no longer symmetric. In this case, a GMRES algorithm can be used, but this Decoupling the Pressure and Velocity algorithm is expensive. For this reason, linearization Let U be the vector of velocity unknowns, P that of by fully time lagging c may be preferable because pressure unknowns, and F the vector of data repre- the matrix A becomes symmetric. Of course, when sented by .f ; vh/.LetA be the (symmetric positive Newton’s iterations are performed, as in (27), this definite) matrix of the discrete Laplace operator repre- option is not available because A is not symmetric. In sented by .r uh; r vh/, B the matrix of the discrete this case, a splitting strategy may be useful. divergence operator represented by .qh; div uh/,andC the matrix of the operator represented by ".ph;qh/. Splitting Algorithms Owing to (21), the matrix B has maximal rank. With There is a wide variety of algorithms for splitting the this notation, (20)hastheform: nonlinearity from the divergence constraint. Here is an 1416 Stokes or Navier-Stokes Flows example where the divergence condition is enforced and therefore can be preconditioned by a Laplace once every other step. At step n, operator, while the second step is an implicit linearized n1 n1 1. Knowing .uh ;ph / 2 Xh  Qh, compute an in- system without constraint. n n termediate velocity .wh;ph / 2 Xh  Qh solution of Numerical Experiments 8.vh;qh/ 2 Xh  Qh ; 1 We present here two numerical experiments of bench- .wn  un1; v /  .pn; div v /.q ; div wn/ marks programmed with the software FreeFem++. ıt h h h h h h h More details including scripts and plots can be found n n n1  ".ph;qh/ Dhf ; vhi.r uh ; r vh/ online at http://www.ljll.math.upmc.fr/ hecht/ftp/ n1 n1 ECM-2013.  c.uh I uh ; vh/: (33)

n The Driven Cavity in a Square 2. Compute uh 2 Xh solution of We use the Taylor-Hood P2  P1 scheme to solve the 1 steady Navier-Stokes equations in the square cavity 8v 2 X ; .un  un1; v / C .r un; r v / h h ıt h h h h h ˝ D0; 1Œ0; 1Œ with upper boundary 1 D0; 1Œf1g: n n n n C c.whI uh; vh/ Dhf ; vhiC.ph ; div vh/: (34) 1   u C u ru Crp D 0 ; div u D 0; Re The first step is fairly easy because it reduces to a “Laplace” operator with unknown boundary conditions uj1 D .1; 0/ ; uj@˝n1 D .0; 0/;

Stokes or Navier-Stokes Flows, Fig. 1 From left to right: pressure at Re 9;000, adapted mesh at Re 8;000, stream function at Re 9;000. Observe the cascade of corner eddies

Stokes or Navier-Stokes Flows, Fig. 2 Von Karm´ an’s´ vortex street Stokes or Navier-Stokes Flows 1417 with different values of Re ranging from 1 to The reader can also refer to the book by T. Oden and 9;000. The discontinuity of the boundary values J.N. Reddy: at the two upper corners of the cavity produces a An introduction to the mathematical theory of finite singularity of the pressure. The nonlinearity is solved elements, Wiley, New-York, 1976. by Newton’s method, the initial guess being obtained More computational aspects can be found in the by continuation on the Reynolds number, i.e., from book by A. Ern and J.L. Guermond: the solution computed with the previous Reynolds Theory and Practice of Finite Elements,AMS159, number. The address of the script is cavityNewton.edp Springer-Verlag, Berlin, 2004. (Fig. 1). The reader will find an introduction to the theory and approximation of the Stokes and steady Navier- Stokes equations, including a thorough discussion on Flow Past an Obstacle: Von Karm´ an’s´ Vortex the inf-sup condition, in the book by V. Girault and Street in a Rectangle P.A. Raviart: The Taylor-Hood P2  P1 scheme in space and Finite Element Methods for Navier-Stokes characteristics method in time are used to solve the Equations. Theory and Algorithms,SCM5, Springer- time-dependent Navier-Stokes equations in a rectangle Verlag, Berlin, 1986. 2:2  0:41 m with a circular hole of diameter 0:1 m An introduction to the theory and approximation of Kg located near the inlet. The density  D 1:0 m3 and the the time-dependent Navier-Stokes problem is treated in 3 m2 the Lecture Notes by V. Girault and P.A. Raviart: kinematic viscosity  D 10 s . All the relevant data are taken from the benchmark case 2D-2 that can be Finite Element Approximation of the Navier-Stokes found online at http://www.mathematik.tu-dortmund. Equations, Lect. Notes in Math. 749, Springer-Verlag, de/lsiii/cms/papers/SchaeferTurek1996.pdf.TheBerlin, 1979. address of the script is at http://www.ljll.math.upmc. Non-conforming finite elements can be found in the fr/ hecht/ftp/ECM-2013 is NSCaraCyl-100-mpi.edp reference by M. Crouzeix and P.A. Raviart: or NSCaraCyl-100-seq.edp and func-max.idp (Fig. 2). Conforming and non-conforming finite element methods for solving the stationary Stokes problem, RAIRO Anal. Numer.´ 8 (1973), pp. 33–76. We also refer to the book by R. Temam: Bibilographical Notes Navier-Stokes Equations, Theory and Numerical Analysis, North-Holland, Amsterdam, 1979. The bibliography on Stokes and Navier-Stokes equa- The famous Millennium Prize Problem is described tions, theory and approximation, is very extensive and at the URL: we have only selected a few references. http://www.claymath.org/millennium/Navier-Stokes A mechanical derivation of the Navier-Stokes equa- Equations. tions can be found in the book by L.D. Landau and The reader will find a wide range of numerical S E.M. Lifshitz: methods for fluids in the book by O. Pironneau: Fluid Mechanics, Second Edition, Vol. 6 (Course of Finite Element Methods for Fluids, Wiley, 1989. Theoretical Physics), Pergamon Press, 1959. See also http://www.ljll.math.upmc.fr/ pironneau. The reader can also refer to the book by C. Truesdell We also refer to the course by R. Rannacher avail- and K.R. Rajagopal: able online at the URL: An Introduction to the Mechanics of Fluids, Model- http://numerik.iwr.uni-heidelberg.de/Oberwolfach- ing and Simulation in Science, Engineering and Tech- Seminar/CFD-Course.pdf. nology, Birkhauser, Basel, 2000. The book by R. Glowinski proposes a vey extensive A thorough theory and description of finite element collection of numerical methods, algorithms, and ex- methods can be found in the book by P.G. Ciarlet: periments for Stokes and Navier-Stokes equations: Basic error estimates for elliptic problems - Finite Finite Element Methods for Incompressible Viscous Element Methods, Part 1,inHandbook of Numerical Flow,inHandbook of numerical analysis, IX,P.G.Cia- Analysis, II, P.G. Ciarlet and J.L. Lions, eds., North- rlet and J.L .Lions, eds., North-Holland, Amsterdam, Holland, Amsterdam, 1991. 2003. 1418 Stratosphere and Its Coupling to the Troposphere and Beyond

Stratosphere and Its Coupling to the Definition Troposphere and Beyond As illustrated in Fig. 1, the Earth’s atmosphere can be separated into distinct regions, or “spheres,” based Edwin P. Gerber on its vertical temperature structure. In the lowermost Center for Atmosphere Ocean Science, Courant part of the atmosphere, the troposphere,thetemper- Institute of Mathematical Sciences, New York ature declines steeply with height at an average rate University, New York, NY, USA of approximately 7 ıC per kilometer. At a distinct level, generally between 10–12 km in the extratropics and 16–18 km in the tropics (The separation between Synonyms these regimes is rather abrupt and can be used as a dynamical indicator delineating the tropics and ex- Middle atmosphere tratropics.) this steep descent of temperature abruptly shallows, transitioning to a layer of the atmosphere where temperature is initially constant with height, and then begins to rise. This abrupt change in the vertical Glossary temperature gradient, denoted the tropopause,marks the lower boundary of the stratosphere, which extends Mesosphere an atmospheric layer between approxi- to approximately 50 km in height, at which point the mately 50 and 100 km height temperature begins to fall with height again. The re- Middle atmosphere a region of the atmosphere in- gion above is denoted the mesosphere, extending to a cluding the stratosphere and mesosphere second temperature minimum between 85 and 100 km. Stratosphere an atmospheric layer between approxi- Together, the stratosphere and mesosphere constitute mate 12 and 50 km the middle atmosphere. Stratospheric polar vortex a strong, circumpolar The stratosphere was discovered at the dawn of the jet that forms in the extratropical stratosphere twentieth century. The vertical temperature gradient, or during the winter season in each respective lapse rate, of the troposphere was established in the hemisphere eighteenth century from temperature and pressure mea- Sudden stratospheric warming a rapid break down surements taken on alpine treks, leading to speculation of the stratospheric polar vortex, accompanied by a that the air temperature would approach absolute zero sharp warming of the polar stratosphere somewhere between 30 and 40 km: presumably the top Tropopause boundary between the troposphere and of the atmosphere. Daring hot air balloon ascents in the stratosphere, generally between 10 to 18 km. late nineteenth century provided hints at a shallowing Troposphere lowermost layer of the atmosphere, of the lapse rate – early evidence of the tropopause – extending from the surface to between 10 and 18 but also led to the deaths of aspiring upper-atmosphere km. meteorologists. Teisserenc de Bort [15]andAssmann Climate engineering the deliberate modification of [2], working outside of Paris and Berlin, respectively, the Earth’s climate system, primarily aimed at re- pioneered the first systematic, unmanned balloon ob- ducing the impact of global warming caused by servations of the upper atmosphere, establishing the anthropogenic greenhouse gas emissions distinct changes in the temperature structure that mark Geoengineering see climate engineering the stratosphere. Quasi-biennial oscillation an oscillating pattern of easterly and westerly jets which propagates down- ward in the tropical stratosphere with a slightly Overview varying period around 28 months Solar radiation management a form of climate en- The lapse rate of the atmosphere reflects the stability of gineering where the net incoming solar radiation to the atmosphere to vertical motion. In the troposphere, the surface is reduced to offset warming caused by the steep decline in temperature reflects near-neutral greenhouse gases stability to moist convection. This is the turbulent Stratosphere and Its Coupling to the Troposphere and Beyond 1419

10−5 120 In this sense, the troposphere can be thought of as a boundary layer in the atmosphere that is well THERMOSPHERE connected to the surface. This said, it is important to 10−4 note that the mass of the atmosphere is proportional 100 mesopause to the pressure: the tropospheric “boundary” layer constitutes roughly 85 % of the mass of the atmo- 10−3 sphere and contains all of our weather. The strato- 80 sphere contains the vast majority of the remaining 10−2 atmospheric mass and the mesosphere and layers above MESOSPHERE just 0.1 %.

10−1 60 Why Is There a Stratosphere?

height (km) The existence of the stratosphere depends on the ra- pressure (hPa) stratopause 10−0 diative forcing of the atmosphere by the Sun. As the 40 atmosphere is largely transparent to incoming solar radiation, the bulk of the energy is absorbed at the 101 surface. The presence of greenhouse gases, which STRATOSPHERE 20 absorb infrared light, allows the atmosphere to interact 102 with radiation emitted by the surface. If the atmosphere tropopause were “fixed,” and so unable to convect (described as TROPOSPHERE a radiative equilibrium), this would lead to an unsta- 103 0 −80 −60 −40 −20 0 20 ble situation, where the air near the surface is much temperature (°C) warmer – and so more buoyant – than that above it. At height, however, temperature eventually becomes Stratosphere and Its Coupling to the Troposphere and Beyond, Fig. 1 The vertical temperature structure of the at- isothermal, given a fairly uniform distribution of the mosphere. This sample profile shows the January zonal mean infrared absorber throughout the atmosphere (The sim- temperature at 40A˚ N from the Committee on Space Research plest model for this is the so-called gray radiation (COSPAR) International Reference Atmosphere 1986 model scheme, where one assumes that all solar radiation (CIRA-86). The changes in temperature gradients and hence stratification of the atmosphere reflect a difference in the dynam- is absorbed at the surface and a single infrared band ical and radiative processes active in each layer. The heights of from the Earth interacts with a uniformly distributed the separation points (tropopause, stratopause, and mesopause) greenhouse gas.). vary with latitude and season – and even on daily time scales due If we allow the atmosphere to turn over in the to dynamical variability – but are generally sharply defined in any given temperature profile vertical or convect in the nomenclature of atmospheric science, the circulation will produce to a well-mixed layer at the bottom with near-neutral stability: the S troposphere. The energy available to the air at the weather layer of the atmosphere where air is in close surface is finite, however, only allowing it to penetrate contact with the surface, with a turnover time scale on so high into the atmosphere. Above the convection will the order of days. In the stratosphere, the near-zero or sit the stratified isothermal layer that is closer to the positive lapse rates strongly stratify the flow. Here the radiative equilibrium: the stratosphere. This simplified air is comparatively isolated from the surface of the view of a radiative-convective equilibrium obscures Earth, with a typical turnover time scale on the order the role of dynamics in setting the stratification in of a year or more. This distinction in stratification and both the troposphere and stratosphere but conveys the resulting impact on the circulation are reflected in the essential distinction between the layers. In this respect, nomenclature: the “troposphere” and “stratosphere” “stratospheres” are found on other planets as well, were coined by Teisserenc de Bort, the former the marking the region where the atmosphere becomes “sphere of change” from the Greek tropos,toturnor more isolated from the surface. whirl while the latter the “sphere of layers” from the The increase in temperature seen in the Earth’s Latin stratus, to spread out. stratosphere (as seen in Fig. 1) is due to the fact that 1420 Stratosphere and Its Coupling to the Troposphere and Beyond the atmosphere does interact with the incoming solar species, such as ozone. This exchange is critical for radiation through ozone. Ozone is produced by the understanding the atmospheric chemistry in both the interaction between molecular oxygen and ultraviolet troposphere and stratosphere and has significant impli- radiation in the stratosphere [4] and takes over as cations for tropospheric air quality, but will not be dis- the dominant absorber of radiation in this band. The cussed. For further information, please see two review decrease in density with height leads to an optimal level articles, [8]and[11]. The primary entry point for air for net ultraviolet warming and hence the temperature into the stratosphere is through the tropics, where the maximum near the stratopause, which provides the boundary between the troposphere and stratosphere is demarcation for the mesosphere above. less well defined. This region is known as the tropical Absorption of ultraviolet radiation by stratospheric tropopause layer and a review by [6] will provide the ozone protects the surface from high-energy radiation. reader an introduction to research on this topic. The destruction of ozone over Antarctica by halo- genated compounds has had significant health impacts, in addition to damaging all life in the biosphere. Dynamical Coupling Between the As described below, it has also had significant im- Stratosphere and Troposphere pacts on the tropospheric circulation in the Southern Hemisphere. The term “coupling” suggests interactions between independent components and so begs the question as Compositional Differences to whether the convenient separation of the atmosphere The separation in the turnover time scale between the into layers is merited in the first place. The key dy- troposphere and stratosphere leads to distinct chemical namical distinction between the troposphere and strato- or compositional properties of air in these two regions. sphere lies in the differences in their stratification and Indeed, given a sample of air randomly taken from the fact that moist processes (i.e., moist convection and some point in the atmosphere, one can easily tell latent heat transport) are restricted to the troposphere. whether it came from the troposphere or the strato- The separation between the layers is partly historical, sphere. The troposphere is rich in water vapor and however, evolving in response to the development of reactive organic molecules, such as carbon monoxide, weather forecasting and the availability of computa- which are generated by the biosphere and anthro- tional resources. pogenic activity. Stratospheric air is extremely dry, Midlatitude weather systems are associated with with an average water vapor concentration of approx- large-scale Rossby waves, which owe their existence imate 3–5 parts per billion, and comparatively rich in to gradients in the effective rotation, or vertical com- ozone. Ozone is a highly reactive molecule (causing ponent of vorticity, of the atmosphere due to varia- lung damage when it is formed in smog at the surface) tions in the angle between the surface plain and the and does not exist for long in the troposphere. axis of rotation with latitude. Pioneering work by [5] and [10] showed that the dominant energy containing Scope and Limitations of this Entry waves in the troposphere, wavenumber roughly 4–8, Stratospheric research, albeit only a small part of the the so-called synoptic scales, cannot effectively propa- Earth system science, is a fairly mature field covering gate into the stratosphere due the presence of easterly a wide range of topics. The remaining goal of this brief winds in the summer hemisphere and strong westerly entry is to highlight the dynamical interaction between winds in the winter hemisphere. For the purposes of the stratosphere and the troposphere, with particular weather prediction, then, the stratosphere could largely emphasis on the impact of the stratosphere on surface be viewed as an upper-boundary condition. Models climate. In the interest of brevity, references have been thus resolved the stratosphere as parsimoniously as kept to a minimum, focusing primarily on seminal possible in order to focus numerical resources on the historical papers and reviews. More detailed references troposphere. The strong winds in the winter strato- can be found in the review articles listed in further sphere also impose a stricter Courant-Friedrichs-Lewy readings. condition on the time step of the model, although The stratosphere also interacts with the troposphere more advanced numerical techniques have alleviated through the exchange of mass and trace chemical this problem. Stratosphere and Its Coupling to the Troposphere and Beyond 1421

Despite the dynamical separation for weather is conversely associated with a poleward shift in the system-scale waves, larger-scale Rossby waves jet stream, although the onset of cold vortex events is (wavenumber 1–3, referred to as planetary scales) not as abrupt. More significantly, the changes in the can penetrate into the winter stratosphere, allowing for troposphere extend for up to 2–3 months on the slow momentum exchange between the layers. In addition, time scale of the stratospheric recovery, while under smaller-scale (on the order of 10–1,000km) gravity normal conditions the chaotic nature of tropospheric waves (Gravity waves are generated in stratified flow restricts the time scale of jet variations to ap- fluids, where the restoring force is the gravitational proximately 10 days. The associated changes in the acceleration of fluid parcels or buoyancy. They stratospheric jet stream and tropospheric jet shift are are completely distinct from relativistic gravity conveniently described by the Northern Annular Mode waves.) also transport momentum between the (The NAM is also known as the Arctic Oscillation, layers. As computation power increased, leading although the annular mode nomenclature has become to a more accurate representation of tropospheric more prominent.) (NAM) pattern of variability. dynamics, it became increasingly clear that a better The mechanism behind this interaction is still an representation of the stratosphere was necessary to active area of research. It has become clear, however, fully understand and simulate surface weather and that key lies in the fact that the lower stratosphere climate. influences the formation and dissipation of synoptic- scale Rossby waves, despite the fact that these waves Coupling on Daily to Intraseasonal Time Scales do not penetrate far into the stratosphere. Weather prediction centers have found that the in- A shift in the jet stream is associated with a large- creased representation of the stratosphere improves scale rearrangement of tropospheric weather patterns. tropospheric forecasts. On short time scales, however, In the Northern Hemisphere, where the stratosphere much of the gain comes from improvements to the tro- is more variable due to the stronger planetary wave pospheric initial condition. This stems from better as- activity (in short, because there are more continents), similation of satellite temperature measurements which an equatorward shift in the jet stream following an project onto both the troposphere and stratosphere. SSW leads to colder, stormier weather over much of The stratosphere itself has a more prominent impact northern Europe and eastern North America. Forecast on intraseasonal time scales, due to the intrinsically skill of temperature, precipitation, and wind anomalies longer time scales of variability in this region of the at the surface increases in seasonal forecasts follow- atmosphere. The gain in predictability, however, is con- ing an SSW. SSWs can be further differentiated into ditional, depending on the state of the stratosphere. Un- “vortex displacements” and “vortex splits,” depending der normal conditions, the winter stratosphere is very on the dominant wavenumber (1 or 2, respectively) cold in the polar regions, associated with a strong west- involved in the breakdown of the jet, and recent work erly jet, or stratospheric polar vortex. As first observed has suggested this has an effect on the tropospheric in the 1950s [12], this strong vortex is sometimes impact of the warming. S disturbed by the planetary wave activity propagating SSWs occur approximately every other year in the below, leading to massive changes in temperature (up Northern Hemisphere, although there is strong inter- to 70 ıC in a matter of days) and a reversal of the mittency: few events were observed in the 1990s, while westerly jet, a phenomenon known as a sudden strato- they have been occurring in most years in the first spheric warming, or SSW. While the predictability of decades of the twenty-first century. In the Southern SSWs are limited by the chaotic nature of tropospheric Hemisphere, the winter westerlies are stronger and less dynamics, after an SSW the stratosphere remains in an variable – only one SSW has ever been observed, in altered state for up to 2–3 months as the polar vortex 2002 – but predictability may be gained around the slowly recovers from the top down. time of the “final warming,” when the stratosphere Baldwin and Dunkerton [3] demonstrated the im- transitions to it’s summer state with easterly winds. pact of these changes on the troposphere, showing that Some years, this transition is accelerated by planetary an abrupt warming of the stratosphere is followed by an wave dynamics, as in an SSW, while in other years it is equatorward shift in the tropospheric jet stream and as- gradual, associated with a slow radiative relaxation to sociated storm track. An abnormally cold stratosphere the summer state. 1422 Stratosphere and Its Coupling to the Troposphere and Beyond

Coupling on Interannual Time Scales The role of the stratosphere in the climate response On longer time scales, the impact of the stratosphere to volcanic eruptions is comparatively better under- is often felt through a modulation of the intraseasonal stood. While volcanic aerosols are washed out of the coupling between the stratospheric and tropospheric troposphere on fairly short time scales by the hydrolog- jet streams. Stratospheric dynamics play an important ical cycle, sulfate particles in the stratosphere can last role in internal modes of variability to the atmosphere- for 1–2 years. These particles reflect the incoming solar ocean system, such as El Nino˜ and the Southern Os- radiation, leading to a global cooling of the surface; cillation (ENSO), and in the response of the climate following Pinatubo, the global surface cooled to 0.1– system to “natural” forcing by the solar cycle and 0.2 K. The overturning circulation of the stratosphere volcanic eruptions. lifts mass up into the tropical stratosphere, transporting The quasi-biennial oscillation (QBO) is a nearly it poleward where it descends in the extratropics. Thus, periodic oscillation of downward propagating easterly only tropical eruptions have a persistent, global impact. and westerly tropical jets in the tropical stratosphere, Sulfate aerosols warm the stratosphere, therefore with a period of approximately 28 months. It is perhaps modifying the planetary wave coupling. There is some the most long-lived mode of variability intrinsic to the evidence that the net result is a strengthening of the atmosphere alone. The QBO influences the surface by polar vortex which in turn drives a poleward shift in modulating the wave coupling between the troposphere the tropospheric jets. Hence, eastern North America and stratosphere in the Northern Hemisphere winter, and northern Europe may experience warmer winters altering the frequency and intensity of SSWs depend- following eruptions, despite the overall cooling impact ing on the phase of the oscillation. of the volcano. Isolating the impact of the QBO has been com- plicated by the possible overlap with the ENSO, a The Stratosphere and Climate Change coupled mode of atmosphere-ocean variability with a Anthropogenic forcing has changed the stratosphere, time scale of approximately 3–7 years. The relatively with resulting impacts on the surface. While green- short observational record makes it difficult to untangle house gases warm the troposphere, they increase the the signals from measurements alone, and models radiative efficiency of the stratosphere, leading to a net have only recently been able to simulate these phe- cooling in this part of the atmosphere. The combination nomenon with reasonable accuracy. ENSO is driven of a warming troposphere and cooling stratosphere by interaction between the tropical Pacific Ocean and leads to a rise in the tropopause and may be one of the zonal circulation of the tropical atmosphere (the the most identifiable signatures of global warming on Walker circulation). Its impact on the extratropical the atmospheric circulation. circulation in the Northern Hemisphere, however, is in While greenhouse gases will have a dominant long- part effected through its influence on the stratospheric term impact on the climate system, anthropogenic polar vortex. A warm phase of ENSO is associated with emissions of halogenated compounds, such as chlo- stronger planetary wave propagation into the strato- rofluorocarbons (CFCs), have had the strongest im- sphere, hence a weaker polar vortex and equatorward pact on the stratosphere in recent decades. Halogens shift in the tropospheric jet stream. have caused some destruction of ozone throughout the Further complicating the statistical separation be- stratosphere, but the extremely cold temperatures of the tween the impacts of ENSO and the QBO is the Antarctic stratosphere in winter permit the formation influence of the 11-year solar cycle, associated with of polar stratospheric clouds, which greatly accelerate changes in the number of sunspots. While the overall the production of Cl and Br atoms that catalyze ozone intensity of solar radiation varies less than 0.1 % of destruction (e.g., [14]). This led to the ozone hole, its mean value over the cycle, the variation is stronger the effective destruction of all ozone throughout the in the ultraviolet part of the spectrum. Ultraviolet middle and lower stratosphere over Antarctica. The radiation is primarily absorbed by ozone in the strato- effect of ozone loss on ultraviolet radiation was quickly sphere, and it has been suggested that the associated appreciated, and the use of halogenated compounds changes in temperature structure alter the planetary regulated and phased out under the Montreal Proto- wave propagation, along the lines of the influence of col (which came into force in 1989) and subsequent ENSO and QBO. agreements. Chemistry climate models suggest that Stratosphere and Its Coupling to the Troposphere and Beyond 1423 the ozone hole should recover by the end of this planetary wave coupling with the troposphere are not century, assuming the ban on halogenated compounds well understood. is observed. It was not appreciated until the first decade of the twenty-first century, however, that the ozone hole also has impacted the circulation of the Southern Further Reading Hemisphere. The loss of ozone leads to a cooling of the austral polar vortex in springtime and a subsequent There are a number of review papers on stratosphere- poleward shift in the tropospheric jet stream. Note tropospheric coupling in the literature. In particular, that this poleward shift in the tropospheric jet in [13] provides a comprehensive discussion of response to a stronger stratospheric vortex mirrors stratosphere-tropospherecoupling, while [7] highlights the coupling associated with natural variability in developments in the last decade. Andrews et al. the Northern Hemisphere. As reviewed by [16], [1] provide a classic text on the dynamics of the this shift in the jet stream has had significant stratosphere, and [9] provides a wider perspective impacts on precipitation across much of the Southern on the stratosphere, including the history of Hemisphere. field. Stratospheric trends in water vapor also have the po- tential to affect the surface climate. Despite the minus- cule concentration of water vapor in the stratosphere References (just 3–5 parts per billion), the radiative impact of a greenhouse gases scales logarithmically, so relatively 1. Andrews, D.G., Holton, J.R., Leovy, C.B.: Middle At- large changes in small concentrations can have a strong mosphere Dynamics. Academic Press, Waltham, MA impact. Decadal variations in stratospheric water vapor (1987) 2. Assmann, R.A.:uber ¨ die existenz eines warmeren¨ Luft- can have an influence on surface climate comparable to tromes in der Hohe¨ von 10 bis 15 km. Sitzungsber K. Preuss. decadal changes in greenhouse gas forcing, and there is Akad. Wiss. 24, 495–504 (1902) evidence of a positive feedback of stratospheric water 3. Baldwin, M.P., Dunkerton, T.J.: Stratospheric harbingers of vapor on greenhouse gas forcing. anomalous weather regimes. Science 294, 581–584 (2001) 4. Chapman, S.: A theory of upper-atmosphere ozone. Mem. The stratosphere has also been featured prominently R. Meteor. Soc. 3, 103–125 (1930) in the discussion of climate engineering (or geoengi- 5. Charney, J.G., Drazin, P.G.: Propagation of planetary-scale neering), the deliberate alteration of the Earth sys- disturbances from the lower into the upper atmosphere. J. tem to offset the consequences of greenhouse-induced Geophys. Res. 66, 83–109 (1961) 6. Fuegistaler, S., Dessler, A.E., Dunkerton, T.J., Folkins, I., warming. Inspired by the natural cooling impact of Fu, Q., Mote, P.W.: Tropical tropopause layer. Rev. Geo- volcanic aerosols, the idea is to inject hydrogen sulfide phys. 47, RG1004 (2009) or sulfur dioxide into the stratosphere, where it will 7. Gerber, E.P., Butler, A., Calvo, N., Charlton-Perez, A., form sulfate aerosols. To date, this strategy of the Giorgetta, M., Manzini, E., Perlwitz, J., Polvani, L.M., S Sassi, F., Scaife, A.A., Shaw, T.A., Son, S.W., Watanabe, so-called solar radiation management appears to be S.: Assessing and understanding the impact of stratospheric among the most feasible and cost-effective means of dynamics and variability on the Earth system. Bull. Am. cooling the Earth’s surface, but it comes with many Meteor. Soc. 93, 845–859 (2012) dangers. In particular, it does not alleviate ocean acid- 8. Holton, J.R., Haynes, P.H., McIntyre, M.E., Douglass, A.R., Rood, R.B., Pfister, L.: Stratosphere-troposphere exchange. ification, and the effect is short-lived – a maximum of Rev. Geophys. 33, 403–439 (1995) two years – and so would require continual action ad 9. Labitzke, K.G., Loon, H.V.: The Stratosphere: Phenom- infinitum or until greenhouse gas concentrations were ena, History, and Relevance. Springer, Berlin/New York returned to safer levels. (In saying this, it is important (1999) 10. Matsuno, T.: Vertical propagation of stationary planetary to note that the natural time scale for carbon dioxide waves in the winter Northern Hemisphere. J. Atmos. Sci. removal is 100,000s of years, and there are no known 27, 871–883 (1970) strategies for accelerating CO2 removal that appear 11. Plumb, R.A.: Stratospheric transport. J. Meteor. Soc. Jpn. feasible, given current technology.) In addition, the 80, 793–809 (2002) 12. Scherhag, R.: Die explosionsartige stratospharenerwarmung¨ impact of sulfate aerosols on stratospheric ozone and des spatwinters¨ 1951/52. Ber Dtsch Wetterd. US Zone 6, the potential regional effects due to changes in the 51–63 (1952) 1424 Structural Dynamics

13. Shepherd, T.G.: Issues in stratosphere-troposphere cou- point x of the structure and at a fixed circular frequency pling. J. Meteor. Soc. Jpn, 80, 769–792 (2002) ! (in rad/s). Figure 1 represents the modulus juj .x;!/j 14. Solomon, S.: Stratospheric ozone depletion: a review of in log scale and the unwrapped phase ' .x;!/ of the concepts and history. Rev. Geophys. 37(3), 275–316 (1999). j doi:10.1029/1999RG900008 FRF such that uj .x;!/Djuj .x;!/j expfi'j .x;!/g. 15. Teisserence de Bort, L.: Variations de la temperature d l’air The unwrapped phase is defined as a continuous func- libre dans la zone comprise entre 8 km et 13 km d’altitude. tion of ! obtained in adding multiples of ˙2 for C. R. Acad. Sci. Paris 138, 42–45 (1902) jumps of the phase angle. The three frequency ranges 16. Thompson, D.W.J., Solomon, S., Kushner, P.J., England, M.H., Grise, K.M., Karoly, D.J.: Signatures of the Antarctic can then be characterized as follows: ozone hole in Southern Hemisphere surface climate change. 1. The low-frequency range (LF) is defined as the Nature Geoscience 4, 741–749 (2011). doi:10.1038/N- modal domain for which the modulus of the FRF GEO1296 exhibits isolated resonances due to a low modal density of elastic structural modes. The amplitudes of the resonances are driven by the damping and the phase rotates of  at the crossing of each isolated Structural Dynamics resonance (see Fig. 1). For the LF range, the strategy used consists in computing the elastic structural Roger Ohayon1 and Christian Soize2 modes of the associated conservative dynamical 1Structural Mechanics and Coupled Systems system and then to construct a reduced-order model Laboratory, LMSSC, Conservatoire National des Arts by the Ritz-Galerkin projection. The resulting ma- et Metiers´ (CNAM), Paris, France trix equation is solved in the time domain or in 2Laboratoire Modelisation´ et Simulation the frequency domain. It should be noted that sub- Multi-Echelle, MSME UMR 8208 CNRS, Universite structuring techniques can also be introduced for Paris-Est, Marne-la-Vallee,´ France complex structural systems. Those techniques con- sist in decomposing the structure into substructures and then in constructing a reduced-order model for Description each substructure for which the physical degrees of freedom on the coupling interfaces are kept. The computational structural dynamics is devoted to 2. The high-frequency range (HF) is defined as the the computation of the dynamical responses in time or range for which there is a high modal density which in frequency domains of complex structures, submitted is constant on the considered frequency range. In to prescribed excitations. The complex structure is this HF range the modulus of the FRF varies slowly constituted of a deformable medium constituted of as the function of the frequency and the phase is metallic materials, heterogeneous composite materials, approximatively linear (see Fig. 1). Presently, this and more generally, of metamaterials. frequency range is relevant of various approaches This chapter presents the linear dynamic analysis such as Statistical Energy Analysis (SEA), diffusion for complex structures, which is the most frequent case of energy equation, and transport equation. How- encountered in practice. For this situation, one of the ever, due to the constant increase of computer power most efficient modeling strategy is based on a formu- and advances in modeling of complex mechanical lation in the frequency domain (structural vibrations). systems, this frequency domain becomes more and There are many advantages to use a frequency domain more accessible to the computational methods. formulation instead of a time domain formulation be- 3. For complex structures (complex geometry, cause the modeling can be adapted to the nature of the heterogeneous materials, complex junctions, physical responses which are observed. This is the rea- complex boundary conditions, several attached son why the low-, the medium-, and the high-frequency equipments or mechanical subsystems, etc.), an ranges are introduced. The different types of vibration intermediate frequency range, called the medium- responses of a linear dissipative complex structure lead frequency range (MF), appears. This MF range us to define the frequency ranges of analysis. Let does not exist for a simple structure (e.g., a simply uj .x;!/be the Frequency Response Function (FRF) of supported homogeneous straight beam). This MF a component j of the displacement u.x;!/,atafixed range is defined as the intermediate frequency range Structural Dynamics 1425 Unwrapped phase FRF modulus (in log scale) LF MF HF Frequency (Hz) LF MF HF Frequency (Hz)

Structural Dynamics, Fig. 1 Modulus (left) and unwrapped phase (right) of the FRF as a function of the frequency. Definition of the LF, MF, and HF ranges

for which the modal density exhibits large variations bounded connected domain of R3, with a smooth over the frequency band. Due to the presence boundary @& for which the external unit normal of the damping which yields an overlapping of is denoted as n. The generic point of & is x D elastic structural modes, the frequency response .x1;x2;x3/.Letu.x;t/ D .u1.x;t/;u2.x;t/;u3.x;t// functions do not exhibit isolated resonances, and be the displacement of a particle located at point the phase slowly varies as a function of the x in & and at a time t. The structure is assumed frequency (see Fig. 1). In this MF range, the to be free (€0 D;), a given surface force field responses are sensitive to damping modeling (for G.x;t/ D .G1.x;t/;G2.x;t/;G3.x;t// is applied to weakly dissipative structure), which is frequency the total boundary € D @&, and a given body force dependent, and sensitive to uncertainties. For this field g.x;t/ D .g1.x;t/;g2.x;t/;g3.x;t// is applied MF range, the computational model is constructed in &. It is assumed that these external forces are in as follows: The reduced-order computational model equilibrium. Below, if w is any quantity depending on of the LF range can be used in (i) adapting the x,thenw;j denotes the partial derivative of w with finite element discretization to the MF range, respect to xj . The classical convention for summations (ii) introducing appropriate damping models (due over repeated Latin indices is also used. to dissipation in the structure and to transfer of The elastodynamic boundary value problem is writ- mechanical energy from the structure to mechanical ten, in terms of u and at time t,as subsystems which are not taken into account in 2 the computational model), and (iii) introducing @t ui .x;t/ ij;j .x;t/D gi .x;t/ in &; (1) uncertainty quantification for both the system-  .x;t/n .x/ D G .x;t/ on €; (2) parameter uncertainties and the model uncertainties ij j i S induced by the modeling errors. ij;j .x;t/ D aij k h .x/"kh.u/ C bij k h .x/"kh.@t u/; For sake of brevity, the case of nonlinear dynamical " .u/ D .u C u /=2: (3) responses of structures (involving nonlinear constitu- kh k;h h;k tive equations, nonlinear geometrical effects, plays, In (1), .x/ is the mass density field,  is the Cauchy etc.) is not considered in this chapter (see Bibliograph- ij stress tensor. The constitutive equation is defined by (3) ical comments). exhibiting an elastic part defined by the tensor aij k h .x/ and a dissipative part defined by the tensor bij k h .x/, independent of t because the model is developed for Formulation in the Low-Frequency Range the low-frequency range, and ".@t u/ is the linearized for Complex Structures strain tensor. Let C D .H 1.&//3 be the real Hilbert space We consider linear dynamics of a structure around a of the admissible displacement fields, x 7! v.x/, position of static equilibrium taken as the reference on &. Considering t as a parameter, the variational configuration, &, which is a three-dimensional formulation of the boundary value problem defined by 1426 Structural Dynamics

(1)–(3) consists, for fixed t, in finding u.:; t/ in C,such The finite element discretization with n degrees of that freedom of (4) yields the following second-order dif- ferential equation on Rn: m.@2u; v/ C d.@ u; v/ C k.u; v/ D f.tI v/; 8v 2 C; t t ŒM  UR .t/ C ŒD UP .t/ C ŒK U.t/ D F.t/; (8) (4) in which the bilinear form m is symmetric and positive and its Fourier transform, which corresponds to the definite, the bilinear forms d and k are symmetric, finite element discretization of (7), yields the complex positive semi-definite, and are such that matrix equation which is written as Z .!2 ŒM  C i!ŒD C ŒK/ U.!/ D F.!/; (9) m.u; v/ D  uj vj dx; Z & in which ŒM  is the mass matrix which is a symmetric k.u; v/ D aij k h "kh.u/"ij .v/dx; (5) & positive definite .n  n/ real matrix and where ŒD and Z ŒK are the damping and stiffness matrices which are d.u; v/ D bij k h "kh.u/"ij .v/dx; symmetric positive semi-definite .n  n/ real matrices. Z & Z Case of a fixed structure. If the structure is fixed on a part €0 of boundary @& (Dirichlet condition u D 0 on f.tI v/ D gj .t/ vj dx C Gj .t/ vj ds.x/: (6) & € €0), then the given surface force field G.x;t/is applied to the part € D @&n€0. The space C of the admissible The kernel of the bilinear forms k and d is the set of displacement fields must be replaced by the rigid body displacements, Crig  C of dimension 6. C C Any displacement field urig in Crig is such that, for all 0 Dfv 2 ; v D 0 on €0g: (10) x in &, urig.x/ D t C   x in which t and  are two arbitrary constant vectors in R3. The complex vector space Cc must be replaced by the Cc For the evolution problem with given Cauchy initial complex vector space 0 which is the complexified C conditions u.:; 0/ D u0 and @t u.:; 0/ D v0, the analysis vector space of 0. The real matrices ŒD and ŒK are of the existence and uniqueness of a solution requires positive definite. the introduction of the following hypotheses:  is a positive bounded function on &;forallx in &,the fourth-order tensor aij k h .x/ (resp. bij k h .x/) is symmet- Associated Spectral Problem and ric, aij k h .x/ D ajikh.x/ D aij hk .x/ D akhij .x/,and Structural Modes such that, for all second-order real symmetric tensor 2 ij , there is a positive constant c independent of x, Setting  D ! , the spectral problem, associated with such that aij k h .x/kh ij  cij ij I the functions the variational formulation defined by (4)or(7), is aij k h and bij k h are bounded on &; finally, g and G are stated as the following generalized eigenvalue prob- such that the linear form v 7! f.tI v/ is continuous lem. Find real   0 and u 6D 0 in C such that on C. Assuming that for all v in C, t 7! f.tI v/ is a square integrable function on R.LetCc be the k.u; v/ D m.u; v/; 8v 2 C: (11) complexified vector space of C and let v be the complex conjugate of v. Then,R introducing the Fourier trans- Rigid body modes (solutions for  D 0). Since the i!t dimension of C is 6, then  D 0 can be consid- formsR u.x;!/ D R e u.x;t/dt and f.!I v/ D rig i!t R e f.tI v/dt, the variational formulation defined ered as a “zero eigenvalue” of multiplicity 6, denoted by (4) can be rewritten as follows: For all fixed real as 5;:::;0.Letu5;:::;u0 be the corresponding ! 6D 0,findu.:; !/ with values in Cc such that eigenfunctions which are constructed such that the following orthogonality conditions are satisfied: for ˛ and ˇ in f5;:::;0g, m.u˛; uˇ/ D ˛ ı˛ˇ and  !2 m.u; v/ C i!d.u; v/ C k.u; v/ k.u˛; uˇ/ D 0. These eigenfunctions, called the rigid c D f.!I v/; 8v 2 C : (7) body modes, form a basis of Crig  C and any rigid Structural Dynamics 1427 Z Z C body displacementP urig in rig can then be expanded as 0 u˛.x/.x/dx D 0; x  u˛.x/.x/dx D 0; urig D ˛D5 q˛ u˛. & & (14) Elastic structural modes (solutions for  6D 0).We introduce the subset Celas D C n Crig. It can be shown which shows that the inertial center of the structure de- that C D Crig ˚Celas which means that any displacement formed under the elastic structural mode u˛, coincides field u in C has the following unique decomposition with the inertial center of the undeformed structure. u D u C u with u in C and u in C . rig elas rig rig elas elas Expansion of the displacement field using the rigid Consequently, k.u ; v / defined on C  C is elas elas elas elas body modes and the elastic structural modes.Any positive definite and we then have k.uelas; uelas/>0 C displacementP fieldPu in can then be written as u D for all uelas 6D 0 2 Celas. 0 C1 ˛D5 q˛ u˛ C ˛D1 q˛ u˛. Eigenvalue problem restricted to Celas. The eigenvalue Terminology. In structural vibrations, !˛ >0is called problem restricted to Celas is written as follows: Find the eigenfrequency of elastic structural mode u˛ (or  6D 0 and uelas 6D 0 in Celas such that the eigenmode or mode shape of vibration) whose normalization is defined by the generalized mass ˛. k.uelas; velas/ D m.uelas; velas/; 8velas 2 Celas: (12) An elastic structural mode ˛ is defined by the three quantities f!˛; u˛;˛g. Countable number of positive eigenvalues. It can be Finite element discretization. The matrix equation of proven that the eigenvalue problem, defined by (12), the generalized symmetric eigenvalue problem corre- admits an increasing sequence of positive eigenvalues sponding to the finite element discretization of (11)is 0<1 Ä 2 Ä ::: Ä ˛ Ä :::. In addition, any written as multiple positive eigenvalue has a finite multiplicity ŒKU D ŒMU: (15) (which means that a multiple positive eigenvalue is repeated a finite number of times). For large computational model, this generalized eigen- Orthogonality of the eigenfunctions corresponding to value problem is solved using iteration algorithms such the positive eigenvalues. The sequence of eigenfunc- as the Krylov sequence, the Lanczos method, and the subspace iteration method, which allows a prescribed tions fu˛g˛ in Celas corresponding to the positive eigen- values satisfies the following orthogonality conditions: number of eigenvalues and associated eigenvectors to be computed. 2 Case of a fixed structure. If the structure is fixed on €0, m.u˛; uˇ/ D ˛ ı˛ˇ ;k.u˛; uˇ/ D ˛ !˛ ı˛ˇ ; (13) Crig is reduced to the empty set and admissible space p Celas must be replaced by C0. In this case the eigenval- in which ! D  and where  is a positive real ˛ ˛ ˛ ues are strictly positive. In addition, the property given S number depending on the normalization of eigenfunc- for the free structure concerning the inertial center of tion u˛. the structure does not hold. Completeness of the eigenfunctions corresponding to the positive eigenvalues.Letu˛ be the eigenfunction associated with eigenvalue ˛ >0. It can be shown that Reduced-Order Computational Model in C eigenfunctions fu˛g˛1 form a complete family in elas the Frequency Domain and consequently, an arbitrary functionPuelas belonging to C can be expanded as u D C1 q u in elas elas ˛D1 ˛ ˛ In the frequency range, the reduced-order computa- which fq g is a sequence of real numbers. These ˛ ˛ tional model is carried out using the Ritz-Galerkin eigenfunctions are called the elastic structural modes. projection. Let CS be the admissible function space Orthogonality between the elastic structural modes such that CS D Celas for a free structure and CS D C0 C and the rigid body modes.Wehavek.u˛ ; urig/ D 0 for a structure fixed on €0.Let S;N be the subspace C and m.u˛; urig/ D 0. Substituting urig.x/ D t C   x of S ,ofdimensionN  1, spanned by the finite into (13) yields family fu1;:::;uN g of elastic structural modes u˛. 1428 Structural Dynamics

For all fixed !, the projection uN .!/ of u.!/ on the Bibliographical Comments complexified vector space of CS;N can be written as The mathematical aspects related to the variational XN formulation, existence and uniqueness, and finite el- N u .x;!/D q˛.!/ u˛.x/; (16) ement discretization of boundary value problems for ˛D1 elastodynamics can be found in Dautray and Lions [6], Oden and Reddy [11], and Hughes [9]. More details in which q.!/ D .q1.!/; : : : ; qN .!// is the complex- concerning the finite element method can be found valued vector of the generalized coordinates which in Zienkiewicz and Taylor [14]. Concerning the time verifies the matrix equation on CN , integration algorithms in nonlinear computational dy- namics, the readers are referred to Belytschko et al. [3], .!2 ŒM  C i!ŒD C Œ K / q.!/ D F.!/; (17) and Har and Tamma [8]. General mechanical formula- tions in computational structural dynamics, vibration, eigenvalue algorithms, and substructuring techniques M D K in which Œ , Œ ,andŒ  are .N  N/ real sym- can be found in Argyris and Mlejnek [1], Geradin metric positive definite matrices (for a free or a fixed and Rixen [7], Bathe and Wilson [2], and Craig and M K structure). Matrices Œ  and Œ  are diagonal and Bampton [5]. For computational structural dynamics such that in the low- and the medium-frequency ranges and extensions to structural acoustics, we refer the reader ŒM ˛ˇ D m.uˇ; u˛/ D ˛ ı˛ˇ ; to Ohayon and Soize [12]. Various formulations for the 2 high-frequency range can be found in Lyon and Dejong Œ K ˛ˇ D k.uˇ; u˛/ D ˛ ! ı˛ˇ : (18) ˛ [10] for Statistical Energy Analysis, and in Chap. 4 of Bensoussan et al. [4] for diffusion of energy and trans- D The damping matrix Œ  is not sparse (fully populated) port equations. Concerning uncertainty quantification F and the component ˛ of the complex-valued vector of (UQ) in computational structural dynamics, we refer F F F the generalized forces D . 1;:::; N / are such that the reader to Soize [13].

ŒD˛ˇ D d.uˇ; u˛/; F˛.!/ D f.!I u˛ /: (19) References The reduced-order model is defined by (16)–(19). Convergence of the solution constructed with the 1. Argyris, J., Mlejnek, H.P.: Dynamics of Structures. North- reduced-order model. For all real !,(17) has a unique Holland, Amsterdam (1991) N 2. Bathe, K.J., Wilson, E.L.: Numerical Methods in solution u .!/ which is convergent in CS when N goes to infinity. Quasi-static correction terms can be Finite Element Analysis. Prentice-Hall, New York (1976) introduced to accelerate the convergence with respect 3. Belytschko, T., Liu, W.K., Moran, B.: Nonlinear Finite to N . Elements for Continua and Structures. Wiley, Chichester (2000) Remarks concerning the diagonalization of the damp- 4. Bensoussan, A., Lions, J.L., Papanicolaou, G.: Asymptotic ing operator. When damping operator is diagonalized Analysis for Periodic Structures. AMS Chelsea Publishing, D Providence (2010) by the elastic structural modes, matrix Œ  defined by 5. Craig, R.R., Bampton, M.C.C.: Coupling of substruc- (19), is an .N  N/ diagonal matrix which can be tures for dynamic analysis. AIAA J. 6, 1313–1319 written as ŒD˛ˇ D d.uˇ; u˛/ D 2˛ !˛ ˛ ı˛ˇ in (1968) which  and ! are defined by (18). The critical 6. Dautray, R., Lions, J.L.: Mathematical Analysis and Numer- ˛ ˛ ical Methods for Science and Technology. Springer, Berlin damping rate ˛ of elastic structural mode u˛ is a (2000) positive real number. A weakly damped structure is a 7. Geradin, M., Rixen, D.: Mechanical Vibrations: Theory structure such that 0< ˛ 1 for all ˛ in f1;:::;Ng. and Applications to Structural Dynamics, 2nd edn. Wiley, Several algebraic expressions exist for diagonalizing Chichester (1997) 8. Har, J., Tamma, K.: Advances in Computational Dynamics the damping bilinear form with the elastic structural of Particles, Materials and Structures. Wiley, Chichester modes. (2012) Subdivision Schemes 1429

9. Hughes, T.J.R.: The Finite Element Method: Linear Static of refinable functions, which are instrumental in the and Dynamic Finite Element Analysis. Dover, New York construction of wavelets. (2000) The “classical” subdivision schemes are stationary 10. Lyon, R.H., Dejong, R.G.: Theory and Application of Sta- tistical Energy Analysis, 2nd edn. Butterworth-Heinemann, and linear, applying the same linear refinement rule at Boston (1995) each refinement level. The theory of these schemes is 11. Oden, J.T., Reddy, J.N.: An Introduction to the Math- well developed and well understood; see [2, 7, 9, 15], ematical Theory of Finite Elements. Dover, New York and references therein. (2011) 12. Ohayon, R., Soize, C.: Structural Acoustics and Vibration. Nonlinear schemes were designed for the approxi- Academic, London (1998) mation of piecewise smooth functions (see, e.g., [1,4]), 13. Soize, C.: Stochastic Models of Uncertainties in Computa- for taking into account the geometry of the initial tional Mechanics, vol. 2. Engineering Mechanics Institute (EMI) of the American Society of Civil Engineers (ASCE), points in the design of curves/surfaces (see, e.g., [5, Reston (2012) 8,11]), and for manifold-valued data (see, e.g., [12,14, 14. Zienkiewicz, O.C., Taylor, R.L.: The Finite Element 16]). These schemes were studied at a later stage, and Method: The Basis, vol. 1, 5th edn. Butterworth- Heine- in many cases their analysis is based on their proximity mann, Oxford (2000) to linear schemes.

Linear Schemes A linear refinement rule operates on a set of points in Subdivision Schemes Rd with topological relations among them, expressed by relating the points to the vertices of a regular grid in Nira Dyn Rs. In the design of 3D surfaces, d D 3 and s D 2. School of Mathematical Sciences, Tel-Aviv The refinement consists of a rule for refining the University, Tel-Aviv, Israel grid and a rule for defining the new points corre- sponding to the vertices of the refined grid. The most common refinement is binary, and the most common Mathematics Subject Classification grid is Zs. d For a set of points P DfPi 2 R W Primary: 65D07; 65D10; 65D17. Secondary: 41A15; i 2 Zsg, related to 2kZs , the binary refinement 41A25 rule R generates points related to 2k1Zs,of the form

Definition X s .RP/i D ai2j Pj ;i2 Z ; (1) A subdivision scheme is a method for generating a j 2Zs S continuous function from discrete data, by repeated applications of refinement rules. A refinement rule k1 operates on a set of data points and generates a denser with the point .RP/i related to the vertex i2 .The s set using local mappings. The function generated by set of coefficients fai 2 R W i 2 Z g is called the mask a convergent subdivision scheme is the limit of the of the refinement rule, and only a finite number of the sequence of sets of points generated by the repeated coefficients are nonzero, reflecting the locality of the refinements. refinement. When the same refinement rule is applied in all refinement levels, the scheme is called stationary, Description while if different linear refinement rules are applied in different refinement levels, the scheme is called Subdivision schemes are efficient computational meth- nonstationary (see, e.g., [9]). ods for the design, representation, and approximation A stationary scheme is defined to be convergent (in of curves and surfaces in 3D and for the generation the L1-norm) if for any initial set of points P,there 1430 Subdivision Schemes exists a continuous function F defined on .Rs/d such representing surfaces of general topology, and a finite that number of extraordinary points are required [13]. k k lim sup jF.i2 /  .R P/i jD0; (2) The analysis on regular grids of the convergence of k!1 i2Zs a stationary linear scheme, and of the smoothness of and if for at least one initial set of points, F 6Á 0.A the generated functions, is based on the coefficients similar definition of convergence is used for all other of the mask. It requires the computation of the joint types of subdivision schemes, with Rk replaced by the spectral radius of several finite dimensional matrices appropriate product of refinement rules. with mask coefficients as elements in specific positions s Although the refinement (1)isdefinedonallZ ,the (see, e.g., [2]) or an equivalent computationP in terms i finite support of the mask guarantees that the limit of of the Laurent polynomial a.z/ D i2Zs ai z (see, the subdivision scheme at a point is affected only by a e.g., [7]). finite number of initial points. When dealing with the design of surfaces, this When the initial points are samples of a smooth analysis applies only in all parts of the grids away function, the limit function of a convergent linear sub- from the extraordinary points. The analysis at these division scheme approximates the sampled function. points is local [13], but rather involved. It also dictates Thus, a convergent linear subdivision scheme is a linear changes in the refinement rules that have to be made approximation operator. near extraordinary points [13, 17]. As examples, we give two prototypes of stationary schemes for s D 1. Each is the simplest of its kind, converging to C 1 univariate functions. The first is the Chaikin scheme [3], called also “Corner cutting,” with References limit functions which “preserve the shape” of the initial sets of points. The refinement rule is 1. Amat, S., Dadourian, K., Liandart, J.: On a nonlin- ear subdivision scheme avoiding Gibbs oscillations and converging towards C. Math. Comput. 80, 959–971 3 1 (2011) .RP/2i D Pi C PiC1; 4 4 (3) 2. Cavaretta, A.S., Dahmen, W., Micchelli, C.A.: Stationary 1 3 Subdivision. Memoirs of MAS, vol. 93, p. 186. American .RP/ D P C P ;i2 Z: 2iC1 4 i 4 iC1 Mathematical Society, Providence (1991) 3. Chaikin, G.M.: An algorithm for high speed curve gen- eration. Comput. Graph. Image Process. 3, 346–349 The second is the 4-point scheme [6, 10], which in- (1974) terpolates the initial set of points (and all the sets of 4. Cohem, A., Dyn, N., Matei, B.: Quasilinear subdivision points generated by the scheme). It is defined by the schemes with applications to ENO interpolation. Appl. refinement rule Comput. Harmonic Anal. 15, 89–116 (2003) 5. Deng, C., Wang, G.: Incenter subdivision scheme for 9 curve interpolation. Comput. Aided Geom. Des. 27, 48–59 .RP/2i D Pi ;.RP/2iC1 D .Pi C PiC1/ (2010) 16 (4) 6. Dubuc, S.: Interpolation through an iterative scheme. 1 J. Math. Anal. Appl. 114, 185–204 (1986)  .P C P /; i 2 Z: 16 i1 iC2 7. Dyn, N.: Subdivision schemes in computer-aided geometric design. In: Light, W. (ed.) Advances in Numerical Analysis While the limit functions of the Chaikin scheme can – Volume II, Wavelets, Subdivision Algorithms and Radial Basis Functions, p. 36. Clarendon, Oxford (1992) be written in terms of B-splines of degree 2, the limit 8. Dyn, N., Hormann, K.: Geometric conditions for tangent functions of the 4-point scheme for general initial sets continuity of interpolatory planar subdivision curves. Com- of points have a fractal nature and are defined only put. Aided Geom. Des. 29, 332–347 (2012) procedurally by (4). 9. Dyn, N., Levin, D.: Subdivision schemes in geometric modelling. Acta-Numer. 11, 73–144 (2002) For the design of surfaces in 3D, s D 2 and the com- 10. Dyn, N., Gregory, J., Levin, D.: A 4-point interpolatory mon grids are Z2 and regular triangulations. The latter subdivision scheme for curve design. Comput. Aided Geom. are refined by dividing each triangle into four equal Des. 4, 257–268 (1987) ones. Yet regular grids (with each vertex belonging to 11. Dyn, N., Floater, M.S., Hormann, K.: Four-point curve subdivision based on iterated chordal and centripetal Z2 four squares in the case of and to six triangles in parametrizations. Comput. Aided Geom. Des. 26, 279–286 the case of a regular triangulation) are not sufficient for (2009) Symbolic Computing 1431

12. Grohs, P.: A general proximity analysis of nonlinear subdi- Symbolic computing Computing that uses symbols vision schemes. SIAM J. Math. Anal. 42, 729–750 (2010) to manipulate and solve mathematical formulas and 13. Peters, J., Reif, U.: Subdivision Surfaces, p. 204. Springer, equations in order to obtain mathematically exact Berlin/Heidelberg (2008) 14. Wallner, J., Navayazdani, E., Weinmann, A.: Convergence results. and smoothness analysis of subdivision rules in Riemannian and symmetric spaces. Adv. Comput. Math. 34, 201–218 (2011) 15. Warren, J., Weimer, H.: Subdivision Methods for Geo- Definition metric Design, p. 299. Morgan Kaufmann, San Francisco (2002) Scientific computing can be generally divided into two 16. Xie, G., Yu, T.: Smoothness equivalence properties of gen- subfields: numerical computation, which is based on eral manifold-valued data subdivision schemes. Multiscale Model. Simul. 7, 1073–1100 (2010) finite precision arithmetic (usually single or double 17. Zorin, D.: Smoothness of stationary subdivision on ir- precision), and symbolic computing which uses sym- regular meshes. Constructive Approximation. 16, 359–397 bols to manipulate and solve mathematical equations (2000) and formulas in order to obtain mathematically exact results. Symbolic computing, also called symbolic manip- ulation or system (CAS), typically includes systems that can focus on well-defined com- Symbolic Computing puting areas such as polynomials, matrices, abstract algebra, number theory, or statistics (symbolic), as Ondrejˇ Certˇ ´ık1, Mateusz Paprocki2, Aaron Meurer3, well as on calculus-like manipulation such as limits, Brian Granger4, and Thilina Rathnayake5 differential equations, or integrals. A full-featured CAS 1Los Alamos National Laboratory, Los Alamos, NM, should have all or most of the listed features. There USA are systems that focus only on one specific area like 2refptr.pl, Wroclaw, Poland polynomials – those are often called CAS too. 3Department of Mathematics, New Mexico State Some authors claim that symbolic computing and University, Las Cruces, NM, USA computer algebra are two views of computing with 4Department of Physics, California Polytechnic State mathematical objects [1]. According to them, symbolic University, San Luis Obispo, CA, USA computation deals with expression trees and addresses 5Department of Computer Science, University of problems of determination of expression equivalence, Moratuwa, Moratuwa, Sri Lanka simplification, and computation of canonical forms, while computer algebra is more centered around the computation of mathematical quantities in well-defined Synonyms algebraic domains. The distinction between symbolic computing and computer algebra is often not made; S ; Symbolic computing; Sym- the terms are used interchangeably. We will do so in bolic manipulation this entry as well.

History Keywords Algorithms for Computer Algebra [2] provides a con- Symbolic computing; Computer algebra system cise description about the history of symbolic compu- tation. The invention of LISP in the early 1960s had a great impact on the development of symbolic computa- Glossary/Definition Terms tion. FORTRAN and ALGOL which existed at the time were primarily designed for numerical computation. In Numerical computing Computing that is based on 1961, James Slagle at MIT (Massachusetts Institute of finite precision arithmetic. Technology) wrote a heuristics-based LISP program 1432 Symbolic Computing for Symbolic Automatic INTegration (SAINT)[3]. In for example, became Maxima [20]; Scratch- 1963, Martinus J. G. Veltman developed pad [21] became Axiom [22]. [4, 5] for particle physics calculations. In 1966, Joel Moses (also from MIT) wrote a program called SIN [6] for the same purpose as SAINT, but he used a more Overview efficient algorithmic approach. In 1968, REDUCE [7] was developed by Tony Hearn at Stanford University A common functionality of all computer algebra for physics calculations. Also in 1968, a specialized systems typically includes at least the features CAS called CAMAL [8] for handling Poisson series mentioned in the following subsections. We use SymPy in celestial mechanics was developed by John Fitch 0.7.5 and Mathematica 9 as examples of doing the and David Barton from the University of Cambridge. same operation in two different systems, but any other In 1970, a general purposed system called REDUCE 2 full-featured CAS can be used as well (e.g., from was introduced. the Table 1) and it should produce the same results In 1971 Macsyma [9] was developed with capa- functionally. bilities for algebraic manipulation, limit calculations, To run the SymPy examples in a Python session, symbolic integration, and the solution of equations. execute the following first: In the late 1970s muMATH [10] was developed by from import * the University of Hawaii and it came with its own x, y, z, n, m = symbols(’x, y, z, programming language. It was the first CAS to run n, m’) on widely available IBM PC computers. With the f = Function(’f’) development of computing in 1980s, more modern CASes began to emerge. Maple [11] was introduced To run the Mathematica examples, just execute them in by the University of Waterloo with a small compiled a Mathematica Notebook. kernel and a large mathematical , thus allowing it to be used powerfully on smaller platforms. In 1988, Arbitrary Formula Representation Mathematica [12] was developed by Stephen Wolfram One can represent arbitrary expressions (not just poly- with better graphical capabilities and integration with nomials). SymPy: graphical user interfaces. In the 1980s more and more In [1]: (1+1/x)**x CASes were developed like Macaulay [13], PARI [14], Out[1]: (1 + 1/x)**x GAP [15], and CAYLEY [16] (which later became In [2]: sqrt(sin(x))/z Magma [17]). With the popularization of open-source Out[2]: sqrt(sin(x))/z software in the past decade, many open-source CASes were developed like Sage [18], SymPy [19], etc. Also, Mathematica: many of the existing CASes were later open sourced; In[1]:= (1+1/x)ˆx Out[1]= (1+1/x)ˆx

Symbolic Computing, Table 1 Implementation details of various computer algebra systems Program License Internal implementation language CAS language Mathematica [12] Commercial C/C++ Custom Maple [11] Commercial C/C++ custom Symbolic MATLAB toolbox [23] Commercial C/C++ Custom Axioma [22] BSD Lisp Custom SymPy [19] BSD Python Python Maxima [20] GPL Lisp Custom Sage [18] GPL C++/Cython/Lisp Pythonb Giac/ [24]GPLC++Custom aThe same applies to its two forks FriCAS [25]andOpenAxiom [26] bThe default environment in Sage actually extends the Python language using a preparser that converts things like 2ˆ3 into Integer(2)**Integer(3), but the preparser can be turned off and one can use Sage from a regular Python session as well Symbolic Computing 1433

In[2]:= Sqrt[Sin[x]]/z Mathematica: Out[2]= Sqrt[Sin[x]]/z In[1]:= Factor[xˆ2 y+xˆ2 z+x yˆ2+2 x y z+x Limits zˆ2+yˆ2 z+y zˆ2] SymPy: Out[1]= (x+y) (x+z) (y+z) In [1]: limit(sin(x)/x, x, 0) Out[1]: 1 Algebraic and Differential Equation Solvers In [2]: limit((2-sqrt(x))/(4-x),x,4) Algebraic equations, SymPy: Out[2]: 1/4 In [1]: solve(x 4+x 2+1, x) Mathematica: ** ** Out[1]: [-1/2 - sqrt(3)*I/2, -1/2 In[1]:= Limit[Sin[x]/x,x->0] + sqrt(3)*I/2, Out[1]= 1 1/2 - sqrt(3) I/2, 1/2 In[2]:= Limit[(2-Sqrt[x])/(4-x),x->4] * + sqrt(3) I/2] Out[2]= 1/4 * Mathematica: Differentiation In[1]:= Reduce[1+xˆ2+xˆ4==0,x] SymPy: Out[1]= x==-(-1)ˆ(1/3)||x In [1]: diff(sin(2*x), x) ==(-1)ˆ(1/3)|| Out[1]: 2*cos(2*x) x==-(-1)ˆ(2/3)||x In [1]: diff(sin(2*x), x, 10) ==(-1)ˆ(2/3) Out[1]: -1024*sin(2*x) and differential equations, SymPy: Mathematica: In [1]: dsolve(f(x).diff(x, 2) In[1]:= D[Sin[2 x],x] +f(x), f(x)) Out[1]= 2 Cos[2 x] Out[1]: f(x) == C1*sin(x) In[2]:= D[Sin[2 x],{x,10}] +C2*cos(x) Out[2]= -1024 Sin[2 x] In [1]: dsolve(f(x).diff(x, 2) +9*f(x), f(x)) Out[1]: f(x) == C1 sin(3 x) Integration * * +C2cos(3 x) SymPy: * * In [1]: integrate(1/(x**2+1), x) Mathematica: Out[1]: atan(x) In[1]:= DSolve[f’’[x]+f[x]==0,f[x],x] In [1]: integrate(1/(x**2+3), x) Out[1]= {{f[x]->C[1] Cos[x] Out[1]: sqrt(3)*atan(sqrt(3)*x/3)/3 +C[2] Sin[x]}} S Mathematica: In[2]:= DSolve[f’’[x]+9 f[x] ==0,f[x],x] In[1]:= Integrate[1/(xˆ2+1),x] Out[2]= {{f[x]->C[1] Cos[3 x] Out[1]= ArcTan[x] +C[2] Sin[3 x]}} In[2]:= Integrate[1/(xˆ2+3),x] Out[2]= ArcTan[x/Sqrt[3]]/Sqrt[3] Formula Simplification Simplification is not a well-defined operation (i.e., Polynomial Factorization there are many ways how to define the complexity of an SymPy: expression), but typically the CAS is able to simplify, In [1]: factor(x**2*y+x**2*z for example, the following expressions in an expected +x*y**2+2*x*y*z way, SymPy: +x*z**2+y**2*z In [1]: simplify(-1/(2*(x**2 + 1)) +y*z**2) - 1/(4*(x + 1))+1/(4*(x - 1)) Out[1]: (x + y)*(x + z)*(y+z) - 1/(x**4-1)) 1434 Symbolic Computing

Out[1]: 0 Mathematica: Mathematica: In[1]:= Simplify[-1/(2(xˆ2+1)) In[1]:= Sum[n,{n,1,m}] -1/(4(x+1))+1/(4(x-1)) Out[1]= 1/2 m (1+m) -1/(xˆ4-1)] In[2]:= Sum[1/nˆ6,{n,1,Infinity}] Out[1]= 0 Out[2]= Piˆ6/945 In[3]:= Sum[1/nˆ5,{n,1,Infinity}] or, SymPy: Out[3]= Zeta[5] In [1]: simplify((x - 1)/(x**2 - 1)) Out[1]: 1/(x + 1) Mathematica: Software In[1]:= Simplify[(x-1)/(xˆ2-1)] Out[1]= 1/(1+x) A computer algebra system (CAS) is typically com- posed of a high-level (usually interpreted) language that the user interacts with in order to perform cal- culations. Many times the implementation of such a Numerical Evaluationp Exact expressions like 2, constants, sums, integrals, CAS is a mix of the high-level language together with and symbolic expressions can be evaluated to a desired some low-level language (like C or C++) for efficiency accuracy using a CAS. For example, in SymPy: reasons. Some of them can easily be used as a library In [1]: N(sqrt(2),30) in user’s programs; others can only be used from the Out[1]: 1.4142135623730950488016- custom CAS language. 8872421 A comprehensive list of computer algebra software is at [27]. Table 1 lists features of several established In [2]: N(Sum(1/n**n, (n,1,oo)),30) Out[2]: 1.2912859970626635404072- computer algebra systems. We have only included sys- 8259060 tems that can handle at least the problems mentioned in the Overview section. Mathematica: Besides general full-featured CASes, there exist In[1]:= N[Sqrt[2],30] specialized packages, for Example, Singular [28]for Out[1]= 1.4142135623730950488016- very fast polynomial manipulation or GiNaC [29]that 8872421 can handle basic symbolic manipulation but does not In[2]:= N[Sum[1/nˆn,{n,1, have integration, advanced polynomial algorithms, or Infinity}],30] limits. Pari [14] is designed for number theory compu- Out[2]= 1.2912859970626635404072- tations and Cadabra [30] for field theory calculations 8259060 with tensors. Magma [17] specializes in algebra, num- ber theory, , and algebraic combina- torics. Finally, a CAS also usually contains a notebook Symbolic Summation like interface, which can be used to enter commands There are circumstances where it is mathematically or programs, plot graphs, and show nicely format- impossible to get an explicit formula for a given sum. ted equations. For Python-based CASes, one can use When an explicit formula exists, getting the exact result IPython Notebook [31]orSage Notebook [18], both of is usually desirable. SymPy: which are interactive web applications that can be used In [1]: Sum(n, (n, 1, m)).doit() from a web browser. C++ CASes can be wrapped in Out[1]: m**2/2 + m/2 Python, for example, GiNaC has several Python wrap- In [2]: Sum(1/n**6,(n,1,oo)).doit() pers: Swiginac [32], Pynac [33], etc. These can then be Out[2]: pi**6/945 used from Python-based notebooks. Mathematica and In [3]: Sum(1/n**5,(n,1,oo)).doit() Maple also contain a notebook interface, which accepts Out[3]: zeta(5) the given CAS high-level language. Symbolic Computing 1435

Applications of Symbolic Computing Code Generation One of the frequent use of a CAS is to derive some Symbolic computing has traditionally had numerous symbolic expression and then generate C or Fortran applications. By 1970s, many CASes were used for ce- code that numerically evaluates it in a production high- lestial mechanics, general relativity, quantum electro- performance code. For example, to obtain the best dynamics, and other applications [34]. In this section rational function approximation (of orders 8, 8) to we present a few such applications in more detail, but a modified Bessel function of the first kind of half- necessarily our list is incomplete and is only meant as integer argument I9=2.x/ on an interval Œ4; 10, one can a starting point for the reader. use (in Mathematica): Many of the following applications and scientific advances related to them would not be possible without symbolic computing.

In[1]:= Needs["FunctionApproximations‘"] In[2]:= FortranForm[HornerForm[MiniMaxApproximation[ BesselI[9/2, x]*Sqrt[Pi*x/2]/Exp[x], {x,{4,10},8,8},WorkingPrecision->30][[2,1]]]] Out[2]//FortranForm= (0.000395502959013236968661582656143 + x*(-0.001434648369704841686633794071 + x*(0.00248783474583503473135143644434 + x*(-0.00274477921388295929464613063609 + x*(0.00216275018107657273725589740499 + x*(-0.000236779926184242197820134964535 + x*(0.0000882030507076791807159699814428 + (-4.62078105288798755556136693122e-6 + 8.23671374777791529292655504214e-7*x)*x))))))) / (1. + x*(0.504839286873735708062045336271 + x*(0.176683950009401712892997268723 + x*(0.0438594911840609324095487447279 + x*(0.00829753062428409331123592322788 + x*(0.00111693697900468156881720995034 + x*(0.000174719963536517752971223459247 + (7.22885338737473776714257581233e-6 + S 1.64737453771748367647332279826e-6*x)*x)))))))

The result can be readily used in a Fortran code (we designed for this task was Schoonschip [4, 5], and in reformatted the white space in the output Out[2] to 1984 FORM [35] was created as a successor. FORM better fit into the page). has built-in features for manipulating formulas in parti- cle physics, but it can also be used as a general purpose system (it keeps all expressions in expanded form, so Particle Physics it cannot do factorization; it also does not have more The application of symbolic computing in particle advanced features like series expansion, differential physics typically involves generation and then calcu- equations, or integration). lation of Feynman diagrams (among other things that Many of the scientific results in particle physics involves doing fast traces of Dirac gamma matrices would not be possible without a powerful CAS; and other tensor operations). The first CAS that was Schoonschip was used for calculating properties of the 1436 Symbolic Computing

W boson in the 1960s and FORM is still maintained Celestial Mechanics andusedtothisday. The equations in celestial mechanics are solved using Another project is FeynCalc [36], originally written perturbation theory, which requires very efficient ma- for Macsyma (Maxima) and later Mathematica.Itisa nipulation of a Poisson series [8, 34, 43–50]: package for algebraic calculations in elementary par- X ticle physics, among other things; it can do tensor and sin P.a;b;c;:::;h/ .u C v CCz/ (2) Dirac algebra manipulation, Lorentz index contraction, cos and generation of Feynman rules from a Lagrangian, Fortran code generation. There are hundreds of publi- where P.a;b;c;:::;h/is a polynomial and each term cations that used FeynCalc to perform calculations. contains either sin or cos. Using trigonometric rela- Similar project is FeynArts [37], which is also a tions, it can be shown that this form is closed to ad- Mathematica package that can generate and visualize dition, subtraction, multiplication, differentiation, and Feynman diagrams and amplitudes. Those can then be restricted integration. One of the earliest specialized calculated with a related project FormCalc [38], built CASes for handling Poisson series is CAMAL [8]. on top of FORM. Many others were since developed, for example, TRIP [43]. PyDy PyDy, short for Python Dynamics, is a work flow Quantum Mechanics that utilizes an array of scientific tools written in the Quantum mechanics is well known for many tedious Python programming language to study multi-body calculations, and the use of a CAS can aid in doing dynamics [39]. SymPy mechanics package is used to them. There have been several books published with generate symbolic equations of motion in complex many worked-out problems in quantum mechanics multi-body systems, and several other scientific Python done using Mathematica and other CASes [51, 52]. packages are used for numerical evaluation (NumPy There are specialized packages for doing computa- [40]), visualization (Matplotlib [41]), etc. First, an ide- tions in quantum mechanics, for example, SymPy has alized version of the system is represented (geometry, extensive capabilities for symbolic quantum mechanics configuration, constraints, external forces). Then the in the sympy.physics.quantum subpackage. At symbolic equations of motion (often very long) are the base level, this subpackage has Python objects to generated using the mechanics package and solved represent the different mathematical objects relevant in (integrated after setting numerical values for the pa- quantum theory [53]: states (bras and kets), operators rameters) using differential equation solvers in SciPy (unitary, Hermitian, etc.), and basis sets as well as [42]. These solutions can then be used for simulations operations on these objects such as tensor products, and visualizations. Symbolic equation generation guar- inner products, outer products, commutators, anticom- antees no mistakes in the calculations and makes it easy mutators, etc. The base objects are designed in the to deal with complex systems with a large number of most general way possible to enable any particular components. quantum system to be implemented by subclassing the base operators to provide system specific logic. There is a general purpose qapply function that is capable General Relativity of applying operators to states symbolically as well In general relativity the CASes have traditionally been as simplifying a wide range of symbolic expressions  used to symbolically represent the metric tensor g involving different types of products and commuta- and then use symbolic derivation to derive various tor/anticommutators. The state and operator objects tensors (Riemann and Ricci tensor, curvature, :::)that also have a rich API for declaring their representation are present in the Einstein’s equations [34]: in a particular basis. This includes the ability to specify a basis for a multidimensional system using a complete 1 8G set of commuting Hermitian operators. R  g R C g D T : (1) 2 c4 On top of this base set of objects, a number of specific quantum systems have been implemented. Those can then be solved for simple systems. First, there is traditional algebra for quantum angular Symbolic Computing 1437 momentum [54]. This allows the different spin Teaching Calculus and Other Classes operators (Sx, Sy , Sz) and their eigenstates to be Computer algebra systems are extremely useful for represented in any basis and for any spin quantum teaching calculus [64] as well as other classes where number. Facilities for Clebsch-Gordan coefficients, tedious symbolic algebra is needed, such as many Wigner coefficients, rotations, and angular momentum physics classes (general relativity, quantum mechanics coupling are also present in their symbolic and and field theory, symbolic solutions to partial dif- numerical forms. Other examples of particular ferential equations, e.g., in electromagnetism, fluids, quantum systems that are implemented include plasmas, electrical circuits, etc.) [65]. second quantization, the simple harmonic oscillator (position/momentum and raising/lowering forms), and Experimental Mathematics continuous position/momentum-based systems. One field that would not be possible at all without com- Second there is a full set of states and operators puter algebra systems is called “experimental mathe- for symbolic quantum computing [55]. Multidimen- matics” [66], where CASes and related tools are used sional qubit states can be represented symbolically to “experimentally” verify or suggest mathematical and as vectors. A full set of one (X, Y , Z, H,etc.) relations. For example, the famous Bailey–Borwein– and two qubit (CNOT, SWAP, CPHASE, etc.) gates Plouffe (BBP) formula (unitary operators) are provided. These can be repre- Ä Â sented as matrices (sparse or dense) or made to act on X1 1 4 2 qubits symbolically without representation. With these  D  16k 8k C 1 8k C 4 kD0 gates, it is possible to implement a number of basic à quantum circuits including the quantum Fourier trans- 1 1   (3) form, quantum error correction, quantum teleportation, 8k C 5 8k C 6 Grover’s algorithm, dense coding, etc. There are other packages that specialize in quantum was first discovered experimentally (using arbitrary- computing, for example, [56]. precision arithmetic and extensive searching using an integer relation algorithm), only then proved rigorously [67]. Another example is in [68] where the authors first Number Theory numerically discovered and then proved that for ratio- Number theory provides an important base for modern nal x, y, the 2D Poisson potential function satisfies computing, especially in cryptography and coding the- ory [57]. For example, LLL [58] algorithm is used in 1 X cos.ax/ cos.by/ 1 .x; y/ D D log ˛ integer programming; primality testing and factoring 2 a2 C b2  a;b odd algorithms are used in cryptography [59]. CASes are (4) heavily used in these calculations. S [60, 61] which implies results where ˛ is algebraic (a root of an integer polynomial). about the distribution of prime numbers has important applications in computational mathematics since it can Acknowledgements This work was performed under the aus- pices of the US Department of Energy by Los Alamos National be used to estimate how long certain algorithms take to Laboratory. run [61]. Riemann hypothesis states that all nontrivial zeros of the Riemann zeta function, defined for com- plex variable s defined in the half-plane < .s/P > 1 by 1 s References the absolutely convergent series .s/ D nD1 n , 1 have real part equal to 2 . In 1986, this was proven 1. Watt, S.M.: Making computer algebra more symbolic. In: for the first 1,500,000,001 nontrivial zeros using com- Dumas J.-G. (ed.) Proceedings of Transgressive Computing putational methods [62]. Sebastian Wedeniwski using 2006: A Conference in Honor of Jean Della Dora, Facultad ZettaGrid (a distributed computing project to find roots de Ciencias, Universidad de Granada, pp 43–49, April 2006 2. Geddes, K.O., Czapor, S.R., Labahn, G.: Algorithms for of the zeta function) verified the result for the first 400 computer algebra. In: Introduction to Computer Algebra. billion zeros in 2005 [63]. Kluwer Academic, Boston (1992) 1438 Symbolic Computing

3. Slagle, J.R.: A heuristic program that solves symbolic in- 27. Wikipedia: List of computer algebra systems. http:// tegration problems in freshman calculus. J. ACM 10(4), en.wikipedia.org/wiki/List of computer algebra systems 507–520 (1963) (2013) 4. Veltman, M.J.G.: From weak interactions to gravitation. Int. 28. Decker, W., Greuel, G.-M., Pfister, G., Schonemann,¨ H.: J. Mod. Phys. A 15(29), 4557–4573 (2000) SINGULAR 3-1-6 – a computer algebra system for polyno- 5. Strubbe, H.: Manual for SCHOONSCHIP a CDC mial computations. http://www.singular.uni-kl.de (2012) 6000/7000 program for symbolic evaluation of algebraic 29. Bauer, C., Frink, A., Kreckel, R.: Introduction to the expressions. Comput. Phys. Commun. 8(1), 1–30 framework for symbolic computation within the c++ pro- (1974) gramming language. J. Symb. Comput. 33(1), 1–12 (2002) 6. Moses, J.: Symbolic integration. PhD thesis, MIT (1967) 30. Peeters, K.: Introducing cadabra: a symbolic computer alge- 7. REDUCE: A portable general-purpose computer algebra bra system for field theory problems. arxiv:hep-th/0701238; system. http://reduce-algebra.sourceforge.net/ (2013) A field-theory motivated approach to symbolic com- 8. Fitch, J.: CAMAL 40 years on – is small still beautiful? puter algebra. Comput. Phys. Commun. 176, 550 (2007). Intell. Comput. Math., Lect. Notes Comput. Sci. 5625, [arXiv:cs/0608005]. - 14 32–44 (2009) 31. Perez,´ F., Granger, B.E.: IPython: a system for interactive 9. Moses, J.: Macsyma: a personal history. J. Symb. Comput. scientific computing. Comput. Sci. Eng. 9(3), 21–29 (2007) 47(2), 123–130 (2012) 32. Swiginac, Python interface to GiNaC: http://sourceforge. 10. Shochat, D.D.: Experience with the musimp/-80 net/projects/swiginac.berlios/ (2015) symbolic mathematics system. ACM SIGSAM Bull. 16(3), 33. Pynac, derivative of GiNaC with Python wrappers: http:// 16–23 (1982) pynac.org/ (2015) 11. Bernardin, L., Chin, P. DeMarco, P., Geddes, K.O., Hare, 34. Barton, D., Fitch, J.P.: Applications of algebraic manipula- D.E.G., Heal, K.M., Labahn, G., May, J.P., McCarron, tion programs in physics. Rep. Prog. Phys. 35(1), 235–314 J., Monagan, M.B., Ohashi, D., Vorkoetter, S.M.: Maple (1972) Programming Guide. Maplesoft, Waterloo (2011) 35. Vermaseren, J.A.M.: New features of FORM. Math. Phys. 12. Wolfram, S.: The Mathematica Book. Wolfram Research e-prints (2000). ArXiv:ph/0010025 Inc., Champaign (2000) 36. Mertig, R., Bohm,¨ M., Denner, A.: Feyn calc – computer- 13. Grayson, D.R., Stillman, M.E.: Macaulay 2, a software algebraic calculation of feynman amplitudes. Comput. Phys. system for research in algebraic geometry. Available at Commun. 64(3), 345–359 (1991) http://www.math.uiuc.edu/Macaulay2/ 37. Hahn, T.: Generating feynman diagrams and amplitudes 14. The PARI Group, Bordeaux: PARI/GP version 2.7.0 . with feynarts 3. Comput. Phys. Commun. 140(3), 418–431 Available from http://pari.math.u-bordeaux.fr/ (2014) (2001) 15. The GAP Group: GAP – groups, algorithms, and program- 38. Hahn, T., Perez-Victoria,´ M.: Automated one-loop calcula- ming, version 4.4. http://www.gap-system.org (2003) tions in four and d dimensions. Comput. Phys. Commun. 16. Cannon, J.J.: An introduction to the group theory language 118(2–3), 153–165 (1999) cayley. Comput. Group Theory 145, 183 (1984) 39. Gede, G., Peterson, D.L., Nanjangud, A.S., Moore, 17. Bosma, W., Cannon, J., Playoust, C.: The Magma algebra J.K., Hubbard, M.: Constrained multibody dynamics with system. I. The user language. J. Symb. Comput. 24(3– python: from symbolic equation generation to publication. 4), 235–265 (1997). Computational Algebra and Number In: ASME 2013 International Design Engineering Tech- Theory. London (1993) nical Conferences and Computers and Information in En- 18. Stein, W.A., et al.: Sage Mathematics Software (Version gineering Conference, pp. V07BT10A051–V07BT10A051. 6.4.1). The Sage Development Team (2014). http://www. American Society of Mechanical Engineers, San Diego .org (2013) 19. SymPy Development Team: SymPy: Python library for 40. NumPy: The fundamental package for scientific computing symbolic mathematics. http://www.sympy.org (2013) in Python. http://www.numpy.org/ (2015) 20. Maxima: Maxima, a computer algebra system. Version 41. Hunter, J.D.: Matplotlib: a 2d graphics environment. Com- 5.34.1. http://maxima.sourceforge.net/ (2014) put. Sci. Eng. 9(3), 90–95 (2007) 21. Jenks, R.D., Sutor, R.S., Watt, S.M.: Scratchpad ii: an 42. Jones, E., Oliphant, T., Peterson, P., et al.: SciPy: open abstract datatype system for mathematical computation. In: source scientific tools for Python (2001–). http://www.scipy. Janßen, R. (ed.) Trends in Computer Algebra. Volume 296 org/ of Lecture Notes in Computer Science, pp. 12–37. Springer, 43. Gastineau, M., Laskar, J.: Development of TRIP: fast sparse Berlin/Heidelberg (1988) multivariate polynomial multiplication using burst tries. In: 22. Jenks, R.D., Sutor, R.S.: Axiom – The Scientific Computa- Computational Science – ICCS 2006, University of Read- tion System. Springer, New York/Berlin etc. (1992) ing, pp. 446–453 (2006) 23. MATLAB: Symbolic math toolbox. http://www.mathworks. 44. Biscani, F.: Parallel sparse polynomial multiplication on com/products/symbolic/ (2014) modern hardware architectures. In: Proceedings of the 37th 24. Parisse, B.: Giac/xcas, a free computer algebra system. International Symposium on Symbolic and Algebraic Com- Technical report, Technical report, University of Grenoble putation – ISSAC ’12, (1), Grenoble, pp. 83–90 (2012) (2008) 45. Shelus, P.J., Jefferys III, W.H.: A note on an attempt at more 25. FriCAS, an advanced computer algebra system: http://fricas. efficient Poisson series evaluation. Celest. Mech. 11(1), 75– sourceforge.net/ (2015) 78 (1975) 26. OpenAxiom, The open scientific computation platform: 46. Jefferys, W.H.: A FORTRAN-based list processor for Pois- http://www.open-axiom.org/ (2015) son series. Celest. Mech. 2(4), 474–480 (1970) Symmetric Methods 1439

47. Broucke, R., Garthwaite, K.: A programming system for analytical series expansions on a computer. Celest. Mech. Symmetric Methods 1(2), 271–284 (1969) 48. Fateman, R.J.: On the multiplication of poisson series. Philippe Chartier Celest. Mech. 10(2), 243–247 (1974) INRIA-ENS Cachan, Rennes, France 49. Danby, J.M.A., Deprit, A., Rom, A.R.M.: The symbolic ma- nipulation of poisson series. In: SYMSAC ’66 Proceedings of the First ACM Symposium on Symbolic and Algebraic Manipulation, New York, pp. 0901–0934 (1965) Synonyms 50. Bourne, S.R.: Literal expressions for the co-ordinates of the moon. Celest. Mech. 6(2), 167–186 (1972) 51. Feagin, J.M.: Quantum Methods with Mathematica. Time reversible Springer, New York (2002) 52. Steeb, W.-H., Hardy, Y.: Quantum Mechanics Using Com- puter Algebra: Includes Sample Programs in C++, Symbol- icC++, Maxima, Maple, and Mathematica. World Scientific, Definition Singapore (2010) 53. Sakurai, J.J., Napolitano, J.J.: Modern Quantum Mechanics. This entry is concerned with symmetric methods for Addison-Wesley, Boston (2010) 54. Zare, R.N.: Angular Momentum: Understanding Spatial solving ordinary differential equations (ODEs) of the Aspects in Chemistry and Physics. Wiley, New York (1991) form 55. Nielsen, M.A., Chuang, I.L.: Quantum Computation and yP D f.y/2 Rn;y.0/D y : (1) Quantum Information. Cambridge University Press, Cam- 0 bridge (2011) Throughout this article, we denote by ' .y / the flow 56. Gerdt, V.P., Kragler, R., Prokopenya, A.N.: A Mathematica t;f 0 Package for Simulation of Quantum Computation. Lecture of equation (1) with vector field f , i.e., the exact Notes in Computer Science (including subseries Lecture solution at time t with initial condition y.0/ D y0, Notes in Artificial Intelligence and Lecture Notes in Bioin- and we assume that the conditions for its well defi- formatics), 5743 LNCS, pp. 106–117. Springer, Berlin/Hei- niteness and smoothness for .y ; jtj/ in an appropriate delberg (2009) 0 n 57. Shoup, V.: A Computational Introduction to Number The- subset ˝ of R  RC are satisfied. Numerical methods ory and Algebra. Cambridge University Press, Cambridge for (1) implement numerical flows ˚h;f which, for (2009) small enough stepsizes h, approximate 'h;f . Of central 58. Lenstra, A.K., Lenstra, H.W., Lovasz,´ L.: Factoring poly- importance in the context of symmetric methods is the nomials with rational coefficients. Mathematische Annalen 261(4), 515–534 (1982) concept of adjoint method. 59. Cohen, H.: A Course in Computational Algebraic Number  Theory, vol. 138. Springer, Berlin/New York (1993) Definition 1 The adjoint method ˚h;f is the inverse of 60. Titchmarsh, E.C.: The Theory of the Riemann Zeta- ˚t;f with reversed time step h: Function, vol. 196. Oxford University Press, Oxford (1951)  1 61. Mazur, B., Stein, W.: Prime Numbers and the Riemann ˚h;f WD ˚h;f (2) Hypothesis. Cambridge University Press, Cambridge (2015) S 62. van de Lune, J., Te Riele, H.J.J., Winter, D.T.: On the zeros of the riemann zeta function in the critical strip. iv. Math. A numerical method ˚h is then said to be symmetric if  Comput. 46(174), 667–681 (1986) ˚h;f D ˚h;f . 63. Wells, D.: Prime Numbers: The Most Mysterious Figures in Math. Wiley, Hoboken (2011) 64. Palmiter, J.R.: Effects of computer algebra systems on concept and skill acquisition in calculus. J. Res. Math. Educ. Overview 22, 151–156 (1991) 65. Bing, T.J., Redish, E.F.: Symbolic manipulators affect math- ematical mindsets. Am. J. Phys. 76(4), 418–424 (2008) Symmetry is an essential property of numerical meth- 66. Bailey, D.H., Borwein, J.M.: Experimental mathematics: ods with regard to the order of accuracy and geometric examples, methods and implications. Not. AMS 52, 502– properties of the solution. We briefly discuss the im- 514 (2005) plications of these two aspects and refer to the corre- 67. Bailey, D.H., Borwein, P., Plouffe, S.: On the rapid compu- tation of various polylogarithmic constants. Math. Comput. sponding sections for a more involved presentation: 66(218), 903–913 (1996) • A method ˚h;f is said to be of order p if 68. Bailey, D.H., Borwein, J.M.: Lattice sums arising from the Poisson equation. J. Phys. A: Math. Theor. 3, 1–18 pC1 (2013) ˚h;f .y/ D 'h;f .y/ C O.h /; 1440 Symmetric Methods

f ' © Rn Rn Rn t;f Rn Rn h;f Rn with a Hamiltonian function H.q;p/ satisfying H.y;z/ D H.y;z/ are -reversible for .y; z/ D ρ ρ ρ ρ ρ ρ .y; z/.

Definition 2 A method ˚h, applied to a -reversible Rn Rn Rn Rn Rn Rn −1 −1 ordinary differential equation, is said to be -reversible ' © (−f) t;f h;f if 1  ı ˚h;f D ˚h;f ı : Symmetric Methods, Fig. 1 -reversibility of f , 't;f and ˚ h;f Note that if ˚h;f is symmetric, it is -reversible if and only if the following condition holds: and, if the local error has the following first-term  ı ˚h;f D ˚h;f ı : (4) expansion Besides, if (4) holds for an invertible ,then˚h;f is pC1 pC2 ˚h;f .y/ D 'h;f .y/ C h C.y/ C O.h /; -reversible if and only if it is symmetric. Example 2 The trapezoidal rule, whose flow is defined then straightforward application of the implicit func- by the implicit equation tion theorem leads to  à 1 1  pC1 O pC2 ˚h;f .y/ D y C hf y C ˚h;f .y/ ; (5) ˚h;f .y/ D 'h;f .y/  .h/ C.y/ C .h /: 2 2

This implies that a symmetric method is necessarily is symmetric and is -reversible when applied to  -reversible f . of even order p D 2q,since˚h;f .y/ D ˚h;f .y/ means that .1 C .1/pC1/C.y/ D 0. This property Since most numerical methods satisfy relation (4), plays a key role in the construction of composition symmetry is the required property for numerical meth- methods by triple jump techniques (see section on ods to share with the exact flow not only time re- “Symmetric Methods Obtained by Composition”), and versibility but also -reversibility. This illustrates that a this is certainly no coincidence that Runge-Kutta meth- symmetric method mimics geometric properties of ods of optimal order (Gauss methods) are symmetric the exact flow. Modified differential equations sustain (see section on “Symmetric Methods of Runge-Kutta further this assertion (see next section) and allow for Type”). It also explains why symmetric methods are the derivation of deeper results for integrable reversible used in conjunction with (Richardson) extrapolation systems such as the preservation of invariants and techniques. the linear growth of errors by symmetric meth- • The exact flow 't;f is itself symmetric owing to ods (see section on “Reversible Kolmogorov-Arnold– the group property 'sCt;f D 's;f ı 't;f . Consider Moser Theory”). now an isomorphism  of the vector space Rn (the phase space of (1)) and assume that the vector field Modified Equations for Symmetric f satisfies the relation  ı f Df ı  (see Fig. 1). Methods Then, 't;f is said to be -reversible, that is it to say the following equality holds:

1 Constant stepsize backward error analysis. Con-  ı 't;f D 't;f ı  (3) sidering a numerical method ˚h (not necessarily sym- metric) and the sequence of approximations obtained Example 1 Hamiltonian systems by application of the formula ynC1 D ˚h;f .yn/; n D 0; 1; 2; : : :, from the initial value y , the idea of back- @H 0 yP D .y; z/ ward error analysis consists in searching for a modified @z N vector field fh such that @H zP D .y; z/ N C2 ' N .y0/ D ˚h;f .y0/ C O.h /; (6) @y h;fh Symmetric Methods 1441 where the modified vector field, uniquely defined by a Remark 1 The expansion (7) of the modified vector N Taylor expansion of (6), is of the form field fh can be computed explicitly at any order N with the substitution product of B-series [2]. N 2 N fh .y/ D f.y/Chf 1.y/Ch f2.y/C:::Ch fN .y/: Example 3 Consider the Lotka-Volterra equations in (7) Poisson form Theorem 1 The modified vector field of a symmetric  à  Ã à uP 0 uv r H.u;v/ method ˚h;f has an expansion in even powers of h, D u ; vP uv0 r H.u;v/ i.e., f2j C1 Á 0 for j D 0; 1; : : : Moreover, if f and v N ˚h;f are -reversible, then fh is -reversible as well H.u;v/ D log.u/ C log.v/  u  v; for any N  0. i.e., y0 D f.y/ with f.y/ D .u.1  v/; v.u  1//T . Proof. Reversing the time step h in (6) and taking the Note that  ı f Df ı  with .u;v/ D .v; u/. inverse of both sides, we obtain 2 The modified vector fields fh;iE for the implicit Euler method and f 2 for the implicit midpoint rule read 1 1 N C2 h;mr .' N / .y0/ D .˚h;f / .y0/ C O.h /: h;fh (with N D 2)

2 2 Now, the group property of exact flows implies that 2 1 0 h 00 h 0 0 1 fh;iE D f C hf f C f .f; f / C f f f .' N / .y0/ D ' N .y0/,sothat h;fh h;fh 2 12 3 2 2 2 h 00 h 0 0  N C2 and fh;mr D f  f .f; f / C f f f: ' N .y0/ D ˚ .y0/ C O.h /; h;fh h;f 24 12 The exact solutions of the modified ODEs are plotted and by uniqueness, .f N / D f N . This proves the h h on Fig. 2 together with the corresponding numerical first statement. Assume now that f is -reversible, so solution. Though the modified vector fields are trun- that (4) holds. It follows from f N D f N that h h cated only at second order, the agreement is excellent. The difference of the behavior of the two solutions is O.hN C2/  ı ' N D  ı ' N D  ı ˚h;f also striking: only the symmetric method captures the h;fh h;fh periodic nature of the solution. (The good behavior of O.hN C2/ D ˚h;f ı  D 'h;f N ı ; the midpoint rule cannot be attributed to its symplectic- h ity since the system is a noncanonical Poisson system.) where the second and last equalities are valid up to This will be further explored in the next section. O.hN C2/-error terms. Yet the group property then O N C2 Variable stepsize backward error analysis. In prac- implies that  ı 'nh;f N D 'nh;f N ı  C n.h / h h S where the constant in the On-term depends on n and tice, it is often fruitful to resort to variable stepsize an interpolation argument shows that for fixed N and implementations of the numerical flow ˚h;f . In ac- small jtj cordance with [17], we consider stepsizes that are proportional to a function s.y; / depending only on O N C1 the current state y and of a parameter  prescribed  ı 't;f N D 't;f N ı  C .h /; h h by the user and aimed at controlling the error. The approximate solution is then given by where the O-term depends smoothly on t and on N . Finally, differentiating with respect to t, we obtain y D ˚ .y /; n D 0;:::: ˇ ˇ nC1 s.yn;/;f n ˇ ˇ N d ˇ d ˇ  ı fh D  ı 't;f N ˇ D 't;f N ı ˇ A remarkable feature of this algorithm is that it pre- dt h dt h tD0 tD0 serves the symmetry of the exact solution as soon as O N C2 N O N C1 C .h / D fh ı  C .h /; ˚h;f is symmetric and s satisfies the relation

N N and consequently  ı fh D fh ı . ut s.˚s.y;/;f .y/; / D s.y;/ 1442 Symmetric Methods

Implicit Euler Implicit midpoint rule 2.5 3

2.5 2

2 1.5 v v 1.5 1 1

0.5 0.5

0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 2.5 3 u u

Symmetric Methods, Fig. 2 Exact solutions of modified equations (red lines) versus numerical solutions by implicit Euler and midpoint rule (blue points) and preserves the -reversibility as soon as ˚h;f is An interesting instance is the case of completely inte- -reversible and satisfies the relation grable Hamiltonian systems:

1 s. ı ˚s.y;/;f .y/; / D s.y;/: @H @H yP D .y; z/; zP D .y; z/; @z @y A result similar to Theorem 1 then holds with h replaced by . with first integrals Ij ’s in involution (That is to say Remark 2 A recipe to construct such a function s, such that .ry Ii /.rzIj / D .rzIi /.ry Ij / for all i;j.) suggested by Stoffer in [17], consists in requiring that such that Ij ı  D Ij . In the conditions where Arnold- the local error estimate is kept constantly equal to a Liouville theorem (see Chap. X.1.3. of [10]) can be tolerance parameter. For the details of the implementa- applied, then, under the additional assumption that tion, we refer to the original paper or to Chap. VIII.3  of [10]. 9.y ;0/2f.y; z/; 8j; Ij .y; z/ D Ij .y0; z0/g; (9)

Reversible Kolmogorov-Arnold-Moser such a system is reversible integrable. In this situation, Theory -reversible methods constitute a very interesting way around symplectic method, as the following result The theory of integrable Hamiltonian systems has its shows: counterpart for reversible integrable ones. A reversible Theorem 2 Let ˚h;.f;g/ be a reversible numerical system method of order p applied to an integrable reversible system (8) with real-analytic f and g. Consider yP D f.y;z/; zP D g.y; z/ where  ı .f; g/ a D .I1.y ; z /;:::;Id .y ; z //: If the condition

D.f; g/ ı  with .y; z/ D .y; z/; (8) ! Xd  d 8k 2 Z =f0g; jk  !.a /j jki j is reversible integrable if it can be brought, through a iD1 reversible transformation .a; / D .I.y; z/; ‚.y; z//, to the canonical equations is satisfied for some positive constants  and ,then there exist positive C , c, and h0 such that the following aP D 0; P D !.a/: assertion holds: Symmetric Methods 1443

1 8h Ä h0; 8.x0;y0/ such that max jIj .y0; z0/  a jÄcj log hj ; (10) j D1;:::;d ( n p p k˚h;.f;g/.x0;y0/  .y.t/; z.t//kÄC th 8t D nh Ä h ; n p jIj .˚h;.f;g/.y0; z0//  Ij .y0; z0/jÄCh for all j:

Analogously to symplectic methods, -reversible methods thus preserve invariant tori Ij D cst over c a ::: a long intervals of times, and the error growth is linear 1 1;1 1;s : : : in t. Remarkably and in contrast with symplectic : : : (13) methods, this result remains valid for reversible cs as;1 ::: as;s variable stepsize implementations (see Chap. X.I.3 b1 ::: bs of [10]). However, it is important to note that for a Hamiltonian reversible system, the Hamiltonian ceases Runge-Kutta methods automatically satisfy the to be preserved when condition (9) is not fulfilled. This -compatibility condition (4): changing h into h situation is illustrated on Fig. 3 for the Hamiltonian in (11)and(12), we have indeed by linearity of  1 2 1 system with H.q;p/ D 2 p C cos.q/ C 5 sin.2q/,an and by using  ı f Df ı  example borrowed from [4]. Xs Á .Yi / D .y/  h ai;j f .Yj / ;iD 1;:::;s Symmetric Methods of Runge-Kutta Type j D1 Xs Á Runge-Kutta methods form a popular class of numer- .y/Q D .y/  h bj f .Yj / : ical integrators for (1). Owing to their importance j D1 in applications, we consider general systems (1)and subsequently partitioned systems. By construction, this is .˚h;f .y// and by previous definition ˚h;f ..y//. As a consequence, -reversible Methods for general systems. We start with the Runge-Kutta methods coincide with symmetric meth- following: ods. Nevertheless, symmetry requires an additional algebraic condition stated in the next theorem: s s Definition 3 Consider a matrix A D .ai;j / 2 R  R s Theorem 3 A Runge-Kutta method .A; b/ is symmet- and a vector b D .bj / 2 R . The Runge-Kutta method denoted .A; b/ is defined by ric if S

T Xs PA C AP D eb and b D Pb; (14) Yi D y C h ai;j f.Yj /; i D 1;:::;s (11) j D1 where e D .1;:::;1/T 2 Rs and P is the permutation Xs matrix defined by pi;j D ıi;sC1j . yQ D y C h b f.Y /: (12) j j Á j D1 T Proof. Denoting Y D Y T ;:::;YT and F.Y/ D Á 1 s T Note that strictly speaking, the method is properly de- T T f.Y1/ ;:::;f.Ys/ , a more compact form fined only for small jhj. In this case, the corresponding for (11)and(12)is numerical flow ˚h;f maps y to yQ. Vector Yi approxi- mates the solution at intermediate point t Cc h,where P 0 i Y D e ˝ y C h.A ˝ I/F.Y/; (15) ci D j ai;j , and it is customary since [1]torepresent a method by its tableau: yQ D y C h.bT ˝ I/F.Y/: (16) 1444 Symmetric Methods

4 1.071

3 q0=0 and p0=2.034 1.0705 q0=0 and p0=2.033 2

1 1.07

p 0

−1 Hamiltonian 1.0695

−2 1.069 −3

1.0685 −10 −5 0 5 10 15 0 2000 4000 6000 8000 10000 q Time

Symmetric Methods, Fig. 3 Level sets of H (left) and evolution of H w.r.t. time for two different initial values

p p p On the one hand, premultiplying (15)byP ˝ I and 1 15 5 2 15 5 15 2  10 36 9  15 36  30 noticing that p p 1 5 15 2 5 15 2 36 C 24 9 36  24 Á p p p (17) 1 15 5 15 2 15 5 .P ˝ I/F.Y/ D F .P ˝ I/Y ; 2 C 10 36 C 30 9 C 15 36 5 4 5 18 9 18 it is straightforward to see that ˚ can also be h;f Methods for partitioned systems. For systems of the defined by coefficients PAP T and Pb. On the other form hand, exchanging h and h, y,andyQ, it appears that   T ˚h;f is defined by coefficients A D eb  A and  b D b.Theflow˚h;f is thus symmetric as soon as yP D f.z/; zP D g.y/; (18) ebT  A D PAP and b D Pb, which is nothing but condition (14). ut it is natural to apply two different Runge-Kutta meth- ods to variables y and z: Written in compact form, a partitioned Runge-Kutta method reads:

Remark 3 For methods without redundant stages, con- dition (14) is also necessary. Y D e ˝ y C h.A ˝ I/F.Z/; Example 4 The implicit midpoint rule,definedbyA D O 1 Z D e ˝ y C h.A ˝ I/G.Y/; 2 and b D 1, is a symmetric method of order 2. More generally, the s-stage Gauss collocation method based yQ D y C h.bT ˝ I/F.Z/; on the roots of the sth shifted Legendre polynomial is OT a symmetric method of order 2s. For instance, the 2- zQ D y C h.b ˝ I/G.Y/; stage and 3-stage Gauss methods of orders 4 and 6 have the following coefficients: and the method is symmetric if both .A; b/ and .A;O b/O are. An important feature of partitioned Runge-Kutta p p method is that they can be symmetric and explicit for 1 3 1 1 3 2  6 4 4  6 systems of the form (18). p p 1 3 1 3 1 2 C 6 4 C 6 4 Example 5 The Verlet method is defined by the fol- 1 1 lowing two Runge-Kutta tableaux: 2 2 Symmetric Methods 1445

1 1 as soon as 1 C :::C s D 1, ‰h;f is of order at least 0 00 2 2 0 1 1 1 1 p C 1 if pC1 1 2 2 and 2 2 0 (19) pC1 1 C :::C s D 0: 1 1 1 1 2 2 2 2 This observation is the key to triple jump compositions, as proposed by a series of authors [3,5,18,21]: Starting The method becomes explicit owing to the special from a symmetric method ˚ of (even) order 2q,the structure of the partitioned system: h;f new method obtained for h Y1 D y0;Z1 D z0 C 2 f.Y1/; 1 1 D 3 D and Y2 D y0 C hg.Z1/; Z2 D Z1; Á 2  21=.2qC1/ h 1=.2qC1/ y1 D Y2; z1 D z0 C f.Y1/Cf.Y2/ 2 2  D 2 2  21=.2qC1/ The Verlet method is the most elementary method of is symmetric the class of partitioned Runge-Kutta methods known as Lobatto IIIA-IIIB. Unfortunately, methods of higher     ‰h;f D ˚ h;f ı ˚ h;f ı ˚ h;f orders within this class are no longer explicit in gen- 1 2 3 D ˚ ı ˚ ı ˚ D ‰ eral, even for the equations of the form (18). It is 3h;f 2h;f 1h;f h;f nevertheless possible to construct symmetric explicit and of order at least 2q C 1. Since the order of a Runge-Kutta methods, which turn out to be equivalent symmetric method is even, ‰ is in fact of order to compositions of Verlet’s method and whose intro- h;f 2qC2. The procedure can then be repeated recursively duction is for this reason postponed to the next section. to construct arbitrarily high-order symmetric methods Note that a particular instance of partitioned sys- of orders 2qC2, 2qC4, 2qC6, :::., with respectively tems are second-order differential equations of the 3, 9, 27, :::, compositions of the original basic method form ˚h;f . However, the construction is far from being the most efficient, for the combined coefficients become yP D z; zP D g.y/; (20) large, some of which being negatives. A partial remedy is to envisage compositions with s D 5. We hereby give which covers many situations of practical interest (for the coefficients obtained by Suzuki [18]: instance, mechanical systems governed by Newton’s 1 law in absence of friction).  D  D  D  D and 1 2 4 5 4  41=.2qC1/ 41=.2qC1/ 3 D Symmetric Methods Obtained by 4  41=.2qC1/ Composition which give rise to very efficient methods for q D 1 S and q D 2. The most efficient high-order composition Another class of symmetric methods is constituted of methods are nevertheless obtained by solving the full symmetric compositions of low-order methods. The system of order conditions, i.e., by raising the order di- idea consists in applying a basic method ˚h;f with a rectly from 2 to 8, for instance, without going through sequence of prescribed stepsizes: Given s real numbers the intermediate steps described above. This requires 1;:::;s, its composition with stepsizes 1h;:::;sh much more effort though, first to derive the order gives rise to a new method: conditions and then to solve the resulting polynomial system. It is out of the scope of this article to describe ‰h;f D ˚sh;f ı :::ı ˚1h;f : (21) the two steps involved, and we rather refer to the paper [15] on the use of 1B-series for order conditions Noticing that the local error of ‰h;f ,definedby and to Chap. V.3.2. of [10] for various examples and ‰h;f .y/  'h;f .y/, is of the form numerical comparisons. An excellent method of order 6 with 9 stages has been obtained by Kahan and Li [12] pC1 pC1 pC1 O pC2 .1 C :::C s /h C.y/ C .h /; and we reproduce here its coefficients: 1446 Symmetric Methods

100

Kahan and Li −1 10 Triple jump

10−2

10−3

10−4

10−5

Error at time t=100 10−6

10−7

10−8

10−9 105 Number of Verlet steps

1 D 9 D 0:3921614440073141; qR DrVfast.q/ rVslow.q/ (22)

2 D 8 D 0:3325991367893594;

3 D 7 D0:7062461725576393; where Vfast and Vslow are two potentials acting on 2 different time scales, typically such that r Vfast is 4 D 6 D 0:0822135962935508; 2 2 positive semi-definite and kr Vfastk >> kr Vslowk. 5 D 0:7985439909348299: Explicit standard methods suffer from severe stability restrictions due to the presence of high oscillations at the slow time scale and necessitate small steps For the sake of illustration, we have computed the and many evaluations of the forces. Since slow forces solution of Kepler’s equations with this method and rV are in many applications much more expen- the method of order 6 obtained by the triple jump slow sive to evaluate than fast ones, efficient methods in this technique. In both cases, the basic method is Verlet’s context are thus devised to require significantly fewer scheme. The gain offered by the method of Kahan and evaluations per step of the slow force. Li is impressive (it amounts to two digits of accuracy on this example). Other methods can be found, for Example 6 In applications to molecular dynamics, for instance, in [10, 14]. instance, fast forces deriving from Vfast (short-range interactions) are much cheaper to evaluate than slow Remark 4 It is also possible to consider symmetric forces deriving from V (long-range interactions). compositions of nonsymmetric methods. In this situa- slow Other examples of applications are presented in [11]. tion, raising the order necessitates to compose the basic method and its adjoint. Methods for general problems with nonlinear fast potentials. Introducing the variable p DPq in (22), the equation reads Symmetric Methods for Highly Oscillatory Problems  à  à  à qP p 0 D C pP 0 r V .q/ In this section, we present methods aimed at solving „ƒ‚… „ƒ‚… „ q ƒ‚fast … problems of the form yP fK .y/ fF .y/ Symmetric Methods 1447  à 0 (and/or symplectic) and allows for larger stepsizes. C : rqVslow.q/ However, it still suffers from resonances and a better „ ƒ‚ … option is given by the mollified impulse methods, fS .y/ which considers the mollified potential VNslow.q/ D 0 The usual Verlet method [20] would consist in compos- Vslow.a.q// in loco of Vslow.q/,wherea.q/ and a .q/ are averaged values given by ing the flows 'h;.fF CfS / and 'h;fK as follows: Z Z h h 1 0 1 ' h ı 'h;fK ı ' h a.q/ D x.s/ds; a .q/ D X.s/ds 2 ;.fF CfS / 2 ;.fF CfS / h 0 h 0 or, if necessary, numerical approximations thereof and where would typically be restricted to very small stepsizes. The impulse method [6,8,19] combines the three pieces xR DrV .x/; x.0/ D q; x.0/P D p; of the vector field differently: fast 2 XR Dr Vfast.x/X; X.0/ D I; X.0/P D 0: (23)

' h ı 'h;.fK CfF / ı ' h : 2 ;fS 2 ;fS The resulting method uses the mollified force 0 T a .q/ .rqVslow/.a.q// and is still symmetric (and/or Note that 'h;fS is explicit symplectic) provided (23) is solved with a symmetric  à  à q q (and/or symplectic) method. 'h;fS D p p  hrqVslow.q/ Methods for problems with quadratic fast poten- tials. In many applications of practical importance, while 'h;.fK CfF / may require to be approximated by a the potential Vfast is quadratic of the form Vfast.q/ D numerical method ˚h;.fK CfF / which uses stepsizes that 1 qT ˝2q. In this case, the mollified impulse method are fractions of h.If˚h;.fK CfF / is symmetric (and/or 2 symplectic), the overall method is symmetric as well falls into the class of trigonometric symmetric methods of the form

0 Á Á 1 Â Ã Â Ã p p 1 0.h˝/rVslow .h˝/q0 C 1.h˝/rVslow .h˝/q1 ˆ D R.h˝/  h @ Á A h q q 2 h .h˝/rVslow .h˝/q0

S where R.h˝/ is the block matrix given by sin2.z/ Â Ã interesting ones are .z/ D z2 , .z/ D 1 (see [9]) cos.h˝/ ˝ sin.h˝/ sin3.z/ sin.z/ R.h˝/ D or .z/ D 3 , .z/ D (see [7]). ˝1 sin.h˝/ cos.h˝/ z z and the functions , , 0 and 1 are even functions such that Conclusion sin.z/ .z/ D .z/; .z/ D cos.z/ .z/; and This entry should be regarded as an introduction to the z 1 0 1 subject of symmetric methods. Several topics have not .0/ D .0/ D 1: been exposed here, such as symmetric projection for ODEs on manifolds, DAEs of index 1 or 2, symmet- Various choices of functions and are possible ric multistep methods, symmetric splitting methods, and documented in the literature. Two particularly and symmetric Lie-group methods, and we refer the 1448 Symmetries and FFT interested reader to [10, 13, 16] for a comprehensive 20. Verlet, L.: Computer “experiments” on classical fluids. i. presentation of these topics. Thermodynamical properties of Lennard-Jones molecules. Phys. Rev. 159, 98–103 (1967) 21. Yoshida, H.: Construction of higher order symplectic inte- grators. Phys. Lett. A 150, 262–268 (1990) References

1. Butcher, J.C.: Implicit Runge-Kutta processes. Math. Comput. 18, 50–64 (1964) 2. Chartier, P., Hairer, E., Vilmart, G.: Algebraic structures of Symmetries and FFT B-series. Found. Comput. Math. 10(4), 407–427 (2010) 3. Creutz, M., Gocksch, A.: Higher-order hybrid Monte Carlo Hans Z. Munthe-Kaas algorithms. Phys. Rev. Lett. 63, 9–12 (1989) 4. Faou, E., Hairer, E., Pham, T.L.: Energy conservation with Department of Mathematics, University of Bergen, non-symplectic methods: examples and counter-examples. Bergen, Norway BIT 44(4), 699–709 (2004) 5. Forest, E.: Canonical integrators as tracking codes. AIP Conf. Proc. 184, 1106–1136 (1989) 6. Garc´ıa-Archilla, B., Sanz-Serna, J.M., Skeel, R.D.: Long- Synonyms time-step methods for oscillatory differential equations. SIAM J. Sci. Comput. 20, 930–963 (1999) Fourier transform; Group theory; Representation the- 7. Grimm, V., Hochbruck, M.: Error analysis of exponential in- ory; Symmetries tegrators for oscillatory second-order differential equations. J. Phys. A 39, 5495–5507 (2006) 8. Grubm¨uller, H., Heller, H., Windemuth, A., Tavan, P.: Gen- eralized Verlet algorithm for efficient molecular dynamics simulations with long-range interactions. Mol. Sim. 6, 121– Synopsis 142 (1991) 9. Hairer, E., Lubich, C.: Long-time energy conservation of The fast Fourier transform (FFT), group theory, and numerical methods for oscillatory differential equations. symmetry of linear operators are mathematical topics SIAM J. Numer. Anal. 38, 414–441 (2000) 10. Hairer, E., Lubich, C., Wanner, G.: Geometric Numerical which all connect through group representation theory. Integration: Structure-Preserving Algorithms for Ordinary This entry provides a brief introduction, with emphasis Differential Equations, 2nd edn. Springer Series in Compu- on computational applications. tational Mathematics, vol. 31. Springer, Berlin (2006) 11. Jia, Z., Leimkuhler, B.: Geometric integrators for multiple time-scale simulation. J. Phys. A 39, 5379–5403 (2006) 12. Kahan, W., Li, R.C.: Composition constants for raising the Symmetric FFTs orders of unconventional schemes for ordinary differential equations. Math. Comput. 66, 1089–1099 (1997) 13. Leimkuhler, B., Reich, S.: Simulating Hamiltonian Dynam- The finite Fourier transform maps functions on a pe- ics. Cambridge Monographs on Applied and Computational riodic lattice to a dual Fourier domain. Formally, let Mathematics, vol. 14. Cambridge University Press, Cam- Z D Z  Z    Z be a finite abelian bridge (2004) n n1 n2 n` Z 14. McLachlan, R.I., Quispel, G.R.W.: Splitting methods. Acta (commutative) group, where nj is the cyclic group Z Numer. 11, 341–434 (2002) nj Df0;1;:::;nj  1g with group operation C. mod 15. Murua, A., Sanz-Serna, J.M.: Order conditions for numer- nj / and n D .n1;:::;n`/ is a multi-index with ical integrators obtained by composing simpler integrators. jnjDn n n .LetCZ denote the linear space Philos. Trans. R. Soc. Lond. A 357, 1079–1100 (1999) 1 2 ` n 16. Sanz-Serna, J.M., Calvo, M.P.: Numerical Hamiltonian of complex-valued functions on Zn.Theprimal do- Problems. Chapman & Hall, London (1994) main Zn is an `-dimensional lattice periodic in all 17. Stoffer, D.: On reversible and canonical integration meth- b directions, and the Fourier domain is Zn D Zn (this ods. Technical Report SAM-Report No. 88-05, ETH-Z¨urich (1988) is the Pontryagin dual group [12]). For infinite abelian 18. Suzuki, M.: Fractal decomposition of exponential operators groups, the primal and Fourier domains in general dif- with applications to many-body theories and monte carlo fer, such as Fourier series on the circle where Rb=Z D simulations. Phys. Lett. A 146, 319–323 (1990) Z. The discrete Fourier transform (DFT) is an (up to 19. Tuckerman, M., Berne, B.J., Martyna, G.J.: Reversible mul- b scaling) unitary map FW CZn ! CZn. Letting F.f/ Á tiple time scale molecular dynamics. J. Chem. Phys. 97, b 1990–2001 (1992) f ,wehave Symmetries and FFT 1449

Á P X k j k j 2 b 2i 1 1 CC ` ` such that R d DjGj. For example, the cyclic f.k/ D f.j/e n1 n` ; 2  group Zn Df0;1;:::;r  1g with group operation j2Zn C.modn/ has exactly n 1-dimensional irreps given as b for k D .k1;:::;k`/ 2 Zn,(1)k.j / D exp.2ikj=n/. A matrix A is equivariant with respect to a representation R of a group G if Á X k j k j AR.g/ D R.g/A for all g 2 G. Any representation 1 b 2i 1 1 CC ` ` f.j/ D f.k/e n1 n` ; jnj R can be block diagonalized, with irreducible repre- b k2Zn sentations on the diagonal. This provides a change of F FAF 1 for j D .j1;:::;j`/ 2 Zn.(2)basis matrix such that is block diagonal for any R-equivariant A. This result underlies most of computational Fourier analysis and will be exemplified A symmetry for a function f 2 CZ is an R-linear n by convolutional operators. map SW CZn ! CZn such that Sf D f .Asexamples, consider real symmetry S f.j/ D f.j/,evensym- R Convolutions in the Group Algebra metry S f.j/ D f.j/ and odd symmetry S f.j/ D e o The group algebra CG is the complex jGj-dimensional f.j/.Iff has a symmetry S,thenfb has an adjoint b F F 1 b b vector space where the elements of G are the basis vec- symmetry S D S ,exampleSe D Se, So D So C cb b tors; equivalently G consists of all complex-valued and SRf.k/ D f.k/. The set of all symmetries functions on G. The product in G extends linearly to of f forms a group, i.e., the set of symmetries is the convolution productP W CG  CG ! CG,given closed under composition and inversion. Equivalently, 1 as .a b/.g/ D h2G a.h/b.h g/ for all g 2 G. the symmetries can be specified by defining an abstract The right regular representation of G on CG is, for CZ group G and a map RW G ! LinR. n/,whichfor every h 2 G, a linear map R.h/W CG ! CG given R CZ each g 2 G defines an -linear map R.g/ on n, as right translation R.h/a.g/ D a.gh/. A linear map 0 0 0 such that R.gg / D R.g/R.g / for all g; g 2 G. R is AW CG ! CG is convolutional (i.e., there exists an an example of a real representation of G. a 2 CG such that Ab D a b for all b 2 CG)ifand Z The DFT on n is computed by the fast Fourier only if A is equivariant with respect to the right regular O transform (FFT), costing .jnj log.jnj// floating point representation. operations. It is possible to exploit may symmetries in the computation of the FFT, and for large classes of The Generalized Fourier Transform (GFT) symmetry groups, savings a factor jGj can be obtained The generalized Fourier transform [6, 10]andthe compared to the nonsymmetric FFT. inverse are given as X ba./ D a.g/.g/ 2 Cdd ; for all  2 R (3) Equivariant Linear Operators and the GFT g2G S Representations 1 X   a.g/ D d trace .g1/ba./ ; for all g 2 G. Let G be a finite group with jGj elements. A dR- jGj  dimensional unitary representation of G is a map 2R RW G ! U.dR/ such that R.gh/ D R.g/R.h/ for (4) all g; h 2 G,whereU.dR/ is the set of dR  dR unitary matrices. More generally, a representation is From the convolution formula a1 b./ D ba./bb./, a linear action of a group on a vector space. Two we conclude: The GFT block-diagonalizes convolu- e representations R and R are equivalent if there exists tional operators on CG. The blocks are of size d,the amatrixX such that R.g/e D XR.g/X 1 for all dimensions of the irreps. g 2 G. A representation R is reducible if it is equiv- alent to a block diagonal representation; otherwise it Equivariant Linear Operators is irreducible. For any finite group G, there exists a More generally, consider a linear operator AW V ! V complete list of nonequivalent irreducible representa- where V is a finite-dimensional vector space and A is tions R Df1;2;:::;ng, henceforth called irreps, equivariant with respect to a linear right action of G 1450 Symmetries and FFT on V. If the action is free and transitive, then A is imate symmetries) of the problem and employ the convolutional on CG. If the action is not transitive, irreducible characters of the symmetry group and the then V splits in s separate orbits under the action of GFT to (approximately) block diagonalize the opera- G,andA is a block-convolutional operator. In this tors. Such techniques are called domain reduction tech- case, the GFT block diagonalizes A with blocks of size niques [7]. Block diagonalization of linear operators approximately sd. The theory generalizes to infinite has applications in solving linear systems, eigenvalue compact Lie groups via the Peter–Weyl theorem and problems, and computation of matrix exponentials. to certain important non-compact groups (unimodular The GFT has also applications in image processing, groups), such as the group of Euclidean transforma- image registration, and computational statistics [1– tions acting on Rn;see[13]. 5, 8].

Applications References Symmetric FFTs appear in many situations, such as real sine and cosine transforms. The 1-dim real cosine 1. Ahlander,˚ K., Munthe-Kaas, H.: Applications of the gener- transform has four symmetries generated by SR and Se, alized Fourier transform in numerical linear algebra. BIT and it can be computed four times faster than the full Numer. Math. 45(4), 819–850 (2005) complex FFT. This transform is central in Chebyshev 2. Allgower, E., Bohmer,¨ K., Georg, K., Miranda, R.: Ex- ploiting symmetry in boundary element methods. SIAM J. approximations. More generally, multivariate Cheby- Numer. Anal. 29, 534–552 (1992) shev polynomials possess symmetries of kaleidoscopic 3. Bossavit, A.: Symmetry, groups, and boundary value prob- reflection groups acting upon Zn [9, 11, 14]. lems. A progressive introduction to noncommutative har- Diagonalization of equivariant linear operators is monic analysis of partial differential equations in domains with geometrical symmetry. Comput. Methods Appl. Mech. essential in signal and image processing, statistics, and Eng. 56(2), 167–215 (1986) differential equations. For a cyclic group Zn, an equiv- 4. Chirikjian, G., Kyatkin, A.: Engineering Applications of ariant A is a Toeplitz circulant, and the GFT is given by Noncommutative Harmonic Analysis. CRC, Boca Raton the discrete Fourier transform, which can be computed (2000) 5. Diaconis, P.: Group Representations in Probability and fast by the FFT algorithm. More generally, a finite Statistics. Institute of Mathematical Statistics, Hayward abelian group Zn has jnj one-dimensional irreps, given (1988) by the exponential functions. Zn-equivariant matrices 6. Diaconis, P., Rockmore, D.: Efficient computation of the are block Toeplitz circulant matrices. These are diago- Fourier transform on finite groups. J. Am. Math. Soc. 3(2), 297–332 (1990) nalized by multidimensional FFTs. Linear differential 7. Douglas, C., Mandel, J.: An abstract theory for the domain operators with constant coefficients typically lead to reduction method. Computing 48(1), 73–96 (1992) discrete linear operators which commute with trans- 8. Fassler,¨ A., Stiefel, E.: Group Theoretical Methods and lations acting on the domain. In the case of periodic Their Applications. Birkhauser, Boston (1992) 9. Hoffman, M., Withers, W.: Generalized Chebyshev polyno- boundary conditions, this yields block circulant matri- mials associated with affine Weyl groups. Am. Math. Soc. ces. For more complicated boundaries, block circulants 308(1), 91–104 (1988) may provide useful approximations to the differential 10. Maslen, D., Rockmore, D.: Generalized FFTs—a survey of operators. some recent results. Publisher: In: Groups and Computation II, Am. Math. Soc. vol. 28, pp. 183–287 (1997) More generally, many computational problems pos- 11. Munthe-Kaas, H.Z., Nome, M., Ryland, B.N.: Through the sess symmetries given by a (discrete or continuous) Kaleidoscope; symmetries, groups and Chebyshev approx- group acting on the domain. For example, the Lapla- imations from a computational point of view. In: Founda- cian operator commutes with any isometry of the do- tions of Computational Mathematics. Cambridge University Press, Cambridge (2012) main. This can be discretized as an equivariant discrete 12. Rudin, W.: Fourier Analysis on Groups. Wiley, New York operator. If the group action on the discretized domain (1962) is free and transitive, the discrete operator is a convolu- 13. Serre, J.: Linear Representations of Finite Groups, vol 42. tion in the group algebra. More generally, it is a block- Springer, New York (1977) 14. Ten Eyck, L.: Crystallographic fast Fourier transforms. Acta convolutional operator. For computational efficiency, Crystallogr Sect. A Crystal Phys. Diffr. Theor. General it is important to identify the symmetries (or approx- Crystallogr. 29(2), 183–191 (1973) Symplectic Methods 1451

H Symplectic Methods of their order  – replace ˚ by a nonsymplectic mapping  H . This is illustrated in Fig. 1 that corre- sponds to the Euler rule as applied to the harmonic J.M. Sanz-Serna oscillator pP Dq, qP D p. The (constant) step size Departamento de Matematica´ Aplicada, Universidad is t  t D 2=12. We have taken as a family de Valladolid, Valladolid, Spain nC1 n of initial conditions the points of a circle centered at p D 1, q D 0 andseentheevolutionafter1,2,...,12 steps. Clearly the circle, which should move clockwise Definition without changing area, gains area as the integration proceeds: The numerical  H is not symplectic. As This entry, concerned with the practical task of in- a result, the origin, a center in the true dynamics, is tegrating numerically Hamiltonian systems, follows turned by the discretization procedure into an unstable up the entry  Hamiltonian Systems and keeps the spiral point, i.e., into something that cannot arise in notation and terminology used there. Hamiltonian dynamics. For the implicit Euler rule, the Each one-step numerical integrator is specified by corresponding integration loses area and gives rise to a a smooth map  H that advances the numerical tnC1;tn family of smaller and smaller circles that spiral toward solution from a time level t to the next t n nC1 the origin. Again, such a stable focus is incompatible with Hamiltonian dynamics. .pnC1;qnC1/ D  H .pn;qn/I (1) tnC1;tn This failure of well-known methods in mimicking Hamiltonian dynamics motivated the consideration of the superscript H refers to the Hamiltonian function integrators that generate a symplectic mapping  H H.p;qI t/ of the system being integrated. For instance when applied to a Hamiltonian problem. Such methods for the explicit Euler rule are called symplectic or canonical. Since symplec-  tic transformations also preserve volume, symplectic nC1 nC1 n n n n .p ;q / D .p ;q / C .tnC1  tn/ f.p ;q I tn/;  integrators applied to Hamiltonian problems are au- n n g.p ;q I tn/ I tomatically volume preserving. On the other hand, while many important symplectic integrators are time- here and later f and g denote the d-dimensional real vectors with entries @H=@qi , @H=@pi (d is the 5 number of degrees of freedom) so that .f; g/ is the canonical vector field associated with H (in simpler 4 words: the right-hand side of Hamilton’s equations). 3 For the integrator to make sense,  H has to approx- tnC1;tn imate the solution operator ˚ H that advances the 2 tnC1;tn S true solution from its value at tn to its value at tnC1: 1     H 0 p.t /; q.t / D ˚ p.t /; q.t / : p nC1 nC1 tnC1;tn n n −1 For a method of (consistency) order ,  H differs  tnC1;tn  from ˚ H in terms of magnitude O .t  t /C1 . −2 tnC1;tn nC1 n The solution map ˚ H is a canonical (symplec- − tnC1;tn 3 tic) transformation in phase space, an important fact −4 that substantially constrains the dynamics of the true solution p.t/;q.t/ . If we wish the approximation −5  H to retain the “Hamiltonian” features of ˚ H ,we −4 −2 0 2 4 should insist on  H also being a symplectic transfor- q mation. However, most standard numerical integrators Symplectic Methods, Fig. 1 The harmonic oscillator inte- – including explicit Runge–Kutta methods, regardless grated by the explicit Euler method 1452 Symplectic Methods reversible (symmetric), reversibility is neither suffi- Runge–Kutta Methods cient nor necessary for a method to be symplectic ([8], Remark 6.5). Symplecticness Conditions Even though early examples of symplectic integra- When the Runge–Kutta (RK) method with s stages tion may be traced back to the 1950s, the systematic specified by the tableau exploration of the subject started with the work of Feng Kang (1920–1993) in the 1980s. An early short mono- a11  a1s graph is [8] and later books are the comprehensive : : : : :: : [5] and the more applied [6]. Symplectic integration (2) a  a was the first step in the larger endeavor of developing s1 ss b  b structure-preserving integrators, i.e., of what is now 1 s often called, following [7], geometric integration. is applied to the integration of the Hamiltonian system Limitations of space restrict this entry to one-step with Hamiltonian function H, the relation (1) takes the methods and canonical Hamiltonian problems. For form noncanonical Hamiltonian systems and multistep inte- grators the reader is referred to [5], Chaps. VII and XV. Xs nC1 n p D p C hnC1 bi f.Pi ;Qi I tn C ci hnC1/; iD1 Xs nC1 n q D q C hnC1 bi g.Pi ;Qi I tn C ci hnC1/; Integrators Based on Generating iD1 Functions P where ci D j aij are the abscissae, hnC1 D tnC1 tn The earliest systematic approaches by Feng Kang and is the step size and Pi , Qi , i D 1;:::;sare the internal others to the construction of symplectic integrators (see stage vectors defined through the system [5], Sect. VI.5.4 and [8], Sect. 11.2) exploited the fol- Xs lowing well-known result of the canonical formalism: n Pi D p C hnC1 aij f.Pj ;Qj I tn C cj hnC1/; The canonical transformation ˚ H possesses a gen- tnC1;tn j D1 erating function S2 that solves an initial value problem for the associated Hamilton–Jacobi equation. It is then (3) possible, by Taylor expanding that equation, to obtain Xs n an approximation eS to S . The transformation  H Qi D q C hnC1 aij g.Pj ;Qj I tn C cj hnC1/: 2 2 tnC1;tn e j D1 generated by S 2 will automatically be canonical and e (4) therefore will define a symplectic integrator. If S 2 C1 differs from S2 by terms O .tnC1  tn/ , the inte- grator will be of order . Generally speaking, the high- Lasagni, Sanz-Serna, and Suris proved that if the order methods obtained by following this procedure are coefficients of the method in (2) satisfy more difficult to implement than those derived by the techniques discussed in the next two sections. bi aij C bj aji  bi bj D 0; i; j D 1;:::;s; (5)

then the method is symplectic. Conversely ([8], Sect. 6.5), the relations (5) are essentially necessary for the method to be symplectic. Furthermore for Runge–Kutta and Related Integrators symplectic RK methods the transformation (1)isin fact exact symplectic ([8], Remark 11.1). In 1988, Lasagni, Sanz-Serna, and Suris (see [8], Chap. 6) discovered independently that some Order Conditions well-known families of numerical methods contain Due to symmetry considerations, the relations (5)im- symplectic integrators. pose s.s C 1/=2 independent equations on the s2 C s Symplectic Methods 1453 elements of the RK tableau (2), so that there is no obtained by concatenating implicit midpoint sub-steps shortage of symplectic RK methods. The available free of lengths b1hnC1,..., bshnC1. The determination of parameters may be used to increase the accuracy of the the free parameters bi is a task best accomplished by method. It is well known that the requirement that an means of the techniques used to analyze composition RK formula has a target order leads to a set of nonlinear methods. relations (order conditions) between the elements of the corresponding tableau (2). For order   there The B-series Approach is an order condition associated with each rooted tree In 1994, Calvo and Sanz-Serna ([5], Sect. VI.7.2) pro- with Ä  vertices and, if the aij and bi are free vided an indirect technique for the derivation of the parameters, the order conditions are mutually inde- symplecticness conditions (5). The first step is to iden- pendent. For symplectic methods however the tableau tify conditions for the symplecticness of the associated coefficients are constrained by (5), and Sanz-Serna and B-series (i.e., the series that expands the transformation Abia proved in 1991 that then there are redundancies (1)) in powers of the step size. Then the conditions between the order conditions ([8], Sect. 7.2). In fact (on the B-series) obtained in this way are shown to be to ensure order   when (5) holds it is necessary equivalent to (5). This kind of approach has proved to and sufficient to impose an order condition for each be very powerful in the theory of geometric integration, so-called nonsuperfluous (nonrooted) tree with Ä  where extensive use is made of formal power series. vertices. Partitioned Runge–Kutta Methods Examples of Symplectic Runge–Kutta Methods Partitioned Runge–Kutta (PRK) methods differ from Setting j D i in (5) shows that explicit RK methods standard RK integrators in that they use two tableaux of (with aij D 0 for i Ä j ) cannot be symplectic. coefficients of the form (2): one to advance p and the Sanz-Serna noted in 1988 ([8], Sect. 8.1) that the other to advance q. Most developments of the theory Gauss method with s stages, s D 1;2;:::,(i.e.,the of symplectic RK methods are easily adapted to cover unique method with s stages that attains the maximal the partitioned situation, see e.g., [8], Sects. 6.3, 7.3, order 2s) is symplectic. When s D 1 the method is and 8.4. the familiar implicit midpoint rule. Since for all Gauss The main reason ([8], Sect. 8.4) to consider the methods the matrix .aij / is full, the computation of class of PRK methods is that it contains integrators the stage vectors Pi and Qi require, at each step, that are both explicit and symplectic when applied the solution of the system (3)and(4) that comprises to separable Hamiltonian systems with H.p;qI t/ D s  2d scalar equations. In non-stiff situations this T.p/CV.qI t/, a format that often appears in the appli- system is readily solved by functional iteration, see cations. It turns out ([8], Remark 8.1, [5], Sect. VI.4.1, [8] Sects. 5.4 and 5.5 and [5] Sect. VIII.6, and then the Theorem 4.7) that such explicit, symplectic PRK meth- Gauss methods combine the advantages of symplectic- ods may always be viewed as splitting methods (see S ness, easy implementation, and high order with that of below). Moreover it is advantageous to perform their being applicable to all canonical Hamiltonian systems. analysis by interpreting them as splitting algorithms. If the system being solved is stiff (e.g., it arises through discretization of the spatial variables of a Runge–Kutta–Nystrom¨ Methods Hamiltonian partial differential equation), Newton it- In the special but important case where the (separable) eration has to be used to solve the stage equations (3) Hamiltonian is of the form H D .1=2/pT M 1p C and (4), and for high-order Gauss methods the cost V.qI t/ (M a positive-definite symmetric matrix) the of the linear algebra may be prohibitive. It is then canonical equations of interest to consider the possibility of diagonally implicit symplectic RK methods, i.e., methods where d d 1 aij D 0 for i

and it is sometimes called the symplectic Euler rule Integrators Based on Splitting and (it is obviously possible to interchange the roles of Composition p and q). Alternatively, (8) may be considered as a one-stage, explicit, symplectic PRK integrator as in [8], The related ideas of splitting and composition are Sect. 8.4.3. extremely fruitful in deriving practical symplectic in- As a second example of splitting, one may consider  tegrators in many fields of application. The corre- (nonseparable) formats H D H1.p; q/CV .q/,where sponding methods are typically ad hoc for the problem the Hamiltonian system associated with H1 can be at hand and do not enjoy the universal off-the-shelf integrated in closed form. For instance, H1 may cor- applicability of, say, Gaussian RK methods; however, respond to a set of uncoupled harmonic oscillators and when applicable, they may be highly efficient. In order V .q/ represent the potential energy of the interactions to simplify the exposition, we assume hereafter that the between oscillators. Or H1 may correspond to the Hamiltonian H is time-independent H D H.p;q/; Keplerian motion of a point mass attracted to a fixed we write H and H rather than ˚ H and gravitational center and V  be a potential describing hnC1 hnC1 tnC1;tn  H . Furthermore, we shall denote the time step some sort of perturbation. tnC1;tn by h omitting the possible dependence on the step number n. Strang Splitting With the notation in (7), the symmetric Strang formula Splitting ([8], Sect. 12.4.3, [5], Sect. II.5)

N H H2 H1 H2 Simplest Splitting h D h=2 ı h ı h=2 (9) The easiest possibility of splitting occurs when the Hamiltonian H may be written as H1 C H2 and the defines a time-reversible, second-order symplectic in- N H Hamiltonian systems associated with H1 and H2 may tegrator h that improves on the first order (7). be explicitly integrated. If the corresponding flows In the separable Hamiltonian case H D T.p/ C H1 H2 are denoted by t and t , the recipe (Lie–Trotter V.q/,(9) leads to splitting, [8], Sect. 12.4.2, [5], Sect. II.5) h pnC1=2 D pn  rV.qn/; H H2 H1 2 h D h ı h (7) qnC1 D qn C hrT.pnC1=2/; defines the map (1) of a first-order integrator that h pnC1 D pnC1=2  rV.qnC1/: is symplectic (the mappings being composed in the 2 right-hand side are Hamiltonian flows and therefore symplectic). Splittings of H in more than two pieces This is the Stormer–Leapfrog–Verlet¨ method that plays are feasible but will not be examined here. a key role in molecular dynamics [6]. It is also possible Symplectic Methods 1455 to regard this integrator as an explicit, symplectic PRK within the field of geometric integration, where it is with two stages ([8], Sect. 8.4.3). frequently not difficult to write down first- or second- order integrators with good conservation properties. More Sophisticated Formulae A useful example, due to Suzuki, Yoshida, and H A further generalization of (7)is others (see [8], Sect. 13.1), is as follows. Let h be a time-reversible integrator that we shall call the basic H H H H H bH 2 ı 1 ı 2 ıı 2 ı 1 (10) method and define the composition method h by ˇsh ˛s h ˇs1h ˇ1h ˛1h P P bH H H H where the coefficients ˛i and ˇi , i ˛i D 1, i ˇi D 1, h D ˛h ı .12˛/h ı ˛hI are chosen so as to boost the order  of the method. A systematic treatment based on trees of the required bH if the basic method is symplectic, then h will obvi- order conditions was given by Murua and Sanz-Serna ously be a symplectic method. It may be proved that, in 1999 ([5], Sect. III.3). There has been much recent 1=3 1=3 bH if ˛ D .1=3/.2 C 2 C 2 /,then h will have activity in the development of accurate splitting coef- order  D 4. By using this idea one may perform ficients ˛ , ˇ and the reader is referred to the entry i i symplectic, fourth-order accurate integrations while  Splitting Methods in this encyclopedia. really implementing a simpler second-order integra- In the particular case where the splitting is given by tor. The approach is particularly attractive when the H T.p/ V.q/ D C , the family (10) provides the most direct application of a fourth-order method (such as general explicit, symplectic PRK integrator. the two-stage Gauss method) has been ruled out on implementation grounds, but a suitable basic method Splitting Combined with Approximations (for instance the implicit midpoint rule or a scheme In (7), (9), or (10) use is made of the exact solution derived by using Strang splitting) is available. flows H1 and H2 . Even if one or both of these flows t t If the (time-reversible) basic method is of order 2 are not available, it is still possible to employ the 1=.2C1/ 1 bH and ˛ D 22 then h will have order  D idea of splitting to construct symplectic integrators. A 2C2; the recursive application of this idea shows that simple example will be presented next, but many others it is possible to reach arbitrarily high orders starting will come easily to mind. from a method of order 2. Assume that we wish to use a Strang-like method For further possibilities, see the entry  Composition H1 but t is not available. We may then advance the Methods and [8], Sect. 13.1, [5], Sects. II.4 and III.3. numerical solution via

H2 ı bH1 ı H2 ; (11) h=2 h h=2 The Modified Hamiltonian

bH1 where h denotes a consistent method for the integra- The properties of symplectic integrators outlined in the S tion of the Hamiltonian problem associated with H1. next section depend on the crucial fact that, when a bH1 If h is time-reversible, the composition (11)isalso symplectic integrator is used, a numerical solution of time-reversible and hence of order  D 2 (at least). the Hamiltonian system with Hamiltonian H may be bH1 And if h is symplectic, (11) will define a symplectic viewed as an (almost) exact solution of a Hamiltonian method. system whose Hamiltonian function He (the so-called modified Hamiltonian) is a perturbation of H. Composition An example. Consider the application of the sym- A step of a composition method ([5], Sect. II.4) con- plectic Euler rule (8) to a one-degree-of-freedom sys- sists of a concatenation of a number of sub-steps tem with separable Hamiltonian H D T.p/C V.q/. performed with one or several simpler methods. Often In order to describe the behavior of the points .pn;qn/ the aim is to create a high-order method out of low- computed by the algorithm, we could just say that they order integrators; the composite method automatically approximately behave like the solutions .p.tn/; q.tn// inherits the conservation properties shared by the meth- of the Hamiltonian system S being integrated. This ods being composed. The idea is of particular appeal would not be a very precise description because the 1456 Symplectic Methods

H H 2.5 true flow h and its numerical approximation h differ in O.h2/ terms. Can we find another differential 2 system S (called a modified system) so that (8)is 2 ∗ ∗ consistent of the second order with S ? The points 1.5 2 ∗ .pn;qn/ would then be closer to the solutions of ∗ 1 S2 than to the solutions of the system S we want ∗ ∗ to integrate. Straightforward Taylor expansions ([8], 0.5 ∗ ∗ Sect. 10.1) lead to the following expression for S2

p 0 ∗ (recall that f D@H=@q, g D @H=@p) ∗ −0.5 ∗ ∗ d h d −1 ∗ p D f.q/C g.p/f 0.q/; q D g.p/ ∗ dt 2 dt ∗ −1.5 ∗ ∗ h 0  g .p/f .q/; (12) − 2 2 −2.5 −2 −1 0 1 2 where we recognize the Hamiltonian system with (h- q dependent!) Hamiltonian Symplectic Methods, Fig. 2 Computed points, true trajectory (dotted line) and modified trajectory (dash–dot line) h He h D T.p/C V.q/C T 0.p/V 0.q/ D H C O.h/: 2 2 (13) General case. Given an arbitrary Hamiltonian sys- Figure 2 corresponds to the pendulum equations tem with a smooth Hamiltonian H, a consistent sym- H g.p/ D p, f.q/ Dsin q with initial condition plectic integrator h and an arbitrary integer >0, p.0/ D 0, q.0/ D 2. The stars plot the numerical it is possible ([8], Sect. 10.1) to construct a modified S e h solution with h D 0:5. The dotted line H D constant Hamiltonian system  with Hamiltonian function H , eh provides the true pendulum solution. The dash–dot line H  such that H differs from the flow in O.hC1/ He h D constant gives the solution of the modified h h 2 terms. In fact, He h may be chosen as a polynomial of system (12). The agreement of the computed points  degree <in h; the term independent of h coincides with the modified trajectory is very good. with H (cf. (13)) and for a method of order  the terms The origin is a center of the modified system (recall in h,...,h1 vanish. that a small Hamiltonian perturbation of a Hamiltonian The polynomials in h He h,  D 2;3;::: are the center is still a center); this matches the fact that, in  partial sums of a series in powers of h. Unfortunately the plot, the computed solution does not spiral in or this series does not in general converge for fixed h, out. On the other hand, the analogous modified system eh H  for the (nonsymplectic) integration in 1 is found not be so that, in particular, the modified flows h cannot H a Hamiltonian system, but rather a system with neg- converge to h as  "1. Therefore, in general, it e h eH h ative dissipation: This agrees with the spiral behavior is impossible to find a Hamiltonian H such that h H observed there. coincides exactly with the integrator h .Neishtadt By adding extra O.h2/ terms to the right-hand sides ([8], Sect. 10.1) proved that by retaining for each h> of (12), it is possible to construct a (more accurate) 0 a suitable number N D N.h/ of terms of the e h modified system S3 so that (8) is consistent of the third series it is possible to obtain a Hamiltonian H such S S eH h H order with 3; thus, 3 would provide an even better that h differs from h in an exponentially small description of the numerical solution. The procedure quantity. may be iterated to get modified systems S4, S5,... and Here is the conclusion for the practitioner: For all of them turn out to be Hamiltonian. a symplectic integrator applied to an autonomous Symplectic Methods 1457

Hamiltonian system, modified autonomous Hamil- This kind of research relies very much on concepts and tonian problems exist so that the computed points lie techniques from the theory of Lie algebras. “very approximately” on the exact trajectories of the modified problems. This makes possible a backward error interpretation of the numerical results: The computed solutions are solving “very approximately” Properties of Symplectic Integrators a nearby Hamiltonian problem. In a modeling situation where the exact form of the Hamiltonian H may be in We conclude by presenting an incomplete list of favor- doubt, or some coefficients in H may be the result of able properties of symplectic integrators. Note that the experimental measurements, the fact that integrating advantage of symplecticness become more prominent the model numerically introduces perturbations to H as the integration interval becomes longer. comparable to the uncertainty in H inherent in the Conservation of energy. For autonomous Hamilto- model is the most one can hope for. nians, the value of H is of course a conserved quantity On the other hand, when a nonsymplectic formula and the invariance of H usually expresses conser- is used the modified systems are not Hamiltonian: The vation of physical energy. Ge and Marsden proved process of numerical integration perturbs the model in in 1988 ([8], Sect. 10.3.2) that the requirements of such a way as to take it out of the Hamiltonian class. symplecticness and exact conservation of H cannot Variable steps. An important point to be noted is be met simultaneously by a bona fide numerical inte- as follows: The backward error interpretation only grator. Nevertheless, symplectic integrators have very holds if the numerical solution after n steps is com- good energy behavior ([5], Sect. IX.8): Under very puted by iterating n times one and the same symplec- general hypotheses, for a symplectic integrator of or- tic map. If, alternatively, one composes n symplectic der : H.pn;qn/ D H.p0;q0/ C O.h /,wherethe maps (one from t0 to t1, a different one from t1 constant implied in the O notation is independent to t2, etc.) the backward error interpretation is lost, of n over exponentially long time intervals nh Ä because the modified system changes at each step ([8], exp h0=.2h/ . Sect. 10.1.3). Linear error growth in integrable systems. For a As a consequence, most favorable properties of Hamiltonian problem that is integrable in the sense symplectic integrators (and of other geometric inte- of the Liouville–Arnold theorem, it may be proved grators) are lost when they are naively implemented ([5], Sect. X.3) that, in (long) time intervals of length with variable step sizes. For a complete discussion of proportional to h, the errors in the action variables this difficulty and of ways to circumvent it, see [5], are of magnitude O.h/ and remain bounded, while Sects. VIII 1–4. the errors in angle variables are O.h / and exhibit a Finding explicitly the modified Hamiltonians. The growth that is only linear in t. By implication the error existence of a modified Hamiltonian system is a general growth in the components of p and q will be O.h/ and result that derives directly from the symplecticness of grow, at most, linearly. Conventional integrators, in- S H the transformation h ([8], Sect. 10.1) and does not cluding explicit Runge–Kutta methods, typically show require any hypothesis on the particular nature of such quadratic error growth in this kind of situation and a transformation. However, much valuable information therefore cannot be competitive in a sufficiently long may be derived from the explicit construction of the integration. modified Hamiltonians. For RK and related methods, KAM theory. When the system is closed to inte- e h a way to compute systematically the H ’s was first grable, the KAM theory ([5], Chap. X) ensures, among described by Hairer in 1994 and then by Calvo, Mu- other things, the existence of a number of invariant tori rua, and Sanz-Serna ([5], Sect. IX.9). For splitting and that contribute to the stability of the dynamics (see [8], e h composition integrators, the H ’s may be obtained by Sect. 10.4 for an example). On each invariant torus the use of the Baker–Campbell–Hausdorff formula ([8], motion is quasiperiodic. Symplectic integrators ([5], Sect. 12.3, [5], Sect. III.4) that provides a means to Chap. X, Theorem 6.2) possess invariant tori O.h/ express as a single flow the composition of two flows. close to those of the system being integrated and 1458 Systems Biology, Minimalist vs Exhaustive Strategies furthermore the dynamics on each invariant torus is Systems Biology, Minimalist vs conjugate to its exact counterpart. Exhaustive Strategies Linear error growth in other settings. Integrable systems are not the only instance where symplectic Jeremy Gunawardena integrators lead to linear error growth. Other cases Department of Systems Biology, Harvard Medical include, under suitable hypotheses, periodic orbits, School, Boston, MA, USA solitons, relative equilibria, etc., see, among others, [1–4].

Introduction

Systems biology may be defined as the study of how Cross-References physiology emerges from molecular interactions [11]. Physiology tells us about function, whether at the  B-Series organismal, tissue, organ or cellular level; molecular  Composition Methods interactions tell us about mechanism. How do we relate  Euler Methods, Explicit, Implicit, Symplectic mechanism to function? This has always been one of  Gauss Methods the central problems of biology and medicine but it  Hamiltonian Systems attains a particular significance in systems biology be-  Molecular Dynamics cause the molecular realm is the base of the biological  Nystrom¨ Methods hierarchy. Once the molecules have been identified,  One-Step Methods, Order, Convergence there is nowhere left to go but up.  Runge–Kutta Methods, Explicit, Implicit This is an enormous undertaking, encompassing,  Symmetric Methods among other things, the development of multicellular organisms from their unicellular precursors, the hier- archical scales from molecules to cells, tissues, and organs, and the nature of malfunction, disease, and re- pair. Underlying all of this is evolution, without which References biology can hardly be interpreted. Organisms are not designed to perform their functions, they have evolved 1. Cano, B., Sanz-Serna, J.M.: Error growth in the numerical to do so—variation, transfer, drift, and selection have integration of periodic orbits, with application to Hamil- tinkered with them over 3:5  109 years—and this has tonian and reversible systems. SIAM J. Numer. Anal. 34, 1391–1417 (1997) had profound implications for how their functions have 2. de Frutos, J., Sanz-Serna, J.M.: Accuracy and conserva- been implemented at the molecular level [12]. tion properties in numerical integration: the case of the The mechanistic viewpoint in biology has nearly Korteweg-deVries equation. Numer. Math. 75, 421–445 (1997) always required a strongly quantitative perspective and 3. Duran, A., Sanz-Serna, J.M.: The numerical integration of therefore also a reliance on quantitative models. If this relative equilibrium solution. Geometric theory. Nonlinear- trend seems unfamiliar to those who have been reared ity 11, 1547–1567 (1998) on molecular biology, it is only because our histori- 4. Duran, A., Sanz-Serna, J.M.: The numerical integra- tion of relative equilibrium solutions. The nonlinear cal horizons have shrunk. The quantitative approach Schroedinger equation. IMA. J. Numer. Anal. 20, 235–261 would have seemed obvious to physiologists, geneti- (1998) cists, and biochemists of an earlier generation. More- 5. Hairer, E., Lubich, Ch., Wanner, G.: Geometric Numerical over, quantitative methods wax and wane within an Integration, 2nd edn. Springer, Berlin (2006) 6. Leimkuhler, B., Reich, S.: Simulating Hamiltonian Dynam- individual discipline as new experimental techniques ics. Cambridge University Press, Cambridge (2004) emerge and the focus shifts between the descriptive 7. Sanz-Serna, J.M.: Geometric integration. In: Duff, I.S., and the functional. The great Santiago Ramon´ y Cajal, Watson, G.A. (eds.) The State of the Art in Numerical to whom we owe the conception of the central ner- Analysis. Clarendon Press, Oxford (1997) 8. Sanz-Serna, J.M., Calvo, M.P.: Numerical Hamiltonian vous system as a network of neurons, classified “theo- Problems. Chapman & Hall, London (1994) rists” with “contemplatives, bibliophiles and polyglots, Systems Biology, Minimalist vs Exhaustive Strategies 1459 megalomaniacs, instrument addicts, misfits” [3]. Yet, of dynamical system, the state is represented by the when Cajal died in 1934, Alan Hodgkin was already (scalar) concentrations of the various molecular com- starting down the road that would lead to the Hodgkin- ponents in specific cellular compartments and the time Huxley equations. evolution by a system of coupled ordinary differential In a similar way, the qualitative molecular biology equations (ODEs). of the previous era is shifting, painfully and with much What is the right kind of model to use? That depends grinding of gears, to a quantitative systems biology. entirely on the biological context, the kind of biological What kind of mathematics will be needed to support question that is being asked and on the experimental this? Here, we focus on the level of abstraction for capabilities that can be brought to bear on the problem. modelling cellular physiology at the molecular level. Models in biology are not objective descriptions of This glosses over many other relevant and hard prob- reality; they are descriptions of our assumptions about lems but allows us to bring out some of the distinctive reality [2]. challenges of molecularity. We may caricature the For our purposes, it will be simplest to discuss current situation in two extreme views. One approach, ODE models. Much of what we say applies to other mindful of the enormous complexity at the molecular classes of models. Assume, therefore, that the system is level, strives to encompass that complexity, to dive into described by the concentrations within specific cellular it, and to be exhaustive; the other, equally mindful compartments of n components, x1;  ;xn,andthe of the complexity but with a different psychology, time evolution is given, in vector form, by dx=dt D strives to abstract from it, to rise above it, and to be f.xI a/. Here, a 2 Rm is a vector of parameters. minimalist. Here, we examine the requirements and the These may be quantities like proportionality constants implications of both strategies. in rate laws. They have to take numerical values before the dynamics on the state space can be fully defined and thereby arises the “parameter problem” [6]. In any Models as Dynamical Systems serious model, most of the parameter values are not known, nor can they be readily determined experimen- Many different kinds of models are available for de- tally. (Even if some of them can, there is always a scribing molecular systems. It is convenient to think of question of whether an in-vitro measurement reflects each as a dynamical system, consisting of a description the in-vivo context.) of the state of the system along with a description The dynamical behaviour of a system may depend of how that state changes in time. The system state crucially on the specific parameter values. As these typically amalgamates the states of various molecular values change through a bifurcation, the qualitative components, which may be described at various levels “shape” of the dynamics may alter drastically; for of abstraction. For instance, Boolean descriptions are instance, steady states may alter their stability or ap- often used by experimentalists when discussing gene pear or disappear [22]. In between bifurcations, the expression: this gene is ON, while that other is OFF. shape of the dynamics only alters in a quantitative way S Discrete dynamic models can represent time evolution while the qualitative portrait remains the same. The as updates determined by Boolean functions. At the geography of parameter space therefore breaks up into other end of the abstraction scale, gene expression may regions; within each region the qualitative portrait of be seen as a complex stochastic process that takes place the dynamics remains unaltered, although its quantita- at an individual promoter site on DNA: states may be tive details may change, while bifurcations take place described by the numbers of mRNA molecules and between regions resulting in qualitative changes in the the time evolution may be described by a stochastic dynamics (Fig. 1). master equation. In a different physiological context, Ca2C ions are a “second messenger” in many key signalling networks and show complex spatial and Parameterology temporal behaviour within an individual cell. The state may need to be described as a concentration that varies We see from this that parameter values matter. They are in space and time and the time evolution by a partial typically determined by fitting the model to experimen- differential equation. In the most widely-used form tal data, such as time series for the concentrations of 1460 Systems Biology, Minimalist vs Exhaustive Strategies

a x (1 – x ) b 1 1 1 a2x1

a dx 1 x = f1(x1) 1 dt 0 1 – a2/a1

= a1x1(1 – x1) – a2x1

f1(x1)

c a1 ≤ a2

a1

a1 > a2

a2

x1 x1 0 0 1 – a2/a1

Systems Biology, Minimalist vs Exhaustive Strategies, a1 >a2, in which there are two steady states, a positive stable Fig. 1 The geography of parameter space. (a) A dynamical state at x1 D 1  a2=a1, to which any positive initial conditions system with one state variable, x1 and two parameters, a1, a2. tends (arrows), and an unstable state at x1 D 0. Here, the (b) Graphs of f1.x1/ (dashed curve) and of the terms in it (solid state space is taken to be the nonnegative real line. A magenta curves), showing the steady states, where dx1=dt D f1.x1/ D dot indicates a stable state and a cyan dot indicates an unstable 0.(c) Parameter space breaks up into two regions: (1) a1 Ä a2, state. The dynamics in the state space undergoes a transcritical in which the state space has a single stable steady state at x1 D 0 bifurcation at a1 D a2 [22] to which any positive initial condition tends (arrow); and (2) some of the components. The fitting may be undertaken further by acquiring more data. This raises an inter- by minimizing a suitable measure of discrepancy be- esting problem of how best to design experiments to tween the calculated and the observed data, for which efficiently constrain the data. Is it better to get more several nonlinear optimization algorithms are available of the same data or to get different kinds of data? [10]. Empirical studies on models with many (>10) On the other hand, one might seek to live with the parameters have revealed what could be described as sloppiness, to acknowledge that the fitted parameter a “80/20” rule [9, 19]. Roughly speaking, 20 % of the values may not reflect the actual ones but nevertheless parameters are well constrained by the data or “stiff”: seek to draw testable conclusions from them. For They cannot be individually altered by much without instance, the stiff parameters may suggest experimen- significant discrepancy between calculated and ob- tal interventions whose effects are easily observable. served data. On the other hand, 80 % of the parameters (Whether they are also biological interesting is another are poorly constrained or “sloppy,” they can be individ- matter.) There may also be properties of the system ually altered by an order of magnitude or more, without that are themselves insensitive, or “robust,” to the a major impact on the discrepancy. The minimization parameter differences. One can, in any case, simply landscape, therefore, does not have a single deep hole draw conclusions based on the fits and seek to test these but a flat valley with rather few dimensions orthogonal experimentally. to the valley. The fitting should have localized the A successful test may provide some encouragement valley within one of the parameter regions. At present, that the model has captured aspects of the mechanism no theory accounts for the emergence of these valleys. that are relevant to the question under study. However, Two approaches can be taken to this finding. On the models are working hypotheses, not explanations. The one hand, one might seek to constrain the parameters conclusion that is drawn may be correct but that may Systems Biology, Minimalist vs Exhaustive Strategies 1461 be an accident of the sloppy parameter values or the being asked. As always, that is a hypothesis, which particular assumptions made. It may be correct for may or may not turn out to be correct. the wrong reasons. Molecular complexity is such that The distinction between exhaustive and minimal there may well be other models, based on differ- models is therefore more a matter of scale than of sub- ent assumptions, that lead to the same conclusions. stance. However, having decided upon a point on the Modellers often think of models as finished entities. scale, and created a model of some complexity, there Experimentalists know better. A model is useful only are some systematic approaches to simplifying it. Here, as a basis for making a better model. It is through we discuss just two. repeated tests and revised assumptions that a firmly One of the most widely used methods is separation grounded, mechanistic understanding of molecular be- of time scales. A part of the system is assumed to be haviour slowly crystallizes. working significantly faster than the rest. If the faster Sometimes, one learns more when a test is not part is capable of reaching a (quasi) steady state, then successful because that immediately reveals a problem the slower part is assumed to see only that steady with the assumptions and stimulates a search to correct state and not the transient states that led to it. In some them. However, it may be all too easy, because of the cases, this allows variables within the faster part to be sloppiness in the fitting, to refit the model using the eliminated from the dynamics. data that invalidated the conclusions and then claim Separation of time scales appears in Michaelis and that the newly fitted model “accounts for the new data.” Menten’s pioneering study of enzyme-catalysed reac- From this, one learns nothing. It is better to follow tions. Their famous formula for the rate of an en- Popperian principles and to specify in advance how a zyme arises by assuming that the intermediate enzyme- model is to be rejected. If a model cannot be rejected, it substrate complex is in quasi-steady state [8]. Although cannot tell you anything. In other areas of experimental the enzyme-substrate complex plays an essential role, science, it is customary to set aside some of the data to it has been eliminated from the formula. The King- fit the model and to use another part of the data, or Altman procedure formalizes this process of elimina- newly acquired data, to assess the quality of the model. tion for complex enzyme mechanisms with multiple In this way, a rejection criterion can be quantified and intermediates. This is an instance of a general method one can make objective comparisons between different of linear elimination underlying several well-known models. formulae in enzyme kinetics, in protein allostery and The kind of approaches sketched above only work in gene transcription, as well as more modern simpli- well when modelling and experiment are intimately fications arising in chemical reaction networks and in integrated [18,21]. As yet, few research groups are able multienzyme posttranslational modification networks to accomplish this, as both aspects require substantial, [7]. but orthogonal, expertise as well as appropriate inte- An implicit assumption is often made that, after grated infrastructure for manipulating and connecting elimination, the behaviour of the simplified dynamical data and models. system approximates that of the original system. The S mathematical basis for confirming this is through a singular perturbation argument and Tikhonov’s Theo- rem [8], which can reveal the conditions on parameter Model Simplification values and initial conditions under which the approx- imation is valid. It must be said that, aside from As pointed out above, complexity is a relative matter. the classical Michaelis–Menten example, few singular Even the most complex model has simplifying assump- perturbation analyses have been undertaken. Biological tions: components have been left out; posttranslational intuition can be a poor guide to the right conditions. In modification states collapsed; complex interactions ag- the Michaelis–Menten case, for example, the intuitive gregated; spatial dimensions ignored; physiological basis for the quasi-steady state assumption is that context not made explicit. And these are just some under in vitro conditions, substrate, S, is in excess of the things we know about, the “known unknowns.” over enzyme, E: Stot  Etot. However, singular There are also the “unknown unknowns.” We hope that perturbation reveals a broader region, Stot C KM  what has been left out is not relevant to the question Etot,whereKM is the Michaelis–Menten constant, in 1462 Systems Biology, Minimalist vs Exhaustive Strategies which the quasi-steady state approximation remains engineering lies in designing a linear system whose valid [20]. Time-scale separation has been widely used frequency response matches specified Bode plots. but cannot be expected to provide dramatic reductions The technology is now available to experimentally in complexity; typically, the number of components are measure approximate cellular frequency responses in reduced twofold, not tenfold. the vicinity of a steady state, at least under simple con- The other method of simplification is also based ditions. Provided the amplitude of the forcing is not too on an old trick: linearization in the neighbourhood of high, so that a linear approximation is reasonable, and a steady state. The Hartman–Grobman Theorem for the steady state is homeostatically maintained, reverse a dynamical system states that, in the local vicinity of a engineering of the Bode plots can yield a simplified hyperbolic steady state—that is, one in which none of linear control system that may be an useful abstraction the eigenvalues of the Jacobian have zero real part— of the complex nonlinear molecular network responsi- the nonlinear dynamics is qualitatively similar to the ble for the homeostatic regulation [15]. Unlike time- dynamics of the linearized system, dy=dt D .Jf /ssy, scale separation, the reduction in complexity can be where y D x  xss is the offset relative to the steady dramatic. As always, this comes at the price of a more state, xss,and.Jf /ss is the Jacobian matrix for the abstract representation of the underlying biology but, nonlinear system, dx=dt D f.x/, evaluated at the crucially, one in which some of the control structure is steady state. Linearization simplifies the dynamics but retained. However, at present, we have little idea how does not reduce the number of components. to extend such frequency analysis to large perturba- Straightforward linearization has not been particu- tions, where the nonlinearities become significant, or larly useful for analysing molecular networks, because to systems that are not homeostatic. it loses touch with the underlying network structure. Frequency analysis, unlike separation of time However, control engineering provides a systematic scales, relies on data, reinforcing the point made way to interrogate the linearized system and, poten- previously that integrating modelling with experiments tially, to infer a simplified network. Such methods and data can lead to powerful synergies. were widely used in physiology in the cybernetic era [5], and are being slowly rediscovered by molecular systems biologists. They are likely to be most use- Looking Behind the Data ful when the steady state is homeostatically main- tained. That is, when the underlying molecular net- Experimentalists have learned the hard way to de- work acts like a thermostat to maintain some internal velop their conceptual understanding from experimen- variable within a narrow range, despite external fluc- tal data. As the great Otto Warburg advised, “Solutions tuations. Cells try to maintain nutrient levels, energy usually have to be found by carrying out innumerable levels, pH, ionic balances, etc., fairly constant, as experiments without much critical hesitation” [13]. do organisms in respect of Claude Bernard’s “milieu However, sometimes the data you need is not the data int´erieure”; chemotaxing E. coli return to a constant you get, in which case conceptual interpretation can tumbling rate after perturbation by attractants or repel- become risky. For instance, signalling in mammalian lents [23]; S. cerevisiae cells maintain a constant os- cells has traditionally relied on grinding up 106 cells motic pressure in response to external osmotic shocks and running Western blots with antibodies against [16]. specific molecular states. Such data has told us a great The internal structure of a linear control system can deal, qualitatively. However, a molecular network oper- be inferred from its frequency response. If a stable ates in a single cell. Quantitative data aggregated over linear system is subjected to a sinusoidal input, its a cell population is only meaningful if the distribution steady-state output is a sinuosoid of the same frequency of responses in the population is well represented by but possibly with a different amplitude and phase. its average. Unfortunately, that is not always the case, The amplitude gain and the phase shifts, plotted as most notoriously when responses are oscillatory. The functions of frequency—the so-called Bode plots, after averaged response may look like a damped oscillation, Hendrik Bode, who developed frequency analysis at while individual cells actually have regular oscillations Bell Labs—reveal a great deal about the structure of but at different frequencies and phases [14, 17]. Even the system [1]. More generally, the art of systems when the response is not oscillatory single-cell analysis Systems Biology, Minimalist vs Exhaustive Strategies 1463 may reveal a bimodal response, with two apparently 7. Gunawardena, J.: A linear elimination framework. http:// distinct sub-populations of cells [4]. In both cases, the arxiv.org/abs/1109.6231 (2011) very concept of an “average” response is a statistical 8. Gunawardena, J.: Modelling of interaction networks in the cell: theory and mathematical methods. In: Egelmann, E. fiction that may be unrelated to the behaviour of any (ed.) Comprehensive Biophysics, vol. 9. Elsevier, Amster- cell in the population. dam (2012) One moral of this story is that one should al- 9. Gutenkunst, R.N., Waterfall, J.J., Casey, F.P., Brown, K.S., ways check whether averaged data is representative Myers, C.R., Sethna, J.P.: Universally sloppy parameter sensitivities in systems biology models. PLoS Comput. Biol. of the individual, whether individual molecules, cells, 3, 1871–1878 (2007) or organisms. It is surprising how rarely this is done. 10. Kim, K.A., Spencer, S.L., Ailbeck, J.G., Burke, J.M., The other moral is that data interpretation should Sorger, P.K., Gaudet, S., Kim, D.H.: Systematic calibration of a cell signaling network model. BMC Bioinformatics. 11, always be mechanistically grounded. No matter how 202 (2010) intricate the process through which it is acquired, the 11. Kirschner, M.: The meaning of systems biology. Cell 121, data always arises from molecular interactions taking 503–504 (2005) place in individual cells. Understanding the molecular 12. Kirschner, M.W., Gerhart, J.C.: The Plausibility of Life. Yale University Press, New Haven (2005) mechanism helps us to reason correctly, to know what 13. Krebs, H.: Otto Warburg: Cell Physiologist, Biochemist and data are needed and how to interpret the data we get. Eccentric. Clarendon, Oxford (1981) Mathematics is an essential tool in this, just as it was 14. Lahav, G., Rosenfeld, N., Sigal, A., Geva-Zatorsky, N., for Michaelis and Menten. Perhaps one of the reasons Levine, A.J., Elowitz, M.B., Alon, U.: Dynamics of the p53- Mdm2 feedback loop in individual cells. Nat. Genet. 36, that biochemists of the Warburg generation were so 147–150 (2004) successful, without “critical hesitation,” was because 15. Mettetal, J.T., Muzzey, D., Gomez-Uribe,´ C., van Michaelis and others had already provided a sound Oudenaarden, A.: The frequency dependence of osmo- mechanistic understanding of how individual enzymes adaptation in Saccharomyces cerevisiae. Science 319, 482–484 (2008) worked. These days, systems biologists confront ex- 16. Muzzey, D., Gomez-Uribe,´ C.A., Mettetal, J.T., van traordinarily complex multienzyme networks and want Oudenaarden, A.: A systems-level analysis of perfect adap- to know how they give rise to cellular physiology. We tation in yeast osmoregulation. Cell 138, 160–171 (2009) need all the mathematical help we can get. 17. Nelson, D.E., Ihekwaba, A.E.C., Elliott, M., Johnson, J.R., Gibney, C.A., Foreman, B.E., Nelson, G., See, V., Horton, C.A., Spiller, D.G., Edwards, S.W., McDowell, H.P., Unitt, J.F., Sullivan, E., Grimley, R., Benson, N., Broomhead, D., References Kell, D.B., White, M.R.H.: Oscillations in NF-B control the dynamics of gene expression. Science 306, 704–708 1. Astr˚ om,¨ K.J., Murray, R.M.: Feedback systems. An Intro- (2004) duction for Scientists and Engineers. Princeton University 18. Neumann, L., Pforr, C., Beaudoin, J., Pappa, A., Fricker, N., Press, Princeton (2008) Krammer, P.H., Lavrik, I.N., Eils, R.: Dynamics within the 2. Black, J.: Drugs from emasculated hormones: the principles CD95 death-inducing signaling complex decide life and of syntopic antagonism. In: Frangsmyr,¨ T. (ed.) Nobel Lec- death of cells. Mol. Syst. Biol. 6, 352 (2010) tures, Physiology or Medicine 1981–1990. World Scientific, 19. Rand, D.A.: Mapping the global sensitivity of cellular net- S Singapore (1993) work dynamics: sensitivity heat maps and a global summa- 3. Cajal, S.R.: Advice for a Young Investigator. MIT Press, tion law. J. R. Soc Interface. 5(Suppl 1), S59–S69 (2008) Cambridge (2004) 20. Segel, L.: On the validity of the steady-state assumption of 4. Ferrell, J.E., Machleder, E.M.: The biochemical basis of enzyme kinetics. Bull. Math. Biol. 50, 579–593 (1988) an all-or-none cell fate switch in Xenopus oocytes. Science 21. Spencer, S.L., Gaudet, S., Albeck, J.G., Burke, J.M., 280, 895–898 (1998) Sorger, P.K.: Non-genetic origins of cell-to-cell variability 5. Grodins, F.S.: Control Theory and Biological Systems. in TRAIL-induced apoptosis. Nature 459, 428–433 (2009) Columbia University Press, New York (1963) 22. Strogatz, S.H.: Nonlinear Dynamics and Chaos: with Ap- 6. Gunawardena, J.: Models in systems biology: the parameter plications to Physics, Biology, Chemistry and Engineering. problem and the meanings of robustness. In: Lodhi, H., Perseus Books, Cambridge (2001) Muggleton, S. (eds.) Elements of Computational Systems 23. Yi, T.M., Huang, Y., Simon, M.I., Doyle, J.: Robust perfect Biology. Wiley Book Series on Bioinformatics. Wiley, adaptation in bacterial chemotaxis through integral feedback Hoboken (2010) control. Proc. Natl. Acad. Sci. U.S.A. 97, 4649–4653 (2000)