Choosing a Fast Initial Propagator for Rapid Convergence of the Parareal Algorithm in the Context of Simple Model Problems

Choosing a Fast Initial Propagator for Rapid Convergence of the Parareal Algorithm in the Context of Simple Model Problems Thomas Roy University of Oxford Supervisors: Andy Wathen, Debbie Samaddar A technical report for InFoMM CDT Mini-Project 2 in partnership with Culham Centre for Fusion Energy Trinity 2016 Contents 1 Introduction 1 2 The Parareal Algorithm 2 2.1 Time-stepping methods . .3 2.2 Convergence Results from the Literature . .6 2.3 Properties and Options for Parareal . .8 2.4 Choice of the Coarse Solver G ...................... 10 3 Models 12 3.1 Lorenz System . 13 3.2 Wave Equation . 14 4 Numerical Results 15 4.1 Scalar Linear Problem . 16 4.2 Lorenz System . 16 4.3 Wave Equation . 20 5 Discussion 21 6 Conclusion and Further Work 23 References 24 A Numerical Methods 26 ii 1 Introduction In the last decades, advancements in hardware have made possible the numerical solution of increasingly complex models. However, these advancements are limited by the more recent stagnation in CPU clock speed. These limits have justified the focus on efficient parallel hardware and algorithms. In general, the parallelization of numerical solvers is done through the spatial variables, i.e. by separating the spatial domain in independent subdomains assigned to different CPUs. There have been multiple successful efforts to extend this to temporal parallelization in the case of time-dependent ordinary differential equations (ODEs), or time-dependent partial differential equations (PDEs) where spatial parallelization is saturated. These parallel in time methods are intrinsically more challenging due to causality; the later solution depends on the earlier solution. Over the last 50 years, a variety of different time parallel time integration methods have been introduced (see [5] for a survey of current methods). Different strategies include multiple shooting methods, domain de- composition and waveform relaxation, space-time multigrid, and direct time parallel methods. This research focuses on the Parareal algorithm, introduced by Lions, Maday, and Turincini in [11]. This algorithm is a multiple shooting method where a fast initial propagator gives a coarse approximation of the solution on the whole time domain, while a fine solver is used to obtain more accurate solutions on independent smaller subdomains. The choice of these components affects the rate of convergence (con- traction) or non-convergence of the overall Parareal iteration. The Culham Centre for Fusion Energy (CCFE) is interested in Parareal for complicated physics simula- tions associated with plasmas, particularly in the behaviour of plasmas at the edge of the system where neutral transport becomes important. Previous attempts have been made by CCFE to characterise the behaviour of the algorithm in these contexts [13, 17]. CCFE seeks a better understanding of how the fast initial propagator affects the outcome and performance of the Parareal algorithm. The goal of this project is to go back to simpler problems in order to determine which factors are favourable for the convergence and stability of the algorithm. In this report, we want to determine what factors affect the convergence of the Parareal algorithm. In Section 2, we detail the Parareal algorithm in a very general formulation. Then, we introduce the needed concepts of numerical analysis before detailing theoretical results from the literature. We discuss the different choices of parameters for the algorithm, including the order of convergence of coarser solver. In Section 3, we detail the different models for which the theoretical results will be tested. In Section 4, we compare theoretical results with numerical results for our test models. In Section 5, we also include observations on the use of multi-step methods and the importance of our results. 1 2 The Parareal Algorithm In this section, we describe the Parareal algorithm as done in [6, 7]. We consider a system of ordinary differential equations (ODEs) of the form u0(t) = f(t; u(t)); t 2 [0;T ]; u(0) = u0; (1) where f : [0;T ] × RM ! RM and u : R ! RM . For the Parareal algorithm, we decompose the time domain Ω = [0;T ] into N time subdomains (time chunks, time slices) Ωn = [Tn;Tn+1], n = 0; 1;:::;N − 1, with 0 = T0 < T1 < : : : < TN−1 < TN = T , and ∆Tn = Tn+1 − Tn. On each time subdomain Ωn, n = 0;:::;N − 1, we consider the problem 0 un(t) = f(t; un(t)); t 2 [Tn;Tn+1]; un(Tn) = Un; (2) where the initial values Un are given by the matching condition 0 U0 = u ; Un = un−1(Tn; Un−1); n = 1;:::;N − 1; (3) where un−1(Tn;Un−1) denotes the solution of (2) with the initial condition un(Tn) = > > Un after time ∆Tn. Letting U = (U0 ;:::; UN−1), we rewrite the system (3) in the form 0 0 1 U0 − u B U − u (T ; U ) C B 1 0 1 0 C F (U) = B . C = 0; (4) @ . A Un − uN−1(TN ; UN−1) where F : RM×N ! RM×N . Solving this with Newton’s method leads to the process k+1 k −1 k k U = U − JF (U )F (U ); (5) where JF denotes the Jacobian of U. We can expand this into the following recurrence: ( U k+1 = u0; 0 (6) U k+1 = u (T ; U k) + @un (T ; U k)(U k+1 − U k); n+1 n n+1 n @Un n+1 n n n where n = 1;:::;N − 1. In general, the Jacobian terms in (6) are too expensive to compute exactly. Instead, the Parareal algorithm uses two approximations with different accuracy: let F (Tn;Tn+1; Un) be an accurate approximation of the solution k un(Tn+1; Un ) on the time subdomain Ωn, and let G(Tn;Tn+1; Un) be a less accurate approximation, for example on a coarser grid, or a lower order method, or an approximation using a simpler model than (1). Then, we approximate the solution in the k time subdomains in (2) by un(Tn+1; Un ) ≈ F (Tn;Tn+1; Un), and the Jacobian terms in (6) by @un k k+1 k k+1 k (Tn+1; Un )(Un − Un ) ≈ G(Tn;Tn+1; Un ) − G(Tn;Tn+1; Un ): (7) @Un 2 This gives us an approximation to (6) given by ( U k+1 = u0; 0 (8) k+1 k k+1 k Un+1 = F (Tn;Tn+1; Un ) + G(Tn;Tn+1; Un ) − G(Tn;Tn+1; Un ); which is the Parareal algorithm introduced in [11]. A natural initial guess for (8) is the 0 0 k k coarse solution, i.e. Un = G(Tn;Tn+1; Un). Let H(Tn;Tn+1; Un ) = F (Tn;Tn+1; Un )− k G(Tn;Tn+1; Un ). We illustrate the recurrence relation (8) in Figure 1. The coarse solutions given by G are computed serially, while the fine solutions given by F can be computed in parallel with each subproblem (2) is assigned to a different CPU. n G G G 1 1 1 U0 U U U H 1 H 2 3 G G 2 2 U0 U U H 1 H 2 G G U 3 3 0 U1 U2 ... U k n H G U k+1 U k+1 k n n+1 Figure 1: The recurrence relation (8). 2.1 Time-stepping methods In this section, we introduce different key notions for time-stepping methods [8, 9], and the methods considered in this report. For more details on the time-stepping methods, see Appendix A. We first consider the following initial value problem y0(t) = λy; y(0) = 1; (9) the famous Dahlquist test equation. For one-step methods, we can always find a function R(z) such that the method applied to (9) may be written as yn+1 = R(z)yn; (10) where z = ∆t λ. 3 Definition 2.1. The function R(z) is called the stability function of the method. It can be interpreted as the numerical solution after one-step for the Dahlquist test equation. The set S = fz 2 C; jR(z)j ≤ 1g (11) is called the stability domain or stability region or region of absolute stability of the method. Definition 2.2. A method, whose stability domain satisfies − S ⊃ C = fz; Re(z) ≤ 0g; is called A-stable. This concept of absolute stability can be extended beyond the scalar case. Con- sider a linear system y0(t) = Ay(t); (12) where A is a constant m × m matrix. For simplicity, we suppose that A is diago- nalizable, which means it has a set of m linearly independent eigenvectors vp such that Avp = λpvp for p = 1; :::; m, where λp are the corresponding eigenvalues. Let P = [v1; :::; vm] be the matrix of eigenvectors and D = diag(λ1; :::; λm) be the diagonal matrix of eigenvalues, then A = P DP −1 and D = P −1AP : (13) Let u(t) = P −1y(t). We can rewrite (12) as u0 = Du: (14) This is a diagonal system of equations that we decouple into m independent scalar equations of the form 0 up = λpup; for p = 1; :::m; (15) where up are the components of u. For the overall method to be stable, each of the scalar problems must be stable, and this requires ∆tλp to be in the stability region of the method for p = 1; :::; m. This can be rewritten as a condition on the spectral radius of the matrix A, ρ(A). The concept of absolute stability does not directly apply to nonlinear systems. As in [9], we will consider a linearized approximation of the nonlinear system. y0 = f(t; y): (16) Let '(t) be a smooth solution of (16).

Load more