On the use of reduced basis methods to accelerate and stabilize the method

Feng Chen, Jan S. Hesthaven, Xueyu Zhu

Abstract We propose a modified parallel-in-time - Parareal - multi-level time integration method which, in contrast to previous methods, uses a coarse solver based on the information from the fine solver at each iteration through the construction of a reduced model. This approach is demonstrated to offer two substantial advantages: it accelerates convergence of the original parareal method and we furthermore demonstrate that the reduced basis stabilizes the parareal method for purely advec- tive problems where instabilities are known to arise. When combined with empirical interpolation techniques (EIM), we discuss the use of this approach to solve both linear and nonlinear problems and highlight the minimal changes required to utilize this algorithm to accelerate existing imple- mentations. We illustrate advantages through the algorithmic design, through analysis of stability, convergence, and computational complexity, and through several numerical examples.

1 Introduction

With the number of computational cores increasing, the demands on scalability of computational methods likewise increase, due partly to an increasing imbalance between the cost of memory access, communication and arithmetic capabilities. Among other things, traditional domain decomposition methods tends to stagnate in scaling as the number of cores increases since the computational cost is overwhelmed by other tasks. This suggests the need to consider the development of computational techniques that better balances these constraints and allows for the acceleration of large scale computational challenges.

Feng Chen Brown University, 182 George Street, Providence, RI 02912, e-mail: [email protected] Jan S. Hesthaven Brown University, 182 George Street, Providence, RI 02912 e-mail: [email protected] Xueyu Zhu Brown University, 182 George Street, Providence, RI 02912 e-mail: [email protected]

1 A recent development in this direction is the parareal method, introduced in [15], which provides a strategy for ’parallel-in-time’ computations in a way that offers the potential for an increased level of parallelism. Relying on a computational inexpensive but inaccurate solver and an accurate but expensive solver, the parareal method utilizes an iterative, predictor-corrector procedure that allows the expensive solver to run across many processors in parallel. Under suitable conditions, the parareal iteration is expected to converge after a small number of iterations to the serial solution [11]. During the last decade, the parareal method has been applied successfully to a number of applications (cf. [16, 17]), demonstrating its potential, accuracy, and robustness. As a central and serial component, the coarse solver can impact the efficiency and stability of the parareal algorithm, e.g., if an explicit scheme is used in both the coarse and the fine stage of the algorithm, the efficiency of the parareal algorithm is limited by the upper bound of the time step size [17]. One can naturally also consider a different temporal integration approach such as an implicit approach, although the cost of this can be considerable and often requires the development of a new solver. An attractive alternative is to use a simplified physics model as the coarse solver [2, 16], thereby ignoring small scale phenomenon but potentially impacting the accuracy. However, the success of such an approach is problem specific. While the choice of the coarse solver impacts accuracy and overall efficiency, the stability of the parareal method is considerably more subtle. For parabolic and diffusion dominated problems, stability is well understood and observed in many applications [12]. However, for hyperbolic and convection dominated problems, the question of stability is considerably more complex and remains open [20, 8, 3]. In [8], the authors propose to project the solution after each iteration onto an energy manifold approximated by the fine solution. The performance of this projection method was demonstrated for the linear wave equation and the nonlinear Burgers’ equation. Another attempt, the Krylov subspace parareal method, builds a new coarse solver by reusing all information from the corresponding fine solver at previous iterations. The stability of this approach was demonstrated for linear problems on linear structural dynamics [10] and a linear 2-D acoustic-advection system [18]. However, the Krylov subspace parareal method appears to be limited to linear problems. Inspired by the Krylov subspace parareal method, we propose a modified parareal method, referred to as the reduced basis parareal method. In this approach, the Krylov subspace is replaced by a subspace spanned by a reduced order basis, constructed from the fine solver. This method inherits most advantages of the Krylov subspace parareal method and remains stable and convergent for linear wave problems. We demonstrate that this approach accelerates the convergence in situations where the original parareal already converges. However, this new approach also allows us to overcome some of the issues discussed above: (i) it deals with nonlinear problems by incorporating methodologies from the reduced basis methods; and (ii) the traditional coarse propagator is needed only at the very beginning of the algorithm to generate an initial reduced basis, allowing for time step restrictions to be relaxed when compared to the coarse solver of the original parareal method. What remains of this paper is organized as follows. We first review the original parareal method in Section 2.1 and the Krylov subspace parareal method in Section 2.2. This sets the stage for Section 2.3 where we introduce the reduced basis parareal method and discuss different strategies to develop reduced models for problems with nonlinear terms. Section 3 offers some analysis of the stability, convergence, and complexity of the reduced basis parareal method and Section 4

2 demonstrates the feasibility and performance of the reduced basis parareal method through various linear and nonlinear numerical examples. We conclude the paper in Section 5.

2 Parareal algorithms

To set the stage for the discussion of the reduced basis parareal method, let us first discuss the original and the Krylov subspace parareal methods in Section 2.1 and Section 2.2, respectively. We shall highlight issues related to stability and computational complexity to motivate the reduced basis parareal method, introduced in Section 2.3.

2.1 The original parareal method

Consider the following initial value problem:

u = L(u) := Au(t) + N(u(t)), t (0,T ], t ∈ (1) u(0) = u0, where u RN is the unknown solution, L is an operator, possibly arising from the spatial discretiza- tion of a∈ PDE, A is the linear part of L, and N is the nonlinear part.

In the following, we denote Fδt as the accurate but expensive fine time integrator, using a constant time step size, δt. Furthermore, G∆t is the inaccurate but fast coarse time integrator using a larger time step size, ∆t. Generally it is assumed that ∆t δt.  The original parareal method is designed to solve (1) in parallel-in-time fashion to accelerate the computation by exploring parallel efficiencies. First, [0,T ] is decomposed into Nc coarse time intervals or elements: T ∆ ∆ 0 = t0 < < ti < < tNc = T , ti = i T, T = . (2) ··· ··· Nc Assume that ∆T = N δt, N N, (3) f f ∈ which implies that T = NcNf δt. Denote Fδt(u,ti+1,ti) as the accurate numerical solution integrated from ti to ti+1 by using Fδt with the initial condition u and the constant time step size δt. Similarly for G∆t(u,ti+1,ti). Denote also un = Fδt(u0,T,0) as the numerical solution generated using only the fine integrator. k Now assume that at the k-th iterated approximation un is known. The parareal approach proceeds to the k + 1-th iteration through the following scheme

k+1 k+1 k k u = G∆ (u ,t ,t ) + F (u ,t ,t ) G∆ (u ,t ,t ), 0 k N 1. (4) n+1 t n n+1 n δt n n+1 n − t n n+1 n ≤ ≤ c −

3 k+1 It is easy to see that Fδt(un ,tn+1,tn) can be done in parallel over all temporal elements. If we take the limit of k and assume that the limit of uk exists, we obtain [15]: → ∞ { n} uk+1 u = F (u ,t ,t ). (5) n+1 → n+1 δt n n+1 n

In order to achieve a reasonable efficiency, the number of iterations, Nit, should be much smaller than Nc. With the notations introduced in the above, the original parareal method is outlined as

1 Initialization: 0 2 u0 = u0; 3 for i 0 to Nc do 0← 0 4 ui+1 = G∆t(ui ,ti+1,ti) 5 end 6 Iterations: 7 k = 0; 8 for k 0 to N do ← it 9 Parallel predictor step: 10 for i 0 to N do ← c 11 u k = (uk ) f i+1 Fδt i ,ti+1,ti 12 end 13 Sequential correction step: 14 for i 0 to Nc do k←+1 k+1 k k 15 u = G∆ (u ,t ,t ) u + G∆ (u ,t ,t ) i+1 t i i+1 i − f i+1 t i i+1 i 16 end 17 end Algorithm 1: The original parareal method

Let us demonstrate the performance of the original parareal method with a few numerical examples. Consider first the viscous Burgers’ equation:

u2 u + ( ) = νu , (x,t) (0,2π) (0,T ], t 2 x xx ∈ × (6) u(x,0) = sin(x),

1 where T = 2 and ν = 10− . 2π-periodic boundary condition is used. The spatial discretization is a P1 discontinuous (DG) with 100 elements [14]. The time integrator is a first-order forward Euler method and we use the following parameters in the parareal integration

3 4 Nc = 100,Nit = 5, ∆t = 10− , δt = 10− . (7)

4 In Figure 1, we show the L -error of the parareal solution at T = 2 against the number of iterations. Notice that for this nonlinear∞ problem the algorithm converges after only four iterations, resulting in an expected reasonable acceleration in a parallel environment.

10−6

10−8

10−10 max error 10−12

10−14

10−16 1 1.5 2 2.5 3 3.5 4 4.5 5 iterations

Fig. 1: The L -error at T against the number of iterations of the 1-D Burgers equation using the original parareal∞ method

As a second example we consider the Kiramoto-Sivashinsky equation [21]:

∂u u2 = ( ) u u , (x,t) ( 8,8) (0,T ], ∂t 2 x − xx − xxxx ∈ − × (8) u(x,0) = exp( x2) − with T = 40 and periodic boundary conditions. As a spatial discretization we use a Fourier with 128 points [13] and the time integrator is an IMEX scheme [1], with the linear terms being treated implicitly and the nonlinear term explicitly. The parameters in the parareal method are taken as

2 4 Nc = 100,Nit = 5, ∆t = 10− , δt = 10− . (9)

Figure 2 (left) shows the time evolution of the chaotic solution to the Kuramoto-Sivashinsky equation with the Gaussian initial condition. In Figure 2 (right), we show the L -error at T = 40 against the number of iterations. In this case we use the solution obtained with the∞ fine solver as the exact solution. It is clear that the parareal solution also converges, albeit slower. However, it should also be noted that ∆t/δt = 100, again indicating the potential for a substantial acceleration. As a last and less encouraging example, we consider the 1-D advection equation

u + au = 0, (x,t) (0,2π) (0,T ], t x ∈ × (10) u(x,0) = exp(sin(x at)), −

5 0 10 parareal−plain

−2 10

−4 10

−6 10

−8 max error 10

−10 10

−12 10

−14 10 1 2 3 4 5 6 7 8 9 10 iterations

Fig. 2: A snapshot of the solution (left) and the L -error at T against the number of iterations (right) of the 1-D Kuramoto-Sivashinsky equation using∞ the original parareal method. where T = 10 and a = 2π and2π-periodic boundary conditions. We use a DG method of order 32 and 2 elements in space [14], a fully implicit fourth-order Runge-Kutta scheme [19] in time, and the following parareal parameters:

2 4 N = 100,N = 27, ∆t = 5 10− , δt = 10− . (11) c it × In Figure 3, we show the L -error at T = 10 against the number of iterations. The instability of the original parareal method is∞ apparent, as has also been observed by others [3, 20, 8].

1012

1010

108

106

104 max error 102

100

10−2

10−4 0 5 10 15 20 25 30 iterations

Fig. 3: The L -error at T against the number of iterations of the 1-D linear advection equation using the original∞ parareal method

6 2.2 The Krylov subspace parareal method

k Nc 1 We notice that in Algorithm 1 only ufi+1 i=0− are used in the advancement of the solution to k + 1. To fix the stability issue, [10] proposed{ } to improve the coarse solver by reusing information computed at all previous iterations. They applied this idea to the linear hyperbolic problems in structural dynamics. Recently, a similar idea was successfully applied to linear hyperbolic problems [18]. k+1 The basic idea of the Krylov subspace parareal method is to project ui onto a subspace spanned by all numerical solutions integrated by the fine solver at previous iterations. Denote the subspace as j Sk := span u , 0 i N , 0 j k . (12) { fi ≤ ≤ c ≤ ≤ } The corresponding orthogonal basis set s ,...,s is constructed by a full QR decomposition. { 1 r } Pk k Denote as the L2-orthogonal projection onto S . The previous coarse solver G∆t is replaced by K∆t as follows: k k K∆ = G∆ ((I P )u,t ,t ) + F (P u,t ,t ). (13) t t − i+1 i δt i+1 i Pk For a linear problem, Fδt( u,ti+1,ti) can be computed efficiently as

XNck XNck Pk Fδt( u,ti+1,ti) = Fδt( Cj sj ,ti+1,ti) = Cj Fδt(sj ,ti+1,ti), (14) j=1 j=1 where Fδt(sj ,ti+1,ti) are computed and stored once the sj ’s are available. Because this approach essentially produces an approximation to the fine solver, the new coarse solver is expected to be more accurate than the old coarse solver. It was shown in [11] that as the dimension of Sk increases, Pk I and K∆t Fδt, achieving convergence. The algorithm outline is presented in Algorithm 2. → → To demonstrate the performance of the Krylov subspace parareal method, we use it to solve the linear advection equation, (10). In Figure 4 (left) we show the L -error at T = 10 against the number of iterations. It is clear that the Krylov subspace parareal method∞ stabilizes the parareal solver for this problem. Two observations are worth making. First, the Krylov subspace parareal method needs to store k k all the values of S and F(S ). As k increases, this memory requirement of size O(kNcN) becomes demanding, as illustrated in Figure 4 (right). Secondly, the efficiency of the coarse solver criti- cally depends on the assumption of linearity of the operator and it is not clear how to extend this framework to nonlinear problems. These constraints appear to limit the practicality of the method.

7 −1 10 2500

10−2 2000

10−3

1500 10−4

−5

max error 10 1000 number of basis

10−6

500 10−7

−8 10 0 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 iterations iterations

Fig. 4: The L -error at T against the number of iterations (left), and the number of bases (right) for solving the 1-D∞ linear advection equation using the Krylov subspace parareal method

1 Initialization: 0 2 u0 = u0; 3 for i 0 to Nc do 0← 0 4 ui+1 = G∆t(ui ,ti+1,ti) 5 end 6 Iterations: 7 k = 0; 8 for k 0 to N do ← it 9 Parallel predictor step: 10 for i 0 to N do ← c 11 u k = (uk ) f i+1 Fδt i ,ti+1,ti 12 end 13 Constructing reduced basis: k 1 k k k 14 Update S − to S based on ui 1,ufi 15 Marching the basis: − 16 for i 1 to N do ← r 17 S = (s 0 ∆ ); f i Fδt i, , t 18 end 19 Sequential correction step: 20 for i 0 to Nc do k←+1 k+1 k k 21 u = K∆ (u ,t ,t ) u + K∆ (u ,t ,t ) i+1 t i i+1 i − f i+1 t i i+1 i 22 end 23 end Algorithm 2: The Krylov subspace parareal method

8 2.3 The reduced basis parareal method

Let us first recall a few properties of reduced basis methods which will subsequently serve as a key building block of the proposed reduced basis parareal method.

2.3.1 Reduced basis methods

We are generally interested in the nonlinear ODE (1). As a system, the dimensionality of the problem can be very large, e.g., if the problem originates from a method-of-lines discretization of a nonlinear PDE. To achieve a high accuracy, the number of degrees of freedom, N, can be extremely large, and it is tempting to identify approximate lower dimensional models to improve the efficiency. The general representation of a reduced model in matrix-form can be expressed as

u(t) V u˜(t), (15) ≈ r where the r columns of the matrix V represent a linear space - the reduced basis - and u˜(t) Rr r ∈ are the coefficients of the reduced bases. Projecting the ODE system (1) onto Vr , we recover the reduced system: du˜(t) VT V = VT AV u˜(t) + VT N(V u˜(t)). (16) r r dt r r r r

Assuming that Vr is orthonormal, we recover du˜(t) = VT AV u˜(t) + VT N(V u˜(t)). (17) dt r r r r

Following this generic introduction, one is left with specifying how to choose a good subspace, Vr , to adequately represent the dynamic behavior of the solution and how to recover the coefficients for the reduced model in an efficient manner. There are several ways to address this question, mostly based on the construction of Vr through the use of snapshots of the solution. Proper orthogonal decomposition The proper orthogonal decomposition (POD) [6, 5] is perhaps the most widely used approach to generate a reduced basis from a collection of snapshots. In this case, we assume we have a collection of Ns snapshots

U = [u1,...,uNs ], (18) where each ui is a vector of length N; this N can be large as it reflects the number of degrees of r RN freedom in system. The POD basis, denoted by φi 1 , is chosen as the orthonormal vectors that solve the following minimization problem: { } ∈

9 XNs Xr T 2 min uj (uj φi)φi 2, φ RN || − || i ∈ j i=1 (19) ( 1, i = j, subject to φT φ = δ = i j ij 0, otherwise.

The solution to this minimization problem is equivalent to the singular value decomposition (SVD) of U: U = VΣWT , (20) N r N r where V R × and W R s× are the left and right singular vectors, respectively, and V is the sought after∈ basis. It is also∈ noteworthy that the entries of the diagonal matrix Σ provides a measure of the relative energy of each of the orthogonal vectors in the basis. Once the basis is available, we can increase the computational efficiency of solving (17) by precom- T puting Vr AVr , which is of size r r. However, the computational complexity of the nonlinear term remains dependent on N and, hence,× potentially costly. Discrete Empirical Interpolation. To address this, [7] proposed an approach, originating in previ- ous work on empirical interpolation methods [4] but based on an existing and discrete basis set, in which N(V u˜(t)) is represented by N˜ (τ) RN which is subsequently approximated as r ∈ N(V u˜(t)) N˜ (τ) V c(τ). (21) r ≈ ≈ p

Here Vp = [v1,...,vm] is an orthogonal POD basis set based on snapshots of N(τ). To recover c(τ), we must seek a solution to an overdetermined system. However, instead of doing an expensive least square method, we extract m equations from the original set of snapshots. Denote

N m P = [e ,....,e ] R × , (22) p1 pm ∈ T RN T where ep1 = [0,...,0,1,0,...,0] (1 only appears on p1-th position of the vector). If P Vp is nonsingular, c(τ) can be uniquely∈ determined by

T T P N˜ (τ) = P Vpc(τ), resulting in a final approximation of N˜ (τ) as

T 1 T N˜ (τ) V (P V )− P N˜ (τ). ≈ p p

The interpolation index pi is selected iteratively by minimizing the largest magnitude of the residual r = uk Vp,kc. The procedure, sometimes referred to as discrete empirical interpolation, is outlined in Algorithm− 3. With the above approximation, we can now express the reduced system as

du˜(t) = VT AV u˜(t) + VT V (PT V ) 1N(PT V u˜(t)). (23) dt r r r p p − r

10 input : v m RN linearly independent POD bases of the nonlinear term { k}k=1 ⊂ output:the interpolation operator Pm = [p1,...,pm]. 1 begin 2 ,p max u ; 1 ← 1 3 P p ; V| | v ; ← { 1 } p,1 ← { 1 } 4 for k 2 to M do ← T T 5 Solve P vk = P Vp,kc(τ) to obtain c(τ); 6 Compute the residual; r = v V c; k − p,k 7 ,p max r ; k ← | | 8 Vp,k Vp,k 1 vk ; ← − ∪ { } 9 Pk Pk 1 pk ; ← − ∪ { } 10 end 11 end Algorithm 3: Empirical interpolation with a given basis set

Full Empirical Interpolation Pursuing the above approach further, one is left wondering if we use bases other than the computational expensive POD basis, and whether we choose the interpolation position based on other guidelines. Addressing these questions leads us to propose a full empirical interpolation method. It is well-known that original empirical interpolation method is commonly used to separate the dependence of parameters and spatial variables [4], and that the method choosing ‘optimal’ interpolation points in a certain sense. In [23], the authors propose to consider the time as a parameter, and use empirical interpolation to construct the reduced bases VE,k of u and the reduced bases VpE,k of the nonlinear term, i.e.,

u(t) V c(τ), N˜ (τ) V c(τ). (24) ≈ E ≈ pE The resulting reduced model can be written as

du˜(t) = V T AV u˜(t) + V T V (PT V ) 1N(PT V u˜(t)). (25) dt E,k E,k E,k pE,k pE,k − E,k The essential difference between the models based on discrete empirical interpolation and the full empirical interpolation approach is found in the way in which one constructs the reduced basis set. In the former case, the importance of the basis elements is guided by the SVD and the relative size of the singular values, resulting in a potentially substantial cost. The latter case approach is based on the interpolation error and the basis in constructed in a full greedy fashion. For more details and a comparison of the performance between the two approaches, we refer to the comparative study for nonlinear model reduction in [23].

11 2.3.2 The reduced basis parareal method

Let us now introduce the new reduced basis parareal method. Our first observation is that the first term in (13) can be dropped under the assumption that the projection error vanishes asymptotically. ˆ Hence, for linear problems, we can replace K∆t by Kδt as

XNck ˆ Pk K∆t = Fδt( u,ti+1,ti) = Cj Fδt(sj ,ti+1,ti), (26) j=1 which is essentially an approximation to the fine time integrator with an admissible truncation error. Keeping in mind that Fδt is an expensive operation, we need to reduce the dimension of Sk to achieve a better efficiency. If the solution to the ODE is regular enough, it is possible k k to find a similar r-dimensional subspace, Sr (the reduced basis space), of the original space S . Pk k Now define to be the orthogonal projection from u onto Sr , then (26) becomes (27), which is essentially an approximation to the fine time integrator using the reduced model constructed by the corresponding reduced bases, i.e.,

XNr ˆ Pk K∆t = Fδt( u,ti+1,ti) = Cj Fδt(sj ,ti+1,ti). (27) j=1

We term the above the reduced parareal method. Depending on the construction of the reduced bases, we refer to it as the POD parareal method or the EIM parareal method. Algorithm 4 describes the basic steps of the reduced basis parareal method for linear problems. It follows a procedure similar to as Algorithm 2, but requires less memory for storing the bases. For nonlinear problems, the relationship

XNr Pk Fδt( u,ti+1,ti) = Cj Fδt(sj ,ti+1,ti) (28) j=1 in general does not hold, even if Pku u. Therefore, the Krylov subspace parareal method is not applicable. Fortunately, the knowledge→ of the development of reduced models using empirical interpolation offers insight into dealing with nonlinear problems, as mentioned in Section 2.3.1. We construct the coarse time integrator as follows:

ˆ r Pk K∆t = Fδt( u,ti+1,ti), (29) r where Fδt is the reduced model. Therefore, as long as there exists a reduced space for the problem, ˆ we can evaluate K∆t efficiently while maintaining an accuracy commensurate with the fine solver. The reduced basis parareal method for nonlinear problems is outlined in Algorithm 5.

12 1 Initialization: 0 2 u0 = u0; 3 for i 0 to Nc do 0← 0 4 ui+1 = G∆t(ui ,ti+1,ti) 5 end 6 Iterations: 7 k = 0; 8 for k 0 to N do ← it 9 Parallel predictor step: 10 for i 0 to N do ← c 11 u k = (uk ) f i+1 Fδt i ,ti+1,ti 12 end 13 Constructing reduced basis: 14 U k = u k ,i = 0,...,N ,j = 0,...,k { f i+1 c } 15 S = POD(U k) or S = EIM(U k) where S = s ,i = 1,...,N { i r } 16 Marching the basis: 17 for i 1 to N do ← r 18 S = (s 0 ∆ ); f i Fδt i, , t 19 end 20 Sequential correction step: 21 for i 0 to Nc do ← PN 22 Pkuk = r C s C i j=1 j j ← j PN 23 ˆ (uk+1 ) = r S K∆t i ,ti+1,ti j=1 Cj f j k+1 k+1 k k 24 u = Kˆ∆ (u ,t ,t ) u + Kˆ∆ (u ,t ,t ) i+1 t i i+1 i − f i+1 t i i+1 i 25 end 26 end Algorithm 4: The reduced parareal method for a linear problem

3 Analysis of the reduced basis parareal method

Let us in the following provide some analysis of the reduced basis parareal method to better understand its stability, convergence and overall computational complexity. Throughout the proofs we assume that there exists a reduced space for the continuous problem.

3.1 Stability analysis

We consider first the linear case. Define

13 1 Initialization: 0 2 u0 = u0; 3 for i 0 to Nc do 0← 0 4 ui+1 = G∆t(ui ,ti+1,ti) 5 end 6 Iterations: 7 k = 0; 8 for k 0 to N do ← it 9 Parallel predictor step: 10 for i 0 to N do ← c 11 u k = (uk ) f i+1 Fδt i ,ti+1,ti 12 end 13 Constructing reduced basis: 14 U k = u k ,i = 0,...,N ,j = 0,...,k { f i+1 c } 15 S = POD DEIM(U k) or S = EIM(U k) where S = s ,i = 1,...,N − { i r } 16 Sequential correction step: 17 for i 0 to Nc do ˆ← k r Pk k 18 K∆t(ui ,ti+1,ti) = Fδt( ui ,ti+1,ti) ˆ k+1 r Pk k+1 19 K∆t(ui ,ti+1,ti) = Fδt( ui ,ti+1,ti) k+1 k+1 k k 20 u = Kˆ∆ (u ,t ,t ) u + Kˆ∆ (u ,t ,t ) i+1 t i i+1 i − f i+1 t i i+1 i 21 end 22 end Algorithm 5: The reduced parareal method for a nonlinear problem

i,k k k I Pk k Cg = max g , g = ( )u L (0,T ). (30) 0 j i+1 j j || − j || 2 ≤ ≤ i,k It is easy to see that Cg goes to zero once the reduced basis is rich enough. In addition, it is reasonable to assume that the fine propagator is L2 stable, i.e., there exists a nonnegative constant CF independent of the discretization parameters, such that,

F (v,t ,t ) (1 + C ∆T ) v , v L (0,T ). (31) || δt i+1 i ||L2(0,T ) ≤ F || ||L2(0,T ) ∀ ∈ 2

Theorem 1 (Stability for the linear case) Under the assumption of (31), the reduced basis parareal method is stable for (1) with N 0, i.e., for each i and k, ≡ uk+1 C eCF (i+1)∆T , (32) || i+1 ||L2(0,T ) ≤ L where CL is a constant depending only on Cg , CF, and u0.

Proof. Using the triangle inequality, the linearity of the operator, and assumption (31), we obtain

14 uk+1 F (Pkuk+1,t ,t ) + F (uk,t ,t ) F (Pkuk,t ,t ) (33) || i+1 ||L2(0,T ) ≤ || δt i i+1 i ||L2(0,T ) || δt i i+1 i − δt i i+1 i ||L2(0,T ) (1 + C ∆T ) uk+1 + (1 + C ∆T ) (I Pk)uk . (34) ≤ F || i ||L2(0,T ) F || − i ||L2(0,T ) Then, by the discrete Gronwall’s lemma [9], we recover

Xi k+1 i+1 k+1 j k u (1 + C ∆T ) ( u + ∆T (1 + C ∆T )− g ) (35) || i+1 ||L2(0,T ) ≤ F || 0 ||L2(0,T ) F j j=0

i+1 k+1 1 i+1 k = (1 + CF∆T ) u0 L (0,T ) + ((1 + CF∆T ) 1) max g (36) || || 2 C − 0 j i+1 j F ≤ ≤ 1 CF (i+1)∆T CF (i+1)∆T i,k e u0 L2(0,T ) + (e 1)Cg . (37) ≤ || || CF − This completes the proof.

For the nonlinear case, we further assume that there exists a nonnegative constant Cr independent of the discretization parameters, such that,

F (v,t ,t ) Fr (Pkv,t ,t ) (1 + C ∆T )qk, v L (0,T ), (38) || δt i+1 i − δt i+1 i ||L2(0,T ) ≤ r i ∀ ∈ 2 k where qi is the L2-difference between the fine operator and the reduced solver using the same k i,k initial condition v at ti. We denote max0 j i+1 qj as Cq , which goes to zero as k . ≤ ≤ → ∞ Theorem 2 (Stability for the nonlinear case) Under assumptions (31) and (38), the reduced basis parareal method is stable for (1) in the sense that for each i and k

uk+1 C eC? (i+1)∆T , (39) || i+1 ||L2(0,T ) ≤ N where C = max C ,C and C is a constant depending only on C , C , C , and u . ? { F r } N q F r 0 Proof. Using the triangle inequality and assumptions (31) and (38), we have

uk+1 Fr (Pkuk+1,t ,t ) + F (uk,t ,t ) Fr (Pkuk,t ,t ) (40) || i+1 ||L2(0,T ) ≤ || δt i i+1 i ||L2(0,T ) || δt i i+1 i − δt i i+1 i ||L2(0,T ) (1 + C ∆T ) uk+1 + (1 + C ∆T )qk. (41) ≤ F || i ||L2(0,T ) r i Next, by the discrete Gronwall’s lemma, we derive

Xi k+1 i+1 k+1 j k u (1 + C ∆T ) ( u + ∆T (1 + C ∆T )− q ) (42) || i+1 ||L2(0,T ) ≤ F || 0 ||L2(0,T ) r j j=0

i+1 k+1 1 i+1 k = (1 + CF∆T ) u0 L (0,T ) + ((1 + Cr ∆T ) 1) max q (43) || || 2 C − 0 j i+1 j r ≤ ≤ 1 CF (i+1)∆T Cr (i+1)∆T i,k e u0 L2(0,T ) + (e 1)Cq . (44) ≤ || || Cr − This completes the proof.

15 3.2 Convergence analysis

To show the convergence for the linear case, we first assume that there exists a nonnegative constant CF, such that,

F (x,t ,t ) F (y,t ,t ) (1 + C ∆T ) x y , t > 0, (45) || δt i+1 i − δt i+1 i ||L2(0,T ) ≤ F || − ||L2(0,T ) ∀ i and define i,k k k I Pk Cw = max w , w = ( )uj L (0,T ). (46) 0 j i+1 j j || − || 2 ≤ ≤ i,k i,k Similar as for Cg in (30), Cw is assumed to approach zero as k . → ∞ Theorem 3 (Convergence for the linear case) Under assumption (45) and N 0 in (1), the reduced ≡ basis parareal solution converges to ui+1 for each i.

Proof. Using the reduced basis parareal formula and the linearity of the operator, we obtain

uk+1 u = F (Pkuk+1,t ,t ) + F (uk,t ,t ) F (Pkuk,t ,t ) F (u ,t ,t ) (47) i+1 − i+1 δt i i+1 i δt i i+1 i − δt i i+1 i − δt i i+1 i = F (Pkuk+1,t ,t ) F (Pku ,t ,t ) (48) δt i i+1 i − δt i i+1 i + F (uk,t ,t ) F (Pkuk,t ,t ) (49) δt i i+1 i − δt i i+1 i + F (Pku ,t ,t ) F (u ,t ,t ). (50) δt i i+1 i − δt i i+1 i By the triangular inequality and assumption (45), we recover

uk+1 u (1 + C ∆T ) uk+1 u (51) || i+1 − i+1||L2(0,T ) ≤ F || i − i||L2(0,T ) + (1 + C ∆T ) (I Pk)uk (52) F || − i ||L2(0,T ) + (1 + C ∆T ) (I Pk)u . (53) F || − i||L2(0,T ) Finally by discrete Gronwall’s lemma we obtain

uk+1 u (1 + C ∆T )i+1( uk+1 u (54) || i+1 − i+1||L2(0,T ) ≤ F || 0 − 0||L2(0,T ) Xi Xi j k j k + ∆T (1 + CF∆T )− gj + ∆T (1 + CF∆T )− wj ) (55) j=0 j=0 Xi j i,k i,k ∆T (1 + C ∆T )− (C + C ) (56) ≤ F g w j=0

1 i+1 i,k i,k ((1 + CF∆T ) 1)(Cg + Cw ) (57) ≤ CF − 1 CF (i+1)∆T i,k i,k (e 1)(Cg + Cw ), (58) ≤ CF − which approaches zero as k increases. This completes the proof.

16 For the nonlinear case, we must also assume that there exists a nonnegative constant Cr , such that, k r Pk k k Fδt(u ,ti+1,ti) F ( u ,ti+1,ti) L (0,T ) (1 + Cr ∆T )q , || i − δt i || 2 ≤ i (59) F (u ,t ,t ) Fr (Pku ,t ,t ) (1 + C ∆T )pk, || δt i i+1 i − δt i i+1 i ||L2(0,T ) ≤ r i k where qi is the L2-difference between the fine operator and the reduced solver using the same k i+1 k i,k k i,k initial condition u . We denote max q as Cq , and assume it goes to zero as k . p and Cp i j=0 i → ∞ i are defined similarly. Theorem 4 (Convergence of the nonlinear case) Under assumptions (38) and (59), the reduced basis parareal solution of (1) converges to ui+1 for each i.

Proof. Using the reduced basis parareal formula, we obtain

uk+1 u = Fr (Pkuk+1,t ,t ) + F (uk,t ,t ) Fr (Pkuk,t ,t ) F (u ,t ,t ) (60) i+1 − i+1 δt i i+1 i δt i i+1 i − δt i i+1 i − δt i i+1 i = Fr (Pkuk+1,t ,t ) Fr (Pku ,t ,t ) (61) δt i i+1 i − δt i i+1 i + F (uk,t ,t ) Fr (Pkuk,t ,t ) (62) δt i i+1 i − δt i i+1 i + Fr (Pku ,t ,t ) F (u ,t ,t ). (63) δt i i+1 i − δt i i+1 i By the triangular inequality and assumptions (59) and (38), we have

uk+1 u (1 + C ∆T ) uk+1 u + (1 + C ∆T )qk + (1 + C ∆T )pk. (64) || i+1 − i+1||L2(0,T ) ≤ F || i − i||L2(0,T ) r i r i Then, by the discrete Gronwall’s lemma, we end up with

1 k+1 ∆ i+1 i,k i,k ui+1 ui+1 L2(0,T ) ((1 + Cr T ) 1)(Cq + Cp ) (65) || − || ≤ Cr − 1 Cr (i+1)∆T i,k i,k (e 1)(Cq + Cp ), (66) ≤ Cr − which approaches zero as k increases under assumption (59).

3.3 Complexity analysis

Let us finally discuss the computational complexity of the reduced basis parareal method. Recall that the dimension of the reduced space is r and the spatial dimension of fine solution is N and is assumed to be the same for the coarse and fine solvers although this may not be a requirement in general. The compression ratio is R = r/N. Following the notation of [18]: τQR(k),τRB(k) (represent- ing τSVD (k), τEIM (k), and τDEIM (k) in different scenarios) reflect computing times required by the corresponding operations at the k-th iteration. τc and τf is the time required by the coarse and fine solvers, respectively. Nt = NcNf is the total number of time steps in one iteration with Nc being the number of the coarse time intervals and Nf the number of fine time steps on each coarse time interval. Np is the number of processors.

17 In [18], the speedup is estimated as

Ntτf S(Np) (67) ≈ Ncτc + Nit(Ncτc + Nt/Npτf ) + NitτQR(it) 1 = . (68) N τ (N ) (1 + N )( Nc τc ) + it QR it + Nit it Nt τf Ntτf Np

2 In the reduced basis parareal method, τc = R τf , since the complexity of the computation of the 2 right hand side of system is O(r ). In addition, τQR becomes τSVD or τEIM . With these in mind, the speedup can be estimated as

1 S(Np) = . (69) (1 + N )( Nc R2) + NitτRB(Nit) + Nit it Nt Ntτf Np

Next, we examine the first two terms in the denominators of (68) and (69). 2 • In the first term, τc/τf takes the role of R . Hence, we can achieve a comparable performance, p if R τc/τf , i.e, if the underlying PDE solution can be represented by a reduced basis set of ≈ p p size O( τc/τf N). Suppose that τc/τf = √1/20 0.23. Then it requires that R < 1/4, which is a reasonable compression ratio for many problems.≈ In addition, it is possible to use a reduced basis approximation to achieve a better performance for cases where the CFL conditions leads to problems in the coarse solver. • For the second term, τ τ O(NN 2N 2), while τ O(r3/2N N +rNN N ). Therefore, SVD ≈ QR ≈ it c EIM ≈ it c it c τ /τ O(2N N /Rr2). As N increases, τ becomes smaller. In addition, EIM has a very SVD EIM ≈ it c c EIM good parallel efficiency and requires less memory during the computation.

Also note that Nit would typically be different for the reduced basis parareal method and the original parareal method. If a reduced space exists, the modified algorithm usually converges within a few iterations, hence accelerating the overall convergence significantly.

4 Numerical results

Let us in the following demonstrate the feasibility and efficiency of the reduced basis parareal method for both linear and nonlinear problems.

18 4.1 The linear advection equation

We begin by considering the performance of the reduced basis parareal method and illustrate that it is stable for the 1-D linear advection equation (10). The same spatial and temporal discretizations in Section 2 and the same parameters as in (11) are used. In Figure 5 (left), we show the L -error at T = 10 against the number of iterations for the original parareal method, the POD parareal∞ method, and the EIM parareal method. The accuracy of the 13 fine time integrator at T is 4 10− . The original parareal method is unstable, while the other two remain stable. In Figure 5 (right),× we show the number of bases used to satisfy the tolerance  in the POD parareal method and the EIM parareal method, where  in the POD context is defined as the relative energy and in the EIM context as the interpolation error. In both cases, the tolerance 13 in the basis selection using POD or EIM is set to 10− . We notice that the EIM parareal method achieves higher accuracy but requires more memory to store the bases. This suggests that one can explore a tradeoff between accuracy and efficiency in particular applications. It should be noted that if only snapshots from the previous iteration in the EIM construction, the scheme is remains unstable. However, when including all snapshots collected until the previous iteration level, the scheme recovers stablility.

10 55 10

parareal−org 50 parareal−pod 5 parareal−pod 10 45 parareal−eim parareal−pod−full parareal−eim 40

0 10 35

30 max error −5 10 number of basis 25

−10 20 10

15

−15 10 10 5 10 15 20 25 0 5 10 15 20 25 30 iterations iterations

Fig. 5: Results obtained using the original parareal method, the POD parareal method, and the EIM parareal method for the 1-D advection equation. On the left we show the L -error at T = 10 against the number of iterations while the right shows the number of bases used∞ for satisfying the 13 tolerance of (10− ) in the POD and EIM parareal methods across the iterations.

Figure 6 shows the convergence behavior of the EIM parareal algorithm with different tolerances, . The dependence of the final accuracy on  is clear. These results also helps confirm Theorem 3, stating that the parareal solution converges exactly to the serial solution integrated by the fine solver as long as the subspace S saturates. In practice, we can choose  such that accuracy of the parareal solution and the serial fine solution are in the same level.

19 1010 55

50

105 45

ε decreases 40

ε decreases 100 35

30 max error 10−5 25 number of basis

20

10−10 15

10

10−15 5 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 iterations iterations

Fig. 6: The performance of the EIM parareal method for the 1-D advection equation against the tolerance used in the design of the reduced basis. On the left we show the L -error at T = 10 against the number of iterations as the tolerance  decreases ad on the right the∞ number of bases used for satisfying the tolerance as  decreases..

4.2 Nonlinear equations

Let us also apply the reduced basis parareal method to several nonlinear PDEs. We recall that the Krylov based approach is not applicable in this case.

4.2.1 Viscous Burgers’ equation

We consider the viscous Burgers’ equation (6), with the same spatial and temporal discretizations and the same parameters as in (7). To build the reduced basis, we set the tolerance for the POD is 15 12 set to be 10− , and for EIM as 10− . In Figure 7 (left), we show the L -error at T = 2 against the number of iterations for the original parareal method, the POD parareal∞ method, and the EIM parareal method. As exact solution we use the solution obtained by the fine time integrator. Although the reduced parareal method converges slower than the original parareal, it requires less work, as discussed in Section 2.3.1.

4.2.2 Kuramoto-Sivashinsky equation

Next consider the Kiramoto-Sivashinsky equation, (8). The same spatial and temporal discretiza- tions and the same parameters as in (9) are used. To build the reduced basis, we set the tolerance 13 8 for POD/EIM is set to be 10− and 10− , respectively.

20 0 10 70 parareal−org

−2 parareal−pod 10 parareal−eim 60 −4 10

−6 10 50

−8 10 40 −10 10 max error parareal−pod

number of basis parareal−eim −12 10 30

−14 10 20 −16 10

−18 10 10 1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 4.5 5 iterations iterations

Fig. 7: We compare the performance of the original parareal method, the POD parareal method, and the EIM parareal method for the 1-D Burgers’ equation. On the left we show the L -error at T = 2 against the number of iterations, while the right illustrates the number of bases. ∞

In Figure 8 we show the L -error at T = 40 against the number of iterations for the original parareal method, the POD parareal∞ method, and the EIM parareal method. As exact solution we use the solution obtained by the fine time integrator. It is clear that the reduced basis parareal method converges faster than the original parareal method. Moreover, to achieve the same accuracy, the number of degrees of freedom in the reduced basis parareal methods is roughly one-third that of the original parareal method.

4.2.3 Allan-Cahn equation: nonlinear source

As a third nonlinear example we consider the 1-D Allan-Cahn equation:

∂u = νu + u u3, (x,t) (0,2π) (0,T ], ∂t xx − ∈ × (70) u(x,0) = sin(x),

1 2 where T = 2 and ν = 2,1,10− ,10− . A periodic boundary condition is assumed. The spatial discretization is a P1 DG method with 100 elements [14] and the time integrator is the forward Euler scheme. The following parameters are used in the parareal algorithm

4 5 N = 200,N = 5, ∆t = 2 10− , δt = 10− . (71) c it × 12 8 The tolerance for POD is set to be 10− , and for EIM to be 10− . In Figure 9 (left), we show the L -error at T = 2 against the number of iterations for the POD parareal method with different values∞ of ν’s. As exact solution we use the solution computed with

21 0 10 60 parareal−pod parareal−pod parareal−eim 55 parareal−eim −2 parareal−org 10 50

−4 45 10 40

−6 10 35

30 −8 max error 10 25 number of basis 20 −10 10 15

−12 10 10

5 −14 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 iterations iterations

Fig. 8: We compare the performance of the original parareal method, the POD parareal method, and the EIM parareal method for the 1-D Kuramoto-Sivashinsky equation. On the left we show the L -error at T = 40 against the number of iterations, while the right shows the the number of bases used∞ against the number of iterations. the fine solver. It is clear that for larger values of ν, the solution converges faster and less elements in the reduced basis is needed. This is expected since a larger ν indicates a smoother and more localized solution which usually allows an efficient representation in a lower dimensional space. Similar results are obtained of an EIM based parareal approach and are not reproduced here.

45 0 ν=2 10 ν=1 ν=1e−1 ν=2 −2 ν=1e−2 ν=1 10 40 ν=1e−1 ν=1e−2 10−4

35 10−6

10−8 max error 30 number of basis

10−10

10−12 25

10−14 20 1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 4.5 5 iterations iterations

Fig. 9: The POD parareal method for the 1-D Allan-Cahn equation. On the left we show the L -error at T = 2 against the number of iterations for different values of ν and on the right we show∞ the number of bases.

22 4.2.4 KdV equation: nonlinear flux

As a last example we consider the KdV equation (taken from [22]):

∂u u2 = ( ) νu , (x,t) ( 1,1) (0,T ], ∂t − 2 x − xxx ∈ − × (72) u(x,0) = 1.5 + 0.5sin(2πx),

3 where T = 2 and ν = 10− and we assume a periodic boundary condition. The equation conserves energy, much like the linear wave equation, but the nonlinearity induces a more complex behavior with the generation of propagating waves. In the parareal algorithm we use

4 5 Nc = 100,Nit = 5, ∆t = 10− , δt = 10− . (73)

10 In the construction of the reduced models, the tolerances for POD and EIM are set to be 10− and 8 10− , respectively. We use a first order local discontinuous Galerkin method (LDG) with 100 elements in space [22, 14] 13 8 and the tolerance for POD and EIM is set to be 10− and 10− , respectively. In Figure 10 (left) we show the L -error at T = 2 against the number of iterations for the original parareal method, the POD parareal∞ method, and the EIM parareal method. The solution obtained with the fine solver is taken to be the exact solution. While the POD parareal method does not work well in this case, the EIM parareal method shows a remarkable performance, i.e., it converges much faster than the original parareal method.

0 10 90 parareal−pod parareal−pod parareal−eim parareal−eim parareal−org 80

70

−5 10 60

50

max error 40 number of basis −10 10 30

20

10

−15 10 1 1.5 2 2.5 3 3.5 4 4.5 5 1 1.5 2 2.5 3 3.5 4 4.5 5 iterations iterations

Fig. 10: We compare the performance of the original parareal method, the POD parareal method, and the EIM parareal method for the 1-D KdV equation. On the left we show the L -error at T = 2 against the number of iterations, while the right shows the the number of bases used∞ against the number of iterations.

23 5 Conclusion

In this paper, we propose the use of a reduced basis method to construct the coarse solver in the parareal algorithm. We demonstrate that, compared with the original parareal method, this new reduced basis parareal method has improved stability characteristics and efficiency, as long as the solution can be represented well by a set of reduced bases. Unlike the Krylov subspace parareal method, this approach can be extended from linear problems to nonlinear problems, while requiring less storage and computing resources. The robustness and versatility of the method has been demonstrated through a number of different problems and sets the stage for the evaluation on more complex problems.

6 Acknowledgement

The authors acknowledge partial support by OSD/AFOSR FA9550-09-1-0613 and AFOSR FA9550- 12-1-0463.

24 References

1. U.M. Ascher, S.J. Ruuth, and B.T.R. Wetton. Implicit-explicit methods for time-dependent partial differential equations. SIAM Journal on Numerical Analysis, 32(3):797–823, 1995. 2. L. Baffico, S. Bernard, Y. Maday, G. Turinici, and G. Zerah.´ Parallel-in-time molecular-dynamics simulations. Physical Review E, 66(5):057701, 2002. 3. G. Bal. On the convergence and the stability of the parareal algorithm to solve partial differential equations. Domain decomposition methods in science and engineering, pages 425–432, 2005. 4. M. Barrault, Y. Maday, N.C. Nguyen, and A.T. Patera. An empirical interpolation method: application to efficient reduced-basis discretization of partial differential equations. Comptes Rendus Mathematique, 339(9):667–672, 2004. 5. D. Cavar and K.E. Meyer. Les of turbulent jet in cross flow: Part 2–POD analysis and identification of coherent structures. International Journal of Heat and Fluid Flow, 2012. 6. A. Chatterjee. An introduction to the proper orthogonal decomposition. CURRENT SCIENCE-BANGALORE-, 78(7):808–817, 2000. 7. S. Chaturantabut and D.C. Sorensen. Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing, 32(5):2737–2764, 2010. 8. X. Dai and Y. Maday. Stable parareal in time method for first and second order hyperbolic system. arXiv preprint arXiv:1201.1064, 2012. 9. E. Emmrich. Discrete versions of Gronwall’s lemma and their application to the numerical analysis of parabolic problems. TU, Fachbereich 3, 1999. 10. C. Farhat, J. Cortial, C. Dastillung, and H. Bavestrello. Time-parallel implicit integrators for the near-real-time prediction of linear structural dynamic responses. International journal for numerical methods in engineering, 67(5):697–724, 2006. 11. M. Gander and M. Petcu. Analysis of a Krylov subspace enhanced parareal algorithm for linear problems. In ESAIM: Proceedings, volume 25, pages 114–129, 2008. 12. M.J. Gander and S. Vandewalle. Analysis of the parareal time-parallel time-integration method. SIAM Journal on Scientific Computing, 29(2):556–578, 2007. 13. Gottlieb S. Hesthaven, J.S and G. Gottlieb. Spectral Methods for Time-Dependent Problems. Cambridge University Press, 2007. 14. J.S. Hesthaven and T. Warburton. Nodal discontinuous Galerkin methods: algorithms, analysis, and applications, volume 54. Springer, 2007. 15. J.L. Lions, Y. Maday, and G. Turinici. A”parareal”in time discretization of pde’s. Comptes Rendus de l’Academie des Sciences Series I Mathematics, 332(7):661–668, 2001. 16. Y. Maday and G. Turinici. Parallel in time algorithms for quantum control: Parareal time discretization scheme. International journal of quantum chemistry, 93(3):223–228, 2003. 17. A.S. Nielsen. Feasibility study of the parareal algorithm. 2012. 18. D. Ruprecht and R. Krause. Explicit parallel-in-time integration of a linear acoustic-advection system. Computers & Fluids, 59:72–83, 2012. 19. L. M. Skvortsov. Diagonally implicit Runge-Kutta methods for stiff problems. Computational Mathematics and Mathematical Physics, 46(12):2110–2123, December 2006. 20. G. Staff and E. Rønquist. Stability of the parareal algorithm. Domain decomposition methods in science and engineering, pages 449–456, 2005. 21. Y. Xu and C.W. Shu. Local discontinuous galerkin methods for the Kuramoto–Sivashinsky equations and the Ito-type coupled KdV equations. Computer methods in applied mechanics and engineering, 195(25):3430–3447, 2006. 22. J. Yan and C.W. Shu. A local discontinuous galerkin method for KdV type equations. SIAM Journal on Numerical Analysis, 40(2):769–791, 2002. 23. X . Zhu, R . Munipalli, and J. S.. Hesthaven. Nonlinear model reduction via full empirical interpolation. In preparation, 2012.

25