Quick viewing(Text Mode)

Dx(T) = −Λx(T) Dt + Dw (T), Λ > 0

Dx(T) = −Λx(T) Dt + Dw (T), Λ > 0

LECTURE 4 STOCHASTIC DIFFERENTIAL EQUATIONS AND SOLUTIONS

Let us consider the following simple stochastic ordinary equation: dX(t) = −λX(t) dt + dW (t), λ > 0. (0.1) It can be readily verified by Ito’s formula that the following process Z t −λt −λ(t−s) X(t) = e x0 + e dW (s) (0.2) 0 satisfies Equation (0.1). By the Kolmogrov continuity theorem, the solution is H¨oldercontinuous of order less than 1/2 in time since 2 [|X(t) − X(s)|2] ≤ (t − s)2( + x2) + |t − s| . (0.3) E λ 0 This simple model shows that the solution to a stochastic differential equation is H¨oldercontinuous of order less than 1/2 and thus does not have derivatives in time. This low regularity of solutions leads to different concerns in SODEs (and their numerical methods) from ODEs.

1. Existence and uniqueness of strong solutions W > W Let (Ω, F, P) be a probability space and (W (t), Ft ) = ((W1(t),...,Wm(t)) , Ft ) be an m- W dimensional standard Wiener process, where Ft , 0 ≤ t ≤ T, is an increasing family of σ-subalgebras of F induced by W (t). Consider the system of Ito SODEs m X dX = a(t, X)dt + σr(t, X)dWr(t), t ∈ (t0,T ],X(t0) = x0, (1.1) r=1 where X, a, σr are m-dimensional column-vectors and x0 is independent of w. We assume that a(t, x) and σ(t, x) are sufficiently smooth and globally Lipschitz. Remark 1.1. The SODEs (1.1) can be rewritten in Stratonovich sense under mild conditions. The equation (1.1) can be written as m X dX = [a(t, X) − c(t, X)]dt + σr(t, X)dWr(t), t ∈ (t0,T ],X(t0) = x0, (1.2) r=1 where m 1 X ∂σr(t, X) c(t, X) = σ (t, X), 2 ∂x r r=1 ∂σr and ∂x is the Jacobi matrix of the column-vector σr:  ∂σ ∂σ  1,r ··· 1,r    ∂x1 ∂xm  ∂σr ∂σr ∂σr  . . .  = ··· =  . .. .  . ∂x ∂x ∂x  . .  1 m ∂σ ∂σ  m,r ··· m,r ∂x1 ∂xm 2 2 We denote f ∈ Lad(Ω; L ([a, b])) if f(t) is adapted to Ft and f(t, ω) ∈ L ([a, b]), i.e., ( Z b ) 2 2 f ∈ Lad(Ω; L ([a, b])) = f(t, ω)|f(t, ω) is Ft-measurable and P( fs ds < ∞) = 1 . a

Here {Ft; a ≤ t ≤ b} is a filtration such that

Date: November 3, 2019. 1 2 LECTURE 4

• for each t, f(t) and W (t) are Ft-measurable, i.e., f(t) and W (t) are adapted to the filtration Ft. • for any s ≤ t, W (t) − W (s) is independent of the σ-filed Fs. Definition 1.2 (A strong solution to a SODE). We say that X(t) is a (strong) solution to SDE (1.1) if 1 • a(t, X(t)) ∈ Lad(Ω,L ([c, d])), 2 • σ(t, X(t)) ∈ Lad(Ω,L ([c, d])), • and X(t) satisfies the following integral equation a.s. Z t Z t X(t) = x + a(s, X(s)) ds + σ(s, X(s)) dW (s). (1.3) 0 0 In general, it is difficult to give a necessary and sufficient condition for the existence and uniqueness of strong solutions. Usually, we can give sufficient conditions.

2 Theorem 1.3 (Existence and uniqueness). If X0 is F0-measurable and E[X0 ] < ∞. The coefficients a, σ satisfy the following conditions. • (Lipschitz condition) a and σ are Lipschitz continuous, i.e., there is a constant K > 0 such that m X |a(x) − a(y)| + |σr(x) − σr(y)| ≤ K|x − y|. r=1 • (Linear growth) a and σ grow at most linearly i.e., there is a C > 0 such that |a(x)| + |σ(x)| ≤ C(1 + |x|), then the SDE above has a unique strong solution and the solution has the following properties

• X(t) is adapted to the filtration generated by X0 and W (s)(s ≤ t). Z t 2 • E[ X (s) ds] < ∞. 0 See [Øksendal, 2003, Chapter 5] for a proof. Here are some examples where the conditions in the theorem are satisfied. • (Geometric Brownian motion) For µ, σ ∈ R,

dX(t) = µX(t) dt + σX(t) dW (t),X0 = x. • (Sine process) For σ ∈ R,

dX(t) = sin(X(t)) dt + σ dW (t),X0 = x.

• (modified Cox-Ingersoll-Ross process) For θ1, θ2 ∈ R, θ2 dX(t) = −θ X(t) dt + θ p1 + X(t)2 dW (t),X = x. θ + 2 > 0. 1 2 0 1 2 Remark 1.4. The condition in the theorem is also known as global Lipschitz condition. A straight- forward generalization is one-sided Lipschitz condition (global monotone condition) m > X 2 2 (x − y) (a(x) − a(y)) + p0 |σr(x) − σr(y)| ≤ K|x − y| , p0 > 0, r=1 and the growth condition can also be generalized as m > X 2 2 x a(x) + p1 |σr(x)| ≤ C(1 + |x| ). r=1 Theorem 1.5 (Regularity of the solution). Under the conditions of Theorem 1.3, the solution is continuous and there exists a constant C > 0 depending only on t that 2p p E[|X(t) − X(s)| ] ≤ C |t − s| , p ≥ 1. LECTURE 4 3

The proof of this theorem rely on the Burkholder-Davis-Gundy inequality. Then by the Kol- mogorov continuity theorem, we can conclude that the solution is only H¨oldercontinuous with exponent less than 1/2, which is the same as Brownian motion.

2. Solution methods This process (0.2) here is a special case of the Ornstein-Uhlenbeck process, which satisfies the equation dX(t) = κ(θ − X(t)) dt + σ dW (t). (2.1) where κ, σ > 0, θ ∈ R. The solution to (2.1) can be obtained by the method of change-of-variable: Y (t) = θ − X(t). Then by Ito’s formula we have dY (t) = −κY (t) dt + σ d(−W (t)). Similar to (0.2), the solution is Z t −κt −κ(t−s) Y (t) = e Y0 + σ e d(−W (s)). (2.2) 0 Then by Y (t) = θ − X(t), we have Z t −κt −κt −κ(t−s) X(t) = X0e + θ(1 − e ) + σ e dW (s). 0 In a more general case, we can use similar ideas to find explicit solutions to SODEs.

2.1. The integrating factor method. We apply the integrating factor method to solve nonlinear SDEs of the form

dX(t) = f(t, X(t)) dt + σ(t)X(t) dW (t),X0 = x. (2.3) where f is a continuous deterministic defined from R+ × R to R. • Step 1. Solve the equation dG(t) = σ(t)G(t) dW (t). Then we have Z t 1 Z t G(t) = exp( σ(s) dW (s) − σ2(s) ds). 0 2 0 The integrating factor function is defined by F (t) = G−1(t). It can be readily verified that F (t) satisfies dF (t) = −σ(t)F (t) dW (t) + σ2(t)F (t) dt. • Step 2. Let X(t) = G(t)C(t) and then C(t) = F (t)X(t). Then by the , (2.3) can be written as d(F (t)X(t)) = F (t)f(t, X(t)) dt.

Then Ct satisfies the following “deterministic” ODE dC(t) = F (t)f(t, G(t)C(t)). (2.4) • Step 3. Once we obtain C(t), we get X(t) from X(t) = G(t)C(t). Remark 2.1. When (2.4) cannot be explicitly solved, we may use some numerical methods to obtain C(t). Example 2.2. Use the integrating factor method to solve the SDE −1 dX(t) = (X(t)) dt + αX(t) dW (t),X0 = x > 0, where α is a constant. 4 LECTURE 4

−1 α2 Solution. Here f(t, x) = x and F (t) = exp(−αW (t) + 2 t). We only need to solve dC(t) = F (t)[G−1(t)C(t)]−1 = F 2(t)/C(t). This gives d(C(t))2 = 2F 2(t) dt and thus Z t (C(t))2 = 2 exp(−2αW (s) + α2s) ds + x2. 0 Since the initial condition is x > 0, we take Y (t) > 0 such that s α2 Z t X(t) = G(t)Y (t) = exp(αW (t) − t) 2 exp(−2αW (s) + α2s) ds + x2 > 0. 2 0 2.2. Moment equations of solutions. For a more complicated SODE, we cannot obtain a solution that can be written explicitly in terms of W (t). For example, the modified Cox-Ingersoll-Ross model (2.5) does not have an explicit solution: p dX(t) = κ(θ − X(t))dt + σ X(t)dW (t),X0 = x, (2.5) However, we can say a bit more about the moments of the process X(t). Write (2.5) in its integral form: Z t Z t X(t) = x + κ (θ − X(s))ds + σ pX(s) dW (s) (2.6) 0 0 and using Ito’s formula gives Z t Z t Z t X2(t) = x2 + (2κθ + σ2) X(s) ds − 2κ X(s)2 ds + 2σ (X(s))3/2 dW (s). (2.7) 0 0 0 From this equation and the properties of Ito’s integral, we can obtain the moments of the solution. The first moment can be obtained by taking expectation over both sides of (2.6):  Z t  mt := E[X(t)] = x + κ θt − E[X(s)] ds , 0 because the expectation of the stochastic integral part is zero1. We can then solve the following ODE:

dmt = κ(θ − mt)dt. The solution is given by: −κt mt = θ + (x − θ)e . For the second moment, we get from (2.7) that Z t Z t 2 2 2 2 E[X (t)] = x + (2κθ + σ ) E[X(s)]ds − 2κ E[X (s)]ds. 0 0

This is again an ODE similar to the one for mt to solve: −κt Z t 2 2 2  (1 − e ) 2 E[X (t)] = x + (2κθ + σ ) θt + (x − θ) − 2κ E[X (s)]ds. κ 0 R t 3 R t 3/2 Here we also assume that we have 0 E[|X(s)| ] ds < ∞ so that 0 (X(s)) dW (s) is an Ito integral R t 3/2 with a square-integrable integrand and thus E[ 0 (X(s)) dW (s)] = 0. Remark 2.3. It can be shown using Feller’s test [Karatzas and Shreve, 1991, Theorem 5.29] that 2 the solution to (2.7) exists and is unique when 2κθ > σ and X0 ≥ 0. Moreover, the solution is 3 p strictly positive when X0 > 0. If E[|X0| ] < ∞, then E[|X(t)| ] < ∞, 1 ≤ p ≤ 3.

1 R t p Here we need to verify that 0 X(s) dW (s) is indeed Ito’s integral with a square-integrable integrand, by showing Z t that E[|X(s)|] ds < ∞. See Remark 2.3. 0 LECTURE 4 5

Unfortunately, even the first few moments are difficult to obtain in general. For example, we cannot get a closure for the second-order moment of the following SDE

α 1 dX(t) = κ(θ − X(t)) dt + X(t) dW (t), < α < 1. 2 We cannot even obtain the first-order moment of the following SDE

dX(t) = sin(X(t)) dt + dW (t),X0 = x.

3. Numerical methods for SODEs As explicit solutions to SODEs are usually hard to find, we seek numerical approximation of solutions. 3.1. Derivation of numerical methods based on . A starting point for numerical SODEs is numerical integration. Consider the SODE (1.1) over [t, t + h]: m Z t+h X Z t+h X(t + h) = X(t) + a(s, X(s)) ds + σr(s, X(s)) dWr. t r=1 t The simplest scheme for (1.1) is the forward Euler scheme. In the forward Euler scheme, we replace (approximate) Z t+h Z t+h a(s, X(s)) ds with a(t, X(t)) ds = a(t, X(t))h t t and Z t+h Z t+h σr(s, X(s)) dWr with σr(t, X(t)) dWr = σr(t, X(t))(Wr(t + h) − Wr(t)). t t Then we obtain the forward Euler scheme (also known as Euler-Maruyama scheme): m X Xk+1 = Xk + a(tk,Xk)h + σl(tk,Xk)∆kWr, (3.1) r=1 where h = (T − t0)/N, tk = t0 + kh, k = 0,...,N.X0 = x0 and ∆kWr = Wr(tk+1) − Wr(tk). The Euler scheme can be implemented by replacing the increments ∆kWr with Gaussian random variables: m X √ Xk+1 = Xk + a(tk,Xk)h + σl(tk,Xk) hξl,k+1, (3.2) r=1 where ξr,k+1 are i.i.d. N (0, 1) random variables. Replacing (approximating) the drift term with its value at t + h, we have Z t+h Z t+h a(s, X(s)) ds ≈ a(t + h, X(t + h)) ds = a(t + h, X(t + h))h. t t The resulting scheme is called backward Euler scheme (also known as drift-implicit Euler scheme) m X Xk+1 = Xk + a(tk+1,Xk+1)h + σl(tk,Xk)∆kWr, k = 0, 1,...,N − 1. (3.3) r=1 The following schemes can be considered as extensions of forward and backward Euler schemes m X √ Xk+1 = Xk + [(1 − λ)a(tk,Xk) + λa(tk+1,Xk+1)]h + σr(tk,Xk) hξr,k+1, (3.4) r=1 where λ ∈ [0, 1], or similarly m X √ Xk+1 = Xk + a((1 − λ)tk + λtk+1, (1 − λ)Xk + λXk+1)h + σr(tk,Xk) hξr,k+1. (3.5) r=1 We can also derive numerical methods for (1.1) in order to get high-order convergence. For example, in (1.1), we can approximate the diffusion terms σr using their half-order Ito-Taylor’s 6 LECTURE 4 expansion which leads to the Milstein scheme [Milstein, 1995]. Let us illustrate the derivation of the Milstein scheme for an autonomous SODE (a and σ do not explicitly depend on t) when m = 1 and r = 1, i.e., for scalar equation with single noise. With the following approximation, Z t+h Z t+h a(X(s)) ds ≈ a(Xt) ds = a(X(t))h t t Z t+h Z t+h Z s 0 σ(X(s)) dW (s) ≈ [σ(X(t)) + σ (X(t))σ(Xt) dW (θ)] dW (s) t t t Z t+h Z s = σ(X(t))(W (t + h) − W (t)) + σ0(X(t))σ(X(t)) dW (θ) dW (s), t t we can obtain the Milstein scheme 1 X = X + a(X )h + σ(X )(W (t ) − W (t )) + σ(X )σ0(X )[(W (t ) − W (t ))2 − h]. k+1 k k k k+1 k 2 k k k+1 k One can also derive a drift-implicit Milstein scheme as follows: 1 X = X + a(X )h + σ(X )(W (t ) − W (t )) + σ(X )σ0(X )[(W (t ) − W (t ))2 − h]. k+1 k k+1 k k+1 k 2 k k k+1 k For (1.1), the Milstein scheme is as follows, see e.g. [Kloeden and Platen, 1992, Milstein and Tretyakov, 2004], m m X √ X Xk+1 = Xk + a(tk,Xk)h + σr(tk,Xk)ξrk h + Λiσl(t, Xk)Ii,l,tk , (3.6) r=1 i,l=1

Z tk+1 Z s where Ii,l,tk = dWi dWl. To efficiently evaluate this double integral, see Chapter 4 of tk tk [Zhang and Karniadakis, 2017]. The scheme (3.6) is of first-order mean-square convergence. For commutative noises, i.e. ∂ Λ σ = Λ σ , Λ = σ> , (3.7) i l l i l l ∂x we can use only increments of Brownian motions of the double Ito integral in (3.6) since

Ii,l,tk + Il,i,tk = (ξikξlk − δil)h, where δil is the Kronecker delta function. In this case, we have a simplified version of (3.6): m m X √ 1 X X = X + a(t ,X )h + σ (t ,X )ξ h + Λ σ (t, X )(ξ ξ − δ )h. (3.8) k+1 k k k r k k lk 2 i l k ik lk il r=1 i,l=1 There has been an extensive literature on numerical methods for SODES. We refer to [Higham, 2001, Schurz, 2002] for introduction to numerical methods for SODEs and to [Kloeden and Platen, 1992, Milstein, 1995] for a systematic construction of numerical methods for SODEs.

For numerical methods for SODEs and SPDEs, the key issues are whether a numerical method converges and in what sense and whether it is stable in some sense, as well as how fast it converges.

3.2. Strong convergence. Definition 3.1 (Strong convergence in Lp). A method (scheme) is said to have a strong convergence order γ in Lp if there exists a constant K > 0 independent of h such that p pγ E[|Xk − X(tk)| ] ≤ Kh for any k = 0, 1,...,N and Nh = T and sufficiently small h. In many applications and in this book, a strong convergence refers to convergence in the mean- square sense, i.e., p = 2. LECTURE 4 7

If the coefficients of (1.1) satisfy the conditions in Theorem 1.3, the forward Euler scheme (3.1) and the backward Euler scheme (3.3) are convergent with half-order (γ = 1/2) in the mean-square sense (strong convergence order half), i.e., 2 max E[|X(tk) − Xk| ] ≤ Kh, 1≤k≤N where K is positive constant independent of h. When the noise is additive, i.e., the coefficients of noises are functions of time instead of functions of the solutions, these schemes are of first-order convergence. Under the conditions in Theorem 1.3, the Milstein scheme (3.6) can be shown to have a strong convergence order one, i.e., γ = 1. Note that all these schemes are one-step schemes. One can use the following Milstein’s fundamen- tal theorem to derive their mean-square convergence order. Introduce the one-step approximation X¯t,x(t + h), t0 ≤ t < t + h ≤ T, for the solution Xt,x(t + h) of (1.1), which depends on the initial point (t, x), a time step h, and {W1(θ) − W1(t),...,Wm(θ) − Wm(t), t ≤ θ ≤ t + h} and which is defined as follows:

X¯t,x(t + h) = x + A(t, x, h; Wi(θ) − Wi(t), i = 1, . . . , m, t ≤ θ ≤ t + h). (3.9)

Using the one-step approximation (3.9), we recurrently construct the approximation (Xk, Ftk ), k = 0, . . . , N, tk+1 − tk = hk+1,TN = T : ¯ X0 = X(t0),Xk+1 = Xtk,X¯k (tk+1) (3.10)

= Xk + A(tk,Xk, hk+1; Wi(θ) − Wi(tk), i = 1, . . . , m, tk ≤ θ ≤ tk+1).

For simplicity, we will consider a uniform time step size, i.e., hk = h for all k. The proof of the following theorem can be found in [Milstein, 1995, Milstein and Tretyakov, 2004, Chapter 1]. Theorem 3.2 (Fundamental convergence theorem of one-step numerical methods). Suppose that (i) the coefficients of (1.1) are Lipschitz continuous; (ii) the one-step approximation X¯t,x(t + h) from (3.9) has the following orders of accuracy: for some p ≥ 1 there are α ≥ 1, h0 > 0, and K > 0 such that for arbitrary t0 ≤ t ≤ T − h, d x ∈ R , and all 0 < h ≤ h0 :

¯ 2 1/2 q1 |E[Xt,x(t + h) − Xt,x(t + h)]| ≤ K(1 + |x| ) h ,

1/(2p)  ¯ 2p 2p 1/(2p) q2 E|Xt,x(t + h) − Xt,x(t + h)| ≤ K(1 + |x| ) h (3.11) with 1 1 q ≥ , q ≥ q + ; 2 2 1 2 2 Then for any N and k = 0, 1,...,N the following inequality holds: 1/(2p)  ¯ 2p 2p 1/(2p) q2−1/2 E|Xt0,X0 (tk) − Xt0,X0 (tk)| ≤ K(1 + E|X0| ) h , (3.12) where K > 0 do not depend on h and k, i.e., the order of accuracy of the method (3.10) is q = q2−1/2. Many other schemes of strong convergence based on Ito-Taylor’s expansion have been developed, such as Runge-Kutta methods, predictor-corrector methods and splitting (split-step) methods, see e.g. [Kloeden and Platen, 1992, Milstein and Tretyakov, 2004].

3.3. Weak convergence. The weak integration of SDEs is computing the expectation

E[f(X(T )], (3.13) where f(x) is a sufficiently smooth function with growth at infinity not faster than a polynomial: |f(x)| ≤ K(1 + |x|κ) (3.14) for some K > 0 and κ ≥ 1. 8 LECTURE 4

Definition 3.3 (Weak convergence). A method (scheme) is said to have a weak convergence order γ if there exists a constant K > 0 independent of h such that

γ |E[f(Xk)] − E[f(X(tk))]| ≤ Kh for any k = 0, 1,...,N and Nh = T and sufficiently small h. Under the conditions of Theorem 1.3, the following error estimate holds for the forward Euler scheme (3.2) (see e.g. [Milstein and Tretyakov, 2004, Chapter 2]):

|Ef(Xk) − Ef(X(tk))| ≤ Kh, (3.15) where K > 0 is a constant independent of h. The backward Euler scheme (3.3) and the Milstein scheme (3.6), are all of weak convergence order 1. This first-order weak convergence of the forward Euler scheme can also be achieved by replacing ξl,k+1 with discrete random variables [Milstein and Tretyakov, 2004], e.g., the weak Euler scheme has the form m √ X X˜k+1 = X˜k + ha(tk, X˜k) + h σr(tk, X˜k)ζr,k+1, k = 0,...,N − 1, (3.16) r=1 where X˜0 = x0 and ζr,k+1 are i.i.d. random variables with the law P (ζ = ±1) = 1/2. (3.17) The following error estimate holds for (3.16)-(3.17) (see e.g. [Milstein and Tretyakov, 2004, Chapter 2]): ˜ |Ef(XN ) − Ef(X(T ))| ≤ Kh, (3.18) where K > 0 can be a different constant than that in (3.15). Introducing the function ϕ(y), y ∈ RmN , so that

ϕ(ξ1,1, . . . , ξr,1, . . . , ξ1,N , . . . , ξm,N ) = f(XN ), (3.19) we have

E[f(X(T )] ≈ Ef(XN ) = Eϕ(ξ1,1, . . . , ξr,1, . . . , ξ1,N , . . . , ξm,N ) (3.20) mN ! 1 Z 1 X = ϕ(y , . . . , y , . . . , y , . . . , y ) exp − y2 dy. mN/2 1,1 m,1 1,N m,N i (2π) rN 2 R i=1 Further, it is not difficult to see from (3.16)-(3.17) that ˜ E[f(X(T )] ≈ Ef(XN ) = Eϕ(ζ1,1, . . . , ζm,1, . . . , ζ1,N , . . . , ζm,N ) (3.21) ⊗mN = Q2 ϕ(y1,1, . . . , ym,1, . . . , y1,N , . . . , ym,N ), where Q2 is the Gauss-Hermite quadrature rule with nodes ±1 and equal weights 1/2. We note ˜ that E[f(XN )] can be viewed as an approximation of E[f(XN )] and that (cf. (3.15) and (3.18))

˜ E[f(XN )] − E[f(XN )] = O(h).

Remark 3.4. Let ζl,k+1 in (3.16) be i.i.d. random variables with the law

P (ζ = yn,j) = wn,j, j = 1, . . . , n, (3.22) where yn,j are nodes of the Gauss-Hermite quadrature Qn and wn,j are the corresponding quadrature weights. Then ˜ ⊗mN Ef(XN ) = Eϕ(ζ1,N , . . . , ζm,N ) = Qn ϕ(y1,1, . . . , ym,N ), ˜ which can be a more accurate approximation of E[f(XN )] than E[f(XN )] from (3.21) but the weak- ˜ sense error for the SDEs approximation Ef(XN ) − Ef(X(T )) remains of order O(h). LECTURE 4 9

We can also use the second-order weak scheme (3.23) for (1.1) (see, e.g. [Milstein and Tretyakov, 2004, Chapter 2]): m √ X h2 X = X + ha(t ,X ) + h σ (t ,X )ξ + La(t ,X ) (3.23) k+1 k k k i k k i,k+1 2 k k i=1 m r m X X h3/2 X +h Λ σ (t ,X )η + (Λ a(t ,X ) + Lσ (t X ))ξ , i j k k i,j,k+1 2 i k k i k, k i,k+1 i=1 j=1 i=1 k = 0,...,N − 1, 1 where X0 = x0; ηi,j = 2 ξiξj − γi,jζiζj/2 with γi,j = −1 if i < j and γi,j = 1 otherwise; m m m m X ∂ ∂ X ∂ 1 X X ∂2 Λ = σi , L = + ai + σiσi ; l l ∂x ∂t ∂x 2 l l ∂x ∂x i=1 i i=1 i r=1 i,j=1 i j and ξi,k+1 and ζi,k+1 are mutually independent√ random variables with Gaussian distribution or with the laws P (ξ = 0) = 2/3,P (ξ = ± 3) = 1/6 and P (ζ = ±1) = 1/2. The following error estimate holds for (3.23) (see e.g. [Milstein and Tretyakov, 2004, Chapter 2]): 2 |Ef(X(T )) − Ef(XN )| ≤ Kh . We again refer to [Kloeden and Platen, 1992, Milstein and Tretyakov, 2004] for more weakly con- vergent numerical schemes. 3.4. Linear stability. To understand the stability of time integrators for SODEs, we consider the following linear model: dX = λX dt + σ dW (t),X0 = x, λ < 0. (3.24) Consider one-step methods of the following form: √ Xn+1 = A(z)Xn + B(z) hσξn, z = λh. (3.25) √ Here h is the time step size and A(z) and B(z) are analytic functions, δWn = hξn are i.i.d. Gaussian random variables. For (3.24), we have X(t) is a Gaussian random variable and 2 2 σ lim E[X(t)] = 0, lim E[X (t)] = . t→∞ t→∞ 2λ n It can be readily shown that Xn is also a Gaussian random variable with E[Xn] = A (z)x and 2 2 2 σ 2zB (z) lim E[|Xn| ] = R(z),R(z) = − . n→∞ 2λ 1 − A2(z) Here are some examples: 2 • Euler scheme: A(z) = 1 + z, B(z) = 1, and R(z) = 2+z . 1 2 • Backward Euler scheme: A(z) = B(z) = 1−z and R(z) = 2−z . 1+z/2 1 • midpoint scheme, A(z) = 1−z/2 and B(z) = 1−z/2 and R(z) = 1. For the Euler schemes, |A(z)| < 1 implies that −2 < λh = z < 0. For the backward Euler scheme and the midpoint scheme, any h > 0 will leads to |A(z)| < 1. When R(z) = 1 and |A(z)| < 1, We call the one-step scheme is mean-square stable. If R(z) = 1 and |A(z)| < 1 holds for all h > 0, we call the scheme A-stable in the mean-squarse sense. For long-time integration, L-stability is also helpful when a stiff problem (e.g. λ is large) is investigated. The L-stability requires A-stability and that A(−∞) = 0 such that limz→−∞ E[Xn] = n limz→−∞ A (z)x = 0 for any fixed n. When λh is large (e.g., λ is too large to have such a practical h that λh is small), L-stable schemes can still obtain the decay of the solution E[X(t)] = 0 even with moderately small time step sizes while A-stable schemes usually require very small h. For example, the trapezoidal rule is A-stable but not L-stable since A(−∞) = 1. In practice, this means that for an extremely large λ, the trapezoidal rule damps the mean since E[Xn+1] = limz→−∞ A(z)E[Xn] =

E[Xn] while E[X(tn+1)] = limz→−∞ exp(−z)E[Xtn ] = 0, where tn+1 − tn = h. 10 LECTURE 4

However, when A(z) and B(z) are rational functions of z, it is impossible to have both A-stability and L-stability since when R(z) = 1, it holds that A(−∞) = 1. The claim can be proved by the argument of contradiction. Remark 3.5. It is still possible to have a scheme such that it is L-stable and A-stable. Define √ X˜n = C(z)Xn + D(z)σ hξn, z = λh, 1 where Xn is from (3.25). For example, for the backward Euler scheme, A(z) = B(z) = 1−z and 2 2 ˜ σ 2 2 lim E[ Xn ] = (C (z)R(z) − 2zD (z)). n→∞ 2λ The limit is exactly the same as the variance of X(∞) when C(z) = 1 and D(z) = (1 − z/2)−1. Such a scheme with X˜ approximating X is called a post-processing scheme or a predictor-corrector scheme. The linear model (3.24) is designed for additive noise. For multiplicative noise, we can consider the following geometric Brownian motion. dX = λX dt + σX dW (t),X(0) = 1. (3.26)

Here we assume that λ, σ ∈ R. The solution to (3.26) is 1 X = exp((λ − σ2)t + σW (t)). 2 σ2 2 The solution is mean-square stable if λ + 2 < 0, i.e., limt→∞ E[X (t)] = 0. It is asymptotic stable σ2 (limt→∞ |X(t)| = 0) if λ − 2 < 0. The mean-square stability implies asymptotic stability. Here we consider only mean-square stability. Applying the forward Euler scheme (3.2) to the linear model (3.26), we have √ Xk+1 = (1 + λh + σ hξk)Xk. √ 2 2 2 2 2 2 The second moment is E[Xk+1] = E[Xk ]E[(1 + λh + σ hξk) ] = E[Xk ] (1 + λh) + hσ . For 2 2 2 limk→∞ E[Xk+1] = 0, we need (1 + λh) + hσ < 1. Similarly, for the backward Euler scheme (3.3), we need 1 + σ2h < (1 − λh)2. We call the region of (λh, σ2h) where a scheme is mean-square stable the mean-square stability region of the scheme. To allow relative large h for stiff problems, e.g., when λ is large, we need a large stability region. Usually, explicit schemes have smaller stability regions than implicit (including semi-implicit and drift-implicit) schemes do. Both schemes (3.2) and (3.6) are explicit and hence they have small stability regions. To improve the stability region, we can use some semi-implicit (drift-implicit) schemes, e.g. (3.3) and drift- implicit Milstein scheme.

3.5. Nonlinear and stiff SODEs. Let Xt0,X0 (t) = X(t), t0 ≤ t ≤ T, be a solution of the system (1.1). We will assume the following. Assumption 3.6. (i) The initial condition is such that

2p E|X0| ≤ K < ∞, for all p ≥ 1. (3.27)

(ii) For a sufficiently large p0 ≥ 1 there is a constant c1 ≥ 0 such that for t ∈ [t0,T ], m 2p0 − 1 X (x − y, a(t, x) − a(t, y)) + |σ (t, x) − σ (t, y)|2 ≤ c |x − y|2, x, y ∈ d. (3.28) 2 r r 1 R r=1

(iii) There exist c2 ≥ 0 and κ ≥ 1 such that for t ∈ [t0,T ], 2 2 −2 2 −2 2 d |a(t, x) − a(t, y)| ≤ c2(1 + |x| κ + |y| κ )|x − y| , x, y ∈ R . (3.29) LECTURE 4 11

The condition (3.28) implies that m 2p0 − 1 − ε X (x, a(t, x)) + |σ (t, x)|2 ≤ c + c0 |x|2, t ∈ [t ,T ], x ∈ d, (3.30) 2 r 0 1 0 R r=1

2 (2p0−1−ε)(2p0−1) Pm 2 0 where c0 = |a(t, 0)| /2 + 2ε r=1 |σr(t, 0)| and c1 = c1 + 1/2. The inequality (3.30) together with (3.27) is sufficient to ensure finiteness of moments: there is K > 0 2p 2p E|Xt0,X0 (t)| < K(1 + E|X0| ), 1 ≤ p ≤ p0 − 1, t ∈ [t0,T ]. (3.31) Also, (3.29) implies that

2 0 2κ d |a(t, x)| ≤ c3 + c2|x| , t ∈ [t0,T ], x ∈ R , (3.32) 2 0 where c3 = 2|a(t, 0))| + 2c2(κ − 1)/κ and c2 = 2c2(1 + κ)/κ. Example 3.7. Here is an example for Assumption 3.6 (ii): dX = −µX|X|r1−1dt + λXr2 dW, where µ, λ > 0, r1 ≥ 1, and r2 ≥ 1. If r1 + 1 > 2r2 or r1 = r2 = 1, then (3.28) is valid for any 2 p0 ≥ 1. If r1 + 1 = 2r2 and r1 > 1 then (3.28) is valid for 1 ≤ p0 ≤ µ/λ + 1/2.

We introduce the one-step approximation X¯t,x(t+h), t0 ≤ t < t+h ≤ T, for the solution Xt,x(t+h) of (1.1), as in (3.9) and (3.10). The following theorem is a generalization of Milstein’s fundamental theorem (see [Milstein, 1995, Milstein and Tretyakov, 2004, Chapter 1]) from the global to non-globally Lipschitz case. For simplicity, we will consider a uniform time step size, i.e., hk = h for all k. Theorem 3.8 ([Zhang and Karniadakis, 2017]). Suppose (i) Assumption 3.6 holds; (ii) The one-step approximation X¯t,x(t + h) from (3.9) has the following orders of accuracy: for d some p ≥ 1 there are α ≥ 1, h0 > 0, and K > 0 such that for arbitrary t0 ≤ t ≤ T − h, x ∈ R , and all 0 < h ≤ h0 : ¯ 2α 1/2 q1 |E[Xt,x(t + h) − Xt,x(t + h)]| ≤ K(1 + |x| ) h , (3.33) 1/(2p)  ¯ 2p 2αp 1/(2p) q2 E|Xt,x(t + h) − Xt,x(t + h)| ≤ K(1 + |x| ) h (3.34) with 1 1 q ≥ , q ≥ q + ; (3.35) 2 2 1 2 2 (iii) The approximation Xk from (3.10) has finite moments, i.e., for some p ≥ 1 there are β ≥ 1, h0 > 0, and K > 0 such that for all 0 < h ≤ h0 and all k = 0,...,N: 2p 2pβ E|Xk| < K(1 + E|X0| ). (3.36) Then for any N and k = 0, 1,...,N the following inequality holds: 1/(2p)  ¯ 2p 2γp 1/(2p) q2−1/2 E|Xt0,X0 (tk) − Xt0,X0 (tk)| ≤ K(1 + E|X0| ) h , (3.37) where K > 0 and γ ≥ 1 do not depend on h and k, i.e., the order of accuracy of the method (??) is q = q2 − 1/2. Corollary 3.9. In the setting of Theorem 3.8 for p ≥ 1/(2q) in (3.37), there is 0 < ε < q and an a.s. finite random variable C(ω) > 0 such that q−ε |Xt0,X0 (tk) − Xk| ≤ C(ω)h , i.e., the method (3.10) for (1.1) converges with order q − ε a.s. The corollary is proved using the Borel-Cantelli lemma. Besided the implicit schemes, some explicit schemes can be also used for nonlinear SODEs, e. g., balanced schemes, truncated/projection schemes. Balanced schemes 12 LECTURE 4

Specifically, the balanced Euler schemes can be written as

m X Xk+1 = Xk+a(tk,Xk)h+ σr(tk,Xk)(Wr(tk+1)−Wr(tk))+P (tk, tk+1,Xk,Xk+1, h, Wr(tk+1)−Wr(tk)), r=1 where the term P is not zero unless h = 0. It can be considered as a penalty method or a Lagrange multiplier method for stiff SDEs. A special form is the so-called tamed Euler scheme.

m X Xk+1 = Xk + T0(a(tk,Xk)h) + Tr(σr(tk,Xk)(Wr(tk+1) − Wr(tk))), r=1 y where T (y) = 1+|y| , or T (y) = tanh(y). Tamed Milstein schemes can be also derived. Remark 3.10. Apply the tame function only when the drift or diffusion coefficients grow superlin- early. No tame function is needed if the coefficient grows linearly.

Truncated schemes/Projection schemes Pm Suppose that |a(x)| + r=1 |σr(x)| ≤ µ(R) when |x| ≤ R and µ is a strictly increasing function with positive values. − 1 Take T (h) ≤ h 4 and then the following scheme converges

m X √ Xk+1 = Xk + ha(X˜k) + σr(tk, X˜k)ξrk h., (3.38) r=1 −1 ˜ −1 −1 Xk = Xk1|Xk|≤µ (T (h)) + sgn(Xk)µ (T (h))1|Xk|>µ (T (h)). (3.39) The above scheme converges even for locally continuous coefficients.

Theorem 3.11. Assume that (local Lipschitz condition)

m X |a(x) − a(y)| + |σr(x) − σr(y)| ≤ KR |x − y| , (3.40) r=1 and (linear growth condition)

m p − 1 X 2 x>a(x) + |σ (x)| ≤ K(1 + |x| ), p > 2, K > 0. (3.41) 2 r r=1 The truncated scheme converges when q ∈ [2, p]:

q lim E[|XN − X(tN )| ] = 0. h→0 3.6. Wong-Zakai approximation. Suppose that the trajectories of the Brownian motion are ap- proximated by piece-wise continuously-differentiable functions

lim sup |Wn(t) − W (t)| = 0 (3.42) n→∞ 0

Appendix A. Burkholder-Davis-Gundy inequality

For any 1 ≤ p < ∞, there exist constants cp,Cp > 0 such that for all (local) martingales X with X0 = 0 and stopping times τ, the following inequality holds: p/2 ∗ p h p/2i cpE[[X]τ ] ≤ E[(Xτ ) ] ≤ CpE [X]τ . ∗ Here Xt = sups≤t |Xs| is the maximum process Xt and [X] is the quadratic variation of X. Fur- thermore, for continuous (local) martingales, this statement holds for all 0 < p < ∞. The proof is based on Doob’s maximal inequality and Ito’s formula.

Appendix B. Fully implicit schemes Fully implicit schemes are also used in practice because of their symplecticity-preserving property and effectiveness in long-term integration. The following fully implicit scheme is from [Milstein and Tretyakov, 2004]:

Xk+1 = Xk + a(tk+λ, (1 − λ)Xk + λXk+1)h m d X X ∂σr −λ (t , (1 − λ)X + λX )σj(t , (1 − λ)X + λX )h ∂xj k+λ k k+1 r k+λ k k+1 r=1 j=1 m X √ + σr(tk+λ, (1 − λ)Xk + λXk+1)(ζrh)k h, (B.1) r=1 where 0 < λ ≤ 1, tk+λ = tk + λh and (ζrh)k are i.i.d. random variables so that   ξ, |ξ| ≤ Ah, ζh = Ah, ξ > Ah, (B.2)  −Ah, ξ < −Ah, p with ξ ∼ N (0, 1) and Ah = 2l| ln h| with l ≥ 1. We recall [Milstein and Tretyakov, 2004, Lemma 1.3.4] that 2 2 p l E[(ξ − ζh)] ≤ (1 + 2 2l| ln h|)h . (B.3) All these fully implicit schemes are of half-order convergence in the mean-square sense, see e.g. [Milstein and Tretyakov, 2004, Chapter 1]. When the noise is additive, i.e., the coefficients of noises are functions of time instead of functions of the solutions, these schemes are of first-order convergence.

References [Higham, 2001] Higham, D. J. (2001). An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Rev., 43(3):525–546 (electronic). [Karatzas and Shreve, 1991] Karatzas, I. and Shreve, S. E. (1991). Brownian motion and stochastic calculus, volume 113 of Graduate Texts in . Springer-Verlag, New York, second edition. [Kloeden and Platen, 1992] Kloeden, P. E. and Platen, E. (1992). Numerical solution of stochastic differential equa- tions. Springer-Verlag, Berlin. [Milstein, 1995] Milstein, G. N. (1995). Numerical integration of stochastic differential equations. Kluwer Academic Publishers Group, Dordrecht. [Milstein and Tretyakov, 2004] Milstein, G. N. and Tretyakov, M. V. (2004). Stochastic numerics for mathematical . Springer-Verlag, Berlin. [Øksendal, 2003] Øksendal, B. (2003). Stochastic differential equations. Universitext. Springer-Verlag, Berlin, sixth edition. An introduction with applications. [Schurz, 2002] Schurz, H. (2002). Numerical analysis of stochastic differential equations without tears. In Handbook of stochastic analysis and applications, volume 163 of Statist. Textbooks Monogr., pages 237–359. Dekker, New York. [Zhang and Karniadakis, 2017] Zhang, Z. and Karniadakis, G. E. (2017). Numerical methods for stochastic partial differential equations with white noise, volume 196 of Applied Mathematical Sciences. Springer, Cham.