CHAPTER 7
Wiener Process
Chapters 6, 7 and 8 offer a brief introduction to stochastic differential equations (SDEs). A standard reference for the material presented hereafter is the book by R. Durett, “Stochastic Calculus: A Practical Introduction” (CRC 1998). For a discussion of the Wiener measure and its link with path integrals see e.g. the book by M. Kac, “Probability and Related Topics in Physical Sciences” (AMS, 1991).
1. The Wiener process as a scaled random walk
Consider a simple random walk Xn n N on the lattice of integers Z: { } ∈ n (1) Xn = ξk, Xk=1 where ξk k N is a collection of independent, identically distributed (i.i.d) random { } ∈ P 1 variables with (ξk = 1) = 2 . The Central Limit Theorem (see the Addendum at the end of this chapter)± asserts that X N N(0, 1) ( Gaussian variable with mean 0 and variance 1) √N → ≡ in distribution as N . This suggests to define the piecewise constant random function W N on t [0→, ∞) by letting t ∈ ∞ N X Nt (2) Wt = ⌊ ⌋ , √N where Nt denotes the largest integer less than Nt and in accordance with standard ⌊ ⌋ N notations for stochastic processes, we have written t as a subscript, i.e. Wt = W N (t). N It can be shown that as N , Wt converges in distribution to a stochastic → ∞ 1 process Wt, termed the Wiener process or Brownian motion , with the following properties:
(a) Independence. Wt Ws is independent of Wτ τ s for any 0 s t. (b) Stationarity. The statistical− distribution of{W } ≤W is independent≤ ≤ of s t+s − s (and so identical in distribution to Wt). (c) Gaussianity. Wt is a Gaussian process with mean and covariance
EWt =0, EWtWs = min(t,s).
1The Brownian motion is termed after the biologist Robert Brown who observed in 1827 the irregular motion of pollen particles floating in water. It should be noted, however, that a similar observation had been made earlier in 1765 by the physiologist Jan Ingenhousz about carbon dust in alcohol. Somehow Brown’s name became associated to the phenomenon, probably because Ingenhouszian motion does not sound that good. Some of us with complicated names are sensitive to this story.
1 2 7. WIENER PROCESS
3
2.5
2
1.5
1
0.5
0
−0.5
−1
−1.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Figure 1. N Realizations of Wt for N = 100 (blue), N = 400 (red), and N = 10000 (green).
(d) Continuity. With probability 1, Wt viewed as a function of t is continuous. To show independence and stationarity, notice that for 1 m n ≤ ≤ n X X = ξ n − m k k=Xm+1 is independent of Xm and is identically distributed to Xn m. It follows that for any 0 s t, W W is independent of W and satisfies − ≤ ≤ t − s s d (3) Wt Ws = Wt s, − − where =d means that the random process on both sides of the equality have the same N distribution. To show Gaussianity, observe that at fixed time t 0, Wt converges as N to Gaussian variable with mean zero and variance t≥since → ∞ N X Nt X Nt Nt d Wt = ⌊ ⌋ = ⌊ ⌋ ⌊ ⌋ N(0, 1)√t = N(0,t). √N Nt √N → ⌊ ⌋ p In other words, p
x2 (4) P(W [x , x ]) = ρ(x, t)dx t ∈ 1 2 Zx1 where 2 e x /2t (5) ρ(x, t)= − . √2πt N N In fact, given any partition 0 t1 t2 tn, the vector (Wt1 ,...,Wtn ) converges in distribution to a n≤-dimensional≤ ≤ ··· Gaussian ≤ random variable. Indeed, 2. TWO ALTERNATIVE CONSTRUCTIONS OF THE WIENER PROCESS 3 using (3) recursively together with (4) and (5), it is easy to see that the probability density that (Wt1 ,...,Wtn ) = (x1,...,xn) is simply given by
(6) ρ(xn xn 1,tn tn 1) ρ(x2 x1,t2 t1)ρ(x1,t1) − − − − · · · − − A simple calculation using
EWt = xρ(x, t)dx, EWtWs = yxρ(y x, t s)ρ(x, s)dxdy. R R2 − − Z Z for t s and similarly for tequation suggests that Wt is not a smooth function of t. In fact, it can be showed that even though Wt is continuous almost everywhere (in fact H¨older continuous with exponent γ < 1/2), it is differentiable almost nowhere. This is consistent with the following property of self-similarity: for λ> 0 d 1/2 Wt = λ− Wλt, 1/2 which is easily established upon verifying that both Wt and λ− Wλt are Gaussian processes with the same (zero) mean and covariance. More about the lack of regularity of the Wiener process can be understood from first passage times. For given a > 0 define the first passage time by Ta inf t : W = a . Now, observe that ≡ { t } P P 1 P (7) (Wt >a)= (Ta
(10) P(Mε > 0) > 0 and P(mε < 0) > 0. In particular, t = 0 is an accumulation point of zeros: with probability 1 the first return time to 0 (and thus, in fact, to any point, once attained) is arbitrarily small.
2. Two alternative constructions of the Wiener process
Since Wt is a Gaussian process, it is completely specified by it mean and co- variance,
(11) EWt =0 EWtWs = min(t,s). in the sense that any process with the same statistics will also form a Wiener process. This observation can be used to make other constructions of the Wiener process. In this section, we recall two of them. 4 7. WIENER PROCESS
The first construction is useful in simulations. Define a set of independent Gaussian random variables ηk k N, each with mean zero and variance unity, and { } ∈ 2 let φk(t) k N be any orthonormal basis for L [0, 1] (that is, the space of square integral{ functions} ∈ on the unit interval). Thus any function f(t) in this set can be decomposed as f(t) = k N αkφk(t) for an appropriate choice of the coefficients ∈ αk. Then, the stochastic process defined by: P t (12) Wt = ηk φk(t′)dt′, k N 0 X∈ Z is a Wiener process in the interval [0, 1]. To show this, it suffices to check that it has the correct pairwise covariance – since Wt is a linear combination of zero mean Gaussian random variables, it must itself be a Gaussian random variable with zero mean. Now, t s EBtBs = Eηkηl φk(t′)dt′ φl(s′)ds′ k,l N 0 0 X∈ Z Z (13) t s = φk(t′)dt′ φk(s′)ds′, k N 0 0 X∈ Z Z where we have invoked the independence of the random variables ηk . Now to interpret the summands, start by defining an indicator function of{ the} interval [0, τ] and argument t 1 if t [0, τ] χτ (t)= ∈ (0 otherwise. If τ [0, 1], then this function further admits the series expansion ∈ τ (14) χτ (t)= φk(t) φk(t′)dt′. 0 Xk Z Using the orthogonality properties of the φk(t) , the equation (13) may be recast as: { } 1 t s EBtBs = φk(t′)dt′φk(u) φl(s′)ds′φl(u) du k,l N 0 0 0 X∈ Z Z Z 1 (15) = χt(u)χs(u)du 0 Z 1 = χmin(t,s)(u)du = min(t,s) Z0 as required. One standard choice for the set of functions φk(t) is the Haar basis. The first function in this basis is equal to 1 on the half{ interval} 0
(20) u(x, t)= Ef(x + Wt) This is the Feynman-Kac formula for the solution of the diffusion equation: ∂u 1 ∂2u (21) = u(x, 0) = f(x). ∂t 2 ∂x2 To show this note first that: u(x, t + s)= Ef(x + W )= Ef(x + (W W )+ W ) t+s t+s − t t = Eu(x + W W ,t) Eu(x + W ,t) t+s − t ≡ s where we have used the independence of W W and W . Now, observe that t+s − t t ∂u 1 (x, t) = lim (u(x, t + s) u(x, t)) s 0+ ∂t → s − 1 = lim E(u(x + Ws,t) u(x, t)) s 0+ s → − 2 1 ∂u E 1 ∂ u E 2 = lim (x, t) Ws + (x, t) Ws + o(s) , s 0+ s ∂x 2 ∂x2 → where we have Taylor-series expanded to obtain the final equality. The result follows E E 2 by noting that Ws = 0 and Ws = s. The formula admits many generalizations. For instance: If t (22) v(x, t)= Ef(x + Wt)+ Eg(x + Ws)ds, Z0 6 7. WIENER PROCESS then the function v(x, t) satisfies the diffusion equation with source-term the arbi- trary function g(x): ∂v 1 ∂2v (23) = + g(x) v(x, 0) = f(x). ∂t 2 ∂x2 Or: If t (24) w(x, t)= E f(x + Wt)exp c(x + Ws)ds 0 Z then w(x, t) satisfies diffusive equation with an exponential growth term: ∂w 1 ∂2w (25) = + c(x)w w(x, 0) = f(x) . ∂t 2 ∂x2 Addendum: The law of large numbers and the central limit theorem
Let Xj j N be a sequence of i.i.d. (independent, identically distributed) ran- dom variables,{ } ∈ let η = EX σ2 = var(X )= E(Z η)2 and define 1 1 1 − n Sn = Xj j=1 X The (weak) law of large numbers states that if E X < , then | j | ∞ S n η in probability. n → The central limit theorem states that if EX2 < then j ∞ S nη n − N(0, 1) in distribution. √nσ2 → We first give a proof of the law of large numbers under the stronger assumption 2 that E Xj < . Without loss of generality we can assume that η = 0. The proof is based| the| Chebychev∞ inequality: Suppose X is a random variable with distribution function F (x)= P(X < x). Then, for any λ> 0, 1 (26) P( X λ) E X p. | |≥ ≤ λp | | Indeed:
λpP( X λ)= λp dF (x) x pdF (x) x pdF (x)= E X p. | |≥ x λ ≤ x λ | | ≤ R | | | | Z| |≥ Z| |≥ Z Using Chebychev’s inequality, we have S 1 S 2 P n >ε E n n ≤ ε2 n for any ε> 0. Using the i.i.d. property, this gives
E S 2 = E X + X + . . . + X 2 = nE X 2. | n| | 1 2 n| | 1| Hence S 1 P n >ε E X 2 0, n ≤ nε2 | 1| → as n , and this proves the law of large numbers. → ∞ ADDENDUM: THE LAW OF LARGE NUMBERS AND THE CENTRAL LIMIT THEOREM 7
Next we prove the central limit theorem. Let f be the characteristic function of X1, i.e. (27) f(k) EeikX1 , k R. ≡ ∈ 2 and similarly let gn be the characteristic function of Sn/√nσ . Then n 2 2 2 n iξSn/√nσ iξXj /√nσ iξXj /√nσ gn(ξ)= Ee = Ee = Ee j=1 Y 2 n ik k 2 1 = 1+ EX EX + o(N − ) √nσ 1 − 2nσ2 1 2 n k 1 = 1 + o(N − ) − 2n k2/2 e− as n . → → ∞ 2 This shows that the characteristic function of Sn/√nσ converges to the the char- acteristic function of N(0, 1) as n and terminates the proof. → ∞ It is instructive to note that the only property of X1 that we have required in the central limit theorem is that EX2 < . In particular, the theorem holds even 1 ∞ if the higher moments of X1 are infinite! For one illustration of this, consider a random variable having probability density function 2 (28) ρ(x)= , π(1 + x2)2 for which no higher order moment than the variance can be computed. Nevertheless, we have: ikx k f(k) e ρ(x)dx = (1+ k ) e−| | ≡ R | | Z =1 1 k2 + o(k2), − 2 although the characteristic function is not three times differentiable at k = 0, so the Taylor series can not be extended beyond second order terms. Notes by Marcus Roper and Ravi Srinivasan.
CHAPTER 8
Stochastic integrals and stochastic differential equations
N From (1) and (2) Wtn , where tn = n/N, satisfies the recurrence relation N N √ (29) Wtn = Wtn + ξn+1 ∆t. where ∆t = 1/N and ξn n N are i.i.d. random variables taking values 1 with 1 { } ∈ ± probability 2 as before A natural generalization of this relation (29) is N N N N √ (30) Xtn+1 = Xtn + b(Xtn ,tn)∆t + σ(Xtn ,tn)ξn+1 ∆t, X0 = x If the last term were absent, this would be the forward Euler scheme for the ODE X˙ t = b(Xt,t). If σ(x, t) meets appropriate regularity requirements, it can be shown that XN converges to a stochastic process X as N (i.e. as ∆t 0 with t t → ∞ → n∆t t). The limiting equation for Xt is denoted as the stochastic differential equation→ (SDE)
(31) dXt = b(Xt,t)dt + σ(Xt,t)dWt, X0 = x, as a remainder that the last term in (30) divided by ∆t does not have a standard function as limit. The notation dWt comes from (29) since this equation can be written as W N W N = ξ √∆t. We note that the convergence of XN to tn+1 − tn n+1 t Xt holds provided only that the ξn’s are i.i.d. random variables with mean zero, E E 2 ξn = 0, and variance one, ξn = 1. The standard choice in numerical schemes is to take ξn = N(0, 1), in which case d √∆t ξ = W W . n+1 tn+1 − tn In the discussion below, however, we will stick to the choice where ξn n N are 1 { } ∈ i.i.d. random variables taking values 1 with probability 2 since it facilitates the calculations. ± Next, we study the properties of Xt solution of (31) and introduce some non- standard calculus due to Itˆoto manipulate this solution.
1. Itˆoisometry and Itˆoformula Consider the recurrence relation N N N √ Xtn+1 = Xtn + f(Wtn )ξn+1 ∆t, N Let us investigate the properties of the limit of Xn∆t as N , assuming that this limit exists. The limiting form of the recurrence relation→ above ∞ is traditionally denoted as
dXt = f(Wt,t)dWt, X0 =0,
9 10 8. STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS which can also be expressed as the stochastic integral
t Xt = f(Ws,s)dWs. Z0 Stochastic integral have special properties called Itˆoisometries
t E f(Ws,s)dWs =0, Z0 t 2 t 2 E f(Ws,s)dWs = Ef (Ws,s)ds. 0 0 Z Z The first Itˆoisometry is often written and used in differential form
Ef(Ws,s)dWs =0.
The Itˆoisometries are easy to demonstrate. The first is implied by
n 1 E N E − N √ Xtn = f(Wtm ,tm)ξm+1 ∆t m=0 X n 1 − E N E √ = f(Wtm ,tm) ξm+1 ∆t =0, m=0 X where we used the independence of the ξm’s and Eξm = 0. The second is implied by n E N 2 E N ¯ N (Xtn ) = f(Wtm ,tm)f(Wtp ,tp)ξm+1ξp+1∆t m,p=0 X n E 2 N = f (Wtm ,tm)∆t, m=0 X 2 where we use the fact that ξm and ξp are independent unless m = p, and ξm = 1 by definition. Going back to (31), a very important formula to manipulate the solution of this equation is Itˆoformula which states the following. Assume that Xt is the solution of (31) and let f be a smooth function. Then f(Xt) satisfies the SDE
1 2 df(Xt)= f ′(Xt)dXt + 2 f ′′(Xt)σ (Xt,t)dt 1 2 = f ′(Xt)b(Xt,t)+ 2 f ′′(Xt)σ (Xt,t) dt + f ′(Xt)σ(Xt,t)dWt.
If f depends explicitly on t, then an additional term ∂f/∂tdt is present at the right hand-side. Itˆoformula is the analog of the chain rule in ordinary differential calculus. However ordinary chain rule would give
df(Xt)= f ′(Xt)dXt.
Here because of the non-differentiability of Xt, we have the additional term that depends on f ′′(x). 2. EXAMPLES 11
N The proof of Itˆoformula can be outlined as follows. We Taylor expand f(XTn+1 ) N N − f(Xtn ) using the recurrence relation (30) for Xtn and keep terms up to O(∆t): f(XN ) f(XN ) tn+1 − tn N N N 1 N N N 2 = f ′(X )(X X )+ f ′′(X )(X X ) + tn tn+1 − tn 2 tn tn+1 − tn · · · N N N = f ′(X )(X X ) tn tn+1 − tn 2 1 N N ∆ N √∆ ∆ 3/2 + 2 f ′′(Xtn ) b(Xtn ,tn) t + σ(Xtn ,tn)ξn+1 t + O( t ) N N N 1 N 2 N 2 3/2 = f ′(X )(X X )+ f ′′(X )σ (X ,t )ξ ∆t + O(∆t ). tn tn+1 − tn 2 tn tn n n+1 The Itˆoformula follows in the limit as ∆t 0 because → 1 N 2 N 2 1 2 f ′′(X )σ (X ,t )ξ ∆t f ′′(X )σ (X ,t)dt. 2 tn tn n n+1 → 2 t t 2 since ξn = 1 by definition, and the higher order terms in the expansion gives no contribution in the limit as ∆t 0. → 2. Examples The Itˆoisometries and the Itˆoformula are the backbone of the Itˆocalculus which we now use to compute some stochastic integrals and solve some SDEs. As an example of stochastic integral, consider t WsdWs. Z0 Taking f(x)= x2 in Itˆoformula gives 1 2 1 2 dWt = WtdWt + 2 dt. Therefore t W dW = 1 W 2 1 t. s s 2 t − 2 Z0 Notice that the second term at the right hand-side would be absent by the rules of standard calculus. Yet, this term must be present for consistency, since the expectation of the left hand-side is t E WsdWs =0, Z0 using the first Itˆoisometry, and the expectation of the right hand-side is zero only 1 1 E 2 1 with the term 2 t included since 2 Wt = 2 t. As a first example of SDE, consider dX = γX dt + σdW , X = x t − t t 0 This is the Ornstein-Uhlenbeck process. Using Itˆoformula with f(x, t) = eγtx, we get (this is Duhammel principle) γt γt γt γt d e Xt = γe Xtdt + e dXt = σe dWt. Integrating gives t γt γ(t s) Xt = e− x + σ e− − dWs. Z0 12 8. STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
2
1.5
1
0.5
0
−0.5
−1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Figure 1. Three realizations of the Ornstein-Uhlenbeck process with X0 = 0 and γ = σ = 1.
This process is Gaussian being a linear combination of the Gaussian process Wt. Its mean and variance are (using the Itˆoisometries) γt EXt = e− x t 2 2 2 γ(t s) 2 σ 2γt E(X EX ) = σ (e− − ) ds = (1 e− ). t − t 2γ − Z0 Thus when γ > 0 σ2 X d N 0, , t → 2γ as t . As→ ∞ a second example of SDE, consider
dYt = Ytdt + αYtdWt, Y0 = y. Itˆo’s formula with f(x) = log x gives
1 1 2 2 d log Yt = (Ytdt + αYtdWt) 2 α Yt dt. Yt − 2Yt Integrating we get 1 2 t α t+αWt Yt = ye − 2 . Note that by the rules of standard calculus, we would have obtained the wrong answer t+αWt Yt = ye . Indeed the term 1 α2t in the exponential is important for consistency since taking − 2 the expectation of the SDE for Yt using the first Itˆoisometry gives
dEYt = EYtdt, 4. FORWARD AND BACKWARD KOLMOGOROV EQUATIONS 13 and hence t EYt = ye . The solution above is consistent with this since (can you show this?) 1 2 EeαWt = e 2 α t.
3. Generalization in multi-dimension The definition of Itˆointegrals and SDE’s can be extended to multi-dimension in a straightforward fashion. The SDE K j k dXt = bj(Xt,t)dt + σjk(Xt,t)dWt , j =1,...,J Xk=1 k K where Wt k=1 are independent Wiener processes, defines a vector-valued stochas- { } 1 J tic process Xt = (Xt ,...,Xt ). The only point worth noting is the Itˆoformula, which in multi-dimension reads: J J 2 K ∂ j ∂ f 1 ′ df(Xt)= f(Xt)dXt + (Xt) σjk(Xt,t)σkj (Xt,t) dt ∂x 2 ∂x ∂x ′ j=1 j ′ j j X j,jX=1 kX=1 4. Forward and Backward Kolmogorov equations Consider the stochastic ODE
dXt = b(Xt)dt + σ(Xt)dWt, X0 = y. Define the transition probability density ρ(x, t y) via | ρ(x, t y)dx = P X B X = y . | { t+s ∈ | s } ZB (ρ(x, t y) does not depends on s because b(x) and σ(x) are time-independent, in | which case the process Xt is stationary.) We will derive equation for ρ. Let f be an arbitrary smooth function. Using Itˆoformula, we have t t 1 f(X ) f(y)= f ′(X )dX + f ′′(X )a(X )ds, t − s s 2 s s Z0 Z0 where a(x)= σ2(x). Taking expectation on both sides, we get t t 1 Ef(X ) f(y)= E f ′(X )b(X )ds + E f ′′(X )(X )ds. t − s s 2 s s Z0 Z0 or equivalently using ρ
f(x)ρ(x, t y)dx f(y) R | − Z t t 1 = f ′(x)b(x)ρ(x, s y)dxds + 2 f ′′(x)a(x)ρ(x, s y)dxds. R | R | Z0 Z Z0 Z Since this holds for all smooth f, we obtain ∂ρ ∂ ∂2 (32) = (b(x)ρ)+ 1 (a(x)ρ) ∂t −∂x 2 ∂x2 with the initial condition limt 0 ρ(x, t y)= δ(x y). This is the forward Kolmogorov equation for ρ in terms of the→ variables| (x, t).− It is also called the Fokker-Planck equation. 14 8. STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS
Equivalently, an equation for ρ in terms of the variables (y,t) can be derived. The Markov property implies that
ρ(x, t + s y)= ρ(x, t z)ρ(z,s y)dz. | R | | Z Hence
ρ(x, t + ∆t y) ρ(x, t y)= ρ(x, t z)ρ(z, ∆t y)dz ρ(x, t y) | − | R | | − | Z = ρ(x, t z) ρ(z, ∆t y) δ(z y) dz. R | | − − Z Dividing both side by ∆t and taking the limit as ∆t 0 using the forward Kol- mogorov equation one obtains → 2 ∂ρ ∂ 1 ∂ = ρ(x, t z) (b(z)δ(z y)) + 2 (a(z)δ(z y)) dz, ∂t R | −∂z − ∂z2 − Z which by integration by parts gives ∂ρ ∂ρ ∂2ρ (33) = b(y) + 1 a(y) . ∂t ∂y 2 ∂y2 This is the backward Kolmogorov equation for ρ in terms of the variables (x, t). The operator ∂ ∂2 L = b(y) + 1 a(y) , ∂y 2 ∂y2 is called the infinitesimal generator of the process. The coefficient b and a can be expressed as (can you show this from the SDE?)
1 1 2 b(y) = lim (EXt y), a(y) = lim E(Xt y) . t 0 t t 0 t → − → − Both the forward and the backward equations can be considered with different initial conditions. In particular, given a smooth function f, if we define
u(y,t)= Ef(Xt), then u(y,t)= f(x)ρ(x, t y) and hence it satisfies R | R ∂u ∂u ∂2u = b(y) + 1 a(y) , ∂t ∂y 2 ∂y2 with the initial condition u(y, 0) = f(y). Thus, the SDE for Xt is the characteristic equation that is associated with this parabolic PDE, much in the same way as the ODE X˙ t = b(Xt) is the characteristic equation associated with the first order PDE ∂u/∂t = b(y)∂u/∂y. This can be generalized in many ways. For instance, the solution of ∂v ∂v ∂2v = c(y)v(y)+ b(y) + 1 a(y) . ∂t ∂y 2 ∂y2 with the initial condition v(y, 0) = f(y), can be expressed as
t R c(Xs)ds v(y,t)= Ef(Xt)e 0 . This is the celebrated Feynman-Kac formula in the context of SDEs. 4. FORWARD AND BACKWARD KOLMOGOROV EQUATIONS 15
4
3.5
3
2.5
2
1.5
1
0.5
0 −4 −3 −2 −1 0 1 2 3 4
Figure 2. Snapshots of the density of the Ornstein-Uhlenbeck process at time t = 0.01 (blue), t = 0.1 (red), t = 1 (green), and t = 10 (magenta). Here X0 = y = 1 and γ = σ = 1. The last snapshot at t = 10 is very close to the equilibrium density.
Let us consider an example. The forward differential equation associated with the Ornstein-Uhlenbeck process introduced in the last section is ∂ρ ∂ σ2 ∂2ρ = γ (xρ)+ ∂t ∂x 2 ∂x2 The solution of this equation is γt 2 1 γ(x ye− ) ρ(x, t y)= exp 2 − 2γt . | πσ2(1 e 2γt)/γ − σ (1 e− ) − − − This shows that the Ornstein-Uhlenbeckp process is a Gaussian process with mean γt 2 2γt ye− and variance σ (1 e− )/2γ. It also confirms that this process tends to N(0, σ2/2γ) as t since− → ∞ 2 2 e γx /σ ρ(x) = lim ρ(x, t y)= − . t 2 →∞ | πσ /γ Generally, the limit of ρ(x, t y) as t , whenp it exists, gives the equilibrium density ρ of the process. It satisfies| → ∞ ∂ ∂2 0= (b(x)ρ)+ 1 (a(x)ρ). −∂x 2 ∂x2 Forward and backward Kolmogorov equations can also be derived for multi- dimensional processes. They read respectively
J J 2 ∂ρ ∂ 1 ∂ = (b (x)ρ)+ (a ′ (x)ρ) ∂t − ∂x j 2 ∂x ∂x jj j=1 j ′ i j X j,jX=1 16 8. STOCHASTIC INTEGRALS AND STOCHASTIC DIFFERENTIAL EQUATIONS and J J 2 ∂ρ ∂ρ 1 ∂ ρ = b (x) + a ′ (x) , ∂t j ∂x 2 jj ∂x ∂x j=1 j ′ i j X j,jX=1 K ′ ′ where ajj (x)= k=1 σjk(x)σj k(x). Notes by Walter Pauls and Arghir Dani Zarnescu. P CHAPTER 9
Asymptotic techniques for SDEs
Here we discuss techniques by which one can study SDEs evolving on very different time-scales and derive closed equations for the slow variables.
1. The case of stiff ordinary differential equations We start with an ODE example. Consider ˙ 3 Xt = Yt + sin(πt) + cos(√2πt) X0 = x (34) −1 Y˙t = (Yt Xt) Y0 = y. − ε − If ε is very small, Y is very fast and one expects that it will adjust rapidly to the t current value of Xt, i.e. Yt = Xt + O(ε) at all times. Then the equation for Xt reduces to (35) X˙ = X3. t − t The solutions of (34) and (35) are compared in figure 1. Here is a formal derivation of the limiting equation (35) which uses the back- ward Kolmogorov equation. For simplicity we drop the term sin(πt) + cos(√2πt). Generalizing the derivation below with this term included is easy but requires a slightly different backward equation because (35) is non-autonomous. Let f be a smooth function and consider
u(x,y,t)= f(Xt).
(This function depends on both x and y since Xt depends on both these variable because Xt and Yt are coupled in (34), and there is no expectation since (34) is deterministic.) The backward equation is ∂u 1 = L u + L u, ∂t x ε y where ∂ ∂ L = y ,L = (y x) . x − ∂x y − − ∂y 2 Look for a solution of the form u = u0 + εu1 + O(ε ), so that u u0 as ε 0. Inserting this expansion into the backward equation, and grouping→ terms of same→ order in ε, one obtains
Lyu0 =0, (36) ∂u L u = 0 L u , y 1 ∂t − x 0 and so on. The first equation tells that u0 belong to the null-space of Ly, i.e. u0 = u0(x, t). The second equation requires as a solvability condition that the right hand-side belongs to the range of Ly. To see what this condition actually is,
17 18 9. ASYMPTOTIC TECHNIQUES FOR SDES
2.5
2
1.5
1
0.5
0
−0.5
−1 0 1 2 3 4 5 6 7
Figure 1. The solution of (34) when ε = 0.05 and we took X0 = 2, Y0 = 1. Xt is shown in blue, and Yt in green. Also shown in red is− the solution of the limiting equation (35). multiply the second equation in (36) by a test function ρ(y), and integrate both sides over R. After integration by part at the left hand-side, this gives
⋆ ∂u0 Lyρ(y)u1dy = ρ(y) Lxu0 dy. R R ∂t − Z Z ⋆ where Ly is the adjoint of Ly viewed as an operator in y at fixed x, i.e. ∂ L⋆ρ(y)= ((y x)ρ(y)). y ∂y − Choosing ρ(y) such that ⋆ (37) 0= Lyρ(y), one concludes that the solvability of (36) requires that
∂u0 (38) 0= ρ(y) Lxu0 dy. R ∂t − Z It can be shown that this equation is also sufficient for the solvability of (36) – the calculation above actually tells the range of Ly is the space perpendicular to the null-space of the adjoint of Ly. Now, (37) is simply the forward Kolmogorov equation for the equilibrium density of the process Yt at fixed Xt = x. Here the equilibrium density is a generalized function ρ(y x)= δ(y x). | − Using this ρ(y x), the solvability condition (38) becomes | ∂u ∂u 0= 0 + x 0 , ∂t ∂x 2. GENERALIZATION TO STOCHASTIC DIFFERENTIAL EQUATION 19 which is the backward equation for X˙ = X3, X = x. t − t 0 A similar argument with the term sin(πt) + cos(√2πt) included gives the backward equation for (35).
2. Generalization to stochastic differential equation The derivation that lead to (35) can be generalized to SDEs. Consider
dXt = f(Xt, Yt)dt, X0 = x (39) 1 1 dY = b(X , Y )dt + σ(X , Y )dt, Y = y, t ε t t √ε t t 0 and assume that the equation for Yt at Xt = x fixed has an equilibrium density ρ(y x) for every x. Then going through a derivation as above with | u(x,y,t)= Ef(Xt), one concludes that the backward equation associated with this SDE also reduces to (38) as ε 0, i.e. → ∂u ∂u 0 = F (x) 0 , ∂t ∂x where F (x)= f(x, y)ρ(y x)dy. R | Z Thus the limiting equation for Xt is
X˙ t = F (Xt), X0 =0. The main difference with the deterministic example treated before is that the fast process Yt does not rapidly settle to an equilibrium point depending on the current value of Xt – only its density does. Here is an example generalizing (34). Consider
3 dXt = Yt dt + sin(πt) + cos(√2πt), X0 = x (40) −1 α dYt = (Yt Xt)dt + dWt, Y0 = y. − ε − √ε The equation forYt at fixed Xt = x defines an Ornstein-Uhlenbeck process whose equilibrium density is 2 2 e (y x) /α ρ(y x)= − − . | √πα Therefore (y x)2/α2 3 e− − 3 3 2 F (x)= y dy = x 2 α x, − R √πα − − Z and the limiting equation is (41) X˙ = X3 3 α2X + sin(πt) + cos(√2πt), X = x. t − t − 2 t 0 3 2 Note the new term 2 α Xt, due to the noise in (40). The solution of (40) and (41) are shown in figure− 2. 20 9. ASYMPTOTIC TECHNIQUES FOR SDES
4
3
2
1
0
−1
−2
−3 0 1 2 3 4 5 6 7
Figure 2. The solution of (40) with X0 = 2, Y0 = 1 when 3 − ε = 10− and α = 1. Xt is shown in blue, and Yt in green. Also shown in red is the solution of the limiting equation (41). Notice how noisy Yt is.
3. Strong convergence and the property of self-averaging The derivation in section 2 only give weak convergence, or convergence in dis- tribution. But stronger results can be obtained. Consider a system of the form ˙ ε ε (42) Xt = f(Xt , Yt/ε), where Yt is a given stochastic process. Assume that Yt is ergodic, in the sense that for any fixed x, 1 T (43) lim f(x, Ys)ds = f¯(x). T T →∞ Z0 Then we can show that, as ε 0, Xε converges strongly to the solution of → t ˙ (44) X¯t = f¯(X¯t) To see this, consider the integral form of (42):
t+∆t ε ε (45) X ∆ X = f(X , Y )ds. t+ t − t s s/ε Zt We rewrite this equation in a way that allows us to exploit the self-averaging prop- erty (43).
t+∆t t+∆t ε ε (46) X ∆ X = f(X , Y )ds + f(X , Y ) f(X , Y ) ds. t+ t − t t s/ε s s/ε − t s/ε Zt Zt We will consider the behavior of these two integrals as ε 0 separately. → 4. DIFFUSIVE TIME-SCALE 21
Using (43), the first integral
t+∆t (t+∆t)/ε (47) f(X , Y )ds = ε f(X , Y )ds ∆tf¯(X ), t s/ε t s → t Zt Zt/ε as ε 0. To investigate the contribution of the second integral, let → t+∆t (48) A(t, ∆t,ε)= f(X , Y ) f(X , Y ) ds. s s/ε − t s/ε Zt We then have t+∆t (49) A(t, ∆t,ε) f(X , Y ) f(X , Y ) ds. | |≤ s s/ε − t s/ε Zt
Assuming f is uniformly Lipschitz in Yt with constant K, we then write t+∆t A(t, ∆t,ε) K X X ds | |≤ | s − t| Zt t+∆t s K Xs Xt f(Xt, Ys′/ε)ds′ ≤ t − − t Z ∆ Z t+ t s + K f(Xt, Ys′/ε)ds′ ds t t Z Z
It is straightforward to show using (47) that, for sufficiently small ε, t+∆t s 2 (50) K f(Xt, Ys′/ε)ds′ ds < C∆t t t Z Z for some constant C < . Gronwall’s lemma then implies that ∞ t+∆t Xt+∆t Xt f(Xt, Ys/ε)ds = A(t, ∆t,ε) (51) − − t | | Z C∆t2 exp K∆t = o(∆t). ≤ This shown that ε ε ¯ ε (52) lim Xt+∆t Xt = ∆tf(Xt )+ o(∆t). ε 0 → − ε ¯ which is sufficient to demonstrate that Xt converges strongly to Xt.
4. Diffusive time-scale An interesting generalization of the situation presented in section 2 arises when
(53) f(x, y)ρ(y x)dy =0. R | Z In this case the limiting equation reduces to the trivial ODE, X˙ t = 0, i.e. no evolution at all. In fact, the interesting evolution then occurs on a longer time- 1 scale of order ε− , and the right scaling to study (39) is 1 dXt = f(Xt, Yt)dt, X0 = x (54) ε 1 1 dY = b(X , Y )dt + σ(X , Y )dt, Y = y, t ε2 t t ε t t 0 22 9. ASYMPTOTIC TECHNIQUES FOR SDES
To obtain the limiting equation for X as ε 0, we proceed as above and consider t → the backward equation for u(x,y,t)= Ef(Xt), which is now rescaled as ∂u 1 1 = L u + L u. ∂t ε x ε2 y 2 2 Inserting the expansion u = u0 + εu1 + ε u2 + O(ε ) (we will have to go one order in ε higher than before) in this equation now gives
Lyu0 =0, L u = L u , (55) y 1 − x 0 ∂u L u = 0 L u , y 2 ∂t − x 1 and so on. The first equation tells that u0(x,y,t) = u0(x, t). The solvability condition for the second equation is satisfied by assumption because of (53) and therefore this equation can be formally solved as
1 u = L− L u . 1 − y x 0 Inserting this expression in the third equation in (55) and considering the solvability condition for this equation, we obtain the limiting equation for u0: ∂u 0 = L¯ u , ∂t x 0 where ¯ 1 Lx = dyρ(y x)LxLy− Lx. R | Z 1 To see what this equation is explicitly, notice that Ly− g(y) is the steady state solution of − ∂v = L v + g(y). ∂t y The solution of this equation with the initial condition v(y, 0) = 0 can be repre- sented by Feynman-Kac formula as t E x v(y,t)= g(Ys )ds, Z0 x where Yt denotes the solution of the second SDE in (54) at Xt = x fixed and ε = 1, i.e. x x x x dYt = b(x, Yt )dt + σ(x, Yt )dWt, Y0 = y. Therefore 1 ∞ x L− g(y)= E g(Y )dt, − y t Z0 and the limiting backward equation above can be written as
∂u0 E ∞ ∂ x ∂u0 = dt dyρ(y x)f(x, y) f(x, Yt ) , ∂t 0 R | ∂x ∂x Z Z This is the backward equation of the SDE
dXt = ¯b(Xt)dt +¯σ(Xt)dWt, X0 = x, 4. DIFFUSIVE TIME-SCALE 23 where ¯ E ∞ ∂ x b(x)= ρ(y x)f(x, y) f(x, Yt )dydt, R | ∂x Z0 Z 2 E ∞ x σ¯ (x)=2 ρ(y x)f(x, y)f(x, Yt )dydt. R | Z0 Z The interesting new phenomena is that the limiting equation for Xt has become an SDE. This means that fluctuations are important on the long-time scale and give rise to stochastic effects in the evolution of Xt that were absent on the shorter time-scale. The calculation above is easy to generalize if there is a slow term in the original equation for Xt, i.e. if instead of (54) one considers 1 dX = g(X , Y )dt + f(X , Y )dt, X = x t t t ε t t 0 1 1 dY = b(X , Y )dt + σ(X , Y )dt, Y = y, t ε2 t t ε t t 0 The limiting equation for X is then t dXt = G(Xt)dt + ¯b(Xt)dt +¯σ(Xt)dWt, X0 = x, with ¯b(x) andσ ¯(x) as above, and
G(x)= ρ(y x)g(x, y)dy. R | Z It is also straightforward to generalize to higher dimensions. Here is an example. 2α dX = Y Z dt (X + X3)dt, t ε t t − t t 3α 1 1 dY = Z X dt Y dt + dW y, t ε t t ε2 t ε t − α 1 1 z dZt = b3YtXtdt 2 Ztdt + dWt . − ε − ε ε y z where Wt , Wt are independent Wiener processes and α is a parameter. There are two fast variables, Yt and Zt, in this example. There is also a slow term, (X + X3)dt, in the equation for X which, in the absence of coupling with Y and − t t t t Zt, would drive Xt to the position x = 0. We ask to what extend this equilibrium of the uncoupled dynamics is relevant with coupling with Yt and Zt. The limiting equation for Xt is dX = ((α2 1)X X3)dt + αdW . t − t − t t The equilibrium density for this equation is 1 1 (α2 1)x2 1 x4 ρ(x)= Z− e 2 − − 4 . This density is shown in figure 3. For α 1, ρ(x) is mono-modal and centered around x = 0, the stable equilibrium of| | the ≤ uncoupled dynamics. However, for α > 1, ρ(x) becomes bi-modal, with two maxima at x = √α2 1 and a minimum |at|x = 0. Thus coupling with the fast modes may destroy± the structu− res apparent in the uncoupled dynamics and induce bifurcations. Notes by Inga Koszalka and Alex Hasha. 24 9. ASYMPTOTIC TECHNIQUES FOR SDES
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0 −4 −3 −2 −1 0 1 2 3 4
1 1 (α2 1)x2 1 x4 Figure 3. The equilibrium density ρ(x) = Z− e 2 − − 4 1 for α = 2 (blue) and α = 2 (red).