Lecture 27: Stochastic Calculus with the Wiener Process Contents 1

Home , Stochastic process, Wiener process

TCM310 FALL 2019: STOCHASTIC METHODS — lecture 27: stochastic calculus with the wiener process

Course handouts are designed as a study aid and are not meant to replace the recommended textbooks. Handouts may contain typos and/or errors. The students are encouraged to verify the information contained within and to report any issue to the lecturer.

contents

1 Introduction 1

2 Wiener process 2 2.1 First Consequences of the deﬁnition ...... 2 2.2 Time scaling...... 3

3 Continuity and diﬀerentiability of Wiener process3 3.1 Kolmogorov-Chentsov ...... 4 3.2 Nowhere diﬀerentiability...... 4

4 Quadratic variation of the Wiener process4

5 Stochastic integrals 5 5.1 Example: Wiener process as integrand ...... 6

6 Itoˆ integral in the space of mean square integrable functions8

7 Itoˆ integral in the space of almost surely square integrable processes 11

A Stochastic analysis terminology 12

References 13

1 introduction

The main reference for these notes are chapter 3 of [2] (physics style presentation) and 2 of [3] (applied mathematics style).

For more visit https://bit.ly/2lVSwTs.

1 2. wiener process

There is a huge literature about Brownian motion and its mathematical description, often referred to as Wiener process. The lecture notes [4] are an excellent example of such literature, and moreover freely available from the internet.

2 wiener process

The above deﬁnitions allow us to characterize the Wiener process as a stochastic process:

definition (Wiener Process aka Brownian motion). A real valued stochastic process

wt : Ω R+ R × → is called a Wiener process or Brownian motion if

W-i w0 = 0

W-ii any increment w w has Gaussian probability density t − s x2 2 (t s) e− − dPw w (x) = p dx pw w (x) dx (1) t− s 2 π (t s) ≡ t− s − for all t > s > 0. Note that for t s the probability density tends to a δ-Dirac centered at zero. ↓ W-iii For all times

t < t < . . . t 1 2 ≤ n the random variables

wt1 , wt2 wt1 , . . . , wtn wtn 1 − − − are independent (the process has independent increments).

2.1 First Consequences of the deﬁnition Some observations are in order

• It is not restrictive to consider the one dimensional case. A d-dimensional Wiener process a vector valued stochastic process whose components are each independent one-dimensional Wiener processes. More explicitly, the probability density of Brownian motion on Rd is given by

pwt (x) = p(wt)i (xi) i=1

• By W-i and W-ii we have that

y2 x2 2 (t s) e− 2 s e− − pws (x) = 1 & pwt ws (x) = 1 , t s (2) (2 π t) 2 − (2 π (t s)) 2 ∀ ≥ − 2 2.2. time scaling

By W-iii The joint probability of w and w w for t s is s t − s ≥ y2 x2 2 (t s) e− 2 s e− − pws,,wt ws (x, y) = 1 1 − (2 π s) 2 (2 π (t s)) 2 −

The main consequence of these observations is proposition. The conditional probability density of the Wiener process is

(x y)2 − − 2 (t s) pw w (x y) = p − t s R+ & x, y R t| s | 2 π (t s) ∀ ≥ ∈ ∀ ∈ − Proof. By deﬁnition of conditional probability density

pwt,ws (x , y) pwt ws (x y) = | | pws (y) We observe that we can write the joint probability as

pw ,w (x , y) = pw w ,w (x y , y) = pw w (x y)pw (y) t s t− s s − t− s − s where the last step follows from the independence of the increments of the Wiener process.

2.2 Time scaling proposition. In distribution we have for any strictly positive c 1 wt = wct (3) √c Proof. We recall that

Z 2 2 ı z w dx ı z x x t z E e t = e e− 2 t = e− 2 √2 π t R It readily follows that

Z 2 2 ı z w dx ı z x x c t z ı √c z w E e c t = e e− 2 c t = e− 2 = E e t √2 π c t R

3 continuity and differentiability of wiener process

Continuity in mean square and probability

3 4. quadratic variation of the wiener process

3.1 Kolmogorov-Chentsov theorem (Kolmogorov–Chentsov ). Let ξ , t 0 if for some β > 0 γ > 0 and any any t, s there is a positive { t ≥ } constant K such that

E ξ ξ β C t s 1+γ | t − s| ≤ | − | then paths of ξt are continuous functions with probability one.

The theorem has immediate consequences for the Wiener process. Namely the identity

E(w w )2n = 2n 1 !! (t s)n t s t − s | − | − ∀ ≥ allows us to deﬁne

β = 2 n & γ = n 1 n n −

3.2 Nowhere diﬀerentiability proposition. Paths of Wiener process are not diﬀerentiable function. Proof. For s > 0 set

(s) w w η = t+s − t t s For any ﬁxed t, the above expression deﬁnes the elements of a one parameter s family of random variables, Gaussian with zero mean and variance

(s) 1 E(η )2 = t s

(s) Were the η to converge in some sense to a limit as s 0 then a sequence of characteristic functions t & (s) f (z, s) = E eı z ηt would converge to a limit which is continuous function in the argument z. In the case considered  (s) z2  1 z = 0 ı z η  E e t = e− 2 s2 =    0 z , 0

4 quadratic variation of the wiener process proposition (Quadratic variation of the B.M.). The quadratic variation of the Brownian motion in [0, t] for any t R+ is ∈ w = t h it in the sense of L2(Ω).

4 Proof. By direct calculation we know that

2 Ewt = t

Let p a ﬁnite partition paving [0, t] with n sub-intervals:

X 2 Qp := (wtk wtk 1 ) − − t p k∈ we have then

2 X h 2 i h 2 i E(Qp t) = E (wtk wtk 1 ) (tk tk 1) (wtl wtl 1 ) (tl tl 1) − − − − − − − − − − − t t p k l ∈ For non-overlapping intervals, the averaged quantities are independent random variables with zero average. The only contributions to the sum come from overlapping intervals:

2 X h 2 i2 X 2 E(Qn t) = E (wtk wtk 1 ) (tk tk 1) = 2 (tk tk 1) − − − − − − − − t p t p k∈ k∈ where the last equality is a consequence of

4 2 Ewt = 3 t

It follows that

maxt p(tk tk 1) 0 2 k∈ − − ↓ E(Qp t) 2 t max(tk tk 1) 0 − ≤ tk p − − → ∈

The finite value of the quadratic variation motivates dt dwt the estimate dt 0 0 dwt O(√dt) ∼ dwt 0 dt for the increments of the Wiener process. Table 1: Differential algebra for infinitesimal increments of the Wiener process.

5 stochastic integrals

Let f an analytic function

f : R R → we would like to make sense of the functional of the Brownian motion Zt I = f (ws) dws mathematics notation (4) 0

5 5. stochastic integrals often also written as Zt I = f (ws) ηs ds physics notation 0

Note that the physics notation implies that ηs is the “derivative” of the Wiener process which we showed to be no-where diﬀerentiable. It is, however, possible to interpret η w˙ , t 0 as distribution-valued t ≡ t ≥ stochastic process, whose precise meaning is determined by the construction of the integral (4).

5.1 Example: Wiener process as integrand

Namely take

f (ws) = ws and suppose to deﬁne the integral as

Zt (θ) X ws dws = lim wθ (wt wt ) (5) p 0 k k+1 − k k k↓ tk p 0 ∈ As n increases the t n describe sequences of reﬁning partitions of the interval [0, t]. The point θ is { k}k=1 k chosen arbitrarily in [tk , tk+1]:

θ = s t + (1 s) t s [0, 1] (6) k k+1 − k ∀ ∈

For ordinary Lebesgue-Stieltjes integrals the right hand side of (5) is independent of way θk is evaluated. In the present case, proposition. 2 F F in mean square sense, denoted L (Ω, , P) where t = σ(ws; s t) is the ﬁltration of the Wiener ≤ process up to time t and P the measure of the Wiener process we have

t Z 2 (θ) wt t (1 2 s) ws dw = − s 2 − 2 0 Proof. For any partition p of the interval [0, t] we can write X wθ (w w ) = k tk+1 − tk t p k∈ X wθ wt wt wθ wt wt k − k+1 + k+1 + k − k + k (w w ) = 2 2 2 2 tk+1 − tk t p k∈ 2 2 X wt wt X (wθ wt ) + (wθ wt ) k+1 − k + k − k+1 k − k (w w ) 2 2 tk+1 − tk t p t p k∈ k∈ 6 5.1. example: wiener process as integrand

We notice now that

2 2 2 X wt wt w k+1 − k = t 2 2 t p k∈ since the left hand side is an alternating sum such that

min tk = 0 & max tk = t t p t p k∈ k∈ Furthermore X [(wθ w ) + (wθ w )](w w ) = k − tk+1 k − tk tk+1 − tk t p k∈ X h 2 2i (w wθ ) (w wθ )(wθ w ) + (wθ w )(w wθ ) + (wθ w ) − tk+1 − k − tk+1 − k k − tk k − tk tk+1 − k k − tk t p k∈ reduces to

X h i X h 2 2i (wθ w ) + (wθ w ) (w w ) = (w wθ ) (wθ w ) k − tk+1 k − tk tk+1 − tk − tk+1 − k − k − tk t p t p k∈ k∈ In order to compute the integral we need to prove that the right hand side in the last expression has a well deﬁned limit in mean square sense. We do this in two steps

1. First we compute the mean

X 2 X E (w wθ ) = (t θ ) tk+1 − k k+1 − k t p t p k∈ k∈ X X = [t s t (1 s) t ] = (1 s) (t t ) = (1 s) t k+1 − k+1 − − k − k+1 − k − t p t p k∈ k∈ and

X 2 X X E (wθ w ) = (θ t ) = [s t + (1 s) t t ] = s t (7) k − tk k − k k+1 − k − k t p t p t p k∈ k∈ k∈

2. Then we prove convergence in mean square

 2  2 X X  2   h 2 i lim E  (wt wθ ) (1 s) t = lim E  (wt wθ ) (tk+1 θk)  p 0  k+1 − k − −  p 0  k+1 − k − −  k k↓ tk p k k↓ tk p ∈ ∈ Using independence of the increments of the Wiener process

 2 X  2  lim E  (wt wθ ) (1 s) t = p 0  k+1 − k − −  k k↓ tk p ∈ X h 2 i2 X 2 lim E (wt wθ ) (tk+1 θk) = 2 lim (tk+1 θk) = 0 p 0 k+1 − k − − p 0 − k k↓ tk p k k↓ tk p ∈ ∈ 7 6. itoˆ integral in the space of mean square integrable functions

and similarly for the other term.

Gleaning all the contributions we arrive at

2 X wt (1 2 s) t lim wθk (wtk wtk 1 ) = − − p 0 − − 2 k k↓ tk p ∈ where the limit holds in square mean sense.

The dependence of the integral upon the discretization rule aﬀects also the expectation value of the integral. proposition. Let us denote

(θ) wu dwu wθ (wu+du wu) ws(u+du)+(1 s) u(wu+du wu) ≡ u − ≡ − − then Zt (θ) E wu dwu = s t 0 Proof. The deﬁnition of the integral requires us to compute

Zt (θ) X n o E ws dws = lim E wθk wtk wθk wtk 1 p 0 − − k k↓ tk p 0 ∈ Upon recalling that

E(w w ) = t t t2 t1 2 ∧ 1 we get into X n o X X θ E wθk wtk wθk wtk 1 = ( k tk 1) = s(tk tk 1) = s t − − − − − − t p t p t p k∈ k∈ k∈ thus proving the claim.

6 itoˆ integral in the space of mean square integrable functions

Let suppose that ξ is a stochastic process adapted to the ﬁltration of the Wiener process w { t}t 0 { t}t 0 satisfying the properties≥ ≥

1. mean square integrability:

Zt E ds ξ2 < s ∞ 0

8 2. Non anticipating: ξ may depend only on w with s t. As a consequence ξ and dw are independent t s ≤ t t variables

E ξt dwt = Eξt Edwt = 0 definition. For any stochastic process ξt satisfying the above two properties we can deﬁne the Itoˆ integral

t Z X 0 ξ ξ dws s := lim tk 1 (wtk wtk 1 ) (8) · p 0 − − − k k↓ tk p 0 ∈ where the over-script 0 emphasizes the “pre-point” discretization rule of the Itoˆ integral. In order to give a · precise meaning to the limit over a sequence of partitions in (8) we may resort to to a “dyadic partitioning” of the interval [0, t]. In such a case we deﬁne

n X j t 2 I = ξ (w w ) where = (9) n tk 1 tk tk 1 n n − − − T 2 j=0 t k∈Tn so that the sum on the right hand side consists of exactly 2n addends. Furthermore, increments of the Wiener process are always evaluated over nearest-neighbors of the dyadic level so that their variance is Tn n ξ t/2 . Finally under our hypotheses we can think of tk 1 as a function − ξ tk 1 = f (w0, wt1 , . . . , wtk 1 ) for t1 tk 1 n (10) − − ≤ · · · ≤ − ∈ T We require the limit on the right hand side (8) to exists in mean square sense. Under our hypotheses this means that we require the sequence of approximations In n 0 to converge in the sense of Cauchy { } ≥ 2 lim E(In In+m) 0 m 0 n 0 − → ∀ ≥ ↑ Note that we use here convergence in the sense of Cauchy because in general the explicit form of the primitive function may not be available as it often happens for ordinary integrals. The deﬁnition (8) entails that proposition. the Itoˆ integral enjoys the

i the martingale property: let F = σ(w ; t t ) the ﬁltration of the Wiener process, then t t t t ≤ 2 ∀ 2 ≥ 1 t  t  t Z2 Z2  Z1 0  0  E dw ξ E  dw ξ F  = dw ξ (11) wt s s  s s t1  s s 1 · ≡  ·  0 0 0

ii the mean square integrability property

 t 2 t t Z  Z Z  0  2 2 E  dws ξs = ds E ξs = E ds ξs (12)  ·  0 0 0 Proof. Namely

9 6. itoˆ integral in the space of mean square integrable functions

• The martingale property stems from the non-anticipating requirement imposed on ξ and the { t}t 0 independence of the increments of the Wiener process. We can always decompose any partition≥ p of the interval [0, t2] into a component p1 partitioning [0, t1] and p2 partitioning (t1, t2]. Then

Zt2 Zt1 Zt2 0 0 0 Ew dws ξs = Ew dws ξs + Ew dws ξs t1 · t1 · t1 · 0 0 t1 X X ξ ξ = Ewt lim sk 1 (wsk wsk 1 ) + Ewt lim sk 1 (wsk wsk 1 ) 1 p 0 − − − 1 p 0 − − − k k↓ sk p1 k k↓ sk p2 ∈ ∈ Since by hypothesis we can commute the limit with the integral sign it is suﬃcient to check that   X  X  X ξ  ξ F  ξ Ewt sk 1 (wsk wsk 1 ) E  sk 1 (wsk wsk 1 ) t1  = sk 1 (wsk wsk 1 ) 1 − −  − −  − − s p − ≡ s p − s p − k∈ 1 k∈ 1 k∈ 1 and   X  X  X ξ  ξ F  ξ Ewt sk 1 (wsk wsk 1 ) E  sk 1 (wsk wsk 1 ) t1  = E sk 1 E(wsk wsk 1 ) = 0 1 − −  − −  − − s p − ≡ s p − s p − k∈ 2 k∈ 2 k∈ 2

• Mean square integrability is proved along the same lines:

 t 2 Z  X E  dw ξ  = E lim (w w )(w w ) ξ ξ  s s sk+1 sk sl+1 sl sk sl   p 0 − − k k↓ sk,sl p 0 ∈

As by hypothesis ξt is non anticipating n o E (w w )(w w ) ξ ξ = δ (s s )E ξ2 sk+1 − sk sl+1 − sl sk sl k ,l k+1 − k sk Hence upon inverting the limit and expectation value operation

 t 2 t Z  X Z E  dw ξ  = lim δ (s s )E ξ2 = ds E ξ2  s s k l k+1 k sk s   p 0 − k k↓ sk,sl p 0 ∈ 0 which yields the claim.

remark. From the proof of the mean square law, we can infer the rule which we need to follow for manipulating Wiener diﬀerentials directly on the continuum. Speciﬁcally, we can think of them as Gaussian variables with correlation satisfying

E (dw dw ) = dt δ(t t ) (13) t2 t1 1 − 2 It is worth clarifying the meaning of non-anticipating stochastic process by considering an explicit example.

10 example (Non-anticipating vs anticipating). Let w a Wiener process for all t 0, the function t ≥   0 if maxws 1  0 s t ≤ f (t) =  ≤ ≤  1 if max > 1  0 s t ≤ ≤ is non-anticipating as it depends on the Wiener process up to the time t when the function is evaluated. On the other hand for any T > t the function   0 if max ws 1  0 s T ≤ g(t) =  ≤ ≤   1 if max ws > 1  0 s T ≤ ≤ is anticipating as it depends on realizations of the Wiener process for times s posterior to the sampling time t.

The simplest example of Itoˆ integral is

t Z X

dws ws = lim wtk 1 (wtk wtk 1 ) p 0 − − − k k↓ tk p 0 ∈ Since X X w2 w2 X 2 tk tk 1 (wtk wtk 1 ) − − − − wtk 1 (wtk wtk 1 ) = − − − 2 − 2 t p t p t p k∈ k∈ k∈ in mean square sense we can conclude

Zt w2 t dw w = t s s 2 − 2 0 at variance with what expected from the ordinary rules of diﬀerential calculus. The origin of the discrepancy from ordinary calculus stems from

dw O(√dt) t ∼

7 itoˆ integral in the space of almost surely square integrable processes

The construction outlined above has one limitation. The mean square integrability condition does not allow us to treat smooth continuous functions such as

x4 f (x) = e 2

11 a. stochastic analysis terminology

Namely, we have

x2 w4 Z t 2 x4 e− 2 t E(e 2 ) = dx e = √2 π t ∞ R It turns out that it is possible to relax the integrability constraint by resorting to a method called localization. Our aim here is to illustrate qualitatively the idea behind it. To this goal we need the following deﬁnition definition. d An R -valued stochastic process ξt t [0,T] is almost surely square integrable in [0, T] if { } ∈  T  Z   2  P  dt ξt <  = 1  k k ∞ 0 We notice that mean square integrable processes are processes are also almost surely square integrable:   ZT ZT   1  2  2 P  dt ξt L E dt ξt  k k ≥  ≤ L k k 0 0 If the numerator on the right hand side is ﬁnite, then

 T   T  Z  Z   2   2  lim P  dt ξt L = 0 P  dt ξt <  = 1 L  k k ≥  ⇒  k k ∞ ↑∞ 0 0 The price to pay for the extension of the stochastic integral is that the result does not enjoy anymore by construction the martingale property (11). Stochastic integrals existing with probability one are therefore referred to as local martingales.

appendix

A stochastic analysis terminology

Let (Ω, , P) be a probability space and ξ a stochastic process on it. We denote by σ( ξ ) the F { t}t R+ { t}t R+ smallest σ algebra such that the stochastic process∈ is a measurable function with respect to σ( ∈ξ ). { t}t R+ Measurable function is another way of saying that the stochastic process is a random variable with∈ respect to σ( ξ ): all sets described by collection of values of ξ correspond to events in the σ-algebra. t t R+ t { } ∈ definition. We call σ( ξ ) the σ-algebra generated by ξ . t t R+ t t R+ { } ∈ { } ∈ As the stochastic process is meant to describe a time evolution, it is useful to introduce a measure theoretic concept representing changes in the set of events to which we wish to assign a probability: definition. A filtration is an increasing family F = of σ-algebras for {Ft}t R+ Ft0 ⊆ Ft1 ⊆ · · · ⊆ Ftn ⊆ F 0 t < < t t. The quadruple (Ω, , F , P) is called a∈ filtered probability space ≤ 1 ··· n ≤ F In other words, if a collection F = of sub-σ–algebras of has the property that t < t implies t t R+ 1 2 {F } ∈ F 12 then the collection is called a filtration. Ft1 ⊆ Ft2 definition. If the stochastic process ξ is such that at each time t ξ is measurable then we say that { t}t R+ t Ft it is adapted to the filtration. ∈

The ﬁltration generated by the stochastic process ξt is ξ t = σ( ξs s t) F≤ { } ≤ is then the σ-algebra consisting of events F of the form

F = ω Ω ξ B , ξ B ,..., ξ B { ∈ | t1 ∈ 1 t2 ∈ 2 tn ∈ n} for all possible n-tuples (t1, . . . , tn) such that

0 t < < t t ≤ 1 ··· n ≤ It is also useful (in the case of an unbounded index set T) to deﬁne as the σ-algebra generated by the F∞ inﬁnite union of the ’s, which is contained in Ft F

= σ( t T t) F∞ ∪ ∈ F ⊆ F The intuitive picture [1] to keep in mind is that a ﬁltration is the information available up to and including each time t: the collection of all the past events up to and including time t having non-vanishing probability. As t increases, the amount of information becomes more and more precise (the set of measurable events is staying the same or increasing) as more data from the evolution become available.

***

references

[1] J. L. Doob. What is a Martingale? The American Mathematical Monthly, 78(5):451, May 1971.

[2] K. Jacobs. Stochastic Processes for Physicists. Understanding Noisy Systems. Cambridge University Press, September 2010.

[3] G. A. Pavliotis. Stochastic Processes and Applications: Diﬀusion Processes, the Fokker-Planck and Langevin Equations. Springer New York, 2014.

[4] R. van Handel. Stochastic Calculus and Stochastic Control. Lecture notes, California Institute of Technology, 2007.