Imperial College of Science, Technology and Medicine Department of Computing
Parallel Numerical methods for SDEs and Applications
Nada Atallah
Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy in Computing and the Diploma of Imperial College London, July 2016
Abstract
Stochastic Differential Equations (SDEs) constitute an important mathematical tool with appli- cations in many areas of research such as finance, physics and computer science. The analytical study of these equations is problematic, especially in the multi-dimensional case, for this reason, numerical techniques prove to be necessary to solve such equations. In this project, parallel numerical techniques for SDEs are studied. Two kinds of parallelism will be explored: in space and in time. Implementation of these techniques applied to several systems of SDEs will be realised (using C++ and MPI) and performance measures like speedup and efficiency will be investigated on medium-scale computer clusters. In the second part of the thesis, a major application area in the field of computer and com- munication networks will be studied, that of second-order stochastic fluid networks. Recently, interest has been growing in networks with large numbers of components, with application in diverse fields, such as internet performance evaluation, the spread of computer viruses and biochemistry. Such models have such a large state space that discrete-state models are numer- ically infeasible due to the explosion in the size of the state space. Fluid approximations are therefore preferable and tend to be more accurate in large state spaces. In fluid models, an integer counter is replaced by a real number representing a volume and the solution method becomes based on differential equations rather than on difference equations. Some analytical solutions are possible in special cases but in general, numerical methods are required. In this project, parallel numerical studies of second order fluid networks will be conducted and re- sults in the context of the performance of computer and communication networks have been analysed, facilitating design improvement of their architecture.
i ii Acknowledgements
I would like to express my sincere gratitude to:
• my supervisor Peter Harrison,
• my second supervisor Tony Field,
• AESOP group,
• my brother, sister and all my friends who made this experience enjoyable and fruitful.
iii iv Contents
Abstract i
Acknowledgements iii
1 Introduction 1
2 Mathematical background 5
2.1 Measure and probability ...... 5
2.1.1 Probability Spaces ...... 5
2.1.2 Random variables and stochastic processes ...... 6
2.1.3 Expected value and variance ...... 10
2.1.4 Distribution Functions and independence ...... 11
2.1.5 Limit Theorems ...... 12
2.1.6 Conditional expectation ...... 13
2.1.7 Martingales ...... 15
2.2 Brownian Motion ...... 17
2.2.1 Properties ...... 18
v vi CONTENTS
2.2.2 Brownian Martingales ...... 18
2.2.3 Reflection principle ...... 20
2.2.4 Path properties of Brownian motion ...... 22
2.3 Poisson Process ...... 24
2.3.1 Properties ...... 25
2.3.2 Poisson Martingales ...... 25
2.3.3 Path properties of Poisson processes ...... 27
2.4 Levy processes ...... 28
2.5 Stochastic Integrals and Itˆo’sFormula ...... 28
2.5.1 Stochastic integral for simple step integrand ...... 28
2.5.2 Stochastic integral for square integrable adapted integrands ...... 30
2.5.3 Further extension of the stochastic integral ...... 30
2.5.4 Itˆoformula (with respect to Brownian motion) ...... 31
2.5.5 Itˆo’sformula (with respect to Itˆoprocesses) ...... 33
2.6 Stochastic differential equations ...... 33
3 Numerical Methods for SDEs 36
3.1 Examples of explicitly solvable SDEs ...... 36
3.2 Existence and uniqueness of strong solutions ...... 38
3.2.1 Solutions for ODEs ...... 39
3.2.2 Solutions for SDEs ...... 40
3.3 Stochastic discrete time approximations ...... 43 CONTENTS vii
3.3.1 Types of approximation ...... 44
3.3.2 Wagner-Platen expansion ...... 44
3.3.3 Convergence ...... 46
3.4 Strong approximations ...... 48
3.4.1 Euler Scheme ...... 48
3.4.2 Milstein Scheme ...... 50
3.4.3 Higher order schemes ...... 53
3.4.4 Geometric Brownian motion simulation ...... 53
3.5 Weak approximations ...... 57
3.5.1 Weak error criterion ...... 59
3.5.2 Euler scheme ...... 59
3.5.3 Higher order schemes ...... 61
4 Parallel stochastic simulation 63
4.1 Phase space Parallelism ...... 63
4.1.1 Method ...... 64
4.1.2 Numerical simulation ...... 65
4.2 Parallelism in time ...... 67
4.2.1 The Parareal algorithm ...... 68
4.2.2 Ordinary differential equations ...... 69
4.2.3 Stochastic differential equations ...... 73 5 Application: Second order stochastic fluid models 89
5.1 Queueing theory and fluid models ...... 89
5.2 Second order single fluid queue model ...... 90
5.2.1 Pathwise construction of the dynamics of a single server fluid queue . . . 91
5.2.2 Diffusion approximation and second order model ...... 93
5.2.3 Analytical study of second order fluid queues in a random environment . 96
5.2.4 Example of a single fluid queue and numerical results ...... 102
5.3 Second order stochastic queueing network model ...... 111
5.3.1 Traffic equations for single class Generalized Jackson network ...... 111
5.3.2 Pathwise construction of the dynamics of a single class open GJN . . . . 113
5.3.3 Diffusion approximation and second order model ...... 115
5.3.4 Second order fluid network model analytically ...... 118
5.3.5 Second order fluid network model numerically ...... 123
6 Conclusion 134
viii List of Tables
4.1 Timing for a different number of processors...... 66
ix x List of Figures
2.1 Histograms for different values of n ...... 14
2.2 Sample paths of Brownian Motion ...... 17
2.3 Sample path of Brownian Motion and its reflected path ...... 21
2.4 Sample path of a Poisson process ...... 24
2.5 Sample path of a Compound Poisson process of intensity 10 and marks uniformly distributed over 0-1 ...... 27
3.1 Sample path of the solution for N = 64 ...... 51
3.2 Sample path of the solution for N = 1024 ...... 51
3.3 Sample path of the solution for N = 16 ...... 54
3.4 Sample path of the solution for N = 64 ...... 54
3.5 GBM sample paths for different values of N ...... 56
3.6 Log-Log plot of the absolute error versus the time increment for the Euler scheme 58
3.7 Log-Log plot of the absolute error versus the time increment for the Milstein scheme ...... 58
3.8 Q-Q Plot comparing the simulation result of the Naive Euler algorithm with the exact solution of the GBM ...... 62
xi xii LIST OF FIGURES
4.1 Speedup versus the number of processors ...... 66
4.2 System efficiency versus number of processors ...... 67
4.3 Parareal algorithm [43] ...... 87
4.4 Parareal simulation of reflected SDE ...... 87
4.5 First Parareal simulation of GBM ...... 88
5.1 Single fluid queue ...... 91
5.2 Markov modulated process ...... 97
5.3 On-Off process ...... 103
5.4 Q-Q plot in the fluid flow case σ1 = σ2 = 0...... 106
5.5 Q-Q plot when σ1 = 1, σ2 = 0...... 107
5.6 Analytical and numerical solution of F(x) versus x when σ1 = 0, σ2 = 0. . . . . 107
5.7 Analytical and numerical solution of F(x) versus x when σ1 = 1, σ2 = 0. . . . . 108
5.8 Analytical and numerical solution of F(x) versus x when σ1 = 0, σ2 = 1. . . . . 108
5.9 Analytical and numerical solution of F(x) versus x when σ1 = σ2 = 1...... 109
5.10 Stationary distribution of the buffer content...... 110
5.11 Expectation of the buffer content as a function of the variance...... 110
5.12 A generalized Jackson network (GJN) ...... 112
5.13 Tandem of fluid queues ...... 125
5.14 Mean queue length function of time with Q1(0) equal to 3 ...... 128
5.15 Mean queue length function of time with Q1(0) equal to 0 ...... 129
5.16 Variance of the queue length function of time with Q1(0) equal to 3 ...... 130 5.17 Variance of the queue length function of time with Q1(0) equal to 0 ...... 131
5.18 Stationary distribution with Q1(0) equal to 3 ...... 132
5.19 Stationary distribution with Q1(0) equal to 0 ...... 133
xiii xiv Chapter 1
Introduction
Motivation, Objectives and Contribution
Stochastic differential equations arise in many fields of study like physics, computing and biol- ogy. They extend the concept of differential equations by taking into consideration the noise factor. Few SDEs can be solved analytically, for this reason, the numerical approach constitute a very important method to solve especially high-dimensional SDEs. But in a lot of cases, they reveal to be highly demanding when the processing time or the memory utilisation is considered. Therefore, recurring to parallelize these methods seems to be a necessity for a lot of problems faced in this field of study. As for example, in the field of physics, a lot of problems related to nonequilibrium statistical mechanics, cosmology, ... seem to involve SDEs like the escape rate from a potential well problem [35]. And they invoke the solution of infinite-dimensional SDEs which requires the use of numerical simulation in order to solve them and parallelization techniques for better performance and to avoid memory or long time processing related issues.
In this thesis, numerical techniques to solve SDEs will be presented and applied to problems in computer and communication networks. Fluid modeling is an important tool when estimating the performance of queueing systems like computer and communication networks especially in
1 2 Chapter 1. Introduction the heavy traffic limit or in cases when we are faced with the state space explosion problem. The second order fluid queue models are fluid models that take into account the noise factor, consequently the mathematical description of these models tends to involve stochastic differen- tial equations. The solution of these equations turns out to be intractable analytically in most of the cases especially when considering large scale models of computer and communication systems.
Throughout this thesis, an extensive study of parallel numerical techniques for stochastic dif- ferential equations will be undertaken. It starts with the embarrassingly parallel case of phase space parallelism. These techniques involving Monte Carlo methods are useful when the inter- est lies on the weak approximation of stochastic differential equations. On the other hand when the focus is on the strong approximation the only kind of parallelism available to explore is the time parallelism that was originally explored for stochastic differential equations by G. Bal [43] in his extension of the parareal algorithm to cover the SDEs case. In his study he considered the Euler scheme as a coarse solver and did the convergence study to prove an improvement in the convergence order by a factor of k over the Euler technique after k iterations of the parareal algorithm thus obtaining k/2 for convergence rate with an Euler coarse solver of order (1/2). We have proceeded with an extended study and took the Milstein scheme as a coarse solver and proved that after k iterations we fall on an order of convergence of k with a Milstein coarse solver of order 1. Several stochastic differential equations have been simulated to visualise the convergence to the exact solution.
In the second part of the thesis, second order fluid models of computer and communication systems were analysed. A detailed study of the single fluid case was undertaken analytically and numerically. Where the weak approximation techniques explored in previous chapters have been used to explore numerically this kind of fluid queues and the analytical solution was used to show the convergence of these techniques to the exact solution when simulating fluid queues with the noise factor. Generalized Jackson networks and their reflected Brownian motion RBM approximation were introduced. Numerical techniques available for multidimensional RBM 3 were presented and the example of a tandem of fluid queue was explored numerically.
Chapters outline
This thesis starts with a chapter describing briefly the mathematical jargon and tools necessary for a good formalism of physical problems where the noise factor is taken into consideration. More specifically, stochastic differential equations and Itˆo calculus are presented in a nutshell. Visiting briefly concepts from measure and probability theory, Brownian motion, Levy pro- cesses and Itˆo integration.
In the following chapter, numerical techniques used to solve stochastic differential equations are explored. Beginning with a short survey of the analytical background behind the existence and uniqueness of the solution of stochastic differential equations and the assumptions necessary for that. Which sets the conditions without which solving numerically an equation might lead to erroneous results. Afterwards, numerical schemes for the simulation of strong and weak approximations of stochastic differential equations, are studied thoroughly with their order of convergence and numerical examples were provided for the sake of clarity.
The fourth chapter deals with the parallelization of the numerical schemes for stochastic differ- ential equations. In this chapter, phase space parallelism and time parallelism are investigated. Their domain of application is different. It depends whether one is seeking a strong or weak ap- proximation of these differential equations. Both aspects of parallelism are studied thoroughly, though the main emphasis as stated earlier, was on the parareal algorithm where an extension to previous studies was undertaken through considering Milstein scheme as the coarse solver and proving that after k iterations, a convergence rate of k is achieved.
In the last chapter, second order fluid queues in the field of computer and communication are in- 4 Chapter 1. Introduction vestigated. An analytical and numerical study of the single fluid queue case is undertaken with the purpose of validating the numerical results against the exact ones. This study is extended to the case of Generalized Jackson networks, that were studied analytically at a first stage and numerically thereafter. The example of a tandem of fluid queue was analysed numerically.
The purpose of the sixth chapter is to provide a suitable conclusion for this PhD thesis.
Statement of Originality
To the best of my knowledge, this thesis contains no material previously published or written by another person except where due acknowledgment is made in the thesis itself. Chapter 2
Mathematical background
In this chapter, we will discuss briefly the relevant mathematical concepts behind the theory of stochastic differential equations.
2.1 Measure and probability
2.1.1 Probability Spaces
Suppose that Ω is a non empty set.
Definition 2.1.1. The set of all possible outcomes of an experiment is called the sample space and is denoted by Ω. Events are subsets of the sample space.
Definition 2.1.2. A field on Ω is a collection F of subsets of Ω with the following properties:
(i) ∅ ∈ F. (ii) If E ∈ F, then EC ∈ F, where EC = Ω −E.
(iii) If E1,E2 ∈ F, then E1 ∪ E2 ∈ F.
Definition 2.1.3. A σ-algebra is a collection F of subsets of Ω with the following properties:
(i) ∅ ∈ F. (ii) If E ∈ F, then EC ∈ F, where EC = Ω −E.
5 6 Chapter 2. Mathematical background
∞ (iii) If E1,E2,... ∈ F, then ∪i=1Ei ∈ F.
A σ-algebra is closed under the operation of taking countable intersections.
Definition 2.1.4. A measure µ is a function defined on a σ-algebra F over a set Ω and taking values in the extended interval [0,∞] such that the following properties are satisfied:
(i) The empty set has measure zero: µ(∅) = 0.
(ii) Countable additivity or σ-additivity: if E1,E2,... is a countable sequence of pairwise disjoint S∞ P∞ sets in F: µ ( i=1 Ei) = i=1 µ(Ei).
The triple (Ω,F,µ) is then called a measure space, and the members of F are called measurable sets [1, 2, 3].
Definition 2.1.5. A Borel σ-field B is the σ-field generated by all intervals. The elements of
B are called Borel sets. The class B of Borel sets in Euclidean Rn is the smallest collection of sets that includes the open and closed sets.
\ B = {F : F is a σ-field containing all intervals}
Definition 2.1.6. A probability space is a triple (Ω,F,P) where Ω is an arbitrary set, F is a σ-algebra of subsets of Ω and P is a measure on F such that: P(Ω) = 1 called probability measure on F or, briefly, probability.
2.1.2 Random variables and stochastic processes
Probability spaces may not be ”directly observable” therefore it is useful to introduce mappings
X from Ω to Rn, the values of which can be observed [4, 5].
Definition 2.1.7. Let (Ω,F,P) be a probability space. A mapping X: Ω → Rn is called an n-dimensional random variable if for each B ∈ B, we have:
X−1(B) ∈ F 2.1. Measure and probability 7
Lemma 2.1.8. Let X :Ω → Rn be a random variable. Then
U(X) := {X−1(B) | B ∈ B} is a σ-algebra generated by X. This is the smallest sub-σ-algebra of U with respect to which X is measurable.
Definition 2.1.9. Given a probability space (Ω,F,P) and a measurable space (S, Σ), a stochas- tic process is a collection of S-valued random variables {X(t), t ∈ T } on the set Ω, indexed by the parameter t taking values in the set T. The random variable takes values in the set S, called the state-space of the stochastic process. For each point ω ∈ Ω, the mapping t → X(t, ω) is the corresponding sample path.
The parameter t is usually used to represent the time. When T is a countable set, the process is said to be a discrete-time process. If T is an interval of the real line, the process is said to be continuous-time process.
Continuity of stochastic processes
In this paragraph, we introduce the notion of continuity for stochastic processes, for more details the reader is referred to references [50, 51, 52].
Definition 2.1.10. (Continuity in Sample Paths) A stochastic process {X(t, .):Ω → R, t ∈ T } is continuous at time s, s ∈ T, if for almost all ω ∈ Ω, t → s implies X(t, ω) → X(s, ω). A process is continuous if, for almost all ω ∈ Ω,X(., ω) is a continuous function.
Definition 2.1.11. (Cadlag) A stochastic process {X(t, .):Ω → R, t ∈ T }, is cadlag if for almost all ω ∈ Ω, we have that for every s ∈ T : t ↓ s implies X(t) → X(s), and for t ↑ s, limX(t) exists, but need not be X(s). t↑s In other words, a stochastic process X(t) is cadlag if almost all its sample paths are continuous from the right and limited from the left at every point. 8 Chapter 2. Mathematical background
The term cadlag stands for the french expression ’continues `adroite, limites `agauche’. These processes are also referred to by rcll(right continuous with left limits). Although this property is weaker than the pathwise continuity, it is still a very important one, widely used when real life applications are considered.
Definition 2.1.12. (Continuity in L2) A stochastic process {(X(t, .):Ω → R, t ∈ T } is continuous in mean square if E[X(t)2] < ∞ for all t and lim E[(X(s) − X(t))2] = 0 for all s→t t ∈ T.
The notion of expectation E[X] is introduced in section (2.1.3).
Definition 2.1.13. (Continuity in Lp) A stochastic process {(X(t, .):Ω → R, t ∈ T } is
p p continuous in Lp if E[X(t) ] < ∞ for all t and lim E[(X(s) − X(t)) ] = 0 for all t ∈ T. s→t
Definition 2.1.14. (Continuity in Probability) A stochastic process {(X(t, .):Ω → R, t ∈ T } is continuous in probability at time s ∈ T, if:
lim P [ω ∈ Ω: |X(t) − X(s)| ≥ ε] = 0, for all ε > 0. t→s
{X(t), t ∈ T } is continuous in probability or stochastically continuous if the previous condition holds for all t ≥ 0.
Evidently, continuity in sample paths implies continuity in Lp and continuity in probability, for more details please refer to [48].
Indistinguishable processes
Stochastic processes are mathematical tools used to represent natural phenomena. The fol-
0 lowing definitions are useful to determine when two processes Xt and Xt represent the same phenomena [48].
0 Definition 2.1.15. Let {Xt, t ∈ T } and {Xt, t ∈ T } be two stochastic processes sharing the same state space (G, G) defined respectively on the probability spaces (Ω,F,P) and (Ω0,F 0,P’). 2.1. Measure and probability 9
The two processes are equivalent if for every finite set of instants of time {t1, t2, ... , tn} in T and elements A1,A2, ... , An of G:
0 0 0 P (Xt1 ∈ A1,Xt2 ∈ A2, ... , Xtn ∈ An) = P (Xt1 ∈ A1,Xt2 ∈ A2, ... , Xtn ∈ An).
0 {Xt} and {Xt} are also said to have the same law.
0 Definition 2.1.16. Let {Xt, t ∈ T } and {Xt, t ∈ T } be two stochastic processes defined on the same probability space (Ω,F,P) sharing the same state space (G, G). The process {Xt} is
0 a modification of {Xt} if:
0 Xt = Xt a.s. for each t ∈ T.
0 Definition 2.1.17. Let {Xt, t ∈ T } and {Xt, t ∈ T } be two stochastic processes defined on the same probability space (Ω,F,P) sharing the same state space (G, G). The processes {Xt} and
0 {Xt} are P-indistinguishable if for almost all ω ∈ Ω:
0 Xt(ω) = Xt(ω) ∀ t ∈ T.
0 Lemma 2.1.18. Let {Xt, t ∈ T } and {Xt, t ∈ T } be two right-continuous stochastic processes defined on the same probability space (Ω,F,P) sharing the same state space (G, G). If the
0 process {Xt} is a modification of {Xt} then they are indistinguishable.
For more details, please refer to [48].
Examples of stochastic processes in real life
Stochastic processes have many applications in lots of fields like biology, physics, finance, telecommunication networks and many others [6].
Example 2.1.19. (Queues) Let X(t) be the number of customers waiting for service in a service facility. {X(t), t ≥ 0} is a continuous-time stochastic process with state-space S = {0, 1, 2, ...}.
Example 2.1.20. (Manufacturing) Let N be the number machines in a machine shop. Each machine can be in one of the two states: working or under repair. Let Xi(t) = 1 if the i-th 10 Chapter 2. Mathematical background
machine is working at time t, and 0 otherwise. Then {X(t) = (X1(t),X2(t), ..., XN (t)), t ≥ 0} is a continuous-time stochastic process with state-space S = {0, 1}N .
Example 2.1.21. (DNA Analysis) A DNA (Deoxyribose Nucleic Acid) is a long molecule that consists of a string of four basic molecules called bases, represented by A, T,G,C. Let Xn be the n-th base in the DNA. Then {Xn, n ≥ 0} is a discrete time stochastic process with state-space S = {A, T, G, C}.
2.1.3 Expected value and variance
Let X :Ω → R be a random variable.
Definition 2.1.22. The expected value or mean value of X is defined by:
Z E(X) := X.dP Ω
Definition 2.1.23. The variance of X is defined by:
Z V (X) := | X − E(X) |2 .dP Ω where | . | denotes the Euclidean norm.
Lemma 2.1.24. (Markov’s inequality). If X is a nonnegative random variable, then, for any value k > 0, E[X] P (X ≥ k) ≤ k
Lemma 2.1.25. (Chebyshev’s inequality). If X is a random variable with mean µ and variance σ2, then, for any value k > 0, σ2 P (| X − µ |≥ k) ≤ k2
The importance of Markov’s and Chebyshev’s inequalities lies in the fact that they provide us with a mean to derive bounds on probabilities when we only know the value of the mean or the values of both the mean and the variance while ignoring the probability distribution. Certainly 2.1. Measure and probability 11 when the distribution is known, the seeked probabilities can be determined exactly without recurring to the bounds [7].
2.1.4 Distribution Functions and independence
Let X :Ω → Rn be a random variable defined on the probability space (Ω,F,P) and let A and B be two events with P (B) > 0.
n Definition 2.1.26. The distribution function of X is the function FX : R → [0, 1] defined by:
n FX (x) := P (X ≤ x) for all x ∈ R
Definition 2.1.27. If there exists a nonnegative, integrable functionf : Rn → R such that:
Z x F (x) = f(y).dy −∞ then f is called the density function for X.
Proposition 2.1.28. Suppose g : Rn → R, and Y = g(X) is integrable. Then
Z E(Y ) = g(x).f(x)dx. n R
Definition 2.1.29. Two events A and B are called independent if:
P (A ∩ B) = P (A).P (B).
n Definition 2.1.30. Let Xi :Ω → R be random variables (i = 1, ...). We say the random variables X1, ... are independent if ∀k ∈ N such that: k ≥ 2 and all choices of Borel sets
n B1, ..., Bk ⊆ R :
P (X1 ∈ B1,X2 ∈ B2, ..., Xk ∈ Bk) = P (X1 ∈ B1)P (X2 ∈ B2)...P (Xk ∈ Bk). 12 Chapter 2. Mathematical background
2.1.5 Limit Theorems
A very famous result in probability theory is ’the strong law of large numbers’, it states that the average of a sequence of independent, identically distributed random variables converges almost surely to the expected value of that distribution [1].
Theorem 2.1.31. (Strong law of large numbers). Let X1, ..., Xn, ... be a sequence of indepen- dent, identically distributed random variables defined on the same probability space, each having a finite mean µ = E[Xi]. Then,
X + ... + X P (lim 1 n = µ) = 1. n→∞ n
Another very well-known theorem in probability theory is ’the central limit theorem’. In ad- dition to its theoretical value, this theorem presents a simple method to compute approximate probabilities for sums of independent random variables. It also provides an explanation for the prevalence of the normal distribution curve shape among the empirical frequencies of many real populations.
Theorem 2.1.32. (Central limit Theorem). Let X1, ..., Xn, ... be a sequence of independent, identically distributed, real-valued random variables each having mean µ and variance σ2. Then for all , −∞ < a < +∞
x2 Z a X1 + ... + Xn − n.µ 1 − limn→∞P ( √ ≤ a) = √ e 2 dx. nσ 2π −∞
Example 2.1.33. The experiment below illustrates this important mathematical concept. We
n generate N sequences of exponentially distributed random variables {Xi}i=1. Their distribution function is given by: 2.1. Measure and probability 13
FX (x) = 0 for x < 0
1 − exp(−λx) for x > 0
where λ = 0.5 . The average of this exponential distribution is µ = 2 and the variance is σ2 = 4. For each of the
n N sequences of generated random variables {Xi}i=1, we determine the value of the normalized random variable Zn given by: Pn X − nµ Z = i=1 √i n σ n
These N values of Zn will be used to plot the histogram approximating the simulated density function of Zn. This experiment is repeated for different values of n. From figure 2.1, it is obvious that as n increases, the density function of Zn approximated by the histogram, will tend more and more to the density function of a standard Gaussian distribution.
2.1.6 Conditional expectation
Definition 2.1.34. Let X be an integrable random variable on a probability space (Ω,F,P) and U a sigma field contained in F. Then E(X | U) is defined to be a random variable such that: (1) E(X | U) is U-measurable; (2) For any A ∈ U, Z Z XdP = E(X | U)dP. A A
Theorem 2.1.35. (Properties of conditional expectation). (1) E(E(X | U)) = E(X) a.s. (2) If a, b are constants, E(aX + bY | U) = aE(X | U) + bE(Y | U) a.s. 14 Chapter 2. Mathematical background
(a) n = 3 (b) n=5
(c) n = 10 (d) n=15
(e) n = 100 (f) n=500
Figure 2.1: Histograms for different values of n 2.1. Measure and probability 15
(3) If X is U-measurable and XY is integrable, then
E(XY | U) = XE(Y | U) a.s.
(4) If X is positive, then E(X | U) ≥ 0 a.s. (5) If W ⊆ U, we have
E(E(X | U) | W) = E(X | W) a.s.
2.1.7 Martingales
Definition 2.1.36. Let (Ω,F,P) be a probability space, a (discrete) filtration is an increasing sequence of sub-σ-fields (F n)n≥0 of F; i.e.
F 0 ⊂ F 1 ⊂ F 2 ⊂ ... ⊂ F n ⊂ ... ⊂ F.
We write F = (F n)n≥0 [1, 2, 8].
Definition 2.1.37. The sequence (Xn)n≥0 of random variables is adapted to the filtration F if
Xn is F n-measurable for every n ≥ 0. The tuple (Ω,F,(F n)n≥0,P) is called a filtered probability space.
Definition 2.1.38. A sequence of real-valued random variables X1,X2, ... is called a martingale with respect to a filtration F 1, F 2, ... if:
(1) Xn is integrable for each n = 1, 2, ... ;
(2) X1,X2, ... is adapted to F 1, F 2, ... ;
(3) E(Xk+1 | F k) = Xk a.s. for each k = 1, 2, ... .
Definition 2.1.39. A sequence of real-valued random variables X1,X2, ... is called a super- martingale with respect to a filtration F 1, F 2, ... if:
(1) Xn is integrable for each n = 1, 2, ... ;
(2) X1,X2, ... is adapted to F 1, F 2, ... ; 16 Chapter 2. Mathematical background
(3) E(Xk+1 | F k) ≤ Xk a.s. for each k = 1, 2, ... .
Definition 2.1.40. A sequence of real-valued random variables X1,X2, ... is called a submartin- gale with respect to a filtration F 1, F 2, ... if:
(1) Xn is integrable for each n = 1, 2, ... ;
(2) X1,X2, ... is adapted to F 1, F 2, ... ;
(3) E(Xk+1 | F k) ≥ Xk a.s. for each k = 1, 2, ... .
Example 2.1.41. (Symmetric random walk) Let X1,X2, ... be a sequence of independent iden- tically distributed random variables such that:
P {Xn = 1} = P {Xn = −1} = 0.5
and Yn the symmetric random walk given by:
Yn = X1 + X2 + ... + Xn,
2 then Yn and Yn − n are martingales with respect to the filtration F n = σ(X1,X2, ..., Xn).
Definition 2.1.42. A random variable τ taking values in the set {1, 2, ...} ∪ {∞} is called a stopping time with respect to a filtration F n if:
{τ = n} ∈ F n, for each n = 1, 2, ...
Theorem 2.1.43. (Doob’s Martingale inequalities)
(1) Doob’s maximal inequality: If Xn is a non-negative submartigale with respect to a filtration
F n, then for all λ > 0 1 P ( max Xk ≥ λ) ≤ E(Xn.1{ max Xk≥λ}) 0≤k≤n λ 0≤k≤n
2 (2)Doob’s maximal L inequality: If Xn is a non-negative square integrable submartigale with respect to a filtration F n, then
E(| max X(k)|2) ≤ (4.E(|X(n)|2). 0≤k≤n 2.2. Brownian Motion 17
2.2 Brownian Motion
A Brownian motion (Wiener process) is the most important example of a continuous-time martingale. It plays a central role in probability theory, the theory of stochastic processes as well as in many other fields of study [8, 9, 10].
Definition 2.2.1. A Brownian motion {Wt, (F t)t∈T } on (Ω,F,P) is a real-valued stochastic process that has continuous sample paths and stationary Gaussian independent increments, such that:
1. W0 = 0.
2. t → Wt(ω) is continuous.
3. For t ≥ s, Wt − Ws is a Gaussian random variable with mean 0 and variance t − s, and
is independent of the history F s.
Figure 2.2: Sample paths of Brownian Motion 18 Chapter 2. Mathematical background
2.2.1 Properties
Let {Wt, (F t)t∈T } be a standard one-dimensional Brownian motion on (Ω,F,P) . Then it follows that:
1. Gaussian process: for any finite sub-family (t1, ..., tn) of T, the finite dimensional dis-
tributions P(Wt1 ≤ x1, ..., Wtn ≤ xn) are multivariate normal with E(Wt) = 0 and
E(Wt.Ws) = min{s, t} for t ≥ 0 and s ≥ 0.
2. Time homogeneity: the process {Wt+s − Ws}, t ≥ 0 is a Brownian motion, for any s > 0.
3. Symmetry: the process {−Wt}, t ≥ 0 is a Brownian motion.
4. Scaling: the process {α.Wt/α2 }, t ≥ 0 is a Brownian motion, for any α > 0.
5. Time inversion: the process {t.W1/t}, t > 0 with W0 = 0 is a Brownian motion.
6. Markov property: {Wt}, t ∈ T, is a Markov process,
P [Wt ≤ x | F s] = P [Wt ≤ x | Ws], 0 ≤ s < t
P [Wt ≤ x | Ws] = P [Wt−s ≤ x], 0 ≤ s < t.
W 7. Strong Law of Large Numbers: lim t = 0, a.s. t→∞ t
For a more detailed study and proofs of the statements above please refer to [45, 46].
2.2.2 Brownian Martingales
Proposition 2.2.2. A Brownian motion {Wt}, t ≥ 0, is a martingale on the filtered probability space (Ω,F,P). 2.2. Brownian Motion 19
Proof. Briefly, (Wt − Ws) is independent of F s for s ≤ t, consequently:
E[Wt − Ws | F s] = E[Wt − Ws]
= 0 which implies that:
E[Wt | F s] = Ws
2 Proposition 2.2.3. The process {Wt − t} is a martingale on the filtered probability space
(Ω,F,P)
Proof. Briefly,
2 2 2 E[Wt − Ws | F s] = E[(Wt − Ws) + 2Ws(Wt − Ws) | F s]
2 = E[(Wt − Ws) | F s]
2 = E[(Wt − Ws) ]
= t − s which implies that:
2 2 E(Wt − t | F s) = Ws − s
2 Proposition 2.2.4. The process exp(δ.Wt − (δ /2).t) is a martingale on the filtered probability space (Ω,F,P), for δ ∈ R.
Proof. Briefly, for t ≥ s, (Wt − Ws) is a gaussian random variable, with mean 0 and variance
(t − s), independent of the history F s. Consequently: 20 Chapter 2. Mathematical background
2 δ.(Wt−Ws) (t−s).δ /2 E[e | F s] = e
2 δ.(Wt−Ws)−δ .(t−s)/2 E[e | F s] = 1 implying that:
2 2 δ.Wt−(δ /2).t δ.Ws−(δ /2).s E[e | F s] = e
2.2.3 Reflection principle
Let {Wt, t ≥ 0} be a Brownian motion on the filtered probability space (Ω,F,P).
Definition 2.2.5. The running maximum to date of Wt is defined as
Mt := sup Ws = maxWs s[0,t] s[0,t]
Definition 2.2.6. The first passage time Tx of Wt to a level x ∈ R is defined as:
Tx := inf{t ≥ 0,Wt = x}.
If m is a given positive level and t a given positive time, certain paths of the Brownian motion might exceed level m before or at time t others might only reach it while others might stay below it. Among the ones that exceed or reach this level we have 2 types of paths: one type that reach level m before time t and are found at some level a below level m at time t and the other type corresponds to the paths that reach level m before t but are found at a level above level m at time t . Which can be expressed mathematically by the equation below:
P [Tm < t] = P [Tm < t, Wt > m] + P [Tm < t, Wt < m]. (2.1) 2.2. Brownian Motion 21
The reflection principle provides a heuristic argument which states that for every path that crossed level m before time t and is found at some point w below level m at time t, there exists a symmetric path(obtained by reflection with respect to level m), which is found at the point (2m − w) at time t, figure 2.3, and both paths share the same probability because of the symmetry with respect to level m of the Brownian motion starting at this level. Therefore for every paths of the Brownian motion crossing level m before time t and ending at some point below level m corresponds an equally probable paths crossing level m before time t but ending up above level m. Which can be expressed by the equation:
P [Tm < t, Wt < m] = P [Tm < t, Wt > m] = P [Wt > m]. (2.2)
Figure 2.3: Sample path of Brownian Motion and its reflected path
Leading to the statement of the theorem below. 22 Chapter 2. Mathematical background
Theorem 2.2.7. The reflection equality states that for m ≥ 0 and w ≤ m,
2m − w P [Mt ≥ m, Wt ≤ w] = P [Wt ≥ 2m − w] = 1 − Φ( √ ), (2.3) t where Φ(.) is the CDF of the standard normal distribution.
Though based on the reflection principle, a rigorous mathematical proof can be provided using the strong Markov property, as for example in reference [45, 46].
Proposition 2.2.8. The density of the first passage time Tm of Wt to a level m ∈ R is given by:
|m| −m2/2t fT (t) = √ e m 2π.t3
Proof. Briefly,
P [Tm < t] = P [Tm < t, Wt > m] + P [Tm < t, Wt < m] using equation 2.2 we obtain that:
m P [Tm < t] = 2.P [Wt > m] = 2.(1 − Φ(√ )). t
By differentiating with respect to t, we obtain the density fTm (t).
2.2.4 Path properties of Brownian motion
Let {Wt, t ≥ 0} be a Brownian motion on the filtered probability space (Ω,F,P). We fix a time horizon 0 < T < ∞.
Definition 2.2.9. A partition of [0, T ] is a finite set Π = {t0, t1, t2, ..., tm} ⊂ [0,T ](m ∈ N) with 0 = t0 < t1 < ... < tm = T.
The maximal step size of Π is kΠk := maxi|ti+1 − ti|.
Theorem 2.2.10. (Quadratic variation of Brownian motion) If (Πn) is a refining sequence of 2.2. Brownian Motion 23
partitions with kΠkn → 0 then, for all t ∈ [0,T ],
X 2 lim |Wt ∧t − Wt | = t almost surely. n i+1 i ti∈Πn,ti≤t
Definition 2.2.11. For a function α : [0,T ] → R, the total variation of α up to t[0,T ] is defined as X TVt(α) := sup |α(ti+1 ∧ t) − α(ti)|, Π ti∈Π,ti≤t where the supremum is taken over all partitions. For a continuous function α(.) and a refining sequence (Πn) of partitions with kΠnk → 0,
X TVt(α) = lim |α(ti+1 ∧ t) − α(ti)|. n ti∈Πn,ti≤t
Theorem 2.2.12. (Total variation of Brownian motion) For a refining sequence (Πn) of par- titions with kΠnk → 0,
X lim |Wt ∧t − Wt | = +∞ almost surely. n i+1 i ti∈Πn,ti≤t
Theorem 2.2.13. (Law of Iterated Logarithm)
Wt lim supp = 1, a.s. t→0 2t log log (1/t)
W lim inf t = −1, a.s. t→0 p2t log log (1/t)
W lim sup√ t = 1, a.s. t→∞ 2t log log t
W lim inf √ t = −1, a.s. t→∞ 2t log log t
For a mathematical proof of the theorems above and for further details please refer to [45, 46]. 24 Chapter 2. Mathematical background
2.3 Poisson Process
Poisson processes play a central role in stochastic modelling especially when jumps are taken into consideration.
Let (Ω, F, F t, P) be a filtered probability space.
Definition 2.3.1. An F t-Poisson process {Nt}, t ≥ 0, of intensity (parameter) λ is a right- continuous, adapted and integer-valued process with stationary independent increments, such that:
1. (Independent Poisson Increments) For t ≥ s, Nt − Ns is a Poisson distributed random
variable with parameter λ(t − s) and is independent of the history F s:
(λ(t − s))k P (N − N = k) = e−λ(t−s). t s k!
2. (Step Function Paths) For t > 0, t → Nt(ω) is an increasing step function of t with jumps
of size one and initial value N0 = 0.
Figure 2.4: Sample path of a Poisson process 2.3. Poisson Process 25
2.3.1 Properties
Let {Nt}, t ≥ 0, be an F t-Poisson process of intensity λ on the filtered space (Ω, F, F t, P). Then it follows that:
1. Memoryless property: P (Nt+s > x + y | Ns > y) = (P (Nt > x), ∀x, y ≥ 0;
2. E[Nt] = λ.t, V ar[Nt] = λ.t
2.3.2 Poisson Martingales
Let {Nt}, t ≥ 0, be an F t-Poisson process of intensity λ on the filtered space (Ω, F, F t, P)
Proposition 2.3.2. The compensated Poisson process {Nt − λ.t}, t ≥ 0, is a martingale on the filtered probability space (Ω,F,F t,P).
Proof. Briefly, (Nt − Ns) is independent of F s for s ≤ t, consequently:
E[Nt − Ns | F s] = E[Nt − Ns]
= λ.(t − s), which implies that:
E[Nt | F s] − Ns = λ.(t − s), therefore,
E[Nt − λ.t | F s] = Ns − λ.(t − s),
2 Proposition 2.3.3. The process {(Nt − λ.t) − λ.t} is a martingale on the filtered probability space (Ω,F,P) 26 Chapter 2. Mathematical background
Proof. Briefly,
2 2 E[(Nt − λ.t) | F s] = E[(Nt − Ns + Ns − λ.t) | F s]
2 2 = E[(Nt − Ns) + (Ns − λ.t) + 2.(Nt − Ns)(Ns − λ.t) | F s]
2 2 = E[(Nt − Ns) ] + (Ns − λ.t) + 2.(Ns − λ.t)E[(Nt − Ns)]
2 2 2 = λ.(t − s) + λ .(t − s) + (Ns − λ.t) + 2.λ.(t − s)(Ns − λ.t)
2 = λ.(t − s) + (Ns − λ.s)
which implies that:
2 2 E((Nt − λ.t) − λ.t | F s) = (Ns − λ.s) − λ.s
Proposition 2.3.4. The process exp(Nt.ln(b) + (1 − b)λ.t) is a martingale on the filtered probability space (Ω,F,P), for b > 0.
Proof. Briefly, for s ≥ t, (Nt − Ns) is a Poisson random variable, with intensity λ(t − s), independent of the history F s. Consequently:
Nt.ln(b)+(1−b)λ.t Ns.ln(b)+(1−b)λ.(t) (Nt−Ns).ln(b) E[e | F s] = e .E[e | F s]
= eNs.ln(b)+(1−b)λ.(t).E[e(Nt−Ns).ln(b)]
ln(b) = eNs.ln(b)+(1−b)λ.(t).eλ(t−s).(e −1)
Therefore,
Nt.ln(b)+(1−b)λ.t Ns.ln(b)+(1−b)λ.s E[e | F s] = e 2.3. Poisson Process 27
2.3.3 Path properties of Poisson processes
Let {Nt}, t ≥ 0, be an F t-Poisson process of intensity λ on the filtered space (Ω, F, F t, P) We fix a time horizon 0 < T < ∞.
Theorem 2.3.5. (Quadratic variation of Poisson processes) If (Πn) is a refining sequence of partitions with kΠkn → 0 then, for all t ∈ [0,T ],
X 2 lim |Nt ∧t − Nt | = Nt. n i+1 i ti∈Πn,ti≤t
Theorem 2.3.6. (Total variation of Poisson processes) For a refining sequence (Πn) of parti- tions with kΠnk → 0, X lim |Nt ∧t − Nt | = Nt. n i+1 i ti∈Πn,ti≤t
Figure 2.5: Sample path of a Compound Poisson process of intensity 10 and marks uniformly distributed over 0-1 28 Chapter 2. Mathematical background
2.4 Levy processes
Levy processes represent a class of stochastic processes that include Poisson processes, Wiener processes. Recently, a lot of applications are relying on models based on these processes as for example in the field of finance and communication networks [58].
Let (Ω, F, F t, P) be a filtered probability space.
Definition 2.4.1. An Ft-adapted process {Xt, t ≥ 0}, with X0 = 0 a.s. is a Levy process, if:
1. (Independent Increments) For 0 ≤ s < t < ∞,Xt − Xs is independent of the history F s.
2. (Stationary Increments) For 0 ≤ s ≤ t, Xt − Xs is distributed similarly to Xt−s.
3. (Continuous in Probability) lim P [ω ∈ Ω: |X(t) − X(s)| ≥ ε] = 0, for all ε > 0. t→s
2.5 Stochastic Integrals and Itˆo’sFormula
Let (Ω,F,P) be a probability space with a finite time horizon T < ∞, a Brownian motion
(Wt)t∈[0,T ] and the filtration ( F t)t∈[0,T ] is the filtration generated by the Brownian motion
W P completed by null sets, that is F t = ( F t ) . Without loss of generality, we let F :=
F T [9, 11, 12, 36].
2.5.1 Stochastic integral for simple step integrand
Definition 2.5.1. The class of functions f : [0,T ] × ω → R with the properties: (1) f :(t, w) → f(t, w) is B ([0,T ]) ⊗ F − measurable,
(2) ft : w → f(t, w) is F t − measurable, for each t,
R T 2 (3) E[ 0 |f(s, w)| ds] < ∞ 2 define the class MT , in other words any product measurable, adapted and bounded f :(t, w) → R
2 is in MT . 2.5. Stochastic Integrals and Itˆo’sFormula 29
2 2 Definition 2.5.2. Let Mstep ∈ MT denote the class of simple step-function f of the form
m−1 X f(t, w) = ηi1(ti,ti+1](t) i=0
for some partition Π = {t0, t1, ..., tm} of [0,T ] and ηi, the value that ft takes for t ∈ (ti, ti+1],
2 belongs to L (Ω, F ti ,P ), ∀ i ∈ {0, 1, ... , m − 1}.
2 Definition 2.5.3. (Stochastic integral for simple step integrand) For f ∈ Mstep, the integral up to time t ∈ [0,T ] is defined as:
m−1 Z t X It(f) ≡ fs.dWs := ηi.(Wti+1∧t − Wti∧t). 0 i=0
2 Theorem 2.5.4. For simple step integrands f, g ∈ Mstep, the stochastic integral verifies the following properties:
2 (a) Linearity: For a, b ∈ R, af + bg ∈ Mstep and
It(af + bg) = a.It(f) + b.It(g).
(b) Martingale property: For t ∈ [0,T ],It(f) is a martingale, (c) Itˆoisometry: Z t 2 2 E[(It(f)) ] = E[ |fs| ds]. 0
2 n 2 n Theorem 2.5.5. For any f ∈ MT , there is a sequence (f )n∈N in Mstep such that, f converges to f in L2([0,T ] × Ω, B([0,T ]) ⊗ F, m ⊗ P ), then,
Z T n 2 n 2 kf − f kL2([0,T ]×Ω) = E[ |fs − fs | .ds] → 0 for n → ∞. 0 30 Chapter 2. Mathematical background
2.5.2 Stochastic integral for square integrable adapted integrands
2 n 2 n Definition 2.5.6. For f ∈ M , there is a sequence of f ∈ Mstep such that kf −f kL2([0,T ]×Ω) →
2 n 0. Hence (fn) is a Cauchy sequence in L ([0,T ] × Ω, m ⊗ P ), therefore (It(f )) is a Cauchy sequence in L2(Ω,P ) by the isometry and therefore has an L2-limit. We define
Z t 2 n It(f) ≡ fs.dWs := L − lim It(f ). 0 n→∞
2 R t Theorem 2.5.7. For f, g ∈ MT , the stochasric integral It(f) ≡ 0 fs.dWs verifies the following properties:
2 (a) Linearity: For a, b ∈ R, af + bg ∈ MT and
It(af + bg) = a.It(f) + b.It(g).
(b) Martingale property: For t ∈ [0,T ],It(f) is a martingale, (c) Itˆoisometry: Z t 2 2 E[(It(f)) ] = E[ |fs| ds]. 0
2 Theorem 2.5.8. (Continuity of the stochastic integral) For f ∈ MT , one can choose a (unique) version of the stochastic integral It(f) such that t 7→ It(f), t ∈ [0,T ] has continuous paths.
2.5.3 Further extension of the stochastic integral
Definition 2.5.9. The class of functions f : [0,T ] × Ω → R with the properties: (1) f :(t, w) → f(t, w) is B ([0,T ]) ⊗ F − measurable,
(2) ft : w → f(t, w) is F t-measurable, for each t,
R T 2 (3) E[ 0 |f(s, w)| ds] < ∞, 2 is the class Mloc.
2 2 Definition 2.5.10. (Stochastic integral for integrands from Mloc ) For f ∈ Mloc, the integral 2.5. Stochastic Integrals and Itˆo’sFormula 31 up to time t ∈ [0,T ] is defined as:
Z t Z t
It(f) ≡ fs.dWs := a.s. − lim 1[0,τn(ω)](s)fs.dWs. 0 n 0
2.5.4 Itˆoformula (with respect to Brownian motion)
Theorem 2.5.11. (Simplest version of Itˆo’sformula) Let F : R → R be two-times continuously differentiable, i.e. F ∈ C2(R), then we have a.s.
Z t Z t 0 1 F (Wt) − F (W0) = F (Ws).dWs + F ”(Ws).ds , t ∈ [0,T ]. 0 2 0 which can be expressed in a differentiable form, by:
1 dF (W ) = F 0(W ).dW + .F ”(W ).dt , t ∈ [0,T ]. t t t 2 t
Theorem 2.5.12. (Simplest version of the time-dependent Itˆoformula) Let F : [0,T ] × R → R, (t, x) 7→ F (t, x), be continuously differentiable in t and two-times continuously differentiable in x, i.e. F ∈ C1,2([0,T ] × R), then we have a.s., for t ∈ [0,T ]:
Z t ∂ Z t ∂ 1 Z t ∂2 F (t, Wt) = F (0,W0)+ F (s, Ws).ds+ F (s, Ws).dWs+ 2 F (s, Ws).ds , x = Ws. 0 ∂s 0 ∂x 2 0 ∂ x which can be expressed in a differentiable form, by:
∂ ∂ 1 ∂2 dF (t, W ) = F (t, W ).dt + F (t, W ).dW + F (t, W ).dt , x = W . t ∂t t ∂x t t 2 ∂2x t t
Example 2.5.13. (Geometric Brownian motion and its SDE) By applying the time-dependent Itˆoformula we obtain that the process:
1 S := s.exp(σW + (µ − σ2)t) t t 2 32 Chapter 2. Mathematical background with s ∈ (0, ∞) and with parameter σ > 0 and µ ∈ R is a solution to the SDE:
dSt = µStdt + σStdWt
with initial value S0 = s.
Definition 2.5.14. (Itˆoprocess) A process (Xt) is called an Itˆoprocess if it can be written as
Z t Z t Xt = x0 + αs.ds + σs.dWs, t ∈ [0,T ] 0 0 for x0 ∈ R, and product measurable and adapted functions αs = α(s, ω), σs = σ(s, ω) satisfying
Z T P [ |αs|ds < ∞] = 1, and 0 Z T 2 P [ |σs| ds < ∞] = 1. 0
Examples 2.5.15. (a) A Brownian motion W is an Itˆoprocess.
1,2 (b) If F ∈ C then Xt := F (t, Wt) is an Itˆo-process by Itˆo’sformula. R t (c) An absolutely continuous process of the form Xt = Xo + 0 α(s, ω)ds is an Itˆoprocess. (d) Every adapted process whose paths are continuously differentiable in t is an Itˆoprocess, ∂ since X (ω) = X (ω) + R t X(s, ω)ds. For instance, X = t is an Itˆoprocess. t 0 0 ∂t t
Definition 2.5.16. (Stochastic integral with respect to an Itˆoprocess) We define the stochastic integral with respect to an Itˆo-process X as
Z t Z t Z t fs.dXs := fsαs.ds + fsσsdWs 0 0 0
for suitable integrand functions fs = f(s, ω) which are such that the integrals on the RHS are
1 2 well defined; that means that fα ∈ L ([0,T ], ds) a.s. and fσ ∈ Mloc. 2.6. Stochastic differential equations 33
2.5.5 Itˆo’sformula (with respect to Itˆoprocesses)
Theorem 2.5.17. (time-dependent Itˆoformula with respect to Itˆoprocesses, one-dimensional)
For a function F ∈ C1,2([0,T ] × R) and an Itˆoprocess X it holds
∂ ∂ 1 ∂2 dF (t, X ) = F (t, X )dt + F (t, X )dX + F (t, X )(σ )2dt , t ∈ [0,T ]. t ∂t t ∂x t t 2 ∂2x t t
Theorem 2.5.18. (Multi-dimensional Itˆoformula with respect to two Itˆoprocesses) Let X and Y be two Itˆoprocesses, such that:
dXt = αt.dt + σt.dWt,
dYt = αt.dt + σt.dWt.
For a function F (x, y) ∈ C2(R × R) we have for t ∈ [0,T ],
∂ ∂ 1 ∂2 dF (X ,Y ) = F (X ,Y )dX + F (X ,Y )dY + F (X ,Y )(dX )2 t t ∂x t t t ∂x t t t 2 ∂2x t t t 1 ∂2 ∂2 + F (X ,Y )(dY )2 + F (X ,Y )(dX dY ). 2 ∂2y t t t ∂x∂y t t t t
Example 2.5.19.
d(Xt,Yt) = YtdXt + XtdYt + (dXtdYt) = (Ytαt + Xtαt + σtσt)dt + (Ytσt + Xtσt)dWt.
2.6 Stochastic differential equations
The inclusion of random effects in differential equations leads to Stochastic differential equations for which the solution has non-differentiable sample paths when a differential equation is forced by an irregular stochastic process such as a Gaussian white noise [12, 13, 14].
Example 2.6.1. We can consider the molecular bombardment of a spec of dust on a water surface, which results in Brownian motion. Taking Xt as one of the components of the velocity 34 Chapter 2. Mathematical background of the particle, Langevin wrote the equation
dX t = −a.X + b.B dt t t for the acceleration of the particle. This equation represents the sum of a retarding frictional force depending on the velocity and the molecular forces represented by a white noise process
Bt, with intensity b which is independent of the velocity. The Langevin equation is written symbolically as a stochastic differential equation
dXt = −a.Xt.dt + b.dWt
where: dWt = Bt.dt.
It must be emphasized that a stochastic differential equation doesn’t have a well defined math- ematical meaning on its own and that it should always be interpreted as an integral equation with Ito or Stratonovich stochastic integrals. As for example, the previous SDE corresponding to the Langevin equation is represented by the following stochastic integral equation:
Z t Z t Xt = X0 − a.Xs.ds + b.dWs 0 0 where the second integral is an Ito stochastic integral.
In general the solution of an SDE inherit the nondifferentiability of sample paths from the Wiener Processes in the stochastic integrals.
If we apply a functional F to a given Itˆoprocess, say a Brownian motion (Wt ), we could use the Itˆoformula as a chain rule to obtain an SDE that is satisfied by F (t, Wt). But if we start on the other hand from a given SDE, we want to find the process that solves this SDE.
Example 2.6.2. The geometric Brownian motion solves the SDE:
dYt = Yt(µ.dt + σ.dWt) 2.6. Stochastic differential equations 35 and we know the solution Y explicitly from example 2.3.13.
A more systematic method than guessing the explicit solution in advance (and verifying it by the Itˆoformula) is to start by making an ansatz for the parametric form of the solution and to use, after an application of Itˆoformula, the method of matching the coefficients (of dt and dW
). For the above SDE for instance, the ansatz Xt = F (t, Wt) leads to the solution.
Example 2.6.3. (The Ornstein-Uhlenbeck process) Consider the SDE:
dYt = −αYt.dt + σ.dWt
with Y0 = y0, α, σ ∈ (0, ∞) and y0 ∈ R. For a and b being differentiable functions of time only with a > 0 and a(0) = 1, let the process X be given by
Z t Xt = a(t).(x0 + b(s).dWs) 0
If we want to choose the parametric form of X such that X is a solution to the SDE for Y, it is immediate to set x0 := y0. By the Itˆoformula we can compute dXt. Matching the coefficients with those from the SDE for Y, we arrive at the equations a(t)b(t) = σ and a0(t)/a(t) = −α. Solving for the functions a and b yields
a(t) = exp(−α.t) and b(t) = σ.exp(αt)
Substituting these into the general definition of X, Itˆoformula yields that X with these functions a and b indeed solves the SDE for Y . It is called the Ornstein-Uhlenbeck process. Chapter 3
Numerical Methods for SDEs
The analytical study of SDEs reveals to be very complicated in most of the situations and few stochastic differential equations are known to have an explicit solution. For this reason, numerical techniques play a very important role in finding approximate solutions to these equations in many fields of application, as for example finance, performance modelling and biological networks.
3.1 Examples of explicitly solvable SDEs
Below are listed some SDEs and their explicit solutions, these will be later on used for the validation of the numerical methods.
Additive noise
Constant Coefficients : homogeneous case
dXt = (a.Xt + b).dt + c.dWt
Z t a.t b −a.t −a.s Xt = e X0 + (1 − e + c e .dWs) a 0
36 3.1. Examples of explicitly solvable SDEs 37
Variable coefficients:
dXt = (a(t).Xt + b(t)).dt + c(t).dWt
Z t Z t −1 −1 Xt = Φt X0 + Φs b(s)ds + Φs c(s).dWs 0 0 with fundamental solution Z t Φt = exp a(s)ds . 0
Multiplicative noise
Constant Coefficients : homogeneous case
dXt = a.Xt.dt + b.Xt.dWt
1 X = X exp a − b2t + b.W t 0 2 t
Constant Coefficients : inhomogeneous case
dXt = (a.Xt + c).dt + (b.Xt + d).dWt
Z t Z t −1 −1 Xt = Φt(X0 + (c − bd) Φs ds + Φs dWs 0 0 with fundamental solution 1 Φ = exp a − b2t + b.W t 2 t
Variable coefficients: homogeneous case
dXt = a(t).Xt.dt + b(t).Xt.dWt
Z t Z t 1 2 Xt = X0exp a(s) − b (s) ds + b(s).dWs 0 2 0 38 Chapter 3. Numerical Methods for SDEs
Variable coefficients: inhomogeneous case
dXt = (a(t).Xt + c(t)).dt + (b(t).Xt + d(t)).dWt
Z t Z t −1 −1 Xt = Φt,0 X0 + Φs,0(c(s) − b(s).d(s))ds + Φs,0d(s).dWs 0 0 with fundamental solution
Z t Z t 1 2 Φt,0 = exp a(s) − b (s) ds + b(s).dWs 0 2 0
(for more details please refer to [14])
3.2 Existence and uniqueness of strong solutions
When an explicit solution of the SDE is not available, it is important to know if the solution exists and if it is unique for some initial value X0, before proceeding with the numerical simula- tion. In this section, we discuss briefly the existence and uniqueness of solutions for stochastic differential equations [53, 54, 46]. We consider the case of one-dimensional stochastic differential equations (the same study can be generalised to the multi-dimensional case):
dXt = a(t, Xt).dt + b(t, Xt).dWt (3.1) where a(t,x), b(t,x) are two Borel-measurable functions defined from [0, ∞) × R into R and
W = {Wt; t ∈ [0, ∞)} is a one-dimensional Brownian motion on the filtered probability space
(Ω,F, Ft,P) for t ≥ 0 .
X = {Xt; t ∈ [0, ∞)} is a continuous stochastic process that satisfies the SDE above with initial value X0 which is F0-measurable. 3.2. Existence and uniqueness of strong solutions 39
3.2.1 Solutions for ODEs
When the function b(t,x) is equal to zero for all the values of x and t, the SDE reduces to an ordinary differential equation:
dXt = a(t, Xt).dt which can be written as: Z t Xt = X0 + a(s, Xs).ds (3.2) 0
The existence and uniqueness theorem ensures that we have a unique solution X(t, X0) by requiring a sufficient condition on the function a(t, x) that it verifies the Lipschitz condition:
|a(t, x) − a(t, y)| ≤ K.|x − y| for all x,y ∈ R, where K is a positive constant. The proof goes by considering the Picard- Lindelof iterations:
(n+1) R t (n) Xt = X0 + 0 a(s, Xs ).ds , n > 0 (0) Xt = X0 , n = 0 and showing that it converges to the continuous solution of equation (3.2) and that it is unique.
To ensure the global existence of the solution (for all t > 0 ), the growth bound condition on a(t,x) must be satisfied: |a(t, x)|2 ≤ L.(1 + |x|2) for all x ∈ R, where L is a positive constant. Without this additional condition, the solution
3 1/3 might become unbounded after a small t. As for example, the solution x(t) = x0/(1 − 3t.x0)
4 3 of the ODE : dx/dt = x , which explodes at t = 1/(3.x0).
Although these conditions are sufficient for the existence and uniqueness of a solution they are not necessary. But their absence might lead us into solving differential equations that might 40 Chapter 3. Numerical Methods for SDEs not succeed to be solvable or that might reveal to have a continuum of solutions [59].
A similar theory was developed for SDEs where uniqueness and existence theorems based on Lipschitz-type conditions proved to be successful as in the ODE case.
3.2.2 Solutions for SDEs
Theorem 3.2.1. Consider the stochastic differential equation 3.1, if the functions a(t,x) and b(t,x) are locally Lipschitz-continuous in the space variable x then the strong uniqueness holds for this SDE.
Before working out the proof of Theorem 3.2.1, it is useful to state the definition of locally Lipschitz continuous functions and the Gronwall’s inequality due to the importance they play in this proof.
Definition 3.2.2. (Local Lipschitz condition) A function f : [0, ∞)×R → R is locally Lipschitz continuous with respect to its second argument if for every integer n ≥ 1 there exists a real constant Kn > 0 such that, for every t ≥ 0, | x |≤ n and | y |≤ n :
| f(t, x) − f(t, y) |≤ Kn. | x − y | .
Lemma 3.2.3. (Gronwall’s inequality) Assume that the continuous functions u: [0,T ] →
[0, ∞) and v: [0,T ] → R satisfy:
Z t u(t) ≤ v(t) + K. u(s).ds; ∀t ∈ [0,T ], 0 where K ≥ 0. Then the Gronwall inequality is:
Z t u(t) ≤ v(t) + K. v(s).eK.(t−s).ds; ∀t ∈ [0,T ]. 0
Proof of Theorem 3.2.1 . We suppose that on the probability space (Ω,F,P) there exist 2 strong 3.2. Existence and uniqueness of strong solutions 41 solutions X and X’ for the SDE (3.1) with respect to the same Brownian motion W, verifying the same initial condition X0 and with almost surely continuous sample path.
0 Let τN be the stopping time such that τN (ω) = inf{t ≥ 0; | Xt(ω) |≥ N} for N ≥ 1 and τN
0 0 another stopping time defined the same way but relative to X . We define SN as τN ∧ τN , for t ∈ [0,T ] we have that,
Z t∧SN X − X0 = a(s, X ) − a(s, X0 ).ds t∧SN t∧SN s s 0 Z t∧SN 0 + b(s, Xs) − b(s, Xs)).dWs 0
h Z t∧SN E[| X − X0 |2] =E | a(s, X ) − a(s, X0 ).ds t∧SN t∧SN s s 0 Z t∧SN 0 2 i + b(s, Xs) − b(s, Xs)).dWs | 0 using the inequality (v + u)2 ≤ 2(v2 + u2) we obtain:
Z t∧SN E[|X − X0 |2] ≤2E| a(s, X ) − a(s, X0 ).ds|2 t∧SN t∧SN s s 0 Z t∧SN 0 2 + 2E| b(s, Xs) − b(s, Xs)).dWs| 0 applying Cauchy-Schwarz inequality to the first term and the Ito isometry to the second term we get,
Z t∧SN E[|X − X0 |2] ≤2tE |a(s, X ) − a(s, X0 )|2.ds t∧SN t∧SN s s 0 Z t∧SN 0 2 + 2E |b(s, Xs) − b(s, Xs)| .ds 0 the local Lipschitz condition imply that,
Z t E[|X − X0 |2] ≤ 2(T + 1)K2 E|X − X0 |2.ds t∧SN t∧SN N s∧SN s∧SN 0 42 Chapter 3. Numerical Methods for SDEs
Finally by applying the Gronwall’s inequality where we take u(t) to be E[|X −X0 |2] and t∧SN t∧SN v(t) = 0, we obtain that E[|X −X0 |2] = 0. Then we deduce that {X ; 0 ≤ t < ∞} and t∧SN t∧SN t∧SN {X0 ; 0 ≤ t < ∞} are modifications of one another and thus are indistinguishable. By taking t∧SN the limit when N goes to ∞, we deduce that this conclusion is also valid for {Xt; 0 ≤ t < ∞}
0 and {Xt; 0 ≤ t < ∞}
Even though theorem 3.2.1 ensures the uniqueness of a strong solution, it does not guarantee its global existence. For this reason, a stronger condition than the local Lipschitz continuity will be required to confirm the existence.
Theorem 3.2.4. Consider the stochastic differential equation 3.1, if the assumptions below are satisfied: i- The functions a(t,x) and b(t,x) are jointly (L2) -measurable in (t, x) ∈ [0,T ] × R. ii- The functions a(t,x) and b(t,x) are globally Lipschitz-continuous in the space variable x,
∃K > 0 such that : |a(t, x) − a(t, y)| ≤ K|x − y|
|b(t, x) − b(t, y)| ≤ K|x − y| for all t ∈ [0,T ] and x,y ∈ R. iii- Linear growth bound conditions on the functions a(t,x) and b(t,x):
∃K > 0 such that : |a(t, x)|2 ≤ K2(1 + |x|2)
|b(t, x)|2 ≤ K2(1 + |x|2) for all t ∈ [0,T ] and x,y ∈ R.
2 iv- X0 is F0-measurable with E(|X0| ) < ∞.
then the SDE has a pathwise unique strong solution Xt on [0,T ] with