<<

MATH 425 PART V: INFORMAL INTRODUCTION TO STOCHASTIC

G. BERKOLAIKO

1. Motivation: how to model price paths Differential equations have long been an efficient tool to model evolution of various mechanical systems. At the start of 20th century, however, science started to tackle modeling of systems driven by probabilistic effects, such as stock market (the work of Bachelier) or motion of small particles suspended in liquid (the work of Einstein and Smoluchowski). For us, the motivating questions are: Is there an equation describing the evolution of a stock price? And, perhaps more applicably, how can we simulate numerically a seemingly random path taken by the stock price? Example 1.1. We have seen in previous sections that we can model the distribution of stock prices at time t as √  2  St = S0 exp (µ − σ /2)t + σ tY , (1.1) where Y is standard normal . So perhaps, for every time t we can generate a standard normal variable, substitute it into equation (1.1) and plot the resulting St as a function of t? The result is shown in Figure 1 and it definitely does not look like a price path of a stock. The underlying reason is that we used equation (1.1) to simulate prices at different time points independently. However, if t1 is close to t2, the prices cannot be completely independent. It is the change in price that we assumed (in “Efficient Market Hypothesis”) to be independent of the previous price changes.

2.

Definition 2.1. Wiener process (or ) is a random function of t, denoted Wt = Wt(ω) (where ω is the underlying random event) such that

(a) Wt is a.s. continuous in t, (b) for every t, Wt ∼ N(0, t) (in particular Wt = 0),

(c) increments Wt2 − Wt1 corresponding to disjoint intervals are independent and have normal distribution,

Wt+∆t − Wt ∼ N(0, ∆t). Wiener process can now be used to simulate the prices in the Asset Price Model via the formula

2 (µ−σ /2)t+σWt St = S0e . (2.1) √ Note that there is no t multiplying Wt (in contrast√ to equation (1.1)) as the process Wt already includes the correct scaling: Wt ∼ N(0, t) ∼ tN(0, 1). N How to simulate a Wiener process? One way is to choose ∆t and generate i.i.d. sequence {zj}j=1 of N(0, ∆t) variables. We then let

t0 = 0, t1 = ∆t, t2 = 2∆t, . . . , 1 2 G. BERKOLAIKO

180

160

140

t 120 S

100

80

60 0 0.2 0.4 0.6 0.8 1 t

Figure 1. A wrong simulation of the price path from Example 1.1.

and the corresponding Wt are

w0 = 0, w1 = z1, w2 = z1 + z2, w3 = z1 + z2 + z3,... (2.2)

An example path is shown in Fig. 2. If needed, the mesh can be chosen non-uniform with cor- responding adjustment to the distribution of increments z. A slight drawback is that to increase mesh density on an already generated path Wt (for example, to add 10 more points between pair w1 and w2) requires a slightly different object, the , because the filled in values need to connect already generated endpoints (such as w1 to w2) correctly. Another way is to use Paley–Wiener representation of Wt as a Fourier with random coefficients,

∞ z0(ω) 2 X sin(nt/2) W (ω) = √ t + √ z (ω) , (2.3) t π n n 2π n=1

∞ where {zj}j=0 are i.i.d. standard normal variables. On the practical level, one generates a large number of coefficients zj and truncates the sum. To “zoom in” one can add more terms to the sum. An example path (plotted using different number of terms) is shown in Figure 3. Properties of the Wiener process√ include “scale invariance”: stretching it by a factor of F in x-direction and by the factor of F in y-direction results in a similar looking function — another realization of the same Wiener process. To compare, doing the same stretching to a differentiable function y = f(x) will result in a flat line if F is taken large. From this observation we conclude that Wt is nowhere differentiable (almost surely). MATH 425 PART V: INFORMAL INTRODUCTION TO STOCHASTIC CALCULUS 3

2

1 t

W 0

-1

-2 0 2 4 6 8 10

200

180

160 t S 140

120

100 0 2 4 6 8 10 t

Figure 2. Top: a sample path of the Wiener process simulated via increments, equation (2.2). Bottom: the corresponding path of the asset price, equation (2.1), with µ = 0.08 and σ = 0.1 over 5 year period.

Let us now fix an interval [0, t] and divide it into a large number N of parts of equal length t δt = N . Let us calculate the limit, as N → ∞ of the sum of increments of Wt squared, N N √ 2 X 2 X   lim δWt = lim δtZj , (2.4) δt→0 j δt→0 j=1 j=1 where δWt := Wt+δt − Wt and Zj ∼ N(0, 1) are i.i.d. standard normal. The left-hand side is reminiscent of Riemann sum in the definition of the so we declare it to be the stochastic integral Z t N N 2 X 2 X t 2 2 (dWs) := lim δWtj = lim Zj = tEZ = t, (2.5) δt→0 N→∞ N 0 j=1 j=1 where we used the Law of Large Numbers to evaluate the limit. Thus we have Z t 2 2 (dWs) = t or (dWs) = dt. (2.6) 0 Note that the derivation of the last identity has been done extremely informally and assigning self-consistent mathematical sense to the objects described takes a lot more work. We will take 2 the identity (dWt) = dt as a given and will apply it to things we know, such as Taylor expansion and differential equations. A numerical demonstration of multiple Wiener process paths and their is shown in Figure 4 4 G. BERKOLAIKO

2 10 coefficients 1.5 50 coefficients 1000 coefficients

1

0.5

t 0 W

-0.5

-1

-1.5

-2 0 1 2 3 4 5 6 t

Figure 3. A sample path of the Wiener process simulated via Paley-Wiener repre- sentation, equation (2.3), with different number of coefficients used in the series.

3. Ito formula and examples of SDEs In the following we will apply the following rules and notation of “stochastic calculus”: as δt becomes “infinitesimal” dt, 2 2 Wt+δt − Wt = δWt → dWt, (Wt+δt − Wt) = (δWt) → dt. The terms of higher order in dt will be neglected, so we will set 2 dtdWt = 0, (dt) = 0. Consider a smooth function f(t, W ) of two variables. We substitute the Wiener process for the variable W obtaining f(t, Wt) which is a function of t only (no longer smooth). We want to calculate its “stochastic differential” df(t, Wt) with respect to t. It is defined as the limit of the increment

δf := f (t + δt, Wt+δt) − f(t, Wt), as δt → dt. Using expansion in two variables, ∂f ∂f 1 ∂2f ∂2f ∂2f  f((t + δt, W ) − f(t, W ) = δt + δW + (δt)2 + 2 δW δt + (δW )2 t+δt t ∂t ∂W 2 ∂t2 ∂t∂W ∂W 2 + higher order terms,

2 2 and remembering that we set dtdWt = 0, (dt) = 0 and (dWt) = dt, we take the limit δt → dt and arrive to ∂f 1 ∂2f  ∂f df(t, W ) = + dt + dW . (3.1) t ∂t 2 ∂W 2 ∂W t This equation is known as Ito formula. MATH 425 PART V: INFORMAL INTRODUCTION TO STOCHASTIC CALCULUS 5

Figure 4. Simulation of 50 Wiener process paths over the interval [0, 1] with step size δt = 1/N = 10−4. Left: the sample paths; note the characteristic square root growth of the width of the bunch of paths. Middle: sample mean as a function of t √and sample standard deviation as a function of t (against the theoretical values 0 and PtN 2 t correspondingly. Right: the sum j=1(δWtj ) as a function of t; the (almost) straight lines correspond to our result (2.6).

Remark 3.1. In order to give a proper mathematical sense to the above differentials, they should be placed inside an integral. Otherwise the “limit” δt → dt is not really defined. Rigorous proofs are important to ensure one does not do meaningless things. But such proofs would be too lengthy for this course. Instead we will continue manipulating these differentials formally.1 The situation here is somewhat analogous to the separation of variables method used in ODEs: dy there the dx split into dy and dx which are moved around independently (and only acquire meaning when integrated).

2 Example 3.1. We will calculate the differential d(Wt) . Here the function f(t, W ) and its deriva- tives are ∂f ∂f ∂2f f(t, W ) = W 2, = 0, = 2W, = 2. ∂t ∂W ∂W 2 Therefore, by Ito formula (3.1), 2 d(Wt ) = dt + 2WtdWt.

1In , the word “formal” is used in two somewhat contradictory meanings. A “formal proof” is a a proof executed in full mathematical rigor, whereas a “formal manipulation” refers to manipulating mathematical objects without checking if their underlying meaning remains valid throughout the manipulation. 6 G. BERKOLAIKO

Example 3.2. Our continuous-time asset price model is

2 (µ−σ /2)t+σWt St = S0e .

We would like to calculate the differential of St. We represent it as St = S(t, Wt), where (µ−σ2/2)t+σW S(t, W ) = S0e . To use Ito formula we need the ∂S  σ2  ∂f ∂2f = µ − S, = σS, = σ2S. ∂t 2 ∂W ∂W 2 From (3.1) we get

dSt = dS(t, Wt) = µStdt + σStdWt. (3.2) This stochastic differential equation is known as the geometric Brownian motion. Remark 3.2. How to understand a stochastic differential equation (SDE) such as (3.2) or, more generally,

dXt = a(t, Xt)dt + b(t, Xt)dWt ? (3.3) What is the use for it? One use is to model sample paths on a computer (so-called Monte-Carlo simulation) and the simplest method for that is to convert the SDE into finite differences. Namely we discretize time 0 = t0 < t1 < t2 < . . . and replace (3.3) with

Xtn+1 − Xtn ≈ a (tn,Xtn ) δtn + b (tn,Xtn ) δWtn , (3.4) and p δtn = tn+1 − tn, δWtn = Wtn+1 − Wtn = tn+1 − tnZn, (3.5) where Zn ∼ N(0, 1) are independent standard normal in accordance with Definition 2.1. We can now express Xtn+1 as a function of Xtn and other known quantities and thus use (3.4) as an iterative scheme starting from a known starting point Xt0 = X0. Every run of the iterative scheme would generate a different random solution known as a sample path. To summarize, SDEs may be manipulated easily using Ito formula and its variants we are about to derive. On rare occasion a closed form solution can be found but usually one simulates the obtained equation on a computer to get some information such as the distribution of the sample path values Xt at some given future time t. Suppose now a smooth function f = f(t, g) depends on t and g and the latter depends in turn on t and the underlying Wiener process Wt, i.e. g = g(t, W ). We would like to establish a to calculate the stochastic differential of f(t, g(t, Wt)) as a function of t. By Taylor expansion of f, ∂f ∂f 1 ∂2f df(t, g ) = dt + dg + (dg )2. (3.6) t ∂t ∂g t 2 ∂g2 t

For dgt we have the Ito formula ∂g 1 ∂2g  ∂g dg = + dt + dW . t ∂t 2 ∂W 2 ∂W t

Squaring dgt and remembering our “stochastic calculus rules” 2 2 (dWt) = dt, dWtdt = 0, dt = 0, we get  ∂g 2 (dg )2 = dt. t ∂W MATH 425 PART V: INFORMAL INTRODUCTION TO STOCHASTIC CALCULUS 7

We finally obtain Ito Chain Rule ! ∂f 1 ∂2f  ∂g 2 ∂f df(t, g(t, W )) = + dt + dg (3.7) t ∂t 2 ∂g2 ∂W ∂g t ! ∂f ∂f ∂g 1 ∂2f  ∂g 2 1 ∂f ∂2g ∂f ∂g = + + + dt + dW . (3.8) ∂t ∂g ∂t 2 ∂g2 ∂W 2 ∂g ∂W 2 ∂g ∂W t

Note that in practice you are often given dgt and thus equation (3.7) or even (3.6) provide a better starting point. Example 3.3. A financial derivative is a financial instrument that derives its value from the value of another asset, called underlying (for example a stock). Therefore the value of a general derivative is a function V (t, S) of time and of the value S of the underlying. Assuming the equation of the underlying is

dSt = µStdt + σStdWt, we will derive an equation for dV (t, St). We use Ito Chain Rule in the form of (3.6) and 2 2 2 2 2 2 (dSt) = σ St (dWt) = σ St dt, to get ∂V ∂V 1 ∂2V dV (t, S ) = dt + dS + (dS )2 t ∂t ∂S t 2 ∂S2 t ∂V 1 ∂2V  ∂V = + σ2S2 dt + dS . (3.9) ∂t 2 ∂S2 t ∂S t While not terribly explicit in its current form, equation (3.9) will be our starting point for the derivation of Nobel-prize winning Black–Scholes–Merton equation. Note the appearance of the ∂V term ∂S : this is the increment in V divided by increment in the underlying S, the all-important ∆ of hedging!

Example 3.4. Suppose Xt is modeling the USD to EUR exchange rate using equation

dXt = µXtdt + σXtdWt.

Then the EUR to USD exchange rate is given by Yt = 1/Xt. What is the equation satisfied by Yt? We will use Ito Chain Rule with 1 f(t, g) = , g = X . g t The relevant partial derivatives are ∂f ∂f 1 ∂2f 2 = 0, = − , = . ∂t ∂g g2 ∂g2 g3 Equation (3.6) becomes 1 2  1  dY = df(t, g ) = 0 + (dg )2 + − dg . t t 2 g3 t g2 t We substitute 2 2 2 g = Xt, dXt = µXtdt + σXtdWt, (dXt) = σ Xt dt, and obtain

1 2 2 1 1 2 1 1 dYt = 3 σ Xt dt − 2 (µXtdt + σXtdWt) = σ dt − µdt − σdWt. Xt Xt Xt Xt Xt 8 G. BERKOLAIKO

Since 1/Xt = Yt, we finally obtain 2 dYt = (σ − µ)Ytdt − σYtdWt. Also of interest in the financial setting is Ito of two functions

Xt = X(t, Wt),Yt = Y (t, Wt), which are here assumed to be driven by the same Wt. In this case the differential of their product is given by

d(XtYt) = dXt · Yt + Xt · dYt + dXt · dYt. (3.10) Note that the last term is too small to appear in usual calculus but may contribute in stochastic 2 calculus since (dWt) = dt.