DEGREE PROJECT IN MATHEMATICS, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2016

Modelling the Stochastic Correlation

PENG CHEN

KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ENGINEERING SCIENCES

Modelling the Stochastic Correlation

PENG CHEN

Master’s Thesis in Financial Mathematics (30 ECTS credits) Master Programme in Applied and Computational Mathematics (120 credits) Royal Institute of Technology year 2016 Supervisor at KTH: Fredrik Armerin Examiner: Boualem Djehiche

TRITA-MAT-E 2016:27 ISRN-KTH/MAT/E--16/27--SE

Royal Institute of Technology School of Engineering Sciences

KTH SCI SE-100 44 Stockholm, Sweden

URL: www.kth.se/sci

Modelling the Stochastic Correlation

Abstract In this thesis, we mainly study the the correlation between stocks. The correlation between stocks has been receiving increasing atten- tion. Usually the correlation is considered to be a constant, although it is observed to be varying over time. In this thesis, we study the properties of correlations between Wiener processes and introduce a stochastic correlation model. Following the calibration methods by Zetocha, we implement the calibration for a new set of market data.

i

Modellering av stokastisk korrelationen

Sammanfattning I det h¨ar examensarbetet fokuserar vi fr¨amst p˚aatt studera kor- relation mellan aktier. Korrelationen mellan aktier har f˚attallt st¨orre uppm¨arksamhet. Vanligtvis antas korrelation vara konstant, trots att empiriska studier antyder att den ¨ar tidsvarierande. I det h¨ar exa- mensarbetet studerar vi egenskaper hos korrelationen mellan Wiener- processer och inf¨or en stokastisk korrelationsmodel. Baserat p˚aka- libreringsmetoder av Zetocha implementerar vi kalibrering f¨or en ny upps¨attning av marknadsdata.

ii

Acknowledgements I would like to express my gratitude to my supervisor at KTH Fredrik Armerin for the guidance and comments on this thesis. Thanks for your kindness and encouragement throughout the thesis work. I woud like to thank my family for supporting my study in Sweden. It is a valuable experience.

Peng Chen Stockholm, June 2016

.

iii

Contents

1 Introduction 1

2 Theoretical background 3 2.1 Itˆointegral ...... 3 2.2 Variation and covariation of continuous martingales ...... 5 2.2.1 Martingale ...... 5 2.2.2 Variation and covariance ...... 6 2.2.3 Quadratic variation of ...... 7 2.3 Itˆo’sformula ...... 8 2.3.1 Constructing correlated Wiener processes ...... 9 2.4 Stochastic differential equations ...... 11 2.5 The Girsanov theorem ...... 13

3 The Black-Scholes model 15 3.1 The one dimensional Black-Scholes model ...... 15 3.1.1 Historic volatility ...... 17 3.1.2 Implied volatility ...... 17 3.2 Correlation ...... 19 3.2.1 Correlation between two random variables ...... 19 3.2.2 Basket correlation and implied correlation ...... 20 3.3 Multidimensional models with correlations ...... 22 3.3.1 Constant volatility model ...... 23 3.4 Multi-asset derivatives ...... 24 3.4.1 Worst-of options ...... 25 3.4.2 Dispersion trading ...... 25 3.4.3 Variance and volatility swaps ...... 27 3.4.4 The correlation swap ...... 28

4 The stochastic correlation model 29 4.1 A general model ...... 29 4.2 A simplified multi-asset model ...... 29 4.3 Properties of the model ...... 31 4.3.1 Existence and uniqueness of solutions ...... 31 4.3.2 Moment evaluation ...... 31 4.3.3 The boundary condition ...... 33 4.3.4 Simulation ...... 33 4.3.5 Numerical experiments for the parameters ...... 34 5 Calibration 37 5.1 Methods for calibration ...... 37 5.2 Data description and the calibration ...... 39 5.2.1 Model analysis ...... 41 5.2.2 Summary ...... 46

v 1 Introduction

There is a large number of stocks traded in financial markets. The prices of these stocks change every day in light of the news and the expected future performance of the equities. These changes are not independent. There are some co-movements between these stocks, which are called correlations. Firms that run some common business lines will have some correlations, and in general, these correlations do not remain the same over time. For instance, suppose A and B are two firms that have some common business lines. If firm A drop a certain business line, then the correlation between A and B will change. According to [8], the price of a stock reflects the expectation of future performance of the firm. Every news will affect this expectation for all stocks but with different extent. The effect may be large for some equities but negligible for others, thus the correlation will change according to the news. For simplicity, the correlation is usually modeled as a constant. In some cases, the constant correlation cannot explain the market well. In this thesis, we will study a stochastic correlation model. It is observed that the correlation between stocks rises in a bear market and falls in a bullish market. In order to verify this claim again, we explore the OMXS30 index to observe its correlation and return. Figure 1 is a scatter plot of the correlation and return for the index OMXS30. It shows that the correlation of the index OMXS30 is negatively correlated to its return. The red line is the linear regression with coefficients

yˆ = 0.194477 − 0.365424x, wherey ˆ denotes the correlation and x is the log return. It indicates that the correlation and the log return are negatively correlated. There are many multi-asset equity derivatives traded in the market. The correlation market mainly includes the following types of contracts:

• Worst-of options

• Dispersion trading

• Variance and volatility swaps

• Correlation swaps We will study these multi-asset equity derivatives later after the introduction of some theoretical backgrounds.

1 three−month realized correlaiton three−month realized 0.05 0.10 0.15 0.20 0.25 0.30 0.35

−0.2 −0.1 0.0 0.1 0.2

three−month log return

Figure 1: Scatter of three month realized correlation vs three month log-return

2 2 Theoretical background

2.1 Itˆointegral Through this section, we will introduce some theoretical background of stochas- tic calculus. This section mainly refers to [9].

Definition 2.1. Let (Ω, F, P) be a probability space. A filtration (Ft)t≥0 on the probability space is an increasing family of σ-field that satisfies

Fs ⊂ Ft ⊂ F, ∀s ≤ t.

A probability endowed with a filtration is called a filtered probability space. A X on (Ω, F, P) is adapted to the filtration (Ft) if for each t ≥ 0, Xt is Ft-measurable. The natural filtration of the stochastic process X is the smallest σ-algebra with respect to which all the variables (Xs, s ≤ t) are measurable. We will X use (Ft )t≥0 to denote the natural filtration of X. We know that a stochastic process is always adapted to its natural filtration. Definition 2.2. (One-dimensional Wiener process) A stochastic process W is called a Wiener process or a if 1. W (0)=0

2. The process has independent increments, i.e. if s < t then W (t)−W (s) W is independent of Fs . 3. For s < t the random variable W (t)−W (s) has the normal distribution N[0, t − s], where t − s is the variance.

4. W has continuous trajectories.

We will interchangeable to use the notation W (t) and Wt in this thesis. Definition 2.3. (Multidimensional Wiener process) An Rd-valued process 1 2 d Wt = (Wt ,Wt , ··· ,Wt ) is called d-dimensional Wiener process, if its com- 1 2 d ponents Wt ,Wt , ··· ,Wt are independent Wiener processes. Note that an n-dimensional Wiener process has n independent compo- nents. In this thesis, we will use term, correlated Wiener process, if the components are correlated. Definition 2.4. We say that a process g belongs to the class L2[a, b] if

R b 2 • a E[g (s)]ds < ∞

3 W • the process g is adapted to (Ft )-filtration. If g belongs to L2[0, t] for all t > 0, then we say g belongs to L2 or

g ∈ L2.

Before we define the Itˆointegral for general process, we firstly define the Itˆo integral for simple processes. Then the general Itˆointegral can be defined as a limit.

Definition 2.5. A process g is called a simple process if there exist a parti- tion Π : a = t0 < t1 < ··· < tn = b of [a, b] such that

n−1 X g(t) = gtk 1[tk,tk+1)(t). (2.1) k=0 Then the Itˆointegral of a simple process is defined by

n−1 Z b X g(s)dWs = gtk [W (tk+1) − W (tk)], (2.2) a k=0 where g is a random variable that is F W -measurable. tk tk Theorem 2.1. For any process g ∈ L2[a, b], we can find a sequence of simple process gn such that

Z b 2 E[(g(s) − gn(s)) ]ds → 0. (2.3) a Then the Itˆointegral for general process g ∈ L2[a, b] can be defined by

Z b Z b g(s)dW (s) = lim gn(s)dW (s). (2.4) a n→∞ a From the definition above, we have following properties of the stochastic integral with respect to a Wiener process.

Theorem 2.2. Let g, f ∈ L2[a, b]. Then each Itˆointegral is defined and has the following properties:

1. Linearity. If α and β are some constants then

Z b Z b Z b (αf(s) + βg(s))dW (s) = α f(s)dW (s) + β g(s)dW (s) a a a

4 2. Zero mean property.

Z b  E g(s)dW (s) = 0. a

3. Itˆoisometry.

"Z b 2# Z b E g(s)dW (s) = E[g2(s)]ds. a a

4. Measurability.

Z b W g(s)dW (s) is Fb -measurable a .

2.2 Variation and covariation of continuous martin- gales 2.2.1 Martingale

Definition 2.6. A stochastic process X is called an (Ft)-martingale if the following conditions hold

• X is adapted to the filtration (Ft)t≥0

• For all t E[|Xt|] < ∞. (2.5)

• For all s < t, the following relation holds

E[Xt|Fs] = Xs (2.6)

It immediately follows from the definition that a Wiener process W is an W (Ft )-martingale.

Definition 2.7. A martingale (Mt)t≥0 is called square integrable on [0,T ] if for all t ∈ [0,T ], 2 E[Mt ] < ∞ The following theorem states that Itˆointegrals are also martingales.

2 R t Theorem 2.3. Let g ∈ L be an adapted process. Then I(t) = 0 g(s)dW (s) is a continuous zero mean square integrable martingale.

5 2.2.2 Variation and covariance Let X and Y be two stochastic processes and consider a partition Π of the interval [0, t]. Here,

Π : 0 = t0 < t1 < ··· < tn−1 < tn = t. The norm or mesh of Π is defined as

kΠk = max {ti+1 − ti}. 1≤i≤n−1 The quadratic variation of X is defined as

n−1 X 2 hXit = lim (Xti+1 − Xti ) kΠk 0  i=0 The quadratic variation is a pathwise measurement of the variance of X. The covariation process between X and Y is defined as:

n−1 X hX,Y it = lim (Xti+1 − Xti )(Yti+1 − Yti ), kΠk 0  i=0 which is a pathwise measurement of covariance between X and Y . The limit above, if exists, is defined using convergence in probability. An alternative definition of covariance is 1 hX,Y i = (hX + Y i − hX − Y i ). t 4 t t In fact, if X and Y are both square integrable martingales or Itˆoprocesses, then the limit exists. It is worth mentioned that

hX,Xit = hXit,

hX,Y it = hY,Xit. If α and β are constant, we have

hαX + βY, Zit = αhX,Zit + βhY,Zit.

Theorem 2.4. If X and Y are two square integrable continuous (Ft)-martingales, then

• hXit is the unique continuous increasing adapted process such that

2 Xt − hXit

is an (Ft)-martingale.

6 • hX,Y it is the unique continuous increasing adapted process such that

XtYt − hX,Y it

is an (Ft)-martingale. 2 Theorem 2.5. If M is a continuous martingale in L with M0 = 0 and u(s) R t is an adapted process such that 0 |u(s)|ds < ∞ a.s., then *Z · + u(s)ds, M = 0 (2.7) 0 t

2.2.3 Quadratic variation of Wiener process

If Wt is a Wiener process, then it can be proven that

hW it = t. (2.8) In other words, n−1 Z t X 2 2 hW it = lim (Wti+1 − Wti ) = (dWτ ) = t (2.9) kΠk 0 0  i=0 The differential form of Equation (2.9) can be written as

2 dhW it = (dWt) = dt (2.10) For a proof of this, see [12]. If process g, f belongs to L2[a, b], then Z · Z t Z t 2 2 h f(s)dW (t)it = f (s)dhW is = f (s)ds, a a a Z · Z · Z t Z t h f(s)dW (t), g(s)dW (t)it = f(s)g(s)dhW is = f(s)g(s)ds a a a a Theorem 2.6. (L´evy’scharacterization of Wiener process) Let W be a W stochastic process with the natural filtration (Ft )t≥0. Then W is a Wiener process if and only if the following condition holds:

1. W0 = 0, 2. W is a martingale with respect to its natural filtration, and

3. hW it = t for all t ≥ 0 L´evy’scharacterization allows one to verify if a process is a Wiener process by computing its quadratics variation, which is useful in the construction of correlated Wiener processes.

7 2.3 Itˆo’sformula Definition 2.8. (Itˆoprocess) Given a Wiener process W on (Ω, F, P), an Itˆoprocess is a stochastic process X on (Ω, F, P) of the form

Z t Z t Xt = X0 + µsds + σsdWs, (2.11) 0 0 where the adapted process σ called the diffusion coefficient or volatility, sat- isfies Z t  2 P σs ds < ∞, for all t ≥ 0 = 1 0 and the adapted process µ, called the drift coefficient, satisfies

Z t  P |µs|ds < ∞, for all t ≥ 0 = 1. 0 In general, we will write (2.11) in a differential form

dXt = µtdt + σtdWt (2.12)

Note that the differential form has no independent meaning. It is simply a shorthand version of (2.11). Using Theorem 2.5, we can compute the quadratic variation of X,

*Z · Z · + Z t Z t 2 2 hXit = σsdWs, σsdWs = σs dhW, W is = σs ds (2.13) 0 0 0 0 t R t R t If Yt = Y0 + 0 βsds + 0 αsdBs is another Itˆoprocess driven by another Wiener process B, then

*Z · Z · + Z t hX,Y it = σsdWs, αsdBs = σsαsdhW, Bis. (2.14) 0 0 0 t

If W and B are correlated and hW, Bit = c(t) for some differentiable deter- ministic function c(t), we have

Z t 0 hX,Y it = σsαsc (s)ds. (2.15) 0 The differential form of (2.14) can be written as

dhX,Y it = σtαtdhW, Bit.

8 dWt dBt dt dWt dt dhW, Bit 0 dBt dhW, Bit dt 0 dt 0 0 0

Table 1: Multiplication table

For convenience, we will sometimes write

dXtdYt = dhX,Y it = σtαtdhW, Bit. (2.16) Equation (2.16) above makes it easier to compute the differential form of covariance process for Itˆoprocesses. It makes the covariance operation work like a regular multiplication. The multiplication rule is shown in Table 1. Theorem 2.7. (Itˆo’sformula) Let X be an Itˆoprocess of the form

dXt = µtdt + σtdWt (2.17)

1,2 + and let g(t, x) ∈ C (R × R). Then Yt = g(Xt, t) is also an Itˆoprocess with the differential form ∂g ∂g 1 ∂2g dY = (t, X )dX + (t, X )dt + (t, X )dhX,Xi t ∂x t t ∂t t 2 ∂x2 t t ∂g ∂g ∂g 1 ∂2g  = σ (t, X )dW + (t, X ) + µ (t, X ) + σ2 (t, X ) dt. t ∂x t t ∂t t t ∂t t 2 t ∂x2 t

Thus, the drift coefficient of Yt is ∂g µ¯ = σ (t, X ) t t ∂x t and the diffusion coefficient is ∂g ∂g 1 ∂2g σ¯ = (t, X ) + µ (t, X ) + σ2 (t, X ). t ∂t t t ∂t t 2 t ∂x2 t

2.3.1 Constructing correlated Wiener processes In [16], a method is mentioned to construct correlated Wiener process. Let ρt be an Itˆoprocess

dρt = a(t, ρt)dt + b(t, ρt)dKt, ρ0 ∈ [−1, 1]. Let W 1 and V be two independent Wiener processes, and define Z t Z t 2 1 p 2 Wt = ρsdWs + 1 − ρsdVs. (2.18) 0 0

9 It can be shown by L´evy’scharacterization that W 2 is a Wiener process. In 2 2 fact, firstly, it is clear that W0 = 0. Since |ρt| ≤ 1 for all t, then E[|Wt |] < ∞ for all t. In addition, 2 W 2 2 E[Wt |Fs ] = Ws (2.19) Therefore, W 2 is a martingale with respect to its natural filtration and the quadratic variation of W 2 is

*Z · Z · + 2 1 p 2 hW it = ρsdWs + 1 − ρsdVs 0 0 t *Z · + *Z · + 1 p 2 = ρsdWs + 1 − ρsdVs 0 0 t t Z t Z t 2 2 = ρsdt + (1 − ρs)ds 0 0 = t

Hence, we have proved that W 2 is also a Wiener process and

* Z · Z · + 1 2 1 1 p 2 hW ,W it = W , ρsdWs + 1 − ρsdVs 0 0 t * Z · + 1 1 = W , ρsdWs 0 t Z t = ρsds. 0 Then, 1 2 dhW ,W it = ρtdt (2.20) This example shows how to construct two correlated Wiener processes. Given a correlation matrix, there is a standard way to construct an n-dimensional correlated Wiener process from an n-dimensional Wiener process .

Correlated n-dimensional Wiener processes Let V = (V 1,V 2, ··· ,V n) be a n-dimensional Wiener process and ( i j dt, if i = j dhV ,V it = 0, otherwise

10 We wants to construct a correlated n-dimensional Wiener process W based on V with a stochastic correlation matrix ρ(t) = (ρij(t))n×n and

i j dhW ,W it = ρij(t)dt.

If the correlation matrix is positive definite, then it has the Cholesky decom- position of the form ? ρ(t) = CtCt , where Ct is a lower triangular matrix with real and positive diagonal entries, ? and Ct is the transpose of Ct. Then

dWt = CtdVt (2.21) will be what we need. In fact, suppose Ct = (cij(t))n×n, then Equation (2.21) can be written as n i X k dWt = cik(t)dVt . (2.22) k=1 Hence, by the multiplication table, we have

n i j X dhW ,W it = cik(t)cjk(t)dt = ρij(t)dt (2.23) k=1 To make it clear in the two-dimensional case, we assume a correlation matrix

 1 ρ (t) ρ(t) = 1 (2.24) ρ1(t) 1

? Performing Cholesky decomposition, we have ρ(t) = CtCt and  1 0  Ct = p 2 . (2.25) ρ1(t) 1 − ρ1(t) Hence  1    1 dWt 1 0 dVt 2 = p 2 2 . (2.26) dWt ρ1(t) 1 − ρ1(t) dVt

2.4 Stochastic differential equations Stochastic differential equations are important when modelling the financial assets. Stochastics differential equations, like regular differential equation, decribe the dynamics of some stochastics process. This section is based on [9] and we will talk about the existence and uniqueness of the solutions for stochastic differential equations.

11 Let W be a Wiener process. An equation of the form dX(t) = µ(X(t), t)dt + σ(X(t), t)dW (t), (2.27) where functions µ(x, t) and σ(x, t) are given and X(t) is the unknown pro- cess, is called a stochastic differential equation(SDE) driven by Wiener pro- cess(Brownian motion). µ(x, t) and σ(x, t) are called coefficients. Definition 2.9. A process X(t) is called a strong solution of SDE (2.27) if R t R t for all t > 0 the integral 0 µ(X(s), s)ds and 0 σ(Xs, s)dW (s) exist, with the second being an Itˆointegral and Z t Z t X(t) = X(0) + µ(X(s), s)ds + σ(X(s), s)dW (s). (2.28) 0 0

Remark. A strong solution is some function F (t, (W (s)s≤t)) of the given Wiener process W . We emphasize that the Wiener process W is given. There is also a weak solution which is a pair of (Xt,Bt) such that satisfied SDE (2.28). It is usually hard to find a explicit expression of function F . Only some classes of SDEs have closed form solutions. Therefore, it is important to study the existence and uniqueness of SDE solutions. Theorem 2.8. Assume the following conditions: 1. Coefficients are locally Lipschitz in x and uniformly in t, that is, for every T and N, there is a constant K depending only on T and N such that for all |x|, |y| ≤ N and all 0 ≤ t ≤ T |µ(x, t) − µ(y, t)| + |σ(x, t) − σ(y, t)| < K|x − y|, (2.29)

2. Coefficients satisfy the linear growth condition |µ(x, t)| + |σ(x, t)| ≤ K(1 + |x|). (2.30)

3. X(0) is independent of (W (t), 0 ≤ t ≤ T ) and E[X2(0)] < ∞. (2.31)

Then there exists a unique strong solution X(t) of the SDE (2.27). X(t) has continuous paths, moreover   E sup X2(t) < C(1 + E[X2(0)]), (2.32) 0≤t≤T where the constant C depends on K and T .

12 There is another result that is specific for one dimensional SDEs. It loose the Lipschitz condition to H¨oldercondition for coefficient σ.

Theorem 2.9. Suppose that µ(x) satisfies the Lipschitz condition and σ(x) 1 satisfies a H¨oldercondition of order α, α ≥ 2 , that is, there exists a constant K such that |σ(x) − σ(y)| < K|x − y|α. (2.33) Then the strong solution exists and is unique.

2.5 The Girsanov theorem Girsanov’s theorem describes how the dynamics of stochastic process changes when the measure changes.

Theorem 2.10. (The Girsanov theorem) Let W P be a d-dimensional Wiener process on the filtered probability space (Ω, F, (Ft)t≥0, P) and let ϕ be any adapted column vector process. Choose a fixed T and define the process L on [0,T ] by

? P dLt = ϕt LtdWt ,

L0 = 1,

? where ϕt is the transpose of ϕt. Applying Itˆoformula, Lt can be solved as

R t ? P 1 2 ϕs dWs − ||ϕs|| ds Lt = e 0 2 . (2.34)

Assume that P E [LT ] = 1, and define a new probability measure Q on FT by dQ L = . T dP Then P Q dWt = ϕtdt − dWt (2.35) and W Q is a multidimensional Wiener process under Q.

The process ϕ is called the Girsanov kernel. In the Girsanov theorem we P make an assumption, E [LT ] = 1, which implies that L is a P-martingale. P Note that it may happen that E [LT ] < 1. The most general result to P guarantee E [LT ] = 1 is the Novikov condition.

13 Theorem 2.11. (The Novikov condition) If the Girsanov kernel ϕ satisfies

P 1 R T ||ϕ ||2dt E [e 2 0 t ] < ∞,

P then L is a martingale and in particular E [LT ] = 1. In particular, if ϕ is uniformly bounded (that is there exists a constant K, that |ϕt| < K for t ∈ [0,T ]), then it satisfies the Novikov condition.

14 3 The Black-Scholes model

3.1 The one dimensional Black-Scholes model The notation and the mathematics here follows [2]. Consider a financial market with only two assets: a risk free asset and a risky asset. Let B be the price process of the risk free asset with the dynamics

dB(t) = r(t)B(t)dt, (3.1) where r(t) is the interest rate. S denotes the price process of the risky asset, usually a stock. The stock price S is given by

dS(t) = αS(t)dt + σS(t)dW¯ (t), (3.2) where W¯ is a Wiener process under the objective measure P and α and σ are deterministic functions. We call σ the volatility of S. Equation (3.1) and Equation (3.2) is the Black-Scholes model. If α and σ are both constant, then it is called the standard Black-Scholes model.

The Black-Scholes equation A contingent claim is a term for a derivative with a payout that is dependent on the realization of some uncertain future event. Let X be a contingent claim with maturity T on the form

X = Φ(S(T )).

It means that if one buys a unit contract of X , then at the maturity T , one will receive Φ(S(T )). The price process of X at time t(< T ) is denoted by Π(t; X ). We assume that this claim can be traded as an asset in the market and its price process has the form Π(t) = F (t, S(t)) for some smooth function F . If the market is free of arbitrage, F is the solution of the following boundary value problem in the domain [0,T ] × R+. 1 F (t, s) + rsF (t, s) + s2σ2(t, s)F (t, s) − rF (r, s) = 0, t s 2 ss F (T, s) = Φ(s)

The equation above is called Black-Scholes equation. The proof can be found in [2].

15 The Black-Scholes formula Suppose X = Φ(S(T )) is an arbitrary contingent claim, then the arbitrage free price of X is given by Π(t; Φ) = F (t, S(t)), where

−r(T −t) Q F (t, s) = e Et,s[Φ(S(T ))]. (3.3) Here Q is a measure, and under Q, the dynamic of S has the form

dS(t) = rS(t)dt + σS(t)dW (t), (3.4) S(t) = s. (3.5)

The volatility σ is constant here and is the same under both measure. The SDE of S can be solved explicitly as

r− 1 σ2 (T −t)+σ[W (T )−W (t)] S(T ) = se( 2 ) . (3.6)

Thus we have the pricing formula Z ∞ F (t, s) = e−r(T −t) Φ(sez)f(z)dz, (3.7) −∞ where f is the density function of Z with the distribution  1   N r − σ2 (T − t), σ2(T − t) . 2 With a little abuse of notation, in the following sections, N will also denote the cumulative distribution function of standard normal distribution. X is called a European call option if

Φ(x) = max{x − K, 0}, (3.8) where K is a constant called the strike price. The price of X will have the form −r(T −t) F (t, s) = sN[d1(t, s)] − e KN[d2(t, s)], (3.9) where     1 s 1 2 d1(t, s) = √ ln + r + σ (T − t) , σ T − t K 2 √ d2(t, s) = d1(t, s) − σ T − t.

Equation (3.9) is known as Black-Scholes fromula.

16 3.1.1 Historic volatility If one wants to value a European call option using the Black-Scholes formula, one needs to estimate the volatility σ. The historic volatility can be used, although it may not perform well. Assume that the underlying asset has the dynamic as Equation (3.2). We sample n + 1 prices with equidistant time point, i.e

S(t0),S(t1), ··· ,S(tn).

Let ∆t = ti+1 − ti, i = 0, 1, ··· , n. We define the log return of S(t) by

  S(ti) 1 2 ξi = ln = α − σ ∆t + σ[W (ti) − W (ti−1)] (3.10) S(ti−1) 2

We can see above that ξ1, ξ2, ··· , ξn are i.i.d and we have  1  E[ξ ] = α − σ2 ∆t, i 2 2 V ar[ξi] = σ ∆t

Letting σξ define the standard deviation of ξi, we have that σ σ = √ ξ ∆t

We can estimate σξ from the sample by

n 1 X σ2 = (ξ − ξ¯)2 ξ n − 1 i i=1 n 1 X ξ¯ = ξ n i i=1

3.1.2 Implied volatility When valuing a European call option using historic volatility, there is a problem that the volatility is not a constant over time. Instead of historic volatility, one can use the implied volatility. The implied volatility of a European option is the value that one input in the Black-Scholes formula that will return a value which equals to the market value. Since the option price contain some future information, the implied volatility is said to be forward-looking.

17 Denoting the market call option price by p, the strike price by K, current underlying price by s, and the time to maturity by T , the Black-Scholes formula can be written by

p = c(s, t, T, r, σ, K). (3.11)

To obtain the implied volatility is to solve σ from above. Since the option price is increasing function of σ, there is a unique solution. However, there is no closed form for the implied volatility. Equation (3.11) can be solved by some numerical methods, such as Newton’s method. Before performing the iteration of Newton’s method, we need a guess σ0 for the initial value of volatility. It is mentioned in [3] that there is a good guess for initial value called Brenner-Subrahmanyam formula: √ p 2π σ0 = √ (3.12) s T

Then we use σ0 as the initial value and iterate the formula

c(σn) − p σn+1 = σn − 0 . (3.13) c (σn) until we reach a solution of sufficient accuracy. Here, ∂c c0(σ ) = (σ ), n ∂σ n which is called the vega of the option.

Dupire’s formula The implied volatility in Equation (3.11) is based on the model with constant volatility. In the real market, the volatility strongly depend on the strike price and the maturity. Bruno Dupire [6] derived a formula based on the local volatility model to generate a volatility surface. Suppose a stock follows the Q dynamics

dS(t) = rS(t)dt + σ(t, S(t))S(t)dW (t), (3.14) where r is the constant interest rate. The transition probability density φ(t, s) of S satisfies the Fokker−Planck equation

∂φ(T, s) ∂[rsφ(T, s)] 1 ∂2[σ2s2φ(T, s)] = − + (3.15) ∂T ∂s 2 ∂s2

18 The price of European call option can be written as

−rT Q C = e E [max(ST − K, 0)] Z ∞ = e−rT (s − K)φ(T, s)ds K (3.16) Z ∞ Z ∞ = e−rT sφ(T, s)ds − e−rT K φ(T, s)ds K K Differentiating Equation (3.16) with respect to K yields ∂C Z ∞ = −e−rT φ(T, s)ds, and (3.17) ∂K K ∂2C = e−rT φ(T,K) (3.18) ∂K2 Difference Equation (3.16) with respect to T ∂C Z ∞ ∂φ(T, s) = −rC + e−rT (s − K) ds, (3.19) ∂T K ∂T and insert the Fokker-Planck equation into (3.19) ∂C Z ∞  ∂[rsφ(T, s)] 1 ∂2[σ2s2φ(T, s)] = −rC + (s − K) − + 2 ds (3.20) ∂T K ∂s 2 ∂s Integrate Equation (3.20) by parts and rearrange terms, ∂C Z ∞ 1 = −rC + e−rT sφ(T, s)ds + e−rT σ2K2φ(T,K) ∂T K 2  ∂C  1 ∂2C = −rC + r C − K + σ2K2 ∂K 2 ∂K2 Then we get the Dupire’s formula ∂C + rK ∂C σ2(T,K) = 2 ∂T ∂K (3.21) 2 ∂2C K ∂K2 3.2 Correlation 3.2.1 Correlation between two random variables The correlation between two random variables X and Y is defined as cov(X,Y ) ρX,Y = (3.22) σX σY

19 where σX and σY are the standard deviation of the variables and cov(X,Y ) is the covariance. The covariance between X and Y is defined by

cov(X,Y ) = E[(X − E[X])(Y − E[Y ])] = E[XY ] − E[X]E[Y ]. (3.23)

For a sample, the correlation will be estimated by the Pearson product- moment correlation coefficient Pn i=1(xi − x¯)(yi − y¯) ρˆX,Y = , pPn 2 Pn 2 i=1(xi − x¯) · i=1(yi − y¯) where xi and yi are the observations of X and Y , whilex ¯ andy ¯ are the mean of each set of observations.

3.2.2 Basket correlation and implied correlation Let I be the value of a basket or index of n assets n X i It = ωiSt, i=1 Pn where i ωi = 1. The differential form is

n X i dIt = ωidSt i=1 We assume that each asset Si is modeled as a Geometric Brownian Motion (GBM) i i i i dSt = µiStdt + σiStdWt (3.24) We know that the index I is not a GBM in general. To make it easier to work with, we will approximate the index by

n i dIt X dS = ω t I i Si t i=1 t Then I is a GBM and the volatility of I can be expressed by

n 2 X 2 2 X σI = ωi σi + ωiωjσiσjρij, (3.25) i=1 i6=j

i j where ρij is the correlation between S and S ,

i j dhW ,W it = ρijdt (3.26)

20 Actually, this assumption is widely used (see [1],[13] and [15]). If we assume all these correlations are the same, then the correlation is called equicorrela- tion or basket correlation and

2 Pn 2 2 σI − i=1 ωi σi ρ = P . (3.27) i6=j ωiωjσiσj Then the correlation matrix between the variables is 1 ρ ρ ··· ρ ρ 1 ρ ··· ρ   ρ ρ 1 ··· ρ R =   . .. . . ··· . . ρ ρ ρ ··· 1 n×n

−1 (see [14]) If n−1 < ρ < 1, then R is positive semidefinite. If n is sufficiently −1 large, then n−1 ≈ 0. For this reason, we consider that 0 ≤ ρ < 1. If ρ = 0, we have n 2 X 2 2 σI,min = ωi σi . (3.28) i=1 If ρ = 1, then n 2 X 2 2 X σI,max = ωi σi + ωiωjσiσj. (3.29) i=1 i6=j Hence, from Equation (3.28) and (3.29), it can be derived that

2 2 σI − σI,min ρ = 2 2 (3.30) σI,max − σI,min We can see from Equation (3.30) that the basket correlation measures how far the realized basket variance is from the minimal basket variance.

Implied correlation The implied basket correlation, like implied volatility, can be observed by options. We can obtain implied volatility by European options. Then the implied correlation can be calculated by applying the implied volatility to Equation (3.27)

21 3.3 Multidimensional models with correlations In this section, we will introduce a multidimensional model. Suppose there are n stocks in the market with dynamics under objective measure P,

i dSt i i i i = µtdt + σtdWt , i = 1, 2, ··· , n, (3.31) St where µi and σi are adapted process satisfying Z t  i i 2 P |µs| + |σs| ds < ∞, for all t ≥ 0 = 1. 0 Assume the n Wiener processes are correlated and ( i j i j ρij(t)dt, if i 6= j dhW ,W it = dW dW = (3.32) t t dt, if i = j

Let ρ denote the correlation matrix. We know from Section 2.3.1 that W = (W 1,W 2, ··· ,W n) can be constructed from an n-dimensional Wiener process V whose components are independent,

n i X k dWt = cik(t)dVt , k=1

? where cik is the a component of the matrix C and ρ = CC . According to Girsanov’s theorem, under the risk neutral measure Q,

k k,Q dVt = ϕk(t)dt + dVt . Then

n i,Q X k,Q dWt = cik(t)(ϕk(t)dt + dVt ) k=1 n n X X k,Q = cik(t)ϕk(t)dt + cik(t)dVt . k=1 k=1 The covariance process of W under Q can be calculated as

n i,Q j,Q X dhW ,W it = cik(t)cjk(t)dt = ρij(t)dt (3.33) k=1 Hence, we see that the correlation matrix is invariant under measure changes.

22 We will explore how the correlation ρ is related to the dynamics of Si. Applying Itˆoformula to the dynamics of each asset Si, we have 1 d ln Si = [µi − (σi)2]dt + σidW i. (3.34) t t 2 t t t Hence, i i 2 i i 2 dhln S it = (σt) dhW , it = (σt) dt, (3.35) i j i j i j i j dhln S , ln S it = σtσt dhW ,W it = σtσt ρij(t)dt. (3.36) The integral forms for above equations are Z t i i 2 hln S it = (σt) dt (3.37) 0 Z t i j i j hln S , ln S it = σtσt ρij(t)dt (3.38) 0 R t i 2 R t In the next section, we will clarify how to estimate 0 (σt) dt and 0 ρij(t)dt using asset price data.

3.3.1 Constant volatility model

i i In this section we assume that σt and µt are constant in the model (3.31) i i and let σt = σi and µt = µi. To estimate the correlation using market price data, there is an interesting result mentioned in [15] . Definition 3.1. Consider a partition Π of the interval [0,T ]:

Π : 0 = t0 < t1 < ··· < tn−1 < tn = T. Then the realized correlation between two asset S1,S2 is defined as

S1 S2 Pn ti ¯1 ti ¯2 i=1(ln S1 − S )(ln S2 − S ) 1 2 ti−1 ti−1 ρˆT (S ,S ) = , (3.39) r S1 S2 Pn ti ¯1 2 Pn ti ¯2 2 i=1(ln S1 − S ) · i=1(ln S2 − S ) ti−1 ti−1 where n k 1 X St Sˆk = ln i , k = 1, 2. (3.40) n Sk i=1 ti−1 This definition is widely used in the financial market to compute the correla- tion between two assets. The correlation is estimated by the asset log return instead of the price. We will see that this definition is consistent with the model.

23 Theorem 3.1. Assume

k dSt k k = µkdt + σkdWt , k = 1, 2 St Then 1 2 1 2 hln S , ln S i ρˆT (S ,S ) → in probability as ||Π|| → 0 σ1σ2T The proof can be found in [15].

Similar to Equation (3.36), we have

1 2 1 2 dhln S , ln S it = σ1σ2dhW ,W it = σ1σ2ρ12(t)dt. (3.41)

Then 1 2 Z T hln S , ln S iT 1 ΥT = = ρtdt (3.42) σ1σ2T T 0

Then we can estimate ΥT byρ ˆT . Although Equation (3.42) is derived under constant volatility model, we will still use it to estimate the market realized correlation.

Remark. Take S1 = S2 in Equation (3.39) and consider

2 n 1 ! X St V (S1) = ln i − S¯1 (3.43) T S1 i=1 ti−1 Then 1 1 VT (S ) → hln S iT in probability as ||Π|| → 0. We know that Z T 1 2 hln S iT = (σ1) dt. (3.44) 0

In fact, VT (S) is used in practice for variance swaps.

3.4 Multi-asset derivatives After the introduction of the theoretical background, we will discuss the correlation market mentioned in Section 1.

24 3.4.1 Worst-of options Before discussing the worst-of option, we first introduce the concept of dis- persion. A high (low) dispersion means that the movement of asset returns are quite different (similar) from each other. The worst-of options are written on the worst stock return. The payoff of worst-of options are  i  ST worst-of call payoff = max 0, min i − K 1≤i≤n S0  i  ST worst-of put payoff = max 0,K − min i . 1≤i≤n S0 [5] It is shown that the prices of the worst-of options are highly sensitive to the correlation between assets. Higher dispersion would lead to a lower payoff for the worst-of call option. Since lower correlation would lead to high dispersion, lower correlations would lead to lower payoffs for worst-of call options. If two stocks are perfect negatively correlated, which means that the correlation is −1, then the payoff of worst-of call options will be quite low. Therefore a buyer of worst-of call option would be long correlation. For worst-of put options, lower correlation would lead to highly dispersed returns of the underlying assets, so lower correlations would lead to higher payoffs. Then a buyer of worst-of put option would be short correlation.

3.4.2 Dispersion trading

An index It is a weighted average of n stock prices:

n 1 X I = ω Si (3.45) t D i t i=1 where D is divisor and ωi is the weight of the asset: n X ωi = 1 (3.46) i=1 The divisor is a number chosen initially at inception. Usually, it is chosen to make the initial value of index equal to a certain number like 100 or 1000. The divisor may be adjusted over the time to keep the value of the index invariant to some unrelated changes of its components, such as a stock split. A dispersion trade, also called dispersion trading means that we sell an index option and buy the options on its components. Assuming D = 1, the payoff of the dispersion trading is X ωi × Single option payoff − Basket option payoff i

25 Let PT denotes the payoff of dispersion trading using put options and CT denotes the payoff using call options, where T is the time to maturity. Then,

X  i CT = ωi max ST − K, 0 − max {IT − K, 0} , and i X  i PT = ωi max K − ST , 0 − max {K − IT , 0} i Using the trick that

a + b |a − b| max{a, b} = + , 2 2 we have

n i X |S − K| |IT − K| C = P = ω T − (3.47) T T i 2 2 i=1 Equation (3.47) shows that the payoff of dispersion trading is the same for i call and put options. In fact, |ST − K| is the payoff of a straddle position on the stock no.i, and |IT − K| is the payoff of a straddle position on the index. The intuition behind dispersion trades is that the basket option provides exposure to volatility and correlation. Assume the asset Si follows a Ge- ometric Brownian motion (GMB), thus following log normal distribution. Although the dynamics of the index cannot be a GBM since there is no explicit distribution function for the sum of log normal variables, we can approximate the volatility of the index by

2 X 2 2 X σindex = ωi σi + ωiωjσiσjρij (3.48) i i6=j

i where, σi is the volatility of S . Here, ρij is the correlation between asset Si and Sj and we will talk about it more in the following sections. Since the price of the option is not linearly related to the volatility, we can just approximately isolate the correlations in Equation (3.48). In addition, we can also see from Equation (3.48) that dispersion trading is a short position for correlation. Trades are usually profitable in a time when the individual stocks are not strongly correlated and lose money when correlation rises.

26 3.4.3 Variance and volatility swaps The variance swap The variance and volatility swaps are the forward contracts with payoff re- lated to the variation of asset returns. The payoff of a variance swap is

2 Nvar · (Σ − Kvar), (3.49) where

• Nvar is the notional amount,

• Σ2 is the realized variance during the swap life, and

• Kvar is the strike, which is determined at inception such that the fair value of the swap is zero. Here 1 Z T Σ2 = σ2(t)dt, (3.50) T 0 and σ(t) is the volatility of the underlying asset. In practice, the realized variance is computed by

T  2 X St Σ2 = ln − S¯ , S t=0 t−1 where T 1 X St S¯ = ln . T S t=0 t−1 Note that the log return instead of the price is used here to compute the variance, which is consistent with our model.

The volatility swap The payoff of volatility swap is

Nvol · (Σ − Kvol), where

• Nvol is the notional amount,

• Σ is the realized volatility during the swap life, as shown in Equation (3.50), and

27 • Kvol is the strike that is chosen such that the fair value of the swap is zero.

Unlike the vanilla options, variance and volatility swap are purely exposed to the volatility. If the variance swap is written on an index, then it provides the exposure to both volatility and correlations.

3.4.4 The correlation swap There is a direct way to trade correlations called the correlation swap. The payoff of a correlation swap is

N · (ρrealized − ρstrike), where N is the notional amount and ρstrike is the strike that is determined at the initial time such that the fair value of the swap is zero. The realized correlation ρrealized is computed using P i6=j ωiωjρij ρrealized = P , (3.51) i6=j ωiωj where ωi is the weight of asset i and ρij is the correlation between asset i and asset j. The correlation ρij is usually calculated by the daily log return of assets:

Si Sj Pn tk ¯i tk ¯j (ln i − S )(ln j − S ) k=1 St S k−1 tk−1 ρij = . r Si Sj Pn tk ¯i 2 Pn tk ¯j 2 (ln i − S ) · (ln j − S ) k=1 St k=1 S k−1 tk−1 If we use equal weights, then

1 X ρ = ρ . realized n(n − 1) ij i6=j

28 4 The stochastic correlation model

4.1 A general model At first, we will introduce a general market model. We consider a market contain N assets. Each asset has the dynamics:

i dSt i i i = rdt + σtdWt , i = 1, 2, ··· ,N. (4.1) St Here, µi is the local mean rate of Si and σi is the volatility, which are both adapted processes with sufficient integrability. The process W i is a standard Wiener process under risk neutral measure Q. The N Wiener processes are correlated with a correlation matrix R = (ρij)N×N :

j i dhW ,W it = ρij(t)dt. (4.2)

Each component of the correlation matrix ρij will be a stochastic process. Although this setup is ideal, it is difficult to realize because of the huge amount of computation.

4.2 A simplified multi-asset model A simplified model has been suggested in [7] and [17], which makes the corre- lation matrix depend on one stochastic process. In this thesis, we will focus on this simplified model. The property of the model will be fully studied. We will also follow the guideline in [17] to calibrate this model with some market data. The model setup is similar to Equation (4.1), but with σi to be the local volatility. The local volatility σi(Si, t) is a function of the asset Si and the time t. For the correlation between each Wiener process, the model assume that all the correlation processes are derived by one source:

j i dhW ,W it = ρij(t)dt = ρ(t)dt. (4.3) Since the correlation is invariant under measure transformation, we don not distinguish the measure in Equation (4.3). We will write ρt for ρ(t) for convenience. We assume the correlation process ρt has the following dynamics under Q p ˆ dρt = α(¯ρ − ρt)dt + β (1 − ρt)(ρt − ρmin)dWt. (4.4)

Here α, β > 0 and ρmin < ρ¯ < 1. In fact, ρmin is the lower bound of ρt and β is called the volatility of correlation. In this model, we only need to deal

29 with one stochastic process instead of the whole matrix. The Wiener process Wˆ in (4.4) is correlated with the other Wiener processes

ˆ i dhW,W it = ρCSdt, i = 1, 2, ··· ,N (4.5) where ρCS is a constant standing for the correlation between stocks and their correlations. ρCS is called the correlation skew in this thesis. Then we have an N+1 dimensional Wiener process with the correlation matrix   1 ρt ρt ··· ρt ρCS  ρt 1 ρt ··· ρt ρCS    ρ ρ 1 ··· ρ ρ   t t t CS R =  . .. .   . . .     ρt ρt ρt ··· 1 ρCS ρ ρ ρ ··· ρ 1 CS CS CS CS (N+1)×(N+1)

Here we will choose the minimal correlation ρmin such that the positive semi- definiteness of R is preserved. Hence by the properties of correlation matrix, we have the relation N · ρ2 − 1 ρ = CS (4.6) min N − 1 The price process of multi-asset derivatives Φ(S1,S2, ·,SN , ρ) may be derived by above dynamics (4.1) and (4.4). This kind of model is set up in [11] to price multi-asset options. In this thesis, we will calibrate the dynamics of ρt under the objective measure P. By Girsanov’s theorem, we have

ˆ P ˆ dWt = ϕtdt + dWt. (4.7)

Here, we make the assumption that ϕt is a constant and ϕt = ϕ. Let f(ρ) = p (1 − ρ)(ρ − ρmin). Then under P

ˆ P dρt = α [¯ρ − ρt − κf(ρt)] dt + βf(ρt)dWt , (4.8) where ϕβ κ = . α To be able to use the data, we make the assumption that ϕ = 0 thus κ = 0 which means that P = Q. In this case,

p ˆ P dρt = α (¯ρ − ρt) dt + β (1 − ρt)(ρt − ρmin)Wt .

30 4.3 Properties of the model 4.3.1 Existence and uniqueness of solutions Let

µ(x) = α(¯ρ − x), and p σ(x) = β (1 − x)(x − ρmin), where x ∈ [ρmin, 1]. Then for any x, y ∈ [ρmin, 1],

|µ(x) − µ(y)| = α|x − y|, which implies that µ(x) is Lipschitz continuous. Since σ(x) is non-negative on [ρmin, 1], then we have

|σ(x) − σ(y)| ≤ |σ(x) + σ(y)|.

Further more,

|σ(x) − σ(y)|2 ≤ |σ(x)2 − σ(y)2| 2 = β |(1 − x) − (y − ρmin)||x − y| 2 ≤ β (1 − ρmin)|x − y|

1 Then σ(x) is H¨older 2 continuous. Hence, Equation (4.4) has a unique strong solution by Theorem 2.9.

4.3.2 Moment evaluation In the correlation market, we are interested in a quantity

1 Z T ΥT = ρtdt, (4.9) T 0 which is actually a continuous-time correlation swap. We will use the first and second moment of ΥT for calibration. ΥT can be estimated by Equation (3.39). ΥT will be called the correlation mean in this thesis. The moment of ρt and ΥT has been stated in [17]. Let ρ0 be the initial value of ρt.

31 1. For ρT

−αT E[ρT ] =¯ρ + (ρ0 − ρ¯)e β2 V ar[ρ ] =(¯ρ − ρ )(1 − ρ¯) +ρ ¯2 T min 2α + β2 ρ − ρ¯ + 0 [β2(ρ + 1) + 2αρ¯]e−αT α + β2 min

U 2 + e−(2α+β )T − E[ρ ]2 2α + β2 T Here,

2α(ρ − ρ¯)  1  U = 0 α(ρ − ρ¯) − β2 (ρ + 1) − ρ α + β2 0 2 min 0 2 −β (ρ0 − ρmin)(1 − ρ0)

2. For ΥT 1 − e−αT E[Υ ] =¯ρ + (ρ − ρ¯) . (4.10) T 0 αT 2 B V ar[Υ ] = (A + A + A + ) − E[Υ ]2 (4.11) T T 2 1 2 3 2 T Here,

αT + e−αT − T β2(¯ρ − ρ )(1 − ρ¯) A = [¯ρ2 + min ], 1 α2 2α + β2 (ρ − ρ¯)[β2(ρ + 1) + 2αρ¯](αT e−αT + e−αT − 1) A = − 0 min , 2 α2(a + β2) ρ − ρ¯ A = − 0 [β2(ρ + 1) + 2αρ¯](αT e−αT + e−αT − 1), 3 α2(a + β2) min 1 2α(ρ − ρ¯) A = − { 0 [α(ρ − ρ¯) 3 (2α + β2)(α + β2) α + β2 0 1 − β2( (ρ + 1) − ρ )] − β2(ρ − ρ )(1 − ρ )}, 2 min 0 0 min 0 ρ¯ ρα¯ 2T 2 B =2 [ + (αT + e−αT − 1)(ρ − 2¯ρ) α2 2 0 −αT −αT + (ρ0 − ρ¯)(αT e + e − 1)].

32 4.3.3 The boundary condition

In the correlation dynamic (4.4), we need to consider if the value of ρt will hit the boundary ρmin or 1. Consider a transformation

ρt − ρmin zt = . (4.12) 1 − ρmin

The dynamics of zt can be derived from Equation (4.4), p ˆ dzt = α (γ − ρt) dt + β zt(1 − zt)dWt, (4.13) where ρ¯ − ρ γ = min . 1 − ρmin We have the result from [4] that if

β2 β2 ≤ γ ≤ 1 − , (4.14) 2α 2α then zt will not hit the boundary 0 or 1. It means that ρt will not hit the boundary ρmin and 1, so we ensure that for all t > 0

ρmin < ρt < 1.

If this is the case, then Equation (4.14) is equivalent to

β2 ≤ min{2αγ, 2α(1 − γ)}. (4.15)

4.3.4 Simulation It is always intuitive to simulate a stochastic process and gain some knowledge about the process. [10] The simplest way is the Euler approximation or Euler- Maruyama approximation. Consider an Itˆoprocess that satisfies the stochastic differential equation

dXt = a(Xt, t)dt + b(Xt, t)dWt (4.16) on 0 ≤ t ≤ T with X0 = x. For a given partition Π of the interval [0,T ],

Π : 0 = t0 < t1 < ··· < tN = T,

33 an Euler approximation is a continuous time stochastic process Y satisfying

Yn+1 = Yn + a(Yn, tn)(tn+1 − tn) + b(Yn, tn)(Wtn+1 − Wtn ), (4.17) for n = 0, 1, ··· ,N − 1 and Y0 = X0.

For the process Y between interval [tn, tn+1], one can define by interpolation

t − tn Yt = Yn + Yn+1, if t ∈ [tn, tn+1). tn+1 − tn Let p tn+1 − tn · Zn = Wtn+1 − Wtn

Then Zn are independent normal distributed for n = 0, 1, ··· ,N − 1 and

Zn ∼ N[0, 1].

Equation 4.17 can be written as p Yn+1 = Yn + a(Yn, tn)(tn+1 − tn) + b(Yn, tn) tn+1 − tn · Zn. (4.18)

4.3.5 Numerical experiments for the parameters In order to gain some intuition of the parameter in the model, we plot the variance curve of correlation mean with varying parameters.

• Figure 2 shows the expectation of correlation mean with α varying. We can see that the expectation is a monotone function of maturity and is not sensitive to the value of α.

• Figure 3 shows the variance of the correlation mean with varying β. The variance of correlation mean will obtain the peak quicker for a larger value α. The larger value of α will also give a smaller magnitude of variance.

• Figure 4 shows the variance of the correlation with varying β. β mea- sures the volatility of correlation. The larger β, the larger variance.

• Figure 5 shows the simulation of correlation path with β varying. It is more intuitive that larger value of β gives larger volatility. The correlation process will revert aroundρ ¯ as time goes on.

34 α=3 α=4 α=5 0.55 0.65 0.75 mean of correlation

0 1 2 3 4 5

Maturity

Figure 2: Correlation mean with α varying and ρ0 = 0.8, ρ¯ = 0.5

α=3 α=4 α=5 var of correlation mean var 0.002 0.006 0.010

0 1 2 3 4 5

Maturity

Figure 3: Standard deviation with with α varying, ρ0 = 0.8, ρ¯ = 0.5, ρmin = 0.1.

35 β=5 β=4 β=3 var of correlation mean var 0.01 0.03 0.05 0.07 0 1 2 3 4 5

Maturity

Figure 4: Standard deviation with β varying, ρ0 = 0.8, ρ¯ = 0.5, ρmin = 0.1, β = 1.0.

β=2 β=4 β=6 correlation 0.4 0.5 0.6 0.7 0.8

0.0 0.5 1.0 1.5 2.0

Years

Figure 5: Simulation of correlation path with β varying and ρ0 = 0.8, ρ¯ = 0.5, ρmin = 0.1, α = 5.0. The black horizon line is the value ofρ ¯.

36 5 Calibration

5.1 Methods for calibration In the stochastic correlation model, we have 6 parameters to consider when calibrating, i.e. ρ0, ρ,¯ α, β, ρCS, ρmin. Since ρmin can be calculated from Equation (4.6) after calibrating ρCS, only 5 parameters will be calibrated. α and ρCS will be inferred from historical data. All the other parameters will be calibrated by the market quotation. The procedure of calibration will mainly refer to [17].

Level of correlation

ρ0 is the initial value of ρt. We also have

−αT lim E[ρT ] = lim(¯ρ + (ρ0 − ρ¯)e ) = ρ0. (5.1) T →0 T →0

ρ0 will be estimated from current market using the implied correlation by us- ing Equation (3.27). The σ in the Equation (3.27) should be replaced by the market at-the-money (ATM) implied volatility. The ATM implied volatility can be obtained from market vanilla options. In the market, different ma- turities will give different implied volatility. Since ρ0 is the initial value, a short maturity will be good. ρ¯ is the tendency of ρt. It means that as the time goes, ρt will revert aroundρ ¯ and E[ρT ] will tend toρ ¯. Similar to Equation (5.1), if we take the limit to infinity, then

−αT lim E[ρT ] = lim (¯ρ + (ρ0 − ρ¯)e ) =ρ ¯ (5.2) T →∞ T →∞

The method for calibratingρ ¯ is similar to ρ0 by Equation (3.27). Sinceρ ¯ is the level that E[ρT ] tends, a long maturity volatility should be used.

Correlation skew ρCS

ρCS is the correlation between the stocks and their correlations. A negative value of ρCS means that the correlation between stocks rises in a bear market and falls in a bullish market, which is consistent with the real market. ρCS is a constant in the model and is calibrated by using the historical data. It is es- timated by the Pearson product-moment correlation coefficient between time series of historical index price and correlations. Then ρmin can be calculated immediately by Equation (4.6).

37 Volatility of correlation β is the volatility of the correlation. It should meet the boundary condition (4.15). It is good if β is estimated from current market information. However, in this way, β will be likely to violate the boundary condition. We will see this in next section. In this thesis, we use a conservative way. β will be set to the maximum value allowed by the boundary condition (4.15), i.e., s ρ¯ − ρ 1 − ρ¯  β = 2α min min , . (5.3) 1 − ρmin 1 − ρmin

In this way, we make sure that correlation will not hit the boundary ρmin or 1.

Speed of mean reversion α is the speed of mean-reverse. The larger α, the faster correlation will be pulled towardsρ ¯. α will be calibrated from historical data. There are two ways to estimate α. One is to use the first moment of ΥT , the other is to use the second moment.

First moment calibration

It is much easier to calibrate α by using E[ΥT ]. The idea is to fit the historical realized mean of ΥT by a optimizerα ˆ,

X 2 αˆ = arg min (Emodel(α, Tk) − Ehistorical(Tk)) , (5.4) α Tk where 1 − e−αT E (α, T ) =ρ ¯ + (ρ − ρ¯) . model k 0 αT However, in practice, ρ0 andρ ¯ are close and the expectation curve is flat. Then it is not sensitive to the value of α. In other words, different values of α will give similar values of least square errors.

Second moment calibration The idea is find a value of α such that the model-computed standard deviation of ΥT close to the historical ones with different maturities. Here, a least square method will be used for calibration. Hence,

X 2 αˆ = arg min (SDmodel(α, β(α),Tk) − SDhistorical(Tk)) , (5.5) α Tk

38 where q SDmodel(α, β(α),Tk) = V ar[ΥTk ]. (5.6)

β(α) is calculated by Equation (5.3). In order to compute V ar[ΥTk ], all other parameters should be calibrated by the above methods before estimating α. SDhistorical(Tk) is the historical realized value correspond to the standard deviation of ΥTk in the model. Let I be an index that contains N stocks, then we compute the pairwise correlation between Si and Sj as

Si Sj Pn tk ¯i tk ¯j (ln i − S )(ln j − S ) k=1 St S i j k−1 tk−1 ρˆT (S ,S ) = , (5.7) r Si Sj Pn tk ¯i 2 Pn tk ¯j 2 (ln i − S ) · (ln j − S ) k=1 St k=1 S k−1 tk−1 which has been stated in Section (3.3.1). For each maturity Tk, we will have n 2 pairwise correlations. We consider these correlation as a sample of ΥTk and SDmodel can be estimated by the sample standard deviation. To calibrate α, one may do a search over α > 0 and find α which meet the Equation (5.5).

The outline of calibration

1. Split the data to two sets. One is called training set for calibrating ρCS and α. The other set is called test data and will be used for testing the model.

2. Within the training data, follow the above methods successively to calibrate ρ0, ρ,¯ ρCS and ρmin. 3. Do a search over α, and for each α set the β by Equation(5.3). For every pair of α and β, perform the least square error method to find an optimizerα ˆ.

4. Before testing the model, calibrate a new set of parameters ρ0, ρ,¯ β and ρmin within the test data.

5. Use the parameter obtained from step 5 along with ρCS and α calibrated from training data to test the model.

5.2 Data description and the calibration The index used for calibration is the Dow Jones Industrial Average (DJIA), which contains 30 stocks. The historical price and interpolated ATM implied volatility will be used for calibration. The implied volatility of DJIA is

39 estimated by the volatility of DIA, which is an ETF tracking the DOW. We used historical data from Jan 2011 to April 2016. The data was split to training set and test set. The first 4 years of data will be treated as training data while the remaining one year as test data. Following the outline of

Historical Calibration Mean of Upislon 0.0 0.2 0.4 0.6 0.8 1.0

0 1 2 3 4

Years

Figure 6: The calibration of the first moment of ΥT . calibration as above, the results of calibrations are shown in Figure 6 and 7. The only difference between the two figures is the method used to calibrated α. Here, we use maturities for 3, 6, 9, ··· , 48 months. Figure 6 show the result of the first moment estimation of α. The fitted curve is quite flat. Actually, when performing the least square error methods, it indicates that α = 0 gives the best result. It contradicts our model which has α > 0. We conclude it is not reliable to calibrate α by the first moment of Υ. In Figure 7, the red line is the historical realized standard deviation of correlation mean, which is correspond to SDhistorical(Tk). The solid blue line is the optimized SDmodel(α, β(α),Tk). There is a gap between the historical curve and the model curve. In fact, if we lift the value of β from 1.08 to 3.0, we will get the dash blue line as shown in Figure 7. However, we have to let the gap be, in order not to let correlation hit the boundary. It is a limit of this model. The dash line is closer to the historical curve and have a smaller error, which means that the historical realized volatility of correlation is higher than the maximum value allowed by the boundary condition of β.

40 Historical Calibration Standard deviation of Upislon Standard deviation 0.0 0.1 0.2 0.3 0.4 0.5

0 1 2 3 4

Years

Figure 7: The calibrated curve vs the historical curve for the standard deviation of ΥT . In the calibrated model, ρ0 = 0.511, ρ¯ = 0.478, ρCS = −0.296, α = 1.30, β = 1.08.

5.2.1 Model analysis

After the calibration, we will take the calibrated ρCS and α to test the model using the test data set. Figure 8 shows some simulations of the correlation for the test set. We see that the simulated correlation starts from ρ0 and reverts aroundρ ¯. Since we strictly control β, the simulated paths do not hit the boundary. Figure 9 shows the standard deviation of the correlation mean for the market and the model. The result is quite similar to historical calibration. Since we limit the value of β, the model underestimate the market realization. Figure 10 shows curves of the market realized correlation (green line) and the expected correlation (blue line). The market realized correlation is cal- culated by the mean of the correlation between components. We also draw a line for the constant model, which treat the correlation as a constant and estimate the constant correlation from historical data. The area between the two red curves is the band plotted one standard deviation away from expectation of correlation. This one standard deviation band (1-SDB) well describes the market correlation: it covers the main range of market corre- lation. The constant correlation line actually does not describe the market correlation well since the correlation is really unstable. This indicates that

41 correlation 0.50 0.55 0.60

0.0 0.5 1.0 1.5 2.0

Years

Figure 8: Simulation the correlation process with parameters ρ0 = 0.509, ρ¯ = 0.536, ρCS = −0.296, α = 1.30, β = 1.13.

Market Model Standard deviation 0.0 0.2 0.4 0.6 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Years

Figure 9: The market standard deviation of correlation mean vs the model computed value. In the calibrated model, ρ0 = 0.509, ρ¯ = 0.536, ρCS = −0.296, α = 1.30, β = 1.13.

42 it may not be proper to treat the correlation as a constant. In Figure 10, if we plot the two standard deviation band (2-SDB) then the correlation will be fully covered by the band, but the 2-SDB is quite wide which almost cover the whole interval [0, 1]. We explore this more in Figure 11 to 14. We split the data into 4 equal consecutive parts and each part contains about one-year length of data. Figure 11 and 12 are generated within part 2 by using part 1 as training set. Likewise, Figure 13 and 14 are generated within part 4 by using part 3 as training set. Figure 11 and 13 are the plots of 1-SDB, while Figure 12 and 14 are the plots of 2-SDB. It can be seen from Figure 11 and 12 that 1-SDB does not cover the realized correlation well while 2-SDB does. However, we can see from Figure 13 and 14 that for another set of data, 2-SDB may be two wide.

Market Constant Stochastic Std dev Correlations 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Years

Figure 10: Graph for model test within the test data. In the model, ρ0 = 0.509, ρ¯ = 0.536, ρCS = −0.296, α = 1.30, β = 1.13.

43 Market Constant Stochastic Std dev Correlations 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Years

Figure 11: 1-SDB with parameters ρ0 = 0.378, ρ¯ = 0.525, ρCS = −0.596, α = 3.30, β = 1.37.

Market Constant Stochastic Std dev Correlations 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Years

Figure 12: 2-SDB with parameters ρ0 = 0.378, ρ¯ = 0.525, ρCS = −0.596, α = 3.30, β = 1.37.

44 Market Constant Stochastic Std dev Correlations 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Years

Figure 13: 1-SDB with parameters ρ0 = 0.361, ρ¯ = 0.524, ρCS = −0.503, α = 4.10, β = 1.78.

Market Constant Stochastic Std dev Correlations 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Years

Figure 14: 2-SDB with parameters ρ0 = 0.361, ρ¯ = 0.524, ρCS = −0.503, α = 4.10, β = 1.78.

45 5.2.2 Summary In this thesis, we have studied the correlations between the n components of an index. The correlations between the components are reflected by the correlations between Wiener processes. In general, the stochastic correlations constitute the correlation matrix   1 ρ12(t) ρ13(t) ··· ρ1,n(t)  ρ (t) 1 ρ (t) ··· ρ (t)  21 23 2,n   ρ (t) ρ (t) 1 ··· ρ (t) Rt =  31 32 3,n  .  . .. .   . ··· . .  ρ (t) ρ (t) ρ (t) ··· 1 n,1 n,2 n,3 n×n It is pointed out that the correlation matrix is invariant under measure changes. If n is large, calibrating the whole correlation matrix will need huge amount of computation. In order to make it easier to work with, we model the correlation matrix of the stocks in an index by one stochastic process ρt, which means

ρij(t) = ρt, i, j = 1, 2, ··· , n.

To preserve the positive definiteness of correlation matrix, ρt has to satisfy −1 0 ≈ ≤ ρ < 1. N − 1 t

Actually, in the model above we set a positive value ρmin be to the minimum of ρt. However in reality, the correlations between different stocks will differ and in some time periods, stocks could be negatively correlated. Since all the pairwise correlations are assumed derived from one positive stochastic process, the variance will be quite high as we see in Figure 9. We restrict the volatility parameter β to make sure that the correlation will not go outside its domain, which also limit the magnitude of V ar[ΥT ]. The model then underestimates the variance of market correlation, but it well describes the range of daily realized correlation as we have seen in Figure 10 to 15.

Future work There could be some future work to improve the model. 1. Model the index correlation derived from more than one source. A possible way is to split the index into several sectors and model the correlations for each sector with different dynamics.

46 2. Use the correlation swaps, if available, to calibrate the parameters. The correlation swap provides the pure exposure to the correlation. In this way, one could even model the volatility as a stochastic process. The calibration can be done by pricing the correlation swap using Monte Carlo method.

3. Introduce a in the correlation dynamics. As time goes on, the correlation may switch to another regime and it may be good to add a jump to the correlation dynamics.

47 References

[1] Avellaneda M, Boyer-Olson D. Reconstruction of volatility: pricing in- dex options by the steepest descent approximation. Courant Institute- NYU Working Paper, 2002.

[2] Bj¨orkT. Arbitrage theory in continuous time. Oxford university press, 2004.

[3] Chambers D R, Nawalkha S K. An improved approach to computing implied volatility. Financial Review, 2001, 36(3): 89-100.

[4] Delbaen F, Shirakawa H., An interest rate model with upper and lower bounds. Asia-Pacific Financial Markets, 2002, 9(3-4): 191-209.

[5] De Weert, F. Exotic options trading. Vol. 564. John Wiley & Sons, 2011.

[6] Dupire B. Pricing with a smile. Risk, 1994, 7(1): 18-20.

[7] Driessen J, Maenhout P, Vilkov G. Option-implied correlations and the price of correlation risk. Journal of Banking and Finance, forthcoming. 2007.

[8] Engle R. Anticipating correlations: a new paradigm for risk manage- ment. Princeton University Press, 2009.

[9] Klebaner F C. Introduction to with applications. London: Imperial college press, 2005.

[10] Kloeden, P.E., Platen, E. Numerical Solution of Stochastic Differential Equations. Springer, Berlin,1992.

[11] Ma J. A stochastic correlation model with mean reversion for pricing multi-asset options. Asia-Pacific Financial Markets, 2009, 16(2): 97-109.

[12] Plotter P E. Stochastic Integration and Differential Equation. Stochastic Modeling and Applied Probability, 2004.

[13] Skintzi V D, Refenes A P N. Implied correlation index: A new measure of diversification. Journal of Futures Markets, 2005, 25(2): 171-197.

[14] Silyakova E. Implied Basket Correlation Dynamics. Sonderforschungs- bereich 649, Humboldt University, Berlin, Germany, 2012.

[15] Sepp S. Modeling of stock return correlation. Master thesis, Universiteit van Amsterdam, 2011.

48 [16] van Emmerich C. Modelling correlation as a stochastic process. Preprint.

[17] Zetocha V. Skewing Up Correlation: Understanding Correla- tion Skew in Equity Derivatives, 2014. Available at SSRN: http://ssrn.com/abstract=2441724

49

TRITA -MAT-E 2016:27 ISRN -KTH/MAT/E--16/27--SE

www.kth.se