Parameter Estimation in Stochastic Volatility Models Via Approximate Bayesian Computing
A Thesis
Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University
By
Achal Awasthi, B.S.
Graduate Program in Department of Statistics
The Ohio State University
2018
Master’s Examination Committee:
Radu Herbei,Ph.D., Advisor Laura S. Kubatko, Ph.D. c Copyright by
Achal Awasthi
2018 Abstract
In this thesis, we propose a generalized Heston model as a tool to estimate volatil- ity. We have used Approximate Bayesian Computing to estimate the parameters of the generalized Heston model. This model was used to examine the daily closing prices of the Shanghai Stock Exchange and the NIKKEI 225 indices. We found that this model was a good fit for shorter time periods around financial crisis. For longer time periods, this model failed to capture the volatility in detail.
ii This is dedicated to my grandmothers, Radhika and Prabha, who have had a
significant impact in my life.
iii Acknowledgments
I would like to thank my thesis supervisor, Dr. Radu Herbei, for his help and his availability all along the development of this project. I am also grateful to Dr. Laura
Kubatko for accepting to be part of the defense committee. My gratitude goes to my parents, without their support and education I would not have had the chance to study worldwide. I would also like to express my gratitude towards my uncles,
Kuldeep and Tapan, and Mr. Richard Rose for helping me transition smoothly to life in a different country. In addition, my deepest appreciation goes to my friends at the department of Statistics who have been there for me since my first day of class at the Ohio State University. Finally, I am extremely thankful to my housemates for bearing with me during the past one year.
iv Vita
2016 ...... B.S. Physics
2016-present ...... Graduate Teaching Associate, The Ohio State University.
Publications
Fields of Study
Major Field: Department of Statistics
v Contents
Page
Abstract...... ii
Dedication...... iii
Acknowledgments...... iv
Vita...... v
List of Tables...... viii
List of Figures...... xii
1. Introduction...... 1
1.1 Motivation...... 1 1.2 Emerging Markets during Financial Crisis...... 2 1.3 Structure of Thesis...... 5
2. Background...... 6
2.1 Introduction...... 6 2.2 Brownian Motion...... 8 2.3 Geometric Brownian Motion (GBM)...... 10 2.3.1 Parameter Estimation for the GBM process using Maximum Likelihood Estimation...... 14 2.4 The Ornstein-Uhlenbeck Process...... 17 2.4.1 Simulation of the OU Process...... 18 2.4.2 Parameter Estimation for OU Process using Maximum Like- lihood...... 20
vi 2.4.3 Parameter Estimation for OU Process using Ordinary Least Squares...... 23 2.5 Cox-Ingersoll-Ross Process...... 26 2.5.1 Simulation of CIR process...... 27 2.5.2 Parameter Estimation for CIR Process using Maximum Like- lihood...... 32 2.6 Generalized Cox-Ingersoll-Ross model...... 35 2.6.1 Parameter Estimation for generalized CIR Process using Max- imum Likelihood...... 37 2.6.2 Distribution of R t2 W (s) ds ...... 40 t1
3. Approximate Bayesian Computing for Stochastic Volatility Models... 43
3.1 Heston Model...... 43 3.1.1 Simulation of sample paths of the Heston Model...... 45 3.1.2 Euler-Maruyama (EM) Approximation...... 46 3.1.3 Euler-Maruyama scheme with Lord et al’.s modification.. 47 3.1.4 Milstein scheme...... 47 3.1.5 Broadie and Kaya’s Exact Algorithm...... 48 3.2 A generalized Heston Model...... 51 3.2.1 Simulation of sample paths of the generalized Heston model 53 3.3 Approximate Bayesian Computing (ABC)...... 60 3.3.1 ABC for Heston Model...... 61 3.3.2 ABC for generalized Heston Model...... 83
4. Application: Modeling Volatility in Financial Markets...... 100
4.1 Introduction...... 100 4.1.1 Stock Index...... 100 4.2 Exploratory Data Analysis...... 104 4.3 Parameter estimation of the Generalized Heston model using ABC 107 4.3.1 Parameter estimation using ABC for SSE...... 107 4.3.2 Parameter estimation using ABC for NIKKEI 225..... 134
5. Contributions and Future Work...... 142
5.1 Results Overview...... 142 5.2 Future Work...... 144 5.2.1 Moments of generalized Heston model...... 145
Bibliography...... 147
vii List of Tables
Table Page
3.1 Table showing the number of simulations vs number of accepted pa-
rameters for different = 100...... 62
3.2 Table showing the number of simulations vs number of accepted pa-
rameters for different = 200...... 62
3.3 Table showing the number of simulations vs number of accepted pa-
rameters for different = 500...... 63
3.4 Table showing the number of simulations vs number of accepted pa-
rameters for different = 800...... 63
3.5 Table showing the number of simulations vs number of accepted pa-
rameters for different = 1000...... 63
viii 3.6 Table showing the number of simulations vs number of accepted pa-
rameters for different = 1500...... 64
3.7 Table showing the number of simulations vs number of accepted pa-
rameters for = 100...... 84
3.8 Table showing the number of simulations vs number of accepted pa-
rameters for = 200...... 84
3.9 Table showing the number of simulations vs number of accepted pa-
rameters for = 500...... 85
3.10 Table showing the number of simulations vs number of accepted pa-
rameters for = 800...... 85
3.11 Table showing the number of simulations vs number of accepted pa-
rameters for = 1, 000...... 85
3.12 Table showing the number of simulations vs number of accepted pa-
rameters for = 1, 500...... 86
4.1 Table showing the number of simulations vs number of accepted pa-
rameters for different = 10, 000...... 108
ix 4.2 Table showing the number of simulations vs number of accepted pa-
rameters for different levels...... 111
4.3 Table showing the estimated parameters for different levels (100 sim-
ulations)...... 114
4.4 Table showing the number of simulations vs number of accepted pa-
rameters for different levels...... 116
4.5 Table showing the estimated parameters for different levels (100 sim-
ulations)...... 120
4.6 Table showing the number of simulations vs number of accepted pa-
rameters for different levels...... 122
4.7 Table showing the estimated parameters for different levels (100 sim-
ulations)...... 126
4.8 Table showing the number of simulations vs number of accepted pa-
rameters for different levels...... 128
4.9 Table showing the estimated parameters for different levels...... 132
x 4.10 Table showing the number of simulations vs number of accepted pa-
rameters for different levels...... 135
4.11 Table showing the estimated parameters for different levels...... 139
xi List of Figures
Figure Page
2.1 Simulated paths of the GBM process with parameters as described in
algorithm1...... 13
2.2 Histogram of log of GBM at the 50th time-step. The orange curve repre-
sents the superimposed normal density curve with parameters obtained
from simulated data at the 50th time-step...... 14
2.3 Histogram of estimated values of µ of the GBM as simulated above.
The dashed red line represents the true value of the parameter..... 16
2.4 Histogram of estimated values of σ of the GBM as simulated above.
The dashed red line represents the true value of the parameter..... 16
2.5 Simulated paths of the OU process with parameters as described above 19
xii 2.6 Histogram of estimated values of β of the OU process as simulated
above. The dashed red line represents the true value of the parameter. 21
2.7 Histogram of estimated values of θ of the OU process as simulated
above. The dashed red line represents the true value of the parameter. 22
2.8 Histogram of estimated values of σ of the OU process as simulated
above. The dashed red line represents the true value of the parameter. 22
2.9 Histogram of estimated values of β of the OU process using least
squares approximation. The dashed red line represents the true value
of the parameter...... 24
2.10 Histogram of estimated values of θ of the OU process using least squares
approximation. The dashed red line represents the true value of the
parameter...... 25
2.11 Histogram of estimated values of σ of the OU process using least
squares approximation. The dashed red line represents the true value
of the parameter...... 25
2.12 Simulated paths of the CIR process with parameters as described above. 32
xiii 2.13 Histogram of estimated values of α of the CIR process. The dashed
red line represents the true value of the parameter...... 34
2.14 Histogram of estimated values of β of the CIR process. The dashed
red line represents the true value of the parameter...... 34
2.15 Histogram of estimated values of σ of the CIR process. The dashed
red line represents the true value of the parameter...... 35
2.16 Simulated paths of the generalized CIR process with parameters as
described above...... 37
2.17 Histogram of estimated values of α of the generalized CIR process using
normal approximation. The dashed red line represents the true value
of the parameter...... 38
2.18 Histogram of estimated values of β of the generalized CIR process using
normal approximation. The dashed red line represents the true value
of the parameter...... 39
2.19 Histogram of estimated values of σ of the generalized CIR process using
normal approximation. The dashed red line represents the true value
of the parameter...... 39
xiv 2.20 Histogram of estimated values of γ of the generalized CIR process using
normal approximation. The dashed red line represents the true value
of the parameter...... 40
3.1 Simulation of a path of CIR process with N = 252, α = 0.09, β = 0.145
and σ = 0.055...... 50
3.2 Simulation of a path of Heston process with N = 252, α = 0.09, β =
0.145, µ = 0.009 and σ = 0.055...... 51
3.3 s=4 intermediate points between ti and ti+1...... 53
3.4 Simulated path of the CIR process with parameters α2 = 0.221, β2 =
th 0.601, σ2 = 0.055. Every (s + 1) value has been chosen for the plot,
where s has been defined in step-I...... 54
3.5 Simulated path of the OU process with parameters α1 = 0.14, β1 =
th 0.861, σ1 = 0.009. Every (s + 1) value has been chosen for the plot,
where s has been defined in step-I...... 55
3.6 Simulated path of the estimates of R t2 ν(s) ds at different time points. 56 t1
3.7 Simulated path of the estimate of R t2 µ(s) ds...... 57 t1
xv 3.8 Simulated path of the estimate of R t2 pν(s) dW ν(s)...... 58 t1
3.9 Simulated path of the estimate of R t2 pν(s) dW Z ...... 59 t1
3.10 Simulated sample path of the generalized Heston model...... 60
3.11 Histograms of accepted values of the parameters of the Heston Model
for = 100 and 1000 simulations. The dashed red lines represent the
true values of the parameters...... 65
3.12 Histograms of accepted values of the parameters of the Heston Model
for = 100 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters...... 66
3.13 Histograms of accepted values of the parameters of the Heston Model
for = 100 and 100, 000 simulations. The dashed red lines represent
the true values of the parameters...... 67
3.14 Histograms of accepted values of the parameters of the Heston Model
for = 200 and 1, 000 simulations. The dashed red lines represent the
true values of the parameters...... 68
xvi 3.15 Histograms of accepted values of the parameters of the Heston Model
for = 200 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters...... 69
3.16 Histograms of accepted values of the parameters of the Heston Model
for = 200 and 100, 000 simulations. The dashed red lines represent
the true values of the parameters...... 70
3.17 Histograms of accepted values of the parameters of the Heston Model
for = 500 and 1, 000 simulations. The dashed red lines represent the
true values of the parameters...... 71
3.18 Histograms of accepted values of the parameters of the Heston Model
for = 500 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters...... 72
3.19 Histograms of accepted values of the parameters of the Heston Model
for = 500 and 100, 000 simulations. The dashed red lines represent
the true values of the parameters...... 73
3.20 Histograms of accepted values of the parameters of the Heston Model
for = 800 and 1, 000 simulations. The dashed red lines represent the
true values of the parameters...... 74
xvii 3.21 Histograms of accepted values of the parameters of the Heston Model
for = 800 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters...... 75
3.22 Histograms of accepted values of the parameters of the Heston Model
for = 800 and 100, 000 simulations. The dashed red lines represent
the true values of the parameters...... 76
3.23 Histograms of accepted values of the parameters of the Heston Model
for = 1000 and 1, 000 simulations. The dashed red lines represent
the true values of the parameters...... 77
3.24 Histograms of accepted values of the parameters of the Heston Model
for = 1000 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters...... 78
3.25 Histograms of accepted values of the parameters of the Heston Model
for = 1000 and 100, 000 simulations. The dashed red lines represent
the true values of the parameters...... 79
3.26 Histograms of accepted values of the parameters of the Heston Model
for = 1500 and 1, 000 simulations. The dashed red lines represent
the true values of the parameters...... 80
xviii 3.27 Histograms of accepted values of the parameters of the Heston Model
for = 1500 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters...... 81
3.28 Histograms of accepted values of the parameters of the Heston Model
for = 1500 and 100, 000 simulations. The dashed red lines represent
the true values of the parameters...... 82
3.29 Histograms of estimated values of the parameters of the generalized
Heston Model for = 100 and 1000 simulations. The dashed red lines
represent the true values of the parameters...... 87
3.30 Histograms of estimated values of the parameters of the generalized
Heston Model for = 100 and 10000 simulations. The dashed red lines
represent the true values of the parameters...... 88
3.31 Histograms of estimated values of the parameters of the generalized
Heston Model for = 200 and 1000 simulations. The dashed red lines
represent the true values of the parameters...... 89
3.32 Histograms of estimated values of the parameters of the generalized
Heston Model for = 200 and 10, 000 simulations. The dashed red
lines represent the true values of the parameters...... 90
xix 3.33 Histograms of estimated values of the parameters of the generalized
Heston Model for = 500 and 1, 000 simulations. The dashed red lines
represent the true values of the parameters...... 91
3.34 Histograms of estimated values of the parameters of the generalized
Heston Model for = 500 and 10, 000 simulations. The dashed red
lines represent the true values of the parameters...... 92
3.35 Histograms of estimated values of the parameters of the generalized
Heston Model for = 800 and 1, 000 simulations. The dashed red lines
represent the true values of the parameters...... 93
3.36 Histograms of estimated values of the parameters of the generalized
Heston Model for = 800 and 10, 000 simulations. The dashed red
lines represent the true values of the parameters...... 94
3.37 Histograms of estimated values of the parameters of the generalized
Heston Model for = 1, 000 and 1, 000 simulations. The dashed red
lines represent the true values of the parameters...... 95
3.38 Histograms of estimated values of the parameters of the generalized
Heston Model for = 1, 000 and 10, 000 simulations. The dashed red
lines represent the true values of the parameters...... 96
xx 3.39 Histograms of estimated values of the parameters of the generalized
Heston Model for = 1, 500 and 1, 000 simulations. The dashed red
lines represent the true values of the parameters...... 97
3.40 Histograms of estimated values of the parameters of the generalized
Heston Model for = 1, 500 and 10, 000 simulations. The dashed red
lines represent the true values of the parameters...... 98
4.1 Daily Adjusted Closing Price of SSE from 01/01/96 to 04/08/16... 105
4.2 Daily Log Adjusted Closing Price of SSE from 01/01/96 to 04/08/16. 106
4.3 Daily Adjusted Closing Price of NIKKEI 225 from 01/05/15 to 07/24/18.106
4.4 Daily Log Returns of NIKKEI 225 from 01/05/15 to 07/24/18..... 107
4.5 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations...... 109
4.6 Comparison between simulated dataset and testing dataset...... 110
4.7 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations...... 112
xxi 4.8 Histograms of accepted values of the parameters for = 5, 000 and 100
simulations...... 113
4.9 Comparison between simulated dataset and testing dataset for =
5, 000 for the first period...... 115
4.10 Comparison between simulated dataset and testing dataset for =
10, 000 for the first period...... 115
4.11 Histograms of accepted values of the parameters for = 1, 000 and 100
simulations...... 117
4.12 Histograms of accepted values of the parameters for = 5, 000 and 100
simulations...... 118
4.13 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations...... 119
4.14 Comparison between simulated dataset and testing dataset for =
1, 000 for the second period...... 121
4.15 Comparison between simulated dataset and testing dataset for =
5, 000 for the second period...... 121
xxii 4.16 Comparison between simulated dataset and testing dataset for =
10, 000 for the second period...... 122
4.17 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations...... 123
4.18 Histograms of accepted values of the parameters for = 5, 000 and 100
simulations...... 124
4.19 Histograms of accepted values of the parameters for = 1, 000 and 100
simulations...... 125
4.20 Comparison between simulated dataset and testing dataset for =
1, 000 for the third period...... 127
4.21 Comparison between simulated dataset and testing dataset for =
5, 000 for the third period...... 127
4.22 Comparison between simulated dataset and testing dataset for =
10, 000 for the third period...... 128
4.23 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations...... 129
xxiii 4.24 Histograms of accepted values of the parameters for = 5, 000 and 100
simulations...... 130
4.25 Histograms of estimated values of the parameters for = 1, 000 and
100 simulations...... 131
4.26 Comparison between simulated dataset and testing dataset for =
1, 000 for the fourth period...... 133
4.27 Comparison between simulated dataset and testing dataset for =
5, 000 for the fourth period...... 133
4.28 Comparison between simulated dataset and testing dataset for =
10, 000 for the fourth period...... 134
4.29 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations...... 136
4.30 Histograms of accepted values of the parameters for = 5, 000 and 100
simulations...... 137
4.31 Histograms of accepted values of the parameters for = 1, 000 and 100
simulations...... 138
xxiv 4.32 Comparison between simulated dataset and testing dataset for = 1, 000.140
4.33 Comparison between simulated dataset and testing dataset for = 5, 000.140
4.34 Comparison between simulated dataset and testing dataset for =
10, 000...... 141
xxv Chapter 1: Introduction
1.1 Motivation
Physicists, statisticians and mathematicians have long been interested in theories related to finance. The tools developed in statistical physics, statistics and theoretical mathematics can be used to model complex financial systems. Many changes have taken place in the world of finance in the later half of the last century. For exam- ple, in 1973 currencies began to be traded in financial markets. The values of these were determined by the foreign exchange markets that are active 24 hours a day all over the world. Among other changes are new models that have come up for esti- mating volatility which is an inherent framework for pricing European options. The
Black-Scholes model (BSM) was among the first successful models to price options.
However, this model is based on several assumptions that are not representative of the real world. In particular, the BSM assumes that volatility is deterministic and remains constant through the option’s life, which clearly contradicts the behavior observed in financial markets. While the BSM framework can be adapted to obtain reasonable prices for plain vanilla options, the constant volatility assumption may lead to significant mispricings when used to evaluate options with non-conventional or exotics features.
1 During the last decades several alternatives have been proposed to improve volatility modeling in the context of derivatives pricing. One such approach is to model volatil- ity as a stochastic quantity. By introducing uncertainty in the behavior of volatility, the evolution of financial assets can be estimated more realistically. In addition, using appropriate parameters, stochastic volatility models can be calibrated to reproduce the market prices of liquid options and other derivatives contracts. One of the most widely used stochastic volatility models was proposed by Heston in 1993. The Heston model introduces a dynamic for the underlying asset which can take into account the asymmetry and excess kurtosis that are typically observed in financial assets returns.
It also provides a closed-form valuation formula that can be used to efficiently price plain vanilla options. This will be particularly useful in the calibration process, where many option pricings are usually required in order to find the optimal parameters that reproduce market prices.
1.2 Emerging Markets during Financial Crisis
Previous research shows us that the strong functioning of stock markets has con- siderable effect on the growth of an economy, especially so in a developing one. Over the past few decades, studies have been conducted around the globe by many re- searchers on the subject of stock market efficiency, and the conflicting results have made it difficult to comment on the status of stock market of a particular country. So, we focus our attention on the stock market behavior in developing countries which aren’t considered to be as stable as the developed ones. They are unlikely to be fully information-efficient, partly due to institutional barriers restricting information flows to the market and partly due to lack of experience of market participants to rapidly
2 lock up new information into security prices. Therefore, it would be interesting to investigate this period of last 20 years studying both the Global Financial and the
Chinese crisis and its effects on fastest emerging economies of India and China. Re- cession had crumpled economies worldwide but these two were relatively unaffected and hence are of particular interest.
The the current fastest growing economies BRICS (Brazil, Russia, India, China, South
Africa) were affected primarily through four channels of trade, finance, commodity, and confidence. The slump in export demand and firmer trade credit caused a slow- down in aggregate demand. The global financial crisis inflicted significant loss in output in all these countries. However, the real GDP growth in India and China remained impressive even though they witnessed some moderation due to weakening global demand. The crisis also exposed the structural weakness of the global financial and real sectors. The BRICS were able to recover quickly with the support of domes- tic demand. The reversal of capital flows led to equity market losses and currency depreciations, resulting in lower external credit flows. The banking sectors of the
BRICS economies performed relatively well [20].
Since our analysis revolves around the two recent financial crises, we need to under- stand its effects as well. A financial crisis is a disruption to financial markets in which adverse selection and moral hazard problems become much worse, so that financial markets are unable to efficiently channel funds to those who have the most produc- tive investment opportunities. As a result, a financial crisis can drive the economy away from an equilibrium with high output in which financial markets perform well to one in which output declines sharply [19]. The end of 2007 and beginning of 2008 observed that the onset of global financial crisis had brought disorder to the financial
3 markets around the world and it is the first crisis in consideration for our study. The
instability in the global stock market scenario began with a shortfall of liquid assets in
US banking system and the continual fall in stock prices on information that Lehman
Brothers, Merill Lynch and many other investment banks and companies were col-
lapsing. The stock markets around the globe suffered huge losses and Indian stock
market was no exception. The SENSEX which had reached historically high levels in
the beginning of 2008, turned down to its level about three years back and the S&P
CNX NIFTY also followed a similar trend. Economic growth decelerated in 2008-09 to 6.7 percent. This represented a decline of 2.1 percent from the average growth rate of 8.8 percent in the previous five years. China was not one of the countries hardest hit by the crisis, neither was it as insulated as many had assumed. This can be seen from the fact that China continued to have one of the highest rates of economic growth across the globe, recording 9.6% in 2008 and 9.2% in 2009.
While most countries would be delighted to have such growth rates, the point to be considered is that these rates reflected a substantial drop from the 14.2% growth in
2007. In terms of short term impact on China, the most visible damage was inflicted on its export-oriented light industry in southern China. Thousands of companies went bust, tens of thousands of workers have been laid-off and official statistics revealed that 10 million migrant workers had returned back to their home provinces. In the
financial sector the stock market crash that started in late 2007 had wiped out more than two thirds of market value although this dramatic collapse was not without any home-made reasons [16]. The Chinese banks for all their profitability witnessed the sudden pull-out of many of their Western partners which (Bank of America, UBS,
RBS) sold their minority stakes in order to retrieve capital. Another massive blow
4 was to the China’s fledgling sovereign wealth fund, China Investment Corporation.
The second crisis in consideration for our study is the Chinese stock market crash which began with the popping of the stock market bubble on 12 June 2015. A third of the value of A-shares on the Shanghai Stock Exchange was lost within one month of the event since mid-June. By 89 July 2015, the Shanghai stock market had fallen
30 percent over three weeks as 1,400 companies, or more than half listed, filed for a trading halt in an attempt to prevent further losses. This crisis was inevitable because over major part of 2014-15, investors kept investing more and more into
Chinese stocks, encouraged by falling borrowing costs as the central bank loosened monetary policy even though economic growth and company profits were weak with retail investors being the one leading this.
1.3 Structure of Thesis
This thesis is organized as follows: in chapter2, we present the most commonly en- countered stochastic models in finance, their simulations and parameter estimations.
Section 3 is devoted to a complete analysis of estimation of parameters of the Heston model using Approximate Bayesian Computing. In chapter3, we also present a new model namely, the generalized Heston model for estimating volatility. In chapter4, we fit the generalized Heston model to the data from the Shanghai Stock Exchange and NIKKEI 225. In chapter5 we discuss some of the results and talk about future work.
5 Chapter 2: Background
2.1 Introduction
In this chapter, we introduce the basic concepts from probability theory and its applications in the field of finance. We also introduce several important and widely used stochastic processes. In addition to their definitions, we describe a statistical approach to estimating the parameters defining these processes.
Definition 1. Let Ω be a non-empty set, and let F be a collection of subsets of Ω.
F is a σ−algebra if it satisfies,
1. ∅ ∈ F,
2. If a set A ∈ F, then Ac ∈ F,
∞ 3. If a sequence of sets A1,A2, · · · ∈ F, then ∪n=1An ∈ F.
Definition 2. Let Ω be a non-empty set, and let F be a σ−algebra over Ω.A probability measure P is a function that, to every set A ∈ F assigns a number in
[0, 1]. This number is called the probability of A and is represented as P(A).
The measure P should satisfy the following properties,
1. P(Ω) = 1, and
6 2. If A1,A2,... is a sequence of disjoint sets such that An ∈ F for all n ≥ 1, then
! ∞ ∞ X P ∪n=1 An = P(An) (2.1) n=1
The triple (Ω, F, P) is called a probability space.
Definition 3. Let F be a σ−algebra and Ω the space of outcomes which are specific to an experiment. A function X : (Ω, F) → R is a random variable if for every subset
Fr = {ω: X(ω) ≤ r} r ∈ R, the condition Fr ∈ F is satisfied.
A random variable X is called a discrete random variable if its range {X(ω): ω
∈ Ω} is countable. A random variable X is called a continuous random variable if its range is a continuous subset of R. A continuous random variable has a cumulative distribution function (CDF) which is absolutely continuous. On the other hand, the
CDF of a discrete random variable is a step function with discontinuities at the values taken on by the random variable.
Definition 4. Let T ⊆ [0, ∞). A family of random variables {Xt}t∈T is called a stochastic process. If T ⊆ N, then the stochastic process is discrete and if T ⊆ [0, ∞), the stochastic process is continuous.
For example, let {X(t): t = 0, 1, 2,...} be a stochastic process that evolves according to the following rule: X(0) = 0 and, for t ≥ 0,
X(t + 1) = X(t) + 1 with probability p X(t + 1) = X(t) − 1 with probability 1 − p,
Then, the stochastic process {X(t): t > 0} is called a random walk. If p = 1/2 i.e., we are equally likely to move forward or backward, then the random walk is called a symmetric random walk. If p 6= 1/2, i.e. we have a preferred direction, then
7 the random walk is called a biased random walk. The random walk process has the following properties,
• If p = 1/2 all states of a random walk are recurrent. If p 6= 1/2 all states are
transient.
• Each state of a random walk has period 2 except for the first and last states, if
the process is assumed to live in 1, 2, . . . , k for some positive integer k.
2.2 Brownian Motion
Brownian Motion (BM) was first observed by biologist Robert Brown [9] in 1827 while studying pollen particles. He observed that when seen under a microscope, the pollen particles floating in water exhibited a zig-zag jittery motion. He repeated the experiment with particles of dust and concluded that the motion was due to the pollen being alive. But, he could not explain the source of this random motion. The theory of
BM was first given by French mathematician Louis Bachelier in his PhD thesis titled
”Theory of Speculation” [7]. It was in 1905 when renowned physicist Albert Einstein using probabilistic arguments was able to explain the theory of BM. He observed that under the right kinetic energy, molecules of water would move randomly. This is how
Robert Brown described the movement of pollens.
The theory of BM has been applied to a variety of fields ranging from biology, physics, economics, mathematics to finance. Stock market researchers were battling with a problem similar to what Robert Brown had encountered in 1827. They were able to
figure out the path of market price but they did not know the reason behind it. They could not determine who was buying, who was selling and how demand and supply were affecting price movements.
8 Definition 5. Let (Ω, F, P) be a probability space. A stochastic process {W (t): t ≥ 0} is said to be a standard Brownian motion process if,
• W (0) = 0 almost surely;
• The increments for non-overlapping time intervals are independent.
• W (t) − W (s) ∼ N(0, t − s) for s < t,
• cov(W (s),W (t)) = min(s, t).
Next, we briefly introduce the concept of a stochastic differential equation (SDE).
Let {X(t): t ≥ 0} be a stochastic process and assume that the process satisfies the following equation,
Z t Z t X(t) = X(0) + a(X(s), s) ds + B(X(s), s) dW (s), (2.2) 0 0 where a(·, ·) and b(·, ·) are known functions and {W (t): t ≥ 0} is a standard Brownian motion. In the equation above, the integral
Z t a(X(s), s) ds 0 is a Riemann integral whereas the integral
Z t B(X(s), s)dW (s) 0 is an Itˆo integral. Throughout this dissertation, we will assume that the functions a(·, ·) and b(·, ·) satisfy sufficient conditions for such integrals to exist and to be finite almost surely. Such conditions can be found in [15]. If a process X(t) satisfies equation
(2.2), we say that X(t) is a diffusion process. Equation (2.2) can be briefly written as,
dX(t) = a(X(t), t) dt + b(X(t), t) dW (t) (2.3)
9 The term a(·, ·) is called the drift term while the function b(·, ·) is called the diffusion
coefficient. In this dissertation we only briefly review some of the necessary tools and
processes from this area. The equation (2.3) is referred to as a stochastic differential
equation (SDE).
Proposition 1. Itoˆ’s Lemma - Let X(t) be a stochastic process which satisfies the
following stochastic differential equation,
dX(t) = a(X(t), t) dt + b(X(t), t) dW (t)
and let f(x,t) be any twice differentiable scalar function of two real variables x and t,
then Itoˆ’s lemma states that, " # ∂f(X, t) ∂f(X, t) b2(X, t) ∂2f(X, t) ∂f(X, t) df(X(t), t) = +a(X, t) + dt+b(X, t) dW (t). ∂t ∂x 2 ∂x2 ∂x
A proof of this lemma can be found in [15].
2.3 Geometric Brownian Motion (GBM)
Definition 6. Let {W (t): t ≥ 0} be a stochastic process that describes a Brownian
Motion. Let S(0) > 0 and µ ∈ R and σ ∈ R+ be constants. If S(t) satisfies the following stochastic differential equation,
dS(t) = µS(t)dt + σS(t)dW (t) (2.4)
then it is said to be a Geometric Brownian Motion (GBM).
The solution of (2.4) is,
n o S(t) = S(0) · exp (µ − 0.5σ2)t + σW (t)
10 For, a small increase in time from t to t + ∆t, the ratio of S(t + ∆t)/S(t) is
S(t + ∆t) n o = exp (µ − 0.5σ2)∆t + σ(W (t + ∆t) − W (t)) S(t) where, W (t+∆t)−W (t) ∼ N(0, ∆t). From this definition, it follows that S(t) cannot be zero at any point of time. If σ (the volatility) equals zero, then equation (2.4) reduces to
S(t) = S(0) exp (µt) .
This implies that given S(0) > 0, S(t) is an increasing function of time t. As noted, for any particular time interval ∆t,
n o S(t + ∆t) = S(t) · exp (µ − 0.5σ2)∆t + σ(W (t + ∆t) − W (t)) (2.5)
If we take logarithms on both sides, we obtain the following equation,
log(S(t + ∆t)) − log(S(t)) = (µ − 0.5σ2)∆t + σ[W (t + ∆t) − W (t)] where, W (t + ∆t) − W (t) ∼ N(0, ∆t). So, σ[W (t + ∆t) − W (t)] ∼ N(0, σ2∆t).
It follows that, (µ − 0.5σ2)∆t + σ[W (t + ∆t) − W (t)] ∼ N[(µ − 0.5σ2)∆t, σ2∆t].
Consequently, conditionally on log(S(t)),
log(S(t + ∆t)) ∼ N[log(S(t)) + (µ − 0.5σ2)∆t, σ2∆t].
The expectation of this process is,
h n o i 2 E(S(t)|S(0)) = E S(0) · exp σW (t) + (µ − 0.5σ )t S(0)
n 2 o = S(0) · exp (µ − 0.5σ )t · E[exp (σW (t))] n o = S(0) · exp (µ − 0.5σ2)t · exp{0.5σ2 · t}
= S(0) · exp (µt)
11 h i Here, we have used the fact that E exp{cW (t)} = exp(c2t/2), where c ∈ R. Simi- larly, the variance of S(t) is,
V ar(S(t)|S(0)) = S(0)2 · exp(2µt) · exp(σ2t − 1).
This stochastic process has been used to model quantities that must be positive. In
figure 2.1, we show 500 simulated paths of a GBM process, which have been obtained according to algorithm1.
Algorithm 1. (Simulation of the GBM process)
• Set the process parameters i.e. total time period (T) = 10, number of steps (N)
= 1000, number of simulations (n) = 500, β = 1.5, θ = 0.15, σ = 0.1.
• Let ∆t = T/N and initialize the process by setting S(0).
• Recursively simulate S(t + ∆t) using (2.5), where W (t + ∆t) − W (t) ∼ N(0,∆t)
is independent of everything else.
12 160
140
120
100
S(t) 80
60
40
20
0 2 4 6 8 10 t
Figure 2.1: Simulated paths of the GBM process with parameters as described in algorithm1
13 Histogram of ln(S(t=50dt))
6
5
4
3 Frequency 2
1
0 2.8 2.9 3.0 3.1 3.2 3.3 Value
Figure 2.2: Histogram of log of GBM at the 50th time-step. The orange curve represents the superimposed normal density curve with parameters obtained from simulated data at the 50th time-step.
2.3.1 Parameter Estimation for the GBM process using Max- imum Likelihood Estimation
Let {X(t): t ≥ 0} be a stochastic process that satisfies the Markov’s property. As- sume that we observe this process at a discrete collection of time points {t0, t1, . . . , tn} where, t0 = 0, ti = iT/n for i = 1, 2, . . . , n. Let X = {X(t0),X(t1),...,X(tn)} be the available data. For simplicity, we use Xi = X(ti). Let θ be the parameters defining the process {X(t): t ≥ 0}. The likelihood function is defined as,
n Y L(θ|X1,X2,...,Xn) = fθ (Xi|Xi−1) i=1 where fθ (Xi|Xi−1) is called the transition density, and X0 is assumed to be fixed. We make this assumption throughout this document. For the GBM process the transition
14 density is, ! 1 (log(X /X ) − ντ)2 √ i i−1 f(Xi|Xi−1) = exp − 2 σXi 2πτ 2σ τ where ν = µ − σ2/2 and τ = T/n. Thus, the likelihood function is,
t 2 ! Y 1 (log(Xi/Xi−1) − ντ) L(µ, σ|X) = √ exp − 2σ2τ i=1 σXi 2πτ
Instead of maximizing the likelihood function, we maximize the log likelihood func- tion l(µ, σ|X).
For a simulation study, we generated a data set according to algorithm1 using the same parameter values as above. Based on such data, we used the built in mini- mization function from Python to estimate the parameter values by minimizing the negative of log-likelihood. This process is repeated 500 times and the histogram of all estimates of the parameter µ is presented in Figure 2.3. The dashed red line repre- sents the true value of the parameter µ. Similarly, Figure 2.4 displays the histogram of all estimates of the parameter σ and the dashed red line represents the true value of the parameter σ.
15 Histogram of est_mu
250
200
150
Frequency 100
50
0 0.05 0.10 0.15 0.20 0.25 Value
Figure 2.3: Histogram of estimated values of µ of the GBM as simulated above. The dashed red line represents the true value of the parameter.
Histogram of est_sigma
200
150
100 Frequency
50
0 0.094 0.096 0.098 0.100 0.102 0.104 0.106 Value
Figure 2.4: Histogram of estimated values of σ of the GBM as simulated above. The dashed red line represents the true value of the parameter.
16 2.4 The Ornstein-Uhlenbeck Process
The Ornstein-Uhlenbeck (OU) process is a stochastic process that was introduced
to model the velocity of a particle that is undergoing a Brownian Motion [22]. The
OU process was an attempt to model the velocity of a particle directly. This was
particularly important because if the position of a particle is given by Brownian
Motion, then its time derivative would not exist. This difficulty was overcome by
using the OU process to model the velocity of a particle.
In addition, the OU process was one of the first models used to model no arbitrage
interest rates as it had favorable properties, like mean reversion. Later, better models
were developed because this model could assume negative values with a positive
probability whereas the quantities it was used to model, like the no arbitrage interest
rates, could never take negative values. In the financial literature, it is also known as
the Vasicek model
Definition 7. Let {X(t): t ≥ 0} be a stochastic process and θ ∈ R and β, σ ∈ R+ be constants. If {X(t): t ≥ 0} satisfies the following stochastic differential equation,
+ dX(t) = −β(X(t) − θ)dt + σdW (t), β, σ ∈ R , θ ∈ R (2.6)
then X(t) is said to be an OU process.
In (2.6) above, the term dX(t) is called the infinitesimal change in X(t), β > 0 is
called the rate of mean reversion and θ is the long term mean of the OU process. The
parameter σ > 0 is called the volatility and dW (t) is Gaussian Noise. In (2.6) the
−β(X(t) − θ)dt term is known as the drift term and the term σdW (t) is known as the diffusion term.
17 The OU process is a mean reverting process, i.e., even though the process is stochastic, it has a tendency to revert to an equilibrium value. The OU process is very helpful in modeling the interest rates or volatility as these quantities are assumed to fluctuate around an equilibrium quantity. As can be seen from (2.6), if σ = 0, we get an ordinary differential equation. Let X(0) = 0, when σ = 0, (2.6) reduces to,
dX(t) = −β(X(t) − θ)dt which can be solved to get
X(t) = θ − θ exp(−βt)
As t → ∞, the general solution converges to θ. So, with the addition of the term
σdW (t), we are merely adding random fluctuations about the equilibrium position θ.
If X(t) is very far from the equilibrium position θ, then the mean reversion term
−β(X(t) − θ)dt becomes larger and pushes X(t) towards the equilibrium position θ.
2.4.1 Simulation of the OU Process
Euler-Maruyama Approximation for OU Process - Let h > 0 be the step size.
The Euler-Maruyama (EM) approximation for OU process is,
+ X(t + h) − X(t) ≈ −β(X(t) − θ)h + σ(W (t + h) − W (t)), β, σ ∈ R , θ ∈ R (2.7)
This approximation leads to the following transition distribution,
[X(t + h)|X(t)] ∼ N(X(t) − β(X(t) − θ), σ2h).
It can be shown that the exact transition density for an OU process is, ! σ2(1 − exp(−2βh)) [X(t + h)|X(t)] ∼ N θ + (X(t) − θ) exp(−βh), (2.8) 2β
18 For a fixed t and a large h > 0, [X(t + h)|X(t)] follows a normal distribution with mean θ and variance σ2/2β.
In Figure 2.5, we show 50 simulated paths according to algorithm2.
Algorithm 2. (Simulation of the OU process)
• Set the process parameters i.e. total time period (T) = 10, number of steps (N)
= 100, number of simulations (n) = 1000, β = 3.5, θ = 0.7, σ = 0.1.
• Let ∆t = T/N and initialize the process by setting X(0) = 0.7.
• Recursively simulate X(t + ∆t) using the distribution given in (2.8).
0.80
0.75
0.70 X(t)
0.65
0.60
0 2 4 6 8 10 t
Figure 2.5: Simulated paths of the OU process with parameters as described above
19 2.4.2 Parameter Estimation for OU Process using Maximum Likelihood
Let {X(t): t ≥ 0} be an OU stochastic process as defined in (2.6). Assume that
we observe this process at a discrete collection of time points {t0, t1, . . . , tn} where,
t0 = 0, ti = iT/n for i = 1, 2, . . . , n. Let X = {X(t0),X(t1),...,X(tn)} be the data.
For simplicity, we use Xi = X(ti). Let θ = (β, θ, σ). Given that this process satisfies
Markov’s property, the likelihood function is defined as,
n Y L(θ|X) = f(Xi|Xi−1) i=1
where f(Xi|Xi−1) is the transition density. For the OU process the transition density is, ! 1 −(X − α )2 f(X |X ) = √ · exp i i−1 i i−1 2πη 2η2 ! 2 where αi−1 = θ + (Xi−1 − θ) · exp (−βh) and η = σ /2β · 1 − exp(−2βh) . Thus,
the likelihood function can be written as,
t 2 ! Y 1 −(Xi − αi−1) L(θ, β, σ|X) = √ · exp (2.9) 2πη 2η i=1 The log likelihood function is,
t 2 ! −t X −(Xi − αi−1) l(θ, β, σ|X) = log(2πη) − (2.10) 2 2η i=1 For a simulation study, we generated a data set according to algorithm2 using the
same parameter values as above. Based on such data, we used the built in mini-
mization function from Python to estimate the parameter values by minimizing the
negative of log-likelihood. This process is repeated 500 times and the histogram of
all estimates of the parameter β is presented in Figure 2.6. The dashed red line rep-
resents the true value of the parameter β. Similarly, Figures 2.7 and 2.8 display the
20 histograms of all estimates of the parameters θ and σ, respectively. The dashed red
lines represent the true value of the parameters θ and σ.
Histogram of est_beta
250
200
150
Frequency 100
50
0 2 3 4 5 6 7 8 9 Value
Figure 2.6: Histogram of estimated values of β of the OU process as simulated above. The dashed red line represents the true value of the parameter.
21 Histogram of est_theta
250
200
150
Frequency 100
50
0 0.68 0.69 0.70 0.71 0.72 0.73 Value
Figure 2.7: Histogram of estimated values of θ of the OU process as simulated above. The dashed red line represents the true value of the parameter.
Histogram of est_sigma 250
200
150
Frequency 100
50
0 0.07 0.08 0.09 0.10 0.11 0.12 0.13 Value
Figure 2.8: Histogram of estimated values of σ of the OU process as simulated above. The dashed red line represents the true value of the parameter.
22 2.4.3 Parameter Estimation for OU Process using Ordinary Least Squares
We consider an OU process as represented by (2.6). Using the EM discretization procedure, we can approximate the OU process as (2.7). This can be further simplified as, √ Xt+dt = Xt(1 − βdt) + βθdt + σ dtZ (2.11) where, Z ∼ N(0, 1) represents the standard normal distribution. When represented this way, equation (2.11) can be thought of as a normal linear model with independent errors. This normal linear model is of the form Y = βX +, where Y is a N×1 vector of Xt+dt values. Thus, we can estimate the coefficient vector β and then use that to estimate the parameters of the OU process. If we compare (2.11) to an AR(1) model whose equation is of the form Xi+1 = β0 + β1Xi + , then we get βθdt = β0 and
β1 = (1 − βdt). It so happens that in this case, we would get the same estimates as we would be get from using the maximum likelihood procedure. This is true because we have a normal linear model and in the case of a normal linear model, ˆ ˆ βols = βmle i.e. the estimator obtained using ordinary least squares is the same as the estimate obtained using maximum likelihood estimation. However, we would lose some information as the least square estimates only use information from the second observation onwards where as the maximum likelihood estimates use information from the first observations itself.
th Let ˆ = Xi+1 − (β0 + β1Xi) be the i residual. The sum of squares of residuals (SSE) is defined as,
N N N 2 X 2 X 2 X SSE = ˆ = Xi+1 + (β0 + β1Xi) − 2 Xi+1(β0 + β1Xi) (2.12) i=1 i=1 i=1
23 Now, we maximize equation 2.11 with respect to the parameters β0 and β1. To do
this we differentiate SSE with respect to the parameters and set them equal to zero.
On doing the aforementioned, we obtain,
PN X − PN βˆ X βˆ = i=1 i+1 i=1 1 i (2.13) 0 n PN PN PN ˆ (N i=1 Xi+1Xi) − ( i=1 Xi i=1 Xi+1) β1 = (2.14) PN 2 PN 2 N i=1 Xi − ( i=1 Xi)
The data generation process and the true parameter values used to generate data were identical to the processes in the previous section. After getting the least square ˆ ˆ estimates, the estimates of the OU process were obtained as follows:- β = (1−β1)/dt, ˆ ˆ ˆ ˆ θ = β0/(1 − β1),σ ˆ = se().
Histogram of est_beta
250
200
150
Frequency 100
50
0 2 3 4 5 6 Value
Figure 2.9: Histogram of estimated values of β of the OU process using least squares approximation. The dashed red line represents the true value of the parameter.
24 Histogram of est_theta 250
200
150
Frequency 100
50
0 0.600 0.625 0.650 0.675 0.700 0.725 0.750 0.775 Value
Figure 2.10: Histogram of estimated values of θ of the OU process using least squares approximation. The dashed red line represents the true value of the parameter.
Histogram of est_sigma
250
200
150
Frequency 100
50
0 0.05 0.06 0.07 0.08 0.09 0.10 Value
Figure 2.11: Histogram of estimated values of σ of the OU process using least squares approximation. The dashed red line represents the true value of the parameter.
25 2.5 Cox-Ingersoll-Ross Process
The Cox-Ingersoll-Ross (CIR) model [11] was introduced in 1985 by John C. Cox,
Jonathan E. Ingersoll and Stephen A. Ross in order to improve the existing Vasicek
model which allowed for negative interest rates. Earlier, the OU model was used to
model interest rates rt. But, the fundamental problem with that approach was that
the change in rt assumed a constant volatility σ regardless of what happened in the
economy. There is empirical evidence that suggests that ∆rt is more volatile, if rt is
high and it is not so volatile if rt is low, i.e. the change in interest rates would be more volatile if the interest rates themselves are very high and that change is relatively less volatile if the interest rates are relatively lower. Also, the interest rates can never be negative but if modeled using an OU process, they can assume negative values with some positive probability. With regards to this, the CIR model was used to model interest rates as it was more efficient and violated fewer assumptions than the OU model used to model the same interest rates.
Definition 8. Let X(t) be a stochastic process and β, σ ∈ R+, and θ ∈ R be constants. If X(t) satisfies the following stochastic differential equation,
p + dX(t) = α(β − X(t))dt + σ X(t)dW (t), β, σ ∈ R , θ ∈ R (2.15)
then X(t) is said to be a CIR process.
In equation (2.15), dX(t) is the infinitesimal change in X(t), α is the rate of mean
reversion, β is the long term mean of the process which is also known as the asymptotic
mean, σ > 0 is the volatility and dW (t) Gaussian Noise. The drift function is linear
and has a mean reverting tendency because of which the CIR process is also a mean
26 reverting process. The diffusion function is proportional to X(t) and thus helps in
ensuring that the process never becomes negative. If all the process parameters, i.e.,
σ, α and β, are positive and 2αβ ≥ σ2 (Feller’s condition), then the CIR process is
well-defined.
The transition density of X(t) given X(s) is,
q ! 2 u √ f(X(t)|X(s)) = c exp(−v − u) I (2 uv) s < t (2.16) v q
where,
2α c = , σ2[1 − exp(−α(t − s))] u = cX(s)e(−α(t−s)),
v = cX(t), 2αβ q = − 1, σ2 √ and Iq(2 uv) is the modified Bessel function of the first kind and of order q. We use
the transformation S(t) = 2cX(t). Thus, the transition density of S(t) given S(s) is,
1 f(S(t)|S(s)) = f(X(t)|X(s)), s < t. 2c
Here, f(S(t)|S(s)) is a non-central χ2 distribution with 2u as the non-centrality pa- rameter and 2q + 2 degrees of freedom.
2.5.1 Simulation of CIR process
Proposition 2. Let Z1,Z2,...,Zk ∼ N(0, 1) be independent random variables, then
2 2 2 2 2 U = Z1 + Z2 + ... + Zk ∼ χk(0), where χk(0) is a (central) chi-squared distribution with k degrees of freedom.
27 2 Let U ∼ χk(0). Then, the probability density function of the random variable U is, uk/2−1 exp(−u/2) f (u) = , u > 0 U 2k/2Γ(k/2)
R ∞ x−1 where, Γ(x) = 0 t exp(−t)dt is the gamma function. It is known that Γ(n) = (n − 1)! for an integer n > 0. The moment generating function of U is,
−k/2 MU (t) = E(exp(tU)) = (1 − 2t) , |t| < 1/2.
Proposition 3. Let Z1,Z2,...,Zk ∼ N(µj, 1) for j = 1, 2, . . . , k be independent
2 2 2 2 2 random variables, then U = Z1 + Z2 + ... + Zk ∼ χk(λ), where χk(λ) is a non-central chi-squared distribution with k degrees of freedom with non-centrality parameter λ
1 Pk 2 where, λ = 2 j=1 µj .
2 Let V ∼ χk(λ). Then, the probability density function of the random variable V is, ∞ X hexp(λ)λjv(j+k/2)−1 exp(−v/2)i f (v) = , v > 0 (2.17) V j!2j+k/2Γ(j + k/2) j=0 The moment generating function of V is, ! λt M (t) = (exp(tV )) = exp (1 − 2t)−k/2, |t| < 1/2 V E 1 − 2t
We note that equation (2.17) is a mixture of Poisson and Gamma distributions. The
non-centrality parameter λ is equal to 0 if and only if µj = 0 for all j = 1, 2, . . . , k.
2 Note that a random variable V ∼ χk(λ) can be simulated using the following hierar- chy:
2 V |Y ∼ χk+2Y (0)
Y ∼ P oisson(λ)
28 We can use the law of iterated expectations to calculate E(V ) and V ar(V ).
E(V ) = E[E(V |Y )] = E(k + 2Y ) = k + 2E(Y ) = k + 2λ
Similarly, the variance of V is,
V ar(V ) = V ar(E(V |Y )) + E(V ar(V |Y ))
= V ar(k + 2Y ) + E(2(k + 2Y ))
= 4λ + 2k + 4λ = 2(k + 4λ)
The characteristic function of V is,
exp{λ2it/(1 − 2it)} φ(t) = (exp{itV }) = , |2it| < 1 (2.18) E (1 − 2it)k/2
It can be shown using equation (2.18) that if we have two independent random vari- ables V ∼ χ2 (λ ) and V ∼ χ2 (λ ) then, 1 k1 1 2 k2 2
d 2 V1 + V2 = χk1+k2 (λ1 + λ2) (2.19)
The above also holds true for any finite number of independent non-central chi- squared distributions. Equation (2.19) implies that the sum random variables which follow a non-central chi-squared distribution is equal in distribution to another ran- dom variable which follows a non-central chi-squared distribution. In particular, if
2 we have a random variable V ∼ χk(λ) then,
d 2 2 V = χ1(λ) + χk−1(0) d > 1 (2.20)
It is important to understand that,
√ √ 2 d 2 d 2 χ1(λ) = [N( λ, 1)] = (N(0, 1) + λ)
29 Equation (2.20) implies that a random variable which follows a non-central chi- squared distribution is equal in distribution to the sum of two independent random variables following a central chi-squared distribution and a standard normal distribu- tion.
Proposition 4. Assume that k > 1. Then, it is true that,
√ 2 d 2 2 χk(λ) = (Z + λ) + χk−1(0) .
Therefore, when the degrees of freedom k > 1, sampling from a non-central chi- squared distribution is equivalent to sampling from an central chi-squared distribu- tion and an independent normal distribution. This sampling method is not compu- tationally intensive and is generally efficient. When 0 < k < 1, we cannot use the above mentioned method to sample from a non-central chi-squared distribution. If
0 < k < 1, a non-central chi-squared distribution can be sampled using a central chi-squared distribution with random degrees of freedom.
Let Y ∼ P oisson(λ/2) random variable. The probability mass function (pmf) of Y is, (λ/2)y {Y = y} = exp(−λ/2) y = 1, 2,... P y!
2 Let U ∼ χk+2N (0). Conditional on the value of Y = y, let U follow a central chi- squared distribution with k + 2y degrees of freedom whose CDF is,
Z x 1 (k/2)+y−1 P{U ≤ u|Y = y} = (k/2)+y exp{−z/2}z dz (2.21) 2 Γ[(k/2) + y] 0
The unconditional cumulative distribution of U is,
∞ ∞ X X (λ/2)y {Y = y} {U ≤ u|Y = y} = exp(−λ/2) {U ≤ y} (2.22) P P y! P 0 0
30 Equation (2.22) is the CDF of a non-central chi-squared distribution with k degrees of freedom and non-centrality parameter λ.
Proposition 5. Assume that k < 1 and Y ∼ P oisson(λ). Then, it is true that,
2 d 2 χk(λ) = χk+2Y (0) .
Therefore, when the degrees of freedom are less than 1, we can sample from a non- central chi-squared distribution by first generating a Poisson random variable Y with parameter λ/2 and then sampling from a central chi-squared distribution with k +2Y degrees of freedom. Even though this hierarchical model to sample from a non-central chi-squared distribution produces unbiased results, it is usually computationally in- tensive.
Algorithm 3. (Simulation of the CIR process)
• Set the process parameters i.e. total time period (T) = 10, number of steps (N)
= 1000, number of simulations (n) = 500, α = 0.9, β = 4.0, σ = 1.5.
• Let ∆t = T/N and initialize the process by setting X(0) = 4.0.
• Recursively simulate X(t + ∆t) using the distribution given in (2.15).
31 14
12
10
8
X(t) 6
4
2
0 0 2 4 6 8 10 t
Figure 2.12: Simulated paths of the CIR process with parameters as described above.
2.5.2 Parameter Estimation for CIR Process using Maxi- mum Likelihood
Let {X(t): t ≥ 0} be a CIR process as defined in (2.15). Assume that we
observe this process at a discrete collection of time points {t0, t1, . . . , tn} where, t0 =
0, ti = iT/n for i = 1, 2, . . . , n. Let X = {X(t0),X(t1),...,X(tn)} be the data. For
simplicity, we use Xi = X(ti). Let θ = (α, β, σ). Given that this process is Markovian, the likelihood function is,
n Y L(θ|X1,X2,...,Xn) = f(Xi|Xi−1) i=1
32 where f(Xi|Xi−1) is the transition density. The transition density for a CIR process is, q ! 2 u √ f(X |X ) = c exp(−v − u) I (2 uv) (2.23) i i−1 v q
where,
2α c = , σ2[1 − exp(−αdt)] (−αdt) u = cXi−1e ,
v = cXi, 2αβ q = − 1, σ2 √ and Iq(2 uv) is the modified Bessel function of the first kind and of order q. The log
likelihood function is,
t X l(α, β, σ|X) = f(Xi|Xi−1) i=1 t X vi √ = t log(c) + [−u − v + q/2 log + log(I (2 u v ))] i−1 i u q i−1 i i=1 i−1 (2.24)
where, c, ui−1, vi = cXi, q and Iq have the usual meaning. For a simulation study, we
generated a data set according to algorithm3 using θ = (0.9, 4.0, 1.5). Based on such
data, we used the built in minimization function from Python to estimate the param-
eter values by minimizing the negative of log-likelihood. This process is repeated 500
times and the histogram of all estimates of the parameter α is presented in Figure
2.13. The dashed red line represents the true value of the parameter β. Similarly,
Figures 2.14 and 2.15 display the histograms of all estimates of the parameters α and
σ respectively. The dashed red lines represent the true value of the parameters α and
σ.
33 Histogram of est_alpha
120
100
80
60 Frequency 40
20
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Value
Figure 2.13: Histogram of estimated values of α of the CIR process. The dashed red line represents the true value of the parameter.
Histogram of est_beta
160
140
120
100
80 Frequency 60
40
20
0 2 3 4 5 6 7 8 9 Value
Figure 2.14: Histogram of estimated values of β of the CIR process. The dashed red line represents the true value of the parameter.
34 Histogram of est_sigma
100
80
60
Frequency 40
20
0 1.425 1.450 1.475 1.500 1.525 1.550 1.575 1.600 Value
Figure 2.15: Histogram of estimated values of σ of the CIR process. The dashed red line represents the true value of the parameter.
2.6 Generalized Cox-Ingersoll-Ross model
+ Definition 9. Let Xt be a stochastic process and β, σ ∈ R , γ ∈ (0, 1), and θ ∈ R be constants. If Xt satisfies the following stochastic differential equation,
γ + dXt = α(β − Xt)dt + σXt dWt, β, σ ∈ R , θ ∈ R, γ ∈ (0, 1) (2.25) then X(t) is said to be a generalized CIR [10] process.
Let {X(t): t ≥ 0} be a stochastic process. Assume that we observe this process at a discrete collection of time points {t0, t1, . . . , tn} where, t0 = 0, ti = iT/n for i = 1, 2, . . . , n. Let X = {X(t0),X(t1),...,X(tn)} be the data. For simplicity, we use Xi = X(ti). Let θ = (α, β, γ, σ). The likelihood function is,
n Y L(θ|X) = f(Xi|Xi−1) i=1
35 where f(Xi|Xi−1) is the transition density.
Even though the exact likelihood function does not have a closed form solution, we use a Gaussian approximation which works relatively well for smaller intervals of time
(∆t). In order to get accurate results, we would like to have the change in the time interval (∆t) as small as possible.
So, using the Gaussian approximation we have,
2 2γ Xi+1|Xi ∼ N(Xi + α(β − Xi−1)dt, σ Xi−1dt). (2.26)
This is true because we assume that W (t + dt) ∼ N(0, dt).
Algorithm 4. (Simulation of the generalized CIR process)
• Set the process parameters i.e. total time period (T) = 10, number of steps (N)
= 1000, number of simulations (n) = 1000, α = 0.5, β = 3, σ = 0.1, γ = 0.2.
• Let ∆t = T/N and initialize the process by setting X(0) = 0.3.
• Recursively calculate X(t + ∆t) using the distribution given in (2.26).
In Figure 2.16, we show 50 simulated paths according to algorithm4.
36 0.80
0.75
0.70 X(t)
0.65
0.60
0 2 4 6 8 10 t
Figure 2.16: Simulated paths of the generalized CIR process with parameters as described above
2.6.1 Parameter Estimation for generalized CIR Process us- ing Maximum Likelihood
We calculate the likelihood function using this Gaussian approximation. Thus,
the likelihood function can be written as,
N 2 ! Y 1 −(Xi+1 − Xi − α(β − Xt)dt) L(θ, β, σ, γ|X) = q · exp 2γ (2.27) 2γ ηXi i=1 πηXi
where, η = 2π2dt. But instead of maximizing the likelihood function, we maximize the log likelihood function l(θ, β, σ|X). The log likelihood function is,
N 2 ! −N X −(Xi+1 − Xi − α(β − Xt)dt) l(θ, β, σ|X) = log(2πη) − + γlog(X ) 2 2γ i i=1 ηXi (2.28)
37 We maximize l(α, β, γ, σ|X) in equation (2.28) to get estimates for the parameters.
Figure 2.17: Histogram of estimated values of α of the generalized CIR process using normal approximation. The dashed red line represents the true value of the parameter.
38 Figure 2.18: Histogram of estimated values of β of the generalized CIR process using normal approximation. The dashed red line represents the true value of the parameter.
Figure 2.19: Histogram of estimated values of σ of the generalized CIR process using normal approximation. The dashed red line represents the true value of the parameter.
39 Figure 2.20: Histogram of estimated values of γ of the generalized CIR process using normal approximation. The dashed red line represents the true value of the parameter.
2.6.2 Distribution of R t2 W (s) ds t1
In this subsection, we derive the distribution of the quantity R t2 W (s) ds, where t1 {W(t)} is a standard BM process. This distribution will play a significant role later
R t in this thesis. The distribution of 0 W (s) ds is the special case of the distribution of R t2 W (s) ds. Let us start by finding the mean and variance of R t W (s) ds. Let t1 0 f(x) = x3 and applying Ito’s Lemma (proposition1) we get,
Z t Z t W 3(t) = 3 W 2(s) dW (s) + 3 W (s) ds 0 0 Z t 1 Z t W (s) ds = W 3(t) − W 2(s) dW (s). 0 3 0
R t Thus, the mean of 0 W (s) ds is,
Z t E W (s) ds = 0. 0
40 R t The variance of 0 W (s) ds is, Z t "Z t 2# Z t Z t V ar W (s) ds = E W (s) ds = E W (s)W (u) du ds 0 0 0 0 Z t Z t Z t Z t = E[W (s)W (u)] du ds = min(s, u) du ds 0 0 0 0 Z t Z s Z t Z t = u du ds + s du ds = t3/3. 0 0 0 s
R t t3 R t Thus, 0 W (s) ds is a random variable that has a mean 0 and variance 3 . 0 W (s) ds is a random variable that has a normal distribution so,
Z t t3 W (s) ds ∼ N 0, . 0 3 R t Once we’ve found the mean and variance of 0 W (s) ds, we move on to the more general case of finding the mean and variance of R t2 W (s) ds. The mean of R t2 W (s) ds t1 t1 is, Z t2 Z t2 E W (s) ds = E[W (s)] ds = 0. t1 t1 The variance of R t2 W (s) ds is, t1 " 2# Z t2 Z t2 V ar W (s) ds = E W (s) ds t1 t1 Z t2 Z t2 Z t2 Z t2 = E[W (s)W (u)] du ds = min(s, u) du ds t1 t1 t1 t1 Z t2 Z s Z t2 Z t2 = u du ds + s du ds t1 t1 t1 s Z t2 2 2 Z t2 s t1 = − ds + s(t2 − s) ds t1 2 2 t1 2t3 t3 = 1 + 2 − t2t 3 3 1 2
Thus, the variance of R t2 W (s) ds is t1 2t3 t3 1 + 2 − t2t . (2.29) 3 3 1 2
We note that,
41 • If we let t1 = 0 and t2 = t then equation (2.29) gets reduced to the variance of
t3 the special case described earlier i.e 3 .
• If we let t1 = t and t2 = t then equation (2.29) gets reduced to 0.
Thus, Z t2 3 3 2t1 t2 2 W (s) ds ∼ N 0, + − t1t2 . t1 3 3
42 Chapter 3: Approximate Bayesian Computing for Stochastic Volatility Models
3.1 Heston Model
In his 1993 paper “A Closed-Form Solution for Options with Stochastic Volatil- ity with Applications to Bond and Currency Option” [13] Heston proposed a new stochastic volatility model, which now carries his name. The Heston model is used extensively in estimating the volatility of financial assets or derivatives. This model is an extension of the Black-Scholes model, i.e., the assumption is that underlying asset price still evolves according to the Black-Scholes model but it also introduces a stochastic behavior for the volatility component. That is, the model assumes that the volatility component in the Black-Scholes model is not fixed but rather is governed by another stochastic differential equation. In particular, the Heston model uses a mean reverting CIR model to describe the evolution of the volatility.
Definition 10. Let S(t): t ≥ 0 to be the price of the asset and ν(t); t ≥ 0 be the variance process. The equations governing the Heston model are,
dS(t) = µS(t)dt + pν(t)S(t) dW S(t) (3.1)
dν(t) = α(β − ν(t))dt + σpν(t) dW ν(t) (3.2)
43 where W S(t) and W ν(t) are correlated standard BM processes with the correlation between them given by ρ ∈ [−1, 1], µ is called the risk-free rate, dS(t) is the infinites-
imal change in S(t), the price of the underlying asset, α is the rate of mean reversion,
β is the long term mean of the CIR process which is also known as the asymptotic
mean, σ > 0 is the volatility of the CIR process.
The Heston model has certain desirable properties which make it a useful model.
Under the Heston model, volatility is modeled as a mean reverting process. This
assumption of the Heston model is also corroborated by observing its behavior in the
financial markets. If the volatility of an asset was not mean reverting, there would be
many assets whose volatility would be close to zero or very high. However, in practice
the probability of occurrence of these cases is very low and short lived.
The Heston model also associates asset prices with volatility by introducing correlated
shocks between the two. This assumption is particularly useful as it helps us to model
the statistical dependence between an asset and its volatility. Empirical evidence [21]
and [14] shows that in an equity market, the volatility and change in price of an asset
are inversely related, i.e., high changes in asset prices result in an increased volatility.
However, the flexibility that the Heston framework provides comes at the expense of
increased model complexity. It is generally difficult to implement the Heston model
as compared to the Black-Scholes model and there is always a tradeoff between the
two models in terms of complexity and accuracy. The Heston model is generally more
complex but also more accurate.
Proposition 6. Let dW ν(t) ∼ N(0, dt) and dW S(t) = ρdW ν(t) + p1 − ρ2dZ(t),
S where dZ(t) ∼ N(0, dt) is independent of dW (t).
44 Then,
V ar[dW ν(t)] = dt p V ar[dW S(t)] = ρCov[dW ν(t), dW ν(t)] + 1 − ρ2Cov[dW ν(t), dZ(t)]
= ρV ar[dW ν(t)]
= ρdt
The correlation between dW S(t) and dW ν(t) is equal to ρ. Let X(t) = log(S(t)).
Using Itˆo’s Lemma (proposition1) we can rewrite the equation (3.1) as, " ! !# 1 ν(t)S2(t) −1 1 dX(t) = µS(t) · + · dt + pν(t) · S(t) · dW X (t) S(t) 2 S2(t) S(t) ! ν(t) dX(t) = µ − dt + pν(t) dW X (t) 2
Thus, after using Itˆo’s lemma we get the following set of equations, ! ν(t) dX(t) = µ − dt + pν(t) dW X (t) (3.3) 2 dν(t) = α(β − ν(t))dt + σpν(t) dW ν(t) (3.4) where, dW X (t) = dW S(t) and all the other parameters have the usual meanings.
Feller’s Condition - It can be seen from equations (3.3) and (3.4) that ν(t) is under the square root sign. Thus, we require ν(t) to be non-negative. Feller proposed a condition which guarantees that ν(t) would be non-negative. If 2αβ ≥ σ2, then ν(t) takes non-negative values.
3.1.1 Simulation of sample paths of the Heston Model
There have been extensive studies on how to simulate sample paths of a Heston model. The basic idea is to partition a time interval into equally spaced intervals and
45 then simulate asset price paths for a given partition. Apart from the generic E-M discretization and Miller’s algorithm, Broadie and Kaya’s [8] algorithm is also popu- lar. There have been several modifications to Broadie and Kaya’s algorithm such as
Smith’s Approximation [23], Broadie and Kaya’s drift interpolation [25], Anderson’s quadratic exponential [6], and Tse and Wan’s Inverse Gaussian [24]. In this project, we use the exact scheme by Broadie and Kaya [8] but we estimate the integrals using
Riemann sums. This is slightly different from the work done by A. Van Haastrecht and A. Pelsser [25] who use the trapezoidal rule to estimate the integrals.
3.1.2 Euler-Maruyama (EM) Approximation
The Euler-Maruyama(EM) algorithm is an easily implementable approximation which can be used to approximate any SDE. The original process X(t) is approximated by another process X˜(t) which is defined in the following way, " # 1 X˜(t + ∆t) = X˜(t) + µ − ν˜(t) ∆t + pν˜(t)∆tZ 2 X
" # p ν˜(t + ∆t) =ν ˜(t) + α β − ν˜(t) ∆t + σ ν˜(t)∆tZν whereν ˜(t) is another process approximating the process ν(t). In between any two time points t, t+∆t, the processes X˜(·) andν ˜(·) are defined via a linear-interpolation of the values defined through the above equations. Above, ZX and Zν are standard normal random variables such that the correlation between them is ρ i.e. Corr(ZX ,Zν) = ρ.
In practice, this algorithm is not robust. When Feller’s condition is violated, the un- derlying variance process does not remain non-negative and has a positive probability of becoming negative. In addition, the Gaussian approximation above is valid only
46 when ∆t is very small. To circumvent this problem, Lord, Koekkoek and van Dijk
[17] propose a modification to the EM algorithm.
3.1.3 Euler-Maruyama scheme with Lord et al’.s modifica- tion
The equations of the modified EM algorithm are, " # 1 X˜(t + ∆t) = X˜(t) + µ − (˜ν(t)) ∆t + pν˜(t)∆tZ 2 X " # p ν˜(t + ∆t) =ν ˜(t) + α β − f(˜ν(t)) ∆t + σ ν˜(t)∆tZν
where, f(z) = max(0, z). If the variance process V˜ becomes negative, it corrects itself
with a deterministic upward drift of αβ.
3.1.4 Milstein scheme
The Milstein scheme is very similar to the EM algorithm. However, the Milstein scheme uses a second-order approximation to the SDE whereas the EM algorithm uses a first-order approximation or linear approximation to the SDE.
The algorithm under the Milstein scheme is, " # 1 X˜(t + ∆t) = X˜(t) + µ − (˜ν(t)) ∆t + pν˜(t)∆tZ , 2 X " # σ2 ν˜(t + ∆t) =ν ˜(t) + α β − f(˜ν(t)) ∆t + σpν˜(t)∆tZ + Z2h, ν 4 ν where, f(z) = max(0, z). It is important to know that ν(t + ∆t) > 0 if ν(t) > 0 and 4αβ ≥ σ2. This fact was stated by Gartner in [12]. When this inequality is not satisfied, it can still be shown that the occurrence of negative realizations ofν ˜ is greatly reduced as compared to the EM algorithm.
47 3.1.5 Broadie and Kaya’s Exact Algorithm
An exact simulation algorithm to simulate the Heston model is proposed by
Broadie and Kaya [8]. However, this algorithm is rarely used in practice as it is
computationally intensive. The solution to (3.1) can be written as, ! 1 Z t+∆t Z t+∆t p S(t + ∆t) = S(t) exp µ∆t − ν(u)du + ν(u)dWS(u) 2 t t
Using this and the transformation X = log(S), we get the following explicit solution
for X(t),
1 Z t+∆t X(t + ∆t) = X(t) + µ∆t − ν(u) du 2 t Z t+∆t Z t+∆t p p 2 p + ρ ν(u) dWν(u) + 1 − ρ ν(u) dWX (u) (3.5) t t
where, W ν(u) and W X (u) are values from two independent Brownian motions at time u. If we integrate (3.4), we get,
Z t+∆t Z t+∆t p ν(t + ∆t) = ν(t) + [α(β − ν(u))]du + σ ν(u)dWν(u) (3.6) t t
Equation (3.6) can be re-written as,
Z t+∆t " Z t+∆t # p −1 ν(u)dWν(u) = σ ν(t + ∆t) − ν(t) − αβ∆t + α ν(u)du t t
R t+∆t p and then if we substitute the value of t ν(u)dWν(u) into equation (3.5), we get,
1 Z t+∆t ρ X(t + ∆t) = X(t) + µ∆t − ν(u)du [ν(t + ∆t) − ν(t) − αβ∆t] 2 t σ Z t+∆t Z t+∆t αρ p 2 p + ν(u)du + 1 − ρ ν(u)dWX (u) σ t t
Thus, we have to sample the following quantities in the required order,
1. ν(t + ∆t) given ν(t)
48 R t+∆t 2. t ν(u)du given ν(t + ∆t), ν(t)
R t+∆t p R t+∆t 3. t ν(u)dWν(u) given t ν(u)du
2 We know that a transformation of νt+dt follows a scaled χ distribution. So,
n(dt) ν(t + dt) exp{−αdt}
has a χ2 distribution with λ(t) as the non-centrality parameter and
4αβ d = σ2
degrees of freedom. Here,
λ = ndtν(t), 4α exp{−αdt} n(dt) = . σ2(1 − exp{−αdt})
To get a value for a future time step (t + dt), we sample from a non-central χ2 distribution with λ(t) as the non-central parameter and d as the degrees of freedom.
We use an built in random number generator in the numpy module to achieve this.
Algorithm 5. The sample paths of Heston Model can be simulated using the following algorithm,
1. Sample νˆ(t + ∆t) given νˆ(t) from a non-central χ2 distribution.
R t+∆t 2. Given νˆ(t + ∆t) and νˆ(t), we estimate t ν(u)du. For this we use the trape- zoidal rule and estimate the integrated variance as,
νˆ(t + ∆t) +ν ˆ(t) IVˆ (t, t + ∆t) ≈ . 2
3. Generate a random observation Zx from an independent standard Gaussian ran-
dom variable.
49 4. Use the following exact scheme to get the different values of a sample path.
αρ IVˆ (t, t + ∆t) Xˆ(t + ∆t) = Xˆ(t) + µ∆t + IVˆ (t, t + ∆t) − σ 2 ρ p q + [ˆν(t + ∆t) − νˆ(t) − αβ∆t] + 1 + ρ2Z IVˆ (t, t + ∆t) (3.7) σ x
V(t) vs time
0.30
0.29
0.28 V(t)
0.27
0.26
0 50 100 150 200 250 t
Figure 3.1: Simulation of a path of CIR process with N = 252, α = 0.09, β = 0.145 and σ = 0.055.
50 X(t) vs time
2.3
2.2
2.1
2.0 X(t)
1.9
1.8
1.7
1.6 0 50 100 150 200 250 t
Figure 3.2: Simulation of a path of Heston process with N = 252, α = 0.09, β = 0.145, µ = 0.009 and σ = 0.055.
3.2 A generalized Heston Model
In this section, we propose a generalization of the Heston model. We extend the Heston model (10) by allowing the drift µ to be governed by another stochastic process. The rationale behind this idea is that there are some local variations in the drift component which we feel might be captured by the generalized Heston model.
As far as we know, all the models that have been proposed in the literature assume the interest rates to be a strictly positive quantity. But, there have been instances when the interest rates have been negative [5]. We feel the generalized Heston model would be more appropriate to estimate the volatility in these markets.
51 Definition 11. Let S(t): t ≥ 0 to be the price of the asset and ν(t); t ≥ 0 be the variance process. The equations governing the generalized Heston model are as follows:
dS(t) = µ(t)S(t)dt + pν(t)S(t) dW S(t) (3.8)
µ dµ(t) = α1(β1 − µ(t))dt + σ1 dW (t) (3.9)
p ν dν(t) = α2(β2 − ν(t))dt + σ2 ν(t) dW (t) (3.10) where dW µ(t) is uncorrelated with both dW S(t) and dW ν(t) by construction. The rest of the parameters have their usual meanings as defined earlier. From proposition
(2.2) we know that equation (3.9) can be written as,
Z t Z t µ µ(t) = µ(0) + α1(β1 − µ(s))ds + σ1 dW (s). (3.11) 0 0 Using the transformation f(X, t) = X(t) = log(S(t)) and Itˆo’s lemma (proposition
1), equation (3.8) can be written as, " ! !# 1 ν(t)S2(t) −1 1 dX(t) = µ(t)S(t) · + · dt + pν(t) · S(t) · dW X (t) S(t) 2 S2(t) S(t) ν(t) dX(t) = µ(t) − dt + pν(t) dW X (t), 2 where µ(t) follows a mean reverting OU process given by equation (3.9) and ν(t) follows a CIR process given by equation (3.10).
Using equation (2.2), X(t) can be written as,
Z t ν(s) Z t X(t) = X(0) + µ(s) − ds + pν(s) dW X (s). (3.12) 0 2 0
For any two times t1, t2 such that t2 > t1, equation (3.12) translates to,
Z t2 Z t2 ν(s) p X X(t2) = X(t1) + µ(s) − ds + ν(s) dW (s). t1 2 t1 52 This can be further simplified as,
Z t2 Z t2 Z t2 ν(s) p X X(t2) = X(t1) + µ(s)ds − ds + ν(s) dW (s). t1 t1 2 t1
Using proposition6 we get,
Z t2 Z t2 Z t2 ν(s) p ν X(t2) = X(t1) + µ(s)ds − ds + ρ ν(s) dW (s) t1 t1 2 t1 p Z t2 + 1 − ρ2 pν(s) dW Z (s), (3.13) t1 where dW µ(s) and dW Z (s) are independent of each other.
3.2.1 Simulation of sample paths of the generalized Heston model
The sample paths of the modified Heston model can be simulated using the fol- lowing multistep procedure.
Step-I
Set the process parameters, i.e., total time period (T)= 1.0, number of steps (N) =
100, ρ = −0.6. Let s be the number of intermediate points between ti and ti+1.
Figure 3.3 illustrates this with s=4. For our simulations, we choose s as 100.
Figure 3.3: s=4 intermediate points between ti and ti+1.
53 We need to simulate both the CIR process and the OU process in order to simulate a path of the generalized Heston model. We simulate the OU process using algorithm
2. Similarly, the CIR process is simulated using algorithm described in Chapter 2.
0.75
0.70
0.65
0.60 V(t)
0.55
0.50
0.45
0.0 0.2 0.4 0.6 0.8 1.0 t
Figure 3.4: Simulated path of the CIR process with parameters α2 = 0.221, β2 = th 0.601, σ2 = 0.055. Every (s + 1) value has been chosen for the plot, where s has been defined in step-I.
54 0.76
0.74
0.72
0.70 M(t)
0.68
0.66
0.0 0.2 0.4 0.6 0.8 1.0 t
Figure 3.5: Simulated path of the OU process with parameters α1 = 0.14, β1 = th 0.861, σ1 = 0.009. Every (s + 1) value has been chosen for the plot, where s has been defined in step-I.
Step-II
We estimate the integral R t2 ν(s) ds using the Riemann sum. t1
s Z t2 X ν(s) ds ≈ IVc = (ν(t1) + ν(si))∆, t1 i=1 where s = 100 is the number of divisions between t1 and t2 and
t − t ∆ = 2 1 . s
ν(t1) and ν(si) have already been simulated in Step - I as part of simulating the CIR process. In this step, we just add the product of all the simulated values of the CIR process between the time points t1 and t2 and ∆.
55 0.00060
0.00055
0.00050
Value 0.00045
0.00040
0.00035
0 20 40 60 80 100 N
Figure 3.6: Simulated path of the estimates of R t2 ν(s) ds at different time points. t1
The X axis here represents the number of divisions between 0 and the total time period T . If N = 100, then there would be 99 estimated values of the integral at
R t2 R t3 R t100 different times, i.e., ν(s) ds, ν(s) ds, . . . , ν(s) ds and first value is just νt1. t1 t2 t99
Step-III
We estimate the integral R t2 µ(s) ds using Riemann sum. t1
s Z t2 X µ(s) ds ≈ Iµc = (µ(t1) + µ(si))∆, t1 i=1 where s and ∆ have their usual meanings.
56 0.000375
0.000350
0.000325
0.000300
Value 0.000275
0.000250
0.000225
0 20 40 60 80 100 N
Figure 3.7: Simulated path of the estimate of R t2 µ(s) ds. t1
The X axis here represents the number of divisions between 0 and the total time period T . If N = 100, then there would be 99 estimated values of the integral at
R t2 R t3 R t100 different times, i.e., µ(s) ds, µ(s) ds, . . . , µ(s) ds, and first value is just µt1. t1 t2 t99
Step-IV
The solution to the CIR process simulated in Step-I is given as,
Z t2 Z t2 p ν(t2) = ν(t1) + α2(β2 − ν(u))du + σ2 ν(u)dW (u). t1 t1
We estimate the integral R t2 pν(s) dW ν(s) as follows, t1
Z t2 Z t2 p −1 ν(u)dW (u) = [ν(t2) − ν(t1) + α2(β2 − ν(u))du]σ2 . t1 t1
We have already simulated all the terms on the right hand side and thus, we know the value of Z t2 pν(u)dW ν(u) t1 57 .