Variance Gamma Pricing of American Futures Options Eunjoo Yoo

Florida State University Libraries

Electronic Theses, Treatises and Dissertations The Graduate School

2008 Variance Gamma Pricing of American Futures Options Eunjoo Yoo

Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected] FLORIDA STATE UNIVERSITY

COLLEGE OF ARTS AND SCIENCES

VARIANCE GAMMA PRICING OF AMERICAN FUTURES OPTIONS

EUNJOO YOO

A Dissertation submitted to the Department of Mathematics in partial fulﬁllment of the requirements for the degree of Doctor of Philosophy

Degree Awarded: Summer Semester, 2008 The members of the Committee approve the Dissertation of Eunjoo Yoo defended on July 10, 2008.

Craig A. Nolder Professor Directing Dissertation

Fred Huﬀer Outside Committee Member

Bettye Anne Case Committee Member

Alec N. Kercheval Committee Member

Jack Quine Committee Member

Approved:

, Chair Department of Mathematics

, Dean, College of Arts and Sciences

The Oﬃce of Graduate Studies has veriﬁed and approved the above named committee members.

ii ACKNOWLEDGEMENTS

I would ﬁrst like to thank my dissertation advisor, Dr. Craig Nolder for his patience and guidance. Also, I would like to thank my committee members, Dr. Bette Anne Case, Dr. Alec Kercheval, Dr. Fred Huﬀer, and Dr. Jack Quine. I also have to thank Dr. James Doran and Dr. Mack Galloway. Last, I would like to thank my family for their support and love all these years.

iii TABLE OF CONTENTS

List of Tables ...... vi

List of Figures ...... viii

Abstract ...... x

1. Introduction ...... 1

2. Review of LévyProcesses ...... 5 2.1 LévyProcesses and Basic Examples ...... 5 2.2 The Lévy-Khintchine Representation ...... 8

3. EMM ...... 10 3.1 Esscher Transform ...... 10 3.2 European Option Valuation ...... 13

4. American Futures Options ...... 17 4.1 The Basics: Futures Options ...... 17 4.2 Implementation: Quadratic Approximation ...... 20 4.3 Calibration ...... 21 4.4 Numerical Result ...... 23 4.5 Out-of-Sample Performance ...... 26

5. Generating Sample Paths ...... 29 5.1 Variance Gamma Process ...... 29 5.2 Simulation of the Variance Gamma Process ...... 33

6. Random Tree Method ...... 38 6.1 The High Estimator Θ ...... 42 6.2 The Low Estimator θ ...... 48

7. Implementation ...... 55 7.1 Depth-First Procedure ...... 55

8. Calibration ...... 60 8.1 Futures Price Process ...... 60 8.2 Data Description ...... 60

iv 8.3 Calibration Method ...... 61 8.4 Results ...... 61

9. Conclusion and Future Research ...... 72

10.Plots ...... 74

APPENDICES90

A. Basic Convergence Concepts ...... 90

B. Subordinators ...... 93

C. Crude Oil Futures: the Basics from NYMEX ...... 95

REFERENCES ...... 97

BIOGRAPHICAL SKETCH ...... 100

v LIST OF TABLES

4.1 Early Exercise Premium: call options on crude oil futures for December 2003- November 2004 ...... 21

4.2 Parameters calibrated: Time-averaged from twelve observations based on Black-Sholes and Variance Gamma models (Dec. 2003 Nov. 2004) . . . . . 24 ∼ 4.3 In-Sample: Average absolute errors ( C C(F,T ; K) )...... 25 | − | 4.4 Pricing Errors for In-Sample: Time-averaged from twelve observations based on quadratic approximation (Dec. 2003 Nov. 2004) ...... 26 ∼ 4.5 In-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) 26 ∼ 4.6 Average Absolute Error: One-day-ahead for December 2003-November 2004 . 27

4.7 Average Absolute Error: Five-days-ahead for December 2003-November 2004 27

4.8 Pricing Errors for Out-of-Sample: Time-averaged from twelve observations based on Black-Sholes and Variance Gamma models (Dec. 2003 Nov. 2004) 28 ∼ 4.9 Out-of-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) ...... ∼ . . . . 28

5.1 Variance Gamma Process: Moment ...... 30 8.1 Parameters calibrated: Time-averaged from twelve observations based on the Black-Scholes and Variance Gamma models (Dec. 2003 Nov. 2004) . . . . 62 ∼ 8.2 Percentage pricing error: (model price-actual price)/actual price (Dec. 2003 Nov. 2004) ...... 63 ∼ 8.3 Pricing errors for In-Sample: Time-averaged from twelve observations based on the Black-Scholes and Variance Gamma models (Dec. 2003 Nov. 2004) . 65 ∼ 8.4 In-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) 65 ∼ 8.5 Average Absolute Error: One-day-ahead sample from December 2003 to November 2004 ...... 67

vi 8.6 Average Absolute Error: Five-days-ahead sample for December 2003 to November 2004 ...... 68

8.7 Percentage Pricing Error: One-day-ahead sample from December 2003 to November 2004 ...... 69

8.8 Percentage Pricing Error: Five-days-ahead sample form December 2003 to November 2004 ...... 69 8.9 APE for Out-of-Sample: Time-averaged from twelve observations based on the B-S and VG models (Dec. 2003 Nov. 2004) ...... 70 ∼ 8.10 RMSE for Out-of-Sample: Time-averaged from twelve observations based on the B-S and VG models (Dec. 2003 Nov. 2004) ...... 71 ∼ 8.11 Out-of-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) ...... ∼ . . . . 71

10.1 Risk-Neutral Mean, Variance, Skewness, Kurtosis for December 2003-November 2004 ...... 88 10.2 Random tree method: Time-averaged calibrated parameters for December 2003-November 2004 ...... 89

vii LIST OF FIGURES

5.1 L´evymeasure of the VG process with respect to (θ, σ, ν)...... 32

5.2 Figure 5.1-Continued ...... 33 5.3 A sample path of a Gamma process: ∆t = 0.01, β = 20...... 36

5.4 A sample path of a VG process: ∆t = 0.001, θ = 0.1, σ = 0.2, and ν = 0.05. . 37

6.1 Random Tree with b = 4 and m =2...... 41 7.1 Depth-ﬁrst processing of the random tree when b=4 and m=3. Solid lines represent nodes that are being worked on currently; dashed lines indicate nodes from the previous steps that don’t need to be in memory...... 59

10.1 VG calibration of Futures Options on Crude Oils. indicates market price and + stands for model price ...... ◦ ...... 74

10.2 Figure 10.1-Continued ...... 75

10.3 Figure 10.1-Continued ...... 76

10.4 Figure 10.1-Continued ...... 77

10.5 Figure 10.1-Continued ...... 78

10.6 Figure 10.1-Continued ...... 79

10.7 Figure 10.1-Continued ...... 80

10.8 B-S calibration of Futures Options on Crude Oils. indicates market price and + stands for model prices ...... ◦ ...... 82 10.9 Figure 10.8-Continued ...... 83

10.10Figure 10.8-Continued ...... 84

10.11Figure 10.8-Continued ...... 85 10.12Figure 10.8-Continued ...... 86

viii 10.13Figure 10.8-Continued ...... 87

ix ABSTRACT

In financial markets under uncertainty, the classical Black-Scholes model cannot explain the empirical facts such as fat tails observed in the probability density. To overcome this drawback, during the last decade, Lévyprocess and stochastic volatility models were introduced to financial modeling. Today crude oil futures markets are highly volatile. It is the purpose of this dissertation to develop a mathematical framework in which American options on crude oil futures contracts are priced more effectively than by current methods. In this work, we use the Variance Gamma process to model the futures price process. To generate the underlying process, we use a random tree method so that we evaluate the option prices at each tree node. Through fifty replications of a random tree, the averaged value is taken as a true option price. Pricing performance using this method is accessed using American options on crude oil commodity contracts from December 2003 to November 2004. In comparison with the Variance Gamma model, we price using the Black-Scholes model as well. Over the entire sample period, a positive skewness and high kurtosis, especially in the short-term options, are observed. In terms of pricing errors, the Variance Gamma process performs better than the Black-Scholes model for the American options on crude oil commodities.

x CHAPTER 1

Introduction

Today crude oil futures markets are highly volatile. It is more important than ever to accurately and robustly price options in these markets. It is well-known that the Black- Scholes model seriously misprices, especially in environments with high volatility. It is the purpose of this dissertation to develop a mathematical framework in which American options on crude oil futures contracts are priced more effectively than by current methods. To this purpose we use a Variance Gamma Lévyprocess in place of Brownian motion. In this way, higher probabilities are assigned to extreme events. This model is used in conjunction with a random tree method, which gives results superior to a simple binomial model. The Variance Gamma process was first proposed by D. B. Madan and E. Seneta (1990) [29] and D. B. Madan and F. Milne (1991) [28]. They introduced it as a new stochastic process in finance with three parameters, infinite activity jumps, and a closed- form characteristic function. They explained how the additional parameters accommodated the volatility and kurtosis of the log return distribution. Later, D. B. Madan, P. P. Carr and E. C. Chang [27] performed empirical tests to investigate the pricing performance of the Variance Gamma process. They observed that option pricing errors of the Variance Gamma process are relatively uncorrelated to an option’s moneyness and over its maturity, while the Black-Scholes model is significantly correlated to these biases. In their empirical tests, D. B. Madan, P. P. Carr and E. C. Chang [27] computed the European pricing formula for the return density using data on the S&P 500 Index. Another approach to the option pricing formula was proposed by P. P. Carr and D. B. Madan (1999) [11]. This new numerical approach is to use the FFT (Fast Fourier Transform) to value options where the underlier was given by the Variance Gamma process. The main idea is to find representations of option values in terms of the closed characteristic function.

1 However, unlike the European options pricing formula, there are no known closed-form formulas for the American-style options. Hence, the solution for the American options needs to be solved in another way. Early work for this was done with the Black-Scholes partial differential equation. This problem is a free boundary problem, due to the optimal exercise boundary that is determined as a part of the PDE solution. Geske (1979) [17], Roll (1977) [33], Whaley (1981) [37], and Johnson (1983) [24] focused on Black’s model with dividend-paying stocks and derived the analytical solutions. Many authors have proposed numerical schemes to value American options. The standard binomial method was introduced by Cox, Ross, and Rubinstein (1979) [13]. This method discretizes both the time and state spaces to approximate option prices. Later, Breen (1991) [5] suggested the accelerated binomial method to reduce computational steps. Another approach, finite difference methods, was introduced by Brennan and Schwartz (1977, 1978) [6]. In 1985, Geske and Shastri compared all early methods in their paper [18]. In early 1990, the convergence for these methods was proved by Jaillet, Lamberton, Lapeyre, Amin, and Khanna [23]. Another new approach appeared in 1987. Barone, Adesi, and Whaley (1987) [2] and Whaley (1986) [38] applied the quadratic approximation of MacMillan [26]. Their method is to obtain the approximate solution of the Black-Scholes PDE by adjusting the early exercise premiums. Also, see Carr, Jarrow, and Myneni (1992) [10], Jacka (1991) [22], and Brodie and Detemple (1996) [7]. Recently, several authors have used Lévyprocesses to price American-style options. Hirsa and Madan (2003) derived a partial integro-differential equation (PIDE) for the underlying process given by the Variance Gamma process [21] and employed finite difference discretization to solve PIDE. Another approach was proposed by Këlleziand Webber in 2004 [25]. They investigated a lattice method based on an approximation to the transition density function of the Lévyprocess. This lattice application for the American option prices was also proposed by Maller, Solomon, and Szimayer [30]. Their method was based on the binomial method, but developed to the multinomial model. Recently, Monte Carlo methods have been used as another numerical method by Glasserman (2004) [19] and Fu, Laprise, Madan, Su, and Wu (2001) [16]. Since a holder of American options can exercise them any time up to options’ expiration date, the analytic solution to the American option pricing problem is not known. Thus, the problem of pricing American options has focused on the numerical approximation methods

2 of the Black-Scholes model, such as the finite-difference method, binomial method, and quadratic method, as we mentioned above. However, it is well known that the Black-Scholes model can not capture empirical facts such as fat tails observed in financial markets. Due to this fact, several authors have recently turned to a Lévyprocess as the underlier process in order to price American options. In this dissertation, we examine how the Variance Gamma process can be applied to the American futures options market. First, we study the Variance Gamma process defined as a Brownian motion with constant drift and volatility as a random time change by a gamma process. Also, we explore the characteristics of the Variance Gamma model, especially the skewness and the kurtosis of the Lévymeasure. The Monte Carlo method is used to generate the underlying dynamics of futures prices. Secondly, a random tree method is implemented to price American options [7]. This method exploits the facts that simulated paths of the underlying Variance Gamma are Markov processes. We compute the high and low biased estimates at each tree node. However, both estimates are converging to the true value and the American options price is taken as the sample average of two estimates through 50 replications of a random tree. Lastly, we use options on crude oil futures commodities to calibrate the underlying process. These American-style option prices are computed for in-sample performance, average percentage pricing error by moneyness and maturity, four different pricing errors (see:8.3), and out-of-sample performance. For the out-of-sample, we use one-day-ahead and five-days-ahead calibrated parameters to price options. Next, the Black-Scholes model is evaluated in the same way to compare the pricing performance of the Variance Gamma model. Based on a sample of 692 closing call option prices from December 2003 to November 2004, we find that the Variance Gamma process has a lower in-sample and out- of-sample pricing error than the corresponding Black-Scholes model over all sample periods. Remarkably, for out-of-the-money options, it has much lower average absolute and average percentage pricing errors compared to the Black-Scholes model. The structure of this thesis follows as: In Chapter 1, the basic definitions and theorems about the Lévyprocess are reviewed, including the Lévy-Khintchine representation. Chapter 2 describes the risk-neutral pricing process and European option valuation. Also, the numerical method for European style options is derived. Introductory definitions about

3 futures options are explained in Chapter 3. Then, for the Black-Scholes model, the European option pricing formula is studied, as well as the numerical scheme to convert European option prices into American option prices and its implementation. Chapter 4 focuses on the Variance Gamma process and its sampling method. The next two chapters, 5 and 6, describe the random tree method and its numerical implementation. The data is described and numerical results from the calibration are listed in Chapter 7. In conclusion, future research is suggested in Chapter 8.

4 CHAPTER 2

Review of L´evy Processes

2.1 L´evy Processes and Basic Examples

In this Chapter, the mathematical deﬁnitions and notation we use come from Sato (1999), as do the proofs or references to prove the theorems [34].

Deﬁnition 2.1.1. A stochastic process X = X(t) t 0 on (Ω, , P), taking values in R, is { } ≥ F said to be a L´evyprocess if

1. For any choice of n 1 and 0 t < t < < t , the random variables ≥ ≤ 0 1 ··· n X ,X X ,X X , ,X X − are independent ( independent increments t0 t1 − t0 t2 − t1 ··· tn − tn 1 property).

2. X0 = 0 a.s.

3. The distribution of X X does not depend on s (stationary increments property). s+t − s 4. X is stochastically continuous: for every t 0 and ε > 0, { t} ≥

lim P( Xt Xs > ε) = 0 s t → | − |

5. There is Ω with P[Ω ] = 1 such that, for every ω Ω , X (ω) is right-continuous 0 ∈ F 0 ∈ 0 t in t 0 and has left limits in t 0. ≥ ≥ We call any process satisfying (1)-(4) a LévyProcess in law [34]. It has been shown that any Lévyprocess in law has a modification which is a Lévyprocess.

Remark 2.1.2. Let’s consider two stochastic processes, X and Y, deﬁned on the same probability space. Y is a modiﬁcation of X if for every t > 0, we have P[Xt = Yt] = 1.

5 Deﬁnition 2.1.3. The convolution µ of two distributions µ1 and µ2 on R, denoted by µ = µ µ , is a distribution deﬁned by 1 ∗ 2

µ(B) = 1B(x + y)µ1(dx)µ2(dy), B (R). R R ∈ B ZZ × Deﬁnition 2.1.4. A probability measure µ on R is inﬁnitely divisible if, for any positive integer n, there is a probability measure µn on R such that µ is the n-fold convolution of µn:

n n µ = µ µ (denoted by µ or µ ∗). n ∗ · · · ∗ n n n n Deﬁnition 2.1.5. The characteristic| {z function} µ ˆ(z) of a probability measure µ on R of the random variable X is

µˆ(z) = E(eizX ) = eizxµ(dx), z R. R ∈ Z Theorem 2.1.6. If X1 and X2 are independent random variables with distributions µ1 and µ , then X + X has distribution µ µ . 2 1 2 1 ∗ 2 Thus, a random variable X is infinitely divisible if, for all k 1, we can find a finite set ≥ (k) (k) (k) of independent and identically distributed random variables (Y1 ,Y2 , ,Yk ) such that k ··· d (k) X = Yi . i=1 X Theorem 2.1.7. If µ1 and µ2 are probability measures on R with characteristic functions µˆ1 and µˆ , respectively, then µ µ has characteristic function µˆ µˆ . 2 1 ∗ 2 1 2 Example 2.1.8. Characteristic functions

The Gaussian distribution on R with mean γ and variance a is deﬁned by

1/2 (x γ)2/(2a) µ(B) = (2πa)− e − dx, ZB where a > 0 and γ R. We have ∈ 1 µˆ(z) = exp az2 + iγz , z R. −2 ∈ µ ¶ The exponential distribution on R with parameter α > 0 is given by αx R µ(B) = α B [0, ) e− dx, B ( ). Here the characteristic function is ∩ ∞ ∈ B R α µˆ(z) = . α iz − 6 If X is a Poisson distribution on R with the parameter λ, then its characteristic function has the following equation:

E [exp(izX)] = exp λ(eiz 1) , z R. − ∀ ∈ © ª

Example 2.1.9. Inﬁnitely divisible distributions

The most common examples of infinitely divisible laws are: the Gaussian distribution, the gamma distribution, α-stable distributions, and the Poisson distribution; a random variable having any of these distributions can be decomposed into a sum of n i.i.d. parts having the same distribution but with modified parameters. For example, if n 1 − X N( µ, σ2), then it is infinitely divisible and we can write X = Y where Y are ∼ i i i=0 µ σ2 X i.i.d. with law N( n , n ).

If Xt t 0 is a Lévyprocess, then for every t, the distribution of Xt is infinitely { } ≥ k divisible: Define tk = n t for all t > 0, n a positive integer and k = 0, 1, , n. Then n ···

X = (X X ) + (X X ) + + (X X − ) = (X X ). That is, it t t1 − t0 t2 − t1 ··· tn − tn 1 tk − tk−1 k=1 can be written as the sum of n independent identically distributedX random variables. If µ is the probability measure of X and µ is the probability measure of (X X ), t n ti − ti−1 n thenµ ˆ(z) = [ˆµn(z)] . Thus, Xt is inﬁnitely divisible.

Theorem 2.1.10. Inﬁnite divisibility and L´evyprocesses

If Xt t 0 is a Lévyprocess on R, then for every t > 0, Xt has an infinitely divisible { } ≥ distribution. Conversely, if µ is an infinitely divisible distribution, then there exists a Lévy process X and power of µt such that the distribution X is given by µt. { t} t

Theorem 2.1.11. Let Xt t 0 be a L´evyprocess on R. There exists a continuous function { } ≥ ψ : R C called the characteristic exponent of X, such that 7→

izXt tψ(z) E[e ] = e− , z R. ∈

7 2.2 The L´evy-Khintchine Representation

The fundamental result of infinitely divisible distributions and Lévyprocesses is the Lévy- Khintchine formula. This gives a representation of characteristic functions of infinitely divisible distributions. It was obtained on R around 1930 by de Finetti and Komogorov in special cases, and then by Lévyin general case, on Rd in 1948.

Theorem 2.2.1. Let D = x : x 1 , the closed unit ball. If µ is an inﬁnitely divisible { | | ≤ } distribution on Rd, then

1 µˆ(z) = exp [ < z, Az > +i < γ, z > (2.1) −2 i d + (e 1 i < z, x > 1D(x)) ν(dx)], z R (2.2) Rd − − ∈ Z where A is a symmetric nonnegative-deﬁnite d d matrix and ν is a measure on Rd satisfying × ν( 0 ) = 0 and ( x 2 1) ν(dx) < , and γ Rd. (2.3) { } Rd | | ∧ ∞ ∈ Z The representation ofµ ˆ(z) in (2.2) by A, ν, γ is unique.

Conversely, if A is a symmetric nonnegative-deﬁnite d d matrix, ν is a measure × satisfying (2.3), and γ Rd, then there exists an inﬁnitely divisible distribution µ ∈ whose characteristic function is given by (2.2).

Its characteristic exponent can be written as:

1 i ψ(z) = < z, Az > i < γ, z > e 1 i < z, x > 1 x 1 ν(dx). (2.4) 2 − − Rd − − | |≤ Z ¡ ¢ Definition 2.2.2. We call (A, ν, γ) in the Lévy-Khintchine representation the generating triplet of µ. The A and the ν are called, respectively, the Gaussian covariance matrix and the Lévymeasure of µ. When A = 0, µ is called purely non-Gaussian.

Corollary 2.2.3. If µ, the distribution of X X , has the generating triplet (A, ν, γ), s+1 − s then µt, the distribution of X X , has the generating triplet (tA, tν, tγ). s+t − s The integrand of the integral in the right-hand side of (2.2) is integrable with respect to ν, since it is bounded outside of any neighborhood of 0 and ei 1 i < z, x > − − 2 1 x 1 = O( x ) as x 0 for fixed z. | |≤ | | | | → 8 The function 1 x 1 in the Lévy-Khintchine formula may be (and often is) is replaced | |≤ with any other measurable function that is x + O( x 2) as x 0. The only effect | | | | → of this replacement is that the value of γ changes. Let c(x) be a bounded measurable function from Rd to R satisfying c(x) = 1 + o( x ) as x 0. Then (2.2) is rewritten | | | | → as

1 i µˆ(z) = exp < z, Az > +i < γc, z > + e 1 i < z, x > c(x) ν(dx) −2 Rd − − · Z ¸ ¡ ¢ with γ Rd deﬁned by c ∈

γc = γ + x(c(x) 1D(x))ν(dx). (2.5) Rd − Z In Equation (2.5), if c = 0, then γ0 = γ x 1 x ν(dx) is well-defined and finite, and − | |≤ | | we can rewrite the Lévy-Khintchine formulaR with a new triplet (A, γ0, ν)0(the drift of µ), as

1 i µˆ(z) = exp < z, Az > i < γ0, z > + (e 1) ν(dx) . (2.6) −2 − Rd − · Z ¸ If γ1 = γ+ x >1 x ν(dx) is well- defined and finite, we can rewrite the Lévy-Khintchine | | | | formula withR a new triplet (A, γ1, ν1)(the center of µ), as

1 i µˆ(z) = exp < z, Az > i < γ1, z > + (e 1 i < z, x >)ν(dx) . −2 − Rd − − · Z ¸ Deﬁnition 2.2.4. A L´evyprocess with the generating triplet (A, γ, ν) is of

type A if A = 0 and ν(Rd) < , Rd ∞ type B if A = 0 and ν( ) = but x 1 x ν(dx) < ,  ∞ | |≤ | | ∞  type C if A = 0 or x 1 x ν(dx) = . 6 | |≤ | | ∞R  R Example 2.2.5. Consider the infinitely divisible distributions. The Lévymeasure ν is zero if and only if the distribution µ is Gaussian. A Compound Poisson process is a Lévyprocess with generating triplet (0, 0, λσ)0, where λ > 0 is a constant and σ is a probability measure on Rd with σ 0 = 0. Thus, a Lévyprocess is a Compound Poisson process if and only if it { } has a generating triplet (0, 0, ν) with ν(Rd) (0, ). 0 ∈ ∞

9 CHAPTER 3

EMM

As we have already seen in the Black-Scholes model, we can ﬁnd an equivalent martingale measure (EMM) by changing the drift. But, in models with jumps, we may not easy to change the drift to obtain an EMM. In this chapter, we introduce a method to ﬁnd an

EMM in the Lévyprocess and call it Esscher Transformation. Let Xt be a Lévyprocess on (Ω, , P). We assume that our market consists of one riskless asset (the bond), with a price F process given by Bt = exp(rt), and one risky asset (the stock or index). The risky asset model is given by S S exp(X ). t ≡ 0 t The log returns log St+s of such a model follow the distribution of increments of length s St of the Lévyprocess³X. ´

3.1 Esscher Transform

We want to ﬁnd an EMM (equivalent martingale measure) Q such that dQ t = exp [θX(t) d(θ, t)], dP |F −

10 where d(θ, t) is a constant. Then the discounted stock price process S∗ should be a Q- martingale process. Hence, at time t = 0,

Q Q rt S(0) = E [S∗(t)] = E [e− S(t)]

rt dQ = e− S(t) dP dP ZΩ rt = e− [exp (θX(t) d(θ, t))] S0 exp(X(t)) dP Ω − Z P (θ+1)X(t) d(θ, t) rt) = S0 E [e − − ]

d(θ, t) rt P (θ+1)X(t) d(θ, t) rt P i( i(θ+1))X(t) = S0 e− − E [e ] = S0 e− − E [e − ]

d(θ, t) rt tψP( i(θ+1)) = S0 e− − e− − where EQ (EP, resp.) is the expectation under the measure Q (P, resp.) and by the deﬁnition of the characteristic exponent. Thus, for any t > 0, using the martingale condition

d(θ, t) rt tψP( i(θ+1)) e− − − − = 1 (3.1) tψP( i(θ + 1)) d(θ, t) rt = 0. ⇒ − − − − rt If we apply the same argument to the riskless bond process, B(t) = B0e for any t > 0, then

tψP( iθ) d(θ, t) = 0 (3.2) − − − d(θ, t) = tψP( iθ). (3.3) ⇒ − − Putting the equation (3.3) into (3.2), we get

ψP( i(θ + 1)) + ψP( iθ) r = 0. (3.4) − − − − Let’s assume that there exists a solution of equation (3.4). Then the Esscher transform exits and dQ P t = exp θX(t) + tψ ( iθ) . dP |F − On the other hand, £ ¤ dQ EQ[eizXt ] = eizXt dQ = eizXt dP dP Z Z = eizXt exp θX + tψP( iθ) t − Z P tψ ( iθ) £(iz+θ)Xt ¤ = e − e dP

P Z P tψ ( iθ) ψ (z iθ) = e − e − .

11 Thus, we can conclude that

ψQ(z) = ψP(z iθ) ψP( iθ) for any z R. (3.5) − − − ∈

Q i( i)Xt rt Since E [e − ] = e , the equation (3.5) also should satisfy the following

φ(z iθ) er = − (3.6) φ( iθ) − P where φ = E [exp (izX1)].

2 Example 3.1.1. Consider that Xt = µt + σ Wt, where Wt is a standard Brownian motion.

The characteristic function for Xt is

EP[eizXt ] = exp ( 1 σ2z2t + iµtz) − 2 P 1 2 2 tψ (z) t( 2 σ z iµz) e− = e− − ⇒ ψP(z) = 1 σ2z2 iµz. ⇒ 2 −

P( iθ) 1 2 2 Thus, d(θ, t) = tψ − = t( σ θ ) + µθ. From the equation (3.2), we get − 2 1 1 θ = r µ σ2 . σ2 − − 2 · ¸ Also, from Equation (3.5), we can easily compute 1 ψQ(z) = σ2(z2 2izθ) iµz. (3.7) 2 − − Plugging in θ to the equation (3.7), then

σ2 σ2 ψQ(z) = z2 iz r . (3.8) 2 − − 2 µ ¶ We can easily check that EQ[exp (X )] = ert when we replace z by i. Next we ﬁnd a t − risk-neutral measure as done by Cont and Tankov (2004) [12].

Proposition 3.1.2. [12] Absence of arbitrage in exp-Lévyprocess. Let X be a Lévyprocess. If the trajectories of X are neither increasing a.s. nor decreasing a.s., then the exp-Lévymodel given by S = ert+Xt is arbitrage-free: a probability measure t ∃ rt Q equivalent to P such that (e− St)t [0,T ] is a Q-martingale. ∈

12 eizXt > Furthermore, E[eizXt ] is a martingale for t 0. According to the above proposition, for a stock price process³ given´ by St = S0 exp [rt + Xt], the risk-neutral process for the log(St) becomes log(S ) = log S + rt log E[ exp(X )] + X . t 0 − t t For any z R, the characteristic function for log(S ) is ∈ t E[exp (iz log(S ))] = exp [iz (log S + rt log E (exp (X ))] E[exp (izX )]. (3.9) t 0 − t t

For instance, let Xt be the Variance Gamma process. This is one of the L´evyclasses whose characteristic function is known analytically. Thus, E[exp(X )] can be calculated from φ( i) t − and has a closed form:

t 2 ν φ (z) = 1 iθνz + σ ν z2 − VG − 2 t ³ 2 ´ν φ( i) = 1 θν σ ν − ⇒ − − − 2 ³ ´ This gives the risk-neutral exponential VG process as

t σ2ν − ν log(S ) = log S + rt log 1 θν + X . (3.10) t 0 − − − 2 t Ãµ ¶ ! 3.2 European Option Valuation

In this section, we introduce an analytic representation of the European option price, using the Fourier transform based on P. Carr and D. Madan’s paper [11].

Analytical Expression of Option Pricing with the FFT: Consider a European call option with the time maturity T and strike K. The following notation is used in this section:.

sT = ln ST

k = ln K

CT (k): the European call option value at time T with strike exp(k)

qT (s): the risk-neutral density of sT

13 In this notation, the characteristic function of sT is

∞ izs φT (z) = E[exp (izsT )] = e qt(s)ds. Z−∞

At the expiration date, T , the call option value CT is

∞ rT s k C (k) = e− (e e )q (s)ds. (3.11) T − T Zk In the above expression, we notice that

C (k) S as k . T → 0 → −∞

To define the call option value, CT to be a square integrable, we define the modified call price function. This is given by

c (k) exp (αk) C (k) for some α > 0 and c L (R) (3.12) T ≡ T T ∈ 2 for some α > 0 such that c (k) L (R). Before considering an appropriate choice of α, we T ∈ 2 can deﬁne the Fourier transform of cT (k), presuming that cT (k) is well deﬁned in L1(R).

∞ (c (k)) = eivkc (k)dk. (3.13) F T T Z−∞ Denoting (c (k)) by ψ(v): R C. Then, we have F T →

∞ ivk ψ(v) = e exp(αk) CT (k)dk Z−∞ ∞ (iv+α)k = e CT (k)dk Z−∞

taking the inverse Fourier transform on both sides,

ivk αk ∞ e− ψ(v)dv = 2πe CT (k) −∞ −αk ∴ e ∞ ivk R CT (k) = 2π e− ψ(v)dv. −∞ R By the facts that C (k) is real and c (k) R, T T ∈ αk e− ∞ ivk C (k) = e− ψ (v)dv. T π T Z0

14 Now we compute ψT (v) in terms of the characteristic function of risk-neutral density of sT = ln(ST ).

∞ ivk ψT (v) = e cT (k)dk Z−∞ ∞ ivk ∞ αk rT s k + = e e e− (e e ) qT (s)ds dk k − Z−∞ Z s ∞ rT s+αk (1+α)k ivk = e− q (s) (e e )e dk ds by Fubini and ψ(ν) < T − | | ∞ Z−∞ Z−∞ (α+1+iv)s ∞ rT e (α + 1 + iv)s = e− q (s) ds T α + iv − α + 1 + iv Z−∞ · ¸ rT Q e− φ (v (α + 1)i) = T − . α2 + α v2 + i(2α + 1)v − Consequently, we have the European call option value

exp ( αk) ∞ ivk C (k) = − e− ψ (v)dv , (3.14) T π T µZ0 ¶ rT Q e− φ (v (α + 1)i) ψ (v) = T − v R. (3.15) T α2 + α v2 + i(2α + 1)v ∈ − Equation (3.15) is the Fourier transform and can be executed by FFT numerically. We describe the numerical scheme later.

−rT e φT (v i) − Remark 3.2.1. Here if α = 0, then ψT (v) = v2+iv . When v = 0 or v = i , φT (v) has − a singularity. But the FFT performs the integrand in the equation (3.15), even when v = 0. Thus, the coeﬃcient α needs to be selected for removing the singularity somehow.

Lastly, we go back to a choice for the coefficient of α. From cT (k) = exp(αk) CT (k), the sufficient condition for c (k) to be in integrable is ψ (0) < . However, the condition T T ∞ ψ (0) < is equivalent to ψ ( (α + 1)i) < for some α > 0 in Equation (3.15). By the T ∞ T − ∞ definition of the characteristic function 3.2, this holds if only if

EQ[Sα+1] < . (3.16) T ∞

In the practical implementation, we use an upper bound, αub such that 0 < α < αub and is well satisﬁed with condition (3.16).

15 Numerical Implementation: To compute the expression (3.15), we discretize it and implement the discrete FFT using Simpson’s rule. To have an intuition of this, we introduce FFT ﬁrst. N 2π (j 1)(k 1) F (k) = f e− N − − for k = 1, ,N. (3.17) j ··· j=1 X To compute F (1),F (2), , F (N) ··· Usually take N to be a power of 2.

FFT reduces the computational complexity, (N ln2(N))

Suppose that we want to approximate the inverse Fourier transform of a function CT (k) with discrete FFT. Then this integrand should be truncated and discretized by:

exp( αku) ∞ ivku C (k ) = − e− ψ (v)dv T u π T Z0 N exp( αku) ivj ku C (k ) − Real e− ψ(v )η T u ≈ π j Ã j=1 ! X N exp( αku) ivj ku η j − Real e− ψ(vj) (3 + ( 1) δj 1) , ≈ π 3 − − − Ã j=1 ! X where v = η(j 1) and by Simpson’s rule. To apply the equation (3.17), we set up j − 2π λ = η N N k = λ + λ(u 1). u − 2 − Then, for u = 1, ,N, N = 2m for some m N ··· ∈ N Nλ exp( αku) η iλη(j 1)(u 1) ivj j CT (ku) − Real e− − − e 2 ψ(vj)(3 + ( 1) δj 1) ≈ π 3 − − − Ã j=1 ! X N 2π Nλ exp( αku) η i (j 1)(u 1) ivj j = − Real e− N − − e 2 ψ(vj)(3 + ( 1) δj 1) . π 3 − − − Ã j=1 ! X For our implementation in Chapter 4, we set η = 0.25 and N = 212.

16 CHAPTER 4

American Futures Options

4.1 The Basics: Futures Options

Futures options or options on futures are options where the underlying commodities are futures contracts. Like an option on a common stock, the holder of a call (a put) option has a right to buy (or sell, respectively) the underlying security at the strike price, say K. But, unlike stock options, when the holder of a call (or put) option exercises it, the holder acquires from the option writer a long (or short) position in a futures contract with a cash amount, $H(t) K(K $H(t)), where H(t) is the futures price at time t. Thus, the call (or − − put) option writer will provide $H(t) K(K $H(t)) in cash and open a short (or long) − − position in a futures contract at its futures price H(t).

Assumptions and Notation We start by introducing Black’s (1976) work on futures option valuation [3]. The assumptions here are as follows:

A1: No transaction costs in the option, futures, and bond markets.

A2: No arbitrage opportunities in markets.

A3: The short-term riskless rate of interest is constant through time.

A4: The instantaneous futures price dynamics is

dF/F = µdt + σdz

where µ is the expected instantaneous relative price change of the futures contract, σ is the standard deviation, and z is a Wiener process.

17 Notice that we do not assume any relationship between the future price and the price of the underlying spot commodity. Thus, the valuation based on Black’s work addresses any futures option contract, independent of the underlying spot commodity. We follow the same notations used in his work:

F : current futures price.

FT : futures price at expiration date.

C(F,T ; X)[(c(F,T ; X))]: American [European] call option price.

εC (F,T ; X): early exercise premium for American call option.

r: risk-free rate of interest.

T : time to expiration of futures option.

X: exercise price of futures option.

From the above assumptions and notations, we derive the partial diﬀerential equation of the futures option price. Let’s consider a portfolio with one derivative (futures option) and a number ∆ of futures contracts. Then the portfolio value Π is

Π = V ∆F, − where V is the futures option price. We assume that the instantaneous change in Π is given by dΠ = dV ∆dF . This comes as a limit of the discrete case where ∆ is ﬁxed over a time − step. Applying Ito’s Lemma to V gives ∂V 1 ∂2V ∂V ∂V dV = F µ + σ2F 2 + dt + σF dZ. (4.1) ∂F 2 ∂F 2 ∂t ∂F µ ¶ Also, in time dt, the total change in wealth of the portfolio is dV/dF so that we take ∆ = dV/dF . Then the equation (4.1) becomes

1 ∂2V ∂V dΠ = σ2F 2 + dt. 2 ∂F 2 ∂t µ ¶ But, when we enter into a futures contract, it costs nothing in our portfolio. This means the cost of setting up is just V . And, by our assumption, we should earn the risk-free rate on

18 the money we invested in the portfolio in time dt. Thus,

dΠ = rV dt 1 ∂2V ∂V rV dt = σ2F 2 + dt. 2 ∂F 2 ∂t µ ¶ Consequently, we have the partial diﬀerential equation for the futures option 1 ∂2V ∂V σ2F 2 rV + = 0. (4.2) 2 ∂F 2 − ∂t Here, it is necessary to consider the boundary condition for the futures option. For European call options, a terminal condition is max(0,F X). If we apply this boundary condition T − to the equation (4.2), a value of the European call option in a futures contract is

rT c(F,T ; X) = e− [FN(d ) XN(d )], (4.3) 1 − 2

2 where d = ln(F/X)+0.5σ T , d = d σ√T , and N() is the cumulative standard normal 1 σ√T 2 1 − distribution.

American Futures Options Now we consider an American call option. In the European call option formula (4.3), we see that when the futures price F is getting large relative to the exercise price, X, then both d and d so N(d ) 1 and N(d ) 1. Thus, 1 → ∞ 2 → ∞ 1 → 2 →

rT c(F,T ; X) = e− (F X). (4.4) − But, a holder of an American call option may be exercises it immediately for the amount rT F X higher than the European option value, e− (F X). Thus, for enough large F ∗, the − − early exercise premium is

C(F,T ; X) c(F,T ; X), F F ∗. ½ − −

Notice that for enough large F such that F >F ∗,

rT rT ε (F X) (F X)e− (F X)(1 e− ). C ≈ − − − ≈ − − In the equation (4.2), we know that there is no analytic solution for the American option on a futures contract with the boundary condition, C(F, t ; X) > max(F X, 0) for all t − 0 6 t 6 T . Hence, we apply the quadratic approximation to valuate the American call

19 option. This method was studied by Barone-Adesi and Whaley [38]. The idea we described above follows. The analytic approximation is

q2 C(F,T ; X) = c(F,T ; X) + A2(F/F ∗) , F F ∗, ½ − where

rT A2 = (F ∗/q2) 1 e− N(d1(F ∗)) { − 2 } [ln(F ∗/X) + 0.5σ T ] d1(F ∗) = σ√T

q2 = (1 + √1 + 4k)/2 2r k = σ2(1 e rT ) − − and F ∗ is the critical futures price where the American futures option must be exercised immediately; the following equation is solved iteratively.

rT F ∗ X = c(F ∗,T ; X) + [1 e− N (d (F ∗))]F ∗/q (4.5) − − 1 2 In this section, we summarized the valuation of an American call option on a futures contract. But, the basics are applicable for an American put option, as well if we replace a payoﬀ function with max(X F, 0). − 4.2 Implementation: Quadratic Approximation

Algorithm: According to the quadratic approximation method, our goal is to ﬁnd the critical futures price, F ∗. In order to do so, we have to solve the equation (4.5). Since we cannot solve it directly, we ﬁnd the solution of (4.5) recursively. We start by evaluating the equation (4.5) at some seed value, F1. i = 1

rT Fi X = c(Fi,T, ; X) + [1 e− N (d1(Fi))]Fi/q2 (4.6) − −2 [ln(Fi/X) + 0.5σ T ] d1(Fi) = . σ√T

Given a initial guess F1, we should solve the equation (4.7). This can be done using Newton’s algorithm for ﬁnding roots recursively. Thus, let’s put

Fi rT f(Fi) = Fi X c(Fi,T, ; X) + [1 e− N (d1(Fi))] = 0. (4.7) − − q2 − Then the next step is

20 i > 2 f() Fi+1 = Fi Fi − f ′()|

1 rT 1 1 rT f ′(Fi) = 1 1 e− N(d1(Fi)) + e− n(d1(Fi)) − q2 − q2 σ√T t µ ¶ do until £ ¤ − £ ¤ f < 0.000001, | | where n() is the standard normal density function.

Early Exercise Premium: In order to see the early exercise premium of options on futures, we compute the European call option prices, using options on futures contracts from December 2003 to November 2004 and the numerical scheme of FFT. Then the early exercise premium is computed by subtracting the European call option prices with the boundary condition. The result is reported in Table 4.1. As shown in Table 4.1, the early exercise premium is so small as to be negligible. For instance, when the moneyness is 1.05- 1.2 and greater than 1.2, these call options have early exercise premiums of 6.30552E-05 and 1.48828E-05 respectively for short-term maturity. When call options lie in deep in the money with medium-term maturity, the early exercise premium is more than 6 percent and signiﬁcant enough to aﬀect a portion of the actual call options price. The further out of the money a call options is, the smaller the early exercise premium is over the option’s life.

Table 4.1: Early Exercise Premium: call options on crude oil futures for December 2003- November 2004

K/F Short-Term (< 60) Med-Term (60-120) < 0.8 0.015948 0.064714 0.8 - 0.95 0.001471 0.005507 0.95-1.05 0.000156 0.001244 1.05-1.2 6.30552E-05 0.000452 > 1.2 1.48828E-05 0.000127

4.3 Calibration

Data Description: The data used to calibrate American options is the futures options on crude oil. The futures options selected are daily closing prices from December 2003 to

21 November 2004. Over the sample period, we calibrate calls once per month and each maturity separately. The quote date we pick is the second Tuesday of each month. Since the volume decreases substantially with expiration dates more than 120 days in the future, we keep futures options whose expiration dates are less than 120 days away. Additionally, we apply two ﬁlters to our data. First, we exclude the very near-term futures options (< 4). Thus, we may have early exercise opportunities at most four times for these options. Secondly, we discard deep out-of-the-money ( K > 1.2) and deep in-the-money calls (for which K is less Ft Ft than 0.8). After all ﬁlters, 692 contracts are left to calibrate over the period. As a risk-free rate, the constant three month T-bill is used [32].

Price Process and Calibration: According to the quadratic approximation, American- style option prices are adjusted to be converted into European-style option prices. Now our valuation and calibration problems are addressed to European ones. Thus, the FFT method is applied to value European call options. Consider any risk-neutral price process S on { t}[0,T ] (Ω, , P) with continuously compounded of interest rate r. Let ξ be the parameter vector Ft we want to estimate. Here, a risk-neutral process for the log(St) is

log(S ) = log S + rt log E[ exp(X )] + X . (4.8) t 0 − t t We compute the option values using the method we described previously:

model r(T t) Q + c (ξ; F ,K ,T ) = e− − E (F K ) , (4.9) t i T − i |Ft where cmodel is European option price over different£ strike prices and¤ maturity T on our observation date. Here, Xt is the Variance Gamma process defined by the characteristic function, t σ2ν − ν φ (z) = 1 iθνz + z2 . VG(θ, σ, ν) − 2 µ ¶ Thus, ξ is (θ, σ, ν) in the case when Xt t [0,T ] is the Variance Gamma process. Now { } ∈ our calibration problem is addressed to find a probability Q such that the price process model market St is a martingale and c (ξ; Ft,Ki,T ) is very closed to the market prices c in a mathematical sense. But, it is not easy to find an exact solution as Cont and Tankov pointed out [12]. Thus, this can be replaced by a nonlinear least square problem practically. Given call option prices cmodel, find a parameter vector ξ such that

inf cmarket cmodel 2 (4.10) ξ || − ||

22 This least square problem has a few difficulties both numerically and theoretically. First, when we try to solve this, the data we use in the market is finite. Thus, it’s hard to identify the optimal model. Thus, there are many models to reproduce the market option prices within any allowed pricing error [36]. Second, the function in the equation (4.10) is typically non-convex, which means it has many local minima [36]. Finally, observation option prices in the market include a bid-ask spread and daily closing prices are used for calibrating the parameters. Therefore, it may contains numerical errors [36]. Cont and Tankov reformulate the least square problem (4.10) to overcome these technical difficulties. They use the regularization method to approximate the solution to remove pricing errors. Also, the smallest relative entropy with respect to a given prior measure is used to take care of its lack of identification problem [36]. However, in our paper we check the relative errors with the infinity norm as the allowing tolerance 1.E-6, and the gradient norm as well. Secondly, in order to reduce the market pricing noise, we use the common method of eliminating data points that are deep out of the money and deep in the money. Now our calibration problem is set up as

N market model 2 min c (ξ; τ, Ki) c (ξ; τ, Ki) (4.11) ξ − i=1 X ¡ ¢ for some ﬁnite N call options in market. 4.4 Numerical Result

4.4.1 Parameters Calibrated

According to our numerical scheme, we calibrate the parameters underlying the Black-Scholes and Variance Gamma models. In Table 4.2, we list the time-averaged parameters. Since each parameter is calibrated with each maturity separately, we report these twelve time-averaged values in the Appendix. As we see in Table 4.2, the volatility parameter σ is around 0.28 for both the Black-Scholes and VG models. For the Black-Scholes model, it ranges from a low of 0.2335 (April 2004 contract) to 0.3354 (November 2004 contract). The σ of the November 2004 contract is the highest one for the VG model, recoded as 0.3332. Also, its minimum for the VG is 0.2435 of the April 2004 contract. The parameter θ calibrated for the VG model is 0.2306 over the entire period. For each observation date, the averaged values of θ are listed in Table 4.2 and in the Appendix. Like the above two parameters, ν is reported as the

23 Table 4.2: Parameters calibrated: Time-averaged from twelve observations based on Black- Sholes and Variance Gamma models (Dec. 2003 Nov. 2004) ∼

Parameter Calibrated Model σ ν θ B-S 0.2810 VG 0.2859 0.0455 0.2306 time-averaged value, 0.0455. The maximum is 0.2628 (June 2004, medium-term contract), while the recorded minimum is 0.00001.

4.4.2 In-Sample Pricing Performance

A. Absolute Error In this section, we investigate the absolute error in call prices between those implied by models and actual prices. We compare the VG to the B-S models by moneyness and maturity in Table 4.3. As we can see, call option prices implied by model appears biased by both moneyness and maturity. In general, for the VG process, the more at the money the call price is, the smaller the absolute error is. And then the further out of the money the call price is, the lower the absolute error is. The smallest mispricing happens with out-of-the-money calls for the VG. However, this phenomenon doesn’t seem to occur with the B-S model. In the VG model, the absolute error is greater for the medium-term maturity calls than for short-term maturity calls, over all moneyness. But in the B-S model, for call options with monyness between 0.95 and 1.1, the absolute error is reported in the opposite way: the absolute error is smaller for medium-term maturity than for short-term maturity. Over all observations, the smallest error for the VG is 0.0062, for short-term options with moneyness of 1.1-1.2. For the B-S model, the smallest error reported is 0.0385 (for short-term options with moneyness of 0.8-0.95).

24 Table 4.3: In-Sample: Average absolute errors ( C C(F,T ; K) ) | − |

Average Absolute Error Short-Term (4-60) Med-Term (60-120) Moneyness B-S VG B-S VG 0.8-0.90 0.0385 0.0167 0.0896 0.0385 0.9-0.95 0.0678 0.0154 0.0912 0.0327 0.95-1 0.092 0.0099 0.0817 0.0274 1-1.05 0.092 0.0144 0.0865 0.0166 1.05-1.1 0.0763 0.0125 0.0715 0.0202 1.1-1.2 0.0999 0.0062 0.0655 0.011

B.Pricing Performance In order to see the pricing performance of the B-S and the VG models, we use four errors: APE, AAE, ARPE, and RMSE.

N Cmarket Cmodel AP E = i=1 | i − i | : average pricing error • N market P i=1 Ci N market model CPi Ci AAE = | − | : average absolute error • N i=1 X 1 N Cmarket Cmodel ARP E = | i − i | : average relative percentage error • N Cmarket i=1 i X N (Cmarket Cmodel)2 RMSE = i − i : root mean square error • v N u i=1 uX t Time-averaged APE, AAE, ARPE, and RMSE are listed in Table 4.4. To compare the pricing performance of the VG model with the B-S model, we also list the ratio of VG to B-S model for each error in Table 4.5. For the B-S model, the APE is about 4.25 % on average. On the other hand, 0.85 % is recorded for the VG model. Generally, the B-S model’s error were about 20 % greater than those of the VG model.

25 Table 4.4: Pricing Errors for In-Sample: Time-averaged from twelve observations based on quadratic approximation (Dec. 2003 Nov. 2004) ∼

Model APE (%) AAE ARPE RMSE B-S 4.2484 0.0751 0.0963 0.0913 VG 0.8484 0.0145 0.0147 0.0186

Table 4.5: In-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) ∼

AP EVG AAEVG ARP EVG RMSEVG AP EBS AAEBS ARP EBS RMSEBS

0.2 0.1933 0.1527 0.2032

4.5 Out-of-Sample Performance

In this section, the same error analysis is done for one-day-ahead and five-days-ahead out-of- samples. Given calibrated parameters, we compute call options prices. We apply the same filters to the out-of-sample observations. First of all, we list the averaged absolute errors by moneyness over all maturity. Then, we determine different pricing errors for one-day and five-days ahead observations.

A. Average Absolute Error Table 4.6 and Table 4.7 show the average absolute errors between the models and actual prices. Overall, the VG process outperforms the B-S model. For example, for one-day ahead out-of-sample with a moneyness of 0.95-1, 0.1044 is recorded for the B-S model. This is three times the error value recorded for the VG, 0.0368. For ﬁve-days ahead out-of-sample observations, the B-S model’s absolute error were, on average, 15-20% greater than those of the VG process.

26 Table 4.6: Average Absolute Error: One-day-ahead for December 2003-November 2004

Average Absolute Error Short-Term (4-60) Med-Term (60-120) Moneyness B-S VG B-S VG 0.8-0.90 0.0330 0.0165 0.0839 0.0705 0.9-0.95 0.0769 0.0278 0.0877 0.0617 0.95-1 0.1044 0.0368 0.1104 0.0501 1-1.05 0.1014 0.0397 0.0942 0.0358 1.05-1.1 0.0778 0.0346 0.0836 0.0405 1.1-1.2 0.0925 0.0291 0.0690 0.0322

Table 4.7: Average Absolute Error: Five-days-ahead for December 2003-November 2004

Average Absolute Error Short-Term (4-60) Med-Term (60-120) Moneyness B-S VG B-S VG 0.8-0.90 0.0661 0.0509 0.0694 0.1309 0.9-0.95 0.0879 0.0676 0.1853 0.1667 0.95-1 0.1159 0.0823 0.2070 0.1391 1-1.05 0.1147 0.0837 0.1859 0.1136 1.05-1.1 0.1033 0.0705 0.1749 0.1154 1.1-1.2 0.1065 0.0656 0.1449 0.0978

B. Pricing Errors In Table 4.8, we list pricing errors over the entire period for one-day and ﬁve-days ahead of observations. In terms of APE and RMSE, the VG process outperforms the B-S model over all maturity. As shown in the ﬁrst and second rows of Table 4.9, the time-averaged ratio of out-of-sample APE and RMSE for the VG to that of the B-S model ranges from 0.46 to 0.78.

27 Table 4.8: Pricing Errors for Out-of-Sample: Time-averaged from twelve observations based on Black-Sholes and Variance Gamma models (Dec. 2003 Nov. 2004) ∼

Pricing Errors for Out-of-Sample Pricing Errors Model one-day ahead ﬁve-days ahead B-S 4.3208 6.2074 APE (%) VG 2.2435 4.8305 B-S 0.0804 0.1143 AAE VG 0.0392 0.0861 B-S 0.0867 0.1139 ARPE VG 0.0140 0.0780 B-S 0.0992 0.1373 RMSE VG 0.0458 0.0924

Table 4.9: Out-of-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) ∼

Out-of-sample AP EVG RMSEVG AP EBS RMSEBS

One-day ahead 0.5192 0.4617 Five-days ahead 0.7782 0.6730

28 CHAPTER 5

Generating Sample Paths

In this chapter, we will describe a method for simulating paths of underlying futures prices.

5.1 Variance Gamma Process

A pure jump L´evyprocess, the Variance Gamma (VG) process is deﬁned as a Brownian motion with constant drift and volatility at a random time change by a gamma process [34].

Let W = Wt θt + σBt t [0, ), where Bt is a Brownian motion. Let G = Gt, t > 0 be a { | } ∈ ∞ { } Gamma process with parameter α, β > 0 (denoted by Gamma(α, β)) so that its density is

1 αt 1 x f (x) = x − e− β , for x 0 (5.1) G(t) Γ(αt)βαt ≥

Then by the definition of the characteristic function, φ (z) = exp tαlog(1 iz ) for Gt − − β all t > 0 and z R. Suppose that W and G are independent.h Then accordingi to ∈ t t the subordination theorem for the Lévyprocess (Theorem 30.1[34], p.197), we can define the stochastic process VG, X (ω) = W (ω) for each t 0 and each ω Ω. Thus, t Gt(ω) ≥ ∈ X(VG) = θG + σB , for any θ R and σ > 0. The characteristic function of the VG with t t Gt ∈ a parameter set (θ, σ, ν) is given by

t 1 − ν E[exp (izX )] = 1 izθν + σ2νz2 . t − 2 µ ¶ Also, this parameter set (θ, σ, ν) gives characteristics of the return distribution. Accord- ing to the ﬁrst four central moments over time interval t, we can obtain skewness and excess kurtosis as follows:

29 Table 5.1: Variance Gamma Process: Moment

Moment VG(θ, σ, ν) VG(0, σ, ν)

Mean θ t 0

Variance (νθ2 + σ2) t σ2t

2 3 2 2 2 3/2 Skewness t(2ν θ + 3σ θν)((νθ + σ ) t)− 0

1 2 2 2 2 2 4 4 Kurtosis 3νt− (1 + (θ ν + σ )− (2σ θ ν + θ ν )) + 3 3(1 + ν/t)

The parameter θ R shows a location and skewness. σ R+ indicates a shape ∈ ∈ parameter, and ν R+ controls the tail behaviors of the L´evydensity of the VG process. ∈ The L´evymeasure of the VG process is

1 θ x 2σ2 ρ (x; θ, σ, ν) = exp x | | + θ2 . (5.2) VG ν x σ2 − σ2 ν | | " r # Remark 5.1.1.

ρ (Rd) = (5.3) VG ∞

x 1 x ρ(dx) < (assuming that A = 0). (5.4) | |≤ | | ∞ R x 6 1 x ρ(dx) < . | | | | ∞ R 2 1 θ x 2σ 2 x ρ(dx) = exp 2 x | 2| + θ dx x 6 1 | | x 6 1 ν σ − σ ν Z| | Z| | Ã r !

2 2 1 θ x 2σ 2 θ x 2σ 2 = exp 2 x 2 + θ dx + exp 2 x + 2 + θ dx ν 0 0, then by the Taylor’s expansion, the ﬁrst integrand in (5.1) becomes

1 + tx + (x2) t O dx 1 + (x) dx 1 sx + (x2) ≈ sO Z0

30 1 2σ2 2 θ ,where s = σ2 ν + θ , and t = σ2 . When θ < 0,q then exp( Cx)dx < . (5.6) − ∞ Z0

θ 1 2σ2 2 θ x 2σ2 2 ,where C = σ2 + σ2 ν + θ . Similarly, 16x<0 exp σ2 x + σ2 ν + θ dx is also − q µ q ¶ finite by the Taylor’s expansion. Consequently,R x 6 1 x ρ(dx) is finite. | | | | R ρ (R) = VG ∞ By the Lévymeasure of the VG process (see: ), we can see it clearly if we take d = 1.

Thus, the VG process is an inﬁnite activity process, but of a ﬁnite variation process since

A = 0 and x 1 x dρ(x) < . Next we observe that when θ = 0, skewness is zero. This | |≤ | | ∞ implies thatR the Lévymeasure of the VG process is symmetric with respect to zero. On the other hand, if θ has a positive (negative) value, then it yields a positive (negative) skewness, respectively. Figure 5.1 (a) shows the skewness corresponding to the sign of the θ. Also, as we can see Figure 5.1 (b), the parameter σ controls a shape of the Lévymeasure of the VG process. The larger the values of σ, the farther from the origin it is. This is explained by fixed θ and ν with various values of σ in Figure 5.1 (b). Moreover, we note that excess kurtosis is influenced by ν. For larger values of ν, we have fatter tails since the exponential tails of the Lévymeasure have a slower decay rate: i.e., the variance rate from the gamma subordinator affects the tail behaviors, which indicates the arrival rate of the large jumps. In Figure 5.1 (c), we plot the Lévymeasure with various ν with fixed θ and σ. And we take a closer look on both sides of its tail in Figure 5.1 (d) and 5.1 (e). With the same θ = 0 and σ = 0.2, we clearly see that the largest ν = 0.2 has the heaviest tail.

31 70 θ = −1.0 θ = 0 60 θ = 1.0

50 ) ν ,

σ 40 , θ (x;

VG 30 levy

0 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3

(a) L´evymeasure with various θ with the ﬁxed σ = 0.2 and ν = 0.1

60 σ = 0.1 σ = 0.2 σ = 0.3 50

40 ) ν , σ , θ (x;

VG 30 levy

0 −0.2 −0.1 0 0.1 0.2 0.3

(b) L´evymeasure with various σ with ﬁxed θ = 0 and ν = 0.1

Figure 5.1: L´evymeasure of the VG process with respect to (θ, σ, ν).

32 70 ν = 0.001 ν = 0.01 60 ν = 0.2

50 ) ν , σ

, 40 θ (x;

VG 30 levy

0 −0.2 −0.1 0 0.1 0.2 0.3

(a) L´evymeasure with various ν with ﬁxed θ = 0 and σ = 0.2

ν = 0.001 ν = 0.001 0.04 0.08 ν = 0.01 ν = 0.01 ν = 0.2 ν = 0.2 0.035 0.07

0.03 0.06 ) ) ν ν , 0.025 , σ σ 0.05 , , θ θ (x; (x; 0.02 0.04 VG VG levy levy 0.015 0.03

0.01 0.02

0.005 0.01

0 0 0.15 0.2 0.25 0.3 −0.3 −0.25 −0.2 −0.15 −0.1 −0.05

(b) tail in the positive side of Figure (a) (c) tail in the negative side of Figure (a)

Figure 5.2: Figure 5.1-Continued

5.2 Simulation of the Variance Gamma Process

In this section, we introduce a simulation method for the sampling VG process. Madan, Carr, and Chang (1998) consider the VG process in two diﬀerent ways [27]. First, the VG process

33 is obtained by evaluating Brownian motion (with constant drift and volatility) at a random time change given by a Gamma process. The other way they view it is as the difference of two independent increasing Gamma processes, X(t; σ, ν, θ) = γ (t; µ , ν ) γ (t; µ , ν ). p p p − n n n The parameters between two different Gamma processes must be satisfied:

2 1 2 2σ θ µp = θ + + (5.7) 2r ν 2 1 2σ2 θ µ = θ2 + (5.8) n 2 ν − 2 r 2 2 1 2 2σ θ νp = θ + + ν (5.9) Ã2r ν 2! 2 2 1 2 2σ θ νn = θ + ν (5.10) Ã2r ν − 2! But in this paper, we deﬁne the VG process X(VG) = XVG, t > 0 with parameters θ, σ > 0, { t } and ν > 0 as a standard Brownian motion with the gamma subordinator. Therefore, we simulate the VG process by sampling a standard Brownian and a Gamma process, rather than simulating two independent Gamma processes.

5.2.1 Sampling the VG Process

First, we notice that the Gamma distribution on R is infinitely divisible: Assume that n 1 − X Gamma(α, β). Then we can rewrite X = Y , where Y are i.i.d. with law ∼ i i i=0 Gamma(α/n, β). Hence, a random variable of the GammaX distribution can be decomposed into a sum of n i.i.d. parts having the same distribution but with modified parameters. By the correspondence theorem between infinitely divisible distributions and Lévyprocesses in law (Theorem 7.10 [34], p.35), there is a Lévyprocess (Gamma process) in law Xt : t 0 n { ≥ } such that X has distribution Gamma(α, β). But X = (X X ) for all t > 0, and 1 t ti − ti−1 i=1 t = i t by the stationary increment property. Thus for discreteX times t < , < t , we can i n 1 ··· n sample the increments

∆G = X(ti) X(ti 1) Gamma(α(ti ti 1), β) (5.11) − − ∼ − − with X = 0 and the increments are independent for i = 1, 2, , n. Next, in order to t0 ∀ ··· generate sampling paths of the VG process, we use the representation Xt = W (Gt) for each

34 ω Ω where W is a standard Brownian motion with a constant drift and G is a Gamma ∈ process. The increment W (G(ti)) W (G(ti 1))( = ∆W ) conditioning on the Gamma time − − change has a normal distribution with

E[∆W ] = θ [G(ti) G(ti 1)] (5.12) − − 2 V ar[∆W ] = σ [G(ti) G(ti 1)]. (5.13) − −

Remark 5.2.1. If Z N(0, 1), then θ + σZ N(θ, σ2). ∼ ∼ Because of this, ∆W is distributed with the mean θ∆G and variance σ√∆GZ.

Therefore, knowing the information of increment [G(ti) G(ti 1)], we can simulate ∆W , − − which is the increment of the VG process. In addition, we use the same restriction on the 1 shape parameter as Madan, Carr, and Chang [27]. Thus, we set α of G(1) to be β so that E[G(t)] (= E[G(t) G(0)]) = 1 β t = t. The following steps are taken to generate samples − β · · of the VG process:

Y Gamma((ti ti 1)/β, β) ∼ − − Z N(0, 1) ∼

∆W = X(ti) X(ti 1) = θY + σ√YZ. − −

Sampling a Gamma Process: According to our sampling scheme, the shape parameter 1 6 is α = β ∆t. In our case, we generate gamma random numbers when α 1. There are a couple of ways to generate Gamma random numbers when α 6 1 [14] [35]. In our paper, we use Berman’s Gamma generator [14]. Furthermore, from the equation (5.1), we can easily see that for any β > 0, βX Gamma(α, 1) when X Gamma(α, β). This implies that it ∼ ∼ is enough to have a good generator for Gamma(α, 1) in order to generate Gamma(α, β).

35 Berman’s Gamma Generator Repeat

1. Generate iid uniform random numbers U and V .

1 1 2. Set X U α and Y V 1−α . ← ← Until X + Y 6 1

3. Generate two iid uniform random numbers U and V . ∗ ∗ 4. Return the number Xlog(U V ) as the Gamma(α, 1) random number. − ∗ ∗ Figure 5.3 shows a sample path with β = 20 for ﬁxed time space ∆t = 0.01.

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0 0 0.2 0.4 0.6 0.8 1

Figure 5.3: A sample path of a Gamma process: ∆t = 0.01, β = 20.

36 According to the main steps, we next simulate n i.i.d. Z(0, 1) standard normal random variables. This can be done in many ways. But, for our application, we use the generator i randn in MATLAB for Z(0, 1). Finally, we set the discretized trajectory W (ti) = ∆Wk. k=1 Figure 5.4 shows the VG process generated with θ = 0.1, σ = 0.2, and ν = 0.05. X

0.8

0.6

0.4

0.2

−0.2

−0.4

−0.6

−0.8 0 0.2 0.4 0.6 0.8 1

Figure 5.4: A sample path of a VG process: ∆t = 0.001, θ = 0.1, σ = 0.2, and ν = 0.05.

37 CHAPTER 6

Random Tree Method

In this chapter, we describe the method to generate random trees, provided by Paul Glasserman and Mark Broadie [8].

Problem Statement Unlike European options, American-style options can be exercised any time over the option’s life. This means that the valuation of the American-style option is decided by optimal exercise time. Thus, the first goal is to find the optimal stopping time and then compute the expectation of the discounted payoff function by the optimal stopping rule. Consider the American call option with the strike K, the maturity T , and the risk-free rate r as usual. Then, we can write this problem

rτ + C = max E[e− (Sτ K) ], (6.1) τ (0,T ] − ∈ where τ is a stopping time (with respect to S) up to the maturity. In this paper, we assume that there are a ﬁnite number of early exercise opportunities. Let’s denote such a ﬁnite set by 0 = t < < t = T . The next step is to generate the American option prices, say { 0 ··· m } S ,S , ,S at corresponding times, 0 = t < < t = T . Then, evaluate the option 0 1 ··· T { 0 ··· m } value discounted by the risk-free rate with respect to this sample path. Finally, we take the average of the estimators over many simulated paths. We’ll explain these steps later.

Remark 6.0.2. Optimal stopping policy: If the optimal stopping rule were known,

rτ + then our estimate would be e− (S K) . However, unfortunately, it is unknown and τ − we have to decide via simulation. A natural way is to compute the optimal stopping

38 time for the simulated path. This gives the path estimate

rti + max e− (Sti K) . (6.2) i=0, ,m ··· −

Notice that the above path estimate corresponds to the foresight solution and therefore

rti + rτ + it tends to overestimate a call option value; maxi=0, ,m e− (Sti K) > e− (Sτ K) . ··· − − The expected value of simulated paths is too high and is not a solution to the

American option pricing problem (6.1). To ﬁx this biased problem, we evaluate two

estimates, biased high and biased low. However, these both estimates go to unbiased

asymptotically by increasing a number of simulations and a number of branches. Also,

this gives an error bound for the true option value. We discuss this in the next part.

We assume that the simulated underlying prices, Sti , can be simulated exactly to discard a time descretization error

Random Tree Method The parameters of the simulated trees are characterized by b ( 2), the number of branches per node. And we make a decision to exercise at ﬁnite ≥ times, 0 = t < < t . Given the initial state, say S , we simulate state variables at 0 ··· m t0 the ﬁnite number of possible exercise times, t m . Through this section, we use the same { i}i=1 notation used by Glasserman and Broadie, as follows [8]:

Time index: 0 = t < < t = T , where t is the time of the ith exercise opportunity. 0 ··· m i But for our convenience, we denote t m by t = 1, 2, ,T . { i}i=0 ··· Risk-neutralized state variables: S : t = 0, 1, ,T is a underlying Markov chain. { t ··· }

rt Discount factor: e− is the discount factor from t 1 to t and r is the constant risk-free − interest rate.

Payoﬀ function: ht(s) is the payoﬀ when exercising at time t and state s.

rt+1 Continuation value at time t and state s: g (s) = E [e− f (S ) S = s] t t+1 t+1 | t Option value: f (s) = max h (s), g (s) . t { t t } 39 Option value at the expiration: fT (s) = hT (s).

Given the initial stock price S , we simulate b independent states variables, S1,S2, ,Sb, and 0 1 1 ···· 1 denoted by Si1 , i = 1, , b. Then at each Si1 we generate b independent states, Si11,Si12, 1 1 ··· 1 2 2 ·· ,Si1b for i = 1, 2, , b. Similarly we simulate b successors, Si1i21,Si1i22, ,Si1i2b from ·· 2 1 ··· 3 3 · · ·· 3 any Si1i2 and i , i = 1, 2, , b. In this way, we can generalize this notation with b branches 2 1 2 ··· per node:

i1i2 it S ···· : t = 0, 1, ,T ; i = 1, , b ; j = 1, , t . (6.3) { t · · ·· j ··· ··· } i1i2 it In our random tree, we put some speciﬁcations on the joint distribution of S ···· . { t }

1. The initial state S0 is ﬁxed.

i1i2 itj 2. (a) S ··· , j = 1, 2, , b are conditionally independent of each other. t+1 ··· i1i2 it i1i2 itj (b) If k < t or it′ = it, then given St ··· , St+1··· are also independent of all i′ i′ i′ 6 S 1 2··· k , where j = 1, 2, , b. k ···

i1i2 it i1i2 itj i1i2 it−1 3. Given St ··· , each St+1··· has the distribution of [St St 1··· ]. | −

To illustrate this method clearly, we give an example of the random tree when b = 4 and m = 2. Figure 6.1 shows the movement and notation of state variables. As shown in Figure 6.1, the nodes are labled in the order in which we generate state 1 2 variables. Thus, it’s not necessary that the value S1 is higher than that of S1 . Also, each point between nodes shows their dependence structure. For instant, S11,S12,S13 depends { T T T } 1 2 3 on S1 , but not S1 and S1 . The most important point is that this is a nonrecombining tree. Unlike the binomial or multinomial trees, we do not recombine the tree at each successor simulated. Thus, at the ﬁnal time step, we have bm nodes.

40 Random Tree when b=4 and m=2 S11 T S12 T S1 1 S13 T S14 T S21 T S22 S2 T 1 S23 T S24 S T 0 S31 T S32 S3 T 1 S33 T S34 T S41 T S42 T S4 1 S43 T S44 T

Figure 6.1: Random Tree with b = 4 and m = 2

41 6.1 The High Estimator Θ

Deﬁnition 6.1.1. [8] The High Estimator Θ:

The high estimator Θ is deﬁned recursively by

b i1···itj i1 it i1 it 1 r i1 itj t = 0, ,T 1 : Θ ··· = max h (S ··· ), e− t+1 Θ ··· ··· − t t t b t+1 ( j=1 ) X i1 iT i1 iT t = T :ΘT ··· = fT (ST ··· ).

At every node, we should make a decision whether to exercise immediately or not. Thus we evaluate the value of the continuation up to the next time period from all possible branch nodes as well as the value of the payoﬀ. The backward induction is applied until the initial time, t = 0.

i1i2 it Remark 6.1.2. When t = 0, Θt ··· , Θ0.

Before examining the main theorems, we will introduce some notation. Let X be a random variable on (Ω, ,P ). F 1 1. X = (E X p) p || ||p | | 1 2. X = (E[ X p S ]) p || ||St | | | t

1 Remark 6.1.3. If t = 0, then S is deterministic and so X = E( X p) p = X 0 || ||S0 | | || ||p

i1i2 it 3. Θt = Θt ···

j i1i2 itj Θt+1 = Θt+1···

j i1i2 itj Θ , j = 1, 2, , b are conditionally independent given S . This means Θ ··· , t+1 ··· t t+1 i1i2 it i = 1, 2, , b are conditionally independent given S ··· for all i , i , i ··· t 1 2 ··· t

Theorem 6.1.4. [8] High estimator consistency:

′ Let Θ be the sample mean of n independent replications of Θ . Suppose that E[ h (S ) p ] < 0 0 | t t | for all t, for some p′ > 1. Then for any 0 < p < p′, Θ (b) f (S ) 0, as b with ∞ k 0 − 0 0 kp → → ∞ arbitrary n. In particular, Θ0(b) converges to f0(S0) in probability and is thus a consistent estimator of the option value.

42 Proof.

Lemma 6.1.5. If h (S ) < for all t, for some p > 1, then the following are also ﬁnite k t t k ∞ for all 0 6 t1 6 t2 6 T :

(i) f (S ) (6.4) k t2 t2 kSt1

(ii) sup Θt2 (b) St1 (6.5) b k k

(iii) sup θt2 (b) St1 (6.6) b k k Proof. (i): By the hypothesis, p > 1 s.t. h (S ) < for any t. Thus, so is ∃ k t t kp ∞ h (S ) < . When t = 0, then h (S ) = h (S ) < . Otherwise, k t2 t2 kp ∞ 1 k t2 t2 kSt2 k t2 t2 kS0 ∞ i.e., when 0 < t < t , then h (S ) = E[ h (S ) p S ] < , since at time t , S 1 2 k t2 t2 kSt1 | t2 t2 | | t1 ∞ 2 t1 is deterministic. Next, we consider g (S ) . By the deﬁnition of g, k t2 t2 kSt1

p rt2+1 p gt2 (St2 ) = E [ E[e ft2+1(St2+1 St2 )] St1 ] k kSt1 | | | | = E [ erT E[f (S S )] p S ] if taking t + 1 = T | T T | t2 | | t1 2 = E [ erT h (S ) p S ] < | T T | | t1 ∞ for h (S ) < for all t if t < t . Thus by the deﬁnition, f (S ) is k t2 t2 kSt1 ∞ 2 1 2 k t2 t2 kSt1 bounded by either h (S ) or g (S ) . Hence it is ﬁnite. k t2 t2 kSt1 k t2 t2 kSt1

(ii): Fixed t1 > 0, we apply a backward induction to t2. By the deﬁnition of Θt,

b i i ···i j 1 2 t2 i1i2 it2 1 r i1i2 it2 j Θ (b) = max h S ··· , e− t2+1 Θ ··· k t2 kSt1 t2 t b t2+1 Ã j=1 ! X i1i2 iT i1i2 iT t = T : Θ (b) = f (S ··· ), then Θ (b) = f (S ··· ) < by the fact (i). 2 T T T k t2 kSt1 k T T kSt1 ∞ Thus sup Θ (b) < . bk t2 k ∞

t2 < T :

b i i ···i j 1 2 t2 i1i2 ıt2 1 r i1i2 it2 j Θ (b) = max h (S ··· ), e− t2+1 Θ ··· k t2 kSt1 k { t2 t2 b t2+1 }kSt1 j=1 X b 1 rj j 6 h (S ) + e− t2+1 Θ (b) . k t2 t2 kSt1 k b t2+1 kSt1 j=1 X 43 Taking a sup on both sides,

b 1 rj j sup Θ (b) 6 h (S ) + sup e− t2+1 Θ (b) k t2 kSt1 k t2 t2 kSt1 k b t2+1 kSt1 b b j=1 X M 6 1 j ht2 (St2 ) St1 + sup Θt2+1(b) St1 k k 2 b k k j rt +1 (for some M1 > 0 such that e− 2 < M1 and b > 2)

< . ( for h (S ) < and T = t + 1) ∞ k t t k ∞ 2

And applying a backward induction method on t2, then for t2 < t2 + 1 = T , sup Θj (b) < by doing the same process until we reach t 6 t . Thus, b k t2 kSt1 ∞ 1 2 sup Θ (b) < for all 0 6 t 6 t < T . b k t2 kSt1 ∞ 1 2

(iii): When t2 = T , θ (b) = f (S ) < by the fact (i). k t2 kSt1 k T T kSt1 ∞

For t1 6 t2 < T and ﬁxed t1, applying n backward induction on t2,

1 b θ (b) = ηj k t2 kSt1 k b t2 kSt1 j=1 X b b 1 j 1 rj j 6 η + e− t2+1 θ (b) k b t2 kSt1 k b t2+1 kSt1 j=1 j=1 X X 1 b M b 6 hj h (S ) + 1 θj (b) . 2k t2 t2 t2 kSt1 b k t2+1 kSt1 j=1 j=1 X Xj rt +1 (for some M1 > 0 such that e− 2 < M1 and b > 2)

This is ﬁnite for the same reason as showing the proof of (ii). Taking sup over all b

and applying n backward induction on t , sup θ (b) is ﬁnite. 2 b k t2 kSt1

Proof of Theorem 6.1.4: First, we will prove Θ (b) f (S ) 0 for each t by using k t − t 0 kp → a backward induction on t. Let’s assume that Θ (b) f (S ) 0 is true. k t+1 − t+1 t+1 kSt+1 → Next we claim Θ (b) f (S ) 0. k t − t t kSt → 44 Case: n = 1

Claim: If Θ (b) f (b) 0, then Θ (b) f (S ) 0. k t+1 − t+1 kSt+1 → k t − t t kSt →

Proof.

Θ (b) f (S ) k t − t t kSt b 1 rj j = max h (S ), e− t+1 Θ (b) max h (S ), g (S ) k { t t b t+1 } − { t t t t }kSt j=1 X b 1 rj j 6 e− t+1 Θ (b) g (S ) k b t+1 − t t kSt j=1 X b b 1 rj j j 1 rj j 6 e− t+1 Θ (b) f (S ) + e− t+1 f (S ) g (S ) k b t+1 − t+1 t+1 kSt k b t+1 t+1 − t t kSt j=1 j=1 X ¡ ¢ X ¡ ¢ = M1 + M2.

b 1 rj j M : Given S , e− t+1 f (S ) for all j = 1, 2, , b are i.i.d. with 2 t b t+1 t+1 ··· j=1 X mean gt(St) and ﬁnite by Lemma 1.Thus, M2 is well deﬁned and

1 b p 1 rj j p M = E e− t+1 f (S ) g (S ) S 0 2 | b t+1 t+1 − t t | | t → " j=1 # X by Theorem 4.1 [20][Appendix A].

M1:

1 b rj j 1 b rj j M 6 max j = 1 e− t+1 Θ (b) j = 1 e− t+1 f (S ) 1 k b t+1 − b t+1 t+1 kSt = Θ (bX) f (S ) . X k t+1 − t+1 t+1 kSt

However, Θ (b) f (S ) 0 by our induction hypothesis. k t+1 − t+1 t+1 kSt+1 → Theorem 6.1.6. [20] Suppose that sup E [ Θ (b) f (S ) r S ] < , b | t+1 − t+1 t+1 | | t ∞ then Θ (b) f (S ) is uniformly integrable for all p, 0 < p < r. k t+1 − t+1 t+1 kSt

45 Check: uniform integrability condition

E Θ (b) f (S ) p+ǫ S = Θ (b) f (S ) p+ǫ t+1 − t+1 t+1 | | t k t+1 − t+1 t+1 kSt 6 £ p+ǫ ¤ p+ǫ sup Θt+1(b) St + ft+1(St+1) St b k k k k ′ ′ 6 p +ǫ p +ǫ sup Θt+1(b) St + ft+1(St+1) St b k k k k < ∞

for 0 < p < p′ and p′ + ǫ > 1 by Lemma 6.6. Hence, this means M = Θ (b) f (S ) 0. Since both M and M are convergent 1 k t+1 − t+1 t+1 kSt → 1 2 to zero, our claim is done.

Last, getting back to induction scheme on t, we have Θ (b) f (S ) 0 k 0 − 0 0 k → as b and n = 1 . → ∞ Case: n > 2. After sampling n independent replications, we have the set, Θ (b) n { 0k }k=1 at time zero. Then,

1 n Θ (b) f (S ) = Θ (b) f (S ) (6.7) k 0 − 0 0 kp kn 0k − 0 0 kp Xk=1 1 n n 1 6 Θ (b) f (S ) + − f (S ) (6.8) n k 0k − 0 0 kp n k 0 0 kp k=1 < X ∞

by the facts that the finite sum is finite and so is f0(S0). Notice that if we divide the sample mean by n 1, instead of n, then (n 1)/n becomes 1 and − − Equation (6.8) is finite. Consequently, the convergence holds for all n. Moreover,

p convergence implies a convergence in probability, i.e., Θ (b) P f (S ). − 0 −→ 0 0

Remark 6.1.7. From the above Theorem 6.1.4, we know that E[Θ (b)] f (S ) as b 0 → 0 0 → . Thus, the estimator is asymptotically unbiased. ∞ 46 Theorem 6.1.8. [8] High-estimator bias:

The high estimator is indeed biased high, i.e.,

E[Θ (b)] f (S ) for all b. (6.9) 0 ≥ 0 0

Proof. We prove Theorem 6.1.8 for each t, not only t = 0. For t = T ,Θ f (S ) by the T ≡ T T deﬁnition. Then

E[fT (ST )] = fT (ST ) 6 E[ΘT (b)]. (6.10)

For t < T , we do backward induction on t. Assume that E[Θ S ] > f holds. t+1| t+1 t+1

Claim: E[Θ S ] > f . t| t t

Since E[ Θ ] < by Lemma 2, by the Jensen’s inequality, we get | t| ∞ b 1 rj j E[Θ S ] > max E[h (S ) S ],E e− t+1 Θ S t| t t t | t b t+1| t Ã " j=1 #! X rt+1 = max h (S ),E[e− Θ S ] { t t t+1| t }

rt+1 = max h (S ),E[e− E[Θ S ] S ] { t t t+1| t+1 | t }

rt+1 > max h (S ),E[e− f (S ) S ] { t t t+1 t+1 | t } = max h (S ), g (S ) = f (S ). { t t t t } t t

Particularly t = 0, then E[Θ0(b)] > f0(S0) for any b.

47 6.2 The Low Estimator θ

We deﬁne the low estimator recursively.

Deﬁnition 6.2.1. [8] The Low Estimator θ:

b i1 it 1 i1 itj t = 0, ,T 1 : θ ··· = η ··· (6.11) ··· − T b t j=1 X i1 iT i1 iT t = T : θT ··· = fT (ST ··· ). (6.12)

i1 itj where ηt ··· is deﬁned by b i1···iti i1 it i1 it 1 rt+1 i1 iti ht(St ··· ), if ht(St ··· ) b 1 e− θt+1··· ≥ −  i=1 i1 itj  Xi=j ηt ··· =  6 b  j i1···iti  r i1 itj i1 it 1 r i1 iti  t+1 ··· t+1  e− θt+1 , if ht(St ··· ) < b 1 e− θt+1··· − i=1  Xi=j  6  Theorem 6.2.2. [8] Low estimator consistency:

Suppose that P (h (S ) = g (S )) = 1 for all t. Then Theorem 6.1.8 holds for the low t t 6 t t estimator as well.

Proof. Here we apply an argument similar to the one used in the proof of Theorem 6.1.4.

First, when t = T , it is obvious that θ f (S ) 0 as b . k T − T T kp → → ∞ Aim: If θ (b) f (S ) 0 then θ (b) f (S ) 0 as b . k t+1 − t+1 t+1 kSt+1 → k t − t t kSt → → ∞ b i j 1 rt+1 i Now we deﬁne a new random variable, Yt (b) = b 1 e− θt+1. − i=1 Xi=j 6 Lemma 6.2.3. b 1 rj j (i) e− t+1 θ (b) g (S ) 0. k b t+1 − t t kSt → j=1 X (ii) Y j(b) g (S ) 0. k t − t t kSt →

j (iii) 1 ht(St)>Y (b) 1 ht(St)>gt(St) St 0. k { t } − { }k →

48 Proof. (i):

b 1 rj j e− t+1 θ g (S ) k b t+1 − t t kSt j=1 X b b 1 rj j 1 rj 6 e− t+1 (θ (b) f (S )) + e− t+1 f (S ) g (S ) k b t+1 − t+1 t+1 kSt k b t+1 t+1 − t t kSt j=1 j=1 X X = M + M 0, 1 2 →

when we apply the same argument used for High estimator in Theorem 6.1.4.

j (ii): By the deﬁnition Yt , b 1 ri i e− t+1 θ g (S ) . (6.13) kb 1 t+1 − t t kSt i=1 − Xi=j 6 j The only diﬀerence between (ii) and (i) of estimators is the lack of one term in Yt . Thus, (ii) is valid by (i).

(iii): Suppose that ht(St)

j j 1 ht(St)>Y (b) 1 ht(St)>gt(St) St = 1 ht(St)>Y (b) St k { t } − { }k k { t }k j 1 = P (Y 6 h (S ) S ) p . t t t | t

However, (ii) implies that Y j(b) P g (S ). Hence, t −→ t t

j 1 1 P [Y g (S ) 6 h (S ) g (S ) S ] p P [h (S ) g (S ) > 0 S ] p t − t t t t − t t | t → t t − t t | t 0, →

since this contradicts to our assumption h (S ) g (S ) < 0. Now let’s consider the t t − t t case when h (S ) > g (S ). Remember our hypothesis P (h (S ) = g (S )) = 1 for all t t t t t t 6 t t t, so we discard the equality between two functions.

j (iii) = 1 ht(St)>Y (b) 1 St k { t } − k j 1 = P [Y 6 h (S ) S ] p 1 0 t t t | t − →

49 Claim: b 1 rj j − t+1 j e θt+1(b)1 ht(St)

j rt+1 j Proof. For the purpose of simplicity in our notation, we denote uj(b) = e− θt+1(b),

j vj(b) = 1 ht(St)

1 b u (b)v (b) uv k b j j − kSt j=1 X 1 b 1 b 6 (u (b)v (b) u (b)v) + (u (b)v uv) k b j j − j kSt k b j − kSt j=1 j=1 X X 1 b 6 u (b) v (b) v + v (u (b) u) k 1 kSt k 1 − kSt k kSt k b j − kSt j=1 X 0 →

by Lemma 6.2.3 (i) and (iii).

Since h (S ) < , the next equation is well deﬁned and holds for Lemma 6.2.3 (iii): i.e., k t t kSt ∞

j ht(St)1 ht(St)>Y (b) ht(St)1 ht(St)>gt(St) St 0. (6.15) k { t } − { }k →

Now we get back to our aim,

1 b θ (b) f (S ) = ηj(b) f (S ) k t − t t kSt k b t − t t kSt j=1 X b 1 j rj j = h (S )1 j + e− t+1 θ (b)1 i f (S ) t t ht(St)>Y (b) t+1 ht(St)Y (b) ) St k { } − b { t } k j=1 X = M1 + M2.

50 According to our claim (1.0.10), M 0. On the other hand, 1 →

M2 = gt(St)1 ht(St)gt(St) + (ht (St)1 ht(St)>Y (b) ) St − { } b { t } k j=1 X b 1 j j = ht (St)1 ht(St)>Y (b) ht(St)1 ht(St)>gt(St) St k b { t } − { }k j=1 X 0 (∵ (1.0.11) and all hj(S ) < for all j). → k t t k ∞

Theorem 6.2.4. [8] Low estimator bias:

The bias of the low estimator is negative: i.e.,

E[θ (b)] f (S ) for all b. 0 ≤ 0 0 Proof. Let’s consider it at t = T ﬁrst. Then, it is clear to see E[θ S ] 6 f (S ). Suppose T | T T T that E[θ S ] 6 f (S ) holds. Then we want to show t+1| t+1 t+1 t+1 E[θ S ] 6 f (S ) for all 0 6 t 6 T. (6.16) t| t t t

By the deﬁnition of θt,

E[θ S ] = E[ηj S ] for any j = 1, , b t| t t | t ··· rj j j − t+1 j = E[ht(St)1 ht(St)>Y (b) St] + E[e θt+1(b)1 ht(St)

j j But θt+1 is conditionally independent of Yt+1 given St

j rj j j = h (S ) P [h (S ) > Y (b) S ] +E[e− t+1 θ (b) S ] P [h (S ) < Y (b) S ] t t t t t | t t+1 | t t t t | t p 1 p − = h (S )p + g (S )(1 p) (by the deﬁnition of the continuation value) t t | t t {z − } | {z } 6 max h (S ), g (S ) (p and 1 p < 1) { t t t t } −

= ft(St).

Applying n-backward induction, E[θt(b)] 6 ft(St) for all 0 6 t 6 T and all b.

51 Remark 6.2.5. Like the case of the high estimator in 6.1.7, the low estimator, θ0(b), is also asymptotically unbiased.

The ﬁnal theorem explains the relationship between the high and low estimators at each node we generate.

Theorem 6.2.6. [8] Comparison of the estimators:

i1i2 it On every realization of the array S ···· : t = 0, 1, ,T ; i = 1, , b ; j = 1, , t , the { t · · ·· j ··· ··· } low estimator is less than or equal to the high estimator. In short,

i1 it i1 it θ ··· Θ ··· t ≤ t with probability 1, for all i i and t = 0, 1, ,T . 1 ··· t · · ··

Proof. t = T :

θT = ΘT = fT (ST ). (6.17)

∴ θT 6 ΘT . t < T : Assume that θj 6 Θj for j = 1, , b is true. Then: t+1 t+1 ···

Want to show:

θj 6 Θj for j = 1, , b. (6.18) t t ··· By the deﬁnition of ηj, if Y j is less than h (S ) for all j = 1, , b. t t t t ··· θ = h (S ) 6 Θ for all j = 1, , b, since Θ = max h (S ), g (S ) . ⇒ t t t t ··· t { t t t t }

52 Now, let’s suppose that Y j such that it is greater than h (S ). Then ∃ t t t 1 b θ = ηj t b t j=1 X b 1 rj j j − t+1 j = ht(St)1 ht(St)>Y (b) + e θt+11 ht(St)Yt (b) ht(St)

Y 1 > h (S ) vs. Y 2,Y 3, ,Y b

53 l 6 k < m 6 b, then

h (S ) 6 Y l and Y m

b b 1 j 1 ri i θ = η 6 ph (S ) + (1 p) e− t+1 θ t b t t t − b t+1 j=1 i=1 X X b 1 ri i 6 ph (S ) + (1 p) e− t+1 Θ (by induction) t t − b t+1 i=1 b X 1 ri i 6 max h (S ), e− t+1 Θ { t t b t+1} i=1 X = Θt.

From the order of our estimators above, the value of the option at time zero satisﬁes

E[θ (b)] f (S ) E[Θ (b)] for all b. 0 ≤ 0 0 ≤ 0

54 CHAPTER 7

Implementation

In this chapter, we give details for the implementation of the random tree method. When we generate a random tree in a naive way, we need to generate bm nodes with m time steps and ﬁxed b branches. Then, we evaluate the high and low estimators at each node as described in the previous section. It means the required memory is 2 bm computationally. But, as we can see in Figure 7.1, the high and low estimators are evaluated at each node recursively, and they only depend on its b successors from that node over whole time steps. Thus, it is enough to store mb + 1 . We account for this using a so-called a depth-ﬁrst procedure later. 7.1 Depth-First Procedure

The main idea of a depth-ﬁrst procedure is to create only one single branch at a time. In order to make it clear, we illustrate it particularly when b = 4 and m = 3When˙ we

i1i2 it generate the sequence S ···· , all indices appearing in superscripts i i i go through t 1 2 ··· t the set 1, 2, b. The algorithm follows: ··· We are given the initial state, S0, and use three time steps.

1 1 1 1. Generate S1 ,S2 ,S3 . Here we use a shortened notation for our purpose:

denoted by S1,S1,S1 = 1, 11, 111. { 1 2 3 }

2. Since we have reached the ﬁnal step, we don’t have to go further. So we move on to generate its successors: i.e.,

Generate 112, 113, 114.

55 3. Now we can evaluate the high and low estimators at node 11.

4. We need to keep two estimators at node 11. But we don’t need all the node values at its successors 112, 113, 114.

Discard all successors 11b rooted from 11.

5. Next we generate 12 and its successors in the same way.

Generate 12, 121, 122, 123, 124

6. Calculate the high and low estimators at node 12 and discard its successors.

7. Repeat these steps (2 6) at nodes 13 and 14. ∼ 8. Evaluate high and low estimators at node 1.

9. Generate a new path 2,21,211 and do the same steps until b = 4. { } 10. Finally we get the high and low estimators at time = 0 through all these steps.

Figure 7.1 shows these steps with b = 4 and m = 3.

Remark 7.1.1. We can generalize to any b 2. To determine the high and low estimators, ≥ we need to know its b successors and itself at each root node. Then after getting the high and low estimators, we get rid of all values. We repeat the same procedure all over time steps. Thus, the storage requirement is b m plus a root itself. Consequently, the total is × b m + 1.

56 Tree Decomposition

111

112

113

1 114 evaluate high and low estimators at the node 11 S 0

(a) Steps: 1, 2 and 3

Tree Decomposition

111

112 11 113

114

121 1 122 12 S 0 123

124

evaluate high and low estimators at the node 12

(b) Steps: 4, 5 and 6

57 Tree Decomposition

111

112 11 113

114

121

122 12 123

124 1 131

S 0 132 13 133

134 evaluate high and low estimators at node 13

Tree Decomposition 111

112 11 113

114

121

122 12 123

124 1 131 S 0 132 13 133

134 141

142 14 evaluate 143 high and low 144 estimatos at the node14

(d) Step: 7

58 Tree Decomposition

111

112 11 113 evaluate high and low 114 estimators at the node 1 121 122 12 123

124 1 131 S 0 132 13 133

134

141

142 14 143

144

(e) Steps: 8 and 9

Figure 7.1: Depth-ﬁrst processing of the random tree when b=4 and m=3. Solid lines represent nodes that are being worked on currently; dashed lines indicate nodes from the previous steps that don’t need to be in memory.

59 CHAPTER 8

Calibration

8.1 Futures Price Process

Given Fti−1 , the risk-neutral underlying non-dividend paying futures price is generated by

(r ω)(ti ti−1)+X(∆t) Fti = Fti−1 e − − , (8.1) where ω is the parameter such that a discounted futures price exp( rt)F becomes a { − t} 1 2 t martingale process. Thus, in the VG process, ω = (1 θν σ ν)− ν . Here, X(∆t) is the − − 2 VG process and is sampled by the method we introduced in Chapter 5.

8.2 Data Description

The data to calibrate American options is the futures options on crude oils. The futures options selected are daily closing prices from December 2003 to November 2004. Over the sample period, we calibrate calls once per month. The quote date we pick is the second Tuesday of each month. Also, we classify them according to their time to maturity. Options with less than 60 days up to expiration are allocated to short-term and between 60 and 120 days to medium-term. Since the volume decreases substantially in options with expiration dates more than 120 days away, we keep only futures options with less than 120 days remaining. Additionally, we apply two ﬁlters to our data. First, we exclude very near- term futures options (< 4). Thus, we may have early exercise opportunities at most four times coinciding with our numerical setting, dt = 4. Secondly, we discard deep out-of-the-

St St money ( K < 0.8) and in-the-money calls ( K > 1.2). After all ﬁlters, 692 contracts are left to calibrate over the period.

60 8.3 Calibration Method

We use the least square method to calibrate our data. At each given observation date, say market t, we have market settlement of calls C (τ, Ki) i I on underlier futures price Ft . { } ∈ { } For each given date, t, we observe N futures options. Denote our computer model price by, model C (τ, Ki)i I with each strike Ki at time to maturity τ. Then our calibration problem is ∈ N market model 2 min C (ξ; τ, Ki) C (ξ; τ, Ki) (8.2) ξ − i=1 X ¡ ¢ where ξ is the parameter vector to be calibrated, so is θ, σ, ν . The most precise way { } to calibrate is to calibrate at each single maturity. However, here we do several maturities together classiﬁed above. In order to get futures options prices based on our simulation method, we use dt = τ/n, n = 4 as the number of time space. 50 independent replications of the tree are used. The parameter for a branch is 5, which is held as a constant over the whole tree process.

8.4 Results

8.4.1 In-Sample Pricing Performance

We measure the pricing performance of each model in two ways. First, we compute four diﬀerent errors for each observation day (in-sample). Second, the same four errors are calculated for out-of-sample using the parameter vector calibrated for in-sample. We list numerical results for in-sample in this section and then for out-of-sample in the following section.

A. Parameters Calibrated According to our parameter setting, we calibrate the parameters underlying the Black-Scholes and Variance Gamma models. In Table 8.1, for each model, we list them as short-term, medium-term and overall twelve time-averaged parameters. As shown in the ﬁrst three rows for the Black-Sholes model, the tree time-averaged parameters, σ, are around 0.2302. 1 It ranges from a low of 0.1881 (April 2004 contract) to 0.2805 † (November 2004 contract). As Doran and Ronn mentioned in their paper, the volatility of

1Numbers are reported 6 digits of precision, rather than the four decimal points shown in the tables.

61 energy data is usually higher than that of equity, S & P 500 [15]. For the VG process, the three time-averaged parameters of σ are also very similar: about 0.2938. The maximum is 0.3950 (short-term on November 2004), meanwhile a minimum of 0.206 (short-term on April 2004). On the other hand, as we can see in Table 8.1, ν varies across maturities. When we generated a Gamma process in the earlier chapter, the parameter ν depended on the days remaining until expiration. Thus, having a diﬀerence of ν cross over maturities makes sense. Without considering days to maturitiy, the low of 0.059 (short-term on April 2004) and the high of 0.6015 (medium-term on June 2004) are reported for ν. As ν accounts for the tail behaviors of the L´evymeasure, we will present excess kurtosis later. But, over the entire period, the time-averaged parameter is 0.2485 for ν. The parameter θ calibrated for VG is 0.2382 over all twelve observation days. It ranged from a low of 0.0621 (medium- term on May 2004) to a high of 0.5278 (short-term on July 2004).We notice the value of θ calibrated are all positive. This shows that return distribution underlying the VG process has a positive skewness, as we explained earlier. Table8.1 presents the summary of the parameters calibrated for the Black-Scholes and VG models.

Table 8.1: Parameters calibrated: Time-averaged from twelve observations based on the Black-Scholes and Variance Gamma models (Dec. 2003 Nov. 2004) ∼

Parameters Calibrated

Model Days to expiration σ ν θ

ST 0.233

B-S MT 0.228

Overall 0.2302

ST 0.2916 0.1298 0.3195

VG MT 0.2959 0.3670 0.1569

Overall 0.2938 0.2485 0.2382

62 B. Percentage Pricing Error Before we see the pricing performance implied by the calibrated parameters as a whole, we examine the percentage pricing error, which is the sample average of the diﬀerence between the option model and the actual price, divided by the actual price. We separate our data by moneyness and maturity. As shown in Table 8.2, the VG process had the better performance in terms of the average percentage pricing error overall. Compared to the B-S model, it is quite remarkable that the VG model outperformed, especially in out-of-the-money category cross section. The other thing to notice here is that the VG and the B-S models have a diﬀerent sign, even though in the same moneyness. For the B-S model, market prices at moneyness from 0.95 to 1.2 are overpriced relative to the model price for the short-term maturity. However, this dose not occur to the VG model. At a moneyness of 1.01-1.2, market price is underpriced relative to the model price for the short-term and for the medium-term maturity, while it is overpriced relative to the model price in at-the-money case for both short- and medium-term maturities.

Table 8.2: Percentage pricing error: (model price-actual price)/actual price (Dec. 2003 Nov. 2004) ∼

Average Percentage Error

Days to Expiration

Short-Term Med-Term

VG B-S VG B-S

0.8-0.95 0.0396 0.0714 0.0238 0.0945

0.95-1.05 -0.1114 -0.0816 -0.0198 0.0216

1.01-1.2 0.0724 -0.5070 0.0322 -0.2552

63 C. Pricing Performance Before we present numerical results, we deﬁne four diﬀerent errors from Schoutens to check pricing performance based on the B-S and VG models [35].

N Cmarket Cmodel AP E = i=1 | i − i | : average pricing error • N market P i=1 Ci N market model CPi Ci AAE = | − | : average absolute error • N i=1 X 1 N Cmarket Cmodel ARP E = | i − i | : average relative percentage error • N Cmarket i=1 i X N (Cmarket Cmodel)2 RMSE = i − i : root mean square error • v N u i=1 uX t To compare the pricing performance of the B-S and VG models, Table 8.3 comprises APE, AAE, ARPE, and RMSE based on the two models. In Table 8.3, they are time-averaged values over twelve observations. Compared to the Black-Scholes model, both APE and RMSE have decreased by half in the VG model. For example, in terms of APE across maturity, about 5.35% (10.43%) is reported for the VG (B-S) model, respectively. Considering the ratio, AP EVG , we get 0.5124. In the same way, RMSEVG is 0.5781 overall. Thus, it’s certain AP EBS RMSEBS that this result is very consistent over all periods. Table 8.4 presents a summary of each ratio to compare pricing performance between the VG and B-S models.

64 Table 8.3: Pricing errors for In-Sample: Time-averaged from twelve observations based on the Black-Scholes and Variance Gamma models (Dec. 2003 Nov. 2004) ∼

Pricing Errors for In-Sample

Model Days to Expiration APE(%) AAE ARPE RMSE

B-S Overall 10.4324 0.1899 0.2069 0.2382

ST 6.2434 0.1072 0.1232 0.1479

VG MT 4.4484 0.0913 0.09042 0.1380

Overall 5.3459 0.09923 0.09042 0.1377

Table 8.4: In-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. 2004) ∼

Days to Expiration AP EVG RMSEVG AP EBS RMSEBS

Short-Term 0.59 0.72

Med-Term 0.47 0.55

Overall 0.51 0.58

D. Analysis of Moments In Chapter 5 (see Table 5.1), we noticed that skewness and kurtosis are dependent on a parameter vector of the VG process as well as time to maturity. Thus, it’s meaningful to look at the ﬁrst four central moments with the calibrated parameters. In order to get four quantities, we compute using the formula in Table 10.1. Since we calibrate our models with short- and medium-term separately, we hold the parameter calibrated until the calibration procedure is done. Hence, we present the ﬁrst four central moments along

65 short- and medium-term during our observation period. Table 10.1 lists the mean, variance, skewness and kurtosis of the VG model. The first four (next four) columns stand for short- (medium-, respectively) term of twelve observation dates. Based on our acknowledgement, θ is accounting for a location and skewness for the VG process. According to our calibration result, we have θ as positive over the entire period. Hence, this implies that the risk-neutral distribution underlying crude oil futures is asymmetric and skewed to the right. Compared to the equity market, a positive skewness is the most distinguishable aspect. Furthermore, since the first moment is proportional to θ and time, mean increases with maturity for a fixed positive θ. Table 10.1 explains these characteristics along the maturity. As time to maturity increases, the variance also goes to the same direction. For skewness and kurtosis, skewness (kurtosis) is proportional to 1/ (t) (1/t, respectively) for all fixed parameters. It holds that both skewness and kurtosis decreasep with maturity.

8.4.2 Out-of-Sample Pricing Performance

It is well known that in-sample fit alone is not enough to guarantee the predictability problem for empirical data. Also, we may not determine which model is the best fit for specific data. To check this reliability, we conduct two tests for one-day and five-days ahead out-of-sample. Given the calibrated parameter set from twelve observation days, we use these to valuate options futures on crude oils for the very next day and five business days later. In both tests, we hold the same filter on out-of-sample observations and all parameters for numerical scheme are kept. Similarly, we investigate the pricing performance as we’ve done for in- sampling.

A. Absolute Error The ﬁrst results in Table 8.5 and Table 8.6 indicate the average absolute errors by moneyness and maturity. As a reference for the B-S model and one- day-ahead samples, the greatest mispricing happens to short-term and medium-term in- the-money (0.8-0.9) options, with ﬁve-days-ahead options showing a similar pattern. The average absolute pricing error is 0.5107 (short-term) and 0.8358 (medium-term). Generally speaking in terms of one-day-ahead samples, the moneyness of options moves to at-the-money category, the average absolute pricing error declines for both the VG and B-S models. This pattern is consistent for both out-of-samples. Meanwhile, for the VG model in the one-day-

66 ahead sample, the lowest mispricing was recorded as 0.0609 for short-term (1.1-1.2) calls, as compared to that of the B-S model, 0.2356. This is quite a significant difference: 26 percent. The VG model also had the smallest mispricing error in the five-day-ahead samples: 0.1324 for medium-term (1.1-1.2) calls. This is an improvement over the B-S model by about 50%. However, there are no general characteristics exhibited in the one-day-ahead samples for five-days-ahead samples.

Table 8.5: Average Absolute Error: One-day-ahead sample from December 2003 to November 2004

Average Absolute Error: One-day-ahead

Days to Expiration

Short-Term (4-60) Med-Term ( 60-120)

Moneyness VG BS VG BS

0.8-0.9 0.3790 0.5107 0.3827 0.83584

0.9-0.95 0.1414 0.2642 0.0978 0.3978

0.95-1 0.1409 0.1113 0.1463 0.17899

1-1.05 0.1264 0.1406 0.0763 0.0979

1.05-1.1 0.0725 0.1765 0.0927 0.14054

1.1-1.2 0.0609 0.2356 0.0687 0.23961

67 Table 8.6: Average Absolute Error: Five-days-ahead sample for December 2003 to November 2004

Average Absolute Error: Five-days-ahead

Days to Expiration

Short-Term (4-60) Med-Term ( 60-120)

Moneyness VG BS VG BS

0.8-0.9 0.5026 0.6098 0.2473 0.5627

0.9-0.95 0.2103 0.3008 0.1891 0.3339

0.95-1 0.2370 0.1371 0.2408 0.2291

1-1.05 0.1950 0.1531 0.1906 0.2132

1.05-1.1 0.1755 0.2147 0.1669 0.1679

1.1-1.2 0.1768 0.2579 0.1324 0.2761

B. Percentage Pricing Error We investigate the percentage pricing errors on the out- of-samples. Table 8.7 and Table 8.8 show the average percentage pricing errors for one day ahead and ﬁve days ahead out-of-sample. It is interesting to note that out-of-the-money options indicate the same behavior as that of the in-sample pricing errors, except for the at- the-money in the medium-term options for the VG process. For instance, in the B-S model, market prices in moneyness between 0.95 and 1.2 are overpriced relative to the model price for the short-term maturity. However, for the VG in the moneyness of 1.05-1.2, market price is underpriced relative to the model price for the short-term and medium-term maturity, while it is overpriced relative to the model price in the at-the-money case for both short- and medium- term maturities.

68 Table 8.7: Percentage Pricing Error: One-day-ahead sample from December 2003 to November 2004

Average Percentage Errors

Days to Expiration

Short-Term Med-Term

Moneyness VG B-S VG B-S

0.8-0.95 0.0479 0.0787 0.0232 0.0995

0.95-1.05 -0.1015 -0.0774 -0.0166 0.03761

1.05-1.2 0.0922 -0.4895 0.0439 -0.2370

Table 8.8: Percentage Pricing Error: Five-days-ahead sample form December 2003 to November 2004

Average Percentage Errors

Days to Expiration

Short-Term Med-Term

Moneyness VG B-S VG B-S

0.8-0.95 0.055611 0.088183245 0.013477 0.083013

0.95-1.05 -0.02257 -0.05063326 0.005598 0.046281

1.05-1.2 0.105592 -0.45322557 0.066391 -0.20526

C. Pricing Performance In general, the result in Table 8.9 indicates that the APE for one day ahead out-of-sample is 7.2472 (5.3064)% for short-term (medium-term) respectively in the VG model. For the VG model, as futures options go to the medium-term, the APE decreases for both one-day and ﬁve-days-ahead. Also, the overall APE for the VG (B-S) model is recorded as 6.2768% (10.4489%) for one-day-ahead sample respectively. On the other hand, for ﬁve-days-ahead sample, 9.7914% (12.2857%) is recorded for the VG (B-S) model on average respectively. According to Table 8.10, for the B-S model, RMSE is 0.2494

69 (0.2723) for one-day (five-days) ahead samples. But, for the VG model, 0.1583 (0.2051) is recorded as RMSE for the short-term (medium-term) respectively. Overall, in APE and RMSE, the VG model outperforms for out-of-samples. In consideration of the ratio of APE and RMSE with the two models, the ratio of APE overall is 0.6007 (0.7179) for one-day- ahead (five-days-ahead) out-of-sample. Since these exceed 0.5, this means the VG model is still better fit our data for predicting as a reference of the B-S model. In terms of RMSE, we also observe the ratio of 0.6347 (0.7534) on one-day-ahead (five-days-ahead) respectively.

Table 8.9: APE for Out-of-Sample: Time-averaged from twelve observations based on the B-S and VG models (Dec. 2003 Nov. 2004) ∼

APE for Out-of-Sample

APE (%)

Model Days to Expiration One-day-ahead Five-days-ahead

B-S Overall 10.4489 12.2857

ST 7.2472 9.7914

VG MT 5.3064 7.8506

Overall 6.2768 8.821

70 Table 8.10: RMSE for Out-of-Sample: Time-averaged from twelve observations based on the B-S and VG models (Dec. 2003 Nov. 2004) ∼

RMSE for Out-of-Sample

RMSE

Model Days to Expiration One-day-ahead Five-days-ahead

B-S Overall 0.2494 0.2723

ST 0.1810 0.2314

VG MT 0.1355 0.1788

Overall 0.1583 0.2051

Table 8.11: Out-of-Sample: Time-averaged ratio of APE and RMSE (Dec. 2003 Nov. ∼ 2004)

AP EVG RMSEVG AP EBS RMSEBS Days to Expiration One-day-ahead Five-days-ahead One-day-ahead Five-days-ahead

Overall 0.6007 0.7180 0.6347 0.7534

71 CHAPTER 9

Conclusion and Future Research

Conclusion The Variance Gamma process has been considered to explain behaviors in the American futures options market. To sample the underlying futures prices, the Variance Gamma process was defined as a Brownian motion with constant drift and volatility with a random time change by a Gamma process. This representation enables us to generate the Variance Gamma process by sampling from the Gamma and normal distributions. Based on the random tree method, the high estimator and the lower estimator were calculated through a random tree path. The American option value was taken as the sample average of the high and the lower estimators through fifty replications of a random tree. Options on crude oil commodity contracts from December 2003 to November 2004 were used to calibrate the risk-neutral parameter set for the underlier Variance Gamma process. The most interesting thing was that we had θ as positive over the entire period, which indicated asymmetry and a positive skewness. Also, it was observed that the very short-term options had much higher kurtosis relative to medium-term options. However, both the Black-Scholes and Variance Gamma models had high volatility. Furthermore, in terms of pricing errors, we found that the Variance Gamma process had lower in-sample and one-day and five-days ahead out-of- samples of APE and RMSE than the corresponding the Black-Scholes model over the entire period. Remarkably, out-of-the-money options for both the short- and medium-terms had much lower average absolute and average percentage pricing errors in the Variance Gamma model compared with the Black-Scholes model.

Future Research We conclude that the Variance Gamma process performs better than the Black-Scholes model in terms of the pricing errors we deﬁne. However, calibrating the

72 parameters once a month for one year is not enough to explain the characteristics of options on crude oil futures market. Therefore, calibrating once a week and performing the same analysis will allow us to generalize it adequately for crude oil futures options. Also, another possible path is to generate the NIG process, which is a Brownian motion with a constant drift at a random time change by an Inverse Gaussian [4] as a risk-neutral underlying process. It would be very interesting to observe how pricing performances are diﬀerent depending on the underlying process when studying the same options in crude oil futures market. As another research direction is to apply this simulation method to high-dimensional American options with large number of exercise opportunities. One numerical method was developed by Broadie and Glasserman [7], [9].

73 CHAPTER 10

Plots

VG model: Market and Model Prices vs. Strike •

12/09/2003 Jan.04,2004 3

1 settlement price

0 29 29.5 30 30.5 31 31.5 32 32.5 33 33.5 strike 12/09/2003 Feb.04,2004 4 12/09/2003 Mar.04,2004 7 6 3 5 2 4 3

1 option price

settlement price 2 1 0 29 30 31 32 33 34 35 36 37 0 strike 24 26 28 30 32 34 36 38 strike

(a) Dec. 9, 2003: short-term (b) Dec. 9, 2003: medium-term

Figure 10.1: VG calibration of Futures Options on Crude Oils. indicates market price and + stands for model price ◦

74 01/13/2004 Feb.04,2004 01/13/2004 Apr.04,2004 10 6

8 4 6

4 2 option price option price 2

0 0 25 30 35 28 30 32 34 36 38 40 strike strike 01/13/2004 Mar.04,2004 01/13/2004 May.04,2004 6 6

4 4

2 2 option price option price

0 0 28 30 32 34 36 38 40 28 30 32 34 36 38 40 strike strike

(a) Jan.13, 2004: short-term (b) Jan.13, 2004: medium-term

Figure 10.2: Figure 10.1-Continued

75 02/10/2004 Mar.04,2004 02/10/2004 May.04,2004 6 2

1.5 4 1 2 option price 0.5 settlement price

0 0 29 30 31 32 33 34 35 36 32 33 34 35 36 37 38 strike strike 02/10/2004 Apr.04,2004 02/10/2004 Jun.04,2004 6 2.5

2 4 1.5 2 settlement price 1 settlement price

0 0.5 29 30 31 32 33 34 35 36 37 38 31 32 33 34 35 36 strike strike

(a) Feb.10, 2004: short-term (b) Feb.10, 2004: medium-term

03/09/2004 Apr.04,2004 03/09/2004 Apr.04,2004 20 20

15 15

10 10

5 5 settlement price settlement price

0 0 20 25 30 35 40 20 25 30 35 40 strike strike 03/09/2004 May.04,2004 03/09/2004 May.04,2004 8 8

6 6

4 4

2 2 settlement price settlement price

0 0 28 30 32 34 36 38 40 42 28 30 32 34 36 38 40 42 strike strike

Figure 10.3: Figure 10.1-Continued

76 04/13/2004 May.04,2004 04/13/2004 Jul.04,2004 8 4

6 3

4 2

2 option price 1 settlement price

0 0 30 31 32 33 34 35 36 37 38 34 35 36 37 38 39 40 41 42 strike strike 04/13/2004 Jun.04,2004 04/13/2004 Aug.04,2004 8 4

6 3

4 2

2 option price 1 settlement price

0 0 30 32 34 36 38 40 42 33 34 35 36 37 38 39 40 strike strike

(a) Apr.13, 2004: short-term (b) Apr.13, 2004: medium-term

05/11/2004 Jun.04,2004 05/11/2004 Jun.04,2004 15 15

10 10

5 5 option price option price

0 0 28 30 32 34 36 38 40 42 44 28 30 32 34 36 38 40 42 44 strike strike 05/11/2004 Jul.04,2004 05/11/2004 Jul.04,2004 15 15

10 10

5 5 option price option price

0 0 28 30 32 34 36 38 40 42 44 46 28 30 32 34 36 38 40 42 44 46 strike strike

Figure 10.4: Figure 10.1-Continued

77 06/08/2004 Sep.04,2004 06/08/2004 Jul.04,2004 2.5 2.5 2 2

1.5 1.5

1 option price option price 1 0.5 0.5 0 38 39 40 41 42 43 44 45 35.5 36 36.5 37 37.5 38 38.5 39 39.5 40 40.5 strike strike 06/08/2004 Oct.04,2004 06/08/2004 Aug.04,2004 5 2.5

4 2

3 1.5 2 option price option price 1 1

0 0.5 34 36 38 40 42 44 46 38 39 40 41 42 43 44 strike strike

(a) June 8, 2004: short-term (b) June 8, 2004: medium-term

07/13/2004 Aug.04,2004 07/13/2004 Oct.04,2004 10 6

8 4 6

4 2 option price option price 2

0 0 30 32 34 36 38 40 42 34 36 38 40 42 44 46 strike strike 07/13/2004 Sep.04,2004 07/13/2004 Nov.04,2004 3 3

2.5 2 2 1 option price option price 1.5

0 1 38 39 40 41 42 43 44 45 46 47 39 39.5 40 40.5 41 41.5 42 42.5 43 strike strike

Figure 10.5: Figure 10.1-Continued

78 08/10/2004 Sep.04,2004 08/10/2004 Nov.04,2004 15 4

3 10 2 5 option price option price 1

0 0 32 34 36 38 40 42 44 46 48 42 44 46 48 50 52 strike strike 08/10/2004 Oct.04,2004 08/10/2004 Dec.04,2004 15 15

10 10

5 5 option price option price

0 0 34 36 38 40 42 44 46 48 50 52 30 35 40 45 50 55 strike strike

(a) Aug.10, 2004: short-term (b) Aug.10, 2004: medium-term

09/14/2004 Oct.04,2004 09/14/2004 Dec.04,2004 6 10

8 4 6

4 2 option price option price 2

0 0 40 41 42 43 44 45 46 47 36 38 40 42 44 46 48 50 52 strike strike 09/14/2004 Nov.04,2004 09/14/2004 Jan.05,2004 3 4

3 2 2 1 option price option price 1

0 0 43 44 45 46 47 48 49 50 51 42 44 46 48 50 52 strike strike

Figure 10.6: Figure 10.1-Continued

79 10/12/2004 Nov.04,2004 10/12/2004 Jan.05,2005 20 15

15 10 10 5 option price 5 option price

0 0 35 40 45 50 55 60 42 44 46 48 50 52 54 56 58 60 62 strike strike 10/12/2004 Dec.04,2004 10/12/2004 Feb.05,2005 15 10

8 10 6

4 5 option price option price 2

0 0 40 45 50 55 60 65 44 46 48 50 52 54 56 58 60 strike strike

(a) Oct.12, 2004: short-term (b) Oct.12, 2004: medium-term

11/09/2004 Dec.04,2004 11/09/2004 Feb.05,2005 15 8

6 10 4 5 option price option price 2

0 0 34 36 38 40 42 44 46 48 50 52 42 44 46 48 50 52 54 56 strike strike 11/09/2004 Jan.05,2005 11/09/2004 Mar.05,2005 10 5

8 4 6 3 4

option price option price 2 2

0 1 40 42 44 46 48 50 52 54 56 58 47 48 49 50 51 52 53 54 55 strike strike

Figure 10.7: Figure 10.1-Continued

80 B-S model: Market and Model Prices vs. Strike •

81 12/09/2003 Jan.04,2004 12/09/2003 Feb.04,2004 3.5 3.5 market price 3 model price 3

2.5 2.5

2 2

1.5 1.5 option price option price 1 1

0.5 0.5

0 0 29 30 31 32 33 34 28 30 32 34 36 38 strike strike

12/09/2003 Mar.04,2004 7

3 option price 2

0 25 30 35 40 strike

01/13/2004 Feb.04,2004 01/13/2004 Mar.04,2004 7 7

6 6

5 5

4 4

3 3 option price option price 2 2

1 1

0 0 28 30 32 34 36 28 30 32 34 36 38 40 strike strike

01/13/2004 Apr.04,2004 01/13/2004 May.04,2004 6 6

5 5

4 4

3 3 option price option price 2 2

1 1

0 0 28 30 32 34 36 38 40 28 30 32 34 36 38 40 strike strike

Figure 10.8: B-S calibration of Futures Options on Crude Oils. indicates market price and + stands for model prices ◦ 82 02/10/2004 Mar.04,2004 02/10/2004 Apr.04,2004 6 5

5 4

4 3 3 2 option price option price 2

1 1

0 0 28 30 32 34 36 28 30 32 34 36 38 strike strike

02/10/2004 May.04,2004 02/10/2004 Jun.04,2004 2 3

2.5 1.5 2

1 1.5 option price option price 1 0.5 0.5

0 0 32 34 36 38 31 32 33 34 35 36 strike strike

03/09/2004 Apr.04,2004 03/09/2004 May.04,2004 8 8

7 7

6 6

5 5

4 4

option price 3 option price 3

2 2

1 1

0 0 28 30 32 34 36 38 40 25 30 35 40 45 strike strike

03/09/2004 Jun.04,2004 03/09/2004 Jul.04,2004 6 3.5

5 3

2.5 4 2 3 1.5 option price option price 2 1

1 0.5

0 0 30 32 34 36 38 40 42 32 34 36 38 40 strike strike Figure 10.9: Figure 10.8-Continued 83 04/13/2004 May.04,2004 04/13/2004 Jun.04,2004 7 7

6 6

5 5

4 4

3 3 option price option price 2 2

1 1

0 0 30 32 34 36 38 30 32 34 36 38 40 42 strike strike

04/13/2004 Jul.04,2004 04/13/2004 Aug.04,2004 3.5 4

3 3.5 3 2.5 2.5 2 2 1.5

option price option price 1.5 1 1

0.5 0.5

0 0 34 36 38 40 42 32 34 36 38 40 strike strike

05/11/2004 Jun.04,2004 05/11/2004 Jul.04,2004 6 7

5 6

5 4 4 3 3 option price option price 2 2

1 1

0 0 34 36 38 40 42 44 34 36 38 40 42 44 46 strike strike

05/11/2004 Aug.04,2004 05/11/2004 Sep.04,2004 4 10

3.5 8 3

2.5 6 2 4 option price 1.5 option price

1 2 0.5

0 0 36 38 40 42 44 46 48 30 35 40 45 50 strike strike Figure 10.10: Figure 10.8-Continued 84 06/08/2004 Jul.04,2004 06/08/2004 Aug.04,2004 2.5 5

2 4

1.5 3

1 2 option price option price

0.5 1

0 0 35 36 37 38 39 40 41 34 36 38 40 42 44 46 strike strike

06/08/2004 Sep.04,2004 06/08/2004 Oct.04,2004 2.5 2.5

2 2

1.5 1.5 1 option price option price

1 0.5

0 0.5 38 40 42 44 46 38 40 42 44 strike strike

07/13/2004 Aug.04,2004 07/13/2004 Sep.04,2004 5 3

2.5 4

2 3 1.5 2 option price option price 1

1 0.5

0 0 35 36 37 38 39 40 41 38 40 42 44 46 48 strike strike

07/13/2004 Oct.04,2004 07/13/2004 Nov.04,2004 6 3

5 2.5 4

3 2 option price option price 2 1.5 1

0 1 34 36 38 40 42 44 46 39 40 41 42 43 strike strike Figure 10.11: Figure 10.8-Continued 85 08/10/2004 Sep.04,2004 08/10/2004 Oct.04,2004 6 10

5 8

4 6 3 4 option price option price 2

2 1

0 0 38 40 42 44 46 48 35 40 45 50 55 strike strike

08/10/2004 Nov.04,2004 08/10/2004 Dec.04,2004 4 12

3.5 10 3 8 2.5

2 6

option price 1.5 option price 4 1 2 0.5

0 0 42 44 46 48 50 52 30 35 40 45 50 55 strike strike

09/14/2004 Oct.04,2004 09/14/2004 Nov.04,2004 5 3

2.5 4

2 3 1.5 2 option price option price 1

1 0.5

0 0 40 42 44 46 48 42 44 46 48 50 52 strike strike

09/14/2004 Dec.04,2004 09/14/2004 Jan.05,2005 10 4.5

4 8 3.5

6 3 2.5 4 option price option price 2

1.5 2 1

0 0.5 35 40 45 50 55 42 44 46 48 50 52 strike strike Figure 10.12: Figure 10.8-Continued 86 10/12/2004 Nov.04,2004 10/12/2004 Dec.04,2004 8 10

7 8 6

5 6 4 4 option price 3 option price

2 2 1

0 0 44 46 48 50 52 54 56 40 45 50 55 60 65 strike strike

10/12/2004 Jan.05,2005 10/12/2004 Feb.05,2005 12 10

10 8

8 6 6 4 option price option price 4

2 2

0 0 40 45 50 55 60 65 40 45 50 55 60 strike strike

11/09/2004 Dec.04,2004 11/09/2004 Jan.05,2004 4 10

3.5 8 3

2.5 6 2 4 option price 1.5 option price

1 2 0.5

0 0 44 46 48 50 52 40 45 50 55 60 strike strike

11/09/2004 Feb.05,2004 11/09/2004 Mar.05,2004 8 4.5

7 4 6 3.5 5 3 4 2.5

option price 3 option price 2 2

1 1.5

0 1 40 45 50 55 60 46 48 50 52 54 56 strike strike Figure 10.13: Figure 10.8-Continued 87 Table 10.1: Risk-Neutral Mean, Variance, Skewness, Kurtosis for December 2003-November 2004

Moment: VG process for the Random Tree Method

DataDate MEAN VARIANCE SKEWNESS KURTOSIS MEAN VARIANCE SKEWNESS KURTOSIS

12092003 0.002959333 0.002560242 1.054584579 18.90062654 0.016487714 0.014264204 0.446784536 3.392420148 0.045738333 0.02025879 0.691325689 2.52989186

1132004 0.003387397 0.000971172 1.635786646 17.16054844 0.036849738 0.022597519 1.160449421 5.806252345 0.029639722 0.008497754 0.552996818 1.961205536 0.05304129 0.032526731 0.96724475 4.033817419

2102004 0.007250472 0.002461824 2.36954004 20.71041594 0.053138444 0.013218245 0.66746293 1.799617801 0.039359706 0.013364186 1.017000751 3.815076621 0.076931778 0.019136863 0.554726208 1.243034976

3092004 0.013231071 0.002141206 1.234107439 5.53607747 0.053004595 0.01300028 0.587370538 1.537952884 0.051601179 0.008350704 0.624914593 1.419507044 0.078354619 0.019217805 0.483099534 1.040379892

4132004 0.00642981 0.000828511 2.356308672 15.06642449 0.038744611 0.013436744 0.441503581 1.47433257

88 0.054653381 0.007042342 0.808207207 1.772520529 0.054936389 0.0190521 0.370774475 1.039792444

5112004 0.009029357 0.002840331 3.501453559 30.7068781 0.016499016 0.032867832 0.597853523 6.848560567 0.058690821 0.018462149 1.373383079 4.724135092 0.024132889 0.048075336 0.494332491 4.682179163

6082004 0.02181025 0.003686141 2.495887111 12.20830054 0.044938056 0.034745449 1.500353349 8.02803187 0.07732725 0.013069046 1.325528218 3.443366818 0.065481167 0.050629083 1.242918025 5.509433636

7132004 0.008377286 0.001221871 2.91132046 19.15972793 0.018105155 0.040953059 0.580268864 6.74887548 0.07330125 0.010691371 0.98420595 2.189683192 0.025941714 0.05867901 0.484764773 4.710152679

8102004 0.007004028 0.002686054 2.106951197 19.12565077 0.047618079 0.026529608 1.41649293 6.461215493 0.03902244 0.014965159 0.892629412 3.432809112 0.06792579 0.037843705 1.185995388 4.529511892

9142004 0.004448222 0.00258425 2.121769591 27.80302991 0.049751556 0.033976272 1.497217813 7.350984703 0.036697833 0.021320062 0.738705354 3.370064231 0.074627333 0.050964407 1.222473225 4.900656469

10122004 0.006105397 0.003440383 3.033392721 36.48884967 0.043443472 0.036950234 1.438991794 8.021828708 0.041516698 0.023394604 1.163253007 5.366007304 0.064831028 0.055141118 1.177956163 5.375452227

11092004 0.006383976 0.00404501 2.374321206 28.12549053 0.054540393 0.028929476 0.457246706 1.590501371 0.039367853 0.024944226 0.956124032 4.560890357 0.079044048 0.041926777 0.37981764 1.097445946 Table 10.2: Random tree method: Time-averaged calibrated parameters for December 2003-November 2004

Short-Term Med-Term

DadtDate theta sigma nu theta sigma nu

12092003 0.106536 0.300434 0.168117 0.164658 0.259597 0.204374

1132004 0.213406 0.239756 0.081263 0.140699 0.278969 0.42724

2102004 0.261017 0.27923 0.156411 0.199864 0.21075 0.132701

3092004 0.333423 0.21758 0.059524 0.193582 0.207408 0.119048

89 4132004 0.405078 0.205983 0.059524 0.145726 0.219112 0.119049

5112004 0.379233 0.306436 0.176549 0.062056 0.348377 0.585767

6082004 0.499653 0.236681 0.113871 0.161777 0.330669 0.601473

7132004 0.527769 0.239621 0.070222 0.068097 0.389039 0.578163

8102004 0.252145 0.295313 0.149239 0.176467 0.289922 0.45795

9142004 0.280238 0.390532 0.131059 0.202216 0.344347 0.477418

10122004 0.307712 0.393009 0.20002 0.168427 0.356536 0.568792

11092004 0.268127 0.394989 0.192985 0.199191 0.316867 0.132338

time-averaged 0.31952808 0.291630333 0.1298987 0.156897 0.2959661 0.367026 APPENDIX A

Basic Convergence Concepts

In this Appendix, we provide various convergence concepts in probability theory. Let (Ω, ,P ) be an arbitrary probability space and X , n = 1, random variables. F { n ···} a.s.: X converge almost surely (a.s.) to a random variable X if { n}

P [ lim Xn = X] = 1. n →∞ Also it is said to converge with probability 1. In measure theory, it’s the same concepts of almost everywhere, or almost all ω Ω. ∈ in probability: Random variables X converges in probability to X, denoted by X P X, if n n −→

lim P [ Xn X > ǫ] = 0 for each positive ǫ. n | − |

(i). If X Xwith probability 1, then X P X n → n −→ (ii). A necessary and suﬃcient condition for X P X is n −→ there exists a subsequent Xn for each subsequence Xn { ki } { k } such that Xn X with probability 1 as i . { ki } → → ∞ p norm − If all X and X have ﬁnite with respect to pth moment, then for 0 < p < , n ∞ X X p = E[ X X p] 0 k n − kp | n − | →

90 By Tchebyshev’s Inequality, convergence in p norm implies that convergence in − probability.

convergence in distribution

Let Xn and X be random variables with distribution Fn and F , respectively. Xn is said to converge in distribution or in law to X(X X) if n ⇒

lim Fn(x) = F (x) for every continuity point x of F. n

(i). This is also called weak convergence. (ii).X X iﬀ for every x such that P [X = x] = 0, n ⇒ lim P [Xn 6 X] = P [X 6 x]. n (iii). It is equivalent to the convergence of E[f(X )] E[f(X)] n → for all bounded continuous functions f : R R. →

convergence of moment [20]

– Deﬁnition of Uniform Integrability:

A random variables Xn is said uniformly integrable if

lim E[ Xn I Xn > α ] = 0. α →∞ | | {| | } – A sequence Y is uniformly integrable iﬀ { n}

(i). sup E Yn < . n | | ∞ (ii). For every ǫ > 0, there exists δ > 0 such that E[ Y I A ] < ǫ for all n and all events A with P (A) < δ. | n| { }

– Let 0 < r < , suppose that E X r < for all n and that X P X as n . ∞ | n| ∞ n −→ → ∞

91 Then the following are equivalent:

(i). X Xin Lr as n , n → → ∞ (ii) E X r E X r(< ) as n , | n| → | | ∞ → ∞ (iii) X r, n > 1 is uniformly integrable. {| n| } Furthermore, if X P X and one of (i)-(iii) hold, then n −→ (iv) E X p E X p as n for all p, 0 < p 6 r. | n| → | | → ∞

moment convergence in the strong law [20][Theorem 4.1] r Let Xn, n > 1 be i.i.d. random variables such that E X1 for some r > 1 and set { n } | | ∞ Sn = Xj (n > 1). Then j=1 X S n EX a.s. and in Lr as n . n → 1 → ∞

92 APPENDIX B

Subordinators

Deﬁnition B.0.1. [1] A subordinator is a one-dimensional L´evyprocess that non-decreasing

(a.s.) on R.

Theorem B.0.2. If T is a subordinator, then for any z R, the characteristic exponent ∈ takes the form ∞ ψ(z) = ibz + (eizy 1)ν(dy), (B.1) − Z0 where b > 0 and the L´evymeasure satisﬁes

∞ ν( , 0) = 0 and (y 1)ν(dy) < . (B.2) −∞ ∧ ∞ Z0 Conversely, any mapping from Rd C of the form (Appendix B.1) is the characteristic → exponent of a subordinator.

Example: Gamma subordinators Let (T (t), t > 0) be a gamma process with parameters a, b > 0. Then each T (t) has bat at 1 bx > density fT (t)(x) = Γ(at) x − e− , for x 0. Let’s take a consideration of a gamma distribution (d = 1). Then, the characteristic function of a gamma distribution is of the form

izx ∞ izx 1 bx E[e ] = exp (e 1) ax− e− dx  0 −  Z ν(dx)   | {z }

93 1 bx By the L´evy-Khintchine representation, A = 0, ν(dx) = a1(0, )(x)x− e− dx, and ∞ γ = 0. Also, we notice that

∞ ψ(z) = (1 eizx)ν(dx) − ·Z0 ¸ From the above facts, (T (t), t > 0) is a subordinator with b = 0 and ν(dx) = 1 bx ax− e− dx holds Appendix B.2.

As we’ve seen our application of subordinators, the most important is of time-changing. Next we introduce the subordination theorem. Let X be an arbitrary L´evyprocess and let T be a subordinator deﬁned on the same probability space such that X and T are independent.

Theorem B.0.3. [1] On (Ω, , P), we deﬁne a new stochastic process Z = (Z(t), t > 0) by F

Z(t) = X(T (t)), for each t > 0, (B.3) so that each ω Ω, Z(t, ω) = X(T (t, ω)(ω)) = (X (ω)). Then, Z is a L´evyprocess. ∈ Tt(ω)

Proof.

Example: VG process

In our case, Zt = B(T (t)) for each t > 0 , where B is a standard Brownian motion and T is a gamma subordinator which is independent of the Brownian motion.

94 APPENDIX C

Crude Oil Futures: the Basics from NYMEX

In our paper, the underlier process is to futures on crude oil. Thus, we better oﬀ know the general rules of contracts. From the NYMEX, we summarize the basics about Crude Oil futures and Options [31]. In 1983, Crude oil began futures trading on the NYMEX. Now Crude Oil Futures are the world’s most actively traded and liquid commodity.

Trading Units: Crude Oil Futures trade in units of 1,000 U. S. barrels.

Trading Months:

1 – Crue Oil Futures: 30(2 2 yrs) consecutive months and long-dated futures initially listed 36,48,60,72, and 84 months prior to delivery.

– Options: 12 consecutive months and three long-dated options at 18, 24 and 36 months out on a June/December cycle.

Price Quotation: Crude Oil Futures are quoted in dollars and cents per barrel.

Minimum Price Fluctuation: $ 0.01 (1 ) per barrel ($10 per contract).

Trading Hours: Crude Oil Futures and Options open outcry trading is executed form 10:00 A.M. to 2:30 P.M. After the hours, trading is performed via NYMEX ACCESSr Internet based trading platform. It begins at 3:15 P.M. on Mondays through Thursdays and ends at 9:30 A.M. the following days. On Sundays, the session starts at 7:00 P.M. All these times are New York time zone.

Last Trading Day:

95 – Crude Oils Futures: Trading terminate at the close of business day prior to the 25th calendar day of month preceding the delivery month. If the 25th calendar day of the month is a non-business day, trading will cease on the third business day prior to the last business day preceding the 25th calendar day.

– Options: Trading ends three business day before the underlying futures contract.

Delivery Period: All deliveries are rateable over the course of the month and must be initiated on or after the ﬁrst calendar day and completed by the last calendar day of the delivery month.

Alternative Delivery Procedure(ADP): An ADP is available to buyers and sellers who have been matched by the Exchange subsequent to the termination of trading in the spot month contract.

Position Limits: Any one month/all months: 20,000 net futures, but not to exceed 1,000 in the last three days of trading in the spot month.

Margin Requirements: Margins are required for open futures or short positions, but will never exceed the premium.

Trading Symbol:

– Futures: CL

– Options: LO

96 REFERENCES

[1] D. Applebaum., Lévyprocesses and stochastic calculus, first ed., Cambridge University Press, 2004. [2] G. Barone-Adesi and R. E. Whaley., Efficient analytic approximation of American option values, Journal of Finance 42 (1986), 301–320. [3] F. Black., The pricing of commodity contracts, Journal of Finance Economics 3 (1976), 167–179. [4] O. E. Brandorff-Nielsen., Normal Inverse Gaussian distributions and the modeling of stock returns, (1995), no. Reseach Report no. 300. [5] R. Breen., The accelerated binomial option pricing model, Journal of Financial and Quantitative Analysis 26 (1991), 153–164. [6] M. Brennan and E. Schwartz., The valuation of American options, Journal of Finance 32 (1977), 449–462. [7] M. Broadie and J. Detemple., American option valuation: New bounds, approximations, and a comparison of existing methods, The Review of Fianancial Studies 9 (1996), 1211– 1250. [8] M. Broadie and P. Glasserman., Pricing American-style securities using simulation, Journal of Economic Dynamics and Control 21 (1997), no. 8-9, 1323–1352. [9] M. Broadie, P. Glasserman, and Z. Ha., Pricing American options by simulation using a stochastic mesh with optimized weights, (2000), 32–50. [10] P. P. Carr, R. Jarrow, and R. Myneni., Alternative characterization of American put options, Mathematical Finance 2 (1992), 87–106. [11] P. P. Carr and D. B. Madan., Option valuation using the fast fourier transform, Journal of Computational Finance 2 (1998), 61–73. [12] R. Cont and P. Tankov., Financial modeling with jump processes, Chapman & Hall CRC Press, 2004. [13] J. C. Cox, S. A. Ross, and M. Rubinstein., Option pricing:a simplified approach, Journal of Financial Economics 7 (1979), 229–263.

97 [14] L. Devroye., Non-uniform random variate generation, Chapman & Hall Crc Press, Spring-Verlag New York Inc., 1986.

[15] J. S. Doran., Estimation of the risk premiums in energy markets, Social Science Research Network (2005).

[16] M. C. Fu, S. B. Laprise, D. B. Madan, Y. Su, and R. Wu, Pricing American options: a comparison of Monte Carlo simulation approaches, Journal of Computational Finance 2 (2001), 62–73.

[17] R. Geske., A note on an analytical valuation formula for unprotected American options on stocks with known dividends, Journal of Financial Economics 7 (1979), 375–380.

[18] R. Geske and K. Shastri., Valuation by approximation: A comparison of alternative options valuation techniques, Journal of Financial and Quantitative Analysis 20 (1985), 45–71.

[19] P. Glasserman., Monte Carlo methods in ﬁnancial engineering, Springer Science + Business Media, Inc., 2004.

[20] A. Gut., Stopped random walks: Limit theorems and applications, Applied Probability, vol. 5, Springer-Verlag, 1988. [21] A. Hirsa and D. B. Madan, Pricing American option under Variance Gamma, Journal of Computational Finance 7 (2003).

[22] S. D. Jacka., Optimal stopping and the American put, Mathematical Finance 1 (1991), 1–14.

[23] P. Jaillet, D. Lamberton, and B. Lapeyre., Variational inequalities and the pricing of American options, (1990), 263–289. [24] H. Johnson., An analytic approximation for the American put price, Journal of Financial and Quantitative Analysis 18 (1983), 141–148. [25] E. K¨elleziand N. Webber., Valuing Bermudan options when asset returns are L´evy process, Quantitative Finance 4 (2006), 87–100.

[26] L. W. MacMillan., An analytic approximation for the American put price, Advances on Futures and Options Research 1 (1986), 119–139.

[27] D. B. Madan, P. P. Carr, and E. C. Chang., The Variance Gamma process and option pricing, European Finance Review (1998), no. 2, 79–105. [28] D. B. Madan and F. Milne., Option pricing with V. G. martingale components, Mathematical Finance 1 (1991), no. 4, 39–59.

[29] D. B. Madan and E. Seneta., The Variance Gamma model for share market returns, The Journal of Business 63 (1990), no. 4, 511–524.

98 [30] R. A. Maller, D. H. Solomon, and A. Szimayer, A multinomial approximation for American option prices in L´evyprocess models, Mathematical Finance 16 (2006), no. 4.

[31] NYMEX, Education series: http: // www. nymex. com .

[32] Board of Governors of the Federal Reserve System, http: // www. fedralreserve. gov.

[33] R. Roll., An analytic valuation formula for unprotected American options on stocks with known dividends, Journal of Financial Economics 5 (1977), 251–258.

[34] Ken-Iti Sato., Lévyprocesses and infinitely divisible distributions, Cambridge University Press, Cambridge, UK, 1999. [35] W. Schoutens., Lévyprocesses in finance: Pricing financial derivatives, John Wiley & Sons, Ltd, 2003.

[36] P. Tankov., L´evyprocesses in ﬁnance: Inverse problems and dependence modeling, Ph.D. thesis, L’Ecol Polytechnique, 2004.

[37] R. E. Whaley., On the valuation of American call options on stocks with known dividends, Journal of Financial Economics 9 (1981), 207–211.

[38] R. E. Whaley, Valuation of American futures options: Theory and emprical tests, Journal of Finance 41 (1987), 127–150.

99 BIOGRAPHICAL SKETCH

EunJoo Yoo

EunJoo Yoo was born on June 15, 1972 in Mokpo, Korea. In the spring of 1995, she completed a Bachelor of Science in Mathematics at Kookmin University. She graduated from Yonsei University in 1998 with a Master of Science in Mathematics. In April, 2004, she obtained a Mater of Science degree in Financial Mathematics from the Department of Mathematics at Florida State University. She continued her studies by pursuing a doctorate in Financial Mathematics under the guidance of Dr. Craig A. Nolder. EunJoo received her Ph.D. in the summer of 2008 for her work in Variance Gamma pricing of American futures options.

100