Pricing American Derivatives by Simulation

Rami V. Tabri

A Thesis in The Department of Mathematics and

Presented in partial fulfillment of the requirements for the degree of Master of Science (Mathematics) at Concordia University Montreal, Quebec, Canada

August 2008

©Rami V. Tabri 2008 Library and Bibliotheque et 1*1 Archives Canada Archives Canada Published Heritage Direction du Branch Patrimoine de I'edition

395 Wellington Street 395, rue Wellington Ottawa ON K1A0N4 Ottawa ON K1A0N4 Canada Canada

Your file Votre reference ISBN: 978-0-494-45530-2 Our file Notre reference ISBN: 978-0-494-45530-2

NOTICE: AVIS: The author has granted a non­ L'auteur a accorde une licence non exclusive exclusive license allowing Library permettant a la Bibliotheque et Archives and Archives Canada to reproduce, Canada de reproduire, publier, archiver, publish, archive, preserve, conserve, sauvegarder, conserver, transmettre au public communicate to the public by par telecommunication ou par Plntemet, prefer, telecommunication or on the Internet, distribuer et vendre des theses partout dans loan, distribute and sell theses le monde, a des fins commerciales ou autres, worldwide, for commercial or non­ sur support microforme, papier, electronique commercial purposes, in microform, et/ou autres formats. paper, electronic and/or any other formats.

The author retains copyright L'auteur conserve la propriete du droit d'auteur ownership and moral rights in et des droits moraux qui protege cette these. this thesis. Neither the thesis Ni la these ni des extraits substantiels de nor substantial extracts from it celle-ci ne doivent etre imprimes ou autrement may be printed or otherwise reproduits sans son autorisation. reproduced without the author's permission.

In compliance with the Canadian Conformement a la loi canadienne Privacy Act some supporting sur la protection de la vie privee, forms may have been removed quelques formulaires secondaires from this thesis. ont ete enleves de cette these.

While these forms may be included Bien que ces formulaires in the document page count, aient inclus dans la pagination, their removal does not represent il n'y aura aucun contenu manquant. any loss of content from the thesis. Canada ABSTRACT Pricing American Interest Rate Derivatives by Simulation Rami Victor Tabri.

We examine the performance of single and multifactor models of the short rate in pricing American put options on the thirty year T- futures contract and two, five and ten year T-note futures contracts by simulation. The models for the short rate we utilize are the one-factor, two-factor, and three-factor Vasicek and CIR models used in Babbs and Now- man [2], and Chen and Scott [9] respectively, and the three factor models of Balduzzi, Das, Ferosi, and Sundaram [3], Dai and Singleton [12], Chen [6], and the Maximal models in

the Air(3) and A2r(3) subfamilies introduced in Dai and Singleton [12]. We utilize the least squares Monte Cairo algorithm developed in Longstaff and Schwartz [23] to estimate the price of the American put option on T-bond and note futures contracts, and construct a high biased estimator in order to obtain a 95% confidence interval for the true price of the American put option. Also, we approximate the optimal exercise boundaries for the American put option and early exercise premia.

Since the state variables in the one-factor, two-factor, and three-factor Vasicek and CIR models have known transition distributions, our investigation focuses on the performance of these models when simulated via the Euler scheme in comparison to simulation via the transition distribution approach. For the remaining models, we compare them across con­ fidence intervals, estimated optimal exercise boundaries, and early exercise premia. With the Vasicek and CIR factor models, we found that as the number of factors increased, the early exercise premia became more non-linear across both approaches of simulation, and that the 95% for these models when simulating via the Euler scheme with our choice of high biased estimator and finite subset of basis functions, significantly increased the size of the 95% confidence intervals when compared to the confidence intervals obtained by simulation via the transition density approach. Finally the estimated early exercise bound­ aries for the CIR models when simulating via the Euler scheme were significantly different

iii when simulation was conducted via the transition distribution. For the rest of the models, the Maximal model and the Dai and Singleton model in the Anr(3) subfamily had the best performance in terms of having positive early exercise premia for most of the American put options considered.

IV Acknowledgements

I would like to acknowledge the financial support I received from the Institut de Mathematiques de Montreal (IFM2) as a master's research fellowship, my supervisor Dr. C. Hyndman, and the Department of Mathematics, Concordia University.

IV Contents

List of Figures x

List of Tables xi

Introduction 1

1 1

1.1 Introduction and Motivation 1 1.2 Modeling Uncertainty 2 1.2.1 Introduction and Motivation 2 1.2.2 Mathematical Formulation of Uncertainty 5 1.3 Basic Market Model 5 1.3.1 Terminology 6 1.3.2 Results 8 1.4 Fundamental Pricing Formulae 10 1.4.1 Zero- Bond Price 10 1.4.2 Zero-Coupon Bond Futures Price 11 1.4.3 The Pricing of a European Option on a Zero-Coupon Bond 12 1.4.4 The Pricing of European Option on a Zero-Coupon Bond Futures . 14 1.4.5 The Pricing of Coupon-Bearing Bonds 16 1.4.6 The Coupon bond futures Price 16

v 1.4.7 The Pricing of European Options on Coupon-Bearing Bonds .... 17 1.4.8 The Pricing of European Options on Coupon-Bearing Bond Futures 18 1.5 Computation of Pricing Formulae 19 1.5.1 Computing the Price of a Contingent Claim 20 1.5.2 Computing Futures Prices 24 1.6 Concluding points 25

2 Affine Term Structure Models 26 2.1 Introduction 26 2.2 Concepts and Terminology 26

2.3 One-Factor Short Rate Models 29 2.3.1 The Pricing of Zero-Coupon Bonds 32 2.3.2 Zero-Coupon Bond Futures Price 35 2.3.3 Pricing European Options on Zero-Coupon Bonds 37 2.3.4 Pricing European Options on Zero-Couopon Bond Futures 43 2.4 Two-Factor Short Rate Models 46 2.4.1 Some General Remarks 46 2.4.2 The Pricing of Zero-Coupon Bonds . 49 2.4.3 Zero-Coupon Bond Futures Price 53 2.4.4 The Pricing of European Options on Zero-Coupon Bonds 55 2.4.5 European Bond Futures Price 59 2.5 Three-Factor Models 61 2.5.1 The Pricing of Zero-Coupon Bonds 70 2.5.2 The Zero-Coupon Bond Futures Price 77 2.5.3 The Pricing of a European Option on a Zero-Coupon Bond 82 2.5.4 The price of a European Option on Zero-Coupon Bond Futures . . . 85

vi 3 American Interest Rate Derivatives 88 3.1 Finite Expiring American Put Option 88 3.2 Practical Dynamic Programming 95 3.3 Properties of the LSM Estimator 99 3.3.1 Interleaving Property 99 3.3.2 Convergence Results 101

4 Case Studies 105 4.1 Introduction 105 4.2 The T-Bond and T-Note Futures Contracts 106 4.3 High Biased Estimator 109

4.4 The Environment 112

4.4.1 Underlying Assets 112 4.4.2 Basis Functions 112 4.4.3 Approximating the Optimal Exercise Boundary 115 4.4.4 General Remarks on Simulation 116 4.4.5 The Parameters of the Models 117 4.5 Results 118 4.5.1 The CIR Models 118

4.5.2 Models in the Alr{3) and A2r(3) Families 120 4.5.3 The Single, Two, and Three Factor Vasicek Models 123

5 Conclusions 135 5.1 Summary 135 5.2 Conclusions 137 5.3 Future Research 139

Bibliography 139

vn A Probability and Calculus 143 A.l Concepts and Terminology 143 A.1.1 Rudimentary Probability . 143 A.l.2 147

B Monte Carlo Methods 150 B. 1 Introduction and Motivation 150

B.l.l Monte Carlo Simulation . . 151 B.2 Variance Reduction Techniques 154 B.2.1 Antithetic Sampling 155 B.2.2 Importance Sampling 157

B.3 Simulating SDEs 158

C Limit Theorems 161

D Parameter Estimates 162

D.l Estimates for the Single, Two and Three Factor Vasicek Models 162 D.2 Estimates for the Single, Two and Three Factor CIR Models 163

D.3 Estimates for the models in the A\r (3) and A2r (3) Families 163

Vlll List of Figures

1.1 Sample path of short term interest rate 4

2.1 The functions A, Bi: B2, B3 76 2.2 The function £^(75) in the BDFS model obtained using Maple 8 77

4.1 Plots of 95% confidence intervals using the high and low biased estimators, and LSM estimators for the prices of the all the American put options, using the two factor CIR model where simulation in conducted via both the Euler scheme and transition distribution approaches 125

4.2 The estimated early exercise premia for the the American options for all strikes and bond maturities considered, when considering single two and three factor models of the short rate simulated via the Euler scheme and transition distribution approach 126

4.3 Approximate optimal exercise boundaries for the two factor CIR model using the Euler scheme 127 4.4 Approximate optimal exercise boundaries for the two factor CIR model using the transition distribution. 128 4.5 Plots of 95% confidence intervals using the high and low biased estimators, and LSM estimators for the prices of the all the American put options, using

the maximal model in the A2r (3) family 129

ix 4.6 The estimated early exercise premia for the the American options for all

strikes and bond maturities considered for the models in the Air(3) and

A2r(3) Families 130

4.7 Approximate optimal exercise boundaries for the Air(3)Max model. The initial price is given by the value zero on the strike prices axis 131

4.8 Estiated Early Exercise Boundaries for the single, two and three factor Va- sick models under the Euler scheme and transition distribution 132 4.9 95% confidence intervals for the American put prices obtained using high and low biased estimators for the two factor simulated via

the transition distribution 133 4.10 Approximate optimal exercise boundaries for the American put options, under the two-factor Vasicek model simulated via the transition density ..134

x List of Tables

D.l Parameter estimates of the generalized Vasicek model from Babbs and Nowman [2] 162

D.2 Parameter estimates of the CIR model from Chen and Scott [9] 163 D.3 Estimates of the parameters from the models in the A\ (3) class taken from Dai and Singleton [12]. Parameters indicted by a "fixed" are restricted to zero 164

D.4 Estimates of the parameters from the models in the ^2(3) class taken from Singleton and Dai [12]. Parameters indicted by a "fixed" are restricted to zero except that r is constrained to be equal to 9, whenever it is "fixed". . . 164

XI Introduction

For at least a decade, financial derivatives have become increasingly significant in the world of finance and investments, and the field of Financial mathematics has played an important role in the development of analytical tools for the better understanding of such markets. Among the most important of these markets are those for derivative securities such as options and futures contracts. One of the most important problems in option pricing the­ ory is the valuation and optimal exercise of American options. American options may be exercised at anytime before a fixed date. Such derivatives are found in most major finan­ cial markets including equity, commodity, bond, foreign exchange, insurance, and energy markets. Traditional methods used to approximate the value of the American option price such as Binomial Trees and Finite Difference techniques have proven successful. However, when multiple factors affect the value of the underlying asset those techniques become im­ practical. Valuing options by simulation overcomes the problem of dimensionality faced by the alternative techniques and allows for a wider variety of assets on which options can be written.

Chapter one presents the mathematical finance framework in which we shall work with. That is, we introduce all the concepts from financial mathematics that we will need in or­ der to derive pricing formulae in the framework of a basic market model. Such concepts include the No-Arbitrage condition, and the risk-neutral measure. Finally we discuss the two approaches to computing pricing formulas which are the Black-Scholes approach and the forward measure approach.

1 Chapter two presents the different interest rate models we shall consider in this thesis. We derive when possible all the pricing formulae associated with each model for the financial assets and derivatives introduced in the first chapter. Also, we occasionally discuss other approached to computing the pricing formulae for the various models. Chapter three introduces the basic theory underling the pricing of American put options and motivates the use of the dynamic programming principle to pricing american options. Also, we present pseudo code for the implementation of the Longstaff and Schwartz least squares algorithm and discuss the properties of this algorithm.

Chapter four introduces the case studies for our numerical experiment. We introduce the underlying asset, and our choice for high biased estimator. Also we present some dis­ cussion on our choice of basis functions to be used in algorithm. Finally we present our results.

2 Chapter 1

Mathematical Finance

1.1 Introduction and Motivation

One can approach the pricing of contingent claims and derivative securities from the per­ spective of Economic Theory using the General Equilibrium approach. This entails setting up a stochastic continuous time economy. That is, describing the number of basic assets and agents in the economy, as well as different agents' demand asset profiles in accordance with their preferences, expectations, attitudes towards risk, information and initial endow­ ments, which we call the primitives of the model. Computing a general equilibrium means deriving the price processes for the different assets and contingent claim as a function of the primitives of the model where, the overall demand for each asset is matched by the overall supply for each asset.

Solving for a general equilibrium is a formidable task, and where the strength of financial economics is felt, is in working with a weaker condition than those required by a general equilibrium called No Arbitrage. An arbitrage is a trading strategy that begins with no money, has zero probability of losing money, and a positive probability of making money. Hence, no arbitrage rules out such opportunities. In order to price contingent claims by No Arbitrage, we would need to set up a market model again, but this time, the stochas-

1 tic evolution of each basic asset is specified exogenously by some stochastic differential equation. Implicitly we are making the assumption that the exogenously given asset prices clear their respective markets. For this reason, we do not have to consider modeling any agents in the economy and hence, not take into account of the market clearing conditions. The No Arbitrage approach is based on replicating the payoff of the contingent claim in each state of the economy through trading the basic assets only. The No Arbitrage price of the claim is the minimum initial wealth that was required for replication such that No Arbitrage opportunities arise. In this sense the No Arbitrage condition is weaker than the General Equilibrium requirements. That is, for price processes and agent demand profiles to be in a General Equilibrium, it is necessary that these price processes be arbitrage free. In the following, we shall first define the mathematical tools used in No Arbitrage pric­ ing approach. Next, we shall introduce the main results of this approach which, in effect, demonstrates how the tools are used to characterize arbitrage free prices for contingent claims. We shall derive the fundamental pricing formulas for the following financial assets and derivatives: coupon and non-coupon bearing bonds, bond futures for both coupon and non-coupon bearing bonds, and european options on the previous two. Finally, we shall provide discussion on two approaches to computing the prices of the financial assets and derivatives mentioned above, which are the partial differential equations approach, also known as the Black-Scholes method, and the change of numeraire approach.

1.2 Modeling Uncertainty

1.2.1 Introduction and Motivation

There are different types of interest rates, for example the interbank rates, and government rates. By "interbank rates" we denote rates at which deposits are exchanged between banks, and by "government rates" we mean interest rates at which governments borrow funds. The most important interbank rate that is usually considered as a reference for contracts is the

2 LIBOR (London InterBank Offered Rate) rate, fixing daily in London. Whenever we utilize the term "interest rate", we mean an interbank rate. The following presentation in this section follows Sondermann [28]. If the evolution of interest rate was a "smooth" function of time, then one could model it with an ordinary

differential equation. Let X(t) : R+ —• R be a differentiable function, and let g be a twice continuously differential function i.e. g € C2(R). Then, Taylor's theorem states that

l Ag{X{t)) = g(X(t + At)) - g(X(t)) = g'(X(t))AX(t) + -g"{X{m)){AXtf

where AX(t) = X(t + At) - X(t) and me[t,t + At}. Hence, as A* -» 0 yields

dg(X(t)) = g'(X(t))dX(t)

or equivalently

g{X{t)) = g{X{0))+ [ g(X(.s))dX(s)

Jo

since AX(t) —> dX(t) = X'(t)dt, and the terms of higher order (dt)2 vanish. If we observe a path on daily interest rate fluctuation given by figure (1.1), we observe that this function is not "smooth". Actually, it is a continuous function which is not of bounded variation in every given interval of time, which is given by Definition 1.2.1 Definition 1.2.1. (Function of Bounded Variation) Let F : R —> C and x G R, we define

TF(X) — sup < 2~2 \F(XJ) — F(xj~i)\ : n G N. — oc < x0 < • • • < x„ — x > .

TF is called the total variation function of F. JfTp(oo) = lim^oo TF(x) is finite, then we say that F is of bounded variation on R.

An important implication of Defintion 1.2.1 is that a function which is not of bounded variation is nowhere differentiable. Hence, this property precludes our use of the classical

3 Short Temi Inteesl Rale 0M8| 1 1 1 1 1 1 1—

0.046-

0044 - : '

0042 -

0.04 - '.•'.-, -

0.038 -

0.036 -

ooJ 1 1 1 1 1 1 1 0 M 100 550 200 250 3O0 350 Tmeiiftays

Figure 1.1: Sample path of short term interest rate calculus to model the phenomenon in figure (1.1). The extension of the classical calculus for real valued functions of unbounded variation is where stochastic calculus takes center stage. Using the above notation, except now let X(t) be of unbounded variation, when forming the differential dg(X(t)), the second term (AX(t))2, also known as the ofX(t), does not vanish as At —> 0, yielding

dg(X(t)) = g\X{t))dX{t) + l-g"(X{t)){dX{t)Y

or, in explicit form,

g(X(t)) = g(X(0)) + j\\X{s))dX{s) + jf l-g"{X{s)){dX{s)f. (1.1)

The challenge was to give meaning to the first integral in equation (1.1) where, the integrand

4 and integrator depend on a function of unbounded variation. This task was solved by Ito and Doeblin separately around the same time, and hence, equation (1.1) is called the Ito- Doeblin formula for the function g. Although one can develop stochastic calculus without , the exposition in this chapter, and results listed in the appendix to this chapter take the probability path to its presentation.

1.2.2 Mathematical Formulation of Uncertainty

The modeling of uncertainty begins with the notion of a measurable space, which is a pair (fi, T). 0 is a nonempty set called the sample space, and T is a sigma-algebra of subsets of Q, whose members are called events. To provide some intuition, it helps to keep in mind that there is a random experiment we are trying to describe. In this case, 0 represents the set of all possible outcomes of this experiment, while T represents sets in which the outcome may or may not belong to. The reason we consider a sigma-algebra as a collection of events of interest rather than another kind of collection, is that, we can associate with each member of J7 a likelihood of occurrence, called a probability. This is expressed formally by utilizing a probability measure denoted by P. Now, the triple (Q, T. P) is called a probability space.

1.3 Basic Market Model

In this section, we shall introduce the No Arbitrage pricing paradigm in continuous time. In order to accomplish this, we shall introduce the necessary terminology in the context of a continuous time economy which shall be a special case of the one presented in Brigo and Mercurio [5].

5 1.3.1 Terminology

There is a finite time horizon T > 0, a probability space (SX T, P), and a filtration, {Tt : 0 < t < T}. The economy consists of two non-dividend paying securities, and their

prices are modeled by a two dimensional adapted process S = {St • 0 < t < T}, whose components S°, and S1 are positive. Let the asset indexed by zero be a bank account and thus, its price evolves according to

dSf = rtS?dt

1 with S° = 1, where rt is the instantaneous spot rate of interest at time t. It is the rate at which the bank account accrues. Hence,

which tells us that investing a dollar at time zero yields at time t the value given by equa­ tion (1.2). The asset indexed by one is a basic asset in the economy.

Definition 1.3.1 (Stochastic Discount Factor). The discount factor between the time in­ stants t and T, D(t, T), is the amount at time t that is "equivalent" to one unit of currency payable at time T, and is given by

D(t,T) = e-^rsds.

When dealing with interest rate derivative securities, the main source of variability is due to the fluctuation in the short rate. For this reason, the probabilistic nature of the short rate is of importance since it affects the behavior of the value of the bank account in our basic market model. In general, we shall model the stochastic evolution of the short rate either with a SDE, or a system of SDEs where, the former is called a single factor model,

'It is also known as the short rate.

6 and the latter is called a multi-factor model. In the multi-factor setting, the number of SDEs and the number of Brownian motions entering those SDEs can both be larger than one, and do not need to be the same.

Definition 1.3.2 (Trading Strategy). A trading strategy is a two dimensional process o =

{4>t '• 0 < / < T} whose components are locally bounded.

The components of are to be interpreted as the number of units of the bank account and security held by an investor at time t. Next, we define the value process associated with a trading strategy.

Definition 1.3.3 (Value Process). The value process associated with a trading strategy o is given by

Vt() = tSt = $S? +

Definition 1.3.4 (Risk-Neutral Measure). A risk- neutral measure, Q, is a probability mea­ sure on the measurable space (Q, F) such that

1. P and Q are equivalent measures.

2. The Radon-Nikodym derivative process ^ belongs to L2(Q, T, P).

3. The "Discounted asset price" process, D(0, -)S is a ({Tt : 0 < t < T}, Q) martin­ gale.

The first item of Definition 1.3.4 means that P(A) = 0 if and only if Q(A) — 0, \/A € T. That is, the risk neutral measure and the original probability measure must agree on which events which have probability measure zero. The second item, means that the the process ^ 2 is square integrable with respect to the measure P. The last item means that

EQ [£>(0, t)S^\Tu} = £>(0, u)S*, V0 < u < t < T, k = 0,1.

2This process was introduced in section AAA and its definition was given by equation (A.3).

7 As mentioned in section 1.1, an arbitrage is a trading strategy that begins with no money,

and it must have zero probability of losing money and a positive probability of making money some time in the future. Formally this is given by Definition 1.3.5.

Definition 1.3.5 (Arbitrage). An Arbitrage is a trading strategy satisfying the two condi­ tions:

1. V0(

2. For some time u > 0, P{Vu{) > 0) = 1 and P{Vu{) > 0) > 0.

Since we are interested in pricing assets which depend on our basic asset S1 in the mar­ ket model described above, the assets to be priced will have state dependent payoffs, and in order to utilize the mathematical formulation of uncertainty described in section A.l .1, we require that the contingent claim be integrable as is given by Definition 1.3.6.

Definition 1.3.6 (Contingent Claim). A contingent claim is a square integrable and positive random variable on (Q, T, P).

Finally, we formalize the notion of a replicating trading strategy in the assets S for a contingent claim given by Definition (1.3.7).

Definition 1.3.7 (Attainability). A contingent claim H is attainable if there exists a trading strategy cj> such that VT{4>) = H.

1.3.2 Results

The contribution of Harrison and Pliska [18] was to associate the absence of arbitrage with the existence of a risk-neutral measure. They provided the mathematical characterization of the unique no arbitrage price associated with any attainable contingent claim H.

8 Theorem 1.3.1 (Harrison and Pliska). Assume there exists a risk-neutral measure Q and let H be an attainable contingent claim. Then, for each time t, 0 < t < T, there exists a

unique price irt associated with H given by

7Tt = EQ[D(t,T)H\Tt}. (1.3)

The implication of Theorem 1.3.1 for pricing purposes is that, in an arbitrage free mar­ ket, the price of any claim is uniquely given, either by the value of the associated replicating strategy, or by the risk neutral expectation of the discounted claim payoff. In practice, prior to applying equation (1.3) we should verify that the contingent claim we are pricing is in­ deed attainable and that a risk-neutral measure does exist in the model. By Definition 1.3.7 the former entails constructing a trading strategy that replicates the payoff of the contin­ gent claim. If one could prove the existence of such a trading strategy in practice without computing it, then assuming the existence of a risk-neutral measure, we have verified the assumptions in Theorem 1.3.1, and hence, apply equation (1.3). This approach in practice is possible, and in the context of the model developed in section 1.3.1, shall make use of the martingale representation theorem for a standard one dimensional Brownian motion given by Theorem A. 1.7.

The next result also proved by Harrison and Pliska [19], which is the content of propo­ sition 1.3.2 provides the necessary and sufficient conditions for a financial market to be complete and arbitrage free.

Definition 1.3.8 (Market Completness). A financial market is complete iff every contingent claim is attainable.

Proposition 1.3.2. A financial market is complete and arbitrage free iff there exist a unique risk neutral measure.

The result given by proposition 1.3.2 characterizes the existence of a unique risk-neutral measure in terms of verifying that every contingent claim in our model is attainable. Utiliz-

9 ing this approach, one can verify whether each contingent claim in our model is attainable by using the martingale representation theorem to prove the existence of a trading strategy that replicates it's payoff structure.

1.4 Fundamental Pricing Formulae

Given the basic market model described in Section 1.3, and assuming the existence of a unique risk-neutral measure, Q, we can proceed to price financial securities and derivatives.

By Theorem 1.3.1, the price of any contingent claim at time t with payoff HT at time T > t is given by

r ds nt = Ec e-tf * HT\Tt (1.4)

First, we shall derive the pricing formulas for zero coupon bonds, bond futures, and Euro­ pean options on the previous two. Next, we shall repeat the previous derivations but with coupon bearing bonds.

1.4.1 Zero-Coupon Bond Price

Definition 1.4.1 (Zero-coupon bond). A TB maturing zero-coupon bond is a contract that guarantees its holder the payment of one unit of currency at time TB, with no intermediate payments. The contract value at time t < TB is denoted by P(t, TB)-

In terms of HT, the payoff of the bond at maturity is given by HTB = 1- Substituting this case into equation (1.4) yields the time t price of the bond given by

B P(t,TB) = EQ 'e-J? r.dsi:F (1.5)

from which P(TB, TB) = 1, VTB.

10 1.4.2 Zero-Coupon Bond Futures Price

First, we shall define what we mean by a futures contract and futures price for a general asset, after which we apply the definition to the special case of the zero-coupon bond. A futures contract on an asset gives both parties involved in the contract a guaranteed price to buy and sell the underlying asset at the time of contract settlement denoted by TF. Note that, the futures price at time t is not the price of the futures contract at time t, rather, it is the price the parties involved in the contract are willing to agree at time t for the exchange at the time of contract settlement. Positions in futures contracts are governed by a specific daily settlement procedure referred to as marking to market. An investor's initial deposit, known as the initial margin is adjusted daily to reflect gains or losses that are due to the futures price movements.

Definition 1.4.2 (Futures price). The futures price of an asset whose value at time TF is S{TF) is given by the formula

Fut(t, TF) = EQ [S{TF)\Tt], 0 < t < TF < TB. (1.6)

A long position in the futures contract is an agreement to receive as a cash flow the changes in the futures price(which may be negative as well as positive) during the time the position is held. A short position in the futures contract receives the opposite cash flow.

Then the futures price, Fut(t, TF, TB), of the zero coupon bond is an adapted with two properties

1. The futures price agrees with the zero coupon bond price on the delivery date. That

is, Fut(TF, TF, TB) = P(TF, TB).

2. The expected value of holding the futures contract over a period of time and receiving the cash flows associated with this position is zero and in the limit as the length of a

11 time period tends to zero, is given by:

1 -E,Q r D(0,v)dFUt(u,TF,TB)\Ft = 0,0

The unique process having the above properties is

Fut(t,TF,TB) = EQ[P{TF,TB)\Ft], 0

where P(TF, TB) is the zero coupon bond price in equation (1.5) evaluated at time TF. It is evident that the bond futures price given by equation (1.8) is a martingale, which follows by using the usual iterated conditioning argument.

1.4.3 The Pricing of a European Option on a Zero-Coupon Bond

We consider two types of European options namely, the put, and the call. A European put option on zero coupon bond is a contract that gives the right for the holder of the contract to sell a unit of the bond at the time of option expiry, To at a price called the strike price denoted by K.

Definition 1.4.3. Let EPZBond(t, To, TB, K) be the price at time t of the European put option which expires at time To on a zero-coupon bond which matures at time TB, and strike price K for 0 < t < T0 < TB.

Then, in terms of H? in equation (1.4), the payoff has the form

+ HTo = (K-P(T0,TB)) , where X* = max[0, X}. Hence, the price at time t is

r ds + EPZBond(t, T0, TB: K) = EQ \e~ f° * (K - P(T0. TB)) \Tt] . (1.9)

12 For a European call option on the zero coupon bond, has the same specification as the put except that the contract gives the right for the holder of the contract to buy a unit of of the bond at the strike price.

Definition 1.4.4. Let ECZBond{t,To,TB) be the price at time t of the European call option which expires at time To on a zero-coupon bond which matures at time TB, and strike price K for 0 < t < To < TB.

+ Then, in terms of HT, HTo = {P(TQ, TB) - K) . Hence,

r ds + ECZBond(t,To,TB, K) = EQ \e~£° ° (P{T0,TB) - K) \Tt (1.10)

The European put and call options for the same zero coupon bond, option expiry date, and strike price are related. This relationship is called put-call parity and is given by

EPZBond(t, To, TB, K) = ECZBond(t7 To, TB, K) - P(t, TB)

+ KP{t,To).yte[0,To}. (1.11)

If this were not the case, a trader at some time t can make a profit and have no liability upon expiration of the contracts which constitutes an arbitrage. For example, if there exist a time t where

EPZBond(t, T0, TB, K) > ECZB

13 with delivery date To- This portfolio would yield him an instant profit of

EPZBond(t, T0, TB, K) - ECZBond{t, T0, TB, K) + P(t, TB) - KP(t, T0) > 0.

Thus, constituting an arbitrage. Similarly, if the inequality in equation (1.12) was reversed, a trader could sell a portfolio at time t that is short in the put, long in the call, short in a zero coupon bond which matures at time TB with delivery date To, and long in a zero coupon bond which matures at time To with a principal amount of K with delivery date To, yielding him an instant profit at time t of

ECZBond{t, T0, TB, K) - P{t, TB) + KP(t, To) - EPZBcmd(t, T0, TB, K) > 0.

1.4.4 The Pricing of European Option on a Zero-Coupon Bond Fu­ tures

European put and call options on zero-coupon bond futures differ from the corresponding European options on zero-coupon bonds presented in Section 1.4.3, in that the holder of the option establishes a position in the futures contract instead of sale or purchase of the of the underlying bond. The decision to exercise at the option expiiy date is based on the comparison of the futures price and strike price at that time. In the case of a European call option on a zero-coupon bond futures, upon exercise, the option holder establishes a long position in the futures contract. On the other hand, a European put option on a zero-coupon bond futures contract, upon exercise, the option holder establishes a short position in the futures contract.

Definition 1.4.5. Let the zero-coupon bond futures price at time t, Fut(t, TF, TB), be given by equation (1.8), and the strike price by K, then the time t price of a European put option

14 on zero-coupon bond futures which expires at time T0 is given by

r ds + EPZBondFut{t, To, TBr TF, K) = EQ \e~ £° ° (K - Fut{T0, TF, TB)) \Tt (1.13) where t

Similarly, we have a definition for the time t price of the corresponding call option.

Definition 1.4.6. Let the zero-coupon bond futures price at time t, Fut(t, TF. TB), be given by equation (1.8), and the strike price by K, then the time t price of a European call option on zero-coupon bond futures which expires at time To is given by

r ds + EPCBondFut{t, To, TB, TF, K) = EQ \e~£° ° {Fut{T0,TF, TB) - K) 1Ji] , (1.14) where t

Equations (1.13) and (1.14) are also related via put-call parity relationship given by

EPZBondFut(t, To, TB, TF, K) = ECZBondFut{t, T0., TB, TF, K)

- Fut(t,TF,TB) + KP(t,T0), (1.15)

for 0 < t < To < TF < TB. If this were not the case, a trader at some time t can make a profit and have no liability upon expiration of the contracts which constitutes an arbitrage. For example, if there exist a time t where

EPZBondFut{t, T0, TB, TF, K) > ECZBondFut{t, T0; TB, TF, K)

-Fut(t,TF,TB) + KP{t,T0), a trader could sell a portfolio at time t which takes a long position in the put option and in a zero-coupon bond futures contract, and a short position in the corresponding call option

15 and zero-coupon bond with principal amount K. This portfolio yields an instant profit of

0

+ Fut(t. TF, TB) ~ KP{t, To),

for the trader at time t without any liability at time T0-

1.4.5 The Pricing of Coupon-Bearing Bonds

A coupon-bearing bond is a contract that ensures the payment at times T = {Ti,..., Tn} of the deterministic cash flows c = {c\,.. .cn}. The last cash flow includes the re-imbursement of the nominal value of the bond. The discounted cash flows of the coupon bond are given by n ^2D{t,Ti)a.

The time t price of this bond is the conditional expectation of the discounted cash flows, conditional on Tt, which is given by

CB{t,T,c) = EQ J2D{t,fyci\Ft i=\ i=\ i=l (1.16) for t < Ti. The second equality follows by the linearity of the conditional expectation operation, and the third one follows by equation (1.5) with TB replaced by Ti for each i.

A similar argument for t E (Tk, Tk+i) holds except that the sum in equation (1.16) begins fromi = k + 1.

1.4.6 The Coupon bond futures Price

By applying equation (1.6) from the definition of the futures price given by Definition 1.4.2, we observe that the time t futures price on a coupon-bearing bond is given by the condi-

16 tional expectation of the coupon-bond price at the time of the settlement date of the futures

contract, conditional on !Ft:

n

CBF(LT,c,TF) = EQ [CB{TF,T,c)\Ft] = £ciFtrf(f,7>,T;-). (1.17)

fort < 7> < Tj. Similarly, if t < Tk < TF < 7fc+1, then the sum in equation (1.17) begins from i = k + 1.

1.4.7 The Pricing of European Options on Coupon-Bearing Bonds

To price a European put option on a coupon-bearing bond with strike price K and expiry

To, consider the payoff at the time of option expiry given by

+ p T [K-CB(T0,T,c)] = K -Y,d ( o,fy i=l

Jamshidian[20] devised a method to rewrite the positive part as the sum of positive parts. The idea was based on finding r* such that the following equation holds,

J2ctP(To,Tf,r*) = K, i=\

and rewriting the payoff function as

Y,

Jamshidian[20] derived a sufficient condition to arrive at the desired decomposition which is given by dP{t,s;r) <0,Vf € (0,s). dr

17 Hence, under this condition, the payoff can be expressed as

n + J2 * [P{T0„ Tu r*) - P{t, T0, Tt] r(T0))} ,

so that the pricing of the put option becomes equivalent to value a portfolio of put options on zero coupon bonds. Now, taking the risk-neutral expectation of the discounted payoff yields

n

EPCBond(t.,T0,T,c,K) = ^2ciEPZB

where EPZBond(t, To. T\, P{To, Tf, r*)) is the time t price of the European put option on a zero-coupon bond with option expiry time, To, bond maturity time, 7], and strike price P(To, T, r*). As similar equation holds for the call

n

ECCBond(t, T0, T, c, K) = ^ CiECZBond{t, T0, Tu P(T0, T; r*)). (1.19)

1.4.8 The Pricing of European Options on Coupon-Bearing Bond Fu­ tures

The derivation of the pricing formula for European options on coupon-bearing bond futures is similar to the derivation of the European options on coupon-bearing bonds presented in the previous section. Consider the payoff at the time of expiry of the put option given by

+ [K-CBF(To,T,c,TF)] = K-^aFutiTcTuTp) .

18 where CBF(t, T, c, 7>) is given by equation (1.17), and is the time t coupon bond futures price with cash flows c payed at times T. If we can find r* such that

n

Y,ciFut{To,Ti,TF;r*) = K: then the pricing of the put option becomes equivalent to value a portfolio of put options on zero-coupon bond futures. Following Jamshidian [20], if

dFut(t,TF,TB:r) < ^ y< £ ^ (] ar then we can achieve the decomposition of the put price given by

n

EPB

n

ECBondFut(t, To, T, c, TF, K) = Y^ CiECZBondFut{t, T0, Tu TF, Fut{t, TF, Th r*)). (1.22)

1.5 Computation of Pricing Formulae

As mentioned in Section 1.3, the stochastic evolution of the short rate shall be modeled either by a single factor or multi-factor model. Hence, the pricing formulae for the interest rate assets and derivatives presented in Section 1.4 are functions of the factors at time t that drive the short rate. This is due to the property that the solutions of the SDEs we shall consider in this thesis are Markov processes. In this section, we shall present the Black-Scholes method3, B-S method hereafter, and

3The Black-Scholes method is also known as the partial differential equations method.

19 the change of measure approach to obtaining the analytic formulas of the pricing formulae presented in Section 1.4. The latter approach is a special case of the general approach called the change of numeraire developed by Geman et al.[15], and is feasible whenever the factor(s) model has a known transition distribution which makes use of the Ito-Doeblin formula for an Ito process and Girsanov's theorem given by Theorems A. 1.5 and A. 1.6 respectively. To illustrate the techniques, we shall consider the case of a single factor model driving the short rate.

1.5.1 Computing the Price of a Contingent Claim

First we shall consider the general setting of an interest rate contingent claim described in the beginning of Section 1.4 whose pricing formula is given by equation (1.4). This level of generality encompasses the zero coupon bond price and European options cases. We shall consider a futures contract on interest rate derivatives in the next section. Let the SDE for the short rate be given by

drt = 0 (t, rt) dt + 7 (t, n) dWt, (1.23) where W is a Brownian motion under a risk-neutral probability measure Q. Since the solution to the SDE in equation (1.23) is a Markov process, then equation (1.4) can be expressed as

V{t, rt) = EQ [D(t, T)HT\H ,0

By Definition 1.3.4 the discounted price process given by

D(0,t)V{t,rt), 0

20 differentiable, then we can determine the partial differential equation, PDE hereafter, that describes this function. We derive the SDE for the process in equation (1.24) using the Ito-Doeblin formula for an Ito process, and the Ito product rule, which are stated in the appendix to this chapter. Next, we set the dt term equal to zero. The differential of the process in equation (1.24) is

d(D(0,t)V(t,rt)) = V(t,rt)dD(0.t) + D{0:t)dV(t,rt) (1.25)

= £>(0, t)[-r{t)V(t, rt)dt + Vt(t, rt)dt + Vr{t, rt)drt

+lv„{t,rt)d[r,r](t)] (1.26)

= D(0, t)[-r(t)V(t n) + Vt(t, rt) + (3 {t, rt) Vr(t, rt)

2 +--y (Urt)Vrr(t,rt)}dt

+D(0, t)j (t rt) Vr(t, rt)dWt., (1.27)

where Vx, Vxx denote the first and second partial derivatives of the function V with respect to the variable x. Note that, we took account of the cross variation of the Brownian motion and the discount factor being zero in Equation (1.25). This holds since the discount process is a process of bounded variation. Now, since the discounted price process of the derivative security is a martingale under the risk-neutral measure, this amounts to setting the dt term equal to zero in equation (1.27), where we obtain the following PDE

2 Vt(t,r) + ,8(t,r)Vr(t,r) + ^1 (t,r)Vrr(t,r) = rV(t,r) (1.28)

V{T,r) = HT.yr. (1.29)

The PDE (1.28) applies to the interest rate derivatives discussed in Section 1.4 with their respective terminal payoffs except for futures price since the futures price is a martingale under the risk-neutral measure. An alternative approach to obtaining the price function is to compute the conditional ex-

21 pectation given by equation (1.4) directly. In some cases, the risk-neutral measure Q as in Definition 1.3.4 is not necessarily the most convenient measure for pricing a contin­ gent claim as the calculation of the expectation can be considerably complicated. Hence, a change in probability measure may be helpful, and this is accomplished by changing the numeraire.

Definition 1.5.1 (Numeraire). A numeraire is any positive non-dividend paying asset.

In order to compute the expectation in equation (1.4) by utilizing a change of measure, we shall make use of the Ito-Doeblin formula for an Ito SDE, and Girsanov's theorem. Let S be the price of the numeraire process, then by applying the Ito-Doeblin formula, to the natural-log of the analytical price function of a contract that yields a cash flow of ST at time T given by V(t, r) we obtain

T !„V(T,rT) = h,VftrO+/ (^i + ^/J(«,r.) + i^a'(U,r.))*. Jt \V{u,r) V(u,r) 2 v (u, r) J T 2 T 1 f fVr(u,r)\ 2/ , , f Vr(u.r) . x Jri au r dw+ ~o 177-4 ' » / TFT—i°^.,ru)dBu. (1.30) 2 it \V{u,r)J Jt V{u,r)

rp

Subtracting J rudu and In V(t, rt) from both sides of equation (1.30) and exponentiating yields

du T e-f'^ f f fVt(u,r) , Vr(u,r)of l^M 2, A, —-—- = exp{ / — + — -0{u, ru) - ru + -— ^a («, ru) du V(t,rt) JtT \V{u,r) 2 V{u,r) T 2 V{u,r) ) _ 1 f fVr(u,r)\ 2 f Vr(u,r) o / 777 r\a{u,ru)du+ — ^

Observing that the integrand

—i-.- (vt(u,r) + Vr(u,r)p(u,r)-rV(u,r) + lVrr(u,r)a2(u,r)) (1.32) V{u,r) \ 2 / is indeed equal to zero since the expression in the brackets in equation (1.32) is given by

22 equation (1.28), we obtain

(1.33) V(t,rt)

Finally, we use Girsanov's theorem to establish from equation (1.33) that

dQ g-/t rudu s ,0 < t

Vr{t,n) dB? = a(t, rt)dt + dBt (1.35) V

is a Qs Brownian motion and the new drift for the SDE characterizing the evolution of the short rate. Hence, we obtain the following chain of equalities

r D(t,T)STV(t,rt) EQ[D(t,T)HT\f t] = EQ HT\Tt = V{t,rt)EQs \ft . V(trt) ST ST (1.36)

The forward measure is obtained when the zero-coupon bond with unit face value and maturity date T is taken as numeraire. Thus, specializing equation (1.34) we obtain

dQ g- ffrudu , o < < < r, (1.37) dQ1 P{t,T) n where QT is called the T-forward measure and by equation (1.35)

Pr{t,r ) dBj t a(t,r )dt + dB (1.38) P{t,n) t t

23 is a QT Brownian motion. Hence the price of the claim

Tr ds 7Tt=Et Q \e-f< ° HT\Ft] (1.39) after multiplying and dividing the analytical bond price function at time t with maturity T, and since this function is Tt measurable, yields

T e- /t rsds n = P(t,T)E UnATt (1.40) t Q P{t,T)

Now, by Girsanov's theorem we obtain

Trt = P(t,T)EQT[HT\Ft}, (1.41) which may ameliorate the computation of the conditional expectation.

1.5.2 Computing Futures Prices

When considering zero coupon bond futures, we must realize that it is the futures price which is a Q martingale. Hence, in the context of equation (1.8), let V(t, r) be given as in section 1.5.1,

V{t,rt) = EQ[P(TF,TB;rTF)\rt],

where P{TF, TB, rTp) denotes the analytical zero coupon bond price formula evaluated at time Tp. The differential of the futures price is obtained via an application of the Ito- Doeblin formula with the function V(t, r) from which we set the dt term equal to zero to obtain the PDE characterizing the function given by

2 Vt(t:r) + 0(t,r)Vr(t:r) + -1 (t,r)Vrr(t,r) = 0 (1.42)

V(TF,r) = P(TFlTB,r),Vr. (1.43)

24 1.6 Concluding points

This chapter presented a mathematical structure of the pricing of interest rate derivatives. Based on this structure, we defined the pricing formulas for the interest rate derivatives that we shall consider in this thesis. Also we defined the tools we shall utilize for the computation of the pricing formulas. The next chapter shall present the interest rate models that we shall utilize in this thesis, and their respective pricing formulae for the interest rate derivatives discussed in Section 1.4. Keeping in mind that our goal is to price American interest rate derivatives when using multi-factor models of the term structure of interest rates, the formulae and tools presented in this chapter provide the bridge to help us simulate sample paths of the underlying asset price.

25 Chapter 2

Affine Term Structure Models

2.1 Introduction

Our goal is to apply to price American interest rate derivatives, when using multi-factor models of the term structure of interest rates. We shall present the different models of the term structure that we shall work with in the present chapter. First, we shall carefully define the different concepts arising in interest rate theory in order to define the term structure of interest rates, which is also known as the , and make the connection with affine term structure models. For each model we shall present the pricing formulas of the interest rate assets and derivatives.

2.2 Concepts and Terminology

In order to define the concept of the term structure of interest rates, we need to introduce an interest rate that exists in financial markets. Zero coupon bond prices are the most basic quantities in interest rate theory since, all interest rates can be defined in terms of these prices, and in turn, zero coupon bond prices can be defined in terms of any given family of interest rates. The following definitions are from Brigo and Mercurio [5].

26 Definition 2.2.1. The continuously-compounded spot interest rate prevailing at time t for the maturity T is denoted by R(t, T) and is the constant rate at which an investment of P(t. T) units of currency at time t accrues continuously to yield a unit amount of currency at maturity T. that is, R{t,T) = J±phll, a.n where P(t, T) is the price of a zero coupon bond at time t and maturity T which was introduced in Section 1.4.

From Definition (2.1), the continuously compounded interest rate is a constant rate that is related to zero coupon bond prices, and we can express this price in terms of the given rate by re-arranging equation (2.1)

P{t,T) = e^™7"^. (2.2)

The term structure of interest rates is a curve that is obtained from market data of spot interest rates. It is also known as the Zero coupon curve and "yield curve." This curve is the graph of the function mapping maturities of zero coupon bonds into rates at time t.

Definition 2.2.2 (term structure of interest rates). The term structure of interest rates at time t is the graph of the function

T>->R(t,T).

Affme term structure models are interest rate models where the continuously com­ pounded spot rate R(t, T) is an affine function of the factors driving the short rate. For example, in the case of a single factor model where the factor is the short rate, we would specify that

R{t,T) = a{t.,T) + b{t,T)ru (2.3)

27 holds for all t and maturities, T, where the functions a and b are deterministic functions of time and bond maturities. In the multi-factor case, say N factors, we would specify that

R(t, T) = s{t, T) + e(t, T) • X't, (2.4) where X is an Ar-dimensional vector valued process, e(t, T) = [ei (t, T),.... exit. T)], and S(t, T) is a deterministic function of time.

Dai and Singleton [12] show how to classify an Ar-factor affine term structure model, ATSM hereafter, into N + 1 subfamilies based on two criteria. An Ar-factor ATSM is said to be admissible if it allows a well defined bond price that is an exponential affine function of the AT factors. An A'-factor ATSM is said to be maximally flexible if it nests econo- metrically all other models in the subfamily. The notation AM(AT) Dai and Singleton [12] introduced denotes the class of A"-factor ATSMs with the number M representing the num­ ber of state variables driving the of the system of SDEs. The models we shall consider are members of the following classes: A0(l), A0(2), .40(3), .4i(l), A2(2),

Aj(3),A2(3),A3(3).

Given AM(N), for some N and M, we have different ways to represent a model in this class. The different representations we shall consider are the maximal Ac form, Ay form, and the maximal Ar form. The Ac form is also known as the canonical form and was introduced in Dai and Singleton [12]. It has the conditional variances of the factors con­ trolled by the first M factors which are conditionally uncorrelated, and has the short rate as an affine function of the factors. The maximal Ar form, also known as the affine in r form, has the same specification as the Ac form except that one of the factors is the short rate itself. The Ay form, which is also known as the affine in y form, the factors can be conditionally correlated but have no unconditional correlation, and the short rate is affine

r in the A -factors. For the models in the following classes: A\{$), v42(3), we shall utilize the maximal Ar, while we utilize the maximal Ac form to describe the models in J41(l),

28 Ai(2), ^3(3) classes. Finally, we have an Ay form to describe models in A0(l), A0(2), MV-

2.3 One-Factor Short Rate Models

In this section we shall present the analytical price functions and their derivations for the interest rate derivatives and assets presented in Section 1.4 when the short rate is driven by a single factor model. We consider two one-factor models which are the Vasicek, and Cox, Ingersoll and Ross models. The presentation of the former is based on Babbs and Nowman [2], and the latter is based on Chen and Scott [9]. In both models presented in this section we consider the special case of a single factor, hence a model in Ao{l) and .4i(l) respectively. We assume that there is a complete probability space (Q, T, P), augmented filtration {jFt : t > 0} generated by a standard Brownian motion W in R, and the existence of risk-neutral measure Q.

The model introduced in Vasicek [29] uses a single- factor, modeled by an Ornstein- Uhlenbeck process, to describe the short rate. Following Babbs and Nowman[2], the factor process is given by

dXt = -£Xtdt + adWt (2.5)

where Wt, t > 0 is a standard Brownian motion, with £ and a are positive constants. The short rate is given by

n = II - Xu (2.6) where JJL. can be an adapted process, but is a constant parameter in this case. In order to price our interest rate derivatives, we must change measure from the objective measure P to the risk neutral-measure Q by using Girsanov's theorem to determine the new drift where,

dBt = ddt + dWtl

29 is a Brownian motion under Q measure, and 9 is market price of risk parameter. The dynamics of the factor process under the Q measure is given by

dXt = Z[-^--Xt) dt + odBu from which we can derive the SDE for the short rate given by

drt = £ (°^±^ - rt] dt + odBu (2.7) where a, f, o6 + /if > 0. The model in equation (2.7) can be solved for explicitly by an application of the Ito-Doeblin formula to the process r with the function f(t, x) given by

u) {t u) f(t, x) = e~^ru + (^-±^)(1 - e-^" ) + ae^ - f e^dBs, (2.8) S J u yielding

u) u) n = e-^~ ru + {0a + /if) f e^^ds + a f e-^~ dBs, t > u. (2.9) J U J U

Hence, rt conditional on ru,t > u under the risk neutral measure is normally distributed:

^ K ~ N (rue-*-») + (£^±if£) (1 - c-«'-)), ^(1 - e-*<*-«>)) . (2.10)

This implies that the short rate can be negative with positive probability, which is an unde­ sirable property as the short rate is always positive. Since the short rate can be negative in the Vasicek model with positive probability, a rem­ edy of this defect was proposed in Cox, Ingersoll, and Ross [11], CIR hereafter. Briefly, CIR [11] developed a general equilibrium framework in which if the change in production opportunities follows a mean reverting square root diffusion, then the short rate does too.

30 The SDE for the short rate is given by

drt = k(0 - rt)dt + ay/ndWt,

where Wt,t > 0 is a Brownian motion and k,9,o > 0 are positive constants. To see why the interest rate cannot be negative in this model, consider what happens when rt is zero, then the diffusion term vanishes and the drift, being positive due to the parameter restrictions, pushes the short rate back into positive territory. Since we assume that a risk- neutral measure exists, in Chen and Scott [9] the risk premium process assumed is

Xrut >0, where A is a constant from which

dBt = -y/Ftdt + dWt a is a Brownian motion under the risk-neutral measure. Thus, the SDE for the short rate under the risk-neutral measure, by virtue of Girsanov's theorem becomes

kO drt = (k + A) ( j-^- - rt ) dt + oJFtdBt. (2.11)

The transition density of the square root diffusion was first solved by Feller [13], and in the context of equation (2.11) is given by

4 fc+A 2 k {k+x){t u) n ,r ^ ( ) v (^ +^- - r mt > v (212) where \ 2 (m, n), non-central chi-square distribution with non-centrality parameter m and n degrees of freedom. Let P'2tmn\ be the density of the non-central chi-squared random

31 variable and it is given by

e 2

Px'l(rn,n)(z) = ^ -p^Wf ,±) W> i=0

2 P 2

where £>r(/.g)(^) »s the density of the gamma distribution with parameters I and q.

2.3.1 The Pricing of Zero-Coupon Bonds

We shall utilize the B-S method presented in Section 1.5, and the method of the separation of variables from the theory of partial differential equations to compute the analytical zero coupon bond price function. Assume the time t bond price be with maturity TB be given by

B{t TB)rt+A{t TB Pit, TB- rt) = e- ' > \ (2.13) where the functions A and B are deterministic. We shall begin from a general specification of the SDE for the short rate and derive the PDE characterizing the price function. After that, we shall specialize the results obtained to the Vasicek and CIR models. If the evolution of the short rate under the risk-neutral measure is given by,

drt = 0 (t, rt) dt + 7 (*, rt) dBt, then by the B-S method, we obtain the PDE

2 Pt{t, TB; r) + 6 {t, r) Pr{t, TB; r) + |7 (t, r) Prr(t, TB; r) = rP{t, TB; r) (2.14)

P(TB.TB;r) = l,Vr. (2.15)

32 Now substituting, for the partial derivatives of the function P{t, TB; r),

Pt(t,TB;r) = (-B'{t,TB)r + A'(UTB))P(t,TB,r) (2.16)

Pr{t,TB;r) = -B(t,TB)P(t,TB,r) (2.17)

2 Prr{t,TB;r) = B(t.;TB) P(t,TB,r) (2.18) into equation (2.14), yields

r = -B(t,TB) + 0{t,r){-B'(t,TB)r + A'{t,TB))

2 +p*(t,r)B(t,TB) (2.19)

In the case of the Vasicek model, equation (2.19) can be reduced to a system of ordinary differential equations by substituting in for the drift and diffusion functions, yielding

B'(t,TB) = l + £B(t,TB) (2.20)

B(TB,TB) = 0 (2.21)

2 2 A'{t,TB) = (ae + nOB(UTB)-^a B (t,TB) (2.22)

A(TB,TB) = 0. (2.23)

The solution to the above system of ODEs is

B(t,TB) = Ll^L (2.24)

2 A(t,TB) = (^tt£i-^yB{t,TB)-TB+t]-^B (t,To) (2-25)

where r = TB — t, is the time to maturity. Since the transition density given in equation (2.10) is Gaussian, another approach to de­ termine the function P(t.TB.r) is based on utilizing the probability distribution of the

33 integral of the short rate to directly compute the conditional expectation given in equa­ tion (1.5) when the SDE is given by equation (2.7). Glasserman [16] and Chen [7] use this approach their computations, and shows that the integral of a Gaussian stochastic process is indeed Gaussian.

In the case of the CIR model, equation (2.19) gives rise to the following system of ODEs

2 2 B'{t,TB) = (k + X)B(t1TB) + ^B (t,TB)-l (2.26)

A'(t,TB) = k0B(t.TB) (2.27)

B{TB,TB) = 0 (2.28)

A(TB,TB) = 0 (2.29)

then we can solve for the functions B(t, TB), A(t, TB) explicitly which are given by:

2(1 - e-^B-*)) B TB) = (2 30) ^ 27e--(^-<> + (k + X + 7)(1 - e-^-0) '

^(t,TB) = —5-log ,T ,, ' „ r- rr-^) (2.31) where

7 = ^{k + A)2 + 2a2. (2.32)

Another approach to determine the function P(t, TB, r) in the single factor CIR case is based on the joint distribution of the short rate and its integral. Lamberton and Lapeyre [22] give the bivariate Laplace transform of the short rate and its integral which.characterizes the joint distribution of the short rate and its integral. Notice that the system of ODEs for the zero-coupon bond price function under the Vasicek model is linear, and in the CIR case, the system is non-linear due to the presence of a Riccatti differential equation given by equation (2.26). This occurs since the single factor CIR model is a stochastic volatility model which is a square root diffusion.

34 2.3.2 Zero-Coupon Bond Futures Price

Using the analytic zero-coupon bond price function, we could apply the B-S method again to compute the zero-coupon bond futures price for both models. In the Vasicek model, we shall compute the conditional expectation directly, and in the CIR model, we shall apply the B-S method.

Recall from equation (1.8) that Fut(t. TF, TB) is the time t futures price, TF is the settle­ ment date of this contract, and TB is the date of bond maturity. Then, given that the bond price is exponential affine

B{TF TB)rT +A{TF TB) Fut{t,TF,TB) = EQ [P(TF:TB,rTF)\Tt] = EQ[e- ' F ' \Ft], (2.33)

where the functions A and B are defined by equations (2.24) and (2.25). Since TTF \ rt is normally distributed with mean and variance given by equation (2.10), we can compute equation (2.33) using this information which implies that the bond price given by equa­ tion (2.13) is log-normally distributed yielding:

Fut(t. TF. TB) = e^^^^^^^^^l^l+I^^Q^^^K^I^] £.34)

We observe that equation (2.34) is also an exponential affine function of the short rate at time t given by

Fut(t TF, TB) = e-BF{t,TF,TB)n+AF{t.TF,TB)_ (2 35)

Substituting in for the mean and variance of TTF \ rt in (2.34) yields:

TF t) BF(t,TF,TB) - B(TF,TB)e-U - (2.36)

7 AF(t, 7>, TB) = A(TF, TB) - B(TF, TB) M±l£\ (1 _ e^ ^)

2 2 f + is (7>; TB)^(1 - e" «^" )) (2.37)

35 In the CIR case, we shall utilize the B-S method to obtain the bond futures price. The PDE for the futures price is given by equation (1.42) with the function V(t, r) replaced with

Fut(t, TF, TB). That is,

0 = Futt{t,TF,TBVr) + (k^X){-^-r\Futr{t,TF,TB-r)

2 +-a rFutrr(t,TF,TB;r)) (2.38)

Fut(TF,TF,TB;r) = P(TF,TB;r),Vr. (2.39)

Now, assume that the time /. bond futures price is an exponential affine function of the short rate at time t given by equation (2.35). Then substituting the partial derivatives

Futt(t, TF, TB; r) = r (-B'F(t, T - F, TB) + A'F(U TF, TB)) Fut(t, 7>, TB(2)40)

Futr(t,TF,TB;r) = -B'F(t,T-F,TB)Fut(t,TF,TB;r) (2.41)

Futrr(t,TF,TB;r)) = BF(t,TF,TB)Fut(t,TF,TB;r) (2.42) in equation (2.38) we are able to reduce the PDE for the bond futures price to a system of

ODEs characterizing the functions AF and BF

2 2 B'F(t,TF,TB) = (k + \)BF(t,TF,TB) + ^a B F(t,TF,TB) (2.43)

BF(TF,TF,TB) = B(TF,TB) (2.44)

TB

AF(t, TF, TB) = A(TF, TB) - f(-k.0BF{u, TFj TB)du (2.45) t AF(TF,TF,TB) - A(TF,TB),' (2.46)

36 where the functions A and B in this case are defined in equations (2.31) and (2.30). Thus, the solutions are given by

2{k + X) S(k+X)t BF{t,TF,TB) (2.47) a2 (e(fc+A)* + (fc+A-J)e(fc+A)T>)

(fc+A) (fc+A)rF A (t,T .T ) (2.48) F F B A(TF,TB) + ™ log (fe+A)t _j_ /k+\~J\ (k+X)T a" e e F where a j = -B(rF,rB)

2.3.3 Pricing European Options on Zero-Coupon Bonds

In this section we shall compute the price of a European option on a zero-coupon bond for the Vasicek and CIR models. Using the characterization of these prices as conditional expectations in Section 1.4.3 and the forward measure approach, the fundamental pricing formula for the call option on the zero coupon bond with option expiry time To, bond maturity time TB, and strike price K is given by

T t o, r ds + ECZBand(t,TQ,TB,K) = EQ e~/« ° {P(T0,TB) - K) \Tt (2.49)

Then

r ds + r ds E,Q e-tf° * (P(To,TB)-K) \ft} = EQ\e-£° ° P{T0,TB)l{P(To,TB)>K)\Ft

Jt Ts S ,(2.50) -KEC e' l{P(T0,TB)>K}\Ft

We can use the forward measure approach to rewrite the second term on the right hand side of equation (2.50) as

T r o. ds KE, Q eSt ^ l{P{Ton)>K]M = KP(t,T0)EQr0 [l{P(T0,TB)>K}\rt] (2.51)

37 since the Radon-Nikodym derivative is given by

dQ - f^° rudu T = —/ ^ x -0< t < T0. dQ o rt P(t, To) • - - °

Using the forward measure, we may also write

r ds EQ [e- £° ° P(T0: TB)1{P{TOJTB)>K) \?t

= P(t,T0)EQr0 [P(T0,TB)l{P{To,TB)>K}\ft] • (2.52)

Substituting for equations (2.51) and (2.52) into equation (2.50) to obtain the form of the analytical pricing function

ECZBond(t,To,TB,K) = P(t,T0)EQr0 [P{T0.TB)\{P(TO,TB)>K}\^

-KP(t,T0)EQr0 [l{P{To ,TB)>K}\^t] • (2.53)

In the case of the Vasicek model, the short rate under the QTB probability measure is given by

2 T B drt = (9a + ^ - Bit,TB)a - £rt) dt + adB t , (2.54) where the function B is given by equation (2.24). Using the SDE (2.54), we can solve for an explicit expression of the short rate process at time t conditional on the information at time u for 0 < u < t < TB, that is

({t u) i(t u) B n = rue- - + M{u, t) + a f e' ~ dBj (2.55) J U

M(u,t) = ({eo + rt) - £) (1 - ^{1-U)) + ^ (e^^ ~ e-«T°+t-**) . (2.56)

38 Therefore the transition distribution of rt conditional on Tu is

N (rue-tW + M(u, t), ~ (l - e-«'-»>)) . (2.57)

Using equation (2.57), we may prove the following

Lemma 2.3.1. Let the function P(t, TB) be the time t zero-coupon bond price with maturity

TB arising in the Vasicek model, and To the time of expiry of the option. Then

EQT0 [P{T0,TB)1{P(T0,TB))>K} \Ft}=H(t,rt)(h) where

A T , N _ { ( 0,TB)-EQTo[rTo\Tt}B(To,TB)+lBHTo,TB){VarQTo[rTo\^t]y

* A{T ,T )-InK 0 B (2.58) B(T0,TB)

x = Q ° l T°' fJ- (2.59) varQT0 [rTo\Ft\

h = x-B{TQ: TB)VarQTo \rTo \Ft], (2.60) where the functions A and B are given by Equations (2.25) and (2.24) respectively.

Lemma 2.3.2. Let the function P(t, TB) be the time t zero-coupon bond price with maturity

TB arising in the Vasicek model, and To the time of expiry of the option. Then

E T Q o [l{P(r0,rB)>K}|^i] =${x), where x is given by equation (2.59), and $(•) denotes the standard normal cumulative distribution function.

Now, using Lemmas 2.3.1 and 2.3.2, we obtain the price at time t of a European call option on a zero coupon bond given by the following theorem.

39 Theorem 2.3.1. Let the function P(t, TB) be the time t zero-coupon bond price with matu­ rity TB arising in the Vasicek model. Also let To be the time of expiry of the option, and K the strike price. Then the price at time t of a European call option on a zero-coupon bond is given by

ECZB

where H(t, rt), h, and x are definedin Lemma 2.3.1, and$(-) denotes the standard normal cumulative distribution function.

Using the put-call parity relationship given by equation (1.12), we can obtain the ana­ lytic price for the corresponding put option. In the case of the CIR model, the dynamics of the short rate under the TB forward measure are given by

2 B drt = (k$ - ((k + A) + Bit, TB)a ) rt) dt + o,JFtdBj , (2.62) where the function B is given by equation (2.30). The transition distribution under this forward measure of rt conditional on Tu is given by

Vr%u (X) = P>*(M(«,«)) (*)> (Z63> where,

q(u,t) = 2[cf>(u,t) + i[; + B(t,TB)}

6{t,u) = j-—r (2.64)

4>(u,t) - 2l a2 (eT(t-u) _ 1) 7 + (k + A) 1p • 2

40 and 7 is given by equation (2.32). That is, rt\ru has the same distribution as a random

.,2'/ variable which is distributed as * ^„(A , a scaled non-central chi square, with v degrees of freedom and non-centrality parameter 5(t, u). Equation (2.63) can also be expressed as follows

and hence, by applying the transformation

X(TB) = 2(^ + {t,TB))rTB, (2.66) conditional on r„, X(TB) is a non-central chi square variate with i/ degrees of freedom and S(t, TB) non-centrality parameter. This can be shown via the distribution function method and equation (2.65)

T S r <2 67 « ' <*™ W = 4* (^ ^ 2(V, + ^.TB))' ") ' ' >

and since q(t, TB) = 2(ib +

Lemma 2.3.3. Let P(t, TB) denote the price at time of a zero-coupon bond with maturity

TB as in Definition 1.4.1, and the short rate rt having the dynamics given by equation (2.11) under the risk-neutral measure Q. Then

EQT0 [P{TO,TB)\{P{T0,TB)>K} 4M 2 T , 1 („ ., , , . „,„, T, M 2,i r1ei< 'o- > HX> | M* + * + BFo, TB)], -,.. ¥—tm^]] , (2.68)

41 where

\ q{To,t) J 2-,

a2 (e7(7b-0 _ I)" k + A + 7 w = (T2

g(To,0 = 2 [0+

: g(r0,i) + 2B(7o;rB)

r" = * B(T0;TB)' where 7 is givew 6y equation (2.32), and the functions A and B are given by Equa­ tions (2.31) and (2.30) respectively.

Lemma 2.3.4. Let P(t, TB) denote the price at time of a zero-coupon bond with maturity

TB as in Definition 1.4.1, and the short rate rt having the dynamics given by equation (2.11) under the risk-neutral measure Q. Then

2 l{ To £) 2, / 4k9 2(p rte - " EQT0 [l{P(T0,TB)>K}\Ft\ = X 2r*[0 + if>]] —, ' — \ cr \j:r> + V\

Using, Lemmas 2.3.3 and 2.3.4, we may prove the following

Theorem 2.3.2. Let P(t, TB) denote the price at time of a zero-coupon bond with maturity

TB as in Definition 1.4.1, and the short rate rt having the dynamics given by equation (2.11) under the risk-neutral measure Q. Then by Lemmas (2.3.3) and (2.3.4), the price of a

42 European call option expiring at time To, and strike price K, on a zero-coupon bond is

ECZBcmd(t,To,TB,K) =

2 P{t, To)H{t, rt)X ' {2r*[e + $ + B{TQ, TB)], ~2, ,,,^"'^1) V a [

2 -KP(t,T0)x hr^ + ip]:—/** ,, ) (2.69) where \2 (•> d, 8) is the non-central chi-square cumulative distribution function, with d de­ grees of freedom and 5 non-centra I ity parameter, and H(t, rt) is defined in Lemma 2.3.3.

Finally ,the correspondong formula for the put option can be obtained from the put-call parity relation given by equation (1.12).

2.3.4 Pricing European Options on Zero-Couopon Bond Futures

We shall utilize the change of measure approach to obtain the analytic price of a call option on a bond futures contract. Using the put-call parity relation, we shall obtain the analytic price function for the put option. Recall that

ECZBondFut{t, To, TB, TF, K), denotes the the price of a european call option expiring at time To, which is written on a bond futures contract with bond maturity date Tg, contract settlement date TF, strike price

K, and TB>TF> To- In the Vasicek case, we may prove the following

Lemma 2.3.5. If the short rate, rt has transition distribution given by equation (2.57) under the time To forward measure, and let Fut{t, TF,TB) be the time t bond futures price given by equation (2.34). Then,

E0T0 [Fid(T0,TF,TB)l{Fut{To!TF,TB)>K)\Ft} = H(t.rt)

43 where

IAF(T0,TF,TB)-EQTO [rTo\Tt}BF{To7F,TB)+\BF(To,TF.TB)(VarQTo [rr0|^t]) 1

A {T ,T ,T )-\nK r* F 0 F B BF{TO,TF,TB)

r* - EQT0 [rTo\^t} x VarQr0 [rTo\Ft}

x - BF(T0, TF, TB)VarQT0 [rTo\Ft],

and the functions AF and BF are given by Equations (2.37) and (2.36) respectively.

Lemma 2.3.6. If the short rate, rt has transition distribution given by equation (2.57) under

the time To forward measure, and let Fut{t, TF,TB) be the time t bond futures price given by equation (2.34). Then,

EQT0 [l{Fut(T0,TF,TB)>K}\Ft} = <&{x), where x is defined in Lemma 2.3.5, and $(•) denotes the standard normal c.d.fi

Now, ECZBondFut(t, To, TB, TF, K) is characterized by the following theorem which makes use of Lemmas 2.3.5 and 2.3.6.

Theorem 2.3.3. Using Lemmas 2.3.5 and 2.3.6

ECZBandFut{t, To, TB, TF, K) = P{t, T0)H(t, rt)$(h) - KP(t, T0)$ (x), (2.70)

where h and H(t,rt) are defined in Lemma 2.3.5, x is defined Lemma 2.3.6, and <£(•) denotes the standard normal c.d.f.

We can obtain the the price of the corresponding put option by applying the put-call parity relation given by equation (1.15).

In the case of the CIR model, we may prove the following

44 Lemma 2.3.7. When the transition distribution for the short rate is given by equation (2.67),

and the bond futures price is characterized by equations (2.35), (2.30) and (2.31), then

EQT0 [Fut{To, TF. TB)l{Fut(T0,TF,TB)>K} \^t\

2 = HX ' (.2r*[

where

V Q(To,t) J

q> = fc + A + 7 V ~o~2 '

q(T0,t) = 2[0 + V], 2 r t 40 rte^ o- ) /i q(To,t) + 2BF{To,TF,TBy 1 eAF(TO.TF,rB)'

where 7 is given 6y equation (2.32), and the functions AF and BF are given by Equa­ tions (2.45) and (2.43) respectively.

Lemma 2.3.8. When the transition distribution for the short rate is given by Equation (2.67), and the bond futures price is characterized by Equations (2.35), (2.30) and (2.31), then

2 l(To t) 2, ( 4k0 2(f) rte ~ E T 2 Q o [l{Firf(7b,7>,TB)>K} \Ft] = X ( r*[(j) + ^]; -^2", , , + n

where, r*, (f>, ib, are defined in Lemma 2.3.7, and BF are given by Equation (2.43).

Theorem 2.3.4. Let ECZBondFut(t, T0, TB, TF, K) denote the time t price of a Euro­ pean call option expiring at time To with strike price K, on a zero-coupon bond futures contract with contract settlement date TF , and bond maturity date TB. Then by Lem-

45 mas 2.3.7 and 2.3.8 we obtain

ECZBondFut{t, T0, TF, TB, K)

2r*[<5> + ib + BF(To,TF.TB)],-^-, A J

-KP^T^^^^f) (2.7!)

Finally, using the put-call parity relation given by equation (1.15), we can obtain the price function for the corresponding put option.

2.4 Two-Factor Short Rate Models

2.4.1 Some General Remarks

The choice of the number of factors to include in an interest rate model is an issue of pri­ mary importance. Leaving analytical tractability aside, it is desirable for a model to be able to explain the actual behavior of the term structure, market price of financial derivatives, or simply the observed short-term interest. For hedging purposes, it is important that a model be able to generate a sufficiently rich family of yield curves for example, increasing, de­ creasing, or humped, and that the variations of the yield curve predicted by the model are consistent with observed fluctuations of this curve. In general, one factor models cannot match the observed term structures, and cannot generate a rich family of yield curves. To illustrate this claim, consider the following example from Brigo and Mercurio [5].

Example 2.4.1. Consider the single factor situation as in the previous section, and suppose there exist a payoff depending on the joint distribution of two yields at time t, being R(t, Tj) and R(t, T2). Hence, the payoff depends on the joint distribution of the T\—t year and T2 — t year continuously compounded spot interest rates at time t. Since the joint distribution is involved, typically, the correlation between the rates plays an important role. Observe that,

46 if the continuously compounded spot rate is given by equation (2.3), then the correlation between the two rates given by

+ T CarrW,T1).W.n)) = , ^^ ^f^> + W> >™ = = 1. y/Var(a(t, TJ + b(L T^)rt)^Var{a{t T2) + b(t, T2)rt)

This means that the different rates at time t are perfectly correlated. Therefore a shock to the interest rate curve at time t is transmitted equally through all maturities. This feature of single factor models when pricing products that depend on two different rates is undesirable since interest rates are known to exhibit some non-perfect correlation

In the remaining part of this section, we will consider the two factor Vasicek model presented in Babbs and Nowman [2], and the two factor CIR model presented in Chen and

Scott [9]. The former model belongs to AQ(2), and the latter belongs to .42(2) which is in canonical form. Babs and Nowman [2] initially specified their two factor model under the objective mea­ sure, P using correlated Brownian motions with p being their correlation coefficient, after which, they re-specified the model utilizing two independent standard Brownian motions. We suppose that under P,

dXu = -ZiXudt + axdWlt (2.72)

2 dX2t = -bXttdt + a2pdWu + a2 y/l - p dW2t (2.73)

rt = fi-Xu-Xn, (2.74)

where W\ and W2 are independent one dimensional standard Brownian motions. Also, assume that the market price of risk attached to each Brownian motions is constant. Hence,

47 by Girsanov's theorem for two dimensions

dBu = e1(lt + dWlt (2.75)

d,B2t = 92dt + AW*, (2.76) are two independent Brownian motions under the risk-neutral measure and the SDEs for the two factors become

dXu = 6 ( -|^ - Xu ) dt + arfBu (2.77)

2 dX it £2 — X2t\dt + o2pdBu + a2yj 1 - p dB2t. (2.78) V £2 ;

Solving equations (2.77) and (2.78) conditional on X\u and X2u yields

Xu = e-M'-^Xiv-^l-e-trt-^ + aife-M-^dBu (2.79)

2 ?2(i u) +a2V/l-p / e- ~ rfB2s (2.80) J u

Now, by linearity the transition distribution of the short rate conditional on Tu which is normal with conditional mean and variance given by

+ *WI3(1 _ e-6(*-«)) (2.81) £2

V'arQ[r4|^] = £(1 - e-%<«-«)) + il(l - c-**-«>) ^u ^2 2 +2aia2p-^—(1 - e- (fr+fe><*-«>). (2.82) Q + ?2

48 In the case of the two-factor CIR model presented in Chen and Scott [9], the dynamics of the factors under the objective probability measure are

dXit = kiiOi-Xiddt + Oiy/JUtdWiui^l.^ (2.83)

rt = Xu+Xx (2.84)

where Wit,i = 1,2 are two independent Brownian motions. They assume that the risk premium processes are given by:

Au = AiAit

The system (2.83), (2.84) under the risk neutral measure by Girsanov's theorem in two dimensions becomes:

dXit = (h+ Xi)(7^r-Xit)dt + aiy/x7tdBiui = 1,2 (2.85)

rt = Xn+X2u (2.86) where

dBlt = - ^/X~tdt + d\Vit, i = 1,2 (2.87) Or are independent standard Brownian motions under the risk-neutral measure. The transition distribution of rt is generally not known in this case since the finite sum of independent non-centrally distributed chi squared random variables is not known.

2.4.2 The Pricing of Zero-Coupon Bonds

We shall utilize the B-S method for determining the analytic price function for the two- factor Vasicek and CIR models. Assume that the analytical bond price function is given

49 by

pU j, \ _ eA{t,TB)+Bi(LTB)Xlt+B2(UTB)X2t (2.88)

We shall compute the functions A, B\,Bi via an extension to the B-S method to multiple dimensions. This entails utilizing a two dimensional version of the Ito-Doeblin formula from stochastic calculus, and the separation of variables theorem from the theory of PDEs. Consider the two SDEs

dX1(0 = A(«,A'1(0,^2(<))d* + 7n(^^i(0>^2W)^iW + 7i2(«,Xi(0,A:2(0)dB2(0 dX2(«)=/?2(*,Xi(0,X2W)d« + 72l(^X1(0,X2(<)M5i(«)+722(t,X1(<),X2(f))d52(*)

where B(t) = (Bt(t).B2(t)) is a two-dimensioanl Brownian motion under the risk-neutral measure. Applying the Ito product rule in two dimensions to the process D(t, TB)P(t, TB), and setting the dt term equal to zero we obtain the PDE chracterizing the functio P{t,Ts) given by

2 r Pt(LTB) + J2 0i(L Xl, x2)Px,{t, TB) + -I2i(t, xi, x2)PXiXi{L TB) I=I

+ 2 (7ii72i + 712722) PxlX2(t, TB) = rP(t, TB).

(2.89)

In the two-factor Vasicek model, if we specialize the function P(t, TB) to equation (2.88) and utilize the drift and diffusion functions from the system of SDEs (2.77) and (2.78),

50 then substituting into the PDE given by equation (2.89), yields

B[{t,TB) = I + B^TB)^ (2.90)

Bi{TB,TB) = 0 (2.91)

B'2(tTB) = l + B2{t,TB)h (2.92)

B2{TB,TB) = 0 (2.93)

2 A'{t,TB) = -fi + e^B^t, TB) + o2(0lP + e2Jl - p )B2(t: TB)

+B1(tTB)B2(t,TB)(a1pa2)

2 2 l +^B 1(t,TB)(a1p) + -Bl(t,TB)ol (2.94)

A(TB,TB) = 0 (2.95)

for t € [0, rB).The solution to the above system is given by

Bi(t,TB) = -(- - ) (2-96)

B2(t,TB) = -I J (2.97)

\ 6 6

-B^TB)^ , ^ 6 ~~ 6 6 ~£c2?

/ <7lg2p \ (1 _ (€l+C2)(TB-t)\ V66(6+6)yl J +(§)(1~ e"2'1(Te_t))+ii)(1 - e^2(TB_t)) (2-98) forte [0,TB]. In the two-factor CIR model, specializing the PDE (2.89) by utilizing the drift and diffusion 51 functions from the system of SDEs (2.85), we obtain the following system of ODEs

2 2 B[{t.TB) = (fc1 + A1)£i(tTB) + ^£1 (LTB)-l (2.99)

By{TB,TB) = 0 (2.100)

2 B'2(t,TB) = (k2 + \2)B2(LTB) + ^B 2(LTB)-l (2.101)

B2(TB,TB) = 0 (2.102)

A'(t,TB) = -(91k1B1(t,TB) + 92hB2(t,TB)) (2.103)

A{TB,TB) = 0 (2.104)

Vf G [0, TB).The above system has the following solution:

2(1 - e-^B-')) 2 105 Bi(t,TB) = ~2lie^l(TB-t) + {k. + x. + T.)(1 _ e-7i(TB-0) < - ) r fc +A 0 5 M.T, = 2Mi / 27le< »-*)( * '-'»> - ^HMBJ a2 log ^27ie_7l(TB_t) + (fci + Ai + 7i)(1 _ e_7l(TB_t)) T t fc2 A2 2 0 5 2fc202l / 272e< s- )( + ^ >* - 2 lo6g a2 V272e-^(^-') + (fc2 + A2 + 72)(1 - e-72(TB-t)) (2.106)

7>*• = ^/(ki + \t)2 + 2of (2.107) for?. = 1,2. Observe that the the zero-coupon bond futures price arising in the two-factor CIR model is an extension of the one arising in the single factor CIR model. This occurs since we have two independent factors in which the system of SDEs (2.85) is not interdependent. This is reflected in the system of ODEs (2.99)-(2.103) where the Riccatti equations given by equations (2.99), and (2.101) also not interdependent.

52 2.4.3 Zero-Coupon Bond Futures Price

We shall compute the analytical price function for the Gaussian case directly by computing the conditional expectation given by equation (1.8). In the CIR case, we shall apply the B-S method again since the distribution of a linear combination of two non-central chi-square variates is unknown. We assume that the analytic bond futures price function is also an exponential affine function of the factors given by

A t T T B t T T Fut(t TF TR) = e F( ' F> B)+ iF( > F> B)Xu+B2F(t,TF!TB)X2t

In the two-factor Vasicek model, the transition distribution conditional on Tt of the follow­ ing linear combination

Z = BX{TF, TB)X1TF + B2(TF, TB)X2TF (2.108) which is Gaussian since the factors are normally distributed. The conditional mean and variance are respectively given by

EQ[Z\Tt] = Bl{TF,TB) fe-6P>"% - ^(1 - e-*<*>-'>))

2 VarQ\Z\Tt\ = B\(TF.TB) (j|(l - e^^)) + B 2(TF,TB) (JL{1 - e-^P>-0)

2 + 2B1(TF.TB)B2(TF,TB)-p^(l-e~ ^)(TF-t)) £i + &

53 A T T Z Now, if Z is given by equation (2.108) and Gaussian, then e ( F> B) js log-normally distributed. Hence, the time t futures price is given by

Fut{t, TF, TB) = exp IA{TF, TB) - EQ\Z\Ft) + \varQ[Z\Ft}\ , (2.109) which implies that

TF t) BlF(t,TF,TB) = Bi(TF,TB)e-^ - , i = l,2 (2.110) i(T AF(t,TF,TB) = ^(rF)rB) + i?i(rF!rB)^(i-e-« ^)

Q

+52(7>)^)^^±|vIzZ)(1 _ e-fa^-D)

l + -VarQ[Z\Tt}. (2.111)

In the two-factor CIR case, utilizing the B-S method in two-dimentions, we applying the

Ito-Doeblin rule in two-dimensions to the process Fut{t, TF, TB), and set the dt term equal to zero. Then, we obtain the PDE chracterizing the function Fut(t, TF, TB) given by

2 r 1 x Futt(t,TF,TB) + J2 Pi(t, xi,x2)FutXi(t, TF, TB) + xTafo i-, x2)Futx,Xi(t, TF, TB) i=l

+ g (7ii72i + 712722) Futx,X2(t,TF,TB) = 0. (2.112)

54 Substituting in for the drift and diffusion functions, we obtain the following system of

ODEs

a; B'1F{t,TF.JB) (ki + M)B1F(t, TF, TB) + -j-B\F{t, TFTB)

B,F{TF.TF,TB) B^T^TB)

(Jo 2 B2F{t, TFTB) {k2 + X1)B2F{t, 7>, TB) + -fB 2F(t TF, TB)

B2F(TF,TF,TB) B2{TF,TB)

A'F{t,TF,TB) — (k\9\BiF(t,TF,TB) + k292B2F{t,TF.TB))

AF(TF,TF,TB) = A(TF,TB)

Vt G [0, TF). The solutions are given by

2(h + Xi) ,{ki+\i)t Bip{t,Tp,TB) J (2.113) of (e(ki+\i)t + (k+>«- i)e(fci+Aj)rF)

AF{t,TF,TB) A(TF,TB)

K lo Ji k (2.114) + 2_. ~^rO; s e(ki+Xi)t _|_ ( i+*i-Ji\e(kj+>>i)TF t=i O- Ji = •Bi(TF,TB)-f, where i = 1, 2.

2.4.4 The Pricing of European Options on Zero-Coupon Bonds

We shall utilize the forward measure approach to compute the analytical price function for the two factor Vasicek model since the conditional distribution of

Z — Bj(To* TB)X\T0 + B2{To, TB)X2T0, (2.115)

55 T on Tt for t < To under the probability measure Q ° is Gaussian. The two factor CIR model, the analytical price function cannot be computed explicitly, and hence, must be computed numerically since the conditional distribution of Z, whenever the factor pro­ cesses are square root diffusions, is unknown.

In order to apply the forward measure approach, we can apply the Ito-Doeblin formula in two dimensions to In P(To, To), where the function P is the analytical price of the zero coupon bond given by equation (2.88), and the functions AF, B\p, B2F are given by the system of equations (2.90), (2.92), and (2.94) respectively. Obtaining

~WM = Hi I P )dB" + l p dB* F w XOP<-{-u:{ti rff^y +(^)v- ;, i

dBl° = - [Bi {t, T0)ai + o2pB2{t, T0)\ dt + dBu (2.116)

dB%> = - B2(t,T0)

2 c (B1(t,To)a1a2p+B2(t,To)aj - a^pO, + e2y/\ - p ) v \ ,, a^2t = & \ z6 A2tJ at

+ o2pdB[° + a2 Vl - f?dB%>. (2.119) where B~[P, and Bj° are two independent standard Brownian motions under QT°. The transition distribution of the two factor process is conditionally bivariate normal on Ts and

56 s < t with

il{t 8) {t EQT0\XU\T8) = Xue- - + f e-^ ^ {B^u,T0)

- f e-^-^e^du (2.120)

{t s) EQr0[X2t\Ts] = X2se-^ - + f e-^(*-«) [Bl(u,T0)oxa2p + B2{u,T0)al) du

{t u) 2 _ / e-^ - a2(pd, + 92yJ\ - p )du (2.121) •Iss 2 VarQr0{Xu\Ts] = -j-(1 - e-^f*-)) (2.122)

Vargro[XMI^J = ^(1 - e-^C-)) (2.123)

CovQTo[XmX2t\Ts} = pf£L(l - e-(&+&)(t-)) (2.124)

T Z given by equation (2.115) is normally distributed under Q ° conditional on Tt with

EQT0[Z\Tt] = B1(T0,TB)EQT0[X1To\ft} + B2(To,TB)EQTO[X2To\ft} (2.125)

2 2 VarQr0 [Z\Tt\ = B (TQ, TB)VarQT0 [X1To \Ft) + B 2{T0., TB)VarQT0 [X2To \Tt]

+2B1(T0, TB)B2(T0, TB)CovQr0 [X1To, X2To\Tt). (2.126)

We may prove the following Lemmas.

Lemma 2.4.1. Let P(To,TB) be the analytic bond price function under the two fac­ tor Vasicek model evaluated at the time of option expiry To- Using equations (2.125), and (2.126),

F \P(T 7M1 IT] ^(^ Jz*-EK}\J-tl = H | 1 ~ $ I , = ^VarQTolZlft]

57 where

H = exp|>l(To,rB) + ^arQr0[Z|^] + £Q7-0[Z|^]|

Z* = A(T0,TB)-\nK.

The function A is given by equation (2.98).

Lemma 2.4.2. Using equations (2.125), and (2.126),

r , i /Z* - EQT0[Z\Ft]\ EQro ll{P(T0,TB>K}\Ft] = 1 - * { VarQTo[zlTt] ) where Z* is defined in Lemma 2.4.1.

The time t price of a European call option on a zero-coupon bond under the two-factor Vasicek model is characterized by the following theorem

Theorem 2.4.1. Using equations (2.125), and (2.126), and Lemmas 2.4.1, and 2.4.2, the time t price of a European option call option on a zero-coupon bond with maturity TB, and expiring at time To with strike price K

ECZBond(t,T0,TB,K)

( (Z* - E0T0 [Z\Tt] - Var0T0 \Z\TA = P(t, To)H 1 - $ Q ' J -^ L ' J \ \ y/VarQT0[Z\ft]

where Z* is defined in Lemma 2.4.1.

Chen and Scott [8] reach a general formula for pricing a European call option on a zero coupon bond for the two factor CIR model when the dynamics of the factors are given by equation (2.85) under Q. Since this formula involves the bivariate non-central chi square

58 distribution, in order to compute the price, one has to use numerical integration. The payoff function at expiration date To is given by

max [0, P(T0., TB) - K), where P(To, TB) is the time To price of the zero-coupon bond maturing time TB and K is the strike price. The time t pricing function for this option is given by

ECZBond{t, T0, TB, K) =

/•OO f-OO + P(t,To) / / [P{z,.,z2.t.T0.TB) - K} f{zuvl,X\)f{z2,u2,X*2)dz1dz2 (2.128) Jo Jo where f(zi,Ui,X*) is the probability density function of factor i under the To forward measure. The region of integration is the range of values for X\To and X2T0 where the option is in the money. Note that, this region is determined by the following linear equation

B1(T0,TB)X}To + B2(T0,TB)X2To >C* = \nK- A{T0,TB).

2.4.5 European Bond Futures Price

Since the zero-coupon bond futures price under either the two-factor Vasicek or CIR mod­ els, is an exponential affine function of the factors, the approach in this section will be similar to the one in the previous section. That is, we shall compute the time t price of the European option on a zero-coupon bond futures contract via the forward measure approach. In the two-factor CIR case, a similar situation occurs as in Section 2.4.4, which is that we do not know the distribution of a linear combination of two independent random variables which are conditionally non-centrally chi-squared distributed. For this reason one has to determine the price function of this interest rate derivative numerically. For the two-factor Vasicek model, we can determine the conditional distribution of under

59 Z = BlF{T0, TB)X1To + B2F(T0, TB)X2To, (2.129) which is Gaussian with

EQr0\Z\Tt\ = B1F(To,TF,TB)EQTO{X1To\Tt} + B2F(To.TFlTB)EQTO{X2TomA30)

VarQT0 [Z\Tt] = B\F{T0, TF, TB)VarQT0 [XiTo \Ft] + BlF(T0, TF, TB)VarQr0 [X2To \^t]

+2B1F(T0, TF, TB)B2F(T0., TF, TB)CovQr0 [X1To,X2To\Ttl (2.131)

where the functions B\F, and B2F are given by equation (2.110). Using this result we may prove the following

Lemma 2.4.3. Using equations (2.130) and (2.131),

EQT0 [Fut(To,TF,TB)l{Fut(To,TF,TB)>K}\Ft]

Z*-EQT0[Z\ft)-VarQT0[Z\rt] = H 1-$ yjVarQTolZlFt] where

H = exp[AF(To,TF,TB) + ±VarQT0[Z\Tt] + EQT0[Z\Ft]

Z* = AF(T0,TF,TB)-lnK.

and

Lemma 2.4.4.

(Z* - EQT0{Z\Tt]\ EQTo [l{Fut(r0,TF.TB)>K}\rt\ = 1 - ^ VarQTo[m] ) , where Z* is defined in Lemma 2.4.3.

60 Theorem 2.4.2. If equations (2.130) and (2. J 31) hold, then by Lemmas 2.4.3 and 2.4.4 , the time t European call option expiring at time To with strike price K, on a zero coupon bond futures contract with contract settlement date 7> and bond maturity date TB is given by

ECZBondFut(t, To, TF, TB, K) =P<«, To)H (l - * f^-Wl-^l^) )

where Z* is defined in Lemma 2.4.3.

2.5 Three-Factor Models

In this section we shall consider the following models from the following classes: yl0(3), ^1(3), v42(3), ^3(3). For the A0{3) and ^3(3) classes, we shall consider the three factor model based on Babbs and Nowman [2] for the former, and the latter, we shall con­ sider the three factor model based on Chen and Scot [9], where in both cases is an extension of the material presented earlier on the single and two factor models. Regarding the A\ (3), and ^2(3) classes, we shall consider three models from each class. In the former, we shall consider the Balduzzi, Das, Ferosi, and Sundaram (BDFS) model denoted by A\T (3)BDFS, the Dai and Singleton (DS) model, denoted by Air(3)ns, and finally, the maximal model in that class denoted by A-[T(3)MAX- In the latter class, we shall consider the Chen model denoted by A2r{3)cHEN, the Dai and Singleton model, denoted A2T{3)DS, and finally, the maximal model in that class denoted by A2T{S)MAX-

Singleton and Dai [12] explain how these different classes are related in terms of a trade off between admissibility and the specification of the conditional variance of each factor

61 on the vector of the factors. That is, on one hand, the Gaussian models are entirely flex­ ible in terms of specifying the signs and magnitudes of the conditional and unconditional correlations among the factors, but at the expense of having constant conditional variances. The three factor CIR model lies at the other extreme of the spectrum of volatility specifica­ tions since all three factors driving the conditional volatilities. The other models mentioned above lie in between these two extremes. In the first section, we shall present all the mod­ els. Next, we dedicate a section to each interest rate asset and derivative where, in each of these sections we present the results of each model.

Babbs and Nowman [2] initially specified their three factor model using correlated Brown- ian motions with correlation matrix given by

/ 1 \

1 Pl2 Pl3

P21 1 /»23

\^ P31 P32 1 J after which, they re-specified the model utilizing three independent standard one dimen­ sional Brownian motions and under the objective measure is given by

dXu = -ZiXudt + axdWu (2.133)

dX2t = -&X2tdt + a2p12d\Vu + a2 Jl - p\2dW2t (2.134)

dX3t = -&X3tdt + a3pisdWu + a3 ; „ dW2t 1 V - (AT.

2 2 _ v(P2r 3 - P12P13) + azs\oz - p{z - , ' T dW3t (2.135)

rt = ti-Xu-X2t-X3t, (2.136)

where \\\, W2 and W3 are independent standard one dimensional Brownian motions under the objective measure, where we made use of the Cholesky decomposition of the variance- covariance matrix of the Brownian motions. Also, they assume that the market price of risk

62 attached to each Wg, denoted by 9q is constant.

dBu = Oxdt + dWlt (2.137)

dB2t = 02dt + dW2t (2.138)

dBu = Ozdt + dW3t (2.139)

are three independent Brownian motions under the risk-neutral measure and the SDEs for the three factors become

dXlt = 6 ( -jp- - Xu \ dt + ddBu (2.140)

dA2t = £2 I 7— A2t I d* + cr2pndBu +

(2.141)

dAW = &

(P23 -P12P13) ,n -A3t dt + (JzpizdBit + cr3 >/w£ 2t

2 , / 2 (P23-Pl2pl3) jr> (2.142) + cr 4/a - P13 ; 5 dB 3 3 1-/^2 zt

63 Solving equations (2.140)- (2.142) conditional on X\u, X2u, and X3u with an application of the Ito-Doeblin formula to the function e^Xi yields

(2.143) 6 Ju

X t 2 6 2(< u) +?2 / e-« - dB2s (2.144) Ju

X 3t / 2 \ H #2 (P23 ~ P12P13) (P23 - Pl2Pl3) -^3 Pl3#l + + 031/^3 -P13 e-fo(*-«)d.M V yr P?2

xi <73 (P23 - P13P12) /"' e-C3(t-«)d5. +a3Pnf e- *-")dBlu + , / 2M

(P23 - P12P13) '* +a3i/a3-p13 (2.145) J s

64 Hence, the distribution of \X\t-, X^t.Xzt) is conditionally multivariate normal with

EQ\XU\FS) = e-M-lXu-^il-e-t*-*), (2.146)

a 2(t-s) V _ 2Jd\P\2 + ^2 Vl ~ P12) n _ c-?2(t- ?2 (2.147)

3(t s £Q[X3i|.Fs] = e-« - >*3.

^2 (P23 ~ P12P13) j ^ _ e-fo(t-*h V1 - P?2 /

ff 3 L / 2 (#23 - P12P13) \ ,-. -f3(t-s)\

(2.148)

Var0[XM|^J = £L(l-e-*&<*->), (2.149)

VarQ[X2t|^] = |l(l_e-*&e->), (2.150)

FarQ[X3f|^] •= ^-(l-e-^C-)), (2.151) ^3 CcwgfXw.Xal^J = ££^(1 _<,-<&+&><*-•>), (2.152) Q +€2

Cot.Q[X]t!X3t|^] = ^^(i-e-to+^'-J), and (2.153) u + « Co^[X3t!X2i|^s] = ^^(i-e-(e.+fo)(«-.)). (2.154) ^3 + ^2

The three factor CIR model presented in Chen and Scott [9] specifies the dynamics of the factors under the objective probability measure as

r dXu = ki{dl-Xlt)dt + aiy X~tdWiui^l,2,d, (2.155)

rt = Xu + X2t + X3u (2.156)

65 where Wit, i = 1,2,3 are two independent Brownian motions. They assume that the risk premium processes are given by:

AK = AiA'lt

A2( = ^2^2t-

A3i = X3X3f

The system (2.155), (2.156) under the risk neutral measure by Girsanov's theorem in two dimensions becomes:

dX = {k + \ )(-?^-X \dt + o ^fX~ dB i = 1,2,3 (2.157) it i i \ki + iXi ) i t iu

n = Xu + X2t + X3t, (2.158) where

dBa = — yfx~t + dWit ,2 = 1,2,3 (2.159) are independent standard Brownian motions under the risk-neutral measure.

We shall present the Air (3) models considered in this thesis by considering the A\T{Z)MAX model first, and obtaining the other models in the family as special cases of it. The

A\T{3)M'AX model is directly specified under the risk-neutral probability measure as

dvt = fi(v-vt)dt + riy/vidBit (2.160)

2 dOt = u{9- 9t) dt + VC + PeVtdB* + 33 t

+crevVVvtdBu (2.161)

drt = krv (v - vt) dt + k (9t - rt) dt + yjar + vtdB3t + arvr]y/rhdBu

2 +arey/Q + PovtdB2t, (2.162)

66 where Bi, Bi, and B3 are independent Brownian motions under the risk-neutral measure.

The model is "maximal" in the Air(3) family in the sense that the minimal known sufficient conditions for admissibility have been imposed, after which the minimal normalizations for econometric identification have been imposed. As opposed to the three factor C1R models, the AIT(3)MAX model has the short term rate as one of the factors, and has a single variable v(t), determining the conditional volatility structure of all three state variables. Hence, the model the three factors to be conditionally correlated. Also, the model allows feedback between the drift of the short rate and the other two factors.

The AIT(3)DS model is the special case of the system (2.160)-(2.162), where the following parameter restrictions are applied

pe = ar = krv = a0v = 0. (2.163) yielding

dvt = n(v-vt)dt + rjy/vldBu (2.164)

ddt = v(0-0t)dt + <;dB2t + oorVv'tdB3t (2.165)

drt = k (0t - rt) dt + y/vldB^ + aTVr]^VtdB\t + o^d-Bit- (2.166)

That is, the volatility variable vt does not affect the drift of the short rate variable, and is not conditionally correlated with the second variable, 9(t). Additionally, it precludes the volatility of vt from affecting the volatility of 6t.

In addition to the parameter restriction given by equation (2.163), the Air(S)BDFS model is obtained by imposing the extra restrictions on the system (2.160)- (2.162) given by

oer = crr$ = 0. (2.167)

67 Hence, The A\r {3)BDFS model is directly specified under the risk-neutral probability mea­ sure as

dvt = (i(v-vt)dt + r}y/v(i)dBu (2.168)

dOt = v (6 - 0t) dt + (dB2t (2.169)

drt = k{9t~rt)dt + y/v~tdB3t + arvri^/v~idBn. (2.170)

Hence, in addition to the interpretation of the parameter restrictions imposed to obtain the

Air{3)jyS model. The AIT(3)BDFS constrains the conditional correlations between rt and vt to be zero, and that the variable 0t affects rt only through the drift of rt. The A2r(3)MAx model is directly specified under the risk-neutral probability measure as

dvt = n(v-vt)dt + kv0 (9-0t)dt + r\^tdBxt (2.171)

ddt = v(e-et)dt + kev{v-vt)dt + Cy/9'tdB2i (2.172)

drt = krv (v — vt) dt — kr6 (# — 9t) dt + k(r — rt) dt + urvr]y/vldBu

^/^T^+VtdB3t + argC^/0'tdB2t, (2.173)

where Bi, B2, and J33 are independent Brownian motions under the risk-neutral measure. Observe that the volatilities of all three state variables are affine functions of two of the three variables, being vt and 9t. A primary difference between the AIT(3)MAX and the

A2T(3)MAX, is that the latter allows for feedback in the drifts of the vt and 9t variables.

The A2T{3)DS model is the special case of the system (2.171)- (2.173), where the following parameter restrictions are applied

0o = ar = k0v = ore = 0. (2.174)

68 and

r = 9 (2.175)

k = kr0. (2.176)

The AIT(3)DS model, constrains the conditional correlation between the rt and 0t to be zero, and does not allow feedback from 9t into the drift of vt. Also, it constrains the long run means of the rt and 9t to be equal. Hence, it is given by

dvt = /JL(V — vt) dt-{-rf^/vtdBu (2.177)

d9t = is{9-0t)dt + k0v(v-vt)dt + (y/9~tdB2t (2.178)

drt = kTV (v - vt) dt + k (9t - rt) dt + y/vtdB3t + arvr)^tdBit (2.179)

In addition to the parameter restrictions given by equations (2.174)- (2.176), the Air {3)chen model imposes further restrictions on the system (2.171)- (2.173), given by

kev ~ krv = arv = 0. (2.180)

The A2r(3)Chen is given by

dvt = y(v — vt) dt + r}y/v~tdB\t (2.181)

ddt = v(9-9t)dt + (y/9~tdB2t (2.182)

drt = k {9t - rt) dt + y/FtdB3t (2.183)

Notice, that there is no feedback in the drifts of the variables vt and 0t, and that the former affects the the dynamics of the short rate only through the volatility, and the latter only through the drift. Finally, a primary difference between the Chen ad BDFS models is that the variable 9t follows a square root diffusion in the former, whereas it is Gaussian in the

69 latter.

2.5.1 The Pricing of Zero-Coupon Bonds

We assume that the solution to the analytical bond price function is an exponential affine function of the three factors given by

P(t,TB) = exp{A(TB)-XltB1(TB)~X2tB2(TB)-X3tB^TB)} (2.184)

TB = TB - t.

We shall utilize the B-S method in three dimensions which entails applying the Ito-Doeblin formula in three dimensions and the separation of variables to derive the system of ODEs describing the functions on the right side of equation (2.184). The three factor Vasicek and

CIR models give rise to closed form solutions to these functions, where as, in all the other cases, some these functions must be determined numerically. For such models we shall obtain a system of ODEs where we can solve the system recursively. That is, we can obtain the function Bj, analytically without resorting to any numerical techniques independent of the other functions in the system, and we can obtain the function B2 similarly except in the Aerify MAX model, where we need to resort to numerical methods to determine this function.

ar An interesting feature of the Air(3)B£,FS, Air(^)DS, >d Air(3)MAX models is that the function B\ in equation (2.184) is the solution of a Riccati differential equation and can be obtained using the Whitakker functions which arise from the Whittaker differential equa­ tion. This was accomplished by utilizing Maple 8, where we utilized the package in Maple 8 which solves the Riccatti differential equation via a symmetry scheme. Although, we obtained the function B\ in these models via Maple 8, we could not obtain the function A by this method since Maple 8 was unable to compute the integral of the function B\ when it was described in terms of the Whitakker functions. For this reason, in all these cases we

70 utilized MATLAB to solve the ODEs systems numerically via the ODE45 algorithm which is based on an explicit Runge-Kutta(4,5) formula. In some instances, we shall present the graphs the B\ function obtained via Maple 8 to illustrate interesting differences in the ap­ proximations. For more information on the Whittaker functions, the reader may consult Abramowitz and Stegun [1] and Luke [24] which are the references mentioned in Maple 8.

can st The ODEs system arising from the .42r(3)c7ien. ^2r(3)_D5, and A2T(3)MAX 'H be solved recursively. That is we solve for the function B3 analytically first, we then sub­ stitute this function into the ODE or ODE subsystem for the B\ and Z?2 functions, and then solve for them. Note that, in this case since we always have two dimensional system of

Riccati equations in the B-[ and B2 functions where the degree of their interdependence increases as we head towards the most complex model being the ,42r(3)M>«: model. In the A2r(3)chen model, the Riccati system is composed of two independent Riccati ODEs, while the A2r(3)Ds and A2r(3)MAx models gives rise to an interdependent Riccati system, where the degree of their dependence is only through the £?2 function appearing linearly in the Riccati equation for the B\ function. Note that, the A2r(S)MAX is more complex since both Riccati equations now include quadratic and cross terms of the B3 function. Synthesizing the above in the context of the all the models presented in this section in terms of the number Riccati equations, and their degree of dependence, we observe that at one extreme, the three factor Vasicek model neither gives rise to a Riccati system of ODES, nor ODE. The three factor CIR model is at the other extreme giving rise to a system of ODES with three Riccati equations, which are not interdependent. The models presented in this section in the A\ (3) class give rise to a single Riccati equation, while the models in the

,42(3) class give rise to a two dimensional dependent Riccati system. In the three-factor Vasicek model, by applying the Ito-Doeblin formula in three dimensions to the function D{t, TB)P(t: TB) where P(t, TB) is given by equation (2.184), and setting the dt term equal to zero, we obtain a PDE characterizing the bonds price. This PDE, by the separation of variables theorem from the theory of PDEs gives rise to the following

71 system of ODEs

B'l(TB) = -(l+tiBi(rB)), i = 1,2,3 (2.185)

Bi{0) = 0, i = l,2,3 (2.186)

A'{TB) = -// + 9lalBl{rB) + a2{9ipn + 62yJI - p\2)B2{rB)

+ B1{TB)B2(TB)(oip12a2)

l 2 + -B\{rB){a,Pl2) + \BI{TB)OI

23 12 3 + B3(TB)

2 2 . D / \/j J / 2 (^ 3 ~~ P12P13) \ + B3{TB)03 I Was - pf3 j——5 I 03

B T a + 2 I( s) 3 + Ti^3pi3Si(TB)JB3(rs) + a2a3p23B2(TB)B3(rB) (2.187)

.4(0) = 0. (2.188)

Observe that, the equations for the B;S are linear ODEs and have solutions given by

1 _ p-ii{rB) B1(rB) = i = 1,2,3. (2.189) St

72 The expression for A may be integrated to obtain

^(TB) = -(TB) ( A* + #I ( T + ^— + -7—

(P23~P]2P13) ' / 3 O „ I „ n2 (p23~P32pl3) '

(p23 — P12P13) \ \ / 0-3 tfl CT2P12 ^3Pl3 \ ,f^2 v W?2" •p?2 +f 6+ir+~6~J+ +

TB / ^3 (P23 - P12P13)' +T3r_ft8 i-A

^I(TB) 6 U2 + 66 + 66 /

1 D/ s , 01/01202 + 0202 A/ - pf2 (P\2Cr\23 ~ P12P13) ^2(TB) I : * ( —— h —2 + 6 V 66 ' £2 66

„ / ^ a T /^r /i2 (P23-P12P13) y$C73 aZ P D , , 1 ^3P1301 , 0302 (p23 - P12P13) , V ~~ ™ 1^£ 6 6 V1 - Pis 6

a a 13

For three factor CIR model we obtain the following system of ODEs

= (ki.+ \i)Bi(TB) + -±B?{TB)-l, i = l,2,3 (2.191)

BM = 0 i = l,2,3 (2.192)

A'(TB) Y^dikiBiim) (2.193) . i=l A{TB.,TB) = 0 (2.194)

73 Vf G [O.TB). Since the three B, functions are independent, the equations (2.195), and (2.193) is a simple extension of the corresponding single and two-factor CIR ODEs systems. The system has the following solution:

2(1 — e~7i(rB)) Bt{TB) = (2195) 27ie-^) + (k, + M + 7i)(l - e-*te)) +A] ) 0 5 2fc16'1 / 27le^)^ -^ * - \ (TB) g ~ a\ ° ^27le-^^) + (A:1 + A1+7i)(l-e-^B))>/ (Ts)(fc2+A2 72) 0 5 2£-26>2, / 272e - * - T af log V272e-»( B) + (k2 + A2 + 72)(1 - e"^^))

lQ g L —.,T ^ ./,,., . , w, -w^Tr V (2-196) cr| V273e-T®( B> + (fc3 + A3 + 73)(1 - e-"*< »>) H = ^{h + A,)2 + 2af (2.197) fori = 1,2,3.

For the Air(3)MAx model, the system of ODEs characterizing P(t, TB) given by equa­ tion (2.184):

2 2 2 A'(rB) = -B1{TB)HV - B2{rB)9u - B3(rB)krvv + ~B (rB) [a rar + ( )

2 2 2 +^B (TB) (O%( + ar) + B2(TB)B3(TB) (or9( + a9rar) (2.198)

2 2 2 2 B[{TB) = -B1(TB)M.-^B (rB)V -~Bl(rB){a rvr1 + a%Pe + l)

2 2 2 2 -^B {TB) (Be + o evr} + ajr) - £I(TB)B3(TB)<7,W - B3(TB)krv

2 2 -Bi(rB)B2(TB)a6vr] - B2(TB)B3(TB) (a9varvr} + ar009 + a0r)

(2.199)

B'2(TB) = -B2(rB)^ + B3(TB)fc (2.200)

B'3(rB) = l-B3(rB)k (2.201)

£,•(0) = 0 (2.202)

A{0) = 0 (2.203)

74 The solutions to equations (2.200) and (2.201) are given by

1 kTB 2 kTB , x ?~ y iv - kv)ke- Mrs) = r+ 2(z ' (2-204) v v — K u {v — ky

1 _ p-~k.TB B3(rB) = . (2.205)

The functions B\ and A have to be determined numerically. Specializing the system (2.199)- (2.201) with the parameter restrictions given by equation (2.163), we obtain the system of ODEs for the Air(3)os model. Also, if we specialize the system (2.199)- (2.201) with the parameter restrictions given by equation (2.167) in addition to those given by equa­ tion (2.163), we obtain the system of ODEs for the Air(3)BDFS model. In each special case, the functions B3, and B2 are given by equations (2.200) and (2.201), and the functions A and B\ must be computed numerically.

A similar situation occurs with the models in the Air(3) family. Beginning with the

A2T{3)MAX model, the system of ODEs characterizing the price of a zero coupon bond given by equation (2.184) is

2 A'{TB) = -B,(r) (/xv + kve9) - B2(T) (6V + k9vv) + -B {r)ar

-B3(T) (krvv - kre9 + kr) (2.206)

2 l BiM = -Bl{T)^-\Bl{r)n - -Bl{r){alif + l)-B2{T)kev

2 -B3(r)krv - B1(T)B3(r)arvr1 (2.207)

2 2 B'2(TB) = -B,(T)kv9-B2(T)v + B3(T)kre-^B (T)<;

2 2 -\B\ O&C + ft) - B2(r)B3(r)aroC (2.208)

B'3(TB) = 1-B3{r)k (2.209)

Bt(0) = 0 (2.210)

.4(0) = 0. (2.211)

75 The solution to equation (2.209) is given by equation (2.205). The bond price solutions under the nested models A2r(3)chen, and A2T(3)DS are obtained by setting the appropriate parameter restrictions. For the A2r{3)DS > this amounts to applying the restrictions given by equations (2.174), (2.175),and (2.176). For the A2r(3)chen model, in addition to the

w restrictions of giving rise to the .42r(3).Ds, e require the parameter restriction given by equation (2.180). Figure 2.1 presents plots of the functions arising in bond price functions given by equa-

Plots of function B (.) arising in the system of ODEs Plots of function A(.) arising in the system of ODEs for the Zero Coupon Bond Prices for the Zero Coupon Bond Prices

— BDFS - - Chen 0 -DS13 -0.02 - DS23 MAX13 -0.04 MAX23 -00ft

-0.08

-0.1

-0.12

-0.14 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 Time in Years Time in Years

Plots of function B (.) arising in the system of ODEs Plots of function B (.) arising in the system of ODEs for the Zero Coupon Bond Prices for the Zero Coupon Bond Prices -i r-

0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1.5 2 2.5 3 3.5 4 Time in Years Time in Years

Figure 2.1: The functions A, B\. B2. B3

76 tion (2.184) for the models considered which are in the Ai{3) and ,42(3) families. The parameters are taken from Dai and Singleton [12], and are replicated in Tables D.3 and D.4 respectively. The functions were determined using the algorithm ODE45 in MATLAB. In­ terestingly, most of these functions can be computed in Maple 8, which utilizes a symmetry scheme to solve the Riccatti equations. For example, using Maple 8, the solution to the Ric- cati equation arising in the Air(3)BDFS model can be expressed in terms of the Whittaker functions, which has a graph given by figure 2.2. Notice how in figure 2.2, the function

>»##.VH*.,1Ji.^Ji

-0.0002 *V- •*.*i

-0.0004 - sfi

-O.0006 :

-0.0008 :

-0.001-

-0.0012 :

-0.0014 -

-0.0016 -

-0.0018 :

0 0.05 Oil 0.15 0.2 0.25 x

Figure 2.2: The function B\{JB) in the BDFS model obtained using Maple 8.

BI(TB) oscillates rapidly within a "band" as TB gets closer to the origin. This feature, is not captured when using MATLAB to numerically determine this function with the algo­ rithm ODE45 with, a maximal step size of ^, relative error not exceeding 2.22045e-14, and absolute error not exceeding e-14. The absolute error tolerance for the ith component is a threshhold below which the value of the ith solution component is unimportant, and it determines the accuracy when the solution approaches zero.

2.5.2 The Zero-Coupon Bond Futures Price

For the three factor Vasicek model, we shall adopt the same approach as in the correspond­ ing single and two factor model. That is, we shall compute the conditional distribution

77 of 3

Z = A(rB - TF) - Y, XiTpBiiTB - rF) (2.212) i conditional on Tu and utilize it to compute the conditional expectation given by equa­ tion (1.8) which defines the futures price at time t. For all the three factor models, except the three factor Vasicek model, we shall compute the bond futures price via the B-S method. That is, for these models, we proceed as in Section 2.5.1. Assume that the zero-coupon bond futures price is an exponential affine function of the three stat variables

Fut(t, 7>, TB) = exp {AF(TF) - XuB1F(rF) - X2tB2F{rF) - XMBZF{TF)}

(2.213)

TF — TF — t.

Then, we shall utilize the Ito-Doeblin rule in three dimensions applied to the function in equation (2.213) to determine the PDE characterizing it. Beginning with the three factor Vasicek model, using the system (2.146)- (2.154), we de­ termine that Z given by equation (2.212) is Gaussian with

3 EQ\Z\H = X^TB-T^EQp^yl^] (2.214)

3 B T VarQ[Z\Ft) = J^ i( B ~ rF)VarQ[XiTF\Tt} «=i

+2 J2 Mrs - TF)BJ{TB - TF)CovQ[XiTF, XjTF\^t] (2.215)

The analytical futures price function is given by

Fut(t:TF,TB) = EQ[exp{A{TB-TF)-Z}\Tt]

= ewlA{TB-TF)-EQ[Z\Ft] + ±VarQ[Z\rt]\ (2.216)

78 which is an exponential affine function of the three factors with

BlF{rF) = Bi(TB - TF)e-^Xlt (2.217)

AF(rF) = A(rB - TF) + B1(TB - rF)^p-(l - e~^)

+ 2L ^2 Bi\TB - TF)

13P%A + £ \ * Ufa ~ TF)B3{TB - rF)

6 V"vt Ji P?2

{ Pl Z) + B3(rB - TF)eJo,-p\z- ^; T \l - T^) (2-218) where i = 1. 2, 3. For the three factor CIR model, we compute the futures price by utilizing the B-S method, which gives rise to the following system of ODEs

2

B'IF(TF) = (ki + XjBipM + ^BUTF) (2-219)

BtF(0) = Bi(TB-TF) (2.220) 3

A'F(TF) = -]TM^F(TF) (2.221)

AF{0) =A{TB-TF) (2.222)

79 VTF 6 [0, Tp). The solutions are given by

2{kt + Xi) BiF{rF) = (2.223) g2 £j + ^fcj+Ai-J^e(fci+Aj)rF)

2fc0,- •A AF(TF) = A{TB.,TF) + J^ log (2.224) | _j_ ^i+Ai-Ji\c(fcj+A,-)TF i=l a;

J, = B^TB-Tp)^-, where i — 1.2, 3. Notice how this system is simply an extension of its single and two factor counterparts. This pattern occurs since the equations for the Bip's given by equa­ tion (2.223) are independent of each other, which was the case in the single and two factor CIR models.

For the models in the Air(3), and A2r(3) families, since the A\T{3)BDFS and AIT(3)DS models are nested in the Alr(3)MAx model ,and that the A2r(3)chen, and A2r(3)Ds mod­ els are nested in the A2T(3)MAX model, we can obtain their ODE systems by considering the parameter restrictions they imply on their maximal model counterparts. The system of

80 ODEs for the Alr(3)MAx model is given by:

2 2 A'FM = -B1F(TF)IIV - B2F(TF)0V - BspMkrvV +-B%F{TF) (a erar + ( )

2 2 + -BIF{TF) (a e( + ar) + B2F{TF)B3F(TF) (oreC? + a0rar) (2.225)

2 2 2 B'1F(rF) = -B1F{TF)H - \B\F(TF)I? - \B F(TF) (a^r) + a g00 + l)

JB T 2 - 2 2F( F) [Bo + o er) - BiF{rF)B3F{rF)orvr? - B3F(rB)krv

- B1F(TF)B2F{rF)a8vi?

- B2F(TF)B3F(TF) (oevarvr? + crr00g + a0r) (2.226)

B'2FM = ~B2F(rFy + B3F(rF)k (2.227)

B'3F(TF) = -B3(rF)k (2.228)

BiF(0) = Bi(TB - TF) (2.229)

AF(0) = A(TB - rF) (2.230)

The solutions to equations (2.227) and (2.228) are given by

/I e~k{rB-rF) (l>2_k\k ~k(rB-TF)\ K ; VTF B2F(rF) = — + - ){y-ke- ) \u v—k vl\v — ky J + (e~kTF - l) {1 - e~k{TB~TF)) (2.231)

kT B3F(TF) = e- r± L (2.232) k

The functions B\ and A have to be determined numerically. Specializing the system (2.225)- (2.228) with the parameter restrictions given by equation (2.163), we obtain the system of ODEs for the A\T(?>)DS model. Also, if we specialize the system (2.225)- (2.201) with the parameter restrictions given by equation (2.167) in addition to those given by equa­ tion (2.163), we obtain the system of ODEs for the Air{3)BDFS model. In each special case, the functions B3, and B2 are given by equations (2.231) and (2.232), and the functions A and B\ must be computed numerically.

81 A similar situation occurs with the models we consider in the A2r(3) family. The system of ODEs for the A2r{3)MAX model characterizing the bond futures price is given by:

2 AFM = -BIF{TF) (fiv + kv99) - B2F{TF) (0V + k9vv) + ^B 3F{TF)ar

-B3(T) (krvv - kr09 + kf) (2.233)

2 2 2 B[F(TF) = -B,F(TF)fi^^BUrF)v -~lB!F(TF)(a rvV + l)-B2F(TF)k0l,

2 -B3F(rF)krv - B1F(TF)B3F(TF)arvr) (2.234)

2 B'2F(TF) = -B1F(TF)kve - B2F(rF)i/ + B3F(TF)kr6 - -BIF{TF)Q

2 -\B\FM {eleC + So) ~ B2F(TF)B3(T)are( (2.235)

B'3F(TF) = -B3F{rF)k (2.236) with the initial conditions given by equations (2.229), and (2.230). All the functions must be determined numerically except for the function B3 which is given by equation (2.232), and this holds true for the A2r(3)s and A2r (3)MAX models.

2.5.3 The Pricing of a European Option on a Zero-Coupon Bond

The three factor Vasicek model is the only model among the three factor models discussed in this thesis which obtains an analytical price function for the European call and put op­ tions on a zero-coupon bond without resorting to numerical methods. We shall use the for­ ward measure approach to determine the price of this financial derivative under the three factor Vasicek interest rate model. We can determine the transition distribution under the forward measure QT°, and the Radon-Nikodym derivative is derived by an application of the Ito-Doeblin formula to the function In P(7b, To) and Girsanov's theorem in three di­ mensions, which is given by

P(t.T0) '

82 where

dBl° = - (B^roh! + B2{r0)o2pi2 + B3{r0)a3pl3)dt + dBJt (2.237)

Pl iz) dB% = B2(r0)a2Jl - p\2 + B3(TQ)^? f ) dt + d,B2t (2.238)

?To _ _ f R_^„W_ . L. „2 (P23 - P12P1313 ) di#> = - J B3(To)a3\lo3 - p\3 - ^ "f ' J + d,B3t (2.239) 1-P?2

TO = To - t are three independent Brownian motions under the probability measure QT°. The transition

T distribution [Xu, X2t, X3t] under Q ° is Gaussian with

EQr0[XU\TS\ = Xue~^~*) + aiJ e-t*-* {B,{T0)O, + B2(r0)a2p12)du

u + ^J e-fr<'- > (B3(r0)a3p13 - 9,) du, (2.240)

S {t u) EQT0[X2t\Fs} = X2se~^- ) + a2 f e^ ~ {B^rofapu) du J s

{t u) + o2 f e^ ' (B2{T0)O2 - 9lPl2 - 62^l-p\2\ du, (2.241)

{t EQT0{X3t\Fs] = X3se-^ ^+a3J e"^'^ (WSB^TO)) du

{t s) + o3 f e-^ ~ (a2p23B2(T0) +

VarQTo [X2t\Ts] = ^-(1 - e-2fe<*-)) (2.244)

2 Kar0ro[X3t|^s] = |J-(1 - e- ^-)) (2.245)

83 CovQTo[Xlt,X2m = °fzT^ ~ e-«>^'->) (2.246)

Cot-QTo[XM, A'M|:FJ = £^(1 - e-«^)(«-)) (2.247)

Cot^0[X3f,X2t\Fs] = ^» (1 - c-te+fcH*-)) (2.248) ?3 + <2

Hence, Z given by 3 B Z = J2 i(ro)XiTo (2.249) 2=1 conditional on J^ is normally distributed with

3 B r EQr0[Z\Tt] = Y, i( o)EQT0[XlTo\ft] (2.250) i=i

3

i=l

+2 J] BiirotfjiTolCavQTo [XlTo,X]To\Ts] (2.251)

Using equations (2.240)- (2.248), we may prove the following lemmas

Lemma 2.5.1. Let Z be given by equation (2.249), and P(t,TB) be the price of a zero coupon bond with maturity TB given by equations (2.184), (2.189) and (2.190). The condi­ tional distribution o/Z is Gaussian under QT° characterized by equations (2.250)- (2.25 J). Then

(Z* - EQT0 \Z\Tt\ - VarQr0 \Z\Tt\ EQTO [P(To,TB)l{P{To,TB)>K}\Tt} = # 1 - $ V yJVarQT0[Z\Ft} where

H = explA(To,TB) + ±VarQT0[Z\Ft] + EQT0[Z\Ft]\

Z* = A(TB - TO)- In K,

84 where the function A is given by equation (2.190).

Lemma 2.5.2. Let Z be given by equation (2.249), and P(£, !#) be the price of a zero coupon bond with maturity TB given by equations (2.184), (2.189) and (2.190). The condi­ tional distribution ofZ is Gaussian under QT° characterized by equations (2.250)- (2.251). Then

EQr0 lW0,rB)>K}N = 1 - * ^ VarQTo[Z\Tt] ) •

where Z* is defined in Lemma 2.5.1.

Now, using Lemmas 2.5.1 and 2.5.2, we may prove the following

Theorem 2.5.1. The time t priceofa European call option with expiry date Toand strike price K on a zero-coupon bond with maturity dat TB, is

^ , , Z*-E0To[Z\Tt]-Var0T0\Z\Tt) ECZB

The price of the put option can be obtained using the put-call parity relation given by equation (1.11).

2.5.4 The price of a European Option on Zero-Coupon Bond Futures

The three factor Vasicek model is the only model among the three factor models discussed in this thesis which obtains an analytical price function for the European call and put op­ tions on a zero-coupon bond without resorting to numerical methods. We shall use the forward measure approach as in the previous section to determine the price of this financial derivative under the three factor Vasicek interest rate model. Let Z be given by

3

Z = J2 BiF{T0, TB)XlTo., (2.253) i

85 where the functions BitF are given by equation (2.217). Then, is Gaussian with

3 B EQr0[Z\rt] = Y, iF(To,TF,TB)EQr0[XlTo\Tt} (2.254) 2 = 1

VarQr0[Z\Tt] = ^ B?F(To,TF, TB)VarQT0[XtTo\Ft} i-1

+2 ]T BiF(T0, TF, TB)B3F{T0, TF, TB)CovQr0

(2.255)

Lemma 2.5.3. Using equations (2.254) and (2.255),

EQTQ [Fvt(To, Tp, TB)l{FuHT0rTF!TB)>K} l-^t]

Z* - EQT0[Z\Ft]-VarQT0[Z\Ft] = H 1-$ V'orQr0[Z |^i] where

H = f^j.AF(To,TF,TB) + ±VarQT0[Z\Ft] + EQT0[Z\rt]\

Z* = AF(To,TF,TB)-lnK.

and

Lemma 2.5.4.

(Z* - EQT0[Z\Ft]\ EQr0 [l{Fut(To,TF,TB)>K}\rt\ - 1 - * ^-^—-^_-J , where Z* is defined in Lemma 2.5.3.

Theorem 2.5.2. If equations (2.254) and (2.255) hold, then by Lemmas 2.5.3 and 2.5.4 , the time t European call option expiring at time To with strike price K, on a zero coupon bond futures contract with contract settlement date TF and bond maturity date TB is given

86 by

ECZBondFut(L T0: TF, TBl K) =

'Z* - EQT0[Z\Ft]-VarQT0[Z\Ft] P(LT0)H 1-$ y/VarQT0[Z\Ft] -^•^U-^l^yi)). <2.256, where Z* is defined in Lemma 2.5.3.

87 Chapter 3

American Interest Rate Derivatives

3.1 Finite Expiring American Put Option

Unlike European options where the owner of the contract may exercise the option only at the expiration date, an American option gives its holder the right but not the obligation to exercise at any time up to and including the expiration date. Because of this early exercise feature, American options are at least as valuable as their European counterparts. An inter­ mediate option between the American and European ones is the Bermudan option which allows early exercise but only on a finite set of dates specified a priori in the contract. In this section we shall consider the finite expiring American put option. Let (Q, T, P) be a complete probability space, and we assume the existence of a risk-nerutral probability measure denoted by Q. The material presented in this section is based on Shreve [27]. Also, we assume that X is a single factor process.

Definition 3.1.1. Let 0 < t < T, x > 0 be given and K is the strike price. Assume

7 that Xt = x. Let J ^', t < u < T, denote the er-algebra generated by the processes Xv

! as t ranges over [t, u], and let Tt T denote the set of stopping times for the filtration Tu , t < u < T, taking values in [t, T] or taking the value oo. In other words, {T < v} £ Tu for every u & [t,T];a stopping time in TtT , makes the decision to stop at a time u 6 [t. T]

88 based only on the path of the price of the underlying asset between times t and u. The price at time t of the American put expiring at time T is defined to be

r ds + V(t, x) = max EQ \e~ K * {K - XT) \Tt (3.1) reTt,T L

r sds + In the event that r = oo, we interpret e K - (K — XT) to be zero. This is when the put expires unexercised.

The analytical characterization of the price of the finite expiration American put1, given that the process X follows the SDE

dXt = n(t, Xt) + oil, Xt)dBt (3.2)

where {Bt}te,0 T] is a Q Brownian motion, is given by

V(t, x) > (K - x)+ , Vt G [0, T), x, r > 0 (3.3)

2 rV{t, x) - Vt{t, x) - fi(t, x)Vx{t, x) - ^a {t, x)Vxx(t, x) > 0, Vt G [0, T), x, r > 0 (3.4) where for each t G [0, T), x > 0, equality holds in either equation (3.3) or equation (3.4). The set {(/, x); 0 < t < T, x > 0} can be divided into two regions which are the stopping set, and the continuation set given by the following definitions.

Definition 3.1.2 (Stopping and Continuation Sets). The stopping and continuation sets are denoted by S and C respectively and they are defined as

S = {(t,x);V(t,x) = (K- x)+] , (3.5)

C= {(t.x);V{t,x)> (K -x)+}. (3.6)

'which is also known as the linear complimentary conditions, and the Black-Scholes equation with a free boundary.

89 The boundary between the sets

2 rV(t, x) - Vt[t, x) - fi{t, x)Vx{t, x) - -a {t, x)Vxx(t, x) = rK- rx + n(t, x) (3.7) is obtained2 except along the optimal exercise boundary. A notable property is the smooth pasting condition which is given by

Vx(t,x+) = Vx(t,x-) = Vx{t,x) = -1 (3.8) for a; = L(T—t),0

2equation (3.7) is obtained since, on S, V(t, x) = k — x, and we substitute in for the partial derivatives of this function in to the left side of equation (3.7).

90 Vxx{t, x) is not continuous along the optimal exercise boundary, and this property does not allow an application of the Ito-Doeblin formula without considering the occupation time of the process Xt along the curve x = L(T — t).

Lemma 3.1.1. Given t, the function Vxx(t, y) is not continuous atx — L(T — t).

Proof. To show that the function Vxx(t,y) exhibits a jump at x = L(T — t), we shall consider the function Vxx(t,y) along the optimal exercise boundary, and show that we cannot approximate it from below. Since equation (3.4) holds with equality along the curve L(T — t), we can use the smooth pasting condition given by equation (3.8), so that

Vx(t,x) — —1, and that L(T — t) belongs to the set S implies that Vt(t,x) = 0 since V(t,x) — (K — x)+ on S. Substituting these findings into equation (3.8) and setting it equal to zero and re-arrainging, we obtain

qu) = 5t|i« (3.9)

Now, for (t,y) € S, but not along the optimal exercise boundary,Vxx(£, y) = 0, everywhere except at k, where it is undefined, and that equation (3.9) hold with strict inequality, we obtain + n v (t ^ 2(r(k-y) -Vx(t,yUt,y)) 0 = Vxx(t,y) < £7—-r

for all y except possibly at k. Hence, in the limit as we approach x = L{T—i), Vxx(t, x—) = 0, and

+ 0 < lim ^(k-yY-V^yMty)) = 2(r(fc-x) +/x(t,x)) y^x- o2(t,y) a2(t,x) by using the smooth pasting condition and the continuity of the functions ji(-, •), er(-, •) which implies that equation (3.9) is strictly positive. Hence, Vxx(t, x—) ^ Vxx(t. x). •

Shreve [27] mentions that a similar situation may occur with the function Vt(t, x), for x = L(T — t), which along with lemma 3.1.1 shows us that we do not meet the sufficient

91 conditions in Theorem A.1.5 to apply the Ito-Doeblin formula to the function V(t, y). If we know that the process Xt spends zero time on the optimal exercise boundary, then we can apply the Ito-Doeblin formula without making an error, and this happens to be the case. Now that we established that we can apply the Ito-Doeblin formula to the function V{t, y) without worrying about the discontinuities of Vt(t, y) and Vxx(t, y) occurring at the optimal exercise boundary, we have the following result.

Theorem 3.1.1. Let Xu, t < u

r* = min{«e[*,T];(u,XtI)€5} (3.11)

where we interpret r* to be oo if{u, Xu) does not enter Sfor any u G [t, T]. Then

Tsds e~ J* V{u, Xu), t < u

Proof. The proof relies on applying the Ito product rule, and the Ito-Doeblin formula after it. Finally, after computing the du term, we observe that it is in fact equal to the negative of the right hand side of equation (3.7) whenever Xu < L(T — u), and zero otherwise. O

We shall end this section with a motivation to utilize the dynamic programming prin­ ciple in continuous time to compute the price of the American put given by equation (3.1) which yields the system of equations (3.3) and (3.4). Although the presentation shall be heuristic, it relies on precise mathematical results.

The dynamic programming principle relies on two principles which are the principle of embedding, and the principle of recursion. Instead of solving equation (3.1) directly, the dynamic programming approach entails solving many problems that are embedded in the original one, which are related recursively. That is, we should solve the problems

V(t,x);te[0,T],xe[0,oo) (3.12)

92 indexed by t recursively starting with the 'smallest' one being V(T,x), and proceed­ ing backwards in time solving the successively 'larger' problems as they depend on the 'smaller' solved ones. Let A be a small positive number, and suppose the owner of the option behaves optimally from time t + & and onwards. This gives rise to the value

T ds + V(t + A,Xt+A)= max EQ ~e-^ ° {k-XT) \Tt^ (3.13)

Also suppose the owner of the option wants to exercise the option in the time period [t,t +

A), and if he behaves optimally, then this gives rise to a value

r ds + V = max EQ e-K ° {k-Xr) \Tt+, (3.14)

Now, from the perspective of time t, what should the owner of the option do? That is, should he or she exercise the option during the period [t, t + A) and obtain V'A, or wait to exercise in the time period [t + A,T] and obtain the expected present value ofV(t +

A. Xt+&). The decision depends on the following

A +Ar fe max'{V , EQ [e-£ " y(* + A, Xt+A)\Tt]} . (3.15)

We shall now show how the equations (3.15) and (3.1) are related using the following lemma.

Lemma 3.1.2.

A + r ds V(t,x) = max{v ,EQ [e-£ * ' V(t + A,Xt+A)|^i] } - (3-16)

Proof.

+Ar ds max [V*,EQ [e-X ° V(t + A,*t+A)|:Ft]} > V{t,x)

r ds follows form the fact that e~ £" " V(u, Xu) is a supermartingale under Q measure, which

93 was proved in Theorem 3.1.1. The reverse inequality follows from fact that the decision

A rule embodied in equation (3.15) is in fact a in TtT. That is, let r be the optimal stopping time arising in the equation (3.14), and let r* be the optimal stopping time arising in equation (3.13). Then, the optimal stopping time that gives rise to a value given by equation (3.15) is

mm {rV'jeV

Now, since V(t, x) involves taking the maximum over all stopping times in TtT, we have that

A r ds V(t, x) > max {^ , EQ [e" ^ ° V(t + A, Xt+A)\ft] }

Suppose, that the process {Xu}ue<0T, is described by the SDE given by equation (3.2) under the Q measure. Then equation (3.16) is equivalent to

A r ds , V -V(t.x): „ eSt ° V(t + A,Xt+A)-V(t,x) 0 = max < — —, EQ (3.17) A l^t

where we subtracted V(t, x) on both sides, and since it is Tt measurable, we can bring under the conditional expectation sign. Now, taking the limit as A J. 0 we obtain

A r ds n ... V -V(tx) v e-ft ° V{t + A,Xt+A)-V(t,x) U = max < hm , Jim En Tt A|0 A AjO A (3.18) where A A V -V(t,x) 0 mimAi0V = V(Lx) lim A]0 A -oc otherwise

94 and if we assume that the drift and diffusion functions in equation (3.2) are bounded, then

'e-f'^Vit + A,Xt+A) - V{t,x). lim En -I*i AjO W A

2 - - rV(L x) + Vt(t, x) + »(t, x)Vx{t, x) + ^ {t, x)Vxx{t, x), (3.19) which follows from applying the dominated convergence theorem.

3.2 Practical Dynamic Programming

To make the dynamic programming approach practical, we need to discretize the time domain. This entails approximating the American put option by a Bermudan put option since the latter can be exercised only at a finite set of times during the life if the option including the expiration time. Hence, we shall incur a discretization error, but this error can be made small by choosing a sufficiently small time step. We next present some notation in order to dfine the environment of the dynamic programming framework in discrete time to characterize the option's value.

First, we discretize time in which the finite expiring American put problem is stated in Definition 3.1.1 by considering an option which can be exercised at M + 1 fixed times given by t0 = 0, U, ...,*M = T between t = 0 and option expiry. There is a complete probability space (ft, T, P) and a risk neutral measure Q, with a discrete filtration F —

D {.FtJ, i = 1, ...M. Let {Xti}, i = 1, ...M be a R valued, F adapted , which models the different factors in the term structure model presented in the previous chapter. Let, {Sti},i = 1, ...M , denote the process which models the price of the underling asset. There is a time homogenous payoff function given by

I(s) = (k - s)+

95 where k is the strike price, and the resulting variables Zti = I(Sti), i = 1, ...M are square integrable. Let %yT denote the set of all F stopping times with values in {/0, ••-, £M}- We define the time zero price of the (Bermudan) option as

C/(0, 50) = max EQ [D(0,T)J(ST)LFO] • (3-20)

We can compute f/(0, So) by solving the following dynamic programming problem

U{tM,StM) = I(StM) (3.21)

U(U,Sti) = maX{/(^),£Q[D(^,^+1)t/(^+1,^+))|^i]} (3.22)

0 < i < M - 1

By the we obtain

EQ [D(thtl+i)U(ti+1,Su+1)\Tti] = MXu), (3-23) where 4>i(x) is formally defined as

4>i{x) = EQ [D(tu ti+1)U(ti+1, SU+1)\XU = x] (3.24)

Thus, by the definition of conditional expectation, we have that 4>i(Xti) is the projection of

D(U, ti+i)U(ti+i, Su+1) onto the space

D 2 L\ = {V : R -> R;EQ [* (Xti)} < oo} .

The first approximation proposed by Longstaff and Schwartz [23] consists in substitut­ ing the infinite dimensional spaces L2. with finite dimensional ones. Their approach is to

2 consider a set of G linearly independent functions ej(),..., ed) such that EQ \e g{Xti)\ <

96 oo,V(? = 1,..., G and V? = 1. ...M, with the following approximation

G

where 0'u = (0UA,...,(3UG) andepQ = (ei(A^), ...eG{Xu))', and

1 Ptl = EQ [e(Xu)e'{Xu)\Fu]- EQ [e{Xu)D{U,ti+i)U{ti+l,Su+i)\Fu] . (3.26)

To illustrate the second approximation proposed by Longstaff and Schwartz [23], we intro­ duce the sequence rf:

rg = M

T = ll T l f {HSu)>pti-e(Xtl)} + W {l(SH) 0 < 2 < M - 1.

From these stopping times, we obtain an approximation of the value function at time zero given by

G t/0 = max {/(So), EQ [/(%)] } • (3.27)

Now, the second approximation proposed by Longstaff and Schwartz [23 J in their algorithm is to evaluate numerically EQ I(STG) by a Monte Carlo procedure. To implement the LSM algorithm, we present pseudo code for it's implementation on a computer. The LSM algorithm follows these basic steps

1. Simulate N independent paths from the multi-factor interest rate model, and con­ struct the corresponding paths of the underlying interest rate derivative.

2. For each path,

(a) At option expiry time set the cash flow value equal to its intrinsic value and store this in a vector U(n) = I(Sj-), Vn = 1..., N.

97 (b) Store the exercise time along each path in a vector A which is N by one, where the component A(n) corresponds to the exercise time for the nth path. Note that, the entries in A have values in {1, 2, ...M}, and zero if never exercised.

3. Apply backwards induction for i = M - 1. ...1

(a) Find the paths of S^ which are in the money. That is, find n = 1,2.... N such

that I(S£) > 0, and call this set A.

(b) Find the paths of the factors that correspond to those in A, and run the regression

G X F {D(U, tA{n))) U(n) = ]T 0u,geg( O + <*, n G A. 3=1

where { 0 if A(n) = 0 D{U,tA{n)) otherwise (c) Update: U(n) = I(S£), and A(n) = U, whenever

JC?(X£);neA

4. set

M N U^ ' = max IK - 5to, if] F (D(t0, tA(n))) I (^(n)) 1 (3.28)

The stopping rule generated by the algorithm is given by

LSM r = min [u > 0; I(SU) > 0, and I(SU) > C°{Xti)} (3.29) which posits that the holder of the option exercise the option the first time, the payoff of immediate exercise is positive, and at least as large as the continuation value as specified

98 by equation (3.25), where the coefficients take the values arising in the LSM algorithm.

3.3 Properties of the LSM Estimator

The criteria we must keep in mind while comparing different estimators are the relative computational effort, bias, consistency, and finally, reliability. The LSM estimator U0 ' in comparison to the price of the American option given by equation (3.1) is biased low since time is discretized and that the stopping time obtained from the LSM algorithm given by equation (3.29) is expected to be suboptimal. Glasserman [16] mentions that relative to the true price of the Bermudan option given by equation (3.22), the LSM estimator U0 ' ' is an interleaving estimator, that is, it mixes elements of low and high bias. We shall demonstrate this property in the case where the specification of the conditional expectations in the LSM algorithm is correct. Also, we shall present four convergence results where the former two were proved by Longstaff and Schwartz [23], and the latter two were proved by Clement, Lamberton, and Protter in Clement, Lamberton, and Protter [10].

3.3.1 Interleaving Property

First whenever the conditional expectation function given by equation (3.25) are correctly specified, the LSM estimator mixes elements of high and low bias in an alternating fashion. That is, at time t; along path n, the value at that position is biased high due to the backwards induction nature of the algorithm, and is biased low since a suboptimal stopping time is applied to determine the value of this position. Next we shall discuss the usefulness of this property. If we fix a time, in the LSM algorithm, given by U, where we want to determine the decision along path n. Step 3(c) of the algorithm presents the basis of the decision rule at time U, which gives rise to the following value of the option at that time

U(U. S£) = max {/ (51)) , C£ (X£) + e„} . (3.30)

99 where en is the estimated error term for path n in the regression at step 3. Applying con­ ditional expectation, conditional on S?. to both sides of equation (3.30), and using Jensen's inequality since the max function is convex yields

EQ [0(U, S$\S»\ > max [EQ [I (S$ |S£] , EQ [C£ (X$ + c„|S^] } • (3.31)

Since the conditional expectation is assumed to be correctly specified, we obtain

EQ [fnl'S't"] - 0, from which equation (3.31) becomes

E(Q U(U,SZ)\S»\ > max{/(Sj;),^^)} (3.32)

(3.33) which is the desired result.

To show that U(U, S£) is also biased low, consider the stopping rule at time U along path n given by

r = min {tj > U; I(S^) > 6° (xg) } , (3.34) where we obtained the functions Cf. \X"\ from running the LSM algorithm form the option expiry time, to time t*. We can combine the two cases that assign a value to u(ti, S£) given by equation (3.30) as

£/(*,, S?) = D(t,-,f<)/(5?0. (3.35)

Hence,

E(Q U(U,SZ)\Tt, < max EQ[D(U,r)I(ST)\Tu} = U(tuS^). (3.36)

100 In this sense, the LSM estimator mixes elements of low and high bias at every stage of the algorithm as opposed to separating the two sources of bias where we avoid the contingency of compounding the two errors at each stage of the algorithm and, instead attempt to have them offset each other. Glasserman [16] mentions that these two effects on the estimator may offset each other, but a thorough study remains open for investigation.

3.3.2 Convergence Results

In this section, we shall be concerned with the asymptotic properties of the LSM estimator as the number of sample paths, denoted by Ar, and as the number of basis functions used in the regressions, denoted by G tend to infinity. The first result, due to Longstaff and Schwartz [23] proves that the LSM estimator is asymptotically biased low. In their paper, they rule out the possibility of exercise at time zero, in this case, the LSM estimate of the time zero price is simply given by the time zero continuation value. In the setup described in Section 3.2, we allow for time zero exercise, which does not change the asymptotic result, since the convergence of [/^'M' as N —• oo is still governed by the convergence of the continuation value. Formally,

Proposition 3.3.1. Given G and M,

M N U(0,So)> lim V$' ' ,a.sQ. N—too

For the details of the proof, the reader may consult the appendix of Longstaff and Schwartz [23]. The essential idea behind the proof is to utilize the strong law of large num­ bers to the sequence <{ F (D(t0,tA(n))) 1 \S?A n1) [, n = 1, ...A\ and utilize the possible suboptimality of the stopping rule generated by the LSM algorithm. In the former, we need to make sure that it is a sequence of independent and identically distributed random vari­ ables with finite mean. This is true since we have independence across the different paths of factors, which implies that the continuation values associated with each path at each

101 exercise time are independent since the functional form, given by Cf.{Xti) is the same

LSM across all paths; then the cash flows generated by following the stopping rule T given

by equation (3.29) are independent across the different paths. Hence, by the strong law of

large numbers

M N LSM lim U^ ' = EQ \D{0,T )I{STLSM)\Tto\ , a.s Q. (3.37) 7V--+00

Combining equation (3.37) with the possible suboptimality of rLSM explained by

LSM EQ [D{Q,T )I{STLSM)\Tt0] < max EQ[D{tur)I{ST)\Tto] = U(t0,Sto), T&TO,T

we obtain the desired result. Proposition 3.3.1 has an important implications for the choice

of G, the number of basis functions used in the regressions. As explained in Longstaff

and Schwartz [23], it provides guidance in determining the number G to obtain an accurate

approximation: In practice, simply increase G until the LSM estimate no longer increases.

Longstaff and Schwartz [23] also consider the effects of propagating the estimated stopping

rule backwards through time in the case where M — 2 and that the option can only be

exercised at the dates 11 and £2 where the convergence of the algorithm is more easily

demonstrated. The second result proved in Longstaff and Schwartz [23] is given by

Proposition 3.3.2. Let M = 2, and the value of the american put option depends on a

single state variable, X with support in (0, oo) which is a Markov process. Assume further

that the option can only be exercised at times t\ < £2. ond that the conditional expectation function (pt1 (x) given by equation (3.24) is absolutely continuous with Radon-Nikodym

derivative (with respect to Lebesgue measure) Ctl (x) and

/>oo / e~x(p\{x)dx < 00 Jo /•OO x / e~ Ctl{x)dx < 00 Jo

102 Then, Ve > 0, 3G such that

1 N lim Q > e = 0 (3.38) N—>oo U(ti,Xtl) - - Y,F (D(tl,tA{n))) I (S?AW) * n=\

Intuitively, this means that choosing G large enough, and letting N —> oo, the LSM algorithm results in a value for the Bermudan option within e of the true value. Also, this result implies that G need not be infinite. The key that makes Proposition 3.3.2 possible is the convergence of C^(x) to 4>\{x) uniformly in G on (0, oo) whenever the indicated integrability conditions are met. This bounds the maximum error in estimating the condi­ tional expectation, which in turn bounds the maximal pricing error. Furthermore, Longstaff and Schwartz [23] conjecture that a similar result can be obtained for higher dimensional problems by finding conditions under which uniform convergence occurs. The third convergence result is due to Clement, Lambert and Protter [10] who show under some general hypothesis on the choice of basis functions, and payoff functions the follow­ ing hold

lim UQ'TGM,N ^ = £/rrGlG(0 , So), a.s. Q. (3.39) N—>oo and

G Mm U (0,So) = U(0,So), (3.40) G—>oo where the limit in equation (3.40) is with respect to the L2 norm. The result given by equa­ tion (3.39) states that the LSM estimator approximates the time zero price of the Bermudan option with conditional expectations in equation (3.24) specified by the functions

C£(*),i = l,...M, almost surely, where these functions are a linear combination of the first G basis functions such as Laguerre, Hermite polynomials, and regular polynomials. Equation (3.40) states that as the number of basis functions increases, the actual limit in equation (3.39) converges

103 to the true time zero Bermudan option price in the L2 norm. The implications of the results presented in equations (3.39) and (3.40) are that a robustness in the choice of basis function arises. This is also demonstrated with numerical tests in Longstaff and Schwartz [23].

104 Chapter 4

Case Studies

4.1 Introduction

Having described the LSM algorithm in Chapter 3, and the different interest rate models in Chapter 2, we shall consider the application of the LSM algorithm with the different interest rate models to price finite expiring American put options when the underlying asset is a coupon-bearing bond futures contract. Such contracts are traded on the Chicago Board of Trade, CBOT hereafter, and Chicago Mercantile Exchange, CME hereafter, which motivates our investigation of the use of the LSM algorithm to price these contracts. In Section 4.2 we present the pricing framework for the following underlying assets: T-bond futures, and the two, five and ten years T-note futures contracts. Since the underlying asset does not pay coupons, the discretization of the time domain yielding the price given by equation (3.20) is that of a Bermudan option, and hence, the Bermudan approximation to the price of the American put option. After which, in Section 4.3 we discuss how to construct a confidence interval for the true price given equation (3.20) which motivates the use of a high biased estimator to be able to bound the price from above, and we present our choice for this estimator. Next, we present the specific contracts we shall consider, and the coding environment that was used to implement the LSM algorithm, and produce the

105 results.

4.2 The T-Bond and T-Note Futures Contracts

We shall consider four underlying assets which are the 30 year treasury bond futures, three treasury note futures contracts which are the 2, 5, and 10 year treasury note futures. What these contracts have in common is that they are not based on a single bond. The 30 year treasury bond futures contract has as the underlying asset in a treasury bond futures contract is any $100,000 face value that has more than 15 years to maturity on the first day of the delivery month and that is non-callable for 15 years from this day. The underlying asset in a 5 year T-note futures contract is any $100,000 face value T-note maturing between four and a quarter to five and a quarter years from the first calendar day of the delivery month. The 10 year T-note futures has any $100,000 face value T- note that matures between six and a half to ten years from the first calendar day of the delivery month. Finally, the 2 year T-note futures contract is based on a $200,000 face value treasury note with original maturity of not more than five and a quarter years and a remaining maturity of not less that one and three quarters years from the last day of the delivery month. When the futures contract settlement date arrives, because of the multiple deliverable bonds, the futures price must account for the cheapest to deliver option held by the seller of the futures contract. For this reason, the CBOT standardized the comparability of the different futures prices as they depend on the different eligible bonds. This was accomplished by introducing an adjustment factor, called a conversion factor, in order to quote all the eligible bond futures prices in a contract in terms a specific bond futures price. The CBOT defines the conversion factor as

Definition 4.2.1 (Conversion Factor). A factor used to equate the price of T-bond and T- note futures contracts with the various cash T-bonds and T-notes eligible for delivery. This factor is based on the relationship of the cash-instrument coupon to the required 6 percent

106 deliverable grade of a futures contract as well as taking into account the cash instrument's maturity or call.

The bond maturity is rounded down to the nearest zero, three, six,or nine months. If the maturity of the bond or T-note is rounded down to zero months the conversion factor by following Nawalkha, Baliaeva and Soto [26], is:

1 1 1 CFa = l 0.03 0.03(1.03)2n + (1.03) 2n

where c is the coupon rate and n is the number of years to maturity. If the maturity of the bond is rounded down to three months, then the conversion factor is:

CFp + % c CF* = (1.03)5 ~4

If the maturity of the bond is rounded down to six months, then the conversion factor is given by CFs = l 1 1 1 0.03 0.03(1.03)2"+1 + (1.03)2"+1

Finally, if the maturity of the bond is rounded down to nine months, then the conversion factor is

rr CF* + % c (1.03)5 4 The CBOT defines the accrued interest as

Definition 4.2.2 (Accrued Interest). The interest earned between the most recent interest payment and the present date but not yet paid to the lender.

To elucidate on Definition 4.2.2, consider the following example taken from Nawalka, Beliaeva, and Soto [26].

Example 4.2.1. Suppose on November 12,2005, the quoted price of the 10 percent coupon bond maturing on August 5, 2019, is $97,250 on a $100,000 face value. Since government

107 bonds pay coupons semiannually, a coupon of $5,000 would be paid on February 5 and August 5 of each year. The number of days between August 5, 2005, and November 12, 2005(not including August 5, 2005 and including November 12, 2005), is 99, whereas the number of days between August 5, 2005, and February 5,2006 (not including August 5,2005, and including February 5,2006), is 181 days. Therefore, with the actual/actual day count convention used for Treasury bonds, the accrued interest from August 5, 2005, to November 12, 2005, is:

99 AI = $5,000 x —- = $2,734.81. 181

The delivery cash price of a futures contract on a single coupon-bearing bond is given by the sum of the invoice price and the accrued interest, where the former is the bond's conversion factor times the futures price at the futures expiry date. Also, the delivery cash price is given by n Y^CixP{TF,Ti) (4.1)

which is the coupon bond price given by equation (1.16) evaluated at TF, where C\ is the cash flow from the coupon bond at time TJ. This implies that

n CiX p T T = Fut T J2 ( r> ') ( F,TF-,TB) XCF + AI. (4.2) «=i

Hence, re-arranging equation (4.2), and solving for Fut(TF, TF. TB), and taking the time t risk neutral expectation yields the time t futures price given by

/ rr. N XT lCiF{t.Tp,Ti)-AI /A„ l=l Fut{t, TF, TB) = ^ Cp ' • (4-3)

In the presence of multiple deliverable bonds, the futures price must account for the cheap­ est to deliver option held by the seller of the futures contract. By indexing all of the deliv-

108 erable bonds, j — 1, ...m, we obtain the time t natures price

Fut(t, TF, TB) = EQ (min ^C^T^-AI^ ^ ^

4.3 High Biased Estimator

Proposition 3.3.1 shows that the LSM estimator given by equation (3.28) is asymptotically biased low almost surely as the number of simulated paths increases to infinity. Hence, for large enough number of simulations, although the LSM estimator is an interleaving estimator, it possibly underestimates the true price of the Bermudan put option given by equation (3.22). For this reason, we should construct a confidence interval for the true price. Following Glasserman [16], this is accomplished by utilizing two biased estimators to construct a 90% confidence interval. Let, Vn and vn be sample means of n independent replications, and as estimators of Vo they are biased high and low respectively in the sense that

E[Vn] > Vo > E [vn]. (4.5)

Suppose that, for some half width Hn,

Vn±Hn

is a valid 95% confidence interval for E[Vn] in the sense that the interval contains this point with 95% probability, and suppose that with 95% probability,

is similarly a valid 95% confidence interval for E [£>„]. By taking the lower confidence limit of the low biased estimator, and the upper confidence limit of the high biased estimator, we

109 get an interval

(vn-Ln,Vn + Hn^ (4.6)

containing the unknown value V0 with probability at least 90%, and at least 95% if 14, and vn are symmetric about their means. We shall consider the high biased estimator which is based on early exercise clairvoyance. The idea comes from the fact that we can simulate the path of the underlying price process of the American put option, and for this reason construct a high biased estimator which utilizes the trajectory of the underlying price process from time zero to the option expiry time. This estimator is constructed via the following steps

1. Simulate N paths of the underlying price process of the American put option given

by S"., n = 1, ...A7 and i = 0, ...M where the initial price in non random.

2. Along each path n, find i such that

max (l{S?.)) (4.7)

and call the argument that maximizes the expression in equation (4.7), in.

3. Discount the cash flows realized by following the rule in for each path n back to time zero, and compute the sample average

1 N HBN = -Y,D{0,tv,)I(SZn). (4.8) 71=1

Notice that the high biased estimator given by equation (4.8) does not involve any approxi­ mation using basis functions, hence, is computationally simpler. Also, since the simulated paths are independent, D(Q,Un)I(S? ), n = l....Ar are independent but not identically distributed.

110 Since the LSM estimator given by equation (3.28) is of the form

max {a,YN} (4.9)

where a € R, and Y^ is the sample mean of Yn, i = 1, ...N which are i.i.d, it is not of the form where we can construct a 90% confidence interval for the true price as mentioned

above since it is not a sample mean. Notice that,

YN

holds for all Ar. If E[Yi] < oo, then taking the limit as N —> oo to both sides of the equality in equation (4.10) we obtain by the strong

EY1< lim max la, YN\ , a.sQ. (4.11)

In the case of the LSM estimator where

Yn = F(D(toMn)))l(s?A{n)), by Proposition 3.3.1, the right hand side of equation (4.11) is less than or equal to the true price of the Bermudan option almost surely Q, yielding

EY1

Hence, YN is biased low. To summarize,

LB f » = ^E (^M) / (S?AM) (4-13) is our low biased estimator, and HBN given by equation (4.8) is our high biased estimator.

Ill 4.4 The Environment

4.4.1 Underlying Assets

We shall focus on the T-bond and T-note futures contracts which are based on a single deliverable coupon-bearing bond with unit face value, and 6 percent coupon rate which is payed semiannually. For the 30 year T-bond futures, we shall consider the bond which has 30 years time to maturity at the time of the futures contract settlement, in the 2 year case, we consider the bond with 2 years to maturity as of the futures contract settlement date, in the 5 year case, we consider the bond with 5 year to maturity from the futures contract settlement date, and finally, in the 10 year case, we consider the bond with 10 years to maturity from the futures contract settlement date. Also, all the options, on these contracts will have one and a quarter year to option expiry.

We utilize MATLAB to implement the LSM algorithm presented in Section 3.2 for the interest rate models presented in Chapter 2. The time domain will be discretized based on the weekly time step, that is dt = ^. So that each point along a simulated path, represents a movement in calendar time of one week. In order to simulate the asset price paths following equation (4.3), we wrote code to simulate the sample paths of the T-bond and T- note futures prices. We also consider a range of strike prices for each option, which is constructed by adding and subtracting increments from the time zero price of the underlying asset. This way, we can investigate the performance of the LSM algorithm for strikes which are at the money, in the money, deeply in the money, out of the money, and deeply out of the money. The T-bonds and T-notes we consider have unit face value, and for this reason, we consider increments which are 0.005 units apart starting from zero going to ±$0.05.

4.4.2 Basis Functions

We shall present the different set of basis functions that we shall utilize in the LSM algo­ rithm for the single, two and three factor models, and discuss their properties. Longstaff

112 and Schwartz [23] find that the results from the LSM algorithm are robust to the choice of basis functions, and they point out that different choices have different numerical and statistical implications. The former concerns numerical errors resulting from scaling. The latter is based on the statistical significance of individual basis functions in the regressions where, some choices of basis functions may be highly correlated with each other and may result in estimation difficulties for individual regression coefficients similar to the problem of multicolinearity in econometrics. They also mention that the LSM algorithm is unaf­ fected in the presence of such an estimation difficulty, since the emphasis is not on the individual coefficients, rather on the fitted value of the regression, which proxies the con­ tinuation values at different possible exercise times where a regression is undertaken. Also, they mention that the LSM algorithm is unaffected by the degree of correlation among the regressors. For the single factor models, we shall utilize a finite subset of the set of basis functions

V={Xn,n = 0,1,2,...} (4.14) where X is a real variable defined on the compact set [0, b] for b > 0 and large enough. The reason the domain of X is non-negative is because X proxies for the price of the underly­ ing asset, which is non-negative. To see that we can approximate elements of L2 ([0, b]), with members of Span^f1, recall first that ^ is a linearly independent set of functions, and secondly that the Hermite polynomials are obtained by applying the Gram-Schmidt orthog- onalization process to members of \&. Let us denote the set of Hermite polynomials on [0, b] by

H = {Hn(X),n = 0,1,2,...}. (4.15)

Then, by the Gram-Schmidt orthogonalization process we obtain

k Span{Hn{X),n = 0, l....k} = Span {l,X, ...X } (4.16)

xSpanM denotes the set of all finite linear combinations of members of the set M.

113 which holds for every k. The next step is to show the following set equality

SpanH = Span®, (4.17) from which the main result follows. Proving SpanH C Span$> is straightforward since every finite linear combination of Hermite polynomials is indeed a finite linear combination of the regular polynomials in ^. To prove the second direction, note that any polynomial of degree n, f(x), can be approximated by Hermite polynomials,

~{ {Hi, Hi) where {•, •) denotes the inner-product on the Hilbert space L2[0. b]. Hence, the set equality given by equation (4.16) holds. Thus,

Span^ = SpanH = L2[0, &]. (4.19)

The finite subset of ^> we shall utilize as basis functions to approximate the conditional expectations is given by

2 V2 = {hX:X }. (4.20)

In higher dimensions, the set of multivariate basis functions is obtained using the tensor product of the univariate basis functions.

Definition 4.4.1 (Two-fold tensor product). Let A and B be sets of functions over x £ Rm and y G Rn, their tensor product is given by

A <8> B = {{x)ib(y)\ EAi'£ B) (4.21)

A two-fold tensor product basis is a tensor product of two univariate set of basis func-

114 tions such as the Hermite polynomials. Definition 4.4.1 can be extended easily to consider a set of multivariate basis functions in three dimensions. If we utilize the first k functions from a set of univariate basis functions in order to construct a set of multivariate basis func­ tions for two and three dimensions, then the number of basis functions grows exponentially with the dimension since, the two and three-fold tensor products yield k2 and A:3 number of multivariate basis functions respectively. Judd [21] mentions from the perspective of the rate of convergence, many of the elements of a tensor product basis are excessive, and for this reason, if the set of univariate basis functions is given equation (4.20), then in higher dimensions, we can work with a smaller set of basis function called the set of complete polynomials and obtain as good a level of accuracy as working with the two and three-fold tensor products of {1. X, X2}. This approach is desirable since it has the potential to in­ crease the computational speed of the LSM algorithm not at the cost of reducing accuracy. In the two factor case, we shall utilize the complete set of polynomials of total degree 2 in two variables given by {\,X,Y,XY,X2,Y2} (4.22) as a the finite set of basis functions. Finally, in the three factor setting, we shall utilize the complete set of polynomials of degree 2 in thee variables given by

2 2 2 {l.X,Y,Z,XY,XZ:YZ,X ,Y ,Z } (4.23) and the term XYZ as the finite set of basis functions.

4.4.3 Approximating the Optimal Exercise Boundary

The optimal exercise boundary for the American put option on a T-note or bond futures contract is a continuous non-increasing function of time to expiry of the option, that deter­ mines the futures price level at or below which it is optimal to exercise the option. Hence, determining this boundary is of importance. At expiry, the optimal exercise boundary is

115 given by the strike price, since any futures price above it will yield a negative payoff, and any futures price below it, yields a positive payoff. We can approximate the optimal exercise boundary by using the stopping rule generated by the LSM algorithm given by equation (3.29). The approximation used is carried out in three steps:

1. At time zero, the approximation is the difference between the strike price and the continuation value at that time.

2. At each time greater that zero, isolate the paths at which optimal exercise occurs, and from this subset of sample paths, find the largest futures price.

3. Since it is possible that the futures prices determined in the first step are not neces­ sarily non-increasing with decreasing time to expiry, we apply a simple algorithm which deals with this contingency. At each time, we compare the futures price at the time prior the current time, with the futures price at the current time. If the value at the current time is less than the value at the prior time, we use the futures price from the prior time to proxy for the value of the boundary at the current time. This way, our approximation is non-decreasing with decreasing time to expiry.

4.4.4 General Remarks on Simulation

Since the parameter values available to us for the single and multifactor Vasicek models, the single and multifactor CIR models, and the three factor models considered in the Air(3) and A2r{3) families are estimated from different data sets, we don't have a quantitative basis of comparison across this classification of all the interest rate models. So, for the single and multifactor Vasiscek, and CIR models, we shall investigate the effect of the discretization error from utilizing the Euler scheme to simulate the paths the underlying asset, as the number of factors increases on the optimal exercise boundaries, confidence intervals, and early exercise premia. For the models in the .4ir(3) and ^2r(3) families, we shall investigate the effect of the added complexity across the models on the optimal

116 exercise boundaries, confidence intervals, and early exercise premia. To have a basis of comparability, with the classification stated above, we need to use the same initial conditions, and random numbers in each case. We chose state zero as the initial state of the uniform random number generator, and the normal random number generator in MATLAB.

4.4.5 The Parameters of the Models

The data used in Babbs and Nowman [2] consist of constructed zero-coupon yields obtained from interbank interest rates. The raw data include money market rates with maturities in­ cluding the overnight rate, one-, three-, and six months rates, Euro dollar futures, and swap rates with two-five, seven, and 10 years to maturity obtained form Datastream. The interest rates are sampled daily from April 1987- to December 1996, and they use weekly data on a Wednesday to avoid missing observations and week-day effects. It is a total of 507 weekly observation dates, and at each date they have eight interest rates given by the maturities: three and six months, one, two, three, five, seven, and 10 years. Babbs and Nowman [2] apply Kalman filtering to a state space formulation of their models, allowing measurement error in the data to estimate the parameters of their models which are reproduced in Ta­ ble D.l.

One of the data sets that Chen and Scott [9] use for the estimation of the single and mul- tifactor CIR models are yields for zero-coupon bonds. They use rates from zero-coupon yield curves for 3 months, 6 months, 5 years, and the longest maturity available (10-25 years). The rates are annualized and stated on a continuously compounded basis and repre­ sent rates for zero-coupon bonds. Also, the rates have ben computed from month end prices in the Treasury , and they use data for the period 1960-1987. The parameters are estimated by applying an approximate maximum likelihood estimator in a state-space model using the above data, where a non-linear Kalman filter is used to estimate the unob- servable factors. The estimated parameters are reproduced in Table D.2. Finally, Dai and

117 Singleton [12] use weekly of yields on 6 moths, two-year, and ten year swap contract yields from 1987-1996. The parameters were estimated by using the simulated method of moments of Gallant and Tauchen [14] and reproduced in Tables D.3 and D.4.

4.5 Results

4.5.1 The CIR Models

What is interesting about this group of models is that since the transition densities of the state variables in the single and multifactor models are known. For this reason, we can ex­ amine the impact of the discretization error arising from the usage of the Euler scheme on the LSM algorithm. This is considered important since not all the models have known tran­ sition distributions. The transition density for the single factor CIR model is conditionally distributed as a scaled non-central chi squared variate under the risk-neutral measure, which is given by equation (2.12). Also, for each of the factor models, we found that the 95% con­ fidence intervals became larger as we considered coupon bonds with longer maturities. On the other hand, the Euler scheme approximation utilizes the discrete approximation of the differential of a Brownian motion in equation (2.11), which is conditionally Gaussian. In other words, the Euler scheme uses a Gaussian approximation to the original transition dis­ tribution which is clearly not Gaussian. We shall consider the impact of the discretization error on the LSM algorithm by considering its impact on the size of the 95% confidence intervals for the price of the Bermudan put option, on the early exercise premia, and on the optimal exercise boundaries.

The 95% confidence interval for the price of the American option under the Euler scheme are significantly larger than its counterpart obtained via the transition distribution approach for all the interest rate models in this group. For example, consider Figure 4.1 which con­ tains the 95% confidence intervals, and the LSM estimates for the two factor CIR models. The confidence intervals generated under the transition distribution are significantly tighter

118 in comparison to the interval generated under the Euler scheme. For example, consider the results for the 30 year T-note future put option given by the bottom two panels in Figure 4.1. Furthermore, consider the strike price which is a deviation of the futures price by 5 cents in the money. Although it is hard to check without zooming in, the sizes of the confidence intervals are approximately 0.001, and 0.1 when simulation is conducted via the transition distribution and Euler scheme respectively. Recall that on the CBOT, that the T-note has a face value of $100,000. Hence, if we use the LSM estimate to price the option, we can incur a loss which is no larger than $ 1000 only if we simulate via the transition distribution, and a loss no lager than $10,000 under the Euler scheme. This begs the question, what is the impact of the discretization error on the high biased, and LSM estimators? Numerically we found that the the discretization error has the tendency to increase the high biased esti­ mator and lower the LSM estimator, thus increasing the confidence interval. That is, for the single and two factor models, all strike prices and the different bond maturities considered, we find that the LSM estimator under the transition distribution dominates its counterpart under the Euler scheme. For the three factor CIR model, this is only true for options that are not deeply out of the money. Also, we found that the discrepancy between the LSM estimators shrinks as the number of factors increases.

As a first test of the performance of the single, two and three factor CIR models for all the options contracts, we shall consider the early exercise premia implied by these different models. In general, the models all have positive early exercise premia. In Figure 4.2, for the single factor model simulated from the transition distribution, we find that the estimated early exercise values as a function of the strikes for all the considered bond maturities is linear. Now, the effect of adding more factors while using the same mode of simulation, gives rise to early exercise premia which behave non-linearly as the strike price varies, and this is true for all the considered bond maturities. This is expected since the additional factors driving the short rate give rise to more flexible behaviors of the short rate, which in turn give rise to more flexible behaviors in the futures prices.

119 On the other hand, consider the put options on the two year T-note futures contracts in Figure 4.2, we find that under simulation via the Euler sheme, the early exercise premium function for the single factor CIR model varies non-linearly with the strike prices as op­ posed to the case where simulation is conducted via the transition distribution. Also, the non-linearity dies out as we consider longer term bond maturities.

When comparing approximate optimal exercise boundaries across simulation methods, we should keep in mind the coupon payment dates of the T-bond or note in the futures con­ tract. Although, the underlying asset in our American put option contract does not make any interest payments, we find that the accrued interest given by Definition 4.2.2 plays a significant role in the behavior of the optimal exercise boundary when simulation is con­ ducted via the Euler scheme. All the approximate optimal exercise boundaries we derived, exhibited an increase at the time step just before the second coupon payment date. At that date, the accrued interest is largest and given the form of the futures price in equation (4.3), the futures price will most likely decrease, giving rise to a fruitful exercise opportunity. All the CIR models give rise to optimal exercise boundaries which exhibit an increase at the time step just prior to the second coupon payment date. For example, consider Figures 4.3 and 4.4, where the axis labeled strike prices represents the deviations of the initial price by percentages of 10 cents. For example, 0.5 on that axis represents the strike price which is 5 cents in the money, and 0 on that axis represents a strike price that is at the money. Now, the time step where the accrued interest is largest is given by 0.78 on the time in percent axis. We see that the the increase at time 0.78 is much larger under the Euler scheme, and this result holds for the single and three factor CIR models too.

4.5.2 Models in the Air(3) and A2r(3) Families

We used the Euler scheme to simulate sample paths of the underlying futures price, hence, we should keep in mind that we have a discretization error in our estimates. Qualitatively, we find that the confidence intervals for all of the models considered in the Air(3) and

120 yl2r(3) families, are large like the confidence intervals for the two-factor CIR model in Figure 4.1 when simulated via the Euler scheme. For example, consider Figure 4.5 where we plot the result for the LSM estimator and the 95% confidence interval for the American put price for the A2T{3)MAX model. The large confidence intervals suggest that the LSM estimator is not a good one. For example, consider the top right panel of Figure 4.5, which details the results when the T-note has ten years to maturity from the date of contract settle­ ment. For a deviation of 5 cents in the money from the initial futures price, the American option price can be approximately as large as 17 cents and as low as 5 cents with 95% probability. On the CBOT, the ten year T-Note has a face of $100, 000, so, if we use this model to price on the CBOT, then we can only say that the price of the American put option on the ten year T-note futures contract can be anywhere between $5,000, and $17,000 with 95% probability.

Comparing the early exercise premia of the models shows that most of them, given the same number of basis functions, same initial conditions and random numbers, should not be used for pricing the American put option. Figure 4.6 presents the plots of the early exercise premia for each of the models as function of the strike price holding constant the number of basis functions, same initial conditions and random numbers. In terms of having

a a non-negative early exercise premium, the Air(3)MAX °d AI3(3)DS models out perform the rest of the models considered in the .4ir(3) and A2r(3) Families. The early exercise

an premia for the A1T(3)MAX d .4i3(3)£>s models are positive except for deeply out of the money put option contracts with the five year T-note futures contract as underlying, which can be seen in the top right panel of Figure 4.6. The other models considered, have sig­ nificantly negative early exercise premia. Note that , a negative early exercise premium means that the European option with the same parameters is strictly worth more, which is theoretically not possible since the American option is worth at least as much as its European counterpart. Obviously, the issue is one of approximation. When utilizing the LSM algorithm to price deeply in the money put options, we found that almost all of the

121 sample paths of the underlying asset were in the money at each time after time zero. This suggests that the fitted values of all the regressions represent their population counterparts well since the error is approximately no greater than 1CT5. This leaves us to consider the other source of error in the LSM algorithm, which is due to the approximation of condi­ tional expectation by a finite set of basis functions. We decided to check that it might b the case that we are not using enough basis functions. So, we repeated the simulation for all the models, of course holding constant, the same initial conditions and random numbers, this time utilizing the finite subset of the tensor basis of the regular polynomials of degree 2. In comparison to the simulations under the set of complete polynomials of degree 2, we added 16 more regressors in the regression at each stage o the LSM algorithm, and found that, the performance of all the models remains significantly unchanged. This numerical result suggests that the 16 extra regressors did not contribute much to the approximation of the conditional expectations.

It was initially conjectured that the source of the error in the approximation might be due to our choice of basis functions. Recall we chose the set of complete polynomials of degree two, including an extra term being the product of all three state variables originally. Since this set of polynomials is not orthogonal, we conjectured that the poor performance of the models in terms of the estimated early exercise premia might be due to the choice of the basis functions. So, we repeated our simulation for the A23{3)r)s , but this time using the first thee basis functions of the three fold tensor basis of the Laguerre polynomials, which consists of 27 terms, and found that there is no significant improvement. That is, the esti­ mated early exercise premia do not change significantly, and hence, the performance of the

^42r(3)DS model did not improve. Given the poor performance of most of the models holding constant the number and type of basis functions, initial conditions and random numbers, one should not use their ap­ proximate optimal exercise boundaries as a decision rule to dictate optimal exercise. One interesting feature is that the optimal exercise boundaries for all the models considered in

122 this section behave similarly for the put options on the two, five, and ten year T-note fu­ tures, and differently for the 30 year T-bond Futures option. Figure 4.7 presents the graphs of the boundaries for the A-ir (3)MAX model. Note that, the axis labeled strike prices repre­ sents the deviations of the initial price by percentages of 10 cents. For example, 0.5 on that axis represents the strike price which is 5 cents in the money. The boundaries for all the models in this section for the two, five and ten year T-note futures options are very similar which is due to the effect of the accrued interest. At time 0.78 in Figure (4.7) we find that the optimal exercise boundaries exhibit a sharp increase. Recall that time 0.78 is the time where the largest accrued interest occurs, and this has a negative effect on the futures prices given by equation (4.3) causing it to decrease. For the 30 year T-bond futures option, since we have a longer time to maturity for the coupon bond, the sum in equation (4.3) offsets the negative effect of the accrued interest since we are adding more terms for longer bond maturities.

4.5.3 The Single, Two, and Three Factor Vasicek Models

For the three factor Vasicek model, we are suspicious of the the 95% confidence intervals and the approximate optimal exercise boundaries it gave rise to. The value of the opti­ mal exercise boundaries in the three factor Vasicek model whether simulated via the Euler scheme or the transition distribution, are very low. For example, the value of the optimal exercise boundary for a put option on a thirty year T-bond futures contract which is at the money is less than five cents. The confidence intervals also suggest abnormally low prices for the American put option prices. For the single and two factor Vasicek models, the results seem more reasonable, and suspect an error in the code for the generation of the sample paths of the futures prices for the three factor Vasicek model. Since the single factor Vasicek models are Gaussian models, we expect that there won't be a large discrepancy in the results when we make comparisons across the simulation approach: Euler scheme and transition distribution. We found that the early exercise premia for the single factor Vasicek

123 model under the Euler scheme and transition distribution to be almost identical as given by Figure 4.8. As for the 95% confidence intervals for the prices of the American put options, the intervals under the transition distribution were tighter that their counterparts generated via the Euler scheme. For the two factor Vasicek model, we find that the 95% confidence intervals for the price of the American option, under the transition density, became larger as the time to bond maturity increased, increasing their similarity with the their counterparts simulated via the transition distribution. Figure (4.9) depicts this trend for the two factor Vasicek model simulated via the transition distribution. For the single and two factor Vasicek models, re­ gardless of the simulation approach, all the optimal exercise boundaries exhibited a large increases at time 0.78, which is the time where the accrued interest appearing in the equa­ tion for the futures price, equation (4.3) is largest. This is depicted in Figure 4.10.

124 Two Factor CIRfTransition Dfet.): 2 Yyear T-ra Two Factor CIRfEuler scheme): 2 Yyear T-note Confidence Intervals Confidence Intervals

-0.5 -0.4 -0.3 -0.2 -0.1 0 0,1 0.2 0.3 0.4 0.5 -0,4 -0,2 0 0,2 0,4 0.6 Deviation of initial futures price by in percentages of 10 cents Deviation of initial futures price by in percentages of 10 cents Two Factor CIRfTransition Dist.): 5 Yyear T-note Two Factor CIRfEuler Scheme): 5 Yyear T-note Confidence Intervals Confidence Intervals 0.2,

JJil -0.4 -0.2 0 0.2 0.4 0.6 -0.4 -0.2 0 0.2 0.' 0.6 Deviation of initial futures price by in percentages of 10 cents Deviation of initial futures price by in percentages of 10 c Two Factor CIRfTransition Dist.): 10 Yyear T-note Two Factor CIRfEuler Scheme): 10 Yyear T-note Confidence Intervals Confidence Intervals

-0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0.2 0.4 Deviation of initial futures price by in percentages of 10 cents Deviation of initial futures price by in percentages of 10 cents Two Factor CIRfTransition Dist,): 30 Yyear T-note Two Factor CIRfEuler Scheme): 10 Yyear T-note Confidence Intervals Confidence Intervals 0.2.

.,TTTTTTTTTininfII

-0.4 -0.2 0 0.2 0.4 -0.4 -0.2 0 0,2 0,4 0.6 Deviation of initial futures price by in percentages of 10 cents Deviation of initial futures price by in percentages of 10 cents

Figure 4.1: Plots of 95% confidence intervals using the high and low biased estimators, and LSM estimators for the prices of the all the American put options, using the two factor CIR model where simulation in conducted via both the Euler scheme and transition distribution approaches.

125 Estimated early exercise value: Five year T-Note Futures American put option -Single-FCIR-TD -Two-FCIR-TD P.014 -Three-FCIR-TO Single-FCIR-ES Two-FCIR-ES Three-F CIR-ES

-0,5 -0.4 -0.3 -0.2 -0.1 0 0.1 0,2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0,2 -0,1 0 0,1 0.2 0,3 0.4 0.5 Deviation from initial futures price by percentage ol 10 cents Deviation from initial futures price by percentage of 10 cents

Estimated early exercise value: Estimated early exercise value: Ten year T-Note Futures American put option Thrity year T-Bond Futures American put option 0.025,

0.02 -

0.015

0.005

-0.5 -0.4 -0.3 -0.2 -0.1 0 0,1 0.2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0,1 0,2 0.3 0.4 0.5 Deviation from initial futures price by percentage of 10 cents Deviation from initial futures price by percentage ol 10 cents

Figure 4.2: The estimated early exercise premia for the the American options for all strikes and bond maturities considered, when considering single two and three factor models of the short rate simulated via the Euler scheme and transition distribution approach.

126 Two Factor Clfpler Scheme): 5 Yyear T-note Two Factor Clfpler Setae): 2 Yyear T-i Optimal Exercise Boundaries Optimal Exercise Boundaries

0,2 0.4 0.6 Strike Prices Time in Percent Strike Prices

Two Factor Clfpler Scheme): 10 Yyear T-i Two Factor Clfpler Scheme): 30 Yyear T-Bond Optimal Exercise Boundaries Optimal Exercise Boundaries

Strike Prices Strike Prices 0,2 0,4 Time in Percent Time in Percent

Figure 4.3: Approximate optimal exercise boundaries for the two factor CIR model using the Euler scheme.

127 Two Factor CIR(Dist.): 2 Yyear T-note Two Factor CIR(Oist): 5 Yyear T-n Optimal Exercise Boundaries

~\ <; r -0.5 0.2 0.4 0.6 0.2 0.4 0.6 Strike Prices Time in Percent e Prices Time in Percent Two Factor CIRDist,): 10 Yyear T-note Two Factor CIR(Dist.): 30 Yyear T-Bond Optimal Exercise Boundaries Optimal Exercise Boundaries

- 0.2 0.4 0.6 -0.5 "A <; r 0.2 0.4 0.6 Strike Prices Time in Percent Strike Prices Time in Percent

Figure 4.4: Approximate optimal exercise boundaries for the two factor CIR model using the transition distribution.

128 MAX23:2 Yyear T-note MAX23:5 Yyear T-note Confidence Intervals Confidence Intervals 0.25 r-r

0.2

§0.15

0.05- llllllll J.

-0.5 -0.4 -0.3 -0.2 -0,1 0 0.1 0.2 0,3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0,1 0,2 0.3 0.4 0.5 Deviation of initial futures price in percentages of 10 cents Deviation of initial futures price in percentages of 10 cents

MAX23:10 Yyear T-note MAX23:30 Yyear T-Bond Confidence Intervals Confidence Intervals 0,25

0,2

g 0.15 - 'Z a.

ri 0.1 _TTTTTU!I 0.05-

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0,4 0,5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 Deviation of initial futures price in percentages of 10 cents Deviation of initial futures price in percentages of 10 cents

Figure 4.5: Plots of 95% confidence intervals using the high and low biased estimators, and LSM estimators for the prices of the all the American put options, using the maximal model in the A2r(3) family.

129 Estimated early exercise value: -BDFS Two year T-Note Futures American put option Estimated early exercise value: -DS13 FTC year T-Note Futures American put option

-0.5 -0.4 -0.3 -0,2 -0.1 0 0.1 0.2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0,1 0.2 0,3 0,4 0,5 Deviation from initial futures price by percentage of 10 cents Deviation from initial futures price by percentage of 10 cents

Estimated early exercise value: Estimated early exercise value: ' Ten year T-Note Futures American put option Ttirify year T-Bond Futures American put option

-0,5 -0,4 -0,3 -0,2 -0.1 0 0.1 0.2 0,3 0,4 0,5 -0,5 -0.4 -0.3 -0.2 -0,1 0 0,1 0.2 0.3 0.4 0.5 Deviation from initial futures price by percentage of 10 cents Deviation from initial futures price by percentage of 10 cents

Figure 4.6: The estimated early exercise premia for the the American options for all strikes and bond maturities considered for the models in the AXr (3) and A2r(3) Families.

130 MAX13:2YyearT-note MAX13:5YyearT-note Optimal Exercise Boundaries Optimal Exercise Boundaries

0.2 0.4 0.6 Strike Prices Time in Percent MAX13:10YyearT-note Optimal Exercise Boundaries 1.05,

0.2 0.4 0.6 0.8 0,2 0.4 0.6 Strike Prices Time in Percent Strike Prices Time in Percent

Figure 4.7: Approximate optimal exercise boundaries for the Alr(3)Max model. The initial price is given by the value zero on the strike prices axis.

131 Estimated early exercise value: Two year T-Note Futures American put option Five year T-Note Futures American put option 1 1 1 1 1 1 - "I

—Single factor Vasicek-TD —Two factor Vasicek-TD / / • —Three factor Vasicek-TD / / — Single factor Vasicek-ES - - Two factor Vasicek-ES - - Three factor Vasicek-ES • / / 7 / V •

-ti^-T' 1 1 1 i l l -0.5 -0,4 -0,3 -0,2 -0,1 0 0,1 0.2 0.3 0.4 0,5 -0,5 -0,4 -0,3 -0.2 -0,1 0 0.1 0.2 0.3 0.4 0.5 Deviation from initial futures price by percentage of 10 cents Deviation from initial futures price by percentage of 10 cents

Estimated early exercise value: Ten year T-Note Futures American put option Thrity year T-Bond Futures American put option 1 1 1 1 1 1 1

/ /

jT / 0^^ • ,---' - - J^/ / / S^ jS / / sS / / , , r>r*2r l I l I i 1 -0,5 -0.4 -0.3 -0.2 -0,1 0 0,1 0.2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0,1 0 0,1 0,2 0,3 0,4 0,5 Deviation from initial futures price by percentage of 10 cents Deviation from initie

Figure 4.8: Estiated Early Exercise Boundaries for the single, two and three factor Vasick models under the Euler scheme and transition distribution

132 Two Factor Vasicek(Transition Dist): 2 Yyear T-note Two Factor Vasicek(Transition Dist): 5 Yyear T-note Confidence Intervals Confidence Intervals

-0,4 -0.2 0 0.2 0.4 0.6 -0.4 -0,2 0 0,2 0,4 0,6 Deviation from initial futures price by percentage of 10 cents Deviation from initial futures price by percentage of 10 cents

Two Factor VasiceklTransition Dist.): 10 Yyear T-note Two Factor VasicekfTransition Dist): 30 Yyear T-Bond Confidence Intervals

0.1 0.1

.Tiill y • _TTIII \r

-0,4 -0,2 0,2 0,4 -0,4 -0,2 0,2 0,4 ires

Figure 4.9: 95% confidence intervals for the American put prices obtained using high and low biased estimators for the two factor Vasicek model simulated via the transition distri­ bution.

133 Single Factor Vasicekfjransition Dist): 2 Yyear T-note Single Factor VasicekfTransition Dist.): 5 Yyear T-i \2 Optimal Exercise Boundaries Optimal Exercise Boundaries

\ \ T .,.,„. . 0.2 0.4 0.6 0.8 Striketes Strike Prices Time in Percent Time in Percent Single Factor VasicekfTransition Dist. note Single Factor VasicekfTransition Dist.): 30 Yyear T-Bond 12 Optimal Exercise Boundaries Optimal Exercise Boundaries

1 \ T 1 \ \ \ \ 0.2 0.4 0.6 0.8 Strike Prices 0.2 0.4 0.6 0.8 1 Time in Percent Strike Prices Time in Percent

Figure 4.10: Approximate optimal exercise boundaries for the American put options under the two-factor Vasicek model simulated via the transition density

134 Chapter 5

Conclusions

5.1 Summary

Chapter 1 introduced continuous-time mathematical finance in the framework of a basic market model with the aim of computing the price formulae of various interest rate assets and derivatives. The key concepts from probability theory and stochastic calculus that are used in that chapter are stated in Appendix A. First, we motivated the use of stochastic calculus to model the behavior of interest rate over time by utilizing functions which are not of bounded variation. After that, we presented the standard mathematical formulation of uncertainty required to introduce the market model, from which we derived the pricing formulae for zero-coupon bonds and bond futures, and European put and call options on the previous two. We also derive the corresponding pricing formulae for coupon bearing bonds, bond futures, and European options on these assets. Finally, we presented two ap­ proaches to computing the pricing formulae mentioned above which are the B-S method, and the forward measure approach.

Chapter 2 presented the different affine term structure models that we utilized, and the derivation of the pricing formulae of the interest rate assets and derivatives presented in Chapter I. The ATSMs are the one-factor, two-factor, and three-factor Vasicek and CIR

135 models used in Babbs and Nowman [2], and Chen and Scott [9] respectively, and the three factor models of Balduzzi, Das, Ferosi, and Sundaram [3], Dai and Singleton [12],

Chen [6], and the Maximal models in the .4ir(3) and ^2r(3) subfamilies introduced in Dai and Singleton [12]. For each of these models, we utilized the B-S method or the forward measure approach to compute the pricing formulae, and discussed alternative approaches to their computation. For example, we presented a solution to the zero-coupon bond pric­ ing formula in the three-factor model of Balduzzi, Das, Ferosi, and Sundaram [3] using Maple 8 which utilizes symmetry schemes to solve the interdependent Riccati differential equations.

Chapter 3 presented the basic pricing theory of the finite expiring American put option under stochastic interest rates. We present the optimal stopping formulation and analyti­ cal characterization of the American put price problem, where the latter characterization is used to motivate the use of the dynamic programming principle in continuous time to compute the put price. After that, we introduced the discrete time approximation to the price by using a discrete form of the dynamic programming principle, and introduced the LSM algorithm developed in Longstaff and Schwartz [23]. We discussed the two types of approximations which are made by the LSM algorithm, and present pseudo code for its implementation on a computer, and describe the optimal stopping rule arising from the LSM algorithm. Also, we show that the LSM estimator is an interleaving estimator in the sense that it mixes elements of high and low bias, and present four convergence results of the LSM algorithm. The first two of these convergence results are from Longstaff and Schwartz [23]. The first result proves that the LSM estimator is asymptotically low biased, and the second result motivates the search for conditions under which the approximation to conditional expectation in the LSM algorithm is uniform. The last two convergence re­ sults are from Clemente, Lamberton and Protter [10]. Their first result proves that under some general hypothesis on the choice of basis functions and payoff functions, the LSM estimator approximates the price of the finite expiring American put option almost surely,

136 when the continuation values are specified by the linear combinations of basis functions we chose, as the number of simulated paths of the underlying asset tends to infinity. Their second result proves that, as the number of basis functions utilized in the approximation of the continuation values increases, then the Bermudan put price converges in the L2 norm to the true price of the American put option.

Chapter 4 introduced the specific contracts that we priced in the thesis, the high biased estimator we utilized, the finite set of basis functions used, and the type of approximation used to approximate the optimal exercise boundaries. The contracts we considered are the American put options on the thirty year T-bond futures contract and two, five and ten year T-note futures contracts, which are actively traded on the Chicago Board of Trade. We presented the pricing formula for the underlying futures contract in Section 4.2, and made a simplifying assumption, which is that we only have a single deliverable coupon-bearing bond in the underlying futures contract. The high biased estimator we utilized is called the clairvoyance estimator, and showed how to construct a 95% confidence interval for the true price of the American put option from two biased estimators, which is based on Glasserman [16]. We describe the set of complete polynomials of degree two which we used as basis functions for each model, and we describe the construction of the estimated optimal exercise boundary based on the stopping rule arising from the LSM algorithm. We also mention the different data sets utilized, and methods in which the parameters of the ATSMs were estimated, by Babbs and Nowman [2], Chen and Scott [9], and Dai and Singleton [12]. Finally we present our results.

5.2 Conclusions

Our results suggest that we should utilize the transition distribution of the state variables whenever possible to simulate sample paths of the state variables, since we would not have a discretization error which, as demonstrated numerically in Section 4.5, adversely affects

137 the size of our confidence intervals, the value of early exercise, and the estimated optimal exercise boundaries. When the transition distribution of the state variables is unknown, one should consider utilizing a smaller step size when simulating the sample paths via the Eu- ler scheme, or use higher order discretization methods such as Milstein schemes. This way one can reduce the discretization error, and hence, improve the accuracy of our estimates of the true American put price, and estimated optimal exercise boundaries. We found numerically that, for the multifactor models of the term structure, the extra terms included in a tensor product basis of degree two where excessive from the perspective of the rate of convergence. That is, the extra terms in the two-fold and three-fold tensor product basis of degree two over the set of complete polynomials did not improve the performance of the models at all. This finding motivated us to consider utilizing different basis func­ tions. We utilized the three-fold tensor product basis functions of Laguerre polynomials of degree two in estimating the price of the American put option for the A2r(3)DS model, and found that there was not significant improvement over the results produced under the set of complete polynomials of degree two. This was expected since the LSM algorithm is robust to the choice of basis functions. For this reason, we also considered utilizing more basis functions in the LSM algorithm. We use the complete polynomials of degrees three and four in the ^2r(3)cs model, and still did not obtain any significant improvement in the performance of that model as the early exercise premia remained negative. Finally, given the choice of basis functions, the high biased estimator, and the number of paths, the

A\T(3)MAX and Air(3)DS models perform the best in comparison to the other models in the v4ir(3) and ^42r(3) subfamilies in terms of the positivity of the early exercise premia for different strike prices, for each of the contracts.

138 5.3 Future Research

We would have liked to utilize more sample paths of the underlying asset, and a smaller step size, but due to the limitation on computing power, and memory, this was not possi­ ble. For future research, one must take into account the computation power since, as we increase the number of sample paths, we can reduce the variance of the LSM estimator1. Also, experimenting by using more basis functions, and different high biased estimators may improve the overall performance of all the models, but this comes at a price of possi­ bly slowing down the LSM algorithm. One can also reduce the variance by using variance reduction techniques such as antithetic sampling and importance sampling, both of which are presented in Section B.2. Longstaff and Schwartz [23] utilize antithetic sampling in one of their examples on pricing American put options on a share of stock, where the risk- neutral stock price process follows a geometric Brownian motion. Moreni [25] developed a variance reduction technique based on the and importance sampling for the computation of American option prices via the LSM algorithm. One criticism of Moreni [25] is that he does not reduce the joint variance of the LSM estimator and a high biased estimator. This is of importance since Proposition 3.3.1 proves that the LSM esti­ mator is a low biased estimator, and since in practice we do not know the functional form of the conditional expectations representing the continuation values in the LSM algorithm, we should construct a confidence interval for the true price of the American option. Finally, to better evaluate the performance of the LSM algorithm, one should compare it's performance with other methods such as the known PDE methods which solve the linear complementary conditions given by equations (3.3) and (3.4) using the same single and multifactor models of the short rate.

The convergence results for the Mote Carlo method is presented in Section B.l

139 Bibliography

[1] M. Abramowitz and I. Stegun. Handbook of Mathematical Functions.

[2] S. H. Babbs and K. B. Nowman. Kalman filering of generalized vasicek term structure

models. The Journal of Financial and Quantitative Analysis, 34(1):115-130, March

1999.

[3] P. Balduzzi, S. R. Das, S. Ferosi, and R. K. Sundaram. A simple approach to three factor afBne term structure models. Jouranal of , 6:43-53, 1996.

[4] R Brandimarte. Numrical Methods in Finance and Economics: A MATLAB-Based Introduction. Wiley, 2nd edition, 2006.

[5] D. Brigo and R Mercuric Interest Models: Theory and Practice. Springer Finance. Springer-Verlag, 2001.

[6] L. Chen. Stochastic Mean and Stochastic Volatility-A three factor model of the term structure of interest rates and its application to the pricing of interest rate derivatives. Blackwell, Oxford, U.K., 1996.

[7] R.-R. Chen. Understanding and Manging Interest Rate Risks. Seires in Mathematical Finance. World Scientific, 1996.

[8] R.-R. Chen and L. Scott. Pricing interest rate options in a two-factor Cox-Ingersoll- Ross model of the term structure. The Review of Financial Studies, 5(4):613-636, 1992.

140 [9] R.-R. Chen and L. Scott. Multi-factor Cox-Ingersoll-Ross models of the term struc­ ture: Estimates an tests from a Kalman filter model. Journal of Real Estate Finance and Economics, 27(2): 143—172,2003.

[10] E. Clement, D. Lamberton, and P. Protter. An analysis of a least squares regression

algortihm for american option pricing. Finance and , 6:449—471, 2002.

[11] H. C. Cox, J. Jonathan E. Ingersoll, and S. A. Ross. A theory of the term structure of

interest rates. Econometrica, 53(2):385-407, March 1985.

[12] Q. Dai and K. Singleton. Specification analysis of affme term structure models. The

Journal of Finance, 55(5): 1943-1978, October 2000.

[13] W. Feller. Two singular diffusion problems. The Annals of Mathematics, 54(1): 173— 182,1951.

[14] R. Gallant and G. Tauchen. Which moments to match? Econometric Theory, 12:657- 681, 1996.

[15] H. Geman, N. Elkaroui, and J. Rochet. Changes of numeraire, changes of probability

measures, and pricing of options. Journal of Applied Probability, 32:443-458,1995.

[16] P. Glasserman. Monte Carlo Methods in Financial Engineering. Applications of Mathematics. Springer, 2003.

[17] D. Greenspan and V. Casulli. Numerical Analysis for Applied Mathematics, Science, and Engineering. Addison Wesley, 1988.

[18] J. Harrison and S. Pliska. Martingales and stochastic integrals in the theory of contin­ uous trading. Stochastic Processes and their Applications, 11:215-260, 1981.

[19] J. Harrison and S. Pliska. A stochastic calculus model of continuous trading: Com­ plete markets. Stochastic Processes and their Applications, 15:313-316, 1983.

141 [20] F. Jamshidian. An exact pricing formula. Journal of Finance, 44:205- 209,1989.

[21] K. L. Judd. Numerical Methods in Economics. MIT Press, Cambridge, MA, 1998.

[22] D. Lamberton and B. Lapeyre. Introduction to Stochastic Calculus Applied to Fi­ nance. Chapman and Hall, 1996.

[23] F. A. Longstaff and E. S. Schwartz. Valuing american options by simlation: A simple least-squares approach. The Review of Financial Studies, 14(1): 113-147, 2001.

[24] Y. L. Luke. The special functions and their approximations, Vol. I. Mathematics in Science and Engineering, Vol. 53. Academic Press, New York, 1969.

[25] N. Moreni. pricing american options: a variance reduction technique for the Longstaff

and Schwartz algorithm. PhysicaA, (338):292-295, 2004.

[26] S. Nawalkha, N. Beliaeva, and G. Soto. Dynamic Term Structure Modeling: the fixed income valuation course. John Wiley and Sons Inc., 2007.

[27] S. E. Shreve. Stochastic Calculus for Finance II: Continuous-Time Models. Springer, 2004.

[28] D. Sondermann. Introduction to Stochastic Calculus for Finance. Lecture Notes In Economics and Mathematical Systems. Springer, 2006.

[29] O. Vasicek. An equilibrium characterization of the term structure. Journal of Finan­ cial Economics, 5:177-188,1977.

142 Appendix A

Probability and Stochastic Calculus

A.l Concepts and Terminology

A.l.l Rudimentary Probability

Let (Q, F) denote a measurable space. In general, this measurable space is abstract and hence usually not easy to work with. The concept of a random variable, denoted by X, circumvents this difficulty by enabling us to do analysis on a simpler measurable space, (R, B(R)) with minimal mathematical machinery, where H(R) denotes the Borel sigma- algebra on the real line. Note that we can construct the image probability measure

1 PX(A) = PoX~ (A),VA e B(R) where X~x denoted the pre-image mapping with domain B(R), and co-domain T. For a random variable, X, we denote its expected value under the probability measure P by Ep [X], and its conditional expectation under the probability measure P conditional on the information in T by Ep [X \T}. Now, suppose we have another probability measure, Q, associated with the same measurable space {Q,,T). If the two probability measures agree on which events are have probability measure zero, then we say that the two measures are

143 equivalent. This concept is expressed formally below:

Definition A.1.1 (Equivalence). Let (Q, J7) be a measurable space. Two probability mea­ sures P and Q on (Q. J7) are said to be equivalent if they agree which sets in T have probability zero.

Given (Q, J7), P, and Q as in Definition A.1.1, then, by the Radon-Nikodym theorem, given by Theorem A. 1.1,

Theorem A.l.l (Radon-Nikodym Theorem). Let P and Q be equivalent probability mea­ sures defined on the measurable space (fi, J7). Then there exists an almost surely positive random variable Z such that EpZ — 1 and

Q{A)= f Z(u)dP{u>),VA€F. (A.l) J A

we can express one measure in terms of the other. This is accomplished via the follow­ ing expression in equation (A.l), where the function Z in equation (A.l) can be written more concisely as

where the notation represents Z as the vehicle used to change probability measures. Hence, any random variable, X now has two expectations, one under the original measure P, denoted by EP [X], and under the measure Q, denoted by EQ [X] which are related via equation (A.l) as

E [X} = E Q P XdP

The implication of this result is that we have more than a single way to compute the ex­ pected value of X. This is particularly useful in the case where the computation of the expected value of X is not straight forward under the original measure, and hence, if pos­ sible, one can construct a Radon-Nikodym derivative such that the expected value of A" under the new measure is computationally simpler.

144 In general, continuous time mathematical finance models model the evolution of the basic asset prices over time as stochastic processes adapted to a given filtration. A stochastic pro- cesse is an indexed collection of random variables defined on the same probability space, denoted by X = {Xt}teT, where T is the index set which represents an interval of time. The purpose of the filtration is to keep track of what information we know at any given point in time and is denoted by {T^teT- 1° the mathematical finance literature, the two important classes of stochastic processes which had a great impact on the development of the pricing of contingent claims are martingales and Markov processes. These concepts are explained by Definitions A.l .2 and A.l .3 respectively and are given by

Definition A.l.2 (Martingale). Let (0.. T. P) be a probability space, let T be a fixed pos­

7 itive number, and let Tu 0 < t < T, be a filtration of sub-a-algebras of J . Consider an adapted stochastic process Mt, 0 < t < T. if

EP [Mt\T8] = Ma,V0

Definition A.1.3 (Markov Process). Let (0. T, P) be a probability space, let T be a fixed

7 positive number, and let Tt, 0 < t < T, be a filtration of sub-cr-algebras of J . Consider an adapted stochastic process Xt, 0 < t < T. Assume that for all VO < s < t < T and for every nonnegative, Borel-measurable function /, there is another Borel-measurable function g such that

EP[f(t,Xt)n=g(t,Xs).

Then we say that the X is a Markov process.

Now, suppose we are interested in changing probability measures, then we need to construct a collection of Radon-Nikodym derivatives, one for each t G T since we are assuming that the stochastic process is adapted to a given filtration. That is, for each t G T,

145 we can construct a Radon-Nikodym derivative by using equation (A.l) with Tt instead of T. More concisely, we will need to construct a Radon-Nikodym derivative process denned as *(0={^L:,er}' (A3) The process in equation (A3) is a martingale, and this result follows from the definition of conditional expectation of a random variable. An example of a stochastic process that is a martingale is Brownian motion, which is arguably the most important stochastic process in continuous time mathematical finance, and is given by Definition A.l .4.

Definition A.1.4 (Standard one dimensional Brownian motion). Let (fi, T. P) be a prob­ ability space. For each u e Q, suppose there is a continuous function Wt of t > 0 that satisfies W(0) = 0 and that depends on u>. Then Wt, t > 0, is a Brownian motion if for all

0 = to < t\ < • • • < tm the increments

Wti = Wu - Wt0, Wt2 - Wtl,..., Wtm - Wt^ are independent and each of these increments is normally distributed with

EP[Wu+1-Wti] = 0,

VarP [Wti+1 - Wu] = ti+1 - U

The mathematical model we shall utilize to describe phenomena such as the one in Figure 1.1 is a stochastic differential equation. Given a probability space, (O, T, P), and a filtration {^i}te[o,oo), a stochastic differential equation, SDE hereafter, is an equation of the form

dXu - 0{v, Xu)du + a(u, Xu)dWu (A.4) where 6(u,x) and a(u. x) are given functions called the drift and diffusion respectively,

146 and W = {Wt : 0 < t < 00} is a standard one dimensional Brownian motion. Given the initial condition XQ = XQ, where x0 6 R, he SDE in equation (A.4) models the evolution of a stochastic process Xt > 0 such that

XQ = X0

Xt = X0+ I 6{u,Xv)du + I a{u,Xu)dWu, (A.5) Jo ./o

2 which is adapted to the given filtration. We shall assume that Ep j0a (u.Xu)du and Jo \@{ui Xu)\ du are finite for every t > 0 so that the integrals on the right-hand side of equation (A.5) are defined and the Ito integral is martingale.

A.1.2 Stochastic Calculus

Theorem A.1.2 (Properties of Stochastic Integral). Let T be a positive constant and let

2 A(t), 0 < t < T, be an adapted stochastic process that satisfies E J0 A (t)dt < 00. Then

1(1) = JQ A(u)dW (u) has the following properties.

1. (Continuity) As a function of the upper limit of integration t, the paths of I(t) are continuous.

2. (Adaptivity) For each t, I(t) is F{t)-measurable.

3. (Linearity) If 1(f) = f* A(u)dW(u) and J(t) = J* T{u)dW{u)r then I(t) ± J{t) =

/0(A(tt)±r(u))dW(u); furthermore, for every constant c, cl(t) — f0 cA(u)d\V(u).

4. (Martingale) I(t) is a martingale.

5. (Ito isometry) EI{tf = E f* A2(u)d(u).

6. (Quadratic Variation) [I, I}(t) = JjJ A2(u)d(u).

Theorem A.1.3 (Integral with respect to an Ito process). Let X(t), t > 0, be an ltd process as in equation (A.5), and let Yt, t > 0, be an adapted process. Furthermore, assume

147 2 that E J* Tla {u, Xu)du and J* \Tu6(u, Xu)\ du are finite for every t > 0. We define the integral with respect to an ltd process

I TudXu = f I>(u,Xtt)dWu + / TuB{u,Xu)du Jo Jo Jo

Theorem A.1.4 (Quadratic Variation). The quadratic variation of the ltd process (A.5) is

2 [X,X](t)= I o {iL,Xu)du Jo

Theorem A.1.5 (Ito-Doeblin formula for an Ito process). Let X(t), t > 0, be an ltd pro­ cesses as described in equation (A.5), and letf(t, x) be a function for which the partial derivatives ft(t, x), fx(t, x), and fxx(t, x) are defined and continuous. Then, for every T>0,

/(T,XT) = /(0,X0) + J ft(t,Xt)dt + J fx(t,Xt)dXt + \J fxx(t,Xt)d[X,X](t)

Theorem A.1.6 (Girsanov's Theorem). Consider a stochastic differential equation, with

Lipschitz coefficients, given by equation (A.4), and initial condition xo under P. Let us be given a new drift p,(u, x), and assume ^("'^~ft"'^ to be bounded. Define the measure Q by dQ dP Tt \ 2 Jo V o{u,Xu) J J0 a(u,Xu)

Then Q is equivalent to P. Moreover, the process W defined by

'fi(u,Xu) - f3(u,Xu) dW{u) = - du + d\V(u) [ a(u,Xu) is a Brownian motion under Q, and

dXu = p{u, Xu)du + a{u, Xu)dW{u), x0.

148 Theorem A.l.7 (Martingale Representation Theorem,one dimension). LetW = {Wt : 0 < t < T} be a Brownian motion on a probability space (Cl.T, P), and let {J-t}te{o,T]> be a filtration generated by this Brownian motion. Let M — {Mt '• 0 < t < T], be a martingale

with respect to this filtration. Then there is an adapted process T = {Tt : 0 < t < T}, such that

Mt = Mo + / TudWu,0

149 Appendix B

Monte Carlo Methods

B.l Introduction and Motivation

Computing prices of interest rate assets and derivatives presented in Section 1.4 begins by specifying an interest rate model, and estimating the parameters of that model using market data. If the analytical pricing formulae are available, then we only need to substitute for the parameters and factor values1. In general, this type of situation is a rare one2, and one must resort to numerical computational methods in order to approximate the prices in question. In the case of the interest rate derivatives discussed in Section 1.4, the fundamental pric­ ing formulas are conditional expectations under a risk-neutral measure, and hence, the computational problem is one of computing integrals. The algorithm for calculating these integrals that we shall utilize is the Monte Carlo integration algorithm. Since we will be primarily working with multi-factor models of the short term interest rate, this algorithm has the property that the order of convergence is dimension invariant. So, if the domain of integration is high dimensional, then this does not impact the speed of convergence of the algorithm. When the domain of integration is a subset of the real line, the Monte

'Since the formulae are conditional expectations,we know that the analytical price functions depend on the factor or factors, and the parameters. 2Some of the factor models presented in this paper do admit an analytic computation of the analytical pricing formulas.

150 Carlo algorithm hardly competes with other algorithms for numerical integration such as the trapezoidal rule which is described in Greenspan and Casulli [17]. Pricing American interest rate derivatives by Monte Carlo simulation, which is our goal, the discussion of this subject will be taken up exclusively in Chapter ??. Note that, the material presented in this chapter is integral to the chapter on pricing American interest rate derivatives. In this chapter, we shall present the Monte Carlo integration algorithm and its convergence properties.

B.l.l Monte Carlo Simulation

Monte Carlo simulation is an algorithm which can be used for estimating the expected value of a random variable. The method is based in two important theorems in probability theory which are the Stong Law of Large Numbers, and the , stated in appendix C. This technique also provides an approach to computing deterministic inte­ grals. Following Glasserman [16] and Brandimarte [4] we illustrate this technique with an example on computing the integral of some function on the unit interval.

Example B.l.l. Let ([0,1], B[0,1]), A) be a measure space where [0,1] is the unit interval, B([0,1]) is the Borel sigma algebra on [0,1], and A is Lebesgue measure. Also, let / be a Borel measurable function with domain given by the unit interval and range in the real number system. We are interested in computing the quantity

a= I fdX. (B.l) Jo

Equation (B.l) can be thought of as the expected value of a function of a random variable X which has uniform distribution on the unit interval so that

E[f(X)} = f fdX. Jo

151 By the Strong Law of Large Numbers, we can approximate the above expectation by a sample mean of size n given by

1 " n ^—'

where the X^s are i.i.d uniform (0,1) random variables. Furthermore, if

/eL2([0,l],B([0,l]),A), that is / is square integrable, then we can estimate the reliability of our approximation by considering the variance of /, given by

aj= f (f- afd\. (B.2) Jo

By the Central Limit Theorem, the error in our approximation, (an — a), is normally dis­ tributed with mean zero and standard deviation given by

% (B.3)

To reduce the size of expression (B.3), we may simply increase the sample size n. Every increase in the sample size by 100 reduces the standard deviation by an order of 10. The above argument is based on us knowing the value of Of which in general we do not know. However, in practice we can approximate oj by

5/W = ^n £(/(**) - *n)2. (B.4) i—l

This approach to increasing he accuracy of the estimate a may require an excessive computational effort. The Monte Cairo estimate described above s just an estimate of the true value, and provided that n is sufficiently large we have

152 1. an will be approximately normal which is due to the Central Limit Theorem

2. the quantile tn^^_z from the t- distribution with n — 1 degrees of freedom tends to

Then an approximate confidence interval for a at the level (1 — p) is given by

' S2(n) an±z,_z\\-^ (B.5) 2 n where ZI_E is the quantile of the standard normal distribution corresponding to probability 1 —p. One possible criterion for choosing the size of n3 is based on controlling the absolute error in such a way that with probability (1 — p),

\dn -a\ < e, where e is the largest acceptable tolerance. Notice that the confidence interval is constructed in a similar manner since

\an~-a\

ls2An) . . . Now, connecting e to z\_z y -J^—, requires n to satisfy

S2An) Zl_E\^^-<€. (B.6)

Since" we cannot estimate Sj(n) until we choose n, we must choose a suitable number of pilot replications, for example, n = 100 in order to arrive at an estimate of the sample variance. 3 Another criterion of interest is controlling the relative error, so that

\ctn ~ a\ < \a\ holds with probability (1 — p).

153 Monte Carlo simulation hardly competes with known numerical methods such as the Trape­ zoidal Rule for estimating one-dimensional integrals For example, the error in the Trape­ zoidal Rule is 0(n~2) at least for twice continuously differentiable /, and the error in the Monte Carlo approach is given by 0(n~2. Where the Monte Carlo approach has consider­ able advantage over the known numerical methods is when the dimension of the domain is larger, and hence a more complicated function for an integrand must be considered. In theory, the price of a derivative security is often represented as a con­ ditional expectation. Estimating the price according to this representation can be a high dimensional problem since it involves sampling from a space of paths of stochastic pro­ cesses. Since each sample path is represented as a vector, this implies that the dimension of the space is at least as large as the number of time steps. The Monte Carlo procedure for estimating derivative prices usually involves the following steps:

1. Simulate sample paths of the underlying state variables in the derivative model. Ex­ amples might include asset prices and interest rates over the life of the contract, according to the risk neutral probability distributions.

2. For each simulated sample path associated with the contract, evaluate the discounted

cash flows.

3. Take the sample average of the discounted cash flows over all sample paths.

Reducing the variance of our estimate in step three above is of importance. The next section will present two methods for reducing the variance of the Monte Carlo estimator namely antithetic sampling, and importance sampling.

B.2 Variance Reduction Techniques

Since we can reduce the variance of the Monte Carlo estimator by increasing the number of replications, the rate of improvement in the reliability of our estimate increases as the

154 number of replications increases. An alternative way to reduce the variance (B.3) is to work on reducing the size of the numerator by reducing the variance of the samples. The variance reduction technique that was have been proposed include Antithetic Sampling, and Importance Sampling.

B.2.1 Antithetic Sampling

Antithetic sampling reduces the variance of the Monte Carlo estimate by introducing cor­ relation between sample paths in a specific way when generating the sample. Following Brandimarte[4], if we generate a pair of replications

w i) (X1 ,^ );» = l,2,..., n which are "horizontally" independent. That is, X* and Xj? are independent for all j and k conditional on ii ^ ?'2. Given this kind of independence the pair averages

X?+X? x , are independent which implies that the sample mean for these pair averages has variance

VAR{X?) + VARiX?) + 2COV{xf\xf)) VAR(X{n)) = An VAR{Xr) (l+,(.Yf>,X<")), 2n

where X(n) denotes the sample mean. Choosing p (x[ , X2 ) < 0 reduces the variance of the sample mean, which means choosing negatively correlated replications within each

155 pair. Returning to the situation where we wish to calculate

a = J fdX.

We may write

where U is a random variable which is uniformly distributed on [0,1]. Consider

j _f(Ul) + f(U2) 2 and , _ f{U) + /(I - U) A~ 2 where U\ and U2 are independent uniformly distributed random variables. Then I is the usual sample based on independent sampling, and A is the paired averaged sample built by antithetic sampling. Then VAW) - VAR[f{U)) and VAR(A) = VARMU» + COV(f(U)J(l - U)) 2 2

So that if/([/) and /(l - U) are negatively correlated, then VAR(A) < VAR(I). If the function / is monotonic then this condition is satisfied. Otherwise, if the monotonicity condition is violated, applying antithetic sampling to the Monte Carlo approach to esti­ mating the above integral may result in an increase in the variance, of the estimator an. Brandimarte[4] demonstrates this possibility by considering the triangle function which is

156 given by 2x if.r€ [0,§]

f(x)= { 2-2x ifxe [|,1] 0 otherwise where it follows that f(U) = /(l — U), and this implies that

COV(f(U)J(l - U)) = COV(f(U)J(U)) = VAR(f(U)) and hence, VAR(A) > VAR(I). Antithetic Sampling is a method for reducing the vari­ ance of a sample provided the monotonicity requirement is met, which does not exploit any specific information pertaining to the estimation problem.

B.2.2 Importance Sampling

Importance Sampling is based on changing the underlying probability measure in order to sample events of interest with respect to this new measure. When We illustrate with an example. Consider the problem of computing the integral 6.

e = E\g(x)] = Jg{x)f(x)d> where X is a random vector with joint density /(x). If we know another density, k such that / is absolutely continuous with respect to it, denoted by / < < k then we may express 0as "p(x)/(x)- J k(x) fe(x) where we integrate over the support of k. The ratio

fix) k(x)

157 is the Radon-Nikodym derivative4, and is the mechanism which allows us to change mea­ sure from /(x) to k{x). To see why this is useful for in reducing variance, we compare the second moments:

E [(

2 g(x)/(x) /•/ (x)/(x)\ 0?(x))2/(x) 9 (x)rfx /(x)rfx. (B.7) *(x) = J {-!&-) * fc(x) Thus, a judicious choice of density fc(x) can lead to a reduction in variance. The ideal importance sampling density in the special case where g(x) > 0, is given by

g(x)/(x) fc(x) = (B.8) e

since by substitution into the expression (B.7) yields

g(x)/(x) 2 2 Ek Ek [e ] = 0 and

g(x)/(x) g(x)/(x) g(x)/(x) VMflfc = -Efc - £, = 0 fc(x) *(x) fc(x)

Now, in practice one must exploit the structure of the problem in order to utilize this vari­ ance reduction technique.

B.3 Simulating SDEs

The starting point for the application of Monte Carlo simulation in the pricing of derivative securities algorithm is the generation of sample paths of the price of the underlying asset. There are two potential sources of error which can arise in generating sample paths. Sam-

4also known as the likelihood ratio.

158 pling error is due to the random nature of the Monte Carlo methods and it can be mitigated by utilizing the variance reduction techniques introduced in Section B.2. To understand discretization error, we illustrate how to discretize a continuous time single factor model given by the SDE

drt = B(. t, rt)dt + o{t, rt)dWu 0

and initial condition ?0 = r. The simplest discretization scheme is the Euler scheme and

entails choosing a time grid of size n given by 0 = 10 < U < • • • < tn — T. Using the recursion

ru+1 = ru + 0{U, ru){tw - U) + a(U, ru)y/(ti+l-ti)Zi+1, (B.10)

where Z\,..., Zn are independent draws from the standard normal distribution. For the time grid mentioned above, dWt is an increment of a Brownian motion given by Wti+1 — Wu which was introduced in exampleA.1.4, and is normally distributed with zero mean and variance (ii+1 — ti). We shall consider a time grid with a constant step size h > 0 meaning that U = ih, \/i, then the recursion in equation (B.10) becomes

ru+1 = ru + 0(U, ru)h + a(U, ru)VhZi+1. (B.l 1)

Quantifying the error of approximation is of importance since then we shall be ale to deter­ mine the sped of convergence of the discretization. Glasserman[16] mentions that there are two broad categories of error approximation used in measuring the quality of discretization methods which are criteria based on the pathwise proximity of a discretized process to a continuous process, and criteria based on the proximity of the corresponding distributions. Thes are generally termed strong and weak criteria respectively. We shall not be delve into the subtle differences of the definitions of strong and weak criteria more than mentioned

159 and only state the definition of strong order of convergence in appendix B. The interested reader is directed to Glasserman[l 6] for the details. The Euler scheme has a strong order of convergence of |, which means that if we decrease the step size by a factor of 100, then the error in approximating the continuous model by the Euler scheme decreases by at least a factor of 10. we can avoid having a discretization error if we know the transition density of the SDE in equation (B.9). As above, we choose a time grid with a constant time step, but this time we sample according to the probability distribution corresponding to the transition density at each time in the grid. Some of the interest rate models presented in chapter2 will have known transition densities, and hence can be simulated without discretization error.

160 Appendix C

Limit Theorems

Theorem C.0.1 (Strong Law of Large Numbers). IfX\, X2, • • • , are independent andiden- tically distributed and have finite mean, then ^=^—- —> E [X\] with probability 1.

Theorem C.0.2 (Central limit Theorem, Lindederg-Levy). Let N denote a random vari­ able with the standard normal distribution. Suppose that {Xk} is an independent sequence of random variables having the same distribution with mean c and finite variance a2. If

Sn = Xi + --- + Xn, then

Sn - nc D ,T _ > A. oy/n

Definition C.0.1 (Strong Order of Convergence). Let X = I X0, Xh-, X2h, ••• \ be any dis­ crete time approximation to continuous-time process X. Fix a time T and let n = [^ J. We say that a discretization X has strong order of convergence en > 0 if

E \Xnh — XT\

161 Appendix D

Parameter Estimates

D.l Estimates for the Single, Two and Three Factor Va- sicek Models

Parameter One-Factor Two-Factor Three-Factor 6 0.1908 0.5529 0.6553 £2 0.0652 0.0705 6 0.0525 o\ 0.0132 0.0195 0.0214 01 0.0186 0.0189 03 0.0163 P12 -0.8360 -0.9394 Pl3 0.8753 P23 -0.9200 M 0.0594 0.0728 0.0701 Qi 0.6483 -0.0849 0.1582 &2 0.0963 0.0961

d3 0.0173

Table D. 1: Parameter estimates of the generalized Vasicek model from Babbs and Now- man [2]

162 D.2 Estimates for the Single, Two and Three Factor CIR

Models

Parameter One-Factor Two-Factor Three-Factor h 0.07223 0.6402 1.3683 k2 0.01700 0.08433 h 0.008428 o\ 0.07540 0.1281 0.1231 02 0.05547 0.1355 oz 0.0883 0i 0.03739 0.03080 0.02979 02 0.00003265 0.0006553 #3 0.0007228 Ai -0.07892 -0.1744 -0.3229

A2 -0.04076 -0.04425 A3 -0.05830

Table D.2: Parameter estimates of the CIR model from Chen and Scott [9]

D.3 Estimates for the models in the A\r (3) and A^r (3) Fam­

ilies

163 Parameter Air(3) BDFS Air(3)£>s MT{3)MAX M 0.602 0.365 0.366 V 0.0523 0.226 0.228 k O(fixed) O(fixed) 0.0348 k 2.05 17.4 18 V 0.000156 0.015 0.0158 6 0.14 0.0827 0.0827

Of)v O(fixed) O(fixed) 0.0212

(7rv 491 4.27 4.2

Or9 O(fixed) -3.42 -3.77

O0r O(fixed) -0.0943 -0.0886 e 0.000113 0.0002 0.000298

Qr O(fixed) O(fixed) 3.26e-14 v2 5.18e-05 0.00782 0.00839 Pe O(fixed) O(fixed) 7.9e-10

Table D.3: Estimates of the parameters from the models in the A\{?>) class taken from Dai and Singleton [12]. Parameters indicted by a "fixed" are restricted to zero.

Parameter ^2r(3)c7ien ^2r(3)l?5 A2r{3)MAX fi 1.24 0.636 0.291

kev O(fixed) -33.9 -12.4 k O(fixed) -35.3 -274 "TV kv9 O(fixed) O(fixed) -0.0021 V 0.0757 0 103 00871 k 2.19 2.7 3.54 V 0.000206 0.000239 0.000315 e 0.416 0.0259 0.0136 r 0.416 0.0259(fixed) 0.053

Urv O(fixed) -182 -133

oTe O(fixed) O(fixed) -0.0953

Qr O(fixed) O(fixed) 1.12e-09 v2 0.000393 0.000119 7.04e-05 e 0.00253 0.00312 0.00237 Pe O(fixed) O(fixed) 1.92e-05

Table D.4: Estimates of the parameters from the models in the ,42(3) class taken from Singleton and Dai[12]. Parameters indicted by a "fixed" are restricted to zero except that r is constrained to be equal to 9, whenever it is "fixed".

164