Stochastic Calculus for Finance II: Continuous-Time Models Solution of Exercise Problems

Yan Zeng Version 1.0.8, last revised on 2015-03-13.

Abstract This is a solution manual for Shreve [14]. If you ﬁnd any typos/errors or have any comments, please email me at [email protected]. This version skips Exercise 7.1, 7.2, 7.5–7.9.

Contents

1 General Probability Theory 2

2 Information and Conditioning 10

3 Brownian Motion 16

4 Stochastic Calculus 26

5 Risk-Neutral Pricing 44

6 Connections with Partial Diﬀerential Equations 54

7 Exotic Options 65

8 American Derivative Securities 67

9 Change of Numéraire 72

10 Term-Structure Models 78

11 Introduction to Jump Processes 94

1 1 General Probability Theory

⋆ Comments: Example 1.1.4 of the textbook illustrates the paradigm of extending probability measure from a σ-algebra consisting of finitely many elements to a σ-algebra consisting of infinitely many elements. This procedure is typical of constructing probability measures and its validity is justified by Carathéodary’s Extension Theorem (see, for example, Durrett[4, page 402]).

I Exercise 1.1. using the properties of Deﬁnition 1.1.2 for a probability measure P, show the following. (i) If A ∈ F, B ∈ F, and A ⊂ B, then P(A) ≤ P(B). Proof. P(B) = P((B − A) ∪ A) = P(B − A) + P (A) ≥ P(A). ∈ F { }∞ F P ⊂ (ii) If A and An n=1 is a sequence of sets in with limn→∞ (An) = 0 and A An for every n, then P(A) = 0. (This property was used implicitly in Example 1.1.4 when we argued that the sequence of all heads, and indeed any particular sequence, must have probability zero.)

Proof. According to (i), P(A) ≤ P(An), which implies P(A) ≤ limn→∞ P(An) = 0. So 0 ≤ P(A) ≤ 0. This means P(A) = 0.

I Exercise 1.2. The inﬁnite coin-toss space Ω∞ of Example 1.1.4 is uncountably inﬁnite. In other words, we cannot list all its elements in a sequence. To see that this is impossible, suppose there were such a sequential list of all elements of Ω∞:

(1) (1) (1) (1) (1) ··· ω = ω1 ω2 ω3 ω4 , (2) (2) (2) (2) (2) ··· ω = ω1 ω2 ω3 ω4 , (3) (3) (3) (3) (3) ··· ω = ω1 ω2 ω3 ω4 , . .

(1) An element that does not appear in this list is the sequence whose ﬁrst component is H is ω1 is T and is (1) (2) (2) T if ω1 is H, whose second component is H if ω2 is T and is T if ω2 is H, whose third component is H (3) (3) if ω3 is T and is T if ω3 is H, etc. Thus, the list does not include every element of Ω∞. Now consider the set of sequences of coin tosses in which the outcome on each even-numbered toss matches the outcome of the toss preceding it, i.e.,

A = {ω = ω1ω2ω3ω4ω5 ··· ; ω1 = ω2, ω3 = ω4, ···}.

(i) Show that A is uncountably inﬁnite.

Proof. We deﬁne a mapping ϕ from A to Ω∞ as follows: ϕ(ω1ω2 ··· ) = ω1ω3ω5 ··· . Then ϕ is one-to-one and onto. So the cardinality of A is the same as that of Ω∞, which means in particular that A is uncountably inﬁnite.

(ii) Show that, when 0 < p < 1, we have P(A) = 0.

Proof. Let An = {ω = ω1ω2 ··· : ω1 = ω2, ··· , ω2n−1 = ω2n}. Then An ↓ A as n → ∞. So

2 2 n P(A) = lim P(An) = lim [P(ω1 = ω2) ··· P(ω2n−1 = ω2n)] = lim [p + (1 − p) ] . n→∞ n→∞ n→∞

2 2 2 2 n Since p + (1 − p) ≤ max{p, 1 − p}[p + (1 − p)] < 1 for 0 < p < 1, we have limn→∞(p + (1 − p) ) = 0. This implies P(A) = 0.

2 I Exercise 1.3. Consider the set function P defined for every subset of [0, 1] by the formula that P(A) = 0 if A is a finite set and P(A) = ∞ if A is an infinite set. Show that P satisfies (1.1.3)-(1.1.5), but P does not have the countable additivity property (1.1.2). We see then that the finite additivity property (1.1.5) does not imply the countable additivity property (1.1.2). Proof. Clearly P(∅) = 0. For any A and B, if both of them are finite, then A ∪ B is also finite. So P ∪ P P ∪ P ∪ (A B) = 0 = (A) + (B). If at least one of them is infinite,∑ then A B is also infinite. So (A B) = ∞ P P P ∪N N P = (A) + (B). Similarly, we can prove ( n=1An) = n=1 (An), even if An’s are not disjoint. P { 1 } ∪∞ To see countable additivity property doesn’t hold for , let An = n ∑. Then A = n=1An is an infinite P ∞ P P ̸ ∞ P set and therefore (A) = . However, (An) = 0 for each n. So (A) = n=1 (An).

I Exercise 1.4.

(i) Construct a standard normal random variable Z on the probability space (Ω∞, F∞, P) of Example 1.1.4 1 under the assumption that the probability for head is p = 2 . (Hint: Consider Examples 1.2.5 and 1.2.6) Solution. By Example 1.2.5, we can construct a random variable X on the coin-toss space, which is uniformly ∫ 2 x 1 − ξ distributed on [0, 1]. For the strictly increasing and continuous function N(x) = √ e 2 dξ, we let −∞ 2π Z = N −1(X). Then P(Z ≤ a) = P(X ≤ N(a)) = N(a) for any real number a, i.e. Z is a standard normal random variable on the coin-toss space (Ω∞, F∞, P). { }∞ (ii) Deﬁne a sequence of random variables Zn n=1 on Ω∞ such that

lim Zn(ω) = Z(ω) for every ω ∈ Ω∞ n→∞

and, for each n, Zn depends only on the ﬁrst n coin tosses. (This gives us a procedure for approximating a standard normal random variable by random variables generated by a ﬁnite number of coin tosses, a useful algorithm for Monte Carlo simulation).

Solution. Define ∑n 1 X = 1{ }. n 2i ωi=H i=1 −1 Then Xn(ω) → X(ω) for every ω ∈ Ω∞ where X is defined as in Example 1.2.5. So Zn = N (Xn) → −1 Z = N (X) for every ω. Clearly Zn depends only on the first n coin tosses and {Zn}n≥1 is the desired sequence.

I Exercise 1.5. When dealing with double Lebesgue integrals, just as with double Riemann integrals, the order of integration can be reversed. The only assumption required is that the function being integrated be either nonnegative or integrable. Here is an application of this fact. Let X be a nonnegative random variable with cumulative distribution function F (x) = P{X ≤ x}. Show

that ∫ ∞ EX = (1 − F (x))dx 0 by showing that ∫ ∫ ∞

1[0,X(ω))(x)dxdP(ω) ∫ Ω 0 E ∞ − is equal to both X and 0 (1 F (x))dx. Proof. First, by the information given by the problem, we have

∫ ∫ ∞ ∫ ∞ ∫

1[0,X(ω))(x)dxdP(ω) = 1[0,X(ω))(x)dP(ω)dx. Ω 0 0 Ω

3 The left side of this equation equals to ∫ ∫ ∫ X(ω) dxdP (ω) = X(ω)dP (ω) = P[X]. Ω 0 Ω The right side of the equation equals to

∫ ∞ ∫ ∫ ∞ ∫ ∞

1{x

I Exercise 1.6. Let u be a ﬁxed number in R, and deﬁne the convex function φ(x) = eux for all x ∈ R. 2 1 Let X be a normal random variable with mean µ = EX and standard deviation σ = [E(X − µ) ] 2 , i.e., with density 2 1 − (x−µ) f(x) = √ e 2σ2 . σ 2π (i) Verify that uX uµ+ 1 u2σ2 Ee = e 2 . Proof.

∫ ∞ [ ] − 2 uX ux 1 − (x µ) E e = e √ e 2σ2 dx ∫−∞ σ 2π ∞ 2 2 1 − (x−µ) −2σ ux = √ e 2σ2 dx ∫−∞ σ 2π ∞ 2 2 2 4 2 1 − [x−(µ+σ u)] −(2σ uµ+σ u ) = √ e 2σ2 dx −∞ σ 2π ∫ ∞ 2 2 − 2 2 uµ+ σ u 1 − [x (µ+σ u)] = e 2 √ e 2σ2 dx −∞ σ 2π 2 2 uµ+ σ u = e 2

(ii) Verify that Jensen’s inequality holds (as it must):

Eφ(X) ≥ φ(EX).

[ ] 2 2 uX uµ+ u σ uµ Proof. E[φ(X)] = E e = e 2 ≥ e = φ(E[X]).

I Exercise 1.7. For each positive integer n, deﬁne fn to be the normal density with mean zero and variance n, i.e., 1 − x2 fn(x) = √ e 2n . 2nπ

(i) What is the function f(x) = limn→∞ fn(x)?

1 Solution. Since |f (x)| ≤ √ , f(x) = lim →∞ f (x) = 0. n 2nπ n n ∫ ∞ (ii) What is limn→∞ −∞ fn(x)dx?

4 ∫ ∫ 2 ∞ ∞ 1 − x Solution. By the change of variable formula, f (x)dx = √ e 2 dx = 1. So we must have −∞ n −∞ 2π ∫ ∞ lim fn(x)dx = 1. n→∞ −∞

(iii) Note that ∫ ∞ ∫ ∞ lim fn(x)dx ≠ f(x)dx. n→∞ −∞ −∞ Explain why this does not violate the Monotone Convergence Theorem, Theorem 1.4.5.

Solution. This is not contradictory with the Monotone Convergence Theorem because {fn}n≥1 doesn’t increase to 0.

I Exercise 1.8. (Moment-generating function). Let X be a nonnegative random variable, and assume that φ(t) = EetX [ ] tX is ﬁnite for every t ∈ R.[ Assume] further that E Xe < ∞ for every t ∈ R. The purpose of this exercise is to show that φ′(t) = E XetX and, in particular, φ′(0) = EX. We recall the deﬁnition of derivative: [ ] φ(t) − φ(s) EetX − EesX etX − esX φ′(t) = lim = lim = lim E . s→t t − s s→t t − s s→t t − s { }∞ The limit above is taken over a continuous variable s, but we can choose a sequence of numbers sn n=1 converging to t and compute [ ] etX − esnX lim E , → sn t t − sn where now we are taking a limit of the expectation of the sequence of random varibles

etX − esnX Yn = . t − sn { }∞ If this limit turns out to be the same, regardless[ of how] we choose the sequence sn n=1 that converges to E etX −esX ′ t, then this limit is also the same as lims→t t−s and is φ (t). The Mean Value Theorem from calculus states that if f(t) is a diﬀerentiable function, then for any two numbers s and t, there is a number θ between s and t such that

f(t) − f(s) = f ′(θ)(t − s).

If we ﬁx ω ∈ Ω and deﬁne f(t) = etX(ω), then this becomes

etX(ω) − esX(ω) = (t − s)X(ω)eθ(ω)X(ω), (1.9.1)

where θ(ω) is a number depending on ω (i.e., a random variable lying between t and s). (i) Use the Dominated Convergence Theorem (Theorem 1.4.9) and equation (1.9.1) to show that [ ] [ ] tX lim EYn = E lim Yn = E Xe . (1.9.2) n→∞ n→∞ [ ] This establishes the desired formula φ′(t) = E XetX .

tX snX e −e θnX θnX max{2|t|,1}X Proof. By (1.9.1), |Yn| = = |Xe | = Xe ≤ Xe . The last inequality is by X ≥ 0 t−sn and the fact that θn is between t and sn, and hence smaller than max{2|t|, 1} for n suﬃciently large. So by ′ tX the Dominated Convergence Theorem, φ (t) = limn→∞ E[Yn] = E[limn→∞ Yn] = E[Xe ].

5 (ii)[ Suppose] the random variable X can take both positive and negative[ values] and EetX < ∞ and E |X|etX < ∞ for every t ∈ R. Show that once again φ′(t) = E XetX . (Hint: Use the notation (1.3.1) to write X = X+ − X−.)

tX+ −tX− tX t|X| tX+ Proof. Since E[e 1{X≥0}] + E[e 1{X<0}] = E[e ] < ∞ for every t ∈ R, E[e ] = E[e 1{X≥0}] + −(−t)X− t|X| E[e 1{X<0}] < ∞ for every t ∈ R. Similarly, we have E[|X|e ] < ∞ for every t ∈ R. So, similar to θnX max{2t,1}|X| (i), we have |Yn| = |Xe | ≤ |X|e for n suﬃciently large, and by the Dominated Convergence ′ tX Theorem, φ (t) = limn→∞ E[Yn] = E[limn→∞ Yn] = E[Xe ].

I Exercise 1.9. Suppose X is a random variable on some probability space (Ω, F, P), A is a set in F, and for every Borel subset B of R, we have ∫

1B(X(ω))dP(ω) = P(A) · P{X ∈ B}. (1.9.3) A Then we say that X is independent of the event A. Show that if X is independent of an event A, then ∫ g(X(ω))dP(ω) = P(A) · Eg(X) A for every nonnegative, Borel-measurable function g.

Proof. If g(x) is of the form 1B(x), where B is a Borel subset of R, then the desired equality is just (1.9.3). By the linearity∑ of Lebesgue integral, the desired equality also holds for simple functions, i.e. g of the n R form g(x) = i=1 1Bi (x), where each Bi is a Borel subset of . Since any nonnegative, Borel-measurable function g is the limit of an increasing sequence of simple functions, the desired equality can be proved by the Monotone Convergence Theorem.

I Exercise 1.10. Let P be the uniform (Lebesgue) measure on Ω = [0, 1]. Deﬁne { 0 if 0 ≤ ω < 1 , Z(ω) = 2 1 ≤ ≤ 2 if 2 ω 1.

For A ∈ B[0, 1], deﬁne ∫ Pe(A) = Z(ω)dP(ω). A

(i) Show that Pe is a probability measure. { }∞ Proof. If Ai i=1 is a sequence of disjoint Borel subsets of [0, 1], then by the Monotone Convergence Theorem, Pe ∪∞ ( i=1Ai) equals to ∫ ∫ ∫ ∑n ∫ ∑∞ e 1∪∞ A ZdP = lim 1∪n A ZdP = lim 1∪n A ZdP = lim ZdP = P(Ai). i=1 i n→∞ i=1 i n→∞ i=1 i n→∞ i=1 Ai i=1

Pe P 1 Pe Meanwhile, (Ω) = 2 ([ 2 , 1]) = 1. So is a probability measure. (ii) Show that if P(A) = 0, then Pe(A) = 0. We say that Pe is absolutely continuous with respect to P. ∫ ∫ P Pe P P P ∩ 1 Proof. If (A) = 0, then (A) = Zd = 2 ∩ 1 d = 2 (A [ , 1]) = 0. A A [ 2 ,1] 2 (iii) Show that there is a set A for which Pe(A) = 0 but P(A) > 0. In other words, Pe and P are not equivalent.

1 Proof. Let A = [0, 2 ).

6 I Exercise 1.11. In Example 1.6.6, we began with a standard normal random variable X under a measure P. According to Exercise 1.6, this random variable has the moment-generating function

uX 1 u2 Ee = e 2 for all u ∈ P.

The moment-generating function of a random variable determines its distribution. In particular, any random 1 u2 variable that has moment-generating function e 2 must be standard normal. −θX− 1 θ2 In Example 1.6.6, we also deﬁned Y = X + θ, where θ is a constant, we set Z = e 2 , and we deﬁned Pe by the formula (1.6.9): ∫ Pe(A) = Z(ω)dP(ω) for all A ∈ F. A We showed by considering its cumulative distribution function that Y is a standard normal random variable under Pe. Give another proof that Y is standard normal under Pe by verifying the moment-generating function formula e uY 1 u2 Ee = e 2 for all u ∈ R. Proof. [ ] [ ] [ ] [ ] 2 2 2 − 2 2 e uY uY uX+uθ −θX− θ uθ− θ (u−θ)X uθ− θ (u θ) u E e = E e Z = E e e 2 = e 2 E e = e 2 e 2 = e 2 .

I Exercise 1.12. In Example 1.6.6, we began with a standard normal random variable X on a probability space (Ω, F, P) and defined the random variable Y = X + θ, where θ is a constant. We also defined −θX− 1 θ2 e Z = e 2 and used Z as the Radon-Nikodým derivative to construct the probability measure P by the formula (1.6.9): ∫ Pe(A) = Z(ω)dP(ω), for all A ∈ F. A Under Pe, the random variable Y was shown to be standard normal. We now have a standard normal random variable Y on the probability space (Ω, F, Pe), and X is related to Y by X = Y − θ. By what we have just stated, with X replaced by Y and θ replaced by −θ, we could b θY − 1 θ2 b b define Z = e 2 and then use Z as a Radon-Nikodým derivative to construct a probability measure P by the formula ∫ Pb(A) = Zb(ω)dPe(ω) for all A ∈ F, A Pb b 1 Pb P so that, under , the random variable X is standard normal. Show that Z = Z and = .

2 2 2 ∫ b θY − θ θ(X+θ)− θ θ +θX −1 b b e Proof. First, Z = e 2 = e 2 = e 2 = Z . Second, for any A ∈ F, P(A) = ZdP = ∫ ∫ A b b b (1AZ)ZdP = 1AdP = P(A). So P = P. In particular, X is standard normal under P, since it’s standard normal under P.

I Exercise 1.13 (Change of measure for a normal random variable). A nonrigorous but informative derivation of the formula for the Radon-Nikodým derivative Z(ω) in Example 1.6.6 is provided by this exercise. As in that example, let X be a standard normal random variable on some probability space (Ω, F, P), and let Y = X + θ. Our goal is to deﬁne a strictly positive random variable Z(ω) so that when we set ∫ Pe(A) = Z(ω)dPe(ω) for all A ∈ F, (1.9.4) A the random variable Y under Pe is standard normal. If we ﬁx ω ∈ Ω and choose a set A that contains ω and is “small,” then (1.9.4) gives Pe(A) ≈ Z(ω)P(A),

7 where the symbol ≈ means “is approximately equal to.” Dividing by P(A), we see that

Pe(A) ≈ Z(ω) P(A) for “small” sets A containing ω. We use this observation to identify[Z(ω). ] − ϵ ϵ With ω ﬁxed, let x = X(ω). For ϵ > 0, we deﬁne B(x, ϵ)[ = x 2 , x +] 2 to be the closed interval − ϵ ϵ centered at x and having length ϵ. Let y = x + θ and B(y, ϵ) = y 2 , y + 2 . (i) Show that { } 1 1 X2(ω) P{X ∈ B(x, ϵ)} ≈ √ exp − . ϵ 2π 2 Proof. ∫ ϵ x+ 2 1 1 2 1 − u2 1 1 − x2 1 − X (¯ω) P(X ∈ B(x, ϵ)) = √ e 2 du ≈ √ e 2 · ϵ = √ e 2 . ϵ ϵ − ϵ 2π ϵ 2π 2π x 2

(ii) In order for Y to be a standard normal random variable under Pe, show that we must have { } 1 1 Y 2(ω) Pe{Y ∈ B(y, ϵ)} ≈ √ exp − . ϵ 2π 2 Proof. Similar to (i). (iii) Show that {X ∈ B(x, ϵ)} and {Y ∈ B(y, ϵ)} are the same set, which we call A(ω, ϵ). This set contains ω and is “small” when ϵ > 0 is small. Proof. {X ∈ B(x, ϵ)} = {X ∈ B(y − θ, ϵ)} = {X + θ ∈ B(y, ϵ)} = {Y ∈ B(y, ϵ)}.

(iv) Show that { } Pe(A) 1 ≈ exp −θX(ω) − θ2 . P(A) 2 The right-hand side is the value we obtained for Z(ω) in Example 1.6.6.

eP(A) Proof. By (i)-(iii), P(A) is approximately

Y 2(¯ω) √ϵ − e 2 2 − 2 2− 2 2 2π − Y (¯ω) X (¯ω) − (X(¯ω)+θ) X (¯ω) − − θ 2 2 θX(¯ω) 2 2 = e = e = e . ϵ − X (¯ω) √ e 2 2π

I Exercise 1.14 (Change of measure for an exponential random variable). Let X be a nonnegative random variable deﬁned on a probability space (Ω, F, P) with the exponential distribution, which is

P{X ≤ a} = 1 − e−λa, a ≥ 0, where λ is a positive constant. Let λ˜ be another positive constant, and deﬁne

λ˜ ˜ Z = e−(λ−λ)X λ e Deﬁne P by ∫ Pe(A) = ZdP for all A ∈ F. A (i) Show that Pe(Ω) = 1.

8 Proof. ∫ e e ∫ ∞ ∫ ∞ λ e λ e e Pe(Ω) = e−(λ−λ)X dP = e−(λ−λ)xλe−λxdx = λee −λxdx = 1. λ λ 0 0

(ii) Compute the cumulative distribution function

Pe{X ≤ a} for a ≥ 0

for the random variable X under the probability measure Pe.

Solution. ∫ ∫ ∫ e a e a λ e λ e e e Pe(X ≤ a) = e−(λ−λ)X dP = e−(λ−λ)xλe−λxdx = λee −λxdx = 1 − e−λa. {X≤a} λ 0 λ 0

I Exercise 1.15 (Provided by Alexander Ng). Let X be a random variable on a probability space (Ω, F, P), and assume X has a density function f(x) that is positive for every x ∈ R. Let g be a strictly increasing, diﬀerentiable function satisfying

lim g(y) = −∞, lim g(y) = ∞, y→−∞ y→∞

and deﬁne the random variable Y = g(X). ∫ ∞ Let h(y) be an arbitrary nonnegative function satisfying −∞ h(y)dy = 1. We want to change the probability measure so that h(y) is the density function for the random variable Y . To do this, we deﬁne

h(g(X))g′(X) Z = . f(X)

(i) Show that Z is nonnegative and EZ = 1. Now deﬁne Pe by ∫ Pe(A) = ZdP for all A ∈ F. A Proof. Clearly Z ≥ 0. Furthermore, we have { } ∫ ∫ ∫ h(g(X))g′(X) ∞ h(g(x))g′(x) ∞ ∞ E{Z} = E = f(x)dx = h(g(x))dg(x) = h(u)du = 1. f(X) −∞ f(x) −∞ −∞

(ii) Show that Y has density h under Pe.

Proof.

∫ ∫ −1 ∫ −1 h(g(X))g′(X) g (a) h(g(x))g′(x) g (a) Pe(Y ≤ a) = dP = f(x)dx = h(g(x))dg(x). {g(X)≤a} f(X) −∞ f(x) −∞ ∫ a By the change of variable formula, the last equation above equals to −∞ h(u)du. So Y has density h under Pe.

9 2 Information and Conditioning

I Exercise 2.1. Let (Ω, F, P) be a general probability space, and suppose a random variable X on this space is measurable with respect to the trivial σ-algebra F0 = {∅, Ω}. Show that X is not random (i.e., there is a constant c such that X(ω) = c for all ω ∈ Ω). Such a random variable is called degenerate.

Proof. For any real number a, we have {X ≤ a} ∈ F0 = {∅, Ω}. So P(X ≤ a) is either 0 or 1. Since lima→∞ P(X ≤ a) = 1 and lima→−∞ P(X ≤ a) = 0, we can ﬁnd a number x0 such that P(X ≤ x0) = 1 and P(X ≤ x) = 0 for any x < x0. So ( ) ( ( )) 1 1 P(X = x0) = lim P x0 − < X ≤ x0 = lim P(X ≤ x0) − P X ≤ x0 − = 1. n→∞ n n→∞ n

Since {X = x0} ∈ {∅, Ω}, we conclude {X = x0} = Ω.

I Exercise 2.2. Independence of random variables can be aﬀected by changes of measure. To illustrate this point, consider the space of two coin tosses Ω2 = {HH,HT,TH,TT }, and let stock prices be given by

S0 = 4,S1(H) = 8,S1(T ) = 2,

S2(HH) = 16,S2(HT ) = S2(TH) = 4,S2(TT ) = 1.

Consider the two probability measures given by 1 1 1 1 Pe(HH) = , Pe(HT ) = , Pe(TH) = , Pe(TT ) = , 4 4 4 4 4 2 2 1 P(HH) = , P(HT ) = , P(TH) = , P(TT ) = . 9 9 9 9 Deﬁne the random variable { 1 if S = 4, X = 2 0 if S2 ≠ 4.

(i) List all the sets in σ(X).

Solution. σ(X) = {∅, Ω, {HT,TH}, {TT,HH}}.

(ii) List all the sets in σ(S1).

Solution. σ(S1) = {∅, Ω, {HH,HT }, {TH,TT }}. e (iii) Show that σ(X) and σ(S1) are independent under the probability measure P. Pe { } ∩ { } Pe { } 1 Pe { } Pe { } Pe { } 1 1 1 Proof. ( HT,TH HH,HT ) = ( HT ) = 4 , ( HT,TH ) = ( HT ) + ( TH ) = 4 + 4 = 2 , Pe { } Pe { } Pe { } 1 1 1 and ( HH,HT ) = ( HH ) + ( HT ) = 4 + 4 = 2 . So we have

Pe({HT,TH} ∩ {HH,HT }) = Pe({HT,TH})Pe({HH,HT }). e e e Similarly, we can work on other elements of σ(X) and σ(S1) and show that P(A ∩ B) = P(A)P(B) for any e A ∈ σ(X) and B ∈ σ(S1). So σ(X) and σ(S1) are independent under P.

(iv) Show that σ(X) and σ(S1) are not independent under the probability measure P. P { } ∩ { } P { } 2 P { } 2 2 4 P { } Proof. ( HT,TH HH,HT ) = ( HT ) = 9 , ( HT,TH ) = 9 + 9 = 9 and ( HH,HT ) = 4 2 6 9 + 9 = 9 . So P({HT,TH} ∩ {HH,HT }) ≠ P({HT,TH})P({HH,HT }).

Hence σ(X) and σ(S1) are not independent under P.

10 P P{ } 2 P{ } 1 (v) Under , we have S1 = 8 = 3 and S1 = 2 = 3 . Explain intuitively why, if you are told that X = 1, you would want to revise your estimate of the distribution of S1.

Solution. Because S1 and X are not independent under the probability measure P, knowing the value of X will aﬀect our opinion on the distribution of S1.

I Exercise 2.3 (Rotating the axes). Let X and Y be independent standard normal random variables. Let θ be a constant, and deﬁne random variables

V = X cos θ + Y sin θ and W = −X sin θ + Y cos θ.

Show that V and W are independent standard normal random variables. Proof. We note (V,W ) are jointly Gaussian. In order to prove their independence, it suﬃces to show they are uncorrelated. Indeed,

E[VW ] = E[−X2 sin θ cos θ + XY cos2 θ − XY sin2 θ + Y 2 sin θ cos θ] = − sin θ cos θ + 0 + 0 + sin θ cos θ = 0.

I Exercise 2.4. In Example 2.2.8, X is a standard normal random variable and Z is an independent random variable satisfying 1 P{Z = 1} = P{Z = −1} = . 2 We deﬁned Y = XZ and showed that Y is standard normal. We established that although X and Y are uncorrelated, they are not independent. In this exercise, we use moment-generating functions to show that Y is standard normal and X and Y are not independent. (i) Establish the joint moment-generating function formula

uv −uv uX+vY 1 (u2+v2) e + e Ee = e 2 · . 2 Solution. [ ] [ ] E euX+vY = E euX+vXZ [ ] [ ] = E euX+vXZ |Z = 1 P (Z = 1) + E euX+vXZ |Z = −1 P (Z = −1) 1 [ ] 1 [ ] = E euX+vX + E euX−vX 2 [ 2 ] 1 (u+v)2 (u−v)2 = e 2 + e 2 2 uv −uv u2+v2 e + e = e 2 · . 2

vY 1 v2 (ii) Use the formula above to show that Ee = e 2 . This is the moment-generating function for a standard normal random variable, and thus Y must be a standard normal random variable. Proof. Let u = 0, we are done. (iii) Use the formula in (i) and Theorem 2.2.7(iv) to show that X and Y are not independent.

[ ] 2 [ ] 2 [ ] [ ] [ ] uX u vY v uX+vY uX vY Proof. E e = e 2 and E e = e 2 . So E e ≠ E e E e . Therefore X and Y cannot be independent.

11 I Exercise 2.5. Let (X,Y ) be a pair of random variables with joint density function { { } | | | | 2 2√x +y exp − (2 x +y) if y ≥ −|x|, 2π 2 fX,Y (x, y) = 0 if y < −|x|.

Show that X and Y are standard normal random variables and that they are uncorrelated but not independent.

Proof. The density fX (x) of X can be obtained by ∫ ∫ ∫ 2 2 2|x| + y − (2|x|+y) ξ − ξ 1 − x2 fX (x) = fX,Y (x, y)dy = √ e 2 dy = √ e 2 dξ = √ e 2 . {y≥−|x|} 2π {ξ≥|x|} 2π 2π

The density fY (y) of Y can be obtained by ∫

fY (y) = fXY (x, y)dx ∫ 2 2|x| + y − (2|x|+y) = 1{|x|≥−y} √ e 2 dx 2π ∫ ∫ ∞ 2 0∧y 2 2x + y − (2x+y) −2x + y − (−2x+y) = √ e 2 dx + √ e 2 dx 0∨(−y) 2π −∞ 2π ∫ ∫ ∞ 2 0∨(−y) 2 2x + y − (2x+y) 2x + y − (2x+y) = √ e 2 dx + √ e 2 d(−x) 0∨(−y) 2π ∞ 2π ∫ ∞ ξ − ξ2 ξ = 2 √ e 2 d( ) |y| 2π 2 1 − y2 = √ e 2 . 2π

So both X and Y are standard normal random variables. Since fX,Y (x, y) ≠ fX (x)fY (y), X and Y are not ∫ 2 ∞ u2 − u independent. However, if we set F (t) = √ e 2 du, we have t 2π ∫ ∞ ∫ ∞ E [XY ] = xyfX,Y (x, y)dxdy ∫−∞ ∫−∞ ∞ ∞ 2 2|x| + y − (2|x|+y) = xy1{y≥−|x|} √ e 2 dxdy ∫−∞ −∞∫ 2π ∞ ∞ 2 2|x| + y − (2|x|+y) = xdx y √ e 2 dy −∞ −|x| 2π ∫ ∞ ∫ ∞ ξ − ξ2 = xdx (ξ − 2|x|)√ e 2 dξ −∞ | | 2π (x ) ∫ ∫ 2 ∞ ∞ 2 − x ξ − ξ2 e 2 = xdx √ e 2 dξ − 2|x| √ −∞ |x| 2π 2π ∫ ∫ ∫ ∫ ∞ ∞ 2 0 ∞ 2 ξ − ξ2 ξ − ξ2 = x √ e 2 dξdx + x √ e 2 dξdx 2π −∞ − 2π ∫0 x ∫ x ∞ 0 = xF (x)dx + xF (−x)dx 0 −∞ = 0.

12 I Exercise 2.6. Consider a probability space Ω with four elements, which we call a, b, c, and d (i.e., Ω = {a, b, c, d}). The σ-algebra F is the collection of all subsets of Ω; i.e., the sets in F are

Ω, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b}, {a, c}, {a, d}, {b, c}, {b, d}, {c, d}, {a}, {b}, {c}, {d}, ∅.

We deﬁne a probability measure P by specifying that 1 1 1 1 P{a} = , P{b} = , P{c} = , P{d} = , 6 3 4 4 and, as usual, the probability of every other set in F is the sum of the probabilities of the elements in the P{ } P{ } P{ } P{ } 3 set, e.g., a, b, c = a + b + c = 4 . We next deﬁne two random variables, X and Y , by the formulas

X(a) = 1,X(b) = 1,X(c) = −1,X(d) = −1, Y (a) = 1,Y (b) = −1,Y (c) = 1,Y (d) = −1.

We then deﬁne Z = X + Y . (i) List the sets in σ(X). Solution. σ(X) = {∅, Ω, {a, b}, {c, d}}. (ii) Determine E[Y |X] (i.e., specify the values of this random variable for a, b, c, and d). Verify that the partial-averaging property is satisﬁed. Solution. ∑ E[Y |X] = E[Y |X = α]1{X=α} α∈{1,−1} ∑ E[Y 1{X=α}] = 1{ } P(X = α) X=α α∈{1,−1} 1 · P(Y = 1,X = 1) − 1 · P(Y = −1,X = 1) = 1{ } P(X = 1) X=1 1 · P(Y = 1,X = −1) − 1 · P(Y = −1,X = −1) + 1{ − } P(X = −1) X= 1 1 · P({a}) − 1 · P({b}) 1 · P({c}) − 1 · P({d}) = 1{ } + 1{ − } P({a, b}) X=1 P({c, d}) X= 1 1 = − 1{ }. 3 X=1 To verify the partial-averaging property, we note [ ] 1 [ ] 1 1 E E[Y |X]1{ } = − E 1{ } = − P({a, b}) = − X=1 3 X=1 3 6 and [ ] 1 E Y 1{ } = P(X = Y = 1) − P(Y = −1,X = 1) = P({a}) − P({b}) = − . X=1 6 Similarly, [ ] E E[Y |X]1{X=−1} = 0 and [ ] E Y 1{X=−1} = P(Y = 1,X = −1) − P(Y = −1,X = −1) = P({c}) − P({d}) = 0. Together, we can conclude the partial-averaging property hodls.

13 (iii) Determine E[Z|X]. Again, verify the partial-averaging property.

Solution. 2 E[Z|X] = X + E[Y |X] = 1{ } − 1{ − }. 3 X=1 X= 1 Veriﬁcation of the partial-averaging property is skipped.

(iv) Compute E[Z|X] − E[Y |X]. Citing the appropriate properties of conditional expectation from Theorem 2.3.2, explain why you get X. Solution. E[Z|X] − E[Y |X] = E[Z − Y |X] = E[X|X] = X, where the ﬁrst equality is due to linearity (Theorem 2.3.2(i)) and the last equality is due to “taking out what is known” (Theorem 2.3.2(ii)).

I Exercise 2.7. Let Y be an integrable random variable on a probability space (Ω, F, P) and let G be a sub-σ-algebra of F. Based on the information in G, we can form the estimate E[Y |G] of Y and deﬁne the error of the estimation Err = Y −E[Y |G]. This is a random variable with expectation zero and some variance Var(Err). Let X be some other G-measurable random variable, which we can regard as another estimate of Y . Show that Var(Err) ≤ Var(Y − X). In other words, the estimate E[Y |G] minimizes the variance of the error among all estimates based on the information in G. (Hint: Let µ = E(Y − X). Compute the variance of Y − X as [ ] E[(Y − X − µ)2] = E ((Y − E[Y |G]) + (E[Y |G] − X − µ))2 .

Multiply out the right-hand side and use iterated conditioning to show the cross-term is zero.) Proof. Let µ = E[Y − X] and ξ = E[Y − X − µ|G]. Note ξ is G-measurable, we have

− E − − 2 Var(Y X) = [({Y X µ) ] } = E [(Y − E[Y |G]) + (E[Y |G] − X − µ)]2 = Var(Err) + 2E [(Y − E[Y |G])ξ] + E[ξ2] = Var(Err) + 2E [Y ξ − E[Y ξ|G]] + E[ξ2] = Var(Err) + E[ξ2] ≥ Var(Err).

I Exercise 2.8. Let X and Y be integrable random variables on a probability space (Ω, F, P). Then Y = Y1 + Y2, where Y1 = E[Y |X] is σ(X)-measurable and Y2 = Y − E[Y |X]. Show that Y2 and X are uncorrelated. More generally, show that Y2 is uncorrelated with every σ(X)-measurable random variable.

Proof. It suﬃces to prove the more general case. For any σ(X)-measurable random variable ξ, E[Y2ξ] = E[(Y − E{Y |X})ξ] = E[Y ξ − E{Y |X}ξ] = E[Y ξ] − E[Y ξ] = 0.

N Exercise 2.9. Let X be a random variable. (i) Give an example of a probability space (Ω, F, P), a random variable X deﬁned on this probability space, and a function f so that the σ-algebra generated by f(X) is not the trivial σ-algebra {∅, Ω} but is strictly smaller than the σ-algebra generated by X.

14 Solution. Consider the dice-toss space similar to the coin-toss space. Then a typical element ω in this space is an inﬁnite sequence ω1ω2ω3 ··· , with ωi ∈ {1, 2, ··· , 6} (i ∈ N). We deﬁne X(ω) = ω1 and f(x) = 1{odd integers}(x). Then it’s easy to see

σ(X) = {∅, Ω, {ω : ω1 = 1}, ··· , {ω : ω1 = 6}}

and σ(f(X)) equals to

{∅, Ω, {ω : ω1 = 1} ∪ {ω : ω1 = 3} ∪ {ω : ω1 = 5}, {ω : ω1 = 2} ∪ {ω : ω1 = 4} ∪ {ω : ω1 = 6}}.

So {∅, Ω} ( σ(f(X)) ( σ(X), and each of these containment is strict. (ii) Can the σ-algebra generated by f(X) ever be strictly larger than the σ-algebra generated by X? Solution. No. σ(f(X)) ⊂ σ(X) is always true.

I Exercise 2.10. Let X and Y be random variable (on some unspeciﬁed probability space (Ω, F, P)), assume they have a joint density fX,Y (x, y), and assume E|Y | < ∞. In particular, for every Borel subset C of R2, we have ∫

P{(X,Y ) ∈ C} = fX,Y (x, y)dxdy. C In elementary probability, one learns to compute E[Y |X = x], which is a nonrandom function of the dummy variable x, by the formula ∫ ∞

E[Y |X = x] = yfY |X (y|x)dy, (2.6.1) −∞

where fY |X (y|x) is the conditional density deﬁned by

fX,Y (x, y) fY |X (y|x) = . fX (x) ∫ ∞ The denominator in this expression, fX (x) = −∞ fX,Y (x, η)dη, is the marginal density of X, and we must assume it is strictly positive for every x. We introduce the symbol g(x) for the function E[Y |X = x] deﬁned by (2.6.1); i.e., ∫ ∞ ∫ ∞ yfX,Y (x, y) g(x) = yfY |X (y|x)dy = dy −∞ −∞ fX (x) In measure-theoretic probability, conditional expectation is a random variable E[Y |X]. This exercise is to show that when there is a joint density for (X,Y ), this random variable can be obtained by substituting the random variable X in place of the dummy variable x in the function g(x). In other words, this exercise is to show that E[Y |X] = g(X). (We introduced the symbol g(x) in order to avoid the mathematically confusing expression E[Y |X = X].) Since g(X) is obviously σ(X)-measurable, to verify that E[Y |X] = g(X), we need only check that the partial-averaging property is satisﬁed. For every Borel-measurable function h mapping R to R and satisfying E|h(X)| < ∞, we have ∫ ∞ Eh(X) = h(x)fX (x)dx. (2.6.2) −∞ This is Theorem 1.5.2 in Chapter 1. Similarly, if h is a function of both x and y, then ∫ ∞ ∫ ∞

Eh(X,Y ) = h(x, y)fX,Y (x, y)dxdy (2.6.3) −∞ −∞

whenever (X,Y ) has a joint density fX,Y (x, y). You may use both (2.6.2) and (2.6.3) in your solution to this problem.

15 Let A be a set in σ(X). By the deﬁnition of σ(X), there is a Borel subset B of R such that A = {ω ∈ Ω; X(ω) ∈ B} or, more simply, A = {X ∈ B}. Show the partial-averaging property ∫ ∫ g(X)dP = Y dP. A A Proof. ∫ ∫ ∫ ∫ ∞ yf (x, y) g(X)dP = E[g(X)1 (X)] = g(x)1 (x)f (x)dx = X,Y dy1 (x)f (x)dx B B X f (x) B X A ∫ ∫ −∞ X∫

= y1B(x)fX,Y (x, y)dxdy = E[Y 1B(X)] = E[YIA] = Y dP. A

I Exercise 2.11. (i) Let X be a random variable on a probability space (Ω, F, P), and let W be a nonnegative σ(X)-measurable random variable. Show there exists a function g such that W = g(X). (Hint: Recall that every set in σ(X) is of the form {X ∈ B} for some Borel set B ⊂ R. Suppose first that W is the indicator of such a set, and then use the standard machine.) { } ↑ Proof. We can find a sequence∑ Wn n≥1 of σ(X)-measurable simple functions such that Wn W . Each Wn Kn n n n can be written as the form a 1An , where A ’s belong to σ(X) and are disjoint. So each A can be i=1 i i i ∑ ∑ i n n Kn n Kn n written as {X ∈ B } for some Borel subset B of R, i.e. Wn = a 1{X∈Bn} = a 1Bn (X) = gn(X), ∑ i i i=1 i i i=1 i i Kn n where g (x) = a 1 n (x). Define g = lim sup g , then g is a Borel function. By taking upper limits on n i=1 i Bi n both sides of Wn = gn(X), we get W = g(X). (ii) Let X be a random variable on a probability space (Ω, F, P), and let Y be a nonnegative random variable on this space. We do not assume that X and Y have a joint density. Nonetheless, show there is a function g such that E[Y |X] = g(X).

Proof. Note E[Y |X] is σ(X)-measurable. By (i), we can ﬁnd a Borel function g such that E[Y |X] = g(X).

3 Brownian Motion

I Exercise 3.1. According to Deﬁnition 3.3.3(iii), for 0 ≤ t < u, the Brownian motion increment W (u) − W (t) is independent of the σ-algebra F(t). Use this property and property (i) of that deﬁnition to show that, for 0 ≤ t < u1 < u2, the increment W (u2) − W (u1) is also indepdent of Ft.

Proof. We have F(t) ⊂ F(u1) and W (u2)−W (u1) is independent of F(u1). So in particular, W (u2)−W (u1) is independent of F(t).

I Exercise 3.2. Let W (t), t ≥ 0, be a Brownian motion, and let F(t), t ≥ 0, be a ﬁltration for this Brownian motion. Show that W 2(t) − t is a martingale. (Hint: For 0 ≤ s ≤ t, write W 2(t) as (W (t) − W (s))2 + 2W (t)W (s) − W 2(s).)

Proof. E[W 2(t) − W 2(s)|F(s)] = E[(W (t) − W (s))2 + 2W (t)W (s) − 2W 2(s)|F(s)] = t − s + 2W (s)E[W (t) − W (s)|F(s)] = t − s. Simple algebra gives E[W 2(t) − t|F(s)] = W 2(s) − s. So W 2(t) − t is a martingale.

I Exercise 3.3 (Normal kurtosis). The kurtosis of a random variable is deﬁned to be the ratio of its fourth central moment to the square of its variance. For a normal random variable, the kurtosis is 3. This fact was used to obtain (3.4.7). This exercise veriﬁes this fact. Let X be a normal random variable with mean µ, so that X − µ has mean zero. Let the variance of X, which is also the variance of X − µ, be σ2. In (3.2.13), we computed the moment-generating function of

16 u(X−µ) 1 u2σ2 X − µ to be φ(u) = Ee = e 2 , where u is a real variable. Differentiating this function with respect to u, we obtain [ ] ′ u(X−µ) 2 1 σ2u2 φ (u) = E (X − µ)e = σ ue 2 and, in particular, φ′(0) = E(X − µ) = 0. Differentiating again, we obtain [ ] ′′ 2 u(X−µ) 2 4 2 1 σ2u2 φ (u) = E (X − µ) e = (σ + σ u )e 2 and, in particular, φ′′(0) = E[(X − µ)2] = σ2. Differentiate two more times and obtain the normal kurtosis formula E[(X − µ)4] = 3σ4. Solution. (3) 4 1 σ2u2 2 4 2 2 1 σ2u2 1 σ2u2 4 6 3 φ (u) = 2σ ue 2 + (σ + σ u )σ ue 2 = e 2 (3σ u + σ u ), and (4) 2 1 σ2u2 4 6 3 1 σ2u2 4 6 2 1 σ2u2 6 2 8 4 4 φ (u) = σ ue 2 (3σ u + σ u ) + e 2 (3σ + 3σ u ) = e 2 (6σ u + σ u + 3σ ). So E[(X − µ)4] = φ(4)(0) = 3σ4.

I Exercise 3.4 (Other variations of Brownian motion). Theorem 3.4.3 asserts that if T is a positive number and we choose a partition Π with points 0 = t0 < t1 < t2 < ··· < tn = T , then as the number n of partition points approaches infinity and the length of the longest subinterval ||Π|| approaches zero, the sample quadratic variation n∑−1 2 (W (tj+1) − W (tj)) j=0 ∑approaches T for almost every path of the∑ Brownian motion W . In Remark 3.4.5, we further showed that n−1 − − n−1 − 2 j=0 (W (tj+1) W (tj))(tj+1 tj) and j=0 (tj+1 tj) have limit zero. We summarize these facts by the multiplication rules dW (t)dW (t) = dt, dW (t)dt = 0, dtdt = 0. (3.10.1) (i) Show that as the number m of partition points approaches infinity and the length of the longest subinterval approaches zero, the sample first variation n∑−1 |W (tj+1) − W (tj)| j=0 approaches ∞ for almost every path of the Brownian motion W . (Hint: n∑−1 n∑−1 2 (W (tj+1) − W (tj)) ≤ max |W (tk+1) − W (tk)| · |W (tj+1) − W (tj)|.) 0≤k≤n−1 j=0 j=0 Proof. Assume there exists A ∈ F, such that P(A) > 0 and for every ω ∈ A, n∑−1 | − | ∞ lim sup Wtj+1 Wtj (ω) < . n j=0 Then for every ω ∈ A, n∑−1 n∑−1 − 2 ≤ | − | · | − | (Wtj+1 Wtj ) (ω) max Wt Wt (ω) Wtj+1 Wtj (ω) 0≤k≤n−1 k+1 k j=0 j=0 n∑−1 ≤ | − | · | − | max Wtk+1 Wtk (ω) lim sup Wtj+1 Wtj (ω) 0≤k≤n−1 n j=0 → 0,

since by uniform continuity of continuous functions over a closed interval, lim →∞ max ≤ ≤ − |W − ∑ n 0 k n 1 tk+1 | n−1 − 2 Wtk (ω) = 0. This is a contradiction with limn→∞ j=0 (Wtj+1 Wtj ) = T a.s..

17 (ii) Show that as the number n of partition points approaches inﬁnity and the length of the longest subinterval approaches zero, the sample cubic variation

n∑−1 3 |W (tj+1) − W (tj)| j=0 approaches zero for almost every path of the Brownian motion W . Proof. We note by an argument similar to (i),

n∑−1 n∑−1 3 2 |W (tj+1) − W (tj)| ≤ max |W (tk+1) − W (tk)| · (W (tj+1) − W (tj)) → 0 0≤k≤n−1 j=0 j=0

as n → ∞.

I Exercise 3.5 (Black-Scholes-Merton formula). Let the interest rate r and the volatility σ > 0 be constant. Let r− 1 σ2 t+σW (t) S(t) = S(0)e( 2 ) be a geometric Brownian motion with mean rate of return r, where the initial stock price S(0) is positive. Let K be a positive constant. Show that, for T > 0, [ ] −rT + −rT E e (S(T ) − K) = S(0)N(d+(T,S(0))) − Ke N(d−(T,S(0))), where [ ( ) ] 1 S(0) σ2 d(T,S(0)) = √ log + r T , σ T K 2 and N is the cumulative standard normal distribution function ∫ ∫ y ∞ 1 − 1 z2 1 − 1 z2 N(y) = √ e 2 dz = √ e 2 dz. 2π −∞ 2π −y Proof. [ ] −rT + E e (ST − K) ∫ ∞ ( ) − x2 −rT (r− 1 σ2)T +σx e 2T = e [ ] S0e 2 − K √ dx 1 ln K −(r− 1 σ2)T 2πT σ S0 2 ∫ 2 ∞ ( √ ) − y −rT (r− 1 σ2)T +σ T y e 2 = e [ ] S0e 2 − K √ dy √1 ln K −(r− 1 σ2)T 2π σ T S0 2 ∫ ∞ ∫ ∞ 2 √ 2 − 1 σ2T 1 − y +σ T y −rT 1 − y = S0e 2 [ ] √ e 2 dy − Ke [ ] √ e 2 dy √1 ln K −(r− 1 σ2)T 2π √1 ln K −(r− 1 σ2)T 2π σ T S0 2 σ T S0 2 ∫ ∞ ( ( )) 2 1 − ξ −rT 1 S0 1 2 = S0 [ ] √ √ e 2 dξ − Ke N √ ln + (r − σ )T √1 ln K −(r− 1 σ2)T −σ T 2π σ T K 2 σ T S0 2 −rT −rT = Ke N(d+(T,S0)) − Ke N(d−(T,S0)).

I Exercise 3.6. Let W (t) be a Brownian motion and let F(t), t ≥ 0, be an associated ﬁltration. (i) For µ ∈ R, consider the Brownian motion with drift µ:

X(t) = µt + W (t).

18 Show that for any Borel-measurable function f(y), and for any 0 ≤ s < t, the function ∫ { } 1 ∞ (y − x − µ(t − s))2 g(x) = √ f(y) exp − dy 2π(t − s) −∞ 2(t − s) satisﬁes∫ E[f(X(t))|F(s)] = g(X(s)), and hence X has the Markov property. We may rewrite g(x) as ∞ − g(x) = −∞ f(y)p(τ, x, y)dy, where τ = t s and { } 1 (y − x − µτ)2 p(τ, x, y) = √ exp − 2πτ 2τ is the transition density for Brownian motion with drift µ. Proof. E |F E − |F | [f(Xt) t] = [f(Wt Ws + a) s] a=Ws+µt E | = [f(Wt−s + a)] a=Ws+µt ∫ ∞ − x2 e 2(t−s) = f(x + Ws + µt)√ dx −∞ 2π(t − s) 2 ∫ ∞ − (y−Ws−µs−µ(t−s)) e 2(t−s) = f(y) √ dy −∞ 2π(t − s)

= g(Xs).

∫ − − 2 ∞ 1 − (y x µτ) So E[f(X )|F ] = f(y)p(t − s, X , y)dy with p(τ, x, y) = √ e 2τ . t s −∞ s 2πτ (ii) For ν ∈ R and σ > 0, consider the geometric Brownian motion S(t) = S(0)eσW (t)+νt. Set τ = t − s and { } ( )2 1 log y − ντ p(τ, x, y) = √ exp − x . σy 2πτ 2σ2τ Show that for any Borel-measurable function f(y) and for any 0 ≤ s < t the function ∫ ∞ g(x) = f(y)p(τ, x, y)dy1 0 satisﬁes E[f(S(t))|F(s)] = g(S(s)) and hence S has the Markov property and p(τ, x, y) is its transition density.

σXt v Proof. E[f(St)|Fs] = E[f(S0e )|Fs] with µ = . So by (i), ∫ σ ∞ 2 − [y−Xs−µ(t−s)] σy 1 − E[f(St)|Fs] = f(S0e )√ e 2(t s) dy −∞ 2π(t − s)

∫ S 2 ∞ 1 ln z − 1 ln s −µ(t−s) σy [ σ S σ S ] S0e =z 1 − 0 0 dz = f(z)√ e 2 0 2π(t − s) σz 2 [ln z −v(t−s)] ∫ ∞ − Ss e 2σ2(t−s) = f(z) √ dz 0 σz 2π(t − s) ∫ ∞

= f(z)p(t − s, Ss, z)dz 0 = g(Ss).

∫ 1 ∞ The textbook wrote 0 h(y)p(τ, x, y)dy by mistake.

19 I Exercise 3.7. Theorem 3.6.2 provides the Laplace transform of the density of the ﬁrst passage time for Brownian motion. This problem derives the analogous formula for Brownian motions with drift. Let W be a Brownian motion. Fix m > 0 and µ ∈ R. For 0 ≤ t < ∞, deﬁne

X(t) = µt + W (t),

τm = min{t ≥ 0; X(t) = m}.

As usual, we set τm = ∞ if X(t) never reaches the level m. Let σ be a positive number and set { ( ) } 1 Z(t) = exp σX(t) − σµ + σ2 t . 2

(i) Show that Z(t), t ≥ 0, is a martingale. Proof. [ ]

Zt E Fs Z [ s { ( ) }] σ2 = E exp σ(W − W ) + σµ(t − s) − σµ + (t − s) t s 2 { ( ) } ∫ 2 2 ∞ √ − x σ e 2 = exp σµ(t − s) − σµ + (t − s) · eσ t−s·x √ dx 2 −∞ 2π √ { } ∫ − − 2 { } 2 ∞ − (x σ t s) 2 σ e 2 σ = exp − (t − s) · √ dx · exp (t − s) 2 −∞ 2π 2 = 1.

(ii) Use (i) to conclude that [ { ( ) }] 1 E exp σX(t ∧ τ ) − σµ + σ2 (t ∧ τ ) = 1, t ≥ 0. m 2 m E E Proof. By optional stopping theorem, [Zt∧τm ] = [Z0] = 1, that is, [ { ( ) }] σ2 E exp σX ∧ − σµ + t ∧ τ = 1. t τm 2 m

(iii) Now suppose µ ≥ 0. Show that, for σ > 0, [ { ( ) } ] 1 2 E exp σm − σµ + σ τ 1{ ∞} = 1. 2 m τm< ≥ ≤ σm Proof. If µ 0 and σ > 0, Zt∧τm e . By bounded convergence theorem,

E[1{τ <∞}Zτ ] = E[ lim Zt∧τ ] = lim E[Zt∧τ ] = 1, m m t→∞ m t→∞ m [ ] − 1 2 − σ2 { ∞} ≤ σm 2 σ t → → ∞ E σm (σµ+ 2 )τm since on the event τm = , Zt∧τm e 0 as t . Therefore, e 1{τm<∞} = 1. ↓ P ∞ σ2 Let σ 0, by bounded convergence theorem, we have (τm < ) = 1. Let σµ + 2 = α, we get √ − − − 2 E[e ατm ] = e σm = emµ m 2α+µ .

20 (iv) Show that if µ > 0, then Eτm < ∞. Obtain a formula for Eτm. (Hint: Diﬀerentiate the formula in (iii) with respect to α.)

−ατm −αx Proof. We note for α > 0, E[τme ] < ∞ since xe is bounded on [0, ∞). So by an argument similar − to Exercise 1.8, E[e ατm ] is diﬀerentiable and √ ∂ 2 −m −ατm −ατm mµ−m 2α+µ E[e ] = −E[τme ] = e √ . ∂α 2α + µ2

↓ E m ∞ Let α 0, by monotone increasing theorem, [τm] = µ < for µ > 0. (v) Now suppose µ < 0. Show that, for σ > −2µ, [ { ( ) } ] 1 2 E exp σm − σµ + σ τ 1{ ∞} = 1. 2 m τm<

−2m|µ| 2 Use this fact to show that P{τm < ∞} = e , which is strictly less than one, and to obtain the Laplace transform √ − − 2 Ee ατm = emµ m 2α+µ for all α > 0.

σ2 σm − ∧ ≤ { ∞} ∧ ≤ Proof. By σ > 2µ > 0, we get σµ + 2 > 0. Then Zt τm e and on the event τm = , Zt τm 2 σm−( σ +σµ)t e 2 → 0 as t → ∞. Therefore, [ ] σ2 σm−(σµ+ )τm E e 2 1{τ <∞} = E[ lim Zt∧τ ] = lim E[Zt∧τ ] = 1. m t→∞ m t→∞ m

↓ − P ∞ 2µm −2|µ|m σ2 Let σ 2µ, we then get (τm < ) = e = e < 1. Set α = σµ + 2 > 0 and we have √ − − − − 2 E ατm E ατm σm mµ m 2α+µ [e ] = [e 1{τm<∞}] = e = e .

I Exercise 3.8. This problem presents the convergence of the distribution of stock prices in a sequence of binomial models to the distribution of geometric Brownian motion. In contrast to the analysis of Subsection 3.2.7, here we allow the interest rate to be diﬀerent from zero. Let σ > 0 and r ≥ 0 be given. For each positive integer n, we consider a binomial model√ taking n steps r σ/ n per unit time. In this√ model, the interest rate per period is n , the up factor is un = e , and the down −σ/ n factor is dn = e . The risk-neutral probabilities are then √ √ r − −σ/ n σ/ n − r − n + 1 e e n 1 pen = √ √ , qen = √ √ . eσ/ n − e−σ/ n eσ/ n − e−σ/ n Let t be an arbitrary positive rational number, and for each positive integer n for which nt is an integer, deﬁne ∑nt Mnt,n = Xk,n, k=1 3 where X1,n, ··· , Xnt,n are independent, identically distributed random variables with

e e 4 P{Xk,n = 1} = pen, P{Xk,n = −1} = qen, k = 1, ··· , nt .

2The textbook wrote e−2x|µ| by mistake. 3 The textbook wrote Xn,n by mistake. 4The textbook wrote n by mistake.

21 The stock price at time t in this binomial model, which is the result of nt steps from the initial time, is given by (see (3.2.15) for a similar equation)

1 1 − 2 (nt+Mnt,n) 2 (nt Mnt,n) Sn(t) = S(0)un { dn } { } σ σ = S(0) exp √ (nt + M ) 5 exp − √ (nt − M ) 2 n nt,n 2 n nt,n { } σ = S(0) exp √ M . n nt,n

This problem shows that as n → ∞, the distribution of the sequence of random variables √σ M appearing ( ) n nt,n − 1 2 2 in the exponent above converges to the normal distribution with mean r 2 σ t and variance σ t. There- fore, the limiting distribution of Sn(t) is the same as the distribution of the geometric Brownian motion { − 1 } S(0) exp σW (t) + (r 2 σ)t at time t. √1 (i) Show that the moment-generating function φn(u) of n Mnt,n is given by

[ ( √ ) ( √ )]nt r − −σ/ n r − σ/ n √u n + 1 e − √u n + 1 e φn(u) = e n √ √ − e n √ √ . eσ/ n − e−σ/ n eσ/ n − e−σ/ n

Proof. [ ] ( [ ]) √1 √u nt √u √u e u Mnt,n e X1,n − nt φn(u) = E e n = E e n = (e n pen + e n qen) [ ( ) ( )] − √σ √σ nt r n r n √u + 1 − e − √u − − 1 + e n n n n = e √σ − √σ + e √σ − √σ . e n − e n e n − e n

(ii) We want to compute lim φn(u) = lim φ 1 (u), n→∞ x↓0 x2

√1 where we have made the change of variable x = . To do this, we will compute log φ 1 (u) and then take n x2 the limit as x ↓ 0. Show that [ ] t (rx2 + 1) sinh ux + sinh(σ − u)x 1 log φ (u) = 2 log x2 x sinh σx

ez −e−z ez +e−z (the deﬁnitions are sinh z = 2 , cosh z = 2 ), and use the formula sinh(A − B) = sinh A cosh B − cosh A sinh B

to rewrite this as [ ] t (rx2 + 1 − cosh σx) sinh ux 1 log φ (u) = 2 log cosh ux + . x2 x sinh σx Proof. [ ( ) ( )] t − 2 − σx 2 − σx x2 ux rx + 1 e −ux rx + 1 e 1 − φ (u) = e σx −σx e σx −σx . x2 e − e e − e

22 So, [ ] t (rx2 + 1)(eux − e−ux) + e(σ−u)x − e−(σ−u)x 1 log φ (u) = 2 log σx −σx x2 x e − e [ ] t (rx2 + 1) sinh ux + sinh(σ − u)x = log x2 sinh σx [ ] t (rx2 + 1) sinh ux + sinh σx cosh ux − cosh σx sinh ux = log x2 sinh σx [ ] t (rx2 + 1 − cosh σx) sinh ux = log cosh ux + . x2 sinh σx

(iii) Use the Taylor series expansions 1 cosh z = 1 + z2 + O(z4), sinh z = z + O(z3), 2 to show that (rx2 + 1 − cosh σx) sinh ux 1 rux2 1 cosh ux + = 1 + u2x2 + − ux2σ + O(x4). (3.10.2) sinh σx 2 σ 2 The notation O(xj) is used to represent terms of the order xj. Proof.

(rx2 + 1 − cosh σx) sinh ux cosh ux + sinh σx 2 2 u2x2 (rx2 + 1 − 1 − σ x + O(x4))(ux + O(x3)) = 1 + + O(x4) + 2 2 σx + O(x3) 2 u2x2 (r − σ )ux3 + O(x5) = 1 + + 2 + O(x4) 2 σx + O(x3) 2 u2x2 (r − σ )ux3(1 + O(x2)) = 1 + + 2 + O(x4) 2 σx(1 + O(x2)) u2x2 rux2 1 = 1 + + − σux2 + O(x4). 2 σ 2

2 (iv) Use the Taylor series expansion log(1 + x) = x + O(x ) to compute limx↓0 log φ 1 (u). Now explain how ( x)2 √σ − 1 2 2 you know that the limiting distribution for n Mnt,n is normal with mean r 2 σ t and variance σ t. Solution. ( ) ( ) 2 2 2 2 2 2 t u x ru 2 σux 4 t u x ru 2 σux 4 1 − − log φ = 2 log 1 + + x + O(x ) = 2 + x + O(x ) . x2 x 2 σ 2 x 2 σ 2 [ ] 2 √1 u ru σu e u Mnt,n 1 2 r σ So limx↓0 log φ 1 (u) = t( + − ), and E e n = φn(u) → tu + t( − )u. By the one-to-one x2 2 σ 2 2 σ 2 6 √1 correspondence between distribution and moment generating function , ( n Mnt,n)n converges to a Gaussian r − σ √σ random variable with mean t( σ 2 ) and variance t. Hence ( n Mnt,n)n converges to a Gaussian random − σ2 2 variable with mean t(r 2 ) and variance σ t. 6This correspondence holds only under certain conditions, which normal distribution certainly satisﬁes. For details, see Shiryaev [12, page 293-296].

23 I Exercise 3.9 (Laplace transform of first passage density). The solution to this problem is long and technical. It is included for the sake of completeness, but the reader may safely skip it. Let m > 0 be given, and define { } m m2 f(t, m) = √ exp − . t 2πt 2t According to (3.7.3) in Theorem 3.7.1, f(t, m) is the density in the variable t of the first passage time τm = min{t ≥ 0; W (t) = m}, where W is a Brownian motion without drift. Let ∫ ∞ g(α, m) = e−αtf(t, m)dt, α > 0, 0 √ be the Laplace transform of the density f(t, m). This problem verifies that g(α, m) = e−m 2α, which is the formula derived in Theorem 3.6.2. (i) For k ≥ 1, define ∫ { } ∞ 2 1 −k/2 m αk(m) = √ t exp −αt − dt, 2π 0 2t so g(α, m) = ma3(m). Show that 2 gm(α, m) = a3(m) − m a5(m), 3 gmm(α, m) = −3ma5(m) + m a7(m). Proof. We note ∫ { } ∞ 2 ∂ 1 −k/2 ∂ m ak(m) = √ t exp −αt − dt ∂m 2π 0 ∂m 2t ∫ ∞ { } ( ) 1 m2 m = √ t−k/2 exp −αt − − dt 2π 0 2t t = −mak+2(m). So ∂ g (α, m) = [ma (m)] = a (m) − m2a (m), m ∂m 3 3 5 and 3 3 gmm(α, m) = −ma5(m) − 2ma5(m) + m a7(m) = −3ma5(m) + m a7(m).

(ii) Use integration by parts to show that 2α m2 a (m) = − a (m) + a (m). 5 3 3 3 7 Proof. For k > 2 and α > 0, we have ∫ { } 1 ∞ m2 a (m) = √ t−k/2 exp −αt − dt k 2t 2π 0 ∫ { } 2 1 ∞ m2 = − · √ exp −αt − dt−(k−2)/2 k − 2 2t 2π [0 { } 2 ∞ 2 1 m −(k−2)/2 = − · √ exp −αt − t k − 2 2π 2t ∫ { }( 0 ) ] ∞ m2 m2 − t−(k−2)/2 exp −αt − −α + dt 2t 2t2 0 [ ∫ { } ∫ { } ] ∞ 2 2 ∞ 2 − 2 · √1 −(k−2)/2 − − m − m −(k+2)/2 − − m = − α t exp αt dt t exp αt dt k 2 2π 0 2t 2 0 2t 2α m2 = − a − (m) + a (m). k − 2 k 2 k − 2 k+2

24 − 2α m2 Plug in k = 5, we obtain a5(m) = 3 a3(m) + 3 a7(m). (iii) Use (i) and (ii) to show that g satisﬁes the second-order ordinary diﬀerential equation

gmm(α, m) = 2αg(α, m).

Proof. We note

3 3 3 gmm(α, m) = −3ma5(m) + m a7(m) = 2αma3(m) − m a7(m) + m a7(m) = 2αma3(m) = 2αg(α, m).

(iv) The general solution to a second-order ordinary diﬀerential equation of the form

ay′′(m) + by′(m) + cy(m) = 0

is λ1m λ2m y(m) = A1e + A2e ,

where λ1 and λ2 are roots of the characteristic equation

aλ2 + bλ + c = 0.

Here we are assuming that these roots are distinct. Find the general solution of the equation in (iii) when α > 0. This solution has two undetermined parameters A1 and A2, and these may depend on α.

Solution. The characteristic equation of gmm(α, m) = 2αg(α, m) is

λ2 − 2α = 0. √ √ So λ1 = 2α and λ2 = − 2α, and the general solution of the equation in (iii) is √ √ 2αm − 2αm A1e + A2e .

(v) Derive the bound ∫ √ { } ∫ m m m m2 1 ∞ g(α, m) ≤ √ t−3/2 exp − dt + √ e−αtdt 2π 0 t 2t 2πm m and use it to show that, for every α > 0,

lim g(α, m) = 0. m→∞ Use this fact to determine one of the parameters in the general solution to the equation in (iii). Solution. We have

g(α, m) = ma (m) 3 ∫ { } m ∞ m2 = √ t−3/2 exp −αt − dt 2t 2π (0∫ { } ∫ { } ) m m m2 ∞ m2 = √ t−3/2 exp −αt − dt + t−3/2 exp −αt − dt 2t 2t 2π (∫0 { } ∫m ) m m m2 ∞ ≤ √ t−3/2 exp −αt − dt + m−3/2 exp {−αt} dt 2t 2π ∫ 0 { } m ∫ m m m2 1 ∞ = √ e−αtt−3/2 exp − dt + √ e−αtdt. 2π 0 2t 2πm m

25 √ −αt −αm m Over the interval (0, m], e is monotone√ decreasing with a range of [e , 1) and t is monotone ∞ −αt m decreasing with a range of [1, ). so e < t over the interval (0, m]. This gives the desired bound ∫ √ { } ∫ m m m m2 1 ∞ g(α, m) ≤ √ t−3/2 exp − dt + √ e−αtdt. 2π 0 t 2t 2πm m ∫ 1 ∞ −αt Now, let m → ∞. Obviously lim →∞ √ e dt = 0, while the ﬁrst term through the change of m 2πm m − 1 variable t = u satisﬁes ∫ √ { } ∫ { } m m m m2 m3/2 m m2 √ t−3/2 exp − dt = √ t−2 exp − dt 2π 0 t 2t 2π 0 2t ∫ − 1 3/2 m 2 m m u = √ e 2 du 2π −∞ m3/2 e−m/2 = √ · → 0 2π m2/2 √ → ∞ 2αm as m √ . Combined, we can conclude that for every α > 0, limm→∞ g(α, m) = limm→∞√ (A1e + − 2αm − 2αm A2e ) = 0, which requires A1 = 0 as a necessary condition. Therefore, g(α, m) = A2e .

√ (vi) Using ﬁrst the change of variable s = t/m2 and then the change of variable y = 1/ s, show that

lim g(α, m) = 1. m↓0

Using this fact to determine the other parameter in the general solution to the equation in (iii). Solution. Following the hint of the problem, we have

∫ ∞ 2 m −3/2 −αt− m g(α, m) = √ t e 2t dt 2π 0 ∫ ∞ 2 s=t/m m 2 −3/2 −αm2s− 1 2 = √ (sm ) e 2s m ds 2π 0 ∫ ∞ 1 −3/2 −αm2s− 1 = √ s e 2s ds 2π 0 √ ∫ ∞ 2 y2 y=1/ s 1 3 −αm 1 − 2 = √ y e y2 2 dy y3 2π ∫0 ∞ 2 2 −αm2 1 − y = √ e y2 2 dy. 2π 0 So by dominated convergence theorem, ∫ ∫ ∞ 2 ∞ 2 −αm2 1 − y 2 − y2 lim g(α, m) = √ lim e y2 2 dy = √ e 2 dy = 1. ↓ ↓ m 0 2π 0 m 0 2π 0 √ − 2αm We√ already proved in (v) that g(α, m) = A2e . So we must have A2 = 1 and hence g(α, m) = e− 2αm.

4 Stochastic Calculus

⋆ Comments:

26 1) To see how we obtain the solution to the Vasicek interest rate model (equation (4.4.33)), we recall the method of integrating factors, a technique often used in solving first-order linear ordinary differential equations (see, for example, Logan [8, page 74]). Starting from (4.4.32) dR(t) = (α − βR(t))dt + σdW (t), we move the term containing R(t) to the left side of the equation and multiple both sides by the integrating factor eβt: eβt[R(t)d(βt) + dR(t)] = eβt[αdt + σdW (t)]. Applying Itô’s formula lets us know the left side of the equation is d(eβtR(t)). Integrating from 0 to t on both sides, we obtain ∫ α t eβtR(t) − R(0) = (eβt − 1) + σ eβsdW (s). β 0 This gives us (4.4.33) ∫ α t R(t) = e−βtR(0) + (1 − e−βt) + σe−βt eβsdW (s). β 0 2) The distribution of CIR interest rate process R(t) for each positive t can be found in reference [41] of the textbook (Cox, J. C., INGERSOLL, J. E., AND ROSS, S. (1985) A theory of the term structure of interest rates, Econometrica 53, 373-384). 3) Regarding to the comment at the beginning of Section 4.5.2, “Black, Scholes, and Merton argued that the value of this call at any time should depend on the time (more precisely, on the time to expiration) and on the value of the stock price at that time”, we note it is justified by risk-neutral pricing and Markov property: e −r(T −t) + e −r(T −t) + C(t) = E[e (ST − K) |F(t)] = E[e (ST − K) |St] = c(t, St). For details, see Chapter 5. 4) For the PDE approach to solving the Black-Scholes-Merton equation, see Wilmott [15] for details.

I Exercise 4.1. Suppose M(t), 0 ≤ t ≤ T , is a martingale with respect to some ﬁltration F(t), 0 ≤ t ≤ T . Let ∆(t), 0 ≤ t ≤ T , be a simple process adapted to F(t) (i.e., there is a partition Π = {t0, t1, ··· , tn} of [0,T ] such that, for every j, ∆(tj) is F(tj)-measurable and ∆(t) is constant in t on each subinterval [tj, tj+1)). For t ∈ [tk, tk+1), deﬁne the stochastic integral k∑−1 I(t) = ∆(tj)[M(tj+1) − M(tj)] + ∆(tk)[M(t) − M(tk)]. j=0

We think of M(t) as the price of an asset at time t and ∆(tj) as the number of shares of the asset held by an investor between times tj and tj+1. Then I(t) is the capital gains that accrue to the investor between times 0 and t. Show that I(t), 0 ≤ t ≤ T , is a martingale.

Proof. Fix t and for any s < t, we assume s ∈ [tm, tm+1) for some m. − − − − − E − |F Case 1. m = k. Then I(t) I(s) = ∆tk (Mt Mtk ) ∆tk (Ms Mtk ) = ∆tk (Mt Ms). So [I(t) I(s) t] = E − |F ∆tk [Mt Ms s] = 0. Case 2. m < k. Then tm ≤ s < tm+1 ≤ tk ≤ t < tk+1. So k∑−1 − − − − − I(t) I(s) = ∆tj (Mtj+1 Mtj ) + ∆tk (Ms Mtk ) ∆tm (Ms Mtm ) j=m k∑−1 − − − = ∆tj (Mtj+1 Mtj ) + ∆tk (Mt Mtk ) + ∆tm (Mtm+1 Ms). j=m+1

27 Hence

E[I(t) − I(s)|Fs] k∑−1 E E − |F |F E E − |F |F E − |F = [∆tj [Mtj+1 Mtj tj ] s] + [∆tk [Mt Mtk tk ] s] + ∆tm [Mtm+1 Ms s] j=m+1 = 0. Combined, we can conclude I(t) is a martingale.

I Exercise 4.2. Let W (t), 0 ≤ t ≤ T , be a Brownian motion, and let F(t), 0 ≤ t ≤ T , be an associated ﬁltration. Let ∆(t), 0 ≤ t ≤ T , be a nonrandom simple process (i.e., there is a partition Π = {t0, t1, ··· , tn} of [0,T ] such that for every j, ∆(tj) is a nonrandom quantity and ∆(t) = ∆(tj) is constant in t on the subinterval [tj, tj+1)). For t ∈ [tk, tk+1), deﬁne the stochastic integral

k∑−1 I(t) = ∆(tj)[W (tj+1) − W (tj)] + ∆(tk)[W (t) − W (tk)]. j=0

(i) Show that whenever 0 ≤ s < t ≤ T , the increment I(t) − I(s) is independent of F(s). (Simplification: If s is between two partition points, we can always insert s as an extra partition point. Then we can relabel the partition points so that they are still called t0, t1, ··· , tn, but with a larger value of n and now with s = tk for some value of k. Of course, we must set ∆(s) = ∆(tk−1) so that ∆ takes the same value on the interval [s, tk+1) as on the interval [tk−1, s). Similarly, we can insert t as an extra partition point if it is not already one. Consequently, to show that I(t) − I(s) is independent of F(s) for all 0 ≤ s < t ≤ T , it suffices to show that I(tk) − I(tl) is independent of Fl whenever tk and tl are two partition points with tl < tk. This is all you need to do.) − − ∑Proof. We follow the simplification in the hint and consider I(tk) I(tl) with tl < tk. Then I(tk) I(tl) = k−1 − − ⊥ F ⊃ F j=l ∆(tj)[W (tj+1) W (tj)]. Since ∆(t) is a non-random process and W (tj+1) W (tj) (tj) (tl) for j ≥ l, we must have I(tk) − I(tl) ⊥ F(tl). ≤ ≤ − (ii) Show that whenever 0 s∫ < t T , the increment I(t) I(s) is a normally distributed random variable t 2 with mean zero and variance s ∆ (u)du.

Proof. We use the notation in (i) and it is clear that I(tk) − I(tl) is normal since it is a linear combination ∑ − of independent normal random variables. Furthermore, E[I(t ) − I(t )] = k 1 ∆ E[W (t ) − W (t )] = 0 ∑ k ∑ l j=l tj ∫j+1 j k−1 2 k−1 2 tk 2 and Var(I(tk) − I(tl)) = ∆ (tj)Var[W (tj+1) − W (tj)] = ∆ (tj)(tj+1 − tj) = ∆ du. j=l j=l tl u (iii) Use (i) and (ii) to show that I(t), 0 ≤ t ≤ T , is a martingale. Proof. By (i), E[I(t) − I(s)|F(s)] = E[I(t) − I(s)] = 0, for s < t, where the last equality is due to (ii). ∫ 2 − t 2 ≤ ≤ (iv) Show that I (t) 0 ∆ (u)du, 0 t T , is a martingale. Proof. For s < t, [ ∫ ( ∫ ) ] t s E 2 − 2 − 2 − 2 F I (t) ∆udu I (s) ∆udu s 0 0 ∫ [ ] t E − 2 − 2 |F − 2 = (I(t) I(s)) + 2I(t)I(s) 2I (s) s ∆udu ∫ s t E − 2 E − |F − 2 = [(I(t) I(s)) ] + 2I(s) [I(t) I(s) s] ∆udu ∫ ∫ s t t 2 − 2 = ∆udu + 0 ∆udu s s = 0.

28 I Exercise 4.3. We now consider a case in which ∆(t) in Exercise 4.2 is simple but random. In particular, let t0 = 0, t1 = s, and t2 = t, and let ∆(0) be nonrandom and ∆(s) = W (s). Which of the following assertions is true? (i) I(t) − I(s) is independent of F(s). Solution. We ﬁrst note

I(t) − I(s) = ∆(0)[W (t1) − W (0)] + ∆(t1)[W (t2) − W (t1)] − ∆(0)[W (t1) − W (0)]

= ∆(t1)[W (t2) − W (t1)] = W (s)[W (t) − W (s)].

So I(t) − I(s) is not independent of F(s), since W (s) ∈ F(s). (ii) I(t) − I(s) is normally distributed. (Hint: Check if the fourth moment is three times the square of the variance; see Exercise 3.3 of Chapter 3.) Solution. Following the hint, recall from (i), we know I(t) − I(s) = W (s)[W (t) − W (s)]. So E − 4 E 4 E − 4 2 · − 2 2 − 2 [(I(t) I(s)) ] = [Ws ] [(Wt Ws) ] = 3s 3(t s) = 9s (t s)

and ( ) ( ) E − 2 2 E 2 E − 2 2 2 − 2 3 [(I(t) I(s)) ] = 3 [Ws ] [(Wt Ws) ] = 3s (t s) . ( )2 Since E[(I(t) − I(s))4] ≠ 3 E[(I(t) − I(s))2] , I(t) − I(s) is not normally distributed. (iii) E[I(t)|F(s)] = I(s). Solution. E[I(t) − I(s)|F(s)] = W (s)E[W (t) − W (s)|F(s)] = 0. So E[I(t)|F(s)] = I(s) is true. ∫ ∫ E 2 − t 2 |F 2 − s 2 (iv) [I (t) 0 ∆ (u)du (s)] = I (s) 0 ∆ (u)du. Solution. [ ∫ ( ∫ ) ] t s 2 2 2 2 E I (t) − ∆ (u)du − I (s) − ∆ (u)du F(s) [ 0 0 ∫ ] t 2 2 2 = E (I(t) − I(s)) + 2I(t)I(s) − 2I (s) − ∆ (u)du F(s) [ ∫ s ] t 2 2 = E (I(t) − I(s)) + 2I(s)(I(t) − I(s)) − W (s)1(s,t](u)du F(s) s = E[W 2(s)(W (t) − W (s))2 + 2∆(0)W 2(s)(W (t) − W (s)) − W 2(s)(t − s)|F(s)] = W 2(s)E[(W (t) − W (s))2] + 2∆(0)W 2(s)E[W (t) − W (s)|F(s)] − W 2(s)(t − s) = W 2(s)(t − s) − W 2(s)(t − s) = 0. ∫ ∫ E 2 − t 2 |F 2 − s 2 So [I (t) 0 ∆ (u)du (s)] = I (s) 0 ∆ (u)du is true.

I Exercise 4.4 (Stratonovich integral). Let W (t), t ≥ 0, be a Brownian motion. Let T be a fixed positive number and let Π = {t0, t1, ··· , tn} be a partition of [0,T ] (i.e., 0 = t0 < t1 < ··· < tn = T ). For ∗ tj +tj+1 each j, define tj = 2 to be the midpoint of the interval [tj, tj+1]. (i) Define the half-sample quadratic variation corresponding to Π to be

n∑−1 ∗ − 2 QΠ/2 = (W (tj ) W (tj)) . j=0

29 1 || || → E 1 Show that QΠ/2 has limit 2 T as Π 0. (Hint: It suﬃces to show QΠ/2 = 2 T and lim||Π||→0 Var(QΠ/2) = 0. Proof. Following the hint, we ﬁrst note that

− − − n∑1 n∑1 n∑1 t − t T E[Q ] = E[(W (t∗) − W (t ))2] = (t∗ − t ) = j+1 j = . Π/2 j j j j 2 2 j=0 j=0 j=0 ( ) − ∗ − ∗ − tj+1 tj E 2 − 2 Then, by noting W (tj ) W (tj) is equal to W (tj tj) = W 2 in distribution and [(W (t) t) ] = E[W 4(t) − 2tW 2(t) + t2] = 3E[W 2(t)]2 − 2t2 + t2 = 2t2, we have

Var(Q )  Π/2   2 n∑−1  ( )2 T  = E  W (t∗) − W (t ) −   j j 2 j=0    2 n∑−1 n∑−1  ( )2 t − t  = E  W (t∗) − W (t ) − j+1 j   j j 2 j=0 j=0 [( )( )] n∑−1 ( )2 t − t t − t = E W (t∗) − W (t ) − j+1 j (W (t∗) − W (t ))2 − k+1 k j j 2 k k 2 j, k=0 [( )( )] n∑−1 ( ) − − E ∗ − 2 − tj+1 tj ∗ − 2 − tk+1 tk = W (tj ) W (tj) (W (tk) W (tk)) ̸ 2 2 j, k=0,j=k[ ] − ( ) n∑1 t − t 2 + E (W (t∗) − W (t ))2 − j+1 j j j 2 j=0 [ ] − ( ) n∑1 t − t 2 = 0 + E W 2(t∗ − t ) − j+1 j j j 2 j=0 − ( ) n∑1 t − t 2 = 2 · j+1 j 2 j=0 T ≤ max |tj+1 − tj| → 0. 2 1≤j≤n

→ T L2 P Combined, we can conclude lim||Π||→0 QΠ/2 2 in ( ). (ii) Deﬁne the Stratonovich integral of W (t) with respect to W (t) to be ∫ T n∑−1 ◦ ∗ − W (t) dW (t) = lim W (tj )(W (tj+1) W (tj)). (4.10.1) ||Π||→0 0 j=0 ∫ T 1 2 − 1 In contrast to the Itô integral W (t)dW (t) = 2 W (T ) 2 T of (4.3.4), which evaluates the integrand at 0 ∗ the left endpoint of each subinterval [tj, tj+1], here we evaluate the integrand at the midpoint tj . Show that ∫ T 1 W (t) ◦ dW (t) = W 2(T ). 0 2

∫(Hint: Write the approximating sum in (4.10.1) as the sum of an approximating sum for the Itô integral T 0 W (t)dW (t) and QΠ/2. The approximating sum for the Itô integral is the one corresponding to the ∗ ∗ ··· ∗ partition 0 = t0 < t0 < t1 < t1 < < tn−1 < tn = T , not the partition Π.)

30 Proof. Following the hint and using the result in (i), we have

n∑−1 ∗ − W (tj )(W (tj+1) W (tj)) j=0 n∑−1 { } ∗ − ∗ ∗ − = W (tj ) [W (tj+1) W (tj )] + [W (tj ) W (tj)] j=0 n∑−1 { } n∑−1 ∗ − ∗ ∗ − ∗ − 2 = W (tj )[W (tj+1) W (tj )] + W (tj)[W (tj ) W (tj)] + [W (tj ) W (tj)] . j=0 j=0

By the construction of Itô integral, ∫ n∑−1 { } T ∗ − ∗ ∗ − lim W (tj )[W (tj+1) W (tj )] + W (tj)[W (tj ) W (tj)] = W (t)dW (t), ||Π∗||→0 j=0 0 ∗ ∗ ∗ ··· ∗ where Π is the partition 0 = t0 < t0 < t1 < t1 < < tn−1 < tn = T . (i) already shows

n∑−1 ∗ − 2 T T lim [W (tj ) W (tj)] = = . ||Π|| 2 2 j=0

Combined, we conclude ∫ ∫ T n∑−1 T ◦ ∗ − T 1 2 W (t) dW (t) = lim W (tj )(W (tj+1) W (tj)) = W (t)dW (t) + = W (T ). ||Π||→0 2 2 0 j=0 0

I Exercise 4.5 (Solving the generalized geometric Brownian motion equation). Let S(t) be a positive stochastic process that satisfies the generalized geometric Brownian motion differential equation (see Example 4.4.8) dS(t) = α(t)S(t)dt + σ(t)S(t)dW (t), (4.10.2) where α(t) and σ(t) are processes adapted to the filtration F(t), t ≥ 0, associated with the Brownian motion W (t), t ≥ 0. In this exercise, we show that S(t) must be given by formula (4.4.26) (i.e., that formula provides the only solution to the stochastic differential equation (4.10.2)). In the process, we provide a method for solving this equation. (i) Using (4.10.2) and the Itô-Doeblin formula, compute d log S(t). Simplify so that you have a formula for d log S(t) that does not involve S(t). Solution. ( ) ⟨ ⟩ − ⟨ ⟩ − 2 2 dSt − 1 d S t 2StdSt d S t 2St(αtStdt + σtStdWt) σt St dt − 1 2 d log St = 2 = 2 = 2 = σtdWt + αt σt dt. St 2 St 2St 2St 2

(ii) Integrate the formula you obtained in (i), and then exponentiate the answer to obtain (4.4.26).

Solution. ∫ ∫ ( ) t t − 1 2 log St = log S0 + σsdWs + αs σs ds. 0 0 2 {∫ ∫ ( ) } t t − 1 2 So St = S0 exp 0 σsdWs + 0 αs 2 σs ds .

31 { ( ) } I − 1 2 Exercise 4.6. Let S(t) = S(0) exp σW (t) + α 2 σ t be a geometric Brownian motion. Let p be a positive constant. Compute d(Sp(t)), the diﬀerential of S(t) raised to the power p. Solution. Without loss of generality, we assume p ≠ 1. Since (xp)′ = pxp−1, (xp)′′ = p(p − 1)xp−2, we have

− 1 − d(Sp) = pSp 1dS + p(p − 1)Sp 2d⟨S⟩ t t t 2 t t p−1 1 − p−2 2 2 = pSt (αStdt + σStdWt) + p(p 1)St σ St dt [ 2 ] 1 = Sp pαdt + pσdW + p(p − 1)σ2dt t t 2 [ ( ) ] p − 1 = pSp σdW + α + σ2 dt . t t 2

I Exercise 4.7. (i) Compute dW 4(t) and then write W 4(T ) as the sum of an ordinary (Lebesgue) integral with respect to time and an Itô integral. ∫ ∫ 4 3 1 · · 2 ⟨ ⟩ 3 2 4 T 3 T 2 Solution. dWt = 4Wt dWt + 2 4 3Wt d W t = 4Wt dWt + 6Wt dt. So WT = 4 0 Wt dWt + 6 0 Wt dt. (ii) Take expectations on both sides of the formula you obtained in (i), use the fact that EW 2(t) = t, and derive the formula EW 4(T ) = 3T 2. [∫ ] ∫ ∫ E 4 E T 3 T E 2 T 2 Solution. [WT ] = 4 0 Wt dWt + 6 0 [Wt ]dt = 6 0 tdt = 3T .

(iii) Use the method of (i) and (ii) to derive a formula for EW 6(T ). ∫ ∫ ∫ 6 5 1 · · 4 6 T 5 T 4 E 6 T 2 Solution. dWt = 6Wt dWt+ 2 6 5Wt dt. So WT = 6 0 Wt dWt+15 0 Wt dt. Hence [WT ] = 15 0 3t dt = 15T 3.

I Exercise 4.8 (Solving the Vasicek equation). The Vasicek interest rate stochastic diﬀerential equation (4.4.32) is dR(t) = (α − βR(t))dt + σdW (t), where α, β, and σ are positive constants. The solution to this equation is given in Example 4.4.10. This exercise shows how to derive this solution. (i) Use (4.4.32) and the Itô-Doeblin formula to compute d(eβtR(t)). Simplify it so that you have a formula for d(eβtR(t)) that does not involve R(t).

βt βt βt βt β Solution. d(e Rt) = βe Rtdt + e dRt = e [βRtdt + (α − βR(t))dt + σdW (t)] = e (αdt + σdW (t)). (ii) Integrate the equation you obtain in (i) and solve for R(t) to obtain (4.4.33).

Solution. ∫ ∫ t t βt βs α βt βs e Rt = R0 + e (αds + σdWs) = R0 + (e − 1) + σ e dWs. 0 β 0 ∫ −βt α − −βt t −β(t−s) Therefore Rt = R0e + β (1 e ) + σ 0 e dWs.

32 I Exercise 4.9. For a European call expiring at time T with strike price K, the Black-Scholes-Merton price at time t, if the time-t stock price is x, is

−r(T −t) c(t, x) = xN(d+(T − t, x)) − Ke N(d−(T − t, x)),

where [ ( ) ] 1 x 1 d (τ, x) = √ log + r + σ2 τ + σ τ K 2 √ d−(τ, x) = d+(τ, x) − σ τ,

and N(y) is the cumulative standard normal distribution ∫ ∫ y ∞ 1 − z2 1 − z2 N(y) = √ e 2 dz = √ e 2 dz. 2π −∞ 2π −y The purpose of this exercise is to show that the function c satisﬁes the Black-Scholes-Merton partial diﬀer- ential equation 1 c (t, x) + rxc (t, x) + σ2x2c (t, x) = rc(t, x), 0 ≤ t < T, x > 0, (4.10.3) t x 2 xx the terminal condition lim c(t, x) = (x − K)+, x > 0, x ≠ K, (4.10.4) t↑T and the boundary conditions

lim c(t, x) = 0, lim [c(t, x) − (x − e−r(T −t)K)] = 0, 0 ≤ t < T. (4.10.5) x↓0 x→∞

Equation (4.10.4) and the ﬁrst part of (4.10.5) are usually written more simply but less precisely as

c(T, x) = (x − K)+, x ≥ 0 and c(t, 0) = 0, 0 ≤ t ≤ T.

For this exercise, we abbreviate c(t, x) as simply c and d(T − t, x) as simply d. (i) Verify ﬁrst the equation −r(T −t) ′ ′ Ke N (d−) = xN (d+). (4.10.6) Proof.

2 − d− −r(T −t) ′ −r(T −t) e 2 Ke N (d−) = Ke √ 2π √ − − 2 − (d+ σ T t) e 2 = Ke−r(T −t) √ 2π √ σ2(T −t) −r(T −t) σ T −td+ − ′ = Ke e e 2 N (d+) 2 2 − −r(T −t) x (r+ σ )(T −t) − σ (T t) ′ = Ke e 2 e 2 N (d ) K + ′ = xN (d+).

(ii) Show that cx = N(d+). This is the delta of the option. (Be careful! Remember that d+ is a function of x.)

33 Proof. By equation (4.10.6) from (i),

′ ∂ −r(T −t) ′ ∂ c = N(d ) + xN (d ) d (T − t, x) − Ke N (d−) d−(T − t, x) x + + ∂x + ∂x ∂ ∂ = N(d ) + xN ′(d ) d′ (T − t, x) − xN ′(d ) d (T − t, x) + + ∂x + + ∂x + = N(d+).

(iii) Show that −r(T −t) σx ′ ct = −rKe N(d−) − √ N (d+). 2 T − t This is the theta of the option.

Proof.

′ ∂ −r(T −t) −r(T −t) ′ ∂ ct = xN (d+) d+(T − t, x) − rKe N(d−) − Ke N (d−) d−(T − t, x) ∂t [ ∂t ] ′ ∂ −r(T −t) ′ ∂ σ = xN (d+) d+(T − t, x) − rKe N(d−) − xN (d+) d+(T − t, x) + √ ∂t ∂t 2 T − t −r(T −t) σx ′ = −rKe N(d−) − √ N (d+). 2 T − t

(iv) Use the formulas above to show that c satisﬁes (4.10.3).

Proof. 1 c + rxc + σ2x2c t x 2 xx −r(T −t) σx ′ 1 2 2 ′ ∂ = −rKe N(d−) − √ N (d+) + rxN(d+) + σ x N (d+) d+(T − t, x) 2 T − t 2 ∂x

σx ′ 1 2 2 ′ 1 = rc − √ N (d+) + σ x N (d+) √ 2 T − t 2 σ T − tx = rc.

(v) Show that for x > K, limt↑T d = ∞, but for 0 < x < K, limt↑T d = −∞. Use these equalities to derive the terminal condition (4.10.4).

Proof. For x > K, d (T − t, x) > 0 and lim ↑ d (T − t, x) = lim ↓ d (τ, x) = ∞. lim ↑ d−(T − t, x) = + ( t T +√ √ ) τ 0 + t T 1 x 1 1 2 ↓ − ↓ √ − ∞ ↑ −∞ ∈ limτ 0 d (τ, x) = limτ 0 σ τ ln K + σ (r + 2 σ ) τ σ τ = . Similarly, limt T d = for x (0,K). Also it’s clear that limt↑T d = 0 for x = K. So { − x K, if x > K + lim c(t, x) = xN(lim d+) − KN(lim d−) = = (x − K) . t↑T t↑T t↑T 0, if x ≤ K

(vi) Show that for 0 ≤ t < T , limx↓0 d = −∞. Use this fact to verify the ﬁrst part of boundary condition (4.10.5) as x ↓ 0.

34 Proof. We note [ ( ) ] 1 x 1 2 lim d+ = √ lim log + r + σ τ = −∞ x↓0 σ τ x↓0 K 2 and √ lim d− = lim d+ − σ τ = −∞. x↓0 x↓0 So for t ∈ [0,T ),

−r(T −t) lim c(t, x) = lim xN(lim d+(T − t, x)) − Ke N(lim d−(T − t, x)) = 0. x↓0 x↓0 x↓0 x↓0

(vii) Show that for 0 ≤ t < T , limx→∞ d = ∞. Use this fact to verify the second part of boundary condition (4.10.5) as x → ∞. In this veriﬁcation, you will need to show that

N(d ) − 1 lim + = 0. x→∞ x−1

0 This is an indeterminate form 0 , and L’Hôspital’s rule implies that this limit is

d [N(d ) − 1] lim dx + . x→∞ d −1 dx x Work out this expression and use the fact that { ( )} √ 1 x = K exp σ T − td − (T − t) r + σ2 + 2

to write this expression solely in terms of d+ (i.e., without the appearance of any x except the x in the argument of d+(T − t, x)). Then argue that the limit is zero as d+ → ∞.

Proof. For t ∈ [0,T ), it is clear limx→∞ d = ∞. Following the hint, we note

′ ′ √1 ∂ N (d+) N (d+) ∂x d+ σ T −t lim x(N(d+) − 1) = lim = lim . x→∞ x→∞ −x−2 x→∞ −x−1 { √ } − − − 1 2 By the expression of d+, we have x = K exp σ T td+ (T t)(r + 2 σ ) . So

d2 √ − + σ T −td −(T −t)(r+ 1 σ2) ′ −x e 2 −Ke + 2 lim x(N(d+) − 1) = lim N (d+) √ = lim √ √ = 0. →∞ →∞ →∞ x x σ T − t d+ 2π σ T − t Therefore

lim [c(t, x) − (x − e−r(T −t)K)] x→∞ −r(T −t)N(d−) −r(T −t) = lim [xN(d+) − Ke − x + Ke ] x→∞ −r(T −t) = lim [x(N(d+) − 1) + Ke (1 − N(d−))] x→∞ −r(T −t) = lim x(N(d+) − 1) + Ke (1 − N( lim d−)) x→∞ x→∞ = 0.

35 ****************************** UPDATE STOPPED HERE ************************************* I Exercise 4.10 (Self-ﬁnancing trading). The fundamental idea behind no-arbitrage pricing is to reproduce the payoﬀ of a derivative security by trading in the underlying asset (which we call a stock) and the money market account. In discrete time, we let Xk denote the value of the hedging portfolio at time k and let ∆k denote the number of shares of stock held between times k and k + 1. Then, at time k, after rebalancing (i.e., moving from a position of ∆k−1 to a position ∆k in the stock), the amount in the money market account is Xk − Sk∆k. The value of the portfolio at time k + 1 is

Xk+1 = ∆kSk+1 + (1 + r)(Xk − ∆kSk). (4.10.7)

This formula can be rearranged to become

Xk+1 − Xk = ∆k(Sk+1 − Sk) + r(Xk − ∆kSk), (4.10.8)

which says that the gain between time k and time k + 1 is the sum of the capital gain on the stock holdings, ∆k(Sk+1 −Sk), and the interest earnings on the money market account, r(Xk −∆kSk). The continuous-time analogue of (4.10.8) is dX(t) = ∆(t)dS(t) + r(X(t) − ∆(t)S(t))dt. (4.10.9) Alternatively, one could deﬁne the value of a share of the money market account at time k to be

k Mk = (1 + r)

and formulate the discrete-time model with two processes, ∆k as before and Γk denoting the number of shares of the money market account held at time k after rebalancing. Then

Xk = ∆kSk + ΓkMk, (4.10.10)

so that (4.10.7) becomes

Xk+1 = ∆kSk+1 + (1 + r)ΓkMk = ∆kSk+1 + ΓkMk+1. (4.10.11)

Subtracting (4.10.10) from (4.10.11), we obtain in place of (4.10.8) the equation

Xk+1 − Xk = ∆k(Sk+1 − Sk) + Γk(Mk+1 − Mk), (4.10.12)

which says that the gain between time k and time k + 1 is the sum of the capital gain on stock holdings, ∆k(Sk+1 − Sk), and the earnings from the money market investment, Γk(Mk+1 − Mk). But ∆k and Γk cannot be chosen arbitrarily. The agent arrives at time (i)

Proof. We show (4.10.16) + (4.10.9) ⇐⇒ (4.10.16) + (4.10.15), i.e. assuming X has the representation Xt = ∆tSt + ΓtMt, “continuous-time self-ﬁnancing condition” has two equivalent formulations, (4.10.9) or (4.10.15). Indeed, dXt = ∆tdSt+ΓtdMt+(Std∆t+dStd∆t+MtdΓt+dMtdΓt). So dXt = ∆tdSt+ΓtdMt ⇐⇒ Std∆t + dStd∆t + MtdΓt + dMtdΓt = 0, i.e. (4.10.9) ⇐⇒ (4.10.15). (ii) Proof. First, we clarify the problems by stating explicitly the given conditions and the result to be proved. We assume we have a portfolio Xt = ∆tSt + ΓtMt. We let c(t, St) denote the price of call option at time t and set ∆t = cx(t, St). Finally, we assume the portfolio is self-ﬁnancing. The problem is to show [ ] 1 rN dt = c (t, S ) + σ2S2c (t, S ) dt, t t t 2 t xx t

where Nt = c(t, St) − ∆tSt.

36 Indeed, by the self-ﬁnancing property and ∆t = cx(t, St), we have c(t, St) = Xt (by the calculations in Subsection 4.5.1-4.5.3). This uniquely determines Γt as

Xt − ∆tSt c(t, St) − cx(t, St)St Nt Γt = = = . Mt Mt Mt Moreover, [ ] 1 dN = c (t, S )dt + c (t, S )dS + c (t, S )d⟨S ⟩ − d(∆ S ) t t t x t t 2 xx t t t t t [ ] 1 = c (t, S ) + c (t, S )σ2S2 dt + [c (t, S )dS − d(X − Γ M )] t t 2 xx t t x t t t t t [ ] 1 = c (t, S ) + c (t, S )σ2S2 dt + M dΓ + dM dΓ + [c (t, S )dS + Γ dM − dX ]. t t 2 xx t t t t t t x t t t t t

By self-ﬁnancing property, cx(t, St)dt + ΓtdMt = ∆tdSt + ΓtdMt = dXt, so [ ] 1 c (t, S ) + c (t, S )σ2S2 dt = dN − M dΓ − dM dΓ = Γ dM = Γ rM dt = rN dt. t t 2 xx t t t t t t t t t t t t

4.11.

Proof. First, we note c(t, x) solves the Black-Scholes-Merton PDE with volatility σ1: ( ) ∂ ∂ 1 ∂2 + rx + x2σ2 − r c(t, x) = 0. ∂t ∂x 2 1 ∂x2

So 1 c (t, S ) + rS c (t, S ) + σ2S2c (t, S ) − rc(t, S ) = 0, t t t x t 2 1 t xx t t and

1 2 2 dc(t, St) = ct(t, St)dt + cx(t, St)(αStdt + σ2StdWt) + cxx(t, St)σ2St dt [ ]2 1 = c (t, S ) + αc (t, S )S + σ2S2c (t, S ) dt + σ S c (t, S )dW t t x t t 2 2 t xx t 2 t x t t [ ] 1 = rc(t, S ) + (α − r)c (t, S )S + S2(σ2 − σ2)c (t, S ) dt + σ S c (t, S )dW . t x t t 2 t 2 1 xx t 2 t x t t

Therefore [ 1 dX = rc(t, S ) + (α − r)c (t, S )S + S2(σ2 − σ2)σ (t, S ) + rX − rc(t, S ) + rS c (t, S ) t t x t t 2 t 2 1 xx t t t t x t ] 1 − (σ2 − σ2)S2c (t, S ) − c (t, S )αS dt + [σ S c (t, S ) − c (t, S )σ S ]dW 2 2 1 t xx t x t t 2 t x t x t 2 t t

= rXtdt.

rt This implies Xt = X0e . By X0, we conclude Xt = 0 for all t ∈ [0,T ].

4.12. (i)

37 −r(T −t) Proof. By (4.5.29), c(t, x) − p(t, x) = x − e K. So px(t, x) = cx(t, x) − 1 = N(d+(T − t, x)) − 1, p (t, x) = c (t, x) = √1 N ′(d (T − t, x)) and xx xx σx T −t +

−r(T −t) pt(t, x) = ct(t, x) + re K −r(T −t) σx ′ −r(T −t) = −rKe N(d−(T − t, x)) − √ N (d+(T − t, x)) + rKe 2 T − t −r(T −t) σx ′ = rKe N(−d−(T − t, x)) − √ N (d+(T − t, x)). 2 T − t

(ii)

Proof. For an agent hedging a short position in the put, since ∆t = px(t, x) < 0, he should short the underlying stock and put p(t, St) − px(t, St)St(> 0) cash in the money market account. (iii)

Proof. By the put-call parity, it suffices to show f(t, x) = x − Ke−r(T −t) satisfies the Black-Scholes-Merton partial differential equation. Indeed, ( ) ∂ 1 ∂2 ∂ 1 + σ2x2 + rx − r f(t, x) = −rKe−r(T −t) + σ2x2 · 0 + rx · 1 − r(x − Ke−r(T −t)) = 0. ∂t 2 ∂x2 ∂x 2

Remark: The Black-Scholes-Merton PDE has many solutions. Proper boundary conditions are the key to uniqueness. For more details, see Wilmott [15].

4.13.

Proof. We suppose (W1,W2) is a pair of local martingale deﬁned by SDE { dW (t) = dB (t) 1 1 (1) dW2(t) = α(t)dB1(t) + β(t)dB2(t).

We want to ﬁnd α(t) and β(t) such that { (dW (t))2 = [α2(t) + β2(t) + 2ρ(t)α(t)β(t)]dt = dt 2 (2) dW1(t)dW2(t) = [α(t) + β(t)ρ(t)]dt = 0.

Solve the equation for α(t) and β(t), we have β(t) = √ 1 and α(t) = −√ ρ(t) . So 1−ρ2(t) 1−ρ2(t) { W1(t) = B∫ 1(t) ∫ t √−ρ(s) t √ 1 (3) W2(t) = dB1(s) + dB2(s) 0 1−ρ2(s) 0 1−ρ2(s) is a pair of independent BM’s. Equivalently, we have { B (t) = W (t) 1 ∫ 1 ∫ √ (4) t t − 2 B2(t) = 0 ρ(s)dW1(s) + 0 1 ρ (s)dW2(s).

4.14. (i)

38 ∈ F Proof. Clearly Zj tj+1 . Moreover

E[Z |F ] = f ′′(W )E[(W − W )2 − (t − t )|F ] = f ′′(W )(E[W 2 ] − (t − t )) = 0, j tj tj tj+1 tj j+1 j tj tj tj+1−tj j+1 j − F ∼ since Wtj+1 Wtj is independent of tj and Wt N(0, t). Finally, we have

2|F ′′ 2 − 4 − − − 2 − 2|F E[Zj tj ] = [f (Wtj )] E[(Wtj+1 Wtj ) 2(tj+1 tj)(Wtj+1 Wtj ) + (tj+1 tj) tj ] = [f ′′(W )]2(E[W 4 ] − 2(t − t )E[W 2 ] + (t − t )2) tj tj+1−tj j+1 j tj+1−tj j+1 j ′′ 2 − 2 − − 2 − 2 = [f (Wtj )] [3(tj+1 tj) 2(tj+1 tj) + (tj+1 tj) ] ′′ 2 − 2 = 2[f (Wtj )] (tj+1 tj) , where we used the independence of Browian motion increment and the fact that E[X4] = 3E[X2]2 if X is Gaussian with mean 0. (ii) ∑ ∑ n−1 n−1 |F Proof. E[ j=0 Zj] = E[ j=0 E[Zj tj ]] = 0 by part (i). (iii)

Proof.

n∑−1 n∑−1 2 V ar[ Zj] = E[( Zj) ] j=0 j=0 n∑−1 ∑ 2 = E[ Zj + 2 ZiZj] j=0 0≤i

4.15. (i)

Proof. Bi is a local martingale with

 2 ∑d σ (t) ∑d σ2 (t) (dB (t))2 =  ij dW (t) = ij dt = dt. i σ (t) j σ2(t) j=1 i j=1 i

So Bi is a Brownian motion. (ii)

39 Proof.   [ ] ∑d ∑d  σij(t)  σkl(t) dBi(t)dBk(t) = dWj(t) dWl(t) σi(t) σk(t) j=1 l=1 ∑ σij(t)σkl(t) = dWj(t)dWl(t) σi(t)σk(t) 1≤j, l≤d ∑d σ (t)σ (t) = ij kj dt σ (t)σ (t) j=1 i k

= ρik(t)dt.

4.16.

Proof. To ﬁnd the m independent Brownion motion W1(t), ··· , Wm(t), we need to ﬁnd A(t) = (aij(t)) so that tr tr (dB1(t), ··· , dBm(t)) = A(t)(dW1(t), ··· , dWm(t)) , or equivalently

tr −1 tr (dW1(t), ··· , dWm(t)) = A(t) (dB1(t), ··· , dBm(t)) ,

and

tr (dW1(t), ··· , dWm(t)) (dW1(t), ··· , dWm(t)) −1 tr −1 tr = A(t) (dB1(t), ··· , dBm(t)) (dB1(t), ··· , dBm(t))(A(t) ) dt

= Im×mdt,

where Im×m is the m × m unit matrix. By the condition dBi(t)dBk(t) = ρik(t)dt, we get

tr (dB1(t), ··· , dBm(t)) (dB1(t), ··· , dBm(t)) = C(t).

−1 −1 tr tr So A(t) C(t)(A(t) ) = Im×m, which gives C(t) = A(t)A(t) . This motivates us to deﬁne A as the square root of C. Reverse the above analysis, we obtain a formal proof.

4.17.

Proof. We will try to solve all the sub-problems in a single, long solution. We start with the general Xi: ∫ ∫ t t Xi(t) = Xi(0) + θi(u)du + σi(u)dBi(u), i = 1, 2. 0 0 The goal is to show C(ϵ) lim √ = ρ(t0). ↓ ϵ 0 V1(ϵ)V2(ϵ) First, for i = 1, 2, we have

M (ϵ) = E[X (t + ϵ) − X (t )|F ] i [∫i 0 i 0 ∫ t0 ] t0+ϵ t0+ϵ |F = E Θi(u)du + σi(u)dBi(u) t0 t0 [∫ t0 ] t0+ϵ − |F = Θi(t0)ϵ + E (Θi(u) Θi(t0))du t0 . t0

So Mi(ϵ) = Θi(t0)ϵ + o(ϵ). This proves (iii). ∫ To calculate the variance and covariance, we note Y (t) = t σ (u)dB (u) is a martingale and by Itô’s ∫ i 0 i i − t formula Yi(t)Yj(t) 0 σi(u)σj(u)du is a martingale (i = 1, 2). So E[(X (t + ϵ) − X (t ))(X (t + ϵ) − X (t ))|F ] [(i 0 i 0 ∫j 0 j )(0 t0 ∫ ) ] t0+ϵ t0+ϵ − − |F = E Yi(t0 + ϵ) Yi(t0) + Θi(u)du Yj(t0 + ϵ) Yj(t0) + Θj(u)du t0 t0 [∫ t0 ∫ ] t0+ϵ t0+ϵ − − |F |F = E [(Yi(t0 + ϵ) Yi(t0)) (Yj(t0 + ϵ) Yj(t0)) t0 ] + E Θi(u)du Θj(u)du t0 [ ∫ ] [ t0 t0∫ ] t0+ϵ t0+ϵ − |F − |F +E (Yi(t0 + ϵ) Yi(t0)) Θj(u)du t0 + E (Yj(t0 + ϵ) Yj(t0)) Θi(u)du t0 t0 t0 = I + II + III + IV. [∫ ] t0+ϵ − |F |F I = E[Yi(t0 + ϵ)Yj(t0 + ϵ) Yi(t0)Yj(t0) t0 ] = E σi(u)σj(u)ρij(t)dt t0 . t0

By an argument similar to that involved in the proof of part (iii), we conclude I = σi(t0)σj(t0)ρij(t0)ϵ + o(ϵ) and [∫ ∫ ] [∫ ] t0+ϵ t0+ϵ t0+ϵ − |F |F II = E (Θi(u) Θi(t0))du Θj(u)du t0 + Θi(t0)ϵE Θj(u)du t0 t0 t0 t0 = o(ϵ) + (Mi(ϵ) − o(ϵ))Mj(ϵ)

= Mi(ϵ)Mj(ϵ) + o(ϵ). By Cauchy’s inequality under conditional expectation (note E[XY |F] deﬁnes an inner product on L2(Ω)), [ ∫ ] t0+ϵ ≤ | − | | | |F III E Yi(t0 + ϵ) Yi(t0) Θj(u) du t0 √ t0 ≤ Mϵ E[(Y (t + ϵ) − Y (t ))2|F ] √ i 0 i 0 t0 ≤ 2 − 2|F Mϵ E[Yi(t0 + ϵ) Yi(t0) t0 ] √ ∫ t0+ϵ ≤ 2 |F Mϵ E[ Θi(u) du t0 ] √t0 ≤ Mϵ · M ϵ = o(ϵ) Similarly, IV = o(ϵ). In summary, we have − − |F E[(Xi(t0 + ϵ) Xi(t0))(Xj(t0 + ϵ) Xj(t0)) t0 ] = Mi(ϵ)Mj(ϵ) + σi(t0)σj(t0)ρij(t0)ϵ + o(ϵ) + o(ϵ). This proves part (iv) and (v). Finally,

C(ϵ) ρ(t0)σ1(t0)σ2(t0)ϵ + o(ϵ) lim √ = lim √ = ρ(t0). ϵ↓0 ϵ↓0 2 2 V1(ϵ)V2(ϵ) (σ1(t0)ϵ + o(ϵ))(σ2(t0)ϵ + o(ϵ))

41 This proves part (vi). Part (i) and (ii) are consequences of general cases.

4.18. (i)

Proof. 1 2 1 2 rt −θWt− θ t −θWt− θ t rt d(e ζt) = (de 2 ) = −e 2 θdWt = −θ(e ζt)dWt,

1 2 −θWt− θ t rt where for the second “=”, we used the fact that e 2 solves dXt = −θXtdWt. Since d(e ζt) = rt rt re ζtdt + e dζt, we get dζt = −θζtdWt − rζtdt. (ii)

Proof.

d(ζtXt) = ζtdXt + Xtdζt + dXtdζt

= ζt(rXtdt + ∆t(α − r)Stdt + ∆tσStdWt) + Xt(−θζtdWt − rζtdt)

+(rXtdt + ∆t(α − r)Stdt + ∆tσStdWt)(−θζtdWt − rζtdt)

= ζt(∆t(α − r)Stdt + ∆tσStdWt) − θXtζtdWt − θ∆tσStζtdt

= ζt∆tσStdWt − θXtζtdWt.

So ζtXt is a martingale. (iii)

Proof. By part (ii), X0 = ζ0X0 = E[ζT Xt] = E[ζT VT ]. (This can be seen as a version of risk-neutral pricing, only that the pricing is carried out under the actual probability measure.)

4.19. (i) ∫ t 2 Proof. Bt is a local martingale with [B]t = 0 sign(Ws) ds = t. So by Lévy’s theorem, Bt is a Brownian motion.

(ii)

Proof. d(BtWt) = BtdWt + sign(Wt)WtdWt + sign(Wt)dt. Integrate both sides of the resulting equation and the expectation, we get ∫ ∫ t t − 1 − 1 E[BtWt] = E[sign(Ws)]ds = E[1{Ws≥0} 1{Ws<0}]ds = t t = 0. 0 0 2 2

(iii)

2 Proof. By Itô’s formula, dWt = 2WtdWt + dt. (iv)

Proof. By Itô’s formula,

2 2 2 2 d(BtWt ) = BtdWt + Wt dBt + dBtdWt 2 = Bt(2WtdWt + dt) + Wt sign(Wt)dWt + sign(Wt)dWt(2WtdWt + dt) 2 = 2BtWtdWt + Btdt + sign(Wt)Wt dWt + 2sign(Wt)Wtdt.

42 So ∫ ∫ t t 2 E[BtWt ] = E[ Bsds] + 2E[ sign(Ws)Wsds] ∫ 0 ∫ 0 t t = E[Bs]ds + 2 E[sign(Ws)Ws]ds 0∫ 0 t − = 2 (E[Ws1{Ws≥0}] E[Ws1{Ws<0}])ds 0 ∫ ∫ 2 t ∞ − x e 2s = 4 x√ dxds 2πs ∫0 √0 t s = 4 ds 0 2π ̸ · 2 = 0 = E[Bt] E[Wt ]. 2 ̸ · 2 Since E[BtWt ] = E[Bt] E[Wt ], Bt and Wt are not independent.

4.20. (i)  { { 1, if x > K x − K, if x ≥ K 0, if x ≠ K Proof. f(x) = So f ′(x) = undeﬁned, if x = K and f ′′(x) = 0, if x < K.  undeﬁned, if x = K. 0, if x < K

(ii)

2 √ ∫ − x 2 ∞ e 2T T − K K Proof. E[f(W )] = (x − K) √ dx = e 2T − KΦ(− √ ) where Φ is the distribution function of T K 2πT 2π ∫ T T ′′ standard normal random variable. If we suppose 0 f (Wt)dt = 0, the expectation of RHS of (4.10.42) is equal to 0. So (4.10.42) cannot hold. (iii) Proof. This is trivial to check. (iv) 1 ≥ 1 Proof. If x = K, limn→∞ fn(x) = 8n = 0; if x > K, for n large enough, x K + 2n , so limn→∞ fn(x) = − − ≤ − 1 limn→∞(x K) = x K; if x < K, for n large enough, x K 2n , so limn→∞fn(x) = limn→∞ 0 = 0. In + summary, limn→∞fn(x) = (x − K) . Similarly, we can show  0, if x < K ′ 1 lim fn(x) = , if x = K (5) n→∞  2 1, if x > K.

(v)

Proof. Fix ω, so that Wt(ω) < K for any t ∈ [0,T ]. Since Wt(ω) can obtain its maximum on [0,T ], there ≥ − 1 exists n0, so that for any n n0, max0≤t≤T Wt(ω) < K 2n . So ∫ T LK (T )(ω) = lim n 1 − 1 1 (Wt(ω))dt = 0. →∞ (K 2n ,K+ 2n ) n 0

43 (vi) Proof. Take expectation on both sides of the formula (4.10.45), we have

+ E[LK (T )] = E[(WT − K) ] > 0.

So we cannot have LK (T ) = 0 a.s.. Remark 1. Cf. Kallenberg Theorem 22.5, or the paper by Alsmeyer and Jaeger (2005).

4.21. (i) Proof. There are two problems. First, the transaction cost could be big due to active trading; second, the purchases and sales cannot be made at exactly the same price K. For more details, see Hull [6]. (ii)

+ Proof. No. The RHS of (4.10.26) is a martingale, so its expectation is 0. But E[(ST − K) ] > 0. So + XT ≠ (ST − K) .

5 Risk-Neutral Pricing

⋆ Comments: 1) Heuristics to memorize the conditional Bayes formula (Lemma 5.2.2 on page 212 of the textbook) 1 Ee[Y |F(s)] = E[YZ(t)|F(s)]. Z(s)

We recall an example in Durrett [4, page 223] (Example 5.1.3): suppose Ω1, Ω2, ··· is a finite or infinite partition of Ω into disjoint sets, each of which has positive probability, and let F = σ(Ω1, Ω2, ··· ) be the σ-field generated by these sets. Then

∑ E[X;Ω ] E[X|F(s)] = 1 i . Ωi P(Ω ) i i

Therefore, if F(s) = σ(Ω1, Ω2, ··· ), we have

∑ Ee[Y ;Ω ] ∑ E[YZ(t); Ω ] Ee[Y |F(s)] = 1 i = 1 i . Ωi Pe Ωi E[Z(s); Ω ] i (Ωi) i i

Consequently, on any Ωi, E[YZ(t); Ω ]/P(Ω ) E[YZ(t)|F(s)] 1 Ee[Y |F(s)] = i i = = E[YZ(t)|F(s)]. E[Z(s); Ωi]/P(Ωi) E[Z(s)|F(s)] Z(s)

2) Heuristics to memorize the one-dimensional Girsanov’s Theorem (Theorem 5.2.3 on page 212∫ of the f t textbook): suppose under a change of measure with density process (Z(t))t≥0, W (t) = W (t) + 0 Θ(u)du becomes a martingale, then we must have { ∫ ∫ } t 1 t Z(t) = exp − Θ(u)dW (u) − Θ2(u)du . 0 2 0 Recall Z is a positive martingale under the original probability measure P, so by martingale representation theorem under the Brownian ﬁltration, Z must satisfy the SDE

dZ(t) = Z(t)X(t)dW (t)

44 for some adapted process X(t).7 Since an adapted process M is a Pe-martingale if and only if MZ is a P-martingale,8 we conclude Wf is a Pe-martingale if and only if WZf is a P-martingale. Since

d[Wf(t)Z(t)] = Z(t)[dW (t) + Θ(t)dt] + Wf(t)Z(t)X(t)dW (t) + [dW (t) + Θ(t)dt]Z(t)X(t)dW (t) = (··· )dW (t) + Z(t)[Θ(t) + X(t)]dt, { ∫ ∫ } − − t − 1 t 2 we must require X(t) = Θ(t). So Z(t) = exp 0 Θ(u)dW (u) 2 0 Θ (u)du . 3) Main idea of Example 5.4.4. Combining formula (5.4.15) and (5.4.22), we have

∑m ∑m d[D(t)X(t)] = ∆i(t)d[D(t)Si(t)] = ∆i(t)D(t)Si(t)[(αi(t) − R(t))dt + σi(t)dBi(t)]. i=1 i=1

To create arbitrage, we want D(t)X(t) to have a deterministic and positive return rate, that is, {∑ m ∆ (t)S (t)σ (t) = 0 ∑i=1 i i i m − i=1 ∆i(t)Si(t)[αi(t) R(t)] > 0.

1 1 In Example 5.4.4, i = 2, ∆1(t) = , ∆2(t) = − and S1(t)σ1 S2(t)σ2 α − r α − r 1 − > 0. σ1 σ2

5.1. (i)

Proof. 1 df(X ) = f ′(X )dt + f ′′(X )d⟨X⟩ t t 2 t t 1 = f(Xt)(dXt + d⟨X⟩t) [ 2 ] 1 1 = f(X ) σ dW + (α − R − σ2)dt + σ2dt t t t t t 2 t 2 t

= f(Xt)(αt − Rt)dt + f(Xt)σtdWt.

This is formula (5.2.20).

(ii)

Proof. d(DtSt) = StdDt + DtdSt + dDtdSt = −StRtDtdt + DtαtStdt + DtσtStdWt = DtSt(αt − Rt)dt + DtStσtdWt. This is formula (5.2.20). 5.2. [ ] e DT VT ZT Proof. By Lemma 5.2.2., E[DT VT |Ft] = E |Ft . Therefore (5.2.30) is equivalent to DtVtZt = Zt E[DT VT ZT |Ft]. 5.3. (i)

7 ξ(t) The existence of X is easy: if dZ(t) = ξ(t)dW (t), then X(t) = Z(t) . 8To see this, note Ee[M(t)|F(s)] = E[M(t)Z(t)|F(s)]/Z(s).

45 Proof.

d f 1 2 e −rT σWT +(r− σ )T + cx(0, x) = E[e (xe 2 − K) ] dx[ ] e −rT d σWf +(r− 1 σ2)T = E e h(xe T 2 ) dx [ ] − f − 1 2 e rT σWT +(r 2 σ )T = E e e 1 σWf +(r− 1 σ2)T [ {xe T 2 >K] } − 1 2 f 2 σ T e σWT = e E e 1{f 1 K − − 1 2 } WT > σ (ln x (r 2 σ )T ) [ √ ] Wf − 1 σ2T e σ T √T = e 2 E e T 1 f √ √ { W√T −σ T > √1 (ln K −(r− 1 σ2)T )−σ T } ∫ T σ T x 2 ∞ √ − 1 2 1 − z2 2 σ T √ 2 σ T z √ = e e e 1{z−σ T >−d (T,x)}dz −∞ 2π + ∫ √ ∞ 2 1 − (z−σ T ) √ 2 √ = e 1{z−σ T >−d (T,x)}dz −∞ 2π +

= N(d+(T, x)).

(ii)

f 1 2 b σWT − σ T b e b b e b b Proof. If we set ZT = e 2 and Zt = E[ZT |Ft], then Z is a P -martingale, Zt > 0 and E[ZT ] = f 1 2 e σWT − σ T b b e b E[e 2 ] = 1. So if we deﬁne P by dP = ZT dP on FT , then P is a probability measure equivalent to Pe, and e b b cx(0, x) = E[ZT 1{ST >K}] = P (ST > K). ∫ c f t − f − b − Moreover, by Girsanov’s Theorem, Wt = Wt + 0 ( σ)du = Wt σt is a P -Brownian motion (set Θ = σ in Theorem 5.4.1.) (iii)

f 1 2 c 1 2 σWT +(r− σ )T σWT +(r+ σ )T Proof. ST = xe 2 = xe 2 . So ( ) c c 1 2 W b b σWT +(r+ σ )T b T P (ST > K) = P (xe 2 > K) = P √ > −d+(T, x) = N(d+(T, x)). T

5.4. First, a few typos. In the SDE for S, ‘‘σ(t)dWf(t)” → ‘‘σ(t)S(t)dWf(t)”. In the ﬁrst equation for c(0,S(0)), E → Ee. In the second equation for c(0,S(0)), the variable for BSM should be  √  ∫ ∫ 1 T 1 T BSM T,S(0); K, r(t)dt, σ2(t)dt . T 0 T 0

(i) ∫ ∫ T T dSt − 1 ⟨ ⟩ f − 1 2 { − 1 2 f } Proof. d ln St = S 2S2 d S t = rtdt + σtdWt 2 σt dt. So ST = S0 exp 0 (rt 2 σt )dt + 0 σtdWt . Let ∫ t ∫t X = T (r − 1 σ2)dt + T σ dWf . The ﬁrst term in the expression of X is a number and the second term 0 t 2 t 0 t t ∫ is a Gaussian random variable N(0, T σ2dt), since both r and σ ar deterministic. Therefore, S = S eX , ∫ ∫ 0 t T 0 ∼ T − 1 2 T 2 with X N( 0 (rt 2 σt )dt, 0 σt dt),. (ii)

46 Proof. For the standard BSM model with constant volatility Σ and interest rate R, under the risk-neutral Y − 1 2 f ∼ − 1 2 2 e Y − + measure, we have ST = S0e , where Y = (R 2 Σ )T +ΣWT N((R√ 2 Σ )T, Σ T ), and E[(S0e K) ] = RT 1 1 1 e BSM(T,S0; K,R, Σ). Note R = T (E[Y ] + 2 V ar(Y )) and Σ = T V ar(Y ), we can get ( ( ) √ ) e Y + E[Y ]+ 1 V ar(Y ) 1 1 1 E[(S e − K) ] = e 2 BSM T,S ; K, E[Y ] + V ar(Y ) , V ar(Y ) . 0 0 T 2 T

So for the model in this problem, ∫ T − rtdt e X + c(0,S0) = e 0 E[(S0e − K) ] ( ( ) √ ) ∫ − T r dt E[X]+ 1 V ar(X) 1 1 1 = e 0 t e 2 BSM T,S ; K, E[X] + V ar(X) , V ar(X) 0 T 2 T  √  ∫ ∫ T T  1 1 2  = BSM T,S0; K, rtdt, σt dt . T 0 T 0

5.5. (i) 1 ′ − 1 ′′ 2 − Proof. Let f(x) = x , then f (x) = x2 and f (x) = x3 . Note dZt = ZtΘtdWt, so ( ) 2 1 ′ 1 ′′ − 1 − 1 2 2 2 Θt Θt d = f (Zt)dZt + f (Zt)dZtdZt = 2 ( Zt)ΘtdWt + 3 Zt Θt dt = dWt + dt. Zt 2 Zt 2 Zt Zt Zt

(ii) [ ] f f e f ZtMt f Proof. By Lemma 5.2.2., for s, t ≥ 0 with s < t, Ms = E[Mt|Fs] = E |Fs . That is, E[ZtMt|Fs] = Zs f f ZsMs. So M = ZM is a P -martingale. (iii) Proof. ( ) 2 f 1 1 1 1 Γt MtΘt MtΘt ΓtΘt dMt = d Mt · = dMt + Mtd + dMtd = dWt + dWt + dt + dt. Zt Zt Zt Zt Zt Zt Zt Zt

(iv) Proof. In part (iii), we have

2 f Γt MtΘt MtΘt ΓtΘt Γt MtΘt dMt = dWt + dWt + dt + dt = (dWt + Θtdt) + (dWt + Θtdt). Zt Zt Zt Zt Zt Zt

e Γt+MtΘt f e f Let Γt = , then dMt = ΓtdWt. This proves Corollary 5.3.2. Zt 5.6.

47 f e f f Proof. By Theorem 4.6.5, it suﬃces to show Wi(t) is an Ft-martingale under P and [Wi, Wj](t) = tδij f e f (i, j = 1, 2). Indeed, for i = 1, 2, Wi(t) is an Ft-martingale under P if and only if Wi(t)Zt is an Ft-martingale under P , since [ ] f e f Wi(t)Zt E[Wi(t)|Fs] = E |Fs . Zs By Itô’s product formula, we have f f f f d(Wi(t)Zt) = Wi(t)dZt + ZtdWi(t) + dZtdWi(t) f = Wi(t)(−Zt)Θ(t) · dWt + Zt(dWi(t) + Θi(t)dt) + (−ZtΘt · dWt)(dWi(t) + Θi(t)dt) ∑d f = Wi(t)(−Zt) Θj(t)dWj(t) + Zt(dWi(t) + Θi(t)dt) − ZtΘi(t)dt j=1 ∑d f = Wi(t)(−Zt) Θj(t)dWj(t) + ZtdWi(t) j=1

f f e This shows Wi(t)Zt is an Ft-martingale under P . So Wi(t) is an Ft-martingale under P . Moreover, [ ∫ · ∫ · ] f f [Wi, Wj](t) = Wi + Θi(s)ds, Wj + Θj(s)ds (t) = [Wi,Wj](t) = tδij. 0 0 Combined, this proves the two-dimensional Girsanov’s Theorem. 5.7. (i)

−1 Proof. Let a be any strictly positive number. We deﬁne X2(t) = (a + X1(t))D(t) . Then ( ) X (0) P X (T ) ≥ 2 = P (a + X (T ) ≥ a) = P (X (T ) ≥ 0) = 1, 2 D(T ) 1 1 ( ) X2(0) and P X2(T ) > D(T ) = P (X1(T ) > 0) > 0, since a is arbitrary, we have proved the claim of this problem. Remark 2. The intuition is that we invest the positive starting fund a into the money market account, and construct portfolio X1 from zero cost. Their sum should be able to beat the return of money market account. (ii)

Proof. We deﬁne X1(t) = X2(t)D(t) − X2(0). Then X1(0) = 0, ( ) ( ) X (0) X (0) P (X (T ) ≥ 0) = P X (T ) ≥ 2 = 1,P (X (T ) > 0) = P X (T ) > 2 > 0. 1 2 D(T ) 1 2 D(T )

e 1 5.8. The basic idea is that for any positive P -martingale M, dMt = Mt · dMt. By Martingale Repre- Mt e e f e Γt f sentation Theorem, dMt = ΓtdWt for some adapted process Γt. So dMt = Mt( )dWt, i.e. any positive Mt martingale must be the exponential of an integral w.r.t. Brownian motion. Taking into account discounting factor and apply Itô’s product rule, we can show every strictly positive asset is a generalized geometric Brownian motion. (i) ∫ − T e 0 Rudu |F e |F e Proof. VtDt = E[e VT t] = E[DT VT t]. So (DtVt)t≥0 is a P -martingale.∫ By Martingale Represen- tation Theorem, there exists an adapted process Γe , 0 ≤ t ≤ T , such that D V = t Γe dWf , or equivalently, ∫ t t t ∫0 s s −1 t e f −1 t e f −1e f Vt = Dt 0 ΓsdWs. Diﬀerentiate both sides of the equation, we get dVt = RtDt 0 ΓsdWsdt + Dt ΓtdWt, e Γt i.e. dVt = RtVtdt + dWt. Dt

48 (ii)

Proof. We prove the following more general lemma. Lemma 5.1. Let X be an almost surely positive random variable (i.e. X > 0 a.s.) deﬁned on the probability space (Ω, G,P ). Let F be a sub σ-algebra of G, then Y = E[X|F] > 0 a.s. ≥ { } Proof. By the property of conditional expectation Yt 0 a.s. Let A = Y = 0 ∑, we shall show P (A) = 0. In- ∈ F |F ∞ ≥ deed, note A , 0 = E[YIA] = E[E[X ]IA] = E[XIA] = E[X1A∩{X≥1}] + n=1 E[X1A∩{ 1 >X≥ 1 }] ∑∞ n n+1 P (A∩{X ≥ 1})+ 1 P (A∩{ 1 > X ≥ 1 }). So P (A∩{X ≥ 1}) = 0 and P (A∩{ 1 > X ≥ 1 }) = 0, n=1 n+1 n n+1 ∑ n n+1 ∀ ≥ ∩ { } ∩ { ≥ } ∞ ∩ { 1 ≥ 1 } n 1. This in turn implies P (A) = P (A X > 0 ) = P (A X 1 ) + n=1 P (A n > X n+1 ) = 0. ∫ T e − Rudu By the above lemma, it is clear that for each t ∈ [0,T ], Vt = E[e t VT |Ft] > 0 a.s.. Moreover, by a classical result of martingale theory (Revuz and Yor [11], Chapter II, Proposition (3.4)), we have the following stronger result: for a.s. ω, Vt(ω) > 0 for any t ∈ [0,T ]. (iii) ( ) e e 1 1 Γt f Γt f Proof. By (ii), V > 0 a.s., so dVt = Vt dVt = Vt RtVtdt + dWt = VtRtdt + Vt dWt = RtVtdt + Vt Vt Dt VtDt e f Γt σtVtdWt, where σt = . This shows V follows a generalized geometric Brownian motion. VtDt 5.9.

2 −rT 1 x 1 2 1 − y Proof. c(0, T, x, K) = xN(d ) − Ke N(d−) with d = √ (ln + (r σ )T ). Let f(y) = √ e 2 , + σ T K 2 2π then f ′(y) = −yf(y),

∂d+ −rT −rT ∂d− c (0, T, x, K) = xf(d ) − e N(d−) − Ke f(d−) K + ∂y ∂y − 1 −rT −rT 1 = xf(d+) √ − e N(d−) + e f(d−) √ , σ TK σ T and

cKK (0, T, x, K) −rT 1 x ∂d+ −rT ∂d− e d− = xf(d+) √ − √ f(d+)(−d+) − e f(d−) + √ (−d−)f(d−) σ TK2 σ TK ∂y ∂y σ T ∂y − − −rT − x xd+ 1 −rT 1 e d− 1 = √ f(d+) + √ f(d+) √ − e f(d−) √ − √ f(d−) √ σ TK2 σ TK Kσ T Kσ T σ T Kσ T −rT x d+ e f(d−) d− = f(d+) √ [1 − √ ] + √ [1 + √ ] K2σ T σ T Kσ T σ T e−rT x = f(d−)d − f(d )d−. Kσ2T + K2σ2T +

5.10. (i)

Proof. At time t0, the value of the chooser option is V (t0) = max{C(t0),P (t0)} = max{C(t0),C(t0) − −r(T −t0) + F (t0)} = C(t0) + max{0, −F (t0)} = C(t0) + (e K − S(t0)) . (ii)

49 e −rt0 e −rt0 −rT −rt0 + Proof. By the risk-neutral pricing formula, V (0) = E[e V (t0)] = E[e C(t0)+(e K −e S(t0) ] = e −rt0 −r(T −t0) + C(0) + E[e (e K − S(t0)) ]. The ﬁrst term is the value of a call expiring at time T with strike −r(T −t0) price K and the second term is the value of a put expiring at time t0 with strike price e K.

5.11. Proof. We first make an analysis which leads to the hint, then we give a formal proof. (Analysis) If we want to construct a portfolio X that exactly replicates the cash flow, we must find a solution to the backward SDE { dXt = ∆tdSt + Rt(Xt − ∆tSt)dt − Ctdt

XT = 0.

− Multiply Dt on both sides of the first equation and apply Itô’s∫ product rule, we∫ get d(DtXt) = ∆td(DtSt) C D dt. Integrate from 0 to T , we have D X − D X = T ∆ d(D S ) − T C D dt. By the terminal t t ∫ T∫ T 0 0 0 t t t 0 t t −1 T − T condition, we get X0 = D0 ( 0 CtDtdt 0 ∆td(DtSt)). X0 is the theoretical, no-arbitrage price of the cash flow, provided we can find a trading strategy ∆ that solves the BSDE. Note the SDE for S αt−Rt gives d(DtSt) = (DtSt)σt(θtdt + dWt), where θt = . Take the proper change of measure so that ∫ σt f t e Wt = 0 θsds + Wt is a Brownian motion under the new measure P , we get ∫ ∫ ∫ T T T f CtDtdt = D0X0 + ∆td(DtSt) = D0X0 + ∆t(DtSt)σtdWt. 0 0 0 ∫ ∫ This says the random variable T C D dt has a stochastic integral representation D X + T ∆ D S σ dWf . 0 t t ∫ 0 0 0 t t t t t T This inspires us to consider the martingale generated by 0 CtDtdt, so that we can apply Martingale Rep- resentation Theorem and get a∫ formula for ∆ by comparison of the integrands. T e |F (Formal proof) Let MT = 0 CtDtdt, and Mt = E[MT t]. Then by Martingale Representation Theo- ∫ e e t e f Γt rem, we can find an adapted process Γt, so that Mt = M0 + ΓtdWt. If we set ∆t = , we can check ∫ ∫ 0 ∫ DtStσt −1 t − t e T Xt = Dt (D0X0 + 0 ∆ud(DuSu) 0 CuDudu), with X0 = M0 = E[ 0 CtDtdt] solves the SDE { dXt = ∆tdSt + Rt(Xt − ∆tSt)dt − Ctdt

XT = 0.

Indeed, it is easy∫ to see that X satisfies∫ the first equation.∫ To check the terminal condition, we note T f − T T e f − XT DT = D0X0 + 0 ∆tDtStσtdWt 0 CtDtdt = M0 + 0 ΓtdWt MT = 0. So XT = 0. Thus, we have found a trading strategy ∆∫, so that the corresponding portfolio X replicates the cash flow and has zero e T terminal value. So X0 = E[ 0 CtDtdt] is the no-arbitrage price of the cash flow at time zero. − Remark 3. ∫As shown in the∫ analysis, d(DtXt) = ∆td(DtSt) CtDtdt. Integrate from t to T , we get 0 − D X = T ∆ d(D S ) − T C D du. Take conditional expectation w.r.t. F on both sides, we get t t t∫ u u u t u u ∫ t − − e T |F −1 e T |F DtXt = E[ t CuDudu t]. So Xt = Dt E[ t CuDudu t]. This is the no-arbitrage price of the cash flow at time t, and we have justified formula (5.6.10) in the textbook. 5.12. (i) ∑ ∑ ∑ e d σij (t) d σij (t) d σij (t) f Proof. dBi(t) = dBi(t) + γi(t)dt = dWj(t) + Θj(t)dt = dWj(t). So Bi is a j=1 σi(t) j=1 σi(t) j=1 σi(t) ∑ 2 e e d σij (t) e martingale. Since dBi(t)dBi(t) = 2 dt = dt, by Lévy’s Theorem, Bi is a Brownian motion under j=1 σi(t) Pe. (ii)

50 Proof. e dSi(t) = R(t)Si(t)dt + σi(t)Si(t)dBi(t) + (αi(t) − R(t))Si(t)dt − σi(t)Si(t)γi(t)dt ∑d ∑d e = R(t)Si(t)dt + σi(t)Si(t)dBi(t) + σij(t)Θj(t)Si(t)dt − Si(t) σij(t)Θj(t)dt j=1 j=1 e = R(t)Si(t)dt + σi(t)Si(t)dBi(t).

(iii) e e Proof. dBi(t)dBk(t) = (dBi(t) + γi(t)dt)(dBj(t) + γj(t)dt) = dBi(t)dBj(t) = ρik(t)dt. (iv)

Proof. By Itô’s product rule and martingale property, ∫ ∫ ∫ t t t E[Bi(t)Bk(t)] = E[ Bi(s)dBk(s)] + E[ Bk(s)dBi(s)] + E[ dBi(s)dBk(s)] ∫0 ∫ 0 0 t t = E[ ρik(s)ds] = ρik(s)ds. 0 0 ∫ e e e t Similarly, by part (iii), we can show E[Bi(t)Bk(t)] = 0 ρik(s)ds. (v) Proof. By Itô’s product formula, ∫ ∫ t t E[B1(t)B2(t)] = E[ sign(W1(u))du] = [P (W1(u) ≥ 0) − P (W1(u) < 0)]du = 0. 0 0 Meanwhile, ∫ t e e e e E[B1(t)B2(t)] = E[ sign(W1(u))du ∫ 0 t e e = [P (W1(u) ≥ 0) − P (W1(u) < 0)]du ∫0 t e f e f = [P (W1(u) ≥ u) − P (W1(u) < u)]du ∫0 ( ) t 1 e f = 2 − P (W1(u) < u) du 0 2 < 0, e e e for any t > 0. So E[B1(t)B2(t)] = E[B1(t)B2(t)] for all t > 0. 5.13. (i) ∫ e e f e e f − t f ∈ Proof. E[W1(t)] = E[W1(t)] = 0 and E[W2(t)] = E[W2(t) 0 W1(u)du] = 0, for all t [0,T ]. (ii)

51 Proof. e e Cov[W1(T ),W2(T )] = E[W1(T )W2(T )] [∫ ∫ ] T T e = E W1(t)dW2(t) + W2(t)dW1(t) 0 0 [∫ ] [∫ ] T T e f f f e f = E W1(t)(dW2(t) − W1(t)dt) + E W2(t)dW1(t) 0 0 [∫ ] T e f 2 = −E W1(t) dt 0 ∫ T = − tdt 0 1 = − T 2. 2

−rt −rt −rt −rt 5.14. Equation (5.9.6) can be transformed into d(e Xt) = ∆t[d(e St) − ae dt] = ∆te [dSt − rStdt − −rt adt]. So, to make the discounted∫ portfolio value e Xt a martingale, we are motivated to change the measure − t − in such a way that St r 0 Sudu at is a martingale under the new measure. To do this,[ we note the SDE for ]S (αt−r)St−a is dSt = αtStdt+σStdWt. Hence dSt −rStdt−adt = [(αt −r)St −a]dt+σStdWt = σSt dt + dWt . ∫ σSt (αt−r)St−a f t e Set θt = and Wt = θsds + Wt, we can find an equivalent probability measure P , under which σSt 0 f f S satisfies the SDE dSt = rStdt + σStdWt + adt and Wt is a BM. This is the rational for formula (5.9.7). This is a good place to pause and think about the meaning of “martingale measure.” What is to be a martingale? The new measure Pe should be such that the discounted value process of the replicating portfolio is a martingale, not the discounted price process of the underlying. First, we want DtXt to be a martingale under Pe because we suppose that X is able to replicate the derivative payoff at terminal time, XT = VT . In order to avoid arbitrage, we must have Xt = Vt for any t ∈ [0,T ]. The difficulty is how to calculate Xt and the magic is brought by the martingale measure in the following line of reasoning: −1 e |F −1 e |F Vt = Xt = Dt E[DT XT t] = Dt E[DT VT t]. You can think of martingale measure as a calculational convenience. That is all about martingale measure! Risk neutral is just a perception, referring to the actual effect of constructing a hedging portfolio! Second, we note when the portfolio is self-financing, the discounted price process of the underlying is a martingale under Pe, as in the classical Black-Scholes-Merton model without dividends or cost of carry. This is not a coincidence. Indeed, we have in this case the e relation d(DtXt) = ∆td(DtSt). So DtXt being a martingale under P is more or less equivalent to DtSt being a martingale under Pe. However, when the underlying pays dividends, or there is cost of carry, d(DtXt) = ∆td(DtSt) no longer holds, as shown in formula (5.9.6). The portfolio is no longer self-financing, but self-financing with consumption. What we still want to retain is the martingale property of DtXt, not that of DtSt. This is how we choose martingale measure in the above paragraph. e −rT Let VT be a payoff at time T , then for the martingale Mt = E[e VT |Ft], by Martingale Representation ∫ e rt e t e f Γte Theorem, we can find an adapted process Γt, so that Mt = M0 + ΓsdWs. If we let ∆t = , then the 0 σSt −rt e f e −rT value of the corresponding portfolio X satisfies d(e Xt) = ΓtdWt. So by setting X0 = M0 = E[e VT ], −rt we must have e Xt = Mt, for all t ∈ [0,T ]. In particular, XT = VT . Thus the portfolio perfectly hedges VT . This justifies the risk-neutral pricing of European-type contingent claims in the model where cost of carry exists. Also note the risk-neutral measure is different from the one in case of no cost of carry. Another perspective for perfect replication is the following. We need to solve the backward SDE { dXt = ∆tdSt − a∆tdt + r(Xt − ∆tSt)dt

XT = VT

e −rt for two unknowns, X and ∆. To do so, we ﬁnd a probability measure P , under which e Xt is∫ a martingale, −rt e −rT |F t e f then e Xt = E[e VT t] := Mt. Martingale Representation Theorem gives Mt = M0 + 0 ΓudWu for

52 some adapted process Γe. This would give us a theoretical representation of ∆ by comparison of integrands, hence a perfect replication of VT . (i)

e −rt −rt Proof. As indicated in the above analysis, if we have (5.9.7) under P , then d(e Xt) = ∆t[d(e St) − −rt −rt f −rt e ae dt] = ∆te σStdWt. So (e Xt)t≥0, where X is given by (5.9.6), is a P -martingale.

(ii)

Proof. By Itô’s formula, dY = Y [σdWf + (r − 1 σ2)dt] + 1 Y σ2dt = Y (σdWf + rdt). So d(e−rtY ) = t t t 2 2 t ∫ t t t −rt f −rt e t a σe YtdWt and e Yt is a P -martingale. Moreover, if St = S0Yt + Yt ds, then 0 Ys ∫ ( ∫ ) t t a a f f dSt = S0dYt + dsdYt + adt = S0 + ds Yt(σdWt + rdt) + adt = St(σdWt + rdt) + adt. 0 Ys 0 Ys This shows S satisﬁes (5.9.7).

−rt Remark 4. To obtain this formula for S, we ﬁrst set Ut = e St to remove the rStdt term. The SDE for f −rt f U is dUt = σUtdWt + ae dt. Just like solving linear ODE, to remove U in the dWt term, we consider f −σWt Vt = Ute . Itô’s product formula yields ( ) ( ) − f − f 1 − f 1 dV = e σWt dU + U e σWt (−σ)dWf + σ2dt + dU · e σWt (−σ)dWf + σ2dt t t t t 2 t t 2 − f − 1 = e σWt ae rtdt − σ2V dt. 2 t

1 σ2t Note V appears only in the dt term, so multiply the integration factor e 2 on both sides of the equation, we get 1 2 f 1 2 σ t −rt−σWt+ σ t d(e 2 Vt) = ae 2 dt. ∫ f 1 2 t σWt+(r− σ )t ads Set Yt = e 2 , we have d(St/Yt) = adt/Yt. So St = Yt(S0 + ). 0 Ys (iii) Proof. [ ∫ ∫ ] t T e e e a a E[ST |Ft] = S0E[YT |Ft] + E YT ds + YT ds|Ft 0 Ys t Ys ∫ ∫ [ ] t a T Y = S Ee[Y |F ] + dsEe[Y |F ] + a Ee T |F ds 0 T t Y T t Y t ∫0 s t ∫ s t T e a e e = S Y E[Y − ] + dsY E[Y − ] + a E[Y − ]ds 0 t T t Y t T t T s ( ∫ )0 s ∫ t t a T = S + ds Y er(T −t) + a er(T −s)ds 0 Y t ( ∫0 s ) t t ads r(T −t) a r(T −t) = S0 + Yte − (1 − e ). 0 Ys r

e rT − a − rT In particular, E[ST ] = S0e r (1 e ). (iv)

53 Proof. ( ∫ ) t ads a dEe[S |F ] = aer(T −t)dt + S + (er(T −t)dY − rY er(T −t)dt) + er(T −t)(−r)dt T t 0 Y t t r ( ∫ ) 0 s t ads r(T −t) f = S0 + e σYtdWt. 0 Ys e e So E[ST |Ft] is a P -martingale. As we have argued at the beginning of the solution, risk-neutral pricing is e valid even in the presence of cost of carry. So by an argument similar to that of §5.6.2, the process E[ST |Ft] is the futures price process for the commodity. (v)

e −r(T −t) e Proof. We solve the equation E[e (ST − K)|Ft] = 0 for K, and get K = E[ST |Ft]. So F orS(t, T ) = F utS(t, T ). (vi) Proof. We follow the hint. First, we solve the SDE { dXt = dSt − adt + r(Xt − St)dt

X0 = 0.

−rt −rt −rt By our analysis in part (i), d(e Xt) = d(e St) − ae dt. Integrate from 0 to t on both sides, we get X = S − S ert + a (1 − ert) = S − S ert − a (ert − 1). In particular, X = S − S erT − a (erT − 1). t t 0 r t 0 r ( ∫ ) T T 0 r e t ads r(T −t) a r(T −t) Meanwhile, F orS(t, T ) = F uts(t, T ) = E[ST |Ft] = S0 + Yte − (1−e ). So F orS(0,T ) = 0 Ys r rT − a − rT − S0e r (1 e ) and hence XT = ST F orS(0,T ). After the agent delivers the commodity, whose value is ST , and receives the forward price F orS(0,T ), the portfolio has exactly zero value.

6 Connections with Partial Diﬀerential Equations

⋆ Comments: 1) A rigorous presentation of (strong) Markov property can be found in Revuz and Yor [11], Chapter III. 2) A rigorous presentation of the Feymann-Kac formula can be found in Øksendal [9, page 143] (also see 钱敏平等 [10, page 236]). To reconcile the version presented in this book and the one in Øksendal [9] (Theorem 8.2.1), we note for the undiscounted version, in this book g(t, X(t)) = E[h(X(T ))|F(t)] = EX(t)[h(X(T − t))] while in Øksendal [9] x v(t, x) = E [h(Xt)]. So v(t, x) = g(T − t, x). The discounted version can connected similarly. 3) Hedging equation. Recall the SDE satisﬁed by the value process X(t) of a self-ﬁnancing portfolio is (see formula (5.4.21) and (5.4.22) on page 230) − d[D(t)X(t)] = D(t)[{dX(t) R(t)X(t)dt] [ ] } ∑m ∑m = D(t) ∆i(t)dSi(t) + R(t) X(t) − ∆i(t)Si(t) dt − R(t)X(t)dt [ i=1 i=1 ] ∑m ∑m = D(t) ∆i(t)dSi(t) − R(t) ∆i(t)Si(t)dt i=1 i=1 ∑m = ∆i(t)d[D(t)Si(t)]. i=1

54 Under the multidimensional market model (formula (5.4.6) on page 226)

∑d dSi(t) = αi(t)Si(t)dt + Si(t) σij(t)dWj(t), i = 1, ··· , m. j=1 and assuming the existence of the risk-neutral measure (that∑ is, the market price of risk equations has a − d ··· solution (formula (5.4.18) on page 228): αi(t) R(t) = j=1 σij(t)Θj(t), i = 1, , m), D(t)Si(t) is a martingale under the risk-neutral measure (formula (5.4.17) on page 228) satisfying the SDE

∑d ∑d f f d[D(t)Si(t)] = D(t)Si(t) σij(t)dWj(t), dSi(t) = R(t)Si(t)dt + Si(t) σij(t)dWj(t), i = 1, ··· , m. j=1 j=1 ∑ m Consequently, d[D(t)X(t)] = i=1 ∆i(t)d[D(t)Si(t)] becomes   ∑m ∑d  f  d[D(t)X(t)] = D(t) ∆i(t)Si(t) σij(t)dWj(t) . i=1 j=1

If we further assume X(t) has the form v(t, S(t)), then the above equation becomes   ∑m 1 ∑m D(t) v (t, S(t))dt + v (t, S(t))dS (t) + v (t, S(t))dS (t)dS (t) − R(t)v(t, X(t))dt t xi i 2 xkxl k l i=1 k,l=1   ∑m ∑d ···  f  = ( )dt + D(t) vxi (t, S(t)) Si(t) σij(t)dWj(t) i=1 j=1   ∑m ∑d  f  = D(t) ∆i(t)Si(t) σij(t)dWj(t) . i=1 j=1

f Equating the coeﬃcient of each dWj(t), we have the hedging equation

∑m ∑m ··· vxi (t, S(t))Si(t)σij(t) = ∆i(t)Si(t)σij(t), j = 1, , d. i=1 i=1 One solution of the hedging equation is

··· ∆i(t) = vxi (t, S(t)), i = 1, , m.

I Exercise 6.1. Consider the stochastic diﬀerential equation

dX(u) =

(i)

Proof. Zt = 1 is obvious. Note the form of Z is similar to that of a geometric Brownian motion. So by Itô’s formula, it is easy to obtain dZu = buZudu + σuZudWu, u ≥ t. (ii)

55 Proof. If Xu = YuZu (u ≥ t), then Xt = YtZt = x · 1 = x and

dXu = YudZu + ZudYu + dYuZu ( ) au − σuγu γu γu = Yu(buZudu + σuZudWu) + Zu du + dWu + σuZu du Zu Zu Zu

= [YubuZu + (au − σuγu) + σuγu]du + (σuZuYu + γu)dWu

= (buXu + au)du + (σuXu + γu)dWu.

Remark 5. To see how to ﬁnd the above solution, we manipulate the equation (6.2.4)∫ as follows. First, to u − bv dv remove the term buXudu, we multiply on both sides of (6.2.4) the integrating factor e t . Then ∫ ∫ u u − bv dv − bv dv d(Xue t ) = e t (audu + (γu + σuXu)dWu). ∫ ∫ ∫ u u u − bv dv − bv dv − bv dv Let X¯u = e t Xu, a¯u = e t au and γ¯u = e t γu, then X¯ satisﬁes the SDE

dX¯u =a ¯udu + (¯γu + σuX¯u)dWu = (¯audu +γ ¯udWu) + σuX¯udWu. ∫ u − σv dWv To deal with the term σuX¯udWu, we consider Xû = X¯ue t . Then ( ) ∫ ∫ ∫ − u σ dW − u σ dW 1 − u σ dW 2 dXˆ = e t v v [(¯a du +γ ¯ dW ) + σ X¯ dW ] + X¯ e t v v (−σ )dW + e t v v σ du u u u u u u u u u u 2 u ∫ u − σv dWv +(¯γu + σuX¯u)(−σu)e t du 1 =a ˆ du +γ ˆ dW + σ Xˆ dW − σ Xˆ dW + Xˆ σ2du − σ (ˆγ + σ Xˆ )du u u u u u u u u u 2 u u u u u u 1 = (â − σ γˆ − Xˆ σ2)du +γ ˆ dW , u u u 2 u u u u ∫ ∫ ∫ u u u 1 2 − σv dWv − σv dWv σ dv where aû =a ¯ue t and γû =γ ¯ue t . Finally, use the integrating factor e t 2 v , we have ( ∫ ) ∫ ∫ 1 u σ2dv 1 u σ2dv 1 2 1 u σ2dv d Xˆ e 2 t v = e 2 t v (dXˆ + Xˆ · σ du) = e 2 t v [(â − σ γˆ )du +γ ˆ dW ]. u u u 2 u u u u u u Write everything back into the original X, a and γ, we get ( ∫ ∫ ∫ ) ∫ ∫ ∫ u u 1 u 2 1 u 2 u u − bv dv− σv dWv + σ dv σ dv− σv dWv − bv dv d Xue t t 2 t v = e 2 t v t t [(au − σuγu)du + γudWu], i.e. ( ) Xu 1 d = [(au − σuγu)du + γudWu] = dYu. Zu Zu

This inspired us to try Xu = YuZu. 6.2. (i)

Proof. The portfolio is self-ﬁnancing, so for any t ≤ T1, we have

dXt = ∆1(t)df(t, Rt,T1) + ∆2(t)df(t, Rt,T2) + Rt(Xt − ∆1(t)f(t, Rt,T1) − ∆2(t)f(t, Rt,T2))dt,

56 and

d(DtXt)

= −RtDtXtdt + DtdXt − = Dt[∆1(t)df((t, Rt,T1) + ∆2(t)df(t, Rt,T2) Rt(∆1(t)f(t, Rt,T1) + ∆2(t))f(t, Rt,T2))dt] 1 = D [∆ (t) f (t, R ,T )dt + f (t, R ,T )dR + f (t, R ,T )γ2(t, R )dt t 1 t t 1 r t 1 t 2 rr t 1 t ( ) 1 +∆ (t) f (t, R ,T )dt + f (t, R ,T )dR + f (t, R ,T )γ2(t, R )dt 2 t t 2 r t 2 t 2 rr t 2 t

−Rt(∆1(t)f(t, Rt,T1) + ∆2(t)f(t, Rt,T2))dt] 1 = ∆ (t)D [−R f(t, R ,T ) + f (t, R ,T ) + α(t, R )f (t, R ,T ) + γ2(t, R )f (t, R ,T )]dt 1 t t t 1 t t 1 t r t 1 2 t rr t 1 1 +∆ (t)D [−R f(t, R ,T ) + f (t, R ,T ) + α(t, R )f (t, R ,T ) + γ2(t, R )f (t, R ,T )]dt 2 t t t 2 t t 2 t r t 2 2 t rr t 2 +Dtγ(t, Rt)[Dtγ(t, Rt)[∆1(t)fr(t, Rt,T1) + ∆2(t)fr(t, Rt,T2)]]dWt

= ∆1(t)Dt[α(t, Rt) − β(t, Rt,T1)]fr(t, Rt,T1)dt + ∆2(t)Dt[α(t, Rt) − β(t, Rt,T2)]fr(t, Rt,T2)dt

+Dtγ(t, Rt)[∆1(t)fr(t, Rt,T1) + ∆2(t)fr(t, Rt,T2)]dWt.

(ii)

Proof. Let ∆1(t) = Stfr(t, Rt,T2) and ∆2(t) = −Stfr(t, Rt,T1), then

d(DtXt) = DtSt[β(t, Rt,T2) − β(t, Rt,T1)]fr(t, Rt,T1)fr(t, Rt,T2)dt

= Dt|[β(t, Rt,T1) − β(t, Rt,T2)]fr(t, Rt,T1)fr(t, Rt,T2)|dt.

Integrate from 0 to T on both sides of the above equation, we get ∫ T DT XT − D0X0 = Dt|[β(t, Rt,T1) − β(t, Rt,T2)]fr(t, Rt,T1)fr(t, Rt,T2)|dt. 0

If β(t, Rt,T1) ≠ β(t, Rt,T2) for some t ∈ [0,T ], under the assumption that fr(t, r, T ) ≠ 0 for all values of r and 0 ≤ t ≤ T , DT XT − D0X0 > 0. To avoid arbitrage (see, for example, Exercise 5.7), we must have for a.s. ω, β(t, Rt,T1) = β(t, Rt,T2), ∀t ∈ [0,T ]. This implies β(t, r, T ) does not depend on T . (iii)

Proof. In (6.9.4), let ∆1(t) = ∆(t), T1 = T and ∆2(t) = 0, we get [ ] 1 d(D X ) = ∆(t)D −R f(t, R ,T ) + f (t, R ,T ) + α(t, R )f (t, R ,T ) + γ2(t, R )f (t, R ,T ) dt t t t t t t t t r t 2 t rr t

+Dtγ(t, Rt)∆(t)fr(t, Rt,T )dWt.

This is formula (6.9.5). [ ] − 1 2 If fr(t, r, T ) = 0,{[ then d(DtXt) = ∆(t)Dt Rtf(t, Rt,T ) + ft(t, Rt,T ) +]}2 γ (t, Rt)frr(t, Rt,T ) dt. We − 1 2 choose ∆(t) = sign Rtf(t, Rt,T ) + ft(t, Rt,T ) + 2 γ (t, Rt)frr(t, Rt,T ) . To avoid arbitrage in this 1 2 case, we must have ft(t, Rt,T ) + 2 γ (t, Rt)frr(t, Rt,T ) = Rtf(t, Rt,T ), or equivalently, for any r in the 1 2 range of Rt, ft(t, r, T ) + 2 γ (t, r)frr(t, r, T ) = rf(t, r, T ). 6.3.

57 Proof. We note

[ ∫ ] ∫ ∫ d − s b dv − s b dv − s b dv e 0 v C(s, T ) = e 0 v [C(s, T )(−b ) + b C(s, T ) − 1] = −e 0 v . ds s s So integrate on both sides of the equation from t to T, we obtain ∫ ∫ ∫ T ∫ − T b dv − t b dv − s b dv e 0 v C(T,T ) − e 0 v C(t, T ) = − e 0 v ds. t ∫ ∫ ∫ ∫ ∫ t T − s T t ′ 0 bv dv 0 bv dv s bv dv Since C(T,T ) = 0, we have C(t, T ) = e t e ds = t e ds. Finally, by A (s, T ) = − 1 2 2 a(s)C(s, T ) + 2 σ (s)C (s, T ), we get ∫ ∫ T 1 T A(T,T ) − A(t, T ) = − a(s)C(s, T )ds + σ2(s)C2(s, T )ds. t 2 t ∫ T − 1 2 2 Since A(T,T ) = 0, we have A(t, T ) = t (a(s)C(s, T ) 2 σ (s)C (s, T ))ds. 6.4. (i) Proof. By the deﬁnition of φ, we have

∫ ′ 1 σ2 T C(u,T )du 1 2 1 2 φ (t) = e 2 t σ (−1)C(t, T ) = − φ(t)σ C(t, T ). 2 2

′ − 2φ (t) ′ − 1 2 So C(t, T ) = ϕ(t)σ2 . Diﬀerentiate both sides of the equation φ (t) = 2 φ(t)σ C(t, T ), we get 1 φ′′(t) = − σ2[φ′(t)C(t, T ) + φ(t)C′(t, T )] 2 1 1 = − σ2[− φ(t)σ2C2(t, T ) + φ(t)C′(t, T )] 2 2 1 1 = σ4φ(t)C2(t, T ) − σ2φ(t)C′(t, T ). 4 2

[ ] ′′ ′ 1 4 2 − ′′ 1 2 1 2 2 − 2φ (t) So C (t, T ) = 4 σ φ(t)C (t, T ) φ (t) / 2 φ(t)σ = 2 σ C (t, T ) σ2φ(t) . (ii) Proof. Plug formulas (6.9.8) and (6.9.9) into (6.5.14), we get

2φ′′(t) 1 2φ′(t) 1 − + σ2C2(t, T ) = b(−1) + σ2C2(t, T ) − 1, σ2φ(t) 2 σ2φ(t) 2

′′ − ′ − 1 2 i.e. φ (t) bφ (t) 2 σ φ(t) = 0. (iii) ′′ − ′ − 1 2 2 − − 1 2 Proof. The√ characteristic equation of φ (t)√ bφ (t) 2 σ φ(t) = 0 is λ bλ 2 σ = 0, which gives two 1 2 2 1 1 2 2 roots 2 (b b + 2σ ) = 2 b γ with γ = 2 b + 2σ . Therefore by standard theory of ordinary diﬀerential 1 bt γt −γt equations, a general solution of φ is φ(t) = e 2 (a1e + a2e ) for some constants a1 and a2. It is then easy to see that we can choose appropriate constants c1 and c2 so that

c1 −( 1 b+γ)(T −t) c2 −( 1 b−γ)(T −t) φ(t) = e 2 − e 2 . 1 1 − 2 b + γ 2 b γ

(iv)

58 ′ −( 1 b+γ)(T −t) −( 1 b−γ)(T −t) Proof. From part (iii), it is easy to see φ (t) = c1e 2 − c2e 2 . In particular, 2φ′(T ) 2(c − c ) 0 = C(T,T ) = − = − 1 2 . σ2φ(T ) σ2φ(T )

So c1 = c2. (v) Proof. We ﬁrst recall the deﬁnitions and properties of sinh and cosh: ez − e−z ez + e−z sinh z = , cosh z = , (sinh z)′ = cosh z, and (cosh z)′ = sinh z. 2 2 Therefore [ ] −γ(T −t) γ(T −t) − 1 b(T −t) e e φ(t) = c e 2 − 1 1 b + γ 1 b − γ [ 2 2 ] 1 − 1 − 1 b(T −t) 2 b γ −γ(T −t) 2 b + γ γ(T −t) = c e 2 e − e 1 1 b2 − γ2 1 b2 − γ2 [4 4 ] 2c1 − 1 b(T −t) 1 −γ(T −t) 1 γ(T −t) = e 2 −( b − γ)e + ( b + γ)e σ2 2 2

2c1 − 1 b(T −t) = e 2 [b sinh(γ(T − t)) + 2γ cosh(γ(T − t))]. σ2 and

′ 1 2c1 − 1 b(T −t) φ (t) = b · e 2 [b sinh(γ(T − t)) + 2γ cosh(γ(T − t))] 2 σ2 2c1 − 1 b(T −t) 2 + e 2 [−γb cosh(γ(T − t)) − 2γ sinh(γ(T − t))] σ2 [ 2 − 1 b(T −t) b bγ bγ = 2c e 2 sinh(γ(T − t)) + cosh(γ(T − t)) − cosh(γ(T − t)) 1 2σ2 σ2 σ2 ] 2γ2 − sinh(γ(T − t)) σ2 2 − 2 − 1 b(T −t) b 4γ = 2c e 2 sinh(γ(T − t)) 1 2σ2 − 1 b(T −t) = −2c1e 2 sinh(γ(T − t)). This implies 2φ′(t) sinh(γ(T − t)) C(t, T ) = − = . 2 − 1 − σ φ(t) γ cosh(γ(T t)) + 2 b sinh(γ(T t))

(vi)

′ 2aφ′(t) Proof. By (6.5.15) and (6.9.8), A (t, T ) = σ2φ(t) . Hence ∫ T ′ − 2aφ (s) 2a φ(T ) A(T,T ) A(t, T ) = 2 ds = 2 ln , t σ φ(s) σ φ(t) and [ ] 1 b(T −t) 2a φ(T ) 2a γe 2 A(t, T ) = − ln = − ln . 2 2 − 1 − σ φ(t) σ γ cosh(γ(T t)) + 2 b sinh(γ(T t))

59 6.5. (i)

Proof. Since g(t, X1(t),X2(t)) = E[h(X1(T ),X2(T ))|Ft] and −rt −rT e f(t, X1(t),X2(t)) = E[e h(X1(T ),X2(T ))|Ft], −rt iterated conditioning argument shows g(t, X1(t),X2(t)) and e f(t, X1(t),X2(t)) ar both martingales. (ii) and (iii) Proof. We note

dg(t, X1(t),X2(t)) 1 1 = gtdt + gx1 dX1(t) + gx2 dX2(t) + gx1x2 dX1(t)dX1(t) + gx2x2 dX2(t)dX2(t) + gx1x2 dX1(t)dX2(t) [ 2 2 1 = g + g β + g β + g (γ2 + γ2 + 2ργ γ ) + g (γ γ + ργ γ + ργ γ + γ γ ) t x1 1 x2 2 2 x1x1 11 12 11 12 x1x2 11 21 11 22 12 21 12 22 ] 1 + g (γ2 + γ2 + 2ργ γ ) dt + martingale part. 2 x2x2 21 22 21 22 So we must have 1 g + g β + g β + g (γ2 + γ2 + 2ργ γ ) + g (γ γ + ργ γ + ργ γ + γ γ ) t x1 1 x2 2 2 x1x1 11 12 11 12 x1x2 11 21 11 22 12 21 12 22 1 + g (γ2 + γ2 + 2ργ γ ) = 0. 2 x2x2 21 22 21 22 Taking ρ = 0 will give part (ii) as a special case. The PDE for f can be similarly obtained. 6.6. (i)

1 bt Proof. Multiply e 2 on both sides of (6.9.15), we get ( ) 1 bt 1 bt 1 b 1 1 bt 1 d(e 2 X (t)) = e 2 X (t) bdt + (− X (t)dt + σdW (t) = e 2 σdW (t). j j 2 2 j 2 j 2 j ∫ ( ∫ ) 1 bt 1 t 1 bu − 1 bt 1 t 1 bu So e 2 Xj(t) − Xj(0) = σ e 2 dWj(u) and Xj(t) = e 2 Xj(0) + σ e 2 dWj(u) . By Theorem 2 0 2 0 ∫ − 1 −bt t 2 − 2 bt e 2 bu σ − bt 4.4.9, Xj(t) is normally distributed with mean Xj(0)e and variance 4 σ 0 e du = 4b (1 e ). (ii) ∑ d 2 Proof. Suppose R(t) = j=1 Xj (t), then ∑d dR(t) = (2Xj(t)dXj(t) + dXj(t)dXj(t)) j=1 ( ) ∑d 1 = 2X (t)dX (t) + σ2dt j j 4 j=1 ( ) ∑d 1 = −bX2(t)dt + σX (t)dW (t) + σ2dt j j j 4 j=1 ( ) d √ ∑d X (t) = σ2 − bR(t) dt + σ R(t) √j dW (t). 4 j j=1 R(t)

∑ ∫ ∑ X2(t) d t √Xj (s) d j Let B(t) = dWj(s), then B is a local martingale with dB(t)dB(t) = dt = dt. So j=1 0 R(s) √ j=1 R(t) − d 2 by Lévy’s Theorem, B is a Brownian motion. Therefore dR(t) = (a bR(t))dt + σ R(t)dB(t)(a := 4 σ ) and R is a CIR interest rate process.

60 (iii)

− 1 bt Proof. By (6.9.16), Xj(t) is dependent on Wj only and is normally distributed with mean e 2 Xj(0) and σ2 − −bt ··· variance 4b [1 e ]. So X1(t), , Xd(t) are i.i.d. normal with the same mean µ(t) and variance v(t). (iv)

Proof.

2 [ ] ∫ ∞ − (x−µ(t)) 2v(t) uX2(t) ux2 e dx E e j = e √ −∞ 2πv(t) 2 2 ∫ ∞ − (1−2uv(t))x −2µ(t)x+µ (t) e 2v(t) = √ dx −∞ 2πv(t)

2 2 2 ∫ − µ(t) µ (t) − µ (t) ∞ (x − ) + − 1 − 1 2uv(t) 1 2uv(t) (1−2uv(t))2 = √ e 2v(t)/(1−2uv(t)) dx −∞ 2πv(t) 2 2 ∫ √ µ(t) 2 µ (t)(1−2uv(t))−µ (t) ∞ x− − 1 − 2uv(t) − ( 1−2uv(t) ) e 2v(t)(1−2uv(t)) = √ e 2v(t)/(1−2uv(t)) dx · √ −∞ 2πv(t) 1 − 2uv(t) 2 − uµ (t) e 1−2uv(t) = √ . 1 − 2uv(t)

(v) ∑ d 2 ··· Proof. By R(t) = j=1 Xj (t) and the fact X1(t), , Xd(t) are i.i.d.,

− udµ2(t) e btuR(0) uR(t) uX2(t) d − d − 2a − E[e ] = (E[e 1 ]) = (1 − 2uv(t)) 2 e 1−2uv(t) = (1 − 2uv(t)) σ2 e 1−2uv(t) .

6.7. (i)

−rt e −rT + Proof. e c(t, St,Vt) = E[e (ST − K) |Ft] is a martingale by iterated conditioning argument. Since

−rt d(e [ c(t, St,Vt)) 1 = e−rt c(t, S ,V )(−r) + c (t, S ,V ) + c (t, S ,V )rS + c (t, S ,V )(a − bV ) + c (t, S ,V )V S2+ t t t t t s t t t v t t t 2 ss t t t t ] 1 c (t, S ,V )σ2V + c (t, S ,V )σV S ρ dt + martingale part, 2 vv t t t sv t t t t

− 1 2 1 2 we conclude rc = ct + rscs + cv(a bv) + 2 cssvs + 2 cvvσ v + csvσsvρ. This is equation (6.9.26). (ii)

61 Proof. Suppose c(t, s, v) = sf(t, log s, v) − e−r(T −t)Kg(t, log s, v), then

−r(T −t) −r(T −t) ct = sft(t, log s, v) − re Kg(t, log s, v) − e Kgt(t, log s, v), 1 1 c = f(t, log s, v) + sf (t, log s, v) − e−r(T −t)Kg (t, log s, v) , s s s s s −r(T −t) cv = sfv(t, log s, v) − e Kgv(t, log s, v), 1 1 1 1 c = f (t, log s, v) + f (t, log s, v) − e−r(T −t)Kg (t, log s, v) + e−r(T −t)Kg (t, log s, v) , ss s s ss s ss s2 s s2 K c = f (t, log s, v) + f (t, log s, v) − e−r(T −t) g (t, log s, v), sv v sv s sv −r(T −t) cvv = sfvv(t, log s, v) − e Kgvv(t, log s, v).

So 1 1 c + rsc + (a − bv)c + s2vc + ρσsvc + σ2vc t s v 2 ss sv 2 vv − −r(T −t) − −r(T −t) − −r(T −t) − − −r(T −t) = sft re[ Kg e Kgt + rsf + rsfs rKe] gs(+ (a bv)(sfv e Kg) v) 1 1 1 K g K + s2v − f + f − e−r(T −t) g + e−r(T −t)K s + ρσsv f + f − e−r(T −t) g 2 s s s ss s2 ss s2 v sv s sv

1 2 −r(T −t) + σ v(sfvv − e Kgvv) [2 ] [ 1 1 1 1 = s f + (r + v)f + (a − bv + ρσv)f + vf + ρσvf + σ2vf − Ke−r(T −t) g + (r − v)g t 2 s v 2 ss sv 2 vv t 2 s ] 1 1 +(a − bv)g + vg + ρσvg + σ2vg + rsf − re−r(T −t)Kg v 2 ss sv 2 vv = rc.

That is, c satisﬁes the PDE (6.9.26).

(iii) |F Proof. First, by Markov property, f(t, Xt,Vt) = E[1{XT ≥log K} t]. So f(T,Xt,Vt) = 1{XT ≥log K}, which implies f(T, x, v) = 1{x≥log K} for all x ∈ R, v ≥ 0. Second, f(t, Xt,Vt) is a martingale, so by diﬀerentiating f and setting the dt term as zero, we have the PDE (6.9.32) for f. Indeed, [ 1 1 df(t, X ,V ) = f (t, X ,V ) + f (t, X ,V )(r + V ) + f (t, X ,V )(a − bv + ρσV ) + f (t, X ,V )V t t t t t x t t 2 t v t t t t 2 xx t t t ] 1 + f (t, X ,V )σ2V + f (t, X ,V )σV ρ dt + martingale part. 2 vv t t t xv t t t

1 − 1 1 2 So we must have ft + (r + 2 v)fx + (a bv + ρσv)fv + 2 fxxv + 2 fvvσ v + σvρfxv = 0. This is (6.9.32).

(iv) Proof. Similar to (iii).

(v)

−r(T −t) Proof. c(T, s, v) = sf(T, log s, v) − e Kg(T, log s, v) = s1{log s≥log K} − K1{log s≥log K} = 1{s≥K}(s − K) = (s − K)+. 6.8.

62 Proof. We follow the hint. Suppose h is smooth and compactly supported, then it is legitimate to exchange integration and diﬀerentiation: ∫ ∫ ∂ ∞ ∞ gt(t, x) = h(y)p(t, T, x, y)dy = h(y)pt(t, T, x, y)dy, ∂t 0 0 ∫ ∞ gx(t, x) = h(y)px(t, T, x, y)dy, 0 ∫ ∞ gxx(t, x) = h(y)pxx(t, T, x, y)dy. 0 ∫ [ ] ∞ 1 2 So (6.9.45) implies 0 h(y) pt(t, T, x, y) + β(t, x)px(t, T, x, y) + 2 γ (t, x)pxx(t, T, x, y) dy = 0. By the arbitrariness of h and assuming β, pt, px, v, pxx are all continuous, we have 1 p (t, T, x, y) + β(t, x)p (t, T, x, y) + γ2(t, x)p (t, T, x, y) = 0. t x 2 xx This is (6.9.43). 6.9. Proof. We ﬁrst note

′ 1 ′′ dhb(Xu) = hb(Xu)dXu + hb (Xu)dXudXu [ 2 ] 1 = h′ (X )β(u, X ) + γ2(u, X )h′′(X ) du + h′ (X )γ(u, X )dW . b u u 2 u b u b u u u Integrate on both sides of the equation, we have ∫ [ ] T − ′ 1 2 ′′ hb(XT ) hb(Xt) = hb(Xu)β(u, Xu) + γ (u, Xu)hb (Xu) du + martingale part. t 2 Take expectation on both sides, we get ∫ ∞ t,x E [hb(XT ) − hb(Xt)] = hb(y)p(t, T, x, y)dy − h(x) −∞ ∫ T 1 = Et,x[h′ (X )β(u, X ) + γ2(u, X )h′′(X )]du b u u 2 u b u ∫t ∫ [ ] T ∞ ′ 1 2 ′′ = hb(y)β(u, y) + γ (u, y)hb (y) p(t, u, x, y)dydu. t −∞ 2

Since hb vanishes outside (0, b), the integration range can be changed from (−∞, ∞) to (0, b), which gives (6.9.48). By integration-by-parts formula, we have ∫ ∫ b b ∂ β(u, y)p(t, u, x, y)h′ (y)dy = h (y)β(u, y)p(t, u, x, y)|b − h (y) (β(u, y)p(t, u, x, y))dy b b 0 b ∂y 0 ∫ 0 b ∂ = − hb(y) (β(u, y)p(t, u, x, y))dy, 0 ∂y and ∫ ∫ b b ∂ γ2(u, y)p(t, u, x, y)h′′(y)dy = − (γ2(u, y)p(t, u, x, y))h′ (y)dy b ∂y b 0 ∫ 0 b 2 ∂ 2 = (γ (u, y)p(t, u, x, y))hb(y)dy. 0 ∂y

63 Plug these formulas into (6.9.48), we get (6.9.49). Diﬀerentiate w.r.t. T on both sides of (6.9.49), we have ∫ ∫ ∫ b b b 2 ∂ − ∂ 1 ∂ 2 hb(y) p(t, T, x, y)dy = [β(T, y)p(t, T, x, y)]hb(y)dy + 2 [γ (T, y)p(t, T, x, y)]hb(y)dy, 0 ∂T 0 ∂y 2 0 ∂y that is, ∫ [ ] b 2 ∂ ∂ − 1 ∂ 2 hb(y) p(t, T, x, y) + (β(T, y)p(t, T, x, y)) 2 (γ (T, y)p(t, T, x, y)) dy = 0. 0 ∂T ∂y 2 ∂y This is (6.9.50). By (6.9.50) and the arbitrariness of hb, we conclude for any y ∈ (0, ∞), ∂ ∂ 1 ∂2 p(t, T, x, y) + (β(T, y)p(t, T, x, y)) − (γ2(T, y)p(t, T, x, y)) = 0. ∂T ∂y 2 ∂y2

6.10.

Proof. Under the assumption that limy→∞(y − K)rype(0, T, x, y) = 0, we have ∫ ∞ ∫ ∞ − − ∂ e − − e |∞ e (y K) (ryp(0, T, x, y))dy = (y K)ryp(0, T, x, y) K + ryp(0, T, x, y)dy K ∂y K ∫ ∞ = rype(0, T, x, y)dy. K If we further assume (6.9.57) and (6.9.58), then use integration-by-parts formula twice, we have ∫ ∞ 2 1 − ∂ 2 2 e (y K) 2 (σ (T, y)y p(0, T, x, y))dy 2 K ∂y [ ∫ ∞ ] 1 − ∂ 2 2 e |∞ − ∂ 2 2 e = (y K) (σ (T, y)y p(0, T, x, y)) K (σ (T, y)y p(0, T, x, y))dy 2 ∂y K ∂y 1 = − (σ2(T, y)y2pe(0, T, x, y)|∞) 2 K 1 = σ2(T,K)K2pe(0, T, x, K). 2 Therefore, ∫ ∞ −rT cT (0, T, x, K) = −rc(0, T, x, K) + e (y − K)peT (0, T, x, y)dy K ∫ ∞ ∫ ∞ −rT −rT = −re (y − K)pe(0, T, x, y)dy + e (y − K)peT (0, T, x, y)dy ∫K ∫K ∞ ∞ ∂ = −re−rT (y − K)pe(0, T, x, y)dy − e−rT (y − K) (rype(t, T, x, y))dy ∂y ∫ K K ∞ 2 −rT − 1 ∂ 2 2 e +e (y K) 2 (σ (T, y)y p(t, T, x, y))dy K 2 ∂y ∫ ∞ ∫ ∞ = −re−rT (y − K)pe(0, T, x, y)dy + e−rT rype(0, T, x, y)dy K K 1 +e−rT σ2(T,K)K2pe(0, T, x, K) 2 ∫ ∞ 1 = re−rT K pe(0, T, x, y)dy + e−rT σ2(T,K)K2pe(0, T, x, K) K 2 1 = −rKc (0, T, x, K) + σ2(T,K)K2c (0, T, x, K). K 2 KK

64 7 Exotic Options

⋆ Comments: On the PDE approach to pricing knock-out barrier options. We give some clarification to the explanation below Theorem 7.3.1 (page 301-302), where the key is that V (t) and v(t, x) differ by an indicator function and all the hassles come from this difference. More precisely, we define the first passage time ρ by following the notation of the textbook:

ρ = inf{t > 0 : S(t) = B}.

Then risk-neutral pricing gives the time-t price of the knock-out barrier option as [ ]

V (t) = Ee e−r(T −t)V (T ) F(t)

+ + where V (T ) = (S(T ) − K) 1{ρ≥T } = (S(T ) − K) 1{ρ>T }, since the event {ρ = T } ⊂ {S(T ) = B} has zero probability. [ ] Therefore, the discounted value process e−rtV (t) = Ee e−rT V (T ) F(t) is a martingale under the risk- neutral measure Pe, but we cannot say V (t) is solely a function of t and S(t), since it depends on the path property before t as well. To see this analytically, we note

{ρ > T } = {ρ > t, ρ ◦ θt > T − t},

where θt is the shift operator used for sample paths (i.e. θt(ω·) = ωt+·. See Øksendal [9, page 119] for details). Then [ ] [ ]

Ee −r(T −t) F Ee −r(T −t) − + F V (t) = e V (T ) (t) = e (S(T ) K) 1{ρ◦θt>T −t} (t) 1{ρ>t}.

Note ρ ◦ θt is solely dependent on the behavior of sample paths between time t and T . Therefore, we can apply Markov property

−r(T −t) eS(t) + V (t) = e E [(S(T − t) − K) 1{ρ>T −t}]1{ρ>t}

−r(T −t) ex + Define v(t, x) = e E [(S(T − t) − K) 1{ρ>T −t}], then V (t) = v(t, x)1{ρ>t} and v(t, x) satisfies the conditions (7.3.4)-(7.3.7) listed in Theorem 7.3.1. Indeed, it is easy to see v(t, x) satisfies all the boundary conditions (7.3.5)-(7.3.7), by the arguments in the paragraph immediately after Theorem 7.3.1. For the Black-Scholes-Merton PDE (7.3.4), we note

−rt −rt −rt −rt d[e V (t)] = d[e v(t, S(t))1{ρ>t}] = 1{ρ>t}d[e v(t, S(t))] + e v(t, S(t))d1{ρ>t}. Following the computation in equation (7.3.13), We can see the Black-Scholes-Merton equation (7.3.4) must hold for {(t, x) : 0 ≤ t < T, 0 ≤ x < B}.

7.1. (i)

1 2 √ 1 1 2 log s − 1 r σ √ 2 2 Proof. Since δ (τ, s) = σ τ [log s + (r 2 σ )τ] = σ τ + σ τ, 1 2 ∂ log s 1 − 3 ∂τ r 2 σ 1 − 1 ∂τ δ(τ, s) = (− )τ 2 + τ 2 ∂t σ [ 2 ∂t σ 2 ∂t ] 1 log s 1 r 1 σ2 √ = − √ (−1) − 2 τ(−1) 2τ σ τ σ [ ] 1 1 1 = − · √ − log ss + (r σ2)τ) 2τ σ τ 2 1 1 = − δ(τ, ). 2τ s

65 (ii)

Proof. ( [ ]) ∂ x ∂ 1 x 1 2 1 δ(τ, ) = √ log + (r σ )τ = √ , ∂x c ∂x σ τ c 2 xσ τ ( [ ]) ∂ c ∂ 1 c 1 2 1 δ(τ, ) = √ log + (r σ )τ = − √ . ∂x x ∂x σ τ x 2 xσ τ

(iii) Proof. (log s+rτ)2σ2τ(log s+rτ)+ 1 σ4τ2 ′ 1 − δ(τ,s) 1 − 4 N (δ(τ, s)) = √ e 2 = √ e 2σ2τ . 2π 2π Therefore ′ 2 −rτ N (δ (τ, s)) − 2σ τ(log s+rτ) e + 2 ′ = e 2σ τ = N (δ−(τ, s)) s −rτ ′ ′ and e N (δ−(τ, s)) = sN (δ+(τ, s)). (iv)

Proof.

2 1 2 2 1 ′ (log s+rτ) −(log +rτ) σ τ(log s−log ) 2 N (δ(τ, s)) − [ s ] s − 4rτ log s2σ τ log s − 2r − 2r 2 2 ( 2 1) log s ( 2 1) ′ − = e 2σ τ = e 2σ τ = e σ = s σ . N (δ(τ, s 1))

′ −1 ( 2r 1) ′ So N (δ(τ, s )) = s σ2 N (δ(τ, s)).

(v) [ ] [ ] √ 1 1 2 1 1 2 1 2 − − √ − √ − √ Proof. δ+(τ, s) δ (τ, s) = σ τ log s + (r + 2 σ )τ σ τ log s + (r 2 σ )τ = σ τ σ τ = σ τ. (vi) [ ] [ ] −1 1 1 2 1 −1 1 2 2 log s − √ − √ √ Proof. δ (τ, s) δ (τ, s ) = σ τ log s + (r 2 σ )τ σ τ log s + (r 2 σ )τ = σ τ . (vii)

2 2 ′ 1 − y ′′ 1 − y y2 ′ ′ Proof. N (y) = √ e 2 , so N (y) = √ e 2 (− ) = −yN (y). 2π 2π 2 To be continued ... 7.3. c c c σWT σ(WT −Wt) c c f f Proof. We note ST = S0e = Ste , WT − Wt = (WT − Wt) + α(T − t) is independent of Ft, c − c F supt≤u≤T (Wu Wt) is independent of t, and

c σMT YT = S0e c c σ supt≤u≤T Wu σMt = S0e 1{c ≤ c } + S0e 1{c c } Mt supt≤u≤T Wt Mt>supt≤u≤T Wu c −c σ supt≤u≤T (Wu Wt) = Ste 1 Y σ sup (Wc −Wc ) + Yt1 Y σ sup (Wc −Wc ) . { t ≤e t≤u≤T u t } { t ≤e t≤u≤T u t } St St

|F ST −t YT −t So E[f(ST ,YT ) t] = E[f(x , x 1 y YT −t + y1 y YT −t )], where x = St, y = Yt. Therefore S0 S0 { ≤ } { ≤ } x S0 x S0 E[f(ST ,YT )|Ft] is a Borel function of (St,Yt).

66 7.4. Proof. By Cauchy’s inequality and the monotonicity of Y , we have ∑m ∑m | − − | ≤ | − || − | (Ytj Ytj−1 )(Stj Stj−1 ) Ytj Ytj−1 Stj Stj−1 j=1 vj=1 v u u u∑m u∑m ≤ t − 2t − 2 (Ytj Ytj−1 ) (Stj Stj−1 ) j=1 j=1 v u √ u∑m ≤ | − | − t − 2 max Yt Yt − (YT Y0) (St St − ) . 1≤j≤m j j 1 j j 1 j=1

If we increase the number of partition points to inﬁnity and let the length of the longest subinterval √∑ √ m max ≤ ≤ |t − t − | approach zero, then (S − S )2 → [S] − [S] < ∞ and max ≤ ≤ |Y − 1 j m j j 1 j=1 t∑j tj−1 T 0 1 j m tj | → m − − → Ytj−1 0 a.s. by the continuity of Y . This implies j=1(Ytj Ytj−1 )(Stj Stj−1 ) 0.

8 American Derivative Securities

⋆ Comments: ∫ t − Justification of Definition 8.3.1. For any given stopping time τ, let Ct = 0 (K S(u))d1{τ≤u}, then ∫ ∞ −rτ −rt e (K − S(τ))1{τ<∞} = e dCt. 0 Regarding C as a cumulative cash flow, valuation of put option via (8.3.2) is justified by the risk-neutral valuation of a cash flow via (5.6.10).

8.1.

′ 2r x − 2r −1 1 2r ′ ′ − − σ2 − − − Proof. vL(L+) = (K L)( σ2 )( L ) L = σ2L (K L). So vL(L+) = vL(L ) if and only if x=L − 2r − − 2rK σ2L (K L) = 1. Solve for L, we get L = 2r+σ2 . 8.2. ≥ − + ≥ − + − ′ − Proof. By the calculation in Section 8.3.3, we can see v2(x) (K2 x) (K1 x) , rv2(x) rxv2(x) 1 2 2 ′′ ≥ ≥ ≤ 2 σ x v2 (x) 0 for all x 0, and for 0 x < L1∗ < L2∗, 1 rv (x) − rxv′ (x) − σ2x2v′′(x) = rK > rK > 0. 2 2 2 2 2 1 + + So the linear complementarity conditions for v2 imply v2(x) = (K2 − x) = K2 − x > K1 − x = (K1 − x) on [0,L1∗]. Hence v2(x) does not satisfy the third linear complementarity condition for v1: for each x ≥ 0, equality holds in either (8.8.1) or (8.8.2) or both. 8.3. (i) Proof. Suppose x takes its values in a domain bounded away from 0. By the general theory of linear differential equations, if we can find two linearly independent solutions v1(x), v2(x) of (8.8.4), then any solution of (8.8.4) can be represented in the form of C1v1 +C2v2 where C1 and C2 are constants. So it suffices to find two linearly independent special solutions of (8.8.4). Assume v(x) = xp for some constant p to be p − − 1 2 − − − 1 2 − determined, (8.8.4) yields x (r pr 2 σ p(p 1)) = 0. Solve the quadratic equation 0 = r pr 2 σ p(p 1) = 1 2r − 2r − 2 − − − σ2 ( 2 σ p r)(p 1), we get p = 1 or σ2 . So a general solution of (8.8.4) has the form C1x + C2x .

67 (ii)

Proof. Assume there is an interval [x1, x2] where 0 < x1 < x2 < ∞, such that v(x) ̸≡ 0 satisfies (8.3.19) with equality on [x1, x2] and satisfies (8.3.18) with equality for x at and immediately to the left of x1 and − 2r for x at and immediately to the right of x2, then we can find some C1 and C2, so that v(x) = C1x + C2x σ2 ′ on [x1, x2]. If for some x0 ∈ [x1, x2], v(x0) = v (x0) = 0, by the uniqueness of the solution of (8.8.4), we would conclude v ≡ 0. This is a contradiction. So such an x0 cannot exist. This implies 0 < x1 < x2 < K + ′ + 9 (if K ≤ x2, v(x2) = (K − x2) = 0 and v (x2)=the right derivative of (K − x) at x2, which is 0). Thus we have four equations for C1 and C2:  − 2r  σ2 − C1x1 + C2x1 = K x1  − 2r σ2 − C1x2 + C2x2 = K x2 − 2r −  2 1  − 2r σ − C1 σ2 C2x1 = 1  − 2r −1 − 2r σ2 − C1 σ2 C2x2 = 1.

Since x1 ≠ x2, the last two equations imply C2 = 0. Plug C2 = 0 into the first two equations, we have K−x1 K−x2 C1 = = ; plug C2 = 0 into the last two equations, we have C1 = −1. Combined, we would have x1 x2 x1 = x2. Contradiction. Therefore our initial assumption is incorrect, and the only solution v that satisfies the specified conditions in the problem is the zero solution. (iii)

Proof. If in a right neighborhood of 0, v satisﬁes (8.3.19) with equality, then part (i) implies v(x) = C1x + − 2r + C2x σ2 for some constants C1 and C2. Then v(0) = limx↓0 v(x) = 0 < (K − 0) , i.e. (8.3.18) will be − ′ − 1 2 2 ′′ violated. So we must have rv rxv 2 σ x v > 0 in a right neighborhood of 0. According to (8.3.20), v(x) = (K − x)+ near o. So v(0) = K. We have thus concluded simultaneously that v cannot satisfy (8.3.19) with equality near 0 and v(0) = K, starting from ﬁrst principles (8.3.18)-(8.3.20). (iv) Proof. This is already shown in our solution of part (iii): near 0, v cannot satisfy (8.3.19) with equality. (v) Proof. If v satisfy (K − x)+ with equality for all x ≥ 0, then v cannot have a continuous derivative as stated in the problem. This is a contradiction. (vi)

+ − 2r Proof. By the result of part (i), we can start with v(x) = (K − x) on [0, x1] and v(x) = C1x + C2x σ2 on ′ + [x1, ∞). By the assumption of the problem, both v and v are continuous. Since (K −x) is not diﬀerentiable at K, we must have x1 ≤ K.This gives us the equations   − 2r − − + σ2 K x1 = (K x1) = C1x1 + C2x1 − 2r −  2 1 − − 2r σ 1 = C1 σ2 C2x1 .

Because v is assumed to be bounded, we must have C1 = 0 and the above equations only have two unknowns: C2 and x1. Solve them for C2 and x1, we are done. 8.4. (i) Proof. This is already shown in part (i) of Exercise 8.3.

9 Note we have interpreted the condition “v(x) satisﬁes (8.3.18) with equality for x at and immediately to the right of x2” + ′ + as “v(x2) = (K − x2) and v (x2) =the right derivative of (K − x) at x2.” This is weaker than “v(x) = (K − x) in a right neighborhood of x2.”

68 (ii)

Proof. We solve for A, B the equations { − 2r AL σ2 + BL = K − L 2r − 2r −1 − σ2 − σ2 AL + B = 1,

2r σ2KL σ2 2rK − and we obtain A = σ2+2r , B = L(σ2+2r) 1. (iii)

Proof. By (8.8.5), B > 0. So for x ≥ K, f(x) ≥ BK > 0 = (K − x)+. If L ≤ x < K,

2r 2 2 + σ KL σ − 2r 2rKx − − σ2 − f(x) (K x) = 2 x + 2 K σ + 2r [ L(σ + 2r) ] 2r 2 x 2r +1 2 x 2r KL σ2 σ + 2r( ) σ2 − (σ + 2r)( ) σ2 − 2r L L = x σ2 . (σ2 + 2r)L

2r +1 2r ′ 2r 2r 2 σ2 − 2 σ2 ≥ σ2 − 2 Let g(θ) = σ + 2rθ (σ + 2r)θ with θ 1. Then g(1) = 0 and g (θ) = 2r( σ2 + 1)θ (σ + 2r 2r −1 2r 2r −1 σ2 2 σ2 − ≥ ≥ ≥ ≥ − + 2r) σ2 θ = σ2 (σ + 2r)θ (θ 1) 0. So g(θ) 0 for any θ 1. This shows f(x) (K x) for L ≤ x < K. Combined, we get f(x) ≥ (K − x)+ for all x ≥ L. (iv)

x − 2r Proof. Since lim →∞ v(x) = lim →∞ f(x) = ∞ and lim →∞ v (x) = lim →∞(K − L∗)( ) σ2 = 0, v(x) x x x L∗ x L∗ ≥ − + ≥ − ′ − and vL∗ (x) are different. By part (iii), v(x) (K x) . So v satisfies (8.3.18). For x L, rv rxv 1 2 2 ′′ − − 1 2 2 ′′ ≤ ≤ − ′ − 1 2 2 ′′ − 2 σ x v = rf rxf 2 σ x f = 0. For 0 x L, rv rxv 2 σ x v = r(K x)+rx = rK. Combined, − ′ − 1 2 2 ′′ ≥ ≥ rv rxv 2 σ x v 0 for x 0. So v satisfies (8.3.19). Along the way, we also showed v satisfies (8.3.20). In summary, v satisfies the linear complementarity condition (8.3.18)-(8.3.20), but v is not the function vL∗ given by (8.3.13).

(v)

2rK 2rK − 2r − σ2 Proof. By part (ii), B = 0 if and only if L(σ2+2r) 1 = 0, i.e. L = 2r+σ2 . In this case, v(x) = Ax = σ2K x − 2r x − 2r σ2 − σ2 ∞ σ2+2r ( L ) = (K L)( L ) = vL∗ (x), on the interval [L, ). e − − 8.5. The diﬃculty of the dividend-paying case is that from Lemma 8.3.4, we can only obtain E[e (r a)τL ], e − not E[e rτL ]. So we have to start from Theorem 8.3.2. (i)

f − − 1 2 σWt+(r a 2 σ )t −f − 1 − − Proof. By (8.8.9), St = S0e . Assume S0 = x, then St = L if and only if Wt σ (r a 1 2 1 x 2 σ )t = σ log L . By Theorem 8.3.2, [ √ ] − 1 x 1 − − 1 2 1 − − 1 2 2 e −rτ log (r a σ )+ 2 (r a σ ) +2r E[e L ] = e σ L σ 2 σ 2 . √ − − x − 1 − − 1 2 1 1 − − 1 2 e rτL γ log L x γ If we set γ = σ2 (r a 2 σ ) + σ σ2 (r a σ2 ) + 2r, we can write E[e ] as e = ( L ) . So the risk-neutral expected discounted pay oﬀ of this strategy is { K − x, 0 ≤ x ≤ L v (x) = L − x −γ (K L)( L ) , x > L.

69 (ii)

∂ − x −γ − γ(K−L) ∂ γK Proof. ∂L vL(x) = ( L ) (1 L ). Set ∂L vL(x) = 0 and solve for L∗, we have L∗ = γ+1 . (iii)

Proof. By Itô’s formula, we have [ ] [ ] 1 d e−rtv (S ) = e−rt −rv (S ) + v′ (S )(r − a)S + v′′ (S )σ2S2 dt + e−rtv′ (S )σS dWf . L∗ t L∗ t L∗ t t 2 L∗ t t L∗ t t t

If x > L∗,

− ′ − 1 ′′ 2 2 rvL∗ (x) + vL∗ (x)(r a)x + vL∗ (x)σ x ( ) 2 −γ −γ−1 −γ−2 − − x − − − x 1 2 2 − − − − x = r(K L∗) + (r a)x(K L∗)( γ) −γ + σ x ( γ)( γ 1)(K L∗) −γ L∗ L∗ 2 L∗ ( )−γ [ ] x 1 2 = (K − L∗) −r − (r − a)γ + σ γ(γ + 1) . L∗ 2

− − 1 2 By the deﬁnition of γ, if we deﬁne u = r a 2 σ , we have 1 r + (r − a)γ − σ2γ(γ + 1) 2 1 1 = r − σ2γ2 + γ(r − a − σ2) 2 2 ( √ )2 ( √ ) 1 u 1 u2 u 1 u2 = r − σ2 + + 2r + + + 2r u 2 σ2 σ σ2 σ2 σ σ2 ( √ ( )) √ 1 u2 2u u2 1 u2 u2 u u2 = r − σ2 + + 2r + + 2r + + + 2r 2 σ4 σ3 σ2 σ2 σ2 σ2 σ σ2 √ ( ) √ u2 u u2 1 u2 u2 u u2 = r − − + 2r − + 2r + + + 2r 2σ2 σ σ2 2 σ2 σ2 σ σ2 = 0.

′ 1 ′′ 2 2 ∗ − − − − − − − If x < L , rvL∗ (x) + vL∗ (x)(r a)x + 2 vL∗ (x)σ x = r(K x) + ( 1)(r a)x = rK + ax. Combined, we get [ ] −rt − −rt − −rt ′ f d e vL∗ (St) = e 1{St

70 √ 1 − 1 1 − 1 2 σ (θ 2 σ) + σ (θ 2 σ) + 2r. We have

r(γ + 1) − aγ < 0 ⇐⇒ (r − a)γ + r < 0 [ √ ] 1 1 1 1 ⇐⇒ (r − a) (θ − σ) + (θ − σ)2 + 2r + r < 0 σ 2 σ 2 √ 1 1 ⇐⇒ θ(θ − σ) + θ (θ − σ)2 + 2r + r < 0 √ 2 2 1 1 ⇐⇒ θ (θ − σ)2 + 2r < −r − θ(θ − σ)(< 0) 2 2 1 1 1 ⇐⇒ θ2[(θ − σ)2 + 2r] > r2 + θ2(θ − σ)2 + 2θr(θ − σ2) 2 2 2 ⇐⇒ 0 > r2 − θrσ2 ⇐⇒ 0 > r − θσ2.

2 Since θσ < 0, we have obtained a contradiction. So our initial assumption is incorrect, and rK − aL∗ ≥ 0 must be true.

(iv) Proof. The proof is similar to that of Corollary 8.3.6. Note the only properties used in the proof of Corollary − − ∧ rt rt τL∗ ∧ ≥ − + 8.3.6 are that e vL∗ (St) is a supermartingale, e vL∗ (St τL∗ ) is a martingale, and vL∗ (x) (K x) . ≥ − + Part (iii) already proved the supermartingale-martingale property, so it suﬃces to show vL∗ (x) (K x) ≥ γK ≥ − + ≤ in our problem. Indeed, by γ 0, L∗ = γ+1 < K. For x K > L∗, vL∗ (x) > 0 = (K x) ; for 0 x < L∗, − − + ≤ ≤ vL∗ (x) = K x = (K x) ; ﬁnally, for L∗ x K,

−γ−1 −γ−1 d x L∗ γK 1 (v (x) − (K − x)) = −γ(K − L∗) + 1 ≥ −γ(K − L∗) + 1 = −γ(K − ) + 1 = 0. dx L∗ −γ −γ γ + 1 γK L∗ L∗ γ+1

− − | ≤ ≤ − − + ≥ and (vL∗ (x) (K x)) x=L∗ = 0. So for L∗ x K, vL∗ (x) (K x) 0. Combined, we have ≥ − + ≥ ≥ vL∗ (x) (K x) 0 for all x 0. 8.6.

−rt + Proof. By Lemma 8.5.1, Xt = e (St − K) is a submartingale. For any τ ∈ Γ0,T , Theorem 8.8.1 implies

e −rT + e −rτ∧T + −rτ + −rτ + E[e (ST − K) ] ≥ E[e (Sτ∧T − K) ] ≥ E[e (Sτ − K) 1{τ<∞}] = E[e (Sτ − K) ],

−rτ + e −rT where we take the convention that e (Sτ −K) = 0 when τ = ∞. Since τ is arbitrarily chosen, E[e (ST − + ≥ e −rτ − + ≤ ∈ K) ] maxτ∈Γ0,T E[e (Sτ K) ]. The other direction “ ” is trivial since T Γ0,T . 8.7.

Proof. Suppose λ ∈ [0, 1] and 0 ≤ x1 ≤ x2, we have f((1 − λ)x1 + λx2) ≤ (1 − λ)f(x1) + λf(x2) ≤ (1 − λ)h(x1) + λh(x2). Similarly, g((1 − λ)x1 + λx2) ≤ (1 − λ)h(x1) + λh(x2). So

h((1 − λ)x1 + λx2) = max{f((1 − λ)x1 + λx2), g((1 − λ)x1 + λx2)} ≤ (1 − λ)h(x1) + λh(x2).

That is, h is also convex.

71 9 Change of Numéraire

⋆ Comments: 1) To provide an intuition for change of numéraire, we give a summary of results for change of numéraire in discrete case. This summary is based on Shiryaev [13]. Consider a model of ﬁnancial market (B,e B,¯ S) as in Delbaen and Schachermayer [2] Deﬁnition 2.1.1 or Shiryaev [13, page 383]. Here Be and B¯ are both one-dimensional while S could be a vector price process. Suppose Be and B¯ are both strictly positive, then both of them can be chosen as numéaire. Several results hold under this model. First, no-arbitrage and completeness properties of market are independent of the choice of numéraire (see, for example, Shiryaev [13, page 413, 481]). e Second, if the market is arbitrage-free,( then) corresponding( ) to B (resp. B¯), there is an equivalent proba- e bility measure Pe (resp. P¯), such that B¯ , S (resp. B , S ) is a martingale under Pe (resp. P¯). Be Be B¯ B¯ Third, if the market is both arbitrage-free and complete, we have the relation

B¯ 1 dP¯ = T [ ]dPe. (6) e B¯0 BT E e B0

See Shiryaev [13, page 510], formula (12). Finally, if fT is a European contingent claim with maturity N and the market is both arbitrage-free and complete, then 10 [ ] [ ]

¯ E¯ fT F e Ee fT F Bt ¯ t = Bt e t . BT BT

That is, the risk-neutral price of fT is independent of the choice of numéraire. See Shiryaev [13], Chapter VI, §1b.2.11 2) The above theoretical results can be applied to market involving foreign money market account. We consider the following market: a domestic money market account M (M0 = 1), a foreign money market f f account M (M0 = 1), a (vector) asset price process S denominated in domestic currency called stock. Suppose the domestic vs. foreign currency exchange rate is Q. Note Q is not a traded asset. Denominated by domestic currency, the traded assets are (M,M f Q, S), where M f Q can( be seen as) the price process f of one unit foreign currency. Domestic risk-neutral measure Pe is such that M Q , S is a Pe-martingale. ( ) M M Denominated by foreign currency, the traded assets are M f , M , S . Foreign risk-neutral measure Pef is ( ) Q Q M S Pef such that QM f , QM f is a -martingale. This is a change of numéraire in the market denominated by domestic currency, from M to M f Q. If we assume the market is arbitrage-free and complete, the foreign

10If the market is incomplete but the contingent claim is still replicable, this result still holds for t = 0. Indeed,[ ] in Shiryaev ≥ ∗ Ee fN [13], Chapter V §1c.2, it is easy to generalize formula (12) to the case of N 1 with x = supeP∈P P B0, x∗ = [ ] ( ) BN e fN ∗ π π π π infe E B0, C (P) = inf{x ≥ 0 : ∃π ∈ SF, x = x, x ≥ fN }, and C∗(P) = sup{x ≥ 0 : ∃π ∈ SF, x = x, x ≤ fN }. P∈P(P) BN 0 N 0 N ∗ No-arbitrage implies C∗ ≤ C . Since ( ) Xπ Xπ ∑N S N = 0 + γ ∆ k , B B k B N 0 k=1 k ∗ ∗ ∗ we must have C∗ ≤ x∗ ≤ x ≤ C . The replicability of fN implies C ≤ C∗. So, if the market is incomplete but the contingent ∗ ∗ claim is still replicable, we still have C∗ = x∗ = x = C , i.e. the risk-neutral pricing formula still holds for t = 0. 11 The invariance of risk-neutral price gives us a mnemonics to memorize formula (6): take fT = 1AB¯T with A ∈ FT and set t = 0, we have [ ] B¯ ¯ P¯ e Ee T B0 (A) = B0 e ; A . BT ¯ ¯ ¯ e BT /B0 So the Radon-Nikodým derivative is dP/dP = e e . This leads to formula (9.2.6) in Shreve [14, page 378]. The textbook’s BT /B0 chapter summary also provides a good way to memorize: the Radon-Nikodým derivative “is the numéraire itself, discounted in order to be a martingale and normalized by its initial condition in order to have expected value 1”.

72 risk-neutral measure is given by

Q M f Q D M f Pef T[ T ] Pe T T T Pe d = f d = d Q0M0 Q0 MT E M0 F on T . For a European contingent claim fT[denominated ] in domestic currency, its payoﬀ in foreign currency Df f Eef T T F is fT /QT . Therefore its foreign price is f t . Convert this price into domestic currency, we have [ ] Dt QT Df f Eef T T F Pef Pe F Qt f t . Use the relation between and on T and the Bayes formula, we get Dt QT [ ] [ ] f ef DT fT e DT fT QtE Ft = E Ft . f D Dt QT t

The RHS is exactly the price of fT in domestic market if we apply risk-neutral pricing. 3) Alternative proof of Theorem 9.4.2 (Black-Scholes-Merton option pricing with random interest rate). By (9.4.7), V (t) = B(t, T )EeT [V (T )|F(t)], where V (T ) = (S(T ) − K)+ with { } 1 S(T ) = For (T,T ) = For (t, T ) exp σ[WfT (T ) − WfT (t)] − σ2(T − t) . S S 2

Then the valuation of EeT [V (T )|F(t)] is like the Black-Scholes-Merton model for stock options, where interest rate is 0 and the underlying has price ForS(t, T ) at time t. So its value is

eT E [V (T )|F(t)] = ForS(t, T )N(d+(t)) − KN(d−(t)) [ ] 1 ForS (t,T ) 1 2 where d(t) = √ log σ (T − t) . Therefore σ T −t K 2

V (t) = B(t, T )[ForS(t, T )N(d+) − KN(d−)] = S(t)N(d+(t)) − KB(t, T )N(d−(t)).

9.1. (i) Proof. For any 0 ≤ t ≤ T , by Lemma 5.5.2, [ ] [ ] M (T ) M (T ) M (T ) E[M (T )|F ] M (t) (M2) 1 2 1 1 t 1 E Ft = E Ft = = . M2(T ) M2(t) M2(T ) M2(t) M2(t)

M (t) So 1 is a martingale under P M2 . M2(t) (ii)

e(N) (M2) Proof. Let M1(t) = DtSt and M2(t) = DtNt/N0. Then P as deﬁned in (9.2.6) is P as deﬁned in M1(t) St e(N) (N) St Remark 9.2.5. Hence = N0 is a martingale under P , which implies S = is a martingale M2(t) Nt t Nt under Pe(N). 9.2. (i)

− − f 1 2 1 1 −νWt−(r− ν )t Proof. Since Nt = N0 e 2 , we have

−1 −1 −νWf −(r− 1 ν2)t f 1 2 1 2 −1 c d(N ) = N e t 2 [−νdW − (r − ν )dt + ν dt] = N (−νdW − rdt). t 0 t 2 2 t t

(ii)

73 Proof. ( ) ( ) c 1 1 1 c c c c c dMt = Mtd + dMt + d dMt = Mt(−νdWt − rdt) + rMtdt = −νMtdWt. Nt Nt Nt Remark: This can also be obtained directly from Theorem 9.2.2. (iii) Proof. ( ) ( ) ( ) X 1 1 1 dXb = d t = X d + dX + d dX t N t N N t N t t ( t ) t t ( ) 1 1 1 = (∆ S + Γ M )d + (∆ dS + Γ dM ) + d (∆ dS + Γ dM ) t t t t N N t t t t N t t t t [ ( ) t t( ) ] [ ( t ) ( ) ] 1 1 1 1 1 1 = ∆t Std + dSt + d dSt + Γt Mtd + dMt + d dMt Nt Nt Nt Nt Nt Nt b c = ∆tdSt + ΓtdMt.

9.3. To avoid singular cases, we need to assume −1 < ρ < 1. (i)

f 1 2 νW3(t)+(r− ν )t Proof. Nt = N0e 2 . So

−1 −1 −νWf (t)−(r− 1 ν2)t dN = d(N e 3 2 ) t 0 [ ] −1 −νWf (t)−(r− 1 ν2)t f 1 2 1 2 = N e 3 2 −νdW (t) − (r − ν )dt + ν dt 0 3 2 2 −1 − f − − 2 = Nt [ νdW3(t) (r ν )dt], and

(N) −1 −1 −1 dSt = Nt dSt + StdNt + dStdNt −1 f −1 − f − − 2 = Nt (rStdt + σStdW1(t)) + StNt [ νdW3(t) (r ν )dt] (N) f (N) − f − − 2 − (N) = St (rdt + σdW1(t)) + St [ νdW3(t) (r ν )dt] σSt ρdt (N) 2 − (N) f − f = St (ν σρ)dt + St (σdW1(t) νdW3(t)). √ 2 − 2 f σ f − ν f f Deﬁne γ = σ 2ρσν + ν and W4(t) = γ W1(t) γ W3(t), then W4 is a martingale with quadratic variation σ2 σν ν2 [Wf ] = t − 2 ρt + t = t. 4 t γ2 γ2 r2 √ f (N) 2 − 2 By Lévy’s Theorem, W4 is a BM and therefore, St has volatility γ = σ 2ρσν + ν . (ii)

f √−ρ f √ 1 f f Proof. This problem is the same as Exercise 4.13, we deﬁne W2(t) = W1(t) + W3(t), then W2 1−ρ2 1−ρ2 is a martingale, with

( )2 ( ) 2 2 f 2 ρ f 1 f ρ 1 2ρ (dW2(t)) = −√ dW1(t) + √ dW3(t) = + − dt = dt, 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2

f f √ ρ √ ρ f f and dW2(t)dW1(t) = − dt + dt = 0. So W2 is a BM independent of W1, and dNt = rNtdt + 1−ρ2 1√−ρ2 f f 2 f νNtdW3(t) = rNtdt + νNt[ρdW1(t) + 1 − ρ dW2(t)].

74 (iii) e f f Proof. Under P , (W1, W2) is a two-dimensional BM, and  ( )  f  f · dW1(t) dSt = rStdt + σStdW1(t) = rStdt + St(σ, 0) f dW2(t) ( )  √ dWf (t)  f − 2 · 1 dNt = rNtdt + νNtdW3(t) = rNtdt + Nt(νρ, ν 1 ρ ) f . dW2(t) √ e So under P , the volatility vector for S is (σ, 0), and the volatility vector for N is (νρ, ν√1 − ρ2). By Theorem e(N) (N) 2 9.2.2, under the measure P , the volatility vector for S is (v1, v2) = (σ − νρ, −ν 1 − ρ . In particular, the volatility of S(N) is √ √ √ √ 2 2 − 2 − − 2 2 2 − 2 v1 + v2 = (σ νρ) + ( ν 1 ρ ) = σ 2νρσ + ν , consistent with the result of part (i). 9.4. ∫ ∫ f f t f t − 1 2 σ2(s)dW3(s)+ (Rs σ2 (s))ds Proof. From (9.3.15), we have Mt Qt = M0 Q0e 0 0 2 . So f ∫ ∫ D f − − t f − t − 1 2 t 1 0 σ2(s)dW3(s) 0 (Rs 2 σ2 (s))ds = D0 Q0 e Qt and ( ) f f f Dt Dt − f − − 1 2 1 2 Dt − f − − 2 d = [ σ2(t)dW3(t) (Rt σ2(t))dt + σ2(t)dt] = [ σ2(t)dW3(t) (Rt σ2(t))dt]. Qt Qt 2 2 Qt To get (9.3.22), we note ( ) ( ) ( ) f f f f MtDt Dt Dt Dt d = Mtd + dMt + dMtd Qt Qt Qt Qt f f MtDt − f − − 2 RtMtDt = [ σ2(t)dW3(t) (Rt σ2(t))dt] + dt Qt Qt f −MtDt f − 2 = (σ2(t)dW3(t) σ2(t)dt) Qt f −MtDt ff = σ2(t)dW3 (t). Qt To get (9.3.23), we note ( ) ( ) ( ) f f f f Dt St Dt Dt Dt d = dSt + Std + dStd Qt Qt Qt Qt f f Dt f StDt − f − − 2 = St(Rtdt + σ1(t)dW1(t)) + [ σ2(t)dW3(t) (Rt σ2(t))dt] Qt Qt f f Dt f +Stσ1(t)dW1(t) (−σ2(t))dW3(t) Qt f Dt St f − f 2 − = [σ1(t)dW1(t) σ2(t)dW3(t) + σ2(t)dt σ1(t)σ2(t)ρtdt] Qt f Dt St ff − ff = [σ1(t)dW1 (t) σ2dW3 (t)]. Qt

75 9.5.

Proof. We combine the solutions of all the sub-problems into a single solution as follows. The payoff of a quanto call is ( ST − K)+ units of domestic currency at time T . By risk-neutral pricing formula, its price at QT e −r(T −t) ST + St e time t is E[e ( − K) |Ft]. So we need to find the SDE for under risk-neutral measure P . By QT Qt f 1 2 σ1W1(t)+(r− σ )t formula (9.3.14) and (9.3.16), we have St = S0e 2 1 and √ f f 1 2 f 2 f f 1 2 σ2W3(t)+(r−r − σ )t σ2ρW1(t)+σ2 1−ρ W2(t)+(r−r − σ )t Qt = Q0e 2 2 = Q0e 2 2 . √ S S (σ −σ ρ)Wf (t)−σ 1−ρ2Wf (t)+(rf + 1 σ2− 1 σ2)t So t = 0 e 1 2 1 2 2 2 2 2 1 . Define Qt Q0 √ √ √ σ − σ ρ σ 1 − ρ2 − 2 2 − 2 2 − 2 f 1 2 f − 2 f σ4 = (σ1 σ2ρ) + σ2(1 ρ ) = σ1 2ρσ1σ2 + σ2 and W4(t) = W1(t) W2(t). σ4 σ4

2 2 f f (σ1−σ2ρ) σ2(1−ρ ) f e Then W4 is a martingale with [W4]t = 2 t + 2 t + t. So W4 is a Brownian motion under P . So σ4 σ4 − f − 2 if we set a = r r + ρσ1σ2 σ2, we have ( ) S S f 1 2 S S t 0 σ4W4(t)+(r−a− σ )t t t f = e 2 4 and d = [σ4dW4(t) + (r − a)dt]. Qt Q0 Qt Qt

Therefore, under Pe, St behaves like dividend-paying stock and the price of the quanto call option is like the Qt price of a call option on a dividend-paying stock. Thus formula (5.5.12) gives us the desired price formula for quanto call option.

9.6. (i) √ √ 1 2 Proof. d (t) − d−(t) = √ σ (T − t) = σ T − t. So d−(t) = d (t) − σ T − t. + σ T −t + (ii)

2 ForS (t,T ) Proof. d (t) + d−(t) = √ log . So + σ T −t K

2 2 ForS(t, T ) d (t) − d (t) = (d (t) + d−(t))(d (t) − d−(t)) = 2 log . + − + + K

(iii) Proof.

2 2 2 2 2 −d (t)/2 −d−(t) −d (t)/2 d (t)/2−d−(t)/2 ForS(t, T )e + − Ke = e + [ForS(t, T ) − Ke + ]

−d2 (t)/2 log ForS (t,T ) = e + [ForS(t, T ) − Ke K ] = 0.

(iv)

76 Proof.

dd (t) + [ ] 1√ √ For (t, T ) 1 1 dFor (t, T ) (dFor (t, T ))2 1 − 3 S 2 − √ S − S − = 1σ (T t) [log + σ (T t)]dt + 2 σdt 2 K 2 σ T − t ForS(t, T ) 2ForS(t, T ) 2 1 For (t, T ) σ 1 1 1 = √ log S dt + √ dt + √ (σdWfT (t) − σ2dt − σ2dt) 2σ (T − t)3 K 4 T − t σ T − t 2 2 1 For (t, T ) 3σ dWfT (t) = log S dt − √ dt + √ . 2σ(T − t)3/2 K 4 T − t T − t

(v) √ σdt Proof. dd−(t) = dd (t) − d(σ T − t) = dd (t) + √ . + + 2 T −t (vi)

2 2 dt Proof. By (iv) and (v), (dd−(t)) = (dd+(t)) = T −t . (vii) Proof. 1 dN(d (t)) = N ′(d (t))dd (t) + N ′′(d (t))(dd (t))2 + + + 2 + + 2 2 1 − d+(t) 1 1 − d+(t) dt = √ e 2 dd+(t) + √ e 2 (−d+(t)) . 2π 2 2π T − t

(viii)

Proof.

′ 1 ′′ 2 dN(d−(t)) = N (d−(t))dd−(t) + N (d−(t))(dd−(t)) 2 2 ( ) d−(t) 2 − 1 − d−(t) σdt 1 e 2 dt = √ e 2 dd+(t) + √ + √ (−d−(t)) 2π 2 T − t 2 2π T − t √ 2 − − 2 d−(t)(σ T t d+(t)) −d−(t)/2 − 1 2 σe e 2 −d−(t)/2 = √ e dd+(t) + √ dt + √ dt 2π 2 2π(T − t) 2(T − t) 2π 2 2 d−(t) −d−(t)/2 − 1 2 σe d (t)e 2 −d−(t)/2 + = √ e dd+(t) + √ dt − √ dt. 2π 2π(T − t) 2(T − t) 2π

(ix)

Proof.

− 2 − 2 d+(t)/2 d+(t)/2 fT e 1 cT σForS(t, T )e dForS(t, T )dN(d+(t)) = σForS(t, T )dW (t) √ √ dW (t) = √ dt. 2π T − t 2π(T − t)

77 (x)

Proof.

ForS(t, T )dN(d+(t)) + dForS(t, T )dN(d+(t)) − KdN(d−(t)) [ ] − 2 d+(t)/2 1 −d2 (t)/2 d+(t) −d2 (t)/2 σForS(t, T )e = ForS(t, T ) √ e + dd+(t) − √ e + dt + √ dt 2π 2(T − t) 2π 2π(T − t) [ ] −d2 (t)/2 e − σ 2 d (t) 2 −d−(t)/2 + −d−(t)/2 −K √ dd+(t) + √ e dt − √ e dt 2π 2π(T − t) 2(T − t) 2π [ ] − 2 − 2 d+(t)/2 d−(t)/2 ForS(t, T )d+(t) −d2 (t)/2 σForS(t, T )e Kσe Kd+(t) −d2 (t)/2 = √ e + + √ − √ − √ e − dt 2(T − t) 2π 2π(T − t) 2π(T − t) 2(T − t) 2π ( ) 1 2 2 −d (t)/2 −d−(t)/2 +√ ForS(t, T )e + − Ke dd+(t) 2π = 0.

− 2 (t,T ) − 2 d−(t)/2 ForS d+(t)/2 The last “=” comes from (iii), which implies e = K e .

10 Term-Structure Models

⋆ Comments: 1) Computation of eΛt (Lemma 10.2.3). For a systematic treatment of the computation of matrix exponential function eΛt, see 丁同仁等 [3, page 175] or Arnold et al. [1, page 221]. For the sake of Lemma 10.2.3, a direct computation to ﬁnd (10.2.35) goes[ as follows.] [ ] λ1 0 0 0 n i) The case of λ1 = λ2. In this case, set Γ1 = and Γ2 = . Then Λ = Γ1 +Γ2, Γ2 = 02×2 [ ] 0 λ2 λ21 0 n ≥ n λ1 0 for n 2, and Γ1 = n . Note Γ1 and Γ2 commute, we therefore have 0 λ2 [ ]( [ ]) [ ] [ ] λ1t λ1t λ1t Λt Γ t Γ t e 0 0 0 e 0 e 0 1 · 2 × e = e e = λ2t I2 2 + = λ2t λ2t = λ1t λ2t . 0 e λ21t 0 λ21te e λ21te e

ii) The case of λ1 ≠ λ2. In this case, Λ has two distinct eigenvalues λ1 and λ2. So it can be diagonalized, that is, we can ﬁnd a matrix P such that [ ] λ 0 P −1ΛP = J = 1 . 0 λ2

Solving the matrix equation P Λ = PJ, we have [ ] 1 0 P = , λ21 1 λ1−λ2 and consequently [ ] 1 0 P −1 = . − λ21 1 λ1−λ2 Therefore ( ) [ ] [ ] ∞ − ∞ ∑ n 1 ∑ n λ1t λ1t Λt PJ P n · J n · −1 e 0 −1 (e ) 0 e = t = P t P = P λ2t P = λ21 λ1t λ2t λ2t . n! n! 0 e − e − e e n=0 n=0 λ1 λ2

78 2) Intuition of Theorem 10.4.1 (Price of backset LIBOR). The intuition here is the same as that of a discrete-time model: the payoff δL(T,T ) = (1 + δL(T,T )) − 1 at time T + δ is equivalent to an investment of 1 at time T (which gives the payoff 1 + δL(T,T ) at time T + δ), subtracting a payoff of 1 at time T + δ. The first part has time t price B(t, T ) while the second part has time t price B(t, T + δ). Pricing under (T + δ)-forward measure: Using the (T + δ)-forward measure, the time-t no-arbitrage price of a payoff δL(T,T ) at time (T + δ) is

B(t, T + δ)EeT +δ[δL(T,T )] = B(t, T + δ) · δL(t, T ), ( ) 1 B(t,T ) − ≤ ≤ where we have used the observation that the forward rate process L(t, T ) = δ B(t,T +δ) 1 (0 t T ) is a martingale under the (T + δ)-forward measure.

I Exercise 10.1 (Statistics in the two-factor Vasicek model). According to Example 4.7.3, Y1(t) and Y2(t) in (10.2.43)-(10.2.46) are Gaussian processes. (i) Show that e −λ1t EY1(t) = e Y1(0), (10.7.1)

that when λ1 ≠ λ2, then λ ( ) e 21 −λ1t −λ2t −λ2t EY2(t) = e − e Y1(0) + e Y2(0), (10.7.2) λ1 − λ2

and when λ1 = λ2, then e −λ1t −λ1t EY2(t) = −λ21te Y1(0) + e Y2(0). (10.7.3) We can write e −λ1t Y1(t) − EY1(t) = e I1(t),

when λ1 ≠ λ2, λ ( ) e 21 −λ1t −λ2t −λ2t Y2(t) − EY2(t) = e I1(t) − e I2(t) − e I3(t), λ1 − λ2

and when λ1 = λ2,

e −λ1t −λ1t −λ1t Y2(t) − EY2(t) = −λ21te I1(t) + λ21e I4(t) + e I3(t),

where the Itô integrals ∫ ∫ t t λ1u f λ2u f I1(t) = e dW1(u),I2(t) = e dW1(u), ∫0 ∫0 t t λ2u f λ1u f I3(t) = e dW2(u),I4(t) = ue dW1(u), 0 0

all have expectation zero under the risk-neutral measure Pe. Consequently, we can determine the variances of Y1(t) and Y2(t) and the covariance of Y1(t) and Y2(t) under the risk-neutral measure from the variances and covariances of Ij(t) and Ik(t). For example, if λ1 = λ2, then

− 2λ1tEe 2 Var(Y1(t)) = e I1 (t), − − − 2 2 2λ1tEe 2 2 2λ1tEe 2 2λ1tEe 2 Var(Y2(t)) = λ21t e I1 (t) + λ21e I4 (t) + e I3 (t) − − − 2 2λ1tEe − 2λ1tEe 2λ21te [I1(t)I4(t)] 2λ21te [I1(t)I3(t)]

−2λ1t e +2λ21e E[I4(t)I3(t)], − − − − 2λ1tEe 2 2λ1tEe 2λ1tEe Cov(Y1(t),Y2(t)) = λ21te I1 (t) + λ21e [I1(t)I4(t)] + e [I1(t)I3(t)],

where the variances and covariance above are under the risk-neutral measure Pe.

79 Proof. Using the notation I1(t), I2(t), I3(t) and I4(t) introduced in the problem, we can write Y1(t) and Y2(t) as −λ1t −λ1t Y1(t) = e Y1(0) + e I1(t) and Y (t) {2 [ ] λ21 −λ1t −λ2t −λ2t λ21 −λ1t −λ2t −λ2t − (e − e )Y1(0) + e Y2(0) + − e I1(t) − e I2(t) − e I3(t), λ1 ≠ λ2; = λ1 λ2 [ λ1 λ2 ] −λ1t −λ1t −λ1t −λ1t −λ1t −λ21te Y1(0) + e Y2(0) − λ21 te I1(t) − e I4(t) + e I3(t), λ1 = λ2.

Since all the Ik(t)’s (k = 1, ··· , 4) are normally distributed with zero mean, we can conclude

e −λ1t E[Y1(t)] = e Y1(0) and { − − − λ21 λ1t − λ2t λ2t ̸ e λ −λ (e e )Y1(0) + e Y2(0), if λ1 = λ2; E[Y2(t)] = 1 2 −λ1t −λ1t −λ21te Y1(0) + e Y2(0), if λ1 = λ2.

(ii) Compute the ﬁve terms Ee 2 Ee Ee Ee Ee 2 I1 (t), [I1(t)I2(t)], [I1(t)I3(t)], [I1(t)I4(t)], [I4 (t)]. The ﬁve other terms, which you are not being asked to compute, are 1 ( ) E 2 2λ2t − I2 (t) = e 1 , 2λ2 E[I (t)I (t)] = 0, 2 3 ( ) t 1 E (λ1+λ2)t − (λ1+λ2)t [I2(t)I4(t)] = e + 2 1 e , λ1 + λ2 (λ1 + λ2) 1 ( ) E 2 2λ2t − I3 (t) = e 1 , λ2 E[I3(t)I4(t)] = 0.

Solution. The calculation relies on the following fact: if Xt and Yt are both martingales, then XtYt − [X,Y ]t e e is also a martingale. In particular, E[XtYt] = E{[X,Y ]t}. Thus ∫ ∫ t e2λ1t − 1 t e(λ1+λ2)t − 1 Ee 2 2λ1u Ee (λ1+λ2)u [I1 (t)] = e du = , [I1(t)I2(t)] = e du = , 0 2λ1 0 λ1 + λ2 ∫ [ ] t 1 e2λ1t − 1 e e 2λ1u 2λ1t E[I1(t)I3(t)] = 0, E[I1(t)I4(t)] = ue du = te − 0 2λ1 2λ1 and ∫ t t2e2λ1t te2λ1t e2λ1t − 1 Ee 2 2 2λ1u − [I4 (t)] = u e du = 2 + 3 . 0 2λ1 2λ1 4λ1

(iii) Some derivative securities involve time spread (i.e., they depend on the interest rate at two different times). In such cases, we are interested in the joint statistics of the factor processes at different times. These are still jointly normal and depend on the statistics of the Itô integral Ij at different times. Compute e E[I1(s)I2(t)], where 0 ≤ s < t. (Hint: Fix s ≥ 0 and define ∫ t λ1u f J1(t) = e 1{u≤s}dW1(u), 0

where 1{u≤s} is the function of u that is 1 if u ≤ s and 0 if u > s. Note that J1(t) = I1(s) when t ≥ s.)

80 Solution. Following the hint, we have ∫ t e(λ1+λ2)s − 1 e e (λ1+λ2)u E[I1(s)I2(t)] = E[J1(t)I2(t)] = e 1{u≤s}du = . 0 λ1 + λ2

I Exercise 10.2 (Ordinary differential equations for the mixed affine-yield model). In the mixed model of Subsection 10.2.3, as in the two-factor Cox-Ingersoll-Ross model, zero-coupon bond prices have the affine-yield form −y1C1(T −t)−y2C2(T −t)−A(T −t) f(t, y1, y2) = e , where C1(0) = C2(0) = A(0) = 0.

(i) Find the partial diﬀerential equation satisﬁed by f(t, y1, y2). [ ∫ ] T e − Rsds Solution. Assume B(t, T ) = E e t Ft = f(t, Y1(t),Y2(t)). Then

d(D(t)B(t, T )) = D(t)[−R(t)f(t, Y1(t),Y2(t))dt + df(t, Y1(t),Y2(t))].

By Itô’s formula,

− − df(t, Y1(t),Y2(t)) = [ft(t, Y1(t),Y2(t)) + fy1 (t, Y1(t),Y2(t))(µ λ1Y1(t)) + fy2 (t, Y1(t),Y2(t))( λ2)Y2(t) 1 +fy1y2 (t, Y1(t),Y2(t))σ21Y1(t) + fy1y1 (t, Y1(t),Y2(t))Y1(t) 2 ] 1 + f (t, Y (t),Y (t))(σ2 Y (t) + α + βY (t)) dt + martingale part. 2 y2y2 1 2 21 1 1

Since D(t)B(t, T ) is a martingale, we must have [ ] ∂ ∂ ∂ −(δ + δ y + δ y ) + + (µ − λ y ) − λ y f 0 1 1 2 2 ∂t 1 1 ∂y 2 2 ∂y [ ( 1 2 )] 2 2 2 1 ∂ ∂ 2 ∂ + 2σ21y1 + y1 2 + (σ21y1 + α + βy1) 2 f 2 ∂y1∂y2 ∂y1 ∂y2 = 0.

12 (ii) Show that C1, C2, and A satisfy the system of ordinary diﬀerential equations 1 1 C′ = −λ C − C2 − σ C C − (σ2 + β)C2 + δ , (10.7.4) 1 1 1 2 1 21 1 2 2 21 2 1 ′ − C2 = λ2C2 + δ2, (10.7.5) 1 A′ = µC − αC2 + δ . (10.7.6) 1 2 2 0 12 2 − 1 2 − The textbook has a typo in formula (10.7.4): the coeﬃcient of C2 should be 2 (σ21 + β) instead of (1 + β). See http://www.math.cmu.edu/∼shreve/ for details.

81 −y1C1(T −t)−y2C2(T −t)−A(T −t) Proof. If we suppose f(t, y1, y2) = e , then ∂f = [y C′ (T − t) + y C′ (T − t) + A′(T − t)]f ∂t 1 1 2 2 ∂f = −C1(T − t)f ∂y1 ∂f = −C2(T − t)f ∂y2 ∂2f = C1(T − t)C2(T − t)f ∂y1∂y2 2 ∂ f 2 − 2 = C1 (T t)f ∂y1 2 ∂ f 2 − 2 = C2 (T t)f. ∂y2 So the PDE in part (i) becomes − ′ ′ ′ − − (δ0 + δ1y1 + δ2y2) + y1C1 + y2C2 + A (µ λ1y1)C1 + λ2y2C2 1 [ ] + 2σ y C C + y C2 + (σ2 y + α + βy )C2 2 21 1 1 2 1 1 21 1 1 2 = 0.

Sorting out the LHS according to the independent variables y1 and y2, we get  − ′ 1 2 1 2 2  δ1 + C1 + λ1C1 + σ21C1C2 + 2 C1 + 2 (σ21 + β)C2 = 0 −δ + C′ + λ C = 0  2 2 2 2 − ′ − 1 2 δ0 + A µC1 + 2 αC2 = 0.

In other words, we can obtain the ODEs for C1,C2 and A as follows   ′ − − − 1 2 − 1 2 2 C1 = λ1C1 σ21C1C2 2 C1 2 (σ21 + β)C2 + δ1 C′ = −λ C + δ  2 2 2 2 ′ − 1 2 A = µC1 2 αC2 + δ0.

I Exericse 10.3 (Calibration of the two-factor Vasicek model). Consider the canonical two-factor Vasicek model (10.2.4), (10.2.5), but replace the interest rate equation (10.2.6) by

R(t) = δ0(t) + δ1Y1(t) + δ2Y2(t), (10.7.7)

where δ1 and δ2 are constant but δ0(t) is a nonrandom function of time. Assume that for each T there is a zero-coupon bond maturing at time T . The price of this bond at time t ∈ [0,T ] is [ ∫ ] e − T R(u)du B(t, T ) = E e t F(t) .

Because the pair of processes (Y1(t),Y2(t)) is Markov, there must exist some function f(t, T, y1, y2) such that B(t, T ) = f(t, T, Y1(t),Y2(t)). (We indicate the dependence of f on the maturity T because, unlike in Subsection 10.2.1, here we shall consider more than one value of T .)

(i) The function f(t, T, y1, y2) is of the aﬃne-yield form

−y1C1(t,T )−y2C2(t,T )−A(t,T ) f(t, T, y1, y2) = e . (10.7.8)

d d d Holding T ﬁxed, derive a system of ordinary diﬀerential equations for dt C1(t, T ), dt C2(t, T ), and dt A(t, T ).

82 Solution. We have d(DtB(t, T )) = Dt[−Rtf(t, T, Y1(t),Y2(t))dt + df(t, T, Y1(t),Y2(t))] and

df(t, T, Y1(t),Y2(t)) − − − = [ft(t, T, Y1(t),Y2(t)) + fy1 (t, T, Y1(t),Y2(t))( λ1Y1(t)) + fy2 (t, T, Y1(t),Y2(t))( λ21Y1(t) λ2Y2(t)) 1 1 + f (t, T, Y (t),Y (t)) + f (t, T, Y (t),Y (t))]dt + martingale part. 2 y1y1 1 2 2 y2y2 1 2

Since DtB(t, T ) is a martingale under risk-neutral measure, we have the following PDE: [ ] 2 − ∂ − ∂ − ∂ 1 ∂ 1 ∂ (δ0(t) + δ1y1 + δ2y2) + λ1y1 (λ21y1 + λ2y2) + 2 + 2 f(t, T, y1, y2) = 0. ∂t ∂y1 ∂y2 2 ∂y1 2 ∂y2

−y1C1(t,T )−y2C2(t,T )−A(t,T ) Suppose f(t, T, y1, y2) = e , then  [ ] d d d ft(t, T, y1, y2) = −y1 C1(t, T ) − y2 C2(t, T ) − A(t, T ) f(t, T, y1, y2),  dt dt dt f (t, T, y , y ) = −C (t, T )f(t, T, y , y ),  y1 1 2 1 1 2  − fy2 (t, T, y1, y2) = C2(t, T )f(t, T, y1, y2),  fy1y2 (t, T, y1, y2) = C1(t, T )C2(t, T )f(t, T, y1, y2),  f (t, T, y , y ) = C2(t, T )f(t, T, y , y ),  y1y1 1 2 1 1 2 2 fy2y2 (t, T, y1, y2) = C2 (t, T )f(t, T, y1, y2). So the PDE becomes ( ) d d d −(δ (t) + δ y + δ y ) + −y C (t, T ) − y C (t, T ) − A(t, T ) + λ y C (t, T ) 0 1 1 2 2 1 dt 1 2 dt 2 dt 1 1 1 1 1 +(λ y + λ y )C (t, T ) + C2(t, T ) + C2(t, T ) = 0. 21 1 2 2 2 2 1 2 2

Sorting out the terms according to independent variables y1 and y2, we get  − − d 1 2 1 2  δ0(t) dt A(t, T ) + 2 C1 (t, T ) + 2 C2 (t, T ) = 0 −δ − d C (t, T ) + λ C (t, T ) + λ C (t, T ) = 0  1 dt 1 1 1 21 2 − − d δ2 dt C2(t, T ) + λ2C2(t, T ) = 0. That is   d −  dt C1(t, T ) = λ1C1(t, T ) + λ21C2(t, T ) δ1 d C (t, T ) = λ C (t, T ) − δ  dt 2 2 2 2 d 1 2 1 2 − dt A(t, T ) = 2 C1 (t, T ) + 2 C2 (t, T ) δ0(t).

(ii) Using the terminal conditions C1(T,T ) = C2(T,T ) = 0, solve the equations in (i) for C1(t, T ) and C2(t, T ). (As in Subsection 10.2.1, the functions C1 and C2 depend on t and T only through the diﬀerence τ = T − t; however, the function A discussed in part (iii) below depends on t and T separately.) − − Solution. For C , we note d [e λ2tC (t, T )] = −e λ2tδ from the ODE in (i). Integrate from t to T , we have 2 ∫ dt 2 2 ( ) T −λ2t −λ2s δ2 −λ2T −λ2t δ2 −λ2(T −t) 0 − e C2(t, T ) = −δ2 e ds = (e − e ). So C2(t, T ) = 1 − e . For C1, we note t λ2 λ2 d λ δ −λ1t −λ1t 21 2 −λ1t −λ2T +(λ2−λ1)t −λ1t (e C1(t, T )) = (λ21C2(t, T ) − δ1)e = (e − e ) − δ1e . dt λ2 Integrate from t to T , we get − − λ1t {e C1(t, T ) (λ2−λ1)T (λ2−λ1)t λ21δ2 −λ1T −λ1t λ21δ2 −λ2T e −e δ1 −λ1T −λ1T − (e − e ) − e − + (e − e ) if λ1 ≠ λ2 = λ2λ1 λ2 λ2 λ1 λ1 λ21δ2 −λ1T −λ1t λ21δ2 −λ2T δ1 −λ1T −λ1T − (e − e ) − e (T − t) + (e − e ) if λ1 = λ2. λ2λ1 λ2 λ1

83 So { − − −λ1(T −t)− −λ2(T −t) − − λ21δ2 λ1(T t) − λ21δ2 e e − δ1 λ1(T t) − ̸ λ λ (e 1) + λ λ −λ λ (e 1) if λ1 = λ2 C1(t, T ) = 2 1 2 2 1 1 . λ21δ2 −λ1(T −t) λ21δ2 −λ2T +λ1t δ1 −λ1(T −t) (e − 1) + e (T − t) − (e − 1) if λ1 = λ2. λ2λ1 λ2 λ1

(iii) Using the terminal condition A(T,T ) = 0, write a formula for A(t, T ) as an integral involving C1(u, T ), C2(u, T ), and δ0(u). You do not need to evaluate this integral. d 1 2 2 − Solution. From the ODE dt A(t, T ) = 2 (C1 (t, T ) + C2 (t, T )) δ0(t), we get ∫ [ ] T − 1 2 2 A(t, T ) = δ0(s) (C1 (s, T ) + C2 (s, T )) ds. t 2

(iv) Assume that the model parameters λ1 > 0 λ2 > 0, λ21, δ1, and δ2 and the initial conditions Y1(0) and Y2(0) are given. We wish to choose a function δ0 so that the zero-coupon bond prices given by the model match the bond prices given by the market at the initial time zero. In other words, we want to choose a function δ(T ), T ≥ 0, so that f(0,T,Y1(0),Y2(0)) = B(0,T ),T ≥ 0. ∂ In this part of the exercise, we regard both t and T as variables and uses the notation ∂t to indicate the ∂ derivative with respect to t when T is held fixed and the notation ∂T to indicate the derivative with respect to ∂ T when t is held fixed. Give a formula for δ0(T ) in terms of ∂T log B(0,T ) and the model parameters. (Hint: ∂ Compute ∂T A(0,T ) in two ways, using (10.7.8) and also using the formula obtained in (iii). Because Ci(t, T ) depends only on t and T through τ = T −t, there are functions Ci(τ) such that Ci(τ) = Ci(T −t) = Ci(t, T ), i = 1, 2. Then ∂ ′ ∂ ′ C (t, T ) = −C (τ), C (t, T ) = C (τ), ∂t i i ∂T i i where ′ denotes differentiation with respect to τ. This shows that ∂ ∂ C (t, T ) = − C (t, T ), i = 1, 2, (10.7.9) ∂T i ∂t i a fact that you will need.)

−Y1(0)C1(0,T )−Y2(0)C2(0,T )−A(0,T ) Solution. We want to ﬁnd δ0 so that f(0,T,Y1(0),Y2(0)) = e = B(0,T ) for all T > 0. Take logarithm on both sides and plug in the expression of A(t, T ), we get ∫ [ ] T − − 1 2 2 − log B(0,T ) = Y1(0)C1(0,T ) Y2(0)C2(0,T ) + (C1 (s, T ) + C2 (s, T )) δ0(s) ds. 0 2 Taking derivative w.r.t. T , we have ∂ ∂ ∂ 1 1 log B(0,T ) = −Y (0) C (0,T ) − Y (0) C (0,T ) + C2(T,T ) + C2(T,T ) − δ (T ). ∂T 1 ∂T 1 2 ∂T 2 2 1 2 2 0 Therefore ∂ ∂ ∂ δ0(T ) = −Y1(0) C1(0,T ) − Y2(0) C2(0,T ) − log B(0,T ) { ∂T[ ∂T ] ∂T −λ1T λ21δ2 −λ2T −λ2T ∂ −Y1(0) δ1e − e − Y2(0)δ2e − log B(0,T ) if λ1 ≠ λ2 = [ λ2 ] ∂T − − − − λ1T − λ2T − λ2T − ∂ Y1(0) δ1e λ21δ2e T Y2(0)δ2e ∂T log B(0,T ) if λ1 = λ2.

84 I Exercise 10.4. Hull and White [89] propose the two-factor model e dU(t) = −λ1U(t)dt + σ1dB2(t), (10.7.10) e dR(t) = [θ(t) + U(t) − λ2R(t)]dt + σ2dB1(t), (10.7.11) e e where λ1, λ2, σ1, and σ2 are positive constants, θ(t) is a nonrandom function, and B1(t) and B2(t) are e e correlated Brownian motions with dB1(t)dB2(t) = ρdt for some ρ ∈ (−1, 1). In this exercise, we discuss how to reduce this to the two-factor Vasicek model of Subsection 10.2.1, except that, instead of (10.2.6), the interest rate is given by (10.7.7), in which δ0(t) is a nonrandom function of time. (i) Deﬁne [ ] [ ] [ ] U(t) λ 0 σ 0 X(t) = ,K = 1 , Σ = 1 R(t) −1 λ2 0 σ2 [ ] [ ] e 0 e B1(t) Θ(t) = , B(t) = e , θ(t) B2(t) so that (10.7.10) and (10.7.11) can be written in vector notation as

dX(t) = Θ(t)dt − KX(t)dt + ΣdBe(t). (10.7.12)

Now set ∫ t Xb(t) = X(t) − e−Kt eKuΘ(u)du. 0 Show that dXb(t) = −KXb(t)dt + ΣdBe(t). (10.7.13)

Proof. ∫ t dXb(t) = dX(t) + Ke−Kt eKuΘ(u)dudt − Θ(t)dt 0 ∫ t = −KX(t)dt + ΣdBe(t) + Ke−Kt eKuΘ(u)dudt 0 = −KXb(t)dt + ΣdBe(t).

b Remark 6. The deﬁnition of X is motivated by the( observation) that Y has homogeneous dt term in (10.2.4)- (10.2.5) and eKt is the integrating factor for X: d eKtX(t) = eKt (dX(t) − KX(t)dt).

(ii) With [ ] 1 0 σ1 C = − √ρ √1 , 2 2 σ1 1−ρ σ2 1−ρ b f e f f deﬁne Y (t) = CX(t), W (t) = CΣB(t). Show that the components of W1(t) and W2(t) are independent Brownian motions and dY (t) = −ΛY (t)dt + dWf(t), (10.7.14) where [ ] − λ1 0 1 − − Λ = CKC = ρσ2(λ√2 λ1) σ1 λ . 2 2 σ2 1−ρ Equation (10.7.14) is the vector form of the canonical two-factor Vasicek equations (10.2.4) and (10.2.5).

85 Proof. ( ) ( ) ( ) 1 0 1 0 f e σ1 σ1 0 e e W (t) = CΣB(t) = − √ρ √1 B(t) = −√ ρ √ 1 B(t). 2 2 0 σ2 2 2 σ1 1−ρ σ2 1−ρ 1−ρ 1−ρ

So Wf is a martingale with ⟨Wf1⟩(t) = ⟨Be1⟩(t) = t, ⟨ ⟩ ρ 1 ρ2t t ρ ρ2 + 1 − 2ρ2 ⟨Wf2⟩(t) = −√ Be1 + √ Be2 (t) = + − 2 ρt = t = t, 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2 and ⟨ ⟩ ρ 1 ρt ρt ⟨Wf1, Wf2⟩(t) = Be1, −√ Be1 + √ Be2 (t) = −√ + √ = 0. 1 − ρ2 1 − ρ2 1 − ρ2 1 − ρ2

f b b e −1 Therefore W is a two-dimensional BM. Moreover, dYt = CdXt = −CKXtdt + CΣdBt = −CKC Ytdt + f f dWt = −ΛYtdt + dWt, where   ( ) ( ) 1 √1 0 0 − 2 −1 σ1 λ1 0 1  σ2 1 ρ  Λ = CKC = √ρ √1 · − −1 λ | | √ρ 1 σ 1−ρ2 σ 1−ρ2 2 C 2 σ 1 2 σ1 1−ρ 1 ( ) ( ) λ1 0 σ1 σ1 √0 = ρλ λ − √ 1 − √1 √ 2 − 2 − 2 − 2 − 2 ρσ2 σ2 1 ρ ( σ1 1 ρ σ2 1)ρ σ2 1 ρ λ1 0 − − = ρσ2(λ√2 λ1) σ1 λ . 2 2 σ2 1−ρ

(iii) Obtain a formula for R(t) of the form (10.7.7). What are δ0(t), δ1, and δ2? Solution. ∫ ∫ t t b −Kt Ku −1 −Kt Ku X(t) = X(t) + e e Θ(u)du = C Yt + e e Θ(u)du ( 0 )( ) ∫ 0 t σ1 √0 Y1(t) −Kt Ku = 2 + e e Θ(u)du ρσ2 σ2 1 − ρ Y2(t) ( ) 0∫ σ Y (t) t = 1 1√ + e−Kt eKuΘ(u)du. − 2 ρσ2Y1(t) + σ2 1 ρ Y2(t) 0

So √ 2 R(t) = X2(t) = ρσ2Y1(t) + σ2 1 − ρ Y2(t) + δ0(t), ∫ −Kt t Ku where δ0(t) is the second coordinate of e 0 e Θ(u)du and can be derived explicitly by Lemma 10.2.3. Note ∫ ∫ [ ][ ] t t λ1(u−t) −Kt Ku e 0 0 e e Θ(u)du = − du ∗ eλ2(u t) θ(u) 0 ∫0 [ ] t 0 = du λ2(u−t) 0 θ(u)e ∫ √ − t λ2t λ2u − 2 Then δ0(t) = e 0 e θ(u)du, δ1 = ρσ2, and δ2 = σ2 1 ρ .

86 I Exercise 10.5 (Correlation between long rate and short rate in the one-factor Vasicek model). The one-factor Vasicek model is the one-factor Hull-White model of Example 6.5.1 with constant parameters,

dR(t) = (a − bR(t))dt + σdWf(t), (10.7.15)

where a, b, and σ are positive constants and Wf(t) is a one-dimensional Brownian motion. In this model, the price at time t ∈ [0,T ] of the zero-coupon bond maturing at time T is

B(t, T ) = e−C(t,T )R(t)−A(t,T ),

where C(t, T ) and A(t, T ) are given by (6.5.10) and (6.5.11): ∫ T ∫ ( ) − s bdv 1 −b(T −t) C(t, T ) = e t ds = 1 − e , b ∫t ( ) T 1 A(t, T ) = aC(s, t) − σ2C2(s, T ) ds t 2 ( ) ( ) 2ab − σ2 σ2 − ab σ2 = (T − t) + 1 − e−b(T −t) − 1 − e−2b(T −t) . 2b2 b3 4b3 In the spirit of the discussion of the short rate and the long rate in Subsection 10.2.1, we ﬁx a positive relative maturity τ¯ and deﬁne the long rate L(t) at time t by (10.2.30): 1 L(t) = − log B(t, t +τ ¯). τ¯

Show that changes in L(t) and R(t) are perfectly correlated (i.e., for any 0 ≤ t1 < t2, the correlation coeﬃcient between L(t2) − L(t1) and R(t2) − R(t1) is one). This characteristic of one-factor models caused the development of models with more than one factor. Proof. We note C(t, T ) and A(t, T ) are dependent only on T − t. So C(t, t +τ ¯) and A(t, t +τ ¯) are constants when τ¯ is ﬁxed. So d B(t, t +τ ¯)[−C(t, t +τ ¯)R′(t) − A(t, t +τ ¯)] L(t) = − dt τB¯ (t, t +τ ¯) 1 = [C(t, t +τ ¯)R′(t) + A(t, t +τ ¯)] τ¯ 1 = [C(0, τ¯)R′(t) + A(0, τ¯)]. τ¯ − 1 − 1 − Integrating from t1 to t2 on both sides, we have L(t2) L(t1) = τ¯ C(0, τ¯)[R(t2) R(t1)] + τ¯ A(0, τ¯)(t2 t1). Since L(t2) − L(t1) is a linear transformation of R(t2) − R(t1), their correlation is 1.

I Exercise 10.6 (Degenerate two-factor Vasicek model). In the discussion of short rates and long rates in the two-factor Vasicek model of Subsection 10.2.1, we made the assumptions that δ2 ≠ 0 and (λ1 − λ2)δ1 + λ21δ2 ≠ 0 (see Lemma 10.2.2). In this exercise, we show that if either of these conditions is violated, the two-factor Vasicek model reduces to a one-factor model, for which long rates and short rates are perfectly correlated (see Exercise 10.5).

(i) Show that if δ2 = 0 (and δ0 > 0, δ1 > 0), then the short rate R(t) given by the system of equations (10.2.4)-(10.2.6) satisﬁes the one-dimensional stochastic diﬀerential equation f dR(t) = (a − bR(t))dt + dW1(t). (10.7.16)

Deﬁne a and b in terms of the parameters in (10.2.4)-(10.2.6).

87 Proof. If δ2 = 0, then ( ) f dR(t) = δ1 −λ1Y1(t)dt + dW1(t) [( ) ] δ0 R(t) f = δ1 − λ1dt + dW1(t) δ1 δ1 f = (δ0λ1 − λ1R(t))dt + δ1dW1(t).

So a = δ0λ1 and b = λ1. − 2 2 ̸ (ii) Show that if (λ1 λ2)δ1 + λ21δ2 = 0 (and δ0 > 0, δ1 + δ2 = 0), then the short rate R(t) given by the system of equations (10.2.4)-(10.2.6) satisﬁes the one-dimensional stochastic diﬀerential equation

dR(t) = (a − bR(t))dt + σdBe(t). (10.7.17)

Deﬁne a and b in terms of the parameters in (10.2.4)-(10.2.6) and deﬁne the Brownian motion Be(t) in terms f f of the independent Brownian motions W1(t) and W2(t) in (10.2.4) and (10.2.5). Proof.

dR(t) = δ1dY1(t) + δ2dY2(t) f f = −δ1λ1Y1(t)dt + λ1dW1(t) − δ2λ21Y1(t)dt − δ2λ2Y2(t)dt + δ2dW2(t) f f = −Y1(t)(δ1λ1 + δ2λ21)dt − δ2λ2Y2(t)dt + δ1dW1(t) + δ2dW2(t) f f = −Y1(t)λ2δ1dt − δ2λ2Y2(t)dt + δ1dW1(t) + δ2dW2(t) f f = −λ2(Y1(t)δ1 + Y2(t)δ2)dt + δ1dW1(t) + δ2dW2(t) √ [ ] δ δ = −λ (R − δ )dt + δ2 + δ2 √ 1 dWf (t) + √ 2 dWf (t) . 2 t 0 1 2 2 2 1 2 2 2 δ1 + δ2 δ1 + δ2 √ 2 2 e √ δ1 f √ δ2 f So a = λ2δ0, b = λ2, σ = δ1 + δ2 and B(t) = 2 2 W1(t) + 2 2 W2(t). δ1 +δ2 δ1 +δ2

I Exercise 10.7 (Forward measure in the two-factor Vasicek model). Fix a maturity T > 0. In the two-factor Vasicek model of Subsection 10.2.1, consider the T -forward measure PeT of Deﬁnition 9.4.1: ∫ 1 PeT (A) = D(T )dPe for all A ∈ F. B(0,T ) A

PeT fT fT (i) Show that the two-dimensional -Brownian motion W1 (t), W2 (t) of (9.2.5) are ∫ t fT − f Wj (t) = Cj(T u)du + Wj(t), j = 1, 2, (10.7.18) 0

where C1(τ) and C2(τ) are given by (10.2.26)-(10.2.28). Proof. We use the canonical form of the model as in formulas (10.2.4)-(10.2.6). By (10.2.20),

dB(t, T ) = df(t, Y1(t),Y2(t)) − − − − − − = de Y1(t)C1(T t) Y2(t)C2(T t) A(T t) ··· − − f − − f = ( )dt + B(t, T )[ C1(T t)dW1(t) C2(T( t)dW)2(t)] f ··· − − − − dW1(t) = ( )dt + B(t, T )( C1(T t), C2(T t)) f . dW2(t) ∫ Pe − − − − fT t − So the volatility vector of B(t, T ) under is ( C1(T t), C2(T t)). By (9.2.5), Wj (t) = 0 Cj(T f eT u)du + Wj(t)(j = 1, 2) form a two-dimensional P −BM.

88 (ii) Consider a call option on a bond maturing at time T¯ > T . The call expires at time T and has strike price K. Show that at time zero the risk-neutral price of this option is [ ] ( )+ − ¯− − ¯− − ¯− B(0,T )EeT e C1(T T )Y1(T ) C2(T T )Y2(T ) A(T T ) − K . (10.7.19)

Proof. Under the T -forward measure PeT , the numeraire is B(t, T ). By risk-neutral pricing, at time zero the risk-neutral price V (0) of the option satisﬁes [ ] ( )+ V (0) 1 − ¯− − ¯− − ¯− = EeT e C1(T T )Y1(T ) C2(T T )Y2(T ) A(T T ) − K . B(0,T ) B(T,T )

Note B(T,T ) = 1, we get (10.7.19).

(iii) Show that, under the T -forward measure PeT , the term

X = −C1(T¯ − T )Y1(T ) − C2(T¯ − T )Y2(T ) − A(T¯ − T ) appearing in the exponent in (10.7.19) is normally distributed. Proof. Using (10.7.18), we can rewrite (10.2.4) and (10.2.5) as { − fT − − dY1(t) = λ1Y1(t)dt + dW1 (t) C1(T t)dt − − fT − − dY2(t) = λ21Y1(t)dt λ2Y2(t)dt + dW2 (t) C2(T t)dt.

Then { ∫ ∫ − t − f t − Y (t) = Y (0)e λ1t + eλ1(s t)dW T (s) − C (T − s)eλ1(s t)ds 1 1 ∫0 1 ∫0 1 ∫ − t − t − t − λ2t − λ2(s t) λ2(s t) f − − λ2(s t) Y2(t) = Y0e λ21 0 Y1(s)e ds + 0 e dW2(s) 0 C2(T s)e ds.

Since C∫1 is deterministic, Y1 has Gaussian distribution. As the consequence, the second∫ term in the expression t t λ2(s−t) λ2(s−t) f fT of Y2, 0 Y1(s)e ds also has Gaussian distribution and is uncorrelated to 0 e dW2(s), since W1 fT and W2 are uncorrelated. Therefore, (Y1,Y2) is jointly Gaussian and X as a linear combination of them is also Gaussian. (iv) It is a straightforward but lengthy computation, like the computations in Exercise 10.1, to determine 2 − 1 2 the mean and variance of the term X. Let us call its variance σ and its mean µ 2 σ , so that we can write X as 1 X = µ − σ2 − σZ, 2 where Z is a standard normal random variable under PeT . Show that the call option price in (10.7.19) is

µ B(0,T )(e N(d+) − KN(d−)) ,

where ( ) 1 1 2 d = µ − log K σ . σ 2 Proof. The call option price in (10.7.19) is [ ] ( )+ B(0,T )EeT eX − K

EeT − 1 2 2 with [X] = µ 2 σ and Var(X) = σ . Comparing with the Black-Scholes formula for call options: if f dSt = rStdt + σStdWt, then [ ( ) ] 1 2 + e −rT σWT +(r− σ )T −rT E e S0e 2 − K = S0N(d+) − Ke N(d−)

89 1 S 1 2 with d = √ (log 0 + (r σ )T ), we can set in the Black-Scholes formula r = µ, T = 1 and S = 1, then σ T K 2 0 [( ) ] [ ( ) ] eT X + µ eT −µ X + µ B(0,T )E e − K = B(0,T )e E e e − K = B(0,T )(e N(d+) − KN(d−))

1 − 1 2 where d = σ ( log K + (µ 2 σ )).

I Exercise 10.8 (Reversal of order of integration in forward rates). The forward rate formula (10.3.5) with v replacing T states that ∫ ∫ t t f(t, v) = f(0, v) + α(u, v)du + σ(u, v)dW (u). 0 0 Therefore ∫ ∫ [ ∫ ∫ ] T T t t − f(t, v)dv = − f(0, v) + α(u, v)du + σ(u, v)dW (u) dv. (10.7.20) t t 0 0

(i) Deﬁne ∫ ∫ T T αb(u, t, T ) = α(u, v)dv, σb(u, t, T ) = σ(u, v)dv. t t Show that if we reverse the order of integration in (10.7.20), we obtain the equation ∫ ∫ ∫ ∫ T T t t b − f(t, v)dv = − f(0, v)dv − (u, t, T )du − σb(u, t, T )dW (u). (10.7.21) t t 0 0 (In one case, this is a reversal of the order of two Riemann integrals, a step that uses only the theory of ordinary calculus. In the other case, the order of a Riemann and an Itô integral are being reversed. This step is justiﬁed in the appendix of [83]. You may assume without proof that this step is legitimate.) Proof. Starting from (10.7.20), we have ∫ ∫ ∫ ∫ T T − f(t, v)dv = − f(0, v)dv − α(u, v)dudv − σ(u, v)dW (u)dv t t {t≤v≤T,0≤u≤t} {t≤v≤T,0≤u≤t} ∫ ∫ ∫ ∫ ∫ T t T t T = − f(0, v)dv − du α(u, v)dv − dW (u) σ(u, v)dv ∫t ∫0 t ∫ 0 t T t t = − f(0, v)dv − αb(u, t, T )du − σb(u, t, T )dW (u). t 0 0

(ii) Take∫ the differential with∫ respect to t in (10.7.21), remembering to get two terms from each of the t b t b integrals 0 α(u, t, T )du and 0 σ(u, t, T )dW (u) because one must differentiate with respect to each of the two ts appearing in these integrals. Solution. Differentiating both sides of (10.7.21), we have ( ∫ ) T d − f(t, v)dv t ∫ ∫ t ∂ t ∂ = f(0, t)dt − αb(t, t, T )dt − αb(u, t, T )du − σb(t, t, T )dW (t) − σb(u, t, T )dW (u) ∂t ∂t ∫ 0 ∫ ∫ 0 ∫ T t T t = f(0, t)dt − α(t, v)dvdt + α(u, t)dudt − σ(t, v)dvdW (t) + σ(u, t)dW (u)dt. t 0 t 0

90 (iii) Check that your formula in (ii) agrees with (10.3.10). ∫ ∫ t t Proof. We note R(t) = f(t, t) = f(0, t) + 0 α(u, v)du + 0 σ(u, v)dW (u). From (ii), we therefore have ( ∫ ) T d − f(t, v)dv = R(t)dt − α∗(t, T )dt − σ∗(t, T )dW (t), t

which is (10.3.10).

I Exercise 10.9 (Multifactor HJM model). Suppose the Heath-Jarrow-Morton model is driven by a d-dimensional Brownian motion, so that σ(t, T ) is also a d-dimensional vector and the forward rate dynamics are given by ∑d df(t, T ) = α(t, T )dt + σj(t, T )dWj(t). j=1 (i) Show that (10.3.16) becomes

∑d ∗ α(t, T ) = σj(t, T )[σj (t, T ) + Θj(t)]. j=1

Proof. We ﬁrst derive the SDE for the discounted bond price D(t)B(t, T ). We ﬁrst note   ( ∫ ) ∫ ∫ T T T ∑d   d − f(t, v)dv = f(t, t)dt − df(t, v)dv = R(t)dt − α(t, v)dt + σj(t, v)dWj(t) dv t t t j=1 ∑d − ∗ − ∗ = R(t)dt α (t, T )dt σj (t, T )dWj(t). j=1

The applying Itô’s formula, we have   ( ∫ ) ∫ T ∑d − T f(t,v)dv  1 ∗ 2  dB(t, T ) = de t = B(t, T ) d − f(t, v)dv + (σ (t, T )) dt 2 j t j=1   1 ∑d ∑d = B(t, T ) R(t) − α∗(t, T ) + (σ∗(t, T ))2 dt − B(t, T ) σ∗(t, T )dW (t). 2 j j j j=1 j=1

Using integration-by-parts formula, we have − d(D(t)B(t, T )) = D(t)[ B(t, T)R(t)dt + dB(t, T )]   1 ∑d ∑d = D(t)B(t, T ) −α∗(t, T ) + (σ∗(t, T ))2 dt − σ∗(t, T )dW (t) 2 j j j j=1 j=1

If no-arbitrage condition holds, we can ﬁnd a risk-neutral measure Pe under which ∫ t Wf(t) = Θ(u)du + W (t) 0 is a d-dimensional Brownian motion for some d-dimensional adapted process Θ(t). This implies

1 ∑d ∑d −α∗(t, T ) + (σ∗(t, T ))2 + σ∗(t, T )Θ (t) = 0 2 j j j j=1 j=1

91 Diﬀerentiating both sides w.r.t. T , we obtain

∑d ∑d − ∗ α(t, T ) + σj (t, T )σj(t, T ) + σj(t, T )Θj(t) = 0. j=1 j=1

Or equivalently, ∑d ∗ α(t, T ) = σj(t, T )[σj (t, T ) + Θj(t)]. j=1

(ii) Suppose there is an adapted, d-dimensional process

Θ(t) = (Θ1(t), ··· , Θd(t))

satisfying this equation for all 0 ≤ t ≤ T ≤ T . Show that if there are maturities T1, ··· , Td such that the d × d matrix (σj(t, Ti))i,j is nonsingular, then Θ(t) is unique. ··· tr ∗ ∗ ··· ∗ tr Proof. Let σ(t, T ) = [σ1(t, T ), , σd(t, T )] and σ (t, T ) = [σ1 (t, T ), , σd(t, T )] . Then the no-arbitrage condition becomes α(t, T ) − (σ∗(t, T ))trσ(t, T ) = (σ(t, T ))trΘ(t).

Let T iterate T1, ··· , Td, we have a matrix equation       ∗ tr α(t, T1) − (σ (t, T1)) σ(t, T1) σ1(t, T1) . . . σd(t, T1) Θ1(t)  .   . . .   .   .  =  . .. .   .  ∗ tr α(t, Td) − (σ (t, Td)) σ(t, Td) σ1(t, Td) . . . σd(t, Td) Θd(t)

Therefore, if (σj(t, Ti))i,j is nonsingular, Θ(t) can be uniquely solved from the above matrix equation.

I Exercise 10.10. (i) Use the ordinary differential equations (6.5.8) and (6.5.9) satisfied by the functions A(t, T ) and C(t, T ) in the one-factor Hull-White model to show that this model satisfies the HJM no-arbitrage condition (10.3.27) Proof. Recall (6.5.8) and (6.5.9) are { C′(t, T ) = b(t)C(t, T ) − 1 ′ − 1 2 2 A (t, T ) = a(t)C(t, T ) + 2 σ (t)C (t, T ).

Then by β(t, r) = a(t) − b(t)r and γ(t, r) = σ(t) (p. 430), we have

∂ ∂ ∂ C(t, T )β(t, R(t)) + R(t) C′(t, T ) + A′(t, T ) ∂T ∂T ∂T[ ] ∂ ∂ ∂ ∂ = C(t, T )β(t, R(t)) + R(t)b(t) C(t, T ) + −a(t) C(t, T ) + σ2(t)C(t, T ) C(t, T ) ∂T ∂T ∂T ∂T ∂ [ ] = C(t, T ) β(t, R(t)) + R(t)b(t) − a(t) + σ2(t)C(t, T ) (∂T ) ∂ = C(t, T ) C(t, T )γ2(t, R(t)). ∂T

(ii) Use the ordinary differential equations (6.5.14) and (6.5.15) satisfied by the functions A(t, T ) and C(t, T ) in the one-factor Cox-Ingersoll-Ross model to show that this model satisfies the HJM no-arbitrage condition (10.3.27).

92 Proof. Recall (6.5.14) and (6.5.15) are { ′ 1 2 2 − C (t, T ) = bC(t, T ) + 2 σ C (t, T ) 1 A′(t, T ) = −aC(t, T ). √ Then by β(t, r) = a − br and γ(t, r) = σ r (p. 430), we have

∂ ∂ ∂ C(t, T )β(t, R(t)) + R(t) C′(t, T ) + A′(t, T ) ∂T ∂T[ ∂T ] ∂ ∂ ∂ ∂ = C(t, T )β(t, R(t)) + R(t) b C(t, T ) + σ2C(t, T ) C(t, T ) − a C(t, T ) ∂T ∂T ∂T ∂T ∂ [ ] = C(t, T ) β(t, R(t)) + R(t)b + R(t)σ2C(t, T ) − a (∂T ) ∂ = C(t, T ) C(t, T )γ2(t, R(t)). ∂T

I Exercise 10.11. Let δ > 0 be given. Consider an interest rate swap paying a ﬁxed interest rate K and receiving backset LIBOR L(Tj−1,Tj−1) on a principal of 1 at each of the payment dates Tj = δj, j = 1, 2, ··· , n + 1. Show that the value of the swap is

n∑+1 n∑+1 δK B(0,Tj) − δ B(0,Tj)L(0,Tj−1). (10.7.22) j=1 j=1

Remark 10.7.1 The swap rate is deﬁned to be the value of K that makes the initial value of the swap equal to zero. Thus, the swap rate is ∑ n+1 B(0,T )L(0,T − ) j=1∑ j j 1 K = n+1 . (10.7.23) j=1 B(0,Tj)

Proof. On each payment date Tj, the payoﬀ of this swap contract is δ(K − L(Tj−1,Tj−1)). Its no-arbitrage price at time 0 is δ(KB(0,Tj) − B(0,Tj)L(0,Tj−1)) by Theorem 10.4. So the value of the swap is

n∑+1 n∑+1 n∑+1 δ[KB(0,Tj) − B(0,Tj)L(0,Tj−1)] = δK B(0,Tj) − δ B(0,Tj)L(0,Tj−1). j=1 j=1 j=1

I Exercise 10.12. In the proof of Theorem 10.4.1, we showed by an arbitrage argument that the value at time 0 of a payment of backset LIBOR L(T,T ) at time T + δ is B(0,T + δ)L(0,T ). The risk-neutral price of this payment, computed at time zero, is

Ee[D(T + δ)L(T,T )].

Use the deﬁnitions 1 − B(T,T + δ) L(T,T ) = , δB(T,T + δ) B(0,T + δ) = Ee[D(T + δ)],

and the properties of conditional expectations to show that

Ee[D(T + δ)L(T,T )] = B(0,T + δ)L(0,T ).

93 1−B(T,T +δ) ∈ F Proof. Since L(T,T ) = δB(T,T +δ) T , we have

Ee Ee Ee |F [D(T + δ)L(T,T )] = [[ [D(T + δ)L(T,T ) T ]] ] 1 − B(T,T + δ) = Ee Ee[D(T + δ)|F ] δB(T,T + δ) T [ ] 1 − B(T,T + δ) = Ee D(T )B(T,T + δ) δB(T,T + δ) [ ] D(T ) − D(T )B(T,T + δ) = Ee δ B(0,T ) − B(0,T + δ) = δ = B(0,T + δ)L(0,T ).

Remark 7. An alternative proof is to use the (T + δ)-forward measure PeT +δ: the time-t no-arbitrage price of a payoﬀ δL(T,T ) at time (T + δ) is

11 Introduction to Jump Processes

⋆ Comments: 1) A mathematically rigorous presentation of semimartingale theory and stochastic calculus can be found in He et al. [5] and Kallenberg [7]. 2) Girsanov’s theorem. The most general version of Girsanov’s theorem for local martingales can be found in He et al. [5, page 340] (Theorem 12.13) or Kallenberg [7, page 523] (Theorem 26.9). It has the following form

Theorem 1 (Girsanov’s theorem for local martingales). Let Q = Zt · P on Ft for all t ≥ 0, and consider a local P -martingale M such that the process [M,Z] has locally integrable variation and P -compensator −1 13 ⟨M,Z⟩. Then M¯ = M − Z− ·⟨M,Z⟩ is a local Q-martingale. Applying Girsanov’s theorem to Lemma 11.6.1, in order to change the intensity of a Poisson process via e change of measure, we need to find a P-martingale L such that the Radon-Nikodým derivative Z· = dP/dP satisfies the SDE dZ(t) = Z(t−)dL(t) and N(t) − λt − ⟨N(t) − λt, L(t)⟩ = N(t) − λte is a martingale under Pe. This implies we should find L such that

⟨N(t) − λt, L(t)⟩ = (λe − λ)t.

13 For the diﬀerence between the predictable quadratic variation ⟨·⟩ (or angle bracket process) and the quadratic∑ variation (or ⟨ c c⟩ square bracket process), see He et al. [5, page 185-187]. Basically, we have [M,N]t = M0N0 + M ,N t + s≤t ∆Ms∆Ns and ⟨M,N⟩ is the dual predictable projection of [M,N] (i.e. [M,N]t − ⟨M,N⟩t is a martingale).

94 ∫ t − Assuming martingale representation theorem, we suppose L(t) = L(0) + 0 H(s)d(N(s) λs), then ∫ ∫ t ∑ t [N(t) − λt, L(t)] = H(s)d[N(s) − λs, N(s) − λs] = H(s)(∆N(s))2 = H(s)dN(s). 0 0

λe−λ we get H(s) = λ . Combined, we conclude Z should be determined by the equation (11.6.2) λe − λ dZ(t) = Z(t−)dM(t). λ

I Exercise 11.1. Let M(t) be the compensated Poisson process of Theorem 11.2.4. (i) Show that M 2(t) is a submartingale. Proof. First, M 2(t) = N 2(t) − 2λtN(t) + λ2t2. So E[M 2(t)] < ∞. φ(x) = x2 is a convex function, by the conditional Jensen’s inequality (Shreve [14, page 70], Theorem 2.3.2 (v)), E[φ(M(t))|F(s)] ≥ φ(E[M(t)|F(s)]) = φ(M(s)), ∀s ≤ t. So M 2(t) is a submartingale. (ii) Show that M 2(t) − λt is a martingale. Proof. We note M(t) = N(t) − λt has independent and stationary increment. So ∀s ≤ t, E[M 2(t) − M 2(s)|F(s)] = E[(M(t) − M(s))2|F(s)] + E[(M(t) − M(s)) · 2M(s)|F(s)] = E[M 2(t − s)] + 2M(s)E[M(t − s)] = Var(N(t − s)) + 0 = λ(t − s). That is, E[M 2(t) − λt|F(s)] = M 2(s) − λs.

I Exercise 11.2. Suppose we have observed a Poisson process up to time s, have seen that N(s) = k, and are interested in the values of N(s + t) for small positive t. Show that P{N(s + t) = k|N(s) = k} = 1 − λt + O(t2), P{N(s + t) = k + 1|N(s) = k} = λt + O(t2), P{N(s + t) ≥ k + 2|N(s) = k} = O(t2), where O(t2) is used to denote terms involving t2 and higher powers of t. Proof. We note P(N(s + t) = k|N(s) = k) = P(N(s + t) − N(s) = 0|N(s) = k) = P(N(t) = 0) = e−λt = 1 − λt + O(t2). Similarly, we have (λt)1 P(N(s + t) = k + 1|N(s) = k) = P(N(t) = 1) = e−λt = λt(1 − λt + O(t2)) = λt + O(t2), 1!

and ∞ ∑ (λt)k P(N(s + t) ≥ k + 2|N(t) = k) = P(N(t) ≥ 2) = e−λt = O(t2). k! k=2

95 I Exercise 11.3 (Geometric Poisson process). Let N(t) be a Poisson process with intensity λ > 0, and let S(0) > 0 and σ > −1 be given. Using Theorem 11.2.3 rather than the Itô-Doeblin formula for jump processes, show that S(t) = exp {N(t) log(σ + 1) − λσt} = (σ + 1)N(t)e−λσt is a martingale.

Proof. For any t ≤ u, we have [ ] [ ] S(u) N(t)−N(u) −λσ(t−u) E F(t) = E (σ + 1) e |F(t) S(t) [ ] = e−λσ(t−u)E (σ + 1)N(t−u) −λσ(t−u)E N(t−u) log(σ+1) = e [e { ( ] )} = e−λσ(t−u) exp λ(t − u) elog(σ+1) − 1 (by (11.3.4)) = e−λσ(t−u)eλσ(t−u) = 1.

So S(t) = E[S(u)|F(t)] and S is a martingale.

I Exercise 11.4. Suppose N1(t) and N2(t) are Poisson processes with intensities λ1 and λ2, respectively, both defined on the same probability space (Ω, F, P) and relative to the same filtration F(t), t ≥ 0. Show that almost surely N1(t) and N2(t) can have no simultaneous jump. (Hint: Define the compensated Poisson processes M1(t) = N1(t) − λt and M2(t) = N2(t) − λ2t, which like N1 and N2 are independent. Use Itô product rule for jump processes to compute M1(t)M2(t) and take expectations.)

Proof. The problem is ambiguous in that the relation between N1 and N2 is not clearly stated. According to page 524, paragraph 2, we would guess the condition should be that N1 and N2 are independent. Suppose N1 and N2 are independent. Deﬁne M1(t) = N1(t) − λ1t and M2(t) = N2(t) − λ2t. Then by independence E[M1(t)M2(t)] = E[M1(t)]E[M2(t)] = 0. Meanwhile, by Itô’s product formula (Corollary 11.5.5), ∫ ∫ t t M1(t)M2(t) = M1(s−)dM2(s) + M2(s−)dM1(s) + [M1,M2](t). 0 0 ∫ ∫ t − t − Both 0 M1(s )dM2(s) and 0 M2(s )dM1(s) are martingales. So taking expectation on both sides, we get   ∑   0 = 0 + E{[M1,M2](t)} = E ∆N1(s)∆N2(s) . 0 0. Therefore N1 and N2 can have no simultaneous jump.

I Exercise 11.5. Suppose N1(t) and N2(t) are Poisson processes defined on the same probability space (Ω, F, P) relative to the same filtration F(t), t ≥ 0. Assume that almost surely N1(t) and N2(t) have no simultaneous jump. Show that, for each fixed t, the random variable N1(t) and N2(t) are independent. (Hint: Adapt the proof of Corollary 11.5.3.) (In fact, the whole path of N1 is independent of the whole path of N2, although you are not being asked to prove this stronger statement.)

Proof. We shall prove the whole path of N1 is independent of the whole path of N2, following the scheme suggested by page 489, paragraph 1.

96 u1 u2 Fix s ≥ 0, we consider Xt = u1(N1(t)−N1(s))+u2(N2(t)−N2(s))−λ1(e −1)(t−s)−λ2(e −1)(t−s), t > s. Then by Itô’s formula for jump process, we have ∫ ∫ t 1 t ∑ eXt − eXs = eXu dXc + eXu dXcdXc + (eXu − eXu− ) u 2 u u s s s

Since ∆Xt = u1∆N1(t) + u2∆N2(t) and N1, N2 have no simultaneous jump,

Xu Xu− Xu− ∆Xu Xu− u1 u2 e − e = e (e − 1) = e [(e − 1)∆N1(u) + (e − 1)∆N2(u)]. ∫ ∫ t t Xu Xu− 14 Note s e du = s e du , we have

Xt − Xs ∫e e t ∑ Xu− u1 u2 Xu− u1 u2 = e [−λ1(e − 1) − λ2(e − 1)]du + e [(e − 1)∆N1(u) + (e − 1)∆N2(u)] s s

Therefore, E[eXt ] = E[eXs ] = 1, which implies [ ] − − u1 − − u2 − − E eu1(N1(t) N1(s))+u2(N2(t) N2(s)) = eλ1(e 1)(t s)eλ2(e 1)(t s) [ ] [ ] − − = E eu1(N1(t) N1(s)) E eu2(N2(t) N2(s)) .

This shows N1(t) − N1(s) is independent of N2(t) − N2(s). Now, suppose we have 0 ≤ t1 < t2 < t3 < ··· < tn, then the vector (N1(t1), ··· ,N1(tn)) is independent of (N2(t1), ··· ,N2(tn)) if and only if (N1(t1),N1(t2) − N1(t1), ··· ,N1(tn) − N1(tn−1)) is independent of (N2(t1),N2(t2) − N2(t1), ··· ,N2(tn) − N2(tn−1)). Let t0 = 0, then [ ∑ ∑ ] n n u (N (t )−N (t − ))+ v (N (t )−N (t − )) E e i=1 i 1 i 1 i 1 j=1 j 2 j 2 j 1 [ ∑ ∑ n−1 n−1 u (N (t )−N (t − ))+ v (N (t )−N (t − )) = E e i=1 i 1 i 1 i 1 j=1 j 2 j 2 j 1 · [ ]]

un(N1(tn)−N1(tn−1))+vn(N2(tn)−N2(tn−1)) E e F(tn−1) [ ∑ ∑ ] [ ] n−1 n−1 u (N (t )−N (t − ))+ v (N (t )−N (t − )) u (N (t )−N (t − ))+v (N (t )−N (t − )) = E e i=1 i 1 i 1 i 1 j=1 j 2 j 2 j 1 E e n 1 n 1 n 1 n 2 n 2 n 1 [ ∑ ∑ ] [ ] [ ] n−1 n−1 u (N (t )−N (t − ))+ v (N (t )−N (t − )) u (N (t )−N (t − )) v (N (t )−N (t − )) = E e i=1 i 1 i 1 i 1 j=1 j 2 j 2 j 1 E e n 1 n 1 n 1 E e n 2 n 2 n 1 ,

where the second equality comes from the independence of Ni(tn) − Ni(tn−1)(i = 1, 2) relative to F(tn−1) and the third equality comes from the result obtained just before. Working by induction, we have [ ∑ ∑ ] n n u (N (t )−N (t − ))+ v (N (t )−N (t − )) E e i=1 i 1 i 1 i 1 j=1 j 2 j 2 j 1 ∏n [ ] ∏n [ ] − − = E eui(N1(ti) N1(ti−1)) · E evj (N2(tj ) N2(tj−1)) i=1 j=1 [ ∑ ] [ ∑ ] n n u (N (t )−N (t − )) v (N (t )−N (t − )) = E e i=1 i 1 i 1 i 1 · E e j=1 j 2 j 2 j 1 .

This shows the whole path of N1 is independent of the whole path of N2.

14On the interval [s, t], the sample path X·(ω) has only ﬁnitely many jumps for each ω, so the Riemann integrals of X· and X·− should agree with each other.

97 I Exercise 11.6. Let W (t) be a Brownian motion and let Q(t) be a compound Poisson process, both deﬁned on the same probability space (Ω, F, P) and relative to the same ﬁltration F(t), t ≥ 0. Show that, for each t, the random variables W (t) and Q(t) are independent. (In fact, the whole path of W is independent of the whole path of Q, although you are not being asked to prove this stronger statement.) − 1 2 − − Proof. Let Xt = u1W (t) 2 u1t + u2Q(t) λt(φ(u2) 1) where φ is the moment generating function of the jump size Y . Itô’s formula for jump process yields ∫ ( ) ∫ t 1 1 t ∑ eXt − 1 = eXs u dW (s) − u2ds − λ(φ(u ) − 1)ds + eXs u2ds + (eXs − eXs− ). 1 2 1 2 2 1 0 0 0

Note ∆Xt = u2∆Q(t) = u2YN(t)∆N(t), where N(t) is the Poisson process associated with Q(t). So

eXt − eXt− = eXt− (e∆Xt − 1) = eXt− (eu2YN(t) − 1)∆N(t). ∑ N(t) u2Yi − Consider the compound Poisson process Ht = i=1 (e 1), then [ ] u2Y1 Ht − λE e − 1 t = Ht − λ(φ(u2) − 1)t

Xt Xt− Xt− is a martingale, e − e = e ∆Ht and ∫ ( ) ∫ ∫ t 1 1 t t eXt − 1 = eXs u dW (s) − u2ds − λ(φ(u ) − 1)ds + eXs u2ds + eXs− dH 1 2 1 2 2 1 s ∫0 ∫ 0 0 t t Xs Xs− = e u1dW (s) + e d(Hs − λ(φ(u2) − 1)s). 0 0 [ ] This shows eXt is a martingale and E eXt ≡ 1. So [ ] [ ] [ ] u W (t)+u Q(t) 1 u t λt(φ(u )−1)t u W (t) u Q(t) E e 1 2 = e 2 1 e 2 = E e 1 E e 2 .

This shows W (t) and Q(t) are independent. Remark 8. It is easy to see that, if we follow the steps of solution to Exercise 11.5, ﬁrstly proving the independence of W (t)−W (s) and Q(t)−Q(s), then proving the independence of (W (t1),W (t2)−W (t1), ··· ,W (tn)− W (tn−1)) and (Q(t1),Q(t2) − Q(t1), ··· ,Q(tn) − Q(tn−1)) (0 ≤ t1 < t2 < ··· < tn), then we can show the whole path of W is independent of the whole path of Q.

I Exercise 11.7. Use Theorem 11.3.2 to prove that a compound Poisson process is Markov. In other words, show that, whenever we are given two times 0 ≤ t ≤ T and a function h(x), there is another function g(t, x) such that E[h(Q(T ))|F(t)] = g(t, Q(t)). Proof. By Independence Lemma, we have

E[h(Q(T ))|F(t)] = E[h(Q(T ) − Q(t) + Q(t))|F(t)] = E[h(Q(T − t) + x)]|x=Q(t) = g(t, Q(t)),

where g(t, x) = E[h(Q(T − t) + x)].

References

[1] Vladimir I. Arnold and Roger Cooke. Ordinary diﬀerential equations, 3rd edition, Springer, 1992. 78 [2] Freddy Delbaen and Walter Schachermayer. The mathematics of arbitrage. Springer, 2006. 72 [3] 丁同仁、李承治：《常微分方程教程》，北京：高等教育出版社，1991. 78

98 [4] Richard Durrett. Probability: Theory and examples, 4th edition. Cambridge University Press, New York, 2010. 2, 44 [5] Sheng-wu He, Jia-gang Wang, and Jia-an Yan. Semimartingale theory and stochastic calculus. Science Press & CRC Press Inc, 1992. 94 [6] John Hull. Options, futures, and other derivatives, 4th edition. Prentice-Hall International Inc., New Jersey, 2000. 44 [7] Olav Kallenberg. Foundations of modern probability, 2nd edition. Springer, 2002. 94

[8] J. David Logan. A first course in differential equations, 2nd edition. Springer, New York, 2010. 27 [9] B. Øksendal. Stochastic differential equations: An introduction with applications, 6th edition. Springer- Verlag, Berlin, 2003. 54, 65 [10] 钱敏平、龚光鲁：《随机过程论（第二版）》，北京：北京大学出版社，1997.10。54

[11] Daniel Revuz and Marc Yor. Continous martingales and Brownian motion, 3rd edition. Springer-Verlag, Berlin, 1998. 49, 54 [12] A. N. Shiryaev. Probability, 2nd edition. Springer, 1995. 23

[13] A. N. Shiryaev. Essentials of stochastic ﬁnance: facts, models, theory. World Scientiﬁc, Singapore, 1999. 72

[14] Steven Shreve. Stochastic calculus for ﬁnance II: Continuous-time models. Springer-Verlag, New York, 2004. 1, 72, 95

[15] Paul Wilmott. The mathematics of ﬁnancial derivatives: A student introduction. Cambridge University Press, 1995.

27, 38