Moment-Generating Functions and Cumulant

The Lebesgue Integral Here is the picture I drew to replace the one in Shreve to illustrate the lower Lebesgue sum.

1 Jensen’s Inequality Jensen’s inequality says that, if f is a convex function, and X an integrable random variable, then ( ) ( ) E f(X) ≥ f E(X) . The ﬁgure below illustrates this for the simple case in which X takes on only two values, x1 and x2, with P(X = x1) = θ. Then E(X) = xθ = θx1 + (1 − θ)x2. We see that ( ) ( ) f E(X) = f(xθ) < E f(X) = θf(x1) + (1 − θ)f(x2).

2 Moment-Generating and Cumulant-Generating Functions

For a random variable X with a given distribution µX , the moment-generating function (m.g.f.) is deﬁned as ( ) MX (u) ≡ E exp(uX) . Although this notation is standard, it pays no attention to the fact that an expectation is not a property of a random variable, but rather of its distribution, which can be represented by the CDF F , or the induced measure µX . If I retain the notation MX (u), rather than MF (u), it is so as to conform to usual notation, although I deplore it. Recall that the series expansion of the exponential function about the origin is

∞ ∑ zn exp z = . n ! n=0

If we formally interchange the expectation in the deﬁnition of the m.g.f. and the inﬁnite sum, then we get ∞ ∞ ∑ un ∑ un M (u) = E(Xn) = m , (1) X n ! n n ! n=0 n=0

where mn is the uncentred moment of order n. We know that this relation cannot always be true, because we know that there are many distributions for which moments beyond a certain order do not exist. Suppose that the moment of order i is the ﬁrst that does not exist. In that case, what happens is that the m.g.f. MX (u) has no derivative at u = 0 of order i. This implies that the Taylor expansion of MX at u = 0 cannot go beyond the term proportional to ui−1. Now consider the standard normal distribution, which has all its moments. The density of the N(0,1) distribution is 1 ϕ(x) = √ exp −−1 x2 2π 2 and so the m.g.f. is ∫ 1 ∞ M (u) = √ exp(ux) exp(−−1 x2) dx N(0,1) 2 2π ∫−∞ 1 ∞ = √ exp −−1 (x2 − 2ux) dx 2π −∞ 2 ∫ exp(u2/2) ∞ = √ exp −−1 (x2 − 2ux + u2) dx 2π −∞ 2 ∫ exp(u2/2) ∞ = √ exp −−1 (x − u)2 dx = exp −1 u2, 2π −∞ 2 2 where the last equality follows by changing the integration variable by y = x − u, and noting that the integral of the standard normal density is 1.

3 Using the standard normal m.g.f. we can compute all the moments of the standard normal n distribution. From (1) we see that mn, the moment of order n, is the coefficient of u /n ! in the Taylor expansion of MN(0,1)(u) about u = 0. This Taylor expansion is ∞ ∑ u2n M (u) = . (2) N(0,1) 2nn ! n=0 Notice first that there are no terms with u raised to an odd power, which gives us the well-known property that all the odd-order moments of the standard normal distribution are zero, since the distribution is symmetric about the origin. For the even-order moments, we equate the coefficient of u2n in (2) and that in (1). This gives m 1 2n = , (2n)! 2nn ! so that (2n)! (2n)(2n − 2) ... 2 (2n − 1) ... 1 m = = = (2n − 1)(2n − 3) ... 1. 2n 2nn ! 2nn ! We see that m2 = 1

m4 = 3 · 1 = 3

m6 = 5 · 3 · 1 = 15

m8 = 7 · 5 · 3 · 1 = 105, and so on. There are other ways of obtaining these results, but this is one of the most straightforward.

The cumulant-generating function (c.g.f.) KX (u) is by deﬁnition the logarithm of the m.g.f.: ( ) KX (u) = log E exp(uX) . The series expansion of the c.g.f. about the origin serves to deﬁne the cumulants of the distribution ∞ ∑ un K (u) = κ X n n ! n=1

The sum starts at n = 1 because it follows from the definition that MX (0) = 1 for any distribution of X, and of course log 1 = 0. We can use the Taylor expansion of the log function about u = 1 in order to get the relation between the moments and the cumulants. I find that this is better done by using computer algebra rather than by pencil and paper. However, there are some easy results. First, κ1 is the expectation, and κ2 is the variance. Then, for standard normal, we find that K (u) = −1 u2, N(0,1) 2 since the exponential and logarithmic functions are inverses. It follows that there is only one non-zero standard-normal cumulant, namely the second, which is equal to 1, and that is indeed the variance.

4 Limiting Distribution of Scaled Random Walk Begin by working out the m.g.f. and c.g.f. for an arbitrary scalar normal distribution N(µ, σ2). Let X = µ + σW , where W ∼ N(0, 1). Then ( ) ( ) ( ) uµ 2 2 MX (u) = E exp(uX) = E exp(uµ + uσW ) = e E exp((uσ)W ) = exp(uµ + u σ /2), since E(uW ) = exp(u2/2). By taking the log, we see that

2 2 KX (u) = uµ + u σ /2.

We now want to obtain the c.g.f. of a scaled random walk W (n)(t), so as to be able to take the limit as n → ∞. Recall that, by deﬁnition,

1 1 ∑nt W (n)(t) = √ M = √ X , n nt n j j=1

where it is assumed that nt is an integer and that the Xj are IID Rademacher variables, that is, random signs. Now,

( nt ) 1 ∑ M (u) ≡ E exp uW (n)(t) = E exp √ X n n j j=1 ( nt ) nt ( ) [ ( )] ∏ uX ∏ uX uX nt = E exp √ j = E exp √ j = E exp √ j , n n n j=1 j=1

so that [ ( √ √ )] √ K (u) = nt log −1 eu/ n + e−u/ n = nt log cosh(u/ n). n 2

The next step is to compute the limit of Kn(u). For that, we need two series expansions: x2 cosh x = 1 + + O(x4), and 2 log(1 + z) = z + O(z2), so that x2 log cosh x = + O(x4). 2 This gives ( ) √ u2 K (u) = nt log cosh(u/ n) = nt + O(n−2) = −1 u2t + O(n−1), n 2n 2 and from this we get the desired result:

1 2 lim Kn(u) = −u t, n→∞ 2 and this is the c.g.f. of the distribution N(0, t).

5 The SDE for the Vasicek Model

The Vasicek interest rate model is given by the stochastic diﬀerential equation

dR = (α − βR)dt + σdW,

where α, β, and σ are positive constants. In order to solve this SDE, first forget about the stochastic term σdW , and solve the ordinary differential equation that results: dR dR = (α − βR)dt, or = dt. α − βR The dependent variable R appears only on the l.h.s., and the independent variable t only on the r.h.s. The indefinite integral of the l.h.s. is − log(α − βR)/β. Thus we find, on integrating from 0 to t, that α − βR(t) log = −βt. α − βR(0) This gives ( ) eβt α − βR(t) = α − βR(0). We interpret the r.h.s. above as the constant of integration, and define ( ) S(t) = eβt α − βR(t) . (3)

Without the stochastic term. dS(t) = 0. With it, we have [ ] dS(t) = βeβt (α − βR(t))dt − dR = −βeβtσdW (t).

Note that since S is linear in R, there is no dR dR term. Integrate from 0 to t: ∫ t S(t) = S(0) − σβ eβsdW (s). 0 Now from (3), we see that 1 ( ) R(t) = α − e−βtS(t) . β Since S(0) = α − βR(0), this yields ∫ 1 ( ) t R(t) = α − e−βtS(0) + σ e−β(t−s)dW (s) β 0 ( ) ∫ α α t = − e−βt − R(0) + σ e−β(t−s)dW (s) β β 0 ∫ α t = (1 − e−βt) + R(0)e−βt + σ e−β(t−s)dW (s). β 0 This is the solution that Shreve pulls out of a hat.

6 Notes on the Exponential and Poisson Distributions Consider the simplest exponential distribution, where the parameter λ is set equal to 1. The CDF of this distribution is

F (x) = 1 − e−x, x ≥ 0.

Let X be a random variable that has this distribution. Consider another r.v. Y = X/λ. Then Pr(Y ≤ y) = Pr(X ≤ λy) = 1 − e−λy. The density of Y is then λe−λy, and this is Shreve’s definition of the exponential density. But it is clear that 1/λ is just a scale parameter. A useful result is that E(Xn) = n ! for positive integer n. This follows by induction: E(X0) = 1 = 0 ! and, by the induction hypothesis, ∫ ∞ ∫ ∞ ∫ ∞ E(Xn+1) = xn+1e−x dx = − xn+1 d(e−x) = e−x d(xn+1) 0 0 0 ∫ ∞ = (n + 1) e−xxn dx = (n + 1)E(Xn) = (n + 1) !. 0 The final step in the first line above follows on integration by parts. Let ∑n Sn = τk, k=1

where the τk are IID exponential random variables with expectation 1. If we set λ = 1 in Shreve’s Lemma 11.2.1 says that the distribution of Sn is the gamma distribution, with density sn−1e−s g (s) = . (4) n (n − 1) ! Shreve proves this by an induction argument. This relies on the following result:

If X and Y are independent random variables, with CDFs FX and FY , and densities fX and fY , then the density of Z = X + Y is ∫ ∞ ∫ ∞ fX (z − y)fY (y) dy or fY (z − x)fX (x) dx. −∞ −∞ It is easy to see that these two expressions are equal. Either of them is called the convo- lution of the two densities. To prove the result, notice that ( ) ( ( )) FZ (z) = Pr(Z ≤ z) = Pr(X ≤ z − Y ) = E I(X ≤ z − Y ) = E E I(X ≤ z − Y ) | Y .

The last expression above is ∫ ( ) ∞ E FX (z − Y ) = FX (z − y) dFY (y). −∞

7 The density of Z is the derivative of this w.r.t. z, and this is

∫ ∞ fZ (z) = fX (z − y)fY (y) dy. −∞

This completes the proof. Another way of obtaining the result relies on the moment-generating function of the exponential distribution. If τ has this distribution with expectation 1, we have

∫ ∞ E(e−uτ ) = e−uxe−x dx = (1 + u)−1. 0

Then, for Sn,

( ∑n ) (∏n ) ∏n ( ) −uSn −n E(e ) = E exp(−u τk) = E exp(−uτk) = E exp(−uτk) = (1 + u) k=1 k=1 k=1

We can now show that this is the m.g.f. of the gamma density (4). The m.g.f. with argument u is ∫ ∫ ∞ n−1 −s ∞ −s(u+1) n−1 −us s e e s e − ds = − ds. 0 (n 1) ! 0 (n 1) ! Change variable by t = s(u + 1) to get ∫ ∞ e−t tn−1 1 dt = E(Xn−1) = (1 + u)−n, n − n − 0 (1 + u) (n 1) ! (1 + u) (n 1) !

as required.