<<

Econ 508B: Lecture 5 Expectation, MGF and CGF

Hongyi Liu

Washington University in St. Louis

July 31, 2017

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 1 / 23 Outline

1 Expected Values

2 Generating Functions

3 Cumulative Generating Functions

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 2 / 23 Outline

1 Expected Values

2 Moment Generating Functions

3 Cumulative Generating Functions

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 3 / 23 Motivation: v.s. Expectation

To start with, people probably have a better understanding for an than for probability. Like optimization and approximation problems, they are phrased in terms of expectations. Expectations are indeed seen as special cases and are treated with uniformity and economy.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 4 / 23 Definition 1.1 Let X be a on (Ω, F,P ).The expected value of X, EX, is defined as Z EX = XdP, Ω given the is well-defined, i.e., at least one of the two quantities R X+dP and R X−dP is finite.

Proposition 1.1 (Change of variable formula) If X is a random variable on (Ω, F,P ) and g : R → R is Borel measurable and Y = g(X) is also a random variable on (Ω, F,P ). R |Y |dP = R |g(x)|P (dx) = R |y|P (dy). Ω R X R Y R If Ω |Y |dP < ∞, then Z Z Z Y dP = h(x)PX (dx) = yPY (dy). Ω R R

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 5 / 23 Moment

Definition 1.2 th th For any positive n, the n moment µn and the n central 0 moment µn of a random variable X is defined by

n 0 n µn ≡ EX , µn ≡ E(X − EX)

provided the expectation is well-defined.

In particular, the of a random variable X is the 2th , namely V ar(X) = E(X − EX)2, provided EX2 < ∞.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 6 / 23 Outline

1 Expected Values

2 Moment Generating Functions

3 Cumulative Generating Functions

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 7 / 23 The payoff of MGF is that it gives the direct connection between MGF and the moments of a random variable X as follows.

MGF

Definition 2.1 The moment generating function (MGF) of a random variable X is tX MX (t) ≡ E(e ), for all t ∈ R

etX is always non-negative, therefore, E(etX ) is well-defined but could be infinity (Why?).

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 8 / 23 MGF

Definition 2.1 The moment generating function (MGF) of a random variable X is tX MX (t) ≡ E(e ), for all t ∈ R

etX is always non-negative, therefore, E(etX ) is well-defined but could be infinity (Why?). The payoff of MGF is that it gives the direct connection between MGF and the moments of a random variable X as follows.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 8 / 23 non-negative case

Proposition 2.1 Let X be a non-negative random variable t > 0. Then

∞ n X t µn M (t) ≡ E(etX ) = X n! n=0

tX P∞ tnXn Proof: By Taylor expansion, e = n=0 n! and X is non-negative, this comes from M.C.T.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 9 / 23 bounded case

Proposition 2.2

Let X be a random variable and let MX (t) be finite for all |t| < , for some  > 0, then (1) E|X|n < ∞ for all n ≥ 1, P∞ n µn (2) MX (t) = n=0 t n! for all |t| < , th (3) MX (·)is infinitely differentiable on (−, +) and for r ∈ N, the r of MX (·) is

∞ n (r) X t M (t) = µ = E(etX Xr)for |t| < . X n+r n! n=0 In particular, (r) r MX (0) = µr = EX

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 10 / 23 Pn (tx)j |tx| (2) : Notice that | j=0 j! | ≤ e for all x ∈ R and n ∈ N, then D.C.T. implies (2) holds.

(3) : The derivative of MX (·) can be found by term-by-term differentiation of the power series. Hence,

r ∞ ∞ r n (r) d X µn X d (t ) µn M (t) = ( tn ) = X dtr n! dtr n! n=0 n=0 ∞ ∞ X tn−r X tn = µ = µ n (n − r)! n+r n! n=0 n=0

Proof

|t|n|X|n |tX| (1) : According to MX (t) is finite and the fact that n! ≤ e for all n ∈ N, then E(e|tX|) ≤ E(etX ) + E(e−tX ) < ∞ for |t| < 

Therefore, choosing a t ∈ (−, +) leads to the of (1).

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 11 / 23 (3) : The derivative of MX (·) can be found by term-by-term differentiation of the power series. Hence,

r ∞ ∞ r n (r) d X µn X d (t ) µn M (t) = ( tn ) = X dtr n! dtr n! n=0 n=0 ∞ ∞ X tn−r X tn = µ = µ n (n − r)! n+r n! n=0 n=0

Proof

|t|n|X|n |tX| (1) : According to MX (t) is finite and the fact that n! ≤ e for all n ∈ N, then E(e|tX|) ≤ E(etX ) + E(e−tX ) < ∞ for |t| < 

Therefore, choosing a t ∈ (−, +) leads to the outcome of (1). Pn (tx)j |tx| (2) : Notice that | j=0 j! | ≤ e for all x ∈ R and n ∈ N, then D.C.T. implies (2) holds.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 11 / 23 Proof

|t|n|X|n |tX| (1) : According to MX (t) is finite and the fact that n! ≤ e for all n ∈ N, then E(e|tX|) ≤ E(etX ) + E(e−tX ) < ∞ for |t| < 

Therefore, choosing a t ∈ (−, +) leads to the outcome of (1). Pn (tx)j |tx| (2) : Notice that | j=0 j! | ≤ e for all x ∈ R and n ∈ N, then D.C.T. implies (2) holds.

(3) : The derivative of MX (·) can be found by term-by-term differentiation of the power series. Hence,

r ∞ ∞ r n (r) d X µn X d (t ) µn M (t) = ( tn ) = X dtr n! dtr n! n=0 n=0 ∞ ∞ X tn−r X tn = µ = µ n (n − r)! n+r n! n=0 n=0

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 11 / 23 Remark 2.1

If MX (t) finite within a finite circle is fulfilled, then all the moments {µn}n≥1 of X are determined and its as well. However, in general, probability distributions are not completely determined by their moments.

Example 2.1 Let X ∼ N(0, 1), then for all t ∈ R,

Z +∞ ∞ 2 k tx 1 −x2/2 t2/2 X (t ) 1 MX (t) = e √ e dx = e = . 2π k! 2k −∞ k=0 ( 0 if n is odd Thus µn = (2k)! k!2k if n = 2k, k = 1, 2, ...

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 12 / 23 Example 2.1 Let X ∼ N(0, 1), then for all t ∈ R,

Z +∞ ∞ 2 k tx 1 −x2/2 t2/2 X (t ) 1 MX (t) = e √ e dx = e = . 2π k! 2k −∞ k=0 ( 0 if n is odd Thus µn = (2k)! k!2k if n = 2k, k = 1, 2, ...

Remark 2.1

If MX (t) finite within a finite circle is fulfilled, then all the moments {µn}n≥1 of X are determined and its probability distribution as well. However, in general, probability distributions are not completely determined by their moments.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 12 / 23 Intuitively speaking, if the sequence of moments does not grow so quickly, then the distribution is determined by its moments. Example 2.2 A standard example of two distinct distributions with the same moment is based on the density of lognormal distribution (Billingsley, Probability and Measure, chapter 30.) 1 f(x) = √ 1/x exp(−(log x)2/2) 2π And its perturbed density:

fa(x) = f(x)(1 + a sin(2π log x))

They have the same moments and the nth moment of each of them is exp(n2/2). Proof: Homework!

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 13 / 23 Joint moment generating function

Definition 2.2 The joint moment generating function of a random vector X = (X1, ..., Xk) is defined by

t1X1+···tkXk MX1,...,Xk (t1, ..., tk) ≡ E(e ),

for all t1, ..., tk ∈ R. And the definition applied here for MX1,...,Xk (·) is

similar to MX (t), namely the MGF of X ’exists’ if MX1,...,Xk (·) is finite d in a neighborhood of the origin of R ,||t|| < t0, t0 > 0.

k r X 1 X M (t) = 1 + κit + κijt t + ··· X i 2 i j i=1 i,j=1

i ···ir i ir where κ 1 = E(Y 1 ··· Y ) for i1, ..., ir = 1, ..., k, which is referred to as the moment about the origin of order r of X, moments of order r form an array, symmetrical w.r.t permutations of indices.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 14 / 23 Moreover, ∂rM (t) i1···ir X κ = |t=0 ∂ti1 ··· ∂tir The relationship

MX (t) = MX1 × · · · × MXk holds if and only if the components of X are independent.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 15 / 23 Alternatively, by the definition of expectation and MGF, random variable X, occurs −5 with probability 1/8, occurs 1 with probability 1/4, and occurs 7 with probability 5/8. Thus its E(Xn) is trivially n (n) 1 n 1 5 n E[X ] = MX (0) = 8 (−5) + 4 + 8 7 .

Example

1 −5t 1 t 5 7t n Suppose MX (t) = 8 e + 4 e + 8 e . E(X )? Answer: 1 1 5 M (n)(t) = (−5)ne−5t + et + 7ne7t X 8 4 8 1 1 5 E[Xn] = M (n)(0) = (−5)n + + 7n X 8 4 8

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 16 / 23 Example

1 −5t 1 t 5 7t n Suppose MX (t) = 8 e + 4 e + 8 e . E(X )? Answer: 1 1 5 M (n)(t) = (−5)ne−5t + et + 7ne7t X 8 4 8 1 1 5 E[Xn] = M (n)(0) = (−5)n + + 7n X 8 4 8 Alternatively, by the definition of expectation and MGF, random variable X, occurs −5 with probability 1/8, occurs 1 with probability 1/4, and occurs 7 with probability 5/8. Thus its E(Xn) is trivially n (n) 1 n 1 5 n E[X ] = MX (0) = 8 (−5) + 4 + 8 7 .

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 16 / 23 Outline

1 Expected Values

2 Moment Generating Functions

3 Cumulative Generating Functions

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 17 / 23 Generating Function

Definition 3.1

Let MX (t) be finite for |t| < t0. The cumulant generating function of X is defined as KX (t) = log MX (t) The CGF also completely determines the distribution of X and it can be expanded in a power series with same radius of convergence R ≥ t0 as follows t2 t3 K (t) = κ t + κ + κ + ···. X 1 2 2! 3 3! r The coefficient κr of t /r! is referred to as the cumulant of order r of X, dr κ = κ (X) = K (t)| r r dtr X t=0

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 18 / 23 Multivariate Cumulative generating function

When X = (X1, ..., Xk) is a vector, the CGF is defined as

KX (t) = logMX (t)

If MX (t) exists, then the CGF admits a multivariate expansion in a neighborhood of the origin, with the coefficients corresponding to of X. Definition 3.2 The joint cumulant of order r is

∂rK (t) i1,i2,···,ir X κ = |t=0. ∂ti1 ··· ∂tir

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 19 / 23 Sums of I.I.D. random variables

Pn Let Sn = i=1 Xi and MXi exists, then

n MSn (t) = (MX (t)) ,KSn (t) = nKX (t),

Also, κr(Sn) = nκr(X) = nκr. In a word, when working with sums of i.i.d random variables, its cumulants are simply times n by each random variable’s cumulants.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 20 / 23 Example 3.1 Let X ∼ N(µ, σ2) and then

2 2 µt+σ2 t 2 t M (t) = e 2 ,K (t) = µt + σ X X 2

2 Therefore, κ1 = µ, κ2 = σ , κr = 0 for r = 3, 4, ....

Cumulants of order larger than 2 are all zero if and only if X has a .

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 21 / 23 Location Shifts

Shifting from X to X + a induce the corresponding transformation of MX (·) and KX (·), respectively

t(X+a) at MX+a(t) = E(e ) = e MX (t), and KX+a(t) = at + KX (t).

Only the first cumulant is affected, i.e., κ1(X + a) = a + κ1.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 22 / 23 Scale Changes

Scaling change of X by b, b > 0 obtains that X/b. It follows that

tX/b MX/b(t) = E(e ) = MX (t/b),

KX/b(t) = KX (t/b), r r κr(X/b) = κr(X)/b = κr/b .

All cumulants are affected by a scale change unless b = 1.

Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 23 / 23