Econ 508B: Lecture 5 Expectation, MGF and CGF
Hongyi Liu
Washington University in St. Louis
July 31, 2017
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 1 / 23 Outline
1 Expected Values
2 Moment Generating Functions
3 Cumulative Generating Functions
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 2 / 23 Outline
1 Expected Values
2 Moment Generating Functions
3 Cumulative Generating Functions
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 3 / 23 Motivation:Probability v.s. Expectation
To start with, people probably have a better understanding for an expected value than for probability. Like optimization and approximation problems, they are phrased in terms of expectations. Expectations are indeed seen as special cases and are treated with uniformity and economy.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 4 / 23 Definition 1.1 Let X be a random variable on (Ω, F,P ).The expected value of X, EX, is defined as Z EX = XdP, Ω given the integral is well-defined, i.e., at least one of the two quantities R X+dP and R X−dP is finite.
Proposition 1.1 (Change of variable formula) If X is a random variable on (Ω, F,P ) and g : R → R is Borel measurable and Y = g(X) is also a random variable on (Ω, F,P ). R |Y |dP = R |g(x)|P (dx) = R |y|P (dy). Ω R X R Y R If Ω |Y |dP < ∞, then Z Z Z Y dP = h(x)PX (dx) = yPY (dy). Ω R R
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 5 / 23 Moment
Definition 1.2 th th For any positive integer n, the n moment µn and the n central 0 moment µn of a random variable X is defined by
n 0 n µn ≡ EX , µn ≡ E(X − EX)
provided the expectation is well-defined.
In particular, the variance of a random variable X is the 2th central moment, namely V ar(X) = E(X − EX)2, provided EX2 < ∞.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 6 / 23 Outline
1 Expected Values
2 Moment Generating Functions
3 Cumulative Generating Functions
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 7 / 23 The payoff of MGF is that it gives the direct connection between MGF and the moments of a random variable X as follows.
MGF
Definition 2.1 The moment generating function (MGF) of a random variable X is tX MX (t) ≡ E(e ), for all t ∈ R
etX is always non-negative, therefore, E(etX ) is well-defined but could be infinity (Why?).
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 8 / 23 MGF
Definition 2.1 The moment generating function (MGF) of a random variable X is tX MX (t) ≡ E(e ), for all t ∈ R
etX is always non-negative, therefore, E(etX ) is well-defined but could be infinity (Why?). The payoff of MGF is that it gives the direct connection between MGF and the moments of a random variable X as follows.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 8 / 23 non-negative case
Proposition 2.1 Let X be a non-negative random variable t > 0. Then
∞ n X t µn M (t) ≡ E(etX ) = X n! n=0
tX P∞ tnXn Proof: By Taylor expansion, e = n=0 n! and X is non-negative, this comes from M.C.T.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 9 / 23 bounded case
Proposition 2.2
Let X be a random variable and let MX (t) be finite for all |t| < , for some > 0, then (1) E|X|n < ∞ for all n ≥ 1, P∞ n µn (2) MX (t) = n=0 t n! for all |t| < , th (3) MX (·)is infinitely differentiable on (−, +) and for r ∈ N, the r derivative of MX (·) is
∞ n (r) X t M (t) = µ = E(etX Xr)for |t| < . X n+r n! n=0 In particular, (r) r MX (0) = µr = EX
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 10 / 23 Pn (tx)j |tx| (2) : Notice that | j=0 j! | ≤ e for all x ∈ R and n ∈ N, then D.C.T. implies (2) holds.
(3) : The derivative of MX (·) can be found by term-by-term differentiation of the power series. Hence,
r ∞ ∞ r n (r) d X µn X d (t ) µn M (t) = ( tn ) = X dtr n! dtr n! n=0 n=0 ∞ ∞ X tn−r X tn = µ = µ n (n − r)! n+r n! n=0 n=0
Proof
|t|n|X|n |tX| (1) : According to MX (t) is finite and the fact that n! ≤ e for all n ∈ N, then E(e|tX|) ≤ E(etX ) + E(e−tX ) < ∞ for |t| <
Therefore, choosing a t ∈ (−, +) leads to the outcome of (1).
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 11 / 23 (3) : The derivative of MX (·) can be found by term-by-term differentiation of the power series. Hence,
r ∞ ∞ r n (r) d X µn X d (t ) µn M (t) = ( tn ) = X dtr n! dtr n! n=0 n=0 ∞ ∞ X tn−r X tn = µ = µ n (n − r)! n+r n! n=0 n=0
Proof
|t|n|X|n |tX| (1) : According to MX (t) is finite and the fact that n! ≤ e for all n ∈ N, then E(e|tX|) ≤ E(etX ) + E(e−tX ) < ∞ for |t| <
Therefore, choosing a t ∈ (−, +) leads to the outcome of (1). Pn (tx)j |tx| (2) : Notice that | j=0 j! | ≤ e for all x ∈ R and n ∈ N, then D.C.T. implies (2) holds.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 11 / 23 Proof
|t|n|X|n |tX| (1) : According to MX (t) is finite and the fact that n! ≤ e for all n ∈ N, then E(e|tX|) ≤ E(etX ) + E(e−tX ) < ∞ for |t| <
Therefore, choosing a t ∈ (−, +) leads to the outcome of (1). Pn (tx)j |tx| (2) : Notice that | j=0 j! | ≤ e for all x ∈ R and n ∈ N, then D.C.T. implies (2) holds.
(3) : The derivative of MX (·) can be found by term-by-term differentiation of the power series. Hence,
r ∞ ∞ r n (r) d X µn X d (t ) µn M (t) = ( tn ) = X dtr n! dtr n! n=0 n=0 ∞ ∞ X tn−r X tn = µ = µ n (n − r)! n+r n! n=0 n=0
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 11 / 23 Remark 2.1
If MX (t) finite within a finite circle is fulfilled, then all the moments {µn}n≥1 of X are determined and its probability distribution as well. However, in general, probability distributions are not completely determined by their moments.
Example 2.1 Let X ∼ N(0, 1), then for all t ∈ R,
Z +∞ ∞ 2 k tx 1 −x2/2 t2/2 X (t ) 1 MX (t) = e √ e dx = e = . 2π k! 2k −∞ k=0 ( 0 if n is odd Thus µn = (2k)! k!2k if n = 2k, k = 1, 2, ...
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 12 / 23 Example 2.1 Let X ∼ N(0, 1), then for all t ∈ R,
Z +∞ ∞ 2 k tx 1 −x2/2 t2/2 X (t ) 1 MX (t) = e √ e dx = e = . 2π k! 2k −∞ k=0 ( 0 if n is odd Thus µn = (2k)! k!2k if n = 2k, k = 1, 2, ...
Remark 2.1
If MX (t) finite within a finite circle is fulfilled, then all the moments {µn}n≥1 of X are determined and its probability distribution as well. However, in general, probability distributions are not completely determined by their moments.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 12 / 23 Intuitively speaking, if the sequence of moments does not grow so quickly, then the distribution is determined by its moments. Example 2.2 A standard example of two distinct distributions with the same moment is based on the density of lognormal distribution (Billingsley, Probability and Measure, chapter 30.) 1 f(x) = √ 1/x exp(−(log x)2/2) 2π And its perturbed density:
fa(x) = f(x)(1 + a sin(2π log x))
They have the same moments and the nth moment of each of them is exp(n2/2). Proof: Homework!
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 13 / 23 Joint moment generating function
Definition 2.2 The joint moment generating function of a random vector X = (X1, ..., Xk) is defined by
t1X1+···tkXk MX1,...,Xk (t1, ..., tk) ≡ E(e ),
for all t1, ..., tk ∈ R. And the definition applied here for MX1,...,Xk (·) is
similar to MX (t), namely the MGF of X ’exists’ if MX1,...,Xk (·) is finite d in a neighborhood of the origin of R ,||t|| < t0, t0 > 0.
k r X 1 X M (t) = 1 + κit + κijt t + ··· X i 2 i j i=1 i,j=1
i ···ir i ir where κ 1 = E(Y 1 ··· Y ) for i1, ..., ir = 1, ..., k, which is referred to as the moment about the origin of order r of X, moments of order r form an array, symmetrical w.r.t permutations of indices.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 14 / 23 Moreover, ∂rM (t) i1···ir X κ = |t=0 ∂ti1 ··· ∂tir The relationship
MX (t) = MX1 × · · · × MXk holds if and only if the components of X are independent.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 15 / 23 Alternatively, by the definition of expectation and MGF, random variable X, occurs −5 with probability 1/8, occurs 1 with probability 1/4, and occurs 7 with probability 5/8. Thus its E(Xn) is trivially n (n) 1 n 1 5 n E[X ] = MX (0) = 8 (−5) + 4 + 8 7 .
Example
1 −5t 1 t 5 7t n Suppose MX (t) = 8 e + 4 e + 8 e . E(X )? Answer: 1 1 5 M (n)(t) = (−5)ne−5t + et + 7ne7t X 8 4 8 1 1 5 E[Xn] = M (n)(0) = (−5)n + + 7n X 8 4 8
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 16 / 23 Example
1 −5t 1 t 5 7t n Suppose MX (t) = 8 e + 4 e + 8 e . E(X )? Answer: 1 1 5 M (n)(t) = (−5)ne−5t + et + 7ne7t X 8 4 8 1 1 5 E[Xn] = M (n)(0) = (−5)n + + 7n X 8 4 8 Alternatively, by the definition of expectation and MGF, random variable X, occurs −5 with probability 1/8, occurs 1 with probability 1/4, and occurs 7 with probability 5/8. Thus its E(Xn) is trivially n (n) 1 n 1 5 n E[X ] = MX (0) = 8 (−5) + 4 + 8 7 .
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 16 / 23 Outline
1 Expected Values
2 Moment Generating Functions
3 Cumulative Generating Functions
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 17 / 23 Cumulant Generating Function
Definition 3.1
Let MX (t) be finite for |t| < t0. The cumulant generating function of X is defined as KX (t) = log MX (t) The CGF also completely determines the distribution of X and it can be expanded in a power series with same radius of convergence R ≥ t0 as follows t2 t3 K (t) = κ t + κ + κ + ···. X 1 2 2! 3 3! r The coefficient κr of t /r! is referred to as the cumulant of order r of X, dr κ = κ (X) = K (t)| r r dtr X t=0
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 18 / 23 Multivariate Cumulative generating function
When X = (X1, ..., Xk) is a vector, the CGF is defined as
KX (t) = logMX (t)
If MX (t) exists, then the CGF admits a multivariate Taylor series expansion in a neighborhood of the origin, with the coefficients corresponding to cumulants of X. Definition 3.2 The joint cumulant of order r is
∂rK (t) i1,i2,···,ir X κ = |t=0. ∂ti1 ··· ∂tir
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 19 / 23 Sums of I.I.D. random variables
Pn Let Sn = i=1 Xi and MXi exists, then
n MSn (t) = (MX (t)) ,KSn (t) = nKX (t),
Also, κr(Sn) = nκr(X) = nκr. In a word, when working with sums of i.i.d random variables, its cumulants are simply times n by each random variable’s cumulants.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 20 / 23 Example 3.1 Let X ∼ N(µ, σ2) and then
2 2 µt+σ2 t 2 t M (t) = e 2 ,K (t) = µt + σ X X 2
2 Therefore, κ1 = µ, κ2 = σ , κr = 0 for r = 3, 4, ....
Cumulants of order larger than 2 are all zero if and only if X has a normal distribution.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 21 / 23 Location Shifts
Shifting from X to X + a induce the corresponding transformation of MX (·) and KX (·), respectively
t(X+a) at MX+a(t) = E(e ) = e MX (t), and KX+a(t) = at + KX (t).
Only the first cumulant is affected, i.e., κ1(X + a) = a + κ1.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 22 / 23 Scale Changes
Scaling change of X by b, b > 0 obtains that X/b. It follows that
tX/b MX/b(t) = E(e ) = MX (t/b),
KX/b(t) = KX (t/b), r r κr(X/b) = κr(X)/b = κr/b .
All cumulants are affected by a scale change unless b = 1.
Hongyi Liu (Washington University in St. Louis)Math Camp 2017 Stats July 31, 2017 23 / 23