Moment Generating Function
Total Page:16
File Type:pdf, Size:1020Kb
Moment Generating Function Statistics 110 Summer 2006 Copyright °c 2006 by Mark E. Irwin Moments Revisited So far I've really only talked about the ¯rst two moments. Lets de¯ne what is meant by moments more precisely. De¯nition. The rth moment of a random variable X is E[Xr], assuming that the expectation exists. So the mean of a distribution is its ¯rst moment. De¯nition. The r central moment of a random variable X is E[(X ¡ E[X])r], assuming that the expectation exists. Thus the variance is the 2nd central moment of distribution. The 1st central moment usually isn't discussed as its always 0. The 3rd central moment is known as the skewness of a distribution and is used as a measure of asymmetry. Moments Revisited 1 If a distribution is symmetric about its mean (f(¹ ¡ x) = f(¹ + x)), the skewness will be 0. Similarly if the skewness is non-zero, the distribution is asymmetric. However it is possible to have asymmetric distribution with skewness = 0. Examples of symmetric distribution are normals, Beta(a; a), Bin(n; p = 0:5). Example of asymmetric distributions are Distribution Skewness Bin(n; p) np(1 ¡ p)(1 ¡ 2p) P ois(¸) ¸ 2 Exp(¸) ¸ Beta(a; b) Ugly formula The 4th central moment is known as the kurtosis. It can be used as a measure of how heavy the tails are for a distribution. The kurtosis for a normal is 3σ4. Moments Revisited 2 Note that these measures are often standardized as in their raw form they depend on the standard deviation. Theorem. If the rth moment of a RV is exists, then the sth moment exists for all s < r. Also the sth central moment exists for all s · r. So you can't have a distribution that has a ¯nite mean, an in¯nite variance, and a ¯nite skewness. Proof. Postponed till later. 2 Why are moments useful? They can be involved in calculating means and variances of transformed RVs or other summaries of RVs. Example: What are the mean and variance of A = ¼R2 E[A] = ¼E[R2] ¡ ¢ Var(A) = ¼2Var(R2) = ¼2 E[R4] ¡ (E[R2])2 So we need E[R4] in addition to E[R] and E[R2]. Moments Revisited 3 Example: What is the skewness of X? E[(X ¡ ¹)3] = E[X3 ¡ 3¹X2 + 3¹2X ¡ ¹3] = E[X3] ¡ 3¹E[X2] + 2¹3 so E[X];E[X2], and E[X3] are needed to calculate the skewness. Moments Revisited 4 Moment Generating Function De¯nition. The Moment Generating Function (MGF) of a random tX variable X, is MX(t) = E[e ] if the expectation is de¯ned. X tx MX(t) = e pX(x) (Discrete) x Z tx MX(t) = e fX(x)dx (Continuous) X Whether the MGF is de¯ned depends on the distribution and the choice of t. For example, the MX(t) is de¯ned for all t if X is normal, de¯ned for no t if X is Cauchy, and for t < ¸ if X » Exp(¸). For those that have done some analysis, for the continuous case, the moment generating function is related to the Laplace transform of the density function. Many of the results about it come from that theory. Moment Generating Function 5 Why should we care about the MGF? ² To calculate moments. It may be easier to work with the MGF than to directly calculate E[Xr]. ² To determine distributions of functions of random variables. ² Related to this, approximating distributions. For example can use it to show that as n increases, the Bin(n; p) \approaches" a normal distribution. The following theorems justify these uses of the MGF. Moment Generating Function 6 Theorem. If MX(t) of a RV X is ¯nite in an open iterval containing 0, then it has derivatives of all orders and (r) r tX MX (t) = E[X e ] (r) r MX (0) = E[X ] Proof. Z 1 (1) d tx MX (t) = e fX(x)dx dt ¡1 Z 1 µ ¶ d tx = e fX(x)dx ¡1 dt Z 1 tx = xe fX(x)dx ¡1 = E[XetX] Moment Generating Function 7 d M (2)(t) = M (1) X dt X Z 1 µ ¶ d tx = x e fX(x)dx ¡1 dt Z 1 2 tx 2 tX = x e fX(x)dx = E[X e ] ¡1 The rest can be shown by induction. The second part of the theorem follows from e0 = 1. 2 Another way to see this result is due to the Taylor series expansion of y2 y3 ey = 1 + y + + + :::; 2! 3! which gives Moment Generating Function 8 · ¸ X2t2 X3t3 M (t) = E 1 + Xt + + + ::: X 2! 3! t2 t3 = 1 + E[X]t + E[X2] + E[X3] + ::: 2! 3! Example MGFs: ² X » U(a; b) Z b etx ebt ¡ eat MX(t) = dx = a b ¡ a (b ¡ a)t Moment Generating Function 9 ² X » Exp(¸) Z 1 Z 1 tx ¡¸x ¡(¸¡t)x ¸ MX(t) = e ¸e dx = ¸e dx = 0 0 ¸ ¡ t Note that this integral is only de¯ned when t < ¸ ² X » Geo(p); (q = 1 ¡ p) X1 X1 pet M (t) = etxpqx¡1 = pet (et)x¡1qx¡1 = X 1 ¡ qet x=1 x=1 ² X » P ois(¸) 1 ¡¸ x 1 t x X e ¸ X (e ¸) t M (t) = etx = e¡¸ = e¸(e ¡1) X x! x! x=0 x=0 Moment Generating Function 10 Examples of using the MGF to calculate moments ² X » Exp(¸) ¸ 1 M (1)(t) = ; E[X] = (¸ ¡ t)2 ¸ 2¸ 2 M (2)(t) = ; E[X2] = (¸ ¡ t)3 ¸2 ¡(r + 1)¸ ¡(r + 1) M (r)(t) = ; E[Xr] = (¸ ¡ t)r+1 ¸r Moment Generating Function 11 ² X » Geo(p); (q = 1 ¡ p) pet pqe2t 1 M (1)(t) = + ; E[X] = 1 ¡ qet (1 ¡ qet)2 p pet 3pqe2t 2pq2e3t 5 ¡ 6p + 2p2 M (2)(t) = + + ; E[X2] = 1 ¡ qet (1 ¡ qet)2 (1 ¡ qet)3 p Theorem. If Y = a + bX then at MY (t) = e MX(bt) Proof. tY at+btX at (bt)X at MY (t) = E[e ] = E[e ] = e E[e ] = e MX(bt) 2 Moment Generating Function 12 For example, this result can be used to verify the result that E[a + bX] = a + bE[X] as (1) at at (1) MY (t) = ae MX(bt) + be MX (bt) (1) (1) MY (0) = aMX(0) + bMX (0) = a + bE[X] Theorem. If X and Y are independent RVs with MGFs MX and MY and Z = X + Y , then MZ(t) = MX(t)MY (t) on the common interval where both MGFs exist. Proof. tZ t(X+Y ) tX tY MZ(t) = E[e ] = E[e ] = E[e e ] tX tY = E[e ]E[e ] = MX(t)MY (t) 2 By induction, this result can be extended to sums of many independent RVs. Moment Generating Function 13 One particular use of this result is that it can give an easy approach to showing what the distribution of a sum of RVs is without having the calculate the convolution of the densities. But ¯rst we need one more result. Theorem. [Uniqueness theorem] If the MGF of X exists for t in an open interval containing 0, then it uniquely determines the CDF. i.e no two di®erent distributions can have the same values for the MGFs on an interval containing 0. Proof. Postponed 2 Example:P Let X1;X2;:::;Xn be iid Exp(¸). What is the distribution of S = X i µ ¶ Yn ¸ ¸ n M (t) = = S ¸ ¡ t ¸ ¡ t i=1 Moment Generating Function 14 Note that this isn't the form of the MGF for an exponential, so the sum isn't exponential. As shown in Example B on page 145, the MGF of a Gamma(®; ¸) is µ ¶ ¸ ® M(t) = ¸ ¡ t so S » Gamma(n; ¸) This approach also leads to an easy proof that the sum of independent normals is also normal. The moment generating function for N(¹; σ2) RV ¹t+σ2t2=2 iid 2 is M(t) = e . So if Xi » N(¹i; σi ); i = 1; : : : ; n, then n Y 2 2 X X P ¹it+σi t =2 2 2 M Xi = e = exp(t ¹i + t =2 σi ) i=1 P P 2 which is the moment generating function of a N( ¹i; σi ) RV. There is one important thing with this approach. We must be able to identify what MGF goes with each density or PMF. Moment Generating Function 15 For example, let X » Gamma(®; ¸) and Y » Gamma(¯; ¹) be independent. Then the MGF of Z » X + Y is µ ¶ µ ¶ ¸ ® ¹ ¯ M (t) = Z ¸ ¡ t ¹ ¡ t This is not the MGF of a gamma distribution unless ¸ = ¹. In fact I'm not quite sure what the density looks like beyond Z 1 ¸®x®¡1e¡¸x ¹¯(z ¡ x)¯¡1e¡¹(z¡x) fZ(z) = dx 0 ¡(®) ¡(¯) You can sometimes use tables of Laplace transforms or doing some complicated complex variable integration to invert the MGF to determine the density or PMF. While we can't get the density easily in this case, we can still use the MGF to get the moments of this distribution. Moment Generating Function 16 It is also possible to work with more complicated situations described by hierarchical models. Suppose that the MGFs for X(MX(t)) and Y jX = x (MY jX(t)) are known. Then the marginal MGF of Y is tY tY MY (t) = E[e ] = E[E[e jX]] = E[MY jX(t)] For example, this could be used to get the MGF of the Beta-Binomial model. Another situation where this is useful is with a random sums model where XN S = Xi i=1 and N is random.