Lecture 11: Using the LLN and CLT, Moments of Distributions
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 3.1,3.3,3.4 LLN and CLT Markov's and Chebyshev's Inequalities For any random variable X ≥ 0 and constant a > 0 then Lecture 11: Using the LLN and CLT, Moments of Markov's Inequality: Distributions E(X ) P(X ≥ a) ≤ a Statistics 104 Chebyshev's Inequality: Var(X ) Colin Rundel P(jX − E(X )j ≥ a) ≤ a2 February 22, 2012 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 1 / 22 Chapter 3.1,3.3,3.4 LLN and CLT Chapter 3.1,3.3,3.4 LLN and CLT Using Markov's and Chebyshev's Inequalities Accuracy of Chebyshev's inequality Suppose that it is known that the number of items produced in a factory during a week How tight the inequality is depends on the distribution, but we can look at is a random variable X with mean 50. (Note that we don't know anything about the the Normal distribution since it is easy to evaluate. Let X ∼ N (µ, σ2) then distribution of the pmf) 1 P(jX − µj ≥ kσ) ≤ (a) What can be said about the probability that this week's production will be exceed k2 75 units? jX − µj 1 P ≥ k ≤ (b) If the variance of a week's production is known to equal 25, then what can be said σ k2 X − µ X − µ 1 about the probability that this week's production will be between 40 and 60 units? P ≤ −k or ≥ k ≤ σ σ k2 -k 0 k Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 2 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 3 / 22 Chapter 3.1,3.3,3.4 LLN and CLT Chapter 3.1,3.3,3.4 LLN and CLT Accuracy of Chebyshev's inequality, cont. LLN and CLT 2 X −µ If X ∼ N (µ, σ ) then let U = σ ∼ N (0; 1). Law of large numbers: Sn − nµ lim = lim (X¯n − µ) ! 0 Empirical Rule: Chebyshev's inequality: n!1 n n!1 1 Central Limit Theorem: 1 − P(−1 ≤ U ≤ 1) ≈ 1 − 0:66 = 0:34 1 − P(−1 ≤ U ≤ 1) ≤ 11 = 1 p 1 − P(−2 ≤ U ≤ 2) ≈ 1 − 0:66 = 0:05 1 − P(−2 ≤ U ≤ 2) ≤ 1 = 0:25 Sn − n ∗ mu d 2 22 lim p = lim n(X¯n − µ) ! N(0; σ ) n!1 n n!1 1 1 − P(−3 ≤ U ≤ 3) ≈ 1 − 0:66 = 0:003 1 − P(−3 ≤ U ≤ 3) ≤ 32 = 0:11 p S − nµ n(X¯ − µ) Why do we care if it is so inaccurate? P a ≤ n p ≤ b = P a ≤ n ≤ b ≈ Φ(b) − Φ(a) σ n σ Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 4 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 5 / 22 Chapter 3.1,3.3,3.4 LLN and CLT Chapter 3.1,3.3,3.4 LLN and CLT Equivalence of de Moivre-Laplace with CLT Equivalence of de Moivre-Laplace with CLT, cont. We have already seen that a Binomial random variable is equivalent to the Central Limit theorem tells us: sum of n iid Bernoulli random variables. iid n Let X ∼ Binom(np), Y ; ··· ; Y ∼ Bern(p) where X = P Y , Sn − nµ 1 n i=1 i P c ≤ p ≤ d ≈ Φ(d) − Φ(c) E(Yi ) = p, and Var(Yi ) = p(1 − p). σ n de Moivre-Laplace theorem tells us: a − nµ X − nµ b − nµ P(a ≤ X ≤ b) = P p ≤ p ≤ p ! nσ nσ nσ a − np X − np b − np P(a ≤ X ≤ b) = P ≤ ≤ ! pnp(1 − p) pnp(1 − p) pnp(1 − p) a − np Sn − np b − np = P p ≤ p ≤ p ! ! np(1 − p) np(1 − p) np(1 − p) b − np a − np ! ! ≈ Φ p − Φ p b − np a − np np(1 − p) np(1 − p) ≈ Φ − Φ pnp(1 − p) pnp(1 − p) Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 6 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 7 / 22 Chapter 3.1,3.3,3.4 LLN and CLT Chapter 3.1,3.3,3.4 LLN and CLT Why do we care? (Statistics) Astronomy Example In general we are interested in making statements about how the world An astronomer is interested in measuring, in light years, the distance from his observatory to a distant star. Although the astronomer has a measuring technique, he knows that, because of works, we usually do this based on empirical evidence. This tends to changing atmospheric conditions and normal error, each time a measurement is made it will not involve observing or measuring something a bunch of times. yield the exact distance but merely an estimate. As a result the astronomer plans to make a series of measurements and then use the average value of these measurements as his estimated LLN tells us that if we average our results we'll get closer and closer value of the actual distance. If the astronomer believes that the values of the measurements are to the true value as we take more measurements independent and identically distributed random variables having a common mean d (the actual distance) and a common variance of 4 (light years), how many measurements need he make to CLT tells us the distribution of our average is normal if we take be reasonably sure that his estimated distance is accurate to within ±0:5 light years? enough measurements, which tells us our uncertainty about the `true' value Example - Polling: I can't interview everyone about their opinions but if I interview enough my estimate should be close and I can use the CLT to approximate my margin of error. Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 8 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 9 / 22 Chapter 3.1,3.3,3.4 LLN and CLT Chapter 3.1,3.3,3.4 LLN and CLT Astronomy Example, cont. Astronomy Example, cont. We need to decide on a level to account for being \reasonably sure", Our previous estimate depends on how long it takes for the distribution of usually taken to be 0:95% but the choice is arbitrary. the average of the measurements to converge to the normal distribution. CLT only guarantees convergence as n ! 1, but most distributions converge much more quickly (think about the np ≥ 10; np ≥ 10 requirement for Normal approximation to Binomial). We can also solve this problem using Chebyshev's inequality Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 10 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 11 / 22 Chapter 3.1,3.3,3.4 Moments of Distributions Chapter 3.1,3.3,3.4 Moments of Distributions Moments Properties of Moments Zeroth Moment: 0 Some definitions, µ0 = µ0 = 1 First Moment: 0 Raw moment: µ1 = E(X ) = µ 0 n µn = E(X ) µ1 = E(X − µ) = 0 Second Moment: 2 µ2 = E[(X − µ) ] = Var(X ) Central moment: 0 0 2 2 µ2 − (µ1) = Var(X ) µn = E[(X − µ) ] Third Moment: µ Skewness(X ) = 3 σ3 Normalized / Standardized moment: Fourth Moment: µ4 Kurtosis(X ) = 4 µn σ µ σn Ex.Kurtosis(X ) = 4 − 3 σ4 Note that some moments do not exist, which is the case when E(X n) does not converge. Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 12 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 13 / 22 Chapter 3.1,3.3,3.4 Moments of Distributions Chapter 3.1,3.3,3.4 Moments of Distributions Third and Forth Moments Moment Generating Function The moment generating function of the discrete random variable X is 3 3 2 2 3 defined for all real values of t by µ3 = E[(X − µ) ] = E(X − 3X µ + 3X µ − µ ) 3 2 3 3 tX X tx = E(X ) − 3µE(X ) + 3µ − µ MX (t) = E[e ] = e P(X = x) = E(X 3) − 3µσ2 − µ3 x 0 2 3 This is called the moment generating function because we can obtain the = µ3 − 3µσ − µ moments of X by successively differentiating MX (t) and evaluating at t = 0. 0 0 MX (0) = E[e ] = 1 = µ0 d d 4 4 3 2 2 3 4 M0 (t) = E[etX ] = E etX = E[XetX ] µ4 = E[(X − µ) ] = E(X − 4X µ + 6X µ − 4X µ + µ ) X dt dt 4 3 2 2 4 4 0 0 0 = E(X ) − 4µE(X ) + 6µ E(X ) − 4µ + µ MX (0) = E[Xe ] = E[X ] = µ1 4 3 2 2 2 2 2 4 d d d = E(X ) − 4µE(X ) + 2µ (σ + µ ) + 4µ σ + µ M00(t) = M0 (t) = E[XetX ] = E (XetX ) = E[X 2etX ] X dt X dt dt 0 0 2 2 4 00 2 0 2 0 = µ4 − 4µµ3 + 6µ σ + 3µ MX (0) = E[X e ] = E[X ] = µ2 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 14 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 15 / 22 Chapter 3.1,3.3,3.4 Moments of Distributions Chapter 3.1,3.3,3.4 Moments of Distributions Moment Generating Function - Poisson Moment Generating Function - Poisson Skewness Let X ∼ Pois(λ) then 1 −λ k 1 t k tX X tk e λ −λ X (λe ) 0 000 3 2 MX (t) = E[e ] = e = e µ = M (0) = λ + 3λ + λ k! k! 3 X k=0 k=0 = exp(λet − λ) µ µ0 − 3µσ2 − µ3 Skewness(X ) = 3 = 3 σ3 σ3 0 t t 3 2 2 3 MX (t) = λe exp(λe − λ) λ + 3λ + λ − 3λ − λ 0 0 = p MX (0) = µ1 = λ = E(X ) ( λ)3 p 2 ( λ) −1=2 M00(t) = (λet )2 exp(λet − λ) + λet exp(λet − λ) = p = λ X 3 00 0 2 2 ( λ) MX (0) = µ2 = λ + λ = E(X ) Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 16 / 22 Statistics 104 (Colin Rundel) Lecture 11 February 22, 2012 17 / 22 Chapter 3.1,3.3,3.4 Moments of Distributions Chapter 3.1,3.3,3.4 Moments of Distributions Moment Generating Function - Poisson Kurtosis Moment Generating Function - Binomial Let X ∼ Binom(n; p) then µ0 = M000(0) = λ3 + 3λ2 + λ n ! n ! 3 X tX X tk n k n−k X n t k n−k M (t) = E[e ] = e p (1 − p) = (pe ) (1 − p) 0 0000 4 3 2 X k k µ4 = MX (0) = λ + 6λ + 7λ + λ k=0 k=0 = [pet + (1 − p)]n µ µ0 − 4µµ0 + 6µ2σ2 + 3µ4 Kurtosis(X ) = 4 = 4 3 σ4 σ4 µ0 − 4λ4(λ3 + 3λ2 + λ) + 6λ3 + 3λ4 0 t t n−1 = 4 MX (t) = npe (pe + 1 − p) λ2 0 0 MX (0) = µ1 = np = E(X ) µ0 − λ4 − 6λ3 − 4λ2 = 4 λ2 4 3 2 4 3 2 00 t 2 t n−2 t t n−1 (λ + 6λ + 7λ + λ) − λ − 6λ − 4λ MX (t) = n(n − 1)(pe ) (pe + 1 − p) + npe (pe + 1 − p) = 2 00 0 2 2 λ MX (0) = µ2 = n(n − 1)p + np = E(X ) 3λ2 + λ = = 3 + λ−1 λ2 Var(X ) = µ0 − (µ0 )2 = n(n − 1)p2 + np − n2p2 Ex.