Stats 512 513 Review ♥

Stats 512 513 Review ♥ Eileen Burns, FSA, MAAA June 16, 2009 Contents 1 Basic Probability 4 1.1 Experiment, Sample Space, RV, and Probability . .... 4 1.2 Density, Value Number Urn Model, and Dice . ... 5 1.3 Expected Value Descriptive Parameters . ..... 5 1.4 More About Normal Distributions ............................... 5 1.5 Independent Trials and a Pictorial CLT . .... 5 1.6 The Population, the Sample, and Data ............................. 5 1.7 Elementary Probability, Stressing Independence . ... 5 1.8 Expectation, Variance, and the CLT . .... 6 1.9 Applications of the CLT . .... 7 2 Introduction to Statistics 8 2.1 PresentationofData ................................ ....... 8 2.2 Estimation of µ and σ2 ..................................... 8 2.3 Elementary Classical Statistics . ......... 8 2.4 Elementary Statistical Applications . ........ 9 3 Probability Models 10 3.1 MathFacts ........................................ 10 3.2 Combinatorics and Hypergeometric RVs . ...... 11 3.3 Independent Bernoulli Trials . 11 3.4 ThePoissonDistribution ............................. ....... 12 3.5 The Poisson Process N ...................................... 12 3.6 The Failure Rate Function λ( ) ................................. 13 · 3.7 Min, Max, Median, and Order Statistics . 13 3.8 Multinomial Distributions . ....... 14 3.9 Sampling from a Finite Populations, with a CLT ....................... 14 4 Dependent Random Variables 17 4.1 Two-Dimensional Discrete RVs . ..... 17 4.2 Two-Dimensional Continuous RVs . ..... 18 4.3 Conditional Expectation . ...... 19 4.4 Prediction......................................... 20 4.5 Covariance ........................................ 21 4.6 Bivariate Normal Distributions . .......... 22 5 Distribution Theory 24 5.1 TransformationsofRVs .............................. ....... 24 5.2 Linear Transformations in Higher Dimensions . ....... 25 5.3 General Transformations in Higher Dimensions ........................ 27 5.4 Asymptotics....................................... ..... 27 5.5 Moment Generating Functions . 28 2 CONTENTS 3 6 Classical Statistics 30 6.1 Estimation ...................................... ...... 30 6.2 The One-Sample Normal Model . 30 6.3 The Two-Sample Normal Model . 31 6.4 Other Models . 32 7 Estimation, Principles and Approaches 33 7.1 Sufficiency, and UMVUE Estimators . 33 7.2 Completeness, and Ancillary Statistics . ...... 34 7.3 Exponential Families, and Completeness . ...... 35 7.4 The Cramér-Rao Bound . 36 7.5 Maximum Likelihood Estimators . ........ 37 7.6 Procedures for Location and Scale Models . 38 7.7 Bootstrap Methods ........................................ 39 7.8 BayesianMethods ..................................... 39 8 Hypothesis Tests 41 8.1 The Neyman-Pearson Lemma and UMP Tests . 41 8.2 TheLikelihoodRatioTest.............................. ...... 42 9 Regression, Anova, and Categorical Data 44 9.1 Simple Linear Regression . 44 9.2 One Way Analysis of Variance ................................. 45 9.3 CategoricalData ................................... ...... 45 A Distributions 47 A.1 ImportantTransformations . ......... 47 Chapter 1 Basic Probability 1.1 Experiment, Sample Space, RV, and Probability Bell Curve (6) Probability an observation falls within x standard deviations of the mean of a normal random variable x = 1: P ( 1 Z 1) = .682 x = 2: P (−2 ≤ Z ≤ 2) = .954 x = 3: P (−3 ≤ Z ≤ 3) = .9974 − ≤ ≤ X µ Note: this is true for any X N(µ, σ2) with Z = − . ∼ σ From a quantile perspective, p .500 .600 .700 .750 .800 .850 .900 .950 .975 .990 .999 zp 0 .254 .524 .672 .840 1.37 1.28 1.645 1.96 2.33 3.80 Basic probability facts (8) P (Ac) = 1 P (A) P (A B) =−P (A) + P (B) P (A B) ∪ − ∩ Union-Intersection Principle (§1.7) (73) n P ( 1 Ai) = i P (Ai) = i=j P (AiAj ) + i=j=k P (AiAjAk) i=j=k=l P (AiAj AkAl) + ∪ 6 6 6 − 6 6 6 · · · P P P P Partition of Event (§1.7) (67) For a given partition of the event A, P ( ∞ A ) = ∞ P (A ) when all pairs of events A , A are disjoint. ∪i=1 i i=1 i i j P Density Functions and Distribution Functions (10) For density function f and distribution function F , b b a P (a < X b) = a f(v) dv = f(v) dv f(v) dv = F (b) F (a) ≤ −∞ − −∞ − R R R 4 1.2. DENSITY, VALUE NUMBER URN MODEL, AND DICE 5 1.2 Density, Value Number Urn Model, and Dice 1.3 Expected Value Descriptive Parameters Discrete Continuous Expected Value (26) (31) ∞ µ = E[X] v p (v) vf(v) dv X · X allv Z−∞ X ∞ E[g(X)] g(v) p (v) g(v)f(v) dv · X Xallv Z−∞ Mean deviation (27) (31) ∞ τ = E[ X µ ] v µ p (v) v µ f(v) dv X | − | | − X | · X | − X | Xallv Z−∞ Variance ∞ σ2 = E[X2] (E[X])2 (v µ )2 p (v) (v µ )2f(v) dv X − − X · X − X Xallv Z−∞ 1.4 More About Normal Distributions 1.5 Independent Trials and a Pictorial CLT Central Limit Theorem (46) Let X for 1 i n denote iid rvs. i ≤ ≤ Let Tn = X1 + X2 + + Xn. Let Z denote a N(0, 1)· · · rv. T nµx Then for all a < b , P a n − b P (a Z b) as n . −∞ ≤ ≤ ∞ ≤ √nσ ≤ → ≤ ≤ → ∞ x 1.6 The Population, the Sample, and Data 1.7 Elementary Probability, Stressing Independence Conditional Probability (58) P (A B) P (A B) = ∩ P (A B) = P (A B) P (B) | P (B) ∩ | · Independent Events (62) P (A B) = P (A) P (A B) = P (A) P (B) | ∩ · General Independence of Random Variables (64) 1. X1 and X2 are independent rvs 2. P (X A and X B) = P (X A)P (X B) for all (one-dimensional) events A and B. 1 ∈ 2 ∈ 1 ∈ 2 ∈ 6 CHAPTER 1. BASIC PROBABILITY 3. Discrete: P (X1 = vi and X2 = wj ) = P (X1 = vi)P (X2 = wj ) for all vi, wj . 4. Continuous: fX1,X2 (v, w) = fX1 (v)fX2 (w) for all v and w in the sample space 5. Discrete: Repetitions X1, X2 of a value number X-experiment are independent provided sampling is done with replacement. 6. These statements generalizes to n dimensions for both discrete and continuous distributions. Law of Total Probability (67) Bayes Theorem (67) P (A B ) P (A B ) P (B ) P (A B )P (B ) P (B A) = ∩ i = | i · i = | i i i| P (A) P (A) P (A B )P (B ) + P (A B )P (B ) + + P (A B )P (B ) | 1 1 | 2 2 · · · | n n Marginals (67) pX (v) = P (X = v) = allw P (X = v and Y = w) = allw pX,Y (v, w). P P Example Maximums and minimums of independent rvs (74) 1.8 Expectation, Variance, and the CLT Sums of Arbitrary RVs (83) 2 2 2 µaX+b = aµX + b µaX+bY = aµX + bµY σaX+b = a σX Sums of Independent RVs (83) E(XY ) = E(X) E(Y ) = µ µ V ar(X + Y ) = V ar(X) + V ar(Y ) = σ2 + σ2 · X · Y X Y Mean and Variance of a Sum (83) For independent repetitions of a basic X-experiment X ,...,X , Let T = X + X . Then 1 n n 1 · · · n E(Tn) = nµx 2 V ar(Tn) = nσx Tn nµX X¯n µX Zn = − = − has mean 0 and variance 1. √nσX σX /√n Covariance (86) Cov(X,Y ) = E[(X µ )(Y µ )] = E(XY ) µ µ − X − Y − X Y Correlation (90) Corr[X,Y ] =Cov[X,Y ]/σX σY 1.9. APPLICATIONS OF THE CLT 7 Sample Moments (87) ¯ 1 n µˆ = X = n i=1 Xi σˆ2 = 1 n P(X X¯ )2 n i=1 i − n 2 1P n ¯ 2 s = n 1 i=1(Xi Xn) − − σˆ = 1 P n (X X¯)(Y Y¯ ) X,Y n i=1 i − i − P Handy formulas (89) σ2 = E[X(X 1)] + µ µ2 σ2 = E[X(X +− 1)] µ − µ2 − − 1.9 Applications of the CLT Continuity correction (93) Pretty graphs (99) Chapter 2 Introduction to Statistics 2.1 Presentation of Data See Appendix B for examples of plot types. 2.2 Estimation of µ and σ2 Estimator Expected Value Variance ¯ 1 n ¯ ¯ 2 Sample meanµ ˆ = X = n i=1 Xi E(Xn) = µ Var[Xn] = σ /n (114) 2 1 n ¯ 2 2 2 2 µ4 1 n 3 4 Sample variance s = n 1 i=1(Xi Xn) E[Sn] = σ Var[Sn] = n n n−1 σ (115) − P − − − P 2.3 Elementary Classical Statistics Actual accuracy ratio of X¯n(Tn) (121) X¯n µ Z = − N(0, 1) n σ/√n ∼= CLT of Probability (121) No matter what the shape of the distribution of the population of possible outcomes of an original X- experiment that has mean µ and standard deviation σ, the following result will hold true. As n gets larger, the distribution of the standardized Zn form of the X¯n-experiment gets closer to the distribution given by the standardized bell shaped Normal curve. Estimated accuracy ratio of X¯n(Tn) (122) X¯n µ Tn = − = Tn 1 Sn/√n ∼ − CLT of Statistics (122) The distribution of the ratio Tn is given (not by the standardized bell curve, but rather) by the Student n 1 density. This is an approximation that gets better as the sample size n gets larger. T − Random confidence intervals vs. hypothesis testing vs p-values (123) Random confidence intervals: 95% CI for µ consists of exactly those values of µ that are .95-plausible Hypothesis testing: .05-level test of Ho : µ = µo rejects Ho at the .05 level if µo is not .95-plausible p-values: 2-sided p-value associated with µo is just exactly that level at which µo first becomes plausible 8 2.4. ELEMENTARY STATISTICAL APPLICATIONS 9 Theorem 2.3.1 – Student T -statistic (124) 2 If X1, X2,...,Xn are independent rvs that all have exactly a N(µ, σ ) distribution, then Students estimated accuracy ratio has exactly the n 1 distribution.

Stats 512 513 Review ♥

5. Completeness and Sufficiency 5.1. Complete Statistics. Definition 5.1. a Statistic T Is Called Complete If Eg(T) = 0 For

1 One Parameter Exponential Families

Lecture 4: Sufficient Statistics 1 Sufficient Statistics

Minimum Rates of Approximate Sufficient Statistics

The Likelihood Function - Introduction

Statistical Inference

An Empirical Test of the Sufficient Statistic Result for Monetary Shocks

1 Sufficient Statistic Theorem

STAT 517:Sufficiency

Are Sufficient Statistics Necessary? Nonparametric Measurement of Deadweight Loss from Unemployment Insurance

Outline of Glms

14. Maximum Likelihood Estimation: MLE (LM 5.2)