2: Confidence Intervals for the Mean; Unknown Variance

2: CONFIDENCE INTERVALS FOR THE MEAN; UNKNOWN VARIANCE Now, we suppose that X1 ,...,Xn are iid with unknown mean µ and unknown variance σ2. Clearly, we will now have to estimate σ2 from the available data. The most commonly-used estimator of σ2 is the sample variance, n 2 = hhhhh1 − d 2 Sx − Σ (Xi X ) . n 1 i =1 The reason for using the n −1 in the denominator is 22σ that this makes Sx an unbiased estimator of .In 22=σ other words, E [Sx ] . We will prove this later. A proof in the normal case follows from Section 4.8 of Hogg & Craig. -2- Note: Hogg and Craig use a denominator of n in their S 2. We, most textbooks, and most practition- ers, however, use n −1. To minimize confusion, we will try for now to avoid using the symbol S 2. g Question: What would happen if we used Sx in place of σ in the formula for the CI? g Answer: It depends on whether the sample size is "large" or not. -3- Large-Sample Confidence Interval; Population Not Necessarily Normal hhhSx Theorem: The interval Xd ± z α is an asymp- /2 √ndd totic level 1 −α CI for µ. In other words, when the sample size is large, we σ can use Sx in place of the unknown , and the CI will still work. 2 Proof: It can be shown that Sx converges in probability to σ2. In other words, e 22−σ e >ε → ε> lim Pr ( Sx ) 0 for any 0. n →∞ As a result, the distribution of -4- hhhhhhXd −µ √dd Sx / n converges to the standard normal distribution. Simi- larly to the proof from the previous handout, we get hhhhhhXd−µ Pr (CI Contains µ) = Pr (−z < < z ) → 1 −α . α/2 √dd α/2 Sx / n Small-Sample Confidence Interval; Normal Population g If the sample size is small (the usual guideline is n ≤ 30), and σ is unknown, then to assure the vali- dity of the CI we will present here, we must assume that the population distribution is normal. This assumption is hard to check in small samples! -5- hhhSx g The CI is Xd ± t α .(t α is defined below.) /2 √ndd /2 The Basics of t Distributions hhhhhhXd −µ When n is small, the quantity t = does not √dd Sx / n have a normal distribution, even when the population is normal. Instead, t has a "Student’s t distribution with n −1 degrees of freedom". There is a different t distribution for each value of the degrees of freedom, ν. − The quantity t α/2 denotes the t value such that the -6- area to its right under the Student’s t distribution (with ν=n −1) is α/2. Note that we use ν=n −1, even though the sample size is n . Values of t α are listed in Table 2, Page 599 of Jobson. g Note that the last row of Table 2 is denoted by "∞". For practical purposes, any value of ν beyond 29 is usually considered "infinite". (Most tables stop at ν=29. Jobson’s table is somewhat better, since he also has entries for ν= 30, 40, 50, 60, and 120.) In this case, the corresponding t distribution is essentially identical to the standard normal distribution. Here, it doesn’t matter whether we use the CI hShhx hShhx Xd ± t α or Xd ± z α since they will be /2 √ndd /2 √ndd -7- almost the same. Since t is asymptotically standard normal, the t α values given in the "∞" row of Table 2 are identical to the z α values defined earlier. g On the other hand, if ν≤29 the t distribution has "longer tails" (i.e., contains more outliers) than the normal distribution, and it is important to use the t −values of Table 2, assuming that σ is unknown. Here, the CI based on t α/2 will be wider than the (incorrect) one based on z α/2. (Why does this happen, and why does it make sense?) Eg1:A random sample of 8 "Quarter Pounders" yields a mean weight of xd = .2 pounds, with a -8- = sample standard deviation of sx .07 pounds. Con- struct a 95% CI for the unknown population mean weight for all "Quarter Pounders". Background: Definitions of χ2 and t distributions As in Section 1.3.3 of Jobson, we define the χ2 distribution with ν degrees of freedom to be the distri- ν χ=2 2 bution of the random variable ν Σ Zi , where i =1 ... Z 1, , Z ν are iid standard normal. The distribution is positive valued and is skewed to the right. 2 2 The mean and variance are E [χν ] =ν, var [χν ] = 2ν. µ σ2 If X1 ,...,Xn are iid N ( , ), then it can be − 22σ χ 2 shown that (n 1)Sx / has a n −1 distribution. -9- 22∼σ χ2 − Therefore, Sx n −1/(n 1), and we find that 22=σ 22σ E [Sx ] , so that Sx is unbiased for . g The random variable hhhhhhZ 2 √χddν / dν is said to have a t distribution with ν degrees of 2 freedom if Z is standard normal and χν is indepen- 2 dent of Z and has a χν distribution. Establishing the Small-Sample CI µ σ2 Theorem: If X1 ,...,Xn are iid N ( , ), then hShhx the interval Xd ± t α is a level 1 −α CI for µ. /2 √ndd -10- d 2 Proof: It can be shown that X and Sx are indepen- dent. (We will prove this later). Define Z = √ndd (Xd −µ)/σ, which is standard normal. χ=2 − 22σ χ 2 Define n −1 (n 1)Sx / , which has a n −1 distribution. Define hhhhhhhhhhhhZ hhhhhhhhhh√ndd (Xd −µ) hhhhhhXd −µ t = = = . χddddddd2 − S S /√ndd √ n −1/(n 1) x x By its definition, t has a t distribution with n −1 degrees of freedom. Therefore, similarly to the earlier proofs, hhhhhhXd−µ Pr (CI Contains µ) = Pr (−t < < t ) = 1 −α . α/2 √dd α/2 Sx / n.

2: Confidence Intervals for the Mean; Unknown Variance

Confidence Intervals for the Population Mean Alternatives to the Student-T Confidence Interval

STAT 22000 Lecture Slides Overview of Confidence Intervals

Statistics for Dummies Cheat Sheet from Statistics for Dummies, 2Nd Edition by Deborah Rumsey

Understanding Statistical Hypothesis Testing: the Logic of Statistical Inference

Confidence Intervals

P Values and Confidence Intervals Friends Or Foe

Student's T Distribution

Multiple Random Variables

History of Biostatistics

Statistical Tests, P-Values, Confidence Intervals, and Power

Skewness-Kurtosis Adjusted Confidence Estimators and Significance Tests Wolf-Dieter Richter

Confidence Intervals for One Standard Deviation with Tolerance Probability