2: CONFIDENCE INTERVALS FOR THE ;

UNKNOWN

Now, we suppose that X1 ,...,Xn are iid with unknown mean µ and unknown variance σ2.

Clearly, we will now have to estimate σ2 from the available . The most commonly-used of σ2 is the variance,

n 2 = hhhhh1 − d 2 Sx − Σ (Xi X ) . n 1 i =1

The reason for using the n −1 in the denominator is

22σ that this makes Sx an unbiased estimator of .In

22=σ other words, E [Sx ] . We will prove this later. A proof in the normal case follows from Section 4.8 of Hogg & Craig. -2- Note: Hogg and Craig use a denominator of n in their S 2. We, most textbooks, and most practition- ers, however, use n −1. To minimize confusion, we will try for now to avoid using the symbol S 2. g Question: What would happen if we used Sx in place of σ in the formula for the CI? g Answer: It depends on whether the sample size is

"large" or not. -3- Large-Sample Confidence Interval;

Population Not Necessarily Normal

hhhSx Theorem: The interval Xd ± z α is an asymp- /2 √ndd totic level 1 −α CI for µ.

In other words, when the sample size is large, we

σ can use Sx in place of the unknown , and the CI will still work.

2 Proof: It can be shown that Sx converges in proba- bility to σ2. In other words,

e 22−σ e >ε → ε> lim Pr ( Sx ) 0 for any 0. n →∞

As a result, the distribution of -4- hhhhhhXd −µ √dd Sx / n

converges to the standard . Simi-

larly to the proof from the previous handout, we get

hhhhhhXd−µ Pr (CI Contains µ) = Pr (−z < < z ) → 1 −α . α/2 √dd α/2 Sx / n

Small-Sample Confidence Interval;

Normal Population

g If the sample size is small (the usual guideline is

n ≤ 30), and σ is unknown, then to assure the vali-

dity of the CI we will present here, we must assume

that the population distribution is normal. This

assumption is hard to check in small samples! -5-

hhhSx g The CI is Xd ± t α .(t α is defined below.) /2 √dnd /2

The Basics of t Distributions

hhhhhhXd −µ When n is small, the quantity t = does not √dd Sx / n have a normal distribution, even when the popula- tion is normal.

Instead, t has a "Student’s t distribution with n −1 degrees of freedom".

There is a different t distribution for each value of the degrees of freedom, ν.

− The quantity t α/2 denotes the t value such that the -6- area to its right under the Student’s t distribution

(with ν=n −1) is α/2. Note that we use ν=n −1, even though the sample size is n . Values of t α are listed in Table 2, Page 599 of Jobson. g Note that the last row of Table 2 is denoted by

"∞". For practical purposes, any value of ν beyond

29 is usually considered "infinite". (Most tables stop at ν=29. Jobson’s table is somewhat better, since he also has entries for ν= 30, 40, 50, 60, and 120.)

In this case, the corresponding t distribution is essentially identical to the standard normal distribu- tion. Here, it doesn’t matter whether we use the CI

hShhx hShhx Xd ± t α or Xd ± z α since they will be /2 √ndd /2 √ndd -7- almost the same. Since t is asymptotically standard normal, the t α values given in the "∞" row of Table

2 are identical to the z α values defined earlier. g On the other hand, if ν≤29 the t distribution has

"longer tails" (i.e., contains more outliers) than the normal distribution, and it is important to use the t −values of Table 2, assuming that σ is unknown.

Here, the CI based on t α/2 will be wider than the

(incorrect) one based on z α/2.

(Why does this happen, and why does it make sense?)

Eg1:A random sample of 8 "Quarter Pounders" yields a mean weight of xd = .2 pounds, with a -8- = sample of sx .07 pounds. Con- struct a 95% CI for the unknown population mean weight for all "Quarter Pounders".

Background: Definitions of χ2 and t distributions

As in Section 1.3.3 of Jobson, we define the χ2 dis- tribution with ν degrees of freedom to be the distri-

ν χ=2 2 bution of the ν Σ Zi , where i =1

... Z 1, , Z ν are iid standard normal. The distribu- tion is positive valued and is skewed to the right.

2 2 The mean and variance are E [χν ] =ν, var [χν ] = 2ν.

µ σ2 If X1 ,...,Xn are iid N ( , ), then it can be

− 22σ χ 2 shown that (n 1)Sx / has a n −1 distribution. -9- 22∼σ χ2 − Therefore, Sx n −1/(n 1), and we find that

22=σ 22σ E [Sx ] , so that Sx is unbiased for . g The random variable

hhhhhhZ 2 √χddν / dν is said to have a t distribution with ν degrees of

2 freedom if Z is standard normal and χν is indepen-

2 dent of Z and has a χν distribution.

Establishing the Small-Sample CI

µ σ2 Theorem: If X1 ,...,Xn are iid N ( , ), then

hShhx the interval Xd ± t α is a level 1 −α CI for µ. /2 √ndd -10- d 2 Proof: It can be shown that X and Sx are indepen-

dent. (We will prove this later).

Define Z = √ndd (Xd −µ)/σ, which is standard normal.

χ=2 − 22σ χ 2 Define n −1 (n 1)Sx / , which has a n −1 distri-

bution. Define

hhhhhhhhhhhhZ hhhhhhhhhh√ndd (Xd −µ) hhhhhhXd −µ t = = = . χddddddd2 − S S /√ndd √ n −1/(n 1) x x

By its definition, t has a t distribution with n −1

degrees of freedom. Therefore, similarly to the ear-

lier proofs,

hhhhhhXd−µ Pr (CI Contains µ) = Pr (−t < < t ) = 1 −α . α/2 √dd α/2 Sx / n