ST 380 Probability and for the Physical Sciences Interval Estimates

A point estimate by itself provides no information about the precision and reliability of estimation.

Consider, e.g. X¯ as an for µ . We have no idea how closex ¯ is to µ .

An alternative to reporting a single number is to report an entire interval of plausible values, that is an interval estimate.

1 / 15 Interval Estimation Introduction ST 380 Probability and Statistics for the Physical Sciences

Confidence Interval For a 95% confidence interval, at the 95% confidence level, any value of parameter θ in the interval is plausible.

A confidence level of 95% implies that 95% of all samples would give an interval that includes θ, and only 5% of all samples would yield an erroneous interval.

The most frequently used confidence levels are 90%, 95%, and 99%.

The higher the confidence level, the more strongly we believe that the value of the parameter lies within the interval.

2 / 15 Interval Estimation Introduction ST 380 Probability and Statistics for the Physical Sciences Basic Properties of Confidence Intervals

The basic properties of confidence intervals (CIs) are most easily introduced by first focusing on a simple, albeit somewhat unrealistic, problem.

Suppose that the parameter of interest is µ, the population is normal, and the value of the σ is known.

3 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

The Assumptions Normality of the population distribution is often a reasonable assumption, or at least an approximation.

However, if the value of µ is unknown, it is typically implausible that the value of σ is known.

Methods based on less restrictive assumptions will be shown later.

4 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

Recall that X¯ ∼ N(µ, σ2/n), and that

X¯ − µ Z = √ ∼ N(0, 1). σ/ n

So

0.95 = P(−1.96 < Z < 1.96)  X¯ − µ  = P −1.96 < √ < 1.96 σ/ n  σ σ  = P X¯ − 1.96√ < µ < X¯ + 1.96√ n n

This is a random interval for the fixed value µ.

5 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

The interpretation is: “the probability is .95 that the random interval includes the true value of µ.”

Ifx ¯ = 80, n = 31 and σ = 2, the 95% confidence interval would be 2.0 80.0 ± 1.96 × √ = (79.3, 80.7). 31

It is tempting to conclude that µ is within this (now fixed) interval with probability .95 ...

6 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

But µ is a constant, if unknown, and once we evaluate the interval, the end-points are also fixed.

It is therefore incorrect to write the statement

P [µ ∈ (79.3, 80.7)] = .95.

The correct interpretation is that if we repeatedly formed confidence intervals using this procedure, in the long run, 95% of them would contain the parameter µ.

We might write that we are “95% confident” that µ lies within the interval.

7 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

For a 99% confidence interval, we would need to replace 1.96 by 2.58.

In general, a confidence level of 1 − α is achieved by using zα/2 in place of 1.96. Recall that  P Z > zα/2 = α/2.

8 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

Definition A 100(1 − α)% confidence interval for the µ of a normal population, when the value of σ is known, is given by

 σ σ  x¯ − z × √ , x¯ + z × √ α/2 n α/2 n

We often write it more compactly as σ x¯ ± z × √ . α/2 n

9 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences

Confidence Level, Precision, and Size Why settle for a 95% confidence interval when a 99% interval is available?

One issue is that the 99% interval is wider (it uses 2.58 instead of 1.96), and thus has less precision.

If we want both high confidence and precision, we could fix both and then solve for the necessary sample size.

10 / 15 Interval Estimation Basic Properties ST 380 Probability and Statistics for the Physical Sciences Large Sample Confidence Intervals

Suppose as before that the parameter of interest is µ, but the population is not known to be normal; we still assume for now that the value of the standard deviation σ is known.

The assures us that X¯ is approximately normally distributed as N(µ, σ2/n), and hence σ x¯ ± z × √ . α/2 n is a confidence interval for µ with a confidence level of approximately 100(1 − α)%.

11 / 15 Interval Estimation Large Sample Intervals ST 380 Probability and Statistics for the Physical Sciences

Suppose, more realistically, that σ is also unknown.

Replacing σ by s, the sample standard deviation, in the calculation of the confidence interval is an additional approximation, but it is still true that s x¯ ± z × √ . α/2 n is a confidence interval for µ with a confidence level of approximately 100(1 − α)%.

12 / 15 Interval Estimation Large Sample Intervals ST 380 Probability and Statistics for the Physical Sciences

General Large Sample Case In other situations, we may want to use an estimator θˆ of some parameter θ, and we may know that θˆ is approximately normally distributed with mean θ, and we may have an estimated standard ˆ errorσ ˆθˆ of θ.

Then ˆ θ ± zα/2 × σˆθˆ is a confidence interval for θ with a confidence level of approximately 100(1 − α)%.

13 / 15 Interval Estimation Large Sample Intervals ST 380 Probability and Statistics for the Physical Sciences

Small Samples from a Recall the confidence interval for the mean µ of a normal distribution, when σ is known: σ x¯ ± z × √ . α/2 n

If σ is not known, we replace it by its estimate,

s = sample standard deviation.

To maintain the coverage probability of 100(1 − α)%, we must adjust the multiplier zα/2.

14 / 15 Interval Estimation Large Sample Intervals ST 380 Probability and Statistics for the Physical Sciences

The necessary probability result is that

X¯ − µ T = √ S/ n has a known , the Student’s t-distribution with ν = n − 1 degrees of freedom.

It follows that s x¯ ± t × √ , α/2,ν n is a 100(1 − α)% confidence interval for µ, where tα/2,ν is the 1 − α quantile of that distribution.

15 / 15 Interval Estimation Large Sample Intervals