Inference for Single Proportions and Means T.Scofield

Confidence Intervals for Single Proportions and Means

A CI gives upper and lower bounds between which we hope to capture the (fixed) population we are studying (perhaps a population proportion p, a population mean µ, a difference of proportions or means p1 − p2, µ1 − µ2, a population correlation ρ, etc.). Our sample gives us a point estimate (pˆ, x¯, p1 − p2, x¯1 − x¯2, r, etc.). For all instances besides the percentile confidence interval construction described in Section3.4, our confidence interval is (point estimate) ± (margin of error). Our text makes several passes at how to calculate the margin of error, each time refining what was done earlier.

Section 3.3

In this first pass at confidence interval construction, we only discussed the construction of 95% CIs. Without having a direct way to simulate the of our point estimate (i.e., sample ), we construct bootstrap distributions which, though centered at the sample statistic instead of the population parameter, have roughly the same shape and spread as the sampling distribution. We ask that the bootstrap distribution appear symmetric, bell-shaped. If it is, we proceed, noting the standard error SE (i.e., the standard deviation of the sampling distribution) is approximately the same as the standard deviation of the bootstrap distribution. We compute the latter, and then take

(margin of error) = (2.0) × (standard deviation of the bootstrap distribution).

Section 5.2

Following Section 5.1, we are ready for a second pass at confidence interval construction. The tells us that we can expect the sampling distributions for pˆ and x¯ will be symmetric, bell-shaped— normal, in fact—so long as our sample size is large enough. This normality should be reflected in bootstrap distributions, and we acquired a new tools, rules of thumb (np ≥ 10 and n(1 − p) ≥ 10 for proportions, or n ≥ 30 for means), and the quantile-quantile plot (produced via the qqmath() command), to help us decide if the sample size is large enough. We also learn that one need only go 1.96 standard deviations, away from the mean (not a full two of them, as in Section 3.3) to enclose 95% of observations in a , a fact revealed by the command qnorm(0.975)

## [1] 1.959964 Thus, we now construct 95% CIs using

(margin of error) = (1.96) × (standard deviation of the bootstrap distribution).

The number 1.96 is called the critical z-value for 95% CIs, and is often labeled as z∗. Similar sorts of qnorm() commands may be used to find the critical value appropriate for other levels of confidence. For example, we use qnorm(0.95)

1 ## [1] 1.644854 to obtain z∗ for a 90% CI. Having determined the appropriate choice of z∗ for the particular level of confidence desired, our confidence interval is (point estimate) ± z∗ × (standard deviation of the bootstrap distribution). Note that we still need to produce a bootstrap distribution in this process.

Section 6.2 (modifications to CIs for p only)

This section focuses exclusively on a refinement to the procedure for constructing confidence intervals for a population proportion p. We learned, in Section 6.1, a formula for standard error r p(1 − p) SE = . pˆ n We are reminded, of course, that our CIs are premised on the sampling distribution for pˆ being normal, which it is reasonable to assume to be the case if the rules of thumb • np ≥ 10, • n(1 − p) ≥ 10, are met. There is the problem that we do not actually know p, so instead we check that • npˆ ≥ 10, • n(1 − pˆ) ≥ 10. If these check out, we take the margin of error to be r pˆ(1 − pˆ) (margin of error) = z∗ × , n choosing z∗ as we learned to do in Section 5.2. Note that we do not need to produce a bootstrap distribution in this process, as we obtain the standard error from a formula instead.

Section 6.5 (modifications to CIs for µ only)

This section focuses exclusively on a refinement to the procedure for constructing confidence intervals for a population mean µ. In Section 6.4, we learn this formula, used to approximate the standard error s SE = √ , x¯ n √ where s is the sample standard deviation. The actual standard error, σ/ n, would have allowed us to use a critical z-value as before. Since, we must replace the stability of a fixed population standard deviation σ with the variability of the sample standard deviation s, we turn to a new type of critical value, called t∗, obtained from an appropriate t-distribution. Once again, we want assurances that the sampling distribution for x¯, our point estimate, is normal which, according to the text, is reasonable to assume true if the sample size n ≥ 30. For a sample of size n = 40 that yields mean and standard deviation x¯ and s respectively, we obtain t∗ using the qt() command. For a 95% CI, we would take t∗ to be the value qt(0.975, df=39)

## [1] 2.022691

2 so that the 95% CI is s x¯ ± (2.022) √ . 40 If, instead, the sample size is n = 34 and we want a 99% CI, we take t∗ to be qt(0.995, df=33)

## [1] 2.733277 Whatever the specific details leading to t∗, our CI is built via the formula s x¯ ± (t∗)√ . n

As in Section 6.2 (for proportions), we do not need to produce a bootstrap distribution, as we obtain the standard error from a formula instead.

3 Hypothesis Testing for Single Proportions and Means

From the start we have outlined the steps of hypothesis testing as these: 1. Identify the research question, along with relevant variables. 2. Formulate hypotheses (null and alternative) appropriate to the research question. 3. Obtain a random sample, and a corresponding sample statistic (called the test statistic). 4. Determine the null distribution—i.e., the sampling distribution of the test statistic assuming the null hypothesis is true. 5. Determine the P -value—i.e., the likelihood that your test statistic, or something even more extreme, occurs under the null hypothesis. 6. Draw a conclusion (i.e. reject H0 or not). Of these, only Steps 4 and 5 undergo modifications as we make successive passes to refine the process. Here is a breakdown.

Section 4.4

In this first pass, we generate a randomization distribution, which simulates the null distribution. At this time we advised a check for normality, but had not yet learned the rules of thumb for proportions (np ≥ 10 and n(1 − p) ≥ 10) or for means (n ≥ 30) which are often considered sufficient to ensure normality; in hindsight, a quantile-quantile plot using qqmath() may be advisable, as well, particularly when we are doing hypothesis testing on a mean. Along with normality, we should make sure the randomization distribution is centered at the value proposed in the null hypothesis which, in the case of hypothesis testing on a population mean, generally involves shifting a bootstrap distribution to the correct location. To get the P -value, we locate the test statistic in the randomization distribution, then calculate the proportion of randomization in one or both tails as the alternative hypothesis dictates. This should always be a number between 0 and 1.

Section 5.2

Here, our modification is to Step 5 only. Having generated a randomization distribution (our stand-in for the null distribution of Step 4) and observed it to be normal, we calculate its standard deviation, which approximates the standard error (SEpˆ or SEx¯). Then, in the case our alternative hypothesis is one-sided and left-tailed (perhaps Ha : µ < µ0, if we are testing a mean), our approximate P -value comes from the command

pnorm(x¯, mean = µ0, sd = SEx¯)

If our alternative hypothesis is one-sided and right-tailed (Ha : p > p0, for a proportion), then

1 - pnorm(pˆ, mean = p0, sd = SEpˆ) generates the approximate P -value. These commands can be used, with appropriate alterations, to yield the approximate P -value when hypothesis is 2-sided (Ha : µ 6= µ0, for instance). Again, the resulting P -value should be a number between 0 and 1.

Section 6.3

Let us assume the null hypothesis is H0 : p = p0. Using the hypothesized value p0 for the population proportion, if the rules of thumb np0 ≥ 10 and n(1 − p0) ≥ 10 are met, then we take the null distribution p in Step 4 to be Norm(p0, p0(1 − p0)/n). As in Section 6.2, this means we do not need to generate a randomization distribution to find the standard error; it comes from the formula. We can find (directly) the

4 appropriate P -value using pnorm() commands, as in Section 5.2. For instance, if the alternative hypothesis were one-sided and left-tailed (i.e., H0 : p < p0), then the P -value would come from the command p pnorm( pˆ, p0, p0(1 − p0)/n) where, of course, it is necessary to fill in the correct values. Of course, since pˆ is from a normal distribution, one can convert it to a Z-score. pˆ − p0 z = p , p0(1 − p0)/n and then you can use the simpler command pnorm(z)

using this computed z.

Section 6.6

As we now have a formula to estimate SEx¯, we are free of needing to produce a randomization distribution, at least if we are reasonably sure that x¯ has a normal sampling distribution (n ≥ 30?). We assume, as before, that the null hypothesis is H0 : µ = µ0. We must standardize the test statistic, and use a t-distribution calculator to find the P -value. The standardized value we call t (not z) x¯ − µ t = √ 0 . s/ n

If the alternative hypothesis is one-sided, left-tailed, then the P -value is pt(t, df=n-1) where t is the standardized value indicated above, and n is the sample size.

5