9.6 the Power of a Test 1 9.6 the Power of a Test Section 9.1 Defined Type I and Type II Errors and Their Associated Risks

M09_LEVI5199_06_OM_C09.QXD 2/4/10 10:57 AM Page 1 9.6 The Power of a Test 1 9.6 The Power of a Test Section 9.1 defined Type I and Type II errors and their associated risks. Recall that a represents the probability that you reject the null hypothesis when it is true and should not be rejected, and b represents the probability that you do not reject the null hypothesis when it is false and should be rejected. The power of the test, 1 - b, is the probability that you correctly reject a false null hypothesis. This probability depends on how different the actual population parameter is from the value being hypothesized (under H0), the value of a used, and the sample size. If there is a large difference between the population parameter and the hypothesized value, the power of the test will be much greater than if the difference between the population parameter and the hypothesized value is small. Selecting a larger value of a makes it easier to reject H0 and therefore increases the power of a test. Increasing the sample size increases the precision in the estimates and therefore increases the ability to detect differences in the parameters and increases the power of a test. The power of a statistical test can be illustrated by using the Oxford Cereal Company sce- nario. The filling process is subject to periodic inspection from a representative of the consumer affairs office. The representative’s job is to detect the possible “short weighting” of boxes, which means that cereal boxes having less than the specified 368 grams are sold. Thus, the representative is interested in determining whether there is evidence that the cereal boxes have a mean weight that is less than 368 grams. The null and alternative hypotheses are as follows: Ú H0: m 368 (filling process is working properly) 6 H1: m 368 (filling process is not working properly) The representative is willing to accept the company’s claim that the standard deviation, s, equals 15 grams. Therefore, you can use the Z test. Using Equation (9.1) on page 302, with X L (the lower critical X value) substituted for X , you can find the value of X that enables you to reject the null hypothesis: X - m Z = L s 1n s = - Za 2 X L m > 1n s = + XL m Za 2 > 1n Because you have a one-tail test with a level of significance of 0.05, the value of Za 2 is equal > to -1.645 (see Figure 9.16). The sample size n = 25. Therefore, (15) X = 368 + (-1.645) = 368 - 4.935 = 363.065 L 125 The decision rule for this one-tail test is 6 Reject H0 if X 363.065; otherwise, do not reject H0. FIGURE 9.16 Determining the lower critical value for a one-tail Z test for a population mean at the 0.05 level of significance .95 .05 μ XL = 368 X Region of Region of Rejection Nonrejection Z ZL = –1.645 0 M09_LEVI5199_06_OM_C09.QXD 2/4/10 10:57 AM Page 2 2 CHAPTER 9 Fundamentals of Hypothesis Testing The decision rule states that if in a random sample of 25 boxes, the sample mean is less than 363.065 grams, you reject the null hypothesis, and the representative concludes that the process is not working properly. The power of the test measures the probability of concluding that the process is not working properly for differing values of the true population mean. What is the power of the test if the actual population mean is 360 grams? To determine the chance of rejecting the null hypothesis when the population mean is 360 grams, you need to = determine the area under the normal curve below X L 363.065 grams. Using Equation (9.1), with the population mean m = 360, X - m Z = STAT s 1n 363.065 - 360 = = 1.02 15 125 From Table E.2, there is an 84.61% chance that the Z value is less than +1.02. This is the power of the test where m is the actual population mean (see Figure 9.17). The probability (b) that you will not reject the null hypothesis (m = 368) is 1 - 0.8461 = 0.1539. Thus, the probability of committing a Type II error is 15.39%. FIGURE 9.17 Determining the power of the test and the ␤ probability of a Type II Power = .8461 error when m = 360 grams .1539 μ = 360 XL = 363.065 X 0 +1.02 Z Now that you have determined the power of the test if the population mean were equal to 360, you can calculate the power for any other value of m. For example, what is the power of the test if the population mean is 352 grams? Assuming the same standard deviation, sample size, and level of significance, the decision rule is 6 Reject H0 if X 363.065 otherwise, do not reject H0. Once again, because you are testing a hypothesis for a mean, from Equation (9.1), X - m Z = STAT s 1n If the population mean shifts down to 352 grams (see Figure 9.18), then 363.065 - 352 Z = = 3.69 STAT 15 125 M09_LEVI5199_06_OM_C09.QXD 2/4/10 10:57 AM Page 3 9.6 The Power of a Test 3 FIGURE 9.18 Determining the power of the test and the ␤ = .00011 probability of a Type II error when m = 352 Power = .99989 grams μ = 352 XL = 363.065 X 0 +3.69 Z From Table E.2, there is a 99.989% chance that the Z value is less than + 3.69. This is the power of the test when the population mean is 352. The probability (b) that you will not reject the null hypothesis (m = 368) is 1 - 0.99989 = 0.00011. Thus, the probability of committing a Type II error is only 0.011%. In the preceding two examples, the power of the test is high, and the chance of committing a Type II error is low. In the next example, you compute the power of the test when the population mean is equal to 367 grams—a value that is very close to the hypothesized mean of 368 grams. Once again, from Equation (9.1), X - m Z = STAT s 1n If the population mean is equal to 367 grams (see Figure 9.19), then 363.065 - 367 Z = =-1.31 STAT 15 125 FIGURE 9.19 Determining the power of the test and the Power = .0951 probability of a Type II error when m = 367 grams ␤ = .9049 μ XL = 363.065 = 367 X –1.31 0 Z From Table E.2, the probability less than Z =-1.31 is 0.0951 (or 9.51%). Because the rejection region is in the lower tail of the distribution, the power of the test is 9.51%, and the chance of making a Type II error is 90.49%. Figure 9.20 illustrates the power of the test for various possible values of m (including the three values examined). This graph is called a power curve. M09_LEVI5199_06_OM_C09.QXD 2/4/10 10:57 AM Page 4 4 CHAPTER 9 Fundamentals of Hypothesis Testing FIGURE 9.20 .99961 .9964 .9783 1.00 .9545 .99989 .99874 .9909 Power curve of the .9131 0.90 cereal-box-filling process .8461 m 6 for H1: 368 grams 0.80 .7549 0.70 .6406 0.60 0.50 .5080 Power 0.40 .3783 0.30 .2578 0.20 .1635 .0951 0.10 .0500 0.00 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 Possible Values for μ (grams) From Figure 9.20, you can see that the power of this one-tail test increases sharply (and approaches 100%) as the population mean takes on values farther below the hypothesized mean of 368 grams. Clearly, for this one-tail test, the smaller the actual mean m, the greater 1For situations involving one-tail the power to detect this difference.1 For values of m close to 368 grams, the power is small tests in which the actual mean, m1, because the test cannot effectively detect small differences between the actual population exceeds the hypothesized mean, the mean and the hypothesized value of 368 grams. When the population mean approaches 368 converse would be true. The larger grams, the power of the test approaches a, the level of significance (which is 0.05 in this the actual mean, m1, compared with the hypothesized mean, the greater example). is the power. For two-tail tests, the Figure 9.21 summarizes the computations for the three cases. You can see the drastic greater the distance between the changes in the power of the test for different values of the actual population means by review- actual mean, m1, and the hypothesized mean, the greater the power of ing the different panels of Figure 9.20. From Panels A and B you can see that when the popula- the test. tion mean does not greatly differ from 368 grams, the chance of rejecting the null hypothesis, based on the decision rule involved, is not large. However, when the population mean shifts substantially below the hypothesized 368 grams, the power of the test greatly increases, approaching its maximum value of 1 (or 100%). In the above discussion, a one-tail test with a = 0.05 and n = 25 was used. The type of statistical test (one-tail vs.

9.6 the Power of a Test 1 9.6 the Power of a Test Section 9.1 Defined Type I and Type II Errors and Their Associated Risks

05 36534Nys130620 31

The Effects of Simplifying Assumptions in Power Analysis

Introduction to Hypothesis Testing

Power of a Statistical Test

Confidence Intervals and Hypothesis Tests

Understanding Statistical Hypothesis Testing: the Logic of Statistical Inference

Post Hoc Power: Tables and Commentary

STAT 141 11/02/04 POWER and SAMPLE SIZE Rejection & Acceptance Regions Type I and Type II Errors (S&W Sec 7.8) Power

A Test of Independence in Two-Way Contingency Tables Based on Maximal Correlation

Statistical Power and P-Values: an Epistemic Interpretation Without Power Approach Paradoxes

The Probability of Not Committing a Type II Error Is Called the Power of a Hypothesis Test

Simulation-Based Power-Analysis for Factorial ANOVA Designs Daniel Lakens1 & Aaron R