Problem Set Section 8.1 Inferences About Two

Problem Set Section 8.1 Inferences About Two Independent Means: Large Samples

1. First Grade Absences: For boys, the average number of absences in the first grade is 15 with a standard deviation of 7; for girls, the average number of absences is 10 with a standard deviation of 6. In a nationwide survey, suppose 100 boys and 50 girls are sampled. Test the hypothesis that boys are absent more than girls? 2. ESP Sensitivity: Imagine a researcher wants to determine whether or not a given drug has any effect on the scores of human subjects performing a task of ESP sensitivity. He randomly assigns his subjects to one of two groups. Nine hundred subjects in group 1 (the experimental group) receive an oral administration of the drug prior to testing. In contrast, 1000 subjects in group 2 (control group) receive a placebo. Does the drug have an effect? Perform an appropriate hypothesis test and construct a confidence

Drug Group No-Drug Group n = 900 n = 1000 mean = 9.78 mean = 15.10 SD = 4.05 SD = 4.28 interval. Use a .05 level of significance. 3. Inferences from Samples of Listing Home Price and Selling Home Price: Use the HOMES dataset. Assume σ = 80 for listing home price and σ = 70 for selling home price. a. Test the claim that the selling price of homes is less that the listing price. Use a . 01 significance level. b. Construct a 99% confidence interval for the mean difference of listing price and selling price.

1 4. Drug Effectiveness: When testing for a difference between means for a treatment group and a placebo group, the following Excel display is obtained.

z-Test: Two Sample for Means

Variable 1 Variable 2 Mean 152.0739 154.9669 Known Variance 438.5388 239.1461 Observations 50 50 Hypothesized Mean Difference 0 z -0.7858 P(Z<=z) one-tail 0.22 z Critical one-tail 1.644853 P(Z<=z) two-tail 0.4 z Critical two-tail 1.959961

a. Using a 0.05 significance level, is there sufficient evidence to support the claim that the treatment group (Variable 1) comes from a population with a mean that is less than the mean for the placebo population? Explain. Set up hypothesis. You may use either the p-value or traditional method. b. Construct a 95% confidence interval. 5. Drug Effectiveness: When testing for a difference between means for a treatment group and a placebo group, the following Excel display is obtained.

z-Test: Two Sample for Means

Variable 1 Variable 2 Mean 75.34798 73.81763 Known Variance 12.5538 11.65292 Observations 50 50 Hypothesized Mean Difference 0 z 2.242534 P(Z<=z) one-tail 0.012463 z Critical one-tail 2.05375 P(Z<=z) two-tail 0.024927 z Critical two-tail 2.32635

a. Using a 0.02 significance level, is there sufficient evidence to support the claim that the treatment group (Variable 1) comes from a population with a mean that is different from the mean for the placebo population? Explain. Set up hypothesis. You may use either the p-value or traditional method. b. Construct a 98% confidence interval.

2 Problem Set Section 8.2 Inferences About Two Dependent Means

1. Data sets A and B are dependent.

a. Find d , sd

b. Test the hypothesis that μd = 0. Use α = .05.

c. Construct a 95% confidence interval for μd

Subject 1 2 3 4 5 A 30 28 43 43 5.8 B 28 24 25 35 8.3 d -2 -4 -18 -8 2.5 2. Cholesterol Drug: A pharmaceutical company wishes to test a new drug with the expectation of lowering cholesterol levels. Ten subjects are randomly selected and pretested. The results area listed below. The subjects were placed on the drug for a period of 6 months, after which their cholesterol levels were tested again. The results are listed below. a. Does the drug lower cholesterol level? Use α = .01. Assume that the distribution of differences is normally distributed. b. Construct a 99% confidence interval estimate of the mean difference between pretest and posttest results. Interpret the results, and comment on the implications of whether the confidence interval contains zero.

Subject 1 2 3 4 5 6 7 8 9 10 Before 195 225 202 195 175 250 235 269 190 240 After 180 220 210 175 170 250 205 250 190 225 3. SAT Preparation Course: Nine students took the SAT. Their scores are listed below. Later on, they took a test preparation course and retook the SAT. a. Did the test preparation course make a difference? Use α = .02. Assume that the distribution of differences is normally distributed. b. Construct a 98% confidence interval estimate of the mean difference between pre-course and post-course results. Interpret the results, and comment on the implications of whether the confidence interval contains zero.

Student 1 2 3 4 5 6 7 8 9 Scores before course 720 860 850 880 860 710 850 1200 950 Scores after course 740 860 840 920 890 720 840 1240 970

3 Problem Set Section 8.3 Comparing Variation in Two Samples

1. Test the claim that the placebo population has a different standard deviation than the treatment population. Use α = .05. Assume normal distributions. Treatment Group: n = 20, x = 65.4, s = 0.83 Placebo Group: n = 15, x = 62.4, s=0.42 2. Comparing Day and Evening Statistics Students: A statistics teacher believes that the standard deviation of test scores for students in her evening statistics class are different that the standard deviation of test scores of student in her day class. The results of an exam, given to the day and evening students are shown below. Can the teacher conclude that there is a difference? Use α = .01. Assume data is normally distributed. Day Students: n = 36, x = 75, s = 9.8 Evening Students: n = 41, x = 77, s=5.3 3. Comparing Authors: Using the BOOKS dataset, test the claim that there is no difference between the standard deviation of words/sentence for Clancy books and Tolstoy books. Use α = .05. Assume normal distributions.

4 Problem Set Section 8.4 Inferences About Two Independent Means: Small Samples

Where appropriate, test for equality of variances to determine whether sample variances should be pooled.

1. Sample data for two populations is displayed below. Population I Population II Sample Size 25 18 Sample Mean 50.2 42.0 Sample Standard Deviation 6.4 9.9 a. Perform an appropriate one tail test for the difference of means. α = .1 b. Construct a 95% confidence interval for the difference of means. 2. Hormone Replacement Therapy: In observational studies the number of deaths due to coronary heart disease has been reduced in women who take hormone replacement therapy. Low levels of serum HDL (good cholesterol) are considered predictive of death from coronary heart disease. Researchers claim that serum HDL increases when patients participate in hormone replacement therapy. A study was done. Results are listed below. a. Does the data support the researchers claim? Use α = .05.

b. Construct a 95% confidence interval about µ1-µ2.

Experimental Control Group Group Sample Size 30 29 Mean Increase in HDL 8.1 2.4 Sample Standard Deviation 10.5 4.3

5 3. Experimental Pain Reliever: Suppose we want to test an experimental pain reliever. We induce pain in each of the test subjects. (This happened to me!) We then give the experimental pain reliever to one group of people and ibuprofen to the other (the control group). After one hour, we ask each person to report their pain level (on a 1-10 scale). The Excel Output is given below. α = .05.

Test: 4 3 4 5 3 3 4 4 2 4 Control: 4 4 6 5 7 4 2 5 6 4

a. Is the experimental one better (or worse) or must we attribute this difference to chance? b. Construct a 95% confidence interval. Is there a difference between the mean pain level for the experimental pain reliever and ibuprofen?

F-Test Two-Sam ple for Variances t-Test: Two-Sample Assuming Equal Variances

Variable 1 Variable 2 Variable 1 Variable 2 Mean 3.6 4.7 Mean 3.6 4.7 Variance 0.711111 2.011111 Variance 0.711111 2.011111 Observations 10 10 Observations 10 10 df 9 9 Pooled Variance 1.361111 F 0.353591 Hypothesized Mean Difference 0 P(F<=f) one-tail 0.068706 df 18 F Critical one-tail 0.314575 t Stat -2.10829 P(T<=t) one-tail 0.024641 t Critical one-tail 1.734064 P(T<=t) two-tail 0.049283 t Critical two-tail 2.100922

t-Test: Two-Sample Assuming Unequal Variances

Variable 1 Variable 2 Mean 3.6 4.7 Variance 0.711111 2.011111 Observations 10 10 Hypothesized Mean Difference 0 df 15 t Stat -2.10829 P(T<=t) one-tail 0.026117 t Critical one-tail 1.75305 P(T<=t) two-tail 0.052233 t Critical two-tail 2.13145

6 4. SAT Scores Between Boys and Girls: Suppose you give the SAT math test to 40 students. The boys scores are as follows: 275 309 355 378 396 461 464 468 472 474 475 480 490 523 555 582 632 683 741 766 The girls scores are as follows: 298 319 409 423 436 447 447 457 454 459 488 502 519 532 533 547 551 557 569 588 Is there any statistically significant difference between boys and girls? In other words, does this data give any evidence that one sex is better than the other, or can we attribute the differences to chance? 5. Impulse Spending: Big Foods Grocery has two grocery stores located in Johnston City. One store is located on First Street and the other on Main Street and each is run by a different manager. Each manager claims that her store's layout maximizes the amounts customers will purchase on impulse. Both managers surveyed a sample of their customers and asked them how much more they spent than they had planned to, in other words, how much did they spend on impulse? The following table shows the sample data collected from the two stores. Main Street First Street 15.19 15.78 18.22 17.73 15.38 10.61 15.96 15.79 21.92 14.22 12.87 13.82 12.47 13.45 13.96 12.86 13.79 10.82 13.74 12.85 18.4 18.57 17.79 10.83 Upper-level management at Big Foods Grocery wants to know if there is a difference in the mean amounts purchased on impulse at the two stores and has hired you to perform the statistical analysis. a. Choose a significance level and perform an appropriate hypothesis test. b. Construct a confidence interval. Interpret results.

7 6. Cognitive Therapy: A hypothetical experiment is conducted to see whether cognitive therapy is more effective for relieving depression than is psychodynamic psychotherapy. Out of the population of depressed people in a certain area, 10 are sampled for the cognitive therapy group and 10 are sampled for the psychodynamic therapy group. After 6 weeks of therapy, the improvement in each patient is assessed. The improvement scores are shown below:

Cognitive Psychodynamic 9 3 7 2 7 4 8 0 3 5 8 2 7 4 5 3 6 2 8 5 Estimate the mean difference in effectiveness and compute a 95% confidence interval on this estimate. What can be concluded about the relative effectiveness of the two treatments in the population?

8 Problem Set Section 8.5 Inferences About Two Proportions

1. Abstract Reasoning: An experiment compared the ability of 7 and 9 year olds to solve a problem requiring a certain type of abstract reasoning. Five of 21 seven-year olds and 14 of 16 nine-year olds solved the problem. Construct the 95% confidence interval on the difference in proportions. 2. Sexual Harassment Case: Suppose we wish to estimate the proportion of persons who would vote for a guilty verdict in a particular sexual harassment case. Of 80 female jurors 58 (72.5%) voted guilty, of 80 male jurors 47 (58.8%) voted guilty. a. Is there a difference between the number of male jurors who voted guilty and the number of female jurors who voted guilty? b. Construct a confidence interval for the difference between the true proportion of male jurors voting guilt and true proportion of female jurors voting guilty. Is there a significant difference? 3. Urn Tampering: Suppose we draw 10 balls from an urn of red and white balls and we get 8 red and 2 white. We leave the room and come back some time later. We again draw 10 balls and we get: 5 red and 5 white. Has someone tampered with the urn and changed the ratio of red to white balls? Or must we attribute this to chance? 4. Breakfast Before School: A survey was conducted of students from the Princeton public school system to determine if the incidence of hungry children was consistent in two schools located in lower-income areas. A random sample of 80 elementary students from school A found that 18 did not have breakfast before coming to school. A random sample of 180 elementary students from school B found that 27 did not have breakfast before coming to school. a. Does there appear to be a difference between the two schools? Use a 5% significance level. b. Construct a 95% confidence interval for the difference between the true proportion of school A students that did not have breakfast and the true proportion of school B students that did not have breakfast. 5. Skating Injuries: 161 people who visited one hospital's emergency room in a 6-month study period with injuries from in-line skating were interviewed. The interviewer found that 53 people were wearing wrist guards and 6 of them had wrist injuries. Of the 108 who did not wear wrist guards, 45 had wrist injuries. We are interested in the difference between the proportions of wrist injuries in the population wearing wrist guard and the population without. Does it appear that wrist guards prevent wrist injuries? Use a 10% significance level.

9 Selected Answers:

Section 8.1 1. z = 4.55, Reject 2. CI (4.95, 5.70) , z=27.83, reject

Section 8.2 2a) t = 2.76, fail to Reject 2b) CI (-21.99, 1.791)

Section 8.3 2) cv(right) = 2.34, F=3.42, reject

Section 8.4 For df = 28 2a) t=2.745, p-val = 0.0053, reject 2b) 1.446 <µ < 9.954

For df = 38.728 2a) t=2.745, p-val = 0.00457, reject 2b) 1.500 < µ < 9.901 . Section 8.5 2a) z=1.831 3) z=-1.406, p-val=0.1596

10