252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

D. COMPARISON OF TWO SAMPLES 1. Two Means, Two Independent Samples, Large Samples. Text 10.1-10.3, 10.7 [10.1 – 10.3, 10.5] (10.1 – 10.3, 10.5) 2. Two Means, Two Independent Samples, Populations Normally Distributed, Population Variances Assumed Equal. Text 10.4, 10.13a, 10.20a, b, e [10.4, 10.15a, 10.13a,b,e]. For the last problem: x1  17.5571, s1  1.9333, x2  19.8905, s2  4.5767 (10.4, 10.14, 10.12a,b,e)

3. Two Means, Two independent Samples, Populations Normally Distributed, Population Variances not Assumed Equal. Optional Text 10.20[10.13c,d] (10.12c,d) See data above. D3, D4 4. Two Means, Paired Samples (If samples are small, populations should be normally distributed). Text 10.26, 10.29[10.36, 10.37], D1, D2 (10.32*(in 252hwkadd.), [10.34] (different numbers), 10.25[10.35], D1, D2) 5. Rank Tests. a. The Wilcoxon-Mann-Whitney Test for Two Independent Samples. Text 12.65[10.48] (10.46) b. Wilcoxon Signed Rank Test for Paired Samples. Text 12.74-12.76[10.57-59] (10.80-82 on CD), Downing & Clark 18-15, 18-9 (in chapter 17 in D&C 3rd edition), D5 6. Proportions. Text 10.32, 10.38, 10.39, 12.32** [12.2, 12.7*, 12.8*] (12.2) 7. Variances. Text 10.40, 10.43-10.48 [10.16, 10.19 - 10.24, 10.25] (10.15, 10.18 - 10.23, 10.24) D6a (below), D6, D7 (A summary problem), D8 (A summary problem) Graded assignment 3 will be posted.

Solutions to problems in outline points 4 and 5 are in this document. ------

Problems with 2 means and paired samples

Exercise 10.32 (only in 8th edition): The problem is to compare prices of a website, HomeGrocer.com with local Seattle supermarkets. Data is below. a. At the 0.05 level, is there evidence of a difference in the average price for products purchased from the two vendors? b. Compute the p-value in(a) and interpret its meaning. c. Set up a 95% confidence interval estimate of the difference in the average price for products purchased from the two vendors? d. Compare the results in (a) and (c). Solution:   .05 . The following results were obtained from Minitab. This is paired data because each line represents a single product. ————— 10/3/2003 1:38:10 PM ————————————————————

Welcome to Minitab, press F1 for help. MTB > Retrieve "C:\Berenson\Data Files-8th\Minitab\ONLINE2.mtw". Retrieving worksheet from file: C:\Berenson\Data Files-8th\Minitab\ONLINE2.mtw # Worksheet was saved on Mon Apr 09 2001

Results for: ONLINE2.mtw MTB > Paired c2 c3. Paired T-Test and CI: HomeGrocer, Supermarkets Paired T for HomeGrocer - Supermarkets

N Mean StDev SE Mean HomeGrocer 8 4.878 2.723 0.963 Supermarkets 8 4.840 2.798 0.989 Difference 8 0.0375 0.2200 0.0778 252solnD2 10/06/03

95% CI for mean difference: (-0.1465, 0.2215)

1 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

T-Test of mean difference = 0 (vs. not = 0): T-Value = 0.48 P-Value = 0.644 #Note that since the p-value is above the significance level, we cannot reject #the null hypothesis.

MTB > let c4=c2-c3 MTB > print c1-c4 Data Display #This is the original data. Row Products HomeGrocer Supermarkets 1 Tide High Efficiency, 64 oz. 6.99 6.99 2 Oreo Cookies, 20 oz. 3.29 3.49 3 Formula 409 Cleaner, 22 oz. 2.59 2.69 4 Pampers Newborn Diapers, 40 count 10.79 10.99 5 Coke Classic, dozen 12 oz. Cans 3.99 3.59 6 Colgate Total Toothpaste, 7.8 oz. 3.49 3.49 7 Tropicana Orange Juice, 64 oz. 3.59 3.49 8 Cheerrios Whole Grain Cereal, 20 oz. 4.29 3.99

Difference 0.0 -0.2 -0.1 -0.2 0.4 0.0 0.1 0.3

MTB > ssq c40 Sum of Squares of Difference Sum of squares (uncorrected) of Difference = 0.35000

MTB > sum c4 Sum of Difference Sum of Difference = 0.30000

To do this problem we do not need statistics on x1 (HomeGrocer prices) or x2 (Supermarket prices), but only on the difference, d  x1  x2 which is displayed above. You should be able to compute n  8,  d  0.30 and  d 2  0.35. 2 2 d 0.30 d  nd 0.35  80.03752 So we have d     0.0375 and s 2     .04839 , which gives n 8 d n 1 7

sd 0.04839 sd  0.04839  0.2200. We need s    0.00604875  0.07777. d n 8 If the paired data problem were on the formula table, it would appear as below. Interval for Confidence Hypotheses Test Ratio Critical Value Interval Difference H : D  D * D  d t  s 0 0 d  D0 d cv D0 t  s between Two 2 d t  2 d H1 : D  D0 , s Means (paired d  x1  x2 d data.) D  1   2 s s  d d n

H 0 : 1   2 * Same as H1 : 1   2 if D0  0.

H 0 : D  D0 a) From the Formula table, we get  where D  1   2 , but here D0  0 , so we can write H1 : D  D0

H 0 : D  0 H 0 : 1   2  0 H 0 : 1   2 n1 7  or  or  . t   t  2.365.  2  .025 H1 : D  0 H1 : 1   2  0 H1 : 1   2

2 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

d  D0 0.0375 Test Ratio Method: t    .4822 Our ‘reject zone’ is the area above 2.365 and the area s 0.07777 d below -2.365. Since the computed t-ratio is between these values, we do not reject H 0 .

Critical value method: d  D t  s  0  2.3650.07777  0.1839. Since is between cv 0 2 d d  0.0375

0.1839 and +0.1839, we cannot reject H 0 . The Instructor’s Solution Manual says Decision: Since the absolute value of t = 0.4821 is less than H0:mD= 0 vs. H 1 : m D 0 . the upper critical value of 2.365, we accept H and D - m 0 t =D = 0.4821 conclude that there is not Test statistic: SD enough evidence to suggest n that there is significant difference in the average price for products purchased from HomeGrocer.com and Seattle Supermarkets. b) If we compare tcalc  .4822 with the 7 degrees of freedom line of the t-table, we find that it is larger than 7 7 0.402  t..35 and smaller than 0.549  t..30 . So, .30  Pt  .4833  .35. Because this is a 2-sidid test, double this to get .60  pvalue  .70. Minitab says that the p-value is .644. The Instructor’s Solution Manual says, “Using the t-table, .p value >0.5. From Excel the p value is 0.645. The probability of obtaining a mean difference in average price that gives rise to a test statistic that deviates from 0 by 0.4821 or more in either direction is 0.645. c) The confidence interval is D  d t  s  0.0375  2.3650.07777  0.0375  0.18392 or -0.1464 to 2 d 0.2214. d) The Instructor’s Solution Manual says " The results in (a) and (d) are the same. The hypothesized value of 0 for the difference in the average price for items purchased from HomeGrocer.com and Seattle Supermarkets is inside the 95% confidence interval.”

Exercise 10.26 [10.36 in 9th] (10.34 in 8th edition): The data below gives a sample of prices of randomly selected books at a college book store and at Amazon. Only the first two columns are given. a) Is there a difference between mean prices from the two stores at the 1% significance level? b) What assumptions are needed? c) Construct and interpret a 99% confidence interval for the mean price difference. d) Compare results of a) and c) 9th and 10th edition data. Row Textbook Book Store Amazon 2 x1 x2 d d 1 Access 2000 Guidebook 52.22 57.34 -5.12 26.21 2 HTML 4.0 CD with Java Script 52.74 44.47 8.27 68.39 3 Designing the Physical Education Curriculum 39.04 41.48 -2.44 5.95 4 Service Management: Operations, Strategy and IT 101.28 73.72 27.56 759.55 5 Fundamentals of Real Estate Appraisal 37.45 42.04 -4.59 21.07 6 Investments 113.41 95.38 18.03 325.08 7 Intermediate Financial Management 109.72 119.80 -10.08 101.61 8 Real Estate Principles 101.28 62.48 38.80 1505.44 9 The Automobile Age 29.49 32.43 -2.94 8.64 10 Geographic Information Systems in Ecology 70.07 74.43 -4.36 19.01 11 Geosystems: An Introduction to Physical Geography 83.87 83.81 0.06 0.00 12 Understanding Contemporary Africa 23.21 26.48 -3.27 10.69 13 Early Childhood Education Today 72.80 73.48 -0.68 0.46 14 System of Transcedental Idealism (1800) 17.41 20.98 -3.57 12.74 15 Principles and Labs for Fitness and Wellness 37.72 40.43 -2.71 7.34 52.96 2872.21

3 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

Solution: To do this problem we do not need statistics on x1 or x2 , but only on the difference, d  x1  x2 which is displayed above. You should be able to compute n  15,  d  52.96 and  d 2  2872.21. 2 2 d 52.96 d  nd 2872.21153.53072 So we have x  x  d     3.5307 and s 2    1 2 n 15 d n 1 14

 191.8016 , which gives sd  191.8016  13.8492. We need s 191.8016 s  d   12.7868  3.5759. d n 15

H 0 : D  D0 a) From the Formula table, we get  where D  1   2 , but here D0  0 , so we can write H1 : D  D0

H 0 : D  0 H 0 : 1   2  0 H 0 : 1   2  or  or  . H1 : D  0 H1 : 1   2  0 H1 : 1   2

The Instructor’s Solution Manual says H0: D = 0 There is no difference in the average price of textbooks between the local bookstore and Amazon.com.

H1: D  0 There is a difference in the average price of textbooks between the local bookstore and Amazon.com. n1 14   .01 and t   t  2.977.  2  .005

d  D0 3.5307 Test Ratio Method: t    0.987 . Our ‘reject zone’ is the area above 2.977 and the area s 3.5759 d below -2.977. Since the computed t-ratio is between these values, we do not reject H 0 . The Instructor’s Solution Manual says “There is not enough evidence to conclude that there is a difference in the average price of textbooks between the local bookstore and Amazon.com.”

Critical value method: d  D t  s  0  2.9773.5759  10.6455. Since is between cv 0 2 d d  3.5307

10.6455 and 10.6455, we cannot reject H 0 . b) The Instructor’s Solution Manual says “One must assume that the distribution of the differences between the average price of business textbooks between the local bookstore and Amazon.com is approximately normally distributed.” c) If we look at the 14 df line of the t-table, our tcalc  0.987 falls between t.20  0.868 and t.15  1.345 , so we can say that .15  Pt  0.987  .20 . Since this is a 2-tailed problem, say .30  pvalue  .40 . The Instructor’s Solution Manual says that the p-value is .3402. d) The confidence interval is D  d t s  3.5307  2.9773.5759  3.5307 10.6455 or -7.115 to 2 d 14.1762. e) The Instructor’s Solution Manual says “The results in (a) and (d) are the same. The hypothesized value of 0 for the difference in the average price for textbooks between the local bookstore and Amazon.com is inside the 99% confidence interval.

4 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

Same problem with 8th edition data.

2 Book x1 x2 d d Row PriceOn PriceOff

1 55.00 50.95 4.05 16.4025 2 47.50 45.75 1.75 3.0625 3 50.50 50.95 -0.45 0.2025 4 38.95 38.50 0.45 0.2025 5 58.70 56.25 2.45 6.0025 6 49.90 45.95 3.95 15.6025 7 39.95 40.25 -0.30 0.0900 8 41.50 39.95 1.55 2.4025 9 42.25 43.00 -0.75 0.5625 10 44.95 42.25 2.70 7.2900 11 45.95 44.00 1.95 3.8025 12 56.95 55.60 1.35 1.8225 18.70 57.4450 n  12,  d  18.70 and  d 2  57.4450. 2 2 d 18.70 d  nd 57.4450 121.55832 So we have x  x  d     1.5583 and s 2    1 2 n 12 d n 1 11

sd 2.5732  2.5732 , which gives sd  2.5732  1.6041. We need s    0.2144  0.4631. d n 12

H 0 : D  D0 a) From the Formula table, we get  where D  1   2 , but here D0  0 , so we can write H1 : D  D0

H 0 : D  0 H 0 : 1   2  0 H 0 : 1   2  or  or  . H1 : D  0 H1 : 1   2  0 H1 : 1   2

The Instructor’s Solution Manual says H0: D = 0 There is no difference in the average price of business textbooks between on-campus and off-campus stores.

H1: D  0 There is no difference in the average price of business textbooks between on-campus and off-campus stores.

n1 11   .01 and t   t  3.106.  2  .005 Solutions below differ somewhat from the Instructor’s Solution Manual. d  D 1.5583 Test Ratio Method: t  0   3.365 . Our ‘reject zone’ is the area above 3.106 and the area s 0.4631 d below -3.106. Since the computed t-ratio is not between these values, reject H 0 . The Instructor’s Solution Manual says “There is enough evidence to conclude that there is a difference in the average price of business textbooks between on-campus and off-campus stores.”

Critical value method: d  D t s  0  3.1060.4631  1.4384. Since is not between cv 0 2 d d  1.5583

1.4384 and 1.4384, we reject H 0 . b) The Instructor’s Solution Manual says “One must assume that the distribution of the differences between the average price of business textbooks between on-campus and off-campus stores is approximately normally distributed.” c) If we look at the 11 df line of the t-table, our tcalc  3.365 falls between t.005  3.106 and t.001  4.025 , so we can say that .001  Pt  0.987  .005 . Since this is a 2-tailed problem, say .002  pvalue  .010 . The Instructor’s Solution Manual says that the p-value is .0067.

5 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

d) The confidence interval is D  d t s  1.5583  3.2060.4631  1.558 1.438 or 0.120 to 2.996. 2 d e) The Instructor’s Solution Manual says “The results in (a) and (d) are the same. The hypothesized value of 0 for the difference in the average price for textbooks between two stores outside the 99% confidence interval.

Exercise 10.29[10.37 in 9th] (10.35 in 8th edition): The book gives a piece of Excel output based on the data set perform. Look at it! Interpret it! Excel output has been copied from the text. t-Test: Paired Two Sample for Means

Before After Mean 74.54286 79.8 Variance 80.90252 37.16471 Observations 35 35 Pearson Correlation -0.1342 Hypothesized Mean Difference 0 df 34 t Stat -2.69904 P(T<=t) one-tail 0.005376 t Critical one-tail 1.690923 P(T<=t) two-tail 0.010752 t Critical two-tail 2.032243 Solution: From the Instructor’s Solution Manual 10.37 From the descriptive statistics provided in the Microsoft Excel output there does not seem to be any violation of the assumption of normality. The mean and median are similar and the skewness value is near 0. Without observing other graphical devices such as a stem-and-leaf display, box- and-whisker plot, or normal probability plot, the fact that the sample size (n = 35) is not very small enables us to assume that the paired t test is appropriate here. The Microsoft Excel output for the paired t test indicates that a significant improvement in average performance ratings has occurred. The calculated t statistic of –2.699 falls far below the one-tailed critical value of –1.6909 using a . 05 level of significance. The p value is 0.005376. Comment: Did they see the same output I saw? If they had actually presented the median and skewness, and the skewness was small, and the mean and mode close, I might agree that the method is appropriate.

Problem D1: A trucking company wishes to compare mileage per gallon on its current air filter with a new product. Results are as below. See if the new filter actually gives better mileage. Assume that the underlying distribution is normal. Use a 5% significance level.

Current Filter x1 8.6 6.1 11.4 7.9 6.6 8.9 6.4 6.5 6.3

New Filter x2 8.3 8.2 7.8 11.6 9.8 9.7 6.7 9.9 8.1

a. Assume each pair of numbers represents experience on a single truck.

b. Assume that these represent two independent random samples, but s1 = s2. c. (Optional) Again assume two random samples, but that the variances are not equal. d. Test that the mean is 8.2 for each filter.

6 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

Solution:

2 x 1 x1 7.54 x2 d d

s 1 8.6 1.68602 8.3 0.3 0.09 6.1 8.2 -2.1 4.41 x2  8.98 11.4 7.8 3.6 12.96 s 2 7.9 1.40855 11.6 -3.7 13.69 6.6 d  149.8.4 -3.2 10.24 d 8.9   9.7  1.44 -0.8 0.64 6.4 n 106.7 -0.3 0.09 2 2 6.7 d  n9.7d 65. 92 -3.0101.44 9.002 s 2    d 6.5 n  19.9 -3.4 9 11.56 6.3 8.1 -1.8 3.24 75.4  5.02444 . 89.8 sd  2.24063 -14.4 65.92 n1  n2  n  10

H 0 : 1  2 H 0 : 1  2  0 H 0 : D  0  or  or  D 1  2 , d  x1  x2  7.54  8.98  1.44 . H1 : 1   2 H1 : 1  2  0 H1 : D  0

It’s time to remind you of the four methods we have for comparing two means. We use 3 of them in this problem. In parts b), c) and d), only the test ratio method will be used. Can you supply the answers for the other two approaches? Interval for Confidence Hypotheses Test Ratio Critical Value Interval Difference H : D  D * D  d t  s 0 0 d  D0 d cv D0 t  s between Two 2 d t  2 d H1 : D  D0 , s Means (paired d  x1  x2 d data.) D  1   2 s s  d d n Difference H : D  D * D  d z   0 0 d  D0 d cv  D0  z between Two 2 d z  d H1 : D  D0 ,  Means (s  2  2 d 1 2 D     known)  d   1 2 n1 n2

d  x1  x2 Difference embed Equation.3 D  d  t s d  D0 d cv  D0  t s 2 d t  2 d between Two H 0 : D  D0 * 1 1 s Means (s s  s  d d p H1 : D  D0 , 2 2 unknown, n1 n2 2 n1  1s1  n2  1s2 sˆp  D  1   2 n  n  2 variances DF  n1  n2  2 1 2 assumed equal)

Difference embed Equation.3 D  d  t s d  D0 d cv  D0  t s 2 d t  2 d between Two H 0 : D  D0 * 2 2 s d Means(s s1 s2 s   H1 : D  D0 , unknown, d n1 n2 D  1   2 variances 2  s2 s2  assumed  1  2   n n  unequal) DF   1 2  2 2 2 2 s1 s2  n   n  1  2 n1 1 n2 1

7 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D . a) This is paired data. 8.6 represents experience on truck 1 with carburetor 1 and 8.3 represents experience s 2.24063 on the same truck with carburetor 2. df  n 1  10 1  9 , s    0.70855 d n 10 Do this by one of the following three methods.

d  D0 1.44  0 (i) Test Ratio: t    2.0323 . Make a diagram with zero in the middle showing a s 0.70855 d 9 'reject' region below  t.05  1.833 . Since -2.0323 falls in the 'reject' region, reject H 0 . d  D  t s  0  1.833 0.70855  1.299 (ii) Critical Value: CV 0  d    . Make a diagram with zero in the middle showing a 'reject' below -1.299. Since d  1.44 falls in the 'reject' region, reject H 0 .

(iii) Confidence interval: D  d  t s becomes D  d  t s  1.44  1.8330.70855  0.141. 2 d  d

D  0.141 contradicts the null hypothesis D  0 so reject H 0 . b) We are assuming that these are two independent samples, but  1   2 . 2 2 2 2  2 n1 1s1  n2 1s2 91.68602  91.40855  2  1 1  So s  and s  s    p   2.403504 d p   n1  n2  2 18  n1 n2   1 1   2.403504     2.403504 0.2  0.69333 df  n1 1 n2 1  10 1 10 1  18. 10 10 

d  D0 1.44  0 t 18  1.734 and t    2.0769 . Make a diagram with zero in the middle showing a .05 s 0.69333 d

'reject' below -1.734. Since t  2.0769 falls in the 'reject' region, reject H 0 . s 2 1.68022 s 2 1.408552 s 2 s 2 c) 1   0.28427 , 2   0.19840 , so 1  2  0.28427  0.29840  0.48267 , n1 10 n2 10 n1 n2 s 2 s 2 1 2 , d  x  x  1.44 and sd    0.48267  0.69474 1 2 n1 n2 2  s 2 s 2   1  2    2  n1 n2  0.48267 DF    17.4 2 2 2 2  2   2  0.28427 0.19840 . We round this down to 17 degrees of freedom,  s1   s2       9 9  n1   n2   n1 1 n2 1

17 d  D0 1.44  0 and use t  1.740 . t    2.065 . Make a diagram with zero in the middle showing .05 s 0.69474 d a 'reject' below -1.740. Since t  2.065 falls in the 'reject' region, reject H 0 . 9 d) We have two samples of ten, so that for each two-tailed test, t.025  2.262 . H 0 :   8.2 and

H1 :   8.2 . To use a test ratio, make a diagram with zero in the middle and ‘reject’ zones above 2.262 and below -2.262. x   7.54  8.2 t  1 0   1.238 s 1.63602 (i) For the first sample, x1 . Since this is between -2.262 and 10 2.262, do not reject the null hypothesis.

8 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

x   8.98 8.2 t  2 0   2.078 s 1.40855 (ii) For the second sample x2 . Since this is between -2.262 and 10 2.262, do not reject the null hypothesis.

Problem D2: Do 2-sided and (if appropriate) 1-sided confidence intervals in D.1

Solution:   .05 . In parts a)-c), use D  d  t s and D  d  t s . 2 d  d a) From problem D1, df  9. 9 t  2.262 D  d  t s  1.44  2.2620.70855  1.44  1.60 . .025 2 d 9 D  d  t s  1.44  1.833 0.70855  1.44 1.29  0.14 t.05  1.833  d    b) From problem D1, df  18. 18 t  2.101 D  d  t s  1.44  2.1010.69333  1.44  1.46 . .025 2 d 18 D  d  t s  1.44  1.734 0.69333  1.44  1.269  0.24 t.05  1.734  d    c) From problem D1, df  17. 17 t  2.110 D  d  t s  1.44  2.1100.69474  1.44  1.47 . .025 2 d 17 D  d  t s  1.44  1.740 0.69474  1.44  1.21  0.23 t.05  1.740  d    d) From problem D1, df  9.

9 1.68602    x1  t sx  7.54  2.262   7.54  1.21 t.025  2.262 2   .  10 

9 1.40855    x1  t sx  8.98  2.262   8.98  1.01 t.025  2.262 2   .  10 

Exercise 10.25 [10.35 in 9th but I swear that I have never seen this before]: The problem uses data from the file on your CD called MEASUREMENT to answer the following. a) At the 5% level of significance, is there evidence of a difference of the mean measurements (from in-line and analytical lab)? b) What assumption is necessary to perform this test? c) Use a graphical method to evaluate the assumption. d) Construct and interpret a 95% confidence interval estimate of the difference between the mean measurements. Solution: The original data follows in columns c1-c3. I have added labels to the Minitab printout to clarify it.

————— 10/20/2005 7:33:28 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > WOpen "C:\BBS\10th ed student minitab files\MEASUREMENT.MTW". Retrieving worksheet from file: 'C:\BBS\10th ed student minitab files\MEASUREMENT.MTW' Worksheet was saved on Tue Oct 22 2002 Results for: MEASUREMENT.MTW

MTB > let c4=c2-c3 #I am computing the difference. MTB > let c5 = c4*c4 #I am squaring the difference.

9 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

MTB > print c1-c5 Data Display Row Sample In-Line Analytical lab d dsq 2 x1 x2 d  x1  x2 d 1 1 8.01 8.01 0.00 0.0000 2 2 7.56 7.29 0.27 0.0729 3 3 7.47 7.54 -0.07 0.0049 4 4 7.40 7.42 -0.02 0.0004 5 5 7.83 7.80 0.03 0.0009 6 6 7.50 7.65 -0.15 0.0225 7 7 6.86 6.93 -0.07 0.0049 8 8 7.31 7.46 -0.15 0.0225 9 9 7.45 7.60 -0.15 0.0225 10 10 7.23 7.40 -0.17 0.0289 11 11 7.37 7.50 -0.13 0.0169 12 12 7.49 7.41 0.08 0.0064 13 13 6.21 6.25 -0.04 0.0016 14 14 6.68 6.54 0.14 0.0196 15 15 5.12 5.20 -0.08 0.0064 16 16 4.84 4.70 0.14 0.0196 17 17 4.84 4.82 0.02 0.0004 18 18 5.21 5.33 -0.12 0.0144 19 19 5.35 5.30 0.05 0.0025 20 20 5.60 5.40 0.20 0.0400 21 21 5.32 5.39 -0.07 0.0049 22 22 5.16 5.17 -0.01 0.0001 23 23 5.66 5.50 0.16 0.0256 24 24 6.31 6.24 0.07 0.0049

MTB > sum c4 Sum of d Sum of d = -0.07  d  0.07

MTB > sum c5 Sum of dsq Sum of dsq = 0.3437  d 2  0.3437 MTB > describe c4 Descriptive Statistics: d Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum d 24 0 -0.0029 0.0249 0.1222 -0.1700 -0.1100 -0.0150 0.0775 0.2700

MTB > Paired c2 c3. Paired T-Test and CI: In-Line, Analytical lab Paired T for In-Line - Analytical lab

N Mean StDev SE Mean In-Line 24 6.49083 1.08624 0.22173 Analytical lab 24 6.49375 1.11711 0.22803 Difference 24 -0.002917 0.122207 0.024945

95% CI for mean difference: (-0.054520, 0.048687) T-Test of mean difference = 0 (vs not = 0): T-Value = -0.12 P-Value = 0.908

10 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

2 2 d  0.07 d  nd 0.3437  24 0.0029172 You can verify that d     0.002917 s 2    n 24 d n 1 23 0.343496  = 0.0149346 s  0.0149346  0.122207 23 d s s 2 0.0149346 s  d  d   0.00062227  .024945 d n n 24

H 0 : 1   2 H 0 : 1   2  0 H 0 : D  0 Our hypotheses are  or  or    .05 H1 : 1   2 H1 : 1   2  0 H1 : D  0 The t test gives a p-value of .908, which is certainly above the significance level and the confidence interval includes zero, so we cannot reject the null hypothesis. The remainder of the solution is copied from the Instructor’s Solutions Manual (Slightly edited).

(a) H0:mD= 0 vs. H 1 : m D 0 Excel Output: t-Test: Paired Two Sample for Means In-Line Analytical lab Mean 6.490833 6.49375 Variance 1.179912 1.247928804 Observations 24 24 Pearson Correlation 0.994239 Hypothesized Mean Difference 0 df 23 t Stat -0.11692 P(T<=t) one-tail 0.453968 t Critical one-tail 1.71387 P(T<=t) two-tail 0.907937 t Critical two-tail 2.068655

D - mD t = = -0.1169 d  D0 Test statistic: S This is the same as t  . D s d n Decision: Since t = 0.1169 falls between the lower and upper critical values 2.0687,

do not reject H0 . There is not enough evidence to conclude that there is a difference in the mean measurements in-line and from an analytical lab. (b) What assumption is necessary to perform this test? You must assume that the distribution of the differences between the mean measurements is approximately normal. (c) Use a graphical method to evaluate the assumption.

Box-and-whisker Plot

Normal Probability Plot

0.3 Difference 0.25 0.2 0.15

e

c 0.1 n e

r 0.05

e -0.3 -0.2 -0.1 0 0.1 0.2 0.3 f f

i 0

D -0.05 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -0.1 11 -0.15 -0.2 Z Value 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

The distributions appear to be right skewed. (d) Construct and interpret a 95% confidence interval estimate of the difference between the mean measurements. S 0.1222 D� t D - 0.0029 2.0687 -0.0545#m 0.0487 n 24 D This is a problem from a while ago that I left in to give you some more practice.

McClave et. al. Exercise 9.33: a) x1 is 1999 inflation rates as forecasted earlier in the 1999 by nine economists. x2 is 2000 inflation rates as forecasted earlier in the 2000 by the same economists. The question is ‘were they more optimistic in 1999?’ What are our hypotheses?

H0 : 1  2 H0 : 1  2  0 H 0 : D  0  or  or  D 1  2 H1 : 1  2 H1 : 1  2  0 H1 : D  0 b) Test the above hypotheses.

Obsn  n  n  9 2 1 2 x1 x2 d d x11  2.333 1.8 x2  2 . 544 2.2 -0.4 0.16 2 d 2.3 1.9 2.3 0 0 d3   2.3  2.30.211 0 0 4 n 2.5 9 3.0 -0.5 0.25 5x1  x2  2.32.333  22.4.544   0-0.1.211 0.01 6 d 2 2.5 nd 2 3.0 -0.5 2 0.25 2  0.77  9 0.211 s7d  2.5  2.5 0 0 8 n 2.31 2.6 -0.38 0.09 9 2.5 2.6 -0.1 0.01  0.0461. sd  0.2147 21.0 22.9 -1.9 0.77 n1  n2  n  9

This is paired data. Each line represents the opinion of one economist. df  n 1  9 1  8. s 0.0461 0.2147 s     0.07157 d n 9 9 Do this by one of the following three methods. d  D  0.211 0 (i) Test Ratio: t  0   2.948 . Make a diagram with zero in the middle showing a s 0.07157 d 8 'reject' region below  t.05  1.860 . Since -2.948 falls in the 'reject' region, reject H 0 . d  D  t s  0  1.860 0.07157  0.133 (ii) Critical Value: CV 0  d    . Make a diagram with zero in the middle showing a 'reject' below -0.133. Since d  0.211 falls in the 'reject' region, reject H 0 .

(iii) Confidence interval: D  d  t s becomes   d  t s    0.211  1.8600.07157  0.078 . 2 d  d

D  0.078 contradicts the null hypothesis D  0 so reject H 0 . Because the difference between the 1999 number and the 2000 number is negative, the 2000 number is above the 1999 number and the economists were more optimistic.

12 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

Problems with 2 nonnormal samples (medians) .

Exercise 12.65 [10.48 in 9th] (10.46 in 8th edition): Da boss ranks 10 individuals each assigned to traditional training (T) and an experimental method (E) on performance. The 20 employees have been rated T 1 2 3 5 9 10 12 13 14 15 1-20, where 1 is worst. The data is as follows: . Using a 5% E 4 6 7 8 11 16 17 18 19 20 significance level can you say that there is a difference in the median performance between the two methods? Solution: These data are not cross-classified so we use Mann-Whitney-Wilcoxon method. They also are ordered for you.

x1 r1 x2 r2 Not given 1 Not given 4 2 6 3 7 5 8 9 11 10 16 12 17 13 18 14 19 15 20 84 126

H 0 :1  2 .  or the null hypothesis is simply 'similar distributions.' H1 : 1   2 From the Instructor’s Solution Manual If population 1 is ‘traditional and population 2 is ‘experimental,’ the null hypothesis says that there is no difference in performance between the traditional and the experimental training methods,’ and the alternative hypothesis says that there is a difference in performance between the traditional and the experimental training methods.

Check this by noting that these two rank sums must add to the sum of the first n1  n2  10 10  20  n nn 1 2021 numbers, and that this is   210 , and that 2 2 SR1  SR2  126  84  210. Our test statistic is the sum of the ranks of the smaller sample, but in this case, since the samples are of equal size, set SR1  W . Go to table 6a or E8 in the text and find the bounds for

n1  10 and n2  10 . The decision rule is if SR1  W  78 or W  132 , reject H0. Since the test statistic is between these bounds, do not reject H0. There is not enough evidence to conclude that there is a difference in performance between the traditional and the experimental training methods.

Exercise 12.74 [10.57 in 9th] (10.80 in 8th edition): You are given the following 12 differences between two related samples. Test for a median difference of zero. What is the value of the test statistic? Data: 3.2, 1.7, 4.5, 0.0, 11.1, -0.8, 2.3, -2.0, 0.0, 14.8, 5.6, 1.7.

Solution; H 0 :  0 . If we are testing for a median of zero, the original data appear in the x column. If we are testing for the difference between two medians replace x by two columns of data and in the third column compute the difference, d  x1  x2 . This method cannot handle ties, so remove the zero differences. Rank the 10 remaining differences bottom-to-top in the column marked r , putting an asterisk by ties. The last column, r*, resolves the ties by doing things like replacing 2 and 3 by their average, 2.5. The last column also notes the sign of the differences. We compute T  and T  , the rank sums of the numbers signed with + and - .

13 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

x d  x  0 d  x  Rank r Corrected Rank r * 3.2 3.2 3.2 6 6+ 1.7 1.7 1.7 2* 2.5+ 4.5 4.5 4.5 7 7+ 0.0 0.0 0.0 Discard 11.1 11.1 11.1 9 9+ -0.8 -0.8 0.8 1 1- 2.3 2.3 2.3 5 5+ -2.0 -2.0 2.0 4 4- 0.0 0.0 0.0 Discard 14.8 14.8 14.8 10 10+ 5.6 5.6 5.6 8 8+ 1.7 1.7 1.7 3* 2.5+

We find that T   6  2.5  7  9  5 10  8  2.5  50 and T   5. nn 1 1011 We should check that for n  10, T  T     55. We find that 2 2 T  T   50  5  55. The smaller of our totals is 5. If we check Table 7 for n  10, and a 2-sided   .05 test we find 8 in the .025 column. This means that we reject the null hypothesis if our rank sum is less than or equal to 8. Since 5 is below 8, reject the null hypothesis

Exercise 12.75 [10.58 in 9th] (10.81 in 8th edition): In 12.74 find the lower and upper critical values of W from Table E.9? Use a 5% significance level to test for a zero median. Solution: These two exercises give the procedure for using the text table E9, which differs somewhat from Table 7. ' n = 10, a = 0.05, WL = 8, WU = 47

Exercise 12.76 [10.59 in 9th] (10.82 in 8th edition): In 12.74 , what is your statistical decision?

Solution: Since W = 50 > WU = 47, reject H0 .

14 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

Downing & Clark (Computational Problem) 17-15: Twenty people rank two political candidates (A, B) on a scale of 1-10. Test the null hypothesis that people have no preference between the candidates. Data is shown in columns A and B below.   .10 Solution: Because this is preference data, we cannot assume that it has the normal distribution. Because it

H 0 :1  2 is paired, use the Wilcoxon Signed Rank Test.  or the null hypothesis is simply 'similar H1 : 1   2 distributions.'

Person A x1  B x2  d  x1  x2 d  x1  x2 Rank r Corrected Rank r * 1 2 8 -6 6 20 20- 2 5 6 -1 1 1 5- 3 3 5 -2 2 10 11.5- 4 7 8 -1 1 2 5- 5 4 5 -1 1 3 5- 6 8 4 4 4 19 19+ 7 9 8 1 1 4 5+ 8 8 9 -1 1 5 5- 9 7 8 -1 1 6 5- 10 5 6 -1 1 7 5- 11 6 5 1 1 8 5+ 12 7 4 3 3 14 16+ 13 8 5 3 3 15 16+ 14 8 6 2 2 11 11.5+ 15 9 6 3 3 16 16+ 16 9 7 2 2 12 11.5+ 17 8 9 -1 1 9 5- 18 4 6 -2 2 13 11.5- 19 10 7 3 3 17 16+ 20 8 5 3 3 18 16+

To explain the calculation of corrected ranks we need the table below. Because of the presence of numbers of equal magnitude, the number in ' Corrected Rank r * ' is the average of the numbers in ' Rank r.'

d  x1  x2 Rank r Corrected Rank r * 1 1 2 3 4 5 6 7 8 9 5 2 10 11 12 13 11.5 3 14 15 16 17 18 16 4 19 19 6 20 20 If we sum the corrected ranks we get T   132 for those with a + sign and T   78 for those with a - sign. The smaller of these is designated TL  78. (Check: T  T   132  78  210 . This should be 2021 equal to the sum of the first 20 numbers, which is 2  210 .) If we use Table 7 " Critical Values of

TL in the Wilcoxon Signed Rank Sum Test …..," we use the .05 column for a 2-sided test with   .10. For

n  20 the critical value is 60. Since 78 is above 60, do not reject H 0 .

For values of n above 15, TL , the smaller of T  and T  , has the normal distribution and may 1 1 be used here with T  4 nn 1  4 20 21  105 and variance 2 1 1  T  6 2n 1T  6 41105  717.5. If the significance level is 10% and the test is two-sided, we

15 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

TL  T z  z  z  1.645. reject our null hypothesis if does not lie between 2 .05  T

TL  T 78 105 z    1.007. Since z this is between 1.645 , we do not reject H 0 .  T 717.5

Downing & Clark (Computational Problem) 17-9: Defense budgets are given for two countries over two decades. Check to see that they have the same distribution. Data is given in columns x1 and x2 below.   .10 Note - It is not totally clear to me that the Wilcoxon-Mann-Whitney test is appropriate in this case, because the data in each row may come from a single year and a paired data test may be more appropriate, and it is not clear in general why a nonparametric procedure is appropriate. For this reason, Computational problem 11 may be a better example of when to use this method. Solution: Because the data are assumed to be independent random samples and thus not paired, use the

H 0 :1  2 Wilcoxon-Mann-Whitney Rank Sum Test.  or the null hypothesis is simply 'similar H1 : 1   2 distributions.' n1  10 and n2  10 . In the table below, r1 and r2 represent bottom to top ranking.

x1 r1 x2 r2 10 3 9 2 12 4 8 1 15 7 17 9 16 8 14 6 18 10 13 5 22 12 19 11 26 15 24 13 28 17 25 14 30 19 27 16 29 18 31 20 113 97

So the sums of the ranks are SR1  113, SR2  97 . Check - Note that these two rank sums must add to the nn 1 2021 sum of the first n1  n2  10 10  20  n numbers, and that this is   210 , and that 2 2 SR1  SR2  97 113  210.

The smaller of SR1 and SR2 is called W . This can be compared against the critical values for TL and TU in Table 6b. For n1  10 , n2  10 and a 2-tailed test with   .10, TL  82 and TU  128. Since W  97 , it is between these values and we cannot reject the null hypothesis.

For values of n1 and n2 that are too large for the tables, W has the normal distribution with mean 1 1 2 1 1 W  2 n1 n1  n2 1  21010 10 1  105 and variance  W  6 n2 W  6 10105  175.

W  W 97 105 z    0.605. If we wish to take a p-value approach  W 175 p  value  2Pz  0.605  .5 .2274  .2726 . Since our p-value is larger than the significance level, we cannot reject the null hypothesis. Though the text uses this method in this problem, strictly speaking, it should be limited to cases where n2  20, so that we are better off using the table.

16 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

252solnD2 10/06/03

Some problems from McClave et. al. have been left in here for practice.

Mc Clave et al Exercise 15-12†: Let x1 be sample B (the smaller sample) and x2 be sample A. Test that the median for sample B is above the median for sample A. The data given is, for A x1  65,35,47,52 and, for B, x2  37,40,33,29,42,33,35,26,34.

H 0 :1  2 H 0 : x1  x2 Solution: Our hypotheses become  or    .05 . H1 : 1  2 H1 : x1 ℏ x2 If we put these data in order we get x1 x2 r1 r2 28 1 29 2 33 3.5 33 3.5 34 5 We have 13 numbers here that we will rank from 1 to 13. Because 35 35 6.5 6.5 the numbers in x are generally higher than the numbers in x , we 37 1 2 8 rank from top to bottom. the ranks and their sums are to the right. 40 9 42 10 47 11 52 12 65 13 42.5 48.5

To check our ranking, note that n1  4, n2  9, n  n1  n2  9  4  13, SR1  42.5, SR2  48.5, and nn 1 1314 SR1  SR2  42.5  48.5  91. Remember that we should have SR  SR    91 . We do, 1 2 2 2 so our ranking is likely to be correct. We designate the rank sums of the smaller group as W  42.5 . The problem says   .05 . The sample sizes are too large for table 5, but we can use Table 6a for a one sided test with n1  4 and n2  9 . The two critical values it gives are 14 and 42, so we reject the null hypothesis since W is above the upper bound.

17 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

252solnD2 10/06/03

McClave et. al. Exercise 15-28†: Check the data below to see if they come from similar distributions. Each line represents a couple.

H 0 :1  2 H 0 : x1  x2 Solution;  or    .05 . Because the data are cross-classified, we use the H1 : 1   2 H1 : x1 not  x2 Wilcoxon signed rank test. The original data appear in the first two columns. In the third column we compute the difference, d  x1  x2 . This method cannot handle ties, so we remove the last pair. We rank these 11 differences bottom-to-top in the column marked r , putting an asterisk by ties. The last column, r*, resolves the ties by doing things like replacing 5, 6 and 7 by their average, 6. The last column also notes the sign of the differences. We compute T  and T  , the rank sums of the numbers signed with + and - .

x1 x2 d r r * We find that T   111.5  8  3.5  3.5  27.5 10 2 8 11 11  and T   39.5. We should check that for n  11, 2 1 1 1* 1.5  nn  1 1112 7 3 4 8 8  T   T     66. We find 2 2 1 6  5 9 9  that T   T   39.5  27.5  66. the smaller 2 5  3 5* 6  of our totals is 27.5. If we check Table 7 for 6 4 2 3* 3.5  n  11, and a 2-sided   .05 test we find 11 in 5 8  3 6* 6  the .025 column. This means that we reject the null hypothesis if our rank sum is less than or 7 10  3 7* 6  equal to 11. Since 27.5 is above 11, do not reject 9 7 2 4* 3.5  the null hypothesis 2 8  6 10 10  10 11 1 2* 1.5  12 12 0

Problem D5: a) If our sample consists of the numbers 9,14,16,16,18,19,22,23,25,26, test the hypotheses

H 0 :  15  by computing x  for each value of x and using the magnitude and sign of the results to H1 :  15 rank them and perform a Wilcoxon signed rank test.   .05

H 0 :1  2 b) for the following data, test the hypotheses  on the following paired samples H1 : 1   2

x1 09 14 16 16 18 19 22 23 25 78  using a Wilcoxon signed rank test.   .05 . x2 14 10 08 14 13 16 12 40 13 24

Solution: a) x d  x  0 d  x  Rank r Corrected Rank r * 9 -6 6 6 6- 14 -1 1 1 2- 16 1 1 2 2+ 16 1 1 3 2+ 18 3 3 4 4+ 19 4 4 5 5+ 22 7 7 7 7+ 23 8 8 8 8+ 25 10 10 9 9+ 26 11 11 10 10+

18 252solnD2 10/18/06 (Open this document in 'Page Layout' view!). Re-edited to replace  or  with D .

H 0 :  15  so d  x  0  x 15 . If we sum the corrected ranks we get T   47 for those with a + H1 :  15 sign and T   8 for those with a - sign. The smaller of these is designated TL  8. (Check: 1011 T  T   47  8  55 . This should be equal to the sum of the first 10 numbers, which is 2  55 .)

If we use Table 7 " Critical Values of TL in the Wilcoxon Signed Rank Sum Test …..," we use the .025 column for a 2-sided test with   .05. For n  10 the critical value is 8. Since this is equal to our value of

TL , reject H 0 .

b)

Observation x1 x2 d  x1  x2 d  x1  x2 Rank r Corrected Rank r * 1 9 14 -5 5 4 4.5- 2 14 10 4 4 3 3+ 3 16 8 8 8 6 6+ 4 16 14 2 2 1 1+ 5 18 13 5 5 5 4.5+ 6 19 16 3 3 2 2+ 7 22 12 10 10 7 7+ 8 23 40 -17 17 9 9- 9 25 13 12 12 8 8+ 10 78 24 54 54 10 10+

H 0 :1  2  . If we sum the corrected ranks we get T   41.5 for those with a + sign and T   13.5 for H1 : 1   2 those with a - sign. The smaller of these is designated TL  13.5. (Check: T  T   41.5 13.5  55 . 1011 This should be equal to the sum of the first 10 numbers, which is 2  55 .) If we use Table 7 "

Critical Values of TL in the Wilcoxon Signed Rank Sum Test …..," we use the .025 column for a 2-sided test with   .05. For n  10 the critical value is 8. Since our value of TL is above the critical value, do not reject H 0 .

Parts not copied ©2003 Roger Even Bove

19