<<

Appendix Tables

787 788 Appendix Tables

Table A.1 Cumulative Binomial Appendix Tables 789

Table A.1 Cumulative Binomial Probabilities (cont.) 790 Appendix Tables

Table A.1 Cumulative Binomial Probabilities (cont.)

Table A.2 Cumulative Poisson Probabilities Appendix Tables 791

Table A.2 Cumulative Poisson Probabilities (cont.) 792 Appendix Tables

Table A.3 Standard Normal Curve Areas Appendix Tables 793

Table A.3 Standard Normal Curve Areas (cont.) 794 Appendix Tables

Table A.4 The Incomplete Appendix Tables 795

Table A.5 Critical Values for t Distributions 796 Appendix Tables

Table A.6 Critical Values for Chi-Squared Distributions Appendix Tables 797

Table A.7 t Curve Tail Areas 798 Appendix Tables

Table A.7 t Curve Tail Areas (cont.) Appendix Tables 799

Table A.8 Critical Values for F Distributions 800 Appendix Tables

Table A.8 Critical Values for F Distributions (cont.) Appendix Tables 801

Table A.8 Critical Values for F Distributions (cont.) 802 Appendix Tables

Table A.8 Critical Values for F Distributions (cont.) Appendix Tables 803

Table A.8 Critical Values for F Distributions (cont.) 804 Appendix Tables

Table A.8 Critical Values for F Distributions (cont.) Appendix Tables 805

Table A.9 Critical Values for Distributions 806 Appendix Tables

Table A.10 Chi-Squared Curve Tail Areas Appendix Tables 807

Table A.10 Chi-Squared Curve Tail Areas (cont.) 808 Appendix Tables

Table A.11 Critical Values for the Ryan–Joiner Test of Normality Appendix Tables 809

Table A.12 Critical Values for the Wilcoxon Signed-Rank Test 810 Appendix Tables

Table A.13 Critical Values for the Wilcoxon Rank-Sum Test Appendix Tables 811

Table A.14 Critical Values for the Wilcoxon Signed-Rank Interval 812 Appendix Tables

Table A.15 Critical Values for the Wilcoxon Rank-Sum Interval pedxTables Appendix

Table A.16 b Curves for t Tests 813 Answers to Odd-Numbered Exercises

Chapter 1 This display brings out the gap in the : 1. a. Houston Chronicle, Des Moines Register, Chicago There are no scores in the high 70’s. Tribune, Washington Post b. Capital One, Campbell Soup, Merrill Lynch, 13. a. 2 23 Stem units: 1.0 Prudential 3 2344567789 Leaf units: .10 . Bill Jasper, Kay Reinke, Helen Ford, David Menendez 4 01356889 d. 1.78, 2.44, 3.50, 3.04 5 00001114455666789 6 0000122223344456667789999 3. a. In a sample of 100 DVD players, what are the chances that more than 20 need service while under warranty? 7 00012233455555668 What are the chances that none need service while still 8 02233448 under warranty? 9 012233335666788 b. What proportion of all DVD players of this brand and 10 2344455688 model will need service within the warranty period? 11 2335999 12 37 5. a. No, the relevant conceptual population is all scores of all students who participate in the SI in conjunction 13 8 with this particular course. 14 36 b. The advantage of randomly allocating students to the 15 0035 two groups is that the two groups should then be fairly 16 comparable before the study. If the two groups perform 17 differently in the class, we might attribute this to the 18 9 treatments (SI and control). If it were left to students to choose, stronger or more dedicated students might b. A representative value could be the , 7.0. gravitate toward SI, the results. c. The data appear to be highly concentrated, except for c. If all students were put in the treatment group there a few values on the positive side. would be no results with which to compare the d. No, there is to the right, or positive treatments. skewness. e. The value 18.9 appears to be an , being more 7. One could generate a simple random sample of all single than two stem units from the previous value. family homes in the city, or a stratified random sample by taking a simple random sample from each of the ten 15. a. district neighborhoods. From each of the homes in the Relative sample the necessary data would be collected. This Number would be an enumerative study because there exists a finite, identifiable population of objects from which to nonconforming Frequency (Freq/60) sample. 0 7 0.117 9. a. There could be several explanations for the variability 1 12 0.200 of the measurements. Among them could be measuring 2 13 0.217 error, (due to mechanical or technical changes across 3 14 0.233 measurements), recording error, differences in 4 6 0.100 weather conditions at of measurements, etc. 5 3 0.050 This study involves a conceptual population. There is b. 6 3 0.050 no frame. 7 1 0.017 11. 6 l 034 810.017 6 h 667899 1.001 7 l 00122244 Doesn’t add exactly to 1 because relative 7 h Stem ¼ tens frequencies have been rounded 8 l 001111122344 Leaf ¼ ones 8 h 5557899 9l 03 9h 58

814 Chapter 1 815

b. .917, .867, 1 À.867 ¼ .133 c. The center of the is somewhere around b. Class Freq Rel freq Density 2 or 3 and it shows that there is some positive skewness in the data. 0– < 50 8 0.08 .0016 50– < 100 13 0.13 .0026 17. a. .375 100– < 150 11 0.11 .0022 .218 b. < c. .242 150– 200 21 0.21 .0042 < d. The histogram is very positively skewed. 200– 300 26 0.26 .0026 300– < 400 12 0.12 .0012 19. a. The number of subdivisions having no cul-de-sacs is 400– < 500 4 0.04 .0004 ¼ 17/47 .362, or 36.2%. The proportion having at 500– < 600 3 0.03 .0003 least one cul-de-sac is 30/47 ¼ .638, or 63.8%. 600– < 900 2 0.02 .00007 100 1.00

y: Count Percent

0 17 36.17 c. .79 1 22 46.81 2 6 12.77 23. Class Freq Class Freq 3 1 2.13 5 1 2.13 10– < 20 8 1.1– < 1.2 2 N ¼ 47 20– < 30 14 1.2– < 1.3 6 .362, .638 30– < 40 8 1.3– < 1.4 7 40– < 50 4 1.4– < 1.5 9 50– < 60 3 1.5– < 1.6 6 b. z: Count Percent 60– < 70 2 1.6– < 1.7 4 70– < 80 1 1.7– < 1.8 5 0 13 27.66 40 1.8– < 1.9 1 1 11 23.40 40 2 3 6.38 3 7 14.89 4 5 10.64 The original distribution is positively skewed. 5 3 6.38 The transformation creates a much more symmetric, 6 3 6.38 mound-shaped histogram. 8 2 4.26 N ¼ 47 25. a. Class interval Freq Rel. Freq. .894, .830 0–< 50 9 0.18 21. a. 50–< 100 19 0.38 Class Freq Rel freq 100–< 150 11 0.22 150–< 200 4 0.08 < 0– 100 21 0.21 200–< 250 2 0.04 < 100– 200 32 0.32 250–< 300 2 0.04 < 200– 300 26 0.26 300–< 350 1 0.02 < 300– 400 12 0.12 350–< 400 1 0.02 < 400– 500 4 0.04 > ¼ 400 1 0.02 < 500– 600 3 0.03 50 1.00 600– < 700 1 0.01 700– < 800 0 0.00 800– < 900 1 0.01 100 1.00 The distribution is skewed to the right, or positively skewed. There is a gap in the histogram, and what appears to be an outlier in the ‘500–550’ interval. The histogram is skewed right, with a majority of observations between 0 and 300 cycles. The class holding the most observations is between 100 and 200 cycles. 816 Chapter 1

decreased to any value at least 370 without changing b. Class interval Freq. Rel. Freq. the median. d. 6.18 min; 6.16 min 2.25 < 2.75 2 0.04 2.75 < 3.25 2 0.04 35. a. 125. 3.25 < 3.75 3 0.06 b. If 127.6 is reported as 130, then the median is 130, a substantial change. When there is or grouping, 3.75 < 4.25 8 0.16 the median can be highly sensitive to a small change. 4.25 < 4.75 18 0.36 4.75 < 5.25 10 0.20 37. x~ ¼ 92; xtrð25Þ ¼ 95:07; xtrð10Þ ¼ 102:23; x ¼ 119:3 5.25 < 5.75 4 0.08 Positive skewness causes the to be larger than the 5.75 < 6.25 3 0.06 median. Trimming moves the mean closer to the median. 39. a. y ¼ x þ c, y~ ¼ x~ þ c The distribution of the natural logs of the original data b. y ¼ cx, y~ ¼ cx~ is much more symmetric than the original. 41. a. 25.8 b. 49.31 c. 7.02 d. 49.31 c. .56, .14. 43. a. 2887.6, 2888 b. 7060.3 29. d. The is: 45. 24.36 Relative Relative 47. $1,961,160 Class frequency Class frequency 49. À3.5; 1.3, 1.9, 2.0, 2.3, À2.5 0 < 150 .193 900 < 1050 .019 51. a. 1, 6, 5 150 < 300 .183 1050 < 1200 .029 b. The box plot shows positive skewness. The two 300 < 450 .251 1200 < 1350 .005 longest runs are extreme outliers. 450 < 600 .148 1350 < 1500 .004 c. outlier: greater than 13.5 or less than À6.5 À 600 < 750 .097 1500 < 1650 .001 extreme outlier: greater than 21 or less than 14 d. The largest observation could be decreased to 6 750 < 900 .066 1650 < 1800 .002 without affecting f 1800 < 1950 .002 s 53. a. The mean is 27.82, the median is 26, and the 5% The relative frequency distribution is almost unimodal trimmed mean is 27.38. The mean exceeds the and exhibits a large positive skew. The typical middle median, in accord with positive skewness. The value is somewhere between 400 and 450, although the trimmed mean is between the mean and median, as skewness makes it difficult to pinpoint more exactly than you would expect. this. b. There are two at the high end and one at the e. .775, .014 low end, but there are no extreme outliers. Because f. .211 the median is in the lower half of the box, the upper whisker is longer than the lower whisker, and there are 31. a. 5.24 two high outliers compared to just one low outlier, the b. The median, 2, is much lower because of positive plot suggests positive skewness. skewness. c. Trimming the largest and smallest observations yields 55. The two distributions are centered in about the same the 5.9% trimmed mean, 4.4, which is between the place, but one machine is much more variable than the mean and median. other. The more precise machine produced one outlier, but this part would not be an outlier if judged by the 33. a. A stem-and leaf display: distribution of the other machine. 32 55 Stem: ones 57. All of the Indian salaries are below the first of 33 49 Leaf: tenths Yankee salaries. There is much more variability in the 34 Yankee salaries. Neither team has any outliers. 35 6699 61. The three flow rates yield similar uniformities, but the 36 34469 values for the 160 flow rate are a little higher. 37 03345 38 9 63. a. 9.59, 59.41. The standard deviations are large, so it is 39 2347 certainly not true that repeated measurements are 40 23 identical. b. .396, .323. In terms of the , the 41 HC emissions are more variable. 42 4 65. 10.65 The display is reasonably symmetric, so the mean and  ¼  þ : 2 ¼ 2 2: 67. a. y ax b sy a sx median will be close. b. 100.78, .572 b. 370.7, 369.50. c. The largest value (currently 424) could be increased by any amount without changing the median. It can be Chapter 2 817

69. The mean is .93 and the is .081. The 7. a. {111, 112, 113, 121, 122, 123, 131, 132, 133, 211, distribution is fairly symmetric with a central peak, as 212, 213, 221, 222, 223, 231, 232, 233, 311, 312, 313, shown by the stem and leaf display: 321, 322, 323, 331, 332, 333} b. {111, 222, 333} Leaf unit ¼ 0.010 c. {123, 132, 213, 231, 312, 321} 77 d. {111, 113, 131, 133, 311, 313, 331, 333} 811 9. a. S ¼ {BBBAAAA, BBABAAA, BBAABAA, 8 556 BBAAABA, BBAAAAB, BABBAAA, BABABAA, 9 22333344 BABAABA, BABAAAB, BAABBAA, BAABABA, 955 BAABAAB, BAAABBA, BAAABAB, BAAAABB, 10 04 ABBBAAA, ABBABAA, ABBAABA, ABBAAAB, 10 55 ABABBAA, ABABABA, ABABAAB, ABAABBA, ABAABAB, ABAAABB, AABBBAA, AABBABA, AABBAAB, AABABBA, AABABAB, AABAABB, 71. a. ¼ .93. It occurs four in the data set. AAABBBA, AAABBAB, AAABABB, AAAABBB} b. The Modal Category is the one in which the most b. {AAAABBB, AAABABB, AAABBAB, AABAABB, observations occur. AABABAB} 73. The measures that are sensitive to outliers are the mean 13. a. .07 b. .30 c. .57 and the midrange. The mean is sensitive because all values are used in computing it. The midrange is the 15. a. They are awarded at least one of the first two projects, most sensitive because it uses only the most extreme .36. values in its computation. b. They are awarded neither of the first two projects, .64. The median, the trimmed mean, and the midfourth are c. They are awarded at least one of the projects, .53. less sensitive to outliers. The median is the most resistant d. They are awarded none of the projects, .47. to outliers because it uses only the middle value (or e. They are awarded only the third project, .17. values) in its computation. The midfourth is also quite f. Either they fail to get the first two or they are awarded resistant because it uses the fourths. The resistance of the the third, .75. trimmed mean increases with the trimming percentage. 17. a. .572 b. .879 75. a. s2 ¼ s2 and s ¼ s b. s2 ¼ 1 and s ¼ 1 y x y x z z 19. a. SAS and SPSS are not the only packages. 77. b. .552, .102 c. 30 d. 19 b. .7 c. .8 d. .2 79. a. There may be a tendency to a repeating pattern. 21. a. .8841 b. .0435 The value .1 gives a much smoother series. b. 23. a. .10 b. .18, .19 c. .41 d. .59 e. .31 f. .69 c. The smoothed value depends on all previous values of the , but the coefficient decreases with k. 25. a. 1/15 b. 6/15 c. 14/15 d. 8/15 d. As t gets large, the coefficient (1 – a)t–1 decreases to zero, so there is decreasing sensitivity to the initial 27. a. .98 b. .02 c. .03 d. .24 value. 29. a. 1/9 b. 8/9 c. 2/9 Chapter 2 31. a. 20 b. 60 c. 10 33. a. 243 b. 3645, 10 0 0 0 1. a. A\B b. A[B c. (A\B ) [ (B\A ) 35. .0679 3. a. S ¼ {1324, 1342, 1423, 1432, 2314, 2341, 2413, 37. .2 2431, 3124, 3142, 4123, 4132, 3214, 3241, 4213, 4231} 39. .0456 b. A ¼ {1324, 1342, 1423, 1432} 41. a. .0839 b. .24975 c. .1998 c. B ¼ {2314, 2341, 2413, 2431, 3214, 3241, 4213, 4231} 43. a. 1/15 b. 1/3 c. 2/3 d. A[B ¼ {1324, 1342, 1423, 1432, 2314, 2341, 2413, 2431, 3214, 3241, 4213, 4231} 45. a. .447, .5, .2 A\B ¼ ∅ b. P(A|C) ¼ .4, the fraction of ethnic group C that has A0 ¼ {2314, 2341, 2413, 2431, 3124, 3142, 4123, blood type A. 4132, 3214, 3241, 4213, 4231} P(C|A) ¼ .447, the fraction of those with blood group A that are of ethnic group C. 5. a. A ¼ { SSF, SFS, FSS } c. .211 b. B ¼ { SSS, SSF, SFS, FSS } c. C ¼ { SSS, SSF, SFS } 47. a. Of those with a Visa card, .5 is the proportion who also d. C0 ¼ { SFF, FSS, FSF, FFS, FFF } have a Master Card. A[C ¼ { SSS, SSF, SFS, FSS } b. Of those with a Visa card, .5 is the proportion who do A\C ¼ { SSF, SFS } not have a Master Card. B[C ¼ { SSS, SSF, SFS, FSS } B\C ¼ { SSS SSF, SFS } 818 Chapter 3

c. Of those with Master Card, .625 is the proportion Chapter 3 who also have a Visa Card. d. Of those with Master Card, .375 is the proportion 1. who do not have a Visa Card. S: FFF SFF FSF FFS FSS SFS SSF SSS e. Of those with at least one of the two cards, .769 is the X:01112223 proportion who have a Visa card. 3. M ¼ the absolute value of the difference between the 49. .217, .178 outcomes, with possible values 0, 1, 2, 3, 4, 5 or 6; 51. .436, .582 W ¼ 1 if the sum of the two resulting numbers is even and W ¼ 0 otherwise, a Bernoulli . 53. .0833 5. No, X can be a Bernoulli random variable where a 59. a. .067 b. .509 success is an in B, with B a particular subset of the sample . 61. .287 7. a. Possible values are 0, 1, 2, ..., 12; discrete 63. a. 76.5% b. .235 b. With N ¼ # on the list, values are 0, 1, 2, ... , N; 65. .466, .288, .247 discrete c. Possible values are 1, 2, 3, 4, ...; discrete 67. a. Because of independence, the conditional d. { x:0< x < 1 } if we assume that a rattlesnake can is the same as the unconditional probability, .3. be arbitrarily short or long; not discrete b. .82 c. .146 e. With c ¼ amount earned per book sold, possible values are 0, c,2c,3c, ..., 10,000c; discrete 71. .349, .651, (1 À p)n,1À (1 À p)n f. { y:0 y  14} since 0 is the smallest possible pH 73. .99999969, .226 and 14 is the largest possible pH; not discrete g. With m and M denoting the minimum and maximum 75. .9981 possible tension, respectively, possible values are { x: m  x  M }; not discrete 77. Yes, no h. Possible values are 3, 6, 9, 12, 15, ... — i.e., 3(1), 79. a. 2p À p2 b. 1 À (1 À p)n c. (1 À p)3 3(2), 3(3), 3(4), ...giving a first element, etc,; discrete d. .9 + .1(1 À p)3 e. .0137 9. a. X is a discrete random variable with possible values 81. .8588, .9896 {2, 4, 6, 8, ...} b. X is a discrete random variable with possible values 83. 2p(1 À p) {2, 3, 4, 5, ...} 85. a. 1/3, .444 b. .15 c. .291 11. a. p(4) ¼ .10 c. .45, .25 87. .45, .32 13. a. .70 b. .45 c. .55 d. .71 e. .65 f. .45 89. a. 1/120 b. 1/5 c. 1/5 15. a. (1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5) b. p(0) ¼ .3, p(1) ¼ .6, p(2) ¼ .1, p(x) ¼ 0 otherwise 91. .9046 c. F(0)8¼ .30, F(1) ¼ .90, F(2) ¼ 1. The c.d.f. is > < 93. a. .904 b. .766 <> 0 x 0 :  < ð Þ¼ 30 0 x 1 95. .008 F x > :  < :> 90 1 x 2  97. .362, .348, .290 12x .81 .162 99. a. P(G | R1 < R2 < R3) ¼ 2/3, so classify as granite if 17. a. b. The fifth battery must be an A, and one of the first R1 < R2 < R3. c. four must also be an A,so b. P(G | R1 < R3 < R2) ¼ .294, so classify as basalt if p(5) ¼ P(AUUUA or UAUUA or UUAUA or R1 < R3 < R2. UUUAA) ¼ .00324 P(G | R3 < R1 < R2) ¼ 1/15, so classify as basalt if d. P(Y ¼ y) ¼ (y À 1)(.1)yÀ2(.9)2, y ¼ 2,3,4,5,... R3 < R1 < R2. c. .175 d. p > 14/17 19. c. F(x) ¼ 0, x < 1, F(x) ¼ log10([x] + 1), 1  x  9, 101. a. 1/24 b. 3/8 F(x) ¼ 1, x > 9. d. .602, .301 103. s ¼ 1 21. F(x) ¼ 0, x < 0; .10, 0  x < 1; .25, 1  x < 2; .45, 107. a. P(B0|survive) ¼ b0/[1 À (b1 + b2)cd] 2  x < 3; .70, 3  x < 4; .90, 4  x < 5; .96, P(B1|survive) ¼ b1(1 À cd)/[1 À (b1 + b2)cd] 5  x < 6; 1.00, 6  x P(B2|survive) ¼ b2(1 À cd)/[1 À (b1 + b2)cd] b. .712, .058, .231 23. a. p(1) ¼ .30, p(3) ¼ .10, p(4) ¼ .05, p(6) ¼ .15, p(12) ¼ .40 b. .30, .60 À 25. a. p(x) ¼ (1/3)(2/3)x 1, x ¼ 1, 2, 3, ... À b. p(y) ¼ (1/3)(2/3)y 2, y ¼ 2, 3, 4, ... À c. p(0) ¼ 1/6, p(z) ¼ (25/54)(4/9)z 1, z ¼ 1, 2, 3, 4, ... Chapter 3 819

29. a. .60 b. $110 85. a. h(x; 10, 10, 20) b. .0325 c. h(x; n, n,2n), E(X) ¼ n/2, V(X) ¼ n2/[4(2n À 1)] 31. a. 16.38, 272.298, 3.9936 b. 401 c. 2496 d. 13.66 87. a. nb(x; 2, .5) ¼ (x + 1).5x+2, x ¼ 0, 1, 2, 3, ... S 2 33. Yes, because (1/x ) is finite. b. 3/16 c. 11/16 d. 2, 4 35. $700 89. nb(x; 6, .5), E(X) ¼ 6 ¼ 3(2) ¼ > ¼ 37. E[h(X)] .408 1/3.5 .286, so you expect to win 93. a. .932 b. .065 c. .068 d. .491 e. .251 more if you gamble. 95. a. .011 b. .441 c. .554, .459 d. .944 39. V(ÀX) ¼ V(X) 97. a. .491 b. .133 41. a. 32.5 b. 7.5 c. V(X) ¼ E[X(X–1)] + E(X) À [E(X)]2 99. a. .122, .808, .283 b. 12, 3.464 c. .530, .011 43. a. 1/4, 1/9, 1/16, 1/25, 1/100 101. a. .099 b. .135 c. 2 b. m ¼ 2.64, s ¼ 1.54, P(|X À m|  2s) ¼ .04 < .25, P(|X À m|  3s) ¼ 0 < 1/9 103. a. 4 b. .215 c. 1.15 years The actual probability can be far below the Chebyshev 105. a. .221 b. 6,800,000 c. p(x; 1608.5) bound, so the bound is conservative. c. 1/9, equal to the Chebyshev bound 111. b. 3.114, .405, .636 d. P(À1) ¼ .02, P(0) ¼ .96, P(1) ¼ .02 113. a. b(x; 15, .75) b. .6865 c. .313 d. 45/4, 45/16 t t 45. MX(t) ¼ .5e /(1–.5e ), E(X) ¼ 2, V(X) ¼ 2 e. .309 yÀ1 47. pY(y) ¼ .75(.25) , y ¼ 1, 2, 3, ... 115. .9914 49. E(X) ¼ 5, V(X) ¼ 4 117. a. p(x; 2.5) b. .067 c. .109

t2=2 51. MY ðtÞ¼e , E(X) ¼ 0, V(X) ¼ 1 119. 1.813, 3.05 53. E(X) ¼ 0, V(X) ¼ 2 121. p(2) ¼ p2 , p(3) ¼ (1 À p)p2 , p(4) ¼ (1 À p)p2, p(x) ¼ [1 À p(2) À ... À p(x À 3)](1 À p)p2, x ¼ 5, 59. a. .850 b. .200 c. .200 d. .701 6, 7, .... e. .851 f. .000 g. .570 Alternatively, p(x) ¼ (1 À p)p(x À 1) + À À ¼ ... 61. a. .354 b. .114 c. .919 p(1 p) p(x 2), x 5, 6, 7, ; 99950841 63. a. .403 b. .787 c. .773 123. a. 0029 b. 0767, .9702 P1 ½ ð ; ފ5 65. .1478 125. a. .135 b. .00144 c. p x 2 x¼0 67. .4068, assuming independence 127. 3.590 69. a. .0173 b. .8106, .4246 c. .0056, .9022, .5858 129. a. No b. .0273 l m l m 71. For p ¼ .9 the probability is higher for B (.9963 versus 131. b. .6p(x; )+.4p(x; ) c. ( + )/2 l m l À m 2 .99 for A) d. ( + )/2 + ( ) /4 ¼ For p .5 the probability is higher for A (.75 versus 133. .5 .6875 for B) 137. X ~ b(x; 25,pp),ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiE(h(X)) ¼ 500p + 750, 73. The tabulation for p > .5 is not needed. shðXÞ ¼ 100 pð1 À pÞ 75. a. 20, 16 (binomial, n ¼ 100, p ¼ .2) b. 70, 21 Independence and constant probability might not be valid because of the effect that customers can have on ¼ ¼ 77. When p .5, the true probability for k 2 is .0414, each other. Also, store employees might affect customer compared to the bound of .25. decisions. When p ¼ .5, the true probability for k ¼ 3 is .0026, compared to the bound of .1111. 139. When p ¼ .75, the true probability for k ¼ 2 is .0652, X 01234 compared to the bound of .25. p(x) .07776 .10368 .19008 .20736 .17280 When p ¼ .75, the true probability for k ¼ 3 is .0039, compared to the bound of .1111. X 567 8 p(x) .13824 .06912 .03072 .01024 t n 79. MnÀX(t) ¼ [p +(1À p)e ] , E(n À X) ¼ n(1 À p), V(n À X) ¼ np(1 À p) Intuitively, the of X and n À X should add to n and their should be the same. 81. a. .114 b. .879 c. .121 d. Use the with n ¼ 15 and p ¼ .1 83. a. h(x; 15, 10, 20) b. .0325 c. .6966 820 Chapter 4

Chapter 4 55. a. .794 b. 5.88 c. 7.94 d. .265 57. No, because of symmetry. 1. a. 25 b. .5 c. 7/16 59. a. approximate, .0391; binomial, .0437 3. b. .5 c. 11/16 d. .6328 b. approximate, .99993; binomial, .99976 5. a. 3/8 b. 1/8 c. .2969 d. .5781 61. a. .7287 b. .8643, .8159 ð Þ¼ 1   ¼ 7. a. f x 10 for 25 x 35 and 0 otherwise 63. a. approximate, .9933; binomial, .9905 b. .2 c. .4 d. .2 b. approximate, .9874; binomial, .9837 c. approximate, .8051; binomial, .8066 9. a. .5618 b. .4382, .4382 c. .0709 67. a. .15866 b. .0013499 c. .999936658 11. a. 1/4pffiffiffi b. 3/16 c. 15/16 Actual: .15866 .0013499 .999936658 d. 2 e. f(x) ¼ x/2 for 0  x < 2, and f(x) ¼ 0 d. .00000028665 otherwise

3 69. a. 120 b. 1.329 c. .371 d. .735 e. 0 13. a. 3 b. 0 for x  1, 1 À 1/x for x > 1 c. 1/8, .088 71. a. 5, 4 b. .715 c. .411 3 15. a. F(x) ¼ 0 for x  0, F(x) ¼ x /8 for 0 < x < 2, 73. a. 1 b. 1 c. .982 d. .129 F(x) ¼ 1 for x  2 b. 1/64 c. .0137, .0137 d. 1.817 75. a. .449, .699, .148 b. .050, .018 \ l ¼ 17. b. 90th of Y ¼ 1.8(90th percentile of X)+32 77. a. Ai b. Exponential with .05 c. 100 pth percentile of Y ¼ a(100 pth percentile of c. Exponential with nl X)+b 83. a. .8257, .8257, .0636 b. .6637 c. 172.73 19. a. 1.5, .866 b. .9245 87. a. .9296 b. .2975 c. 98.18 21. a. .8182, .1113 b. .044 89. a. 68.03, 122.09 b. .3196 c. .7257, skewness 23. a. A +(B À A)p pffiffiffiffiffi 2 91. a. 149.157, 223.595 b. .957 c. .0416 b. (A + B)/2, (B À A) /12, ðB À AÞ= 12 n+1 n+1 d. 148.41 e. 9.57 f. 125.90 c. (B À A )/[(n + 1)(B À A)] 93. a ¼ b 25. 314.79 95. b. G(a + b) G(m + b)/[G(a + b + m ) G(b)], b/( a + b) 27. 248, 3.6 97. Yes, since the pattern in the plot is quite linear. 29. 1/(1 À t/4), 1/4, 1/16 99. Yes 31. 100p,30p ð Þ¼ 1 À   ¼ 101. Yes 33. f x 10 for 5 x 5 and 0 otherwise .5 103. Form a new variable, the of the rainfall 35. a. M(t) ¼ .15 e t/(.15 À t), t < .15; E(X) ¼ 7.167, values, and then construct a normal plot for the new V(X) ¼ 44.44 variable. Because of the linearity of this plot, normality b. E(X) ¼ 7.167, V(X) ¼ 44.44 is plausible. 37. M(t) ¼ .15/(.15 À t), E(X) ¼ 6.667, V(X) ¼ 44.44 105. The normal plot has a nonlinear pattern showing This distribution is shifted left by .5, so the mean differs positive skewness. by .5 but the is the same. 107. The plot deviates from linearity, especially at the low 39. a. .4850 b. .3413 c. .4938 d. .9876 end, where the smallest three observations are too small e. .9147 f. .9599 g. .9104 h. .0791 relative to the others. The plot works for any l because l i. .0668 j. .9876 is a . 41. a. 1.34 b. À1.34 c. .674 d. À.674 109. f ( y) ¼ 2/y3, y > 1 e. À1.555 Y 111. f ðyÞ¼yeÀy2=2, y > 0 43. a. .9772 b. .5 c. .9104 d. .8413 Y

e. .2417 f. .6826 113. fY ( y) ¼ 1/16, 0 < y < 16 .7977 .0004 2 45. a. b. 115. fY ( y) ¼ 1/[p(1 + y )] c. The top 5% are the values above .3987. 117. Y ¼ X2/16 47. The second machine pffiffiffi 119. fY ðyÞ¼1=½2 yŠ,0< y < 1 49. a. .2525 b. 39.96 pffiffiffi pffiffiffi 121. fY ðyÞ¼1=½4 yŠ,0< y < 1, fY ðyÞ¼1=½8 yŠ, 51. .0510 1 < y < 9

yÀ1 53. a. .8664 b. .0124 c. .2718 125. pY ( y) ¼ (1 À p) p, y ¼ 1, 2, 3, ... Chapter 5 821

127. a. .4 b. .6 c. F(x) ¼ x/25, 0  x  25; 9. a. .3/380,000 b. .3024 c. .3593 F(x) ¼0, x < 0; F(x) ¼ 1, x > 25 d. 12.5, 7.22 d. 10Kx2 + .05, 20  x  30 e. no ÀÁÀÁ Àl Ày 129. b. F(x) ¼ 1 À 16/(x +4)2, x  0; F(x) ¼ 0, x < 0 11. a. pxðÞ¼; y e lx=x! eÀÁyy=y! for x ¼ 0, 1, 2, ...; c. .247 d. 4 e. 16.67 y ¼ 0, 1, 2, ... b. eÀlÀyð1 þ l þ yÞ c. eÀlÀyðl þ yÞm=m!, Poisson with parameter l + y 131. a. .6563 b. 41.55 c. .3179 13. a. eÀxÀy, x  0, y  0 b. .3996 c. .5940 133. a. .00025, normal approximation; .000859, binomial d. .3298 b. .0888, normal approximation; .0963, binomial 15. a. FðyÞ¼1 À 2eÀ2ly þ eÀ3ly for y  0, F(y) ¼ 0 for ¼ À   ¼ < 135. a. F(x) 1.5(1 1/x), 1 x 3; F(x) 0, x 1; y < 0; f ðyÞ¼4leÀ2ly À 3eÀ3ly for y  0, f(y) ¼ 0 ¼ > F(x) 1, x 3 b. .9, .4 c. 1.6479 for y < 0 d. .5333 e. .2662 b. 2/(3l) 137. a. 1.075, 1.075 b. .0614, .3331 c. 2.476 17. a. .25 b.p 1/ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffip c. 2/p ð Þ¼ 2 À 2 ðÞp 2 À   139. b. 95,693, 1/3 d. fX x 2pRffiffiffiffiffiffiffiffiffiffiffiffiffiffiffix  for R x R, 2 2 2 fY ðyÞ¼2 R À y ðÞpR for ÀR  y  R,no 141. b. F(x) ¼ .5e.2x, x  0; F(x) ¼ 1 À .5eÀ.2x, x > 0 c. .5, .6648, .2555, .6703 19. .15 143. a. k ¼ (a À 1)5aÀ1 b. F(x) ¼ 0, x  5; 21. L2 F(x) ¼ 1 À (5/x) aÀ1, x > 5 c. 5(a À 1)/(a À 2) 23. 1/4 h 145. b. .4602, .3636 c. .5950 d. 140.178 25. À2/3 147. a. Weibull b. .5422 27. a. À.1058 b. À.0128 149. a. l b. a xaÀ 1/ba 2 ¼ < < ¼ ð Þ¼ À ÀaðÞxÀx =ðÞ2b   b ¼ 37. a. fX(x) 2x,0 x 1, fX(x) 0 elsewhere c. F x 1 e ,0 x ; F(x) 0, ¼ < < < x < 0; F(x) ¼ 1 À eÀab/2, x > b b. fY|X(y| x) 1/x,0 y x 1 c. .6 d. no, the domain is not a rectangle ð Þ¼aðÞÀ =b ÀaðÞxÀx2=ðÞ2b   b f x 1 x e ,0 x ; e. E(Y| X ¼ x) ¼ x/2, a linear function of x ¼ < ¼ > b f(x) 0, x 0, f(x) 0, x f. V(Y| X ¼ x) ¼ x2/12 This gives total probability less than 1, so some À2x probability is located at infinity (for items that last 39. a. fX(x) ¼ 2e ,0< x < 1, fX(x) ¼ 0, x  0 Ày+x forever). b. fY|X(y| x) ¼ e ,0< x < y < 1 c. P(Y > 2| x ¼ 1) ¼ 1/e m  s  151. R v/20, R v/800 d. no, the domain is not rectangular ¼ ¼ 155. F(q*) ¼ .818 e. E(Y| X x) x + 1, a linear function of x f. V(Y| X ¼ x) ¼ 1 41. a. E(Y| X ¼ x) ¼ x/2, a linear function of x; V(Y| Chapter 5 X ¼ x) ¼ x2/12 b. f(x, y) ¼ 1/x,0< y < x < 1 1. a. .20 b. .42 c. The probability of at least one c. fY(y) ¼Àln(y), 0 < y < 1 hose being in use at each pump is .70. d. E(Y) ¼ 1/4, V(Y) ¼ 7/144 d. x 012 y 012 e. E(Y) ¼ 1/4, V(Y) ¼ 7/144 p (x) .16 .34 .50 p (y) .24 .38 .38 ¼ ¼ ¼ X Y 43. a. pY|X(0|1) 4/17, pY|X(1|1) 10/17, pY|X(2|1) 3/17 ¼ ¼ ¼ P(X  1) ¼ .50 b. pY|X(0|2) .12, pY|X(1|2) .28, pY|X(2|2) .60 e. dependent, .30 ¼ P(X ¼ 2 and Y ¼ 2) 6¼ P(X ¼ 2) c. .40 ¼ ¼ ¼ P(Y ¼ 2) ¼ (.50)(.38) d. pX|Y(0|2) 1/19, pX|Y(1|2) 3/19, pX|Y(2|2) 15/19 ¼ ¼ 2 ¼ ¼ 4 3. a. .15 b. .40 c. .22 ¼ P(A) ¼ P(|X À X |  2) 45. a. E(Y| X x) x /2 b. V(Y| X x) x /12 1 2 ¼ –.5 À < < d. .17, .46 c. fY(y) y 1, 0 y 1 e. x1 0123447. a. p(1,1) ¼ p(2,2) ¼ p(3,3) ¼ 1/9, p(2,1) ¼ p(3,1) ¼ ¼ p1(x1) .19 .30 .25 .14 .12 p(3,2) 2/9 ¼ ¼ ¼ ¼ b. pX(1) 1/9, pX(2) 3/9, pX(3) 5/9 E(X1) 1.7 ¼ ¼ ¼ c. pY|X(1|1) 1, pY|X(1|2) 2/3, pY|X(2|2) 1/3, ¼ ¼ ¼ f. x2 0123pY|X(1|3) .4, pY|X(2|3) .4, pY|X(3|3) .2 d. E(Y| X ¼ 1) ¼ 1, E(Y| X ¼ 2) ¼ 4/3, p (x ) .19 .30 .28 .23 2 2 E(Y| X ¼ 3) ¼ 1.8, no g. 0 ¼ p(4 , 0) 6¼ p1(4) Á p2(0) ¼ (.12)(.19) so the two e. V(Y| X ¼ 1) ¼ 0, V(Y| X ¼ 2) ¼ 2/9, variables are not independent. V(Y| X ¼ 3) ¼ .56 ¼ ¼ ¼ 5. a. .54 b. .00018 49. a. pX|Y(1|1) .2, pX|Y(2|1) .4, pX|Y(3|1) .4, ¼ ¼ ¼ pX|Y(2|2) 1/3, pX|Y(3|2) 2/3, pX|Y(3|3) 1 7. a. .030 b. .120 c. .10, .30 d. .38 ¼ ¼ ¼ ¼ ¼ Á b. E(X| Y 1) 2.2, E(X| Y 2) 8/3, e. yes, p(x,y) pX(x) pY(y) E(X| Y ¼ 3) ¼ 3, no c. V(X| Y ¼ 1) ¼ .56, V(X| Y ¼ 2) ¼ 2/9, V(X| Y ¼ 3) ¼ 0 822 Chapter 6

51. a. 2x –10 b. 9 c. 3 d. .0228 d. F(x, y) ¼ .6x2y +.4xy3,0 x  1; 0  y  1; F(x, y) ¼ 0, x  0; F(x, y) ¼ 0, y  0; ¼ ¼ ... ¼ ¼ 53. a. pX(x) .1, x 0, 1, 2, ,9;pY|X(y| x) 1/9, y 0, F(x, y) ¼ .6x2 +.4x,0 x  1, y > 1; ... 6¼ 1, 2, ,9,y x; F(x, y) ¼ .6y +.4y3, x > 1, 0  y  1; F(x, y) ¼ 1, ¼ ¼ ... 6¼ pX,Y(x, y) 1/90, x, y 0, 1, 2, ,9,y x x > 1, y > 1 ¼ ¼ À ¼ ... b. E(Y| X x) 5 x/9, x 0, 1, 2, , 9, a linear P(.25  X  .75, .25  Y  .75) ¼ .23125 function of x e. F(x, y) ¼ 6x2y2, x+y 1, 0  x  1; 0  y  1,   55. a. .6x, .24x b. 60 c. 60 x 0, y 0 F(x, y) ¼ 3x4 À 8x3 +6x2 +3y4 À 8y3 +6y2 À 1, 57. a. .1410 b. .1165 x+y> 1, x  1, y  1 With positive correlation, the deviations from their means F(x, y) ¼ 0, x  0; F(x, y) ¼ 0, y  0; of X and Y are likely to have the same sign. F(x, y) ¼ 3x4 –8x3 +6x2 ,0 x  1, y > 1 F(x, y) ¼ 3y4 –8y3 +6y2 ,0 y  1, x > 1 ¼ ¼ 2 < < ¼ 59. a. If U X1 + X2, fU(u) u ,0 u 1, fU(u) ¼ > > 2 F(x, y) 1, x 1, y 1 2u – u ,1< u < 2, fU(u) ¼ 0, elsewhere b. If V ¼ X2 À X1, fV(v) ¼ 2 À 2v,0< v < 1, 91. a. 2x, x b. 40 c. .100 fV(v) ¼ 0, elsewhere 93. MW(t) ¼ 2/[(1–1000t)(2–1000t)], 1500 2 61. 4y3[(ln(y3)] ,0< y3 < 1 ¼ 4 5 65. a. g5(y) 5y /10 , 25/3 b. 20/3 c. 5 Chapter 6 d. 1.409 3 j ðÞ¼½j = Š½ð À Þ= Š < < 67. gY5 Y1 y5 4 2 3 y5 4 6 ,4 y5 10; 8.8 1. a. x 25 32.5 40 45 52.5 65 69. 1/(n + 1), 2/(n + 1), 3/(n + 1), ..., n/(n +1) hi pðxÞ .04 .20 .25 .12 .30 .09 2 Gðnþ1ÞGðiþ1=yÞ Gðnþ1ÞGðiþ2=yÞ À Gðnþ1ÞGðiþ1=yÞ  71. GðiÞGðnþ1þ1=yÞ , GðiÞGðnþ1þ2=yÞ GðiÞGðnþ1þ1=yÞ EðXÞ¼44:5 ¼ m 73. a. .0238 b. $2025 b. s2 0 112.5 312.5 800 ÀÁ ; ¼ n! ð ÞiÀ1ð ð ÞÀ ð ÞÞjÀiÀ1ð À ð ÞÞnÀj ð Þ ð Þ 2 75. gi;j yi yj ðiÀ1Þ!ðjÀiÀ1Þ!ðnÀjÞ! F yi F yj F yi 1 F yj f yi f yj , p(s ) .38 .20 .30 .12 À1 < < < 1 yi yj 2 ¼ ¼ s2 Ð E(S ) 212.25 ðÞ¼ð À Þ 1 ð ð þ ÞÀ ð ÞÞnÀ2 ð Þ ð þ Þ 77. a. fW2 w2 n n 1 À1 F w1 w2 F w1 f w1 f w1 w2 dw1 ðÞ¼ð À Þ nÀ2ð À Þ < < 3. x/n 0 .1.2.3.4 b. fW2 w2 n n 1 w2 1 w2 ,0 w2 1 79. f(x) ¼ eÀx/2 À eÀx, x  0; f(x) ¼ 0, x < 0. p(x/n) 0.0000 0.0000 0.0001 0.0008 0.0055 .5 .6 .7 .8 .9 1.0 81. a. 3/81,2508 ð 30Àx > 2 < kxydy ¼ kð250x À 10x Þ; 0  x  20 0.0264 0.0881 0.2013 0.3020 0.2684 0.1074 ð Þ¼ ð 20Àx b. fX x > 30Àx :> ¼ ð À 2 þ 1 3Þ; <  kxydy k 450x 30x 2x 20 x 30 0 5. a. x 1 1.5 2 2.5 3 3.5 4 fY(y) ¼ fX(y) dependent c. .3548 d. 25.969 e. À32.19, À.894 pðxÞ .16 .24 .25 .20 .10 .04 .01 f. 7.651 b. PðX  2:5Þ¼:85 83. 7/6 c. r 0123 87. c. If p(0) ¼ .3, p(1) ¼ .5, p(2) ¼ .2, then 1 is the smaller of p(r) .30 .40 .22 .08 the two roots, so extinction is certain in this case with m < 1. If p(0) ¼ .2, p(1) ¼ .5, p(2) ¼ .3, then 2/3 is the smaller of d. .24 m > the two roots, so extinction is not certain with 1. 7. xp ðxÞ xp ðxÞ xp ðxÞ 89. a. P((X,Y) ∈ A) ¼ F(b, d) À F(b, c) À F(a, d)+F(a, b) b. P((X,Y) ∈ A) ¼ F(10, 6) À F(10, 1) À F(4, 6) + F(4, 1) 0.0 0.000045 1.4 0.090079 2.8 0.052077 P((X,Y) ∈ A) ¼ F(b, d) À F(b, c–1) – F(a–1, d)+ 0.2 0.000454 1.6 0.112599 3.0 0.034718 F(a–1, b–1) 0.4 0.002270 1.8 0.125110 3.2 0.021699 c. At each (x*, y*), F(x*, y*) is the sum of the 0.6 0.007567 2.0 0.125110 3.4 0.012764 probabilities at points (x, y) such that x  x* and 0.8 0.018917 2.2 0.113736 3.6 0.007091  y y* 1.0 0.037833 2.4 0.094780 3.8 0.003732 x F(x, y) 1.2 0.063055 2.6 0.072908 4.0 0.001866 100 250 200 .50 1 y 100 .30 .50 0 .20 .25 Chapter 7 823

11. a. 12, .01 75. .8340 b. 12, .005 r ¼ s2 =ðs2 þ s2 Þ c. With less variability, the second sample is more 77. a. W W E r ¼ closely concentrated near 12. b. .9999 13. a. No, the distribution is clearly not symmetric. 79. 26, 1.64 A positively skewed distribution —perhaps Weibull, 81. If Z1 and Z2 are independent standard normal lognormal, or gamma. observations, then let pffiffiffi b. .0746 X ¼ 5Z + 100, Y ¼ 2ð:5Z1 þð 3=2ÞZ2Þþ50 c. .00000092. No, 82 is not a reasonable value for m. 1 15. a. .8366 b. no Chapter 7 17. 43.29 e 19. a. .9802, .4802 b. 32 1. a. 113.73, X b. 113, X c. 12.74, S, an estimator for the population standard 21. a. .9839 b. .8932 deviation d. The sample proportion of students exceeding 100 in 27. a. 87,850, 19,100,116 IQ is 30/33 ¼ .91 b. In case of dependence, the mean calculation is still e. .112, S=X valid, but not the variance calculation. c. .9973 3. a. 1.3481, X b. 1.3481, X c. 1.78, X þ 1:282S 29. a. .2871 b. .3695 d. .67 e. .0846 31. .0317; Because each piece is played by the same 5. a. 1,703,000 b. 1,599,730 c. 1,601,438 musicians, there could easily be some dependence. If they perform the first piece slowly, then they might 7. a. 120.6 b. 1,206,000, 10,000X c. .8 perform the second piece slowly, too, d. 120, Xe pffiffiffiffiffiffiffiffi 33. a. 45 b. 68.33 c. À1, 13.67 d. À5, 68.33 9. a. X, 2.113 b. l=n, .119 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 35. a. 50, 10.308 b. .0076 c. 50 d. 111.56 11. b. p1ð1 À p1Þ=n1 þ p2ð1 À p2Þ=n2 e. 131.25 c. In part (b) replace p1 with X1/n1 and replace p2 with X2/n2 37. a. .9615 b. .0617 d. À.245 e. .0411 39. a. .5, n(n + 1)/4 b. .25, n(n + 1)(2n + 1)/24 13. a. .9876 b. .6915 P 41. 10:52.74 y^ ¼ 2=ð Þ 15. a. Xi 2n b. 74.505 43. .48 17. b. 4/9 ¼ À 2 n ^ ^ 45. b. MY(t) 1/[1 t /(2n)] 19. a. p^ ¼ 2l À :30 ¼ :20 c. p^ ¼ð100l À 9Þ=70 w2 n 47. Because n is the sum of independent random variables, 21. a. .15 b. yes c. .4437 w2 each distributed as 1, the applies. 23. a. y^ ¼ð2x À 1Þ=ð1 À xÞ¼3 53. a. 3.2 b. 10.04, the square of the answer to (a) ^ b. y ¼½Àn=S lnðxiފ À 1 ¼ 3:12 n n À n > 57. a. 2/( 2 2), 2 2 25. p^ ¼ r=ðr þ xÞ¼:15 This is the number of successes n2ðn þ n À Þ=½n ðn À Þ2ðn À ފ n > b. 2 2 1 2 2 1 2 2 2 4 , 2 4 over the number of trials, the same as the result in 61. a. 4.32 Exercise 21. It is not the same as the estimate of Exercise 17. 65. a. The approximate value, .0228, is smaller because of P P s^2 ¼ 1 2 s^2 ¼ 1 2 skewness in the chi-squared distribution 27. a. n Xi b. n Xi ^ P b. This approximation gives the answer .03237, agreeing 29. a. y ¼ X2=ð2nÞ¼74:505, the same as in Exercise 15 with the software answer to this number of decimals. qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffii y^ ð Þ ¼ : 67. No, the sum of the is not the same as the b. 2 ln 2 10 16 ^ percentile of the sum, except that they are the same for 31. l ¼Àlnðp^Þ=24 ¼ :0120 the 50th percentile. For all other percentiles, the percentile of the sum is closer to the 50th percentile 33. No, A does not have more information. than is the sum of the percentiles Q P n ; n 35. i¼1 xi i¼1 xi 69. a. 2360, 73.70 b. .9713 37. I(.5 max(x1, x2, ..., xn)  y  min(x1, x2, ..., xn)) 71. .9685 39. a. 2X(n À X)/[n(n À 1)] 73. .9093 Independence is questionable because con- pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Fðð À Þ= À = Þ sumption one day might be related to consumption the 41. a. X b. X c 1 1 n next day. 824 Chapter 8

43. a. Vðy~Þ¼y2=½ŠnnðÞþ 2 b. y2/n 33. a. (38.081, 38.439) b. (100.55, 101.19), yes c. The variance in (a) is below the bound of (b), but the theorem does not apply because the domain is a 35. a. Assuming normality, a 95% lower confidence bound function of the parameter. is 8.11. When the bound is calculated from repeated independent samples, roughly 95% of such bounds 45. a. x b. N(m, s2/n) should be below the population mean. c. Yes, the variance is equal to the Crame´r-Rao bound b. A 95% lower prediction bound is 7.03. When the d. The answer in (b) shows that the asymptotic bound is calculated from repeated independent distribution of the theorem is actually exact here. samples, roughly 95% of such bounds should be below the value of an independent observation. 47. a. 2/s2 b. The answer in (a) is different from the answer, 37. a. 378.85 b. 413.09 c. (333.88, 407.50) 1/(2s4), to 46(a), so the information does depend on the parameterization. 39. 95% : (.0498, .0772) ^ 41. a. (169.36, .179.37) 49. l ¼ 6=ð6t6 À t1 À ...À t5Þ¼6=ðx1 þ 2x2 þ ...þ 6x6Þ¼:0436, where x ¼ t , x ¼ t À t , ..., x ¼ t À t b. (134.30, 214.43), which includes 152 1 1 2 2 1 6 6 5 c. The second interval is much wider, because it allows 53. 1.275, s ¼ 1.462 for the variability of a single observation. d. The normal probability plot gives no reason to doubt ðs^2Þ¼s2= s^2 55. b. no, E 2, so 2 is unbiased normality. This is especially important for part (b), but 59. .416, .448 the large sample size implies that normality is not so critical for (a). 61. d(X) ¼ (À1)X, d(200) ¼ 1, d(199) ¼À1 P P 45. a. 18.307 b. 3.940 c. .95 d. .10 b^ ¼ 2 ¼ : 63. b. xiyi Pxi 30 040, the estimated minutes 2 1 ^ 2 47. b. (2.34, 5.60) per item; s^ ¼ ðyi À bxiÞ ¼ 16:912; b^ ¼ n 25 751 49. a. (7.91, 12.00) b. Because of an outlier, normality is questionable for this data set. Chapter 8 c. In MINITAB, put the data in C1 and execute the following macro 999 times 1. a. 99.5% b. 85% c. 2.97 d. 1.15 Let k3 ¼ N(c1) sample k3 c1 c3; 3. a. A narrower interval has a lower probability b. No, replace. m is not random ¼ m let k1 mean(c3) c. No, the interval refers to , not individual observations stack k1 c5 c5 d. No, a probability of .95 does not guarantee 95 end successes in 100 trials 51. a. (26.61, 32.94) 5. a. (4.52, 5.18) b. (4.12, 5.00) c. 55 d. 94 b. Because of outliers, the weight gains do not seem 7. Increase n by a factor of 4. Decrease the width by a factor normally distributed. of 5. c. In MINITAB, see Exercise 49(c). pffiffiffi 53. a. (38.46, 38.84) 9. a. ðx À 1:645 sp=ffiffiffi n; 1Þ; (4.57, 1) b. ðx À za Á s= n; 1Þ b. Although the normal probability plot is not perfectly pffiffiffi straight, there is not enough deviation to reject ðÀ1;  þ Á s= Þ À1 c. x za n ;( , 59.7) normality. 11. 950; .8724 (normal approximation), .8731 (binomial) c. In MINITAB, see Exercise 49(c). 13. a. (.99, 1.07) b. 158 55. a. (169.13, 205.43) b. Because of an outlier, normality is questionable for 15. a. 80% b. 98% c. 75% this data set. c. In MINITAB, see Exercise 49(c). 17. .06, which is positive, suggesting that the population mean change is positive 57. a. In MINITAB, put the data in C1 and execute the following macro 999 times 19. (.513, .615) Let k3 ¼ N(c1) 21. .218 sample k3 c1 c3; replace. 23. (.439, .814) let k1 ¼ stdev(c3) stack k1 c5 c5 25. a. 381 b. 339 end 29. a. 1.341 b. 1.753 c. 1.708 d. 1.684 b. Assuming normality, a 95% for s e. 2.704 is (3.541, 6.578), but the interval is inappropriate because the normality assumption is clearly not 31. a. 2.228 b. 2.131 c. 2.947 d. 4.604 satisfied. e. 2.492 f. 2.715 Chapter 9 825

59. a. (.198, .230) b. .048 21. Test H0: m ¼ .5 vs. Ha: m 6¼ .5 ¼ > c. A 90% prediction interval is (.149, .279) a. Do not reject H0 because t.025,12 2.179 |1.6| ¼ > À b. Do not reject H0 because t.025,12 2.179 | 1.6| 61. 246 ¼ > À c. Do not reject H0 because t.005,24 2.797 | 2.6| ¼ < À 63. a. A 95% confidence interval for the mean is (.163, d. Reject H0 because t.005,24 2.797 | 3.9| .174). Yes, this interval is below the interval for 59(a). 23. Because t ¼ 2.24  1.708 ¼ t.05,25, reject H0: m ¼ 360. b. (.089, .326) Yes, this suggests contradiction of prior belief. 65. (0.1263, 0.3018) 25. Because |z| ¼ 3.37  1.96, reject the null hypothesis. 67. a. yes b. (196.88, 222.62) It appears that this population exceeds the national pffiffiffiffiffiffiffiffi average in IQ. ^ 69. c. VðbÞ¼s2=Sx2, s^ ¼ s= Sx2 i b i 27. a. no, t ¼À.02 b. 58 d. Put the xi’s farp fromffiffiffiffiffiffiffiffi 0 to minimize sb^ b^ Æ = S 2 c. n ¼ 20 total observations e. ta=2;nÀ1s xi , (29.93, 30.15) ¼ < ¼ 73. a. .00985 b. .0578 29. a. Because t .50 1.895 t.05,7 do not reject H0. pffiffiffi pffiffiffi b. .73 75. a. ðx Àðs= nÞt: ; À ;d; x Àðs= nÞt: ; À ;dÞ 025 n 1 975 n 1 ¼À > À ¼À b. (3.01, 4.46) 31. Because t 1.24 1.397 t.10,8, we do not have evidence to question the prior belief. 77. a. 1/2n b. n/2n c. (n + 1)/2n,1À (n + 1)/2nÀ1, (29.9, 39.3) with confidence level .9785 35. a. The distribution is fairly symmetric, without outliers. b. Because t ¼ 4.25  3.499 ¼ t.005,7, there is strong 2 79. a. P(A1\A2) ¼ .95 b. P(A1\A2)  .90 evidence to say that the amount poured differs from c. P(A \A )  1 À a À a ; P(A \A \ ...\ A ) the industry standard, and indeed bartenders tend to 1 2 ...1 2 1 2 k  1 À a1 À a2 À Àak exceed the standard. c. Yes, the test in (b) depends on normality, and a normal probability plot gives no reason to doubt the Chapter 9 assumption. d. .643, .185, .016 1. a. yes b. no c. no d. yes e. no f. yes 37. a. Do not reject H0: p ¼ .10 in favor of Ha: p > .10 because z ¼ 1.33 < 1.645. Because the null 5. H0: s ¼ .05 vs. Ha: s < .05. Type I error: Conclude that the standard deviation is less than .05 mm when it is hypothesis is not rejected, there could be a type II really equal to .05 mm. Type II error: Conclude that the error. standard deviation is .05 mm when it is really less than b. .49, .27. c. 362 .05. 39. a. Do not reject H0: p ¼ .02 in favor of Ha: p < .02 ¼À > À 7. A type I error here involves saying that the plant is not in because z 1.1 1.645. There is no strong compliance when in fact it is. A type II error occurs when evidence suggesting that the inventory be postponed. < we conclude that the plant is in compliance when in fact it b. .195. c. .0000001. isn’t. A government regulator might regard the type II 41. a. Reject H because z ¼ 3.08  2.58. b. .03 error as being more serious. 0 43. Using n ¼ 25, the probability of 5 or more leaky faucets 9. a. R1 is .0980 if p ¼ .10, and the probability of 4 or fewer leaky b. A type I error involves saying that the two companies faucets is .0905 if p ¼ .3. Thus, the rejection region is are not equally favored when they are. A type II error 5 or more, a ¼ .0980, and b ¼ .0905. involves saying that the two companies are equally favored when they are not. 45. a. reject b. reject c. do not reject d. reject c. binomial, n ¼ 25, p ¼ .5; .0433 e. do not reject d. .3, .4881; .4, .8452; .6, .8452; .7, .4881 e. If only 6 favor the first company, then reject the null 47. a. .0778 b. .1841 c. .0250 d. .0066 e. .5438 hypothesis and conclude that the first company is not 49. a. P ¼ .0403 b. P ¼ .0176 c. P ¼ .1304 preferred. d. P ¼ .6532 e. P ¼ .0021 f. P ¼ .000022 m ¼ m 6¼ 11. a. H0: 10 vs. Ha: 10 b. .0099 51. Based on the given data, there is no reason to believe ¼ ¼ c. .5319. .0076 d. c 2.58 e. c 1.96 that pregnant women differ from others in terms of true  ¼ : f. x 10 02, so do not reject H0 average serum receptor concentration. g. Recalibrate if z À2.58 or z  2.58 53. a. Because the P-value is .17, no modification is 13. b. .00043, .0000075, less than .01 indicated. b. 997 15. a. .0301 b. .0030 c. .0040 55. Because t ¼À1.759 and the P-value ¼ .089, which is m ¼ 17. a. Because z ¼ 2.56 > 2.33, reject H b. .84 less than .10, reject H0: 3.0 against a two-tailed 0 alternative at the 10% level. However, the P-value c. 142 d. .0052 exceeds .05, so do not reject H0 at the 5% level. There

19. a. Because z ¼À2.27 > À2.58, do not reject H0 b. .22 c. 22 826 Chapter 10

is just a weak indication that the percentage is not equal to 91. a. For the test of H0: m ¼ m0 vs. Ha: m > m0 at level a, S m  w2 3% (lower than 3%). reject H0 if 2 xi/ 0 a;2n m ¼ m m < m a m ¼ m < For the test of H0: 0 vs. Ha: 0 at level , 57. a. Test H0: 10 vs. Ha: 10 2 reject H0 if 2Sxi/m0  w Àa; b. Because the P-value is .017 < .05, reject H0, 1 2n suggesting that the pens do not meet specifications. For the test of H0: m ¼ m0 vs. Ha: m 6¼ m0 at level a, > S m  w2 c. Because the P-value is .045 .01, do not reject H0, reject H0 if 2 xi/ 0 a=2;2n or S m  w2 suggesting there is no reason to say the lifetime is if 2 xi/ 0 1Àa=2;2n inadequate. b. Because Sxi ¼ 737, the test is 2Sxi/m0 d. Because the P-value is .0011, reject H0. There is good ¼ 19.65, which gives a P-value of .52. There is no evidence showing that the pens do not meet reason to reject the null hypothesis. specifications. 93. a. yes 61. a. 98, .85, .43, .004, .0000002 b. .40, .11, .0062, .0000003 c. Because the null hypothesis will be rejected with high Chapter 10 probability, even with only slight departure from the null hypothesis, it is not very useful to do a .01 level 1. a. À.4; it doesn’t b. .0724, .269 test. c. Although the CLT implies that the distribution will be 63. b. 36.61 c. yes approximately normal when the sample sizes are each 100, the distribution will not necessarily be normal 65. a. Sxi  c b. yes when the sample sizes are each 10.

67. Yes, the test is UMP for the alternative Ha : y > .5 3. Do not reject H0 because z ¼ 1.76 < 2.33 because the tests for H0 : y ¼ .5 vs. Ha : y ¼ p0 all have the same form for any p > .5. 5. a. Ha says that the average calorie output for sufferers is 0 more than 1 cal/cm2/min below that for non-sufferers. 69. b. .05 Reject H0 in favor of Ha because z ¼À2.90 À2.33 c. .04345, .05826; Because .04345 < .05, the test is not b. .0019 c. .819 d. .66 unbiased. ¼  d. .05114; not most powerful 7. Yes, because z 1.83 1.645.  À  ¼ : 71. b. The value of the is 3.041, so the P-value is 9. a. x y 6 2 ¼ ¼ .081, compared to .089 for Exercise 55. b. z 1.14, two-tailed P-value .25, so do not reject the null hypothesis that the population means are 73. A sample size of 32 should suffice. equal. c. No, the values are positive and the standard deviation m ¼ m > 75. a. Test H0: 2150p vs.ffiffiffi Ha: 2150 exceeds the mean. ¼ð À Þ=ð = Þ b. t x 2150 s n c. 1.33 d. .101 d. 95% CI: (10.0, 29.8) e. Do not reject H0 at the .05 level. 11. a. A 95% CI for the true difference, fast food mean – not ¼ 77. Because t .77 and the P-value is .23, there is no fast food mean is (219.6, 538.4) evidence suggesting that coal increases the mean heat b. The one-tailed P-value is .014, so reject the null flux. hypothesis of a 200-calorie difference at the .05 79. Conclude that activation time is too slow at the .05 level, level, and conclude that yes, there is strong evidence. but not at the .01 level. 13. 22. No. 81. A normal probability plot gives no reason to doubt the 15. b. It increases. normality assumption. Because the sample mean is 9.815, giving t ¼ 4.75 and a (upper tail) P-value of .00007, 17. Because z ¼ 1.36, there is no reason to reject the reject the null hypothesis at any reasonable level. The hypothesis of equal population means (p ¼ .17). true average flame time is too high. 19. Because z ¼ .59, there is no reason to conclude that the 83. Assuming normality, calculate t ¼ 1.70, which gives a population mean is higher for the no-involvement group two tailed P-value of .102. Do not reject the null (p ¼ .28). hypothesis H0: m ¼ 1.75. 21. Because t ¼À3.35 À3.30 ¼ t.001,42, yes, there is 85. The P-value for a lower tail test is .0014 (normal evidence that experts do hit harder. approximation, .0005), so it is reasonable to reject the ¼ À < ¼ idea that p ¼ .75 and conclude that fewer than 75% of 23. b. No c. Because |t| | .38| 2.228 t.025,10, no, mechanics can identify the problem. there is no evidence of a difference.  87. Because t ¼ 6.43, giving an upper tail P-value of 25. Because the one-tailed P-value is .005 .01, conclude at .0000002, conclude that the population mean time the .01 level that the difference is as stated. exceeds 15 minutes. This could result in a type I error. ¼ ¼ 89. Because the P-value is .013 > .01, do not reject the null 27. Yes, because t 2.08 with P-value .046. hypothesis at the .01 level. 29 b. (127.6, 202.0) c. 131.8 Chapter 10 827

31. Because t ¼ 1.82 with P-value .046  .05, conclude at # start with X in C1, Y in C2 the .05 level that the difference exceeds 1. let k3 ¼ N(c1) qffiffiffiffiffiffiffiffiffiffi let k4 ¼ N(c2) ðÞÆ À  Á 1 þ 1 33. a. x y ta=2;mþnÀ2 sp sample k3 c1 c3; À m n b. ( .24, 3.64) replace. À c. ( .34, 3.74), which is wider because of the loss of a sample k4 c2 c4; degree of freedom replace. ¼ 35. a. The slender distribution appears to have a lower mean let k1 mean(c3)-mean(c4) and lower variance. stack k1 c5 c5 b. With t ¼ 1.88 and a P-value of .097, there is no end significant difference at the .05 level. 71. a. Here is a macro that can be executed 999 times in MINITAB: 37. With t ¼ 2.19 and a two-tailed P-value of .031, there is a significant difference at the .05 level but not the .01 level. # start with X in C1, Y in C2 let k3 ¼ N(c1) 39. With t ¼ 3.89 and one-tailed P-value ¼ .006, conclude let k4 ¼ N(c2) at the 1% level that true average movement is less for the sample k3 c1 c3; TightRope treatment. Normality is important, but the replace. normal probability plot does not indicate a problem. sample k4 c2 c4; replace. 41. a. The 95% confidence interval for the difference of let k2 ¼ medi(c3)-medi(c4) means is (.000046, .000446), which has only positive stack k2 c6 c6 values. This omits 0 as a possibility, and says that the end conventional mean is higher. b. With t ¼ 2.68 and P-value ¼ .010, reject at the .05 73. a. (.593, 1.246) level the hypothesis of equal means in favor of the b. Here is a macro that can be executed 999 times in conventional mean being higher. MINITAB: # start with X in C1, Y in C2 ¼ 43. With t 1.87 and a P-value of .049, the difference is let k3 ¼ N(c1) (barely) significantly greater than 5 at the .05 level. let k4 ¼ N(c2) 45. a. No b. À49.1 c. 49.1 sample k3 c1 c3; replace. 47. 1 2 3 4 sample k4 c2 c4; x 10 20 30 40 replace. y 11 21 31 41 let k5 ¼ stdev(c3)/stdev(c4) stack k5 c12 c12 end 49. a. Because |z| ¼ |À4.84|  1.96, conclude that there is a difference. Rural residents are more favorable to the 75. a. Because t ¼À2.62 with a P-value of .018, conclude increase. that the population means differ. At the 5% level, b. .9967 blueberries are significantly better. b. Here is a macro that can be executed repeatedly in 51. (.016, .171) MINITAB: 53. Because z ¼ 4.27 with P-value .000010, conclude that # start with data in C1, group var in C2 the radiation is beneficial. let k3 ¼ N(c1) Sample k3 c1 c3. 55. a. H0: p3 ¼ p2, Ha: p3 > p2 unstack c3 c4 c5; À b. (X3 X2)/npffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi subs c2. c. ðX3 À X2Þ= X2 þ X3 let k9 ¼ mean(c4)-mean(c5) d. With z ¼ 2.67, P ¼ .004, reject H0 at the .01 level. stack k9 c6 c6 end 57. 769 77. a. Because f ¼ 4.46 with a two-tailed P-value of .122, ¼ ¼ 59. Because z 3.14 with P .002, reject H0 at the .01 there is no evidence of unequal population variances. level. Conclude that lefties are more accident-prone. b. Here is a macro that can be executed repeatedly in 61. a. .0175 b. .1642 c. .0200 d. .0448 MINITAB: ¼ e. .0035 let k1 n(C1) Sample K1 c1 c3. 63. No, because f ¼ 1.814 < 6.72 ¼ F.01,9,7. unstack c3 c4 c5; ¼ ¼ subs c2. 65. Because f 1.2219 with P .505, there is no reason to let k6 ¼ stdev(c4)/stdev(c5) question the equality of population variances. stack k6 c6 c6 67. 8.10 end 69. a. (.158, .735) 79. a. A MINITAB macro is given in #75(b). b. Here is a macro that can be executed 999 times in 81. a. (À11.85, À6.40) MINITAB: b. See Exercise 57(a) in Chapter 8. 828 Chapter 11

85. The difference is significant at the .05, .01, and .001 7. a. The Levene test gives f ¼ 1.47, P-value .236, so there levels. is no reason to doubt equal variances. b. Because f ¼ 10.48  4.02 ¼ F.01,4,30, there are 89. b. No, given that the 95% CI includes 0, the test at the significant differences among the means. .05 level does not reject equality of means. 91. (À299.2, 1517.8) Source DF SS MS F P Plate 4 43993 10998 10.48 0.000 93. (1020.2, 1339.9). Because 0 is not in the CI, we would length reject equality of means at the .01 level. Error 30 31475 1049 ¼ 95. Because t 2.61 and the one-tailed P-value is .007, the Total 34 75468 difference is significant at the .05 level using either a one-tailed or a two-tailed test. 11. w ¼ 36.09 31425 97. a. Because t ¼ 3.04 and the two-tailed P-value is .008, Splitting the paints into two groups, {3, 1, 4}, {2, 5}, the difference is significant at the .05 level. there are no significant differences within groups but the b. No, the mean of the concentration distribution paints in the first group differ significantly (they are depends on both the mean and standard deviation lower) from those in the second group. of the log concentration distribution. 13. 31425 99. Because t ¼ 7.50 and the one-tailed P-value is .0000001, 427.5 462.0 469.3 502.8 532.1 the difference is highly significant, assuming normality. 101. The two-sample t is inappropriate for paired data. The 15. w ¼ 5.92; At the 1% level the only significant paired t gives a mean difference .3, t ¼ 2.67, and the differences are between formation 4 and the first two two-tailed P-value is .045, so the means are significantly formations. different at the .05 level. We are concluding tentatively 2134 that the label understates the alcohol percentage. 24.69 26.08 29.95 33.84 103. Because paired t ¼ 3.88 and the two-tailed P-value is À .008, the difference is significant at the .05 and .01 17. ( .029, .379) levels, but not at the .001 level. 19. 426 105. Because z ¼ 2.63 and the two-tailed P-value is .009, 21. a. Because f ¼ 22.60  3.26 ¼ F.01,5,78, there are there is a significant difference at the .01 level, significant differences among the means. suggesting better survival at the higher temperature. b. (À99.1, À35.7), (29.4, 99.1) 107. .902, .826, .029, .00000003 23. The nonsignificant differences are indicated by the 109. Because z ¼ 4.25 and the one-tailed P-value is .00001, underscores. the difference is highly significant and companies 10 6 3 1 appear to discriminate. 45.5 50.85 55.40 58.28 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 111. With Z ¼ðX À YÞ= X=n þ Y=m, the result is ¼À5.33, two-tailed -value ¼ .0000001, so one 25. a. Assume normality and equal variances. z P ¼ < ¼ ¼ should conclude that there is a significant difference in b. Because f 1.71 2.20 F.10,3,48, P-value .18, . there are no significant differences among the means. ¼ ¼ 113. (i) not bioequivalent (ii) not bioequivalent (iii) 27. a. Because f 3.75, P-value .028, there are bioequivalent significant differences among the means. b. Because the normal plot looks fairly straight and the P-value for the Levene test is .68, there is no reason to Chapter 11 doubt the assumptions of normality and constant variance. c. The only significant pairwise difference is between m ¼ m ¼ m ¼ m ¼ m 1. a. Reject H0: 1 2 3 4 5 in favor of Ha: brands 1 and 4: m m m m m 1, 2, 3, 4, 5 not all the same, because 4321 ¼  ¼ f 5.57 2.69 F.05,4,30. 5.82 6.35 7.50 8.27 b. Using Table A.9, .001 < P-value < .01. (The P-value is .0018) 31. .63 3. Because f ¼ 6.43  2.95 ¼ F , there are pffiffiffiffiffiffiffi .05,3,28 ð = Þ significant differences among the means. 33. arcsin x n 35. a. Because f ¼ 1.55 < 3.26 ¼ F , there are no 5. Because f ¼ 10.85  4.38 ¼ F.01,3,36, there are .05,4,12 significant differences among the means. significant differences among the means. b. Because f ¼ 2.98 < 3.49 ¼ F.05,3,12, there are no Source DF SS MS F P significant differences among the means. Formation 3 509.1 169.7 10.85 0.000 37. With f ¼ 5.49  4.56 ¼ F.01,5,15, there are significant Error 36 563.1 15.6 differences among the stimulus means. Although not all Total 39 1072.3 differences are significant in the multiple comparisons analysis, the means for combined stimuli were higher. Chapter 11 829

Differences among the subject means are not very 51. a. With f ¼ 1.55 < 2.81 ¼ F.10,2,12, there is no important here. The normal plot of residuals shows no significant at the .10 level. reason to doubt normality. However, the plot of residuals b. With f ¼ 376.27  18.64 ¼ F.001,2,12, there is a against the fitted values shows some dependence of the significant difference between the formulation variance on the mean. If logged response is used in place means at the .001 level. of response, the plots look good and the F test result is With f ¼ 19.27  12.97 ¼ F.001,1,12, there is a similar but stronger. Furthermore, the logged response significant difference among the speed means at the gives more significant differences in the multiple .001 level. comparisons analysis. c. Main effects Formulation: (1) 11.19, (2) –11.19 Speed: (60) 1.99, (70) –5.03, (80) 3.04 Means: 53. Here is the ANOVA table L1 L2 T L1 + L2 L1 + T L2 + T 24.825 27.875 29.1 40.35 41.22 45.05 Source DF SS MS F P Pen 3 1387.5 462.50 0.68 0.583 39. With f ¼ 2.56 < 2.61 ¼ F.10,3,12, there are no significant surface 2 2888.1 1444.04 2.11 0.164 differences among the angle means. Interaction 6 8100.3 1350.04 1.97 0.149 Error 12 8216.0 684.67 41. a. With f ¼ 1.04 < 3.28 ¼ F.05,2,34, there are no significant differences among the treatment means. Total 23 20591.8

Source DF SS MS F With f ¼ 1.97 < 2.33 ¼ F.10,6,12, there is no significant interaction at the .10 level. Treatment 2 28.78 14.39 1.04 With f ¼ .68 < 2.61 ¼ F.10,3,12, there is no significant Block 17 2977.67 175.16 12.68 difference among the pen means at the .10 level. Error 34 469.56 13.81 With f ¼ 2.11 < 2.81 ¼ F.10,2,12, there is no significant Total 53 3476.00 difference among the surface means at the .10 level. ¼ b. The very significant f for blocks, which shows that 57. a. F MSAB/MSE ¼ ¼ blocks differ strongly, implies that was b. A: F MSA/MSAB B: F MSB/MSAB successful. 59. a. Because f ¼ 3.43  2.61 ¼ F.05,4,40, there is a significant difference among the exam means at the 43. With f ¼ 8.69  6.01 ¼ F.01,2,18, there are significant differences among the three treatment means. .05 level. ¼ < ¼ The normal plot of residuals shows no reason to doubt b. Because f 1.65 2.61 F.05,4,40, there is no normality, and the plot of residuals against the fitted significant difference among the retention means at values shows no reason to doubt constant variance. the .05 level. There is no significant difference between treatments B 61. a. and C, but Treatment A differs (it is lower) significantly from the others at the .01 level. Source DF SS MS F Means: A 29.49 B 31.31 C 31.40 Diet 4 .929 .232 2.15 Error 25 2.690 .108 Total 29 3.619 45. Because f ¼ 8.87  7.01 ¼ F01,4,8, reject the hypothesis that the variance for B is 0. Because f ¼ 2.15 < 2.76 ¼ F.05,4,25, there is no 49. a. significant difference among the diet means at the .05 level. Source df SS MS F b. (À.59, .92) Yes, the interval includes 0. A 2 30763.0 15381.5 3.79 c. .53 B 3 34185.6 11395.2 2.81 63. a. Test H0: m1 ¼ m2 ¼ m3 versus Ha: the three means Interaction 6 43581.2 7263.5 1.79 are not all the same. With f ¼ 4.80 and F.05,2,16 ¼ Error 24 97436.8 4059.9 3.63 < 4.80 < 6.23 ¼ F.01,2,16, it follows that Total 35 205966.6 .01 < P-value < .05 (more precisely, P ¼ .023). Reject H0 in favor of Ha at the 5% level but not at b. Because 1.79 < 2.04 ¼ F.10,6,24, there is no the 1% level. significant interaction. b. Only the first and third means differ significantly at c. Because 3.79  3.40 ¼ F.05,2,24, there is a significant the 5% level. difference among the A means at the .05 level. 123 d. Because 2.81 < 3.01 ¼ F..05,6,24, there is no significant difference among the B means at the .05 25.59 26.92 28.17 level. ¼ e. Using w 64.93, 65. Because f ¼ 1123  4.07 ¼ F.05,3,8, there are significant differences among the means at the .05 level. 31 2For Tukey multiple comparisons, w ¼ 7.12: 3960.2 4010.88 4029.10 (continued) 830 Chapter 12

c. No, there is a wide range of y values for a given x; for PCM OCM RM PIM example when temperature is 18.2 the ratio ranges 29.92 33.96 125.84 129.30 from .9 to 2.68. 3. Yes. Yes. The means split into two groups of two. The means within each group do not differ significantly, but the means in 5. b. Yes the top group differ strongly from the means in the c. The relationship of y to x is roughly quadratic. bottom group. 7. a. 5050 psi b. 1.3 psi c. 130 psi d. À130 psi 67. The normal plot is reasonably straight, so there is no 3 À 3 3 reason to doubt the normality assumption. 9. a. .095 m /min b. .475 m /min c. .83 m /min, 1.305 m3/min d. .4207, .3446 e. .0036 69. 11. a. À.01 h, À.10 h b. 3.0 h, 2.5 h Source DF SS MS F c. .3653 d. .4624 A 1 322.667 322.667 980.5 13. a. y ¼ .63 + .652x B 3 35.623 11.874 36.1 b. 23.46, À2.46 AB 3 8.557 2.852 8.7 c. 392, 5.72 Error 16 5.266 .329 d. .956 Total 23 372.113 e. y ¼ 2.29 + .564x, r2 ¼ .688 15. a. y ¼À15.2 + .0942x With f ¼ 8.7  3.24 ¼ F.05,3,16, there is significant b. 1.906 interaction at the .05 level. c. À1.006 , À0.096, 0.034, 0.774 In the presence of significant interaction, main effects are d. .451 not very useful. 17. a. Yes b. slope, .827; intercept, À1.13 Chapter 12 c. 40.22 d. 5.24 e. .975 1. a. Temperature 19. a. y ¼ 75.2 À .209x 54.274 17 0 b. The coefficient of determination is .791, meaning that 17 23 the predictor accounts for 79.1% of the variation in y. 17 445 c. The value of s is 2.56, so typical deviations from the 17 67 regression line will be of this size. 17 Stem: hundreds and tens 21. b. y ¼À2.18 + .660x 18 0000011 Leaf: ones c. 7.72 18 2222 d. 7.72 18 445 0 0 b^ ¼ : b^ þ ; b^ ¼ : b^ 18 6 25. 0 1 8 0 32 1 1 8 1 18 8 29. a. Subtracting x from each xi shifts the plot x units to The distribution is fairly symmetric and bell-shaped with the left. The slope is left unchanged, but the new a center around 180. y intercept is y, the height of the old line at x ¼ x. b^à ¼  ¼ b^ þ b^  b^à ¼ b^ Ratio b. 0 Y 0 1x and 1 1

0 889 31. a. .00189 b. .7101 1 0000 2 c. No, because here Sðxi À xÞ is 24,750, smaller than the 13 ðb^ Þ¼s2=Sð À Þ2 value 70,000 in part (a), so V 1 xi x is 1 4444 higher here. 166 1 8889 Stem: ones; 33. a. (.51, 1.40) b ¼ b < 211 Leaf: tenths b. To test H0: 1 1 vs. Ha: 1 1, we compute t ¼À.2258 > À1.383 ¼Àt , so there is no 2 .10,9 reason to reject the null hypothesis, even at the 10% 25 level. There is no conflict between the data and the 26 assertion that the slope is at least 1. 2 b^ ¼ : 300 35. a. 1 1 536, and a 95% CI is (.632, 2.440) b. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find The distribution is concentrated between 1 and 2, with t ¼ 3.62, with P-value .0025. At the .01 level some positive skewness. conclude that there is a useful linear relationship. b. No, x does not determine y: for a given x there may be c. Because 5 is beyond the range of the data, predicting more than one y. at a dose of 5 might involve too much extrapolation. Chapter 12 831

b^ ¼ : r ¼ r > d. 1 1 683, and a 95% CI is (.531, 2.835). 59. a. For the test of H0: 0 vs. Ha: 0, we find Eliminating the point causes only moderate change, r ¼ .760, t ¼ 4.05, with P-value < .001. At the .001 so the point is not extremely influential. level conclude that there is a positive correlation. b. Because r2 ¼ .578 we say that the regression accounts b ¼ b 6¼ 37. a. Yes, for the test of H0: 1 0 vs. Ha: 1 0, we find for 57.8 % of the variation in endurance. This also ¼À t 6.73, with P-value .00002. At the .01 level applies to prediction of lactate level from endurance. conclude that there is a useful linear relationship. b. (À2.77, –1.42) 61. For the test of H0: r ¼ 0 vs. Ha: r 6¼ 0, we find r ¼ .773, t ¼ 2.44, with P-value .072. At the .05 level conclude ¼ 43. No, z .73 and the P-value is .46, so there is no evidence that there is not a significant correlation. With such a for a significant impact of age on kyphosis. small sample size, a high r is needed for significance.  45. a. sY^ increases as the distance of x from x increases 63. a. Reject the null hypothesis in favor of the alternative. b. (2.26, 3.19) b. No, with a large sample size a small r can be c. (1.34, 4.11) significant. d. At least 90% c. Because t ¼ 2.200  1.96 ¼ t.025,9998 the correlation 47. a. The regression equation is y ¼À1:58 þ 2:59x and is statistically (but not necessarily practically) R2 ¼ .838. significant at the .05 level. b. A 95% confidence interval for the slope is ( 2.16, 67. a. .184, –.238, –.426 3.01). In repetitions of the whole process of data b. The mean that is subtracted is not the mean x1;nÀ1 of collection and calculation of the interval, roughly x , x ,, ..., x , or the mean x2;n of x , x ,, ..., x . 95% of the intervals will contain the true slope. 1 2 n–1 2 3 n Also,qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the denominatorq ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi of r1 is not c. When tannin ¼ .6 the estimated mean astringency is P P nÀ1 ð À  Þ2 n ð À  Þ2 À0.0335 and the 95% confidence interval is (À0.125, 1 xi x1;nÀ1 2 xi x2;n . However, if 0.058) n is large then r1 is approximately the same as the ¼ d. When tannin .6 the predicted astringency is correlation. A similar relationship applies to r2. À0.0335 and the 95% prediction interval is c. No (À0.5582, 0.4912) d. After performing one test at the .05 level, doing more e. Our null hypothesis is that true average astringency tests raises the probability of at least one type I error to is 0 when tannin is .7, and the alternative is that the more than .05. true average is positive. The t for this test is 4.61, with P-value ¼ .000035, so yes there is compelling 69. The plot shows no reasons for concern about using the evidence. simple model. 49. (431.2, 628.6) 71. a. The model may not be a perfect fit because the plot shows some curvature. 51. a. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, b. The plot of standardized residuals is very similar to we find t ¼ 10.62, with P-value .000014. At the the residual plot. The normal probability plot gives .001 level conclude that there is a useful linear no reason to doubt normality. relationship. b ¼ b 6¼ b. (8.24, 12.96) With 95% confidence, when the flow 73. a. For the test of H0: 1 0 vs. Ha: 1 0, we find ¼ rate is increased by 1 SCCM, the associated expected t 10.97, with P-value .0004. At the .001 level change in etch rate is in the interval. conclude that there is a useful linear relationship. c. (36.10, 40.41) This is fairly precise. b. The residual plot shows curvature, so the linear d. (31.86, 44.65) This is much less precise than the relationship of part (a) is questionable. interval in (c) c. There are no extreme standardized residuals , and the e. Because 2.5 is closer to the mean, the intervals will be plot of standardized residuals is similar to the plot of narrower. ordinary residuals. f. Because 6 is outside the range of the data, it is 75. The first data set seems appropriate for a straight-line unknown whether the regression will apply there. model. The second data set shows a quadratic g. Use a 99% CI at each value: (23.88, 31.43), (29.93, relationship, so the straight-line relationship is 35.98), (35.07, 41.45) inappropriate. The third data set is linear except for an 53. a. Yes outlier, and removal of the outlier will allow a line to be fit. The fourth data set has only two values of x, so there is b. Yes, for the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find t ¼À4.39, with P-value < .001. At the .001 level no way to tell if the relationship is linear. conclude that there is a useful linear relationship. 77. a. To test for lack of fit, we find f ¼ 3.30, with 3 c. (403.6, 468.2) numerator df and 10 denominator df, so the P-value 57. a. r ¼ .923, so x and y are strongly correlated. is .079. At the .05 level we cannot conclude that the b. unaffected relationship is poor. c. unaffected b. The shows that the relationship is not d. The normal plots seem consistent with normality, but linear, in spite of (a). In this case, the plot is more the scatter plot shows a slight curvature. sensitive than the test. e. For the test of H0: r ¼ 0 vs. Ha: r 6¼ 0, we find ¼ 79. a. 77.3 t 7.59, with P-value .00002. At the .001 level b. 40.4 conclude that there is a useful linear relationship. c. The coefficient b3 is the difference in sales caused by the window, all other things being equal. 832 Chapter 12

81. a. .686, no f. Source DF SS MS F b. We find f ¼ 28.6  2.62 ¼ F.001,16,186, so there is a significant relationship at the .001 level. Regression 2 5 2.5 0.625 c. With all other predictors held constant, the estimated Error 1 4 4.0 difference in y between class A and not is .364. In Total 3 9 terms of $/ft2, the effect is multiplicative. Class A

buildings are estimated to be worth 44% more With f ¼ .625 < 199.5 ¼ F.05,2,1, there is no dollars per square foot, with all other predictors held significant relationship at the .05 level. constant. qPffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi b^ ¼ ; ¼ ð À Þ2=ð À Þ; d. The difference in (c) is highly significant because the 93. 0 y s y y pffiffiffin 1 two-tailed P-value is .00000013. c00 ¼ 1=n; y Æ t:025;nÀ1s= n X ^ 1 mþn 83. a. 48.31, 3.69 95. a. b ¼ y ¼y; 0 m þ n 1 i b. No, because the interaction term will change. X X 1 m 1 mþn c. Yes, f ¼ 18.92, P-value < .0001. b^ ¼ y À y ¼ y À y 1 m 1 i n mþ1 i 1 2 d. Yes, t ¼ 3.496, P-value ¼ .003  .01 ^ ¼  ; ¼ ; ...; ; ^ ¼  ; ¼ þ ; ...; þ b. yi y1 1 m yi Py2 i m 1 m n e. (21.6, 41.6) ¼ m ð À  Þ2 þ mþn ð À  Þ2 ¼ f. There appear to be no problems with normality or SSEpffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 yi y1 mþ1 yi y2 s =ð þ À Þ ¼ curvature, but the variance may depend on x1 SSE m n 2 c11 4/(m + n) b^ ¼ : ; b^ ¼ : ^ ¼ ; ¼ ; ...; ; 85. a. No d. 0 128 17 1 14 33 yi 121 i 1 3 ^ ¼ : ; ¼ ; ...; b. With f ¼ 5.03  3.69 ¼ F , there is a significant yi 135 33 i 4 6 .05,5,8 ¼ ¼ ¼ relationship at the .05 level. SSE 116.67 s 5.4006 c11 2/3 b c. Yes, the individual hypotheses deal with the issue of 95% CI for 1 (2.09, 26.58) whether an individual predictor can be deleted, not the 97. Residual ¼ Dep Var – Predicted Value effectiveness of the whole model. Std Error Residual ¼ [MSE – (Std Error Predict)2].5 d. 6.2, 3.3, (16.7, 31.9) Student Residual ¼ Residual/Std Error Residual e. With f ¼ 3.44 < 4.07 ¼ F.05,3,8, there is no reason to 2 reject the null hypothesis, so the quadratic terms can 101. a. Hij ¼ 1=n þðxi À xÞðxj À xÞ=Sðxk À xÞ ^ 2 2 2 be deleted. VðYiÞ¼s ½1=n þðxi À xÞ =Sðxk À xÞ Š b. VðY À Y^ Þ¼s2½1 À 1=n Àðx À xÞ2=Sðx À xÞ2Š 87. a. The quadratic terms are important in providing a good i i i k c. The variance of a predicted value is greater for an x fit to the data. that is farther from x b. A 95% PI is (.560, .771). d. The variance of a residual is lower for an x that is  89. a. rRI ¼ .843 (.000), rRA ¼ .621 (.001), rIA ¼ .843 farther from x (.000) Here the P-values are given in parentheses to e. It is intuitive that the variance of prediction should be three decimals. higher with increasing distance. However, points that b. Rating ¼ 2.24 + 0.0419 IBU – 0.166 ABV. Because are farther away tend to draw the line toward them, the two predictors are highly correlated, one is so the residual naturally has lower variance. redundant. 103. a. With f ¼ 12.04  9.55 ¼ F , there is a c. Linearity is an issue. .01,2,7 significant relationship at the .01 level. e. The regression is quite effective, with R2 ¼ .872. The To test H : b ¼ 0 vs. H : b 6¼ 0, |t| ¼ ABV coefficient is not significant, so ABV is not 0 1 a 1 2.96  t ¼ 2.36, so reject H at the .05 level. needed. The highly significant positive coefficient .025,7 0 The foot term is needed. for IBU and negative coefficient for its square show To test H : b ¼ 0 vs. H : b 6¼ 0, |t| ¼ that Rating increases with IBU, but the rate of increase 0 2 a 2 0.02 < t ¼ 2.36, so do not reject H at the .05 is lower at higher IBU. .025,7 0 level. The height term is not needed. 2 3 2 3 À À b. The highest leverage is .88 for the fifth point. The 1 1 1 1 height for this student is given as 54 inches, too low 6 1 À1 1 7 6 1 7 91. a. X ¼ 6 7 y ¼ 6 7, to be correct for this group of students. Also this 4 À 5 4 5 00 1 1 1 0 value differs by 8 from the wingspan, an extreme 1 1 1 4 2 3 2 3 2 3 difference. 400 6 1:5 c. Point 1 has leverage .55, and this student has height 4 0405b^ ¼ 4 2 5 b. b^ ¼ 4 :5 5 75, foot length 13, both quite high. 004 4 1 Point 2 has leverage .31, and this student has height 2 3 2 3 66 and foot length 8.5, at the low end. 0 1 Point 7 has leverage .31 and this student has both 6 7 6 7 6 2 7 6 À1 7 height and foot length at the high end. c. y^ ¼ 4 5 y À y^ ¼ 4 5 SSE ¼ 4, MSE ¼ 4 1 À1 d. Point 2 has the most extreme residual. This student 3 1 has a height of 6600 and a wingspan of 5600 differing 00 d. (À12.2, 13.2) by 10 , so the extremely low wingspan is probably wrong. e. For the test of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, we find |t| ¼ .5 < t ¼ 12.7, so do not reject H at e. For this data set it would make sense to eliminate .025,1 0 points 2 and 5 because they seem to be wrong. the .05 level. The x1 term does not play a significant role. However, outliers are not always mistakes and one needs to be careful about eliminating them. Chapter 13 833

w2 ¼ : < : ¼ w2 105. a. .507% b. .7122 3. Do not reject H0 because 1 57 7 815 :05;3 c. To test H : b ¼ 0 vs. H : b 6¼ 0, we have t ¼ 3.93, 0 1 a 1 w2 ¼ with P-value .0013. At the .01 level conclude that 5. Because 6.61 with P-value .68, do not reject H0. there is a useful linear relationship. 7. Because w2 ¼ 4.03 with P-value > .10, do not reject H . d. (1.056, 1.275) 0 e. y^ ¼ 1:014 y À y^ ¼À:214 9. a. [0, .223), [.223, .510), [.510, .916), [.916, 1.609). [1.609, 1) À 107. –36.18, ( 64.43, –7.94) b. Because w2 ¼ 1.25 with P-value > .10, do not reject 109. No, if the relationship of y to x is linear, then the H0. 2 relationship of y to x is quadratic. 11. a. (À1, À.967), [À.967, À.431), [À.431, 0), [0, .431), 1 111. a. Yes [.431, .967), [.967, ) À1 b. y^ ¼ 98:293 y À y^ ¼ :117 b. ( ,.49806), [.49806, .49914), [.49914, .50), [.50, 1 c. s ¼ .155 .50086), [.50086, .50194), [.50194, ) w2 ¼ > d. .794 c. Because 5.53 with P-value .10, do not reject H0. e. 95% CI for b1: (.0613, .0901) f. The new observation is an outlier, and has a major 13. Using p^ ¼ :0843, w2 ¼ 280.3 with P-value < .001, so impact: reject the independence model. The equation of the line changes from y ¼ 97.50 + .0757 x to y ¼ 97.28 + .1603 x 15. The likelihood is proportional to y233(1 – y)367 from s changes from .155 to .291 which y^ ¼ :3883. This gives estimated probabilities r2 changes from .794 to .616 .1400, .3555, .3385, .1433, .0227 and expected counts 21.00, 53.32, 50.78, 21.49, 3.41. Because 3.41 < 5, ¼ 113. a. The paired t procedure gives t 3.54 with a two- combine the last two categories, giving w2 ¼ 1.62 with tailed P-value of .002, so at the .01 level we reject the P-value > .10. Do not reject the binomial model. hypothesis of equal means. ^ 2 b. The regression line is y ¼ 4.79 + .743x, and the test 17. l ¼ 3:167 which gives w ¼ 103.9 with P-value < .001, of H0: b1 ¼ 0 vs. Ha: b1 6¼ 0, gives t ¼ 7.41 with a so reject the assumption of a Poisson model. P-value of <.000001, so there is a significant y^ ¼ : ; y^ ¼ : w2 ¼ relationship. However, prediction is not perfect, 19. 1 4275 2 2750 which gives 29.3 with P- < with r2 ¼ .753, so one variable accounts for only value .001, so reject the model. 75% of the variability in the other. 21. Yes, the test gives no reason to reject the null hypothesis 117. a. linear of a . b. After fitting a line to the data, the residuals show a 23. The P-values are both .243. lot of curvature. c. Yes. The residuals from the logged model show some 25. Let pi1 ¼ the probability that a fruit given treatment i departure from linearity, but the fit is good in terms matures and pi2 ¼ the probability that a fruit given 2 ^ ^ of R ¼ .988. We find a ¼ 411:98, b ¼À:03333: treatment i aborts, so Ho: pi1 ¼ pi2 for i ¼ 1, 2, 3, 4, 5. d. (58.15, 104.18) We find w2 ¼ 24.82 with P-value < .001, so reject the null hypothesis and conclude that maturation is affected 119. a. The plot suggests a quadratic model. by leaf removal. b. With f ¼ 25.08 and a P-value of < .0001, there is a significant relationship at the .0001 level 27. If pij denotes the probability of a type j response when c. CI: (3282.3, 3581.3), PI: (2966.6, 3897.0). Of course, treatment i is applied, then H0:p1j ¼ p2j ¼ p3j ¼ p4j for ¼ w2 ¼ :  : ¼ w2 the PI is wider, as in simple linear regression, j 1, 2, 3, 4. With 27 66 23 587 :005;9, because it needs to include the variability of a new reject H0 at the .005 level. The treatment does affect the observation in addition to the variability of the mean. response. d. CI: (3257.6, 3565.6), PI: (2945.0, 3878.2). These are w2 ¼ :  : ¼ w2 slightly wider than the intervals in (c),which is 29. With 64 65 13 277 :01;4, reject H0 at the .001 appropriate, given that 25 is slightly closer to the level. Political views are related to marijuana usage. In mean and the vertex. particular, liberals are more likely to be users. ¼À e. With t 6.73 and a two-tailed P-value of 31. Compute the expected counts by < .0001, the quadratic term is significant at the n ÁÁ nÁjÁ nÁÁ e^ ¼ np^ ¼ np^ ÁÁp^Á Áp^ÁÁ ¼ n i k . For the .0001 level, so this term is definitely needed. ijk ijk i j k n n n w2 statistic df ¼ 20. 121. a. With f ¼ 2.4 < 5.86 ¼ F , there is no .05,15,4 33. a. With w2 ¼ :681 < 4:605 ¼ w2 , do not reject significant relationship at the .05 level :10;2 independence at the .10 level. b. No, especially when k is large compared to n b. With w2 ¼ 6:81  4:605 ¼ w2 , reject independence c. .9565 :10;2 at the .10 level. c. 677 Chapter 13 35. a. With w2 ¼ 6.45 and P-value .040, reject independence at the .05 level. ¼À 1. a. reject H0 b. do not reject H0 c. do not reject b. With z 2.29 and P-value .022, reject H0 d. do not reject H0 independence at the .05 level. 834 Chapter 14

c. Because the takes into account the 5. We form the difference and perform a two-tailed test of order in the professorial ranks, it should be more H0: m ¼ 0 at level .05. This gives s+ ¼ 72 and because sensitive, so it should give a lower P-value. it does not satisfy 14 < s+ < 64, we reject H0 at the d. There are few female professors but many assistant .05 level. professors, and the assistant professors will be the ¼ m ¼ professors of the future. 7. Because s+ 162.5 with P-value .044, reject H0: 75 in favor of Ha: m > 75 at the .05 level. 37. With w2 ¼ 13:005  9:210 ¼ w2 , reject the null :01;2 ¼ hypothesis of no effect at the .01 level. Oil does make a 9. With w 38, reject H0 at the .05 level because the  difference (more parasites). rejection region is {w 36}.

11. Test H0: m1 – m2 ¼ 1 vs. Ha: m1 – m2 > 1. After 39. a. H0: The population proportion of Late Game Leader Wins is the same for all four sports; H : The subtracting 1 from the original process measurements, a ¼ < proportion of Late Game Leader Wins is not the same we get w 65. Do not reject H0 because w 84. 2 2 for all four sports. With w ¼ 10:518  7:815 ¼ w: ; , 05 3 13. b. Test H0: m1 – m2 ¼ 0 vs. Ha: m1 – m2 < 0. With a reject the null hypothesis at level .05. Sports differ in P-value of .002 we reject H at the .01 level. terms of coming from behind late in the game. 0 b. Yes (baseball) 15. With w ¼ 135, z ¼ 2.223, and the approximate P-value is .026, so we would not reject the null hypothesis at the w2 ¼ :  : ¼ w2 41. With 197 6 16 812 :01;6, reject the null .01 level. hypothesis at the .01 level. The aged are more likely to die in a chronic-care facility. 17. (11.15, 23.80) w2 ¼ : < : ¼ w2 À 43. With 763 7 779 :10;4, do not reject the 19. ( .585, .025) hypothesis of independence at the .10 level. There is no evidence that age influences the need for item pricing. 21. (16, 87) w2 ¼ :  : ¼ w2 29. a. (.4736, .6669) 45. a. No, 9 02 7 815 :05;3. w2 ¼ : < : ¼ w2 b. (.4736, .6669) b. With 157 6 251 :10;3, there is no reason to say the model does not fit. 33. For a two-tailed test at level .05, we find that s+ ¼ 24 and < < 47. a. H : p ¼ p ¼ ... ¼ p ¼ .10 vs. H : at least one because 4 s+ 32, we do not reject the hypothesis of 0 0 1 9 a equal means. pi 6¼ .10, with df ¼ 9. ¼ ¼ ... b. H0: pij .01 for i and j 0,1,2, ,9 vs. Ha: at least 35. a. a ¼ .0207; Bin(20, .5) 6¼ ¼ one pij .01, with df 99. b. c ¼ 14; because y ¼ 12, do not reject H c. No, there must be more observations than cells to do a 0 ¼ :  : ¼ w2 valid chi-square test. 37. With K 20 12 13 277 :01;4, reject the null hypo- d. The results give no reason to reject . thesis of equal means at the 1% level. Axial strength does seem to (as an increasing function) depend on plate length. Chapter 14 ¼ : < : ¼ w2 39. Because fr 6 45 7 815 :05;3, do not reject the null hypothesis of equal emotion means at the 5% level. 1. For a two-tailed test of H0: m ¼ 100 at level .05, we 0 find that s+ ¼ 27 and because 14 < s+ < 64, we do not 41. Because w ¼ 26 < 27, do not reject the null hypothesis reject H0. at the 5% level.

3. For a two-tailed test of H0: m ¼ 7.39 at level .05, we find that s+ ¼ 18 and because s+ does not satisfy 21 < s+ < 84, we reject H0. Index

A Ansari–Bradley test, 786 Beta functions, incomplete, 207 Additive model Association, causation and, 251, 671 Bias-corrected and accelerated for ANOVA, 584–6, 589 Asymptotic normal distribution, 298, interval, 415, 417, 538 for linear , 624 371, 375, 377, 671 Bimodal histogram, 18, 19 for multiple regression Asymptotic relative , Binomial distribution analysis, 682 764, 769 basics of, 128–135 , 426 coefficient, 674 Bayesian approach to, 777–780 Analysis of , 699 Average and, 240 (ANOVA) definition of, 25 normal distribution and, additive model for, 584–586, deviation, 33 189–190, 302 597 pairwise, 379, 772–773, 775 and, data transformation for, 579 rank, 785 147–149 definition of, 552 weighted (see Weighted average) Binomial , 130–131, 134, in, 556, 573, 147, 240, 302, 724 589, 597 B Binomial random variable fixed vs. random effects, 579 Bar graph, 9, 19 Bernoulli random variables and, , 785 Bartlett’s test, 562 134, 302 fundamental identity of, 560, Bayesian approach to inference, 758, cdf for, 132 564, 587, 599, 600, 635 776–782 definition of, 130 interaction model for, 597–606 Bayes’ Theorem, 79–81, 777, 780 distribution of, 132 Kruskal–Wallis test, 784 , 104, 122, 134, expected value of, 134, 135 Levene test, 562–563 302 373, 375, 377, 777 in hypergeometric experiment, linear regression and, 636, 639, Bernoulli random variable 141 664, 708, 717 binomial random variable and, in hypothesis testing, 428–431, mean in, 553, 555, 557 134, 302 450–454 mixed effects model for, Crame´r–Rao inequality for, 375 mean of, 134–135 593, 603 definition of, 98 for, multiple comparisons in, expected value, 113 135 564–571, 578, 589–590, 603 on, 372–373, multinomial distribution of, 240 noncentrality parameter for, 377 in negative binomial experiment, 574, 582 Laplace’s 142 notation for, 555, 559, 598 and, 782 normal approximation of, power curves for, 574–575 mean of, 113 189–190, 302 randomized block mle for, 377 pmf for, 132 and, 590–593 moment generating function for, and Poisson distribution, regression identity of, 635–636 122, 123, 127 147–149 sample sizes in, 574–576 pmf of, 103 standard deviation of, 134 single-factor, 553–582 score function for, 372 unbiased estimation, 335, 337 two-factor, 582–608 in Wilcoxon’s signed-rank variance of, 134, 135 type I error in, 558–559 statistic, 314 , 135, 142–144 type II error in, 574 , 206–208, 777 Bioequivalence tests, 551

835 836 Index

Birth process, pure, 378 in confidence intervals, Complement of an event, 53, 60 Bivariate data, 3, 617, 623, 632, 389–390, 410 Compound event, 52, 62 691, 721 critical values for, 317, 389, Concentration parameter, 779 Bivariate normal distribution, 409–410, 477, 725, 727, Conceptual population, 6, 113, 258–260, 310, 318, 477, 737–738 287, 487 667–671 definition of, 200 Conditional density, 253 Bonferroni confidence intervals, degrees of freedom for, 200, 315 Conditional distribution, 253–263, 424, 657–659, 689 361, 369, 667, 735, 758, 777 Bootstrap procedure and, 317 Conditional mean, 255–262 for confidence intervals, F distribution and, 323–325 , 74–81, 411–418, 532–534 and, 200, 315 84–85, 200, 253–255, 362, for paired data, 538–540 in goodness-of-fit tests, 720–751 365–366 for point estimates, 345–346 and, 226 Conditional probability density Bound on the error of estimation, 388 standard normal distribution function, 253 Box–Muller transformation, 271 and, 224, 316–317, 325 Conditional probability mass Boxplot, 37–41 of sum of squares, 317, 557 function, 253, 255 comparative, 40–41 t distribution and, 320, 325 Conditional variance, 255–262, 367 Branching process, 281 in transformation, 224 Confidence bound, 398–399, 403, and, 231 440, 494, 500, 513 C Chi-squared random variable Confidence interval Categorical data in ANOVA, 557 adjustment of, 400 classification of, 30 cdf for, 316 in ANOVA, 565, 570–571, 578, graphs for, 19 expected value of, 315 589, 591, 603 in multiple regression analysis, in hypothesis testing, 482 based on t distribution, 401–404, 696–699 in likelihood ratio tests, 477, 480 499–501, 505, 513–515, Pareto diagram, 24 mean of, 315 570–571, 643–646 sample proportion in, 30 moment generating function Bonferroni, 424, 657-659 of, 315 bootstrap procedure for, mean of, 322, 342 pdf of, 200, 315 411–418, 538, 540, 532–534 median of, 342 standard normal random for a contrast, 571 minimal sufficiency for, 367 variables and, 224, for a , 671 reciprocals and, 231 316–317, 325 vs. credibility interval, 777–781 standard normal distribution in Tukey’s procedure, 565 definition of, 382 and, 271 variance of, 315 derivation of, 389 uniform distribution and, 226 Chi-squared test for difference of means, 493–495, variance of sample mean degrees of freedom in, 726, 734, 500–501, 505, 513–515, for, 349 736, 745, 748 532–534, 539–540, 565–569, Causation, association and, 251, 671 for , 724-730, 578, 589, 591, 603 cdf. See Cumulative distribution for homogeneity, 745–747 for difference of proportions, 524 function for independence, 747–749 distribution-free, 771–776 counts/frequencies, 725–727, P-value for, 727–728 for exponential distribution 729–730, 732–740, 744–750 for specified distribution, parameter, 389 Cell probabilities, 729, 732, 737, 739 729–730 in linear regression, 643–646, Censored experiments, 32, 343–344 z test and, 752 656–658 , 2 Class intervals, 15–17, 278, 293, for mean, 383–387, 392, Central Limit Theorem 738–739 403–404, 411–415 basics of, 298–303 Coefficient of determination for median, 415–417 and, 305 definition of, 632–634, 686 in multiple regression, 689, 712 proof of, 329–330 F ratio and, 687 one-sided, 398, 500, 513 sample proportion distribution in multiple regression, 686 for paired data, 513–515, 539 and, 190 sample correlation coefficient for ratio of variances, 530–531, 537 Wilcoxon rank-sum test and, 770 and, 664 sample size and, 388 Wilcoxon signed-rank test Coefficient of skewness, 121, Scheffe´ method for, 610 and, 765 128, 178 sign, 784 Central t distribution, 320–323, 423 Coefficient of variation, 45, 229, 357 for slope coefficient, 643 Chebyshev’s inequality, 120, 138, Cohort, 281 for standard deviation, 409–410 156, 194, 303, 345 Combination, 70–72 for variance, 409–410 Chi-squared distribution Comparative boxplot, 40–41, 502, width of, 385, 387–388, 394, 397, censored experiment and, 421 503, 554 404, 417, 495 Index 837

Wilcoxon rank-sum, 774–776 paired data and, 515–516 for Studentized range Wilcoxon signed-rank, 772-774 sample (see Sample correlation distribution, 565 Confidence level coefficient) for t distribution, 320, 390, definition of, 382, 385–388 Covariance 500, 504 simultaneous, 565–570, 578, correlation coefficient and, 249 type II error and, 574 589, 591, 658 Crame´r–Rao inequality and, , 174 in Tukey’s procedure, 565–570, 374–375 De Morgan’s laws, 56 578, 589, 591 definition of, 247 Density Confidence set, 772 of independent random conditional, 253–257 Consistency, 304, 357, 375–377 variables, 250–251 curve, 160 , 304, 357, of linear functions, 249 function (pdf), 160 375–377 matrix format for, 711 joint, 235 Contingency tables, two-way, Covariate, 699 marginal, 236 744–751 Crame´r–Rao inequality, 374–375 scale, 17 , 189–190 Credibility interval, 777–782 Dependence, 84–88, 238–242, 250, Continuous random variable(s) Critical values 257, 747 conditional pdf for, 254, 789 chi-squared, 317 Dependent events, 84-88 cumulative distribution function F, 324 Dependent variable, 614 of, 163–168 standard normal (z), 184 , 1–41 definition of, 99, 159 studentized range, 565 Deviation vs. discrete random variable, 162 t, 322, 409 definition of, 33 expected value of, 171–172 tolerance, 406 minimize absolute deviations joint pdf of (see Joint probability Cumulative distribution function principle, 33, 679 density functions) for a continuous random Dichotomous trials, 128 marginal pdf of, 236–238 variable, 163–168 Difference statistic, 347 mean of, 171, 172 for a discrete random variable, Discrete random variable(s) moment generating of, 175–177 104–108 conditional pmf for, 253 pdf of (see Probability density inverse function of, 223–224 cumulative distribution function function) joint, 282 of, 104–108 percentiles of, 166–168 of order statistics, 272–273 definition of, 99 standard deviation of, 173–175 pdf and, 163 expected value of, 112 transformation of, 220–225, percentiles and, 167 joint pmf of (see Joint probability 265–270 pmf and, 105–108 mass function) variance of, 173–175 transformation and, 220–225 marginal pmf of, 234 Contrast of means, 570–571 Cumulative frequency, 24 mean of, 112 Convenience samples, 7 Cumulative relative frequency, 24 moment generating of, 122 Convergence pmf of (see Probability mass in distribution, 153, 329 D function) in mean square, 303 Data standard deviation of, 117 in probability, 304 bivariate, 3, 617, 632, 691 transformation of, 225 Convex function, 231 categorical (see Categorical data) variance of, 117 Correction factor, 141, 560, 568, of, 32, 343–344 Disjoint events, 54 577, 582 characteristics of, 3 Dotplots, 12 Correction for the mean, 560 collection of, 7–8 Dummy variable, 696 Correlation coefficient definition of, 2 Dunnett’s method, 571 autocorrelation coefficient and, 674 multivariate, 3, 220 in bivariate normal distribution, qualitative, 19 E 258–260, 310, 667 , 3 Efficiency, asymptotic relative, confidence interval for, 671 Deductive reasoning, 6 764, 769 covariance and, 249 Degrees of freedom (df) Empirical rule, 187 Crame´r–Rao inequality and, in ANOVA, 557–559, distribution, 202, 229 374–375 587, 599 Error(s) definition of, 249, 663 for chi-squared distribution, estimated standard, 344, 646, 713 estimator for, 666 200, 315–320 estimation, 334 , 669 in chi-squared tests, 726, 734, family vs. individual, 570 for independent random 737, 746 measurement, 179, 211, 337, 477 variables, 250 for F distribution, 323 prediction, 405, 658, 683 in linear regression, 664, 667, 669 in regression, 631, 685 rounding, 36 measurement error and, 328 sample variance and, 35 standard, 344, 713 838 Index

Error(s) (cont.) observational studies in, 488 Fisher–Irwin test, 525 type I, 429 paired data, 515 Fisher transformation, 669 type II, 429 paired vs. independent samples, Fitted values, 588, 629, 674 Estimated regression function, 520–521 Fixed effects model, 579, 592, 597 676, 685 randomized block, 590–593 Fourth spread, 37, 41, 285 Estimated regression line, 625 randomized controlled, 489 Frequency, 13 Estimated , 344, repeated measures designs in, 591 Frequency distribution, 13 646, 713 with replacement, 69, 141, 287 Friedman’s test, 785 Estimator, 332 retrospective, 488 F test Event(s) simulation, 291–294 in ANOVA, 558, 580, 587, complement of, 53 Explanatory variable, 614 593, 600 compound, 52, 62 Exponential distribution Bartlett’s test and, 562 definition of, 52 censored experiments and, 343 coefficient of determination dependent, 84–88 chi-squared distribution and, 687 disjoint, 54 and, 317 critical values for, 324, 528, 558 exhaustive, 79 confidence interval for distribution and, 323, 527, 558 independent, 84–88 parameter, 389 for equality of variances, 527, 537 indicator function for, 364 double, 477 expected mean squares and, 573, intersection of, 53 estimators for parameter, 343, 351 589, 593, 600, 604 mutually exclusive, 54 goodness-of-fit test for, 739 Levene test and, 562 mutually independent, 87 mixed, 229 power curves and, 574–575 simple, 52 in pure birth process, 378 P-value for, 529, 537, 559 union of, 53 shifted, 360, 479 in regression, 687, 709 Venn diagrams for, 55 skew in, 277 sample sizes for, 574 Expected mean squares standard gamma distribution single-factor, 558, 580 in ANOVA, 573, 577, 600, 614 and, 198 vs. t test, 576 F test and, 589, 593, 600, 604 Weibull distribution and, 203 two-factor, 587, 593, 600 in mixed effects model, 593, 604 Exponential random variable(s) type II error in, 574 in random effects model, 580, Box–Muller transformation Full quadratic model, 695 593–594 and, 271 in regression, 681 cdf of, 199 G Expected value expected value of, 198 Galton–Watson branching process, conditional, 255 independence of, 242 281 of a continuous random mean of, 198 Gamma distribution variable, 171 in order statistics, 272, 275 chi-squared distribution and, 200 covariance and, 247 pdf of, 198 definition of, 195 of a discrete random variable, 112 transformation of, 220, 267, 270 density function for, 195 of a function, 115, 245–246 variance of, 198 and, 201 heavy-tailed distribution and, Exponential regression model, 721 estimators of parameters, 351, 114–115, 120 , 48 355, 358 of jointly distributed random Extreme outliers, 39–41 exponential distribution and, variables, 245 Extreme value distribution, 217 198–200 Law of Large Numbers and, 303 Poisson distribution and, 783 of a linear combination, 306 F standard, 195 of mean squares (see Expected notation, 69 Weibull distribution and, 203 mean squares) Factorization theorem, 363 Gamma function moment generating function Factors, 552 incomplete, 196, 217 and, 122, 175 function, 230 properties of, 195 moments and, 121 Family of probability distributions, Gamma random variables, 195 in order statistics, 272–273, 277 104, 213 , 143, 225 of sample mean, 277, 296 F distribution Geometric random variables, 143 of sample standard deviation, chi-squared distribution and, 323 Goodness-of-fit test 340, 379 definition of, 323 for composite hypotheses, of sample total, 296 expected value of, 325 732, 741 of sample variance, 339 for model utility test, 649, 687, 709 definition of, 723 Experiment noncentral, 574–575 for homogeneity, 745–747 binomial, 128, 240, 724 pdf of, 324 for independence, 747–749 definition of, 52 Finite population correction factor, 141 simple, 724–730 double-blind, 523 Fisher information, 371 Grand mean, 555, 584 Index 839

H Intersection of events model utility test and, 721 Half-normal plot, 220 definition of, 53 in Neyman–Pearson theorem, 470 Histogram multiplication rule for probability significance level and, 470, 471 bimodal, 18 of, 77–79 sufficiency and, 380 class intervals in, 15–17 Invariance principle, 357 tests, 475 construction of, 12–20 Inverse matrix, 712 Limiting relative frequency, 58, 59 density, 17–18 Linear combination multimodal, 19 J distribution of, 309 Pareto diagram, 24 Jacobian, 267 expected value of, 306 for pmf, 103 Jensen’s inequality, 231 independence in, 306 symmetric, 19 Joint cumulative distribution variance of, 307 unimodal, 18 function, 282 Linear probabilistic model, 617, 627 Hodges–Lehmann estimator, 379 Jointly distributed random variables Linear regression Homogeneity, 745–747 bivariate normal distribution additive model for, 614, 682, 705 Hyperexponential distribution, 229 of, 258–260 ANOVA in, 649, 699, 768 Hypergeometric distribution, conditional distribution of, confidence intervals in, 643, 656 138–141 253–263 correlation coefficient in, and binomial distribution, 141 correlation coefficients for, 249 662–671 Hypergeometric random variable, covariance between, 248 definition of, 617 138–141 expected value of function of, degrees of freedom in, 631, Hypothesis 245–246 685, 708 alternative, 426 independence of, 238–239 estimates in, composite, 732–741, 744 linear combination of, 306–312 625–636, 679 definition of, 426 in order statistics, 274–276 likelihood ratio test in, 721 errors in testing of, 428–434 pdf of (see Joint probability mles in, 631, 639 notation for, 426 density functions) model utility test in, 648, 687, 708 null, 426 pmf of (see Joint probability mass parameters in, 617, 624–636, 682 research, 427 functions) percentage of explained variation simple, 469 transformation of, 265–270 in, 633–634 Hypothetical population, 6 variance of function of, 252, 307 prediction interval in, 654, 658, 689 Joint marginal density function, 245 residuals in, 629, 674, 685 I Joint probability mass function, in, 627 Inclusive inequalities, 136 233–234 sums of squares in, 631–636, 686 Incomplete , 207 Joint probability table, 233 t ratio in, 648, 669, 690 Incomplete gamma function, Line graph, 102–103 196–197, 217 K , 217, 367 Independence k-out-of-n system, 153 , 279 chi-squared test for, 749 Kruskal–Wallis test, 784–785 Logistic regression model conditional distribution and, k-tuple, 68–69 contingency tables for, 749–751 257–258 definition of, 620–622 correlation coefficient L fit of, 650–651 and, 250 lag 1 autocorrelation coefficient, 674 mles in, 650 covariance and, 250, 252 , 478 in multiple regression analysis, 699 of events, 84–88 Laplace’s rule of succession, 782 Logit function, 621, 650 of jointly distributed random Largest extreme value distribution, 228 Lognormal distribution, 205–205, 233 variables, 238–239, 241 Law of Large Numbers, 303–304, Lognormal random variables, 205–206 in linear combinations, 306–307 322–323, 376 mutual, 87 , 79 M pairwise, 90, 94 Least squares estimates, 626, 645, Mann–Whitney test, 766–770 in simple random sample, 287 679, 683–684 , 234, 236, 253 Independent variable, 614 Level a test, 433 Marginal probability density Indicator variables, 696 Level of a factor, 552, 583, 593 functions, 236 , 6 Levene test, 562–563 Marginal probability mass Inferential statistics, 5–6 Leverages, 714–715 functions, 234 Inflection point, 180 , 354, 470, 475 Matrices in regression analysis, Intensity function, 156 Likelihood ratio 705–715 Interaction, 597–602, 603–606, chi-squared statistic for, 477 Maximum likelihood estimator 693–698 definition of, 470 for Bernoulli parameter, 377 Intercept, 214, 617, 627 mle and, 475 for binomial parameter, 377 840 Index

Maximum likelihood estimator Minimize absolute deviations normal equations in, 683, 685, (cont.) principle, 477, 679 705–708 Crame´r–Rao inequality and, 375 Minimum variance unbiased parameters for, 682 data sufficiency for, 369 estimator, 341–343, 358, and , Fisher information and, 371, 375 369, 375 691–693 for geometric distribution Mixed effects model, 593–603 prediction interval in, 689 parameter, 742 Mixed exponential distribution, 229 principle of least squares in, in goodness-of-fit testing, 733 mle. See Maximum likelihood 683–706 in homogeneity test, 745 estimate residuals in, 685, 691, 688, 691, in independence test, 748 Mode 708, 713 in likelihood ratio tests, 475 of a continuous distribution, squared multiple correlation in, in linear regression, 631, 639 228, 229 686, 709 in logistic regression, 650 of a data set, 46 sum of squares in, 686, 708–710 sample size and, 357 of a discrete distribution, 156 t ratios in, 690, 712 score function and, 377 Model utility test, 647–649 Multiplication rule, 77–88 McNemar’s test, 526, 550 Moment generating function Multiplicative exponential Mean of a Bernoulli rv, 122, 127 regression model, 721 of Cauchy distribution, 322, of a binomial rv, 135 Multiplicative power regression 342, 761 of a chi-squared rv, 315 model, 721 conditional, 255–257 CLT and, 329–330 Multivariate data, 3, 20 correction for the, 560 of a continuous rv, 175–177 Multivariate hypergeometric deviations from the, 33, 206, 563, definition of, 122, 175 distribution, 244 631, 739 of a discrete rv, 122–127 Mutually exclusive events, 54, 79 of a function, 115, 245–246 of an exponential rv, 221 MVUE. See Minimum variance vs. median, 28 of a gamma rv, 195 unbiased estimator moments about, 121 of a linear combination, 311 outliers and, 27, 28 and moments, 124, 176 N population, 26 of a negative binomial rv, 143 Negative binomial distribution, regression to the, 260, 636 of a normal rv, 191 141–144 sample, 25 of a Poisson rv, 149 definition of, 141 of sample total, 296 of a sample mean, 329–330 estimation of parameters, 352, 738 See also Average uniqueness property of, 123, 176 Negative binomial random Mean square Moments variable, 141 expected, 573, 589, 593, 594, definition of, 121 Newton’s binomial theorem, 143 600, 604 method of, 350–352, 358, 740 Neyman factorization theorem, 363 lack of fit, 681 and moment generating function, Neyman–Pearson theorem, 470–475 pure error, 681 124, 176 Noncentrality parameter, 423, Mean square error Monotonic, 221, 353 574, 582 definition of, 335 Multimodal histogram, 19 Noncentral t distribution, 423 of an estimator, 335 Multinomial distribution, 240, 725 Nonhomogeneous Poisson MVUE and, 341 Multinomial experiment, 240, 724 process, 156 sample size and, 337 Multiple regression Nonstandard normal distribution, Measurement error, 337 additive model, 682, 705 185–188 Median categorical variables in, 696–699 Normal distribution in boxplot, 37–38 coefficient of multiple asymptotic, 298, 371, 375, 377 of a distribution, 27, 28 determination, 686, 709 binomial distribution and, as estimator, 378, 478 confidence intervals in, 712 189–190, 302 vs. mean, 28 covariance matrices in, 711–713 bivariate, 258–260, 310, 318, outliers and, 26, 28, 29 degrees of freedom in, 685, 477, 677–671 population, 28 696, 708 confidence interval for mean of, sample, 27, 271 diagnostic plots, 691 383–388, 392, 398, 403 statistic, 378 fitted values in, 685 continuity correction and, 189–190 Mendel’s law of inheritance, 726–728 F ratio in, 687, 709 density curves for, 180 M-estimator, 359, 381 interaction in models for, 693–698 and discrete random variables, Midfourth, 46 leverages in, 714–715 188–190 Midrange, 333 logistic regression model, 699 goodness-of-fit test for, 730, 740 Mild outlier, 39, 393 in matrix/vector format, 705–715 of linear combination, 309 Minimal , model utility test in, 687, lognormal distribution and, 366–367, 369 708–709 205, 303 Index 841

nonstandard, 185–188 estimator for a, 332–346 of, 335 pdf for, 179 Fisher information on, 371–377 moments method, 350–352, 358 percentiles for, 182–188, 210 goodness-of-fit tests for, MVUE of, 340–342, 358, 369, probability plot, 210, 740 728–729, 732–736 375 Ryan–Joiner test for, 747 hypothesis testing for, 427, 450 notation for, 332, 334 standard, 181 location, 217, 367 of a standard deviation and, t distribution and, 320–322, 325, maximum likelihood estimate 286, 340 402 of, 354–359, 369 standard error of, 344–346 z table, 181–183 moment estimators for, 350–352 of a variance, 334, 339 Normal equations, 626, 683, 705 MVUE of, 341–343, 358, Point prediction, 405, 628, 684 Normal probability plot, 210, 740 369, 375 Poisson distribution Normal random variable, 181 noncentrality, 574 Erlang distribution and, 202 Null distribution, 443–444, 760, 780 null value of, 427 expected value, 149, 152 Null hypothesis, 426 of a , exponential distribution and, 199 Null set, 54, 57 103–104 gamma distribution and, 783 Null value, 427, 436 in regression, 617–618, 622, goodness-of-fit tests for, 736–738 624–636, 658, 666, 682 in hypothesis testing, 470–472, O scale, 195, 203, 217–218, 365 474, 482, 550 , 488 shape, 217–218, 365 mode of, 156 , 621–622, 750–751 sufficient estimation of, 361–369 moment generating function One-sided confidence interval, Pareto diagram, 24 for, 149 398–399 , 170, 178, 226 nonhomogeneous, 156 Operating characteristic curve, 137 pdf. See Probability density function parameter of, 149 Ordered categories, 749–751 Percentiles and Poisson process, 149–151, 199 Ordered pairs, 66–67 for continuous random variables, variance, 149, 152 Order statistics, 271–278, 338, 166–168 Poisson process, 149–151, 194 365–367, 478 in hypothesis testing, 458, 740 Polynomial regression model, sufficiency and, 365–367 in probability plots, 211–216, 740 691–693 Outliers sample, 29, 210–211, 216 Pooled t procedures in a boxplot, 37–41 of standard normal distribution, and ANOVA, 477, 504–505, 576 definition of, 11 182–184, 211–216 vs. Wilcoxon rank-sum extreme, 39–41 Permutation, 68, 69, 535–541 procedures, 769 leverage and, 714 Permutation test, 535–541 , 79–81, mean and, 29, 415–417 PERT analysis, 207 777, 781 median and, 29, 37, 415, 417 Plot Power curves, 574–575 mild, 39 probability, 210–218, 369, 499, Power function of a test, 473–475, in regression analysis, 679, 688 668, 676, 688, 691, 740 574–575 scatter, 615–617, 632–633, Power model for regression, 721 P 663, 667 Paired data pmf. See Probability mass function Neyman–Pearson theorem and, in before/after experiments, Point estimate/estimator 473–475 511, 526 biased, 337–342 type II error and, 446–447, bootstrap procedure for, 538–540 bias of, 335–340 472–476, 505, 593, 749 confidence interval for, 513–515 bootstrap techniques for, Precision, 315, 344, 371, 382, definition of, 509 345–346, 411–418 387–388, 397, 405, 417, 514, vs. independent samples, 515 bound on the error of 516, 592, 781 in McNemar’s test, 550 estimation of, 388 Prediction interval permutation test for, 540–541 censoring and, 343–344 Bonferroni, 659 t test for, 511–513 consistency, 304, 357, 375–377 vs. confidence interval, 406, in Wilcoxon signed-rank test, for correlation coefficient, 665–666 658–659, 690 762–763 and Crame´r–Rao inequality, in linear regression, 654, 658–659 Pairwise average, 772, 773, 775 373–377 in multiple regression, 690 Pairwise independence, 94 definition of, 26, 287, 332 for normal distribution, 404–406 Parallel connection, 55, 88, 89, 90, efficiency of, 375 Prediction level, 405, 659, 689 272, 273 Fisher information on, 371–377 Predictor variable, 614, 682, Parameter(s) least squares, 626–631 693–696 Bayesian approach to, 776–782 maximum likelihood (mle), Principle of least squares, 625–636, concentration, 779 352–359 674, 679, 683 confidence interval for, 389, 394 of a mean, 26, 287, 332–333, 366 , 79, 758 842 Index

Probability hyperexponential, 229 Randomized controlled conditional, 74–81, 84–85, 200, hypergeometric, 138–141, experiment, 489 253–255, 362, 365–366 307–308 Randomized response technique, 349 continuous random variables joint, 232–283, 665–667, 732 Random variable and, 99, 158–225, 235–242, Laplace, 315, 477–478 continuous, 158–231 253–255 of a linear combination, 259, definition of, 97 counting techniques for, 66–72 306–312 discrete, 96–157 definition of, 50 logistic, 279 jointly distributed, 232, 233–283 density function (see Probability lognormal, 205–206, 303 standardizing of, 185 density function) multinomial, 240, 724 types of, 99 of equally likely outcomes, 62–63 negative binomial, 141–144 Range histogram, 103, 159–160, normal, 179–191, 205, 210–216, definition of, 33 188–190, 289–290 258–260, 297–303, 309, 730 in order statistics, 271–274 inferential statistics and, 6, 9, 284 parameter of a, 103–104 population, 394 Law of Large Numbers and, Pareto, 170, 178, 226 sample, 33, 271–274 303–304, 322–323 Poisson, 146–151, 199 Studentized, 565–566 law of total, 79 Rayleigh, 169, 226, 349, 360 Rank average, 785 mass function (see Probability of a sample mean, 285–294, Ratio statistic, 478 mass function) 296–304 Rayleigh distribution, 226, of null event, 57 standard normal, 181–184 349, 360 plots, 210–218, 369, 499, 668, of a statistic, 285–304 Regression 676, 688, 691, 740 Studentized range, 565 coefficient, 640–651, 682–685, posterior/prior, 79–81, 758, symmetric, 19, 28, 121, 168, 705–707, 711–712 777, 781 174, 180 effect, 260, 636 properties of, 56–63 t, 320–323, 325, 401–403, 443, function, 614, 676, 682, 685, relative frequency and, 58–59, 462, 511 693, 696 291–292 uniform, 161–162, 164 line, 618–620, 624–636, and, 51–55, 56–57, Weibull, 202–205 640–647, 674–677 63, 66, 95 Probability mass function linear, 617–620, 624–636, and Venn diagrams, 54–55, 62, conditional, 253–254 640–649, 654–659 75–76 definition of, 101–109 logistic, 620–622, 650–651 Probability density function (pdf) joint, 233–236 matrices for, 705–715 conditional, 254–255, 777 marginal, 234 to the mean, 260 definition of, 161 Product rules, 66–68 multiple, 682–689 joint, 232–278, 310, 354, Proportion multiplicative exponential 363–365, 368, 470, 475 population, 30, 395, 450–454, model, 721 marginal, 236–238, 268–269 519–525 multiplicative power model vs. pmf, 162 sample, 30, 190, 302, 338, 519, for, 721 Probability distribution 748 plots for, 676–678 Bernoulli, 98, 102–104, 113, trimming, 29, 333, 340, 342–343 polynomial, 691–693 122–123, 127, 134, 302, 304, P-value quadratic, 691–693 308, 360, 373, 375, 377, 777 for chi-squared test, 727–728 through the origin, 381–421 beta, 206–208 definition of, 456 Rejection method, 281 binomial, 128–135, 147–149, for F tests, 529–530 Rejection region 189–190, 302, 352–353, for t tests, 462–465 cutoff value for, 428–433 395–396, 428–431 type I error and, 457–459 definition of, 428 bivariate normal, 258–260, for z tests, 459–461 lower-tailed, 431, 437–438 477, 669 in Neyman–Pearson theorem, Cauchy, 226, 231, 271, 342 Q 470–474 chi-squared, 200, 224, 315–320 Quadratic regression model, 691–693 two-tailed, 438 conditional, 253–263 Qualitative data, 19 type I error and, 429 continuous, 99, 158–231 , 28–29 in union-intersection test, 551 discrete, 96–157 upper-tailed, 429, 437–438 exponential, 198–200, 203, 343 R Relative frequency, 13–19, 30, extreme value, 217–218 Random effects model, 579–580, 58–59 F, 323–325 593–594, 603–606 Repeated measures designs, 591 family, 104, 213, 216–218, 558 Random interval, 384–386 Replications, 58, 291–293, 386 gamma, 194–200, 217–218 Randomized block experiment, Research hypothesis, 427 geometric, 106–107, 114, 143, 225 590–593 Residual plots, 588, 602, 676–678 Index 843

Residuals Poisson distribution and, 147 Series connection, 272–273 in ANOVA, 588, 602 for population proportion, Set theory, 53–55 definition of, 556 396–398 Shape parameters, 217–218, 366 leverages and, 714–715 power and, 433, 440–441, 445, Siegel–Tukey test, 786 in linear regression, 629, 674–678 452–454, 489, 505, 523 Significance in multiple regression, 685, 688 probability plots and, 216 practical, 468–469, 727 standard error, 674 in simple random sample, 287 statistical, 469, 489, 727 standardizing of, 675, 691 t distribution and, 445, 505 Significance level variance of, 675, 713 type I error and, 433, 440-441, definition of, 433 Response variable, 8, 614, 620 445, 489, 523 joint distribution and, 479 Retrospective study, 488 type II error and, 433, 440–441, likelihood ratio and, 475 Ryan–Joiner test, 741 445, 452–454, 489, 505, 523 observed, 458 variance and, 303 Sign interval, 784 S z test and, 440–441, 452–453 , 784 Sample Sample space Simple events, 52, 62, 66 convenience, 7 definition of, 51 Simple hypothesis, 469, 732 definition of, 2 probability of, 56–63 Simple random sample outliers in, 38–40 Venn diagrams for, 54–55 definition of, 7, 287 simple random, 7, 287 Sample standard deviation independence in, 287 size of (see Sample size) in bootstrap procedure, 413, 537 sample size in, 287 stratified, 7 confidence bounds and, 398 Simulation experiment, 288, Sample coefficient of variation, 45 confidence intervals and, 392, 291–294, 417, 463 Sample correlation coefficient 403 Skewed data in linear regression, 662–664, definition of, 33 coefficient of skewness, 121, 178 669, 719 as estimator, 340, 379 definition of, 19 vs. population correlation expected value of, 340, 379 in , 19, 413 coefficient, 666, 669–671 independence of, 318–319 mean vs. median in, 28 properties of, 664–665 mle and, 357 of, 121 strength of relationship, 665 population standard deviation probability plot of, 216, 411–413 Sample mean and, 286, 340, 379 Slope, 617–618, 622, 626, 642, 644 definition of, 25 sample mean and, 34, 318–319 Slope coefficient population mean and, 296–304 of, confidence interval for, 644 sampling distribution of, 296–304 288–289, 320, 340, 379, 482 definition of, 617–618 Sample median variance of, 482 hypothesis tests for, 648 definition of, 27 Sample total, 296, 306, 560 least squares estimate of, 626 in order statistics, 271–272 Sample variance in logistic regression model, 622 vs. population median, 417 in ANOVA, 555–556 Standard deviation Sample moments, 350–351 calculation of, 35 normal distribution and, 179 Sample percentiles, 210–211 definition of, 33 of point estimator, 344–346 Sample proportion, 30, 335–336, distribution of, 287–289, 320 population, 117, 173 338, 391–400, 450–455, expected value of, 339 of a random variable, 117, 173 519–526 population variance and, 35, 317, sample, 33 Sample size 322–323, 339 z table and, 186 in ANOVA, 574–576 Sampling distribution Standard error, 344–346 asymptotic relative efficiency bootstrap procedure and, 413, Standardized variable, 185 and, 764, 769 532, 758 Standard normal distribution bound on the error of estimation definition of, 284, 287 Cauchy distribution and, 271 and, 388 derivation of, 288–291 chi-squared distribution and, Central Limit Theorem and, 302 of intercept coefficient, 719 316, 325 confidence intervals and, of mean, 288–290, 297–299 critical values of, 184 387–388, 394, 396, 403, 495 permutation tests and, 758 definition of, 181 definition of, 9 simulation experiments for, density curve properties for, in finite population correction 291–294 181–184 factor, 140 of slope coefficient, 640–649 F distribution and, 323, 325 for F test, 574–576 Scale parameter, 195, 203–204, percentiles of, 182–184 for Levene test, 562–563 217–218, 365 t distribution and, 320, 325 mle and, 357–358, 375 Scatter plot, 615–617 Standard normal random variable, noncentrality parameter and, Scheffe´ method, 610 181, 325 574–576, 582 Score function, 373–377 Statistic, 286 844 Index

Statistical hypothesis, 426 Trimming proportion, 29, 343 U Stem-and-leaf display, 10–12 True regression function, 615 Unbiased estimator, 337–344 Step function, 106 True regression line, 618–620, 625, minimum variance, 340–343 Stratified samples, 7 640–641 Uncorrelated random variables, Studentized range distribution, 565 t test 251, 307 Student t distribution, 320–323 vs. F test, 576 Uniform distribution Summary statistics, 627, 630, heavy tails and, 764, 769, 774 beta distribution and, 778 645, 671 likelihood ratio and, 475, 476 Box–Muller transformation Sum of squares in linear regression, 648 and, 271 error, 557, 631, 708 in multiple regression, definition of, 161 interaction, 599 688–690, 712 discrete, 120 lack of fit, 681 one-sample, 443–445, 461, transformation and, 223–224 pure error, 681 474–476, 511, 769 Uniformly most powerful test, regression, 636, 699, 708 paired, 511 473–474 total, 559–560, 587, 591, 645, pooled, 504–505, 576 Unimodal histogram, 18–19 686 P-value for, 461–462 Union-intersection test, 551 treatment, 557–560 two-sample, 499–504, 576, 515 Union of events, 53 Symmetric distribution, 19, type I error and, 443–445, 501 Univariate data, 3 121, 168 type II error and, 445–447, 505 vs. Wilcoxon rank-sum V T test, 769 Variable(s) Taylor series, 174, 579 vs. Wilcoxon signed-rank test, covariate, 699 t confidence interval 763–764 in a data set, 10 heavy tails and, 764, 769, 774 Tukey’s procedure, 565–570, 578, definition of, 3 in linear regression, 643, 656 589–590, 603 dependent, 614 in multiple regression, 689, 712 Two one-sided tests, 551 dummy, 696–699 one-sample, 403–404 Type I error explanatory, 614 paired, 513–515 definition of, 429 independent, 614 pooled, 505 Neyman–Pearson theorem indicator, 696–699 two-sample, 500, 515 and, 470 predictor, 614 t distribution power function of the test random, 96–231 central, 423 and, 473 response, 614 chi-squared distribution and, P-value and, 457–458 Variance 320, 325, 500, 504 sample size and, 441 conditional, 255–257 critical values of, 322, 402, significance level and, 433 of a function, 118–119, 444, 461 vs. type II error, 433 174–175, 328 definition of, 320 Type II error of a linear function, 118–120, 307 degrees of freedom in, definition of, 429 population, 34–35, 117, 173 320–321, 401–402 vs. type I error, 433 precision and, 781 density curve properties for, Type II error probability of a random variable, 117, 173 322, 402 in ANOVA, 574–576, 596 sample, 33–37 F distribution and, 325, 576 degrees of freedom and, 516 , 54–55, 62, 75, 76 noncentral, 423 for F test, 574–576, 596 standard normal distribution in linear regression, 653 W and, 320, 322, 403 Neyman–Pearson theorem Weibull distribution Student, 320–323 and, 469–472 basics of, 202–205 Test statistic, 428 power of the test and, 446, 473 chi-squared distribution and, 231 Time series, 48, 674 sample size and, 440, 505, estimation of parameters, 356, , 406 477–478, 468, 495 359–360 Treatment, 553, 555–556, 583 in tests concerning means, extreme value distribution Tree diagram, 67–68, 78, 81, 87 440, 445, 468, 489, 505 and, 217 Trial, 128–131 in tests concerning proportions, probability plot, 217–218 Trimmed mean 452–453, 522–524 Weighted average, 112, 171, 261, definition of, 28–29 t test and, 445, 505 504, 779, 781 in order statistics, 271–272 vs. type I error probability, 433 Weighted least squares outliers and, 29 in Wilcoxon rank-sum estimates, 679 as point estimator, 333, test, 769 Wilcoxon rank-sum test, 766–769 340, 343 in Wilcoxon signed-rank test, Wilcoxon signed-rank test, population mean and, 340, 343 763–764 759–764 Index 845

Z z curve for a difference between z confidence interval area under, maximizing means, 485–493 for a correlation coefficient, 671 of, 479 for a difference between for a difference between rejection region and, 438 proportions, 521 means, 493 t curve and, 322, 402 for a mean, 438, 442 for a difference between z test for a Poisson parameter, proportions, 524 chi-squared test and, 752 400, 482 for a mean, 387, 392 for a correlation for a proportion, 451 for a proportion, 395 coefficient, 669 P-value for, 459–461