Simple Linear Regression

Chapter 14 Simple Linear RegressionLearning Objectives1. Understand how regression analysis can be used to develop an equation that estimates mathematically how two variables are related.2. Understand the differences between the regression model, the regression equation, and the estimated regression equation.3. Know how to fit an estimated regression equation to a set of sample data based upon the least- squares method.4. Be able to determine how good a fit is provided by the estimated regression equation and compute the sample correlation coefficient from the regression analysis output.5. Understand the assumptions necessary for statistical inference and be able to test for a significant relationship.6. Know how to develop confidence interval estimates of y given a specific value of x in both the case of a mean value of y and an individual value of y.7. Learn how to use a residual plot to make a judgement as to the validity of the regression assumptions.8. Know the definition of the following terms: independent and dependent variable simple linear regression regression model regression equation and estimated regression equation scatter diagram coefficient of determination standard error of the estimate confidence interval prediction interval residual plot14 - 1 Chapter 14Solutions:1 a.16 14 12 10 y 8 6 4 2 0 0 1 2 3 4 5 6 x b. There appears to be a positive linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.Sx15 S y 40 d. x=i = =3 y = i = = 8 n5 n 52 S(xi - x )( y i - y ) = 26 S ( x i - x ) = 10(xi  x )( y i  y ) 26 b1 2   2.6 (xi  x ) 10 b0  y  b1x  8  (2.6)(3)  0.2 y0.2  2.6 x e. y 0.2  2.6(4)  10.614 - 2 Simple Linear Regression2. a.605040 y 3020100 0 5 10 15 20 25 x b. There appears to be a negative linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.Sx55 S y 175 d. x=i = =11 y = i = = 35 n5 n 52 S(xi - x )( y i - y ) = - 540 S ( x i - x ) = 180S(xi - x )( y i - y ) -540 b1 =2 = = -3 S(xi - x ) 180 b0= y - b 1 x =35 - ( - 3)(11) = 68 yˆ =68 - 3 x e. yˆ =68 - 3(10) = 3814 - 3 Chapter 143. a.302520 y 151050 0 5 10 15 20 25 xSx50 S y 83 b. x=i = =10 y = i = = 16.6 n5 n 52 S(xi - x )( y i - y ) = 171 S ( x i - x ) = 190S(xi - x )( y i - y ) 171 b1 =2 = = 0.9 S(xi - x ) 190 b0= y - b 1 x =16.6 - (0.9)(10) = 7.6 yˆ =7.6 + 0.9 x c. yˆ =7.6 + 0.9(6) = 1314 - 4 Simple Linear Regression4. a.135130125120 y 115110105100 61 62 63 64 65 66 67 68 69 x b. There appears to be a positive linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.Sx325 S y 585 d. x=i = =65 y = i = = 117 n5 n 52 S(xi - x )( y i - y ) = 110 S ( x i - x ) = 20(xi  x )( y i  y ) 110 b1 2   5.5 (xi  x ) 20 b0  y  b1x  117  (5.5)(65)  240.5 y  240.5 5.5x e. y  240.5 5.5x  240.5 5.5(63)  106 pounds14 - 5 Chapter 145. a.25002000 ) $( 1500 e c i rP 10005000 1 1.5 2 2.5 3 3.5 4 4.5 Baggage Capacity b. Let x = baggage capacity and y = price ($).There appears to be a positive linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion.Sx29.5 S y 11,110 d. x=i = =3.277778 y = i = = 123.444444 n9 n 92 S(xi - x )( y i - y ) = 2909.888891 S ( x i - x ) = 4.555559S(xi - x )( y i - y ) 2909.888891 b1 =2 = = 638.755615 S(xi - x ) 4.555559 b0= y - b 1 x =1234.4444 - (638.7561)(3.2778) = - 859.254658 y 859.26  638.76 x e. A one point increase in the baggage capacity rating will increase the price by approximately $639. f. y 859.26  638.76 x   859.26  638.76(3)  $105714 - 6 Simple Linear Regression6. a.302520 s u n 15 o B 1050 50.00 70.00 90.00 110.00 130.00 150.00 170.00 190.00 Salary b. Let x = salary and y = bonus.There appears to be a positive linear relationship between x and y.Sx1420 S y 160 c. x=i = =142 y = i = = 16 n10 n 102 xi 1420  y i  160  ( x i  x )( y i  y )  1114  ( x i  x )  6046S(xi - x )( y i - y ) 1114 b1 =2 = = .184254 S(xi - x ) 6046 b0= y - b 1 x =16 - (.184254)(142) = - 10.164068 yˆ = -10.1641 + .1843 x d. $1000 increase in salary will increase the bonus by $180. e yˆ = -10.1641 + .1843 x = - 10.1641 + .1843(120) = 11.95 or approximately $12,00014 - 7 Chapter 147. a.44,00042,00040,000 e 38,000 c i rP 36,00034,00032,00030,000 0 1 2 3 4 5 6 Reliability b. x= S xi/ n = 47 /15 = 3.1333 y = S y i / n = 548,434 /15 = 36,562.26672 S(xi - x )( y i - y ) = - 36,086.5333 S ( x i - x ) = 27.7333S(xi - x )( y i - y ) -36,086.5333 b1 =2 = = -1301.1987 S(xi - x ) 27.7333 b0= y - b 1 x =36,562.2667 - ( - 1301.1987)(3.1333) = 40,639.3126 yˆ =40,369 - 1301.2 x c. The scatter diagram and the slope of the estimated regression equation indicate a negative linear relationship between reliability and price. Thus, it appears that higher reliable cars actually cost less. Although this result may surprise you, it may be due to the fact that higher priced cars have more options that may increase the likelihood of problems. d. A car with a good reliability rating corresponds to x = 3. yˆ =40,369 - 1301.2 x = 40,369 - 1301.2(3) = 36,465.40Thus, the estimate of the price of an upscale sedan with a good reliability rating is approximately $36,465.14 - 8 Simple Linear Regression8. a.1800 1600 1400 1200 e 1000 c i rP 800 600 400 200 0 0 1 2 3 4 5 6 Sidetrack Capability b. There appears to be a positive linear relationship between x = sidetrack capability and y = price, with higher priced models having a higher level of handling.Sx28 S y 10,621 c. x=i = =2.8 y = i = = 1062.1 n10 n 102 S(xi - x )( y i - y ) = 4003.2 S ( x i - x ) = 19.6(xi  x )( y i  y ) 4003.2 b1 2   204.2449 (xi  x ) 19.6 b0= y - b 1 x =1062.1 - (204.2449)(2.8) = 490.2143 y490.21  204.24 x d. y490.21  204.24 x  490.21  204.24(4)  130714 - 9 Chapter 149. a.150 140 130 120 110 y 100 90 80 70 60 50 0 2 4 6 8 10 12 14 xSx70 S y 1080 b. x=i = =7 y = i = = 108 n10 n 102 S(xi - x )( y i - y ) = 568 S ( x i - x ) = 142(xi  x )( y i  y ) 568 b1 2   4 (xi  x ) 142 b0  y  b1x  108  (4)(7)  80 y  80  4x c. y  80  4x  80  4(9)  11614 - 10 Simple Linear Regression10. a.450 400 350 300 e 250 c i rP 200 150 100 50 0 0 10 20 30 40 50 Rating b. The scatter diagram and the slope of the estimated regression equation indicate a negative linear relationship between rating and price. Thus, it appears that sleeping bags with a lower temperature rating cost more than sleeping bags with a higher temperature rating. In other words, it costs more to stay warmer. c. x= S xi/ n = 209 /11 = 19 y = S y i / n = 2849 /11 = 2592 S(xi - x )( y i - y ) = - 10,090 S ( x i - x ) = 1912S(xi - x )( y i - y ) -10,090 b1 =2 = = -5.2772 S(xi - x ) 1912 b0= y - b 1 x =259 - ( - 5.2772)(19) = 359.2668 yˆ =359.2668 - 5.2772 x d. yˆ =359.2668 - 5.2772 x = 359.2668 - 5.2772(20) = 253.72Thus, the estimate of the price of sleeping bag with a temperature rating of 20 is approximately $254.14 - 11 Chapter 1411. a.3530) 25 % ( s e 20 r u t r a 15 p eD 1050 10 15 20 25 30 35 Arrivals (%) b. There appears to be a positive linear relationship between the variables. c. Let x = percentage of late arrivals and y = percentage of late departures.Sx273 S y 265 x=i = =21 y = i = = 20.3846 n13 n 132 S(xi - x )( y i - y ) = 142 S ( x i - x ) = 166S(xi - x )( y i - y ) 142 b1 =2 = = .8554 S(xi - x ) 166 b0= y - b 1 x =20.3846 - (.8554)(21) = 2.4212 yˆ =2.42 + .86 x d. A one percent increase in the percentage of late arrivals will increase the percentage of late arrivals by .86 or slightly less than one percent. e. yˆ =2.42 + .86 x = 2.42 + .86(22) = 21.34%14 - 12 Simple Linear Regression12. a.120001100010000 e 9000 c i rP 8000700060005000 700 720 740 760 780 800 820 840 Weight b. The scatter diagram indicates a positive linear relationship between weight and price. Thus, it appears that PWC’s that weigh more have a higher price. c. x= S xi/ n = 7730 /10 = 773 y = S y i / n = 92,200 /10 = 92202 S(xi - x )( y i - y ) = 332,400 S ( x i - x ) = 14,810S(xi - x )( y i - y ) 332,400 b1 =2 = = 22.4443 S(xi - x ) 14,810 b0= y - b 1 x =9220 - (22.4443)(773) = - 8129.4439 yˆ = -8129.4439 + 22.4443 x d. yˆ = -8129.4439 + 22.4443 x = - 8129.4439 + 22.4443(750) = 8703.78Thus, the estimate of the price of Jet Ski with a weight of 750 pounds is approximately $8704. e. No. The relationship between weight and price is not deterministic. f. The weight of the Kawasaki SX-R 800 is so far below the lowest weight for the data used to develop the estimated regression equation that we would not recommend using the estimated regression equation to predict the price for this model.14 - 13 Chapter 1413. a.30.025.020.0 y 15.010.05.00.0 0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.0 xSx399 S y 97.1 b. x=i = =57 y = i = = 13.8714 n7 n 72 S(xi - x )( y i - y ) = 1233.7 S ( x i - x ) = 7648S(xi - x )( y i - y ) 1233.7 b1 =2 = = 0.1613 S(xi - x ) 7648 b0= y - b 1 x =13.8714 - (0.1613)(57) = 4.6773 y  4.68  0.16x c. y  4.68  0.16x  4.68  0.16(52.5)  13.08 or approximately $13,080. The agent's request for an audit appears to be justified.14 - 14 Simple Linear Regression14. a.30.0 29.0 28.0 27.0 y 26.0 r a l 25.0 aS 24.0 23.0 22.0 21.0 20.0 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00 Index b. Let x = cost of living index and y = starting salary ($1000s)Sx369.16 S y 267.7 x=i = =36.916 y = i = = 26.77 n10 n 102 S(xi - x )( y i - y ) = - 311.9592 S ( x i - x ) = 7520.4042(xi  x )( y i  y ) 311.9592 b1 2   .0415 (xi  x ) 7520.4042 b0 y  b 1 x 26.77  (  .0415)(36.916)  28.30 y28.30  .0415 x c. y28.30  .0415 x  28.30  .0415(50)  26.215. a. The estimated regression equation and the mean for the dependent variable are: yi  0.2  2.6xi y  8The sum of squares due to error and the total sum of squares are2 2 SSE  (yi  yi )  12.40 SST  (yi  y)  80Thus, SSR = SST - SSE = 80 - 12.4 = 67.6 b. r2 = SSR/SST = 67.6/80 = .845The least squares line provided a very good fit; 84.5% of the variability in y has been explained by the least squares line. c. rxy =.845 = + .919216. a. The estimated regression equation and the mean for the dependent variable are:14 - 15 Chapter 14 yî =68 - 3 x y = 35The sum of squares due to error and the total sum of squares are2 2 SSE=� (yi = yˆ i ) = 230 SST � = ( y i y ) 1850Thus, SSR = SST - SSE = 1850 - 230 = 1620 b. r2 = SSR/SST = 1620/1850 = .876The least squares line provided an excellent fit; 87.6% of the variability in y has been explained by the estimated regression equation. c. rxy =.876 = - .936Note: the sign for r is negative because the slope of the estimated regression equation is negative. (b1 = -3)17. The estimated regression equation and the mean for the dependent variable are: yî =7.6 + .9 x y = 16.6The sum of squares due to error and the total sum of squares are2 2 SSE=� (yi = yˆ i ) = 127.3 SST � = ( y i y ) 281.2Thus, SSR = SST - SSE = 281.2 – 127.3 = 153.9 r2 = SSR/SST = 153.9/281.2 = .547We see that 54.7% of the variability in y has been explained by the least squares line. rxy =.547 = + .74018. a. The estimated regression equation and the mean for the dependent variable are: yˆ 1790.5  581.1 x y  3650The sum of squares due to error and the total sum of squares are2 2 SSE  (yi  y i )  85,135.14 SST   ( y i  y )  335,000Thus, SSR = SST - SSE = 335,000 - 85,135.14 = 249,864.86 b. r2 = SSR/SST = 249,864.86/335,000 = .746We see that 74.6% of the variability in y has been explained by the least squares line. c. rxy =.746 = + .863719. The estimated regression equation and the mean for the dependent variable are:14 - 16 Simple Linear Regression yî =40,639 - 1301.2 x y = 36,562.27The sum of squares due to error and the total sum of squares are2 2 SSE=� (yi = yˆ i ) = 47,116,828 SST � = ( y i y ) 94,072,519Thus, SSR = SST - SSE = 94,072,519 – 47,116,828 = 46,955,691 r2 = SSR/SST = 46,955,691/94,072,519 = .4991We see that 49.91% of the variability in y has been explained by the least squares line. rxy =.4991 = - .7120. a.70 65 60 55 50 e r o 45 c S 40 35 30 25 20 1500.00 2000.00 2500.00 3000.00 3500.00 4000.00 4500.00 PriceThe scatter diagram indicates a positive linear relationship between price and score. x= S xi/ n = 29,600 /10 = 2960 y = S y i / n = 496 /10 = 49.62 S(xi - x )( y i - y ) = 34,840 S ( x i - x ) = 2,744,000S(xi - x )( y i - y ) 34,840 b1 =2 = = .012697 S(xi - x ) 2,744,000 b0= y - b 1 x =49.6 - (.012697)(2960) = 12.0169 yˆ =12.0169 + .0127 x b. The sum of squares due to error and the total sum of squares are14 - 17 Chapter 142 2 SSE=� (yi = yˆ i ) = 540.0446 SST � = ( y i y ) 982.4Thus, SSR = SST - SSE = 982.4 – 540.0446 = 442.3554 r2 = SSR/SST = 442.3554/982.4 = .4503The fit provided by the estimated regression equation is not that good; only 45.03% of the variability in y has been explained by the least squares line. c. yˆ =12.0169 + .0127 x = 12.0169 + .0127(3200) = 52.66The estimate of the overall score for a 42-inch plasma television is approximately 53.Sx3450 S y 33,700 21. a. x=i = =575 y = i = = 5616.67 n6 n 62 S(xi - x )( y i - y ) = 712,500 S ( x i - x ) = 93,750(xi  x )( y i  y ) 712,500 b1 2   7.6 (xi  x ) 93,750 b0 y  b 1 x 5616.67  (7.6)(575)  1246.67 y  1246.67  7.6x b. $7.60 c. The sum of squares due to error and the total sum of squares are:2 2 SSE  (yi  y i )  233,333.33 SST   ( y i  y )  5,648,333.33Thus, SSR = SST - SSE = 5,648,333.33 - 233,333.33 = 5,415,000 r2 = SSR/SST = 5,415,000/5,648,333.33 = .9587We see that 95.87% of the variability in y has been explained by the estimated regression equation. d. y  1246.67  7.6x  1246.67  7.6(500)  $5046.6722. a. Let x = speed (ppm) and y = price ($)Sx149.9 S y 10,221 x=i = =14.99 y = i = = 1022.1 n10 n 102 S(xi - x )( y i - y ) = 34,359.810 S ( x i - x ) = 291.3890S(xi - x )( y i - y ) 34,359.810 b1 =2 = = 117.917320 S(xi - x ) 291.3890 b0= y - b 1 x =1022.1 - (117.917320)(14.99) = - 745.48062714 - 18 Simple Linear Regression yˆ = -745.480627 + 117.917320 x b. The sum of squares due to error and the total sum of squares are:2 2 SSE  (yi  y i )  1,678,294 SST   ( y i  y )  5,729,911Thus, SSR = SST - SSE = 5,729,911 - 1,678,294 = 4,051,617 r2 = SSR/SST = 4,051,617/5,729,911 = 0.7071Approximately 71% of the variability in price is explained by the speed. c. rxy =.7071 = + .84It reflects a linear relationship that is between weak and strong.23. a. s2 = MSE = SSE / (n - 2) = 12.4 / 3 = 4.133 b. s  MSE  4.133  2.0332 c. (xi  x )  10 s 2.033 sb    0.643 1 2 (xi  x ) 10 b 2.6 d. t =1 = = 4.044 s .643 b1Using t table (3 degrees of freedom), area in tail is between .01 and .025 p-value is between .02 and .05Using Excel or Minitab, the p-value corresponding to t = 4.04 is .0272.Because p-value   , we reject H0: 1 = 0 e. MSR = SSR / 1 = 67.6F = MSR / MSE = 67.6 / 4.133 = 16.36Using F table (1 degree of freedom numerator and 3 denominator), p-value is between .025 and .05Using Excel or Minitab, the p-value corresponding to F = 16.36 is .0272.Because p-value   , we reject H0: 1 = 0Source Sum Degrees Mean of Variation of Squares of Freedom Square F p-value Regression 67.6 1 67.6 16.36 .0272 Error 12.4 3 4.13314 - 19 Chapter 14Total 80.0 424. a. s2 = MSE = SSE/(n - 2) = 230/3 = 76.6667 b. s =MSE = 76.6667 = 8.75602 c. S(xi - x ) = 180 s 8.7560 sb = = = 0.6526 1 2 180 S(xi - x ) b -3 d. t =1 = = -4.59 s .653 b1Using t table (3 degrees of freedom), area in tail is less than .01; p-value is less than .02Using Excel or Minitab, the p-value corresponding to t = -4.59 is .0193.Because p-value   , we reject H0: 1 = 0 e. MSR = SSR/1 = 1620F = MSR/MSE = 1620/76.6667 = 21.13Using F table (1 degree of freedom numerator and 3 denominator), p-value is less than .025Using Excel or Minitab, the p-value corresponding to F = 21.13 is .0193.Because p-value   , we reject H0: 1 = 0Source Sum Degrees Mean of Variation of Squares of Freedom Square F p-value Regression 230 1 230 21.13 .0193 Error 1620 3 76.6667 Total 1850 425. a. s2 = MSE = SSE/(n - 2) = 127.3/3 = 42.4333 s =MSE = 42.4333 = 6.51412 b. S(xi - x ) = 190 s 6.5141 sb = = = 0.4726 1 2 190 S(xi - x ) b .9 t =1 = = 1.90 s .4726 b1Using t table (3 degrees of freedom), area in tail is between .05 and .10 p-value is between .10 and .2014 - 20 Simple Linear RegressionUsing Excel or Minitab, the p-value corresponding to t = 1.90 is .1530.Because p-value > , we cannot reject H0: 1 = 0; x and y do not appear to be related. c. MSR = SSR/1 = 153.9 /1 = 153.9F = MSR/MSE = 153.9/42.4333 = 3.63Using F table (1 degree of freedom numerator and 3 denominator), p-value is greater than .10Using Excel or Minitab, the p-value corresponding to F = 3.63 is .1530.Because p-value > , we cannot reject H0: 1 = 0; x and y do not appear to be related.26. a. In solving exercise 18, we found SSE = 85,135.14 s2 = MSE = SSE/(n - 2) = 85,135.14/4 = 21,283.79 s  MSE  21,283.79  145.892 (xi  x )  0.74 s 145.89 sb    169.59 1 2 (xi  x ) 0.74 b 581.1 t =1 = = 3.43 s 169.59 b1Using t table (4 degrees of freedom), area in tail is between .01 and .025 p-value is between .02 and .05Using Excel or Minitab, the p-value corresponding to t = 3.43 is .0266.Because p-value   , we reject H0: 1 = 0 b. MSR = SSR/1 = 249,864.86/1 = 249.864.86F = MSR/MSE = 249,864.86/21,283.79 = 11.74Using F table (1 degree of freedom numerator and 4 denominator), p-value is between .025 and .05Using Excel or Minitab, the p-value corresponding to F = 11.74 is .0266.Because p-value   , we reject H0: 1 = 0 c. Source Sum Degrees Mean of Variation of Squares of Freedom Square F p-value Regression 249864.86 1 249864.86 11.74 .026614 - 21 Chapter 14Error 85135.14 4 21283.79 Total 335000 5Sx37 S y 1654 27. a. x=i = =3.7 y = i = = 165.4 n10 n 102 S(xi - x )( y i - y ) = 315.2 S ( x i - x ) = 10.1(xi  x )( y i  y ) 315.2 b1 2   31.2079 (xi  x ) 10.1 b0= y - b 1 x =165.4 - (31.2079)(3.7) = 49.9308 yˆ =49.9308 + 31.2079 x2 2 b. SSE = (yi  y i )  2487.66 SST = (yi  y ) = 12,324.4Thus, SSR = SST - SSE = 12,324.4 - 2487.66 = 9836.74MSR = SSR/1 = 9836.74MSE = SSE/(n - 2) = 2487.66/8 = 310.96F = MSR / MSE = 9836.74/310.96 = 31.63Using F table (1 degree of freedom numerator and 8 denominator), p-value is less than .01Using Excel or Minitab, the p-value corresponding to F = 31.63 is .001.Because p-value   , we reject H0: 1 = 0Upper support and price are related. c. r2 = SSR/SST = 9,836.74/12,324.4 = .80The estimated regression equation provided a good fit; we should feel comfortable using the estimated regression equation to estimate the price given the upper support rating. d. y = 49.93 + 31.21(4) = 174.7728. The sum of squares due to error and the total sum of squares are2 2 SSE=� (yi = yˆ i ) = 12,953.09 SST � = ( y i y ) 66,200Thus, SSR = SST - SSE = 66,200 – 12,953.09 = 53,246.91 s2 = MSE = SSE / (n - 2) = 12,953.09 / 9 = 1439.232214 - 22 Simple Linear Regression s =MSE = 1439.2322 = 37.9372We can use either the t test or F test to determine whether temperature rating and price are related. We will first illustrate the use of the t test.2 Note: from the solution to exercise 10 S(xi - x ) = 1912 s 37.9372 sb = = = .8676 1 2 1912 S(xi - x ) b -5.2772 t =1 = = -6.0825 s .8676 b1Using t table (9 degrees of freedom), area in tail is less than .005; p-value is less than .01Using Excel or Minitab, the p-value corresponding to t = -6.0825 is .000.Because p-value   , we reject H0: b1 = 0Because we can reject H0: b1 = 0 we conclude that temperature rating and price are related.Next we illustrate the use of the F test.MSR = SSR / 1 = 53,246.91F = MSR / MSE = 53,246.91 / 1439.2322 = 37.00Using F table (1 degree of freedom numerator and 9 denominator), p-value is less than .01Using Excel or Minitab, the p-value corresponding to F = 37.00 is .000.Because p-value   , we reject H0: b1 = 0Because we can reject H0: b1 = 0 we conclude that temperature rating and price are related.The ANOVA table is shown below.Source Sum Degrees Mean of Variation of Squares of Freedom Square F p-value Regression 53,246.91 1 53,246.91 37.00 .000 Error 12,953.09 9 1439.2322 Total 66,200 102 2 29. SSE = S(yi - yˆ i ) = 233,333.33 SST = (yi  y ) = 5,648,333.33Thus, SSR = SST – SSE= 5,648,333.33 –233,333.33 = 5,415,000MSE = SSE/(n - 2) = 233,333.33/(6 - 2) = 58,333.33MSR = SSR/1 = 5,415,00014 - 23 Chapter 14F = MSR / MSE = 5,415,000 / 58,333.25 = 92.83 Source of Sum Degrees of Mean Variation of Squares Freedom Square F p-value Regression 5,415,000.00 1 5,415,000 92.83 .0006 Error 233,333.33 4 58,333.33 Total 5,648,333.33 5Using F table (1 degree of freedom numerator and 4 denominator), p-value is less than .01Using Excel or Minitab, the p-value corresponding to F = 92.83 is .0006.Because p-value   , we reject H0: 1 = 0. Production volume and total cost are related.2 2 30. SSE = S(yi - yˆ i ) = 1,678,294 SST = (yi  y ) = 5,729,911Thus, SSR = SST – SSE = 5,729,911 – 1,678,294 = 4,051,617 s2 = MSE = SSE/(n-2) = 1,678,294/8 = 209,786.8 s 209,786.8  458.022 (xi  x ) = 291.3890 s 458.02 sb = = = 26.83 1 2 291.3890 S(xi - x ) b 117.9173 t =1 = = 4.39 s 26.83 b1Using t table (1 degree of freedom numerator and 8 denominator), area in tail is less than .005 p-value is less than .01Using Excel or Minitab, the p-value corresponding to t = 4.39 is .002.Because p-value   , we reject H0: 1 = 0There is a significant relationship between x and y.31. SSE = 540.04 and SST = 982.40Thus, SSR = SST - SSE = 982.40 – 540.04 = 442.36 s2 = MSE = SSE / (n - 2) = 540.04 / 8 = 67.5050 s =MSE = 67.5050 = 8.216MSR = SSR / 1 = 442.3614 - 24 Simple Linear RegressionF = MSR / MSE = 442.36 / 67.5050 = 6.55Using F table (1 degree of freedom numerator and 8 denominator), p-value is between .025 and .05Using Excel or Minitab, the p-value corresponding to F = 6.55 is .034.Because p-value   , we reject H0: b1 = 0Conclusion: price and overall score are related32. a. s = 2.0332 x3  ( xi  x )  101(x x )2 1 (4 3)2 s s p 2.033   1.11 yp 2 n (xi  x ) 5 10 b. y  0.2  2.6x  0.2  2.6(4)  10.6 y  t s p  /2 yp10.6  3.182 (1.11) = 10.6  3.53 or 7.07 to 14.132 2 1(xp  x ) 1 (4 3) c. sind  s 1  2  2.033 1    2.32 n (xi  x ) 5 10 d. yp  t /2 sind10.6  3.182 (2.32) = 10.6  7.38 or 3.22 to 17.9833. a. s = 8.75602 b. x=11 S ( xi - x ) = 1801(x- x )2 1 (8- 11)2 s= s +p =8.7560 + = 4.3780 yˆp 2 n S(xi - x ) 5 180 yˆ =68 - 3 x = 68 - 3(10) = 38 y  t s p  /2 yp38  3.182(4.3780) = 38  13.93 or 24.07 to 51.9314 - 25 Chapter 142 2 1(xp - x ) 1 (8- 11) c. sind = s 1 + +2 = 8.7560 1 + + = 9.7895 n S(xi - x ) 5 180 d. yp  t /2 sind38  3.182(9.7895) = 38  31.15 or 6.85 to 69.1534. s = 6.51412 x=10 S ( xi - x ) = 1901(x- x )2 1 (12- 10)2 s= s +p =6.5141 + = 3.0627 yˆp 2 n S(xi - x ) 5 190 yˆ =7.6 + .9 x = 7.6 + .9(12) = 18.40 y  t s p  /2 yp18.40  3.182(3.0627) = 18.40  9.75 or 8.65 to 28.152 2 1(xp - x ) 1 (12- 10) sind = s 1 + +2 = 6.5141 1 + + = 7.1982 n S(xi - x ) 5 190 yp  t /2 sind18.40  3.182(7.1982) = 18.40  22.90 or -4.50 to 41.3035. a. Note: some of the values shown were computed in exercises 18 and 26. s = 145.89 2 x3.2  ( xi  x )  0.741(x x )2 1 (3 3.2)2 s s p 145.89   68.54 yp 2 n (xi  x ) 6 0.74 y1790.5  581.1 x  1790.5  581.1(3)  3533.814 - 26 Simple Linear Regression y  t s p  /2 yp3533.8  2.776 (68.54) = 3533.8  190.27 or $3343.53 to $3724.072 2 1(xp  x ) 1 (3 3.2) b. sind  s 1  2  145.89 1    161.19 n (xi  x ) 6 0.74 yp  t /2 sind3533.8  2.776 (161.19) = 3533.8  447.46 or $3086.34 to $3981.2636. a. yˆ =359.2668 - 5.2772 x =359.2668 – 5.2772(30) = 200.95 or approximately $201. b. s = 37.93722 x=19 S ( xi - x ) = 19121(x- x )2 1 (30- 19)2 s= s +p =37.9372 + = 14.90 yˆp 2 n S(xi - x ) 11 1912 y  t s p  /2 yp200.95 2.262 (14.90) = 200.95 33.70 or 167.25 to 234.652 2 1(xp - x ) 1 (30- 19) c. sind = s 1 + +2 = 37.9372 1 + + = 40.76 n S(xi - x ) 11 1912 yp  t /2 sind200.95 2.262 (40.76) = 200.95 92.20 or 108.75 to 293.15 d. As expected, the prediction interval is much wider than the confidence interval. This is due to the fact that it is more difficult to predict the results for one new model with a temperature rating of 30 Fo than it is to estimate the mean for all models with a temperature rating of 30 Fo .2 37. a. x57  ( xi  x )  7648 s2 = 1.88 s = 1.3714 - 27 Chapter 141(x x )2 1 (52.5 57)2 s s p 1.37   0.52 yp 2 n (xi  x ) 7 7648 y  t s p  /2 yp yˆ = 4.68 + 0.16x = 4.68 + 0.16(52.5) = 13.0813.08  2.571 (.52) = 13.08  1.34 or 11.74 to 14.42 or $11,740 to $14,420 b. sind = 1.4713.08  2.571 (1.47) = 13.08  3.78 or 9.30 to 16.86 or $9,300 to $16,860 c. Yes, $20,400 is much larger than anticipated. d. Any deductions exceeding the $16,860 upper limit could suggest an audit.38. a. y  1246.67  7.6(500)  $5046.672 b. x575  ( xi  x )  93,750 s2 = MSE = 58,333.33 s = 241.522 2 1(xp  x ) 1 (500 575) sind  s 1  2  241.52 1    267.50 n (xi  x ) 6 93,750 yp  t /2 sind5046.67  4.604 (267.50) = 5046.67  1231.57 or $3815.10 to $6278.24 c. Based on one month, $6000 is not out of line since $3815.10 to $6278.24 is the prediction interval. However, a sequence of five to seven months with consistently high costs should cause concern. 39. a. Let x = miles of track and y = weekday ridership in thousands.Sx203 S y 309 x=i = =29 y = i = = 44.1429 n7 n 72 S(xi - x )( y i - y ) = 1471 S ( x i - x ) = 838(xi  x )( y i  y ) 1471 b1 2   1.7554 (xi  x ) 838 b0 y  b 1 x 44.1429  (1.7554)(29)   6.7614 - 28 Simple Linear Regression y 6.76  1.755 x b. SST =3620.9 SSE = 1038.7 SSR = 2582.1 r2 = SSR/SST = 2582.1/3620.9 = .713The estimated regression equation explained 71.3% of the variability in y; a good fit. c. s2 = MSE = 1038.7/5 = 207.7 s 207.7  14.411(x x )2 1 (30 29)2 s s p 14.41   5.47 yp 2 n (xi  x ) 7 838 y 6.76  1.755 x   6.76  1.755(30)  45.9 y  t s p  /2 yp45.9  2.571(5.47) = 45.9  14.1 or 31.8 to 602 2 1(xp  x ) 1 (30 29) d. sind  s 1  2  14.41 1    15.41 n (xi  x ) 7 838 yp  t /2 sind45.9  2.571(15.41) = 45.9  39.6 or 6.3 to 85.5The prediction interval is so wide that it would not be of much value in the planning process. A larger data set would be beneficial.40. a. 9 b. y = 20.0 + 7.21x c. 1.3626 d. SSE = SST - SSR = 51,984.1 - 41,587.3 = 10,396.8MSE = 10,396.8/7 = 1,485.3F = MSR / MSE = 41,587.3 /1,485.3 = 28.00Using F table (1 degree of freedom numerator and 7 denominator), p-value is less than .0114 - 29 Chapter 14Using Excel or Minitab, the p-value corresponding to F = 28.00 is .0011.Because p-value   , we reject H0: B1 = 0. e. y = 20.0 + 7.21(50) = 380.5 or $380,50041. a. y = 6.1092 + .8951x b B .8951 0 b. t 1 1   6.01 s .149 b1Using the t table (8 degrees of freedom), area in tail is less than .005 p-value is less than .01Using Excel or Minitab, the p-value corresponding to t = 6.01 is .0003.Because p-value   , we reject H0: B1 = 0 c. y = 6.1092 + .8951(25) = 28.49 or $28.49 per month42 a. y = 80.0 + 50.0x b. 30 c. F = MSR / MSE = 6828.6/82.1 = 83.17Using F table (1 degree of freedom numerator and 28 denominator), p-value is less than .01Using Excel or Minitab, the p-value corresponding to F = 83.17 is .000.Because p-value < = .05, we reject H0: B1 = 0.Branch office sales are related to the salespersons. d. y = 80 + 50 (12) = 680 or $680,00043. a. The Minitab output is shown below:The regression equation is Price = 4.98 + 2.94 WeightPredictor Coef SE Coef T P Constant 4.979 3.380 1.47 0.154 Weight 2.9370 0.2934 10.01 0.000S = 8.457 R-Sq = 80.7% R-Sq(adj) = 79.9%Analysis of VarianceSource DF SS MS F P Regression 1 7167.9 7167.9 100.22 0.000 Residual Error 24 1716.6 71.5 Total 25 8884.514 - 30 Simple Linear RegressionPredicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI 1 34.35 1.66 ( 30.93, 37.77) ( 16.56, 52.14) b. The p-value = .000 <  = .05 (t or F); significant relationship c. r2 = .807. The least squares line provided a very good fit. d. The 95% confidence interval is 30.93 to 37.77. e. The 95% prediction interval is 16.56 to 52.14.44. a/b. The scatter diagram shows a linear relationship between the two variables. c. The Minitab output is shown below:The regression equation is Rental$ = 37.1 - 0.779 Vacancy%Predictor Coef SE Coef T P Constant 37.066 3.530 10.50 0.000 Vacancy% -0.7791 0.2226 -3.50 0.003S = 4.889 R-Sq = 43.4% R-Sq(adj) = 39.8%Analysis of VarianceSource DF SS MS F P Regression 1 292.89 292.89 12.26 0.003 Residual Error 16 382.37 23.90 Total 17 675.26Predicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI 1 17.59 2.51 ( 12.27, 22.90) ( 5.94, 29.23) 2 28.26 1.42 ( 25.26, 31.26) ( 17.47, 39.05)Values of Predictors for New ObservationsNew Obs Vacancy% 1 25.0 2 11.3 d. Since the p-value = 0.003 is less than  = .05, the relationship is significant. e. r2 = .434. The least squares line does not provide a very good fit. f. The 95% confidence interval is 12.27 to 22.90 or $12.27 to $22.90.14 - 31 Chapter 14 g. The 95% prediction interval is 17.47 to 39.05 or $17.47 to $39.05.Sx70 S y 76 45. a. x=i = =14 y = i = = 15.2 n5 n 52 S(xi - x )( y i - y ) = 200 S ( x i - x ) = 126(xi  x )( y i  y ) 200 b1 2   1.5873 (xi  x ) 126 b0 y  b 1 x 15.2  (1.5873)(14)   7.0222 yˆ  7.02  1.59 x b. The residuals are 3.48, -2.47, -4.83, -1.6, and 5.22 c.642 s l a u d i 0 s e R -2-4-6 0 5 10 15 20 25 xWith only 5 observations it is difficult to determine if the assumptions are satisfied. However, the plot does suggest curvature in the residuals that would indicate that the error term assumptions are not satisfied. The scatter diagram for these data also indicates that the underlying relationship between x and y may be curvilinear. d. s2  23.782 2 1(xi x ) 1 ( x i  14) hi  2   n (xi  x ) 5 126The standardized residuals are 1.32, -.59, -1.11, -.40, 1.49. e. The standardized residual plot has the same shape as the original residual plot. The curvature observed indicates that the assumptions regarding the error term may not be satisfied.14 - 32 Simple Linear Regression46. a. yˆ 2.32  .64 x b.4 3 2 s l 1 a u d i 0 s eR -1 -2 -3 -4 0 2 4 6 8 10 xThe assumption that the variance is the same for all values of x is questionable. The variance appears to increase for larger values of x.47. a. Let x = advertising expenditures and y = revenue yˆ 29.4  1.55 x b. SST = 1002 SSE = 310.28 SSR = 691.72MSR = SSR / 1 = 691.72MSE = SSE / (n - 2) = 310.28/ 5 = 62.0554F = MSR / MSE = 691.72/ 62.0554= 11.15Using F table (1 degree of freedom numerator and 5 denominator), p-value is between .01 and .025Using Excel or Minitab, the p-value corresponding to F = 11.15 is .0206.Because p-value   = .05, we conclude that the two variables are related. c.14 - 33 Chapter 14105 s l 0 a u d i s eR -5-10-15 25 30 35 40 45 50 55 60 65 Predicted Values d. The residual plot leads us to question the assumption of a linear relationship between x and y. Even though the relationship is significant at the .05 level of significance, it would be extremely dangerous to extrapolate beyond the range of the data.48. a. yˆ 80  4 x8642 s l a u d i 0 s e R -2-4-6-8 0 2 4 6 8 10 12 14 x b. The assumptions concerning the error term appear reasonable.49. a. Let x = return on investment (ROE) and y = price/earnings (P/E) ratio. 14 - 34 Simple Linear Regression yˆ  32.13  3.22 x b.2 1.5 s l a u d 1 i s eR 0.5 d e z 0 i d r a d -0.5 n a tS -1 -1.5 0 10 20 30 40 50 60 x c. There is an unusual trend in the residuals. The assumptions concerning the error term appear questionable.50. a. The MINITAB output is shown below:The regression equation is Y = 66.1 + 0.402 XPredictor Coef SE Coef T p Constant 66.10 32.06 2.06 0.094 X 0.4023 0.2276 1.77 0.137S = 12.62 R-sq = 38.5% R-sq(adj) = 26.1%Analysis of VarianceSOURCE DF SS MS F p Regression 1 497.2 497.2 3.12 0.137 Residual Error 5 795.7 159.1 Total 6 1292.9Unusual Observations Obs. X Y Fit SEFit Residual St.Resid 1 135 145.00 120.42 4.87 24.58 2.11R R denotes an observation with a large standardized residual.The standardized residuals are: 2.11, -1.08, .14, -.38, -.78, -.04, -.41The first observation appears to be an outlier since it has a large standardized residual.14 - 35 Chapter 14 b.2.52.01.5 l a u d i s 1.0 e R d e z i 0.5 d r a d n a t 0.0 S-0.5-1.0110 115 120 125 130 135 140 Fitted ValueThe standardized residual plot indicates that the observation x = 135, y = 145 may be an outlier; note that this observation has a standardized residual of 2.11. c. The scatter diagram is shown below150 145 140135 130 y 125 120 115110 105 100 100 110 120 130 140 150 160 170 180 xThe scatter diagram also indicates that the observation x = 135, y = 145 may be an outlier; the implication is that for simple linear regression an outlier can be identified by looking at the scatter diagram.14 - 36 Simple Linear Regression51. a. The Minitab output is shown below:The regression equation is Y = 13.0 + 0.425 XPredictor Coef SE Coef T p Constant 13.002 2.396 5.43 0.002 X 0.4248 0.2116 2.01 0.091S = 3.181 R-sq = 40.2% R-sq(adj) = 30.2%Analysis of VarianceSOURCE DF SS MS F p Regression 1 40.78 40.78 4.03 0.091 Residual Error 6 60.72 10.12 Total 7 101.50Unusual Observations Obs. X Y Fit Stdev.Fit Residual St.Resid 7 12.0 24.00 18.10 1.20 5.90 2.00R 8 22.0 19.00 22.35 2.78 -3.35 -2.16RXR denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large influence.The standardized residuals are: -1.00, -.41, .01, -.48, .25, .65, -2.00, -2.16The last two observations in the data set appear to be outliers since the standardized residuals for these observations are 2.00 and -2.16, respectively. b. Using MINITAB, we obtained the following leverage values:.28, .24, .16, .14, .13, .14, .14, .76MINITAB identifies an observation as having high leverage if hi > 6/n; for these data, 6/n = 6/8 = .75. Since the leverage for the observation x = 22, y = 19 is .76, MINITAB would identify observation 8 as a high leverage point. Thus, we conclude that observation 8 is an influential observation.14 - 37 Chapter 14 c.302520 y 151050 0 5 10 15 20 25 xThe scatter diagram indicates that the observation x = 22, y = 19 is an influential observation.52. a. The Minitab output is shown below:The regression equation is Shipment = 4.09 + 0.196 Media$Predictor Coef SE Coef T P Constant 4.089 2.168 1.89 0.096 MediaExp 0.19552 0.03635 5.38 0.001S = 5.044 R-Sq = 78.3% R-Sq(adj) = 75.6%Analysis of VarianceSource DF SS MS F P Regression 1 735.84 735.84 28.93 0.001 Residual Error 8 203.51 25.44 Total 9 939.35Unusual Observations Obs Media$ Shipment Fit SE Fit Residual St Resid 1 120 36.30 27.55 3.30 8.75 2.30R R denotes an observation with a large standardized residual b. Minitab identifies observation 1 as having a large standardized residual; thus, we would consider observation 1 to be an outlier.14 - 38 Simple Linear Regression53. a. The Minitab output is shown below:The regression equation is Price = 28.0 + 0.173 VolumePredictor Coef SE Coef T P Constant 27.958 4.521 6.18 0.000 Volume 0.17289 0.07804 2.22 0.036S = 17.53 R-Sq = 17.0% R-Sq(adj) = 13.5%Analysis of VarianceSource DF SS MS F P Regression 1 1508.4 1508.4 4.91 0.036 Residual Error 24 7376.1 307.3 Total 25 8884.5Unusual Observations Obs Volume Price Fit SE Fit Residual St Resid 22 230 40.00 67.72 15.40 -27.72 -3.31RXR denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. b. The Minitab output identifies observation 22 as having a large standardized residual and is an observation whose x value gives it large influence. The following residual plot verifies these observations.21 l a u 0 d i s e R d e -1 z i d r a d n a t -2 S-3-4 30 40 50 60 70 Fitted Value14 - 39 Chapter 1454. a. The Minitab output is shown below:The regression equation is Salary = 707 + 0.00482 MktCapPredictor Coef SE Coef T P Constant 707.0 118.0 5.99 0.000 MktCap 0.0048154 0.0008076 5.96 0.000S = 379.8 R-Sq = 66.4% R-Sq(adj) = 64.5%Analysis of VarianceSource DF SS MS F P Regression 1 5129071 5129071 35.55 0.000 Residual Error 18 2596647 144258 Total 19 7725718Unusual Observations Obs MktCap Salary Fit SE Fit Residual St Resid 6 507217 3325.0 3149.5 338.6 175.5 1.02 X 17 120967 116.2 1289.5 86.4 -1173.3 -3.17R R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. b. Minitab identifies observation 6 as an observation whose x value gives it large influence and observation 17 as having a large standardized residual. A standardized residual plot against the predicted values is shown below:3 s l a 2 u d i s 1 e R0 d e z i -1 d r a d -2 n a t -3 S -4 500 1000 1500 2000 2500 3000 3500 Predicted Values55. No. Regression or correlation analysis can never prove that two variables are causally related.56. The estimate of a mean value is an estimate of the average of all y values associated with the same x. The estimate of an individual y value is an estimate of only one of the y values associated with a particular x.57. The purpose of testing whether 1  0 is to determine whether or not there is a significant relationship between x and y. However, rejecting 1  0 does not necessarily imply a good fit. For 14 - 40 Simple Linear Regression2 example, if 1  0 is rejected and r is low, there is a statistically significant relationship between x and y but the fit is not very good.58. a. The Minitab output is shown below:The regression equation is Price = 9.26 + 0.711 SharesPredictor Coef SE Coef T P Constant 9.265 1.099 8.43 0.000 Shares 0.7105 0.1474 4.82 0.001S = 1.419 R-Sq = 74.4% R-Sq(adj) = 71.2%Analysis of VarianceSource DF SS MS F P Regression 1 46.784 46.784 23.22 0.001 Residual Error 8 16.116 2.015 Total 9 62.900 b. Since the p-value corresponding to F = 23.22 = .001 <  = .05, the relationship is significant. c. r 2 = .744; a good fit. The least squares line explained 74.4% of the variability in Price. d. yˆ 9.26  .711(6)  13.5359. a. The Minitab output is shown below:The regression equation is Options = - 3.83 + 0.296 CommonPredictor Coef SE Coef T P Constant -3.834 5.903 -0.65 0.529 Common 0.29567 0.02648 11.17 0.000S = 11.04 R-Sq = 91.9% R-Sq(adj) = 91.2%Analysis of VarianceSource DF SS MS F P Regression 1 15208 15208 124.72 0.000 Residual Error 11 1341 122 Total 12 16550 b. yˆ  3.83  .296(150)  40.57 ; approximately 40.6 million shares of options grants outstanding. c. r 2 = .919; a very good fit. The least squares line explained 91.9% of the variability in Options.14 - 41 Chapter 1460. a.130012801260 00 1240 5 P& 1220 S120011801160 10000 10200 10400 10600 10800 11000 DJIA b. The Minitab output is shown below:The regression equation is S&P500 = - 182 + 0.133 DJIAPredictor Coef SE Coef T P Constant -182.11 71.83 -2.54 0.021 DJIA 0.133428 0.006739 19.80 0.000S = 6.89993 R-Sq = 95.6% R-Sq(adj) = 95.4%Analysis of VarianceSource DF SS MS F P Regression 1 18666 18666 392.06 0.000 Residual Error 18 857 48 Total 19 19523 c. Using the F test, the p-value corresponding to F = 392.06 is .000. Because the p-value a =.05, we reject H0:b 1 = 0 ; there is a significant relationship. d. With R-Sq = 95.6%, the estimated regression equation provided an excellent fit. e. yˆ = -182.11 + .133428DJIA= - 182.11 + .133428(11,000) = 1285.60 or 1286. f. The DJIA is not that far beyond the range of the data. With the excellent fit provided by the estimated regression equation, we should not be too concerned about using the estimated regression equation to predict the S&P500.14 - 42 Simple Linear Regression61. The Minitab output is shown below:The regression equation is Expense = 10.5 + 0.953 UsagePredictor Coef SE Coef T p Constant 10.528 3.745 2.81 0.023 X 0.9534 0.1382 6.90 0.000S = 4.250 R-sq = 85.6% R-sq(adj) = 83.8%Analysis of VarianceSOURCE DF SS MS F p Regression 1 860.05 860.05 47.62 0.000 Residual Error 8 144.47 18.06 Total 9 1004.53Fit Stdev.Fit 95% C.I. 95% P.I. 39.13 1.49 ( 35.69, 42.57) ( 28.74, 49.52) a. y = 10.5 + .953 Usage b. Since the p-value corresponding to F = 47.62 = .000 <  = .05, we reject H0: 1 = 0. c. The 95% prediction interval is 28.74 to 49.52 or $2874 to $4952 d. Yes, since the expected expense is $3913.62. a. The Minitab output is shown below:The regression equation is Defects = 22.2 - 0.148 SpeedPredictor Coef SE Coef T P Constant 22.174 1.653 13.42 0.000 Speed -0.14783 0.04391 -3.37 0.028S = 1.489 R-Sq = 73.9% R-Sq(adj) = 67.4%Analysis of VarianceSource DF SS MS F P Regression 1 25.130 25.130 11.33 0.028 Residual Error 4 8.870 2.217 Total 5 34.000Predicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI 1 14.783 0.896 ( 12.294, 17.271) ( 9.957, 19.608) b. Since the p-value corresponding to F = 11.33 = .028 <  = .05, the relationship is significant.14 - 43 Chapter 14 c. r 2 = .739; a good fit. The least squares line explained 73.9% of the variability in the number of defects. d. Using the Minitab output in part (a), the 95% confidence interval is 12.294 to 17.271.63. a.9 8 7 6 s 5 y aD 4 3 2 1 0 0 5 10 15 20 DistanceThere appears to be a negative linear relationship between distance to work and number of days absent. b. The MINITAB output is shown below:The regression equation is Days = 8.10 - 0.344 DistancePredictor Coef SE Coef T p Constant 8.0978 0.8088 10.01 0.000 X -0.34420 0.07761 -4.43 0.002S = 1.289 R-sq = 71.1% R-sq(adj) = 67.5%Analysis of VarianceSOURCE DF SS MS F p Regression 1 32.699 32.699 19.67 0.002 Residual Error 8 13.301 1.663 Total 9 46.000Fit Stdev.Fit 95% C.I. 95% P.I. 6.377 0.512 ( 5.195, 7.559) ( 3.176, 9.577) c. Since the p-value corresponding to F = 419.67 is .002 <  = .05. We reject H0 : 1 = 0. d. r2 = .711. The estimated regression equation explained 71.1% of the variability in y; this is a reasonably good fit.14 - 44 Simple Linear Regression e. The 95% confidence interval is 5.195 to 7.559 or approximately 5.2 to 7.6 days.64. a. The Minitab output is shown below:The regression equation is Cost = 220 + 132 AgePredictor Coef SE Coef T p Constant 220.00 58.48 3.76 0.006 X 131.67 17.80 7.40 0.000S = 75.50 R-sq = 87.3% R-sq(adj) = 85.7%Analysis of VarianceSOURCE DF SS MS F p Regression 1 312050 312050 54.75 0.000 Residual Error 8 45600 5700 Total 9 357650Fit Stdev.Fit 95% C.I. 95% P.I. 746.7 29.8 ( 678.0, 815.4) ( 559.5, 933.9) b. Since the p-value corresponding to F = 54.75 is .000 <  = .05, we reject H0: 1 = 0. c. r2 = .873. The least squares line provided a very good fit. d. The 95% prediction interval is 559.5 to 933.9 or $559.50 to $933.9065. a. The Minitab output is shown below:The regression equation is Points = 5.85 + 0.830 HoursPredictor Coef SE Coef T p Constant 5.847 7.972 0.73 0.484 X 0.8295 0.1095 7.58 0.000S = 7.523 R-sq = 87.8% R-sq(adj) = 86.2%Analysis of VarianceSOURCE DF SS MS F p Regression 1 3249.7 3249.7 57.42 0.000 Residual Error 8 452.8 56.6 Total 9 3702.5Fit Stdev.Fit 95% C.I. 95% P.I. 84.65 3.67 ( 76.19, 93.11) ( 65.35, 103.96) b. Since the p-value corresponding to F = 57.42 is .000 <  = .05, we reject H0: 1 = 0. c. 84.65 points14 - 45 Chapter 14 d. The 95% prediction interval is 65.35 to 103.9666. a. The Minitab output is shown below:The regression equation is Horizon = 0.275 + 0.950 S&P 500Predictor Coef SE Coef T P Constant 0.2747 0.9004 0.31 0.768 S&P 500 0.9498 0.3569 2.66 0.029S = 2.664 R-Sq = 47.0% R-Sq(adj) = 40.3%Analysis of VarianceSource DF SS MS F P Regression 1 50.255 50.255 7.08 0.029 Residual Error 8 56.781 7.098 Total 9 107.036The market beta for Horizon is b1 = .95 b. Since the p-value = 0.029 is less than  = .05, the relationship is significant. c. r2 = .470. The least squares line does not provide a very good fit. d. Texas Instruments has higher risk with a market beta of 1.46.67. a. The Minitab output is shown below:The regression equation is Audit% = - 0.471 +0.000039 IncomePredictor Coef SE Coef T P Constant -0.4710 0.5842 -0.81 0.431 Income 0.00003868 0.00001731 2.23 0.038S = 0.2088 R-Sq = 21.7% R-Sq(adj) = 17.4%Analysis of VarianceSource DF SS MS F P Regression 1 0.21749 0.21749 4.99 0.038 Residual Error 18 0.78451 0.04358 Total 19 1.00200Predicted Values for New ObservationsNew Obs Fit SE Fit 95.0% CI 95.0% PI 1 0.8828 0.0523 ( 0.7729, 0.9927) ( 0.4306, 1.3349) b. Since the p-value = 0.038 is less than  = .05, the relationship is significant.14 - 46 Simple Linear Regression c. r2 = .217. The least squares line does not provide a very good fit. d. The 95% confidence interval is .7729 to .9927.68. a. 9080 ) % ( 70 g n i t a 60 R n o i t 50 c a f s i 40 t a S 3020 20 30 40 50 60 70 Top Five (%) b. There appears to be a positive linear relationship between the two variables.A portion of the Minitab output for this problem follows.The regression equation is Rating (%) = 9.4 + 1.29 Top Five (%)Predictor Coef SE Coef T P Constant 9.37 10.24 0.92 0.380 Top Five (%) 1.2875 0.2293 5.61 0.000S = 6.62647 R-Sq = 74.1% R-Sq(adj) = 71.8%Analysis of VarianceSource DF SS MS F P Regression 1 1383.9 1383.9 31.52 0.000 Residual Error 11 483.0 43.9 Total 12 1866.9 c. yˆ = 9.37 + 1.2875 Top Five (%)14 - 47 Chapter 14 d. Since the p-value = .000 < α = .05, the relationship is significant. e. r 2 = .741; a good fit. The least squares line explained 74.1% of the variability in the satisfaction rating.2 f. rxy = r =.741 = .8614 - 48

Simple Linear Regression

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support