Use the Data in the SPSS Data File SAT99.SAV
Total Page:16
File Type:pdf, Size:1020Kb
Topic 11 Regression Analysis—Testing
Activity 11-5: Verbal SAT Scores (cont’d.) Use the data in the SPSS data file SAT99.SAV.
a) Compute the regression using percentage taking the examinations to predict the average verbal SAT score. Report your regression equation. Verbal SAT= 572.181 - 1.074 Percent
b) How many degrees of freedom are associated with the t test on 1 ? 49
c) Use Table II to compute the p-value for testing H0: 1 0 vs. H a : 1 0 .
p-value = Pr( T49 � < 13.488) .0005 d) What do you conclude? There is very strong evidence that the slope coefficient is negative.
e) Analyze the residuals to determine if the regression model technical requirements appear to be satisfied. What are your conclusions? The plots are shown below. The normal probability plot supports a normal distribution. The residual plot, however, shows a definite U-shape bringing into question the existence of a linear relationship.
Normal P-P Plot of Regression Standardized Residual Scatterplot
Dependent Variable: Verbal SAT Dependent Variable: Verbal SAT 1.0
l 2 a u d i
0.8 s e b
R 1 o
r d P e
z i
m 0.6 d u r C a 0
d d n e t a t c 0.4 e S
p n
x -1 o E i s
0.2 s e r
g -2 e R 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Observed Cum Prob -2 -1 0 1 Regression Standardized Predicted Value
Activity 11-6: Physical Measurements (cont’d.) a) Enter the physical measurements taken on members of your class in a SPSS worksheet. Do the two measurements appear to be linearly related? Results will vary according to the data. b) Compute a regression line that uses forearm length to predict height. Report your regression equation. Results will vary according to the data.
c) How many degrees of freedom are associated with the t test on 1 ? Results will vary according to the data.
d) Use Table II to compute the p-value for testing H0: 1 0 vs. H a : 1 0 . Results will vary according to the data. e) What do you conclude? Results will vary according to the data. f) Do the residuals indicate that the regression model technical requirements are satisfied? Results will vary according to the data.
Activity 11-7: Aircraft Operating Costs (cont’d.) The SPSS data file AIRCRAFT.SAV contains data on a number of commercial aircraft. The variables are described in the worksheet.
a) Compute the regression line for using fuel consumption to predict operating cost. Report your regression equation. Operating Cost= 451.609 + 1.837 Fuel Consumption
b) Use the Analysis of Variance table to test H0: 1 0 vs. Ha : 1 0 . What are the degrees of freedom and the p-value?
ANOVAb
Sum of Model Squares df Mean Square F Sig. 1 Regression 86083757 1 86083757.24 285.340 .000a Residual 7843899 26 301688.440 Total 93927657 27 a. Predictors: (Constant), Fuel Consumption (gal/hr) b. Dependent Variable: Operating Cost ($/hr)
The degrees of freedom are (1,26), and the p-value is 0. c) What do you conclude? There is very strong evidence that the slope coefficient is not 0.
Activity 11-8: Measuring Cherry Trees (cont’d.) The SPSS data set CHERRY_TREES.SAV contains the diameter, height, and volume of a sample of black cherry trees.
a) Fit a regression line that uses height times the square of diameter to predict the volume. Report your regression equation.
Coefficientsa
Unstandardized Standardized Coefficients Coefficients Model B Std. Error Beta t Sig. 1 (Constant) 1.095 1.876 .583 .568 Height x Diameter^2 .002 .000 .932 10.258 .000 a. Dependent Variable: Volume (cubic feet)
Volume=1.095 + .002( Height Diameter 2 )
b) Test H0: 0 0 vs. H a : 0 0 . From the output shown above the value of t is .583 with a p-value of .568. c) What do you conclude? What does this mean? There is no evidence that the intercept is not 0. This means that a tree with HD2 = 0 will have a volume of 0. d) Test H0:b 1= 0 vs. H a : b 1 > 0 . From the output shown above t = 10.528 and p-value = 0/2 = 0. e) What do you conclude? What does this mean? There is very strong evidence that the slope coefficient is positive. Height times the square of diameter is a useful predictor of volume. f) Do the assumptions of the regression model appear to be satisfied? Explain your reasoning. The plots are shown below. The normal probability plot is reasonably consistent with a normal distribution. The residual plot seems to get somewhat wider as you move from left to right making constant standard deviation questionable. The relationship does appear to be linear.
Normal P-P Plot of Regression Standardized Residual Scatterplot
Dependent Variable: Volume (cubic feet) Dependent Variable: Volume (cubic feet)
1.0 l a u d
i 2
0.8 s e R b
o d r e P z 1 i d m 0.6 r u a C d
n d a e 0 t t c 0.4 S
e n p o x i E s
s -1 e
0.2 r g e R -2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 -2 -1 0 1 2 Observed Cum Prob Regression Standardized Predicted Value
Activity 11-9: Ice Cream Consumption The SPSS data file ICECREAM.SAV contains data regarding the consumption of ice cream and other factors. We wish to see if the price of ice cream has an effect on the amount that is consumed.
a) Fit a regression line that uses the price per pint to predict the per capita consumption of ice cream. Report your regression equation. Consumption = .923 – 2.047 Price b) What hypotheses should you test in order to see if the price of ice cream has an effect on the amount consumed?
You should test H0:b 1= 0 vs. H a : b 1 0 . c) Conduct the test you chose in part (b). What do you conclude?
Coefficientsa
Unstandardized Standardized Coefficients Coefficients Model B Std. Error Beta t Sig. 1 (Constant) .923 .396 2.329 .027 Price ($/pint) -2.047 1.439 -.260 -1.422 .166 a. Dependent Variable: Consumption (pint per capita) The value of t is -1.422 with a p-value of .166 indicating that there is no evidence that the price of ice cream has an effect on the amount consumed.
d) Do the regression model assumptions appear to be satisfied? Explain your reasoning. The plots are shown below. The normal probability plot is consistent with a normal distribution. The residual plot is randomly scattered with no real change in width. So a linear relationship with a constant standard deviation is reasonable. Normal P-P Plot of Regression Standardized Residual Scatterplot
Dependent Variable: Consumption (pint per capita) Dependent Variable: Consumption (pint per capita)
1.0
l 3 a u d i
0.8 s e 2 R b
o d r e P z
i d m 0.6 1 r u a C d
n d a e t t c 0.4 S 0
e n p o x i s E s
e -1
0.2 r g e R -2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 -2 -1 0 1 2 Observed Cum Prob Regression Standardized Predicted Value