Reading Minitab output for a simple linear regression model

Regression Analysis: Height versus Mother Height

The regression equation is the estimated regression Height = 24.7 + 0.640 Mother Height equation: y = b0 + b1x (dependent (intercept, (slope, (independent variable, y) b0 ) b1) variable, x)

(test (estimates) (sd of ests.) statistics) (p-values) Predictor Coef SE Coef T P Constant 24.690 8.978 2.75 0.009 tests H0: β0 = 0, vs. two-tailed alternative Mother H 0.6405 0.1394 4.59 0.000 tests H0: β1 = 0, vs. two-tailed alternative intercept, b0 (the latter is equivalent to testing for slope, b1 linear correlation between x and y)

S = 2.973 R-Sq = 35.7% R-Sq(adj) = 34.0% (standard error (coefficient of linear (adjusted r2, used for multiple regression) 2 of estimate, se) determination, r )

(the coefficient of linear correlation is the square root for a simple linear regression of r2, with the same sign as the model, these tests are equivalent slope, b1)

Analysis of Variance

(test stat.) (p-value) Source DF SS MS F P Regression 1 186.54 186.54 21.11 0.000 tests for overall model fit Residual Error 38 335.76 8.84 (used for multiple regression) Total 39 522.30 (total df = n-1 “explained variation” for simple linear “unexplained variation” regression) “total variation”

Unusual Observations

(ŷ: predicted value of y (y – ŷ) (observed x) (observed y) for the observed x) sŷ (y – ŷ) se Obs Mother H Height Fit SE Fit Residual St Resid 8 63.0 71.000 65.039 0.505 5.961 2.03R 21 63.0 58.600 65.039 0.505 -6.439 -2.20R 30 75.0 71.100 72.724 1.560 -1.624 -0.64 X

R denotes an observation with a large standardized residual (Standardized residuals are residuals divided by se; they have a mean of 0 and a standard deviation of 1. This simplifies determining how far an observed value is from the least-squares regression line, relative to the dispersion of observations around the line. Observations with standardized residuals greater than +2 or less than –2 are considered “unusual” by Minitab.)

X denotes an observation whose X value gives it large influence. (Observations with x values far from the mean of x have a greater impact on model parameters – such observations are said to have high leverage. Removing such observations (which may not be warranted!) would produce substantially different parameter estimates of the slope and intercept.)