The Simple Model, Continued

Estimating σ2. Recall that if the obey the simple linear model then each bivariate observation (xi, yi) satisfies

yi = β0 + β1xi + ǫi

2 where the random error ǫi is N(0, σ ). Note that this model has three : the 2 unknown regression coefficients β0 and β1 and the unknown random error σ . We 2 estimate β0 and β1 by the estimates βˆ0 and βˆ1. How do we estimate σ ? Since the random errors ǫi have µ = 0, ∞ 2 2 2 σ = ǫ fǫ(ǫ)dǫ − µ (1) Z−∞ ∞ 2 = ǫ fǫ(ǫ)dǫ. (2) Z−∞

Since the ith residual ei provides an estimate of the ith random error ǫi, i.e., ei ≈ ǫi, it turns out we can estimate the integral in equation (2) by an average of the residuals squared:

n 2 1 2 σˆ = ei (3) n − 2 Xi=1 n 1 2 = (yi − yˆi) (4) n − 2 Xi=1 Thus we estimate the random error variance σ2 byσ ˆ2. This is equation 7.32, page 544. Usually we don’t manually compute σˆ2 but use software (Minitab) instead. In the Minitab regression output, the estimate σˆ is provided and labeled “S.”

Improved Regression Models: Polynomial Models. Clearly a good regression model will satisfy all four of the regression assumptions:

1. The errors ǫ1, . . . , ǫn all have mean 0, i.e., µǫi = 0 for all i. 2 2 2 2. The errors ǫ1, . . . , ǫn all have the same variance σ , i.e., σǫi = σ for all i.

3. The errors ǫ1, . . . , ǫn are independent random variables.

4. The errors ǫ1, . . . , ǫn are normally distributed.

Recall that we assess these assumptions by checking if the residuals satisfy them. Of the four by far the most important is the first. If the residuals are not centered on zero for all values of x then our model is systematically over and/or under predicting y. If our model is the simple linear model this that y is not a linear function of x. That is,

yi = f(xi)+ ǫi

1 where f(x) =6 β0 + β1x for any β0 and β1. In some scenarios theoretical considerations can help us determine the nature of f(x). In other situations there is no theory to guide us. For such cases one approach is to approximate f(x) by a higher order polynomial:

yi = f(xi)+ ǫi 2 p ≈ β0 + β1xi + β2xi + ··· + βpxi + ǫi, The above model is called a polynomial regression model. We estimate its parameters as in the simple linear model: we estimate the regression coefficients β0,..., βp using least squares and we estimate σ2 using (3) above. In science and engineering problems, a quadratic model (p = 2) often suffices. Warning: it can be dangerous to use cubic and higher order (p ≥ 3) models as then you risk overfitting the data.

Exercise 1. Predicting citympg using car weight continued:

By analyzing the residuals we found that the (simple) linear model

citympgi = β0 + β1wti + ǫi was inadequate because the relationship between citympg and curb weight wt was sufficiently nonlinear. We need a better model. Perhaps the quadratic polynomial regression model

2 citympgi = β0 + β1wti + β2wti + ǫi will be adequate. Analyze the data (mpgweight.MTW on the course website) using this model as follows:

1. Name column c5 “wt2” and, using Calc -> Calculator, fill it with the squares of the values in wt. 2. Fit the quadratic model following the same procedure you used previously but now selecting both wt and wt2 as predictor variables. 3. Create a scatterplot of the residuals, RESI1, vs. wt. Based on this residual plot, do you feel the quadratic model is adequate, i.e, do you notice any trends in the residuals? 4. Is the constant-variance assumption met?

5. Identify the least squares estimates of β0, β1, and β2 and compute the residual for the first observation. Your value should agree with the first value in RESI1. 6. For polynomial models the formula for estimating the variance of the random errors is

n 2 1 2 σˆ = ei n − p − 1 Xi=1 which is the same as (3) except that we divide by n − p − 1 instead of n − 2 to account for the fact that we are estimating p + 1 coefficients instead of 2. What is the value of σˆ for the quadratic model?

2