Chapter 4: Model Adequacy Checking

Chapter 4: Model Adequacy Checking

Chapter 4: Model Adequacy Checking In this chapter, we discuss some introductory aspect of model adequacy checking, including: • Residual Analysis, • Residual plots, • Detection and treatment of outliers, • The PRESS statistic • Testing for lack of fit. The major assumptions that we have made in regression analysis are: • The relationship between the response Y and the regressors is linear, at least approximately. • The error term ε has zero mean. • The error term ε has constant varianceσ 2 . • The errors are uncorrelated. • The errors are normally distributed. Assumptions 4 and 5 together imply that the errors are independent. Recall that assumption 5 is required for hypothesis testing and interval estimation. Residual Analysis: The residuals , , , have the following important properties: e1 e2 L en (a) The mean of is 0. ei (b) The estimate of population variance computed from the n residuals is: n n 2 2 ∑()ei−e ∑ei ) 2 = i=1 = i=1 = SS Re s = σ n − p n − p n − p MS Re s (c) Since the sum of is zero, they are not independent. However, if the number of ei residuals ( n ) is large relative to the number of parameters ( p ), the dependency effect can be ignored in an analysis of residuals. Standardized Residual: The quantity = ei ,i = 1,2, , n , is called d i L MS Re s standardized residual. The standardized residuals have mean zero and approximately unit variance. A large standardized residual ( > 3 ) potentially indicates an outlier. d i Recall that e = (I − H )Y = (I − H )(X β + ε )= (I − H )ε Therefore, / Var()e = var[]()I − H ε = (I − H )var(ε )(I −H ) = σ 2 ()I − H . Studentized Residual: The quantity = ei = ei ,i = 1,2, , n , ti L (1− ) 2 (1− ) MS Re s hii S hii is called the studentized residual. The studentized residuals have approximately a Student's t distribution with n − p degrees of freedom. If we delete the i th observation, fit the regression model to the remaining n −1observations, and calculate the predicted value of corresponding to the deleted yi observation, the corresponding predictor error is = − ) . Generally a large ei yi y(−i) difference between the ordinary residual and the PRESS residual will indicate a point where the model fits the data well, but a model without that point predicts poorly. These prediction errors are usually called PRESS residuals or deleted residuals. It can be shown that = ei . Therefore, e(−i) 1− hii 2 ⎡ ⎤ 1 1 2 Var = Var ei = Var = 1− = σ ()e(−i) ⎢ ⎥ 2 ()ei 2 [σ ()hii ] ⎢1− ⎥ 1− ⎣ hii ⎦ ()1−hii ()1−hii hii Note that a standardized PRESS residual is ei ()1− e(−i) = hii = ei Var() 2 2 1− ei σ σ ()hii 1− ()hii which, if we use to estimate 2 is just the studentized residual. MS Re s σ R-student Residual: The quantity = ei , i = 1,2, , n , is called the R- ri L 2 (1− ) S (−i) hii student residual or jackknife residuals, where the quantity 2 is the residual variance S (−i) computed with the i th observation removed. It can be shown that 2 (n − p) − ei MS Re s 1− 2 = hii S (−i) n − p −1 If the usual assumptions in regression analysis are met, the jackknife residual follows exactly a t -distribution with n − p −1 degrees of freedom. Example 1: Consider the following data: y x1 x2 16 7 5 11 3 4 12 3 6 14 4 1 10 5 2 ⎡1 7 5⎤ ⎢ ⎥ ⎡16⎤ ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ ⎢1 3 4⎥ ⎢ 5 22 18⎥ ⎢11⎥ ⎢ ⎥ / ⎢ ⎥ y = ⎢12⎥ , X = ⎢1 3 6⎥ ⇒ X X = ⎢22 108 79⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢14⎥ ⎢1 4 1⎥ ⎢18 79 82⎥ ⎣⎢10⎦⎥ ⎢ ⎥ ⎣⎢ ⎦⎥ ⎢ ⎥ 1 5 2 ⎣⎢ ⎦⎥ ⎡ ⎤ ⎢ 2.7155 − 0.3967 − 0.2139⎥ −1 ⎢ ⎥ / = ⎢− 0.3967 0.0893 0.0010 ⎥ ()X X ⎢ ⎥ ⎢ ⎥ ⎢− 0.2139 0.0010 0.0582 ⎥ ⎣ ⎦ ⎡1 7 5⎤ ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ 2.8645 − 0.3712 − 0.2592 ⎡ ⎤ ⎢1 3 4⎥⎢ ⎥⎢1 1 1 1 1 ⎥ −1 ⎢ ⎥⎢ ⎥ / / ⎢ ⎥ H =X ()X X X = ⎢1 3 6⎥⎢− 0.3712 0.0936 − 0.0067⎥⎢7 3 3 4 5⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢1 4 1⎥⎢ ⎥⎢5 4 6 1 2⎥ ⎢− 0.2592 − 0.0067 0.0719⎥ ⎢ ⎥⎣ ⎦⎣⎢ ⎦⎥ ⎢ ⎥ 1 5 2 ⎣⎢ ⎦⎥ ⎡ ⎤ ⎢ 0.9252 − 0.0935 0.0748 − 0.1121 0.2056⎥ ⎢ ⎥ ⎢− 0.0935 0.3832 0.4268 0.1931 0.0903 ⎥ ⎢ ⎥ ⎢ ⎥ H = ⎢ 0.0748 0.4268 0.7030 − 0.1101 − 0.0945⎥ ⎢ ⎥ ⎢− 0.1121 0.1931 − 0.1101 0.6096 0.4195 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 0.2056 0.0903 − 0.0945 0.4195 0.3790 ⎣⎢ ⎦⎥ ⇒ = 0.9252, = 0.3832, = 0.7030, = 0.6096, = 0.3790 h11 h22 h33 h44 h55 ⎡ ⎤ 0.0748 0.0935 − 0.0748 0.1121 − 0.2056 ⎢ ⎥ ⎡ ⎤ ⎢ ⎥ ⎢ 0.84 ⎥ ⎢ ⎥⎡16⎤ 0.0935 0.6168 − 0.4268 − 0.1931 − 0.0903 ⎢ ⎥ ⎢− 0.45⎥ ⎢ ⎥ 11 ⎢ ⎥ ⎢ ⎥⎢ ⎥ = − = − 0.0748 − 0.4268 0.2970 0.1101 0.0945 ⎢12⎥ = ⎢ 0.16 ⎥ e ()I H y ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢14⎥ ⎢ ⎥ ⎢ 0.1121 − 0.1931 0.1101 0.3904 − 0.4195 ⎥ ⎢ 2.26 ⎥ ⎢ ⎥⎢10⎥ ⎣ ⎦ ⎢− 2.81⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ − 0.2056 − 0.0903 0.0945 − 0.4195 0.6210 ⎣⎢ ⎦⎥ ' 13.9374 = ee = = 6.97 MS Re s n − p 2 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0.84 0.32 d1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢− 0.45⎥ ⎢− 0.17⎥ ⎢d 2⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1 = e = ⎢ 0.16 ⎥ = ⎢ 0.06 ⎥ ⎢ 3⎥ d 6.97 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ MS Re s ⎢ ⎥ ⎢ ⎥ ⎢d 4⎥ ⎢ 2.26 ⎥ ⎢ 0.86 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣d 5⎦ − 2.81 −1.06 ⎣⎢ ⎦⎥ ⎣⎢ ⎦⎥ ⎡ ⎤ e1 ⎢ ⎥ ⎡ 0.84 ⎤ ⎢ 1− ⎥ MS Re s ()h11 ⎢ ⎥ ⎢ ⎥ ⎢ 6.97(1− 0.9252) ⎥ ⎢ ⎥ ⎢ ⎥ ⎡ ⎤ ⎡ ⎤ e1 − 0.45 1.16 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢t ⎥ 1− ⎢ MS Re s ()h22 ⎥ ⎢ 6.97(1− 0.3832) ⎥ ⎢− 0.22⎥ ⎢t2⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 0.16 = ⎢ e1 ⎥ = ⎢ ⎥ = ⎢ 0.11 ⎥ ⎢t3⎥ ⎢ ()1− ⎥ ⎢ 6.97(1− 0.7030) ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ MS Re s h33 ⎥ ⎢ ⎥ ⎢ ⎥ t4 1.37 ⎢ ⎥ ⎢ ⎥ ⎢ 2.26 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ e1 ⎥ ⎢ ⎥ ⎣t5⎦ ⎢ 6.97(1− 0.6096) ⎥ −1.35 ⎢ 1− ⎥ ⎢ ⎥ MS Re s ()h44 ⎢ ⎥ ⎣ ⎦ ⎢ ⎥ ⎢ − 2.81 ⎥ ⎢ ⎥ e1 ⎢ ⎥ ⎢ ⎥ ⎣ 6.97(1− 0.3790) ⎦ 1− ⎣⎢ MS Re s ()h55 ⎦⎥ 2 e1 (n − p) − 2 MS Re s 1− (5 − 3)6.97 − 0.84 2 = h11 = 1−0.9252 = 4.5 S (−1) n − p −1 5 − 3 −1 2 (n − p) − e2 2 Re s (−0.45) MS 1− (5 − 3)6.97 − 2 = h22 = 1−0.3832 = 13.6 S (−2) n − p −1 5 − 3 −1 2 e3 (n − p) − 2 MS Re s 1− (5 − 3)6.97 − 0.16 2 = h33 = 1−0.7030 = 13.9 S (−3) n − p −1 5 − 3 −1 2 e44 (n − p) − 2 MS Re s 1− (5 − 3)6.97 − 2.26 2 = h44 = 1−0.6096 = 0.86 S (−4) n − p −1 5 − 3 −1 2 (n − p) − e55 2 Re s (−2.81) MS 1− (5 − 3)6.97 − 2 = h55 = 1−0.3790 = 1.22 S (−5) n − p −1 5 − 3 −1 ⎡ ⎤ ⎢ e1 ⎥ ⎢ 2 ⎥ ()1− ⎡ 0.84 ⎤ ⎢ S (−1) h11 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 4.5(1− 0.9252) ⎥ ⎢ e1 ⎥ ⎡ ⎤ ⎡ ⎤ ⎢ − 0.45 ⎥ 1.45 r(−1) ⎢ 2 ⎥ ⎢ ⎥ ⎢ ⎥ ()1− ⎢ ⎥ ⎢ ⎥ ⎢ S (−2) h22 ⎥ ⎢ 13.6(1− 0.3832) ⎥ ⎢− 0.15⎥ r(−2) ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 0.16 ⎢ ⎥ = ⎢ e1 ⎥ = ⎢ ⎥ = ⎢ 0.08 ⎥ r(−3) ⎢ 2 ⎥ ⎢ ⎥ ⎢ ⎥ ()1− ⎢ 13.9(1− 0.7030) ⎥ ⎢ S (−3) h33 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢r(−4)⎥ 3.90 ⎢ ⎥ ⎢ ⎥ ⎢ 2.26 ⎥ ⎢ ⎥ ⎢ e1 ⎥ ⎢ ⎥ ⎢ 3.23⎥ ⎢ (−5)⎥ 0.86(1− 0.6096) − ⎣r ⎦ ⎢ 2 ⎥ ⎢ ⎥ ()1− ⎢ ⎥ ⎣ ⎦ ⎢ S (−4) h44 ⎥ ⎢ − 2.81 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ e1 ⎥ ⎣ 1.22(1− 0.3790) ⎦ ⎢ 2 ⎥ ()1− ⎣⎢ S (−5) h55 ⎦⎥ SAS Output: Residuals, Studentized Residuals and R-student Residuals Obs Residuals student Rstudent 1 0.84112 1.16423 1.45010 2 -0.44860 -0.21618 -0.15468 3 0.15888 0.11034 0.07826 4 2.26168 1.36988 3.89917 5 -2.81308 -1.35107 -3.23320 Scat t er pl ot of X2 ver sus X1 x1 7 6 5 4 3 123456 x2 Graphical Analysis of Residuals: (a) Normal probability plot: If the normality assumption is not badly violated, the conclusion reached by a regression analysis in which normality is assumed will generally be reliable and accurate. A very simple method of checking the normality assumption is to construct a normal probability plot of residuals. Let , , , be the residuals ranked in increasing order. Note that e(1) e(2) L e(n) ⎛ 1 ⎞ ⎜ i − ⎟ E()= −1⎜ 2 ⎟ e(i) Φ ⎜ n ⎟ ⎜ ⎟ ⎝ ⎠ where Φ denotes the standard normal cumulative distribution. Normal probability plots are constructed by plotting the ranked residuals against the expected normal e(i) ⎛ 1 ⎞ ⎜ i − ⎟ value −1⎜ 2 ⎟. The resulting points should lie approximately on a straight line. Φ ⎜ n ⎟ ⎜ ⎟ ⎝ ⎠ Substantial departures from a straight line indicate that the distribution is not normal. If normality is deemed unsatisfactory, the Y values may be transformed by using a Log, square root, etc. to see whether the new set of observation is approximately normal. (b) Plot of Residuals versus the Fitted values: A plot of the residuals (or the ei scaled residuals , or ) versus the corresponding fitted values ) is d i ti r i yi useful for detecting several common types of model inadequacies. If the plot of residuals versus the fitted values can be contained in a horizontal band, then there are no obvious model defects. The outward-opening funnel pattern implies that the variance of ε is an increasing function of Y . An inward-opening funnel indicates that the variance of ε decrease as Y increases. The double-bow often occurs when Y is a proportion between zero and one. The usual approach for dealing with inequality of variance is to apply a suitable transformation to either the regressor or the response variable. A curved plot indicates nonlinearity. This could mean that other regressor variables are needed in the model. For example a squared term may be necessary. Transformation on the regressor and/or the response variable may be helpful in these cases. A plot of residuals versus the predicted values may also reveal one or more unusually large residuals.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    17 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us