Regression Review - Part II Weighted

Statistics 149

Spring 2006

Copyright °c 2006 by Mark E. Irwin Diagnostic Example

Body Fat Prediction

20 healthy females 25-34 years old were studied to come up with a predictive model for body fat based on simple measurements as the gold standard measurement is time consuming and expensive.

Response variable: Bodyfat - determined by body immersion in water

Predictor variables:

• Tricep - Triceps Skinfold Thickness

• Thigh - Thigh Circumference

• Midarm - Midarm Circumference

In the following figure, the plotting symbol is the observation number.

Diagnostic Example 1 12 7 12 7 712

8 1811 8 11 18 18 118

16 16 16 2 17 2 17 17 2 6 6 6 9 20 9 20 9 20 4 4 4 10 10 10 3 3 3 Bodyfat Bodyfat Bodyfat 14 14 14

15 2019 25 15 2019 25 15 2019 25

15 5 515 15 5 131 1 13 13 1

15 20 25 30 45 50 55 25 30 35

Tricep Thigh Midarm

18 7 3 3

1211 17 164 6 10 4 4 8 3 5 8 5 8 16 16 20 11 11 1 1 9 Thigh 2 14 14

Midarm 2 12 Midarm 2 12 20 7 20 7 19 19 19

13 17 17

25 30 35 10 18 25 30 35 10 18

45 50 55 6 6 14 13 9 13 9 1 15 5 15 15

15 20 25 30 15 20 25 30 45 50 55

Tricep Tricep Thigh

Diagnostic Example 2 > cor(bodyfat) Tricep Thigh Midarm Bodyfat Tricep 1.0000 0.92384 0.45778 0.8433 Thigh 0.9238 1.00000 0.08467 0.8781 Midarm 0.4578 0.08467 1.00000 0.1424 Bodyfat 0.8433 0.87809 0.14244 1.0000 Lets fit the the model with Tricep and Thigh used to predict Bodyfat. > bodyfat2.lm <- lm(Bodyfat ~ Tricep + Thigh, data=bodyfat) > > summary(bodyfat2.lm)

Call: lm(formula = Bodyfat ~ Tricep + Thigh, data = bodyfat)

Residuals: Min 1Q Median 3Q Max -3.9469 -1.8807 0.1678 1.3367 4.0147

Diagnostic Example 3 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -19.1742 8.3606 -2.293 0.0348 * Tricep 0.2224 0.3034 0.733 0.4737 Thigh 0.6594 0.2912 2.265 0.0369 * --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Residual standard error: 2.543 on 17 degrees of freedom Multiple R-Squared: 0.7781, Adjusted R-squared: 0.7519 F-statistic: 29.8 on 2 and 17 DF, p-value: 2.774e-06

Diagnostic Example 4 8 8 14 2 14 2 9 9 12 12

20 20 16 7 16 7 15 11 15 11 5 5 6 6 17 18 17 18

Raw Residuals 1 Raw Residuals 1 10 10 19 19 4 3 3 4 13 13 −4 −2 0 2 4 −4 −2 0 2 4

15 20 25 30 45 50 55

Tricep Tricep

8 8 14 2 14 2 9 9 12 12

20 20 15 16 7 15 16 7 11 11 5 5 6 6 17 18 17 18 1 1 10 10 19 19 4 4 Externally Studentized Residuals Externally Studentized Residuals −1.5 −0.5 0.5 1.5 3 −1.5 −0.5 0.5 1.5 3 13 13

15 20 25 30 45 50 55

Tricep Tricep

Diagnostic Example 5 DFBETAS(Thigh) DFBETAS(Tricep) DFBETAS(Intercept) 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4

5 10 15 20 5 10 15 20 5 10 15 20

Observation No. Observation No. Observation No. DFFITS Cook’s D Leverages −1.0 −0.5 0.0 0.5 0.0 0.1 0.2 0.3 0.4 0.5 0.05 0.10 0.15 0.20 0.25 0.30 0.35

5 10 15 20 5 10 15 20 5 10 15 20

Observation No. Observation No. Observation No.

Diagnostic Example 6 >Influence measures of lm(formula = Bodyfat ~ Tricep + Thigh, data = bodyfat) :

dfb.1_ dfb.Trcp dfb.Thgh dffit cov.r cook.d hat inf 1 -3.05e-01 -1.31e-01 2.32e-01 -3.66e-01 1.361 4.60e-02 0.2010 2 1.73e-01 1.15e-01 -1.43e-01 3.84e-01 0.844 4.55e-02 0.0589 3 -8.47e-01 -1.18e+00 1.07e+00 -1.27e+00 1.189 4.90e-01 0.3719 * 4 -1.02e-01 -2.94e-01 1.96e-01 -4.76e-01 0.977 7.22e-02 0.1109 5 -6.37e-05 -3.05e-05 5.02e-05 -7.29e-05 1.595 1.88e-09 0.2480 * 6 3.97e-02 4.01e-02 -4.43e-02 -5.67e-02 1.371 1.14e-03 0.1286 7 -7.75e-02 -1.56e-02 5.43e-02 1.28e-01 1.397 5.76e-03 0.1555 8 2.61e-01 3.91e-01 -3.32e-01 5.75e-01 0.780 9.79e-02 0.0963 9 -1.51e-01 -2.95e-01 2.47e-01 4.02e-01 1.081 5.31e-02 0.1146 10 2.38e-01 2.45e-01 -2.69e-01 -3.64e-01 1.110 4.40e-02 0.1102 11 -9.02e-03 1.71e-02 -2.48e-03 5.05e-02 1