F-Test for Nested Models

F-Test for Nested Models

F-test in Multiple Regression David Gerard 2018-12-07 1 Learning Objective • Test for including multiple variables at the same time. • Section 10.3 in the book 2 Case Study • Kentucky Derby • Speed vs Year and Yeat2. library(Sleuth3) data(ex0920) head(ex0920) ## Year Winner Starters NetToWinner Time Speed Track Conditions ## 1 1896 Ben Brush 8 4850 127.8 35.23 Dusty Fast ## 2 1897 Typhoon II 6 4850 132.5 33.96 Heavy Slow ## 3 1898 Plaudit 4 4850 129.0 34.88 Good Fast ## 4 1899 Manuel 5 4850 132.0 34.09 Fast Fast ## 5 1900 Lieut. Gibson 7 4850 126.2 35.64 Fast Fast ## 6 1901 His Eminence 5 4850 127.8 35.23 Fast Fast 3 Year vs Speed qplot(Year, Speed, data = ex0920) + geom_smooth(se = FALSE) 37 36 Speed 35 34 1920 1950 1980 2010 Year 4 Goal • Get a p-value for the association between year and Speed. • It is clear that a quadratic model would be better than a linear model. 2 • µ(Speed|Year) = β0 + β1Year + β2Year • So to see if year is important, we need to test: • H0 : β1 = β2 = 0 • HA : either β1 6= 0 or β2 6= 0 5 Full and Reduced Models: 2 • Full Model: µ(Speed|Year) = β0 + β1Year + β2Year • Reduced Model: µ(Speed|Year) = β0 • Use F -test strategy to run this hypothesis test. 1. Fit both full and reduced models. 2. Calculate sum of squared residuals under both models and the corresponding degrees of freedom. 3. Calculate the F -statistic. 4. Compare to theoretical F -distribution under H0 6 Fit Under Full 37 36 Speed 35 34 1920 1950 1980 2010 Year 7 Residuals under Full 37 36 Speed 35 34 1920 1950 1980 2010 Year 8 Fit under Reduced 37 36 Speed 35 34 1920 1950 1980 2010 Year 9 Residuals under Reduced 37 36 Speed 35 34 1920 1950 1980 2010 Year 10 In R • First, fit both models ex0920$Year2 <- ex0920$Year ^ 2 lmfull <- lm(Speed ~ Year + Year2, data = ex0920) lmreduced <- lm(Speed ~ 1, data = ex0920) • Then use anova() with the reduced model as the first argument. anova(lmreduced, lmfull) ## Analysis of Variance Table ## ## Model 1: Speed ~ 1 ## Model 2: Speed ~ Year + Year2 ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 115 93.0 ## 2 113 33.1 2 59.9 102 <2e-16 11 What is that Table? ## Analysis of Variance Table ## ## Model 1: Speed ~ 1 ## Model 2: Speed ~ Year + Year2 ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 115 93.0 ## 2 113 33.1 2 59.9 102 <2e-16 Res.Df RSS Df Sum of Sq F Pr(>F) dfreduced RSSreduced dffull RSSfull dfextra ESS F -stat p-value 12 F -test • We can use the F -test for any two nested models. • Nested: The reduced model is a special case of the full model by setting constraints on some of the parameters of the full. 13 Another Example • µ(Speed|Year, Starters) = 2 2 β0 + β1Year + β2Year + β3Starters + β4Starters • H0 : β3 = β4 = 0 • HA : either β3 6= 0 or β4 6= 0 • Full Model: µ(Speed|Year, Starters) = 2 2 β0 + β1Year + β2Year + β3Starters + β4Starters • Reduced Model: 2 µ(Speed|Year, Starters) = β0 + β1Year + β2Year 14 Another Example ex0920$Starters2 <- ex0920$Starters ^ 2 lmfull <- lm(Speed ~ Year + Year2 + Starters + Starters2, data = ex0920) lmreduced <- lm(Speed ~ Year + Year2, data = ex0920) anova(lmreduced, lmfull) ## Analysis of Variance Table ## ## Model 1: Speed ~ Year + Year2 ## Model 2: Speed ~ Year + Year2 + Starters + Starters2 ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 113 33.1 ## 2 111 30.9 2 2.18 3.92 0.023 15 Example of a non-nested model 2 • Model 1: µ(Speed|Year, Starters) = β0 + β1Year + β2Year • Model 2: 2 µ(Speed|Year, Starters) = β0 + β1Starters + β2Starters • Cannot use an F -test to compare these two models. • Why? Mathematical theory only gaurantees the F -distribution when the models are nested. 2 • When models are not nested, use adjusted R , Cp, AIC, or BIC methods from section 12.4 (more on this later). 16.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us