Submodels (Nested Models) and Analysis of Variance Of

1 2 SUBMODELS (NESTED MODELS) Examples: AND ANALYSIS OF VARIANCE a. NH: !1 = 1 OF REGRESSION MODELS AH: !1 ! 1 Data: (x1, y1), (x2, y2), … , (xn, yn) What submodel does NH correspond to? Assumptions: • Linear mean function • Constant conditional variance How many parameters does it have? • Independence of observations • Normality of conditional distributions AH corresponds to the full model (with three parameters, including ! ). Our model has 3 parameters: 1 E(Y|x) = ! + ! x (Two parameters: ! and ! ) 0 1 0 1 b. NH: ! = 0 Var(Y|x) = "2 (One parameter: ") 1 AH: ! ! 0 1 We will call this the full model. AH <-> full model. Many hypothesis tests on coefficients can be NH <-> ? reformulated as test of the full model against a submodel : a special case of the full model obtained by specifying certain of the parameters or certain Number of parameters? relationships between parameters. 3 4 c. NH: !0 = 0 As with the full model, we can "fit" a submodel using AH: !0 ! 0 least squares: AH <-> full model. Example 1: Submodel: NH <->? E(Y|x) = 2 + !1x Var(Y|x) = "2, Number of parameters? Consider lines y = 2 + h1x We can go the other way: [picture!] Examples: For the submodel given, what is the null hypothesis of the corresponding hypothesis test? Minimize 2 2 RSS(h1) = # di = # [yi - (2 + h1xi)] d. E(Y|x) = 2 + !1x 2 Var(Y|x) = " to get "ˆ1 . Note: For this example, yi - (2 + h1xI) = (yi - 2) - h1xi, so fitting this model is equivalent to fitting the model ! e. E(Y|x) = !0 + !0x E(Y|x) = !1x Var(Y|x) = "2 Var(Y|x) = "2 to the transformed data (x1, y1 - 2), (x2, y2 - 2), … , (xn, yn - 2) 5 6 Exercise: Try finding the least squares fit for the Example 2: Submodel submodel E(Y|x) = !0 E(Y|x) = !1x 2 2 Var(Y|x) = " Var(Y|x) = " ("Regression through the origin") Minimize RSS(h ) = d 2 = (y - h )2 0 # i # i 0 You should get a different formula for "ˆ1 from that obtained by setting "ˆ 0 = 0 in the formula for the least [picture!] squares fit for the full model. Details: Result: h0 = y -- the same as the univariate estimate. ˆ Note: This is also the same as setting "1 = 0 in the least squares fit for the full model. Caution: The result is not always this nice, as the exercise below shows. ! ! ! ! 7 8 Generalizing: If we fit a submodel by Least Squares, General Properties: (Stated without proof; true for we can define the residual sum of squares for the multiple regression as well as simple regression) submodel: 2 i. RSSsub is a multiple of a $ distribution, with ˆ 2 eˆ 2 RSSsub = #(yi - y i,sub ) = # i,sub ii. degrees of freedom dfsub = n - (# of coefficients estimated), and where ! ! 2 RSSsub 2 iii. "ˆ sub = is an estimate of " for the ˆ ˆ dfsub y i,sub = E sub(Y|x) submodel. is the fitted value for the submodel and This will allow us to do hypothesis tests comparing a eˆ yˆ ! i,sub = yi - i,sub submodel with the full model. Example: For the submodel ! ! E(Y|x) = !0 Var(Y|x) = "2, yˆ i,sub = y for each i, so 2 RSSsub = #(yi - y ) = SYY = (n-1) sY, ! where sY is the sample standard deviation of Y. ! ! ! ! ! 9 10 This hypothesis test uses the t-statistic Another Perspective: SXY "ˆ SXX We want a way to help decide whether the full model 1 t = s.e.("ˆ ) = "ˆ ~ t(n-2), is significantly better than the full model. 1 SXX ˆ RSSsub - RSSfull can be considered a measure of how where here "ˆ = " full is the estimate of " for the full much better the full model is than the submodel. model. (Why is this difference always " 0?). Note that 2 But RSSsub - RSSfull depends on the scale of the data, (SXY ) 2 (SXX) 2 (SXY) RSS " RSS 2 2 2 sub full t = "ˆ = "ˆ (SXX) so RSS is a better measure. SXX full Recall: Example (to help build intuition): The submodel (SXY )2 ! RSSfull = SYY - E(Y|x) = ! SXX 0 Var(Y|x) = "2 RSS = SYY (in this particular example) sub Testing this model against the full model is Thus equivalent to performing a hypothesis test with (SXY )2 RSSsub - RSSfull = . NH: ! = 0 SXX 1 so AH: ! ! 0. 1 ! ! ! ! ! ! ! ! 11 12 RSSsub " RSSfull F Distributions 2 t = #ˆ 2 Recall: A t(k) random variable has the distribution of RSSsub " RSS full a random variable of the form = RSS (n " 2) full where RSS " RSS sub full = ÷ (n-2), RSS full Thus ! 2 which is just a constant times our earlier measure t(k) ~ RSSsub " RSS full Also, ! 2 of how much better the full model is Z ~ RSS full than the submodel. Definition: An F-distribution F(!1, !2) with !1 degrees of freedom in the numerator and !2 degrees ! of freedom in the denominator is the distribution of a random variable of the form W " 1 2 U " where W ~ $ (!1) 2 2 U ~ $ (!2) and U and W are independent. ! ! 13 14 Thus: 2 2 t(k) ~ F(1, k); Still another look at the F-statistic t : i.e., the square of a t(k) random variable is an F(1,k) RSSsub " RSS full random variable – so any two-sided t-test could also F = RSS (n " 2) be formulated as an F-test. full RSS " RSS df " df In particular, the hypothesis test with hypotheses ( sub full ) ( sub full ) = RSS df , ! full full NH: !1 = 0 since dfsub – dffull = (n – 1) – (n – 2) = 1. AH: !1 ! 0 could be done using the F-statistic t2 instead of the t- i.e., F is the ratio of: statistic . Numerator: Recall that in this example, the difference between the residual sum of squares for the submodel and the RSS for the full model RSSsub " RSS full 2 t = RSS ÷ (n-2), full Denominator: the residual sum of squares for the full model which we have seen does make sense as a measure of whether the full model (corresponding to AH) is ! But: better than the submodel (corresponding to NH). with each divided by its degrees of freedom to "weight" them appropriately to get a tractable Example: Forbes data. distribution. ! 15 16 RSSsub " RSS full Rewriting the F-statistic, This is also just a constant times , RSS full RSS RSS df df which is a reasonable measure of how much better ( sub " full ) ( sub " full ) the full model is than the submodel in fitting the data. RSSfull df full ! This generalizes: Whenever we have a submodel (in # &# & RSSsub " RSS full df full multiple linear regression as well as simple linear % (% ( = % RSS (% df " df ( regression), $ full '$ fsub full ' 2 2 RSSsub " RSS full a. RSSsub (hence "ˆ sub) will be a constant times a $ is just a constant multiple of , which RSS full distribution, with degrees of freedom dfsub, which we ! is a reasonable measure of how much better the full then also refer to as the degrees of freedom of RSSsub 2 model is than the submodel in fitting the data. and of "ˆ sub. ! Thus we can use this F statistic for the hypothesis test (RSSsub " RSSfull ) (dfsub " df full ) b. 2 #ˆ full NH: Submodel AH: Full model (RSSsub " RSSfull ) (dfsub " df full ) = RSSfull df full ~ F(dfsub - dffull, dffull). ! ! ! ! ! .

Load more