Midterm Review

Formal Statement of Simple Linear Regression Model Yi = β0 + β1Xi + i th I Yi value of the response variable in the i trial I β0 and β1 are parameters I Xi is a known constant, the value of the predictor variable in the i th trial I i is a random error term with mean E(i ) = 0 and variance 2 Var(i ) = σ I i = 1;:::; n Least Squares Linear Regression I Seek to minimize n X 2 Q = (Yi − (β0 + β1Xi )) i=1 I Choose b0 and b1 as estimators for β0 and β1. b0 and b1 will minimize the criterion Q for the given sample observations (X1; Y1); (X2; Y2); ··· ; (Xn; Yn). Normal Equations I The result of this maximization step are called the normal equations. b0 and b1 are called point estimators of β0 and β1 respectively. X X Yi = nb0 + b1 Xi X X X 2 Xi Yi = b0 Xi + b1 Xi I This is a system of two equations and two unknowns. The solution is given by::: Solution to Normal Equations After a lot of algebra one arrives at P(X − X¯ )(Y − Y¯ ) b = i i 1 P 2 (Xi − X¯ ) b0 = Y¯ − b1X¯ P X X¯ = i n P Y Y¯ = i n Properties of Solution th I The i residual is defined to be ei = Yi − Yî P I i ei = 0 P ^ P I i Yi = i Yi P I i Xi ei = 0 P ^ I i Yi ei = 0 I The regression line always goes through the point X¯ ; Y¯ Alternative format of linear regression model: ∗ ¯ Yi = β0 + β1(Xi − X ) + i The least squares estimator b1 for β1 remains the same as before. ∗ ¯ The least squares estimator for β0 = β0 + β1X becomes ∗ ¯ ¯ ¯ ¯ ¯ b0 = b0 + b1X = (Y − b1X ) + b1X = Y Hence the estimated regression function is Y^ = Y¯ + b1(X − X¯ ) s2 estimator for σ2 SSE P(Y − Y^ )2 P e2 s2 = MSE = = i i = i n − 2 n − 2 n − 2 2 I MSE is an unbiased estimator of σ 2 E(MSE) = σ I The sum of squares SSE has n − 2 \degrees of freedom" associated with it. I Cochran's theorem (later in the course) tells us where degree's of freedom come from and how to calculate them. Normal Error Regression Model Yi = β0 + β1Xi + i th I Yi value of the response variable in the i trial I β0 and β1 are parameters I Xi is a known constant, the value of the predictor variable in the i th trial 2 I i ∼iid N(0; σ ) note this is different, now we know the distribution I i = 1;:::; n Inference concerning β1 Tests concerning β1 (the slope) are often of interest, particularly H0 : β1 = 0 Ha : β1 6= 0 the null hypothesis model Yi = β0 + (0)Xi + i implies that there is no relationship between Y and X. Note the means of all the Yi 's are equal at all levels of Xi . Sampling Dist. Of b1 I The point estimator for b1 is P(X − X¯ )(Y − Y¯ ) b = i i 1 P 2 (Xi − X¯ ) I For a normal error regression model the sampling distribution of b1 is normal, with mean and variance given by E(b1) = β1 σ2 Var(b ) = 1 P 2 (Xi − X¯ ) Estimated variance of b1 2 I When we don't know σ then we have to replace it with the MSE estimate I Let SSE s2 = MSE = n − 2 where X 2 SSE = ei and ei = Yi − Yî plugging in we get σ2 Var(b ) = 1 P 2 (Xi − X¯ ) s2 Var(^ b ) = 1 P 2 (Xi − X¯ ) Recap I We now have an expression for the sampling distribution of b1 when σ2 is known σ2 b ∼ N (β ; ) (1) 1 1 P 2 (Xi − X¯ ) 2 I When σ is unknown we have an unbiased point estimator of σ2 s2 Var(^ b ) = 1 P 2 (Xi − X¯ ) Sampling Distribution of (b1 − β1)=S(b1) p I b1 is normally distributed so (b1 − β1)=( Var(b1)) is a standard normal variable I We don't know Var(b1) so it must be estimated from data. We have already denoted it's estimate I If using the estimate V^ (b1) it can be shown that b − β 1 1 ∼ t(n − 2) S^(b1) q S^(b1) = V^ (b1) Confidence Intervals and Hypothesis Tests Now that we know the sampling distribution of b1 (t with n-2 degrees of freedom) we can construct confidence intervals and hypothesis tests easily. 1 − α confidence limits for β1 I The 1 − α confidence limits for β1 are b1 ± t(1 − α=2; n − 2)sfb1g I Note that this quantity can be used to calculate confidence intervals given n and α. I Fixing α can guide the choice of sample size if a particular confidence interval is desired I Given a sample size, vice versa. I Also useful for hypothesis testing Tests Concerning β1 I Example 1 I Two-sided test I H0 : β1 = 0 I Ha : β1 6= 0 I Test statistic b − 0 t∗ = 1 sfb1g Tests Concerning β1 I We have an estimate of the sampling distribution of b1 from the data. I If the null hypothesis holds then the b1 estimate coming from the data should be within the 95% confidence interval of the sampling distribution centered at 0 (in this case) b − 0 t∗ = 1 sfb1g Decision rules ∗ if jt j ≤ t(1 − α=2; n − 2); accept H0 ∗ if jt j > t(1 − α=2; n − 2); reject H0 Absolute values make the test two-sided Inferences Concerning β0 I Largely, inference procedures regarding β0 can be performed in the same way as those for β1 I Remember the point estimator b0 for β0 b0 = Y¯ − b1X¯ Sampling distribution of b0 I When error variance is known E(b0) = β0 1 X¯ 2 σ2fb g = σ2( + ) 0 P 2 n (Xi − X¯ ) I When error variance is unknown 1 X¯ 2 s2fb g = MSE( + ) 0 P 2 n (Xi − X¯ ) Confidence interval for β0 The 1 − α confidence limits for β0 are obtained in the same manner as those for β1 b0 ± t(1 − α=2; n − 2)sfb0g Sampling Distribution of Y^h I We have Y^h = b0 + b1Xh 0 I Since this quantity is itself a linear combination of the Yi s it's sampling distribution is itself normal. I The mean of the sampling distribution is EfY^hg = Efb0g + Efb1gXh = β0 + β1Xh Biased or unbiased? Sampling Distribution of Y^h I So, plugging in, we get 1 (X − X¯ )2 σ2fY^ g = σ2 + h h P 2 n (Xi − X¯ ) 2 I Since we often won't know σ we can, as usual, plug in S2 = SSE=(n − 2), our estimate for it to get our estimate of this sampling distribution variance 1 (X − X¯ )2 s2fY^ g = S2 + h h P 2 n (Xi − X¯ ) No surprise::: I The sampling distribution of our point estimator for the output is distributed as a t-distribution with two degrees of freedom Y^ − EfY g h h ∼ t(n − 2) sfY^hg I This means that we can construct confidence intervals in the same manner as before. Confidence Intervals for E(Yh) I The 1 − α confidence intervals for E(Yh) are Y^h ± t(1 − α=2; n − 2)sfY^hg I From this hypothesis tests can be constructed as usual. Prediction interval for single new observation I If the regression parameters are unknown the 1 − α prediction interval for a new observation Yh is given by the following theorem Y^h ± t(1 − α=2; n − 2)sfpredg I We have 2 2 2 2 2 2 σ fpredg = σ fYh − Y^hg = σ fYhg + σ fY^hg = σ + σ fY^hg An unbiased estimator of σ2fpredg is 2 2 s fpredg = MSE + s fY^hg, which is given by 1 (X − X¯ )2 s2fpredg = MSE 1 + + h P 2 n (Xi − X¯ ) ANOVA table for simple lin. regression Source of Variation SS df MS E(MS) P ^ ¯ 2 2 2 P ¯ 2 Regression SSR = (Yi − Y ) 1 MSR = SSR=1 σ + β1 (Xi − X ) P 2 2 Error SSE = (Yi − Yî ) n − 2 MSE = SSE=(n − 2) σ P 2 Total SSTO = (Yi − Y¯ ) n − 1 F Test of β1 = 0 vs. β1 6= 0 ANOVA provides a battery of useful tests. For example, ANOVA provides an easy test for Two-sided test Test statistic from before t∗ = b1−0 H0 : β1 = 0 sfb1g Ha : β1 6= 0 ANOVA test statistic ∗ MSR Test statistic F = MSE F Distribution 2 I The F distribution is the ratio of two independent χ random variables normalized by their corresponding degrees of freedom. ∗ I The test statistic F follows the distribution F ∗ ∼ F (1; n − 2) Hypothesis Test Decision Rule ∗ Since F is distributed as F (1; n − 2) when H0 holds, the decision rule to follow when the risk of a Type I error is to be controlled at α is: ∗ If F ≤ F (1 − α; 1; n − 2), conclude H0 ∗ If F > F (1 − α; 1; n − 2), conclude Ha General Linear Test I The test of β1 = 0 versus β1 6= 0 is but a single example of a general test for a linear statistical models. I The general linear test has three parts I Full Model I Reduced Model I Test Statistic Full Model Fit I A full linear model is first fit to the data Yi = β0 + β1Xi + i I Using this model the error sum of squares is obtained, here for example the simple linear model with non-zero slope is the \full" model X 2 X 2 SSE(F ) = [Yi − (b0 + b1Xi )] = (Yi − Yî ) = SSE Fit Reduced Model I One can test the hypothesis that a simpler model is a \better" model via a general linear test (which is really a likelihood ratio test in disguise).

Midterm Review

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support