Simple Linear Regression

Simple Linear Regression

Simple Linear Regression Example: Body density Aim: Measure body density (weight per unit volume of the body) (Body density indicates the fat content of the human body.) Problem: ◦ Body density is difficult to measure directly. ◦ Research suggests that skinfold thickness can accurately predict body density. ◦ Skinfold thickness is measures by pinching a fold of skin between calipers. 2.0 ) 3 1.8 m kg 3 10 1.6 ( 1.4 Body Density 1.2 1.0 1.03 1.04 1.05 1.06 1.07 1.08 1.09 Skinfold Thickness (mm) Questions: ◦ Are body density and skinfold thickness related? ◦ How accurately can we predict body density from skinfold thickness? Regression: predict response variable for fixed value of explanatory variable ◦ describe linear relationship in data by regression line ◦ fitted regression line is affected by chance variation in observed data Statistical inference: accounts for chance variation in data Simple Linear Regression, Feb 27, 2004 - 1 - Population Regression Line Simple linear regression studies the relationship between ◦ a response variable Y and ◦ a single explanatory variable X. We expect that different values of X will produce different mean responses of Y . For given X = x, we consider the subpopulation with X = x: ◦ this subpopulation has mean µY |X=x = E(Y |X = x) (cond. mean of Y given X = x) ◦ and variance 2 σY |X=x = var(Y |X = x) (cond. variance of Y given X = x) Linear regression model with constant variance: E(Y |X = x) = µY |X=x = a + b x (population regression line) 2 2 var(Y |X = x) = σY |X=x = σ ◦ The population regression line connects the conditional means of the response variable for fixed values of the explanatory variable. ◦ This population regression line tells how the mean response of Y varies with X. ◦ The variance (and standard deviation) does not depend on x. Simple Linear Regression, Feb 27, 2004 - 2 - Conditional Mean Sample (x1, y1),..., (xn, yn) 6 1 5 2 3 4 4 5 3 6 7 2 8 9 1 10 11 0 12 Sampling probability f(x, y) 6 0 1 5 2 3 4 4 5 3 6 7 2 8 9 1 10 11 0 12 y fix x = x0 6 0 f(x0, y) 1 5 2 3 4 4 5 3 6 7 2 8 y 9 1 10 rescale by fX (x0) 11 0 12 Conditional probability 6 0 1 5 2 fXY (x0, y) 3 4 4 f(y|x0) = 5 3 f (x ) 6 X 0 7 2 8 9 1 10 11 0 12 Z E(Y |X = x0) = y fY |X(y|x0) dy conditional mean Simple Linear Regression, Feb 27, 2004 - 3 - The Linear Regression Model Simple linear regression Yi = a + b xi + εi, i = 1, . , n where Yi response (also dependent variable) xi predictor (also independent variable) εi error Assumptions: ◦ Predictor xi is deterministic (fixed values, not random). ◦ Errors have zero mean, E(εi) = 0. 2 ◦ Variation about mean does not depend on xi, i.e. var(εi) = σ . ◦ Errors εi are independent. Often we additionally assume: ◦ The errors are normally distributed, iid 2 εi ∼ N (0, σ ). For fixed x the response Y is normally distributed with Y ∼ N (a + b x, σ2). Simple Linear Regression, Feb 27, 2004 - 4 - Least Squares Estimation Data: (Y1, x1),..., (Yn, xn) Aim: Find straight line which fits data best: Yˆi = a + b xi fitted values for coefficients a and b a - intercept b - slope Least Squares Approach: Minimize squared distance between observed Yi and fitted Yˆi: n n P 2 P 2 L(a, b) = (Yi − Yˆi) = (Yi − a − b xi) i=1 i=1 Set partial derivatives to zero (normal equations): n ∂L P = 0 ⇔ (Yi − a − b xi) = 0 ∂a i=1 n ∂L P = 0 ⇔ (Yi − a − b xi) · xi = 0 ∂b i=1 Solution: Least squares estimators S aˆ = Y¯ − XY · X¯ SXX S ˆb = XY SXX where n P SXY = (Yi − Y¯ )(xi − x¯) (sum of squares) i=1 n P 2 SXX = (xi − x¯) i=1 Simple Linear Regression, Feb 27, 2004 - 5 - Least Squares Estimation Least squares predictor Yˆ ˆ Yˆi =a ˆ + b xi Residuals εˆi: εˆi = Yi − Yˆi ˆ = Yi − aˆ − b xi Residual sum of squares (SS Residual) n n P 2 P ˆ 2 SS Residual = εˆi = (Yi − Yi) i=1 i=1 Estimation of σ2 n 2 1 P 2 1 σˆ = (Yi − Yˆi) = SS Residual n − 2 i=1 n − 2 Regression standard error p se =σ ˆ = SS Residual/(n − 2) Variation accounting: n P 2 SS Total = (Yi − Y¯ ) total variation i=1 n P 2 SS Model = (Yˆi − Y¯ ) variation explained by linear model i=1 n P 2 SS Residual = (Yi − Yˆi) remaining variation i=1 Simple Linear Regression, Feb 27, 2004 - 6 - Least Squares Estimation Example: Body density Scatter plot with least squares regression line: 2.0 ) 3 1.8 m kg 3 10 1.6 ( 1.4 Body Density 1.2 1.0 1.03 1.04 1.05 1.06 1.07 1.08 1.09 Skinfold Thickness (mm) Calculation of least squares estimates: x¯ y¯ SXX SXY SYY SS Residual 1.064 1.568 0.0235 -0.2679 4.244 1.187 S −0.267 ˆb = XY = = −11.40 SXX 0.023 aˆ =y ¯ − ˆbx¯ = 1.568 + 11.40 · 1.064 = 13.70 RSS 1.187 σˆ2 = = = 0.0132 n − 2 90 √ √ 2 se = σˆ = 0.0132 = 0.1149 Simple Linear Regression, Feb 27, 2004 - 7 - Least Squares Estimation Example: Returns on Treasury bills and inflation Using STATA: . infile ID BODYD SKINT using bodydens.txt, clear (92 observations read) . regress BODYD SKINT Source | SS df MS Number of obs = 92 -------------+------------------------------ F( 1, 90) = 231.89 Model | 3.05747739 1 3.05747739 Prob > F = 0.0000 Residual | 1.18663025 90 .013184781 R-squared = 0.7204 -------------+------------------------------ Adj R-squared = 0.7173 Total | 4.24410764 91 .046638546 Root MSE = .11482 ------------------------------------------------------------------------------ BODYD | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- SKINT | -11.41345 .7494999 -15.23 0.000 -12.90246 -9.924433 _cons | 13.71221 .7975822 17.19 0.000 12.12768 15.29675 ------------------------------------------------------------------------------ . twoway (lfitci BODYD SKINT, range(1 1.1)) (scatter BODYD SKINT), xtitle(Skin thickn > ess) ytitle(Body density) scheme(s1color) legend(off) 2.5 2 Body density 1.5 1 1 1.02 1.04 1.06 1.08 1.1 Skin thickness Simple Linear Regression, Feb 27, 2004 - 8 -.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us