Weighted Least Squares
Total Page:16
File Type:pdf, Size:1020Kb
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Weighted Least Squares Recall the linear regression equation E(Y ) = β0 + β1x1 + β2x2 + ··· + βk xk We have estimated the parameters β0, β1, β2, ::: , βk by minimizing the sum of squared residuals n X 2 SSE = (yi − y^i ) i=1 n 2 X h ^ ^ ^ ^ i = yi − β0 + β1xi;1 + β2xi;2 + ··· + βk xi;k : i=1 1 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Sometimes we want to give some observations more weight than others. We achieve this by minimizing a weighted sum of squares: n X 2 WSSE = wi (yi − y^i ) i=1 n 2 X h ^ ^ ^ ^ i = wi yi − β0 + β1xi;1 + β2xi;2 + ··· + βk xi;k i=1 The resulting β^s are called weighted least squares (WLS) estimates, and the WLS residuals are p wi (yi − y^i ): 2 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Why use weights? Suppose that the variance is not constant: 2 var(Yi ) = σi : If we use weights 1 wi / 2 ; σi the WLS estimates have smaller standard errors than the ordinary least squares (OLS) estimates. That is, the OLS estimates are inefficient, relative to the WLS estimates. 3 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II 2 In fact, using weights proportional to 1/σi is optimal: no other weights give smaller standard errors. When you specify weights, regression software calculates standard 2 errors on the assumption that they are proportional to 1/σi . 4 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II How to choose the weights If you have many replicates for each unique combination of xs, 2 use si to estimate var(Y jxi ). Often you will not have enough replicates to give good variance estimates. The text suggests grouping observations that are \nearest neighbors". Alternatively you can use the regression diagnostic plots. 5 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Example: Florida road contracts. dot11 <- read.table("Text/Exercises&Examples/DOT11.txt", header = TRUE) l1 <- lm(BIDPRICE ~ LENGTH, dot11) summary(l1) plot(l1) 6 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II The first plot uses unweighted residuals yi − y^i , but the others use weighted residuals. Also recall that they are \Standardized residuals" ∗ yi − y^i zi = p : s 1 − hi which are called Studentized residuals in the text. With weights, the standardized residuals are ∗ p yi − y^i zi = wi p : s 1 − hi 7 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Note that the \Scale-Location" plot shows an increasing trend. Try weights that are proportional to powers of x = LENGTH: # Try power -1: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/LENGTH)) # Still slightly increasing; try power -2: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/LENGTH^2)) # Now slightly decreasing. summary() shows that the fitted equations are all very similar. weights = 1/LENGTH gives the smallest standard errors. 8 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Often the weights are determined by fitted values, not by the independent variable: # Try power -1: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/fitted(l1))) # About flat; but try power -2: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/fitted(l1)^2)) # Now definitely decreasing. summary() shows that the fitted equations are again very similar. weights = 1/fitted(l1) gives the smallest standard errors. 9 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Note Standard errors are computed as if the weights are known constants. In the last case, we used weights based on a preliminary OLS fit. Theory shows that in large samples the standard errors are also valid with estimated weights. 10 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Note When you specify weights wi , lm() fits the model 2 2 σ σi = wi and the \Residual standard error" s is an estimate of σ: Pn w (y − y^ )2 s2 = i=1 i i i n − p If you change the weights, the meaning of σ (and s) changes. You cannot compare the residual standard errors for different weighting schemes (c.f. page 488, foot). 11 / 11 Special Topics Weighted Least Squares.