ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Weighted Least Squares
Recall the linear regression equation
E(Y ) = β0 + β1x1 + β2x2 + ··· + βk xk
We have estimated the parameters β0, β1, β2, ... , βk by minimizing the sum of squared residuals
n X 2 SSE = (yi − yˆi ) i=1 n 2 X h ˆ ˆ ˆ ˆ i = yi − β0 + β1xi,1 + β2xi,2 + ··· + βk xi,k . i=1
1 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Sometimes we want to give some observations more weight than others.
We achieve this by minimizing a weighted sum of squares:
n X 2 WSSE = wi (yi − yˆi ) i=1 n 2 X h ˆ ˆ ˆ ˆ i = wi yi − β0 + β1xi,1 + β2xi,2 + ··· + βk xi,k i=1
The resulting βˆs are called weighted least squares (WLS) estimates, and the WLS residuals are √ wi (yi − yˆi ).
2 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Why use weights? Suppose that the variance is not constant:
2 var(Yi ) = σi .
If we use weights 1 wi ∝ 2 , σi the WLS estimates have smaller standard errors than the ordinary least squares (OLS) estimates.
That is, the OLS estimates are inefficient, relative to the WLS estimates.
3 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
2 In fact, using weights proportional to 1/σi is optimal: no other weights give smaller standard errors.
When you specify weights, regression software calculates standard 2 errors on the assumption that they are proportional to 1/σi .
4 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
How to choose the weights If you have many replicates for each unique combination of xs, 2 use si to estimate var(Y |xi ). Often you will not have enough replicates to give good variance estimates.
The text suggests grouping observations that are “nearest neighbors”.
Alternatively you can use the regression diagnostic plots.
5 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Example: Florida road contracts. dot11 <- read.table("Text/Exercises&Examples/DOT11.txt", header = TRUE) l1 <- lm(BIDPRICE ~ LENGTH, dot11) summary(l1) plot(l1)
6 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
The first plot uses unweighted residuals yi − yˆi , but the others use weighted residuals.
Also recall that they are “Standardized residuals”
∗ yi − yˆi zi = √ . s 1 − hi which are called Studentized residuals in the text.
With weights, the standardized residuals are ∗ √ yi − yˆi zi = wi √ . s 1 − hi
7 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Note that the “Scale-Location” plot shows an increasing trend.
Try weights that are proportional to powers of x = LENGTH:
# Try power -1: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/LENGTH)) # Still slightly increasing; try power -2: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/LENGTH^2)) # Now slightly decreasing. summary() shows that the fitted equations are all very similar. weights = 1/LENGTH gives the smallest standard errors.
8 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Often the weights are determined by fitted values, not by the independent variable:
# Try power -1: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/fitted(l1))) # About flat; but try power -2: plot(lm(BIDPRICE ~ LENGTH, dot11, weights = 1/fitted(l1)^2)) # Now definitely decreasing. summary() shows that the fitted equations are again very similar. weights = 1/fitted(l1) gives the smallest standard errors.
9 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Note Standard errors are computed as if the weights are known constants.
In the last case, we used weights based on a preliminary OLS fit.
Theory shows that in large samples the standard errors are also valid with estimated weights.
10 / 11 Special Topics Weighted Least Squares ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II
Note
When you specify weights wi , lm() fits the model
2 2 σ σi = wi and the “Residual standard error” s is an estimate of σ:
Pn w (y − yˆ )2 s2 = i=1 i i i n − p
If you change the weights, the meaning of σ (and s) changes.
You cannot compare the residual standard errors for different weighting schemes (c.f. page 488, foot).
11 / 11 Special Topics Weighted Least Squares