Generalized and Weighted Least Squares Estimation

LINEAR REGRESSION ANALYSIS MODULE – VII Lecture – 25 Generalized and Weighted Least Squares Estimation Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur 2 The usual linear regression model assumes that all the random error components are identically and independently distributed with constant variance. When this assumption is violated, then ordinary least squares estimator of regression coefficient looses its property of minimum variance in the class of linear and unbiased estimators. The violation of such assumption can arise in anyone of the following situations: 1. The variance of random error components is not constant. 2. The random error components are not independent. 3. The random error components do not have constant variance as well as they are not independent. In such cases, the covariance matrix of random error components does not remain in the form of an identity matrix but can be considered as any positive definite matrix. Under such assumption, the OLSE does not remain efficient as in the case of identity covariance matrix. The generalized or weighted least squares method is used in such situations to estimate the parameters of the model. In this method, the deviation between the observed and expected values of yi is multiplied by a weight ω i where ω i is chosen to be inversely proportional to the variance of yi. n 2 For simple linear regression model, the weighted least squares function is S(,)ββ01=∑ ωii( yx −− β0 β 1i) . ββ The least squares normal equations are obtained by differentiating S (,) ββ 01 with respect to 01 and and equating them to zero as nn n ˆˆ β01∑∑∑ ωβi+= ωiixy ω ii ii=11= i= 1 n nn ˆˆ2 βω01∑iix+= βω ∑∑ii x ω iii xy. i=1 ii= 11= Solution of these two normal equations give the weighted least squares estimate of ββ 01 and . 3 Generalized least squares estimation Suppose in usual multiple regression model yX=+==βε with E () ε 0,() V ε σ 2 I , the assumption VI () εσ = 2 is violated and become 2 V ()εσ= Ω where Ω is a known nn × nonsingular, positive definite and symmetric matrix. This structure of Ω incorporates both the cases. when Ω is diagonal but with unequal variances and . when Ω is not necessarily diagonal depending on the presence of correlated errors, then the off - diagonal elements are nonzero. The OLSE of β is −1 b= ( XX ' ) Xy '. In such cases OLSE gives unbiased estimate but has more variability as Eb()(')= XX−−11 XEy '()(')= XX XX 'ββ= − −− − Vb()(')= XX 1 XVyXXX '()(')12= σ (') XX 1 X 'Ω XXX (').1 Now we attempt to find better estimator as follows: 4 Since Ω is positive definite, symmetric, so there exists a nonsingular matrix K such that KK '= Ω . Then in the model yX=βε + , premutliply by K − 1 gives Ky−−11= KXβε + K − 1 or zB=β + g −− − where z = K 11 yB ,, = K X g = K 1 ε . Now observe that − Eg()= K1) E ()ε = 0 Vg()=−− E{ g Eg ()}{ g Eg ()'} = E( gg ') = −−11εε EK'' K = KE−−11(εε ') K ' =σ 21KK−− Ω ' 1 = σ 21K−− KK'' K 1 2 = σ I. Thus the elements of g have 0 mean, common variance σ 2 and they are uncorrelated. 5 So either minimize S()β = gg ' =εε' Ω−1 =−()'()yXββ Ω−−1 yX and get normal equations as (X'Ω=Ω-1X)'βˆ Xy−1 ˆ −−11 − 1 or β =ΩΩ('X XX ) ' y . Alternatively, we can apply OLS to transformed model and obtain OLSE of β as βˆ = (')BB−1 Bz ' = (''XK−−11 K X ) − 1 XK '' −− 11 K y =ΩΩ('X−−11 XX ) ' − 1 y . This is termed as generalized least squares estimator (GLSE) of β . The estimation error of GLSE is ˆ −1 ββ= (BB ' ) B '( B+ g ) −1 =β + (')BB Bg ' ˆ −1 or ββ−=(')BB Bg '. 6 Then E(ββˆ −= )(') BB−1 BEg '()0= which shows that GLSE is an unbiased estimator of β . The covariance matrix of GLSE is given by βˆ =−− ββββ ˆˆˆˆ VEE(){ ()}{ E ()'} −−11 = E(') BB BggB ' ''(') BB = (BB ' )−−11 BEggB ' ( ') '( BB ' ) . Since E( gg ')= K−−11 E (εε ') K ' =σ 21KK−− Ω ' 1 21−− 1 = σ K KK'' K = σ 2 I, so V()βσˆ = 21 (') BB−− BBBB '(') 1 = σ 21(')BB− 2−− 11 − 1 = σ (''XK K X ) =σ 2(XX ' Ω−− 11 ). Now we prove that GLSE is the best linear unbiased estimator of β . 7 The Gauss-Markov theorem for the case Var()ε = Ω The Gauss-Markov theorem establishes that the generalized least-squares (GLS) estimator of β given by −− − β ˆ =ΩΩ (' X 11 XX ) ' 1 y , is BLUE (best linear unbiased estimator). By best β , we mean that β ˆ minimizes the variance for any linear combination of the estimated coefficients, ' β ˆ . We note that ˆ −−11 − 1 E()β =ΩΩ EX ( ' X ) X ' y =ΩΩ(X '−−11 X ) X ' − 1 Ey () =ΩΩ('XXXX−−11 ) ' − 1β = β. Thus β ˆ is an unbiased estimator of β . The covariance matrix of β ˆ is given by ˆ −−11 − 1 −−11 − 1 V()β =ΩΩΩΩ (' X XX ) ' VyX ()(' XX ) ' ' −−11 − 1 −−11 − 1 =ΩΩΩΩΩ('X XX ) ' (' X XX ) ' ' −−11 − 1 − 1 −− 11 =('XXX Ω ) ' Ω ΩΩ XXX (' Ω ) =(XX ' Ω−−11 ). Thus, Var('ββˆˆ )= ' Var ( ) −−11 ='(XX ' Ω ) . 8 Let β be another unbiased estimator of β that is a linear combination of the data. Our goal, then, is to show that −− Var ( ' β ) ≥Ω '( X ' 11 X ) with at least one such that Var ( ' β ) ≥Ω '( X ' −− 11 X ) . We first note that we can write any other estimator of β that is a linear combination of the data as β = Ω−−11 Ω+ − 1 + * ('X X ) X ' By b0 * where B is an p x n matrix and b o is a p x 1 vector of constants that appropriately adjusts the GLS estimator to form the alternative estimate. Then β = Ω−−11 Ω+ − 1 + * E() E( ( X ' X ) X ' By bo ) = Ω−−11 Ω+ − 1 +* (X ' X ) X ' B Ey () b0 = Ω−−11 Ω+ − 1 +* ('X X ) X ' B XB b0 −−11 − 1 * =Ω('XXXXBXb ) ' Ω++ββ0 * =++ββBX b0 . * Consequently, β is unbiased if and only if both b 0 = 0 and BX = 0. The covariance matrix of β is −−11 − 1 V()β = Var( ( X ' Ω X ) X ' Ω+ B y) −−11 − 1 −−11 − 1 =('X Ω XX ) ' Ω+ BVyX ()(' Ω XX ) ' Ω+ B ' −−11 − 1 −−11 − 1 =('X Ω XX ) ' Ω+ B Ω (' X Ω XX ) ' Ω+ B ' −−11 − 1 −1 −− 11 =('XXX Ω ) ' Ω + B ΩΩ XXXB(' Ω ) + ' −−11 =('X Ω X ) +Ω BB ' 9 because BX = 0, which implies that ( BX )' = X ' B ' = 0. Then Var('ββ )= ' V ( ) ='('(X Ω−−11 X ) +Ω BB ') −− ='(X ' Ω11 X ) +Ω ' BB ' =Var('βˆ ) +Ω ' B B '. We note that Ω is a positive definite matrix. Consequently, there exists some nonsingular matrix K such that Ω= KK ' As a result, B Ω= B ' BK '' KB is at least positive semidefinite matrix; hence, ' BB Ω≥ ' 0. Next note that we can define * = KB '. As a result, p *' * *2 ''BBΩ= =∑ i i=1 which must be strictly greater than 0 for some ≠ 0 unless B = 0. th ˆˆ For = ( 0, 0,...,0,1,0, … ,0 ) where 1 occurs at the i place, ' ββ = i is the best linear unbiased estimator of ' ββ = i for all i = 1, 2, …, k. Thus, the GLS estimate of β is the best linear unbiased estimator. 10 Weighted least squares estimation When ε ' s are uncorrelated and have unequal variances, then 1 00 0 ω 1 1 0 00 εσ=22 Ω= σω V () 2 . 1 0 00 ω n The estimation procedure is usually called as weighted least squares. Let W = Ω − 1 then the weighted least squares estimator of β is obtained by solving normal equation (' X WX ) β ˆ = X ' Wy ˆ −1 which gives β = (' X WX ) X ' Wy where ωω 12 , ,..., ω n are called the weights. The observations with large variances usual have smaller weights than observations with small variance. .

Generalized and Weighted Least Squares Estimation

Generalized Linear Models and Generalized Additive Models

Generalized Least Squares and Weighted Least Squares Estimation

Lecture 24: Weighted and Generalized Least Squares 1

18.650 (F16) Lecture 10: Generalized Linear Models (Glms)

Weighted Least Squares

4.3 Least Squares Approximations

Time-Series Regression and Generalized Least Squares in R*

Statistical Properties of Least Squares Estimates

Models Where the Least Trimmed Squares and Least Median of Squares Estimators Are Maximum Likelihood

Testing for Heteroskedastic Mixture of Ordinary Least 5

1 Simple Linear Regression I – Least Squares Estimation

Weighted Least Squares, Heteroskedasticity, Local Polynomial Regression