<<

of

Jakub Mućk

Meeting # 4

Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 26 Outline

1 Two-way Error Component Model Fixed effects model Random effects model

2 Hausman-Taylor

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 2 / 26 Two-way Error Component Model

Two-way Error Component Model

0 yit = α + Xitβ + uit i ∈ {1,..., N }, t ∈ {1,..., T}, (1)

the composite error component:

ui,t = µi + λt + εi,t, (2)

where:

I µi – the unobservable individual-specific effect; I λt – the unobservable time-specific effect; I ui,t – the remainder disturbance.

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 3 / 26 Fixed effects model

Two-way fixed effects model:

yit = αi + λt + β1x1it + ... + βkxkit + uit, (3)

where

I αi - the individual-specific intercept; I λi - the period-specific intercept; 2 I uit ∼ N 0, σu .

As in the case of the one-way fixed effect model, independent variables x1, ..., xk cannot be time invariant (for a given unit). The estimation: 1 the dummy variable estimator (LSDV); 2 the within estimator; 3 combination of the above methods.

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 4 / 26 The LSDV estimator can be obtained by estimating the following model:

N N X X yit = αjDjit + λτ Bτit + β1x1it + ... + βkxkit + ui,t. (6) j=1 τ=1

2 where ui,t ∼ N 0, σu . The parameters in equation (7) can be estimated with the OLS.

The LSDV estimator

In the analogous fashion to the one-way fixed effect model, we define the following dummy variables:

I for the j-th unit: ( 1 i = j D = , (4) jit 0 otherwise

I for the τ-th period: ( 1 t = τ B = . (5) τit 0 otherwise

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 5 / 26 The LSDV estimator

In the analogous fashion to the one-way fixed effect model, we define the following dummy variables:

I for the j-th unit: ( 1 i = j D = , (4) jit 0 otherwise

I for the τ-th period: ( 1 t = τ B = . (5) τit 0 otherwise The LSDV estimator can be obtained by estimating the following model:

N N X X yit = αjDjit + λτ Bτit + β1x1it + ... + βkxkit + ui,t. (6) j=1 τ=1

2 where ui,t ∼ N 0, σu . The parameters in equation (7) can be estimated with the OLS.

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 5 / 26 One might test only individual or period effects:

H0 : α1 = ... = αk , (10)

H0 : λ1 = ... = λk . (11) The construction of test for poolability is the same but we use different degrees of freedom.

The LSDV estimator: test for poolability

The LSDV estimator: N N X X yit = αj Djit + λτ Bτit + β1x1it + ... + βk xkit + ui,t , (7) j=1 τ=1

2 where ui,t ∼ N 0, σu . We can test whether the individual-specific and period-specific effects are significantly different: H0 : α1 = ... = αk ∧ λ1 = ... = λT = 0. (8) To verify the null described by (8) we compare the sum of squared errors from the pooled model with (with restrictions, SSER) with the sum of squared errors from the LSDV model (without restrictions, SSER). The test F: (SSE − SSE )/(N − T − 1) F = R U (9) SSEU /(NT − K − T)

if null is emph then F ∼ F(N−T−1,NT−K−T).

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 6 / 26 The LSDV estimator: test for poolability

The LSDV estimator: N N X X yit = αj Djit + λτ Bτit + β1x1it + ... + βk xkit + ui,t , (7) j=1 τ=1

2 where ui,t ∼ N 0, σu . We can test whether the individual-specific and period-specific effects are significantly different: H0 : α1 = ... = αk ∧ λ1 = ... = λT = 0. (8) To verify the null described by (8) we compare the sum of squared errors from the pooled model with (with restrictions, SSER) with the sum of squared errors from the LSDV model (without restrictions, SSER). The test statistics F: (SSE − SSE )/(N − T − 1) F = R U (9) SSEU /(NT − K − T)

if null is emph then F ∼ F(N−T−1,NT−K−T). One might test only individual or period effects:

H0 : α1 = ... = αk , (10)

H0 : λ1 = ... = λk . (11) The construction of test for poolability is the same but we use different degrees of freedom.

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 6 / 26 Given the above transformations we can eliminate both the individual- and period specific effects:

0 (yit − y¯i − y¯t +y ¯it) = β (xit − x¯i − x¯t +x ¯it) + (uit − u¯i − u¯t +u ¯it) . (12)

The within estimator

Consider the following transformations:

I averaging over time (for each unit i):y ¯i , I averaging over unit (for each period t):y ¯t ,

I averaging over time and unit:y ¯,

for the dependent variable yit:

T N T N 1 X 1 X 1 X X y¯ = y , y¯ = y , y¯ = y . i T it t N it NT it t i t i

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 7 / 26 The within estimator

Consider the following transformations:

I averaging over time (for each unit i):y ¯i , I averaging over unit (for each period t):y ¯t ,

I averaging over time and unit:y ¯,

for the dependent variable yit:

T N T N 1 X 1 X 1 X X y¯ = y , y¯ = y , y¯ = y . i T it t N it NT it t i t i

Given the above transformations we can eliminate both the individual- and period specific effects:

0 (yit − y¯i − y¯t +y ¯it) = β (xit − x¯i − x¯t +x ¯it) + (uit − u¯i − u¯t +u ¯it) . (12)

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 7 / 26 The within estimator

The within estimator:

y˜it = β1x˜1it + ... + βkx˜kit +u ˜it (13)

where

y˜it = yit − y¯i − y¯t +y ¯it

x˜1it = x1it − x¯1i − x¯1t +x ¯1it ......

x˜kit = xkit − x¯ki − x¯kt +x ¯kit

u˜it = uit − u¯i − u¯t +u ¯it .

The parameters β1, ..., βk can be estimated by the OLS.

The independent variables, i.e., x1it, ..., xkit, cannot be time (unit) invariant.

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 8 / 26 The independent variables can be time invariant. Individual-specific and period-specific effects are independent:

E (µi, µj) = 0 if i 6= j, E (λs, λt) = 0 if s 6= t E (µi, λt) = 0 Estimation method: GLS (generalized least squares).

Random effects model

Two-way random effects model:

yit = α + β1x1it + ... + βkxkit + uit, (14) where the error component is as follows:

uit = µi + λt + εit, (15) where 2  I µi is the individual-specific error component and µi ∼ N 0, σµ ; 2  I λt is the period-specific error component and λt ∼ N 0, σλ ; 2 I εit is the idiosyncratic error component and εit ∼ N 0, σε .

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 9 / 26 Random effects model

Two-way random effects model:

yit = α + β1x1it + ... + βkxkit + uit, (14) where the error component is as follows:

uit = µi + λt + εit, (15) where 2  I µi is the individual-specific error component and µi ∼ N 0, σµ ; 2  I λt is the period-specific error component and λt ∼ N 0, σλ ; 2 I εit is the idiosyncratic error component and εit ∼ N 0, σε . The independent variables can be time invariant. Individual-specific and period-specific effects are independent:

E (µi, µj) = 0 if i 6= j, E (λs, λt) = 0 if s 6= t E (µi, λt) = 0 Estimation method: GLS (generalized least squares).

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 9 / 26 The two-way RE model - the variance- of error termI

It is assumed that the error term in the two-way RE model can be described as follow: uit = µi + λt + εit, (16) 2  2 2 where µi ∼ N 0, σµ , λt ∼ N 0, σλ and εit ∼ N 0, σε . The variance-covariance matrix of the error term is not diagonal!. The diagonal elements of the variance covariance matrix of the error term:

2  2 E uit = E (µi + λt + εit ) 2 2 2  = E µi + E λt + E εit + 2cov (µi , λt ) + 2cov (µi , εit ) + 2cov (λt , εit ) | {z } | {z } | {z } | {z } | {z } | {z } =σ2 =σ2 =σ2 =0 =0 =0 µ λ ε 2 2 2 = σµ + σλ + σε .

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 10 / 26 The two-way RE model - the variance-covariance matrix of error termII

For a given unit, non-diagonal elements of the variance covariance matrix of the error term (t 6= s):

cov (uit , uis) = E [(µi + λt + εit )(µi + λs + εis)] , 2 = E µi + E (λt λs) + E (εit εis) + COV1 | {z } | {z } | {z } | {z } 2 =0 =0 =0 =σµ 2 = σµ,

where COV1 = cov(λt, µi)+cov(λs, µi)+cov(λt, εit)+cov(λs, εit)+2cov(µi, εit). Similarly, for a a given period, non-diagonal elements of the variance covari- ance matrix of the error term (i 6= j):

cov (uit , ujt ) = E [(µi + λt + εit )(µj + λt + εjt )] , 2 = E (µi µj ) + E λt + E (εit εjt ) + COV2 | {z } | {z } | {z } | {z } =0 =0 =σ2 =0 λ 2 = σλ.

where COV2 = cov(µi, λt)+cov(µi, λt)+cov(µi, εit)+cov(µj, εit)+2cov(λt, εit).

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 11 / 26 The two-way RE model - the variance-covariance matrix of error termIII

Finally, the variance-covariance matrix of the error term can be described:

 2 2 2 σµ + σλ + σε if i = j, t = s,  2  σµ if i = j, t 6= s, E (uit, ujs) = 2 (17)  σλ if i 6= j, t = s,  0 if i 6= j, t 6= s.

Unlike the one-way RE model, the variance-covariance matrix E (uit, ujs) is not block-diagonal because there is correlation between units. This correlation 2 2 2 2 is caused by the the period-specific effects and equals σλ/(σµ + σλ + σε ). Akin to the one-way RE model there is equicorrelation which is implied by 2 2 2 2 the unit-specific effects and equals σµ/(σµ + σλ + σε ).

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 12 / 26 The GLS estimation of the two-way RE model: example

Consider the following transformation:

∗ zit = zit − θ1¯zi − θ2¯zt + θ3¯zit, (18)

where zit ∈ {yit, x1it,..., xkit, uit} and

T N T N 1 X 1 X 1 X X ¯z = z , y¯ = z , ¯z = z , i T it t N it NT it t i t i and

σε θ1 = 1 − q 2 2 Tσµ + σε

σε θ2 = 1 − p 2 2 N σλ + σε σε θ3 = θ1 + θ2 + q − 1. 2 2 2 Tσµ + N σλ + σε .

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 13 / 26 The GLS estimation of the two-way RE model: Swamy and Aurora (1972)

Swamy and Aurora (1972) propose running three least squares regressions to 2 2 2 estimate σε , σµ and σλ which allow us to calculate θ1, θ2 and θ3. 1 The within regression allows to estimate the variance of the idiosyncratic com- 2 ponent, i.e.,σ ˆε . 2 The between (individuals) regression allows to estimate the σµ which is the estimated variance of the error term from the following regression:

y¯i − y¯ = β1 (¯x1i − x¯1) + ... + βk (¯xki − x¯k ) +u ¯i − u¯, 2 andσ ˆI is the estimated variance of the error term from the above regres- sion. Then, the estimated variance of the individual specific-error term can be described as follow σˆ2 − σˆ2 σˆ2 = I ε . µ T 3 The between (periods) regression allows to estimate the σλ which is the esti- mated variance of the error term from the following regression:

y¯t − y¯ = β1 (¯x1t − x¯1) + ... + βk (¯xkt − x¯k ) +u ¯t − u¯, 2 andσ ˆT is the estimated variance of the error term from the above regres- sion. Then, the estimated variance of the individual period-error term can be described as follow σˆ2 − σˆ2 σˆ2 = T ε . λ N Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 14 / 26 The GLS estimation of the two-way RE model: general remarks

2 2 2 Alternative of σε , σµ and σλ are proposed by Wallace and Hussain (1969), Amemiya (1971), Fuller and Battese (1974) and Nerlove (1971). 2 2 2 In general, estimates of σε , σµ and σλ allow to calculate θ1, θ2 and θ3 and estimate model for transformed variables:

∗ ∗ ∗ ∗ yit = β1x1it + ... + βkxkit + uit (19)

∗ where zit = zit − θ1¯zi − θ2¯zt + θ3¯zit and zit ∈ {yit, x1it,..., xkit, u1it}. 2 2 Computational warning: sometimes estimates of σµ or σλ could be nega- tive. This is due to the fact that we use two-step strategy rather than jointly estimation.

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 15 / 26 Two-way Error Component Model

yit = α + β1x1it + βkxkit + µi + λt + εit (20)

Random effects Fixed effects 2  Individual µi ∼ N 0, σµ µi =⇒ individual intercepts αi 2  time λt ∼ N 0, σλ λt effects drawn from the random sample αi and λt are assumed to be con- =⇒ we can estimate the param- stant over time 2 eters of distribution, i.e., σµ and 2 σλ Assumptions: (i) E (µi |εit ) = 0 ∧ E (λt |εit ) = 0 (i) E (µi |εit ) = 0 ∧ E (λt |εit ) = 0 (ii) E (µi |xit ) = 0 ∧ E (λt |xit ) = 0 individual-specific and period- specific effects are independent of the explanatory variable xit Estimation GLS OLS (within or LSDV) Efficiency higher lower Additional: impossible to use time invariant re- gressors (collinearity with αi )

Jakub Mućk Econometrics of Panel Data Two-way Error Component Model Meeting # 4 16 / 26 Outline

1 Two-way Error Component Model Fixed effects model Random effects model

2 Hausman-Taylor estimator

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 17 / 26 Hausman-Taylor estimator

Let’s consider the following one-way RE model:

yit = x1itβ1 + x2itβ2 + z1iγ1 + z2iγ2 + µi + uit (21)

where: x1it are time-varying variables; not correlated with µi x2it are time-varying variables; correlated with µi z1i are time-invariant variables; not correlated with µi z2i are time-invariant variables; correlated with µi

The RE model estimates on γ2 are inconsistent. The estimator proposed by Hausman and Taylor (1981) takes into account the above correlation.

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 18 / 26 The anatomy of the Hausman-Taylor estimator

First step: Within regression for the model including only time-variable regressors, both x1it and x2it. Here, the usual differences from the temporal mean are used:

(yit − y¯i) = β1(x1i, − x¯1i) + β2(x2it − x¯2i) + (uit − u¯i) (22)

Based on the expression above we can estimate variance of the idiosyncratic 2 error, i.e.,σ ˆε .

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 19 / 26 The anatomy of the Hausman-Taylor estimator

Second step: construct the intra-temporal mean of the residuals from (22):

0 ¯e = [(¯e1, ¯e1,..., ¯e1),..., (¯eN , ¯eN ,..., ¯eN )] (23) | {z } | {z } T T

Then make TSLS for ¯ei using: variables: z1it (time invariant, not correlated with µi), z2it (time invariant, correlated with µi) instruments: z1it, x1it (time invariant, not correlated with µi) Specifically,

1 Regress z2it on z1it as well as x1it . 2 Use the predicted value from the above regression and create new matrix, i.e., Z = [z1it , ˆz21t ]. 3 Regress ¯ei on Z to get estimates of γ1 and γ2. 4 2 Calculate σTSLS,¯e the variance of the error components from the above regres- sion. Now, we can calculate the variation of the individual-specific error component: σ2 σ2 = σ2 − ε . (24) µ TSLS,¯e T

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 20 / 26 The anatomy of the Hausman-Taylor estimator

2 2 Based on the estimates of σµ and σε calculate the conventional in the FGLS regression scale parameter θ:

s 2 σε θ = 2 −1 2 (25) σε + T σµ

Finally, do a TSLS regression of y∗ on X ∗ with instruments described by V :

∗ y = yit − θyit, (26) ∗ X = [x1it, x2it, z1i, z2i] − θ[x1it, x2it, z1i, z2i], (27)

V = [(x1it − x¯1i) , (x2it − x¯2i) , z1i, x¯1i], (28) more specifically: ∗ ∗ 1 Regress X on the instruments (V ) and obtain fitted values, i.e., Xˆ , ∗ ∗ 2 Regress y on the predicted values from the previous step, i.e, Xˆ , in order to get the estimates of [β, γ]. The estimates of the variance-covariance of the structural parameters are a little bit more complicated.

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 21 / 26 Empirical example: PSID

Consider the following model:

lwageit = α + β1occit + β2southit + β3smsait + β4indit + (29)

+β5expit + β6exp2it + β7wksit + β8msit + (30)

+β9unionit + β10femi + β11blki + β12edi + uit (31)

Variable Description lwage log wage expit years of full-time work experience wksit weeks worked occit occupation; occit = 1 if in a blue-collar occupation indit industry; indit = 1 if working in a manufacturing industry southit residence; southit = 1 if in the South area smsait smsait = 1 if in the Standard metropolitan statistical area msit marital status femi female or male unionit if wage set be a union contract edi years of education blki black

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 22 / 26 Empirical example: PSID

FE RE occ −0.022 −0.050∗∗∗ south −0.002 −0.017 smsa −0.043∗∗ −0.014 ind 0.019 0.004 exp 0.113∗∗∗ 0.082∗∗∗ exp2 0.000∗∗∗ −0.001∗∗∗ wks 0.001 0.001 ms −0.030 −0.075∗∗∗ union 0.033∗∗ 0.063∗∗∗ fem −0.339∗∗∗ blk −0.210∗∗∗ ed 0.100∗∗∗ cons 4.649∗∗∗ 4.264∗∗∗ 0.000 0.000 σµ 0.263 σε 0.152 ρ 0.749

Note: Note: the superscripts ∗∗∗, ∗∗ and ∗ denote the rejection of null about parameters’ insignificance at 1%, 5% and 10% significance level, respectively.

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 23 / 26 Empirical example: PSID

Correlation between the estimated random unit-specific effect and independent variables xjit ρ(ˆµi, xjit) p-value occ −0.009 0.546 south 0.001 0.928 smsa 0.091 0.000 ind −0.035 0.023 exp −0.764 0.000 exp2 −0.737 0.000 wks 0.071 0.000 ms −0.034 0.026 union 0.039 0.011 fem 0.000 1.000 blk 0.000 1.000 ed 0.000 1.000

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 24 / 26 Empirical example: PSID

Classification of the explanatory variables

uncorrelated with µi correlated with µi time-variant wks, south, ms, smsa occ, exp, exp2, ind, union time-invariant fem, blk ed

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 25 / 26 Empirical example: PSID

FE RE HT occ −0.022 −0.050∗∗∗ −0.027 south −0.002 −0.017 0.007 smsa −0.043∗∗ −0.014 −0.042∗∗ ind 0.019 0.004 0.014 exp 0.113∗∗∗ 0.082∗∗∗ 0.113∗∗∗ exp2 0.000∗∗∗ −0.001∗∗∗ 0.000∗∗∗ wks 0.001 0.001 0.001 ms −0.030 −0.075∗∗∗ −0.030 union 0.033∗∗ 0.063∗∗∗ 0.033∗∗ fem −0.339∗∗∗ −0.131 blk −0.210∗∗∗ −0.286∗ ed 0.100∗∗∗ 0.138∗∗∗ cons 4.649∗∗∗ 4.264∗∗∗ 2.913∗∗∗ 0.000 0.000 σµ 0.263 0.942 σε 0.152 0.152 ρ 0.749 0.975

Note: Note: the superscripts ∗∗∗, ∗∗ and ∗ denote the rejection of null about parameters’ insignificance at 1%, 5% and 10% significance level, respectively.

Jakub Mućk Econometrics of Panel Data Hausman-Taylor estimator Meeting # 4 26 / 26