Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM

M-

Gabriel V. Montes-Rojas

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Set-up

Let m(x, θ) be a parametric model for E (y|x) (we will also use it later for other

conditional moments, such as Qτ (y|x) in in quantile regression) where m is a known function of x and θ. x is a K vector of explanatory variables. θ is a P × 1 parameter vector. θ ∈ Θ ⊆ RP . A model is correctly specified for the conditional mean E (y|x), if for some θ0 ∈ Θ, E (y|x) = m(x, θ0)

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Set-up

Let q(w, θ) be a function of the random vector w and the parameter vector θ ∈ Θ. In general we will have w = (y, x) with typical element wi from a sample {wi : i = 1, 2, ..., N}. q is a known function of w and θ. θ is a P × 1 parameter vector. θ ∈ Θ ⊆ RP . An M- of θ0 solves the problem

N −1 min N ∑ q(wi , θ), θ∈Θ i=1

N ˆ −1 θ = arg min N ∑ q(wi , θ) θ∈Θ i=1 −1 N Define the function θ 7→ MN (θ) where MN (θ) ≡ N ∑i=1 q(wi , θ) and θ 7→ MN (θ) where M(θ) ≡ E [q(w, θ)]. Note that θˆ is a random variable as it depends on the sample {wi : i = 1, 2, ..., N}. Sometimes we will use the notation θˆN to emphasize that it depends on the sample.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Set-up

It is generally assumed that

θ0 = arg min E [q(w, θ)]. θ∈Θ

E [q(w, θ)] can be seen in statistical terms a loss function. Examples of loss functions are: - quadratic: (.)2 which is the base of OLS estimators; - absolute value: |.| which is the base of median regression estimators.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Identification

Identification requires that there is a unique solution to the minimisation problem:

E [q(w, θ0)] < E [q(w, θ)], for all θ ∈ Θ, θ 6= θ0.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Consistency

Because θˆ depends on the whole function θ 7→ MN (θ) we require an appropriate “functional convergence”.

We require uniform convergence in probability:

N −1 p max N q(wi , θ) − E [q(w, θ)] → 0. ∈ ∑ θ Θ i=1

Uniform convergence is a difficult and technical issue.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Uniform Weak Law of Large Numbers

Uniform Weak Law of Large Numbers: Let w be a random vector taking values in W ⊂ RM , Θ ⊂ RP , and let q : W × Θ → R be a real valued function. Assume that (a) Θ is compact; (b) for each θ ∈ Θ, q(., θ) is Borel measurable on W; (c) for each w ∈ W, q(w,.) is continuous on Θ; and (d) |q(w, θ)| < b(w) for all θ ∈ Θ, where b is a nonnegative function on W such that E [b(w)] < ∞. Then N −1 p max N q(wi , θ) − E [q(w, θ)] → 0. ∈ ∑ θ Θ i=1

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Consistency

Consistency of M-estimators: Under the assumptions required for uniform convergence in probability above, and assuming identification holds, then a random −1 N vector θˆ that is an M-estimator (i.e. θˆ = arg minθ∈Θ N ∑i=1 q(wi , θ)) satisfies p θˆ → θ.

p Suppose that θˆ → θ, and assume that r(w, θ) satisfies the same assumptions on −1 N p q(w, θ). Then, N ∑i=1 r(w, θˆ) → E [r(w, θ0)].

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Example: OLS

In OLS models m(x, β) = E (y|x) = xβ and q(y, x, β) = [y − m(x, β)]2. For identification of OLS models we solve

2 2 2 E [q(y, x, β0)] = E [y + m(x, β0) − 2ym(x, β0) ] 2 2 2 2 2 = E [(xβ0) + u + (xβ0) − 2(xβ0) ] = σ

2 2 2 E [q(y, x, β)]β6=β0 = E [y + m(x, β) − 2ym(x, β) ] 2 2 2 = E [(xβ0) + u + (xβ) − 2(xβ0)(xβ)] 2 2 = σ + E (xβ0 − xβ)

Then, E [q(x, β0)] < E [q(x, β)] for all β ∈ B, β 6= β0.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Score of the objective funcion

If q(w,.) is continuously differentiable on the interior of Θ, then with probability approaching one, θˆ solves the first-order condition

N ˆ ∑ s(wi , θ) = 0, i=1 where  0 0 ∂q(w, θ) ∂q(w, θ) ∂q(w, θ) s(w, θ) = ∇θq(w, θ) = , , ..., ∂θ1 ∂θ2 ∂θP is the score of the objective function. The condition above is usually satisfied with equality (but not always, see OLS vs. quantile regression).

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Score of the objective funcion

If E [q(w, θ)] is continuously differentiable on the interior of Θ, then a condition for the maximisation is

∇θE [q(w, θ)]θ=θ0 = 0. If the derivative and expectations can be interchanged (Dominated Convergence Theorem), then E [∇θq(w, θ0)] = E [s(w, θ0)] = 0. p Moreover, if θˆ → θ, then under standard regularity conditions

N −1 ˆ p N ∑ s(wi , θ) → E [s(w, θ0)] i=1

An estimator that is defined by

N ˆ θ = arg min k ∑ s(wi , θ)kM θ∈Θ i=1

is called a Z-estimator (for zero). Here kkM is a specified norm (see GMM). For this case we should assume a different identification condition

E [s(w, θ)] = 0 iff θ = θ0.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Hessian of the objective function

If q(w,.) is twice continuously differentiable on the interior of Θ, then define the Hessian of the objective function,

∂2q(w, θ) H(w, θ) = ∇2q(w, θ) = θ ∂θ∂θ0

By the mean-value expansion of q(w, θ) expanded about θ0,

N N N ! ˆ ¨ ˆ ∑ s(wi , θ) = ∑ s(wi , θ0) + ∑ Hi (θ − θ0), i=1 i=1 i=1

where H¨ i ≡ H(wi , θ˜) and θ˜ is a vector where each element is on the line segment between θˆ and θ0. p Using similar arguments as for the score if θˆ → θ, then by asymptotic theory

N −1 ˆ p N ∑ H(wi , θ) → E [H(w, θ0)]. i=1

For M-estimators, if θ0 is identified then E [H(w, θ0)] is positive definite.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Asymptotic expansions

The expansion about the true value θ0 plays a fundamental role in asymptotic analysis. Rearranging terms we obtain the first-order asymptotic expansion of θˆ:

√ N !−1 " N # −1 −1/2 N(θˆ − θ0) = N ∑ H¨ i −N ∑ si (θ0) i=1 i=1

where si (θ0) = s(wi , θ0). Moreover, using A0 ≡ E [H(w, θ0)],

√ " N # ˆ −1 −1/2 N(θ − θ0) = A0 −N ∑ si (θ0) i=1   N !−1 " N # −1 ¨ −1 −1/2 +  N ∑ Hi − A0  −N ∑ si (θ0) i=1 i=1

" N # −1 −1/2 = A0 −N ∑ si (θ0) + op (1) · Op (1) i=1 " N # −1 −1/2 = A0 −N ∑ si (θ0) + op (1) i=1

ˆ −1 Define the influence function of θ as e(wi , θ0) ≡ ei (θ0) ≡ A0 si (θ0) because this provides the “influence” of each particular i − th observation on the estimator.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Asymptotic normality

Asymptotic normality: In addition to the assumptions for consistency assume that (a) θ0 ∈ intΘ; (b) s(w,.) is continuously differentiable on the interior of Θ for all w ∈ W; (c) Each element in H(w, θ) is bounded in absolute value by a function b(w), where E [b(w)] < ∞; (d) A0 ≡ E [H(w, θ0)] is positive definite; (e) E [s(w, θ0)] = 0; and (f) Each element in s(w, θ0) has finite second moment. Then, √ ˆ d −1 −1 N(θ − θ0) → N(0, A0 B0A0 ), 0 where B0 ≡ E [s(w, θ0)s(w, θ0) ] = Var[s(w, θ0)]. Thus,

ˆ −1 −1 Avar(θ) = A0 B0A0 /N.

−1 −1 Note the sandwich formula: A0 B0A0 . This will appear many times later on.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Delta method

The asymptotic expansion above can also be used for functions of parameters. Consider√ a function θ 7→ g(θ). We would like to know what is the distribution of N(g(θˆ) − g(θ0)). Assume that √ ˆ d N(θ − θ0) → N(0, Vθˆ ), θ 7→ g(θ) is differentiable at θ0 (this means the first derivatives exist and are continuous at θ0). Then,

√ ˆ d 0 N(g(θ) − g(θ0)) → N 0, ∇θg(θ0)Vθˆ ∇θg(θ0)

Note that by Slutsky’s theorem:

ˆ ˆ ˆ 0 p 0 ∇θg(θ)Vθˆ ∇θg(θ) → ∇θg(θ0)Vθˆ ∇θg(θ0)

ˆ where Vθˆ is a consistent estimator of Vθˆ.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Example: OLS

Note that for OLS models

0 s(w, β) = ∇θq(w, β) = −2x (y − xβ), which gives us the classical conditions E [x0u] = 0. The Hessian in this case is:

0 A0 = E [H(w, β0)] = 2E [x x]. Moreover, assuming homoskedasticity,

0 2 0 B0 = E [s(w, β0)s(w, β0) ] = 4σ E [x x] Then, √ d 2 0 −1 N(βˆ − β0) → N(0, σ E [x x] ).

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Example: OLS

If there is heteroskedasticity:

0 0 0 B0 = E [s(w, β0)s(w, β0) ] = 4E [x u ux]

and then B0 6= A0. For this case, √ d 0 −1 0 0 0 −1 N(βˆ − β0) → N(0, E [x x] E [x u ux]E [x x] ).

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Example: Long-run elasticities

Consider a regression model yt = β0 + xt β1 + yt−1α1 + ut where (y, x) are in logarithm.

The short-run effect (elasticity) of x on y is given by β1. The long-run effect (elasticity) of x on y is given by β1 . 1−α1 ˆ Obtain the distribution of β1 . 1−αˆ 1

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Generalized method of moments

Let g(w, θ) be a L × 1 vector map W × Θ 7→ g(w, θ). Let gi (θ) ≡ g(wi , θ).

Assume that E [g(wi , θ0)] = 0.

A minimal condition to identify θ0 is L ≥ P.

When L = P, then θ0 is estimated by setting the sample counterpart −1 N 0 N ∑i=1 g(w, θ0) to zero. This is the OLS case where g(w, θ) = x (y − xβ). When L > P, then the generalized method of moments (GMM) uses an N appropiate metric to minimize a quadratic form of ∑i=1 g(w, θ0):

" N #0 " N # ˆ min ∑ g(wi , θ) Ξ ∑ g(wi , θ) θ∈Θ i=1 i=1

where Ξˆ is an L × L symmetric weighting positive semidefinite matrix. This is the case of IV estimators. 0 h N i h N i Let QN (θ) = ∑i=1 g(wi , θ) Ξˆ ∑i=1 g(wi , θ) . Under standard regularity 0 conditions QN (θ) converges uniformly to {E [gi (θ)]} Ξ0 {E [gi (θ)]} where p Ξˆ → Ξ0.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Generalized method of moments

Under the assumption that g(w,.) is continuously differentiable on int(Θ), θ0 ∈ Θ, then the first order condition for θˆ is

" N #0 " N # ˆ ˆ ˆ ∑ ∇θgi (θ) Ξ ∑ gi (θ) ≡ 0. i=1 i=1 Define L × P matrix with rank P, G0 ≡ E [∇θgi (θˆ)]; 0 P × P matrix with rank P, A0 ≡ G0ΞG0; 0 P × P matrix with rank P, B0 ≡ G0ΞΛ0ΞG0; 0 L × L matrix with rank L, Λ0 ≡ E [gi (θ0)gi (θ0) ] = Var[gi (θ0)]. Then, after some calculations

N √ 0 −1/2 ˆ 0 = G0Ξ0N ∑ gi (θ0) + A0 N(θ − θ0) + op (1) i=1 Then,

√ N ˆ −1 0 −1/2 d −1 −1 N(θ − θ0) = −A0 G0Ξ0N ∑ gi (θ0) + op (1) → N(0, A0 B0A0 ) i=1

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM Generalized method of moments

−1 Optimal efficient GMM estimator requires Ξ0 = Λ0 because

0 −1 0 0 −1 0 −1 (G0Ξ0G0) (G0Ξ0Λ0Ξ0G0)(G0Ξ0G0) − (G0Λ0G0) i.e. the difference between two hypothesical GMM estimators, can be shown to be positive semidefinite.

Note however that we cannot have an estimator of Λ0 before we estimate θˆ. However, we only require any consistent estimator, say θ˜ to estimate first Λ0, then plug in the GMM framework.

Gabriel Montes-Rojas M-estimators Identification and consistency Score and Hessian functions Asymptotic normality and delta method GMM References

This slides are based on Chapters 12, 13 and 14 of Wooldridge. Newey, W.K., and McFadden, D. (1994), “Large Sample Estimation and Hypothesis Testing,” in Handbook of Econometrics, Volume 4, ed. R.F. Engle and D. McFadden. Amsterdam: North Holland, 2111–2245. Van der Vaart, A.W. (1998), Asymptotic . Cambridge University Press.

Gabriel Montes-Rojas M-estimators