GMM Estimation and Testing

GMM Estimation and Testing Whitney Newey October 2007 Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Idea: Estimate parameters by setting sample moments to be close to population counterpart. Definitions: β : p 1 parameter vector, with true value β . × 0 g (β)=g (w ,β):m 1 vector of functions of ith data observation w and paramet i i × i Model (or moment restriction): E[gi(β0)] = 0. Definitions: n def gˆ(β) = gi(β)/n : Sample averages. iX=1 Aˆ : m m positive semi-definite matrix. × GMM ESTIMATOR: ˆ β =argmingˆ(β)0Aˆgˆ(β). β Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. GMM ESTIMATOR: ˆ β =argmingˆ(β)0Aˆgˆ(β). β Interpretation: Choosing βˆ so sample moments are close to zero. For g = g Agˆ , same as minimizing gˆ(β) 0 . k kAˆ 0 k − kAˆ q When m = p,theβ ˆ with gˆ(βˆ)=0will be the GMM estimator for any Aˆ When m>p then Aˆ matters. Methodo f MomentsisSpecial Case: j Moments : E[y ]=hj(β0), (1 j p), ≤ ≤p Specify moment functions : g (β)=(y h1(β),...,y hp(β)) , i i − i − 0 Estimator :ˆg(βˆ)=0same as yj = h (βˆ), (1 j p). j ≤ ≤ Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Two-stage Least Squares as GMM: Model : yi = Xi0β + εi,E[Ziεi]=0, (i =1,...,n). Specify moment functions : g (β)=Z (y X β). i i i − i0 n Sample moments are :ˆg(β)= Z (y X β)/n = Z (y Xβ)/n i i − i0 0 − iX=1 Z =[Z1,...,Zn]0,X =[X1,...,Xn]0,y =(y1,...,yn)0. 1 Specify distance (weighting) matrix : Aˆ =(Z0Z/n)− . ˆ 1 GMM estimator is 2SLS : β =argmin[(y Xβ)0Z/n](Z0Z/n)− Z0(y Xβ)/n β − − 1 =argmin(y Xβ)0Z(Z0Z)− Z0(y Xβ). β − − Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Intertemporal CAPM Nonlinear in parameters. ci consumption at time i, Ri is asset return between i and i+1, α0 is time discount factor, u(c, γ0) utility function; Ii is variables observed at time i; Agent maximizes ∞ j E[ α0− u(ci+j,γ0)] jX=0 subject to intertemporal budget constraint. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. First-order conditions are E[R α0 uc(c ,γ )/uc(c ,γ ) I ]=1. i · · i+1 0 i 0 | i Marginal rate of substitution between i and i +1 equals rate of return (in expected value). Let Z denote a m 1 vector of variables observable at time i (like lagged con- i × sumption, lagged returns, and nonlinear functions of these). Moment function is g (β)=Z R α uc(c ,γ )/uc(c ,γ ) 1 . i i{ i · · i+1 0 i 0 − } Here GMM is nonlinear instrumental variables. Empirical Example: Hansen and Singleton (1982, Econometrica). Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Dynamic Panel Data It is a simple model that is important starting point for microeconomic (e.g. firm investment) and macroeconomic (e.g. cross-country growth) applications is E[yit yi,t 1,yi,t 2,...,yi0,αi]=β0yi,t 1 + αi, | − − − αi is unobserved individual effect. Microeconomic application is firm investment (with additional covariates). Macroeconomic is cross-country growth equations. Let ηit = yit E[yit yi,t 1,...,yi0,αi]. − | − E[yi,t jηit]=0, (1 j t, t =1, ..., T ), − ≤ ≤ E[αiηit]=0, (t =1,...,T) . Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. ∆ denote first difference, i.e. ∆yit = yit yi,t 1, so ∆yit = β0∆yi,t 1 + ∆ηit. − − − Then E[yi,t j(∆yit β0∆yi,t 1)] = 0, (2 j t, t =1,...,T) . − − − ≤ ≤ IV moment conditions. Twice (and more) lagged levels of yit canbeusedas instruments for differenced equations. Different instruments for different residuals. Additional moment conditions from orthogonality of αi and ηit.Theya re E[(yiT β0yi,T 1)(∆yit β0∆yi,t 1)] = 0, (t =2, ..., T 1). − − − − − These are nonlinear. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Combine moment conditions by stacking. Let yi0 t . gi (β)=⎛ . ⎞ (∆yit β∆yi,t 1), (t =2,...,T) , − − yi,t 2 ⎜ ⎟ ⎝ − ⎠ ∆yi2 β∆yi1 α −. gi (β)=⎛ . ⎞ (yiT βyi,T 1). − − ∆yi,T 1 β∆yi,T 2 ⎜ − ⎟ ⎝ − − ⎠ These moment functions can be combined as 2 T α gi(β)=(gi (β)0, ..., g i (β)0,gi (β)0)0. Here there are T (T 1)/2+(T 2) moment restrictions. − − Ahn and Schmidt (1995, Journal of Econometrics) show that the addition of the α nonlinear moment condition gi (β) to the IV ones often gives substantial asymptotic efficiency improvements. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Arellano and Bond approach also better is small samples: ∆yi,1 t ∆yi,2 gi(β)=⎛ . ⎞ (yit βyi,t 1) . − − ⎜ ⎟ ⎜ ∆y ⎟ ⎜ i,t 1 ⎟ ⎝ − ⎠ Assumes that have representation ∞ ∞ yit = atjyi,t j + btjηi,t j + Cαi. − − jX=1 jX=1 Hahn, Hausman, Kuersteiner approach: Long differences yi0 yi2 βyi1 gi(β)=⎛ −. ⎞ (yiT yi1 β(yi,T 1 yi0)] . − − − − ⎜ ⎟ ⎜ y βy ⎟ ⎜ i,T 1 i,T 2 ⎟ ⎝ − − − ⎠ Has better small sample properties by getting most of the information with fewer moment conditions. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. IDENTIFICATION: Identification precedes estimation. If a parameter is not identified then no consistent estimator exists. Identification from moment functions: Let g¯(β)=E[gi(β)]. Identification condition is β0 IS THE ONLY SOLUTION TO g¯(β)=0 Necessary condition for identification is that m p. ≥ When m<p,i.e. there are fewer equations to solve than parameters, there will typically be multiple solutions to the moment conditions. In IV need at least as many instrumental variables as right-hand side variables. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Rank condition for identification: Let G = E[∂gi(β0)/∂β]. Rank condition is rank(G)=p. Necessary and sufficient for identification when gi(β) is linear in β. 0=g¯(β)=g¯(β )+G(β β )=G(β β ) 0 − 0 − 0 has a unique solution at β0 if and only if the rank of G equals the number of columns of G. For IV G = E[Z X ] so that rank(G)=p is usual rank condition. − i i0 Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. In the general nonlinear case it is difficult to specify conditions for uniqueness of the solution to g¯(β)=0. rank(G)=p will be sufficient for local identification. There exists a neighborhood of β0 such that β0 is the unique solution to g¯(β) for all β in that neighborhood. Exact identification is m = p. For IV as many instruments as right-hand side variables. ˆ Here the GMM estimator will satisfy gˆ(β)=0asymptotically; see notes. Overidentification is m>p. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. TWO STEP OPTIMAL GMM ESTIMATOR: When m>pGMM estimator depends on weighting matrix Aˆ. Optimal Aˆ minimizes asymptotic variance of βˆ. n Let Ω be asymptotic variance of √ngˆ(β0)= i=1 gi(β0)/√n,i.e. d P √ngˆ(β ) N(0, Ω). 0 −→ In general, central limit theorem for time series gives Ω = lim E[ngˆ(β )ˆg(β ) ]. n 0 0 0 −→ ∞ Optimal Aˆ satisfies p Aˆ Ω 1 . −→ − Take ˆ 1 Aˆ = Ω− ˆ for Ω a consistent estimator of Ω. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Estimating Ω. Let β˜ be preliminary GMM estimator (known Aˆ). Let g˜ = g (β˜) gˆ(β˜) i i − If E[gi(β0)gi+(β0)0]=0thenf orsomepre n ˆ Ω = g˜ig˜i0/n. iX=1 Example is CAPM model above. Cite as: Whitney Newey, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. If have autocorrelation in moment conditions then L ˆ ˆ ˆ ˆ Ω = Λ0 + wL(Λ + Λ0 ), X=1 n ˆ − Λ = g˜ig˜i0+/n.

GMM Estimation and Testing

Instrumental Regression in Partially Linear Models

And Approximately Correct: Estimating Mixed Demand Systems

Efficiency Gains by Modifying GMM Estimation in Linear Models Under Heteroskedasticity

Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain

Strengthening Weak Instruments by Modeling Compliance∗

Valid Post-Selection and Post-Regularization Inference: an Elementary, General Approach

Demand Estimation with High-Dimensional Product Characteristics

Nber Working Paper Series Identification And

Identification and Inference with Many Invalid Instruments

Two-Step Estimation, Optimal Moment Conditions, and Sample Selection Models

GENERALIZED METHOD of MOMENTS Whitney K

Optimal Instruments in Time Series: a Survey