1 the Delta Method

Econ 715 Lecture 6 Testing Nonlinear Restrictions1 The previous lectures prepare us for the tests of nonlinear restrictions of the form: H0 : h(0) = 0 versus H1 : h(0) = 0: (1) 6 In this lecture, we consider Wald, Lagrange multiplier (LM), and quasi-likelihood ratio (QLR) test. The Delta Method will be useful in constructing those tests, especially the Wald test. 1 The Delta Method The delta method can be used to …nd the asymptotic distribution of h(n), suitably normalized, if dn(n 0) d Z: ! b Theoremb (-method): Suppose dn(n 0) d Y where n and Y are random k-vectors, 0 is a ! non-random k-vector, and dn : n 1 is a sequence of scalar constants that diverges to in…nity fk `b g b @ as n . Suppose h( ): R R is di¤erentiable at 0, i.e., h() = h(0) + h(0)( 0) + ! 1 ! @0 o( 0 ) as 0. Then, jj jj ! d @ dn(h(n) h(0)) h(0)Y: ! @0 @ b @ @ If Y N(0;V ), then h(0)Y N 0; h(0)V ( h(0))0 . @0 @0 @0 Proof: By the assumption of di¤erentiability of h at 0, we have @ dn(h(n) h(0)) = h(0)dn(n 0) + dno( n 0 ): @0 jj jj b b @ b The …rst term on the right-hand side converges in distribution to h(0)Y: So, we have the desired @0 result provided dno( n 0 ) = op(1). This holds because jj jj dbno( n 0 ) = dn(n 0) o(1) = Op(1)o(1) = op(1) jj jj jj jj b b and the proof is complete. 1 The notes for this lecture is largely adapted from the notes of Donald Andrews on the same topic I am grateful for Professor Andrews’generosity and elegant exposition. All errors are mine. Xiaoxia Shi Page: 1 Econ 715 2 Tests for Nonlinear Restrictions r The R -valued function h( ) de…ning the restrictions and the matrix 0 (de…ned in Assumption CF) are assumed to satisfy: Assumption R: (i) h() is continuously di¤erentiable on a neighborhood of 0. @ (ii) H = h(0) has full rank r ( d; where d is the dimension of ): @0 (iii) 0 is positive de…nite. For example, we might have h() = 1 2; h() = 12 3, or h() = 1 : (2) 2 + 3 1 ! The requirement that 0 is positive de…nite ensures that the asymptotic covariance matrix, 1 1 B 0B , of pn(n 0) is positive de…nite. 0 0 The Wald statistic for testing H0 is de…ned to be b 1 1 1 n = nh(n)0(HnB nB H0 ) h(n), where W n n n @ Hn = h(n) (3) @0 b b b b b b b b b and Bn and n are consistent estimators of B0 and 0 respectively. See Lecture 5 for the choice of Bn and n: b b As de…ned, the Wald statistic is a quadratic form in the (unrestricted) estimator h(n) of the b b restrictions h(0). The quadratic form has a positive de…nite (pd) weight matrix. If H0 is true, b then h(n) should be close to zero and the quadratic form n in h(n) also should be close to zero. W On the other hand, if H0 is false, h(n) should be close to h(0) = 0 and the quadratic form n b 6 b W in h(n) should be di¤erent from zero. Thus, the Wald test rejects H0 when its value is su¢ ciently b large. Large sample critical values are provided by Theorem 6.1 below. b The LM and QLR statistics depend on a restricted extremum estimator of 0. The restricted extremum estimator, n, is an estimator that satis…es 1=2 n ; h(en) = 0, and Q^n(n) inf Q^n(): ; h() = 0 + op(n ): (4) 2 f 2 g e e e Xiaoxia Shi Page: 2 Econ 715 The LM statistic is de…ned by @ ^ 1 1 1 1 1 @ ^ n = n Qn(n)Bn Hn0 (HnBn nBn Hn0 ) HnBn Qn(n), where LM @0 @ @ Hn = h(n)e e e e e e e e e e e (5) @0 e e and Bn and n are consistent estimators under H0 of B0 and 0, respectively, that are constructed using the restricted estimator n rather than n: e e The LM statistic is a quadratic form in the vector of derivatives of the criterion function eval- e b uated at the restricted estimator n. The quadratic form has a pd weight matrix. If the null hypothesis is true, then the unrestricted estimator should be close to satisfying the restrictions. e In this case, the restricted and unrestricted estimators, n and n, should be close. In turn, this implies that the derivative of the criterion function at n and n should be close. The latter, @ ^ e b ^ @ Qn(n), equals zero by the …rst-order conditions for unrestricted minimization of Qn() over . @ ^ e1=2 b (More precisely, @ Qn(n) is only close to zero, viz., op(n ), by Assumption EE2, since n is not b ^ @ ^ required to exactly minimize Qn() just approximately minimize it.) Then, @ Qn(n) should be b @ b close to zero under H0 and the quadratic form, n, in Q^n(n) also should be close to zero. On LM @ @ ^ e the other hand, if H0 is false, then n and n should not be close, @ Qn(n) should not be close to @ ^ e zero, and the quadratic form, n, in @ Qn(n) also should not be close to zero. In consequence, LM b e e the LM test rejects H0 when the LMn statistic is su¢ ciently large. Asymptotic critical values for e n are determined by Theorem 6.1 below. LM Note that n can be computed without computing n. This can have computational advan- LM tages if the criterion function is simpler when the restrictions h() are imposed than when unre- b stricted. For example, if the unrestricted model is a nonlinear regression model and the restricted model is a linear model, then the LS estimator has a closed-form solution under the restrictions, but not otherwise. Next, we consider the QLR statistic. It has the proper asymptotic null distribution only when the following assumption holds: Assumption QLR: (i) 0 = cB0 for some scalar constant c = 0: 6 (ii) cn p c for some sequence of non-zero random variables cn : n 1 : ! f g Forb example, in the ML example with correctly speci…ed model,b the information matrix equality yields 0 = B0, so Assumption QLR holds with c = cn = 1. In the LS example with conditionally 2 2 homoskedastic errors (i.e., E(Ui Xi) = 0 a.s. and E(U Xi) = a.s.), Assumption QLR holds with j i j c = 2 and b 1 n 2 cn = (Yi g(Xi; n)) : (6) n i=1 P b b Xiaoxia Shi Page: 3 Econ 715 1 In the GMM, MD, and TS examples with optimal weight matrix (e.g., A0A = V0 for the GMM and 1 MD estimators, see Lecture 5), Assumption QLR holds with c = c = 1, since 0 = B0 = 0V0 0. The QLR statistic is de…ned by b n = 2n(Q^n(n) Q^n(n))=cn: (7) QLR When H0 is true, the restricted and unrestrictede estimators,b b n and n, should be close and, hence, so should the value of the criterion function evaluated at these parameter values. Thus, e b under H0, the statistic n should be close to zero. Under H1, n should be noticeably QLR QLR larger than zero, since the minimization of Q^n() over the restricted parameter space, which does not include the minimum value 0 of Q(), should leave Q^n(n) noticeably larger than Q^n(n). In consequence, the QLR test rejects H0 when n is su¢ ciently large. Asymptotic critical values QLR e b are provided by Theorem 6.1 below. For the ML example, the QLR statistic equals the standard likelihood ratio statistic, viz., minus two times the logarithm of the likelihood ratio. When Assumption QLR holds, the n statistic simpli…es to LM @ ^ 1 @ ^ n = n Qn(n)Bn Qn(n)=cn or LM @0 @ @ ^ 1 @ ^ n = n Qn(en) en Qn(en) b (8) LM @0 @ e@ e e (since in this case one can take n = cnBn and Q^n(n) = H0 n for some vector n of Lagrange @ n multipliers). e e e For the LM and QLR tests,e we useb thee following assumption.e p Assumption REE: n 0 under H0: ! This assumption can bee veri…ed using the results of Lecture 3 with the parameter space replaced by the restricted parameter space = : h() = 0 : f 2 g For the Wald and LM tests, we assume that consistent estimators under H0 of B0 and 0 are used in constructing the test statistics:e p p Assumption COV: For the Wald statistic, Bn B0 and n 0 under H0. For the LM p p ! ! statistic, Bn B0 and n 0 under H0: ! ! b b Estimatorse that satisfy thise assumption are discussed in Lecture 5. Xiaoxia Shi Page: 4 Econ 715 Theorem 6.1: (a) Under Assumptions EE2, CF, R, and COV, d 2 n under H0; W ! r 2 where r denotes a chi-square distribution with r degrees of freedom. (b) Under Assumptions EE2, CF, R, COV, and REE, d 2 n under H0: LM ! r (c) Under Assumptions EE2, CF, R, COV, REE, and QLR, d 2 n under H0: QLR ! r r r 2 One rejects H0 using the Wald test if Wn > k , where k is the 1 quantile of the r distribution.

Load more