The local power of the gradient test Artur J. Lemonte, Silvia L. P. Ferrari Departamento de Estat´ıstica, Universidade de Sao˜ Paulo, Rua do Matao,˜ 1010, Sao˜ Paulo/SP, 05508-090, Brazil Abstract The asymptotic expansion of the distribution of the gradient test statistic is de- rived for a composite hypothesis under a sequence of Pitman alternative hypotheses converging to the null hypothesis at rate n¡1=2, n being the sample size. Compar- isons of the local powers of the gradient, likelihood ratio, Wald and score tests reveal no uniform superiority property. The power performance of all four criteria in one- parameter exponential family is examined. Key words: Asymptotic expansions; Chi-square distribution; Gradient test; Likeli- hood ratio test; Pitman alternative; Power function; Score test; Wald test. 1 Introduction The most commonly used large sample tests are the likelihood ratio (Wilks, 1938), Wald (Wald, 1943) and Rao score (Rao, 1948) tests. Recently, Terrell (2002) proposed a new test statistic that shares the same first order asymptotic properties with the likelihood ratio (LR), Wald (W ) and Rao score (SR) statistics. The new statistic, referred to as the gradient statistic (ST ), is markedly simple. In fact, Rao (2005) wrote: “The suggestion by Terrell is attractive as it is simple to compute. It would be of interest to investigate the performance of the [gradient] statistic.” The present paper goes in this direction. > Let x = (x1; : : : ; xn) be a random vector of n independent observations with prob- ability density function ¼(x j θ) that depends on a p-dimensional vector of unknown > parameters θ = (θ1; : : : ; θp) . Consider the problem of testing the composite null hy- > > > > pothesis H0 : θ2 = θ20 against H1 : θ2 6= θ20, where θ = (θ1 ; θ2 ) , θ1 = (θ1; : : : ; θq) > and θ2 = (θq+1; : : : ; θp) , θ20 representing a (p¡q)-dimensional fixed vector. Let ` be the Pn total log-likelihood function, i.e. ` = `(θ) = l=1 log ¼(xl j θ). Let U(θ) = @`=@θ = > > > (U1(θ) ; U2(θ) ) be the corresponding total score function partitioned following the partition of θ. The restricted and unrestricted maximum likelihood estimators of θ are b b> b> > e e> > > θ = (θ1 ; θ2 ) and θ = (θ1 ; θ20) , respectively. 1 The gradient statistic for testing H0 is e > b e ST = U(θ) (θ ¡ θ): e e > b Since U1(θ) = 0, ST = U2(θ) (θ2 ¡ θ20). Clearly, ST has a very simple form and does not involve knowledge of the information matrix, neither expected nor observed, and no matrices, unlike W and SR. Asymptotically, ST has a central chi-square distribution with p ¡ q degrees of freedom under H0. Terrell (2002) points out that the gradient statistic “is not transparently non-negative, even though it must be so asymptotically.” His Theorem 2 implies that if the log-likelihood function is concave and is differentiable at θe, then ST ¸ 0. In this paper we derive the asymptotic distribution of the gradient statistic for a com- posite null hypothesis under a sequence of Pitman alternatives converging to the null hypothesis at a convergence rate n¡1=2. In other words, the sequence of alternative hy- ¡1=2 > potheses is H1n : θ2 = θ20 + n ², where ² = (²q+1; : : : ; ²p) . Similar results for the likelihood ratio and Wald tests were obtained by Hayakawa (1975) and for the score test, by Harris & Peers (1980). Comparison of local power properties of the competing tests will be performed. Our results will be specialized to the case of the one-parameter exponential family. A brief discussion closes the paper. 2 Notation and preliminaries Our notation follows that of Hayakawa (1975, 1977). We introduce the following log- likelihood derivatives 2 3 ¡1=2 @` ¡1 @ ` ¡3=2 @ ` yr = n ; yrs = n ; yrst = n ; @θr @θr@θs @θr@θs@θt their arrays > y = (y1; : : : ; yp) ; Y = ((yrs)); Y::: = ((yrst)); the corresponding cumulants ·rs = E(yrs);·r;s = E(yrys); 1=2 1=2 1=2 ·rst = n E(yrst);·r;st = n E(yryst);·r;s;t = n E(yrysyt) and their arrays K = ((·r;s)); K::: = ((·rst)); K:;:: = ((·r;st)); K:;:;: = ((·r;s;t)): We make the same assumptions as in Hayakawa (1975). In particular, it is assumed that the ·’s are all O(1) and they are not functionally independent; for instance, ·r;s = ¡·rs. 2 Relations among them were first obtained by Bartlett (1953a,b). Also, it is assumed that Y is non-singular and that K is positive definite with inverse K¡1 = ((·r;s)) say. For triple-suffix quantities we use the following summation notation Xp Xp K::: ± a ± b ± c = ·rstarbsct; K:;:: ± M ± b = ·r;stmrsbt; r;s;t=1 r;s;t=1 where M is a p £ p matrix and a, b and c are p £ 1 column vectors. > > > The partition θ = (θ1 ; θ2 ) induces the corresponding partitions: " # " # " # Y Y K K K11 K12 Y = 11 12 ; K = 11 12 ; K¡1 = ; 21 22 Y21 Y22 K21 K22 K K > > > a = (a1 ; a2 ) , etc. Also, Xp Xp K2:: ± a2 ± b ± c = ·rstarbsct: r=q+1 s;t=1 Using a procedure analogous to that of Hayakawa (1975), we can write the asymptotic ¡1=2 expansion of ST for the composite hypothesis up to order n as 1 S = ¡(Zy + »)>Y (Zy + ») ¡ p K ± (Zy + ») ± Y ¡1y ± Y ¡1y T 2 n ::: 1 ¡ p K ± (Zy + ») ± (Z y ¡ ») ± (Z y ¡ ») + O (n¡1); 2 n ::: 0 0 p ¡1 where Z = Y ¡ Z0, " # " # ¡1 ¡1 Y11 0 Y11 Y12 Z0 = ; » = ²; 0 0 ¡Ip¡q Ip¡q being the identity matrix of order p ¡ q. We can now use a multivariate Edgeworth Type A series expansion of the joint density function of y and Y up to order n¡1=2 (Peers, 1971), which has the form · 1 f = f 1 + p (K ± K¡1y ± K¡1y ± K¡1y ¡ 3K ± K¡1 ± K¡1y) 1 0 6 n :;:;: :;:;: ¸ 1 ¡ p K ± K¡1y ± D + O(n¡1); n :;:: where ½ ¾ p 1 Y f = (2¼)¡p=2jKj¡1=2 exp ¡ y>K¡1y ±(y ¡ · ); 0 2 rs rs r;s=1 0 D = ((dbc)), dbc = ± (ybc ¡ ·bc)=±(ybc ¡ ·bc), with ±(¢) being the Dirac delta function (Bracewell, 1999), to obtain the moment generating function of ST , M(t) say. 3 ¡1=2 From f1 and the asymptotic expansion of ST up to order n , we arrive, after long algebra, at µ ¶ ¡ 1 (p¡q) t > M(t) = (1 ¡ 2t) 2 exp ² K ² 1 ¡ 2t 22:1 · ¸ 1 £ 1 + p (A d + A d2 + A d3) + O(n¡1); n 1 2 3 ¡1 where d = 2t=(1 ¡ 2t), K22:1 = K22 ¡ K21K11 K12, 1¡ ¢ A = ¡ K ± K¡1 ± ²¤ + 4K ± A ± ²¤ + K ± A ± ²¤ + K ± ²¤ ± ²¤ ± ²¤ ; 1 4 ::: :;:: ::: ::: 1¡ ¢ A = ¡ K ± K¡1 ± ²¤ ¡ K ± A ± ²¤ ¡ 2K ± ²¤ ± ²¤ ± ²¤ ; 2 4 ::: ::: :;:: " # " # ¡1 ¡1 1 ¤ ¤ ¤ ¤ K11 K12 K11 0 A3 = ¡ K::: ± ² ± ² ± ² ; ² = ²; A = : 12 ¡Ip¡q 0 0 ¡(p¡q)=2 > When n ! 1, M(t) ! (1¡2t) expf2t¸=(1¡2t)g, where ¸ = ² K22:1²=2, and hence the limiting distribution of ST is a non-central chi-square distribution with p ¡ q degrees of freedom and non-centrality parameter ¸. Under H0, i.e. when ² = 0, M(t) = ¡(p¡q)=2 ¡1 (1 ¡ 2t) + O(n ) and, as expected, ST has a central chi-square distribution with p ¡ q degrees of freedom up to an error of order n¡1. Also, from M(t) we may obtain the ¡1=2 first three moments of ST up to order n as 2A 8(A + A ) ¹0 (S ) = p ¡ q + ¸ + p 1 ; ¹ (S ) = 2(p ¡ q + 2¸) + 1p 2 1 T n 2 T n and 6(A + 2A + A ) ¹ (S ) = 8(p ¡ q + 3¸) + 1 p 2 3 : 3 T n 3 Main result The moment generating function of ST in a neighborhood of θ2 = θ20 can be written, after some algebra, as µ ¶ ¡ 1 (p¡q) t > y M(t) = (1 ¡ 2t) 2 exp ² K ² 1 ¡ 2t 22:1 · ¸ 1 X3 £ 1 + p a (1 ¡ 2t)¡k + O(n¡1); n k k=0 4 where 1© a = Ky ± (K¡1)y ± (²¤)y ¡ (4K + 3K )y ± Ay ± (²¤)y 1 4 ::: :;:: ::: y ¤ y ¤ y ¤ y ¡ 2(K::: + 2K:;::) ± (² ) ± (² ) ± (² ) y ¤ y ¤ yª ¡ 2(K2:: + K2;::) ± ² ± (² ) ± (² ) ; 1© (1) a = ¡ Ky ± (K¡1 ¡ A)y ± (²¤)y 2 4 ::: y ¤ y ¤ y ¤ yª ¡ (K::: + 2K:;::) ± (² ) ± (² ) ± (² ) ; 1 a = ¡ Ky ± (²¤)y ± (²¤)y ± (²¤)y; 3 12 ::: > > > and a0 = ¡(a1+a2+a3). The symbol “y” denotes evaluation at θ = (θ1 ; θ20) . Inverting M(t), we arrive at the following theorem, our main result. Theorem 1 The asymptotic expansion of the distribution of the gradient statistic for test- ing a composite hypothesis under a sequence of local alternatives converging to the null hypothesis at rate n¡1=2 is 1 X3 Pr(S · x) = G (x) + p a G (x) + O(n¡1); (2) T f;¸ n k f+2k;¸ k=0 where Gm;¸(x) is the cumulative distribution function of a non-central chi-square variate with m degrees of freedom and non-centrality parameter ¸. Here, f = p ¡ q, ¸ = > y ² K22:1²=2 and the ak’s are given in (1). If q = 0, the null hypothesis is simple, ²¤ = ¡² and A = 0. Therefore, an immediate consequence of Theorem 1 is the following corollary. Corollary 1 The asymptotic expansion of the distribution of the gradient statistic for test- ing a simple hypothesis under a sequence of local alternatives converging to the null hypothesis at rate n¡1=2 is given by (2) with f = p, ¸ = ²>Ky²=2 and 1 1© ª a = Ky ± ² ± ² ± ²; a = ¡ Ky ± (K¡1)y ± ² ¡ 2Ky ± ² ± ² ± ² ; 0 6 ::: 1 4 ::: :;:: 1© ª 1 a = Ky ± (K¡1)y ± ² ¡ (K + 2K )y ± ² ± ² ± ² ; a = Ky ± ² ± ² ± ²: 2 4 ::: ::: :;:: 3 12 ::: 4 Power comparisons between the rival tests To first order ST , LR, W and SR have the same asymptotic distributional properties under either the null or local alternative hypotheses.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages11 Page
-
File Size-