On the Power of the F-test for Linear Restrictions in a Regression Model
Willim E. Griffiths University of Melbourne
R. Carter Hill Louisiana State University
8 May 2020
Abstract
Consider testing a joint null hypothesis on regression coefficients where some of the restrictions in the null hypothesis are true and some are false. We show that an F-test that includes the valid restrictions frequently has greater power than the test that excludes the valid restrictions, a result that may be counterintuitive.
Keywords: noncentrality parameter; joint hypotheses; surplus restrictions
JEL codes: C12, C20
Corresponding author William Griffiths Department of Economics University of Melbourne Vic 3010 Australia [email protected]
2
1. Introduction
Consider the linear regression model E(|yX ) X with covariance matrix 2 I . We assume X is of dimension (N K ) with rank K , that NK , and that y is normally distributed, conditional on X.
We are concerned with the power of the F-test for a set of linear restrictions HR0 : r which can be written as
R11 r R22 r
where Rr11 1 and Rr22 2. That is, the first set of restrictions is false and the second set is true.
We assume that R1 is of dimension J1 K , R2 is J2 K , and J12JK.
We ask the question: which test has the greater power, the F-test for HR0 : r, or the F-test for
HR01: r 1? Since the latter test omits the superfluous valid restrictions Rr22 2, intuition suggests it should have the greater power. In fact, Ruud (2000, p.235) “proves” that the noncentrality
parameters for the two tests are equal, and argues that the power for HR01: r 1 will be greater because of the reduction in the numerator degrees of freedom. There is an error in Ruud’s proof, however. We show that, except under special circumstances, the noncentrality parameter for testing
HR01: r 1 is less than the non-centrality parameter for testing HR0 : r, and hence, whether
testing HR01: r 1 is more powerful will depend on the competing effects of changes in the noncentrality parameter, the degrees of freedom in the numerator, and the critical value of the test.
We then go on to ask: what happens if the model is re-estimated subject to the restrictions
Rr22 2 which are known to be true, and the F-test for HR01: r 1 is applied to this restricted model? We show that the noncentrality parameter for this F-test is identical to that for testing
HR0 : r in the unrestricted model. This result brings the earlier result more in line with our intuition. While it might seem counterintuitive for the noncentrality parameter to be larger when valid
restrictions are included in the null hypothesis HR0 : r, our intuition does suggest that it should be
larger when testing HR01: r 1 in a correctly restricted model. We conclude by exploring the 3
conditions under which the noncentrality parameter for testing HR01: r 1 from the restricted model
is equal to the noncentrality parameter for testing HR01: r 1 from the unrestricted model.
To summarise, we are interested in comparing the noncentrality parameters for the F-test under the following three scenarios.
1. Testing HR0 : r when Rr11 1 and Rr22 2.
2. Testing HR01: r 1 when Rr11 1 and Rr22 2.
3. Testing HR01: r 1 when Rr11 1 and the valid restrictions Rr22 2 have been imposed on
the model.
Let the three noncentrality parameters for these cases be 12, and 3 , respectively. We show that
132, and we specify the conditions under which 132.
The F-statistic for the complete set of restrictions is HR0 : r is
1 1 RˆˆrRXXRRr ˆ 2 F JJ12
1 where ˆ X XXy is the least squares estimator and ˆ 2 yXˆˆ yX () NK is the variance estimator. The noncentrality parameter is
1 R rRXXRRr 2 1
In Section 2 we illustrate our results with some simple examples. Section 3 is devoted to proofs of the general results. Section 4 contains some numerical examples of the power differences. A concluding remark is provided in Section 5.
2. Simple Examples
Consider a model with two explanatory variables x1 and x2 which are expressed in terms of
N N deviations from their means such that i1 x1i 0 , i1 x2i 0 and E yXiii| x11 x 2 2.
Example 1
For the first example we compare noncentrality parameters for null hypotheses H01:0 2
and H01:0 when 1 0 and 2 0 . For testing H01:0 2 , we have 4
10 0 1 1 X X 2 RrRXXR 2 01 0 and
N 1 22x 1 11i1 i 1122 0 XX 0
For testing H01:0, we have Rr1110, 0, and
1 1 1 1 2 212 10XX 0
1 N 1 x2 2 i1 2i 2 1 2 NNxx22 N xx ii1112ii i 1 12 ii
22N 11i1 x i 2 2 1 r
2 where rxxxx222 NNN is the squared sample correlation between x and x . iii111iii121 1 2 1 2
Thus, we have 12, with equality holding when x1 and x2 are uncorrelated, a result that
contradicts Ruud’s general result that 12.
For the third scenario, we impose the restriction 2 0 , leading to the model E yXii| x11.
Then, for testing H01:0 , we see immediately that
22N x 11 i1 i 312
While it may seem strange that 12, it is clear that we would expect 32; otherwise, we could
increase the noncentrality parameter for testing H01:0 by simply including irrelevant explanatory variables in the model.
Example 2
For a second example, consider the hypothesis H01:0, 1 2 where 1 0 and 12. In this case
R1 10 0 Rr R2 11 0 5
The noncentrality parameter for testing the joint hypothesis H01:0, 1 2 is
1 1 101 11 1 112 0 XX 11 01 0
2 1 NNxx22 N xx NN2 ii1112ii i 1 12 ii 10xxx212iii 11 0 ii11 1 2 1 NN 11 xx x2 01 0 ii1112ii 1 i
2 1 NNxx22 N xx NNN22 ii1112iii1 12ii xxxx2212iiii 0 iii1111 2 1 NN NNN xxxxxxx2222 0 ii112121212iiiiiii iii 111
2 1 NNxx22 N xx NN2 ii1112ii i 1 12 ii xzx22iii 0 ii11 1 2 1 NN zx z2 0 ii11ii2 i
22 where zxx. It can be shown that NNx22xxxzxzx N NN 22 N , iii12 ii1112i i i 1 12 ii ii 11 i 2 i i 1 ii 2 and hence that
22N z 1 i1 i 1 2
Omitting 12 from the null hypothesis and testing only H01:0 yields the same noncentrality parameter as in Example 1, namely
22N 11i1 x i 2 2 2 1 r
To test H01:0 on the model with the restriction 12 imposed, we note that the restricted model is
E yXiiii| x121 x z 1 and the noncentrality parameter is
22N z 1 i1 i 312
After some algebra, we can show that
2222NN 111ii11zxii2 12 22 1 r
2 NN2 1 xxx2 0 22N ii11212iii i1 x2i 6
222N Thus, 132. Moreover, when x1 and x2 are uncorrelated, 121i1 x 2i 0 ; in contrast to Example 1, the strict inequality always holds.
3. General Case
To explore the general case of testing HR0 : r when Rr can be partitioned as
Rr11 Rr22 0 with 0, we adopt the notation used by Ruud (2000, p.233-235) and set AARXXR2 1 , with
A being lower triangular, partitioned conformably with R,
A11 0 A AA21 22 and with the inverse of A given by
B11 0 B B21B 22
It follows that
11 1 R XX R R XX R 22RXX R 1112 R XX11 R R XX R 2122 (1)
AA11 11 AA 11 21 AA21 11 AA 21 21 AA 21 21 and
1 211R XX11 R AA A A BB (2) B11BBBBB 11 21 21 21 22 B22BBB 21 22 22
The noncentrality parameter can be written as
1 1 2 1 0 RXX R 0
BB11 11 BB 21 21 BB 21 22 0 BB22 21 BB 22 22 0
BB11 11 BB 21 21 7
1 Now, from (1), BB AA112 R XX R , and thus, 11 11 11 11 1 1
1 2 R XX1 R B B 1112121
Discarding the surplus restrictions Rr22 from the null hypothesis and testing HR01: r 1, we get the noncentrality parameter
1 2 RXXR1 21 1
Since BB21 21 is nonnegative, we have 12; discarding the surplus restrictions decreases the non- centrality parameter and hence may decrease the power of the test if it is not compensated by a reduction in the degrees of freedom. We provide some insights into the conditions under which
BB21 21 0 and hence 12 after showing that 31.
Estimating subject to the valid restrictions Rr22 yields the restricted estimator
1 ˆˆ X XRRXXR11 R ˆ r R 22 2 2 2 with covariance matrix (see, for example, Judge et al. 1988, p.237-239)
1 Vˆ | X22 XX11 XX R R XX 1 R R XX 1 R 22 2 2
ˆ The non-centrality parameter for testing HR01: r 1 using R is
1 1 22RXXR 1111 RXXRRXXRRXXR 31 1122221 1 AA AA AA AA 1 AA 11 11 11 21 21 21 22 22 21 11
1 The term AA AA AA AA1 AA can be recognised as the top-left block in the 11 11 11 21 21 21 22 22 21 11
partitioned inverse of AA . The inverse of AA is BB whose top-left block is BB11 11 BB 21 21 . Hence,
3111121211BB BB
To examine the conditions under which 123, suppose that the model can be rewritten as
Ey(| X ) X11 X 2 2 (3) 8
where X1 is N K1 with J11 K , X 2 is N K2 with J22 K , and J12JKKK 1 2 . Also suppose that the restrictions Rr are of the form
R11111 Rr0 R (4) R22222 0 Rr
with dimensions conformable to the partition X XX12 . There is no overlap between the
coefficients in the invalid restrictions Rr11 1 1 and those in the valid restrictions Rr22 2 2 . We
encountered this situation in Example 1 where we found that 123 if x1 and x2 were
uncorrelated. Thus, we suspect the requirement for 123 in the more general set up in (3) and
(4) would be that X1 and X 2 are orthogonal.
Noting that
1 1 BB BB22 RXXR 1111 RXXRRXXR RXXR 11 11 21 21 1 1 1 2 2 2 2 1
1 1 it follows that BB BB BB 2 R XX R if 11 11 21 21 11 11 1 1
1 2 RXXRRXXRRXXR11 1 0 122221
Partitioning X X 1 in line with (3),
1 QQ11 12 XX QQ21 22
1 1 we find that R2XXRRQR 2 22 22 22, R1XXRRQR 2 11 12 22, and
1 2 R XX11 R R XX R R XX 1 R 2 R Q R R Q R 1 R Q R 1 2 2 2 2 1 11 12 22 22 22 22 22 21 11
This quantity will be zero if Q12 0 which in turn will be true if X1 and X 2 are orthogonal. Thus, the three noncentrality parameters are equal if the valid coefficient restrictions are separate from the invalid ones, and the explanatory variables associated with the valid and invalid restrictions are
orthogonal. As we saw in Example 2, it is not sufficient for X1 and X 2 to be orthogonal; we also require the two sets of restrictions to be separable. 9
4. Numerical Examples
Example 1
To illustrate how the power of a test that includes a valid hypothesis can be greater than the test
that excludes it, consider the regression model yxeiiii12 ,1,,40 . Let xi 10, i 1, , 20
40 40 2 and xii 20, 21, ,40 . We have N 40 , i1 xi 600 and i1 xi 10,000 . Let the true
2 parameter values be 1 100 , 2 10 and 2500 . First consider testing the joint hypothesis
H 01:100,9 2 against the alternative H11: 100 and/or 2 9 . At the 5% level of
significance we reject the joint null hypothesis if the F-test statistic is greater than the critical value
F0.95,2,38 3.24482 . The noncentrality parameter is
XX 1 100 11100 2 9 2 2 9 1 40 600 0 01 2500 600 10,000 1 4
The probability of rejecting the joint null hypothesis is the probability that a value from a non-central
F-distribution with noncentrality parameter 4 will exceed F0.95,2,38 3.24482 . This value for the
PrF 3.24482 0.38738 test power is 2,38, 4 .
Now we omit the true hypothesis 1 100 , and test H 02:9 against H12:9 using an
F-test. The test critical value is the 95th percentile of the F-distribution, F0.95,1,38 4.09817 . The noncentrality parameter of the F-distribution for this single hypothesis is
1 212 0 22901 XX 1
1 1 1 40 600 0 01 2500 600 10,000 1 0.4
The probability of rejecting the null hypothesis H 02:9 versus H12:9 when 2 10 is
PrF 4.09817 0.09457 1,38, 0.4 . Removing the true hypothesis has reduced the power of the test. 10
Example 2
All the above results have been conditional on X. To investigate the effect of relaxing this
assumption, we again use the model in Example 1, but generate 10,000 samples of x from a
N 15,1.62 distribution, and, correspondingly, 10,000 samples of (yx | ) from Nx100 10 , 502 .
This choice of x gives values from 10.2 to 19.8 with probability over 0.99, mimicking the range of x-
values in Example 1. Using the F-test, the percentage of rejections for the single hypothesis was
0.0569, and for the joint hypothesis, 0.3545. Using the chi-square test, with chi-square value equal to
the F-value multiplied by the numerator degrees of freedom, the single hypothesis rejection
percentage was 0.0640, and the joint test rejection percentage was 0.3911. Removing the true
hypothesis reduced the simulated power of the test.
5. Concluding Remark
We have shown that, except under special circumstances, including valid superfluous restrictions
in an F-test for linear restrictions in a regression model increases the non-centrality parameter of the
F-statistic, and hence can also improve its power. This apparently counterintuitive result makes sense
when we observe that the noncentrality parameter from testing both sets of restrictions jointly is identical to that obtained when the F-test for the invalid restrictions is applied to a model that has been restricted in line with the valid restrictions.
References
Judge, G.G., R.C. Hill, W.E. Griffiths, H. Lütkepohl and T.-C. Lee (1988), Introduction to the Theory
and Practice of Econometrics, second edition, New York: John Wiley and Sons.
Ruud, P. A. (2000), An Introduction to Classical Econometric Theory, Oxford: Oxford University
Press.