Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model C. Flinn February 5, 1999 1. Asymptotic Results under Classical Assumptions The following results apply to the linear regression model y = Xβ + ε, where X is of dimension (n × k),εis a (unknown) (n × 1) vector of disturbances, and β is a (unknown) (k × 1) parameter vector. We assume that n k, and that ρ(X)=k. This implies that ρ(XX)=k as well. Throughout we assume that the “classical” conditional moment assumptions apply, namely • E(εi|X)=0∀i. 2 • V (εi|X)=σ ∀i. We first show that the probability limit of the OLS estimator is β, i.e., that it is consistent. In particular, we know that βˆ = β +(XX)−1Xε ⇒ E(βˆ|X)=β +(XX)−1XE(ε|X) = β In terms of the (conditional) variance of the estimator β,ˆ V (βˆ|X)=σ2(XX)−1. Now we will rely heavily on the following assumption X X lim n n = Q, n→∞ n where Q is a finite, nonsingular k × k matrix. Then we can write the covariance ˆ of βn in a sample of size n explicitly as σ2 X X −1 V (βˆ |X )= n n , n n n n so that 2 −1 ˆ σ XnXn lim V (βn|Xn) = lim lim n→∞ n n =0× Q−1 =0 Since the asymptotic variance of the estimator is 0 and the distribution is centered on β for all n, we have shown that βˆ is consistent. Alternatively, we can prove consistency as follows. We need the following result. Lemma 1.1. Xε plim =0. n Proof. E Xε =0 n. First, note that n for any Then the variance of the expression Xε n is given by Xε Xε Xε V = E n n n = n−2E(XεεX) σ2 XX = , n n lim V Xε =0× Q =0. so that n→∞ n Since the asymptotic mean of the random variable is 0 and the asymptotic variance is 0, the probability limit of the expression is 0. 2 Now we can state a slightly more direct proof of consistency of the OLS estimator, which is plim(βˆ) = plim(β +(XX)−1Xε) X X −1 Xε = β + lim n n plim n n = β + Q−1 × 0=β. Next, consider whether or not s2 is a consistent estimator of σ2. Now SSE s2 = , n − k where SSE =(y − Xβˆ)(y − Xβˆ). We showed that E(s2)=σ2 for all n - that is, that s2 is an unbiased estimator of σ2 for all sample sizes. Since SSE = εMε, with M =(I − X(XX)−1X), then εMε p lim s2 = p lim n − k εMε = p lim n εε εX XX −1 Xε = p lim − p lim n n n n εε = p lim − 0 × Q−1 × 0. n Now εε n = n−1 ε2 n i i=1 so that εε n E = n−1E ε2 n i i=1 n −1 2 = n Eεi i=1 = n−1(nσ2)=σ2. 3 Similarly, , under the assumption that εi is i.i.d., the variance of the random variable being considered is given by εε n V = n−2V ( ε2) n i i=1 n −2 2 = n V (εi ) i=1 −2 4 2 = n (n[E(εi ) − V (εi) ]) −1 4 2 = n [E(εi ) − V (εi) ], εε 0 E(ε4) so that the limit of the variance of n is as long as i is finite [we have already assumed that the first two moments of the distribution of εi exist]. Thus εε σ2 the asymptotic distribution of n is centered at and is degenerate, thus proving consistency of s2. 2. Testing without Normally Distributed Disturbances In this section we look at the distribution of test statistics associated with linear restrictions on the β vector when εi is not assumed to be normally distributed as 2 N(0,σ ) for all i. Instead, we will proceed with the weaker condition that εi is in- dependently and identically distributed with the common cumulative distribution 2 function (c.d.f.) F. Furthermore, E(εi)=0and V (εi)=σ for all i. Since we retain the mean independence and homogeneity assumptions, and since unbiasedness, consistency, and the Gauss-Markov theorem for that matter, all only rely on these first two conditional moment assumptions, all these results continue to hold when we drop normality. However, the small sample distributions of our test statistics no longer will be accurate, since these were all derived under the assumption of normality. If we made other explicit assumptions regarding F, it is possible in principle to derive the small sample distributions of test statistics, though these distributions are not simple to characterize analytically or even to compute. Instead of making explicit assumptions regarding the form of F, we can derive distributions of test statistics which are valid for large n no matter what the exact form of F [exceptthatitmustbeamemberoftheclassofdistibutions for which the asymptotic results are valid, of course]. We begin with the following useful lemma, which is associated with Lindberg- Levy. 4 2 2 Lemma 2.1. If ε is i.i.d. with E(εi)=0and E(εi )=σ for all i; if the elements of the matrix X are uniformly bounded so that |Xij| <U for all i and j and for U lim XX = Q finite;and if n is finite and nonsingular, then 1 √ Xε → N(0,σ2Q). n Proof. Consider the case of only one regressor for simplicity. Then 1 n Z ≡ √ X ε n n i i i=1 is a scalar. Let Gi be the c.d.f. of Xiεi. Let n n 2 2 2 Sn ≡ V (Xiεi)=σ Xi . i=1 i=1 −1 2 In this scalar case, Q =limn i Xi . By the Lindberg-Feller Theorem, the 2 necessary and sufficient condition for Zn → N(0,σ Q) is 1 n lim ω2 dG (ω)=0 S2 i (2.1) n i=1 |ω|>νSn ω ν>0. Gi(ω)=F ( ). for all Now |Xi| Then rewrite [2.1] as n n X2 ω 2 ω lim i dF ( )=0. S2 n X |X | n i=1 |ω/Xi|>νSn/|Xi| i i 2 2 2 n Xi 2 n 2 −1 Since lim Sn =limnσ = nσ Q, then lim 2 =(σ Q) , which is a finite i=1 n Sn and nonzero scalar. Then we need to show n −1 2 lim n Xi δi,n =0, i=1 2 ω ω δi,n ≡ dF ( ). lim δi,n =0 i where |ω/Xi|>νSn/|Xi| Xi |Xi| Now for all and any fixed ν since |Xi| is bounded while lim Sn = ∞ [thus the measure of the set |ω/X | >νS/|X | 0 lim n−1 X2 i n i goes to asymptotically]. Since i is finite and −1 2 lim δi,n =0for all i, lim n Xi δi,n =0. 5 For vector-valued Xi, the result is identical of course, with Q being k × k instead of a scalar. The proof is only slightly more involved. Now we can prove the following important result. Theorem 2.2. Under the conditions of the lemma, √ n(βˆ − β) → N(0,σ2Q−1). √ −1 −1 Proof. n(βˆ − β)= X X √1 Xε. Since lim X X = Q−1 and √1 Xε → √ n n n n N(0,σ2Q),then n(βˆ − β) → N(0,σ2Q−1QQ−1)=N(0,σ2Q−1). The results of this√ proof have the following practical implications. For small n, the distribution of n(βˆ − β) is not normal, though asymptotically the distribution of this random variable converges to a normal. The variance of this random variable converges to σ2Q−1 which is arbitrarily well-approximated by −1 2 XnXn 2 −1 ˆ s = s n(XnXn) . But the variance of (β − β) is equal to the variance √ n of n(βˆ − β) divided by n, so that in large samples the variance of the OLS 2 −1 2 −1 estimator is approximately equal to s n(XnXn) /n = s (XnXn) , even when F is non-normal. Usual t tests of one linear restriction on β are no longer consistent. However, an analagous large sample test is readily available. 2 2 Proposition 2.3. Let εi be i.i.d. (0,σ ),σ < ∞, and let Q be finite and nonsingular. Consider the test H0 : Rβ = r, where R is (1 × k) and r is a scalar, both known. Then Rβˆ − r → N(0, 1). s2R(XX)−1R Proof. Under the null, Rβˆ − r = Rβˆ − Rβ = R(βˆ − β), so that the test statistic is √ nR(βˆ − β) . s2R(XX/n)−1R Since √ n(βˆ − β) → N(0,σ2Q−1) √ ⇒ nR(βˆ − β) → N(0,σ2RQ−1R). 6 The denominator of the test statistic has a probability limit equal to σ2RQ−1R, which is the standard deviation of the random variable in the numerator. A mean zero normal random variable divided by its standard deviation has the distribution N(0, 1). A similar result holds for the situation in which multiple (nonredundent) linear restrictions on β are tested simultaneously. 2 2 Proposition 2.4. Let εi be i.i.d. (0,σ ),σ < ∞, and let Q be finite and nonsingular. Consider the test H0 : Rβ = r, where R is (m × k) and r is a (m × 1) vector, both known. Then (r − Rβˆ)[R(XX)−1R]−1(r − Rβˆ)/m χ2 → m . SSE/(n − k) m Proof. The denominator is a consistent estimator of σ2 [as would be SSS/n], and has a degenerate limiting distribution.

Asymptotic Results for the Linear Regression Model

Notes for a Graduate-Level Course in Asymptotics for Statisticians

Chapter 4. an Introduction to Asymptotic Theory"

Chapter 6 Asymptotic Distribution Theory

Chapter 5 the Delta Method and Applications

Asymptotic Results for the Linear Regression Model

Asymptotic Independence of Correlation Coefficients With

Deriving the Asymptotic Distribution of U- and V-Statistics of Dependent

Asymptotic Distributions in Time Series

Asymptotic Variance of an Estimator

Asymptotic Inference in Some Heteroscedastic Regression Models

U-Statistics

A Unified Asymptotic Distribution Theory for Parametric and Non