Chapter 6 General Linear Model: Statistical Inference

Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estima- bility of parameters in a linear model (Chapter 4), least square estimation (Chapter 4), and generalized least square estimation (Chapter 5). In dis- cussing LS or GLS estimators, we have not made any probabilistic inference mainly because we have not assigned any probability distribution to our lin- 191 BIOS 2083 Linear Models Abdus S. Wahed ear model structure. Statistical models very often demand more than just estimating the parameters. In particular, one is usually interested in putting a measure of uncertainty in terms of confidence levels or in testing whether some linear functions of the parameters such as difference between two treatment effects is significant or not. As we know from our statistical methods courses that interval estimation or hypothesis testing almost always require a probability model and the inference depends on the particular model you chose. The most common probability model used in statistical inference is the normal model. We will start with the Gauss-Markov model, namely, Y = Xβ + , (6.1.1) where Assumption I. E()=0, Assumption II. cov()=σ2I, and introduce the assumption of normality to the error components. In other words, 2 Assumption IV. ∼ N(0,σ In). As you can see, Assumption IV incorporates Assumptions I and II.The model (6.1.1) along with assumption IV will be referred to as normal theory Chapter 6 192 BIOS 2083 Linear Models Abdus S. Wahed linear model. Chapter 6 193 BIOS 2083 Linear Models Abdus S. Wahed 6.2 Maximum likelihood, sufficiency and UMVUE Under the normal theory linear model described in previous section, the likelihood function for the parameter β and σ2 can be written as n − β T − β β 2| √1 −(Y X ) (Y X ) L( ,σ Y, X)= exp 2 (6.2.1) σ 2π σ 1 n YT Y − 2βT XT Y + βT XT Xβ) = √ exp − , σ 2π σ2 (6.2.2) Since X is known, (6.2.2) implies that L(β,σ2|Y, X) belongs to an expo- nential family with a joint complete sufficient statistic (YT Y, XT Y)for (β,σ2). Also, (6.2.1) shows that the likelihood is maximized for β for given σ2 when Y − Xβ2 =(Y − Xβ)T (Y − Xβ) (6.2.3) is minimized. But, when is (6.2.3) minimized? From chapter 4, we know that (6.2.3) is minimized when β is a solution to the normal equations XT Xβ = XT Y. (6.2.4) Thus maximum likelihood estimator (MLE) of β also satisfies the normal equations. By the invariance property of maximum likelihood, MLE of an estimable function λT β is given by λT βˆ,whereβˆ is a solution to the normal equations (6.2.4). Using the maximum likelihood estimator βˆ of β,we Chapter 6 194 BIOS 2083 Linear Models Abdus S. Wahed express the likelihood (6.2.1) as a function of σ2 only, namely, 1 n (Y − Xβˆ)T (Y − Xβˆ) L(σ2|Y, X)= √ exp − (6.2.5) σ 2π σ2 with corresponding log-likelihood n (Y − Xβˆ)T (Y − Xβˆ) lnL(σ2|Y, X)=− ln(2πσ2) − , (6.2.6) 2 σ2 which is maximized for σ2 when (Y − Xβˆ)T (Y − Xβˆ) σˆ2 = . (6.2.7) MLE n β 2 Note that while the MLE of is identical to the LS estimator,σ ˆMLE is not the same as the LS estimator of σ2. Proposition 6.2.1. MLE of σ2 is biased. Proposition 6.2.2. The MLE λT βˆ is the uniformly minimum variance un- biased estimator (UMVUE) of λT β. The MLE T βˆ and 2 2 are independently dis- Proposition 6.2.3. λ nσˆMLE/σ tributed, respectively as T β 2 T T g and 2 . N(λ ,σ λ (X X) λ) χn−r Chapter 6 195 BIOS 2083 Linear Models Abdus S. Wahed 6.3 Confidence interval for an estimable function Proposition 6.3.1. The quantity T βˆ − T β λ λ 2 T T g σˆLSλ (X X) λ has a t distribution with n − r df . The proposition 6.3.1 leads the way for us to construct a confidence interval (CI) for the estimable function λT β. In fact, a 100(1 − α)% CI for λT β is given as T βˆ ± 2 T T g λ tn−r,α/2 σˆLSλ (X X) λ. (6.3.1) This confidence interval is in the familiar form Estimate ± tα/2 ∗ SE. Chapter 6 196 BIOS 2083 Linear Models Abdus S. Wahed Example 6.3.2. Consider the simple linear regression model considered in Example 1.1.3 given by Equation (1.1.7) where Yi = β0 + β1wi + i, (6.3.2) where Yi and wi respectively represents the survival time and the age at prognosis for the ith patient. We have identified in Example 4.2.3 that a unique LS estimator for β0 and β1 is given by ⎫ ¯ (wi−w¯)(Yi−Y ) ⎬ βˆ1 = 2 (wi−w¯) ⎭ (6.3.3) βˆ0 = Y¯ − βˆ1w,¯ 2 provided (wi − w¯) > 0. For a given patient who was diagnosed with leukemia at the age of w0, one would be interested in predicting the length of survival. The LS estimator of the expected survival for this patient is given by Yˆ0 = βˆ0 + βˆ1w0. (6.3.4) What is a 95% confidence interval for Yˆ0? T T T Note that Yˆ0 is in the form of λ βˆ where βˆ1 =(βˆ0, βˆ1) ,andλ =(1,w0) . The variance-covariance matrix of βˆ is given by cov(βˆ)=(XT X)−1σ2 ⎡ ⎤ 2 w2 − w σ ⎣ i i ⎦ = 2 . (6.3.5) n (wi − w¯) − wi n Chapter 6 197 BIOS 2083 Linear Models Abdus S. Wahed Thus the variance of the predictor Yˆ0 is given by T var(Yˆ0)=var(λ βˆ) = λT cov(βˆ)λ ⎡ ⎤ ⎛ ⎞ 2 w2 − w 1 σ ⎣ i i ⎦ ⎝ ⎠ = 2 (1,w0) n (wi − w¯) − wi n w0 ⎡ ⎤ 2 2 w − w0 w σ ⎣ i i ⎦ = 2 (1,w0) n (wi − w¯) − wi + nw0 2 σ 2 − 2 = − 2 wi 2w0 wi + nw0 n (wi w¯) 2 − 2 σ (wi w0) = 2 . (6.3.6) n (wi − w¯) Therefore, the standard error of Yˆ0 is obtained as: − 2 ˆ (wi w0) SE(Y0)=ˆσLS 2 , (6.3.7) n (wi − w¯) 2 n − ˆ − ˆ 2 − ˆ whereσ ˆLS = i=1(Yi β0 β1wi) /(n 2). Using (6.3.1), a 95% CI for Y0 is then − 2 ˆ ± (wi w0) Y0 tn−2,0.025σˆLS 2 . (6.3.8) n (wi − w¯) Chapter 6 198 BIOS 2083 Linear Models Abdus S. Wahed 6.4 Test of Hypothesis Very often we are interested in testing hypothesis related to some linear function of the parameters in a linear model. From our discussions in Chapter 4, we have learned that not all linear functions of the parameter vector can be estimated. Similarly, not all hypothesis corresponding to linear functions of the parameter vector can be tested. We will know shortly, which hypothesis can be tested and which cannot. Let us first look at our favorite one-way- ANOVA model. Usually, the hypotheses of interest are: 1. Equality of a treatment effects: H0 : α1 = α2 = ...= αa. (6.4.1) 2. Equality of two specific treatment effects, e.g., H0 : α1 = α2, (6.4.2) 3. A linear combination such as a contrast (to be discussed later) of treatment effects H0 : ciαi =0, (6.4.3) where ci are known constants such that ci =0. Chapter 6 199 BIOS 2083 Linear Models Abdus S. Wahed Note that, if we consider the normal theory Gauss-Markov linear model Y = Xβ + , then all of the above hypotheses can be written as AT β = b, (6.4.4) where A is a p × q matrix with rank(A)=q.IfA is not of full column rank, we exclude the redundant columns from A to have a full column rank matrix. Now, how would you proceed to test the hypothesis (6.4.4)? Let us deviate a little and refresh our memory about test of hypothesis. 2 If Y1,Y2,...,Yn are n iid observations from N(μ, σ ) population, how do we construct a test statistic to test the hypothesis H0 : μ = μ0? (6.4.5) Chapter 6 200 BIOS 2083 Linear Models Abdus S. Wahed Thus we took 3 major steps in constructing a test statistic: • Estimated the parametric function (μ), • Found the distribution of the estimate, and • Eliminated the nuisance parameter. We will basically follow the same procedure to test the hypothesis (6.4.4). First we need to estimate AT β. That would mean that AT β must be estimable. Definition 6.4.1. Testable hypothesis. A linear hypothesis AT β is testable if the rows of AT β are estimable. In other words (see chapter 4), there exists a matrix C such that A = XT C. (6.4.6) The assumption that A has full column rank is just a matter of conve- nience. Given a set of equations AT β = b, one can easily eliminate redundant equations to transform it into a system of equations with a full column rank matrix. Since AT β is estimable, the corresponding LS estimator of AT β is given by AT βˆ = AT (XT X)gXT Y, (6.4.7) Chapter 6 201 BIOS 2083 Linear Models Abdus S. Wahed which is a linear function of Y. Under assumption IV, AT βˆ is distributed as T T 2 T T g 2 A βˆ ∼ Nq(A β,σ A (X X) A = σ Σβ), (6.4.8) where we introduced the notation AT (XT X)gA = B.

Load more