Unit 11: Analysis of Covariance (ANCOVA)
Total Page:16
File Type:pdf, Size:1020Kb
Unit 11: Analysis of Covariance (ANCOVA) STA 643: Advanced Experimental Design Derek S. Young 1 Learning Objectives I Become familiar with the basic orthogonal designs theory of ANCOVA I Understand the benefits and how we include covariates into ANOVA from a designed experiment I Become familiar with the single-factor ANCOVA model I Know the assumptions of ANCOVA I Know how to construct an ANCOVA model when there is more than one covariate I Know how to construct and analyze an ANCOVA model with a blocked design I Know how to construct and analyze a multi-factor ANCOVA model 2 Outline of Topics 1 Linear Model Theory 2 Single-Factor ANCOVA 3 Generalizations of ANCOVA 3 Outline of Topics 1 Linear Model Theory 2 Single-Factor ANCOVA 3 Generalizations of ANCOVA 4 Overview I Analysis of covariance (or ANCOVA) is a technique that combines features of ANOVA and regression. I ANCOVA can be used for either observational or designed experiments. I The idea is to augment the ANOVA model containing the factor effects (we will only consider fixed effects) with one or more additional quantitative variables that are related to the response variable. I These quantitative variables are called covariates (or they also called concomitant variables). I This augmentation is intended to reduce large error term variances that are sometimes present in ANOVA models. I Note that if the covariates are qualitative (e.g., gender, political affiliation, geographic region), then the model remains an ANOVA model where the original factors are of primary interest while the covariates are simply included for the purpose of error variance reduction. 5 Coordinatized-Version of Linear Model I In an ANCOVA model, we have observations that are taken in different categories as in ANOVA, but we also have some other predictors which are measured on an interval scale as in regression. I In the general form of the ANCOVA model, we observe 2 s 2 Y ∼ NN (δ + Xγ; σ I); δ 2 Q; γ 2 R ; σ > 0; (1) where X is a known N × s matrix (which we call a covariate matrix in ANCOVA) and Q is a known q-dimensional subspace. 2 I We assume that δ, γ, and σ are unknown parameters. I Typically, Q is a subspace from ANOVA. I We call the ANOVA model with subspace Q the associated ANOVA model. 6 Coordinate-Free Version of Linear Model I Let ∗ ∗ X = PQ? X; Y = PQ? Y: I In order to guarantee that the covariates are not confounded with the ANOVA component of the model, we need to assume that X∗ has full rank; i.e., that rank(X∗) = s. I Note that it is not enough that X have full column rank in order to have a possible covariance analysis. I Let U be the s-dimensional subspace spanned by the columns of X∗ and let V be the subspace of possible value for µ; i.e., s V = fµ = δ + Xγ : δ 2 Q; γ 2 R g: I Then 2 2 Y ∼ NN (µ; σ I); µ 2 V; σ > 0; which is just the ordinary linear model. 7 Adjusted Estimators I Note that U ? Q and ∗ ∗ µ = δ + PQXγ + X γ; δ + PQXγ 2 Q; X γ 2 U: I Therefore, µ 2 V = Q ⊕ U: I Hence ^ ∗ µ^ = PV Y = PQY + PU Y = δu + X γ^ ^ δu = PQY γ^ = (X∗TX∗)−1X∗TY = (X∗TX∗)−1X∗TY∗ I Note that δ^u is just the estimator of δ based on the associated ANOVA model and that γ^ is just the least squares estimator of γ found by regressing Y (or Y∗) on X∗. I The \u" subscript means the result has been unadjusted for the covariate, while the \a" subscript means the result has been adjusted for the covariate. I Further results using orthogonal direct sums allow us to write 2 2 ∗ 2 kµ^k = kδ^uk + kX δ^k ; and p = dim(V ) = dim(Q) + dim(U) = q + s. 8 Optimal Estimators I Note that γ^ = (X∗TX∗)−1X∗Tµ^; γ = (X∗TX∗)−1X∗Tµ: I Therefore, γ^ is the optimal (i.e., best unbiased) estimator of γ; however, ^ ∗ E[δu] = PQ(δ + Xγ) = δ + (X − X )γ: ∗ I Therefore, δ^u is the optimal estimator of δ + (X − X )γ for the ANCOVA model. I Note that the optimal estimator of ∗ ∗T ∗ −1 ∗T δ = (PQ − (X − X )(X X ) X )µ is ^ ∗ ∗T ∗ −1 ∗T ^ ∗ δa = (PQ − (X − X )(X X ) X )µ^ = δu − (X − X )γ^: I Thus, we have δ^a, which has been adjusted for the covariate. 9 Adjusted Sums of Squares I We can now construct the sums of squares and degrees of freedom for the error when adjusting for covariates. I First, SSEa and dfe;a for the ANCOVA model are given by 2 2 2 ∗ 2 ∗ 2 SSEa = kY − µ^k = kYk − kδ^uk + kX γ^k = SSEu − kX γ^k dfEa = N − dim(V ) = N − (q + s) = dfe;u − s; where SSEu and dfe;u are the sum of squares and df for the error for the associated ANOVA model. I We can compute MSEa = SSEa=dfe;a in the obvious way. I In order to produce the results presented above, it is necessary to find X∗. ^ I Let δu = PQY be the estimator of δ under the associated ANOVA model. I Then, the residual vector for the full ANOVA model is given by ^ ∗ Y − δu = PQ? Y = Y : ∗ I To find X = PQ? X, we apply the same operation to the x values that we apply to the y values when we find the residuals Y∗ for the associated ANOVA model. 10 Inference Setup I Next, consider testing the null hypothesis that δ 2 T , a known t-dimensional subspace of Q. I Let ∗∗ ∗∗ X = PT ? X; Y = PT ? Y: ∗ ∗∗ I Note that if X has full rank then X also does. I As written earlier, Y∗∗ is the vector of residuals for the reduced form of the associated ANOVA model and X∗∗ is the same function of the columns of X as Y∗∗ is of Y. I Let W be the subspace of possible values of µ under the null hypothesis. I By the same argument from earlier, we see that ~ ∗∗ µ~ = PW Y = δu + X γ~ ~ δu = PT Y γ~ = (X∗∗TX∗∗)−1X∗∗TY = (X∗∗TX∗∗)−1X∗∗TY∗∗ 2 2 ∗∗ 2 kµ~k = kµ~ uk + kX γ~k k = dim(W ) = t + s: 11 Test Statistic I Therefore, for testing δ 2 T for the ANCOVA model, the sum of squares, df, and mean square for the hypothesis are given, respectively, by 2 2 SSHa = kµ^k − kµ~k 2 ∗ 2 2 ∗∗ 2 = kδ^uk + kX γ^k − kδ~uk + kX γ~k ∗ 2 ∗∗ 2 = SSHu + kX γ^k − kX γ~k ∗ ∗∗ 2 = SSHu + kX γ^ − X γ~k dfh;a = (q + s) − (t + s) = q − t = dfh;u MSHa = SSHa=dfh;a: I SSHu and dfh;u are the sums of squares and df, respectively, for the hypothesis in the associated ANOVA model. I Note that the df for the hypothesis in the associated ANCOVA model is the same as the df for the hypothesis in the associated ANOVA model and that if X∗∗ = X∗, then the sum of squares and mean square for the hypothesis in the ANCOVA are the same as the sum of squares and the mean square for the hypothesis in the associated ANOVA model. I As usual, the optimal size α test for this hypothesis rejects if ∗ MSHa F = > F1−α;df ;df : MSEa h;a e;a 12 Outline of Topics 1 Linear Model Theory 2 Single-Factor ANCOVA 3 Generalizations of ANCOVA 13 Example: Goats Data A veterinarian carried out an experiment on a goat farm to determine whether a standard worm drenching program (to prevent the goats from obtaining worms) was adequate. Forty goats were used in each experiment, where twenty were chosen completely at random and drenched according to the standard program, while the remaining twenty were drenched more frequently. The goats were individually tagged, and weighed at the start and end of the year-long study. The objective in the experiment is to compare the liveweight gains between the two treatments. This comparison could be made using an ANOVA; however, a commonly observed biological phenomenon could allow us to increase the precision of the analysis. Namely, the lighter animals gain more weight than the heavier animals, so we have a \regression to the norm" setting. Since this can be assumed to occur within both treatment groups, it is appropriate to adjust the analysis to use that covariate information. 14 Example: Goats Data To understand why incorporating the covariate information might be effective, look at the plot of the data below. Here, we plotted weight gain of the goats separately for each treatment (either \standard" or \intensive"). It is evident that the error terms { as shown by the scatter around the treatment means given the respective horizontal lines { are fairly large. Thus, this indicates a large error term variance. Standard Treatment Intensive Treatment 10 10 8 8 6 6 Weight Gain Weight Gain 4 4 2 2 5 10 15 20 5 10 15 20 Goat # Goat # 15 Example: Goats Data Suppose that we now utilize the goats' initial weight. The scatterplot below shows the goats' weight gain versus their initial weight, with different colors corresponding to the different treatments. Note that the two treatment regression lines happen to be linear, although this need not be the case. The scatter around the treatment regression lines is much less than the scatter around the treatment means shown on the previous slide as a result of the goats' weight gain being highly linearly related to the initial weight.