Chapter 12 Analysis of Covariance
Any scientific experiment is performed to know something that is unknown about a group of treatments and to test certain hypothesis about the corresponding treatment effect.
When variability of experimental units is small relative to the treatment differences and the experimenter do not wishes to use experimental design, then just take large number of observations on each treatment effect and compute its mean. The variation around mean can be made as small as desired by taking more observations.
When there is considerable variation among observations on the same treatment and it is not possible to take an unlimited number of observations, the techniques used for reducing the variation are (i) use of proper experimental design and (ii) use of concomitant variables.
The use of concomitant variables is accomplished through the technique of analysis of covariance. If both the techniques fail to control the experimental variability then the number of replications of different treatments (in other words, the number of experimental units) are needed to be increased to a point where adequate control of variability is attained.
Introduction to analysis of covariance model In the linear model
YX11 X 2 2 ... Xpp , if the explanatory variables are quantitative variables as well as indicator variables, i.e., some of them are qualitative and some are quantitative, then the linear model is termed as analysis of covariance (ANCOVA) model.
Note that the indicator variables do not provide as much information as the quantitative variables. For example, the quantitative observations on age can be converted into indicator variable. Let an indictor variable be
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 1 1 if age 17 years D 0 if age 1 7 years. Now the following quantitative values of age can be changed into indicator variables.
Ages (years) Ages 14 0 15 0 16 0 17 1 20 1 21 1 22 1
In many real application, some variables may be quantitative and others may be qualitative. In such cases, ANCOVA provides a way out.
It helps is reducing the sum of squares due to error which in turn reflects the better model adequacy diagnostics.
See how does this work:
In one way model :Yij i ij , we have TSS11 SSA SSE 1
In twoway model:Yij i j ij , wehave TSS222 SSA SSB SSE 2 In three way model :Y , we have TSS SSA SSB SS SSE ij i j k ik 33333 If we have a given data set, then ideally
TSS123 TSS TSS
SSA123 SSA SSA ;
SSB23 SSB
So SSE123SSE SSE . SS()/ effects df Note that in the construction of F -statistics, we use . SSE/ df So F -statistic essentially depends on the SSEs . Smaller SSElarge F more chance of rejection.
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 2 Since SSA, SSB etc. here are based on dummy variables, so obviously if SSA, SSB , etc. are based on quantitative variables, they will provide more information. Such ideas are used in ANCOVA models and we construct the model by incorporating the quantitative explanatory variables in ANOVA models.
In another example, suppose our interest is to compare several different kinds of feed for their ability to put weight on animals. If we use ANOVA, then we use the final weights at the end of experiment. However, final weights of the animals depend upon the initial weight of the animals at the beginning of the experiment as well as upon the difference in feeds.
Use of ANCOVA models enables us to adjust or correct these initial differences.
ANCOVA is useful for improving the precision of an experiment. Suppose response Y is linearly related to covariate X (or concomitant variable). Suppose experimenter cannot control X but can observe it. ANCOVA involves adjusting Y for the effect of X. If such an adjustment is not made, then the X can inflate the error mean square and makes the true differences is Y due to treatment harder to detect.
If, for a given experimental material, the use of proper experimental design cannot control the experimental variation, the use of concomitant variables (which are related to experimental material) may be effective in reducing the variability. Consider the one way classification model as
E(Yiij i 1,...,pj , 1,...,N i ,
2 Var() Yij .
If usual analysis of variance for testing the hypothesis of equality of treatment effects shows a highly significant difference in the treatment effects due to some factors affecting the experiment, then consider the model which takes into account this effect
E(YtipjNij ) i ij 1,..., , 1,..., i ,
2 Var() Yij where tij are the observations on concomitant variables (which are related to X ij ) and is the regression coefficient associated with tij . With this model, the variability of treatment effects can be considerably reduced.
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 3 For example, in any agricultural experimental, if the experimental units are plots of land then, tij can be
th th measure of fertility characteristic of the j plot receiving i treatment and X ij can be yield.
In another example, if experimental units are animals and suppose the objective is to compare the growth rates of groups of animals receiving different diets. Note that the observed differences in growth rates can be attributed to diet only if all the animals are similar in some observable characteristics like weight, age etc. which influence the growth rates.
th th In the absence of similarity, use tij which is the weight or age of j animal receiving i treatment.
If we consider the quadratic regression in tij then in
2 E(Yttipjnij ) i i ij 2 ij , 1,..., , 1,..., i ,
2 Var() Yij .
2 ANCOVA in this case is the same as ANCOVA with two concomitant variables ttijand ij .
In two way classification with one observation per cell,
EY(ij ) i j t ij , i 1,..., I , j 1,..., J or
EY()ij i j i t ij 2 w ij withij 0, 0, ij th then (ytij , ij ) or ( ytw ij , ij , ij ) are the observations in ( ij , ) cell and twij , ij are the concomitmentvariables.
The concomitant variables can be fixed on random. We consider the case of fixed concomitant variables only.
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 4 One way classification
th Let Yjij(1...,1...) ni i p be a random sample of size ni from i normal populations with mean
ijE()Yt ij i ij
2 Var() Yij
2 wherei , and are the unknown parameters, tij are known constants which are the observations on a concomitant variable.
The null hypothesis is
H 01:... p . Let 111 yyyyyyio ij;, oj ij oo ij npni jiij 111 ttttttio ij;, oj ij oo ij npni jiij
nn i . i ˆ Under the whole parametric space () , use likelihood ratio test for which we obtain the i 's and ˆ using the least squares principle or maximum likelihood estimation as follows:
2 MinimizeSy (ij ij ) ij 2 ()ytij i ij ij S 0 for fixed i
iioioyt S Put inS and minimize the function by 0, i 2 i.e.,minimizeyyij io ( tt ij io ) with respect to gives ij
()()yyttij io ij io ˆ ij =.2 ()ttij io ij ˆ Thus iioioytˆ ˆ ˆij iˆt ij .
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 5 Since yytˆ ˆ ˆ ij ij ij i ij yyij ioˆ(), tt ij io we have ()()yyttij io ij io ()yyyˆ 22 () ij . ij ij ij io 2 ij ()ttij io ij
2 Under (say), consider and minimize S under sample H 01:... p Sytwijij w ij space () w , S w 0, S w 0 ˆ ˆ ytoo ˆ oo
()()yyttij oo ij oo ˆ ij 2 ()ttij oo ij ˆ ˆ ˆ ˆijˆt ij .
Hence
2 ()()yyttij oo ij oo ()yyyˆ 22 ( )ij ij ij ij oo 2 ij ij ()ttij oo ij and 2 ()ˆˆˆ 2 ()()().yy ˆˆ tt ˆ tt ij ij i oo ij io ij oo ij ij
The likelihood ratio test statistic in this case is given by
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 6 maxL ( , ,2 ) w maxL ( , ,2 ) ˆˆˆ 2 ()ij ij ij . ˆ 2 ()yij ij ij Now we use the following theorems:
Theorem 1: Let YYYY (,12 ,...,)n follow a multivariate normal distribution N (,) with mean vector and positive definite covariance matrix . Then YAY follows a noncentral chi-square distribution with p degrees of freedom and noncentrality parameter A, i.e., 2 (,p A ) if and only if A is an idempotent matrix of rank p.
Theorem 2: Let YYYY (12 , ,...,n ) follows a multivariate normal distribution N (, ) with mean
2 vector and positive definite covariance matrix . Let YAY 1 follows (,pA11 )
2 and YAY 2 follows (,pA22 ). Then YAY 1 and YAY 2 are independently distributed if AA12 0.
2 Theorem 3: Let YYYY (,12 ,...,)n follows a multivariate normal distribution NI(, ), then the maximum likelihood (or least squares) estimator Lˆ of estimable linear parametric function is
nˆ 2 independently distributed of ˆ 2 ; Lˆ follow NL ,( LXXL )1 and follows 2 ()np where 2 rank() X p .
Using these theorems on the independence of quadratic forms and dividing the numerator and denominator by respective degrees of freedom, we have
()ˆˆ ˆ 2 np1 ij ij F ij ~(Fp 1,)under n p H ˆ 2 0 py1() ij ij
So reject H0 whenever FF1 (1,) p np at level of significance. The terms involved in can be simplified for computational convenience follows:
We can write
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 7 ˆ 2 ()yij ij ij ˆ 2 ytˆ ˆ ij ij ij 2 ()()yyˆ tt ij oo ij oo ij 2 (yy )()()()ˆˆˆˆ tt tt ttˆ ij oo ij oo ij io ij io ij 2 ˆ ˆ ()()yyij io tt ij io ij 2 ()()()yyˆˆ ttˆ tt ij oo ij io ij oo ij ˆˆˆ2 ˆ ()yij ij (). ij ij ij ij For computational convenience TE22 ()ˆˆ ˆ 2 TEytyt ij ij yyTE yy ij tt tt ()y ˆ 2 E 2 ij ij E yt ij yy Eyy where
2 Tyyyy() ij oo ij 2 Ttttt() ij oo ij
Tyyttyt()() ij oo ij oo ij
2 Eyyyy() ij io ij 2 Etttt() ij io ij
Eyt()().yytt ij io ij io ij
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 8 Analysis of covariance table for one way classification is as follows: Source of Degrees Sum of products Adjusted sum of squares F variation of yy yt tt Degress Sum of squares freedom of feedom
Population p 1 P ()TEP ()TEPTE() p 1 qqq np 1 q yy yy yy yt yt yt tt tt tt 102 1
p 1 q E 2 2 yt np 1 qE2 yy Error E E E E np yy yt tt yy
2 Tyt Total n 2 qT0 yy n 1 T T Ttt yy yt Ttt
If H0 is rejected, employ multiple comprises methods to determine which of the contrasts in i are responsible for this.
For any estimable linear parametric contrast
pp CCiiwith i 0, ii11 pp p ˆˆˆ CCyCtii ii iii ii11 i 1
2 ˆ Var() 2 ()ttij i ij 2 Ct C 2 ii Var()ˆ 2 i i . 2 i nttiiji() ij
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 9 Two way classification (with one observations per cell) Consider the case of two way classification with one observation per cell.
2 LetyNij ~ ( ij , ) be independently distributed with
E(ytiIjJij ) i j ij , 1... , 1...
2 Vy()ij where : Grand mean
I th 1 : Effect of i level of A satisfying i 0 i
J th 1 : Effect of j level of B satisfying j 0 i tij : observation (known) on concomitant variable. The null hypothesis under consideration are H :...0 012 I H012 :...0 J
Dimension of whole parametric space (): I J
Dimension of sample space ():1 w J under H0
Dimension of sample space ():1 w I under H 0 with respective alternative hypotheses as
H1 : At least one pair of 's is not equal
H1 : At least one pair of 's is not equal.
Consider the estimation of parameters under the whole parametric space () .
2 Find minimum value of ()yij ij under . ij To do this, minimize
2 ()yij i j t ij . ij For fixed , which gives on solving the least squares estimates (or the maximum likelihood estimates) of the respective parameters as
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 10 ytoo o yy () t t (1) iiooiooo jojooojooyy(). tt
2 Under these values of ,and,i j the sum of squares ()yij i j t ij reduces to ij
2 (2) yyyij io oj y oo (). tttt ij io oj oo ij Now minimization of (2) with respect to gives
IJ ()()yyyij io oj ytttt oo ij io oj oo ij11 ˆ IJ . 2 ()ttttij io oj oo ij11 Using ˆ, we get from (1)
ˆ ytooˆ oo
ˆˆiioooiooo()()yy tt ˆ jojooojoo()().yyˆ tt Hence ˆ 2 ()yij ij ij ()()yyyij io oj y ootttt ij io oj oo ()yyy y2 ij ij io oj oo 2 ij ()ttttij io oj oo ij 2 Eyt Eyy Ett where
2 Eyyyyyy() ij io oj oo ij
Eytijioojooijioojoo()()yyy ytttt 2 Etttttt(). ij io oj oo ij
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 11 Case (i) : Test of H0
2 Minimize ()ytij j ij with respect to , j and gives the least squares estimates) (or the ij maximum likelihood estimates) of respective parameters as ˆ ˆ ˆ ytoo ˆ oo ˆ ˆ jojooojooyyˆ() tt ()()yytt ij oj ij oj ˆ ij 2 (3) ()ttij oj ij ˆˆˆ ˆ ˆˆjij ˆt . Substituting these estimates in (3) we get ()()yyij ojtt ij oj ()yyyˆ 22 () ij ij ij ij j ()tt 2 ij ij ij oj ij EAyt yt EAyy yy EAtt tt where
2 AJyyyy () io oo i 2 AJtttt () io oo i 2 AJyyttyt ()() io oo io oo i 2 Eyyyyyy () ij io oj oo ij 2 Etttttt() ij io oj oo ij
Eytijioojooijioojoo()().yyy ytttt ij
Thus the likelihood ratio test statistic for testing H0 is
ˆˆˆ 22 ()yyij ij () ij ij ij ij . 1 ˆ 2 ()yij ij ij Adjusting with degrees of freedom and using the earlier result for the independence of two quadratic forms and their distribution Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 12 ˆˆˆ 22 ()yyij ij () ij ij ()IJ I J ij ij F1 2 ~(FI 1, IJI J ) under Ho . (1)Iy (ˆ ) ij ij ij
So the decision rule is to reject Ho whenever FFI11 (1, IJIJ ).
Case b: Test of H 0
2 Minimize ()yij i t ij with respect to ,i and gives the least squares estimates (or ij maximum likelihood estimates) of respective parameters as
ˆ ytoo oo
jioooioooyy() tt ()()yytt ij io ij io ij 2 (4) ()ttij io ij
ij i ij . From (4), we get ()()yyttij io ij oj ()yyy 22 () ij ij ij ij io 2 ij ij ()ttij io ij 2 EByt yt EByy yy Btt
2 BIyyyy () oj oo j where 2 BItttt () oj oo j 2 BIyyttyt ()(). io oo oj oo j
Thus the likelihood ratio test statistic for testing H 0 is
22ˆ ()yyij ij () ij ij ()IJ I J ij ij FFJIJIJ~( 1, ) under H . 2 (1)Jy (ˆ )2 o ij ij ij
So the decision rule is to reject H0 whenever FFJ21 (1, IJIJ ).
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 13 If Ho is rejected, use multiple comparison methods to determine which of the contrasts i are responsible for this rejection. The same is true for Ho .
The analysis of covariance table for two way classification is as follows:
Degrees Sum of products F Source of of yy yt tt variation freedom Between I 1 A A A Iqqq1 IJ I J q0 yy yt tt 023 evels of A F1 I 1 q2
Between J 1 B B B J 1 qqq levels of yy yt tt 142 IJ I J q F 1 B 2 J 1 q2
2 Error (1)(1)IJ E Eyt Eyy Eyt tt IJ I J q E 2 yy E tt
Total IJ 1 Tyy Tyt Ttt IJ 2
()AE 2 Error + yt yt IJJ qAE3 ()yy yy levels A E tt tt of A
2 ()BEyt yt Error + qBE() IJI 4 yy yt BE levels tt tt of B
Analysis of Variance | Chapter 12 | Analysis of Covariance | Shalabh, IIT Kanpur 14