Analysis of Covariance Handout

Analysis of Covariance Handout

<p> Statistics 701 Analysis of Covariance Handout</p><p>Data Set for Illustration: Rehabilitation data set, where Y = Number of Days of Treatment, the qualitative factor is "Physical Fitness" with 1 = Below Average, 2 = Average, and 3 = High Average. The covariate or quantitative factor is Age, in years. The full data set is given below.</p><p>YnumDays AphyFit Rep XAge 29 1 1 18.3 42 1 2 30.0 38 1 3 26.5 40 1 4 28.1 43 1 5 29.7 40 1 6 27.8 30 1 7 19.8 42 1 8 29.3 30 2 1 20.8 35 2 2 25.2 39 2 3 29.2 28 2 4 20.0 31 2 5 21.5 31 2 6 22.1 29 2 7 19.7 35 2 8 24.7 29 2 9 20.2 33 2 10 22.9 26 3 1 22.7 32 3 2 28.7 21 3 3 18.9 20 3 4 18.0 23 3 5 21.7 22 3 6 20.0</p><p>Here is a plot of these data with NumDays as dependent and Age as independent, where we used different markers to indicate the Physical Fitness. This plot was generated using Minitab using the grouping option.</p><p>1 2 40 3 s y a D</p><p> m 30 u N Y</p><p>20</p><p>20 25 30 XAge</p><p>Clearly, from this plot we note that the response variable is affected by the Age, so that in analyzing this data, we should use a model that takes into account the effect of Age. This is the type of data where an Analysis of Covariance model is appropriate. Analysis of Covariance Model (ANOCOVA)</p><p>Yij =  + i + Xij + ij, j=1,2,…,ni; i=1,2,…,p=3.</p><p>Performing the Analysis using Minitab (ANOVA using General Linear Model option and by using the Covariate option within this option). The output from this run is given below. </p><p>Worksheet size: 100000 cells Retrieving project from file: C:\COURSES\STAT70~1\LECTUR~1\ANALYS~1.MPJ</p><p>General Linear Model</p><p>Factor Type Levels Values APhyFit fixed 3 1 2 3</p><p>Analysis of Variance for YNumDays, using Adjusted SS for Tests</p><p>Source DF Seq SS Adj SS Adj MS F P APhyFit 2 672.00 246.08 123.04 399.11 0.000 XAge 1 409.83 409.83 409.83 1329.39 0.000 Error 20 6.17 6.17 0.31 Total 23 1088.00 </p><p>Term Coef StDev T P Constant 3.9083 0.7610 5.14 0.000 XAge 1.16729 0.03201 36.46 0.000</p><p>Unusual Observations for YNumDays</p><p>Obs YNumDays Fit StDev Fit Residual St Resid 23 23.0000 24.0389 0.2267 -1.0389 -2.05R </p><p>R denotes an observation with a large standardized residual.</p><p>Conclusions</p><p>1. There is a significant effect of the covariate "Age" as can be deduced from the p-value from the ANOVA table pertaining to "Xage". The estimate of the regression coefficient is 1.16729, and the estimate of the standard error is 0.03201. Note that this test is performed after removing the effect of the qualitative factor. 2. After removing the effect of the covariate Age, we also find that there are signficant differences among the levels of the factor "Physical Fitness" as can be discerned from the p-value associated with AphyFit. To see which levels are different, we could then examine the result of the Tukey multiple comparison procedure which is given below. Since all these simultaneous confidence intervals for the difference between two means do not include zero, then we could conclude that the three levels of physical fitness do have different mean number of days of treatment, even after we have removed the effect of age.</p><p>Tukey 95.0% Simultaneous Confidence Intervals Response Variable YNumDays All Pairwise Comparisons among Levels of APhyFit </p><p>APhyFit = 1 subtracted from:</p><p>APhyFit Lower Center Upper --+------+------+------+---- 2 -2.574 -1.847 -1.121 (--*-) 3 -9.566 -8.723 -7.880 (--*--) --+------+------+------+---- -9.0 -6.0 -3.0 0.0</p><p>APhyFit = 2 subtracted from:</p><p>APhyFit Lower Center Upper --+------+------+------+---- 3 -7.606 -6.876 -6.146 (-*--) --+------+------+------+---- -9.0 -6.0 -3.0 0.0</p><p>Also included below are the fitted values as well as the (estimated) residuals.</p><p>YnumDays AphyFit Rep XAge FITS1 RESI1 29 1 1 18.3 28.7930 0.20697 42 1 2 30.0 42.4503 -0.45028 38 1 3 26.5 38.3648 -0.36478 40 1 4 28.1 40.2324 -0.23244 43 1 5 29.7 42.1001 0.89991 40 1 6 27.8 39.8822 0.11775 30 1 7 19.8 30.5440 -0.54396 42 1 8 29.3 41.6332 0.36682 30 2 1 20.8 29.8639 0.13613 35 2 2 25.2 34.9999 0.00007 39 2 3 29.2 39.6691 -0.66907 28 2 4 20.0 28.9300 -0.93004 31 2 5 21.5 30.6810 0.31903 31 2 6 22.1 31.3813 -0.38134 29 2 7 19.7 28.5799 0.42015 35 2 8 24.7 34.4163 0.58372 29 2 9 20.2 29.1635 -0.16349 33 2 10 22.9 32.3152 0.68483 26 3 1 22.7 25.2062 0.79380 32 3 2 28.7 32.2099 -0.20991 21 3 3 18.9 20.7705 0.22949 20 3 4 18.0 19.7199 0.28005 23 3 5 21.7 24.0389 -1.03891 22 3 6 20.0 22.0545 -0.05452</p><p>1 40 2 3 1 2 s e d 3 u e l t t a</p><p> i 30 F V</p><p>20</p><p>20 25 30 XAge</p><p>Towards Providing Parameter Estimates The Cell Means for the Response Variable NumDays are given below. Rows: APhyFit</p><p>YNumDays YNumDays YNumDays N Mean StDev</p><p>1 8 38.000 5.477 2 10 32.000 3.464 3 6 24.000 4.427 All 24 32.000 6.878</p><p>The Cell Means for the Covariate Age are given below.</p><p>Rows: APhyFit</p><p>XAge XAge XAge N Mean StDev</p><p>1 8 26.188 4.566 2 10 22.630 2.990 3 6 21.667 3.858 All 24 23.575 4.098</p><p>Recall that the estimate of the regression coefficient was: 1.16729</p><p>Estimate of the grand mean: 32 - (1.16729)(23.575) = 4.4811</p><p>Estimates of the Treatment Effects for levels of Physical Fitness: </p><p>Below Average: (38 - 32) - (1.16729)(26.188 - 23.575) = 2.9499 Average: (32 - 32) - (1.16729)(22.630 - 23.575) = 1.1031 High Average: (24 - 32) - (1.16729)(21.667 - 23.575) = -5.7728</p><p>Note that (8)(2.9499) + (10)(1.1031) + (6)(-5.7728) = 0.</p><p>Remark: The testing procedure could also be performed using the "Extra Sum of Squares" approach. Thus, to test for differences among the levels of the qualitative variable "PhyFit" we fit the two models:</p><p>Full Model: includes Age and PhyFit, whose ANOVA was given above. Reduced Model: only includes Age. The ANOVA for this is given by</p><p>Analysis of Variance for YNumDays, using Adjusted SS for Tests</p><p>Source DF Seq SS Adj SS Adj MS F P XAge 1 835.75 835.75 835.75 72.89 0.000 Error 22 252.25 252.25 11.47 Total 23 1088.00 </p><p>Term Coef StDev T P Constant -2.682 4.121 -0.65 0.522 XAge 1.4711 0.1723 8.54 0.000</p><p>The extra sum of squares associated with "PhyFit" is therefore: SSE(Reduced) - SSE(Full) = 252.25 - 6.17 = 246.08, and its degrees-of-freedom is 22 - 20 = 2. Note that this extra sum of squares is also obtainable from the Adjusted SS in the first ANOVA table (this corresponds to the Type III SS). This approach could also be used to perform the test that the regression coefficient is zero, but this test is obtainable from the first ANOVA table.</p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us