Analysis of Covariance in Agronomy and Crop Research

Analysis of Covariance in Agronomy and Crop Research Rong-Cai Yang Alberta Agriculture and Rural Development and University of Alberta [email protected] CSA Statistics Workshop – Saskatoon June 21, 2010 Outline • Overview of Analysis of Covariance (ANCOVA) • Basic theory and principles • Conventional uses • Elaborated applications • Take-home messages 2 • Most stats textbooks would devote one chapter to ANCOVA (e.g., Steel et al. 1997, Ch 17; Snedecor & Cochran 1980, Ch 18) . Milliken and Johnson (2002, Analysis of messy data. Volume 3: Analysis of covariance) devote the entire book to the subject • Other books specifically for SAS users also have a chapter on ANCOVA (e.g., Littell et al. 2006. SAS for mixed models, 2nd ed., Ch 7) However, ANCOVA is a more advanced topic, often appearing towards the end of books; ANCOVA is taught cursorily or ignored completely in many stats classes 3 What is ANCOVA? • ANCOVA is a statistical technique that combines the methods of ANOVA and regression. • ANCOVA has two types of independent variables • Dummy (0-1) variables for treatment IDs • Continuous variables (covariates) – directly measured • If there are only dummy variables, ANCOVA becomes ANOVA • If there are only covariates, ANCOVA becomes regression analysis 4 Why ANCOVA? • Two kinds of nuisance factors contribute to experimental error Controlled Measured Blocking ANCOVA Reduced experimental error 5 Choice of covariates • Nuisance factors that can be measured but not controlled (e.g., blocking) . Closely related to the response variable (y) • Remove more error variation that cannot be accounted for by blocking . e.g., insect movement and soil fertility gradient are not in the same direction • Pre-study variables (i.e., measured before the start of study) to ensure that they are not influenced by the treatments being tested . Plot-to-plot heterogeneity (e.g., soil moisture, soil nutrients, weed population, non-uniform insect distribution) . Residual effects of previous trials 6 Statistical model • The simplest ANCOVA model: one-way trt structure with ti, one independent covariate xij and associated regression coefficient β – y ij = β0 + ti + βXij+εij • This model represents a set of parallel lines • The common slope of lines is β • The intercept of the ith line is (β0 + ti ). • If Xs were not measured, then βXij+ could not be determined and would thus be included in the error term – y ij = β0 + ti + eij 7 How does ANCOVA work? • ANCOVA is essentially an ANOVA of the quantity yij - βXij. The value of slope β is chosen so that error SS of yij - βXij, 2 • Eyy -2βExy + β Exx is minimized. Some re-arrangements lead to 2 2 • Exx(β - Exy/Exx) + Eyy- (Exy) /Exx Thus, least squares estimate of the slope is: • β = Exy/Exx and the minimum error SS is: 2 • Eyy- (Exy) /Exx df SSx CPxy SSy df SSy|x Treatment (T) t-1 Txx Txy Tyy 2 Error (E) t(r-1) Exx Exy Eyy t(r-1)-1 SS1=Eyy - (Exy) /Exx 2 Total (T+E) tr-1 Txx+Exx Txy+Exy Tyy+Eyy tr-2 SS2=Tyy+Eyy - (Txy+Exy) /(Txx+Exx) Adj treatment t-1 T’yy=SS2-SS1 T’yy is the adjusted SS for treatment = SAS output 8 Conventional uses of ANCOVA • Adjusted means . Trt means of the y variable are adjusted to a common value of covariate => equitable comparison of trt means • Statistical control of errors . Variation in y due to its association with covariates is removed from the error variance => more precise estimates of trt means and more powerful test • Testing for homogeneity of slopes for different treatment groups . Are regression lines parallel for different groups? • Estimating missing values . Less useful now => present-day stats software can easily handle unbalanced data 9 Stand (x) and yield (y) (lbs field weight of ear corn) of six varieties in RCBD with four blocks (Snedecor & Cochran 1980, Table 18.5.2) SAS code: Output from SAS PROC MIXED: proc mixed data=sc; Type 3 Tests of Fixed Effects class variety block; model y=variety Num Den block x; Effect DF DF F Value Pr > F run; variety 5 14 6.64 0.0023 block 3 14 5.15 0.0132 x 1 14 76.01 <.0001 Detailed calculations β =917.25/113.83 = 8.06 df SSx CPxy SSy df SSy|x MSy|x F Pr>F 21.67 8.50 436.17 Block 3 45.83 559.25 9490.00 Variety 5 15 113.83 917.25 8752.33 Error 14 1361.26 97.23 20 V+E 159.67 1476.50 18242.33 19 4588.50 Adj V 5 3227.24 645.45 6.64 0.0023 10 Adjusting treatment means x y y (adj) SE • Single mean: i. i. i. 24.0 173.0 191.8 5.38 yi. (adj) = yi. – β(xi.- x..) 25.3 182.3 191.0 5.03 with a standard error (SE) 26.5 194.5 193.2 4.93 SE = {MS [1/r + (x - x )2/E ]}0.5 y|x i. .. xx 28.0 232.8 219.3 5.17 27.8 201.0 189.6 5.10 For example, adj variety a, 26.5 215.0 213.7 4.93 y = 173.0-(8.06)(24.0-26.3) = 191.8 a. (adj) x..=26.3 SE = {97.23(1/4 + (-2.3)2/113.8)}0.5 = 5.38 SAS PROC MIXED output with LSMEANS statement: Standard Effect variety Estimate Error DF t Value Pr > |t| variety a 191.80 5.3814 14 35.64 <.0001 variety b 190.98 5.0310 14 37.96 <.0001 variety c 193.16 4.9328 14 39.16 <.0001 variety d 219.32 5.1654 14 42.46 <.0001 variety e 189.58 5.1013 14 37.16 <.0001 variety f 213.66 4.9328 14 43.31 <.0001 11 Difference between adjusted treatment means • Difference between two adjusted means: xi. yi. yi.(adj) yi. (adj) - yj. (adj) = yi. - yj. - β(xi.- xj.) 24.0 173.0 191.8 with a standard error (SE) 25.3 182.3 191.0 SE = {MS [(1/r + 1/r + (x - x )2/E ]}0.5 y|x i j i. j. xx 26.5 194.5 193.2 28.0 232.8 219.3 For example, diff between varieties a and b, 27.8 201.0 189.6 y - y = 191.8 – 191.0 = 0.8, a. (adj) b. (adj) 26.5 215.0 213.7 2 0.5 SE = {97.23(2/4 + (24.0-25.3) /113.8)} 26.3 =7.07 SAS PROC MIXED output with LSMEANS statement: Standard Effect variety _variety Estimate Error DF t Value Pr > |t| variety a b 0.8223 7.0677 14 0.12 0.9090 variety a c -1.3554 7.3455 14 -0.18 0.8562 variety a d -27.5187 7.8920 14 -3.49 0.0036 variety a e 2.2169 7.7865 14 0.28 0.7800 variety a f -21.8554 7.3455 14 -2.98 0.0100 . 12 Relative efficiency of ANCOVA vs. ANOVA • Error MS from ANOVA without considering covariate (x) = 8752.33/15 = 583.49 • Error MS from ANOVA after considering covariate (x) = 97.23 • Effective error MS = MSy|x[1+ Txx/(t-1)/Exx] =97.23*[1+45.83/5/113.83] =105.06 • Rel Efficiency = 583.49/105.06 = 5.55 – ANCOVA with 10 replications gives as precise estimates as unadjusted means with 55 replications!! df SSx CPxy SSy df SSy|x MSy|x F Pr>F 21.67 8.50 436.17 Block 3 45.83 559.25 9490.00 Variety 5 15 113.83 917.25 8752.33 Error 14 1361.26 97.23 20 V+E 159.67 1476.50 18242.33 19 4588.50 Adj V 5 3227.24 645.45 6.64 0.0023 13 Elaborated applications of ANCOVA to agronomy and crop research • Application #1: Analysis of dosage response • Application #2: Analysis of treatment stability across environments • Application #3: Analysis of spatial variability 14 Application #1: Analysis of dosage response /*Gomez and Gomez 1984, pages 317 - 327*/ /*A fertilizer trial with five nitrogen rates (kg/ha) tested on rice yield (tonne/ha) Objectives: for each of two seasons, dry and wet. Each trial has a RCBD with three replications. 1.Examine if there is */ differential yield options ls=100 ps=6000; data raw; response to N rates in input season $ nitrogen r1 r2 r3; dry and wet seasons n=nitrogen; *n is set to be a covariate; datalines; dry0 4.891 2.577 4.541 2.Determine if it is dry60 6.009 6.625 5.672 necessary to have a dry90 6.712 6.693 6.799 dry120 6.458 6.675 6.639 separate technology dry150 5.683 6.868 5.692 recommendation for wet0 4.999 3.503 5.356 wet60 6.351 6.316 6.582 two seasons. wet90 6.071 5.969 5.893 wet120 4.818 4.024 5.813 wet150 3.436 4.047 3.740 ;run; 15 Since there are five N rates, a fourth-degree polynomial can be fit 7 6 5 Dry - Rep1 4 Dry - Rep 2 3 Dry - Rep 3 Yield (t/ha) Yield 2 Wet - Rep 1 Wet - Rep 2 1 Wet - Rep 3 0 0 50 100 150 Nitrogen (kg/ha) 16 Standard (old) method: . Partition trt SS using orthogonal contrasts due to linear, quadratic and higher-order regression effects • Trt SS = SS(linear) + SS(quadratic) + . Older editions of textbooks give tables of orthogonal polynomial coefficients for balanced data with equally spaced treatment levels • How about unbalanced data with unequally spaced treatment levels? • How to estimate regression equations? • In a factorial experiment, are all regressions the same over all levels of the other factor (i.e., homogeneity of slopes)? 17 Orthogonal polynomial analysis • Using the SAS IML ORPOL function to obtain orthogonal polynomial coefficients for five unequally spaced nitrogen levels (0, 60, 90, 120 and 150) proc iml; levels={0 60 90 120 150}; coef=orpol(levels`); print coef; quit; run; 18 Orthogonal polynomial analysis Nitrogen level 0 60 90 120 150 Linear -0.7278 -0.2080 0.0520 0.3119 0.5719 Quadratic 0.4907 -0.4729 -0.4595 -0.1160 0.5576 Cubic -0.1677 0.6312 -0.2170 -0.6213 0.3748 Quartic 0.0367 -0.3671 0.7342 -0.5507 0.1468 • The coefficients are orthonormal because the squared coefficients for each contrast sum to one.

Analysis of Covariance in Agronomy and Crop Research

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support