
12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have encountered such measurements before and called them subsamples; and in the context of the nested designs we have discussed, the primary uses of subsamples are: 1. To obtain a better estimate of the true value of an experimental unit (i.e. to reduce experimental error) 2. To estimate the components of variance in the system. In these simple nested experiments, "subsample" was not a classification variable but merely an ID. That is, "Subsample 1" from experimental unit 1 was no more similar to "Subsample 1" from experimental unit 2 than it was to "Subsample 2" from experimental unit 2. Even though no two measurements can be made at exactly the same instant in time, we did not refer to these subsamples as repeated measures because we were not interested in characterizing the effect of time between measurements on the response variable. Now assume that this effect of time is of interest and, again, that the measurements are made repeatedly on the same experimental unit (e.g. plant height over weeks; yield of perennial crops over seasons; animal growth over months; population dynamics over years, etc.). These observations are not replications because they are made on the same experimental unit (i.e. they are not independent). But neither are they subsamples, as defined above, because: 1. We are interested in the effect of time between measurements. 2. "Subsample" is no longer simply an ID: "Subsample 1 from e.u. 1" (e.g. Height of Plant 1 at Week 1) is more related to "Subsample 1 from e.u. 2" (Height of Plant 2 at Week 1) than it is to "Subsample 2 from e.u. 2" (Height of Plant 2 at Week 2). Not replications, not subsamples, these sorts of measurements possess qualities of both and are called repeated measurements; and the split-model model offers one means of analyzing such data. It is important to note in this context that an important assumption of the ANOVA is not the independence of measurements but the independence of errors. Indeed, the split-plot model implicitly assumes a correlation among measurements. Before outlining the split-plot approach to repeated measures data, know that time series or multivariate methods can also be used to analyze such data. Time series methods are more appropriate when analyzing long series of data, with more than 20 repeated measures per individual. Consequently, such analyses are more frequently applied to stock data or weather data than to agricultural experiments. Multivariate methods are very useful for shorter time series, and researchers who consistently rely on repeated 12.2 measures in their experiments should become familiar with them. That being said, the far simpler univariate analysis presented here is also quite useful. Note that what distinguishes repeated-measures data from any other multivariate data is not so much the existence of the repeated measurements but the desire to examine changes in the measurements taken on each subject. By simply designating the different points in time as "levels" of a "Time" factor, the split- plot principle can be applied to experiments where successive observations are made on the same experimental unit over a period of time. For example, a fertilizer trial or variety trial with a perennial crop like alfalfa might be harvested several times. Other examples might be repeated picking of fruit from the same trees in an orchard or repeated soil sampling of plots over time for nutrient content. In each case, the experimental units to which the treatment levels are assigned are the "main plots," and the several measurements over time are the "subplots." A subplot in this case, however, differs from the usual subplot in that it consists of data taken from the entire main plot rather than from some designated portion of the main plot, as is the case with the usual split-plot. In the simple, univariate split-plot approach to repeated measures data, the dependency among the observations within a main plot is used to adjust the degree of freedoms in the ANOVA to give approximate tests. The approximate ANOVA for a repeated measures CRD is shown below. 12. 9. 1. Repeated measures ANOVA Approximate ANOVA of repeated measurement analysis ______________________________________________________________________________________ Source df SS MS Conservative df ________________________________________________________________________ Among Experimental Units Treatment (A) a-1 SSA SSA/(a-1) Rep* Trt (Error A) a (n-1) SS(MPE) SS(MPE)/a(n-1) Within Experimental Units Response in time (B) b-1 SSB SSB /(b-1) 1 Response by Trt. (A*B) (b-1)(a-1) SSAB SSAB/(b-1)(a-1) a-1 Error B a(b-1)(n-1) SS(SPE) SS(SPE)/a (b-1) (n-1) a(n-1) _____________________________________________________________________________________ The analysis looks like a split-plot analysis, except that conservative degrees of freedom are used in all F-tests for effect of Time (i.e. the repeated measures) and any interactions with Time. No unusual problems arise when analyzing the effects of the main plot (A) because the analysis of the main plot is insensitive to the split, as seen before in the normal split-plot. However, F values generated by testing the effects of Time (subplot B) and the interaction of main plots treatments with Time (A*B) may not follow an F distribution, thereby generating erroneous results. 12.3 The normal split-plot model assumes that pairs of observations within the same main plot are equally correlated. With repeated-measures data, however, arbitrary pairs of observations on the same experimental unit are not necessarily equally correlated. Measurements close in time are often more highly correlated than measurements far apart in time. Since this unequal correlation among the repeated measurements is ignored in a simple split-plot analysis, tests derived in this manner may not be valid. To compensate for this assumption of uniform correlation across repeated measurements, a conservative approach is recommended by many statisticians, "conservative" because it requires larger F values to declare significance for B and A*B effects. In this approach, it is suggested that the degrees of freedom of B (response in time) be used to scale the degrees of freedom for B, A*B, and Error B. Finally, critical F values should be used that are based upon these conservative degrees of freedom (previous table, right column). The uncorrected degrees of freedom are appropriate for independent replications within main plots. The corrected ones (right column) are appropriate for totally dependent replications, a situation equivalent to having all responses represented by a single response (this explains why the corrected df is one). Total dependency is the worst theoretically-possible scenario and is therefore a severe condition to impose. The true level of dependency among repeated measurements in a real experiment will probably be somewhere between these two extremes. 12. 9. 2. Example of a repeated measurements experiment An experiment was carried out to study the differences in yield of four alfalfa cultivars. Five replications of these four varieties were organized according to a CRD, and four cuttings were made of each replication over time. The data represents the repeated measurements of yield (tons/acre) of the four cultivars and is analyzed as a split-plot CRD with repeated measures: 4 levels of main plot A: Cultivars 1 – 4 4 levels of subplot B: Cut times 1 – 4 (9/10/74, 6/25/75, 18/5/75, 9/16/75) To analyze this data, we begin by carrying out a standard split-plot analysis. SAS program data rem_mes; input rep A_var B_time yield @@; cards; 1 1 1 2.80191 1 1 2 3.73092 1 1 3 3.09856 1 1 4 2.50965 1 2 1 2.76212 1 2 2 5.40530 1 2 3 3.82431 1 2 4 2.72992 1 3 1 2.29151 1 3 2 3.81140 1 3 3 2.92575 1 3 4 2.39863 1 4 1 2.56631 1 4 2 4.96070 1 4 3 2.81734 1 4 4 2.05752 2 1 1 2.96602 2 1 2 4.43545 2 1 3 3.10607 2 1 4 2.57299 2 2 1 3.09636 2 2 2 3.90683 2 2 3 3.26229 2 2 4 2.58614 2 3 1 2.54027 2 3 2 3.82716 2 3 3 2.86727 2 3 4 2.16287 2 4 1 2.31630 2 4 2 3.96629 2 4 3 2.91461 2 4 4 2.15764 12.4 3 1 1 2.43232 3 1 2 4.32311 3 1 3 2.81030 3 1 4 2.07966 3 2 1 3.09917 3 2 2 4.08859 3 2 3 3.13148 3 2 4 2.60316 3 3 1 2.41199 3 3 2 4.08317 3 3 3 3.03906 3 3 4 2.07076 3 4 1 2.65834 3 4 2 3.71856 3 4 3 2.92922 3 4 4 2.15684 4 1 1 2.93509 4 1 2 3.99711 4 1 3 2.77971 4 1 4 2.44033 4 2 1 2.65256 4 2 2 5.42879 4 2 3 2.70891 4 2 4 2.30163 4 3 1 2.30420 4 3 2 3.27852 4 3 3 2.72711 4 3 4 2.04933 4 4 1 2.47877 4 4 2 3.92048 4 4 3 3.06191 4 4 4 2.35822 5 1 1 2.42277 5 1 2 3.85657 5 1 3 3.24914 5 1 4 2.34131 5 2 1 2.63666 5 2 2 3.77458 5 2 3 3.09734 5 2 4 2.30082 5 3 1 2.36941 5 3 2 3.44835 5 3 3 2.50562 5 3 4 2.08980 5 4 1 2.23595 5 4 2 4.02985 5 4 3 2.85279 5 4 4 1.85736 ; proc glm; class rep A_var B_cut; model yield= A_var rep*A_var B_time A_var*B_cut; test h=A_var e=rep*A_var; means A_var / lsd e=rep*A_var; means B_time/ lsd; run; quit; Output (split plot CRD) Class Level Information Class Levels Values REP 5 1 2 3 4 5 A_VAR 4 1 2 3 4 B_TIME 4 1 2 3 4 Number of observations in data set = 80 Dependent Variable: YIELD Sum of Mean Source DF Squares Square F Value Pr > F Model 31 42.884847 1.383382 14.46 0.0001 Error 48 4.592605 0.095679 Corrected Total 79 47.477452 R-Square C.V.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages11 Page
-
File Size-