1- Handout #1: Randomization and Blocking A
Total Page:16
File Type:pdf, Size:1020Kb
Handout #1: Randomization and Blocking A common problem in comparative experiments is that, because the treatments are applied to different units, the treatment comparisons could be obscured by variations among the experimental units. When different responses are obtained on different units, are they due to differences among the treatments or the inherent variability in the units? To prevent any treatment from systematically getting more than its fair share of better units, the treatments are randomly assigned to the units. As we shall see later, in addition to guarding against potential systematic biases, randomization also provides a justification for the statistical analysis. The simplest kind of randomized designs are the completely randomized designs in which the allocation of treatments is completely random. A disadvantage of complete randomization is that when variations among the experimental units are large, the treatment comparisons do not have good precision. Blocking is an effective way to reduce experimental error. The experimental units are divided into more homogeneous groups called blocks. When the treatments are compared on the units within blocks, the variability between blocks are eliminated from the treatment comparisons and the precision is improved. We shall distinguish between designs and experimental plans. A design is a non- randomized assignment of the treatments to unit labels. An experimental plan is the assignment that will be used in the actual experiment, and is obtained from the initial design by applying an appropriate random permutation of the unit labels. Each design is defined by a unit-treatment incidence matrix \X , whose rows correspond to unit labels. A random permutation is then applied to the rows of \X . For example, in a completely randomized design where there are R> units and treatments with specified numbers of replications <á<">X, , , \ is an arbitrary R‚> (0, 1)-matrix such that there is one 1 in each row and <33 1's in the th column. A randomized experimental plan is obtained by randomly permuting the rows of \X , i.e., one of the Rx permutations is randomly picked. In a block design in which the units are divided into ,5 blocks of size , randomization is performed independently within each block, and the block labels are also randomly permuted. We make sure that the block structure of the units is preserved: two unit labels originally in the same block remain in the same block after the permutation. The randomization procedure described above picks one permutation of the unit labels randomly from the Ð5xÑ, ,! allowable permutations that preserve the block structure. A major design problem is to choose \X according to some statistical criterion. When 5œ>, a clear choice is to make each treatment appear once in each block. In this case, there is no need to randomize among the blocks. This is called a randomized complete block design. The block structure of a block design with ,5 blocks of size , called nesting, is denoted by ,Î5. Another commonly encountered block structure involves two blocking factors: <- experimental units are classified according to an <-level factor (called the row -1- factor) and a --level factor (called the column factor) such that there is one unit corresponding to each combination of the two factors. Although in some experiments (for example, agricultural experiments), the units are indeed laid out in rows and columns, usually the reference to rows and columns is more for convenience. In such row-column designs, we try to eliminate the variability among different levels of each of the two factors that may affect the experimental result. Randomization is performed by randomly permuting the row labels and, independently, randomly permuting the column labels. In other words, a permutation of the <- unit labels is picked randomly from the <x-x allowable permutations that preserve the row-column structure: after the permutation, two unit labels originally in the same row (column) remain in the same row (column, respectively.) The block structure of an <‚- row-column design, called crossing, is denoted by <‚-. Nelder (1965) defined simple block structures to be those which can be obtained by iterations of nesting ÐÎÑ and crossing Ð ‚ Ñ. For example, in $ÎÐ% ‚ %Ñ, the 16 units in each of three blocks are arranged into 4 rows and 4 columns. Randomization of a simple block structure is carried out according to the prescription for nesting and crossing in each stage. Let WRRR be the set of all the ! permutations of the unit labels denoted by 1, á , R−W3Ð3Ñ. For each permutation 111R , we denote the image of unit label under by , 3œ1, á , R, and for each simple block structure, let º be the set of the allowable permutations: those which preserve the given block structure. It can be shown that for any two unit labels 34b and , a permutation 1º in such that 1 Ð3Ñœ4. (1.1) Given any two pairs Ð3ß 4Ñ and Ð3ww, 4 Ñ of unit labels, we say that they are equivalent with respect to º1º if b− such that 1 Ð3Ñœ3ww and 1 Ð4Ñœ4. This relation partitions the R # pairs of unit labels into some mutually disjoint equivalence classes: any two pairs of unit labels are in the same class if and only if they are equivalent. For example, for the block structure ,Î5, there are three equivalence classes: ÖÐ3ß4ÑÀ3œ4×, ÖÐ3ß 4Ñ À 3 Á 4, but they belong to the same block × , (1.2) ÖÐ3ß 4Ñ À 3 and 4 are in different blocks × . For <‚-, it is easy to see that there are four equivalence classes (which four?). We shall make the important assumption of additivity between treatments and units. Denote by CÐ?34, > Ñ an observation on unit 3 when treatment 4 is applied there. Our assumption is that CÐ?34, > Ñ œα" 4 3, (1.3) where α"4 is a constant representing the effect of the 4th treatment, and 3 is the effect of the 3th unit. We allow "3 to be a random variable (incorporating measurement errors, uncontrolled variations, etc.) We shall also assume that "3 has a finite variance. -2- For convenience, we shall denote by >Ð3Ñ the treatment assigned to the 3th unit label according to a design \ . Let CC be a random element with PrМќÎ1º 1 X kk for all 1º− , where º is the number of permutations in º. Note that C represents the kk randomization process. For each 3œ1, á, R, let C3 be the observation that takes place at unit CÐ3Ñ. Then, by (1.3), R Cœ3636()α">Ð3Ñ ., 6œ" R where d36œ3œ 1 if C( )l , and 0 otherwise. Since .œ3 36 1 for all , we have 6œ" R Cœα" ., (1.4) 3636>Ð3Ñ 6œ" Let R w %"3 œ.636 . 6œ" " It follows from (1.1) that for any 3Ð3œÑœ, PrC ( )llR for all œáR1, , . Therefore w " PrÐœ%"3 6 ÑœR , for all 3 and l. ww w w Consequently, %%"R, áœÐÑœ . have identical distributions. Let .%%%.E 3 and 3 3. Then (1.4) can be written as Cœ33.α >Ð3Ñ %, (1.5) where EÐÑœ%%%3"R 0 and , á . are identically distributed; or, in vector notation, C\œ.1 X α% , (1.6) w where α%œá(αα"> , , ) and 1 is the R‚" vector of ones. Let Z œcov( ). It is easy to see that cov(CC34 , ) is a constant for all pairs Ð3ß4Ñ belonging to the same equivalence class induced by º. Example 1. In a completely randomized design, clearly there are only two equivalence classes in the partition of the R # units induced by the allowable permutations: ÖÐ3ß4ÑÀ3œ4} and ÖÐ3ß4ÑÀ3Á4}. Therefore Zœ+, MRR N -3- for some +, and , where MNRR is the identity matrix of order R and is the R‚R matrix of ones. Example 2. Block designs Ð,Î5ÑÞ In this case, Z has three different entries corresponding to the three equivalence classes given in (1.2). Example 3. Similarly, for <‚-, Z has four different entries corresponding to whether two units are (i) identical, (ii) on the same row but different columns, (iii) on the same column but different rows, or (iv) on different rows and different columns. -4-.