<<

Unit 9: and Fractional Factorial Designs

STA 643: Advanced Experimental Design

Derek S. Young

1 Learning Objectives

I Understand what it for a treatment to be confounded with blocks

I Know how generalized interactions are used in confounding

I Know how to construct and analyze incomplete block designs for 2k and 3k factorial designs

I Become familiar with half-fraction and quarter-fraction designs

I Understand how we use aliasing and design generators for fractional factorial designs

I Understand the resolution of a design

I Understand how to use the Plackett-Burman designs for the purpose of screening large number of factors

2 Outline of Topics

1 Confounding

2 Fractional Factorial Designs

3 Design Resolution

3 Outline of Topics

1 Confounding

2 Fractional Factorial Designs

3 Design Resolution

4 Leveraging Factorial Treatment Designs

I Recall that we designed and analyzed that involved k factors, each at two levels – i.e., 2k factorial designs. k I A major advantage of 2 factorial designs is that they help determine whether the factors act independently or if they interact with one another as they affect the EUs. k k I By far, 2 and 3 (i.e., three levels of each factor) factorial designs are the most common, but as the number of factors increases, then the number of treatment combinations rapidly increases.

I Thus, we turn to incomplete block designs to help control experimental error, while simultaneously accomplishing what we hope to answer with a factorial design.

I We can leverage the factorial arrangement to provide effective incomplete block designs and subsequent analyses.

5 Example: Popcorn

If you have ever made a bag of microwaveable popcorn, you probably noticed that there are always unpopped kernels at the bottom of the bag. An experiment involving (approximately) 3.5 ounce bags of popcorn was designed. Three factors are studied: brand of popcorn (a cheap brand and a costly brand), amount of time in the microwave (4 minutes and 6 minutes), and percent power of the microwave (75% and 100%). Thus, this is a 23 factorial design. The weight of the remaining unpopped kernels (in ounces) after popping is recorded. See the table below. Note that the observation in the second row is 3.5 oz. For this bag, virtually none of the popcorn popped. The first column is the ”run order”; i.e., the order in which the treatments were ran.

Run Order Brand (A) Time (B) Power (C) Kernels (Y) 8 Cheap 4 75 3.1 1 Costly 4 75 3.5 2 Cheap 6 75 1.6 4 Costly 6 75 1.2 3 Cheap 4 100 0.7 5 Costly 4 100 0.7 7 Cheap 6 100 0.5 6 Costly 6 100 0.3

6 Example: Popcorn Experiment

I Below is the design matrix for the full model with the two-way interactions and three-way , reported in standard order. I The low levels of each factor are denoted by a “-” – these are the levels of “Cheap”, “4”, and “75”. I The high levels of each factor are denoted by a “+” – these are the levels of “Costly”, “6”, and “100”.

Std. Order I A B C AB AC BC ABC Y 1 + - - - + + + - y111 = 3.1 2 + + - - - - + + y211 = 3.5 3 + - + - - + - + y121 = 1.6 4 + + + - + - - - y221 = 1.2 5 + - - + + - - + y112 = 0.7 6 + + - + - + - - y212 = 0.7 7 + - + + - - + - y122 = 0.5 8 + + + + + + + + y222 = 0.3

7 Example: Popcorn Experiment

A few notes about the design matrix on the previous slide: I The use of “+” and “-” symbols is so that we can define contrasts of coefficients of +1 and -1 for each of the treatment combinations. I The column “I” contains all “+”, which is used to estimate the grand with a divisor of 23 (or more generally, 2k). I We say that the matrix is in standard order because the pattern of ”+”s and ”-”s in the columns (factors) have been arranged a certain way. Namely, column A has successive pairs of “-” and “+” signs, column B has pairs of “-” signs followed by pairs of “+” signs, and column C has four “-” signs followed by four “+” signs. In general, the kth column has 2k−1 of the “-” signs followed by an equal number of “+” signs. Thus, you can construct a design matrix in standard order for an arbitrary 2k factorial design. I The coefficients for any interaction is the product of the columns of coefficients that comprise that interaction. For example, ABC in row 7 is found by (-1)(+1)(+1)=(-1), or simply, a “-” sign.

8 Example: Popcorn Experiment

Below are the treatment means plots. We see that there is a potential interaction due to the magnitude of the responses. Recall that no interaction is present if lines at two levels of a treatment are parallel to each other.

Power = 75% Power = 100%

1 1 1 3.5 0.7 Time Time

1 4 1 4 1 2 6 2 6 3.0 0.6 2.5 2 0.5 2.0 Amount of Kernels (oz) of Amount Kernels (oz) of Amount Kernels 0.4

2 1.5

2 2 0.3

cheap costly cheap costly

Brand Brand

9 Example: Popcorn Experiment

B C Simple Effect of Cheap Brand to Costly Brand (A) 4 75 µˆ211 − µˆ111 = 3.5 − 3.1 = 0.4 6 75 µˆ221 − µˆ121 = 1.2 − 1.6 = −0.4 4 100 µˆ212 − µˆ112 = 0.7 − 0.7 = 0.0 6 100 µˆ222 − µˆ122 = 0.3 − 0.5 = −0.2 A C Simple Effect of 4 Minutes to 6 Minutes (B) Cheap 75 µˆ121 − µˆ111 = 1.6 − 3.1 = −1.5 Costly 75 µˆ221 − µˆ211 = 1.2 − 3.5 = −2.3 Cheap 100 µˆ122 − µˆ112 = 0.5 − 0.7 = −0.2 Costly 100 µˆ222 − µˆ212 = 0.3 − 0.7 = −0.4 A B Simple Effect of 75% Power to 100% Power (C) Cheap 4 µˆ112 − µˆ111 = 0.7 − 3.1 = −2.4 Costly 4 µˆ212 − µˆ211 = 0.7 − 3.5 = −2.8 Cheap 6 µˆ122 − µˆ121 = 0.5 − 1.6 = −1.1 Costly 6 µˆ222 − µˆ221 = 0.3 − 1.2 = −0.9

Note: µˆijk = yijk

10 Example: Popcorn Experiment

1 Grand Mean: y¯··· = 8 y··· = 1.45 Main Effects:

Factor A: =

Factor B: =

Factor C: =

Two-Factor Interactions: 1 1 AB: (y111 + y112 + y221 + y222) − (y211 + y212 + y121 + y122) = 1.325 − 1.575 = −0.250 4 4 1 1 AC: (y111 + y121 + y212 + y222) − (y211 + y221 + y112 + y122) = 1.425 − 1.475 = −0.050 4 4 1 1 BC: (y111 + y211 + y122 + y222) − (y121 + y221 + y112 + y212) = 1.850 − 1.050 = 0.800 4 4 Three-Factor Interaction: 1 1 ABC: ((y222−y122)−(y212−y112))− ((y221−y121)−(y211−y111)) = −0.050−(−0.200) = 0.150 4 4

11 Example: Popcorn Experiment

Analysis of Table

Response: bullets Df Sum Sq Mean Sq F value Pr(>F) brand 1 0.005 0.005 0.1111 0.79517 time 1 2.420 2.420 53.7778 0.08628 . power 1 6.480 6.480 144.0000 0.05293 . brand:time 1 0.125 0.125 2.7778 0.34404 brand:power 1 0.005 0.005 0.1111 0.79517 time:power 1 1.280 1.280 28.4444 0.11800 Residuals 1 0.045 0.045 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Since this is a 23 factorial design without replicates, we use the three-way interaction as our estimate of error. Moreover, the three-way interaction effect appears minor given that the effect calculated on the previous slide is only 0.150, which is smaller compared to most of the other calculated effects. In the above ANOVA table, it appears that all of the two-way interactions are not significant at the 0.05 level. While we could systematically remove each interaction based on the highest p-value, we will simply remove all two-way interactions from the analysis and focus on the main effects.

12 Example: Popcorn Experiment

Analysis of Variance Table

Response: bullets Df Sum Sq Mean Sq F value Pr(>F) brand 1 0.005 0.0050 0.0137 0.91232 time 1 2.420 2.4200 6.6529 0.06137 . power 1 6.480 6.4800 17.8144 0.01347 * Residuals 4 1.455 0.3637 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

In the above ANOVA table, it appears that brand is not statistically significant give it has such a large p-value. This, again, is not unexpected given that the estimated main effect of -0.050 is small. So, we remove the effect due to brand.

13 Example: Popcorn Experiment

Analysis of Variance Table

Response: bullets Df Sum Sq Mean Sq F value Pr(>F) time 1 2.42 2.420 8.2877 0.034635 * power 1 6.48 6.480 22.1918 0.005286 ** Residuals 5 1.46 0.292 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

In the above ANOVA table, we are left with both time and power as significant effects on the amount of kernels left after popping. In all of these ANOVA tables, the F -tests were tested against the MSE.

14 Contrasts

I In general, the among treatment means (main effect or interaction) is X lAB··· = cty¯t, t

where t is the index over the different treatments and ct = ±1 are the coefficients for the contrasts, which are determined by the “+”s and “-”s in the column of the design matrix for treatment AB ··· . I The estimate of an effect and its estimate for any contrast among treatment means in a complete 2k factorial design are, respectively, s 2 lAB··· 4σ AB ··· = and sAB··· = , 2k−1 r2k where r is the number of replicates, which has been and will be 1 for most of our discussion. I The 1 df SS for the effects is given by r(l )2 SS(AB ··· ) = AB··· 2k

15 Incomplete Block Designs for 2k Factorials

k I A complete for a 2 factorial with many factors may not be possible in a single, complete block; e.g.,

I If there is insufficient raw material in a manufactured batch to accommodate all treatments, then each batch of raw materials can be used as an incomplete block. I If experimental error is too large with an RCBD in an agricultural field, then the variation among field plots can be controlled in the experiment with reduced block sizes for more homogeneous groups of experimental plots. I If a mother cat has a small litter of kittens, which are not enough for us to test all different treatments for a particular experiment on them, then the kittens could be used as an incomplete block.

16 Confounding

I Confounding occurs when some treatment effects (either main effects or interactions) are estimated by the same linear combination of the experimental observations as a effect; i.e., the treatment effect is indistinguishable from the effect of the blocks with which it is confounded.

I Confounding naturally arises in full factorial designs that are run in blocks, where the block size is smaller than the number of different treatment combinations. k I Ordinarily the highest-order interaction effect in a 2 factorial is chosen to be confounded with blocks.

I Usually, main effects and two-factor interactions are the effects of most interest, so confounding higher-order interactions means the other (important) effects are estimated without penalty.

I In other words, we usually avoid confounding main effects and two-factor interactions with blocks.

17 The Yates Notation

I Earlier we presented the design matrix for a 23 factorial design. I To make referring to specific treatments easier (instead of referring to their “standard order” number), we use a particular labeling convention. I The first column in the design matrix is how we refer to the treatments; i.e., it combines all of the letters for the main effects that occur at their high level (“+”) in that treatment, but where we use lower-case letters to not cause additional confusion. I The first treatment is “(1)”, which simply means that none of the main effects are at their high level. I This labeling convention is known as Yates’s notation. Treatment I A B C AB AC BC ABC (1) + - - - + + + - a + + - - - - + + b + - + - - + - + ab + + + - + - - - c + - - + + - - + ac + + - + - + - - bc + - + + - - + - abc + + + + + + + +

18 Rules for Standard Order

k I There is a simple algorithm for writing a 2 factorial design in standard order: 1 The first column is I and all entries are set to “+”. 2 The second column is for factor A. The first entry is “-”, the second entry is “+”, and we alternate between the signs until all 2k entries are filled. 3 The third column is for factor B. The first two entries are “-”, the second two entries are “+”, and we alternate between the signs with two in a row until all 2k entries are filled. 4 The fourth column is for factor C. The first four entries are “-”, the second four entries are “+”, and we alternate between the signs with four in a row until all 2k entries are filled. 5 The jth column starts with 2j−1 “-” entries in a row, then has 2j−1 “+” entries in a row, and this pattern is repeated until all 2k entries are filled.

19 Evens-Odds Rule

k I In 2 factorial designs, half of the treatments have a “-” and half of the treatments have a “+”.

I The treatments can be divided into two groups – each of 2k/2 = 2k−1 EUs – using the evens-odds rule, which says that any treatment combination (defined like the first column of the design matrix on the previous slide) that has an even number of letters receives one of the coefficients “+” or “-”, while treatments with an odd number of letters receive the other coefficient.

I The convention is that the treatment labelled “(1)” has zero letters, which is an even number. th I In this set-up, we confound the k order interaction with blocks.

I We will illustrate this confounding to achieve incomplete blocks from both 23 and 24 factorials.

20 23 Factorial

I The contrast for the three-factor interaction ABC is given by

lABC = (a + b + c + abc) + (−(1) − ab − ac − bc),

where the first four treatments have a coefficient of ci = +1 and the second four treatments have a coefficient of ci = −1. I Thus, the first four treatments with the “+” sign will be in Block 1 and the second four treatments with “-” sign will be in Block 2.

I Four EUs in Block 1 will be randomly assigned to one of the treatments a, b, c, or abc. I Four EUs in Block 2 will be randomly assigned to one of the treatments (1), ab, ac, or bc.

21 23 Factorial

I By confounding the three-factor interaction with the blocking factor, we get the following matrix:

Block Treatment 1 2 (1) 0 1 a 1 0 b 1 0 ab 0 1 c 1 0 ac 0 1 bc 0 1 abc 1 0

22 24 Factorial

I The contrast for the four-factor interaction ABCD is given by

lABCD =

I The first eight treatments have a coefficient of ci = +1 and the second eight treatments have a coefficient of ci = −1.

I Thus, the first eight treatments with the “+” sign will be in Block 1 and the second eight treatments with “-” sign will be in Block 2.

I Eight EUs in Block 1 will be randomly assigned to one of the treatments (1), ab, ac, ad, bc, bd, cd, or abcd. I Eight EUs in Block 2 will be randomly assigned to one of the treatments a, b, c, d, abc, abd, acd, or bcd.

23 24 Factorial

I By confounding the four-factor interaction with the blocking factor, we get the following :

Block Treatment 1 2 (1) 1 0 a 0 1 b 0 1 ab 1 0 c 0 1 ac 1 0 bc 1 0 abc 0 1 d 0 1 ad 1 0 bd 1 0 cd 1 0 abd 0 1 acd 0 1 bcd 0 1 abcd 1 0

24 ANOVA Overview for Confounded Design

I The SS are computed as usual for completely confounded designs, except for the exclusion of a SS partition for the interaction effect confounded with the block. I SSBlk will include the confounded factorial effect. I Below is the ANOVA table (with df only) for a 23 factorial design with b incomplete blocks and r replicate groups – note that ABC is confounded with the blocks, so the SSBlk includes the ABC effect.

Source df Replicates r − 1 Blocks within Replicates r(b − 1) Treatments 6 A 1 B 1 C 1 AB 1 AC 1 BC 1 Error r(23 − b) − 6 Total r(23) − 1

25 Method for Creating Incomplete Blocks

I For constructing incomplete block designs with chosen factorial effects confounded with blocks, we utilize residues modulo m, which means that for an integer k, the residue mod m is the remainder when k is divided by m. I The residue r for the integer k mod m is written as k = r(mod m). I For example, 7=1(mod 2) since 7 divided by 2 has a remainder of 1. I With 2k factorial designs, we work with residues of (mod 2), so the values we will be working with are 0 and 1. I To determine the allocation of treatment contrasts to incomplete blocks, we use the defining contrast

L = α1x1 + α2x2 + ··· + αkxk, (1)

which is a linear function where the contrast is confounded with blocks and th αi = I{i factors is present in the defining contrast} I L is evaluated for each treatment combination. th I The value of xi is the level of the i factor (either 0 or 1 for low or high, respectively) in any treatment combination under consideration for allocation to an incomplete block.

26 Example: Creating Incomplete Blocks

Suppose the defining contrast for a 24 factorial in two blocks of eight EUs is the three-factor interaction ABD. A, B, and D are present in the defining contrast, so α1 = α2 = α4 = 1, while the absence of C means α3 = 0. The defining contrast for the allocation of treatments is then L = x1 + x2 + x4. The treatment combinations with L = 0(mod 2) are assigned to Block 1 and those with L = 1(mod 2) are assigned to Block 2. The values (using standard order) and assignment are as follows:

Treatment x1 x2 x4 L = x1 + x2 + x4 r(mod 2) Block Assignment (1) 0 0 0 0 0 Block 1 a 1 0 0 1 1 Block 2 b 0 1 0 1 1 Block 2 ab 1 1 0 2 0 Block 1 c 0 0 0 0 0 Block 1 ac 1 0 0 1 1 Block 2 bc 0 1 0 1 1 Block 2 abc 1 1 0 2 0 Block 1 d 0 0 1 1 1 Block 2 ad 1 0 1 1 0 Block 1 bd 0 1 1 2 0 Block 1 cd 0 0 1 1 1 Block 2 abd 1 1 1 3 1 Block 2 acd 1 0 1 2 0 Block 1 bcd 0 1 1 2 0 Block 1 abcd 1 1 1 3 1 Block 2

Note that if B1 and B2 represent the block totals, the treatments in B1 have a “+” sign while the treatments in B2 have a “-” sign. Thus, the contrast among the block totals is equivalent to the ABD contrast: lABD = B1 − B2.

27 Use of Two Defining Contrasts

k I We have, thus far, focused on 2 factorials with two incomplete blocks of 2k−1 EUs, each with one defining contrast confounding with blocks.

I The use of two blocks with one factorial effect confounded may not reduce block size sufficiently. 6 I For example, a 2 factorial with 64 treatments may require block sizes no larger than 16 EUs per block with four blocks per replications.

I We can accomplish further reduction in block size by confounding an additional defining contrast with blocks.

28 Example: Creating Incomplete Blocks

Suppose a 24 factorial is to have 16 treatment combination placed in four blocks of 24−2 = 4 EUs each. Suppose that AC and BD are chosen to be confounded with blocks. The defining contrasts are then L1 = x1 + x3 and L2 = x2 + x4. Each treatment will provide a pair of residues modulo 2: (L1,L2). The pairs (0,0), (0,1), (1,0), and (1,1) will be assigned to Block 1, Block 2, Block 3, and Block 4, respectively.

Treatment x1 x2 x3 x4 L1 L2 Residue Block Assignment (1) 0 0 0 0 0 0 (0,0) Block 1 a 1 0 0 0 1 0 (1,0) Block 3 b 0 1 0 0 0 1 (0,1) Block 2 ab 1 1 0 0 1 1 (1,1) Block 4 c 0 0 1 0 1 0 (1,0) Block 3 ac 1 0 1 0 2 0 (0,0) Block 1 bc 0 1 1 0 1 1 (1,1) Block 4 abc 1 1 1 0 2 1 (0,1) Block 2 d 0 0 0 1 0 1 (0,1) Block 2 ad 1 0 0 1 1 1 (1,1) Block 4 bd 0 1 0 1 0 2 (0,0) Block 1 cd 0 0 1 1 1 1 (1,1) Block 4 abd 1 1 0 1 1 2 (1,0) Block 3 acd 1 0 1 1 2 1 (0,1) Block 2 bcd 0 1 1 1 1 2 (1,0) Block 3 abcd 1 1 1 1 2 2 (0,0) Block 1

29 Example: Generalized Interactions

Let us continue with our previous example for the purpose of defining a generalized interaction. ABCD is the generalized interaction confounded as a consequence of purposely confounding AC and BD with blocks. We can use more generic algebraic rules for determining generalized interactions. We do this by forming the product of the symbols for the defining contrasts with the exponent of any symbol reduced to mod 2. The product of AC and BD is AC× BD = ABCD. The generalized interaction of AC and BD, as determined by the symbol product, is ABCD since all exponents of the symbol product are 1(mod 2). If the contrasts ACD and BCD had been chosen for the original defining contrasts, then the product is ACD×BCD=ABCCDD=ABC2D2=AB, where C2 and D2 are dropped because their exponents are 0(mod 2). Therefore, the generalized interaction is AB when ACD and BCD are the defining contrasts.

30 Incomplete Block Designs for 3k Factorial Designs

k I 3 factorial designs make it possible to estimate linear and quadratic trends for quantitative factors and to provide more detailed descriptions of qualitative factors. k I One constraint is that the number of EUs in a 3 factorial design increases by power of 3 as more factors are added.

I Because of the large number of necessary EUS, incomplete block designs for 3k factorials become very useful. I The levels of a factor are (usually) represented by xi = 0, 1, 2; e.g., a 32 factorial design with factors A and B results in the following t = 9 treatment conditions: A 0 1 2 0 00 10 20 B 1 01 11 21 2 02 12 22

31 Confounding for 3k Factorial Designs

k I Incomplete block designs for 3 factorials require three blocks to have blocks of equal size.

I We will have 2 df between blocks and a treatment effect with 2 df must be confounded with blocks. 2 I We have 2 df for main effects, 2 df for two-factor interactions, and so on.

I As before, we would not want to confound main effects with blocks. k I Defining contrasts in 3 factorials and confounding with three or more factors are both done analogously to what we presented for 2k factorials.

I Note that now we must work with residues of (mod 3).

32 Outline of Topics

1 Confounding

2 Fractional Factorial Designs

3 Design Resolution

33 Utility of Fractional Treatment Designs

k I Even when each factor is studied at two level (e.g., 2 factorial designs), the number of EUs necessary grows geometrically with the number of factors; e.g., 22 = 4, 24 = 16, 26 = 64, 28 = 256, 210 = 1024, etc.

I Fractional factorial designs use only one-half, one-quarter, or smaller fractions of the 2k treatment combinations.

I Some of the reasons for using fractional factorial designs include:

I the number of necessary treatments exceed resources; I information is needed only from main effects and low-order interactions; I screening studies; and I working under the assumption that only a few effects are truly important (this is driven by what is called the factor sparsity hypothesis or sparsity of effects principle).

34 Half-Fraction Design

k−1 I The half-fraction design is referred to as a 2 fractional factorial 1 k k−1 design because 2 2 = 2 . I The notation indicates that the design will include k factors – each at two levels – but we only use 2k−1 EUs. k I For incomplete blocks, we placed one replicate of a 2 factorial design using a defining contrast that let us separate the treatment combinations into two sets.

I Each of the two sets was placed into one of the incomplete blocks according to the “+” and “-” coefficients of the treatments. I Each block was half of a complete replication of the treatments and the defining contrast was confounded with blocks (although this still made it possible to estimate the remaining effects). I We use the same principle for constructing fractional factorial designs, which is best illustrated by first presenting an example.

35 Example: Popcorn Experiment

Recall the designed experiment involving (approximately) 3.5 ounce bags of popcorn. A 23 factorial design was used. Factor A is the brand of popcorn – a cheap brand (-) and a costly brand (+). Factor B is the amount of time in the microwave – 4 minutes (-) and 6 minutes (+). Factor C is the percent power of the microwave – 75% (-) and 100% (+). The weight of the remaining unpopped kernels (in ounces) after popping is the response. The are reported below, but this time we have divided the t = 8 treatments into two groups of four treatments using the defining contrast based on ABC. This particular division of treatments could have been used to construct an incomplete with ABC interaction confounded with blocks. If researchers wanted to construct a half-replicate fractional factorial design, then they could use the four treatments in the top-half of this table.

Treatment I A B C AB AC BC ABC Y a + + - - - - + + 3.5 b + - + - - + - + 1.6 c + - - + + - - + 0.7 abc + + + + + + + + 0.3 (1) + - - - + + + - 3.1 ab + + + - + - - - 1.2 ac + + - + - + - - 0.7 bc + - + + - - + - 0.5

36 What is Sacrificed Through Fractional Designs?

I While we gain in terms of having reduced size of the experiment, it does come at a price.

I Namely, we lose some information on the treatment effects. 3 I If we use a half-replicate of the 2 factorial, then we lose the ability to estimate the three-factor interaction and each main effect is confounded (or aliased) with a two-factor interaction. I Aliasing is mostly another term for confounding.

I We usually use confounding in the context of an incomplete block design; e.g., when we confound a treatment effect with a block. I We usually use aliasing in the context of fractional designs; e.g., the treatment D is aliased with the three-factor interaction ABC in a 24 design.

37 Example: Popcorn Experiment

∗ P Recall that the contrasts for treatment t is found by lt∗ = t cty¯t, where ct = ±1 corresponds to the sign in the design matrix for treatment t. If using only the top-half of the table for the popcorn experiment, we see the contrasts for the main effects are:

lA = a − b − c + abc = 3.5 − 1.6 − 0.7 + 0.3 = 1.7

lB = −a + b − c + abc = −3.5 + 1.6 − 0.7 + 0.3 = −2.3 lC = −a − b + c + abc = −3.5 − 1.6 + 0.7 + 0.3 = −4.1

Contrasts for the two-factor interactions are:

lBC = a − b − c + abc = 3.5 − 1.6 − 0.7 + 0.3 = 1.7

lAC = −a + b − c + abc = −3.5 + 1.6 − 0.7 + 0.3 = −2.3

lAB = −a − b + c + abc = −3.5 − 1.6 + 0.7 + 0.3 = −4.1

Notice that lA = lBC, lB = lAC, and lC = lAB. Therefore, the above contrasts estimate the combined effect of A+BC, B+AC, and C+AB, respectively. This means that we cannot I differentiate between the main effect of brand (A) and the interaction between time (B) and power (C); I differentiate between the main effect of time (B) and the interaction between brand (A) and power (C); and I differentiate between the main effect of power (C) and the interaction between brand (A) and time (B). Since the same contrast estimates two different treatments, those treatments are aliased. Specifically, A is aliased with BC (written as A=BC), B is aliased with AC (written as B=AC), and C is aliased with AB (written as C=AB).

38 Design Generator and Defining Relation

I The higher-order interaction used as the defining contrast (i.e., such that ct ≡ 1), is known as the design generator.

I In the popcorn experiment, ABC is the design generator because it has all the same coefficients (+), which is the same as the identity column I. I The defining relation for a design is the correspondence between a treatment and the identity column I.

I In the popcorn experiment, I=ABC is thus the defining relation.

I Thus, the aliasing scheme for a design can be determined from the defining relation by multiplying the column on each side of the defining relation by successive columns of the design matrix, such that the multiplication is carried out term-by-term.

39 Multiplication Rules for Design Generation

I It is helpful to establish some multiplication rules that allow us to establish the defining contrasts:

1 When multiplying the column I by I, all entries of remain 1; i.e.,

I × I = I2 = I

2 When multiplying any column (say, AB ··· ) by I, the column entries are unchanged:

I × (AB ··· ) = AB ···

3 When multiplying any column by itself, the resulting column is I: (AB ··· ) × (AB ··· ) = I

I You can think of the above as being akin to identity properties.

40 Example: Popcorn Experiment

We will use the defining relation I=ABC and the multiplication rules to show the aliasing relationships that we have already obtained. First we multiply the defining contrast by each main effect:

A × ABC = A2BC = BC

B × ABC = AB2C = AC

C × ABC = ABC2 = AB

Next we multiply the defining contrast by each two-factor interaction, which gives us the same relationship established above:

BC × ABC = AB2C2 = A

AC × ABC = A2BC2 = B

AB × ABC = A2B2C = C

Note that if we had used the bottom-half of the original table, then the defining relationship would have been I=-ABC because all of the entries in the ABC column are “-” while all of the entries in the I column are “+”. The same logic applied above can be used to define the aliasing relationships when I=-ABC is the defining relationship. Note that the quantities above are also called generalized interactions because we are taking a treatment, symbolically multiplying it by a higher-order interaction, and then using our multiplication rules to reveal the resulting aliasing structure.

41 Constructing Half-Replicate 2k−1 Designs

I The half-fraction design is constructed with the highest order interaction as the design generator; e.g., 23−1 half-fraction design has I=ABC, 24−1 half-fraction design has I=ABCD, etc. I The treatment combinations are identified as follows: 1 Write the design matrix (using “+”s and “-”s) in standard order for the factors in a 2k−1 design. 2 Identify the ± coefficients for the kth factor by equating them to the coefficients for the highest-order interaction. I For example, to construct a half-replicate 23−1 design, we use the first four rows in standard order from the full 23 design (only writing the columns for the first two factors) and then calculate C=AB (see the table below). I Note that the treatments match those in the first-half of the table for the popcorn experiment. A B C=AB Treatment - - + c + - - a - + - b + + + abc

42 Highest-Order Interaction

I The highest-order interaction of least interest is typically used to generate the half-replicate because the defining contrast chosen for the design generator cannot be estimated. 4 I For example, suppose that a 2 factorial is generated with the defining relation I=ABCD. 4 I A half-replicate for a 2 factorial requires 8 EUs and if using the defining relation I=ABCD, we get the following aliasing structure: A = BCD B = ACD C = ABD D = ABC AB = CD AC = BD AD = BC

43 Alternative Design

I Suppose we use the defining relation I=ABD to generate a different 24−1 fractional factorial design. I The resulting aliasing structure is as follows:

A = B = C = D = AC = BC = CD =

44 Quarter-Fraction and Smaller-Fraction Designs

I When the number of factors is large, the number of treatments in a half-fraction design may still be prohibitive. I We can proceed to halve our half-fraction design, which results in a quarter-fraction design. I We can proceed in this manner of halving to obtain smaller and smaller designs; e.g., eighth-fraction design, sixteenth-fraction design, etc. I Fractional designs are usually written as 2k−f fractional designs, where f is a positive integer strictly less than k. I In general, for a 2k−f fractional factorial design, there are 2f terms in the defining relation, which consist of 1 The constant term, I. 2 The f interaction terms used to define the f successive fractionations. 3 The 2f − f − 1 generalized interactions, constructed from the crossproducts involving pairs, triples, and so on, of the f interaction terms used to define the f successive fractionations. f k−f I Since there are 2 terms in the defining relation for a 2 fractional factorial design, we see that each factor effect is confounded with 2f − 1 other factor effects.

45 Example: 24−2 Design

To obtain the aliasing structure for this quarter-fraction design, we first note that the half-fraction design is based on the defining relation I=ABCD. We then fractionate this half-fraction design by using the defining relation I=AB. This means that I=ABCD=AB. Moreover, the generalized interaction of AB and ABCD is CD. Therefore, I=ABCD=AB=CD is the defining relation for the 24−2 quarter-fraction design. To continue defining the aliasing structure, note that if we multiply each of the four quantities in the defining relation, we get A=BCD=B=ACD. With a little more work, it can be shown that the full aliasing scheme is given by

I = ABCD = AB = CD (defining relation) A = BCD = B = ACD C = ABD = ABC = D AC = BD = BC = AD

Since main effects are confounded with each other (A with B and C with D), this design is clearly undesirable. But let us proceed to construct the design matrix for this quarter-replicate 24−2 design. First, let column A be written in standard order (-,+,-,+). Since A=B, B will also be (-,+,-,+). Since A is confounded with B, we would then treat C as our “second column” in the standard order routine. Therefore, C is (-,-,+,+), which is what B would have been had it not been confounded with A. The design matrix with only the main effects written down (and it is easy to specify the higher-order terms):

A B=A C D=C Treatment - - - - (1) + + - - ab - - + + cd + + + + abcd

46 Outline of Topics

1 Confounding

2 Fractional Factorial Designs

3 Design Resolution

47 Resolution

I The (maximum) resolution of a fractional factorial design, denoted by R, is the number of factors involved in the lowest-order effect in the defining relation, excluding the constant I. I The resolution is important in identifying the severity of the confounding scheme. 4−1 I For example, in a 2 fractional factorial design, we showed that the defining relation is I=ABCD. The resolution of this design is R = 4 because there are four factors involved. This tells us that the most severe cases of confounding will involve (i) a main effect and a three-factor interaction (e.g., A=BCD) and (ii) two two-factor interactions (e.g., AB=CD). I Roman numerals are commonly used to denote the resolution to avoid confusion with the number of factors. 4−1 I For example, in the 2 fractional factorial design discussed above, 4−1 we showed that R = 4. Therefore, we write this as a 2IV fractional factorial design.

48 General Comments on Design Resolution

I In general, the higher the resolution of the design, the less severe the degree of confounding. I The resolution should never be less than III. I Resolution II designs are not used since at least one pair of main effects will be confounded. I Designs of resolution III, IV, and V are the most common. I In designs of resolution III, no main effects are confounded with other main effects, some main effects are confounded with two-factor interactions, and two-factor interactions are confounded with other two-factor interactions. I In designs of resolution IV, no main effects are confounded with other main effects or two-factor interactions, but some main effects are confounded with three-factor interactions and some two-factor interactions are confounded with other two-factor interactions. I In designs of resolution V, no main effects or two-factor interactions are confounded with other main effects or two-factor interactions, but some main effects are confounded with four-factor interactions and some two-factor interactions are confounded with three-factor interactions.

49 Resolution for 3 to 8 Factors

Number of Number of Defining Relation Fraction Factors Runs (Omitting Generalized Interactions) 3−1 3 2III 4 I=ABC 4−1 4 2IV 8 I=ABCD 5−1 5 2V 16 I=ABCDE 5−2 2III 8 I=ABC=ACE 6−1 6 2VI 32 I=ABCDEF 6−2 2IV 16 I=ABCE=BCDF 6−3 2III 8 I=ABD=ACE=BCF 7−1 7 2VII 64 I=ABCDEFG 7−2 2IV 32 I=ABCDF=ABDEG 7−3 2IV 16 I=ABCE=BCDF=ACDG 7−4 2III 8 I=ABD=ACE=BCF=ABCG 8−2 8 2V 64 I=ABCDG=ABEFH 8−3 2IV 32 I=ABCF=ABDG=BCDEH 8−4 2IV 16 I=BCDE=ACDF=ABCG=ABDH *Note: Other defining relationships can be used for some of these designs.

50 Example: Peanut Solids

A scientist is studying the effect the extraction of food solids from peanuts using water. The seven factors of interest included effects like pH level of the water, extraction time, and agitation speed. The scientist is able to run 16 different treatment combinations. Therefore, they conducted a single replicate of a 27−3 design; i.e., an eighth-fractional design. Below is output for the design generation: $catlg.entry Design: 7-3.1 16 runs, 7 factors, Resolution IV Generating columns: 7 11 13 WLP (3plus): 0 7 0 0 0 , 0 clear 2fis A resolution IV design is achieved for this setting. Near the top of the output are the generating columns. These are the locations of the treatments in standard order, where the first entry is for factor A and not for the constant I. Therefore, 7, 11, and 13, correspond to ABC, ABD, and ACD, respectively. This means the design generator is given by E=ABC, F=ABD, and G=ACD. One can then cycle through the multiplication rules and see how the aliasing scheme given above is achieved.

51 Example: Peanut Solids

Using the design generator given by E=ABC, F=ABD, and G=ACD, we can then cycle through the multiplication rules and see how the aliasing scheme given above is achieved. Below are the aliasing schemes for main effects as well as two-factor and three-factor interactions: $aliased $aliased$legend [1] "A=A" "B=B" "C=C" "D=D" "E=E" "F=F" "G=G"

$aliased$main [1] "A=BCE=BDF=CDG=EFG" "B=ACE=ADF=CFG=DEG" "C=ABE=ADG=BFG=DEF" "D=ABF=ACG=BEG=CEF" [5] "E=ABC=AFG=BDG=CDF" "F=ABD=AEG=BCG=CDE" "G=ACD=AEF=BCF=BDE"

$aliased$fi2 [1] "AB=CE=DF" "AC=BE=DG" "AD=BF=CG" "AE=BC=FG" "AF=BD=EG" "AG=CD=EF" "BG=CF=DE"

$aliased$fi3 [1] "ABG=ACF=ADE=BCD=BEF=CEG=DFG" If you take any of the aliased main effects and multiply both sides by that main effect, you will see the defining relationship involving I. For example, if you take A=BCE=BDF=CDG=EFG and multiply through by A, you get the (partial) defining relationship I=ABCE=ABDF=ACDG=AEFG. All treatments have 4 letters in them, which means the maximum resolution is IV, which we already showed on the previous slide.

52 Analyzing Fractional Factorial Designs

I We begin by estimating the effect of a factor in a fractional factorial design: P 2l 2 cty¯t AB ··· = AB··· = t , N N where N is the total number of EUs (or runs) required for the experiment and

lAB··· is, again, the contrast of the treatment combination AB ··· . I If we let C be our contrast matrix (which amounts to using our design matrix for the fractional factorial design, but where the “-” and “+” signs are replaced by “-1” and “1”, respectively), then the estimated slopes for the are βˆ = (CTC)−1CTY, where ˆ P ˆ βAB··· = t cty¯t/N; therefore AB ··· = 2βAB··· I The 1 df SS for an effects in a 2k−f design is (l )2 SS(AB) = AB··· 2k−f I Obviously, analyzing data collected using a fractional factorial design and fitting all of the effects does not leave us any df for estimating the error. I However, we can plot the estimated factor effects and interactions from the 2k−f design on a normal probability plot (QQ-plot) to help us decide which, if any, effects appear to be negligible. I We could then drop those factors that appear negligible (while obeying the hierarchy principle) and combine those effects to represent experimental variation, thus allowing us to use ANOVA.

53 Example: Filtration Experiment

A chemical product is produced in a pressure vessel. A 24−1 fractional factorial design is conducted in the pilot plant to study the factors thought to influence the filtration rate (Y) of this product. The four factors are temperature (A), pressure (B), concentration of formaldehyde (C), and stirring rate (D). Each of these have a low and a high level. In the 24−1 design, the defining relation I=ABCD is used. The design matrix and data are given below:

Treatment I A B C D=ABC Y (1) + - - - - 45 ad + + - - + 100 bd + - + - + 45 ab + + + - - 65 cd + - - + + 75 ac + + - + - 60 bc + - + + - 80 abcd + + + + + 96

54 Example: Filtration Experiment

The full aliasing scheme for this 24−1 fractional factorial design is as follows: I = ABCD A = BCD B = ACD C = ABD D = ABC AB = CD AC = BD AD = BC The 1 df SS (Type I SS) are then as follows: Call: aov(formula = out)

Terms: A B C D A:B A:C A:D Sum of Squares 722.0 4.5 392.0 544.5 2.0 684.5 722.0 Deg. of Freedom 1 1 1 1 1 1 1

Estimated effects may be unbalanced In the above, those SS quantities which are small are for B and AB. Hence, these are likely negligible effects. The estimated effects used in the calculation of the 1 df SS – which can be plotted on a normal probability plot – are given below: A1 B1 C1 D1 A1:B1 A1:C1 A1:D1 19.0 1.5 14.0 16.5 -1.0 -18.5 19.0

55 Example: Filtration Experiment

A

I To the right is the normal probability 0.8 AD plot of estimated factor effects and interactions for this experiment. D 0.6

I The further values deviate from the 0-1 C line (i.e., the black line), the more Normal Probability 0.4 substantial their effect. B I From this figure, it looks like B and AB AB

are negligible, which is consistent with 0.2 the Type I SS that we calculated. AC

−20 −10 0 10 20

Effect Estimates

56 Example: Filtration Experiment

We next drop B and AB from the analysis. The variability explained by B and AB will make up our estimate of the experimental error. The resulting ANOVA table is presented below. Clearly, all of the remaining effects are significant effects on filtration rate. Note that each F -test is constructed against the MSE, which is our estimate of the experimental error.

Analysis of Variance Table

Response: rate Df Sum Sq Mean Sq F value Pr(>F) A 1 722.0 722.00 222.15 0.004471 ** C 1 392.0 392.00 120.62 0.008189 ** D 1 544.5 544.50 167.54 0.005916 ** A:C 1 684.5 684.50 210.62 0.004714 ** A:D 1 722.0 722.00 222.15 0.004471 ** Residuals 2 6.5 3.25 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

57 Plackett-Burman Designs

I A limitation of fractional factorial designs of resolution III is that they require the number of treatment combinations to be a power of 2. I Plackett-Burman designs are two-level, resolution III designs that can be used for studying up to N − 1 factors in N experimental trials, where N is a multiple of 4. I Valid run sizes of Plackett-Burman designs are, therefore, 4, 8, 12, 16, 20, etc. I When N is a power of 2, then Plackett-Burman designs are equivalent to resolution III fractional factorial designs that we already presented. I When N is not a power of 2, then the aliasing scheme is complicated. I Analysis of Plackett-Burman designs is the same as what we presented for other fractional factorial designs.

58 Final Comments

I As we have seen, factorial experiments are very versatile treatment designs.

I In this lecture, we’ve only provided the rudimentary principles for constructing fractional factorials. k I Fractional designs can also be constructed for 3 factorials using the same principles as those for 2k factorials.

I However, instead of dividing by powers of two (e.g., half-fractional, quarter-fractional, etc.), you now divide by powers of three (e.g., third-fractional, ninth-fractional, etc.) k−f I Most statistical software can handle 3 fractional factorial designs.

I The resolution of such designs can also be determined (or specified).

59 This is the end of Unit 9.

60