Other Design and Analysis Topics

k CHAPTER 15 Other Design and Analysis Topics CHAPTER OUTLINE 15.1 NONNORMAL RESPONSES S15.3.2 Estimating the Parameters in a Logistic AND TRANSFORMATIONS Regression Model 15.1.1 Selecting a Transformation: The Box–Cox S15.3.3 Interpreting the Parameters in a Logistic Method Regression Model 15.1.2 The Generalized Linear Model S15.3.4 Hypothesis Tests on Model Parameters 15.2 UNBALANCED DATA IN A FACTORIAL DESIGN S15.3.5 Poisson Regression 15.2.1 Proportional Data: An Easy Case S15.3.6 The Generalized Linear Model 15.2.2 Approximate Methods S15.3.7 Link Functions and Linear Predictors k 15.2.3 The Exact Method S15.3.8 Parameter Estimation in the Generalized k 15.3 THE ANALYSIS OF COVARIANCE Linear Model 15.3.1 Description of the Procedure S15.3.9 Prediction and Estimation with the 15.3.2 Computer Solution Generalized Linear Model 15.3.3 Development by the General Regression S15.3.10 Residual Analysis in the Generalized Linear Significance Test Model 15.3.4 Factorial Experiments with Covariates S15.4 Unbalanced Data in a Factorial Design S15.4.1 The Regression Model Approach 15.4 REPEATED MEASURES S15.4.2 The Type 3 Analysis SUPPLEMENTAL MATERIAL FOR CHAPTER 15 S15.4.3 Type 1, Type 2, Type 3 and Type 4 Sums of S15.1 The Form of a Transformation Squares S15.2 Selecting in the Box–Cox Method S15.4.4 Analysis of Unbalanced Data using the Means S15.3 Generalized Linear Models Model S15.3.1 Models with a Binary Response Variable The supplemental material is on the textbook website www.wiley.com/college/montgomery. CHAPTER LEARNING OBJECTIVES 1. Know how to use the Box-Cox method to select a variance—stabilizing transformation. 2. Understand how the generalized linear model can be used to analyze some experiments with nonnormal response distributions. 3. Understand some basic analysis methods for unbalanced factorial designs. 4. Know how to analyze an experiment with a covariate. 5. Know how to design an experiment when the values of the covariate are known in advance. 6. Know how to analyze a single-factor design with repeated measures of the response. 656 k k 15.1 Nonnormal Responses and Transformations 657 he subject of statistically designed experiments is an extensive one. The previous chapters have provided an Tintroductory presentation of many of the basic concepts and methods, yet in some cases we have only been able to provide an overview. For example, there are book-length presentations of topics such as response surface methodology, mixture experiments, variance component estimation, and optimal design. In this chapter, we provide an overview of several other topics that the experimenter may potentially find useful. 15.1 Nonnormal Responses and Transformations 15.1.1 Selecting a Transformation: The Box–Cox Method In Section 3.4.3, we discussed the problem of nonconstant variance in the response variable y from a designed experiment and noted that this is a departure from the standard analysis of variance assumptions. This inequality of variance problem occurs relatively often in practice, often in conjunction with a nonnormal response variable. Examples would include a count of defects or particles, proportion data such as yield or fraction defective, or a response variable that follows some skewed distribution (one “tail” of the response distribution is longer than the other). We introduced transformation of the response variable as an appropriate method for stabilizing the variance of the response. Two methods for selecting the form of the transformation were discussed—an empirical graphical technique and essentially trial and error in which the experimenter simply tries one or more transformations, selecting the one that produces the most satisfactory or pleasing plot of residuals versus the fitted response. Generally, transformations are used for three purposes: stabilizing response variance, making the distribution of the response variable closer to the normal distribution, and improving the fit of the model to the data. This last objective could include model simplification, say by eliminating interaction terms. Sometimes a transformation will k k be reasonably effective in simultaneously accomplishing more than one of these objectives. We have noted that the power family of transformations y∗ = y is very useful, where is the parameter 1 of the transformation to be determined (e.g., = 2 means use the square root of the original response). Box and Cox (1964) have shown how the transformation parameter may be estimated simultaneously with the other model parameters (overall mean and treatment effects). The theory underlying their method uses the method of maximum likelihood. The actual computational procedure consists of performing, for various values of , a standard analysis of variance on ⎧y − 1 ⎪ ≠ 0 y() = ⎨ ̇y−1 (15.1) ⎪ ⎩ ẏ ln y = 0 where ẏ = ln−1[(1∕n)Σ ln y] is the geometric mean of the observations. The maximum likelihood estimate of is the value for which the error sum of squares, say SSE( ), is a minimum. This value of is usually found by plotting a graph of SSE( ) versus and then reading the value of that minimizes SSE( ) from the graph. Usually between 10 and 20 values of are sufficient for estimating the optimum value. A second iteration using a finer mesh ofvaluescan be performed if a more accurate estimate of is necessary. Notice that we cannot select the value of by directly comparing the error sums of squares from analysis of variance on y because for each value of , the error sum of squares is measured on a different scale. Furthermore, a problem arises in y when = 0, namely, as approaches zero, y approaches unity. That is, when = 0, all the response values are a constant. The component (y − 1)∕ of Equation 15.1 alleviates this problem because as tends to zero, (y − 1)∕ goes to a limit of ln y. The divisor component ẏ −1 in Equation 15.1 rescales the responses so that the error sums of squares are directly comparable. In using the Box–Cox method, we recommend that the experimenter use simple choices for because the practical difference between = 0.5 and = 0.58 is likely to be small, but the square root transformation ( = 0.5) is much easier to interpret. Obviously, values of close to unity would suggest that no transformation is necessary. k k 658 Chapter 15 Other Design and Analysis Topics Once a value of is selected by the Box–Cox method, the experimenter can analyze the data using y as the response unless of course = 0, in which case he can use ln y. It is perfectly acceptable to use y() as the actual response, although the model parameter estimates will have a scale difference and origin shift in comparison to the results obtained using y (or ln y). An approximate 100(1 − ) percent confidence interval for λ can be found by computing ( ) 2 t , ∗ ∕2 SS = SSE( ) 1 + (15.2) where is the number of degrees of freedom, and plotting a line parallel to the λ axis at height SS∗ on the graph of ∗ SSE( ) versus λ. Then, by locating the points on the λ axis where SS cuts the curve SSE( ), we can read confidence limits on λ directly from the graph. If this confidence interval includes the value = 1, this implies (as noted above) that the data do not support the need for transformation. EXAMPLE 15.1 We will illustrate the Box–Cox procedure using the peak By plotting SS∗ on the graph in Figure 15.1 and reading the discharge data originally presented in Example 3.5. Recall points on the λ scale where this line intersects the curve, we that this is a single-factor experiment (see Table 3.7 for the obtain lower and upper confidence limits on λ of λ− = 0.27 original data). Using Equation 15.1, we computed values of and λ+ = 0.77. Because these confidence limits do not k k SSE( ) for various values of λ: include the value 1, use of a transformation is indicated, () and the square root transformation (λ = 0.50) actually used SSE is easily justified. −1.00 7922.11 −0.50 687.10 −0.25 232.52 0.00 91.96 ss E(λ) 0.25 46.99 0.50 35.42 110 0.75 40.61 100 1.00 62.08 90 1.25 109.82 80 1.50 208.12 70 * 60 A graph of values close to the minimum is shown in ss Figure 15.1, from which it is seen that λ=0.52 gives 50 40 a minimum value of approximately SSE(λ) = 35.00. An approximate 95 percent confidence interval on λ is found by 30 calculating the quantity SS∗ from Equation 15.2 as follows: 20 ( ) 10 t2 ∗ 0.025,20 0 SS = SS () 1 + λ E 20 0.00 0.25 0.50 0.75 1.00 1.25 [ ] λ– = 0.27 λ+ = 0.77 (2.086)2 = 35.00 1 + ◾ 20 FIGURE 15.1 Plot of SSE( )versus for = 42.61 Example 15.1 k k 15.1 Nonnormal Responses and Transformations 659 Box–Cox plot power transforms ◾ FIGURE 15.2 Output from Design-Expert plot Design-Expert for the Box–Cox procedure Peak discharge 20.32 Lambda current =1 Best = 0.541377 16.14 Low CI = 0.291092 ) High CI = 0.791662 SS Recommend transformation 11.95 Square root (Lambda = 0.5) Ln (Residual Ln (Residual 7. 7 6 3.58 –3 –2 –1 0 123 Lambda Some computer programs include the Box–Cox procedure for selecting a power family transformation. Figure 15.2 presents the output from this procedure as implemented in Design-Expert for the peak discharge data.

Load more