8 RESPONSE SURFACE DESIGNS Desirable Properties of a Response

Home , Central composite design, Spherical design

8 RESPONSE SURFACE DESIGNS

Desirable Properties of a Response Surface Design

1. It should generate a satisfactory distribution of information throughout the design region.

2. It should ensure that the ﬁtted value yb at x be as close to the true value at x. 3. It should provide an estimate of the pure experimental error.

4. It should give good detectability of model lack of ﬁt.

5. It should allow experiments to be performed in blocks.

6. It should allow for designs of increasing order to be constructed sequentially.

7. It should be eﬃcient with respect to the number of experimental runs.

8. It should be cost eﬀective.

9. It should provide a check of the homogeneity of variance assumption.

2 10. It should provide a good distribution of the scaled prediction variance Var[yb(x)]/σ throughout the design region.

11. It should be robust to the presence of outliers in the data.

8.1 Prediction Variance • Recall that the prediction variance at any point x in the design region is given by

Var[yb(x)] = (13) where X is the model matrix corresponding to x(m) which is a vector corresponding to the model terms. For example:

(m)0 – For the ﬁrst-order model, x = [1, x1, x2, . . . , xk] (m)0 – For the interactive model, x = [1, x1, x2, . . . , xk, x1x2, . . . , xk−1xk] – For the second-order model,

(m)0 2 2 2 x = [1, x1, x2, . . . , xk, x1, x2, . . . , xk, x1x2, . . . , xk−1xk]

• The deﬁnition of Var[yb(x)] in (13) indicates that

1. Var[yb(x)] varies based on the location of x. 0 −1 2. Var[yb(x)] is dependent on the choice of experimental design (because of (X X) ). • In design comparison studies, the scaled prediction variance function

NVar[y(x)] b = σ2 is often used because the division by σ2 makes the quantity scale-free and the multiplication by the design size N allows this quantity to reﬂect variance on a per observation basis.

148 • That is, when two designs are being compared, scaling by N penalizes the design with the larger sample size. Thus, emphasis is also placed on design size eﬃciency.

EXAMPLE:

• Consider a k-factor experiment having N experimental runs.

• The design matrix D is the N × k matrix whose rows correspond to the N experimental runs.

• For any p-parameter polynomial response surface model, the model matrix X associated with D is the N × p matrix whose columns correspond to the p terms in the model.

2 • For the 3 design, (x1, x2) ∈ {−1, 0, 1} × {−1, 0, 1}: Design Matrix D Model Matrix X (p = 6) k = 2 y = β0 + β1x1 + β2x2 + β12x1x2 2 2 N = 9 +β11x1 + β22x2 +

2 2 x1 x2 x1 x2 x1x2 x x    1 2  −1 −1 1 −1 −1 1 1 1  −1 0   1 −1 0 0 1 0       −1 1   1 −1 1 −1 1 1       0 −1   1 0 −1 0 0 1           0 0   1 0 0 0 0 0       0 1   1 0 1 0 0 1       1 −1   1 1 −1 −1 1 1       1 0   1 1 0 0 1 0  1 1 1 1 1 1 1 1

• The problem of choosing a “best” design D for ﬁtting the parameters of a linear model can be interpreted in more than one way. Examples:

–D produces model coeﬃcient estimates with smallest variance. –X should have orthogonal columns.

• For second or higher-order polynomial response surface models, there is not a unique class of “best” designs (Box and Hunter 1957).

• Coeﬃcient estimates should be studied simultaneously. Therefore, one desirable design prop- erty is to produce predicted values Yb(x) with small variance, i.e., small Var(Yb(x)). • For example, consider two 6-point designs with 1 factor. The experiment consists of collecting data at temperatures between 80◦ and 100◦.

• Temperature is coded as

80◦ → −1 85◦ → −.5 90◦ → 0 95◦ → .5 100◦ → 1

149 Design D1 X for D1 Design D2 X for D2 N = 6 y = β0 + β1x + N = 6 y = β0 + β1x +

 −1   1 −1   −1   1 −1   −1   1 −1   −.5   1 −.5           0   1 0   0   1 0           0   1 0   0   1 0           1   1 1   .5   1 .5  1 1 1 1 1 1

6 0 1/6 0 For D1 :(X0X) = (X0X)−1 = 0 4 0 1/4

6 0 1/6 0 For D2 :(X0X) = (X0X)−1 = 0 5/2 0 2/5

1/6 0 1 For D1 : V (x) = 6 [ 1 x ] = 0 1/4 x

1/6 0 1 For D2 : V (x) = 6 [ 1 x ] = 0 2/5 x

• If we consider the second-order model

2 y = β0 + β1x1 + β11x1 + :

then we can show that

For D1 : V (x) = For D2 : V (x) =

• For k = 1 factor, V (x) can be plotted across the interval design space.

• For the ﬁrst-order response surface model depicted in Figure 1a, V (x) is uniformly better for D1 than for D2.

• This is not true, however, for V (x) for the second-order response surface model shown in Figure 1b.

– V (x) is smaller for D1 than D2 near ±1 because D1 has replicated endpoints at ±1. V (x) is smaller for D2 than D1 near 0 because D2 has replicated points at 0 and at ±.5. – Each design has its strengths and weaknesses with respect to the prediction variance of the second order response surface model.

• It is important to remember that the prediction variance function is dependent on the response surface design and the response surface model.

150 Figure 1a: First Order Model Figure 1b: Second Order Model

• Design D1 ◦ Design D2

• For designs with k = 2 factors (x1, x2), V (x) can be displayed using a contour plot or a 3-dimensional surface plot.

2 • Example: For the N = 9 point 3 design with (x1, x2) ∈ {−1, 0, 1} × {−1, 0, 1}:

V (x) =

• A contour plot and a 3-dimensional surface plot of V (x) for the 32 design are shown in Figure 2a and Figure 2b.

Figure 2a: Contour plot Figure 2b: 3-dimensional plot

• Graphical techniques for evaluating prediction variance properties throughout the experimental region have been developed for studying k ≥ 3 factors. These will be discussed later in the course.

151 152 8.2 Design of Experiments for First-Order Models

• Situation: An experimental design consisting of N runs is to be conducted on x1, x2, . . . , xk, a single response y is to be recorded, and a ﬁrst-order model yi β0 + β1x1 + β2x2 + ··· + βkxk is considered adequate.

• We will use coded design variables such that xj ∈ [−1, 1] ∀j. That is, the design region R(x) is a k-dimensional hypercube.

• Once the data is collected, we use the method of least squares to ﬁt the ﬁrst-order model

k X yb = β0 + βixi i=1

• An orthogonal design or orthogonal array is an experimental design

X = [1, x1, x2,..., xk]

th 0 0 where xj is the j column of X is xixj = 0 for all i 6= j and 1 xj = 0 for all j. • As a result, if two columns are orthogonal, then the two corresponding variables are linearly independent and the corresponding estimates are independent.

• The goal is to simultaneously minimize the variances of the bj’s and, if possible, retain orthog- onality among the columns of X. In other words, we want (X0X)−1 to be a diagonal matrix whose diagonal entries are minimized. (Or equivalently, we want (X0X) to be a diagonal matrix whose diagonal entries are maximized.)

k X • Variance Optimality Theorem: Suppose the first-order model yb = β0 + βixi is to i=1 be fitted and the design size N is fixed. If xj ∈ [−1, 1] for j = 1, 2, . . . , k, then var(bj) is minimized if the design is orthogonal and all xi levels are ±1 for j = 1, 2, . . . , k. Outline of proof:

th 0 PN 2 – The i diagonal element of X X = j=1 xij. This will be will be maximized when each xij = ±1, and will yield values on the diagonal = N. 0 – If the columns of X are also orthogonal, then X X is a diagonal matrix NIk+1. 0 −1 2 0 −1 2 – Thus, (X X) = (1/N)Ik+1, and the variance/covariance matrix σ (X X) = (σ /N)Ik+1. Thus, the diagonal entries will be minimized. 2 0 −1 – Recall that the variance of any parameter estimate var(bj) is a diagonal entry of σ (X X) . Thus, for each j, the var(bj) is minimized by designs satisfying the stated conditions. • The two-level full-factorial 2k designs, fractional-factorial 2k−p designs of at least Resolution III, and Plackett-Burman designs satisfy this theorem.

• A saturated design for a model with P parameters (excluding the intercept) is a design which allows estimation of all P model parameters in N = points. This implies there are no degrees of freedom for error. Therefore, saturated designs are used primarily for parameter estimation.

153 • Another type of orthogonal ﬁrst-order design is the simplex design.A simplex design is an orthogonal saturated design with N = k + 1 whose points represent the vertices of a regular-sided ﬁgure. In general, the design points are given by the rows of the matrix

  x11 x21 ··· xk1  x12 x22 ··· xk2  D =    ····  x1,N x2,N ··· xk,N

such that the angle that any two points makes with the origin is θ where cos(θ) =

• For k = 2, the points form an equilateral triangle with cos(θ) = −1/2 (i.e., θ = 120◦).

• For k = 3, the points form a regular tetrahedron (pyramid) with cos(θ) = −1/3.

• Rotation of a k-variable simplex design will yield another k-variable simplex design.

• To construct an k-variable simplex design with N = k + 1-points, you begin with any N × N orthogonal√ matrix P with equal ﬁrst column elements. The last N − 1 columns of the matrix NP yield a simplex design.

• A matrix P can be obtained by selecting any nonsingular N × N matrix Q with equal ﬁrst column elements and then applying the Gram-Schmidt orthonormalization techniques to the columns of Q.

• A simpler method is to use the following table (from Montgomery’s Design and Analysis of Experiments text).

• The table contains the coeﬃcients of orthogonal polynomials to form Q and then apply the Gram-Schmidt orthonormalization technique.

154 155 156 A Three-Factor Central Composite Design

8.3 Central Composite Designs One of the most popular and commonly used classes of experimental designs for ﬁtting the second order model are the central composite designs introduced by Box and Wilson (1951). Assuming k ≥ 2 design variables, the central composite design (CCD) consists of:

(i) An f = 2k−p full (p = 0) or fractional (p > 0) factorial design of at least Resolution V. Each point is of the form (x1, . . . , xk) = (±1, ±1,..., ±1).

(ii) 2k axial or star points of the form (x1, . . . , xi, . . . , xk) = (0, .., 0, ±α, 0, ..., 0) for 1 ≤ i ≤ k.

(iii) nc center points (x1, . . . , xk) = (0, 0,..., 0). If α = 1 for the axial points, then the design is referred to as a face-centered cube design. k+2 The CCD contains N = f + 2k + nc points to estimate the 2 parameters in the second-order model. Each of the three types of points in a central composite design play diﬀerent roles:

• The factorial points allow estimation of the ﬁrst-order and interaction terms.

• The axial points allow estimation of the squared terms.

• The center points provide an internal estimate of pure error used to test for lack of ﬁt and also contribute toward estimation of the squared terms.

157 Table of Central Composite Designs

k Fraction Resolution Deﬁning Factorial Axial Center Total Relation Points Points Points Points 2 2 2 — 4 4 nc 8 + nc 3 3 2 — 8 6 nc 14 + nc 4 4 2 — 16 8 nc 24 + nc 5−1 5 2 V E=ABCD 16 10 nc 26 + nc 6−1 6 2 VI F=ABCDE 32 12 nc 44 + nc 7−1 7 2 VII G=ABCDEF 64 14 nc 78 + nc 8−2 8 2 V G=CDEF H=ABEF 64 16 nc 80 + nc 9−2 9 2 VI H=CDEFG J=ABEFG 128 18 nc 146 + nc

Central Composite Designs (Face-Centered Cube) ======k=3 k=4 k=5 ======1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 -1 -1 -1 1 1 -1 -1 1 1 1 -1 1 1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 1 -1 -1 1 -1 1 1 -1 -1 1 1 -1 1 -1 -1 -1 -1 1 1 1 -1 1 1 1 -1 1 0 0 1 -1 -1 -1 1 -1 -1 -1 -1 -1 0 0 1 -1 -1 1 1 -1 -1 1 1 0 1 0 1 -1 1 -1 1 -1 1 -1 1 0 -1 0 1 -1 1 1 1 -1 1 1 -1 0 0 1 1 1 -1 -1 1 1 -1 -1 1 0 0 -1 1 1 -1 1 1 1 -1 1 -1 0 0 0 1 1 1 -1 1 1 1 -1 -1 ::: 1111 11111 0 0 0 -1 0 0 0 -1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 -1 0 0 0 -1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 -1 0 0 0 -1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 -1 0 0 0 -1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 -1 : : : : 0 0 0 0 1 0 0 0 0 0 0 0 0 0 ::::: 0 0 0 0 0

• For a spherical design region, replace the (±1, 0, 0,..., 0), (0, ±1, 0,..., 0),..., (0, 0,..., ±1) axial points with (±α, 0, 0,..., 0), (0, ±α, 0,..., 0),..., (0, 0,..., ±α). The ”optimal” choice of α will be discussed later.

158 8.4 Box-Behnken Designs A second class of experimental designs for quadratic regression is the class of Box-Behnken designs introduced by Box and Behnken (1960). Assuming k ≥ 3, most of the Box-Behnken designs (BBD) are constructed by combining two-level factorial designs with balanced incomplete block designs (BIBD). Associated with every BIBD, and hence, every BBD considered, are the following parameters: k = the number of design variables. b = the number of blocks in the BIBD. t = the number of design variables per block. r = the number of blocks in which a design variable appears. λ = the number of times that each pair of design variables appears in the same block. It must hold that λ = . When constructing a BBD:

1. The t columns deﬁning a 2t factorial design (or 2t−p fractional-factorial design for larger t) with levels ±1 replace the t design variables appearing in each block in the BIBD.

2. The remaining k − t columns are set at mid-level 0.

3. nc mid-level center points (0,..., 0) are included in the design. The total BBD size is N = where f = 2t. It should be noted that several of the designs (e.g., when k = 6) proposed by Box and Behnken (1960) are based on designs with partially balanced incomplete block designs (PBIBDs).

Converting a 3-Factor BIBD into a 3-Variable BBD BBD Variables Blocks x1 x2 x3 1 1 0 1 1 -1 0 -1 1 0 BIBD -1 -1 0 Factors 0 1 1 Blocks ABC 2 0 1 -1 −→−→−→ 1 0 -1 1 2 0 -1 -1 3 1 0 1 3 1 0 -1 -1 0 1 -1 0 -1 0 0 0 CP 0 0 0 0 0 0

159 160 161 Box-Behnken Designs for k ≤ 7

162