Robust Design: Statistical Analysis for Taguchi Methods

In the previous lecture, we saw how to set up the design of a series of to test different configurations of potential designs under different conditions of uncontrollable factors (noise). In this set of notes, we will briefly see how to interpret the experimental results. A more complete study of this requires more time spent in the study of ; we will just look at some basics to get an understanding of the main principles.

Analysis of the experimental results

Upon completion of the full array of experiments, we must analyze the results in order to select the best design. The different designs are compared in terms of their signal-to-noise ratio, or the S/N ratio. We shall not go into the complete details of the statistical analysis in this course, but some insights are useful to understand how the method works.

We shall look at several methods, starting from the most basic to a somewhat analytical method. For more complete details, you will need to study a course in design of experiments (DOE).

I will use a few simple examples to explain these methods.

Example 1. Water Pump Design

The design of a water pump is being studied to set the optimum design parameters. The evaluation criterion is water leakage during operation. The table below lists the factors and the levels at which they are tested.

Table 1. Water pump design parameter levels. Water pump design Factor Level 1 Level 2 A: Cover design cover1 cover2 B: Gasket design gasket1 gasket2 C: Front bolt torque LSL USL D: Sealant No Yes E: Surface finish rough smooth F: Back bolt torque LSL USL G: Torque sequence Front, Back Back, Front

The outcome of the testing is measured in terms of the number of leaks in the pump 7 assembly. The tests are conducted using an L8(2 ) , and the raw results are tabulated below. Leakage was rated on a scale from 0 Æ 5, where 0: no leaks, 5: extremely high leaking.

1

Table 2. Experimental results on the water pump designs Cover Gasket Front Sealant Surface Back Torque torque finish torque sequence Trial A B C D E F G Test: No. leaking 1 1 1 1 1 1 1 1 4 2 1 1 1 2 2 2 2 3 3 1 2 2 1 1 2 2 1 4 1 2 2 2 2 1 1 0 5 2 1 2 1 2 1 2 2 6 2 1 2 2 1 2 1 4 7 2 2 1 1 2 2 1 0 8 2 2 1 2 1 1 2 1

Method 1. Observation

This is the simplest analysis one can perform. For example, looking at leakage data, it appears design configurations 4 and 7 are the best (zero leakage).

On analysis, we see that these two designs have different levels for factor A (i.e. different cover designs), and therefore the likelihood that cover design does not affect the performance is increased.

Secondly, three columns are at the same level: [B: new gasket design, E: high surface finish, and G: back-front torque sequence] in the best configurations. Therefore the likelihood that these particular levels of these factors increase the design quality is high.

At this simple level of analysis, effects are mostly ignored, and only some basic idea of the design quality is obtained, which may be used to reduce the number of variables in future experiments.

2

Method 2. method

We re-organize the data from the experimental runs in increasing order of the measured (Leaking intensity). Table 3 below shows this resulting data for our example.

Table 3. Ranking method Cover Gasket Front Sealant Surface Back Torque torque finish torque sequence Trial A B C D E F G Test: No. leaking 4 1 2 2 2 2 1 1 0 7 2 2 1 1 2 2 1 0 3 1 2 2 1 1 2 2 1 8 2 2 1 2 1 1 2 1 5 2 1 2 1 2 1 2 2 2 1 1 1 2 2 2 2 3 1 1 1 1 1 1 1 1 4 6 2 1 2 2 1 2 1 4

Note the following:

(a) Parameter B, Gasket design, clearly has a strong affect on the design. The four best designs all use the new gasket design, while the four worst use the existing gasket.

(b) If we partition the runs by gasket design, then factor E, or Surface finish, appears to have consistent secondary affect on the performance (in the first four runs in Table ??, Smooth surface finish pumps have lower leakage than rough; this trend is repeated in the last four runs).

(c) From these observations, one may conclude that using a design with the new gasket and smooth surface finish will result in a good design.

The observations of the ranking method can usually be identified by using first order statistical data – about each parameter, without considering interaction affects. We call this the main effect due to a factor X, or ME( X). The easiest ME’s we can compute are the values.

Let us denote ME1(X) and ME2(X) as the mean value of the output when the factor X was at levels 1 and 2 respectively.

3 Thus: ME1 (B) = (2+3+4+4)/4 = 3.25 ME2 (B) = (0+0+1+1)/4 = 0.5

For comparison, let us look at the main effects due to Factors A and E:

ME1 (A) = (4+3+1+0)/4 = 2 ME2 (A) = (2+4+0+1)/4 = 1.75

ME1 (E) = (1+1+4+4)/4 = 2.5 ME2 (E) = (0+0+2+3)/4 = 1.25

These are consistent with the conclusions above and also with the conclusions form the observation method.

Method 3. Column Effects Method

This method was suggested by Taguchi to perform a quick check on the relative importance of each factor; the idea is to basically list out the ME1,2(X) for each parameter (or for each column, if the column is assigned to study and interaction). In this sense, we can look at this method as an extended form of the column ranking method.

Table 4. Column effects method (Taguchi). Cover Gasket Front Sealant Surface Back Torque torque finish torque sequence

Trial No. A B C D E F G Test: leaking 4 1 2 2 2 2 1 1 0 7 2 2 1 1 2 2 1 0 3 1 2 2 1 1 2 2 1 8 2 2 1 2 1 1 2 1 5 2 1 2 1 2 1 2 2 2 1 1 1 2 2 2 2 3 1 1 1 1 1 1 1 1 4 6 2 1 2 2 1 2 1 4

ME1(X) 8 13 8 7 10 7 8 ME2(X) 7 2 7 8 5 8 7 ME2(X) - -1 -11 -1 1 -5 1 -1 ME1(X)

4 Method 4. Graphical method

The graphical method plots the ME’s. The simplest case involves plotting the ME for each factor individually. This yields practically the same information as the previous methods (Figure 1a). It is also possible to plot pair-wise interactions of factors. Assume we want to plot the interaction of factors A and B, each of which can take two levels. We compute the mean output for each combination of A and B. Then we can plot iso- parametric lines – keep one factor constant, and draw the plot of how the output varies as the other one changes. This is shown in Figure 1b. We get two plotted lines. If the two lines are (nearly) parallel, then the likelihood is high that the two factors have little or no interaction (why?).

From the table above, we have: ME( B1E1) = (4+4)/2 = 4 ME( B1E2) = (3+2)/2 = 2.5 ME( B2E1) = (1+1)/2 = 1 ME( B2E2) = (0+0)/2 = 0

(a) Gasket design (b) Interaction of B and E

4 4

3 3

2 2 Mean Leak rating Mean Leak rating 1 1

B=1 B=2 B=1 B=2 Gasket design current new current new

Figure 1. Graphical method for simple analysis of experimental data

Method 5. ANOVA (Analysis of Variations)

ANOVA is a statistical tool that allows us to analyze (or break down) the variations of the test results into components that are contributed by the different sources. This allows us to partition the total to different factors, and even to combinations of different factors. Therefore ANOVA usage in Taguchi methods can be seen as a two step procedure. In the first stage, the different design test data points are used to compute the total variance of the measured output. Further, the variance due to the individual factors, and all possible combinations of factors that were studied are computed. In the second stage, the variance due to any pair of factors (or combination) is compared. By doing so, we can conclude which factor has more significant effect on the design output. The measure of “relative significance” is based on a F-test, which gives us a

5 probabilistic evaluation, i.e. the probability that the two factors being compared have a different enough affect on the output. When this probability is high enough (typically, higher than 95%), we may conclude that the two factors have significantly different affect. From this, we can then decide which design configuration is more robust.

In this section, we shall take a brief look at ANOVA basics. We begin with a simplified example, and build up to the case for design evaluations.

Example 1. Assume that we have designed a water pump, and test it for performance several times; the criterion is flow rate. The test results are:

Table 5. Simple example of ANOVA computations Run 1 2 3 4 5 6 7 8 Flow rate 5 6 8 2 5 4 4 6

Let: yi = i-th data value N = total number of observations T = total (sum) of all observations T = average of all observations = T/N = y

For our case, N = 8, T = 40, T = 5.0

Since we are interested in variations about the mean (and in some cases, the variation of the mean from zero), we further compute:

2 2 2 2 2 2 2 2 SST = total sum of squares = 5 + 6 + 8 + 2 + 5 +4 + 4 + 6 = 222

We denote (yi - y ) as the error, and therefore can compute the SSerror = SSe, as follows:

2 2 2 2 2 2 2 2 SSe = 0 + 1 + 3 + (-3) + 0 +(-1) + (-1) + 1 = 22

Further, we may write the square of the mean as T 2, and thus the sum-of-square of the 2 “error” due to deviation of mean from zero can be denoted as SSm = N(T ) = 8x 25 = 200.

Notice that SST = SSe + SSm

Further, we may think of the 8 observations to possess 7 degrees of freedom (since we don’t really know the population mean, and therefore yield one degree of freedom to compute the sample mean, y , form the 8 data points). You may think of this as the data set having 7 dof, and the mean having one dof. This notation is useful mostly in computing sample , which is the measure of interest for us.

6 We use the standard definition of variance, V2 = SS/v, where SS is the sum of squares of the quantity of interest, and v is the degree of freedom of this quantity. We shall denote V2 with the symbol V.

Thus, for our data, Ve = SSe/ve = 22/7 = 3.14

Example 2. Consider the design of a water sprinkler, where the key design parameter is the diameter of the sprinkler hole. If the hole is too small, the water is sprayed too far, while if it is too large, the water pours out a very small speed. The measured parameter is the exit velocity of the water, and three different values (levels) of the DP are to be tested: diameter = 0.15mm, 0.25mm and 0.35mm.

We denote the factor under study, hole diameter, as factor A, and thus A can take kA levels. Here kA = 3.

Several tests are performed on sprinklers at each level of A, and the data from the tests is tabulated below. Using Ai to denote the sum of the observations at level Ai, we have:

Table 6. ANOVA for multiple data points at different levels Level Diameter Water velocity Ai nAi Ai = Ai/ nAi A1 0.15 2.2 1.9 2.7 2.0 8.8 4 2.2 A2 0.25 1.5 1.9 1.7 - 5.1 3 1.7 A3 0.35 0.6 0.7 1.1 0.8 3.2 4 0.8

Further, the grand totals are as follows:

T = 6Ai = 17.1 N = 6nAi = 11 T = 1.6

In this case, the variations have three components: (a) variation of the mean from zero, (b) variations due to the different mean values for the three designs, and (c) variation of each sprinkler type about the mean of that type.

(a) As before, we can compute, for the entire data set,

2 2 SST = 2.2 + … + 0.8 = 31.9, and

2 2 SSm = variation of the mean about 0 is N(T ) = T /N = 26.583

(b) The variations due to the different mean values of the three designs can be written:

7 2 SSA = 6 nAi ( Ai - T ) 2 ª k A § A ·º T 2 ¨ i ¸ = «¦¨ ¸»  ¬« i 1 © nAi ¹¼» N = 8.82/4 + 5.12/3 + 3.22/4 – 17.12/11 = 4.007

(c) Variations of the pumps, or the sum-of-squares of the error:

k A nAi 2

SSe ¦¦(yij  Aj ) ¡ j ¡1i 1 = 02 + (-0.3)2 + 0.52 + (-0.2)2 + 0.22 + 02 + (-0.2)2 + (-0.1)2 + 0.32 + 02 = 0.600

You can verify that SST = SSm + SSA +SSe

Table 7 summarizes the data statistics:

Table 7. ANOVA statistics for the sprinkler example Source SS v V m 26.583 1 26.583 A 4.007 2 2.0035 e 0.6 8 0.075 T 31.19 11

The F test

Let us first look at a generic form of the F test before applying it to Taguchi methods. We are given two sets of data (they may be observation values from two sets of experiments). We can then compute the sample variance for each of these sets; these measures are estimates of the true variance of the data sets (note that since we only have a finite number of readings, we don’t really know the population variance, we only know the sample variance). Now, can we comment, by looking at the variance values of the two sets, whether they arise out of different samples from the same population, or whether they are actually data sets from two totally different distributions?

Notice the parallel question for Taguchi designs: we are interested to know whether two designs are significantly different in their measured behavior, or not.

To answer this, we need to know several things: (i) What is the size of the sample of the first set of data? (ii) What is the size of the sample of the second set? (iii) If we say that the two sets are samples from the same population, how confident do we want to be in our statement?

8 Intuitively, the first two factors are important because as our sample size increases, our estimate of the Variance moves closer to the true variance.

The third factor is important for the obvious reason – if we want to be more confident that the two are from different populations, then the two samples must have very different variance values.

The actual F test is based on the most remarkable theorem in statistics, called the (CLT). Roughly, the CLT states that:

Central Limit Theorem: (a) No matter what is the population distribution from which we take samples, the sample will be normally distributed (b) As the sample sizes increase, the mean of the sample means approaches the mean of the population (c) The variance of the mean of the samples is smaller than the variance of the population. In particular, if the population has variance = V2, then the variance of the sample means = V2/N, where N is the sample size. [As N increases, the mean is so close to the true mean that it almost does not have any variance, hence the name central limit]

Using this theorem, F-test tables have been constructed, and can be used to estimate if two sets of sample variances belong to the same population or not. You can find these tables in many statistics text-books.

The convention in ANOVA is to write the confidence as (1-risk), and to denote the risk as a parameter, D. A commonly used value is D = 0.05 (which corresponds to 95% confidence).

Example 3. We now look at some examples of the application of ANOVA to Taguchi method of design. Let us first consider a simple 2-factor experiment to get familiar with the ideas. Assume that a candy manufacturer is testing different combinations of sugar and vegetable oil to test for plasticity of the resulting candy, and tests at two levels for each factor. At each combination of settings, the plasticity is tested two times. The resulting data is as follows:

A = % sugar; A1 = 3.5, A2 = 4.5 B = % oil; B1 = 1.2, B2 = 1.8

Table 8. Data from candy experiments A1 A2 B1 6, 8 3, 4 B2 7, 8 9, 10

The numbers in each cell represent the two test data at that setting. In the ANOVA analysis, the variations this time are composed of four sources:

9 (i) variation due to A (ii) variation due to B (iii) variation due to interaction of A and B (iv) variation due to error.

Thus, SST = SSA + SSB + SSAXB + SSe

We may compute them as follows:

A1 = sum of all outputs at level A1 = 29; B1 = 21; A2 = 26; B2 = 34 nA1 = nA2 = 4; nB1 = nB2 = 4, N = 8;

N 2 2 T 2 2 2 SST = ¦ yi  = 6 + … + 10 – 55 /8 = 40.875 i 1 N

k A 2 2 Ai T 2 2 2 SS A ¦  = 29 /4 + 26 /4 – 55 /8 = 1.125 i ¡1 nAi N

2 2 2 Likewise, SSB = 21 /4 + 34 /4 – 55 /8 = 21.125

Computing SSAXB requires some care: if we just total up all the sum-of-squares of each combination of Ai and Bi, this total will include all the terms of SSA and SSB also. In order to isolate these lower order terms, we need to subtract out these values. In other words:

c 2 2 (AXB)i T SS AXB ¦   SS A  SS B , where c is the number of different ¢ n N i 1 AXBi combinations of Ai, Bj. In our example, c = 4; A1B1 = 14; A2B2 = 7; A1B2 = 15 and A2B2 = 19.

2 2 2 2 2 Thus SSAXB = 14 /2 + 7 /2 + 15 /2 + 19 /2 – 55 /8 – 1.125 – 21.125 = 15.125

SSe = SST - SSA - SSB - SSAXB = 3.5

To compute the variances, we need the degrees of freedom: vT = N – 1 = 8 – 1 = 7 vA = 1; vB = 1 (each factor has only two levels, so only one dof) vAXB = vA vB = [(how many times we can change A)x( for each value of A, how many times we can change B)] = 1x1 = 1

The remaining degrees of freedom belong to the noise, ve = vT – vA – vB – VAXB = 4. You can think of this as follows. The noise is attributed to the variations of statistics around the mean at each setting; at each combination of Ai, Bj, we have two data points, or one

10 free value of data contributing to the noise. Since there are four settings, the dof of noise is 4.

From this data, we can now construct the ANOVE table, as shown below. For significance analysis, it is conventional to compare the variance of the source with that of the error term.

Table 9. ANOVA data for the candy example Confidence Source SS v V F =c, Fc;1;4 A 1.125 1 1.125 1.29 < 90% B 21.125 1 21.125 24.14 > 99% AXB 15.125 1 15.125 17.29 > 95% e 3.5 4 0.875 Total, T 40.875 7

From this analysis, one may conclude that the amount of sugar has relatively little impact on the plasticity. The plasticity is very sensitive to the amount of oil. Further, the interaction of sugar and oil amounts is quite significant.

There is an interesting relationship between ANOVA calculations as above, and using the orthogonal array. Since we performed 8 experiments, corresponding to a typical L8 array, we can tabulate the data as follows:

Table 10. The L8 orthogonal array interpretation of the Candy example data Trial No. A B AXB D E F G Plasticity 1 1 1 1 1 1 1 1 6 2 1 1 1 2 2 2 2 8 3 1 2 2 1 1 2 2 7 4 1 2 2 2 2 1 1 8 5 2 1 2 1 2 1 2 3 6 2 1 2 2 1 2 1 4 7 2 2 1 1 2 2 1 9 8 2 2 1 2 1 1 2 10

We can now use the first column of this table to compute SSA, and from there, Variance of A. If we compute the SS in the third column, we get:

2 2 2 SScolumn3 = [(6+8+9+10) /4 +(7+8+3+4) /4 – 55 /8 ] = 15.125, which is exactly the SSAXB as we computed earlier.

Further, if we also carry out the SS computations for the dummy columns D, E, F and G, we shall find that each column gives us precisely one component of the SSe, and that: SSD + SSE + SSF + SSG = SSe.

11 [The reason that this happens is because we had precisely one test data for each experimental run.]

Thus, the sum of the SS for all columns is exactly equal to SST.

What factors to compare?

We may look at each of columns D, …, G as four components of error, each with 1 dof. However, a better estimate of the error is obviously the sum of the error terms from the four columns. This is called pooling.

Taguchi used pooling up to estimate the relative significance of the effect of columns (factors, or their interactions). The idea is as follows: compare the lowest column effect to the next lowest to see if they are significantly different. If not, then pool their effect and compare with the next larger column effect, until a significant factor is discovered.

Using this strategy, our F-test table will appear like the following:

Table 11. Pooling up to compute significance of columns Confidence Source SS v V F =c, Fc;1;4 A 1.125 1 1.125 B 21.125 1 21.125 22.83 > 99% AXB 15.125 1 15.125 16.35 > 95% eD 3.125 1 3.125 eE 0.125 1 0.125 eF 0.125 1 0.125 eG 0.125 1 0.125 Total, T 40.875 7

Here, the pooling takes place in the sequence eG, eF, eE, eA, eD. At the end, we are comparing the pooled effect of these, V = (0.125+0.125+0.125+1.125+3.125)/5, with the next higher variance. Compare this table with Table 9.

In general, some of the remaining columns of the orthogonal array may actually have been assigned to other factors. In this case, the Taguchi method merely shifts the comparison (instead of using variance of the error term) to the insignificant factors, once again, using the strategy of pooling up.

References: Taguchi Techniques for Quality , P. J. Ross, Mc-Graw Hill, 1996 Product Design, Kevin Otto and Kristin Wood, Prentice Hall, New Jersey, 2001 Taguchi Methods, Glen Stuart Peace, Addison Wesley, 1993

12