Analyze Phase 331

60000

50000

40000

30000

20000

10000

O~----,------.------,------,,------,------.------,----- N = 227 136 27 41 32 5 ' V~ 00 0' 00 00 i-.~ ~G ~~ ~O~ ()0 -Sfl' 0 -S 0 ~~ 0 ~~ ~G d> ~0~ ~0 0 2S ~~ 0'0 (j «l Employment category

FIGURE 10.7 Boxplots of salary by job category.

Boxplots are particularly useful for comparing the distribution of values in several groups. Figure 10.7 shows boxplots for the salaries for several different job titles. The boxplot makes it easy to see the different properties of the distributions. The location, variability, and shapes of the distributions are obvious at a glance. This ease of interpretation is something that alone cannot provide.

Statistical Inference This section discusses the basic concept of . The reader should also consult the glossary in the Appendix for additional information. Inferential statistics belong to the enumerative class of statistical methods. All statements made in this sec­ tion are valid only for stable processes, that is, processes in statistical control. Although most applications of Six Sigma are analytic, there are times when enumerative statistics prove useful. The term inference is defined as (1) the act or process of deriving logical conclusions from premises known or assumed to be true, or (2) the act of reasoning from factual knowledge or evidence. Inferential statistics provide information that is used in the process of inference. As can be seen from the definitions, inference involves two domains: the premises and the evidence or factual knowledge. Additionally, there are two conceptual frameworks for addressing premises questions in inference: the design-based approach and the model-based approach. As discussed by Koch and Gillings (1983), a statistical analysis whose only assump­ tions are random selection of units or random allocation of units to experimental condi­ tions results in design-based inferences; or, equivalently, -based inferences. The objective is to structure such that the sampled population has the same 332 Chap te r Ten

characteristics as the target population. If this is accomplished then inferences from the are said to have internal validity. A limitation on design-based inferences for experimental studies is that formal conclusions are restricted to the finite population of subjects that actually received treatment, that is, they lack external validity. However, if sites and subjects are selected at random from larger eligible sets, then models with random effects provide one possible way of addressing both internal and external validity considerations. One important consideration for external validity is that the sample coverage includes all relevant subpopulations; another is that treatment differ­ ences be homogeneous across subpopulations. A common application of design-based inference is the survey. Alternatively, if assumptions external to the study design are required to extend inferences to the target population, then statistical analyses based on postulated prob­ ability distributional forms (e.g., binomial, normal, etc.) or other stochastic processes yield model-based inferences. A focus of distinction between design-based and model­ based studies is the population to which the results are generalized rather than the nature of the statistical methods applied. When using a model-based approach, external validity requires substantive justification for the model's assumptions, as well as statis­ tical evaluation of the assumptions. Statistical inference is used to provide probabilistic statements regarding a scientific inference. Science attempts to provide answers to basic questions, such as can this machine meet our requirements? Is the quality of this lot within the terms of our con­ tract? Does the new method of processing produce better results than the old? These questions are answered by conducting an , which produces . If the data vary, then statistical inference is necessary to interpret the answers to the questions posed. A is developed to describe the probabilistic structure relating the observed data to the quantity of interest (the parameters), that is, a scientific hypoth­ esis is formulated. Rules are applied to the data and the scientific hypothesis is either rejected or not. In formal tests of a hypothesis, there are usually two mutually exclusive and exhaustive hypotheses formulated: a null hypothesis and an alternate hypothesis.

Chi-Square, Student's T, and F Distributions In addition to the distributions present earlier in the Measure phase, these three distri­ butions are used in Six Sigma to test hypotheses, construct confidence intervals, and compute control limits.

Chi-Square Many characteristics encountered in Six Sigma have normal or approximately normal distributions. It can be shown that in these instances the distribution of sample vari­ ances has the form (except for a constant) of a chi-square distribution, symbolized X2. Tables have been constructed giving abscissa values for selected ordinates of the cumu­ lative X2 distribution. One such table is given in Appendix 4. The X2 distribution varies with the quantity u, which for our purposes is equal to the sample size minus 1. For each value of u there is a different X2 distribution. Equation (10.3) gives the pdf for the X2.

(10.3) 336 Chapter Ten

1.0

0.8 F(2,2)

0.6

0.4

0.2

0.0 0 2 4 6 8 10 F 8 7 6 5 4 3 2

2 4 6 8 10 F

FIGURE 10.12 F distributions.

denominator. Appendix 5 and 6 provide values for the 1 and 5% percentage points for the F distribution. The percentages refer to the areas to the right of the values given in the tables. Figure 10.12 illustrates two F distributions.

Point and Interval Estimation So far, we have introduced a number of important statistics including the sample , the sample , and the sample . These sample statistics are called point because they are single values used to represent population parameters. It is also possible to construct an interval about the statistics that has a pre­ determined probability of including the true population parameter. This interval is called a . Interval estimation is an alternative to that gives us a better idea of the magnitude of the sampling error. Confidence intervals can be either one-sided or two-sided. A one-sided or confidence interval places an upper or lower bound on the value of a parameter with a specified level of confidence. A two­ sided confidence interval places both upper and lower bounds. In almost all practical applications of enumerative statistics, including Six Sigma applications, we make inferences about populations based on data from samples . In this chapter, we have talked about sample averages and standard deviations; we have even used these numbers to make statements about future performance, such as long term Analyze Phase 337 yields or potential failures. A problem arises that is of considerable practical impor­ tance: any estimate that is based on a sample has some amount of sampling error. This is true even though the sample estimates are the "best estimates" in the sense that they are (usually) unbiased estimators of the population parameters.

Estimates of the Mean For random samples with replacement, the of X has a mean /.1 £!..nd a standard deviation equal to (J/.};;. For large samples the sampling distribution of X is approximately normal and normal tables can be used to find the probability that a sam­ ple mean will be within a given distance of /.1. For example, in 95% of the samples we will observe..E mean within t..1.96(J/.};; of /.1. In other words, in 95% of the samples the interval from X -1.96(J/.};; to X + 1.96(J/.};; will include /.1. This interval is called a "95% confidence interval for estimating /.1." It is usually shown using inequality symbols:

X -1.96(J/.};; < /.1X + 1.96(J/.};;

The factor 1.96 is the Z value obtained from the normal in the Appendix 2. It corre­ sponds to the Z value beyond which 2.5% of the population lie. Since the normal distri­ bution is symmetric, 2.5% of the distribution lies above Z and 2.5% below -Z. The notation commonly used to denote Z values for confidence interval construction or hypothesis testing is Za/ z where 100(1 - a) is the desired confidence level in percent. For example, if we want 95% confidence, a = 0:05,100(1- a) = 95%, and ZO.025 = 1.96. In hypoth­ esis testing the value of a is known as the significance level.

Example: Estimating Jl When 0' Is Known Supp~e that cr is known to be 2.8. Assume that we collect a sample of n = 16 and com­ pute X = 15.7. Using the e equation mentioned in previous section we find the 95% confidence interval for /.1 as follows:

X-1.96cr/.};; < /.1 < X + 1.96cr/.};;

15.7 -1.96(2.8/.Ji6) < /.1 < 15.7+1.96(2.8/.Ji6)

14.33 < /.1 < 17.07

There is a 95% level of confidence associated with this interval. The numbers 14.33 and 17.07 are sometimes referred to as the confidence limits. Note that this is a two-sided confidence interval. There is a 2.5% probability that 17.07 is lower than /.1 and a 2.5% probability that 14.33 is greater than /.1. If we were only interested in, say, the probability that /.1 were greater than 14.33, then the one­ sided confidence interval would be /.1 > 14.33 and the one-sided confidence level would be 97.5%.

Example of Using Microsoft Excel to Calculate the Confidence Interval for the Mean When Sigma Is Known Microsoft Excel has a built-in capability to calculate confidence intervals for the mean. The dialog box in Fig. 10.13 shows the input. The formula result near the bottom of 338 Chap te r Ten

ONFIDENCE ------. Alpha 1.05 ~ ;;;; 0.05 Standard_de v 1'""12-..s------,!)".....;;;; 2.8

Size 116 ~ = 10

= 1.371972758 RelJJrns the confidence interval fur a popularon meaJl. See Help fur the equation used.

Size Is the sample Si28.

Formula result =1.371972758 OK cancel

FIGURE 10.13 Example of finding the confidence interval when sigma is known using Microsoft Excel.

the screen gives the interval width as 1.371972758. To find the lower confidence limit subtract the width from the mean. To find the upper confidence limit add the width to the mean.

Example: Estimating Jl When 0' Is Unknown When cr is not known and we wish to replace cr with s in calculating confidence inter­ vals for /-l, we must replace Z a/ 2 with t a/2 and obtain the from tables for stu­ dent's t distribution instead of the normal tables. Let's revisit the example above and assume that instead of knowing cr, it was estiJ!lCl.ted from the sample, that is, based on the sample of n = 16, we computed s = 2.8 and X = 15.7. Then the 95% confidence inter­ val becomes:

x - 2.131s/J;; < /-l < X + 2.131s/J;; 15.7 - 2.131(2.8/Ji6) < /-l < 15.7 + 2.131(2.8/Ji6) 14.21 < /-l < 17.19

It can be seen that this interval is wider than the one obtained for known cr. The t a/ 2 value found for 15 df is 2.131 (see Table 3 in the Appendix), which is greater than Z a/2 = 1.96 above.

Example of Using Microsoft Excel to Calculate the Confidence Interval for the Mean When Sigma Is Unknown Microsoft Excel has no built-in capability to calculate confidence intervals for the mean when sigma is not known. However, it does have the ability to calculate t-values when given probabilities and degrees of freedom. This information can be entered into an equation and used to find the desired confidence limits. Figure 10.14 illustrates the approach. The formula bar shows the formula for the 95% upper confidence limit for the mean in cell B7. An a I y z e Phas e 339

B7 = =$B$: 1+ TINV($8$4 ,$8$3-1)* \--.....,.....------.--8...... JA $8$2/SQRT( $8$3) 1 Mean 15.7 2 sigma 2.8 3 n 16 4 Alpha 0.05 5 Lower Confi dence 6 Limit 14 .21 Upper Confi dence 7 Limit 17.19

FIGURE 10.14 Example of finding the confidence interval when sigma is unknown using Microsoft Excel.

Hypothesis Testing Statistical inference generally involves four steps:

1. Formulating a hypothesis about the population or "state of nature" 2. Collecting a sample of observations from the population 3. Calculating statistics based on the sample 4. Either accepting or rejecting the hypothesis based on a predetermined accep­ tance criterion There are two types of error associated with statistical inference:

Type I error (a error)-The probability that a hypothesis that is actually true will be rejected. The value of a is known as the significance level of the test.

Type II error (~ error)-The probability that a hypothesis that is actually false will be accepted. Type II errors are often plotted in what is known as an operating characteristics curve.

Confidence intervals are usually constructed as part of a statistical test of hypotheses. The hypothesis test is designed to help us make an inference about the true population value at a desired level of confidence. We will look at a few examples of how hypothesis testing can be used in Six Sigma applications.

Example: Hypothesis Test of Sample Mean Experiment: The nominal specification for filling a bottle with a test chemical is 30 cc. The plan is to draw a sample of n = 25 units from a stable process and, using the sample mean and standard deviation, construct a two-sided confidence interval (an interval that extends on either side of the sample average) that has a 95% probability of includ­ ing the true population mean. If the interval includes 30, conclude that the lot mean is 30, otherwise conclude that the lot mean is not 30. 340 Chap te r Ten

Result: A sample of 25 bottles was measured and the following statistics computed

x = 28 cc s = 6 cc

The appropriate test is t, given by the formula

t= X-Il = 28-30 =-1.67 s/$z 6/Es

Table 3 in the Appendix gives values for the t statistic at various degrees of freedom. There are n -1 degrees of freedom (d£). For our example we need the t 975 column and the row for 24 df. This gives a t value of 2.064. Since the absolute value of this t value is greater than our test statistic, we fail to reject the hypothesis that the lot mean is 30 cc. Using statistical notation this is shown as:

Ho:1l = 30 cc (the null hypothesis)

H1:11 is not equal to 30 cc (the alternate hypothesis) a = .05 (Type I error or level of significance) Critical region: -2.064::S; to::S; + 2.064 Test statistic: t = -1.67.

Since t lies inside the critical region, fail to reject Ho' and accept the hypothesis that the lot mean is 30 cc for the data at hand.

Example: Hypothesis Test of Two Sample The variance of machine X's output, based on a sample of n = 25 taken from a stable process, is 100. Machine Y's variance, based on a sample of 10, is 50. The manufacturing representative from the supplier of machine X contends that the result is a mere "statis­ tical fluke." Assuming that a "statistical fluke" is something that has less than 1 chance in 100, test the hypothesis that both variances are actually equaL The test statistic used to test for equality of two sample variances is the F statistic, which, for this example, is given by the equation

S2 100 F = s~ = 50 = 2,numerator df = 24,denominator df = 9

Using Table 5 in the Appendix for F 99 we find that for 24 df in the numerator and 9 df in the denominator F = 4.73. Based on this we conclude that the manufacturer of machine X could be right, the result could be a statistical fluke. This example demon­ strates the volatile nature of the sampling error of sample variances and standard deviations.

Example: Hypothesis Test of a Standard Deviation Compared to a Standard Value A machine is supposed to produce parts in the of 0.500 inch plus or minus 0.006 inch. Based on this, your computes that the absolute worst standard deviation tolerable is 0.002 inch. In looking over your capability charts you find that the best machine in the shop has a standard deviation of 0.0022, based on a sample of 25 units.