8 Process Capability Analysis

8.1 Introduction • A process capability analysis relates the inherent variability in a process to specifications or requirements for the product produced by that process.

• There are many ways of analyzing the capability of a process. The most common being: (1) and probability plots (3) Process capability ratios. (2) The (4) Designed experiments.

• Process capability measures the uniformity of a process. Process variability (variance) and systematic deviations from a target value (bias) are the primary sources of nonuniformity.

• We will study the two major components of process variability:

– Short-term variability which reflects the inherent random variability at a point in time. – Long-term variability which reflects the variability over time.

• It is common to take a 6σ spread as a measure of process capability (where σ comes from the distribution of the product quality characteristic of interest).

• When the distribution is assumed to be normal N(µ, σ), we define the natural tolerance limits to be µ ± 3σ. In this case, 99.73% of process output will be within the tolerance limits.

• One way to estimate of process capability is to find a probability distribution that best de- scribes data from that process (e.g. normal, weibull, gamma, lognormal, etc.). Once an acceptable distribution has been found a process capability analysis is performed by compar- ing the properties of fitted distribution to specification limits.

• When the researcher observes the process directly and can control or monitor the data- collection procedure, the study is a true process capability study because by controlling data collection and knowing the time sequence of the data, inferences can be made about the stability of the process over time.

• Major applications of data from a process capability analysis are:

1. Predicting how well the process will meet tolerances. 2. Assisting, when necessary, in adjusting a process. 3. Reducing the variability in a manufacturing process. 4. Specifying performance requirements for new equipment. 5. Selecting between competing suppliers.

8.2 Using a or Probability Plots • One advantage of using a histogram is the immediate visual impression of process performance and that it could possibly indicate a reason for poor performance (off-target, outliers, skewness, bimodality, etc.).

• For a histogram to be moderately stable so that it can reliably estimate process capability, Montgomery recommends that at least 100 observations be taken from the process.

114 • The histogram along with the x and s enable us to assess process capability by looking first at the shape of the histogram. If it reasonably approximates a normal distribution, then x ± 3s can be used when assessing process capability.

• A normal probability plot with a test for normality (such as a Kolmogorov-Smirnov test) are commonly used as supplementary checks of normality.

Example 1: I used SAS to generate two data sets of 250 values from two distributions having µ = 20. • The first data set contains 250 random values from a normal N(20, 1) distribution. The variable is denoted NORMAL.

• The second data set contains 250 random values from a gamma (.5,40) distribution. The variable is denoted GAMMA.

• Suppose the lower and upper specification limits are LSL=17 and USL=23, respectively.

• Histograms (1) and (2) have a normal pdf superimposed on the normal and gamma data histograms, respectively.

• Histograms (3) and (4) have a gamma pdf superimposed on the normal and true gamma data histograms, respectively.

• The estimated parameters shown below each plot are the maximum likelihood estimates (MLEs).

• The quality of the fitted distribution to the hypothesized distribution can be assessed with goodness-of-fit tests.

• SAS can output the results for the (i) Anderson-Darling Test, (ii) Cramer Von-Mises Test, (3) Kolmogorov-Smirnov Test, and (4) the (not-recommended) Chi-Square Goodness-of-Fit Test. SAS Summary Statistics for the Normal(20,1) sample data: ------The CAPABILITY Procedure Variable: _normal Moments N 250 Sum Weights 250 Mean 20.0544549 Sum Observations 5013.61374 Std Deviation 0.98469082 Variance 0.96961602 Skewness 0.12451955 0.28268622 Uncorrected SS 100786.725 Corrected SS 241.434388 Coeff Variation 4.91008519 Std Error Mean 0.06227732

Basic Statistical Measures Location Variability Mean 20.05445 Std Deviation 0.98469 Median 19.98213 Variance 0.96962 Mode . Range 5.90262 Interquartile Range 1.36525

115 Tests for Normality Test --Statistic------p Value------Shapiro-Wilk W 0.994503 Pr < W 0.5028 Kolmogorov-Smirnov D 0.037547 Pr > D >0.1500 Cramer-von Mises W-Sq 0.059499 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.378384 Pr > A-Sq >0.2500

Quantiles Extreme Observations Quantile Estimate ------Lowest------Highest------100% Max 23.3518731 Value Obs Value Obs 99% 22.4471940 95% 21.7529221 17.4492496 239 22.0698526 33 90% 21.2904965 17.5315695 29 22.1224929 213 75% Q3 20.7315978 17.8064801 234 22.4471940 70 50% Median 19.9821281 17.8153951 214 22.9921756 56 25% Q1 19.3663452 17.8168153 5 23.3518731 61 10% 18.9203281 5% 18.4856329 1% 17.8064801 0% Min 17.4492496 Specification Limits ------Limit------Percent------Lower (LSL) 17.00000 % < LSL 0.00000 Target 20.00000 % Between 99.60000 Upper (USL) 23.00000 % > USL 0.40000

Process Capability Indices Index Value 95% Confidence Limits Cp 1.015547 0.926361 1.104631 CPL 1.033981 0.934039 1.133481 CPU 0.997113 0.900105 1.093675 Cpk 0.997113 0.900280 1.093946 Cpm 1.013998 0.926976 1.104973

SAS Summary Statistics for the Gamma(.5,40) sample data: ------The CAPABILITY Procedure Variable: _gamma Moments N 250 Sum Weights 250 Mean 19.834919 Sum Observations 4958.72976 Std Deviation 3.20121968 Variance 10.2478074

116 Skewness 0.300444 Kurtosis -0.1739938 Uncorrected SS 100907.707 Corrected SS 2551.70405 Coeff Variation 16.1393131 Std Error Mean 0.20246291

Basic Statistical Measures Location Variability Mean 19.83492 Std Deviation 3.20122 Median 19.73126 Variance 10.24781 Mode . Range 18.15474 Interquartile Range 4.66168

Tests for Normality Test --Statistic------p Value------Shapiro-Wilk W 0.990459 Pr < W 0.1010 Kolmogorov-Smirnov D 0.048292 Pr > D >0.1500 Cramer-von Mises W-Sq 0.109171 Pr > W-Sq 0.0878 Anderson-Darling A-Sq 0.650947 Pr > A-Sq 0.0909

Quantiles Extreme Observations Quantile Estimate ------Lowest------Highest------100% Max 30.1185447 Value Obs Value Obs 99% 27.1717936 95% 25.3220567 11.9638075 218 26.9835419 173 90% 23.9723333 13.2888758 174 27.1096947 28 75% Q3 22.2090666 13.5740292 234 27.1717936 96 50% Median 19.7312570 13.6244628 1 27.2287022 176 25% Q1 17.5473890 13.6678656 17 30.1185447 5 10% 15.7160413 5% 14.9488884 1% 13.5740292 0% Min 11.9638075 Specification Limits ------Limit------Percent------Lower (LSL) 17.00000 % < LSL 18.40000 Target 20.00000 % Between 64.00000 Upper (USL) 23.00000 % > USL 17.60000

Process Capability Indices Index Value 95% Confidence Limits Cp 0.312381 0.284947 0.339783 CPL 0.295192 0.246202 0.343748 CPU 0.329570 0.278902 0.379784 Cpk 0.295192 0.246412 0.343971 Cpm 0.311966 0.285194 0.339956

117 PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Distribution of _normal 25 N 250 Summary Statistics Cp 1.02 Mean 20.05 Cpk 1.00 Std Dev 0.985 Cpm 1.01 Skewness 0.125 20 Kurtosis 0.283

15 t n e c r e P 10

5

0 17.1 17.7 18.3 18.9 19.5 20.1 20.7 21.3 21.9 22.5 23.1 _normal Specifications and Curve Lower=17 Target=20 Upper=23 Normal(Mu=20.054 Sigma=0.9847) PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Distribution of _gamma 25 N 250 Summary Statistics Cp 0.31 Mean 19.83 Cpk 0.30 Std Dev 3.201 Cpm 0.31 Skewness 0.300 20 Kurtosis -.174

15 t n e c r e P 10

5

0 10 12 14 16 18 20 22 24 26 28 30 _gamma Specifications and Curve Lower=17 Target=20 Upper=23 Normal(Mu=19.835 Sigma=3.2012)

118 PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Distribution of _normal 25 N 250 Summary Statistics Cp 1.02 Mean 20.05 Cpk 1.00 Std Dev 0.985 Cpm 1.01 Skewness 0.125 20 Kurtosis 0.283

15 t n e c r e P 10

5

0 17.1 17.7 18.3 18.9 19.5 20.1 20.7 21.3 21.9 22.5 23.1 _normal Specifications and Curve Lower=17 Target=20 Upper=23 Gamma(Theta=0 Alpha=417 Sigma=0.05) PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Distribution of _gamma 30 N 250 Summary Statistics Cp 0.31 Mean 19.83 Cpk 0.30 Std Dev 3.201 25 Cpm 0.31 Skewness 0.300 Kurtosis -.174

20 t n e c r 15 e P

10

5

0 10 12 14 16 18 20 22 24 26 28 30 _gamma Specifications and Curve Lower=17 Target=20 Upper=23 Gamma(Theta=0 Alpha=38.6 Sigma=0.51)

119 SAS Code for Process Capability Example with Normal and Gamma Data

DM ’LOG; CLEAR; OUT; CLEAR;’; * ODS PRINTER PDF file=’C:\COURSES\ST528\SAS\cp1.pdf’; ODS LISTING; OPTIONS LS=78 PS=500 NONUMBER NODATE; *******************************************************************; *** NORMAL AND GAMMA VARIATES FROM DISTRIBUTIONS WITH MEAN = 20 ***; *******************************************************************; DATA in; DO N = 1 TO 250; _normal = 20 + RANNOR(5510); ** NORMAL(20,1) **; _gamma = .5*RANGAM(20921,40); ** GAMMA(.5,40) **; OUTPUT; END; SYMBOL1 VALUE=dot WIDTH=3 L=1; TITLE ’PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA’; PROC CAPABILITY DATA=in; VAR _normal _gamma; ** Specify responses ; SPEC LSL=17 USL=23 TARGET=20 ; ** Enter specifications; *** Make histograms of the normal and gamma data ; *** with the MLE normal pdf and statistics superimposed ; HISTOGRAM _normal _gamma / NORMAL(INDICES); INSET MEAN (5.3) STD=’Std Dev’ (5.3) SKEWNESS (5.3) KURTOSIS (5.3) / HEADER = ’Summary Statistics’ POS = NE; INSET N CP (4.2) CPK (4.2) CPM (4.2) / POS = NW; *** Make histograms of the normal and gamma data ; *** with the MLE gamma pdf and statistics superimposed ; HISTOGRAM _normal _gamma / GAMMA(THETA=0 INDICES); INSET MEAN (5.3) STD=’Std Dev’ (5.3) SKEWNESS (5.3) KURTOSIS (5.3) / HEADER = ’Summary Statistics’ POS = NE; INSET N CP (4.2) CPK (4.2) CPM (4.2) / POS = NW; *** Make empirical CDF plots of the normal and gamma data ; *** with the MLE normal CDF superimposed ; CDFPLOT _normal _gamma / NORMAL; *** Make QQ and PP plots of the normal data ; QQPLOT _normal / NORMAL; PPPLOT _normal / NORMAL; RUN;

• As an alternative to the histogram, we can use probability plots (such as CDF plots, percentile- percentile (PP) plots, and quantile-quantile (QQ) plots) to study process capability.

• Plot (5) has the fitted normal CDF plot superimposed on the empirical CDF plot for the random normal data.

• Plot (6) has the fitted normal CDF plot superimposed on the empirical CDF plot for the random gamma data.

• Plot (7) is quantile plot of the random normal data versus the quantiles assuming a normal distribution.

• Plot (8) is a normal probability plot of the random normal data.

120 PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Cumulative Distribution Function for _normal 100

80 t n e c r 60 e P e v i t a l u 40 m u C

20

0 18 20 22 24 _normal Specifications and Normal Curve Lower=17 Target=20 Upper=23 Mu=20.054 Sigma=0.9847 PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Cumulative Distribution Function for _gamma 100

80 t n e c r 60 e P e v i t a l u 40 m u C

20

0 10 15 20 25 30 35 _gamma Specifications and Normal Curve Lower=17 Target=20 Upper=23 Mu=19.835 Sigma=3.2012

121 PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

Q-Q Plot for _normal 24

22 l a m r 20 o n _

18

16 -3 -2 -1 0 1 2 3 Normal Quantiles Specifications Lower=17 Target=20 Upper=23 PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA

P-P Plot for _normal 1.0

l 0.8 a m r o n _ f o 0.6 n o i t u b i r t s i D 0.4 e v i t a l u m u C 0.2

0.0 0.0 0.2 0.4 0.6 0.8 1.0 Normal(Mu=20.054 Sigma=0.9847)

122 • SAS output (9A) contains results for these tests fitting a (hypothesized) normal distribution to the random normal data. All p-values are large so we fail to reject the null hypothesis of a normal distribution. • SAS output (9B) contains results for these tests fitting a (hypothesized) normal distribution to the random gamma data. All p-values are relatively small so there is evidence to reject the null hypothesis of a normal distribution. PROCESS CAPABILITY COMPARISON OF NORMAL AND GAMMA DATA OUTPUT 9A The CAPABILITY Procedure Fitted Normal Distribution for _normal Parameters for Normal Distribution Parameter Symbol Estimate Mean Mu 20.05445 Std Dev Sigma 0.984691

Goodness-of-Fit Tests for Normal Distribution Test ----Statistic----- DF ------p Value------Kolmogorov-Smirnov D 0.0375466 Pr > D >0.150 Cramer-von Mises W-Sq 0.0594991 Pr > W-Sq >0.250 Anderson-Darling A-Sq 0.3783838 Pr > A-Sq >0.250 Chi-Square Chi-Sq 11.3270226 7 Pr > Chi-Sq 0.125

Percent Outside Specifications for Normal Distribution Lower Limit Upper Limit LSL 17.000000 USL 23.000000 Obs Pct < LSL 0 Obs Pct > USL 0.400000 Est Pct < LSL 0.096127 Est Pct > USL 0.138878

Capability Indices Based on Normal Distribution Cp 1.015547 CPL 1.033981 CPU 0.997113 Cpk 0.997113 Cpm 1.013998

Quantiles for Normal Distribution ------Quantile------Percent Observed Estimated 1.0 17.8065 17.7637 5.0 18.4856 18.4348 10.0 18.9203 18.7925 25.0 19.3663 19.3903 50.0 19.9821 20.0545 75.0 20.7316 20.7186 90.0 21.2905 21.3164 95.0 21.7529 21.6741 99.0 22.4472 22.3452

123 OUTPUT 9B The CAPABILITY Procedure Fitted Normal Distribution for _gamma Parameters for Normal Distribution Parameter Symbol Estimate Mean Mu 19.83492 Std Dev Sigma 3.20122

Goodness-of-Fit Tests for Normal Distribution Test ----Statistic----- DF ------p Value------Kolmogorov-Smirnov D 0.04829201 Pr > D >0.150 Cramer-von Mises W-Sq 0.10917094 Pr > W-Sq 0.088 Anderson-Darling A-Sq 0.65094719 Pr > A-Sq 0.091 Chi-Square Chi-Sq 6.39603200 7 Pr > Chi-Sq 0.494

Percent Outside Specifications for Normal Distribution Lower Limit Upper Limit LSL 17.000000 USL 23.000000 Obs Pct < LSL 18.400000 Obs Pct > USL 17.600000 Est Pct < LSL 18.792339 Est Pct > USL 16.140229

Capability Indices Based on Normal Distribution Cp 0.312381 CPL 0.295192 CPU 0.329570 Cpk 0.295192 Cpm 0.311966

Quantiles for Normal Distribution ------Quantile------Percent Observed Estimated 1.0 13.5740 12.3878 5.0 14.9489 14.5694 10.0 15.7160 15.7324 25.0 17.5474 17.6757 50.0 19.7313 19.8349 75.0 22.2091 21.9941 90.0 23.9723 23.9374 95.0 25.3221 25.1005 99.0 27.1718 27.2821

• Output 10A and 10B contains the parameter estimates for fitting the normal data to a gamma distribution (10A) and for fitting the gamma data to a gamma distribution (10B).

• They also contain tables of the observed versus estimated quantiles. If the fitted distribution is a good choice, then the quantiles should be close.

• The output also contains process capability indices that we will discuss in the next section.

124 OUTPUT 10A The CAPABILITY Procedure Fitted Gamma Distribution for _normal Parameters for Gamma Distribution Parameter Symbol Estimate Threshold Theta 0 Scale Sigma 0.048128 Shape Alpha 416.6901 Mean 20.05445 Std Dev 0.982436 Goodness-of-Fit Tests for Gamma Distribution Test ----Statistic----- DF ------p Value------Kolmogorov-Smirnov D 0.0313208 Pr > D >0.500 Cramer-von Mises W-Sq 0.0488707 Pr > W-Sq >0.500 Anderson-Darling A-Sq 0.3381429 Pr > A-Sq >0.500 Chi-Square Chi-Sq 11.2562464 7 Pr > Chi-Sq 0.128 Percent Outside Specifications for Gamma Distribution Lower Limit Upper Limit LSL 17.000000 USL 23.000000 Obs Pct < LSL 0 Obs Pct > USL 0.400000 Est Pct < LSL 0.054527 Est Pct > USL 0.199484 Capability Indices Based on Gamma Distribution Cp 1.017742 CPL 1.083847 CPU 0.957809 Cpk 0.957809 Cpm 0.968746 Quantiles for Gamma Distribution ------Quantile------Percent Observed Estimated 1.0 17.8065 17.8400 5.0 18.4856 18.4663 10.0 18.9203 18.8062 25.0 19.3663 19.3834 50.0 19.9821 20.0384 75.0 20.7316 20.7081 90.0 21.2905 21.3234 95.0 21.7529 21.6973 99.0 22.4472 22.4105 ======OUTPUT 10B Fitted Gamma Distribution for _gamma Parameters for Gamma Distribution Parameter Symbol Estimate Threshold Theta 0 Scale Sigma 0.513866 Shape Alpha 38.59943 Mean 19.83492 Std Dev 3.192567

125 Goodness-of-Fit Tests for Gamma Distribution Test ----Statistic----- DF ------p Value------Kolmogorov-Smirnov D 0.03369770 Pr > D >0.500 Cramer-von Mises W-Sq 0.04294087 Pr > W-Sq >0.500 Anderson-Darling A-Sq 0.27089970 Pr > A-Sq >0.500 Chi-Square Chi-Sq 3.64833166 7 Pr > Chi-Sq 0.819 Percent Outside Specifications for Gamma Distribution Lower Limit Upper Limit LSL 17.000000 USL 23.000000 Obs Pct < LSL 18.400000 Obs Pct > USL 17.600000 Est Pct < LSL 18.947093 Est Pct > USL 15.960379 Capability Indices Based on Gamma Distribution Cp 0.312763 CPL 0.330698 CPU 0.299782 Cpk 0.299782 Cpm 0.269220 Quantiles for Gamma Distribution ------Quantile------Percent Observed Estimated 1.0 13.5740 13.1701 5.0 14.9489 14.8915 10.0 15.7160 15.8692 25.0 17.5474 17.5986 50.0 19.7313 19.6639 75.0 22.2091 21.8850 90.0 23.9723 24.0206 95.0 25.3221 25.3618 99.0 27.1718 28.0075

• In general, we will relate the empirical distribution to a theoretical distribution. The param- eters of the theoretical distribution can be specified or estimated from the data.

• We will choose a distribution that ‘best’ represents the data. This can be based on scientific or engineering principles or by empirical modeling among competing distributions. The following figure is a guide to the choice of a distribution by locating the measures of skewness (β1) and kurtosis (β2) on the figure.

• From the data we get estimates: q βb1 = βb2 =

Pn (x − x)j where M = i=1 i is the jth centered sample moment. j n

• The relations between the SAS measures of skewness and kurtosis and β1 and β2 are (i) 2 βb1 ≈ (SAS skewness) and βb2 ≈ (SAS kurtosis) + 3.

126 • If the point (βb1, βb2) falls in a region where none of the distributions seem appropriate, you will need to consider other families of distributions (e.g. Weibull).

127

121