WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

Process Capability Indices for Non-Normal Data

MARTIN KOVÁŘÍK, LIBOR SARGA Department of Statistics ands Quantitative Methods Tomas Bata University in Zlín Mostní 5139, 760 01 Zlín CZECH REPUBLIC [email protected], [email protected]

Abstract: - When probability distribution of a process characteristic is non-normal, Cp and Cpk indices calculated using conventional methods often lead to erroneous interpretation of process capability. Various methods have been proposed for surrogate process capability indices (PCIs) under non-normality but few literature sources offer their comprehensive evaluation and comparison, in particular whether they adequately capture process capability under mild and severe departures from normality, and what is the best method to compute true capability under each of these circumstances. We overview 9 methods as to their performance when handling PCI non-normality. The comparison is carried out by simulating log-normal and data and the results presented using box plots. We show performance to be dependent on the ability to capture tail behavior of the underlying distribution.

Key-Words: - process, capability, index, distribution, method, transformation, asymmetric

1 Introduction their capability indices to Pearson distribution. Process capability indices (PCIs) are widely used to Johnson et al. [11] substituted 6σ in the numerator determine whether a given process is capable to of equation (2) by 6θ whereθ was selected so that output products in a given tolerance band. Cp a Cpk, „capability“ was not substantially influenced by the ranking among the most popular, are defined as: shape of the distribution. Castagliola presented tolerance width computational methods for non-normal PCIs by C = , (1) P process width estimating ratio of unsatisfactory products using Burr distribution. Vännman introduced C() uv, USL µ µ −− LSL (2) p C pk = min ,CC plpu }{ = min , ,  3σ 3σ  index family with ()uv, comprising many other where USL and LSL denote upper and lower indices as special cases. Deleryd investigated specification limits, respectively, and the C pk suitable u’s and v’s for skewed process distributions. minimum of (),CC plpu . Due to unknown C p ()1, 1 , equivalent to C pmk , is recommended as 2 process value µ and varianceσ , they are usually the best fit for treating non-normality in PCIs. Even 2 though it is accepted C and C are not suitable estimated from sample counterparts, x and s . p pk English and Taylor analyzed non-normality in PCIs for process capability with non-normal distributions and posited Cpk is more sensitive to deviations than and the previously-mentioned methods constitute Cp. Kotz and Johnson tested properties of the viable substitutes, a complex treatment is missing. indices, e.g., Clements’ method, Johnson-Kotz- Experts are interested in which methods are Pearn method and “distribution-independent” PCIs mutually comparable, which are sensitive to the under non-normal distributions as well. Index by normality assumption, and which are suitable for Wright [19] integrates skewness correction factor in use with non-normal distributions. denominator of C pmk proposed by Johnson. Choi We will investigate 9 methods with respect to and Bai [5] suggested a weighted variance heuristic process capability by comparing C pu obtained from method to correct PCIs for skewness by considering simulations with a target C pu . In case the data are above and beyond mean individually. Some authors proposed new- non-normally distributed, it will be necessary to obtain estimates as Lˆ , Uˆ , and Meˆ generation PCIs based on assumptions about the 0.00135 0.99865 0.5 population distribution. Pearn and Kotz [16] tied quantiles; we will evaluate some methods.

E-ISSN: 2224-2899 419 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

Sometimes, the shape of the distribution is known where Lp and U p are 0.135 and 99.865 percentiles, a priori or based on knowledge of physical process respectively, trivially obtained in statistical software properties. It is then necessary to use goodness of fit such as Minitab, NCSS, XLStatistics, etc. Due to test to determine whether the model correctly median being preferred as asymmetric distributions’ represents the data, estimates distribution parameters, and computes the quantiles for log- location parameter, C pu and C pl are defined as normal, Weibull, exponential distributions, etc. − Some literature sources list equations for parameter USL median C pu = , (4) and quantile estimations which can be also inputted x 99865.0 − median to software. Fig. 1 depicts how a PCI is obtained for a non-normally distributed quality attribute median − LSL C pl = , (5) [16,17,18]. median − x 99865.0 Because the method is graphical, it can produce errors if statistical software is eschewed in favor of a probability paper where x axis is linear and y probabilistic, conforming to a selected probability distribution. N observations are diagramed according to (x(i)) with the respective counterparts of the empirical distribution function ((i) / N). Interpolated line is the best fit for a function of quality attribute’s expected probability distribution. Fig. 1 PCI for a non-normally distributed quality Quantiles corresponding to (0,00135; 0,50; 0,99865) attribute [8] can be inferred and instated to the performance indicators. However, we repeat the method can be erroneous if statistical software is not involved 2 Surrogate PCIs for Non-Normal [19,20]. Distributions

Here we will present methods to compute PCIs for non-normal data distributions. 2.2 Distribution-Independent Tolerance Intervals Chan et al. [4] devised a method utilizing 2.1 Probability Graph distribution-independent tolerance intervals to A popular approach to compute PCIs is to use compute C p a C pk for non-normally distributed normal distribution graph which allows to processes: investigate the normality assumption. As with USL − LSL USL − LSL USL − LSL normal distribution where process width spans the C p = = = . (6) 6σ 3 ()23 σ 0.135 and 99.865 percentile range, a surrogate PCI ()4σ can be obtained from suitable probability 2 distribution: This implies USL − LSL USL − LSL USL − LSL C = = = , (7) p ω 3 ω ω 3 3 2 2 min[(USL , µµ −− LSL)] C pk = , (8) ω 2 Fig. 2 Normal distribution: interval spanning where ω is tolerance interval’s width. If 99% 99.73% of values, and non-normal distribution: reliability is required, the interval ensures 75% interval spanning 99.73% of values [Own work] coverage of population statistics, the method is thus the only one giving different result even if the USL − LSL C = = underlying distribution is normal. In that case, P Upper − lower %135.0%135.0 centiles 99.73% of the quality attribute’s values lie within (3) USL − LSL ±3s tolerance bounds, Cp = 1 99.72% of = , values lie within the bound. In equation (6), 6 in − LU PP denominator is generally dependent on sample size

E-ISSN: 2224-2899 420 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

out of which s is estimated, and on quality b) x , s, Sk (skewness), Ku (), attribute’s distribution. The non-normality limitation c) standardized quantiles L' , U ' (for selected p can be mitigated by selecting suitable percentage of p p items within the tolerance bounds, e.g., for 99%, and known Sk, Ku can be obtained from tables 5.15 may be used, a constant valid for many [6] or in software), probability distributions with skewness from 0 to d) standardized median M‘ from tables [6], ´ ´ 3.111 and kurtosis from 1 to 5.997 [5,10]. e) Lp , U p : p ⋅−= LsxL p , p ⋅+= UsxU p , ´ f) M: ⋅+= MsxM g) C , C from equations (13) and (14). 2.3 Weighted Variance Method p pk Choi and Bai [5] proposed a heuristic weighted Pearn and Kotz [16] expanded the method with variance method for PCI correction according to C pm , C pm* , C pmk defined as: population skewness. Let P be probability x USL − LSL C´ = , (15) a variable X is lower or equal to the population pm 2  − LU  mean  pp  −+ 2 n 6   ()TM 1  6  Px = ∑ ()− XXI i , (9) n i=1 min(USL , −− LSLTT ) C ' = , (16) where Ix() =1 when x > 0 , and Ix() = 0 when pm* 2 Up + Lp  2 3   ()−++ TM x < 0 . PCI is then defined as [5]  6  USL − LSL C = , (10)  USL − M  p 6ωW , x  2    p − MU  2  where = +− . Therefore   −++ WPxx112 3   ()TM   3  − µ ´   USL C pmk = min . (17) C pu = , (11)  − LSLM  22 Pxσ  2    − LM   µ − LSL 3  p  ()−+ TM 2 C pl = . (12)     ()123 − P σ   3   x

2.4 Clements’ Method 2.5 Burr Percentile Method Clements [6] supplanted 6σ in equation (6) with Burr [2,3] proposed Burr XII distribution to UL− : compute the random variable X’s percentiles. Its pp probability density function, y, is of the form USL − LSL  ckyc−1 C p = , (13)  kcywhen ≥≤≥ ,1,1,0 − c k+1 (18) LU pp (),| kcyf = ()1+ y  0 ywhen < ,0 where U p is the 99.865 percentile. For C pk , µ is  estimated by median M and 3σ by UM− where c and k represent skewness and kurtosis, p respectively. and MU− p . We get Liu and Chen introduced modified Clements’ method with percentiles from Burr XII distribution USL − M − LSLM  C = min ., (14) instead of from the Pearson’s curve. pk − −  p MU LM p  1. Estimate sample mean x and standard deviation Clements’ method employs traditional estimates for s. Skewness s3 and kurtosis s4 from empirical skewness and kurtosis based on the third and the sample size are calculated as: 3 fourth central moments both of which are unreliable   n j − xx s =   , (19) for small sample sizes. The PCIs were devised for 3 ∑  a class of Pearson distributions, i.e., beta, gamma, ( )(nn −− 21 )  s  and Student [6,16]. and Procedure for calculating PCIs: a) tolerance bounds USL, LSL,

E-ISSN: 2224-2899 421 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

4  −  2.6 Transformation Techniques ()nn +1  j xx  s4 = ∑ − ( )( )(nnn −−− 321 )  s    (20) 1. Variance stabilization requires ()n −13 2 a transformation y= gx() with constant variance − . 2 ( )(nn −− 32 ) σ ()y . In case variance of x is a function of the 2. Compute standardized skewness and kurtosis 2 2 type σ ()()x= fx1 , σ ()y is computed as moments α3 and α4 , respectively: 2  ()xdg  ()n − 2 σ 2 ()y ≈ () = Cxf ,, (24) α3 = s3 , (21)   1 ()nn −1  dx  and where C is a constant; gx() is then obtained by ( )(nn −− 32 ) ()n −1 solving a differential equation α = s + 3 . (22) 4 ()n2 −1 4 ()n +1 dx () ≈ Cxg . (25) Standardized kurtosis is known as excess; if it ∫ 1()xf equals 3, the distribution is normal (i.e., Multiple methods and tools were devised for mesokurtic). Negative values indicate platykurtic 2 (broader) distribution with higher standard constant relative error δ ()x , e.g.,σ ()x is given deviation, positive values leptokurtic (slender) by fx()()= δ 2 x, x2 = constant. After instating, distribution with lower standard deviation [1,2,3]. 1 we get gx= ln x. Here, logarithmic 3. Use α3 and α4 to select suitable c and k, then () calculate Burr XII distribution standardized value transformation is optimal together with geometric Y − µ mean. When σ 2 ()()x= fx exhibits a power Z = , (23) 1 σ relation, gx() will be of power type as well. For where Y is the selected Burr value, µ and σ mean normal distribution, mean is independent on and standard deviation, respectively. All central variance, i.e., variance-stabilizing transformation moment characteristics are in tables [2,3] which also will ensure convergence to normality. supply standardized 0.00135, 0.5, and 0.99865 percentiles corresponding to Z0.00135 , Z0.5 , and 2. Sample distribution symmetrization is possible using simple power transformation Z0.99865 . We get them by comparing  λ λ > − xX Y − µ forx 0 = .  s σ ()xgy ==  ln forx λ = 0 (26)  −λ 4. We now have estimates for lower, medium, and − forx λ < .0 upper percentiles: Scale is not preserved, is discontinuous due to λ , Lp = x + sZ 0.00135 , and suitable only for positive values. Optimal λˆ M= x + sZ , 0.5 estimate is found by minimizing the suitable asymmetry characteristics. Skewness defined in the Up = x + sZ 0.99865. following equation is robust and can be used in 5. Compute the PCI from equations for the Clements’ method. place of gyˆ1 () : Other procedures can be used to estimate Burr ()50.075.0 ()−−− yyyy 25.050.0 distribution’s parameters apart from the third and ˆ1R ()yg = . (27) − the fourth central moments, e.g., maximum yy 25.075.0 likelihood method, probability weighted moments, For a symmetric distribution, gyˆ1 () = 0 , as well as etc. Moment characteristics are the most usual due ˆ to their simplicity and speed of computation [2,3]. gyˆ1 () and gyˆ1R () . Optimal λ is found from a rankit plot; its quantiles will lie approximately on a line.

E-ISSN: 2224-2899 422 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

3. Hine-Hines selection graph (x axis: xx / , λ 0.5 1−Pi  X −1 ()λ  for λ ≠ 0 y axis: xx/ ) X =  (29) Pi 0.5 λ A selection graph as a diagnostic tool for finding  ln forX λ = .0 optimal λ value. To determine its quality and optimal λ , quantile dispersion graphs (QDG) and quantile graphs (Q-Q) can be plotted, with normality tests after power transformation more suitable. A well-known Shapiro-Wilk test is equivalent to testing tangent’s significance in a Q-Q plot, i.e., linearity can be analyzed as well. The Box-Cox family of transformations depends on λ estimated by Maximum Likelihood Estimate (MLE) or Least- Squares Estimate (LSE). For λ =1, additive measurement model is suitable while for λ = 0 , a multiplicative one should be preferred. First, λ is chosen from a selected range, and the following Fig. 3 Selection graph for a sample from log-normal statistic computed distribution [14] 1 2 Lmax ln ˆ +−= ln ()λσ , XJ = 1 The graph originates from asymmetry requirements (30) 1 n of quantiles around median ln ˆ 2 ()λσ −+= 1 ln X , −λ ∑ i ~ ~ 2 i=1  xP   x   i  +  5.0  =  ~   ~  ,2 (28) where x x − n n  5.0   1 Pi  ∂W ()λ = i λ−1 ∀= λ where for each ordinal probability, values of , XJ ∏∏ X i , = ∂X = −i i 1 i i 1 Pi = 2 , i = 2,3 are usually assigned. To compare n σ 2 experimental and theoretical observation for so that lnJX()()λλ ,= − 1∑ ln Xi . The ˆ selected λ , yxλλ+=− 2 for 01≤≤x and i=1 2 01≤≤y is plotted into the graph: estimate for fixed λ is σλˆ = Sn() where S ()λ a) for λ = 0 , the solution is a line yx= , is a residual sum of squares in analysis of variance b) for λ < 0 , the solution is a ratio X ()λ . After computing Lmax ()λ for several λ in −λ 1/λ yx=()2 − , the range, Lmax ()λ is plotted against λ . MLE for λ > c) for 0 , the solution is a ratio λ is computed from the λ maximizing Lmax ()λ . λ −1/λ xy=()2 − . With optimal λ* , every X in the specification Judging from the positions of experimental points bounds is transformed to a normal value with the on theoretical curves in the selection graph, λ can help of equation (29) [4,18]. be inferred and quality of transformation assessed in varying distances from median. To converge the sample distribution to normal, Box-Cox transformation is a viable method due to skewness and kurtosis [14].

2.6.1 Box-Cox Transformation Disadvantages of a simple power transformation (discontinuity around zero and non-aligned transformed scales) are mitigated by Box-Cox ()λ transformation X , a linear variation of a simple ()λ power transformation X p . Box-Cox class of Fig. 4 Shape of Box-Cox transformation for r = -1, polynomial transformations is of the form 0, 0.5, 1, 2, 3 [13]

E-ISSN: 2224-2899 423 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

Box-Cox transformation estimates λ by for variance. Each derivation is computed at xx= R . minimizing standard deviation of the normalized For 100 1−α % confidence interval for mean of transformed variable; software is capable to evaluate () many of them. y is a transformed value of the x original data, it holds that LR µ +≤≤− IxIx UR quality attribute [18]. where λ denotes the function’s shape and allows to find  () F(x) satisfying the maximum normality or symmetry −1 ys L  −α 21 ()ntGygI −−+= ,1 (34) condition. Different shapes of the Box-Cox  n  transformation function (specified as r) for various λ −1 ()ys  are depicted in Fig. 4. The aim is to find r ensuring U  −α 21 ()ntGygI −−+= ,1 (35) maximum symmetry “measured” by skewness, or  n  (better) maximum normality “measured” by 2 −2 ()xgd  ()xdg  2 likelihood, after transformation. Maximum G −= 5.0 2   ⋅ ()ys . (36) symmetry (zero skewness condition) or maximum dx  dx  likelihood condition is computed iteratively [15]. tn1−α /2 ()−1 denotes 100() 1−α / 2 % Student

2.6.2 Backward Transformation After Box-Cox quantile with ()n −1 degrees of freedom. With and Power Transformation y= gx() and y , sy2 () , it is trivial to compute If a suitable transformation leading to approximate 2 2 normality is found, y , sy() , and confidence xR and sx()R : a) For a special case of λ = 0 , i.e., logarithmic intervals y±− t1−α /2 ()() n1/ sy n can be found transformation of the type gx= ln x: and statistical tests performed. However, we fist () need to transform the statistics back to their original 2 ≈ 22 xR ≈+exp y 0.5 sy2 () , s() xRR xs() y. forms.

λ ≠ 1. Incorrect approach simply backward b) For 0 , xR is the following quadratic − transforms = 1 . For a simple power case, equation’s root: xR gy() 1 λ  ()15.0 λy ±+  backward transformation leads to mean of the form x =   . (37) λ R 2,1, 2 22 n 1  215.0 λ()()ysy ()λ −+++± 2 ()ys   λ  x −1 ∑ i  closer to = is taken as an i=1 xRi, x0.5 gy()0.5 R xx λ ==   . (31)  n  estimate of xR . With known untransformed mean,   variance can be computed as −+λ λ 2= 22 ⋅ 2 [14]. For λ = 0 , ln x is used instead of x and sx() xR sy() x 1/λ e instead of x . xx= − represents harmonic, R 1 2.6.3 Johnson Transformation = geometric, = arithmetic, and = xxR 0 xxR 1 xxR 2 When observations are non-normally distributed, quadratic mean, respectively. The approach leads to Johnson transformation can be used; the data will biased results. then be N (0,1)-normally distributed. It’s of the form 2. Correct (more exact) approach is based on Taylor expansion y= gx() around y . For γ += ητ ()xz λε ,,; (38) η > −∞ λγ >∞<< −∞ ε ∞<< untransformed xR , an approximate relation is ,0 ,0, , −  2 2  −1 1 ()xgd  ()xdg  2 where z is a standardized normal random variable R ygx −≈ 2   ⋅ ()ys  (32)  2 dx  dx   and x a variable fitted using Johnson transformation. γ η ε λ for mean and Four parameters to be estimated are , , , and −2 with τ an arbitrary function taking one of the 2  ()xdg  2 following three forms. Johnson transformation ()xs R ≈   ⋅ ()ys (33)  dx  therefore selects one type depending on whether the random variable is “log-normal”, “bounded”, or “unbounded” [9,11,12].

E-ISSN: 2224-2899 424 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

Log-normal System (SL) (5.0 )(α ) (xxx 5.0 −+−− εεε )  ˆ ()x −= ελ  ×  x − ε  5.0 (47) (x1−α )2 (α )(xx 1−α −−−− εεε ) 1()x λετ =  ,log,; x ≥ ε (39)  λ  2 −1 {(x 5.0 ) (α )(xx 1−α −−−−× εεε )} . Titled Johnson log-normal SL distribution, it is tied to a log-normal system. Parameter estimates are • No end point is known: four percentiles must be −1 compared to those from normal distribution. The    − xx 5.095.0  equation for i = 1, 2, 3, 4: ηˆ = 645.1   , (40)  − xx 05.05.0   x − εˆ  z += ηγ ˆˆ log i . (48) i  ˆ  1 exp()−− 645.1 ηˆ   ˆ λε −+ xi  * =ηγ ˆˆ log , (41)   Being non-linear, it is solved numerically.  − xx 05.05.0  * Hill et al. devised an algorithm to assign first four εˆ x 5.0 exp()−== ηγ ˆˆ , (42) central moments of X to the above-mentioned where 100α% percentile is computed as α ()n +1 th distribution types. PCIs are quantified from equations (1) and (2) [11,12]. value from n observations. If necessary, linear interpolation can be used to get the desired percentile [11,12]. 2.7 Wright’s C s Wright proposed a process capability index Cs Unbounded System (SU) incorporating skewness correction factor in the −  x − ε  ()x ,; λετ = sinh 1 , x ∞<<∞− . (43) numerator of Cpmk. It is of the form 2  λ  min(USL , µµ −− LSL) Cs = = 2 2 The curves are unbounded and the system consists 3 ()T +−+ 3 σµµσ of both Student’s and normal distributions among (49) min(USL , µµ −− LSL) others. Hahn and Shapiro supplied a table for γˆ and = , 2 + σµσ ηˆ based on given skewness and kurtosis for 33 filtering. where T = µ and µ3 is the third central moment [19].

Bounded System (SB) Some methods are widely employed in industrial  x − ε  3 ()x λετ = log,;  , x +≤≤ λεε . (44) applications, such as probability graphs and  ελ −+ x  Clements’ method. Conversely, Box-Cox The system includes bounded distributions, i.e., transformation is relatively unknown among gamma and beta. Because each may be lower- experts. If the distribution is normal, all methods bounded ε or upper-bounded ()+ λε , , the except for the distribution-independent one should following can occur: provide identical results to traditional Cp and Cpk as • Fluctuation range is known: both end points are per equations (1) and (2). However, as different specified, the parameters are obtained by statistics and procedures are used for parameter estimates, there will be some variability in Cp and 1− ′ − zz αα ηˆ = , (45) Cpk with that of Cp in particular decreasing with   2 (x1−α′ )(λεε −+− xα ) χ log  increasing sample size, e.g., 1−α distribution  λεε −+−   (xα )(x1−α′ ) assuming normality. Variability of Cpk can be  x − ε  substantial for all sample sizes as it depends on −= ηγ  1−α′  z1−α′ ˆˆ log . (46) process variability caused by mean shifts and other  λε −+ x1−α′  changes. • One known end point: additional equation (given by respective medians) is required to supplement ηˆ and γˆ ; given as 3 Case Study

3.1 Data Comparison Metric Different test metrics may lead to different results, accentuating that it is imperative to set a suitable

E-ISSN: 2224-2899 425 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

performance comparison measure. Researchers use 3.1.1 Test Distribution various ones to address the PCI non-normality To investigate effects of non-normal data on PCIs, problem. English and Taylor fixed Cp and Cpk to 1.0 log-normal and Weibull distributions were selected for all their simulations investigating PCI robustness as it is known their parameters can represent under non-normality. The basis of their comparison fractional, medium and high deviations from was a Cˆ and Cˆ ratio (established from the normality. They are also known for different p pk properties near tails which substantially influence simulation) higher than 1.0 in case of non-normal process capability. Skewness and kurtosis for both distribution. Deleryd focused on a ratio of are summarized in Table 1, distribution functions unsatisfactory items to determine Cp. are given respectively as Estimated p ()vuC ,, ’s bias and variance were  ln x − µ  ()xF ,; σµ Φ=   x ≥ ,0, (52) compared with a target Cp. As direct relation  σ  between PCI and defects exists, choice of index also n   x   sets acceptable percentage of detects. Rivera et al. ()xF ση 1,; exp−−=    x ≥ .0, (53) [17] tweaked population’s upper specification  σ   bounds to get true number of defects and respective Cpk. Cpk from simulations of transformed data were Log-normal distribution Weibull distribution 2 compared with target values of Cpk. µ σ Shape Scale In practice, PCIs are frequently used to monitor parameter parameter performance and compare processes. Even though 0.0 0.1 1.0 1.0 such application are not recommended when normality is untested, “substitute” PCIs for non- 0.0 0.3 2.0 1.0 normal data should be compatible with PCIs 0.0 0.5 4.0 1.0 computed under normality when defect ratios are approximately equal. This explains solution 1.0 0.5 0.5 1.0 proposed by Rivera et al: the defects ratio is set Tab. 1 Parameters used in performance comparison a priori using suitable specification limits and the simulations [Own work] PCIs are then compared with the target value, leading to analyzing one-sided tolerance where Cpu, Fig. 5 depicts shapes of log-normal and Weibull a one-sided tolerance limit for PCI, is used as distributions with parameters from Tab. 1. a comparison measure. Cpu is computed as

Ratio of unsatistif itemsactory ()−Φ= C pu ,3 (50)

Ratio of unsatisfactory items for asymmetric two- sided tolerance:

Ratio of unsatistif itemsactory = (51) Fig. 5 Probability density functions for log-normal ()()3C pl −Φ+−Φ= C pu .3 and Weibull distributions with parameters from Table 1 [18], modified In the study, C =1, 1.5and 1.667 and was pu 3.2 Monte Carlo Simulations derived from respective USL for log-normal and Weibull distributions with identical ratio of A series of simulations was performed with sample unsatisfactory items as targets. USL’s serve to sizes n = 50, 75, 100, 125 and 150, target C estimate C which are then compared with the pu were pu set to 0.1, 1.33, 1.5, 1.667 and 1.8 using log-normal target values. The method whose C sample mean pu and Weibull distributions. We only list results for estimate exhibits the least deviation with the n = 100 and C smallest variance as measured by dispersion or pu’s 1.0, 1.5 and 1.667. In each run, standard deviation, is the best. Graphical statistics such as x , s, upper and lower 0.135 representation depicting the two characteristics is percentiles aas well as median were obtained from a simple box and whisker plot. random variables generated from both distributions. For the transformation methods, they are basically obtained from the transformed values. Cpu’s were subsequently determined using the 9 methods

E-ISSN: 2224-2899 426 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

presented above with the statistics for each USL target [4,7,18].

Log-normal distribution µ σ2 Skewness Kurtosis β β1 2 0.0 0.5 2,9388 21,5073 0.0 0.3 1,9814 10,7057 0.0 0.1 1,0070 4,8558 1.0 0.5 2,9388 21,5073

Weibull distribution Shape Scale Skewness Kurtosis parameter parameter β 2 Fig. 9 ˆ box plots (100 observations each) for β1 C pu 2 1,0 1,0 2,0000 9,0000 log-normal distribution (μ= 0, σ = 0.5, 0.3), target 2,0 1,0 0,6311 3,2451 Cpu = 1.0, 1.5, 1.667 [Own work]

4,0 1,0 -0,0872 2,7478

0,5 1,0 2,0000 9,0000 Tab. 2 Skewness and kurtosis for the investigated 4 Discussion and Conclusion values [Own work] Generally, the methods can be put into two categories. Probability graph, distribution- independent tolerance intervals, weighted variance method, and Wright’s process capability index can be classified as non-transformation methods. The second category comprises transformation techniques: Clements’, Burr’s, Box-Cox’s, and Johnson’s. In the performance comparison part, two measures are considered: accuracy and correctness with respect to sample size. For the former, we will

look at difference between simulated mean C pu and

target C pu , for the latter, we will consider variance or range of simulated means. The lower the variance or range, the better the performance. The results show transformation methods are better for both Fig. 6 Probability density functions for log-normal distributions, with two exceptions. First: Clements’ and Weibull distributions with parameters from method does not work as well as the rest. For Tab. 1 [18], modified Weibull distribution, its performance is lower than that of probability graph method. Second: box plots Each iteration was run 100 times to obtain mean show probability graph method is the only one ˆ which outperforms transformation techniques for C pu from C pu . To investigate which method is the Weibull distribution with σ = 1.0 and η = 2.0. For best for handling non-normality, box plots where log-normal distribution, transformation techniques ˆ C pu ’s will be plotted with target Cpu’s for both supply consistently better C pu ’s closer to the true distributions will be generated. The plots depict C ’s, all other inflate the results. From a practical several statistics pertaining to Cpu, e.g., mean, pu variance, outliers and extreme values. Box plots point of view, non-transformation methods are which could capture the target Cpu, mean will easier to compute but were proven insufficient for intersect the horizontal line on the target level [18]. quantifying process capability except where population distribution is approximately normal. ˆ Differences in C pu ’s are mostly higher compared to transformation techniques and their use is not recommended for distributions significantly

E-ISSN: 2224-2899 427 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

deviating from normal. When one method fails, transformed data are required which converge them closer to normal distribution. For further computations, they need to be backward transformed. Transformation class’ performance is fairly consistent with respect to correctness and accuracy; however, they are sensitive to sample size Fig. 8 Box plots (100 observations each) for Weibull distribution (σ = 0.0, 1.0, η = 1.0, 2.0), as evident for C pu where n = 50 and 150: the target Cpu = 1.0, 1.5, 1.667 [Own work] differences are 33%, 11% and 26% for Clements’,

Box-Cox, and Johnson transformation methods, The graph shows Burr percentile method generally respectively, which is more than 10% difference allows for better process capability estimates in compared to non-transformation ones. To small, non-normal sample sizes. Monte Carlo summarize, transformation techniques exhibit lower simulations corroborated traditional PCIs produce ˆ C pu variability as well as higher differences biased process capability results in case of non- normal distributions as they firmly rely on the between Cˆ computed from small and large pu normality assumption. sample sizes which cannot be ascribed only to inherent statistical variability. This hints at them Acknowledgements being favorable for smaller sample sizes. Possible reasons include: This contribution was written within the framework of the GA ČR (Czech Science Foundation) grant- 1. They usually involve conversion from maintained project: Reg. No. 407/12/0821, Creating distribution’s scale of range to shape. This may a Czech Instrument for Measuring Academic Tacit Knowledge, and with the financial support of GA induce unwanted process shifts influencing ˆ , C pu ČR. especially for smaller sample sizes. The authors are also thankful to the Operational 2. For monitoring capability, their success Programme Education for Competitiveness co- depends on the ability to detect behavior near tails funded by the European Social Fund (ESF) and via UCL percentiles. national budget of the Czech Republic for the grant No. CZ.1.07/2.3.00/20.0147 - “Human Resources Transformation techniques should be used to Development in the field of Measurement and monitor process capability with n ≥ 100. Management of Companies, Clusters and Regions Nevertheless, Box-Cox’s dominance is apparent Performance”, which provided financial support for from the graphs, particularly for log-normal this research. distribution which is to be expected since it provides exact transformations to natural logarithms. In case References: of log-normal distribution, this results in convergence to normality. Comparing Box-Cox’s [1] E. Abouheif, A method for testing the and Johnson’s methods, the former is more accurate assumption of phylogenetic independence in as it requires maximizing logarithmic likelihood comparative data, Evol. Ecol. Res. 1 (1999) function to get precise λ while the latter operates on 895-909. estimates of the first four central moments. [2] I. W. Burr, Cumulative frequency functions, Probability graph method is recommended for Ann. Math. Stat. 13:2 (1942) 215-232. processes slightly deviating from normality, it also [3] I. W. Burr, Parameters for a general system of can detect anomalies such as bimodality and distribution to match a grid of α3 a α4, Commun. outliers. Another finding is that Burr percentile Stat. 2:1 (1973) 1-21. method is comparable to Box-Cox’s transformation. [4] L. K. Chan, S. W. Cheng, F. A. Spiring, A Their performance for small sample sizes, graphical technique for process capability, specifically 30 sample sizes with n = 50, was Trans. ASQC Annu. Qual. Congr. 42 (1988) analyzed to determine which one is better for 268–275. samples frequently encountered in practice. The results are depicted in Fig. 8. [5] I. S. Choi, D. S. Bai, Process capability indices for skewed populations, Proc. 20th Int. Conf. Comp. Ind. Eng. (1996) 1211-1214.

E-ISSN: 2224-2899 428 Volume 11, 2014 WSEAS TRANSACTIONS on BUSINESS and ECONOMICS Martin Kovářík, Libor Sarga

[6] J. A. Clements, Process capability indices for [15] M. Meloun, J. Militký, Kompendium non-normal calculations, Qual. Prog. 22 (1989) statistického zpracování dat, second ed., Praha, 49-55. Academia, 2006. [7] M. Dlouhý, J. Fábry, M. Kuncová, T. Hladík, [16] W. Pearn, S. Kotz, Application of Clements’ Simulace podnikových procesů, Brno, method for calculating second- and third- Computer Press, 2007. generation process capability indices for non- [8] F. Fabian, V. Horálek, J, Křepela, J. Michálek, normal Pearsonian populations, Qual. Eng. 7:1 V. Chmelík, J. Chodounský, J. Král, Statistické (1994) 139-145. metody řízení jakosti, Praha, ČSJ, 2007. [17] L. A. R. Rivera, N. F. Hubele, F. D. Lawrence, [9] I. D. Hill, R. Hill, R. L. Holder, Algorithm Cpk index estimation using data AS99: Fitting Johnson curves by moments, J. R. transformation, Comput. Ind. Eng. 29:1-4 Stat. Soc. 25:2 (1976) 180-189. (1995) 55-58. [10] K. Ishikawa, What Is Total Quality Control? [18] L. C. Tang, S. E. Than, B. W. Ang, A graphical The Japanese Way, New York, Prentice Hall, approach to obtaining confidence limits of Cpk. 1988. Qual. Reliab. Eng. Int. 13:6 (1997) 337-346. [11] N. L. Johnson, S. Kotz, W. L. Pearn, Flexible [19] P. A. Wright, A process capability index process capability indices, Pak. J. Stat. 10 sensitive to skewness, J. Stat. Comput. Sim. (1992) 23-31. 52:3 (1995) 195-203. [12] V. E. Kane, Process capability indices, J. Qual. Technol. 18 (1986) 41-52. [20] K. Yang, B. El-Haik, Design For : A Roadmap for Product Development, New York, [13] K. Kupka, Statistické řízení jakosti, Pardubice, McGraw-Hill, 2003. TriloByte, 2001. [14] M. Meloun, J. Militký, Statistická analýza experimentálních dat, second ed., Praha, Academia, 2004.

E-ISSN: 2224-2899 429 Volume 11, 2014