Tolerance Intervals

K. Krishnamoorthy

University of Louisiana at Lafayette, Lafayette, LA, USA

Dakar International Conference on Recent Developments in Applied

March 17, 2014

1 / 60 A motivating example

Air lead levels collected by the National Institute of Occupational Safety and Health (NIOSH) at a laboratory, for health hazard evaluation. The air lead levels were collected from 15 different areas within the facility.

Air lead levels (µg/m3) 200 120 15 7 8 6 48 61 380 80 29 1000 350 1400 110

A fitted the log-transformed lead levels quite well (that is, the sample is from a lognormal distribution).

Objective: Are 90% of air lead levels in the facility below the occupational exposure limit (OEL) 50 µg/m3 ?

2 / 60 A motivating example

Y : Air lead levels, X : log-transformed air lead levels. µ and σ2: population and for X . X N(µ,σ2). ∼ exp(µ): air lead level. The usual confidence interval for µ: X¯ and S, the sample mean and of the log-transformed for a sample of size n. S A 95% confidence interval for µ: X¯ tn 1;.975 ± − √n S A 95% upper confidence bound for µ: X¯ + tn 1;.95 . − √n Confidence intervals for the median air lead level can be obtained.

3 / 60 A motivating example

To predict the air lead level at a particular area within the laboratory, a 95%

1 X¯ tn 1;.975S 1+ ± − r n

for the log-transformed lead level can be used. However, the confidence interval and prediction interval cannot answer this question, “Are 90% of the population lead levels below a threshold?” What is required is a tolerance interval; more specifically, an upper tolerance limit.

4 / 60 One-Sided Tolerance Limits

Let X = (X1, ..., Xn) be a sample from a population.

A (p, 1 α) upper tolerance limit U(X) is constructed so − that at least 100p percent of the population is U(X ) with ≤ confidence 1 α. −

A (p, 1 α) lower tolerance limit L(X ) is constructed so − that at least 100p percent of the population is L(X ) with ≥ confidence 1 α. −

A (p, 1 α) tolerance interval (L(X ), U(X )) is constructed − so that the interval would include at least 100p percent of the population with confidence 1 α. −

5 / 60 One-Sided Tolerance Limits

A (p, 1 α) upper tolerance limit is a 100(1 α)% upper − − confidence limit for the 100pth of the population of interest.

For example, let Q.90 denote the 90th percentile of a population, and let U(X ) be a 95% upper confidence limit for Q.90. Note that

90% of the population Q U(X ) with confidence 95%. ≤ .90 ≤ So at least 90% of the population is less than or equal to U(X ) with confidence 95%. Similarly, we can argue that the (p, 1 α) lower tolerance limit − is a 100(1 α)% lower confidence limit for the 100(1 p)th − − percentile of the population of interest.

6 / 60 Two-Sided Tolerance Intervals

Construction of one-sided tolerance limits simplifies to finding one-sided confidence limits for appropriate of the population. Thus, the problem simplifies to of some parametric function. A (p, 1 α) two-sided tolerance interval (L(X ), U(X )) contains at − least a proportion p of the population with confidence 1 α. i.e., −

PX % of population in (L(X ), U(X )) p = 1 α { ≥ } −

PX PX L(X ) X U(X ) X p = 1 α. ≤ ≤ ≥ −    

We here notice that the computation of L(X ) and U(X ) does not reduce to the computation of confidence limits for certain .

7 / 60 Normal Distribution: One-Sided Tolerance Limits

2 Let X1, ..., Xn be a sample from a N(µ,σ ) population with unknown mean µ and unknown variance σ2. The sample mean X¯ and sample variance S2 are defined by n n 1 2 1 2 X¯ = Xi and S = (Xi X¯) . n n 1 − i i X=1 − X=1 We shall describe the computation of one-sided tolerance limits based on X¯ and S2 for a normal population.

zp: p of a standard normal distribution. The 100pth percentile of N(µ,σ2) is

qp = µ + zpσ,

where zp is 100pth percentile of the std norma; distribution. A 1 α upper confidence limit for qp is a (p, 1 α) one-sided upper − − tolerance limit for the normal population. 8 / 60 Normal Distribution: One-Sided Tolerance Limits

(.90, .95) upper tolerance limit is a 95% upper confidence limit for µ + z.90σ.

9 / 60 Normal Distribution: One-Sided Tolerance Limits

(.90, .95) lower tolerance limit is a 95% lower confidence limit for µ − z.90σ.

10 / 60 Normal Distribution: One-Sided Tolerance Limits

The (p, 1 α) upper tolerance limit is taken to be of the form − X¯ + k1S, and k1 is referred to as the tolerance factor and is to be determined so that

P(X¯ + k S µ + zpσ) = 1 α. 1 ≥ −

It can be shown that the

µ + zpσ X¯ 1 − tn 1(zp√n), S ∼ √n −

where tm(δ) is the noncentral t distribution with df = m and the noncentrality parameter δ. Therefore,

µ + zpσ X¯ 1 − tn 1;1 α(zp√n) with probability 1 α. S ≤ √n − − −

11 / 60 Normal Distribution: One-Sided Tolerance Limits

Rearranging the terms, we find

µ + zpσ X¯ + k S with probability 1 α, ≤ 1 − where 1 k1 = tn 1;1 α(zp√n). √n − −

Similarly, it can shown that

X¯ k S − 1 is a (p, 1 α) lower tolerance limit for the N(µ,σ2) distribution. −

12 / 60 Normal Distribution: Two-Sided Tolerance Intervals

(.90, .95) lower tolerance limit is a 95% lower confidence limit for µ − z.90σ.

13 / 60 Normal Distribution: Equal-tailed TI

14 / 60 Normal Distribution: Equal-Tailed TI

A (p, 1 α) equal-tailed tolerance interval (L, U) is constructed so − that it will include the interval

µ z 1+p σ, µ + z 1+p σ − 2 2   with probability 1 α. Note that the interval is determined so that −

1 p no more than a proportion −2 of the population is < L and

1 p no more than a proportion −2 of the population is > L

In may applications, such restriction is not needed, and so we shall consider only two-sided TIs in the sequel.

15 / 60 Normal Distribution: Two-Sided Tolerance Intervals

A two-sided tolerance interval: X¯ k S. ± 2 k2 is determined such that the interval would contain at least a proportion p of the normal population with confidence 1 α. − P ¯ PX (X¯ k S X X¯ + k S X¯, S) p = 1 α, X ,S − 2 ≤ ≤ 2 | ≥ − X N(µ,σ 2), independently of X¯ and S. ∼ The computation of tolerance factor k2 is numerically involved. Software packages such as StatCalc can be used.

We can approximate k2 as 1 mχ2 (1/n) 2 k 1;p 2 2 , (1) ≃ χm;α ! where m = n 1, and χ2 (δ) denotes the α quantile of a − m;α noncentral chisquare distribution with df m and noncentrality parameter δ. 16 / 60 Normal Distribution: Example 1 (Air Lead Level)

In this example, we like to assess the air lead level in a laboratory. The data in Table 2.1 represent air lead levels collected by the National Institute of Occupational Safety and Health (NIOSH) at a laboratory, for health hazard evaluation. The air lead levels were collected from 15 different areas within the facility. Table 2.1 Air lead levels (µg/m3) 200 120 15 7 8 6 48 61 380 80 29 1000 350 1400 110 Log-transformed lead levels fit a normal distribution (that is, the sample is from a lognormal distribution).

17 / 60 Normal Distribution: Example 1

18 / 60 Normal Distribution: Example 1

We compute an upper tolerance limit based on the log-transformed data in order to assess the maximum air lead level in the laboratory. The sample mean and standard deviation of the log-transformed data:x ¯ = 4.333 and s = 1.739. To compute a (0.90, 0.95) upper tolerance limit for the air lead level, the tolerance factor 2.068, and

x¯ + k1s = 4.333 + 2.068(1.739) = 7.929.

Thus, exp(7.929) = 2777 is a (0.90, 0.95) upper tolerance limit for the air lead levels.

19 / 60 Normal Distribution: Example 1

The occupational exposure limit (OEL) for lead exposure set by the Occupational Safety and Health Administration (OSHA) is 50 µg/m3. A work place is considered safe if an upper tolerance limit does not exceed the OEL. In this case, the upper limit of 2777 far exceeds the OEL; hence we can not conclude that the workplace is safe.

20 / 60 Normal Distribution: Assessing Survival Probability

In many applications it is desired to estimate the probability that a exceeds a specified value. For example, in lifetime data analysis, it is of interest to assess the probability that the lifetime of an item exceeds a value (survival probability). In industrial hygiene, it is of interest to estimate the probability that the exposure level (level of exposure to a contaminant in a workplace) of a worker exceeds the occupational exposure limit (OEL). This is referred to as the exceedance probabilitiy. To assess the lifetime of an item, a lower confidence limit for the survival probability is warranted, and to assess the exposure level in a workplace, one needs an upper confidence limit for the exceedance probability. Confidence limits for these probabilities can be easily deduced from suitable one-sided tolerance limits.

21 / 60 Normal Distribution - Assessing Survival Probability

A Lower Confidence Limit for St = P(X > t): A 1 α one-sided lower confidence limit for St = P(X > t) is the − value of p for which the (p, 1 α) lower tolerance limit = t. −

22 / 60 Normal Distribution - Assessing Survival Probability

That is, the 1 α lower confidence limit for P(X > t) is the value − of p that satisfies 1 X¯ tn 1;1 α(zp√n)S = t, (2) − √n − −

or equivalently, p is determined so that

X¯ t tn 1;1 α(zp√n)= − . − − S/√n

As tn 1;1 α(zp√n) is increasing in p, a root finding method can be − − used to find the value of p that satisfies (3).

23 / 60 Normal Distribution - Assessing Exceedance Probability

An Upper Confidence Limit for St = P(X > t): If p is a lower confidence limit for P(X t), then 1 p is an ∗ ≤ − ∗ upper confidence limit for 1 P(X t)= P(X > t). − ≤ A 1 α lower confidence limit for P(X t) is the p that satisfies − ≤ ∗ (p , 1 α) upper tolerance limit = t. ∗ − That is,

t X¯ X¯ + tn 1;1 α(zp∗ √n)S = t tn 1;1 α(zp∗ √n)= − . (3) − − ⇔ − − S/√n

The desired 1 α upper confidence limit for P(X > t) is given by − 1 p . − ∗

24 / 60 Normal Distribution: Example 1 (Air Lead Level)

Test for the Exceedance Probability A workplace could be considered in compliance, if the exceedance probability is less than 0.05. For the air lead level example, the exceedance probability is P(X > ln(OEL)) with OEL = 50, and we shall find a 95% upper CL for this probability. Setting the (p, .95) upper tolerance limit equal to ln(OEL), we have ln(50) x¯ tn 1;0.95(zp∗ √n)= − t14;0.95(zp∗ √15) = 0.9376, − s/√n ⇔ −

where n = 15,x ¯ = 4.333 and s = 1.739.

Now, using StatCalc, we get zp∗ √15 = 2.5913 or − p∗ = 0.252, which is a 95% lower confidence limit for P(X ln(50)). ≤ 25 / 60 Normal Distribution: Example 1 (Air Lead Level)

Thus, 1 p = 1 0.252 = 0.748 is a 95% upper confidence − ∗ − limit for P(X > ln(50)).

For the workplace to be in compliance, the exceedance probability should not exceed .05. As the upper confidence limit .748 is not less than .05, we can’t conclude that the facility is in compliance. Exposure monitoring should be improved by taking appropriate safety measures.

26 / 60 Normal Distribution: Example 2

Filling machine monitoring A machine is set to fill a liter of milk in plastic containers. At the end of a shift operation, a sample of 20 containers was selected, and the actual amount of milk in each container was measured using an accurate method. The accurate measurements are given in Table 1.

Table 1. Actual amount of milk (in liters) in containers 0.968 0.982 1.030 1.003 1.046 1.020 0.997 1.010 1.027 1.010 0.973 1.000 1.044 0.995 1.020 0.993 0.984 0.981 0.997 0.992

A normal model fits the data very well. The sample statistics: x¯ = 1.0036 and s = 0.0221085.

27 / 60 Normal Distribution: Example 2

(0.95, 0.95) tolerance intervals

The two-sided tolerance factor k2 is 2.760, and the TI is

x¯ k s = 1.0036 2.760(0.0221085) = 1.0036 0.0610. ± 2 ± ± 95% of containers contain milk between .9426 and 1.0646 liter ⇒ with confidence 95%.

28 / 60 Exponential Distribution

The probability density function of a two-parameter exponential distribution is given by 1 f (x b)= exp( (x a)/b), x > a, b > 0. (4) | b − −

Let X1,..., Xn be a sample. The maximum likelihood estimators of a and b are given by n 1 a = X and b = (Xi X )= X¯ X , (5) (1) n − (1) − (1) i X=1 b b where X(1) is the smallest of the Xi ’s. The MLEs a and b are independent with b b 2n(a a) 2 2nb 2 − χ2 and χ2n 2. (6) b ∼ b ∼ − b [see Lawless (1982),b Section 3.5] 29 / 60 Exponential Distribution: One-Sided TLs

Pivotal Quantity The 100pth percentile of the exponential(a, b) distribution is given by qp = a 2 ln(1 p)b. − − We shall find a pivotal quantity for estimating a + cb, where c is a known constant. Using the distributional results in (6), we find

2 a + cb a cb (a a) c (a a)/b 2nc χ2 − = − − = − − 2− = fn,c , say, b b b/b ∼ χ2n 2 b b b − (7) whereb the chi-squareb random variablesb are independent. For 0 <α<.5, let fn,c;α denote the α quantile of fn,c . Then

a + fn,c;αb, a + fn,c;1 αb (8) −   is an exact 1 2α upper confidenceb limit for ba + cb. − b b 30 / 60 Exponential Distribution: One-Sided TLs

The 100pth percentile of the exponential(a, b) distribution is given by qp = a 2 ln(1 p)b. − − The pivotal for qp is given by

2 a 2 ln(1 p)b a 2n ln(1 p)+ χ2 − − − 2− = Ep. say. (9) b ∼− χ2n 2 − b − For 0 <α<.b5, let Ep;α denote the α quantile of Ep. Then

a Ep b (10) − ;α is an exact 1 α upper confidence limitb for qp, or (p, 1 α) upper − b − tolerance limit for the exponential(a, b) distribution. As the distribution of Ep does not depend on any uknown parameters, the percentiles can be estimated by simulation.

31 / 60 Exponential Distribution: One-Sided TLs

The 100(1 p)% lower percentile of an exponential distribution is −

q1 p = a 2ln(p)b. − −

Let E1 p;1 α denote the 100(1 α)th percentile of − − − 2 2ln(p)+ χ2 E1 p = 2 . − χ2n 2 − Then a E1 p;1 αb − − − is an exact (p, 1 α) lower tolerance limit for the exponential(a, b) − b distribution. b

32 / 60 Exponential Distribution: Estimation of P(X > t)

Let t denote the specified time at which we like to estimate the survival probability

t a St = P(X > t a, b) = 1 F (x a, b) = exp − . | − | − b   The value of p for which the (p, 1 α) lower tolerance limit is − equal to t is the 1 α lower confidence limit for P(X > t). That − is, for given MLEs a and b, p is determined so that a t a + E1 bp,1 αbb= t − = E1 p,1 α. − − ⇔ b − − b b b b

33 / 60 Exponential Distribution: Estimation of P(X > t)

The value of p that satisfies the equation is the 100αth percentile of 1 t a 2 2 exp A , wiht A = − χ2n 2 + χ2. −2n −    b  Finally, the 1 α lower confidence limit forbP(X > t) is expressed − as b 1 exp A1 α , (11) −2n −   where Aq is the q quantile of A. Note that, for given MLEs a and b, the distribution of A does not depend on any unknown parameters, and so its percentiles can be estimated by simulatib on. b

34 / 60 Exponential Distribution: Estimation of P(X > t)

t ab Let ηt = − . The approximate 1 α upper confidence limit for A b − is expressed as

b 1 2 2 2 2 A1 α ηt (2n 2)+2+ ηt (2(n 2) U∗) + (2 χ2;1 α) , − ≃ − − − − − (12) 2  2  where U∗b= χ2n 2;1 α if ηt b> 0, and is χ2n 2;α otherwise. This approximation is− quite− satisfactory even for− sample of sizes as small as three [Krishnamoorthy,b 2014].

35 / 60 Exponential Distribution: Example 3

The data in Table 1 represent the failure mileage of 19 military carriers. The probability plot by Krishnamoorthy and Mathew (2009, Example 7.3) indicated that data fit a two-parameter exponential distribution.

Table 1: Failure mileages of 19 military carriers

162 200 271 302 393 508 539 629 706 777 884 1008 1101 1182 1463 1603 1984 2355 2880

The MLEs based on the data are a = X(1) = 162 and b = 835.21.

b b

36 / 60 Exponential Distribution: Example 3

In this type of problem, it is of interest to find lower tolerance limit to judge the minimum life span of a product. So we shall compute a (.95, .95) lower tolerance limit for failure mileage distribution. The (.95, .95) lower tolerance factor based on 1,000,000 simulation runs is -0.119, and the lower tolerance limit is

162 .119 835.21 = 62.61 − × This that at least 95% of military carriers will last 62.61 units of miles with confidence 95%.

37 / 60 Exponential Distribution

Suppose it is desired to find a 90% lower confidence limit for the probability that a military carrier last 300 or more units of miles, that is, P(X > 300), where X represents the failure milage of a military carrier. Recall that a = 162 and b = 835.21. The 90% lower confidence limit based on (11) with 1,000,000 simulation runs is 0.719. This means thatb at least 72%b of military carriers work 300 units of mileage or more with confidence 95%. To compute the approximate lower confidence limit based on (12), 2 2 we found U∗ = χ36;.95 = 50.998, χ2;.95 = 5.991, ηt = .1652 and A.95 = 12.646. So the 95% lower confidence limit for P(X > 300) is exp( 12.646/2/19) = .717, which is very close to the one based − b on simulation.

38 / 60 Nonparametric Tolerance Intervals

If a sample is from a continuous population, and does not fit a parametric model, or fits a parametric model for which tolerance intervals are difficult to obtain, then one may seek nonparametric tolerance intervals for an intended application. NP procedures are applicable to find tolerance limits for any continuous population. However, for some sample sizes, a nonparametric tolerance interval that satisfies specified content and coverage requirements may not exist. Furthermore, nonparametric tolerance intervals are typically wider than their parametric counterparts.

39 / 60 Nonparametric Tolerance Intervals: Introduction

The nonparametric methods are based on Wilk’s (1941) result that if a sample is from a continuous distribution, then the distribution of the proportion of the population between two order statistics is independent of the population sampled, and is a function of only the particular order statistics chosen.

Let X = (X1, ..., Xn) be a random sample from a continuous distribution FX (x), and let X(1) < ... < X(n) be the order statistics for the sample. Recall that a (p, 1 α) tolerance − interval (L(X ), U(X )) is such that

PX PX L(X ) X U(X ) X p = 1 α, ≤ ≤ ≥ −    

where X also follows the same continuous distribution FX independent of the sample X.

40 / 60 Nonparametric Tolerance Intervals: Introduction

X Wilk’s result allows us to choose L( )= X(r) and X U( )= X(s), r < s, and the problem is to determine the values of r and s so that

PX X PX X X X X , X p = 1 α. (r), (s) (r) ≤ ≤ (s) (r) (s) ≥ −     (13)

One-sided tolerance limits are defined similarly. For example, the rth order is a (p, 1 α) lower tolerance limit if −

PX PX X X X p = 1 α. (14) (r) ≥ (r) (r) ≥ −    

41 / 60 Nonparametric Tolerance Intervals: One-Sided

Let X1, ..., Xn be a sample from a continuous distribution FX . To construct a nonparametric (p, 1 α) lower tolerance limit, − we need to find the positive integer r so that

PX [PX (X X X ) p] = 1 α. (r) ≥ (r)| (r) ≥ − The value of r that satisfies above requirement is the largest integer so that

n n i n i (1 p) p − 1 α (15) i − ≥ − i r X=   The rth X is the desired (p, 1 α) lower (r) − tolerance limit.

42 / 60 Nonparametric Tolerance Intervals: One-Sided

The rth largest order statistic, that is, X(n r+1), is the (p, 1 α) upper tolerance limit. − − As an example, let n = 50, p = 0.90 and 1 α = 0.95. Then −

r 0 1 2 3 .... P(X r n = 50, 1 p = .10) 1 .995 .966 .888 .... ≥ | − X(2) is the (.90, .95) one-sided lower tolerance limit, and X(49) is the (.90, .95) upper tolerance limit.

43 / 60 Nonparametric Tolerance Intervals: Two-Sided

To construct a (p, 1 α) nonparametric tolerance interval, we − have to determine a pair of order statistics X(r) and X(s), r < s, so that

PX X PX [X X X X , X ] p = 1 α. (16) (r), (s) (r) ≤ ≤ (s)| (r) (s) ≥ − Equivalently, r and s be determined so that

P(X s r 1) 1 α, (17) ≤ − − ≥ − where X binomial(n, p). ∼ Let k be the least value for which

P(X k) 1 α. (18) ≤ ≥ − Then any interval (X , X )isa(p, 1 α) tolerance (r) (s) − interval, provided 1 r < s n and s r = k + 1. ≤ ≤ − 44 / 60 Nonparametric Tolerance Intervals: Two-Sided

Let k be the least value for which

P(X k) 1 α. (19) ≤ ≥ − Then any interval (X , X )isa(p, 1 α) tolerance interval, (r) (s) − provided 1 r < s n and s r = k + 1. ≤ ≤ − Suppose we choose s = n r +1. Then − n k s r = k + 1 n 2r +1= k + 1 r = − − ⇔ − ⇔ 2

If r is determined as above, then

X(r), X(n r+1) = (rth smallest, rth largest) − is a (p, 1 α) tolerance interval. −

45 / 60 Nonparametric TIs: Example 4

Assessing the percentage of shaft holes satisfying the tolerance specifications

46 / 60 Nonparametric TIs: Example 4

Example The following is a sample of 40 shaft-hole diameters (in mm) given in ascending order:

10.1450 10.2010 10.2105 10.2160 10.2165 10.2361 10.2751 10.2766 10.2806 10.2853 10.2871 10.2886 10.2971 10.2999 10.3057 10.3133 10.3171 10.3209 10.3242 10.3392 10.3404 10.3414 10.3417 10.3426 10.3428 10.3466 10.3615 10.3687 10.3701 10.3710 10.3825 10.3967 10.4214 10.4284 10.4463 10.4568 10.4784 10.5007 10.5892 10.6129

(.75,.95) Tolerance Interval: To determine the order statistics, the least value of k for which P(X k n = 40, p = .75) 0.9567 is ≤ | ≥ k = 34. Also, r = (40 34)/2 = 3. Thus, − (X(r), X(n−r+1)) = (X(3), X(38)) = (10.2105, 10.5007) is the desired TI.

47 / 60 Nonparametric TIs: Example 4

Normal Q−Q Plot Sample Quantiles 10.2 10.3 10.4 10.5 10.6

−2 −1 0 1 2

Theoretical Quantiles

Figure 1: Probability plots of shaft hole diameters

48 / 60 Nonparametric TIs: Example 4

Normal based tolerance intervals: The (.76, .95) two-sided tolerance factor k2 = 1.4687. The TI is

X¯ k S = 10.3419 1.4687 0.097560 = (10.1986, 10.4852) ± 2 ± ×

This means at least 76% shaft holes are within the tolerance specification with confidence 95%.

49 / 60 Nonparametric TIs: Sample Size for Two-Sided TIs

Table 2: Values of n so that (a) (X , X ) is a two-sided (p, 1 α) (1) (n) − tolerance interval, (b) X(1) is a (p, 1 α) one-sided lower tolerance limit; equivalently, X is a (p, 1 α) one-sided− upper tolerance limit (n) − Interval 1 α p type 0.80 0.90− 0.95 0.99 0.80 one-sided 8 11 14 21 two-sided 14 18 22 31 0.90 one-sided 16 22 29 44 two-sided 29 38 46 64

0.95 one-sided 32 45 59 90 two-sided 59 77 93 130 0.99 one-sided 161 230 299 459 two-sided 299 388 473 662

50 / 60 Binomial Tolerance Intervals

Let X binomial(n,π), where n is the number trials, and π is the ∼ successes probability. For a given X = k, the problem is to find a TI for a binomial(m,π) distribution. Recall that one-sided tolerance limits are one-sided confidence bounds on appropriate quantiles. On the basis of the normal Y mπ approximation to the quantity − , the p quantile of a √mπ(1 π) − binomial(m,π) distribution is given by

kp(π, m) mπ + zp mπ(1 π). ≃ − p

51 / 60 Binomial Tolerance Intervals

Noting that the above quantile is an increasing function of π, an approximate (p, 1 α) upper tolerance limit for the binomial(m,π) − distribution can be obtained by replacing the π in the above expression by a 1 α upper confidence limit πu. More specifically, −

kp(πu, m) [mπu + zp mπu(1 πu)], (20) ≃ − p where [x] is the integer nearest to x,isa(p, 1 α) upper − tolerance limit for the binomial(m,π) distribution. Similarly, an approximate (p, 1 α) lower tolerance limit can be − obtained as

kp(πl , m) [mπl zp mπl (1 πl )]. (21) ≃ − − p

52 / 60 Binomial Tolerance Intervals

If (πl ,πu)isa1 α confidence interval for π, then −

mπl z 1+p mπl (1 πl ) , mπu + z 1+p mπu(1 πu) − 2 − 2 − hh p i h p (22)ii

is an approximate (p, 1 α) equal-tailed TI. − Among all available CIs for π, the following score CI is the most popular in applications, and is also quite comparable with others.

c2 c π(1 π)+ c2/(4n) π + 2n √n (πl ,πu)= − , (23) c2 ± c2 1+ n ! p 1+ n b b b

where c = z1 α/2. − 53 / 60 Binomial Tolerance Intervals

Numerical studies by Krishnamoorthy et al. (2011) indicated that a (p, 1 2α) equal-tailed TI for a binomial distribution − can be used as an approximate (p, 1 α) two-sided TI. −

For example, if it is desired to find a (.90, .95) two-sided TI for a binomial distribution, then we simply find a (.90, .90) equal-tailed TI (L, U) which satisfies the requirement that

PX PX (L(X ) X U(X ) X ) p 1 α. { ≤ ≤ | ≥ }≃ −

54 / 60 Binomial Tolerance Intervals: Example

To illustrate the methods for constructing binomial tolerance intervals, we shall use the data posted on the NIST webpage1, and they represent fractions of defective chips in a sample of 21 wafers. A chip in a wafer is considered to be defective whenever a misregistration, in terms of horizontal and/or vertical distances from the center, is recorded. On each wafer, locations of 50 chips were measured and the proportion of defective chips was recorded.

Here the ni ’s are all equal to 50, πi ’s are the sample fractions of defective given in Table 3, and the overall proportion of defective

21 b ni πi 196 π = i=1 = = 0.1867. 21 n 1050 P i=1 i b b P 55 / 60 Binomial Tolerance Intervals: Example

Table 3: Fractions of defective chips in a sample of 21 wafers

πi .24 .10 .22 .16 .12b .18 .20 .24 .24 .14 .16 .14 .18 .20 .26 .28 .10 .18 .20 .26 .12 ni = 50, i =1, 2, ..., 21

56 / 60 Binomial Tolerance Intervals: Example

We shall compute (0.90, 0.95) one-sided as well as (0.90, 0.95) two-sided TIs for the binomial(m,π) distribution. 21 Towards this, we note that n = i=1 ni = 1050, k = the total number of defective chips, which is 196, and m = 50. P To find the TIs (20), (21) and (22) using the score CIs for π, the necessary normal percentiles are z.90 = 1.282 and z.95 = 1.645. The 90% score CI (using π = 0.1867 and n = 1050 in (23)) is (0.1677, 0.2072). Using the above CI in (22),b we get the (0.90, 0.95) tolerance interval to be the interval [4, 15]. The exact CI based on a numerically involve method is also [4, 15].

57 / 60 References I

[1] Wilks, S. S. (1941). Determination of sample sizes for setting tolerance limits. Annals of , 12, 91-96. [2] Wilks, S. S. (1942). Statistical prediction with special reference to the problem of tolerance limits. Annals of Mathematical Statistics, 13, 400-409. [3] Wald, A. (1943). An extension of Wilks’ method for setting tolerance limits. Annals of Mathematical Statistics, 14, 45-55. [4] Wald, A. and Wolfowitz, J. (1946). Tolerance limits for a normal distribution, Annals of the Mathematical Statistics, 17, 208-215. [5] Krishnamoorthy, K., Mathew, T. and Mukherjee, S. (2008). Normal based methods for a gamma distribution: prediction and tolerance interval and stress-strength reliability. Technometrics, 50, 69-78.

58 / 60 References II

[6] Krishnamoorthy, K. and Mathew, T. (2009). Statistical Tolerance Regions: Theory, Applications and Computation. Wiley. [7] Krishnamoorthy, K., Xia, Y. and Xie, F. (2011). A simple approximate procedure for constructing tolerance intervals for binomial and Poisson distributions. Communications in Statistics -Theory and Methods, 40, 2443-2458.

59 / 60 Tolerance Intervals

Thank You! Comments/Questions?

60 / 60