<<

Henry Ford Hospital Medical Journal

Volume 24 | Number 3 Article 7

9-1976 Use of the Weibull distribution for analysis of a clinical therapeutic study in rheumatoid arthritis W. R. McCrum

J. T. Sharp

G. B. Bluhm

Follow this and additional works at: https://scholarlycommons.henryford.com/hfhmedjournal Part of the Life Sciences Commons, Medical Specialties Commons, and the Public Health Commons

Recommended Citation McCrum, W. R.; Sharp, J. T.; and Bluhm, G. B. (1976) "Use of the Weibull distribution for analysis of a clinical therapeutic study in rheumatoid arthritis," Henry Ford Hospital Medical Journal : Vol. 24 : No. 3 , 173-182. Available at: https://scholarlycommons.henryford.com/hfhmedjournal/vol24/iss3/7

This Article is brought to you for free and open access by Henry Ford Health System Scholarly Commons. It has been accepted for inclusion in Henry Ford Hospital Medical Journal by an authorized editor of Henry Ford Health System Scholarly Commons. Henry fbrd Hosp. Med. Journal Vol. 24, No. 3, 1976

Use of the Weibull distribution for analysis of a clinical therapeutic study in rheumatoid arthritis

W. R. McCrum, PhD *, J. T. Sharp, MD **, and G. B. Bluhm, MD

This paper introduces the use of the Weibull distribution function for the analysis of small samples of clinical data. It is com­ pared with the use ofthe conventional anal­ ysis of in determining the results of a study comparing the use of gold salts and a placebo in the treatment of rheumatoid arthrids. Although the samples were small, 14 con­ trol patients and 13 who were treated, analy­ sis of variance determined several significant differences between the two groups. The use of the Weibull distribution, however, not only confirmed these differences, but also determined several more differences be­ tween the two groups that were undetected by analysis of variance. A brief description and discussion of the Weibull distribution function is presented. It Includes a method for determining the Weibull parameters, and the use of these parameters in identifying unknown samples as belonging to more well-known distribu­ * Department of Neurosurgery and Neurology, tion functions such as the normal, exponen­ Henry Ford Hospital tial and Chi Square. A method for comparing two samples •'Department of Internal Medicine, Rheumatic using an integration ofthe sum of the alpha Disease Section, Baylor College of Medicine, Houston, Texas, 77025 and beta errors is also presented. Finally there is offered an explanation as to *** Division of Rheumatology, Henry Ford why the use of the Weibull distribution Hospital should be more sensiUve in determining differences between small samples of data Address reprint requests to Dr. McCrum at Henry Ford Hospital, 2799 West Grand Boulevard, De- than more conventional methods of hypoth­ troftMl 48202 esis testing.

173 Analysis of clinical data

Methods

A. Clinical Twenty-seven patients meeting the ARA criteria for classical rheumatoid arthritis, and IN 1951 Waloddi Weibull described a sim­ with the disease of less than 5 years' dura­ ple distribution function with the only condi- tion, were randomized to either a control or tions that it be non-zero and non- treated group. There were 14 patients in the decreasing.^ He said the function had wide control group and 13 in the treated group. applicability and presented seven examples The similarity of disease severity and dura­ ofthe useofthisfunction in . Most of tion in the two groups was established pre­ the examples were in biostatistics. In one viously.' Gold salts or placebo were example (the life of steel) the inde­ administered for at least two years. The pendent variable was time; subsequently patients were examined at specified times this use of the function as a regularly throughout the study."* The details distribution has been found most useful in ofthe clinical measurements evaluated reliability testing. It isalso finding increasing herein were published previously.' acceptance in engineering and physics for use in other than life testing procedures. This paper compares the initial pre-treat- Johnson and Leone^ refer to "Weibullizing" ment examinations and the examinations at a variety of distribution functions such as the 6, 12 and 24 months. The differences be­ normal, exponential, beta, gamma and Chi tween pre-treatment and subsequent mea­ Square. However, its use has been almost surements of grip strength, ring size, joints totally neglected in the field of biostatistics, showing synovitis, walking time and lace as described in his remaining examples. The tying time were analyzed. The serum pro­ purpose of this paper is to show that this teins were determined at the beginning and function is useful in evaluating clinical data at the end ofthe study. and has theoretical advantages over the classical statistical hypothesis-testing tech­ B. Stadstical niques which assume a . Since our samples are drawn from un­ To illustrate these points the Weibull dis­ known populations, the population param­ tribution is compared with analysis of vari­ eters must be estimated from the samples. ance in evaluating a double blind trial of This is true of any statistical procedure when gold therapy in rheumatoid arthritis. the population parameters are unknown. The Weibull distribution uses three param­ Gold salts have been used to treat rheu­ eters to identify the population. These are matoid arthritis for many years. However, the: origin (alpha), the shape (beta) and the there are few studies on the effectiveness of scale (theta). Thus the cumulative Weibull this treatment. Recently Sigler etaP reported distribution function: that some features of the disease were im­ proved by treatment with gold salts, using analysis of variance for data evaluation. It F(x) = 1- was recognized that the use of "classical" statistics such as analysis of variance or and the density function; Student's "t" test was perhaps inappropriate for evaluating small samples of 13 and 14 •1 data points. For this reason the data were re­ f(X) = /3(X-a) • V e-a ) evaluated using the Weibull distribution function. To our knowledge this is the first application forthe Weibull distribution func­ It is evident that F (x) is a log log function. tion to a therapeutic trial. When the relationship between the log-

174 McCrum, Sharp and Bluhm arithmic values of the sample x and the log compared between the two groups. Since log values of F (x) is linear, the sample is said there were 14 control and 13 treated patients, to have come from a population having a data points were 28 and 26 forthe respective Weibull distribution. Thus the slope of the groups. best fit line through the sampledata points is the best estimate of the When these data points were entered into beta. The intercept of this line with the the computer (see Appendix 1) the following abscissa yields the origin (parameter alpha). values were derived: alpha (the origin) of the When X = theta the Weibull function re­ control group was zero; beta (shape) was duces to 1,28 (a nearly ); and theta (scale) was equal to 17.69. The was 6.5 ounces. Alpha was 5, beta was 1.2, F(x) = 1-e 0.632. and theta was 39.06 for the treated group. The median was 18.72 ounces. For this study a computer was used to deter­ mine the best fit line through the data using The confidence "c" that the treated group successive estimates of alpha as a starting had the greater grip strength after six months point. Appendix 1 describes the format ofthe was 0.9964 (Appendix II). The analysis of computer program used in determining the variance of these two samples yielded an parameters of samples of grip strength in the alpha-error of 0.035 (significant at the 0.05 control and treated groups. level).

Having established the Weibull param­ The data for grip strength differences at 12 eters for our two samples to be compared, and 24 months were evaluated in the same we then establish the probability of a dif­ fashion. All samples showed nearly expo­ ference between them as a confidence "c". nential distributions with zero origin. At 12 This in essence represents the minimum months the confidence that the treated group integral or overlap of the Type 1 (alpha) and was better was 0.9428, and at 24 months was Type 11 (beta) errors of classical hypothesis 0.9885. Comparison ofthe Weibull distribu­ testing. The use of alpha and beta in this tion analysis with the analysis of variance is instance should not be confused with shown in Table 1. Using analysis of variance, Weibull parameters having the same name. The rationale for the confidence "c" is ex­ TABLE I plained by L. G. Johnson." Since that pub­ lication has a limited circulation, the (Grip Stirength) introduction to the article is reproduced in Appendix III. ANOVA a = 0.035 ^6 = ^6 WEIBULL H, ^6 >S = 0.9964 Results

Cnp strength analysis ANOVA H : a = 0.008 Grip strength was measured in ounces of = ^12 water per square inch for each hand in all WEIBULL H, C = 0.9428 patients in this study. Initial measurements, made before treatment, revealed no dif­ ference between the patients selected as ANOVA a = 0.025 controls and those to be treated. Differences ^24 " ^^24 in strength between the initial measurements WEIBtILL H, T > C C = 0.9885 24 24 and measurements after six months were

175 McCrum, Sharp and Bluhm the null hypothesis Ho is accepted if the a TABLE II error is > 0.05. Using the confidence meth­ od the Hypothesis H, is accepted if "c" is CRing Size) > 0.90. In this instance there was no conflict between analysis of variance and the confi­ dence method. Ho was rejected by the for­ ANOVA H : C = T a = 0.05 mer and H| accepted by the latter. o 6 6 WEIBULL H, ^6 < S C = 0.9999 The choice of a 0.90 confidence level is purely arbitrary in this instance. It is the usual level of confidence accepted in physics and — a = 0.005 ANOVA H : C,12 = T"1 2 , and was considered WEIBULL H T < C — C = 0.9999 in this case an acceptable level for the 12 12 purpose of the study. It must be remembered that we are considering the probability of a ANOVA H — a = 0.23 difference between two samples whereas in S4 = ^24 using analysis of variance the level of the WEIBULL : T,. < C.,^ — = 0.9032 alpha error is not necessarily indicative of a 1 24 24 difference between two samples. In point of fact, selection of too small an alpha error (0.01 for instance) may mask the correct were zero and the shape parameter ranged probability of a difference by creating too from 1.0 to 1.4, indicating the distribution large a beta error. This point is enlarged form was nearly exponential. upon later. Forthe 6-month period, the confidence of a difference between the treated and control Ring size analysis group was 0.55 which is near pure chance. The differences in ringsizes at6,12 and 24 At 12 and 24 months, however, the confi­ months were evaluated bythe same method. dence that the treated group showed fewer The Weibull parameter alpha was zero for all joints with synovitis was 0.9999 for both samples. The parameter beta ranged from periods. It will be noted (Table 111) that at 1.4 to 1.8. These values indicate that all the samples were highly skewed to the left ofthe TABLE III but were not exponential. Confidence levels that the ring size differences for the treated group were smaller than those for the (Sum of Joints Showing Synovitis) control group were 0.9999 (certainty) for the 6- and 12-month periods and 0.9032 for 24 ANOVA H : = T, months. o 6 6 a = 0.10 WEIBULL H-^: Tg < Cg C = 0.55 Analysis of variance yielded alpha-errors sufficiently small that the null hypothesis would be rejected for the 6- and 12-month — a = 0.015 periods. The data from the 24-month period, ANOVA H : C,_ = T-. 0 12 12 however, yielded a large alpha-error. (See WEIBULL H, : T < C — C = 0.9999 Table II). 1 12 12

Sum of joints analysis ANOVA H^: C^^ = T24 a = 0.09 Analysis ofthe difference in the number of joints showing synovitis 6,12 and 24 months WEIBULL : T..,, < C C = 0.9999 1 24 '24 revealed origins ofthe Weibull distributions

176 Analysis of clinical data both the 6-month and 24-month evalua­ TABI£ IV tions, the alpha-error was essentially the same, whereas the confidence level changed from 0.55 to 0.9999. This points CWalk±ng Time) out the importance of considering the beta- error as well as the alpha-error in determin­ ing differences between samples. This be­ 6 inontils comes especially important when dealing with highly skewed distributions as in this H : C T a = 0.75 case. o T < C C = 0.58 Walking dme analysis Initially the treated and control groups showed no difference in thetime required to 12 mont±is walk a measured distance. Differences in walkingtime were calculated at6,12 and 24 «o = C = T a = 0.75 months. Both treated and control patients showed a decrease in walkingtime over the H : T < C C = 0.9999 1 entire length of the study. At 6 months the decrease in time was similarfor both groups. At 12 and 24 months, however, the treated 24 ittoiths group showed a greater decrease than the control group with the high probability, "c" H : C = T a = 0.75 = 0.9999 for both 12 and 24 months, that o these changes were valid. These samples T < C C = 0.9999 were also nearly exponentially distributed. Analysis of variance failed to indicate any differences between the treated and con­ trols, (see Table IV). include lace tying time, hemoglobin, and white blood cell count. Serum proteins analysis Serum protein electrophoresis was per­ formed on samples from each subject at the Discussion beginning and end ofthe study. There were no differences between the treated and con­ The Weibull distribution function is a very trol groups in either total protein, albumin or convenient and flexible probability distribu­ globulin at the beginning. At the end of the tion. Itrequiresofthe sample data only that it study there was no change in total protein be non-zero and non-decreasing in its but albumin values were lower (c = 0.956) cumulative sum. It is widely used in industry and globulin levels higher (c = 0.981) in the because empirically it has been proven relia­ control group than in the treated group. ble in evaluating small samples of test data. Analysis of variance showed only the al­ Almost all medical data sets fit these condi­ bumin level to be lower in the control group tions. Yet use ofthe Weibull distribution for at the end ofthe study (see Table V). evaluatingclinical data has had such limited application that it is essentially a new ap­ Tests showing the treatment effect proach. Its use, therefore, deserves careful Several other measurements were evalu­ consideration. ated which showed no differences between the treated and control groups by either the Perhaps its strongest attribute for its useful­ ANOVA or Weibull methods. These tests ness is its ability to replace a number of

177 McCrum, Sharp and Bluhm

TABLE V essentially a normal distribution. A value lessthan 3.5 indicates skewingof the data to the left of the mean and a value greater than (Serum Proteins) 3.5 shows skewing to the right of the mean.

If the parameters ofthe parent population Total Protein are unknown, then for any statistical in­ ference the sample must be assumed to be a H : = c — a = 0.35 random sampleofthat population. Since this o I I must be the case, the parameters of the < C = 0.7332 ^1 ^1 sample are the best estimators ofthe param­ eters of the parent population. Thus, for H : = a = 0.75 o example, if the sample is exponentially dis­ < Cp C = 0.8039 tributed, it must be assumed that the parent "1 = population is also exponentially distributed. In such case the parameters of mean and Albumin variance will have little or no usefulness in establishing a meaningful inference. On the H : C = T a - 0.26 other hand, if it is arbitrarily assumed that the o 1 I parent population is normally distributed H : C, < T C = 0.8044 1 then it is obvious that the exponentially I distributed sample is nota random sample of Ho = CF = Tp a = 0.028 that population. A statistical inference in Cp < C = 0.9562 such case would not be valid. If the Weibull TF parameters of the sample are established first, errors concerning population character Globulin will not be made. This is in contrast to using analysis of variance. In using analysis of H : Cl = Tj a = 0.08 variance, as in the present study, the mean o and variance of the population were esti­ % = Cl < Tl C = 0.914 mated from the sample. This required two arbitrary assumptions: 1)the parent popula­ H : a = 0.005 o S =T F tion was normally distributed, and 2) the sample was a truly random representation of Cp < Tp C = 0.9811 this normal population as defined mathe­ matically (5) by the relationship:

P{XK • Xk — Fk (x„ better-known functions such as the normal, exponential, Chi Square, gamma and beta In point of fact, neither of these assumptions distributions. When the shape parameter was supported by the data available from beta equals one the distribution becomes this study. As can be seen from the results exponential. most of the samples were either exponen­ tially distributed or highly skewed to the left of the mean. It may be argued that small F(x) 1-e samplesdrawn from normal populations are skewed and corrections can be made forthe Further, it has been determined empirically . This argument is valid only when that when the shape parameter equals 3.5 the population distribution is known with the mean and median coincide and we have certainty.

178 Analysis of clinical data

The distribution functions of biological those reached by analysis of variance. In data are generally not known, and, further­ some instances, however, analysis of vari­ more, should not be assumed to have a ance failed to detect small differences be­ normal distribution. This is particularly true tween the treated and control groups that if the data are oriented in the time domain. were held to be significantly different when Clinical studies often are concerned with a evaluated by the Weibull method. The basis "life" process, either that of the patient or of for such discrepancies has already been a disease. Such "life" processes differ in discussed. In the present study in particular their distribution functions. Some are char­ two factors stand out. acterized by "infant mortality", some by "random decay" and others by "old age In the first instance analysis of variance survival". This being the case, such studies depends upon the mean and variance for should be evaluated by an appropriate "life information. Since most data samples in this testing" distribution such as the exponential, study were nearly exponential in distribu­ log normal, gamma, etc. The Weibull dis­ tion, their mean and variance carried little tribution proves most useful in this regard meaningful information. because it characterizes most of these. As mentioned before, Weibull parameters ade­ quately describe numerous distribution In the second instance the consideration functions including the normal and Chi of only alph-errors ratherthan consideration Square. of both alpha and beta errors together is emphasized. In classical statistics, an in­ ference is accomplished by hypothesis test­ Another advantage ofthe Weibull method ing. An hypothesis is made, and from the is that populations can be compared at sample data a probability of an error is different percentiles other than the mean. established if the hypothesis is accepted. Thus two different groups can be compared This probability is called the alpha-error. A at10% level, 90% level orthe 50% level (the probability that an error will be made if the median). Analysis of variance and Student's hypothesis is rejected is also present. This "t" test compare only the , which probability is Called the beta-error. The two usually vary from the 40th percentile to the probabilities are mutually exclusive; both 60th percentile, depending on the amount the alpha-error and the beta-error contribute and direction ofthe skewness ofthe popula­ to the total probability of an error in either tion. Thus, for example, some treatment accepting or rejecting the hypothesis. The might make 10% of the population worse, correct level of probability that should be improve another 10%, but "on the average" chosen for rejecting or accepting an hypoth­ show no effect. This could be revealed by esis then is that level which minimizes the Weibull analysis at varying percentiles. sum of the alpha and beta errors (see Appen­ dix 111). In actual practice this is rarely done. The shape parameter beta may also be An alpha-error (usually very small) is ar­ compared fordifferent Weibull distributions. bitrarily chosen, and the hypothesis is ac­ It may happen that two populations will cepted or rejected, depending on the alpha- "average out" equally but the character of error calculated from the samples. their distributions may be quite different. These differences can be determined and The Weibull analysis uses a different quantified by evaluating the shape method of making a statistical inference. A parameter. "confidence" is established which is a mea­ sure of the overlap between two adjacent In the study reported here, most of the probability density functions. It is then in conclusions based on evaluation by the essence an integration of the alpha and beta Weibull analysis were in agreement with errors, (Figure 1).

179 McCrum, Sharp and Bluhm

CONTROL

TREATED /

DISTRIBUTION OF MEANS \

. Computed \ alpha error r

/ beta 11 IMTK. \ error^fl 1^^= II lllillilNIIINHTTr*^ 1 \ 0 1=0.23 /

WEIBULL method ANOVA method accepts rejects the the NULL HYPOTHESIS NULL HYPOTHESIS

Figure 1 Hypothesis testing of ring size difference at 24 months. Comparison of hypothesis testing using a predetermined alpha error, as in ANOVA, with the confidence level determined by integrating the overlap of the alpha and beta errors.

Conclusion the method appears to be sensitive in detect­ ing small differences in small samples. A We believe that the application of Weibull major advantage in using the Weibull dis­ distribution analysis to clinical data, as illus­ tribution for clinical data analysis is that one trated in the gold trial reported here, is valid does not have to assume the population has and will yield reliable results. In addition. a particular shape such as bell-shaped curve

180 Analysis of clinical data of the normal distribution or the decaying Ac know led gements slope of an exponential distribution. The shape is determined from the sample, much The authors wish to thank the J. M. as the shape ofthe Chi Square distribution is Richards' Laboratory, Grosse Pointe Park, determined by the sample. We believe that Michigan, for its contribution of computer Weibull distribution analysis deserves wider time and programs in completing this pro­ trial in clinical problems since only experi­ ject; and the Michigan Chapter, Arthritis ence from more extensive use will establish Foundation, for the grant to the its true value. clinical study.

References

1. Weibull W: A statistical distribution function of treatment of rheumatoid arthritis. Ann Intern wide applicability./ Appl Mechanics 293-297, Med 80:21-26, 1974 Sept 1951 4. Johnson LG: Sample comparison vs hypothesis 2. Johnson NL and Leone FC: StaUstics and Ex­ testing. Stadstical Bulledn of the Detroit Re­ perimental Design in Engineering and the search Institute 2:104, 1972 Physical Sciences. Vol I, John Wiley and Sons, New York, pp 112-115, 1964 5. Fisz M: and Mathemaucal StatisUcs, 3rd Ed, John Wiley and Sons, New 3. Sigler JW, Bluhm GB, Duncan H, Sharp |X York, 335-336, 1963 Ensign DC, and McCrum WR: Gold salts in

APPENDIX I Step 7. Median = 18.8633. Step 8. Origin (alpha) = 5 (from inpuO. Computer Program Format for Weibull Step 9. Shape (beta) = 1.1231. Parameters of Grip Strength of Treated Patients Step 10. Scale (theta) = 39.0681. Step 11. = 0.9705. Input Step 12. Median = 18.72. Step 1. Estimates of origin (alpha): 0, 5, 10. Step 13. Origin (alpha) = 10 (from input). Step 2. Data (not ordered): x,, x^, X24. Step 14. Shape (beta) = 10.0581. Output Step 15. Scale (theta) = 29.2672. Step 16. Goodness of fit = 0.9371. Step 1. Data (ordered listing): Xj,, Step 17. Median = 18.8111. Xj2, .... Xj24. Step 2. Median ranks of: Xj,, Xj2, Xj24. Comment Step 3. Origin (alpha) = 0 (from input). The best goodness of fit test equals 0.9705 (Step Step 4. Shape (beta) = 1.0280. 11). Therefore the best estimates of the Weibull Steps. Scale (theta) = 43.6483. parameters are: alpha = 5 (Step 8), beta = 1.1231 Step 6. Goodness of fit = 0.9500. (Step 9) and theta = 39.0618 (Step 10).

1= Distribution of OVERLAP Distribution of PARAMETER | AREA PARAIVIETER2 OR INTERFERENCE AREA

X=Value of Parameter-^

181 McCrum, Sharp and Bluhm

APPENDIX II The question which now can be asked is "How are these two statistical techniques related (i.e., Computer Program Format for "Confidence" of sample comparison and hypothesis testing)?" a Difference in Grip Strength Between Treated More specifically, we can ask "What is the rela­ and Control Groups. tionship between a, ji, and C?"

Input Acomparison of two samples is generally done Step 1. Control group values: alpha = 0, by comparing a particular type of population beta = 1.276, theta = 17.69, sample parameter or quantile level in the two samples. size = 28. The parameter chosen could be the mean in the case of normal distribution, or characteristic val­ Step 2. Treated group values: alpha = 5, ues n the case of Weibull distributions. beta = 1.1231, theta = 39.06, sample size = 24. The corresponding hypothesis which istested is Step 3. Percentile compared = 0.5 (median). then the one which states that the mean of an Step 4. Null ratio = 1 (at least different). observed sample belongs in population^ with meani, or that the characteristic value 9 of an Output observed sample belongs in population^ with Step 1. Estimated mean of control characteristic valuex = Si (^^ and being the population = 14.329. characteristic values ofthe two possible popula­ Step 2. Estimated standard deviation of tions from which the sample could come). control population = 2.836. In order to make a comparison of parameters vs Step 3. Estimated mean of treated parameteri we need distribution functions for both population = 31.97. parameters {say, the estimated mean of popula­ Step 4. Estimated standard deviation of tions) and parameters (the estimated mean of treated population = 6.43. popu/a!/on2). Step 5. Confidence = 0.9964. Once these distribution functions are known we Comment construct an interference diagram, as in Figure 1. Given the Weibull parameters for two samples, the program compares the two samples at any percentile level and at any selected magnitude of Bulletin No. 2, Page 2 difference. In this example the were compared and found to be at least different with a From Figure I it is possible to calculate the confidence 0.9964. If we had selected a null ratio probability that a random element from II exceeds of 2, then the program would have computed the a random element from I. Thisisthen definedtobe confidence that the median of one sample was at the confidence that parameters > parameters. least twice as large as the median of the other sample. Let the PDF (Probability Density Function) of I be fl (X). Let fl (X) = PDF of II. APPENDIX III Let Fl (X) = CDF (Cumulative Distribution Function) of I. From Statistical Bulletin: Reliability & Variation Let F2 (X) = CDF of II. Research. Leonard G. Johnson, Editor, Vol. 2, Bulletin 2, May 1972, Page 1. Then Sample comparison vs hypothesis testing C = jVi (X) f2 (X) dX In comparing two samples we calculate the confidence C that one ofthe samples represents a population which is superior to the population According to formula 1 the confidence C in­ represented by the other sample. creases as in the overlap area between fi (X) and f2 (X) is made smaller, until when there is no In testing the hypothesis that a given sample overlap, the confidence becomes 1 (i.e., 100% belongs to a population superiorto some standard or certainty) that whatever is selected in II will population we specify so-called a and /3 errors, surely exceed whatever is selected in I. On the and determine whether under the {a, /3) pair other hand, iffi (X) and f2 (X) are identical (100% chosen we should acceptor reject the hypothesis overlap), it follows that there is a 50-50 chance that the sample belongs to the standard (i.e., 50% confidence orC-.5) thatthe selection population. from II will exceed the selection from I.

182