Confidence Intervals in Analysis and Reporting of Clinical Trials Guangbin Peng, Eli Lilly and Company, Indianapolis, IN
Total Page:16
File Type:pdf, Size:1020Kb
Confidence Intervals in Analysis and Reporting of Clinical Trials Guangbin Peng, Eli Lilly and Company, Indianapolis, IN ABSTRACT for more frequent use of confidence Regulatory agencies around the world intervals (Simon, 1993). have recommended reporting confidence There is a close relationship between intervals for treatment differences along confidence intervals and significance with the results of significance tests. tests (Hahn and Meeker, 1991); in fact, a SAS provides easy and convenient ways confidence interval can often be used to to produce confidence intervals using test a hypothesis. If the 100(1-α)% procedures such as PROC GLM and confidence interval for the mean PROC UNIVARIATE in conjunction treatment difference in a clinical trial with ODS (output delivery system). In does not contain zero, there is evidence this paper, I will discuss the relationship to indicate a treatment difference at the between significance tests and 100 α % significance level. This strategy confidence intervals, summarize the is equivalent to the hypothesis test that types of confidence intervals used in rejects the null hypothesis of no mean clinical study reports, and provide treatment difference at the level of α. examples from clinical trials to illustrate Compared to p-values, confidence the computation of distribution- intervals are generally more informative. dependent confidence intervals for the They provide quantitative bounds that mean treatment difference and express the uncertainty inherent in distribution-free confidence intervals for estimation, instead of merely an accept the median response within each or reject statement. The length of a treatment group using SAS. confidence interval depends on the sample size; this influence of sample INTRODUCTION size is evident from observing the length Confidence interval estimation and of the interval, while this is not the case significance testing (hypothesis testing) for a significance test. So confidence are the two most commonly used intervals are usually more meaningful statistical inference methods for clinical than statistical hypothesis tests alone. trials (Walker, 1997). Because p-values Moreover, they are easier to explain to have been widely presented and, in the those with no formal training in statistics past, more easily obtained from standard (Hahn and Meeker, 1991). statistical software than have confidence Regulatory agencies around the world intervals, significance tests have also have emphasized in recent years the been more widely accepted by the importance of reporting confidence medical community than have intervals in clinical study reports. The confidence intervals. The extensive use ICH Harmonized Tripartite Guideline on of significance tests with clinical trial statistical principles for clinical trials data has further increased their states, “Estimates of treatment effects popularity and made confidence should be accompanied by confidence intervals less popular. However, the intervals, whenever possible, and the overuse and misinterpretation of way in which these will be calculated significance tests have lead to advocates should be identified…. it is important to bear in mind the need to provide statistical estimates of the size of based on what seems to be an acceptable treatment effects together with degree of assurance for the specific confidence intervals (in addition to application (Hahn and Meeker, 1991). significance tests).” For instance, the analyst may construct a In this paper, I will discuss the types of 95% confidence interval for the mean. confidence intervals and the different This indicates that the method of methods for constructing them. I will construction guarantees that 95% of all also illustrate the calculation of such intervals will contain the (true) confidence intervals for clinical trial data population mean. (Of course, this also by providing sample SAS code and means that 5% of them will not.) One output. Finally, I will discuss the can request a higher level of confidence, advantages of presenting confidence which will reduce the chances of intervals along with p-values in clinical obtaining an interval that does not study reports. contain the population mean. However, increasing the confidence level results in CHARACTERISTICS OF a wider (that is, less precise) interval for CONFIDENCE INTERVALS a fixed sample size. On the other hand, Because there exist a variety of when there is a fixed confidence level, confidence intervals, the analyst must the length of intervals becomes shorter determine which type of interval to use as the sample size increases. So, the depending on the application (Hahn and analyst may choose higher confidence Meeker, 1991). Two commonly used levels with large samples and lower types are confidence intervals for confidence levels with small samples. In population parameters and confidence some cases, obtaining meaningful intervals for distribution percentiles. The confidence intervals becomes most frequently used type of confidence impractical because of the small sample interval attempts to capture the size or complexity of analysis. population mean. Sometimes, however, the analyst may construct confidence CONFIDENCE INTERVALS FOR intervals for the standard deviation or CLINICAL TRIALS other shape parameters for a distribution The most frequently used confidence to satisfy his needs. In the case that the interval for clinical trial data is the 95% assumed distribution parameters are not confidence interval for the mean suitable to describe the sampled treatment difference. The selection of population, the analyst may focus on one 95% for the confidence level is common or more percentiles of the sampled across disciplines. The reason for this distribution and construct confidence selection seems quite obvious for intervals for them (for example, for the clinical trials, especially for median or the quartiles). Sometimes one- confirmatory trials as a result of the sided confidence bounds are desired in study design, and the 95% confidence situations where the major interest is level provides reasonable assurance restricted to the lower limit or the upper along with adequate precision for most limit alone. trials. The methods for calculating these All statistical intervals have an confidence intervals can be generally put associated confidence level. The analyst into two categories: distribution- must determine the confidence level dependent and distribution-free methods. Construction of distribution-dependent difference through analysis of variance, confidence intervals requires one to although it is critical to obtain the assume a particular distribution, such as degrees of freedom and estimate the the normal distribution. From experience, standard error (or mean square error) the normal assumption appears to be correctly. Before the release of SAS valid for many clinical trial data analyses. version 8, the analyst had to extract these However, it may be inappropriate to values from the SAS procedures and calculate distribution-dependent then calculate the confidence intervals confidence intervals when the assumed using the formula above (or one of its distribution does not fit the data well. In variations) with custom written SAS such cases, distribution-free confidence code. The following code using SAS intervals should be constructed. A version 6.09 represents one way to distribution-free interval sometimes may obtain the 95% confidence interval for not exist, and its length is generally the mean treatment difference from an longer than the corresponding ANOVA model. distribution-dependent interval for a *-------------------------------------*; particular distribution. This is the price * The following statements get output *; that one pays for not making the * datasets containing statistics *; * needed for calculation of confidence*; distribution assumption (Hahn and * interval *; Meeker, 1991). So, a distribution- *-------------------------------------*; proc glm data=final outstat=glmdt dependent confidence interval should be noprint; chosen whenever there is solid evidence class &trt &str &invcd; model &dep=&indp; that the data follows a tractable lsmeans &trt/pdiff stderr tdiff distribution. out=lsmeandt; *--------------------------------------*; DISTRIBUTION-DEPENDENT * The following statements get the *; * degree of freedom from the model *; CONFIDENCE INTERVAL *--------------------------------------*; If the assumption that the data are data _null_; set glmdt; if _type_='ERROR'; normally distributed is valid, one can call symput('df', df); construct confidence intervals for the *--------------------------------------*; mean treatment difference. The general * The following statements get the *; form of a confidence interval for the * LSMEANS from the model *; *--------------------------------------*; mean difference between two treatment data _null_; set lsmeandt; groups (Group A and Group B) is if &trt=’’&control’’ then call symput('pLSM', lsmean); if &trt=’’&trt1’’ then call symput('tLSM', lsmean); a − b ± 1 − a / 2, df * (Ya − Yb ) Y Y t S (1) *--------------------------------------*; *The following statements get the 95% *; *confidence interval *; whereY is the mean or least squares *--------------------------------------*; data mgmean; set lsmeandt; mean, and S (Ya − Yb ) is the standard error of lb=(&tLSM -&pLSM)-tinv(.975, &df)*abs(&tLSM -&pLSM)/sqrt(f); − − the estimate of (Ya Yb ), and t 1 a / 2, df is ub=(&tLSM -&pLSM)+tinv(.975, the 100(1-α/2) percentile from student’s