Evaluating Methods of Estimating Common Risk Difference for Stratified Binomial Trials Clinical Research Services

Kate Fisher, MA Biostatistics Global Data Operations Clinical Research Services Evaluating methods of estimating common risk difference for stratified binomial trials for less common events Background Methods Cont. Results Summary Many clinical trials use a stratified randomization approach to Simulation Cont. Example Risk Difference Estimates (pct-ptx) Stratified Score CI • The Score CI has a slightly higher chance of being declared ensure balanced treatment assignment within subgroups, helping The performance of the common risk difference estimation significant compared to the Stratified Newcombe in the Young, Low risk to more accurately estimate treatment effect. For example, a trial methods are investigated under 2 settings: ranges of a reasonably powered study (P(significant)>80%) may stratify the randomization on gender, age groupings, or other Young, High risk 1. Varying the magnitude of risk difference(pct- ptx) • The probability of capturing the true treatment difference baseline covariate that investigators believe may influence (maintaining homogenous risk difference across Old, Low risk was below the 95% value for all methods, but increasing for treatment effect on a binomial outcome. For stratified binomial strata) Old, High risk values between 0.08 and 0.024. trials, a risk difference can be calculated per stratum. These Common • Score(Miettinen-Nurminen) are more likely to capture estimates can be combined to produce a common risk difference 2. Varying the magnitude of heterogeneity of risk smaller differences(<0.08) and Stratified Newcombe CIs to describe the overall treatment effect. However, there are many difference MH Wald-type(Sato) were more likely to capture larger treatment differences. ways to estimate the common risk difference and confidence For 1), the magnitude of the difference between the New(Klingenberg) • Stratified Newcombe CIs were the widest, followed by interval(CI). This research explores confidence intervals using a probabilities of event for control arm(pct) and treatment arm Stratified Newcombe variety of methods: Wald-Type(Sato)[3], Stratified Newcombe[4], (ptx) is varied from 0% to 2.5%, using .05% increments. Score. Score(Miettinen-Nurminen) Score (Miettinen-Nurminen)[5,6], and a new approach presented For 2), heterogeneity is introduced by sampling from a • The MH Wald-type CI and the new(Klingenberg) method -0.05 -0.025 0 0.025 0.05 0.075 by Klingenberg[7]. Cochrane-Mantel-Haenzel[1,2] weights are uniform distribution centered at 0 with bounds varying at performed similarly, with narrow confidence intervals, used when constructing the Wald and Stratified Newcombe 0.01%, 0.1% and 1% to add noise to ptx for each stratum. slightly higher chance of being significant, and worse confidence intervals. For this setting, pct is kept constant at 3% and ptx is 2% + coverage probability. noise. Each simulation is run for 10,000 iterations. 0.0149 Methods • Introducing heterogeneity didn’t appreciably alter the results The following is captured to help determine if certain when average risk difference was 1%. 0.0148 MH Wald-type(Sato) CI methods outperform others across simulations: • Cochrane-Mantel-Haenzel(MH) weights are used to – How often 0, indicating an insignificant result, is in 0.0147 Conclusions combine the stratum risk differences to produce the the CI common risk difference and variance of the estimate 0.0146 • In this particular setting of sample sizes and poverall there – How often the true underlying risk difference is Average CI width are not drastic differences in conclusions between the using the Sato variance estimator. Average CI width captured in the CI (coverage probability) 0.0145 New CI presented by Klingenberg methods, perhaps because of the event probability not – Average width of CI 0.0144 being too rare and the sample sizes being very large and • Similar to Wald-type, but estimates the variance under This simulation was run in R Studio (R version 3.2.0)[9] uneven. the null to construct the CI. Wald(Sato) New(Klingenberg) Stratified Newcombe Score(M-N) • However, these results suggest using the Score CI due to Stratified Newcombe confidence limits Results a good balance between high coverage probability of the • The stratified Wilson confidence limits(CL) for each Probability of significant CIs over varying treatment differences true risk difference, probability of declaring significance, P =2.5% treatment group probability of event are combined using overall ptx-pct=-0.01 low heterogenity med heterogenity high heterogenity and CI width. (no heterogenity) uni[-1e-04,1e-04] uni[-0.001,0.001] uni[-0.01,0.01] 1.0 MH weights to get common upper and lower CLs, which 0.16 MH Wald-type(Sato) Wald(Sato) • An r function has now been written for the Stratified 0.90 0.90 0.90 are used to calculate a common risk difference CL New(Klingenberg) 0.90 New(Klingenberg) 0.8 Stratified Newcombe Stratified Newcombe Newcombe and the Score(M-N) method, which is using the method described by Yan and Su[4]. See also Score(Miettinen-Nurminen) Score(M-N) 0.6 implemented in SAS 9.4 but was not available in r, along 0.85 0.85 0.85 0.85 Kim and Won[8]. 0.12 n_sig[, n_sig[, 1] with r code to run a simulation for implementing projected 0.4 Score (Miettinen-Nurminen) confidence limits sample sizes and event probabilities in 2 treatment arms. 0.2 0.80 0.80 0.80 • Stratum risk difference CIs are computed using the Entire range 0.80 Code available upon request. 0.08 Score method and combined using weights dependent 0.0 0.5 1.0 1.5 2.0 2.5 0.00 0.05 0.10 0.15 0.20 0.25 0.75 0.75 0.75 0.75 p_diff 0.95 p_diff[1:6] n_sig2[2, ] n_sig2[2, ] n_sig2[3, ] n_sig2[4, on the CI lengths. ] n_sig2[1, 0.60 References [1] Cochran, W.G. (1954). “Some methods for strengthening the common chi-square 0.70 0.70 0.70 0.70 tests”. Biometrics, 10, 417-451 0.85 Simulation 0.50 [2] Mantel, N., and Haenszel, W. (1959). “Statistical aspects of the analysis of data from Probability of significant 95% CI retrospective studies of disease”. Journal of the National Cancer Institute, 22, 719-748. 0.65 0.65 0.65 Data are simulated using 4 strata and an overall event 0.65 [3] Sato, T. (1989), “On the Variance Estimator of the Mantel-Haenszel Risk Difference,” probability similar to an actual trial. Across all simulations 0.75 Probability of significant 95% CI 0.40 Biometrics, 45, 1323–1324, letter to the editor. 0.60 0.60 0.60 the underlying overall event probability (poverall=2.5%) and 0.60 0.65 0.70 0.75 0.80 0.85 1.0 1.1 1.2 1.3 1.4 0.60 [4] Yan X. and Su X.G. (2010) “Stratified Wilson and Newcombe confidence intervals for (p -p )*100 multiple binomial proportions” Statistics in Medicine 19, 811-825 the sample sizes per stratum remain stable (n1j=2,350, p_diff[13:18] ct tx p_diff[20:30] *Heterogenity introduced by sampling from uniform distribution and adding to ptx 1:4 1:4 1:4 1:4 [5] Agresti, A. (2013), Categorical Data Analysis, 3rd Edition, Hoboken, NJ: John Wiley & n2j=125, n3j=1050, n4j=75, where j=1,2 for treatment or Sons. control group, i.e. using 1:1 allocation to treatment and Coverage probability over varying treatment differences p -p =-0.01 [6] Miettinen, O. S. and Nurminen, M. M. (1985), “Comparative Analysis of Two Rates,” Poverall=2.5% tx ct low heterogenity med heterogenity high heterogenity Statistics in Medicine, 4, 213–226 control). (no heterogenity) uni[-1e-04,1e-04] uni[-0.001,0.001] uni[-0.01,0.01] [7] Klingenberg, B. (2013) “A new and improved confidence interval for the Mantel- Wald(Sato) 1.00 1.00 1.00 1.00 New(Klingenberg) Haenszel risk difference” Statistics in Medicine, 33(17), 2968-83. MH Wald-type(Sato) Stratified Newcombe Sample sizes per strata per treatment Score(M-N) [8] Kim, Y., Won, S. (2013) “A new and improved confidence interval for the Mantel- 2350 New(Klingenberg) Haenszel risk difference,” in Proceedings of PharmaSUG 2013 (Pharmaceutical Industry Stratified Newcombe 0.98 Score(Miettinen-Nurminen) SAS Users Group), Cary, NC: SAS Institute Inc. [9] RStudio Team (2015). RStudio: Integrated Development for R. RStudio, Inc., Boston, 2000 MA URL http://www.rstudio.com/. 0.95 0.95 0.95 0.95 0.94 1500 Contact Information: 1050 n_est[, 1] n_est2[2, ] n_est2[2, ] n_est2[3, ] n_est2[4, n_est2[1, ] n_est2[1, [email protected] 1000 0.90 919-294-5562 0.90 0.90 0.90 0.90 PAREXEL International Corp 500 2520 Meridian Parkway, Suite 200 125 75 0.86 Durham, North Carolina 27713 0 0.0 0.5 1.0 1.5 2.0 2.5 Acknowledgements: Thank you to Daniel Bonzo, Benedict Dormitorio, and Young, Low risk Young, High risk Old, Low risk Old, High risk Probability true risk diff captured in 95% CI p =2.5% p =3.75% ct ct Probability true average risk diff is in 95% CI Marlina Nasution of PAREXEL for their contributions of ideas and edits. 0.85 0.85 0.85 p_diff 0.85 Randomization Strata ptx=2.5% (pct-ptx)*100 ptx=1.25% © 2015 PAREXEL International Corporation. All rights reserved. *Heterogenity introduced by sampling from uniform distribution and adding to ptx 1:4 1:4 1:4 1:4.

Load more