<<

. Comparison of Negative Binomial Regression Models in Analyzing Hypoglycemia Events

, Junxiang Luo, PhD; Yongming Qu, PhD Global Statistical Science, Eli Lilly and Company

Diabetes and Hypoglycemia Events We let BNWP denote the negative binomial regression with 휙 estimated using Figure 1. Type I error based on simulations for different cases Table 2. The comparison of hypoglycemia events between insulin A and insulin B using various models Pearson Chi-square : (relative rate, , 95% and p-value for insulin B vs. insulin A) 2 2 Without Adjusting for Baseline With Adjusting for Baseline Diabetes and hypoglycemia events 2 푦푖 − μ 푖 푦푖 − μ 푖 Variable Model 휒 = = 2 RR SE 95% CI P-Value RR SE 95% CI P-Value 푉푎푟(μ 푖) μ 푖 + 푘μ • Diabetes is a disease characterized by elevated blood glucose level 푖 푖 푖 NB1K 0.66 0.20 (0.37, 1.19) 0.167 0.53 0.15 (0.31, 0.93) 0.026 NB2K 0.66 0.19 (0.37, 1.18) 0.160 0.53 0.15 (0.30, 0.93) 0.028 Nocturnal • Hypoglycemia events, a complication of insulin therapy, are a limiting factor for NBWD NBWD 0.66 0.16 (0.41, 1.06) 0.087 0.53 0.12 (0.34, 0.84) 0.006 Hypoglycemia patients to further titrate insulin to achieve optimal glycemic control NBWP 0.66 0.21 (0.36, 1.23) 0.196 0.53 0.16 (0.29, 0.98) 0.043 Events • It is important to develop new diabetes medication with low hypoglycemia events Let NBWD denote the negative binomial regression with ϕ estimated using ZINB 0.61 0.18 (0.34, 1.09) 0.097 0.49 0.14 (0.28, 0.85) 0.011 : NBSP 0.66 0.20 (0.37, 1.19) 0.167 0.53 0.16 (0.30, 0.94) 0.031 • Statistical analysis of hypoglycemia events are important NB1K 0.91 0.20 (0.59, 1.40) 0.668 0.87 0.18 (0.58, 1.30) 0.499 푦푖 1 1 + 푘푦푖 NB2K 0.91 0.19 (0.60, 1.37) 0.652 0.87 0.17 (0.59, 1.29) 0.483 Commonly used statistical models to analyze hypoglycemia events 퐷 = 2 푙 푦푖; 푦푖 − 푙(푦푖; 휇 푖) = 2 푦푖 푙표푔 − 푦푖 + log Total µ 푖 푘 1 + 푘µ 푖 NBWD 0.91 0.20 (0.59, 1.39) 0.668 0.87 0.18 (0.58, 1.31) 0.502 푖 푖 Hypoglycemia NBWP 0.91 0.20 (0.59, 1.40) 0.672 0.87 0.19 (0.56, 1.34) 0.530 Events • – hypoglycemia events are not continuous variables and the ZINB ZINB 0.89 0.18 (0.60, 1.34) 0.582 0.83 0.17 (0.55, 1.25) 0.362 distribution is very skewed NBSP 0.91 0.18 (0.62, 1.34) 0.635 0.87 0.18 (0.58, 1.30) 0.494 To account for excessive zeros, the zero-inflated negative binomial (ZINB) model • – too strong on model assumption was developed [3, 4, 5]. In this model, each observation is generated to be zero • Wilcoxon Rank-Sum test – difficult to adjust for baseline covariates and hard to with a probability of φi and to be from a negative interpret the results g Yi Xi with probability of 1-φi: • Negative binomial regression [1,2] is widely used in current clinical research and 0 , 푤푖푡푕 푝푟표푏푎푏푖푙푖푡푦 표푓 휑 Figure 3. Rejection publications – the properties of negative binomial regression with different 푃 푌 = 푦 푋 ~ 푖 푖 푖 푖 푔 푦 푋 , 푤푖푡푕 푝푟표푏푎푏푖푙푖푡푦 표푓 1 − 휑 rates based on options in analyzing hypoglycemia events are not well studied 푖 푖 푖 randomly re- NBSP for nocturnal and total Objective hypoglycemia events Empirical or sandwich estimators are useful for obtaining inferences that To compare negative binomial models with different options through Monte Carlo are not sensitive to the choice of the model [6, 7]: simulation and bootstrap simulation and identify the most appropriate and robust method for analyzing hypoglycemia event with and without baseline adjustment n −1 n n −1 ′ −1 ′ −1 ′ −1 ′ −1 cov β = Di Σi Di Di Σi eiei Σi Di Di Σi Di , Methods i=1 i=1 i=1 where e = y − μ , Σ is the variance of y and D is the matrix of first derivatives Figure 2. Statistical power based on simulation; (A) for Case C and N=400; (B) for Let Y1, Y2, ... ,Yn be a set of . The data with negative binomial i i i i i i Case C and N=1000; (C) for Case F and N=400; (D) for Case F and N=1000 Summary of results from the example distribution is defined by of μi with respect to the fixed effect of β. We let BNSP denote the negative 1 1 binomial regression with 휙 estimated using Pearson Chi-square statistic and the (A) (B) • Insulin B significantly reduced the nocturnal hypoglycemia events compared to insulin A when 100 훤 푦 + 푦 80 푖 푘 푘휇푖 1 푘 adjusting for baseline value, regardless of model options 푃 푌푖 = 푦푖 ; 푘, 휇푖 = , 푦푖 = 0, 1, 2, … cov β is estimated using the sandwich method. 1 1 + 푘휇푖 1 + 푘휇푖 70 95 푦푖 ! 훤 • NBSP with baseline adjustment was one of the best methods in controlling Type I error 푘 Simulation 60 90 • Type I error rate for NBSP was controlled at 5% for the total hypoglycemia events, but was where 휇푖 = 퐸(푌푖) and 푘 is a . The variance of 푌푖 is 푉푎푟 푌푖 = 휇푖 + 2 Table 1 shows the mean and of negative binomial distributions slightly higher than 5% for nocturnal hypoglycemia events. 푘휇푖 . The negative binomial regression works through the negative binomial 50 85

for baseline and postbaseline for treatment Group 0 and Group 1. Case A was close Statistical Statistical Power(%) distribution with a link function to connect the mean parameter 휇푖 with the Statistical Power(%) to mimic nocturnal hypoglycemia event rate per 30 days for Type 1 Diabetes 40 80 independent variables 푿 = (푋 , … , 푋 )′ such that μ = g(푿 ; 훃). Typically, a log Conclusion and Discussion 푖 푖1 푖푝 i 푖 (T1D) patients. Nocturnal hypoglycemia events are hypoglycemia events occurring link function is used for negative binomial regression such that during the night and can be dangerous due to the lack of awareness. We increased (D) • Negative binomial regression with overdispersion estimated by Pearson statistic and variance- the dispersion parameter k of the post baseline hypoglycemia event rate from 2.02 (C) 100 log(휇푖) = 푿푖′훽 80 covariance matrix of parameters estimated by Sandwich method (NBSP) is the most robust in Case A to 14 in Case B in order to evaluate the model performance when data is 95 method in controlling the Type I error The log-likelihood of the negative binomial distribution is extremely over-dispersed. Cases A and B were used to evaluate Type I error of the 70 models for these nocturnal hypoglycemia event rates, while Case C was selected to 90 • Adjusting for baseline hypoglycemia events can dramatically improve the statistical power if 1 60 1 Γ 푦푖 + assess those model’s statistical power. Similarly, Cases D, E and F were chosen to the correlation between baseline and post baseline hypoglycemia events is moderate or high 푘 85 푙 푦푖 ; 휇푖 = 푦푖 log 푘휇푖 − 푦푖 + log 1 + 푘휇푖 + log (1) 50 푘 1 assess the model performance for total hypoglycemia event rates. • Recommendation Γ 푦푖 + 1 Γ Statistical Power(%) 푘 Statistical Power(%) 40 80 In the simulation, the sample size of 100, 200, and 500 per treatment group was • Baseline hypoglycemia events should be collected for insulin studies if hypoglycemia NB1K used. Therefore, the total sample size was N = 200, 400 and 1000 in our events are important endpoints simulations. The correlation 휌(푋 , 푌 ) was assumed to be the same and was In a basic negative binomial regression, we assume the two treatment groups share 퐺푖 퐺푖 ρ=0 & BL adj=N ρ=0 & BL adj=Y ρ=0.3 & BL adj=N ρ=0.3 & BL adj=Y assessed from mild to strong on values of 0.0, 0.3, 0.5, and 0.8. For each scenario, • Baseline hypoglycemia events should be adjusted for in the analysis of post baseline the same dispersion parameter 푘 as in (1). We denote this basic negative binomial ρ=0.5 & BL adj=N ρ=0.5 & BL adj=Y ρ=0.8 & BL adj=N ρ=0.8 & BL adj=Y 3,000 random samples were generated. hypoglycemia events regression model (1) as NB1K. Table 1. Mean and standard deviation of the negative binomial distribution in Application NB2K simulations References We applied various analysis methods with and without adjustment for baseline values In some count data, the assumption of the same dispersions across treatment groups Group 1 Group 0 Case for a Phase 2 clinical study (Table 2) 1. McCullagh, P. and Nelder, J. A. Generalized Linear Models, 2nd ed., Chapman & Hall, may not be satisfied, which that 푘 is various across treatment groups. This Baseline 푋1 Post baseline 푌1 Baseline 푋0 Post baseline 푌0 London 1989. leads to the following A (0.4, 1.0) (0.7, 1.3) (0.4, 1.0) (0.7, 1.3)  Comparing insulin B with insulin A for patients with Type 2 Diabetes. B (0.4, 1.0) (0.5, 2.0) (0.4, 1.0) (0.5, 2.0) 2. Lawless, J.F. Negative Binomial and Mixed Poisson Regression. Canad. J. Statist. 1987; 15: 1  N = 93 for insulin A and N=195 for insulin B 209-225. 푙 푦 ; 휇 = 푦 푙표푔 푘 휇 − 푦 + log 1 + 푘 휇 C (0.4, 1.0) (0.5, 0.9) (0.4, 1.0) (0.7, 1.3) 퐺푖 퐺푖 퐺푖 퐺 퐺푖 퐺푖 퐺 퐺푖  4 weeks of lead-in period (prior to ) at baseline 푘퐺 D (3.7, 5.6) (5.6, 5.0) (3.7, 5.6) (5.6, 5.0) 3. Mullahy J. Specification and Testing of Some Modified Count Data Models. J. of 1 E (3.7, 5.6) (5.6, 10.0) (3.7, 5.6) (5.6, 10.0) Γ 푦 +  12 weeks of treatment period 1986; 33: 341-365. 퐺푖 푘 F (3.7, 5.6) (4.7, 4.2) (3.7, 5.6) (5.6, 5.0) + log 퐺  A hypoglycemia event was defined as a blood glucose value < 70 mg/dL. 1 4. Lambert D. Zero-Inflated Poisson Regression Models with an Application to Defects in Γ 푦퐺푖 + 1 Γ Nocturnal hypoglycemia events are hypoglycemia events occurred between 푘퐺 Manufacturing. Technometrics 1992; 34: 1–14. Summary of simulation results (Figures 1 and 2): bedtime and wake-up time. where G = 0 or 1 indicates the treatment group. We denote this negative binomial 5. Greene W. H. Accounting for Excess Zeros and Sample Selection in Poisson and Negative regression model as NB2K.  NBSP is the most robust method in controlling the Type I error regardless of We also assessed Type-I error by repeatedly randomly splitting subjects into 2 treatment Binomial Regression Models, Technical report 1994. baseline adjustment groups for 3,000 times (Figure 3) 6. White, H A. Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for NBWP  Baseline adjustment generally improves the statistical power Heteroskedasticity. Econometrica 1980; 48: 817–838.  Baseline adjustment methods had minimum bias, and generally results in Some real clinical count data is so ill-distributed that the overdispersion cannot be 7. Cameron A., Trivedi P.K. of Count Data, Cambridge University Press, explained by 푘 or 푘’s only. One way to solve this problem is to introduce another smaller standard deviation and mean squared errors (data not shown here) Cambridge, MA 1998. overdispersion 휙 to explain the extra overdispersion [1]. In this case, Var Y = 휙(μ + 푘μ2). ACKNOWLEDGEMENT The authors would like to thank Drs Scott Jacober, Cory Heilmann and Honghua Jiang for useful comments and scientific reviews