Recommendations for the Primary Analysis of Continuous Endpoints in Longitudinal Clinical Trials
Total Page:16
File Type:pdf, Size:1020Kb
02-DIJ42(4) 2348.qxd 6/9/08 1:46 PM Page 303 STATISTICS 303 Recommendations for the Primary Analysis of Continuous Endpoints in Longitudinal Clinical Trials Craig H. Mallinckrodt, PhD Research Advisor, Lilly This position paper summarizes relevant theo- simple methods in favor of joint analysis of Research Laboratories, Eli ry and current practice regarding the analysis data from all time points based on a multivari- Lilly and Company, of longitudinal clinical trials intended to sup- ate model (eg, of a mixed-effects type). One Indianapolis, Indiana port regulatory approval of medicinal prod- such newer method, a likelihood-based mixed- Peter W. Lane, MA, CStat ucts, and it reviews published research regard- effects model repeated measures (MMRM) ap- Director of Consultancy and Training, Research Statistics ing methods for handling missing data. It is proach, has received considerable attention in Unit, GlaxoSmithKline, one strand of the PhRMA initiative to improve the clinical trials literature. We discuss specif- Harlow, United Kingdom efficiency of late-stage clinical research and ic concerns raised by regulatory agencies with Dan Schnell, PhD gives recommendations from a cross-industry regard to MMRM and review published evi- Section Head, Pharmaceutical Statistics, team. We concentrate specifically on continu- dence comparing LOCF and MMRM in terms Procter & Gamble ous response measures analyzed using a linear of validity, bias, power, and type I error. Our Pharmaceuticals, model, when the goal is to estimate and test main conclusion is that the mixed model ap- Mason, Ohio treatment differences at a given time point. proach is more efficient and reliable as a Yahong Peng, PhD Traditionally, the primary analysis of such tri- method of primary analysis, and should be Senior Biometrician, Clinical Biostatistics, Merck als handled missing data by simple imputation preferred to the inherently biased and statisti- Research Lab, Upper using the last, or baseline, observation carried cally invalid simple imputation approaches. Gwynedd, Pennsylvania forward method (LOCF, BOCF) followed by We also summarize other methods of handling James P. Mancuso, PhD analysis of (co)variance at the chosen time missing data that are useful as sensitivity Associate Director, Statistics, Pfizer Inc, point. However, the general statistical and sci- analyses for assessing the potential effect of Groton, Connecticut entific community has moved away from these data missing not at random. Key Words that it tends to under- rather than overestimate Missing data; Longitudinal data; INTRODUCTION treatment effects. Although the appropriateness Primary analysis; In longitudinal trials, efficacy is often assessed of LOCF hinges on strong assumptions, it is also Clinical trials in terms of treatment differences at a specific generally regarded as less biased than an analy- Correspondence Address time point, usually the last time at which obser- sis of completing subjects only, potentially Craig Mallinckrodt, Eli Lilly and Company, Lilly vations are planned while patients are under counteracting bias caused by differential tim- Corporate Center, Indianapolis, IN 46285 treatment. A major difficulty in analyses of such ing, rates, and reasons for dropout in the various (email: [email protected]). trials is missing data at the chosen time point, treatment arms. often due to patients withdrawing (or dropping Over the past 20 years, statistical methodolo- out) from treatment. Inference from the results gy and software have been developed that allow of a trial can be complicated by the method for the routine use of alternative approaches used to handle the missing data because the in- with less restrictive assumptions than LOCF. ference may depend on the method and its as- These methods are based on analyzing the ob- sumptions. servations made at all time points. One such Historically, the simple imputation method, longitudinal approach, which has been exten- called last observation carried forward (LOCF), sively studied in regulatory settings, uses a mod- has been used for the primary efficacy analysis el referred to as multivariate, or mixed, and is in- of clinical trials supporting registration of new creasingly denoted in the literature by the medicines (1). This approach is simple to carry abbreviation MMRM (mixed model for repeated out and is generally regarded as conservative in measures) (2–14). Drug Information Journal, Vol. 42, pp. 303–319, 2008 • 0092-8615/2008 Submitted for Publication: August 6, 2007 Printed in the USA. All rights reserved. Copyright © 2008 Drug Information Association, Inc. Accepted for Publication: November 29, 2007 02-DIJ42(4) 2348.qxd 6/9/08 1:46 PM Page 304 304 STATISTICS Mallinckrodt et al. The MMRM method is from the broader class the implications of the research comparing of direct-likelihood analyses and makes use of LOCF and MMRM. Additional difficulties may fully and partially observed data sequences from have arisen from differences in the perspectives individual patients by estimating the covariance of pharmaceutical companies and regulators, ei- between data from different time points (1). As ther real or perceived. is described in an upcoming section, it is often The purpose of this article is to capitalize on useful to implement MMRM using an unstruc- the diverse experience of researchers at a num- tured approach to modeling both the treat- ber of pharmaceutical companies in order to (1) ment-by-time means and the (co)variances, lead- clarify terminology and concepts regarding use ing to what is essentially a multivariate normal of MMRM and LOCF in regulatory settings, (2) model wherein treatment group means at the address specific concerns raised by regulatory primary time point are adjusted to reflect both agencies regarding use of MMRM as the primary the actually observed data and the projected analysis, and (3) make specific recommenda- outcomes from the patients with missing data tions for analysis of data from confirmatory lon- (see, eg, articles by Cnaan et al. [15], Molen- gitudinal clinical trials with continuous end- berghs and colleagues [5], and Molenberghs points. and Kenward [1]). Other methods, such as mul- Regarding our perspective on the choice of tiple imputation, are also the result of advances primary analysis, this article is the consensus of in methodology and software but have not been an expert working team from the Efficiency studied as extensively as MMRM in regulatory in Clinical Trials Initiative of the Pharmaceuti- settings. cal Research and Manufacturers of America Given the strong theoretical and empirical ev- (PhRMA). We believe there is a compelling pub- idence favoring MMRM over LOCF, it is not sur- lic health need to develop drugs using the best prising that use of LOCF as the primary analysis possible scientific methods in all disciplines in has been questioned by statisticians and clini- order to meet patient needs with better and cians in academic, industry, and regulatory set- more affordable medicines. We believe regula- tings. However, regulatory agencies frequently tors share this perspective, as evidenced by the require that primary analyses of efficacy use various Critical Path initiatives. Hopefully, this LOCF. For example, Dr. Linda Yau surveyed stat- article will help drug developers and regulators isticians working in phases 2 and 3 from a wide achieve their common goal. range of therapeutic areas, including neuro- science, antivirals, respiratory, gastrointestinal, TERMINOLOGY AND CONCEPTS urology, and cardiovascular. In her presentation REGARDING USE OF MMRM AND at the DIA Conference in Philadelphia, June LOCF IN REGULATORY SETTINGS 2006, Dr. Yau noted that LOCF was almost uni- MISSING DATA TERMINOLOGY versally preferred by regulatory agencies as the AND CONCEPTS primary analysis. However, there was generally In order to understand the potential impact of no objection to using more recent methods missing data, the process (ie, mechanisms) lead- such as MMRM for primary analyses in phase 1, ing to the missingness must be considered. The nor for trials on medical devices or diagnostic following taxonomy of missing-data mecha- tests. In addition, plans for some vaccine trials nisms is now common in the statistical literature in phase 2 have included MMRM or multiple (16). imputation as the primary analysis. Data are considered missing completely at ran- In our experience, decisions regarding choice dom (MCAR) if, conditional upon the indepen- of the primary analysis have been hampered by dent variables in the analytic model, the miss- misunderstandings of concepts, some of which ingness does not depend on either the observed stem from inconsistency in terminology. This, in or unobserved outcomes of the variable being turn, has led to misunderstandings regarding analyzed (Y). Data are missing at random (MAR) if, 02-DIJ42(4) 2348.qxd 6/9/08 1:46 PM Page 305 Recommendations for Primary Analysis STATISTICS 305 conditional upon the independent variables in be ignorable if parameters were estimated via the analytic model, the missingness depends on maximum likelihood but would not be ignorable the observed outcomes of the variable being an- if parameters were estimated via a frequentist alyzed (Yobs) but does not depend on the unob- method that assumes MCAR (18). served outcomes of the variable being analyzed These subtleties can be easy to overlook in (Ymiss). Data are missing not at random (MNAR) practice, leading to misunderstandings about if, conditional upon the independent variables missing data and its consequence. For example, in the analytic model, the missingness depends when dropout rates differ by treatment group, on the unobserved outcomes of the variable be- then it can be said that dropout is not random. ing analyzed. But it would be incorrect to conclude that the Several key points arise from these definitions. missingness mechanism giving rise to the First, the characterization of the missingness dropout is MNAR and that analyses assuming mechanism does not rest on the data alone; it MCAR or MAR would be invalid. Although involves both the data and the model used to an- dropout is not completely random in the sim- alyze the data.