1 Common Methodological Problems in Randomized Controlled Trials Of
Total Page:16
File Type:pdf, Size:1020Kb
1 Common Methodological Problems in Randomized Controlled Trials of Preventive Interventions Christine M. Steeger†, PhD, [email protected] Pamela R. Buckley†, PhD, [email protected] Fred C. Pampel, PhD, [email protected] Charleen J. Gust, MA, [email protected] Karl G. Hill, PhD, [email protected] Institute of Behavioral Science, University of Colorado Boulder, 1440 15th St., Boulder, CO 80309 † These authors contributed equally to this work. Correspondence concerning this article should be addressed to Christine Steeger, Institute of Behavioral Science, University of Colorado Boulder, 1440 15th St., Boulder, CO 80309; phone: 303-735-7146; [email protected] Acknowledgements: The authors would like to thank Abigail Fagan, Delbert Elliott, Denise Gottfredson, and Amanda Ladika for their comments and critical read of the manuscript, Sharon Mihalic for paper concepts, and Jennifer Balliet for participating in data entry and data coding. Declarations Funding: This study was funded by Arnold Ventures. Conflicts of interest/competing interests: The authors declare that they are members of the Blueprints for Healthy Youth Development staff and that they have no financial or other conflict of interest with respect to any of the specific interventions, policies or procedures discussed in this article. Ethics approval/consent: This paper does not contain research with human participants or animals. Data: Available from authors upon request. Materials and/or Code availability: n/a Author contributions: Concepts and design (CS; PB; FP); data entry, coding, management, and analysis (CS; PB; FP; CG); drafting of manuscript (CS; PB; FP); intellectual contributions, reviewing, and critical editing of manuscript content (CS; PB; FP; CG; KH). All authors have read and approved the final manuscript. 2 Abstract Objective. Randomized controlled trials (RCTs) are often considered the gold standard in evaluating whether intervention results are in line with causal claims of beneficial effects. However, given that poor design and incorrect analysis may lead to biased outcomes, simply employing an RCT is not enough to say an intervention “works.” This paper applies a subset of the Society for Prevention Research (SPR) Standards of Evidence for Efficacy, Effectiveness, and Scale-up Research, with a focus on internal validity (making causal inferences) to determine the degree to which RCTs of preventive interventions are well-designed and analyzed, and whether authors provide a clear description of the methods used to report their study findings. Methods. We conducted a descriptive analysis of 851 RCTs published from 2010-2020 and reviewed by the Blueprints for Healthy Youth Development web-based registry of scientifically- proven and scalable interventions. We used Blueprints’ evaluation criteria that correspond to a subset of SPR’s standards of evidence. Results. Only 22% of the sample satisfied important criteria for minimizing biases that threaten internal validity. Overall, we identified an average of 1-2 methodological weaknesses per RCT. The most frequent sources of bias were problems related to baseline non-equivalence (i.e., differences between conditions at randomization) or differential attrition (i.e., differences between completers versus attritors or differences between study conditions that may compromise the randomization). Additionally, over half the sample (51%) had missing or incomplete tests to rule out these potential sources of bias. Conclusions. Most preventive intervention RCTs need improvement in rigor to permit causal inference claims that an intervention is effective. Researchers also must improve reporting of methods and results to fully assess methodological quality. These advancements will increase the usefulness of preventive interventions by ensuring the credibility and usability of RCT findings. Keywords: Randomized controlled trial, RCT, preventive interventions, internal validity, CONSORT 3 4 Introduction Randomized controlled trials (RCTs) are often considered the gold standard for determining experimental validity and the causal effects of preventive interventions (Shadish, Cook, & Campbell, 2002; West & Thoemmes, 2010). With high-quality implementation, RCTs allow for causal inferences and estimates of average treatment effects that are more reliable and credible than those from other empirical methods (Deaton & Cartwright, 2018). Despite the strength and appropriateness of the RCT to evaluate an intervention (i.e., program, practice, or policy), simply using an RCT design and reporting results is not sufficient to determine whether an intervention “works.” Given that poorly implemented RCTs may produce biased outcomes (Schulz, Altman, Moher, & Group, 2010), an RCT must be correctly designed, implemented, and analyzed in order to make causal inferences and claim beneficial effects of an intervention. That is, RCTs must be internally valid to minimize several sources of bias (systematic error). When feasible and appropriate, randomization is necessary to ensure sound causal conclusions of positive intervention effects, which inform policy and practice decisions for communities (Montgomery et al., 2018). Social and psychological interventions, however, are often complex and contextually dependent upon the difficult-to-control environments in which they are delivered (e.g., schools, correctional facilities, health care settings; Bonell, 2002; Grant, Montgomery, et al., 2013; Grant, Mayo-Wilson, Melendez-Torres, & Montgomery, 2013). Understanding RCTs therefore requires a detailed, transparent description of the interventions tested and the methods used to evaluate them (Grant, Montgomery, et al., 2013). Transparent reporting is crucial for assessing the validity and efficacy or effectiveness of intervention studies used to inform evidence-based decision making and policymaking. To guide social science researchers in the requisite methodological criteria for establishing efficacy (i.e., the extent to which an intervention does more good than harm when 5 delivered under optimal conditions) and effectiveness (i.e., intervention effects when delivered in real-world conditions; Flay et al., 2005), two seminal papers on the methodological standards of evidence were developed by prevention scientists and endorsed by the Society for Prevention Research (SPR). With a goal of increasing consistency in reviews of prevention research, these standards originated in Flay et al. (2005) and were updated in Gottfredson et al. (2015) as the prevention science field has progressed in the number and quality of preventive interventions for reducing youth problem behaviors. How well researchers apply and report the SPR standards of evidence criteria for high- quality RCTs in the prevention science field, however, is unknown. This paper uses a subset of the SPR standards of evidence that must be met for preventive interventions to be judged “tested and efficacious” or “tested and effective” (Flay et al., 2005; Gottfredson et al., 2015). We focus on threats to internal validity to determine whether RCTs of preventive interventions are well- implemented and well-reported – and if not, what are the most common design and analysis flaws? And what information on methods is missing? This study’s larger goal is to improve the design, analysis, and reporting of potential threats to internal validity of intervention research that uses an experimental design. To answer these research questions, we present findings from a large-scale descriptive analysis of RCTs testing intervention program efficacy or effectiveness using the Blueprints for Healthy Youth Development online clearinghouse database. Blueprints identifies scientifically proven and scalable interventions that prevent or reduce the likelihood of antisocial behavior and promote a healthy course of youth development (Buckley, Fagan, Pampel, & Hill, 2020; Fagan & Buchanan, 2016; Mihalic & Elliott, 2015). The Status of RCTs in the Prevention Science Field Increased public and private funder investments in experimental studies of social programs have led to a higher volume of preventive intervention research over the past several 6 decades (Bastian, Glasziou, & Chalmers, 2010). Along with a greater number of published intervention studies, there is some evidence (up to 2010) that the methodological rigor in designing and evaluating RCTs has improved over time for trials in the medical and child health fields (Falagas, Grigori, & Ioannidou, 2009; Thomson et al., 2010). Still, more recent publications have highlighted that many RCTs in the social sciences have design and/or analysis flaws (Ioannidis, 2018), which contribute to sources of bias that weaken internal validity and question causal claims of intervention effectiveness. In addition, the use of RCT findings for informing policy and practice decisions is hindered by poor or incomplete reporting of study design, procedures, and analysis (Montgomery et al., 2018; Walleser, Hill, & Bero, 2011). Grant, Mayo-Wilson, and colleagues (2013) were the first (to our knowledge) to conduct a comprehensive review of the reporting quality of social programs. They identified a sample of 239 RCTs among 40 high impact factor academic journals publishing complex interventions in 2010 in the fields of clinical psychology, criminology, education, and social work. Findings revealed that many standards concerning randomization procedures were poorly reported, such as participant allocation to conditions,