A Commentary on Research Rigour in Clinical Psychological Science How to Avoid Throwing out the Innovation Baby with the Resear

Behaviour Research and Therapy 120 (2019) 103417 Contents lists available at ScienceDirect Behaviour Research and Therapy journal homepage: www.elsevier.com/locate/brat A commentary on research rigour in clinical psychological science: How to T avoid throwing out the innovation baby with the research credibility bath water in the depression field ∗ Barnaby D. Dunna, , Heather O'Mahena, Kim Wrighta, Gary Brownb a Mood Disorders Centre, University of Exeter, UK b Psychology Department Royal Holloway University, UK ARTICLE INFO ABSTRACT Keywords: Proponents of the research credibility movement make a number of recommendations to enhance research Research waste rigour in psychology. These represent positive advances and can enhance replicability in clinical psychological Innovation science. This article evaluates whether there are any risks associated with this movement. We argue that there is Intervention development the potential for research credibility principles to stifle innovation and exacerbate type II error, but only ifthey Intervention implementation are applied too rigidly and beyond their intended scope by funders, journals and scientists. We outline ways to Clinical trial design mitigate these risks. Further, we discuss how research credibility issues need to be situated within broader concerns about research waste. A failure to optimise the process by which basic science findings are used to inform the development of novel treatments (the first translational gap) and effective treatments are thenim- plemented in real-world settings (the second translational gap) are also significant sources of research waste in depression. We make some suggestions about how to better cross these translational gaps. In the past ten years there has been a growing recognition that many broader concerns about research waste. Pragmatically, the central aim of findings fail to replicate in psychology, which is in part driven bythe clinical psychological science is to develop effective means of preventing widespread use of questionable research practices (QRPs) that inflate or treating mental health difficulties. Efficient research practices are type I errors (i.e. false positives). A series of practical meta-science those that optimise the quality, validity and speed, and minimise the cost proposals have been put forward about how the rigour of psychological (in every sense), of this treatment development pathway. While issues of science can be enhanced (which can be referred to as the research rigour do need to be addressed in clinical psychology, the bigger problem credibility movement). These proposals are now being extended from is a failure to optimise this translational research pipeline. We will dis- the basic science arena to other domains of applied psychology in- cuss how waste occurs due to sub-optimal translation of findings from cluding clinical psychology. This is a timely, important and useful basic science into novel treatments (the t1 gap) and a failure to imple- change that may greatly benefit the field. However, as with most ment treatments shown to be effective in controlled trials in real world change, there could be risks as well as opportunities depending on how settings (the t2 gap) (Cooksey, 2006). We will suggest ways to enhance these recommendations are implemented. the efficiency of this translational pipeline. This article will identify possible risks in implementing research We take a UK perspective, identifying what we see as challenges for credibility recommendations in clinical psychological science, consider research credibility and efficiency in the context of the UK National how likely these are to occur, and discuss ways to mitigate against them. Health Service and research funding bodies. We are aware that many We use intervention development in depression as a focus to illustrate but not all of the issues we raise will apply globally and should there- these points. We argue there is a risk that the research credibility fore be considered within each national context. movement may inhibit innovation and exacerbate type II error in the field if its principles are applied in a rigid, “one size fits all” waythatgoes 1. Emergence of the research credibility movement beyond their intended scope and fails to take into account the context in which clinical psychology operates. We will then suggest that research It is increasingly recognised that many key findings in science credibility priorities in clinical psychology need to be situated within generally, and psychology specifically, fail to replicate (e.g., Camerer ∗ Corresponding author. Mood Disorders Centre, University of Exeter, Exeter, EX4 4QF, UK. E-mail address: [email protected] (B.D. Dunn). https://doi.org/10.1016/j.brat.2019.103417 Received 31 January 2019; Received in revised form 14 May 2019; Accepted 3 June 2019 Available online 05 June 2019 0005-7967/ © 2019 Elsevier Ltd. All rights reserved. B.D. Dunn, et al. Behaviour Research and Therapy 120 (2019) 103417 et al., 2018; Chalmers & Glasziou, 2009; Ionnidis, 2005; Open Science against the zeitgeist) and ‘tightening’ (critical evaluation, selection and Collaboration, 2015). Over 90% of respondents to a recent Nature then implementation of the ideas generated by ‘loosening’). survey agreed that there was a ‘reproducibility crisis’ across science Optimal scientific development arguably involves striking acon- (Baker, 2016). The endemic use of questionable research practices sidered balance between (and appropriately sequencing) these in- (QRPs) has also become apparent, which can lead to spurious false ductive-loosening and deductive-tightening processes (Kaufman & positive findings in basic psychological science (Simmons, Nelson, & Glaveanu, 2018). There must be space to engage in both induction and Simonsohn, 2011). At the design stage, these include chronic under- deduction, with induction tending to occur first to generate the input to powering (Button et al., 2013; Fraley & Vazire, 2014) and inadequate the deductive process. When engaging in deductive, tightening pro- protocol specification. At the data-collection stage, adaptive stopping cesses, the optimal balance also needs to be struck between protecting rules are sometimes used (ceasing data collection when effects become from type I error (a false positive finding) and type II error (a false significant). At the analysis stage, questionable practices include mea- negative finding) and between ensuring internal validity (the experi- sure-hacking, p-hacking and selective exclusion (e.g., Ioannidis et al., ment is free from error and any difference in measurement is due solely 2014; Flake & Fried, 2019). At the write-up phase, problematic ap- to the independent variable) and external validity (i.e. generalisability proaches include hypothesizing after the results are known, spin and beyond experiment) of findings. As is well known, there is a trade-off providing insufficient detail to enable replication (Kerr, 1998; Glasizou between each of these (attempts to reduce type I error can enhance type et al., 2014). At the publication phase, there are ‘file drawer’ and ci- II error; attempts to maximise internal validity can reduce external tation bias problems (e.g., Ingre & Nilsonne, 2018). These QRPs until validity). recently were seen as acceptable standard practice, were incentivized Having outlined the factors that maximise innovation in science, we by the grant and publication reward structures, and most likely con- will now evaluate the thesis that an overly rigid application of the re- tributed to low rates of replicability in the field. search credibility movement could lead to some unanticipated costs. In Clinical psychology research is not immune to these problems (e.g., particular, we will assess whether it could stifle innovation and tilt the Perepletchikova, Treat, & Kazdin, 2008). Cuijpers and Cristea (2015) in balance too far in the favour of protecting against type I error (and a ‘tongue in cheek’ article discuss ways that may be (intentionally or therefore maximise type II error). It is important to acknowledge we are unintentionally) used to prove a novel therapy is effective, even when it predominantly focusing on possible risks rather than benefits of the is not. These include: allegiance biases leading to enhanced supervision research credibility movement in what follows. We are taking this ‘one- and training for the preferred therapy (Munder, Brutsch, Leonhart, sided’ stance because others have already written eloquently about the Gerger, & Barth, 2013); maximising expectancy effects for the preferred positives associated with a thoughtful application of research rigour treatment; exploiting weak spots in trial methodologies to favour the principals (e.g., Hopwood & Vazire, 2018; Munafo et al., 2017), while preferred treatment (e.g. selection of a ‘straw-man’ comparator, pre- the negatives of an extreme application have rarely been explicitly ferential allocation to treatment arm, non-blinded assessment, focusing evaluated. Our intention is to contribute to a dialectical synthesis on completer not intention-to-treat data, measure and p-hacking); and around the merits of research credibility principles and to promote a running small trials and only publishing the ones that work. Similar use balanced implementation in the field. In the following sections we will of QRPs in clinical psychology and psychiatry research have been focus on depression intervention research to illustrate these risks. identified by other

Load more