Paradoxical Conclusions in Randomised Controlled Trials with 'Doubly Null

Open access Original research BMJ Open: first published as 10.1136/bmjopen-2019-029596 on 9 January 2020. Downloaded from ‘Not clinically effective but cost- effective’ - paradoxical conclusions in randomised controlled trials with ‘doubly null’ results: a cross- sectional study James Raftery ,1 HC Williams,2 Aileen Clarke,3 Jim Thornton,2 John Norrie,4 Helen Snooks,5 Ken Stein6 To cite: Raftery J, Williams HC, ABSTRACT Strengths and limitations of this study Clarke A, et al. ‘Not clinically Objectives Randomised controlled trials in healthcare effective but cost- effective’ increasingly include economic evaluations. Some show - paradoxical conclusions in ► A strength of this study is the identification of a small differences which are not statistically significant. randomised controlled trials with problem to do with results on cost- effectiveness ‘doubly null’ results: a cross- Yet these sometimes come to paradoxical conclusions from randomised trials which is fairly common but sectional study. BMJ Open such as: ‘the intervention is not clinically effective’ but has not otherwise been reported or examined. ‘is probably cost- effective’. This study aims to quantify 2020;10:e029596. doi:10.1136/ ► A limitation was that the sample was confined bmjopen-2019-029596 the extent of non- significant results and the types of to that of trials funded and published by the UK’s conclusions drawn from them. ► Prepublication history and Health Technology Assessment (HTA) programme. additional material for this Design Cross- sectional retrospective analysis of ► The study was strengthened by using full reports paper are available online. To randomised trials published by the UK’s National of each study in the HTA monograph series, which view these files, please visit Institute for Health Research (NIHR) Health Technology requires conclusions and recommendations for the journal online (http:// dx. doi. Assessment programme. We defined as ‘doubly null’ research. org/ 10. 1136/ bmjopen- 2019- those trials that found non- statistically significant ► The generalisability of this study was enhanced by 029596). differences in both primary outcome and cost per the fact that most of the trials reviewed were pub- patient. Paradoxical was defined as concluding in favour lished in major peer- reviewed medical journals. Received 01 February 2019 http://bmjopen.bmj.com/ Revised 11 November 2019 of an intervention, usually compared with placebo or Accepted 19 November 2019 usual care. No human participants were involved. Our sample was 226 randomised trial projects published by BACKGROUND the Health Technology Assessment programme 2004 to 2017. All are available free online. Randomised trials are widely seen as Results The 226 projects contained 193 trials with a providing the most robust evidence of the full economic evaluation. Of these 76 (39%) had at least effectiveness of healthcare interventions. one ‘doubly null’ comparison. These 76 trials contained For that reason, they are mandatory for drug 94 comparisons. In these 30 (32%) drew economic licensing. Many trials show null results, that is on October 2, 2021 by guest. Protected copyright. conclusions in favour of an intervention. Overall report a non- statistically significant difference in the conclusions split roughly equally between those primary outcome. Several studies have identi- favouring the intervention (14), and those favouring fied ‘spin’ whereby authors nonetheless draw either the control (7) or uncertainty (9). possibly unwarranted conclusions.1–3 Discussion Trials with ‘doubly null’ results and Given the importance of costs, trials increas- paradoxical conclusions are not uncommon. The ingly include estimation of cost- effectiveness. differences observed in cost and quality- adjustedlife Even when both outcome and cost show only © Author(s) (or their year were small and non- statistically significant. Almost small, non- significant differences, some trials all these trials were also published in leading peer- employer(s)) 2020. Re- use make strong claims. Apparently paradoxical permitted under CC BY-NC. No reviewed journals. Although some guidelines for reporting commercial re- use. See rights economic results require cost- effectiveness estimates conclusions are sometimes drawn along the and permissions. Published by regardless of statistical significance, the interpretability lines that ‘the intervention was not clinically BMJ. of paradoxical results has nowhere been addressed. effective’ but ‘is probably cost- effective’. For numbered affiliations see Conclusions Reconsideration is required of the The definition of ‘spin’ relies largely on end of article. interpretation of cost- effectiveness analyses in statistical significance, a concept which some 4 Correspondence to randomised controlled trials with ‘doubly null’ have seen as outdated. Economic evaluation Professor James Raftery; results, particularly when economics favours a novel in randomised trials has abandoned statistical j. p. raftery@ soton. ac. uk intervention. significance in favour of decision analysis.5 Raftery J, et al. BMJ Open 2020;10:e029596. doi:10.1136/bmjopen-2019-029596 1 Open access BMJ Open: first published as 10.1136/bmjopen-2019-029596 on 9 January 2020. Downloaded from This new approach has become the norm, as reflected published by the UK National Health Service (NHS). in guidelines for reporting such as CHEERS (Consol- The NHS funds research through National Institute idated Health Economics Evaluation Reporting state- for Health Research (NIHR). Its largest programme, ment).6 Most trials however continue to be designed to the Health Technology Assessment (HTA) programme test hypotheses using statistical significance7 not least due mainly funds systematic reviews and randomised trials. to drug licensing requirements.8 Those trials usually include economic evaluation. The Claims that a new intervention is cost- effective usually programme publishes the entire findings of each trial imply that it should be adopted. Claims to the contrary, in its monograph series. These combine reports of clin- in favour of placebo or usual care, by contrast imply no ical efficacy, cost and cost-effectiveness in a format that change. Our interest has mainly to do with the former requires an overall conclusion.9 due to the implications for healthcare. We defined as ‘doubly null’ those trials that found no statistically significant difference in both primary METHODS outcome and cost per patient. Paradoxical was defined The criteria for trials with ‘doubly null’ results were non- as going on to conclude in favour of an intervention, statistically significant differences in both (a) the primary whether compared with placebo, usual care or to another outcome, and (b) mean cost per patient. Significance was intervention. defined at the 0.05 level (or 95% CIs). We aimed to establish how often ‘doubly null’ results Our sampling frame was all 226 projects which occur and how commonly paradoxical conclusions were included a randomised trial published by the NIHR HTA drawn. Our sample was randomised trials funded and programme 2004 to 2017. We then excluded three types http://bmjopen.bmj.com/ on October 2, 2021 by guest. Protected copyright. Figure 1 Flowchart: projects including randomised trials identifying those which included a ‘doubly null’ result. 2 Raftery J, et al. BMJ Open 2020;10:e029596. doi:10.1136/bmjopen-2019-029596 Open access BMJ Open: first published as 10.1136/bmjopen-2019-029596 on 9 January 2020. Downloaded from of trials: those testing non- inferiority, those which did The probability of an intervention being cost- effective not report an economic evaluation and those which were was only loosely associated with the conclusion drawn. stopped prematurely. Those favouring the intervention had on average 89% This study did not have an original protocol. It evolved probability of being cost- effective, compared with 79% from an initial cross-sectional review of ‘doubly null’ for those favouring a control (see online supplemen- results to consideration of the implications of paradoxical tary appendix 2). These averages however spanned wide conclusions. This led to preparation of focussed summa- ranges and are based on relatively few studies. Some proj- ries of each trial with such conclusions. All the reports ects found small differences convincing while others saw reviewed are available free at https://www. journalsli- them as indicating lack of evidence. Other factors such as brary. nihr. ac. uk/ hta/#/. The volume and issue numbers secondary outcomes, borderline significance and patient in online supplementary appendix 1 identify all those convenience were sometimes mentioned as having influ- included and those with ‘doubly null’ results. enced the overall conclusions. Monograph titles were used to identify those containing a randomised trial. Each trial was checked to identify those with ‘doubly null’ results. For each of these, the DISCUSSION abstract was extracted along with additional text to do The potential for drawing conclusions from ‘doubly null’ with cost- effectiveness. Only primary comparisons were results arises due to different methodological perspec- included, as defined by each project. Overall conclusions tives that can be described as hypothesis testing versus were classified based on the conclusion section of the decision analysis. Rejection of a hypothesis of superiority Abstract. Quotes from those whose economic conclusion precludes any conclusion other than the null hypothesis. favoured

Paradoxical Conclusions in Randomised Controlled Trials with 'Doubly Null

Experimental Evidence During a Global Pandemic

Sensitivity and Specificity. Wikipedia. Last Modified on 16 November 2013

A Randomized Control Trial Evaluating the Effects of Police Body-Worn

Why Replications Do Not Fix the Reproducibility Crisis: a Model And

Which Findings Should Be Published?∗

1 Maximizing the Reproducibility of Your Research Open

(Salmo Salar) with Amoebic Gill Disease (AGD) Chlo

The Significance of Null Results in Physics Education Research

Using Bias Analysis to Address Misclassification in Epidemiology

Null Hypothesis Significance Testing: a Review of an Old and Continuing Controversy

ORBITA and Coronary Stents: a Case Study in the Analysis and Reporting

The Promise and Peril of Surrogate End Points in Cancer Research