CORRESPONDENCE LINK TO ORIGINAL ARTICLE

funding often outweighs the efforts required to rigorously challenge the novel findings; Failed trials for central nervous system thus, verification in an independent laboratory is unlikely. disorders do not necessarily invalidate As the costs of clinical studies are so much higher than those of preclinical development, preclinical models and drug targets one might assume that pharmaceutical com- panies would conduct robust replications of key findings. This is indeed the case during Anton Bespalov, Thomas Steckler, Bruce Altevogt, Elena Koustova, Phil Skolnick, lead optimization, candidate selection, test- Daniel Deaver, Mark J. Millan, Jesper F. Bastlund, Dario Doller, Jeffrey Witkin, ing different administration routes and the Paul Moser, Patricio O’Donnell, Ulrich Ebert, Mark A. Geyer, Eric Prinssen, use of primary preclinical disease models. Theresa Ballard and Malcolm Macleod However, this is far less common for stud- ies in the more complex disease models that are used in late stages of development and A recent article identified five key technical modest effects on survival. Numerous other are potentially more relevant for predicting determinants that make substantial contribu- drug candidates have reported efficacy in a clinical efficacy. tions to the outcome of drug R&D projects superoxide dismutase 1 (SOD1) mouse model In general, the rigorousness with which (Lessons learned from the fate of AstraZeneca’s of ALS (one of the common animal models preclinical data is obtained — and the result- drug pipeline: a five-dimensional framework. for this disease), but none of these candidates ing robustness of the data — is quite low; Nat. Rev. Drug Discov. 13, 419–431 (2014))1. produced an efficacy signal in clinical trials. few studies report randomization, blinding, Careful consideration of such determinants The ALS Therapy Development Institute later sample size calculations or attrition. might be particularly valuable in the fields of rigorously retested more than 100 of those neurology and psychiatry, in which successful molecules in the SOD1 model (using adequate Generalizability drug development has declined precipitously statistical power, treatment groups matched Every laboratory has a unique combination over the past decade. This decline has largely for litter and gender, blinding, uniform end of protocols, suppliers of tools and reagents, been fuelled by a high failure rate in the trans- point criteria, tracking of non-ALS deaths source of animals, and animal husbandry lation of preclinical efficacy findings, caused and quantitative analysis of transgene copy characteristics. As drugs are used in highly by multiple factors (see Supplementary infor- number prior to assigning mice to a study), and heterogeneous patient populations, efficacy mation S1 (table)), including limited training they were unable to replicate any of the previ- observed in a single lab is more likely to be and poor protocol design, inadequate animal ously reported preclinical efficacy findings2. successfully translated when similar findings models, insufficiently validated therapeutic In this context, the lack of clinical efficacy is can also be obtained under different condi- targets and problems with data handling and not surprising. tions in other laboratories. There is empirical reporting. Similar problems related to deficiencies evidence to support this assumption: the Here, we focus on three factors that can be in experimental design (such as inadequate broader the range of circumstances and labo- addressed immediately in order to re‑­evaluate blinding and randomization) had previously ratory environments in which preclinical the therapeutic potential of older drugs and been observed in studies with animal models efficacy can be demonstrated, the higher the targets and to increase the probability of suc- of stroke and multiple sclerosis3. The potential likelihood of detecting efficacy signals in clini- cess for future preclinical-to‑clinical translation: impact of such experimental design problems cal studies6. A recent European Union‑funded data robustness, data generalizability and target can be assessed retrospectively for drugs that initiative (the Multicentre Preclinical Animal engagement data, a factor that was also high- have already been approved or abandoned, Research Team (MultiPART)) established lighted in the recent article1. We argue that the and steps can be taken to improve the robust- web-based platforms for multicentre animal many failed clinical trials in neuropsychiatry do ness of experiments for drugs in development studies, and the National Institute of Ageing not necessarily invalidate the potential of a drug (see Supplementary information S2 (table)). supports an Interventions Testing Program target or an animal model. Rather, these failures A hallmark of the scientific method is that seeks to validate the efficacy of treatments indicate a need for improved experimental the replication of findings both within and for ageing across several test sites with ade- designs and a robust translational strategy to between laboratories. However, such replica- quately powered, rigorous experiments using better inform compound and dose selection tion is limited by cost, human resources, time genetically heterogeneous mice of both sexes. for clinical trials. We conclude that many of the and bioethical considerations. Additionally, Generalizability of preclinical data is not drugs and targets in neuropsychiatry that have “there is an almost irresistible pressure to stop only an issue concerned with laboratory been discarded because of negative clinical trial when the result is about what one expects it conditions and animal strains, age and sex. outcomes may deserve re‑evaluation using con- to be,” according to Terry Quinn4. However, For example, there is a remarkable paucity temporary knowledge, methodology and tools. the more novel the findings of an experiment of studies employing chronic or subchronic appear, the less likely they are to be true5, drug administration. Given that even a second Robustness especially in the context of poorly designed dose of most drugs can alter the biological The problem of robustness in preclinical and underpowered studies. This problem milieu (for example, tolerance, sensitization data is best illustrated by an example from reflects the pressure to publish novel findings or receptor regulation), chronic dosing stud- research in amyotrophic lateral sclerosis in high-impact journals before being scooped ies are important for generating the best (ALS), a severe progressive neurodegenera- by a rival laboratory or funding runs out. The predictions of effects in patients. It is largely tive disease. There is currently one approved conventional value attached to such publi- unknown whether a compound’s efficacy has medication for ALS, riluzole, which has only cations for career advancement and future been confirmed in properly designed studies

NATURE REVIEWS | DRUG DISCOVERY www.nature.com/nrd ©2016 Mac millan Publishers Li mited. All ri ghts reserved.

CORRESPONDENCE

with chronic administration in animals before 20 7 initiation of a clinical trial . This information 18 Without biomarker is particularly important for those indications With biomarker 16 for which preclinical models do not require repeated administration to detect drug effi- 14 cacy, while the treatment duration in clinical 12 trials can range from weeks to several months, 10 depending on the indication. 8

Target engagement Number of projects 6

In the context of hypothesis-driven drug 4 research, any observation of clinical efficacy is serendipitous if the molecule does not engage 2 its biological targets at the dose tested. Various 0 σ modelling tools can be used to assess target 1 3 4 H 3 NK 3 6 2A 1 M1 α7 PDE10 DA D DA D DA D 5-HT 5-HT GABA mGlu GlyT Other engagement, and direct target occupancy NMDAR Target assays including positron emission tomog­ | raphy (PET) are increasingly available. The Figure 1 Analysis of the use of biomarkers in the development of novel treatments for schizophrenia. The Thomson Reuters Cortellis database (searchedNature on January Reviews 21, | 2014)Drug Discoverywas used aim is to demonstrate that, at relevant doses, to identify drug development projects in schizophrenia between 1994 and 2014. We could not find the drug is present in the same compartment evidence of biomarker-driven (for example, using positron emission tomography (PET), MRI or elec- as the target and in appropriate free concen- troencephalography (EEG)) dose selection for 80% of 72 novel drugs that were evaluated in Phase II trations to bind to it. PET studies demonstrate clinical proof‑of‑concept studies in this time period. 5‑HT2A, 5‑hydroxytryptamine receptor 2A; receptor occupancy but not targeted down- 5‑HT6, 5‑hydroxytryptamine receptor 6; α7, α7 nicotinic receptor; σ, σ receptor; DA D1, dopamine stream effects; such approaches may not be receptor 1; GABA, γ‑aminobutyric acid; GlyT1, sodium- and chloride-dependent glycine trans- appropriate for novel drugs working through porter 1; H3, histamine receptor 3; M1, muscarinic acetylcholine receptor 1; mGlu, metabotropic allosteric or non-competitive molecular ; NK3, neurokinin receptor 3; NMDAR, N‑methyl-d‑aspartate receptor; PDE10, mechanisms, and for some targets PET tracers phosphodiesterase 10. are not yet available. Scientists at Pfizer considered evidence of exposure at the site of action, target binding Path forward with appropriate protocol design informed by and expression of functional pharmacologi- Efforts to improve the robustness of pre- statistical and power analyses similar to those cal activity for 44 Phase II programmes across clinical data will lead to better study designs. processes now standard for clinical trials. several therapeutic areas8. In 43% of cases, it Strengthening of publication policies and open was reported that the target mechanism had access to data (including negative data) is an Summary and conclusions not been adequately tested owing to the lack additional key way to improve data reliability Herein, we challenge the widely held view of evidence of target engagement. Similarly, and transparency. The Preclinical Data Forum that the high failure rate in neuropsychiatry AstraZeneca reported that 40% of efficacy was established with support from the European trials invalidates both the drug targets chosen failures in Phase II projects could be attributed College of Neuropsychopharmacology and the preclinical models used. We argue to a lack of clear target linkage to a disease (ECNP)10 and is developing an online platform instead that the scientific community should or validated animal model, and 29% could to enable scientists to exchange unpublished attach greater importance to issues of data be attributed to a lack of data establishing data in a pre-competitive manner and to share robustness, data generalizability and target tissue exposure1. knowledge on the use of tool compounds. This engagement when designing preclinical stud- Even the use of cerebrospinal fluid concen- platform should facilitate disclosure of large ies. We do not seek to detract attention from trations to guide dosing may be misleading for amounts of pre-competitive information and the fundamental need to strengthen our intracellular targets or for compounds that are should be paralleled by the development of scientific understanding of disease mecha- actively transported across the blood–brain consensus approaches to data robustness and nisms, improve clinical testing strategies, and barrier. Similar considerations may apply demonstrating generalizability. develop better disease models. Our premise is to antibody therapeutics. Key parameters There may be compounds that have failed in that an increase in robustness, generalizability required for brain penetration have been clinical proof‑of‑concept studies owing to poor and evidence of target engagement will discussed9, and the failure of AZD8529 — a target engagement or that might be more appro- increase the probability of successful transla- positive of metabotropic priate for other diseases sharing similar mecha- tion of preclinical findings into Phase II effi- glutamate receptor 2 (mGluR2) — in a Phase II nisms. The ECNP medicines chest provides a cacy. We are optimistic that the adoption of study in schizophrenia has been attributed, in list of pharmacological tools no longer under these approaches will enhance our ability to part, to unreliable target engagement1. development and can be used to obtain further bring improved and much needed medicines We have used the Thomson Reuters Cortellis clinical information on a particular target. Such to patients. database to identify drug development projects target revalidation efforts should ideally occur Anton Bespalov was previously at Neuroscience in schizophrenia between 1994 and 2014; we in a pre-competitive space and may involve the Research, AbbVie, 6706 Ludwigshafen, Germany. could not find evidence of biomarker-driven development of new business models. Present address: Partnership for Assessment and Accreditation of Scientific Practice, Am Aukopf 14/1, dose selection for 80% of 72 novel drugs sub- Finally, improved training on appropri- D-69118 Heidelberg, Germany and Institute of jected to Phase II clinical proof‑of‑concept ate study designs will ensure that preclinical , Pavlov Medical University, studies (FIG. 1). research is conducted to the highest standards, 197022 St Petersburg, Russia.

NATURE REVIEWS | DRUG DISCOVERY www.nature.com/nrd ©2016 Mac millan Publishers Li mited. All ri ghts reserved.

CORRESPONDENCE

Thomas Steckler is at Janssen Research and Ulrich Ebert is at Boehringer Ingelheim Pharma, 7. Bespalov, A. et al. Drug tolerance: a known unknown Development, B-2340 Beerse, Belgium. 55218 Ingelheim am Rhein, Germany. in translational neuroscience. Trends Pharmacol. Sci, http://dx.doi.org/10.1016/j.tips.2016.01.008 Bruce Altevogt is at Global Policy and International Mark A. Geyer is at the University of California (2016). Public Affairs, Pfizer, New York, 10017 New York, USA. San Diego, La Jolla, California 92093, USA. 8. Morgan, P. et al. Can the flow of medicines be improved? Fundamental pharmacokinetic and Elena Koustova and Phil Skolnick are at the National Eric Prinssen and Theresa Ballard are at Roche pharmacological principles toward improving Institute on Drug Abuse, National Institutes of Health, Pharma Research and Early Development, Phase II survival. Drug Discov. Today 17, 419–424 (2012). Bethesda, Maryland 20892, USA. Neuroscience, Ophthalmology and Rare Diseases, 9. Di, L. et al. Demystifying brain penetration in central Daniel Deaver is at Non-Clinical Research and Roche Innovation Center Basel, CH‑4070 Basel, nervous system drug discovery. J. Med. Chem. 56, Development, Alkermes, Waltham, Massachusetts Switzerland. 2–12 (2013). 02451, USA. 10. Steckler, T. et al. The preclinical data forum network: Malcolm Macleod is at Edinburgh University, A new ECNP initiative to improve data quality and Mark J. Millan is at the Institut de Recherche Servier, Old College, South Bridge, Edinburgh EH8 9YL, UK. robustness for (preclinical) neuroscience. 78290 Croissy sur Seine, France. Eur. Neuropsychopharmacol. 25, 1803–1807 (2015). Correspondence to A.B. Jesper F. Bastlund is at Neuroscience Research, [email protected] Acknowledgements H. Lundbeck A/S, Copenhagen, 2500 Valby, Denmark. The authors thank Martien Kas (Utrecht University), Michael doi:10.1038/nrd.2016.88 Decker (AbbVie), Lynn Butler-David (Exciva), Martin Weber Dario Doller was previously at Discovery Chemistry & Published online 17 Jun 2016 (Genentech), Jurgen Gottowik (Roche) and Katja Brose (Cell DMPK; Lundbeck Research USA, Paramus, New Jersey Press) for stimulating discussion and helpful comments. 07652, USA. Present address: Concert Pharmaceuticals, Inc., 99 Hayden Avenue, 1. Cook, D. et al. Lessons learned from the fate of Competing interests statement The authors declare competing interests: see Web version Lexington, Massachusetts 02421, USA. AstraZeneca’s drug pipeline: a five-dimensional framework. Nat. Rev. Drug Discov. 13, 419–431 for details. Jeffrey Witkin is at Neuroscience Discovery Research, (2014). Lilly Research Labs, , 2. Scott, S. et al. Design, power, and interpretation of FURTHER INFORMATION studies in the standard murine model of ALS. Indianapolis, Indiana 46285, USA. ECNP medicines chest: https://www.ecnp.eu/projects- Amyotroph. Lateral Scler. 9, 4–15 (2008). initiatives/ECNP-medicines-chest.aspx Paul Moser was previously at the Pierre Fabre 3. van der Worp, H. B. et al. Can animal models of Interventions Testing Program: https://www.nia.nih.gov/ Research Institute, 81106 Castres, France. disease reliably inform human studies? PLoS Med. research/dab/interventions-testing-program-itp 7, e1000245 (2010). Present address: BIAL-Portela & Ca S.A., MultiPART: http://www.dcn.ed.ac.uk/multipart/ 4. Quinn, T. Don’t stop the quest to measure Big G. Thomson Reuters Cortellis database: Avenida da Siderurgia Nacional, 4745–457 São Nature 505, 455 (2014). https://cortellis.thomsonreuterslifesciences.com/ Mamede do Coronado, Portugal. 5. Ioannidis, J. P. Why most published research findings are false. PLoS Med. 2, e124 (2005). SUPPLEMENTARY INFORMATION Patricio O’Donnell is at the Neuroscience and 6. Richter, S. H. et al. Systematic variation improves See online article: S1 (table) | S2 (table) Pain Research Unit, Pfizer, Cambridge, reproducibility of animal experiments. Nat. Methods ALL LINKS ARE ACTIVE IN THE ONLINE PDF Massachusetts 02139, USA. 7, 167–168 (2010).

NATURE REVIEWS | DRUG DISCOVERY www.nature.com/nrd ©2016 Mac millan Publishers Li mited. All ri ghts reserved.