Chemical Predictive Modelling to Improve Compound Quality
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS Chemical predictive modelling to improve compound quality John G. Cumming1, Andrew M. Davis2, Sorel Muresan3,4, Markus Haeberlein5,6 and Hongming Chen3 Abstract | The ‘quality’ of small-molecule drug candidates, encompassing aspects including their potency, selectivity and ADMET (absorption, distribution, metabolism, excretion and toxicity) characteristics, is a key factor influencing the chances of success in clinical trials. Importantly, such characteristics are under the control of chemists during the identification and optimization of lead compounds. Here, we discuss the application of computational methods, particularly quantitative structure–activity relationships (QSARs), in guiding the selection of higher-quality drug candidates, as well as cultural factors that may have affected their use and impact. Reducing attrition rates remains a major challenge in physics of the system, although they can indicate what drug development. Recent estimates indicate that the the controlling physical properties might be. QSAR chances of a drug candidate successfully reaching methods use statistical regression and classification- the start of Phase II trials, where the role of the target based approaches to identify quantitative patterns that biology in the disease can be tested, are only ~37%1. are present within the existing data. The rules in rule- Moreover, the probability of success in Phase II trials is based methods may be either manually or automatically 1Chemistry Innovation only ~34%. Although a substantial proportion of failures generated. Physics-based approaches are often used in Centre, Discovery Sciences, AstraZeneca R&D, in Phase II trials may be due to flaws in the underlying conjunction with QSAR methods, but the large scale of Alderley Park, Macclesfield biological hypothesis, the physicochemical properties of data sets available can limit the degree to which a physics- SK10 4TG, UK. small-molecule drug candidates also have an important based approach may be rigorously applied owing to limi- 2Respiratory, Inflammation impact, through their influence on ADMET (absorption, tations in computational resources. Empirical methods, and Autoimmunity Innovative distribution, metabolism, excretion and toxicity) char- however, are particularly suited for the analysis of the large Medicines Unit, AstraZeneca R&D, Pepparedsleden 1, acteristics and on the effectiveness of the compounds at volumes of data that are now available from the routine 2 431 83 Mölndal, Sweden. selectively engaging their targets in humans . use of high- and medium-throughput in vitro biological 3Chemistry Innovation Given such issues, there has long been interest in and ADMET assays in drug discovery. We term the suite Centre, Discovery Sciences, the use of computational approaches to help guide the of available empirically based methods ‘chemical pre- AstraZeneca R&D, (TABLE 1) Pepparedsleden 1, selection and optimization of compounds for synthesis dictive modelling’ . 431 83 Mölndal, Sweden. and testing in order to reduce the risks of failure related In this Review, we describe the development of some 4Present address: AkzoNobel to their physicochemical properties3. Many papers have of the most important chemical predictive modelling tools Surface Chemistry, been published that discuss the properties of a ‘quality that are currently used in the industry and discuss some Hamnvägen 2, 444 85 compound’ — a compound that is more likely to robustly of their limitations as well as the cultural aspects that may Stenungsund, Sweden. 5CNS & Pain Innovative test the biological hypothesis in the clinic. prevent these approaches from realizing their potential. Medicines, AstraZeneca R&D, Such computational approaches can be broadly 151 85 Södertälje, Sweden. divided into physics-based and empirically based meth- What defines compound quality? 6Present address: ods. Physics-based methods encompass, for example, A universally accepted definition of compound quality Proteostasis Therapeutics, 200 Technology Square, molecular dynamics and the prediction of binding has not been established. However, multiple landmark Cambridge, Massachusetts affinity by methods such as free energy perturbation publications over the past 15 years or so have indicated 02139, USA. and quantum chemical calculations. Empirical methods the importance of various physicochemical properties, Correspondence to are based on observed patterns in existing data, which are particularly lipophilicity. J.G.C. and A.M.D. used to guide the design of future compounds; examples The pioneering ‘rule of five’ guidelines published by e-mails: John.Cumming@ astrazeneca.com; of such methods include quantitative structure–activity Lipinski et al. in 1997 proposed simple physicochemical 4 [email protected] relationships (QSARs), rule-based systems and expert property-based guidelines for drug permeability (BOX 1). doi:10.1038/nrd4128 systems. They do not rely on any understanding of the By analysing a set of drugs that had entered clinical trials, 948 | DECEMBER 2013 | VOLUME 12 www.nature.com/reviews/drugdisc © 2013 Macmillan Publishers Limited. All rights reserved REVIEWS Table 1 | Selected commonly used tools for chemical predictive modelling Tool or toolkit URL Cheminformatics toolkits Daylight toolkit http://www.daylight.com OpenEye Scientific Software toolkit http://www.eyesopen.com ChemAxon http://www.chemaxon.com The Chemistry Development Kit102 http://sourceforge.net/projects/cdk/files/cdk RDkit http://www.rdkit.org Dragon descriptors103 http://www.talete.mi.it Batch Modules for ACD/Percepta http://www.acdlabs.com/products/percepta/batch.php Statistical tools and toolkits The R Project for Statistical Computing http://www.r-project.org Bioclipse59 http://bioclipse.net JMP statistical discovery software http://www.jmp.com Pipelining tools Pipeline Pilot (Accelrys) http://accelrys.com/products/pipeline-pilot Knime104 http://www.knime.org Data visualization tools Spotfire http://spotfire.tibco.com Vortex (Dotmatics) http://www.dotmatics.com/products/vortex it was found that the following rules pertained to a large study of the toxicological outcomes of 245 compounds proportion of the compounds: molecular mass ≤500 Da; in development at Pfizer, which found that compounds calculated LogP (cLogP) ≤5; number of hydrogen-bond with cLogP >3 and total polar surface area <75 Å2 were donors ≤5; and number of hydrogen-bond acceptors ≤10. six times more likely to show an adverse event in a rat or It was suggested that compounds that violate any two of dog in vivo safety study than a compound with cLogP <3 the ‘rule of five’ conditions are unlikely to be oral drugs4. and total polar surface area >75 Å2 (REF. 12). Flexibility, The recognition that lead optimization often resulted molecular complexity and shape are additional properties in increased lipophilicity and molecular size prompted that have received attention from some in the field14,15. the definition of the ‘lead-like’ concept5,6. This sug- Where potency information is available, the most gested that screening libraries should be preferentially recent evolution of these guidelines is the proposal of populated with smaller and less lipophilic compounds various ligand efficiency concepts that that build on the than those described by Lipinski’s ‘drug-like’ definitions. concepts of lead-likeness and of discriminating between Leads that are smaller and less lipophilic than drug-like optimal and non-optimal binders, as first suggested by compounds would provide ‘headroom’ for lead optimi- Andrews16. Ligand efficiency17, ligand lipophilic efficiency zation. These publications had a huge impact on how (LLE)18 and LogP divided by ligand efficiency (LELP)19 are medicinal chemists defined compound quality and led size-, lipophilicity- and size-plus-lipophilicity-corrected to an increase in the use of in silico approaches for drug measures of potency that help to identify compounds design; for example, medicinal chemists would compu- that are maximizing the use of their chemical struc- tationally filter compound collections and compounds ture in desirable binding and are therefore likely to be proposed for synthesis to only include those with cal- better leads. culated physiochemical properties that were sufficiently For example, scientists at Astex have reported that lead- or drug-like. companies focusing on leads with ligand efficiency and Over the past 15 years, many developments on these lipophilic efficiency find more robust SARs and produce guidelines have been published7,8 to supplement and candidate drugs with a more acceptable compound prop- fine-tune recommended molecular property ranges for erty profile20. Similar findings were reported by Leeson fragments9, target classes, disease areas10 and ADMET and St Gallay21; in their target-by‑target comparison, characteristics11,12. Some of the guidelines have been companies that applied rigorous ligand efficiency and challenged and new ones proposed. Based on studies of LLE optimization produced candidate drugs that were the temporal invariance of physicochemical properties, smaller and less lipophilic. Finally, Keserü and colleagues cLogP it has been questioned whether the focus on molecular analysed data on the ADMET properties of compounds The calculated logarithm of the 1‑octanol–water partition mass is justified, as lipophilicity is the more fundamen- published by Pfizer and showed that LELP can discrim 13 coefficient of the non-ionized tal controlling property . The importance of lipophi-