Data for Drugs
Total Page:16
File Type:pdf, Size:1020Kb
Data For Drugs John Overington, CIO [email protected] @johnpoverington ©2017 Medical Discovery Catapult. All rights reserved. Medical Discovery Catapult and the Medical Discovery Catapult logo are among the trademarks or registered trademarks owned by or licensed to Medical Discovery Catapult. All other marks are the property of their respective owners. • The Medicines Discovery Catapult • ChEMBL, SureChEMBL & UniChem • Errors, Errors, Everywhere • Drug Blending • Resistance • Competitive Intelligence • Are Antibacterials Really Different? • Assay Networks ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical 2 The Medicines Discovery Catapult The UK Catapult Programme The Catapult centres are a network of world-leading centres designed to transform the UK’s capability for innovation in specific areas and help drive future economic growth. Medicines Discovery Catapult 5 Medicines Discovery Catapult • Supporting innovative ‘Fast-to-Patient’ Medicines Discovery • A not-for-profit company set up and funded by Innovate UK • Helping to solve shared problems through new disease-based Syndicates corner-stoned by medical research charities • Focus on translating potential drug candidates into clinical trials as quickly as possible for the good of the wealth and health of the UK • Doing wet science, informatics, virtual discovery, technology development, process challenge • Lower barrier to entry and improve market liquidity ChEMBL, SureChEMBL & UniChem ChEMBL – https://www.ebi.ac.uk/chembl • The world’s largest primary public database of medicinal chemistry data • ~1.7 million compounds • ~11,000 targets • ~14 million bioactivities • Truly Open Data - CC-BY-SA license • ChEMBL data also loaded into BindingDB, PubChem BioAssay and BARD • MyChEMBL VM, RDF, full relational download…. A. Gaulton et al (2012) Nucleic Acids Research Database Issue. 40 D1100-1107 ChEMBL Compound >Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSY EEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRS RYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEG Inhibition of SSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGD EEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEAD K =4.5 nM CGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVL human Thrombin i TAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLK KPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVC KDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFY THVFRLKKWIQKVIDQFGE PTT (partial ED =230 nM thromboplastin 2 time) Assay Affinity of Drugs for Their Efficacy Targets Ki, Kd, IC50, EC50, & pA2 endpoints for drugs against their‘efficacy targets’ 400 350 300 250 200 Frequency 150 100 50 0 2 3 4 5 6 7 8 9 10 11 12 -log10 affinity 10mM 1mM 100mM 10mM 1mM 100nM 10nM 1nM 100pM 10pM 1pM Overington, et al, Nature Rev. Drug Disc. 5 pp. 993-996 (2006) Gleeson et al, Nature Rev. Drug Disc. 10 pp. 197-208 (2011) SureChEMBL– https://www.surechembl.org • New Public chemistry patent resource • Donated by Digital Science – SureChem commercial product • Automatically extracted chemical structures from full- text patents • ~15 million chemical structures • Updated daily • Full chemistry download UniChem – https://www.ebi.ac.uk/unichem • Simple chemical integration service • >144 million structures from ~30 sources • URI/resource ID/Standard InChI based lookups • Available chemicals, PubChem, ZINC, real time, private • Chemical structure ‘Time Machine’ J. Chambers et al (2013) J. Cheminf. DOI:10.1186/1758-2946-5-3 Some Personal Perspectives on ChEMBL • Things that worked well • Single, major visionary funder – Wellcome Trust • Focus on data content/backend not GUI • Bioinformatics and cheminformatics • Clear License – CC-BY-SA - same license as Wikipedia content • Private/secure https: services from start • Opportunism – SureChEMBL • Open data re-envigorated cheminformatics research • Things that didn’t work so well • Community curation attempts • Publisher interactions – except Royal Society of Chemistry • Insufficient staff in outreach/training • Migration to FOSS was too slow Errors, Errors, Everywhere The Reproducibility Crisis Begley & Lee (2012) Nature DOI:10.1038/483531 & Prinz et al (2011) NRDD DOI:10.1038/nrd3439-c1 Errors in ChEMBL “The more complex the parameter, the more frequent the errors” Enhanced data model for ChEMBL can appear as errors – complexes, receptor sets, model organisms Tiikkainen et al (2013) JCIM DOI:10.1021/ci400099q Errors in SureChEMBL Senger et al J Cheminf (2015) DOI:10.1186/s13321-015-0097-z Inter-species Assay Variability Same compound, same end-point for rat and human orthologs Scatter plot of measured Distribution of potency potencies differences 12 0.6 2 y 10 n = 2.781 t n f a y $ 8 t i 0.4 e rat s m n a e r d pKi 6 F o norm. dens. norm. h 0.2 t r 4 o 2 2 4 6 8 10 12 −4 −2 0 2 4 orthopKiFra humanme$afnty1 diff(human,diff rat) Krüger & Overington (2012) PLoS Comp. Biol. DOI:10.1371/journal.pcbi.1002333 Inter-lab Variability Same compound, same species, different publication Scatter plot of measured Distribution of potency potencies differences 12 n = 3.000 2 0.6 y t 10 n f a $ y e 8 t i 0.4 m s a n Assay2 r e i norm. dens. norm. F 6 d e l pK p 0.2 m 4 a s 2 2 4 6 8 10 12 −4 −2 0 2 4 sampleFrpKi Assay1ame$afnty1 diff(assay1,diff assay2) Krüger & Overington (2012) PLoS Comp. Biol. DOI:10.1371/journal.pcbi.1002333 Inter-species vs Inter-lab Variability Inter-publication Inter-orthologue density density pKii - pKij Krüger & Overington (2012) PLoS Comp. Biol. DOI:10.1371/journal.pcbi.1002333 Large-Scale Cell-line Screening Data M.J. Garnett et al (2012) Nature DOI:10.1371/journal.pcbi.1002333 & J. Barretina et al (2012) Nature DOI:10.1038/nature11003 Inconsistent Cell-line Screening Data B. Haibe-Kains et al (2013) Nature DOI:10.1038/nature12831 (see also Stransky et al (2015) Nature DOI:10.1038/nature15736) Primary Data – Batches and Replicates http://www.wexlerwallace.com/wp-content/uploads/2012/04/Southeast-Laborers-Health-v-Pfizer.pdf Incorrect Chemical Structures Bosutinib Voxtalisib http://cen.acs.org/articles/90/web/2012/05/Bosutinib-Buyer-Beware.html and Overington & Wennerberg unpublished Drug Blending Drug Targeting Single Drug Multiple Drugs Classic Drug Single Discovery, Drug Blending Target Ehrlich’s ‘Magic Bullet’ Multiple Designed Combination Targets Polypharmacology Therapy Overington and Al-Lazikani - unpublished Monotherapy vs Polypharmacology Monotherapy, monopharmacology Monotherapy, polypharmacology Illustrative only Cetuximab : EGFR Erlotinib : EGFR Overington and Al-Lazikani - unpublished Combination Therapy vs Blending Combination therapy, polypharmacology Combination therapy, monopharmacology Erlotinib : EGFR Losmapimod : p38a Erlotinib : EGFR Gefitinib : EGFR Overington and Al-Lazikani - unpublished Drug Targeting Drug Single Drug Multiple Drugs Classic Drug Discovery, Drug Blending Ehrlich’s ‘Magic Bullet’ Single Target Single Target Designed/Serendipito Combination us Polypharmacology Therapy Multiple Targets Multiple Overington and Al-Lazikani - unpublished Drug Pharmacokinetics Cmax Tmax • Drugs do not work under steady state conditions Absorption Elimination Rate ka Rate kel 30 10.05.2017 Master headline Drug Action • n.b. effective concentration at site of drug action can be higher or lower than plasma concentration % effect 100 75 50 MEC ∝ XC50 ‘efficacy’ target 25 31 10.05.2017 Master headline MEC = Minimum Effective Concentration Adverse Drug Reactions (ADRs) • Acute ADRs are usually related to adverse pharmacology at/around Cmax MEC ADR target • Cmax can vary greatly due to drug dose and a wide range of environmental and genetic factors • Occurrence and duration of side-effects appears stochastic • Examples • QT prolongation/hERG effects for cisapride – potentially fatal % effect 100 • Blurred vision side effect for sildenafil – an inconvenience 75 50 25 32 10.05.2017 Master headline One Drug - Many Targets • As concentration increases, an ever larger number of targets are modulated XC50 ADR target • Polypharmacology – many effects from one drug • Same target -> different effects • Different target -> different effects XC50 ‘off’ target • Dose dependent MEC Efficacy target 33 10.05.2017 Master headline Lower Dose, Shorter Duration of Action Cmax 75mg dose Cmax 37.5mg dose 75 mg dose, ka = 0.5, kel = 0.2 37.5 mg dose, ka = 0.5, kel = 0.2 Cmax 18.75mg dose 18.75 mg dose, ka = 0.5, kel = 0.2 MEC Imatinib Polypharmacology Spectra Tyrosine-protein kinase FYN 5.38 ATP-binding cassette sub-family G member 2 5.39 c-Jun N-terminal kinase 1 5.40 Serine/threonine-protein kinase 17A 5.41 c-Jun N-terminal kinase 3 5.50 Dual specificity protein kinase CLK4 5.53 Mixed lineage kinase 7 5.59 Tyrosine-protein kinase FGR 5.62 Tyrosine-protein kinase FRK 5.64 Maternal embryonic leucine zipper kinase 5.72 Serine/threonine-protein kinase GAK 5.72 Ephrin type-A receptor 8 5.77 Serine/threonine-protein kinase RAF 5.77 Interleukin-1 receptor-associated kinase 1 5.92 ) 1 - Carbonic anhydrase XII 6.01 Homeodomain-interacting protein kinase 4 6.02 Tyrosine-protein kinase Lyn 6.05 Carbonic anhydrase III 6.28 Tyrosine-protein kinase BLK 6.28 Carbonic anhydrase XIV 6.33 BCR/ABL p210 fusion protein 6.41 Carbonic anhydrase VI 6.41 Phosphatidylinositol-5-phosphate 4-kinase type-2 gamma 6.42 Concentration (ng.ml Concentration Macrophage colony stimulating factor receptor 6.54 Stem cell growth factor receptor