Data For Drugs
John Overington, CIO [email protected]
@johnpoverington
©2017 Medical Discovery Catapult. All rights reserved. Medical Discovery Catapult and the Medical Discovery Catapult logo are among the trademarks or registered trademarks owned by or licensed to Medical Discovery Catapult. All other marks are the property of their respective owners. • The Medicines Discovery Catapult • ChEMBL, SureChEMBL & UniChem • Errors, Errors, Everywhere • Drug Blending • Resistance • Competitive Intelligence • Are Antibacterials Really Different?
• Assay Networks ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical
2 The Medicines Discovery Catapult The UK Catapult Programme The Catapult centres are a network of world-leading centres designed to transform the UK’s capability for innovation in specific areas and help drive future economic growth. Medicines Discovery Catapult
5 Medicines Discovery Catapult
• Supporting innovative ‘Fast-to-Patient’ Medicines Discovery • A not-for-profit company set up and funded by Innovate UK • Helping to solve shared problems through new disease-based Syndicates corner-stoned by medical research charities • Focus on translating potential drug candidates into clinical trials as quickly as possible for the good of the wealth and health of the UK • Doing wet science, informatics, virtual discovery, technology development, process challenge • Lower barrier to entry and improve market liquidity ChEMBL, SureChEMBL & UniChem ChEMBL – https://www.ebi.ac.uk/chembl
• The world’s largest primary public database of medicinal chemistry data • ~1.7 million compounds • ~11,000 targets • ~14 million bioactivities • Truly Open Data - CC-BY-SA license • ChEMBL data also loaded into BindingDB, PubChem BioAssay and BARD • MyChEMBL VM, RDF, full relational download….
A. Gaulton et al (2012) Nucleic Acids Research Database Issue. 40 D1100-1107 ChEMBL
Compound >Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSY EEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRS RYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEG Inhibition of SSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGD EEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEAD K =4.5 nM CGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVL human Thrombin i TAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLK KPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVC KDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFY THVFRLKKWIQKVIDQFGE
PTT (partial ED =230 nM thromboplastin 2
time) Assay Affinity of Drugs for Their Efficacy Targets
Ki, Kd, IC50, EC50, & pA2 endpoints for drugs against their‘efficacy targets’
400
350
300
250
200 Frequency 150
100
50
0 2 3 4 5 6 7 8 9 10 11 12
-log10 affinity
10mM 1mM 100mM 10mM 1mM 100nM 10nM 1nM 100pM 10pM 1pM
Overington, et al, Nature Rev. Drug Disc. 5 pp. 993-996 (2006) Gleeson et al, Nature Rev. Drug Disc. 10 pp. 197-208 (2011) SureChEMBL– https://www.surechembl.org
• New Public chemistry patent resource • Donated by Digital Science – SureChem commercial product • Automatically extracted chemical structures from full- text patents • ~15 million chemical structures • Updated daily • Full chemistry download UniChem – https://www.ebi.ac.uk/unichem
• Simple chemical integration service • >144 million structures from ~30 sources • URI/resource ID/Standard InChI based lookups • Available chemicals, PubChem, ZINC, real time, private • Chemical structure ‘Time Machine’
J. Chambers et al (2013) J. Cheminf. DOI:10.1186/1758-2946-5-3 Some Personal Perspectives on ChEMBL
• Things that worked well • Single, major visionary funder – Wellcome Trust • Focus on data content/backend not GUI • Bioinformatics and cheminformatics • Clear License – CC-BY-SA - same license as Wikipedia content • Private/secure https: services from start • Opportunism – SureChEMBL • Open data re-envigorated cheminformatics research • Things that didn’t work so well • Community curation attempts • Publisher interactions – except Royal Society of Chemistry • Insufficient staff in outreach/training • Migration to FOSS was too slow Errors, Errors, Everywhere The Reproducibility Crisis
Begley & Lee (2012) Nature DOI:10.1038/483531 & Prinz et al (2011) NRDD DOI:10.1038/nrd3439-c1 Errors in ChEMBL
“The more complex the parameter, the more frequent the errors”
Enhanced data model for ChEMBL can appear as errors – complexes, receptor sets, model organisms Tiikkainen et al (2013) JCIM DOI:10.1021/ci400099q Errors in SureChEMBL
Senger et al J Cheminf (2015) DOI:10.1186/s13321-015-0097-z Inter-species Assay Variability Same compound, same end-point for rat and human orthologs
Scatter plot of measured Distribution of potency potencies differences 12
0.6 2
y 10 n = 2.781
t
n
f
a
y $
8 t
i 0.4
e
rat
s
m
n
a
e
r d
pKi 6
F
o norm. dens. norm.
h 0.2 t
r 4 o
2 2 4 6 8 10 12 −4 −2 0 2 4 orthopKiFra humanme$afnty1 diff(human,diff rat)
Krüger & Overington (2012) PLoS Comp. Biol. DOI:10.1371/journal.pcbi.1002333 Inter-lab Variability Same compound, same species, different publication
Scatter plot of measured Distribution of potency potencies differences 12 n = 3.000
2 0.6 y
t 10
n
f
a
$ y
e 8 t
i 0.4
m
s
a
n
Assay2
r
e
i norm. dens. norm.
F 6
d
e
l pK
p 0.2
m 4
a s
2
2 4 6 8 10 12 −4 −2 0 2 4 sampleFrpKi Assay1ame$afnty1 diff(assay1,diff assay2)
Krüger & Overington (2012) PLoS Comp. Biol. DOI:10.1371/journal.pcbi.1002333 Krüger & Overington (2012) (2012) Overington Krüger & Inter - species PLoS Comp. Biol. Comp. PLoS
densitydensity DOI:10.1371/journal.pcbi.1002333 vs Inter pK ii - - pK lab Variability ij Inter Inter - - orthologue publication Large-Scale Cell-line Screening Data
M.J. Garnett et al (2012) Nature DOI:10.1371/journal.pcbi.1002333 & J. Barretina et al (2012) Nature DOI:10.1038/nature11003 Inconsistent Cell-line Screening Data
B. Haibe-Kains et al (2013) Nature DOI:10.1038/nature12831 (see also Stransky et al (2015) Nature DOI:10.1038/nature15736) Primary Data – Batches and Replicates
http://www.wexlerwallace.com/wp-content/uploads/2012/04/Southeast-Laborers-Health-v-Pfizer.pdf Incorrect Chemical Structures Bosutinib Voxtalisib
http://cen.acs.org/articles/90/web/2012/05/Bosutinib-Buyer-Beware.html and Overington & Wennerberg unpublished Drug Blending Drug Targeting
Single Drug Multiple Drugs Classic Drug Single Discovery, Drug Blending Target Ehrlich’s ‘Magic Bullet’
Multiple Designed Combination Targets Polypharmacology Therapy
Overington and Al-Lazikani - unpublished Monotherapy vs Polypharmacology
Monotherapy, monopharmacology Monotherapy, polypharmacology
Illustrative only
Cetuximab : EGFR Erlotinib : EGFR
Overington and Al-Lazikani - unpublished Combination Therapy vs Blending
Combination therapy, polypharmacology Combination therapy, monopharmacology
Erlotinib : EGFR Losmapimod : p38a Erlotinib : EGFR Gefitinib : EGFR
Overington and Al-Lazikani - unpublished Drug Targeting Drug Single Drug Multiple Drugs
Classic Drug Discovery, Drug Blending Ehrlich’s ‘Magic
Bullet’ Single Target Single
Target Designed/Serendipito Combination
us Polypharmacology Therapy Multiple Targets Multiple
Overington and Al-Lazikani - unpublished Drug Pharmacokinetics
Cmax Tmax • Drugs do not work under steady state conditions
Absorption Elimination
Rate ka Rate kel
30 10.05.2017 Master headline Drug Action
• n.b. effective concentration at site of drug action can be higher or lower than plasma concentration
% effect 100
75
50
MEC ∝ XC50 ‘efficacy’ target 25
31 10.05.2017 Master headline MEC = Minimum Effective Concentration Adverse Drug Reactions (ADRs)
• Acute ADRs are usually related to adverse
pharmacology at/around Cmax MEC ADR target • Cmax can vary greatly due to drug dose and a wide range of environmental and genetic factors • Occurrence and duration of side-effects appears stochastic
• Examples • QT prolongation/hERG effects for cisapride – potentially fatal % effect 100 • Blurred vision side effect for sildenafil – an inconvenience 75
50
25
32 10.05.2017 Master headline One Drug - Many Targets
• As concentration increases, an ever larger number of targets are modulated
XC50 ADR target • Polypharmacology – many effects from one drug • Same target -> different effects • Different target -> different effects XC50 ‘off’ target • Dose dependent
MEC Efficacy target
33 10.05.2017 Master headline Lower Dose, Shorter Duration of Action
Cmax 75mg dose
Cmax 37.5mg dose
75 mg dose, ka = 0.5, kel = 0.2
37.5 mg dose, ka = 0.5, kel = 0.2 Cmax 18.75mg dose 18.75 mg dose, ka = 0.5, kel = 0.2 MEC Imatinib Polypharmacology Spectra
Tyrosine-protein kinase FYN 5.38 ATP-binding cassette sub-family G member 2 5.39 c-Jun N-terminal kinase 1 5.40 Serine/threonine-protein kinase 17A 5.41 c-Jun N-terminal kinase 3 5.50 Dual specificity protein kinase CLK4 5.53 Mixed lineage kinase 7 5.59 Tyrosine-protein kinase FGR 5.62 Tyrosine-protein kinase FRK 5.64 Maternal embryonic leucine zipper kinase 5.72 Serine/threonine-protein kinase GAK 5.72 Ephrin type-A receptor 8 5.77 Serine/threonine-protein kinase RAF 5.77
Interleukin-1 receptor-associated kinase 1 5.92
)
1 -
Carbonic anhydrase XII 6.01 Homeodomain-interacting protein kinase 4 6.02 Tyrosine-protein kinase Lyn 6.05 Carbonic anhydrase III 6.28 Tyrosine-protein kinase BLK 6.28 Carbonic anhydrase XIV 6.33 BCR/ABL p210 fusion protein 6.41 Carbonic anhydrase VI 6.41 Phosphatidylinositol-5-phosphate 4-kinase type-2 gamma 6.42
Concentration (ng.ml Concentration Macrophage colony stimulating factor receptor 6.54 Stem cell growth factor receptor 6.62 Tyrosine-protein kinase LCK 7.00 Bcr/Abl fusion protein 6.66 Platelet-derived growth factor receptor alpha 7.09 Carbonic anhydrase VII 6.96 Carbonic anhydrase 15 6.07.11 Carbonic anhydrase IX 7.12 Platelet-derived growth factor receptor beta 7.14 Tyrosine-protein kinase ABL 7.20 Platelet-derived growth factor receptor 7.30 Discoidin domain-containing receptor 2 7.34 Epithelial discoidin domain-containing receptor 1 7.37 Carbonic anhydrase I 7.50 Carbonic anhydrase II 7.52 Tyrosine-protein kinase ABL2 7.94
7.0 8.0
Time (hr)
Imatinib 400 mg single dose from Jawhari et al (2011) J Bioequiv Availab 3: 161-164; Data is median pChEMBL for human targets from ChEMBL 16 Approved Statin Drugs Atorvastatin Cerivastatin Fluvastatin Pitavastatin 7.5 nM 1.3 nM 0.3 nM 0.6 nM
Rosuvastatin Pravastatin Lovastatin* Simvastatin* Tenivastatin is the hydrolyzed form of 1.7 nM 2.3 nM 12.0 nM 0.12 nM Simvastatin
IC50 values in a rat microsomal assay. From ‘HMGCoA Reductase Inhibitors’, ed. Schmitz and Torzewski (2002) Statin Efficacy & Safety
IC50 (nM) 1.7
7.5 /L)
mmol 12.0
0.12
C C ( -
2.3 Clinical safety of Atorvastatin
0.3 Incidence of side effects (%) effects side of Incidence Mean difference LDL in difference Mean Daily dose (mg)
Daily dose (mg) Statin Systems Pharmacology
Plasma Volume of Oral Clearance Half-life Passive Trade Pro- Lipophilicity Absorption protein Distribution T Main Metabolic INN M.Wt. bioavailability C max T permeability Main Transporters name drug Log D (%) binding V L (hr) 1/2 enzymes F (%) d (L/hr/kg) (hr) (nm/s) (%) (L/kg) BCRP, MRP2, NTCP, OATP1A2, OATP1B1, Atorvastatin Lipitor 559 1.00-1.25 30 12 80-90 5.2 0.25 2-3 15-30 23 3A4 OATP1B3, OATP2B1 Pgp Cerivastatin Baycol 460 1.5-1.75 98 60 >99 0.33 0.2 2-3 137 2C8 3A4 BCRP and OATP1B1 BCRP, OATP1B1, Fluvastatin Lescol 411 1.00-1.25 98 19-29 >99 0.15-0.17 0.97 <1 0.5-2.3 2C8 2C9 3A4 OATP1B3 and OATP2B1 Lovastatin Mevacor Y 405 3.91 30 5 >95 0.26-1.1 2-4 2.9 328 3A4 OATP1B1 and P-gp BCRP, MRP2, NTCP, Pitavastatin Livalo 421 1.5 80 >60 96 2 1 11 35 2C9 OATP1B1, OATP1B3 and Pgp OATP1B1, OATP1B3, OATP2B1, Pgp, MRP2, Pravastatin Pravchol 425 -1.00 - -0.7 34 18 43-55 0.46 0.81 1-1.5 1.3-2.8 7.5 non CYP BCRP and OAT3 in renal clearance OATP1A2, OATP1B1, OATP1B3, OATP2B1, Rosuvastati Crestor 482 -0.5 - -0.25 50 20 88 1.8 0.67 3-4 20.8 4.4 2C9 BCRP, Pgp, MRP2 and n NTCP, OAT3 in renal clearance SimvastatinLipophilicZocor Y 419 4.4 60-80 5 94-98 0.45 4 2-3 352 2C8 3A4 PolarBCRP and Pgp Lovastatin* Simvastatin* Atorvastatin Cerivastatin Fluvastatin Pitavastatin Rosuvastatin Pravastatin
Data adapted from Generaux et al. Xenobiotica, 2011; 41: 639–651, and 2011 US prescribing information Statin Pharmacogenetics
Figures from Niemi, Clinical Pharmacology & Therapeutics, 87, pp. 130-133 (2010) Combination Therapy
Combination therapy, polypharmacology Combination therapy, monopharmacology
Erlotinib : EGFR Losmapimod : p38a Erlotinib : EGFR Gefitinib : EGFR
Multiple Targets Single Target Combination Therapy – Multiple Targets
• Combine drugs against different targets and look for improved outcomes • mechanistic synergy, sensitization, treatment of comorbidities, increased compliance? • Examples • Hyzaar (losartan and potassium-hydrochlorothiazide) – hypertension • Vytorin (simvastatin and ezetimibe) – hypercholestemia • Dosing levels not usually reduced in combination products compared to monotherapy Combination Drugs - Dosing
88 Drugs used in combinations, covering 36 targets USAN
Dose (mg) Al-Lazikani & Overington, unpublished. Data from Chembl13 load of FDA Orange Book Drug Blending
• Combine drugs against the same targets and look for improved outcomes • Combined ‘Me too’ drugs, but…..
• Differing off-target bioactivity spectra, Cmax, Tmax, AUC∞ • Benefits • Minimize effects of genetic variation of target/ADMET system • Efficacy target ‘sees’ pooled concentrations • Off-targets ‘see’ reduced concentrations of components • Reduced resistance in anti-infective and anti-cancer settings • Ability to dose higher in anti-infective setting • Improved population-level safety and minimized intrapatient variability
Al-Lazikani & Overington, unpublished. Data from Chembl13 load of FDA Orange Book Drug Blending – single agent
MEC ADR Drug A
Drug A, dose = 75 mg Drug A, dose = 150 mg
MEC Drug A Drug Blending – two agents
MEC ADR Drug B
MEC ADR Drug A
Drug A, dose = 75 mg
Drug B, dose = 30 mg Pooled concentration of A and B MEC Efficacy Drug A MEC Efficacy Drug B
Resistance ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical
46 Mechanisms of Drug Resistance
Efficacy Metabolism Transport Target
Expression Expression Expression Coding Mutation
HIV-1 Proteinase Beta lactamase PGP (antivirals) (antibacterials) (anti-cancers)
Real World Drug Resistance ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical
48 Gainor and Shaw (2013) J. Clin, Oncol. 31 3987-3996 Selected Clinical EGFR Inhibitors EGFR Selected Clinical Selectivity data taken from OSI - 744, CP Erlotinib Tarceva - 358774 Ghoreschi et al, Nature AZD Gefitinib Iressa Immunol - 1839 . (2009) 10 356 - 360 AEE - 788 Tykerb/Tyverb GW Lapatinib - 572016 49 ©2016 Medical Discovery Catapult. All rights reserved. Overlay of EGFR Inhibitors
2-D overlay of Erlotinib, Gefitinib, Lapatinib and AEE-788 Hydrophobic Pocket II, Allosteric site
Adenine mimic
Adenine ring of ATP
Hydrophobic Pocket I
Overington and Van Westen - unpublished Drug Resistance
Subsite 1 Core site Subsite 2 Mutation / Selection Wild-type Target Wild-type Target Mutant Target
n.b. Many alternative mechanisms for resistance exist! Resistance – Switched Sequential Therapy
Mutation / Selection Wild-type Target Mutant Target Mutant Target
Mutation / Selection Wild-type Target Mutant Target Mutant Target Resistance – Blending
Mutant Target Blend sensitive
Mutation / Wild-type Target Selection Mutant Target Wild-type Target Blend sensitive Blend sensitive
What is probability of jointly resistant mutant simultaneously arising? Mutant Target Blend resistant DNA Mutations Can Change Coded Protein
Protein Phe Pro Met Arg Gly Asp
Gene T T C C C A A T G C G T G G A G A C
mutation A Tyr
C Ser G Cys
translation Mutation Probabilities Are Not Random
Alexandrov et al., Nature 500, 415–421 (22 August 2013) doi:10.1038/nature12477 Mutation Probabilities Are Not Random
Alexandrov et al, Nature 500, 415–421 (22 August 2013) doi:10.1038/nature12477 Different Profiles = Different Mutants
Gene T T C C C A A T G C G T G G A G A C
Signature 7 (melanoma) T T T T T
Ser
Phe Leu Cys Asp Protein Phe Pro Met Arg Gly Asp
Leu Thr Ser Glu
Gln
A A A A A Signature 18 (neuroblastoma) Gene T T C C C A A T G C G T G G A G A C Resistance Is Practically Bounded
• What if only a relatively limited repertoire of mutations were possible in a tumour? • Take all CDS in Ensembl • Apply Alexandrov et al frequencies to score all possible mutations in all genes • Gives precomputed library of mutants specific to particular cancer background • Can select from this set efficacy targets for drugs • Can model binding site differences • Effect + Likelihood • Does a ‘blend’ of inhibitors to same target offer significant advantages in resistance • Inhibitors will have distinct target binding, metabolism and transport SARs – more robust to multiple types of resistance Mutational Profiles in Different Cancers
Competitive Intelligence ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical
60 Privileged Target Families ChEMBL17 Drugs
Santos & Overington, unpublished Clinical Kinome
Overington, Al-Lazikani & Wennerberg, unpublished Clinical Kinome
• 399 Clinical stage human kinase inhibitors • 29 Approved small molecule kinase inhibitors • 15 -tinib – tyrosine kinase inhibitors • 5 -rolimus – mTor inhibitors • 4 -rafenib – Raf inhibitors • 2 -anib – angiogenesis inhibitors • 1 -metinib – met inhibitor • 1 brutinib – Bruton tyrosine kinase inhbitors • 1 -dil – Rho kinase inhibitor (Japan only) • 38 Phase 3 • 143 Phase 2 • 189 Phase 1 • Phase 1:2 ratio is atypical due to many kinase inhibitor trials being phase 1/2 oncology trials Kinase Inhibitors in Clinical Development
Overington, Bellis, Al-Lazikani & Wennerberg, unpublished Kinase Inhibitor Attrition
Overington, unpublished Kinase Inhibitor Productivity
Overington, unpublished
Are Antibacterials Really Different? ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical
67 Antibacterial Physicochemical Properties
• Antibacterials widely known to fall in a different region of ‘chemical space’ to ‘human’ drugs • Larger and more polar • Mostly natural products • Seen as exceptions to Lipinski’s rule-of- five
O’Shea & Moser (2008) J Med Chem DOI:10.1021/jm700967e Antibacterial Drug Target Classes
Protein RNA/riboprotein
PBP 30S ribosomal subunit
H C CH NH2 3 3 OH N H CH3 (R) N H3C H H OH (S) (R) CH3
O N HO (R) CH3 NH2 O OH O OH O OH O O HO Amoxicillin Tetracycline Oral Oral Natural product-derived Natural product Target Class View of Physicochemistry
Mugumbate & Overington (2015) Biorg Med Chem DOI:10.1016/j.bmc.2015.04.063 Oligonucleotide vs Oligopeptide Polarity 23 natural amino acids 4 RNA nucleosides
Element % protein % RNA (unweighted (unweighted monomer monomer composition) composition) C 65 45 N 17 33 O 17 17 S 1 0 P 0 5
• RNA species are significantly more polar than proteins • Binding site composition comparisons underway RNA Target Ligands • Discovery of a novel class of antibiotics, binding to a ncRNA riboflavin riboswitch - ribB • Screened 57,000 known synthetic antibacterials for riboflavin essentiality
OH
HO (R) (S) OH
(S) CH3 OH Roseoflavin (IC50 = 300nM) N N N O H3C FMN antimetabolite, binds ribB NH -1 H3C MIC > 128 mg/ml E.coli MB5746
O
CH3 HN
S N N Ribocil (IC50 = 300nM)
N Competitive binding wrt FMN -1 N N NH2 MIC > 2 mg/ml E.coli MB5746
Howe et al (2015) Nature DOI:10.1038/nature15542 Implications • Clear difference observed in physicochemical properties of antibacterials • Basis of differences are likely to be due to target class, not organism • Structural analysis supports larger, more polar ligands for RNA targets • Antibacterial protein-targeted compound similar to Human protein-targeted compounds • Historical antibacterial data likely biased to RNA-directed chemotypes • Alignment of compound collections to RNA-directed property profile likely to generate more RNA-directed compounds from phenotypic hits • These however may make great drugs!
Mugumbate & Overington (2015) Biorg Med Chem DOI:10.1016/j.bmc.2015.04.063
Assay Networks ©2016 Medical Discovery Catapult. rights Catapult. reserved. All ©2016 Discovery Medical
74 Assays from Target to Clinic
Cell- Animal Human Biochemical Functional based disease clinical assay assay screen model trial
Build assay networks Link to animal models Directed graph of all Understand attrition from co-occurrence and genetics assays from targets to through drug development trials Zwierzyna, Atkinson & Overington, unpublished Assay Graph of Approved Drugs FDA approved drugs linked by shared activity in a phenotypic assay
MEMANTINE FEXOFENADINE
KETAMINE KETOTIFEN
ESCITALOPRAM CARPROFEN MEFENAMIC ACID LEVOMILNACIPRAN Inflammation CROMOLYN NAPROXEN
MECLOFENAMIC ACID
CITALOPRAM RISEDRONIC ACID MILNACIPRAN PYRIDOSTIGMINE IBUPROFEN
INDOMETHACIN MAZINDOL ALENDRONIC ACID Pain PAMIDRONIC ACID
ASPIRIN ATOMOXETINE ROFECOXIB FUROSEMIDE FLURBIPROFEN PAROXETINE PROCAINAMIDE IBANDRONIC ACID NEOSTIGMINE ZOLEDRONIC ACID DULOXETINE TAPENTADOL DICLOFENAC
EDROPHONIUM OXYCODONE CHLORTHALIDONE INDAPAMIDE VALDECOXIB DICHLORPHENAMIDE CELECOXIB DESVENLAFAXINE ETHOXZOLAMIDE
VENLAFAXINE METHYLNALTREXONE HYDROCHLOROTHIAZIDE FLUOXETINE DORZOLAMIDE MEPERIDINE CYPROHEPTADINE LOPERAMIDE TRICHLORMETHIAZIDE METHAZOLAMIDE DESIPRAMINE MAFENIDE TRAMADOL TACRINE SERTRALINE RIVASTIGMINE TOPIRAMATE METHAMPHETAMINE NALMEFENE NALBUPHINE METHADONE BRINZOLAMIDE HYDROMORPHONE HYDROCODONE NORTRIPTYLINE DEXTROAMPHETAMINE ACETAZOLAMIDE
GALANTAMINE SULFANILAMIDE ACETAMINOPHEN ZONISAMIDE AMITRIPTYLINE MORPHINE NALTREXONE BROMPHENIRAMINE
FENTANYL BUTORPHANOL ETHOPROPAZINE DOXEPIN IMIPRAMINE NALOXONE TOLTERODINE HYDROQUINONE DONEPEZIL SCOPOLAMINE CYCLOPHOSPHAMIDE LEVORPHANOL FLUVOXAMINE ERLOSAMIDE TETRACAINE ALFENTANIL CLOMIPRAMINE CODEINE SUFENTANIL OXYMORPHONE PROMETHAZINE SOLIFENACIN OXYBUTYNIN LEVALLORPHAN DARIFENACIN
PENTAZOCINE DECAMETHONIUM
ATROPINE VEMURAFENIB
ALVIMOPAN AMLODIPINE CHLORPROMAZINE ILOPERIDONE Signal transduction ZIPRASIDONE VERAPAMIL DICYCLOMINE NILOTINIB
PROMAZINE PIMOZIDE
RISPERIDONE
BOSUTINIB DASATINIB MIRTAZAPINE PAZOPANIB LOXAPINE NIFEDIPINE DILTIAZEM NEFAZODONE HALOPERIDOL VANDETANIB SORAFENIB CHLORPROTHIXENE AFATINIB NINTEDANIB QUETIAPINE THIORIDAZINE RUXOLITINIB PYRILAMINE ARIPIPRAZOLE PILOCARPINE IMATINIB TOFACITINIB
AXITINIB CRIZOTINIB TRIFLUOPERAZINE NICARDIPINE GEFITINIB CLOZAPINE OLANZAPINE BUSPIRONE
AMOXAPINE SUNITINIB METHYSERGIDE LAPATINIB ERLOTINIB DOXAZOSIN CERITINIB
APOMORPHINE PRAMIPEXOLE TAMSULOSIN HISTAMINE
PERGOLIDE ASENAPINE PRAZOSIN CARBACHOL CHLORIDE
ROPINIROLE
ACETYLCHOLINE TERAZOSIN ALFUZOSIN PONATINIB
DOPAMINE PHENTOLAMINE BROMOCRIPTINE VORINOSTAT
FENOLDOPAM METHACHOLINE
NICOTINE BELINOSTAT
OXYMETAZOLINE
TOLAZOLINE BRIMONIDINE VARENICLINE ETHINYL ESTRADIOL DOCETAXEL
CLONIDINE ROMIDEPSIN ROTIGOTINE Dopaminergics TESTOSTERONE TUBOCURARINE
PHENYLEPHRINE APRACLONIDINE GUANABENZ FULVESTRANT ESTRADIOL
ERIBULIN
RALOXIFENE VINORELBINE BASE VINBLASTINE NOREPINEPHRINE DEXMEDETOMIDINE BAZEDOXIFENE ARTESUNATE
EPINEPHRINE MITOMYCIN TAMOXIFEN DIETHYLSTILBESTROL COLCHICINE DAUNORUBICIN ESTRONE ARTENIMOL
LEVONORDEFRIN METAPROTERENOL VINCRISTINE PACLITAXEL TRIMETHOPRIM
DIGITOXIN DOXORUBICIN MILTEFOSINE TERBUTALINE EPIRUBICIN ARTEMETHER
PYRIMETHAMINE ETOPOSIDE PODOFILOX METHOTREXATE MITOXANTRONE
DIGOXIN CYTARABINE
ISOPROTERENOL FLUOROURACIL
DACTINOMYCIN TRIMETREXATE FORMOTEROL LEVOSALBUTAMOL TOPOTECAN PRALATREXATE GEMCITABINE
DEFEROXAMINE THIOGUANINE AZACITIDINE TENIPOSIDE DNA replication & regulation
VILANTEROL SALMETEROL FLOXURIDINE PENTAMIDINE
PINDOLOL
IRINOTECAN MELPHALAN ARFORMOTEROL INDACATEROL LORATADINE PROPRANOLOL
MERCAPTOPURINE
CLADRIBINE CHLOROQUINE
METOPROLOL ALBUTEROL
BETAXOLOL SOTALOL NITROFURANTOIN THIOTEPA
CLOFARABINE Oncology oncology
DELAVIRDINE CLONAZEPAM ENZALUTAMIDE FENOFIBRATE BEXAROTENE LOVASTATIN CAPTOPRIL FROVATRIPTAN GRANISETRON EFAVIRENZ ALITRETINOIN
LISINOPRIL
ENALAPRILAT METOCLOPRAMIDE NARATRIPTAN
QUINAPRIL RIZATRIPTAN
ETRAVIRINE BICALUTAMIDE TRIAZOLAM NEVIRAPINE ROSIGLITAZONE TRETINOIN ONDANSETRON STAVUDINE DIAZEPAM PRAVASTATIN ENALAPRIL CHLORDIAZEPOXIDE SUMATRIPTAN ISOTRETINOIN
CISAPRIDE
ROSUVASTATIN TAZAROTENE VORTIOXETINE SIMVASTATIN ZOLMITRIPTAN
ZOLPIDEM PREDNISOLONE ZALCITABINE DEXAMETHASONE TROGLITAZONE ALPRAZOLAM MIFEPRISTONE PIOGLITAZONE FLUMAZENIL
ADAPALENE
ZIDOVUDINE CERIVASTATIN DIDANOSINE ATORVASTATIN
TELMISARTAN
ESZOPICLONE PROGESTERONE LOSARTAN
PLERIXAFOR
VALSARTAN NORETHINDRONE
SAQUINAVIR DARUNAVIR ATAZANAVIR MEDROXYPROGESTERONE ACETATE
LOPINAVIR NELFINAVIR
RITONAVIR
AMPRENAVIR
INDINAVIR
IPRATROPIUM ACLIDINIUM SAXAGLIPTIN PARGYLINE DEGARELIX FLUTICASONE FUROATE OXYTOCIN MIGLUSTAT SILDENAFIL ALPROSTADIL SIROLIMUS NILUTAMIDE SULFISOXAZOLE CANAGLIFLOZIN DEXMETHYLPHENIDATE BOSENTAN BUDESONIDE VARDENAFIL CYCLOSPORINE FLUTAMIDE SULFAMETHOXAZOLE DAPAGLIFLOZIN METHYLPHENIDATE MACITENTAN
LINAGLIPTIN GANIRELIX ACETATE
ILOPROST VASOPRESSIN MIGLITOL SITAGLIPTIN CETRORELIX TIOTROPIUM GLYCOPYRROLATE FLUTICASONE PROPIONATE TADALAFIL TACROLIMUS
SELEGILINE ALOGLIPTIN ABARELIX
PHENELZINE EPOPROSTENOL
DESMOPRESSIN ACARBOSE
RASAGILINE
ENTACAPONE SINCALIDE KETOPROFEN THEOPHYLLINE LIOTHYRONINE APIXABAN TELAPREVIR CARFILZOMIB LEFLUNOMIDE GONADORELIN CARBOPLATIN GANCICLOVIR CALCITRIOL EPLERENONE PROCYCLIDINE DIFLUNISAL
TOLCAPONE PENTAGASTRIN SUPROFEN CAFFEINE DEXTROTHYROXINE RIVAROXABAN BOCEPREVIR BORTEZOMIB TERIFLUNOMIDE LEUPROLIDE OXALIPLATIN CIDOFOVIR CALCIPOTRIENE SPIRONOLACTONE ORPHENADRINE SULFASALAZINE Zwierzyna & Overington, unpublished Assay Clustering Using Word Embedding
Word2vec clustering on noun phrases from ChEMBL phenotypic (F) assays
Zwierzyna & Overington, unpublished PCA of Word2Vec Assay Descriptions
Each assay description: average over its word vectors. Data points projected from a 200-dimensional space to 2D using PCA
Zwierzyna & Overington, unpublished Word2vec Embedding of Assays ChEMBL assays of known drugs annotated with different ATC codes (~15k of ~94k) N03 (antiepileptic) M01 (anti-inflammatory) L01 (antineoplastic)
C02 (antihypertensive) A10 (antidiabetic) N02 (analgesic)
Overington,unpublished Zwierzyna Zwierzyna & Word2vec Assay Graph C09 angiotensin system
C02 anithypertensive G04 urological J01 antibacterial L01 antineoplastic
A10 antidiabetic P01 antiprotozoal N03 antiepileptic N05 psycholeptics C10 lipid N03 antiepileptic modifying
M01 anti-inflammatory N02 analgesics M02 muscular pain M02 muscular pain L01 antineoplastic C01 cardiac therapy A03 antiemetics A07 antidiarrheals N06 antidepressants N01 anesthetic N05 psycholeptics
A11 vitamins Zwierzyna & Overington, unpublished Acknowledgements
Bissan Al-Lazikani Aroon Hingorani, Marc Marti-Renom Juan Pablo-Casas Francesco Martinez
Magda Zwierzyna, Mark Davies Krister Wennerberg
WT086151/Z/08/Z (2008-2014) WT104104/Z/14/Z (2014-2019) Medicines Discovery Catapult Mereside, Alderley Park, Alderley Edge, Cheshire, SK10 4TG md.catapult.org.uk @MediDiscCat [email protected]