<<

Building innovative alliances

What to make next? Augmented Design

Cresset UGM, 29th June 2017 Building innovative drug discovery alliances

What to make next? Augmented Design

Or

2 Old blokes and a couple of buns

Cresset UGM, 29th June 2017 What to make next?

PAGE 2 What’s medicinal ?

PAGE 3 What do I make next? Lead Telemetry

1st Candidate

• MPO scores comprised of:  MPO  Solubility Score  Protein binding  Metabolic stability  LogD

Project progression

PAGE 4 Good medicinal chemistry? Project Progress as a Function of Time & compound number pIC 50 Comp. 3 Comp. 4 (LLE 8.0) (LLE 8.0) First co-crystal structure delivered Structure based design: Comp. 2 - Targeting specific unconserved (LLE 7.9) Target Rapid progress increasing potency Core redesigned: cysteine residue Off Target 1 Docking based design: - Replacement of aromatic CH Off Target 2 - Introduction of axial methyl “lock” with heteroatom - Rigidification of Comp. 1 - Intramolecular H-bonding (LLE 6.8)

Rapidly regained potency

Advanced Lead identified Ames liability removed AO liability identified No Ames liability … with drop in potency No AO activity

Initial Hit (LLE 3.2) Ames liability Selectivity improved identified 6 months

PAGE 5 Key tactics for early series evaluation

Establishing the Understanding Conformation • Determine which molecular features are driving or limiting • Assessing conformational potency landscape of hit series • Evolution of molecular • Exploration of molecular scaffolds using series features limiting conformational hybridisation and freedom and assessing their computational scaffold hopping impact on biological profile approaches

Focus on Properties Ensuring Compound Efficiency • Focus on aligning molecular • Hypothesis driven, iterative properties of a series with approach to compound design desirable property space* • Preparing minimum number of • Establish independence compounds required to address between trends in molecular an issue or assess potential of properties and biological profile a series

PAGE 6 * Desirable property space depends on route of admin, target organ etc What do do well? An under-utilised resource

lab notebooks (eLN)

200 chemists

proprietary reaction ~10 reactions per week database

Adding say ~100,000 reactions per year, strong medicinal chemistry bias

PAGE 7 The ‘Bread & Butter’ A brief history of synthetic medicinal chemistry ….

‘Our study also shows a steady increase in the number of different reaction types used in pharmaceutical patents but a trend toward lower median yield for some of the reaction classes.’

Huge pool of known chemistry waiting to be tapped

PAGE Landrum et al J. Med. Chem. 2016, 59, 4385−4402; Rougley & Jordan J Med Chem. 2011, 54,3451-79 Reaction Vectors

O O + HO OH O R1 R2 P

I 1 2 3 4 I 1 2 3 4 Bond C-C C=O C-OH C-OR Bond C-C C=O C-OH C-OR # 4 1 2 0 # 4 1 0 2

reactant vector, R = (R1 + R2) product vector, P

I 1 2 3 4 Bond C-C C=O C-OH C-OR # 0 0 -2 2

reaction vector, D = P - R

PAGE 9 Broughton, H. B., Hunt, P. & Mackey, M. (2003) Methods for Classifying and Searching Chemical Reactions. United States Patent Application 367550. Reaction Vectors in Structure Generation

• The reaction vector, D, equals the difference between the product vector, P, and the reactant vector, R D = P – R

Given a reaction vector, D, and a reactant vector, R, the product vector, P, can be obtained P = D + R

Given a product vector, P, can we reconstruct the product molecule(s)?

O O

I 1 2 3 4 better descriptor Bond C-C C=O C-OH C-OR O # 4 1 0 2 O is required O

O

PAGE 10 J. Chem. Inf. Model., 2009, 49 (5), pp 1163–1184. Knowledge-Based Approach to de Novo Design Using Reaction Vectors So… Using known chemistry we can…

PAGE 11 Augmented Design: scaffold hopping in CDK2 Feature Similarity + Docking

• Reactions: 26K • Reagents: 18K extracted from Aldrich  No rings < 4  Contain C, N, O, S, P, F, Cl, Br, I, B, Si, Se  < 3 F  Hann substructures removed  Rb < 3  Max 15 heavy atoms • Scored feature similarity to 1H1S

Hinge Polar Cyclohexyl Fragment 4400 products 16000 products

8 starting 10 starting materials materials

PAGE 12 Augmented Design: Fast Follower Approach Find me a new IP free series?

Type II Kinase Inhibitor

PAGE 13 Spark meets Reaction Vectors Kinase inhibitor programme

Evotec Program Target Starting Point End Point Timeline contribution Literature fast H2L Kinase inhibitor Advanced Lead 5-6 FTEs 9 Months follower

Literature Type II Stage 1: Generation of new cores Stage 2: RV Backpocket expansion kinase inhibitor 1) Spark replacement of Cmpd A core 1) New structures generated with novel Cmpd A: 2) ROCs overlay, GOLD docking & building blocks shown pharmacophore match using Cmpd A 2) New structures filtered on phys. chem. crystal structure properties (cLogD 1.5-3.5, MW<500) 3) 4 new cores building blocks selected and 3) ROCs overlay / GOLD docking & Hinge Core Back conformational analysis completed: pharmacophore match completed binder pocket 4) Compounds docked in key kinase crystal structures to test for selectivity IC50 = 10 nM 5) Predictive DMPK workflow run Crystal structure 6) Scifinder IP searches conducted published 7) 14 New compounds selected for synthesis

Novel lead compound generated with desirable potency suitable for further SAR expansion IC50 = 80 nM

PAGE 14 So…

PAGE 15 What do I make next? Options

RVs & Bayes FMO

Pan-Omics eAPPS

PAGE 16 What to make next? GPR Bayesian optimisation meets reaction sequence vectors

Balance between Exploitation Exploration

Acquisition Functions

• For finding the maximum:  Probability of improvement (PI)  Expected improvement (EI)  Upper confidence bound (UCB)

• For finding the minimum:  Lower confidence bound (LCB)

PAGE 17 Nafisa Sharif & Mike Osborne 2016 The Dataset The ’s QSAR

Apply acquisition The Experiment function

Train Train +1+2+3…n Test Train

Build model (50)

pIC50 Train +1+2+3…n Calculate acquisition function. Add top scoring compound to model. Rebuild model.

Test

Predict on final compounds (42). Repeat 100x… (50+1+2…151) Compounds (ordered chronologically)

PAGE 18 Results: Observing Signal Formation

Application of the acquisition function to build the training set Test Test set r No Bayesian Optimisation Test set r

PI UCB

2

2

value

value Test set r

Number of compounds added to training set Number of compounds added to training set

2

value

Test Test set r Test Test set r

2 LCB

Number of compounds added to training set EI 2

value value Chronologically Ordered

Number of compounds added to training set Number of compounds added to training set

PAGE 19 Balancing Exploration with Exploitation Mixing the acquisition functions

No Bayesian Optimisation LCB followed by UCB LCB followed by UCB

Test Train +1+2+3…n Test

Test Test set r

2 value

kappa = 2

Number of compounds added to training set

Exploration followed 2 r2 = 0.627 r = -0.345 by Exploitation No Correlation Good Correlation

PAGE 20 Modelling

“By definition all models are wrong … it’s just that some are more useful than others” George EP Box (1919-2013)

“Modelling isn’t about getting it right … It’s about understanding why you get the answer you get”

If you understand the why then you can build a better model

PAGE 21 Summary

• Demonstrated that given a molecule we can automatically suggest what chemistry we can apply and hence what we can make

• Successful impact of RVs in combination with Cresset tools

• Described research into the application of Acquisition Functions to drive choice of compound(s) for synthesis from within the RV networks

PAGE 22 Acknowledgments

Dimitar Hristozov Val Gillet Mike Osborne Atanas Patronov Beining Chen Nafisa Sharif Craig Johnstone Hina Patel Michelle Southey Ben Allen Chris Stimson James Wallace

23

PAGE Building innovative drug discovery alliances

Your contact: Mike Bodkin VP Research Informatics 114 Innovation Drive, Milton Park, Abingdon Oxfordshire OX14 4RZ, UK T: +44 (0)1235 44 1207

[email protected] What do I make next? Sequence vector network

PAGE 25 AI & Augmenting Design

Local QSAR Global QSAR

7

y = 0.7044x + 0.9289 6 R2 = 0.6091

5

4

3

2

1

0 0 1 2 3 4 5 6 7 8

Given a single change, Take all the data, What is the effect? What is the effect?

Bench Chemist Computational Chemist

PAGE 26 Augmented Design

Load Molecule Generate new molecules Any scoring tool Multi-objective Best new molecules

Input Mutate Score Rank Output

Q2 = 0.76

Activity XQSAR Pool of possible products Predicted Local Models Multi-objective Pareto optimisation Actual algorithm In-silico reaction

Reaction Vectors Score Select

Starting material Docking Score

Q2 = 0.68

ADMETQSAR Global Models

Predicted Cream-off top scoring molecules for evolution Actual

PAGE 27 1. Multi-dimensional de novo design of drug-like compounds. 2013, De Novo Molecular Design ISBN 978-3-527-67700-9. 2. Validation of Reaction Vectors for de Novo Design. 2011, Library Design, Search Methods, and Applications of Fragment-Based . ISBN 9780841224926. 3. Knowledge-Based Approach to de-Novo Design using Reaction Vectors. 2009, 49 (5), pp 1163–1184. J. Chem. Inf. Model. channel dual Inhibitor design QSARs drive the objective function

Coloured by Pareto Rank: Red (High)-> Blue (Low)

2nd/3rd Iterations

V1.5 pIC50 Prediction (log scale) (log Prediction pIC50 V1.5 K

NaV1.5 pIC50 Prediction (log scale) Results (3 Iterations)

PAGE 28 Antipsychotic polypharmacology 4 objectives: QSAR + Pharmacophore Similarity

The chart shows the known affinity (Ki) values of antipsychotic drugs for a panel of receptors.

How can we go about designing a novel antipsychotic?

26K Reactions 93K Reagents

PAGE 29 Naunyn Schmiedebergs Arch Pharmacol. 2015 Mar 14 Fragment Growth in-situ PHIP2 Bromo-Domain

Reactions based around

Objective function

PAGE 30 Cox, O. B., Krojer, T., Collins, P., Monteiro, O., Talon, R., Bradley, A., … Glick, M. (2016). A poised fragment library enables rapid synthetic expansion yielding the first reported inhibitors of PHIP(2), an atypical bromodomain. Chem. Sci., 7(3), 2322–2330. http://doi.org/10.1039/C5SC03115J Interaction sampling using SIFts & K-means

• Approx 25% molecules in cluster 0 make interactions with GLU-1339 and GLN-1343

• All in cluster 2 make interactions with GLU-1349

PAGE 31 Evolutionary Design KNIME framework

PAGE 32 Fragment Molecular Orbital (FMO) QM-Based SBDD

• FMO is a quantum mechanical method that has been developed for application to large (biological) systems • FMO provides detailed analysis of protein- interactions and their chemical nature  Calculate individual contribution of each residue and water molecule to binding enthalpy

Glu81 Phe80

Phe82 - Electrostatic - Exchange Repulsion His84 Leu134 - Dispersion - Charge Transfer “Strength of molecular interaction at that snapshot in time”

PAGE 33 Heifetz et al., J. Med. Chem., Article ASAP; J. Chem. Inf. Model., 2016, 56 (1), pp 159–172 QM Virtual SAR expansion using FMO 1WCC (CDK2) core modifications

PDB: 1WCC

IC50 = 350mM LE < 0.51 1 2 3 4 5

6 7 8 9 10 Removal of the chlorine detrimental to the fragment binding 11 12 13

• Medium throughput FMO analysis can be

rapidly carried out to answer SAR questions

E D

• The technique is highly effective for prioritizing

the initial fragment expansion directions or IC50 = 7mM optimization for larger DE = Sum PIE – Sum PIE (1WCC fragment)

PAGE 34 Modified Pairs

Atom Pairs 2 (AP2) : X1(n, p, r)-2(BO)-X2(n, p, r)

X: element type n: number of bonds to heavy p: number of π bonds r: number of ring memberships BO: bond order

Atom Pairs 3 (AP3): X1(n, p, r)-3-X2(n, p, r)

• Extending the bond distance in atom pairs encodes more of the environment of the reaction centre

PAGE 35 Beckmann Rearrangement

H N

N O OH Cl Cl X element type n number of bonds to heavy atoms Reaction vector p number of π bonds r number of ring memberships Negative APs Positive APs BO bond order 1- C(3,2,1)-2(1)-C(3,1,0) 1+ C(3,2,1)-2(1)-N(2,0,0)

2- C(3,1,0)-2(2)-N(2,1,0) 2+ C(3,1,0)-2(1)-N(2,0,0) Atom Pairs 2 (AP2) : X1(n, p, r)-2(BO)-X2(n, p, r)

3- N(2,1,0)-2(1)-O(1,0,0) 3+ C(3,1,0)-2(2)-O(1,1,0)

a- C(3,2,1)-3-N(2,1,0) a+ C(2,2,1)-3-N(2,0,0)

b- C(3,2,1)-3-C(1,0,0) b+ C(2,2,1)-3-N(2,0,0)

c- C(3,1,0)-3-C(2,2,1) c+ C(2,2,1)-3-N(2,0,0) Atom Pairs 3 (AP3): X1(n, p, r)-3-X2(n, p, r) d- C(3,1,0)-3-C(2,2,1) d+ N(2,0,0)-3-C(1,0,0)

e- C(3,1,0)-3-O(1,0,0) e+ N(2,0,0)-3-O(1,1,0)

f- N(2,1,0)-3-C(1,0,0) f+ O(1,1,0)-3-C(1,0,0)

PAGE 36 Applying a RV to a reactant to generate a Product

1. Removing the negative atom pairs from the reactant

H 1- 2- N N 3- O OH Cl Cl

Negative APs Positive APs

1- C(3,2,1)-2(1)-C(3,1,0) 1+ C(3,2,1)-2(1)-N(2,0,0) 9 CH3 2- C(3,1,0)-2(2)-N(2,1,0) 2+ C(3,1,0)-2(1)-N(2,0,0) H C 3- N(2,1,0)-2(1)-O(1,0,0) 3+ C(3,1,0)-2(2)-O(1,1,0) 3 C 8 HC 4 C 6 a- C(3,2,1)-3-N(2,1,0) a+ C(2,2,1)-3-N(2,0,0)

b- C(3,2,1)-3-C(1,0,0) b+ C(2,2,1)-3-N(2,0,0) C CH 5 2 Cl C 1 c- C(3,1,0)-3-C(2,2,1) c+ C(2,2,1)-3-N(2,0,0) 7 H d- C(3,1,0)-3-C(2,2,1) d+ N(2,0,0)-3-C(1,0,0)

e- C(3,1,0)-3-O(1,0,0) e+ N(2,0,0)-3-O(1,1,0)

f- N(2,1,0)-3-C(1,0,0) f+ O(1,1,0)-3-C(1,0,0)

PAGE 37 Applying a RV to a reactant to generate a Product

2. Adding positive atom pairs to the fragment

9 CH Negative APs Positive APs 3 H C 3 C 8 1- C(3,2,1)-2(1)-C(3,1,0) 1+ C(3,2,1)-2(1)-N(2,0,0) HC 4 C 6 2- C(3,1,0)-2(2)-N(2,1,0) 2+ C(3,1,0)-2(1)-N(2,0,0) A 5 CH Cl C 2 3- N(2,1,0)-2(1)-O(1,0,0) 3+ C(3,1,0)-2(2)-O(1,1,0) 7 H 1 9 9 CH3 CH3 a- C(3,2,1)-3-N(2,1,0) a+ C(2,2,1)-3-N(2,0,0) 2+ (d+) 3+ (f+) H H C C 3 C 8 3 C 8 b- C(3,2,1)-3-C(1,0,0) b+ C(2,2,1)-3-N(2,0,0) 4 6 4 6 O HC NH HC C 10 B 10 C c- C(3,1,0)-3-C(2,2,1) c+ C(2,2,1)-3-N(2,0,0) CH C CH 5 2 5 2 Cl C 1 Cl C 1 d- C(3,1,0)-3-C(2,2,1) d+ N(2,0,0)-3-C(1,0,0) 7 H 7 H

e- C(3,1,0)-3-O(1,0,0) e+ N(2,0,0)-3-O(1,1,0) 9 1+ (a+,b+,c+) 9 2+ (d+,e+) 1+ (a+) CH 11 CH f- N(2,1,0)-3-C(1,0,0) f+ O(1,1,0)-3-C(1,0,0) 3 H H 9 3 C N CH NH H 11 3 3 H 8 C 4 10 C 3 C C HC 6 C 3 C 8 4 6 8 4 6 O HC C N HC C 10 D 10 H E F 5 CH C CH Cl C 2 C CH 5 2 7 1 5 2 Cl C 1 H Cl C 1 Atom Pairs 2 (AP2) : X1(n, p, r)-2(BO)-X2(n, p, r) 7 H 7 H

No AP2s left in the reaction X element type H H 9 H H 9 vector that match atom 11 3 C N CH3 3 C N CH 4 10 4 11 C 3 n number of bonds to heavy atoms HC 6 C 8 HC C 6 8 G H p number of π bonds O 5 CH O C CH 10 r number of ring memberships Cl C 2 11 Cl 5 C 2 7 1 3+ (e+,f+) 7 1 1+ (a+,b+,c+) BO bond order H H Final Solution Duplicate solution

PAGE 38 How well does it work? Database

Number of Correctly Reproduced Reaction Type Reactions Number Per cent Epoxide reduction 450 449 99.8 Version II: Stored as SQL database Epoxide formation 450 444 98.7 Includes: Ester to amide 172 172 100.0 the reaction vector Alcohol dehydration 171 169 98.8 Claisen rearrangement 61 54 88.5 the environment around each broken bond Beckmann rearrangement 123 123 100.0 the fragmentation path Friedyl Crafts acylation 113 113 100.0 the reconstruction path Olefin metathesis 9 7 77.8 Dieckmann condensation 98 91 92.9 the original reagent and reactants Nitro reduction 231 230 99.6 oxidation 272 272 100.0

9 Cope rearrangement 453 306 67.5 CH3 H C Improved algorithm Aldol condensation 134 134 100.0 3 C 8 HC 4 C 6 Alcohol amination 97 97 100.0 A 5 CH Cl C 2 Amide reduction 51 51 100.0 7 H 1 9 9 CH3 CH3 Diels-Alder hetero 441 320 72.6 2+ (d+) 3+ (f+) H H C C 3 C 8 3 C 8 Ether halogenation 58 58 100.0 4 6 4 6 O HC NH HC C 10 B 10 C Ozonolysis 132 125 94.7 5 CH C CH Cl C 2 Cl 5 C 2 7 1 7 1 Claisen condensation 98 77 78.6 H H

Carboxylic acids to aldehydes 194 194 100.0 9 1+ (a+,b+,c+) 9 2+ (d+,e+) 1+ (a+) CH 11 CH 3 H H 9 3 C N CH NH Nitrile reduction 102 102 100.0 H 11 3 3 H 8 C 4 10 C 3 C C HC 6 C 3 C 8 4 6 8 4 6 O HC C N HC C 10 Diels-Alder cycloaddition 106 65 61.3 10 D H CH E F 5 2 C CH Cl C 1 C CH Fischer indole 230 94 40.9 Cl 5 C 2 7 H Cl 5 C 2 7 H 1 7 H 1 Alkene halogenation 310 281 90.6 No AP2s left in the reaction Nitrile hyrdrolysis 460 460 100.0 H H 9 H H 9 vector that match atom 11 3 C N CH3 3 C N CH 4 10 4 11 C 3 HC 6 C 8 HC C 6 8 Olefination 455 427 93.8 G H CH O C CH O 5 2 11 5 2 10 Wittig-Horner 211 190 90.0 Cl C 1 Cl C 1 7 H 3+ (e+,f+) 7 H 1+ (a+,b+,c+) Robinson annulation 13 10 76.9 Final Solution Duplicate solution Total 5,695 5,115 89.8

PAGE 39 Hristozov, D., Bodkin, M., Chen, B., Patel, H. & Gillet, V. J. 2011. Validation of Reaction Vectors for De Novo Design. Library Design, Search Methods, and Applications of Fragment-Based Drug Design. American Chemical Society. The algorithm looks great but ...

1. The algorithm only knows about transformation types that are in the Db! 2. The AP2/3’s cover 1 and 2 bonds. Remote functionality isn’t considered.

3. A reaction path is not a drug “optimisation”!

i ii iii iv

a b c d e

But how do we get from here to here?

PAGE 40 Reaction Sequence Vectors Tools for molecular design

i ii iii iv

a b c d e

Sequence Vectors a b More one to think about a c a d a e

b x Molecules Nodes // RVs Edges

PAGE SAR Exploration Succinyl Hydroxamates

PAGE 42 Bailey, S., et al., 2008. Bioorganic & Medicinal Chemistry Letters, 18, 6562-6567 Novel SAR Succinyl Hydroxamates

PAGE 43 James Wallace ukQSAR 2014 Principle Components Analysis of Property Space Succinyl Hydroxamates

PAGE 44 James Wallace ukQSAR 2014.