<<

Supporting Information for A simple representation of three-dimensional molecular structure

Seth D. Axen1, Xi-Ping Huang2,4, Elena L. Cáceres1,3, Leo Gendelev1,3, Bryan L. Roth2,4,5, Michael J. Keiser1,3*

1. Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 675 Nelson Rising Ln NS 416A, San Francisco, CA 94143

2. Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599

3. Department of Pharmaceutical Chemistry, Institute for Neurodegenerative Diseases, and Institute for Computational Health Sciences, University of California, San Francisco, 675 Nelson Rising Ln NS 416A, San Francisco, CA 94143

4. National Institute of Mental Health Psychoactive Screening Program (NIMH PDSP), University of North Carolina, Chapel Hill, North Carolina, USA.

5. Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

Table of Contents Supporting Figures and Tables 2 Supporting References 22

1

Supporting Figures and Tables

Figure S1. Diagram of information flow in the E3FP fingerprinting algorithm for a large molecule. Overview of fingerprinting for the large, flexible molecule (CHEMBL210990) in Figure 1c.

2

Figure S2. Performance of SEA Cross-validation. Fold-average AUCs from cross-validation using A) 10,000 random molecules from ChEMBL17 over a broad parameter set and B-C) 100,000 random molecules from ChEMBL20 over B) narrower and C) even narrower parameter set (See Experimental Section).

3

Figure S3. Comparison of individual fold AUPRCs and AUROCs from second stage of E3FP parameter optimization. A) All folds B) High SEA Tanimoto coefficient (TC) cutoff folds C) Low SEA TC cutoff folds. Points are colored according to SEA TC cutoff. Red lines indicate maximum fold AUPRC and AUROC from ECFP4. Folds using a low (very permissive) SEA TC cutoff (light gray, panels (A) and (C)), achieve high AUROC at the cost of AUPRC.

4

Figure S4. Fold-wise performance by AUCs observed across parameter ranges, from second stage of E3FP parameter optimization. Boxplots show the first and third quartiles, while a horizontal line indicates the median AUC. Whiskers extend to the minimum and maximum AUCs obtained for parameters within that range. Ranges for radius multipliers in (A-B) are exclusive of the lowest value and include the highest value.

5

Table S1. Performance of ECFP at various radii.

Name Radius Mean AUPRC Mean AUROC SEA Tanimoto Cutoff

ECFP0 0 0.1898 ± 0.0011 0.9526 ± 0.0004 0.78

ECFP2 1 0.5401 ± 0.0024 0.9869 ± 0.0002 0.42

ECFP4 2 0.5803 ± 0.0014 0.9882 ± 0.0001 0.30

ECFP6 3 0.5595 ± 0.0004 0.9874 ± 0.0002 0.24

ECFP8 4 0.5854 ± 0.0010 0.9874 ± 0.0001 0.22

ECFP10 5 0.5793 ± 0.0016 0.9876 ± 0.0002 0.20

Fold-average of target-average AUCs from 5-fold cross-validation against ChEMBL20 for various ECFP radii/number of iterations. Tanimoto thresholds determined by SEA are included.

Table S2. Performance of E3FP and ECFP4 at various levels of folding.

Name Bit Number Mean AUPRC Mean AUROC SEA Tanimoto Cutoff

ECFP4 1024 0.5803 ± 0.0014 0.9882 ± 0.0001 0.30

ECFP4 2048 0.5665 ± 0.0011 0.9882 ± 0.0003 0.28

ECFP4 4096 0.5822 ± 0.0014 0.9883 ± 0.0002 0.28

E3FP 1024 0.6427 ± 0.0020 0.9886 ± 0.0003 0.20

E3FP 2048 0.5031 ± 0.0013 0.9832 ± 0.0002 0.16

E3FP 4096 0.5503 ± 0.0011 0.9862 ± 0.0002 0.16

6

Figure S5. PRC Performance of E3FP and ECFP using various classifiers. Best fold PRC curve resulting from 5-fold cross-validation of E3FP and ECFP4 1024-bit fingerprints using A) Naive Bayes classifier (NB), B) Random Forest (RF), C) Support Vector Machine with a linear kernel (LinSVM), and D) artificial neural network (NN). Each tautomer/conformer fingerprint is treated as a separate molecule during training and testing, and the molecule prediction is set to the maximum prediction for all conformers of that molecule. For Mean E3FP and Mean ECFP4, all fingerprints for a molecule are bitwise averaged to form a “float” fingerprint.

7

Figure S6. ROC Performance of E3FP and ECFP using various classifiers. Best fold ROC curve resulting from 5-fold cross-validation of E3FP and ECFP4 1024-bit fingerprints using A) Naive Bayes classifier (NB), B) Random Forest (RF), C) Support Vector Machine with a linear kernel (LinSVM), and D) artificial neural network (NN). Each tautomer/conformer fingerprint is treated as a separate molecule during training and testing, and the molecule prediction is set to the maximum prediction for all conformers of that molecule. For Mean E3FP and Mean ECFP4, all fingerprints for a molecule are bitwise averaged to form a “float” fingerprint.

8

Figure S7. PRC Curves for ECFP and variants of E3FP using SEA. All PRC curves resulting from 5 independent 5-fold cross-validation runs of A) ECFP4, B) ECFP4 with information about chiral bonds C) E2FP, D) E2FP with stereochemical information, E) E3FP without stereochemical information, and F) E3FP.

9

Figure S8. ROC Curves for ECFP and variants of E3FP using SEA. All ROC curves resulting from 5 independent 5-fold cross-validation runs of A) ECFP4, B) ECFP4 with information about chiral bonds C) E2FP, D) E2FP with stereochemical information, E) E3FP without stereochemical information, and F) E3FP.

10

Table S3. References for Examples of Molecule Pairs where E3FP and ECFP4 Tanimoto Coefficients Differ.

Molecule 1 Molecule 2 Target Reference 1 Reference 2

CHEMBL113217 CHEMBL113907 GABA-B receptor 1 1

CHEMBL329431 CHEMBL365849 Nitric-oxide synthase, inducible 2 3

CHEMBL329431 CHEMBL365849 Nitric-oxide synthase, brain 2 3

CHEMBL329431 CHEMBL365849 Nitric-oxide synthase, endothelial 2 3

CHEMBL218710 CHEMBL8839 Metabotropic glutamate receptor 2 4 4

CHEMBL218710 CHEMBL8839 Metabotropic glutamate receptor 3 4 4

CHEMBL158261 CHEMBL333193 Carbonic anhydrase II 5 5

CHEMBL186856 CHEMBL306541 Nitric-oxide synthase, inducible 3 6

CHEMBL186856 CHEMBL306541 Nitric-oxide synthase, brain 3 6

CHEMBL186856 CHEMBL306541 Nitric-oxide synthase, endothelial 3 6

CHEMBL255141 CHEMBL270807 Histamine H3 receptor 7 7

CHEMBL20429 CHEMBL21309 Vesicular transporter 8 8

CHEMBL148543 CHEMBL35860 HIV type 1 protease 9 10

CHEMBL1807550 CHEMBL3125318 Sodium/glucose cotransporter 1 11 12

CHEMBL1807550 CHEMBL3125318 Sodium/glucose cotransporter 2 11 12

CHEMBL606937 CHEMBL606938 Sigma opioid receptor 13 13

CHEMBL2051761 CHEMBL2051978 Maltase-glucoamlyase 14 14

CHEMBL2051761 CHEMBL2051978 Sucrase-isomaltase 14 14

11

15 15 CHEMBL301670 CHEMBL58824 M1

15 15 CHEMBL301670 CHEMBL58824 M2

15 15 CHEMBL301670 CHEMBL58824 M3

15 15 CHEMBL301670 CHEMBL58824 M4

CHEMBL263575 CHEMBL354652 Kappa opioid receptor 16 16

CHEMBL263575 CHEMBL354652 Mu opioid receptor 16 16

CHEMBL263575 CHEMBL354652 Nociceptin receptor 16 16

Table S4. Motivating target-molecule pairs using early optimal and most optimal parameter sets.

Compound Target ECFP4 ECFP4 Early Early E3FP New New E3FP Motivating Confirmed TC P-value E3FP TC P-value E3FP TC P-value

-95 -74 Alphaprodine M3 0.3167 0.19 0.2857 1.40x10 0.3043 9.16x10 yes

-95 -68 Alphaprodine M5 0.3167 0.03 0.2537 1.45x10 0.2466 5.34x10 yes

Anpirtoline α3β4 0.2400 >1 0.2609 1.04x10-75 0.2553 3.63x10-26 yes

Anpirtoline α4β2 0.2979 >1 0.2683 3.04x10-85 0.2609 1.45x10-33 yes

Cypenamine α2β4 0.2381 >1 0.2308 2.37x10-37 0.2727 5.44x10-56 yes yes

Cypenamine α3β4 0.2564 >1 0.1951 2.47x10-12 0.2051 1.69x10-03 yes

Cypenamine α4β4 0.2381 >1 0.2632 2.69x10-30 0.3125 1.56x10-48 yes

Only target-molecule pairs for which the molecule was a confirmed binder to a member of the target family are shown. The early optimal E3FP parameter set was chosen early in the optimization procedure and differed from the final parameter set in that the eight lowest energy conformers for each tautomer were considered, and the shell radius multiplier was 1.671 Å.

12

Table S5. Compounds used in binding assays and source where purchased.

Compound ZINC ID ChEMBL ID SMILES Vendor Product ID

Alphaprodine ZINC1087483 CHEMBL1529817 CCC(=O)O[C@]1(c2ccccc2)CCN( Mcule MCULE- C)C[C@@H]1C 7919657032

Anpirtoline ZINC958 CHEMBL1316374 Clc1cccc(SC2CCNCC2)n1 Tocris 703

Cinepazet ZINC38595418 CHEMBL2106028 CCOC(=O)CN1CCN(C(=O)/C=C/ Sigma-Aldrich T137642 c2cc(OC)c(OC)c(OC)c2)CC1

Cypenamine ZINC2037167 CHEMBL2110918 N[C@@H]1CCC[C@H]1c1ccccc Enamine EN300-183376 1

(R)-(-)- ZINC456 CHEMBL1087 Cc1cccc(C)c1NC(=O)[C@H]1CC Toronto M225070 Mepivacaine CCN1C Research Chemicals

Quinine ZINC2041225 CHEMBL2107019 C=C[C@@H]1CN2CC[C@@H]1 Mcule MCULE- Ethylcarbonate C[C@@H]2[C@@H](OC(=O)OC 3651128133 C)c1ccnc2ccc(OC)cc21

Tranylcypromine ZINC1482197 CHEMBL313833 N[C@@H]1C[C@H]1c1ccccc1 Sigma-Aldrich P8511

Zoniporide ZINC3933046 CHEMBL355862 NC(N)=NC(=O)c1cnn(- Sigma-Aldrich SML0076 c2cccc3ncccc32)c1C1CC1

13

Table S6. All predictions tested by first-pass assays.

Target Compound Average Inhibition (%), of Four Replicates at 10 μM

α2β2 * 1.2

α2β4 Tranylcypromine* 5.1

α3β2 Tranylcypromine -6.3

α3β4 Tranylcypromine -3

α4β2 Tranylcypromine -2.1

α4β2** Tranylcypromine -4.4

α4β4 Tranylcypromine 1.9

α7 Tranylcypromine -7.2

α7** Tranylcypromine 0.3

α2β2 Cypenamine 20.2

α2β4 Cypenamine* 30.2

α3β2 Cypenamine 22.4

α3β4 Cypenamine 49.1

α4β2 Cypenamine 18.3

α4β2** Cypenamine 4.9

α4β4 Cypenamine 47.4

α7 Cypenamine -13

α7** Cypenamine 7.2

α2β2 Zoniporide 6.2

14

α2β4 Zoniporide 3.6

α3β2 Zoniporide 1.2

α3β4 Zoniporide -3.5

α4β2 Zoniporide -3.4

α4β2** Zoniporide -8

α4β4 Zoniporide 3.7

α7 Zoniporide* 0.5

α7** Zoniporide -4

M1 Alphaprodine 7.1

M2 Alphaprodine 11.2

M3 Alphaprodine* 13.3

M4 Alphaprodine 104.1

M5 Alphaprodine 77

α2β2 Anpirtoline 6.2

α2β4 Anpirtoline 16.5

α3β2 Anpirtoline 18.7

α3β4 Anpirtoline 42.4

α4β2 Anpirtoline* -1.5

α4β2** Anpirtoline -9.7

α4β4 Anpirtoline 21.6

α7 Anpirtoline 19.6

15

α7** Anpirtoline 0.9

D1 Cinepazet 7

D2 Cinepazet* -19

D3 Cinepazet -14.3

D4 Cinepazet -3.9

D5 Cinepazet -3.6

5-HT1A (R)-(-)-Mepivacaine 22.1

5-HT1B (R)-(-)-Mepivacaine 1.7

5-HT1D (R)-(-)-Mepivacaine 5.4

5-HT1E (R)-(-)-Mepivacaine -7.5

5-HT2A (R)-(-)-Mepivacaine 16.7

5-HT2B (R)-(-)-Mepivacaine 11.1

5-HT2C (R)-(-)-Mepivacaine 5.3

5-HT3 (R)-(-)-Mepivacaine* 3.7

5-HT4 (R)-(-)-Mepivacaine -7.8

5-HT5 (R)-(-)-Mepivacaine 47.5

5-HT6 (R)-(-)-Mepivacaine -5.4

5-HT7 (R)-(-)-Mepivacaine 84.3

5-HT1A Quinine Ethylcarbonate 14.2

5-HT1B Quinine Ethylcarbonate -6.6

5-HT1D Quinine Ethylcarbonate 33.6

16

5-HT1E Quinine Ethylcarbonate -4

5-HT2A Quinine Ethylcarbonate 18.6

5-HT2B Quinine Ethylcarbonate 33

5-HT2C Quinine Ethylcarbonate 27.1

5-HT3 Quinine Ethylcarbonate 2.5

5-HT4 Quinine Ethylcarbonate* -0.4

5-HT5 Quinine Ethylcarbonate 13

5-HT6 Quinine Ethylcarbonate -12.1

5-HT7 Quinine Ethylcarbonate 28

Each row corresponds to a single molecule-target pair which was tested with four replicates at a concentration of 10 µM. Asterisks (*) indicate molecule/target pairs motivating the experiments. Double asterisks (**) indicate experiments performed in rat brain tissues.

17

Table S7. All predictions tested by complete radioligand binding assays.

Target Compound Experiment Mean LogKi Pooled LogKi Number

α2β4 Cypenamine 4 (1, 1, 1, 1) -5.2960 ± 0.1189 -5.3330 ± 0.0555

α3β4 Anpirtoline 4 (1, 1, 1, 1) -5.4512 ± 0.1113 -5.4670 ± 0.0633

α3β4 Cypenamine 4 (1, 1, 1, 1) -5.6603 ± 0.2143 -5.5700 ± 0.0698

α4β4 Cypenamine 4 (1, 1, 1, 1) -5.3807 ± 0.0654 -5.3860 ± 0.0592

M1 Alphaprodine 4 (3, 3, 3, 3)

M2 Alphaprodine 4 (3, 3, 3, 3)

M3 Alphaprodine 5 (3, 3, 3, 3, 3)

M4 Alphaprodine 3 (3, 3, 3)

M5 Alphaprodine 5 (3, 3, 3, 3, 3) -6.2116 ± 0.3153 -6.1130 ± 0.0558

Each row corresponds to a result from pooling the specified number of experiments, whose replicate numbers are indicated in parentheses. The reported standard error of the pooled logKi results from multiple repeats of the curve fitting procedure in GraphPad Prism 5.0.

18

Figure S9. Results of binding assays to muscarinic acetylcholine receptors. Each panel corresponds to a pooled set of binding experiments in triplicate with reference molecule atropine against A) M1, B) M2, C) M3, D) M4, and E) M5. See Table S7 for numbers of experiments.

19

Figure S10. Results of Tango agonist assays of alphaprodine against muscarinic acetylcholine receptors. A) M3 and B) M5. Each panel corresponds to a pool of experiments with reference agonist carbachol.

20

Figure S11. Results of antagonist assays of alphaprodine versus acetylcholine and carbachol against M5. Acetylcholine (A-B) and Carbachol (C-D) mediated calcium release dose-responses in the absence and presence of increasing concentrations of alphaprodine. Results were analyzed in GraphPad Prism 5.0 with pooled normalized values from three individual assays, each in triplicate. The antagonist potency of alphaprodine is (B) pA2 of 4.60 ± 0.04 with a Schild slope of 1.17 ± 0.07 for carbachol and (D) pA2 of 4.65 ± 0.04 and Schild slope of 1.06 ± 0.04 for acetylcholine.

21

Supporting References

(1) Froestl, W.; Mickel, S. J.; Hall, R. G.; von Sprecher, G.; Strub, D.; Baumann, P. A.; Brugger, F.; Gentsch, C.; Jaekel, J.; Olpe, H. R. Phosphinic Acid Analogues of GABA. 1. New Potent and Selective GABAB Agonists. J Med Chem 1995, 38, 3297–3312. (2) Moormann, A. E.; Metz, S.; Toth, M. V.; Moore, W. M.; Jerome, G.; Kornmeier, C.; Manning, P.; Hansen, D. W.; Pitzele, B. S.; Webber, R. K. Selective Heterocyclic Amidine Inhibitors of Human Inducible Nitric Oxide Synthase. Bioorg Med Chem Lett 2001, 11, 2651– 2653. (3) Shankaran, K.; Donnelly, K. L.; Shah, S. K.; Caldwell, C. G.; Chen, P.; Hagmann, W. K.; Maccoss, M.; Humes, J. L.; Pacholok, S. G.; Kelly, T. M.; Grant, S. K.; Wong, K. K. Synthesis of Analogs of (1,4)-3- and 5-Imino Oxazepane, Thiazepane, and Diazepane as Inhibitors of Nitric Oxide Synthases. Bioorg Med Chem Lett 2004, 14, 5907–5911. (4) Monn, J. A.; Massey, S. M.; Valli, M. J.; Henry, S. S.; Stephenson, G. A.; Bures, M.; Hérin, M.; Catlow, J.; Giera, D.; Wright, R. A.; Johnson, B. G.; Andis, S. L.; Kingston, A.; Schoepp, D. D. Synthesis and Metabotropic Glutamate Receptor Activity of S-Oxidized Variants of (-)-4-Amino-2-Thiabicyclo-[3.1.0]Hexane-4,6-Dicarboxylate: Identification of Potent, Selective, and Orally Bioavailable Agonists for MGlu2/3 Receptors. J Med Chem 2007, 50, 233– 240. (5) Ponticello, G. S.; Freedman, M. B.; Habecker, C. N.; Lyle, P. A.; Schwam, H.; Varga, S. L.; Christy, M. E.; Randall, W. C.; Baldwin, J. J. Thienothiopyran-2-Sulfonamides: A Novel Class of Water-Soluble Carbonic Anhydrase Inhibitors. J Med Chem 1987, 30, 591–597. (6) Moore, W. M.; Webber, R. K.; Fok, K. F.; Jerome, G. M.; Connor, J. R.; Manning, P. T.; Wyatt, P. S.; Misko, T. P.; Tjoeng, F. S.; Currie, M. G. 2-Iminopiperidine and Other 2- Iminoazaheterocycles as Potent Inhibitors of Human Nitric Oxide Synthase Isoforms. J Med Chem 1996, 39, 669–672. (7) Nersesian, D. L.; Black, L. A.; Miller, T. R.; Vortherms, T. A.; Esbenshade, T. A.; Hancock, A. A.; Cowart, M. D. In Vitro SAR of -Containing Histamine H3 Receptor Antagonists: Trends across Multiple Chemical Series. Bioorg Med Chem Lett 2008, 18, 355–359. (8) Rogers, G. A.; Parsons, S. M.; Anderson, D. C.; Nilsson, L. M.; Bahr, B. A.; Kornreich, W. D.; Kaufman, R.; Jacobs, R. S.; Kirtman, B. Synthesis, in Vitro Acetylcholine-Storage- Blocking Activities, and Biological Properties of Derivatives and Analogues of Trans-2-(4- Phenylpiperidino)Cyclohexanol (Vesamicol). J Med Chem 1989, 32, 1217–1230. (9) Kaltenbach, R. F.; Nugiel, D. A.; Lam, P. Y.; Klabe, R. M.; Seitz, S. P. Stereoisomers of Cyclic Urea HIV-1 Protease Inhibitors: Synthesis and Binding Affinities. J Med Chem 1998, 41, 5113–5117. (10) Lam, P. Y.; Ru, Y.; Jadhav, P. K.; Aldrich, P. E.; DeLucca, G. V.; Eyermann, C. J.; Chang, C. H.; Emmett, G.; Holler, E. R.; Daneker, W. F.; Li, L.; Confalone, P. N.; McHugh, R. J.; Han, Q.; Li, R.; Markwalder, J. A.; Seitz, S. P.; Sharpe, T. R.; Bacheler, L. T.; Rayner, M. M.;

22

Hodge, C. N. Cyclic HIV Protease Inhibitors: Synthesis, Conformational Analysis, P2/P2’ Structure-Activity Relationship, and Molecular Recognition of Cyclic Ureas. J Med Chem 1996, 39, 3514–3525. (11) Xu, B.; Feng, Y.; Cheng, H.; Song, Y.; Lv, B.; Wu, Y.; Wang, C.; Li, S.; Xu, M.; Du, J.; Peng, K.; Dong, J.; Zhang, W.; Zhang, T.; Zhu, L.; Ding, H.; Sheng, Z.; Welihinda, A.; Roberge, J. Y.; Seed, B.; Chen, Y. C-Aryl Glucosides Substituted at the 4’-Position as Potent and Selective Renal Sodium-Dependent Glucose Co-Transporter 2 (SGLT2) Inhibitors for the Treatment of Type 2 Diabetes. Bioorg Med Chem Lett 2011, 21, 4465–4470. (12) Xu, G.; Lv, B.; Roberge, J. Y.; Xu, B.; Du, J.; Dong, J.; Chen, Y.; Peng, K.; Zhang, L.; Tang, X.; Feng, Y.; Xu, M.; Fu, W.; Zhang, W.; Zhu, L.; Deng, Z.; Sheng, Z.; Welihinda, A.; Sun, X. Design, Synthesis, and Biological Evaluation of Deuterated C-Aryl Glycoside as a Potent and Long-Acting Renal Sodium-Dependent Glucose Cotransporter 2 Inhibitor for the Treatment of Type 2 Diabetes. J Med Chem 2014, 57, 1236–1251.

(13) de Costa, B. R.; Rice, K. C.; Bowen, W. D.; Thurkauf, A.; Rothman, R. B.; Band, L.; Jacobson, A. E.; Radesca, L.; Contreras, P. C.; Gray, N. M. Synthesis and Evaluation of N- Substituted Cis-N-Methyl-2-(1-Pyrrolidinyl)Cyclohexylamines as High Affinity Sigma Receptor Ligands. Identification of a New Class of Highly Potent and Selective Sigma Receptor Probes. J Med Chem 1990, 33, 3100–3110. (14) Horii, S.; Fukase, H.; Matsuo, T.; Kameda, Y.; Asano, N.; Matsui, K. Synthesis and Alpha-D-Glucosidase Inhibitory Activity of N-Substituted Valiolamine Derivatives as Potential Oral Antidiabetic Agents. J Med Chem 1986, 29, 1038–1046. (15) Gao, L.-J.; Waelbroeck, M.; Hofman, S.; Van Haver, D.; Milanesio, M.; Viterbo, D.; De Clercq, P. J. Synthesis and Affinity Studies of Himbacine Derived Muscarinic Receptor Antagonists. Bioorg Med Chem Lett 2002, 12, 1909–1912. (16) Röver, S.; Wichmann, J.; Jenck, F.; Adam, G.; Cesura, A. M. ORL1 Receptor Ligands: Structure-Activity Relationships of 8-Cycloalkyl-1-Phenyl-1,3,8-Triaza-Spiro[4.5]Decan-4- Ones. Bioorg Med Chem Lett 2000, 10, 831–834.

23