The Open Conference Proceedings Journal, 2011, 2, 113-129 113 Open Access Predicting Drug Promiscuity Using Spherical Harmonic Surface Shape-Based Similarity Comparisons Violeta I. Pérez-Nueno*, Vishwesh Venkatraman, Lazaros Mavridis and David W. Ritchie*

INRIA Nancy – Grand Est, 615 rue du Jardin Botanique, 54506 Vandoeuvre-lès-Nancy, France

  Polypharmacology is becoming an increasingly important aspect in drug design. Pharmaceutical companies are discovering more and more cases in which multiple drugs bind to a given target (promiscuous targets) and in which a given drug binds to more than one target (promiscuous ligands). These phenomena are clearly of great importance when considering drug side-effects. In the last 4 years, more than 30 drugs have been tested against more than 40 novel secondary targets based on promiscuity predictions. Current methods for predicting promiscuity typically aim to relate protein receptors according to their primary sequences, the similarity of their ligands, and more recently, the similarity of their ligand binding pockets. Here, we present a spherical harmonic (SH) surface shape-based approach to predict rapidly promiscuous ligands and targets by comparing sets of SH ligand and protein shapes, respectively. We present details of our approach applied to a wide range of PDB complexes comprising ligands in a selected subset of the MDL Drug Data Report (MDDR) database which are distributed over 249 diverse pharmacological targets. The shape similarity of each ligand to each target’s ligand set is quantified and used to predict promiscuity. We also analyse the correlation between binding pocket and ligand shapes. We compare our promiscuity predictions with experimental activity values extracted from the BindingDB database.    Consensus shapes, drug promiscuity, ligand shape space, protein pocket space, protein sequence space, shape similarity, spherical harmonic shapes.

INTRODUCTION early, before the cost associated with developing a drug candidate rises steeply [9]. On the other hand, promiscuity is Drug promiscuity may be defined as the specific binding not always unwelcome and it can even be exploited for drug of a drug-like molecule to more than one target. On the other development. The use of old drugs for new targets has been hand, if a protein binds different ligands, it can be considered shown to provide a promising way to reduce both the time as a promiscuous receptor [1]. These are notions illustrated and cost of drug development [10]. in Fig. (1). The concept of ‘target-hopping,’ whereby a binder for one target can be considered as the basis for leads Given that it is currently infeasible to screen a drug for another target has historically been extremely fruitful in against all of the proteins expressed by the human genome, lead discovery [2]. Nowadays, polypharmacology is several computational techniques have been developed to becoming an increasingly important aspect in drug design. In predict the pharmacological profiles of known drugs. These the last 4 years, more than 30 drugs have been tested against range from the well-known docking of compounds into more than 40 novel secondary targets based on promiscuity protein structures to the use of machine learning methods predictions [3]. Pharmaceutical companies are discovering [11, 12], sequence comparison [13] side-effect similarity [6], more and more cases in which multiple drugs bind to a given and fingerprint/pharmacophore comparison methods [14, target (promiscuous targets) and in which a given drug binds 15]. These in silico methods typically aim to relate protein to more than one target (promiscuous ligands). Both of these receptors to each other quantitatively based on their phenomena are clearly of great importance when considering similarity in primary sequence space [14], ligand chemical drug side-effects. For example, a common reason for descriptor space [15], and more recently in their terminating a drug development program is that the leads are pharmacophoric pocket descriptor space [16]. found to be non-selective or promiscuous [4]. Thus, the in The first and most common techniques use e.g. the silico prediction of unwanted side effects caused by the BLAST [17, 18] or FASTA [19] sequence alignment tools to promiscuous behaviour of drugs and their targets is highly create similarity maps in protein sequence space [20]. Later, relevant to the pharmaceutical industry. Considerable effort other approaches were developed which compare the is now being put into the computational [5, 6] and chemistry of the targets’ ligands following the hypothesis experimental [7, 8] screening of several suspected off-target that similar molecules are likely to have similar properties proteins in the hope that side effects might be identified [21]. For example, Keiser et al., relate receptors to each other according to the chemical similarity of their ligands. In their Similarity Ensemble Approach (SEA) [22], the *Address correspondence to these authors at the INRIA Nancy – Grand Est, 615 rue du Jardin Botanique, 54506 Vandoeuvre-lès-Nancy, France; calculated probability that two molecules might interact with Tel: + +33-3-83593045; Fax: + +33-3-83413079; the same target by chance is expressed using an expectation E-mails: [email protected], [email protected]

2210-2892/11 2011 Bentham Open 114 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al. value [23], which is conceptually similar to the E-value used mechanics theory and encodes these properties as SH in sequence alignment software such as BLAST [17]. expansions [35, 36]. Surface shapes are represented as radial Mestres et al., relate proteins in ligand space using three in– distance expansions of the molecular surface, r(,), with house molecular descriptors (PHRAG, FPD, SHED) [24, respect to a selected harmonic coordinate origin (CoH), 25]. With the increase in recent years of approaches for which is normally set equal to the molecular center of comparing protein pockets [26] such as four-point gravity (CoG). This allows an entire molecular surface shape pharmacophoric descriptors (FLAP [27]), three-point to be captured using a Fourier-like polynomial expansion, pharmacophoric descriptors (Cavbase [28], SiteEngine [29], such as Equation 1

SuMo [30]), geometric hashing methods (Kinnings and L l Jackson [31]), and graph-matching-base algorithms (IsoCleft Equation 1 r(), =  almylm(), [32]), the notion of a binding pocket similarity space has l =0 m=l recently been proposed. This is based on the principal that protein binding pockets are the place where the interactions where  and  are the spherical coordinates, ylm(,) are real between a protein and a ligand are formed. Hence, they must spherical harmonics, alm are the expansion coefficients, and have complementary shapes and physicochemical properties L is the order or highest polynomial power of the expansion. to the small molecules that they accommodate. Therefore, As determined previously, we use L = 6 for shape calculating the similarity of binding pockets allows proteins comparisons [43]. Mathematically, the SH approach applies with similar function and selectivity for the same binding only to “starlike” shapes which have single-valued surfaces partners to be related. For example, Weskamp et al. with respect to radial rays projections from the chosen CoH. compared targets by the similarity of their binding pockets Most molecules do not satisfy this requirement. Hence, the using the LIGSITE program [14]. Milleti et al., recently SH surfaces described here should be considered as “surface related receptors to each other using pocket-based “shape envelopes” which enclose the true molecular surface. For context” descriptors [16, 33, 34]. highly non-starlike molecules this surface envelope can be a rather severe approximation to the true surface. Indeed, in Here, we present a shape-based approach which uses extreme cases, the CoH can fall outside the molecular spherical harmonic (SH) representations [35, 36] to compare surface, and this can cause the software to fail. molecular surfaces efficiently. This approach relates receptors to each other by the SH shape similarity of their PARAFIT and MSSH calculate shape superpositions by ligands and their binding pockets. Since shape exploiting the special rotational properties of the SH complementarity is an essential feature for molecular functions. For example, rotated SH expansion coefficients recognition, using ligand and binding pocket shapes should for a molecule B can be calculated as provide a good way to characterise their properties. If two l ' ( l ) Equation 2 binding pockets of different proteins share a common shape, blm = Rmm' (),, blm' it is likely that ligands that bind to part of one binding pocket m' =l (l) will also be recognized in the corresponding part of the other where (, , ) are zyz Euler rotation angles and R (, , ) pocket. On the other hand, if two ligands of different is the l’th real Wigner rotation matrix. To calculate a proteins share a similar shape, it is likely that both of them superposition between a pair of molecules A and B, the will complement the shape of each binding pocket. Hence, harmonic coordinate origin (CoH) of molecule B is by identifying similar ligands and binding pocket shapes, our translated to that of the fixed reference molecule A, and a approach aims to provide a shape-based way to predict rotational search to a resolution of 1 degree in each Euler promiscuous ligands and targets. rotation angle is then performed to find the rotation which We present the results of applying our approach to a wide minimizes the distance, DAB, between the corresponding range of ligands which Shuffenhauer et al., [37] previously pairs of SH expansions (Equation 3). selected from the MDL Drug Data Report (MDDR) database Equation 3 [38], and for which crystallographic protein-ligand complexes exist in the Protein Data Bank (PDB) [39]. This Thanks to the orthogonality of the basis functions, gives an annotated list of ligands for 249 protein targets of Equation 3 reduces to pharmacological interest. The shape similarity between 2 2 ' Equation 4 ligands and between binding pockets for these selected DAB = a + b  2ab protein targets is quantified and used to predict promiscuity. We analyse the correlation between binding pocket and where a and b represents vectors of expansion coefficients. ligand shape spaces. We also compare our promiscuity In practice, it is often more convenient to compare predictions with experimental activity values extracted from molecules using a normalised similarity score such as the BindingDB database [40]. Equation 5 [43, 44]. METHODS ab Equation 5 S = 2 2 Calculating SH Shapes (a + b  ab) We use the PARASURF and PARAFIT modules [41] to Using PARAFIT it is also straight-forward to construct calculate and superpose SH molecular surfaces, and the the average or “consensus” shape, r(), , of a group of N MSSH [42] program to calculate the SH shapes of protein molecules by calculating the average of their superposed SH k pockets. PARASURF calculates molecular shape and expansion coefficients, alm, for k=1,..., N, as shown in electronic properties from semi-empirical quantum Equation 6 [44]. Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 115

Fig. (1). Drug Promiscuity. Left: multiple drugs bind to a given target (promiscuous target). Right: a given drug binds to more than one target (promiscuous ligand).

N L l 1 k Equation 6 explicitly consider multiple ligand conformations. On the r(), = almylm(), N  other hand, calculating SH consensus shapes can take into k=1 l=0 m=l account 3D shape features of several conformations to be However, before computing the average, each molecule used as an averaged pseudo-molecule [48]. Schuffenhauer’s in the consensus must first be rotated to minimize the subset is based on a simple ontology [37] that maps distance between it and the remaining N - 1 molecules. In Commission (EC) numbers [49], GPCRs, ion channels, and practice, because these rotations are not known a priori, the nuclear receptors to MDDR activity classes. In this dataset, consensus shape is constructed iteratively as described all ligands are annotated where possible according to the previously [44]. We have shown that the consensus shape- MDDR activity class. It should be noted that these ligand based representation can be used to capture the essential 3D annotations can be rather general text descriptions, such as shape features of several known high-affinity ligands and to “dihydrofolate reductase inhibitor” or “anticancer agent”, for encode them in the form of a single representative pseudo- example. Therefore, an annotation can correspond to molecule which can be used as a VS query [45-48]. multiple targets. Supplementary Table 1 shows the list of The SH consensus approach may also be used to target annotations used for our promiscuity predictions. calculate the consensus shapes of both ligand molecules and Fig. (2) shows the workflow followed to process the data protein receptor pockets. Here, the SH coefficients of the for our promiscuity predictions. The MDDR database was binding pockets are computed using MSSH. Each pocket filtered according to Schuffenhauer’s subset to give 8659 surface is calculated around the bound ligand coordinates ligands with 196 unique annotations. These ligands were using the default MSSH radial cut-off distance of 20 Å. It is then used to search the PDB hetero-atom dictionary to find worth noting that MSSH can usually calculate a good ligands for which crystal structures exist. These structures representation of closed pockets, but open pockets on the were extracted from the PDB using their three-letter PDB protein surface are not represented well, and this limits the ligand codes to give a total of 957 protein-ligand complexes. quality of the results. Nonetheless, all pockets are analysed. Any structures solved by NMR and those without a CATH Once the SH coefficients of the binding pockets are code were removed to give 687 proteins belonging to 76 calculated, the same consensus algorithm is applied as unique annotations. Only one chain was kept for each described above. protein. Ligand and Target Data Preparation Predicting Promiscuity Using SH Shape-Based Similarity We applied our approach to those PDB complexes which Ligand SH Shape Similarity contain the ligands in Schuffenhauer’s subset of the MDDR database (version 2010.2) [38], comprising 65367 To predict ligand promiscuity, all the ligands from the compounds distributed over 249 diverse pharmacological PDB complexes were extracted and transformed into a targets for which experimental binding information is canonical orientation using PARAFIT. The SH similarity of known. In other words, we consider only crystallogra- each ligand with each target’s ligand set was calculated to phically determined ligand coordinates and we do not give an all-vs-all ligand interaction matrix. The matrix was then analyzed by using three Tanimoto thresholds. 116 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

Fig. (2). Data processing flowchart. We used a subset of the MDDR database comprising 76 diverse pharmacological targets having at least one ligand in the PDB heteroatom dictionary. The similarity of each ligand SH shape to each target’s ligand set shape is calculated and quantified using the Tanimoto coefficient. The similarity between each SH consensus shape pocket to each target’s consensus binding pocket is calculated in the same way. Finally, the ligand-pocket shape interaction matrix is analyzed to predict promiscuity.

Binding Pocket SH Shape Similarity and labelled as in Fig. (3a). Each annotation is represented by a consensus binding pocket shape. The interaction matrix For promiscuity prediction in binding pocket space, we is clustered by consensus binding pocket shape similarity. It positioned all protein-ligand complexes according to the can be seen that this matrix also groups together related canonical orientations of their ligands [50]. This placed all targets. For example, several distinct groups each with high pockets in a ligand-defined standard orientation in order to shape similarity (Tanimoto  0.9) are found for the serine calculate SH binding pocket shapes using MSSH. The proteases, other proteases, nuclear hormone receptors, consensus shapes of the pockets belonging to the same kinases, GPCRs, ion channels, and metallo MDDR annotation were calculated with PARAFIT. The SH enzymes. More specifically, the serine proteases cluster similarity between consensus pocket shapes was calculated includes coagulation factors Xa and VIIIa, thrombin, trypsin, to give an all-vs-all binding pocket interaction matrix. As -lactamase, interleukin-8, and serine type-d ala-d-ala before, this matrix was analyzed using the same three carboxypeptidase. The main nuclear hormone receptor Tanimoto thresholds. cluster includes the vitamin d3-like receptors, estrogen, and Ligand vs Binding Pocket SH Shape Similarity androgen. Similarly, the GPCR cluster includes the - adrenoreceptor type 1, -adrenoreceptor, ETA, and Finally, a ligand-pocket SH shape interaction matrix was endothelin. The main enzyme cluster includes the peptidase, also analyzed in the same way. serine endopeptidase, alcohol dehydrogenase, purine RESULTS nucleoside phosphorilase hydroxymethylglutaryl CoA reductase, cholestenone 5-reductase, adenosylhomo- Ligand-Ligand Interaction Matrix cysteinase, lanosterol synthase, and guanylate cyclase. It can Fig. (3a) shows the form of the 677 ligand-ligand also be seen that each consensus binding pocket matches interaction matrix. A high resolution zoom of a portion of itself (diagonal in dark blue). Hence, it can be seen that this figure is shown in Fig. (4). A ligand Tanimoto score comparing receptor pocket shapes correctly groups many greater than 0.9 is shown in dark blue. Ligands with a protein targets into their expected macromolecular target Tanimoto score between 0.7 and 0.9 are shown in blue, and family. scores lower than 0.7 are shown in light blue. Both axes are Analysis of the pocket-pocket matrix shows some highly labelled according to the MDDR annotations to which the similar binding pockets. Fig. (6) shows four examples of 677 ligands belong. It can be observed that each ligand shape shape supperpositions between the correlated pockets. matches itself (diagonal in dark blue). Similarly, ligands Firstly, the consensus pocket shape of androgen (Fig. 6 top) belonging to related targets are found to have similar shapes shares high shape similarity with those of -adrenoreceptor, (dark blue areas) such as some nuclear hormone receptors purine nucleoside phosphorilase, thimidine kinase, hydroxy- (vitamin d3-like receptors, estrogen, androgen, progesterone) methylglutaryl CoA reductase, and adenosylhomocysteinase. and serine proteases (coagulation factors Xa and VIIIa, The promiscuity predicted for androgen is consistent with thrombin, trypsin, -lactamase). several existing MDDR activity classes (e.g. androgen, Pocket – Pocket Interaction Matrix inhibitor, antiandrogen) for the androgen ligands. Secondly, the consensus pocket of hydroxymethylglutaryl Fig. (3b) (high resolution in Fig. 5) shows the 76 CoA reductase (Fig. 6 middle) shares high shape similarity consensus pocket-pocket interaction matrix. It is color coded with the consensus binding pockets of lanosterol synthase, Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 117

a) Ligand vs ligand b) Pocket vs pocket c) Ligand vs pocket T animoto ≥ 0.9 Dark blue Tanimoto ≥ 0.9 Dark blue Similarity found & PDB complex exists Dark blue 0.7 ≤ Tanimoto < 0.9 Blue 0.7 ≤ Tanimoto < 0.9 Blue Similarity found & no PDB complex exists Blue Tanimoto < 0.7 Light blue Tanimoto < 0.7 Light blue No similarity found Light blue Tanimoto threshold 0.7 Tanimoto threshold 0.9 )

c l us t e re d (

677 L i ga nds 677 L i ga nds 677 L i ga nds

P oc ke t s 76

677 Ligands 76 Pockets (clustered) 76 Pockets 76 Pockets

d) In silico vs in vitro interaction matrices

In silico In vitro (IC50/nM) In vitro vs in silico Tanimoto ≥ 0.9 Dark blue 0 < IC50/nM ≤ 1 Dark blue 0.7 ≤ Tanimoto < 0.9 Blue 1 < IC50/nM ≤ 10 Blue Tanimoto < 0.7 Light blue IC50/nM > 10 Light blue

127 L i ga nds 127 L i ga nds 127 L i ga nds

76 Pockets 76 Pockets 76 Pockets

Fig. (3). Predicting Promiscuity Using SH Shape-Based Similarity. This figure shows illustrative images of a) the ligand-ligand interaction matrix, b) the binding pocket-binding pocket interaction matrix, c) the ligand-binding pocket interaction matrix, d) the in silico vs in vitro interaction matrices. purine nucleotide phosphorilase, thimidine kinase, Overall, analysis of the pocket interaction matrix in Fig. cholestenone 5-reductase, and adenosylhomocysteinase. 5 points to the prediction of several very promiscuous targets Several MDDR activity classes (e.g. hypolipidemic, HMG- (dark blue rows) such as the aforementioned examples, i.e. CoA reductase  inhibitor) are also related to the GABA-A alpha subunit, androgen, hydroxymethylglutaryl hydroxymethylglutaryl CoA reductase ligands. Thirdly, CoA reductase, and thrombin, as well as estrogen, vitamin GABA-A alpha is predicted to be another highly 3d-like, acetylcholinesterase, RNA directed DNA promiscuous target. Its pocket shows high shape similarity polymerase, and purine nucleoside phosphorylase, and also with the pockets of -adrenoreceptor, purine nucleoside selective targets such as caspase (light blue row). These phosphorilase, thimidine kinase, hydroxymethylglutaryl CoA predictions agree with several related activity classes found reductase, cholestenone 5-reductase, adenosylhomo- in the MDDR for each of these annotations. cysteinase, lanosterolsynthase, vitamin d3-like receptors, estrogen, androgen, adenosindeaminase, acetylcholine- Ligand-Pocket Interaction Matrix sterase, and RNA directed DNA polymerase. This predicted Fig. (3c) shows the ligand-consensus binding pocket high promiscuity is also supported by the large number of interaction matrix using a Tanimoto threshold of 0.7 and a related MDDR activity classes for this annotation (non Tanimoto threshold of 0.9. A high resolution zoom of opioid analgesic, GABA-A/benzodiazepine receptor, portions of these figures is shown in Figs. (7 and 8), sedative/hypnotic, anxiolytic, agent for sleep disorders, respectively. The x-axis represents the 76 annotation benzodiazepine agonist, alcohol deterrent, anticonvulsant, consensus binding pockets and the y-axis the 677 ligands agent for premedication, antimigraine, and intravenous distributed in those annotations. Both axes are labelled anesthetic). Finally, the thrombin consensus binding pocket according to the MDDR annotations to which the 677 is found to be similar to the consensus pocket shapes of complexes belong. The matrix with the lower threshold coagulation factors Xa and VIIIa, -lactamase, trypsin, highlights the confirmed interactions according to the interleukin-8, and serine-type d-ala-d-ala carboxypeptidase. crystallised structures present in the PDB. It can be observed This also agrees with the activity classes of the thrombin that each ligand shape matches the annotation-based ligands in MDDR (e.g. anticoagulant, thrombin inhibitor, consensus pocket that the ligand belongs to (diagonal in dark factox Xa inhibitor, trypsin inhibitor, protease inhibitor). blue). A lower and more permissive threshold helps to 118 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

transf err

ing.alky h phosphor ydr X3..5..cy o l.or xyme ser .ar r t pr e ine.type.d.ala.d.ala.carbo ibonucleoside.diphosphat X3..5..cy trah ibosy y os t glut h l.gr clic.nucleo y t glut pur aglandin.endoper pr ydr lglut X1 choles oups..o amat lgly ocollagen.pr ine.nucleoside.phosphor .phosphatidy amat of phosphor rna.dir ar clic.gmp.phosphodies oly unspecif cinamide.f nad.adp.r e.r y adenosy be l.coa.r t e.r lpoly enone.5alpha.r t ornit plat dih tide.phosphodies ecept alcohol.deh her carbonat t pr ect dipeptidy me ur monoamine.o adenosine.deaminase a.adr nucleo coagulation.fact peptidy ecept ser t gaba.a.alpha.subunit coagulation.fact o ydr h ace nitr phosphodies .t ele idine.phosphor glut aldeh hine.decarbo ic.dies lanos t ed.dna.polymerase ymidy t han.me ic.monoo ein.tyr educt ine.endopeptidase alloendopeptidase be oline.dio alpha.glucosidase guan aden cy cy or phospholipase.a2 of ic.o adenosine.type.1 adenosine.type.3 t.activating.fact ibosy xant orm enocept ty lhomocy t linosit hiv his or h amat .kainat t cloo cloo olat tidy be a.adr o lcholines s ymidine.kinase l.dipeptidase.a .nmda.subunit

e.deh vit yde.r t pr .1 t xide.synt xide.synt t er lat er pr s ase..nadph2. l.peptidase.iv t y y y amine.type.2 t hine.o osine.kinase er .r a.lact tr int xypeptidase o amin.d3.lik ltransf e.r lat lat ydr ltransf e.r ltransf xy xy cat ol.synt t oges y e.synt e.synt ol.4.kinase e t omely h cat .h endo caspase.1 ein.kinase l.sulfat enocept er peptidase xy xy his tr y andr e.cy e.cy e.subunit genase.2 genase.1 educt educt educt educt or es t ydr ydrat ogenase hepsin.b l.gr xidase.a s opepsin hr tr leukin.8 hepsin.l genase genase t t .type.1 erase.i amase t einase tr ypt xidase xy t tr t t t ombin amine or er erase erase erase erase erase erase olase t clase clase oups y ypsin y hase or ogen ogen hase hase hase hase sin.1 helin lase .viia lase lase one ase ase ase ase ase ase ase .xa et

or or

e a

ase educt r e diphosphat − ibonucleoside

r

genase xy dio oline pr − ocollagen

ribonucleoside.diphosphate.reductasepr

e subunit e kainat or ecept r e amat

procollagen.proline.dioxygenaseglut

1 −

glutamate.receptor.kainate.subunitcaspase

lase xy decarbo hine ornit

caspase.1

clase cy e lat y

ornithine.decarboxylaseaden

opepsin tr e r 1 −

adenylate.cyclasehiv

erase ltransf tidy

hiv.1.retropepsinnucleo

ase ypt

nucleotidyltransferasetr

tryptasepeptidase

or type 1 type or enocept adr a t

peptidasebe oups gr l y h t me han t her t o oups, gr l y ar or l alky ing err transf

beta.adrenoceptor.type.1

ine endopeptidase ine

transferring.alkyl.or.aryl.groups..other.than.methyl.groupsser ogenase ydr

serine.endopeptidasedeh alcohol helin t

alcohol.dehydrogenaseendo a et

or fact activating t ele endothelinplat

or enocept adr a t etbe a

lase y phosphor nucleoside − ine platelet.activating.factorpur

ymidine kinase ymidine h beta.adrenoceptort

ase (nadph2) ase educt r coa − l y ar lglut y h t xyme o ydr purine.nucleoside.phosphorylaseh

ase educt r − 5alpha enone t

thymidine.kinasecholes

einase t s lhomocy

hydroxymethylglutaryl.coa.reductase..nadph2.adenosy

hase synt ol er t

cholestenone.5alpha.reductaselanos

clase cy e lat y

adenosylhomocysteinaseguan

amine type 2 type amine t

lanosterol.synthasehis

e lik − d3 amin

guanylate.cyclasevit

ogen tr

histamine.type.2es

ogen

vitamin.d3.likeandr

gaba a alpha subunit esalpha a tr ogengaba

adenosine deaminase androgenadenosine

erase t lcholines ty

gaba.a.alpha.subunitace

ed dna polymerase dna ed ect dir − rna

adenosine.deaminase

acety3 lcholinestype teraseadenosine

rna.directed.dna.polymerasea2 phospholipase sin 1 sin omely tr

adenosine.type.3s lase y phosphor idine

phospholipase.a2ur

adenosine type 1 strtype omelysin.1adenosine hase synt xide o endoper − aglandin t os pr

genase xy monoo ic uridine.phosphorylaseunspecif

genase 1 genase xy cloo adenosine.type.1cy

xidase o hine prostaglandin.endoperoxide.synthasexant

xidase a xidase

unspecific.monooo xygenasemonoamine

ase educt r yde .1aldeh

erase ltransf ibosy r − xanthine.oxidaseadp nad

glucosidase −

monoamine.oxidase.aalpha

ase ydrat deh e

aldehyde.reductasecarbonat

one er t oges

nad.adp.ribosyltransferasepr

osine kinase osine tyr − ein t o

alpha.glucosidasepr

erase ltransf y orm f cinamide lgly ibosy

carbonate.dehydratasephosphor

kinase − 4 ol linosit phosphatidy −

progesterone1

erase t phosphodies tide nucleo − clic cy − ,5’

protein.tyrosine.kinase3’

peptidase iv peptidase − l

phosphoribosylglycinamide.formyltransferasedipeptidy

ase sulfat − l y er t

s

X1.phosphatidylinositol.4.kinase

hase synt e amat glut lpoly oly of ydr trah e

X3..5..cyclic.nucleotide.phosphodiesteraset

hase synt xide o − ic

dipeptidyl.peptidase.ivnitr

erase t phosphodies gmp − clic cy − ,5’

steryl.sulfatase3’

ase educt r e olat of ydr

tetrahydrofolylpolyglutamate.synthasedih hase synt e lat ymidy h

nitric.oxide.synthaset erase i erase t

X3..5..cyclic.gmp.phosphodiesterasephosphodies alloendopeptidase t me

dihydrofolate.reductase dipeptidase a dipeptidase − l peptidy

ein kinase ein t o thymidylate.synthasepr

olase ydr h er t dies ic

phosphodiesterase.iphosphor

amine t metalloendopeptidasehis

hepsin b hepsin peptidyl.dipeptidase.acat

hepsin l hepsin protein.kinasecat

or nmda subunit nmda or ecept r e amat phosphoric.diester.hydrolaseglut

genase 2 genase xy cloo

histaminecy

xypeptidase carbo ala − d − ala − d type − ine

cathepsin.bser

8 − leukin er

cathepsin.lint

ypsin

glutamate.receptor.nmda.subunittr

amase lact − a t

cyclooxygenase.2be

ombin hr

serine.type.d.ala.d.ala.carboxypeptidaset

or viia or

fact interleukin.8coagulation

or xa or coagulation fact trypsincoagulation beta.lactamase r pr glut caspase ornit aden hiv nucleo tr peptidase be transf ser alcohol deh endo et plat be pur t h choles adenosy lanos guan his vit es andr gaba a alpha subunit adenosine deaminase ace rna adenosine type 3 phospholipase a2 s ur adenosine type 1 pr unspecif cy xant monoamine o aldeh nad adp alpha carbonat pr pr phosphor 1 3’ dipeptidy s t nitr 3’ dih t phosphodies me peptidy pr phosphor his cat cat glut cy ser int tr be t coagulation fact coagulation fact e h h hr ibonucleoside tr t ydr − ypt ypsin ,5’ ,5’ er trah ymidine kinase ymidy a ocollagen os oges o o idine phosphor

thrombin cloo cloo

amin d3 er tr t t omely t phosphatidy ombin t t − ydr hepsin b hepsin l a adr a adr t a ine endopeptidase ic ine t t − ine amine type 2 amine ele ty amat amat alloendopeptidase ogen y ein ein kinase − − 1 r hine o o leukin − t hine decarbo ase ogen dir

coagulation.factor.viia − l y y aglandin t cy cy lcholines t − yde r − xyme err helin ydr lact er o of lat lat − − xy xy t t activating fact tidy sulfat e − glucosidase t ect enone 5alpha l xide synt type d er

coagulation.factor.xa clic clic nucleoside phosphor lat − olat ol synt tr tyr − enocept enocept ing alky ic monoo − e cy e cy sin 1 genase 1 genase 2 lhomocy e r e r e deh of l dipeptidase a amase opepsin ibosy ic dies one r − 1 ed dna polymerase e synt ltransf ibosy − − xidase educt osine kinase oly − − peptidase iv t ecept ecept ydr h e r 8 lik − ase nucleo gmp phosphodies clase clase y t pr erase i − lpoly xidase a − e lglut t educt ydrat hase − ogenase lgly erase linosit endoper ala oline dio ltransf or viia or xa t diphosphat ase l or ar hase hase or type 1 or erase er h or kainat or nmda subunit y s xy xy − lase ar cinamide f t

Fig. (4). High resolution zoom of the top right corner (red box) of the ligand-ligand SH shapeglut interaction matrix. The interaction einase tide phosphodies d ase lase ase genase y ydr ol 4 − − l erase or − amat ala carbo y r educt l gr coa r o xy matrix is color coded using darker blue for a high shape similarity (Tanimoto  0.9), blue for medium similarity (Tanimotoolase between 0.7 and − xide synt e subunit kinase genase oups, o e synt e r educt orm ase y educt 0.9) and light blue for low similarity (Tanimoto < 0.7).lase Both axes are labelled according to the MDDR annotations in which the ligands are t erase xypeptidase y hase hase ltransf t ase (nadph2) her t distributed. ase

t erase han me erase t h

identify the similarities between known ligandsy for a given reductase, glutamate receptor kainite subunit, estrogen, l gr target and some other ligands but it causesoups quite a lot of false vitamin d3-like receptors, lanosterol synthase, adenosylho- positives (large blue areas), whereas using a more restrictive mocysteinase, acetylcholinesterase, RNA directed DNA threshold clearly highlights the predicted promiscuous polymerase, thymidine kinase, purine nucleoside targets (i.e. a high number of dark blue pixels in a single phosphorylase and ribonucleoside diphosphate reductase). column). The 0.9 threshold matrix highlights as blue The fact that there exist in the PDB several complexes with columns several predicted promiscuous targets which were ligands for different targets confirms these predictions (dark identified previously in the ligand-ligand and pocket-pocket blue columns highlighted in the 0.7 threshold matrix). It is matrices (e.g. GABA-A alpha subunit, androgen, or interesting to note that the caspase pocket (which appears in hydroxymethylglutaryl CoA reductase) as well as some the centre of Fig. 7, 8, and 9) has a completely light blue others (procollagen proline dioxygenase, cholestenone 5 column indicating that this is a highly specific target. Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 119

transf

err ing.alky h phosphor ydr

X3..5..cy o l.or xyme ser .ar r t pr e ine.type.d.ala.d.ala.carbo ibonucleoside.diphosphat X3..5..cy trah ibosy y os t glut h l.gr clic.nucleo y t glut pur aglandin.endoper pr ydr lglut X1 choles oups..o amat lgly ocollagen.pr ine.nucleoside.phosphor .phosphatidy amat of phosphor rna.dir ar clic.gmp.phosphodies oly unspecif cinamide.f nad.adp.r e.r y adenosy be l.coa.r t e.r lpoly enone.5alpha.r t ornit plat dih tide.phosphodies ecept alcohol.deh her carbonat t pr ect dipeptidy me monoamine.o ur adenosine.deaminase a.adr nucleo coagulation.fact peptidy ecept ser t gaba.a.alpha.subunit coagulation.fact o ydr h nitr ace phosphodies .t ele idine.phosphor glut aldeh hine.decarbo ic.dies lanos t ed.dna.polymerase ymidy t han.me ic.monoo ein.tyr educt ine.endopeptidase alloendopeptidase be oline.dio alpha.glucosidase cy cy guan aden or phospholipase.a2 of ic.o adenosine.type.1 adenosine.type.3 t.activating.fact ibosy xant orm enocept ty lhomocy t linosit hiv his or h amat .kainat t cloo cloo olat tidy be a.adr o lcholines s ymidine.kinase l.dipeptidase.a .nmda.subunit e.deh vit yde.r t pr .1 t xide.synt xide.synt t er lat er pr s ase..nadph2. l.peptidase.iv t y y y amine.type.2 t hine.o osine.kinase er .r a.lact tr int xypeptidase o amin.d3.lik ltransf e.r lat lat ydr ltransf e.r ltransf xy xy cat ol.synt t oges y e.synt e.synt ol.4.kinase e t omely h cat .h endo caspase.1 ein.kinase l.sulfat enocept er peptidase xy xy his tr y andr e.cy e.cy e.subunit genase.2 genase.1 educt educt educt educt or es t ydr ydrat ogenase hepsin.b l.gr xidase.a s opepsin hr tr leukin.8 hepsin.l genase genase t t .type.1 erase.i amase t einase tr ypt xidase xy t tr t t t ombin amine or er erase erase erase erase erase erase olase t clase clase oups y ypsin y hase or ogen ogen hase hase hase hase sin.1 helin lase .viia lase lase one ase ase ase ase ase ase ase .xa et or or

e

a

ase educt r e diphosphat − ibonucleoside r

genase xy dio oline pr − ocollagen ribonucleoside.diphosphate.reductase pr

e subunit e kainat or ecept r e amat

procollagen.proline.dioxygenase glut

1 − glutamate.receptor.kainate.subunit caspase

lase xy decarbo hine caspase.1 ornit

clase cy e lat y

ornithine.decarboxylase aden

opepsin tr e r 1 − adenylate.cyclase hiv

erase ltransf tidy hiv.1.retropepsin nucleo

ase ypt nucleotidyltransferase tr

tryptase peptidase

or type 1 type or enocept adr a t peptidase be

oups gr l y h t me han t her t o oups, gr l y ar or l alky ing err beta.adrenoceptor.type.1 transf

ine endopeptidase ine

transferring.alkyl.or.aryl.groups..other.than.methyl.groups ser

ogenase ydr serine.endopeptidasedeh alcohol

helin t alcohol.dehydrogenase endo

a

endothelin et

or fact activating t ele eta plat

or enocept adr a t platelet.activating.factor be

lase y phosphor nucleoside − ine beta.adrenoceptor pur

ymidine kinase ymidine h purine.nucleoside.phosphorylase t

thymidine.kinase ase (nadph2) ase educt r coa − l y ar lglut y h t xyme o ydr

hydroxymethylglutaryl.coa.reductase..nadph2. h ase educt r − 5alpha enone t

cholestenone.5alpha.reductase choles einase t s lhomocy adenosy

adenosylhomocysteinase hase synt ol er t lanos

clase cy e lat y lanosterol.synthase guan

guanylate.cyclase amine type 2 type amine t

histamine.type.2 his e lik − d3 amin

vitamin.d3.like vit ogen tr

estrogen es ogen

androgen andr

gaba.a.alpha.subunitsubunit alpha a gaba

adenosine.deaminasedeaminase adenosine erase t lcholines ty

acetylcholinesterase ace ed dna polymerase dna ed ect dir − rna rna.directed.dna.polymerase

adenosine.type.33 type adenosine

phospholipase.a2a2 phospholipase sin 1 sin omely tr

stromelysin.1 s lase y phosphor idine

uridine.phosphorylase ur

adenosine.type.11 type adenosine hase synt xide o endoper − aglandin t os

prostaglandin.endoperoxide.synthase pr genase xy monoo ic

unspecific.monooxygenase unspecif genase 1 genase xy cloo

cyclooxygenase.1 cy xidase o hine

xanthine.oxidase xant xidase a xidase o monoamine

monoamine.oxidase.a ase educt r yde

aldehyde.reductase aldeh erase ltransf ibosy r −

nad.adp.ribosyltransferaseadp nad glucosidase −

alpha.glucosidase alpha ase ydrat deh e

carbonate.dehydratase carbonat one er t oges

progesterone pr osine kinase osine tyr − ein t o

protein.tyrosine.kinase pr erase ltransf y orm f cinamide lgly ibosy

phosphoribosylglycinamide.formyltransferase phosphor kinase − 4 ol linosit phosphatidy −

X1.phosphatidylinositol.4.kinase 1 erase t phosphodies tide nucleo − clic cy − ,5’

X3..5..cyclic.nucleotide.phosphodiesterase 3’ peptidase iv peptidase − l dipeptidy

dipeptidyl.peptidase.iv ase sulfat − l y er t

steryl.sulfatase s hase synt e amat glut lpoly oly of ydr trah e

tetrahydrofolylpolyglutamate.synthase t hase synt xide o − ic

nitric.oxide.synthase nitr erase t phosphodies gmp − clic cy − ,5’

X3..5..cyclic.gmp.phosphodiesterase 3’ ase educt r e olat of ydr

dihydrofolate.reductase dih hase synt e lat ymidy h

thymidylate.synthase t erase i erase t

phosphodiesterase.i phosphodies alloendopeptidase t

metalloendopeptidase me dipeptidase a dipeptidase − l

peptidyl.dipeptidase.a peptidy ein kinase ein t o pr

protein.kinase olase ydr h er t dies ic

phosphoric.diester. phosphor amine t

histamine his hepsin b hepsin

cathepsin.b cat hepsin l hepsin

cathepsin.l cat or nmda subunit nmda or ecept r e amat

glutamate.receptor.nmda.subunit glut genase 2 genase xy cloo

cyclooxygenase.2 cy xypeptidase carbo ala − d − ala − d type − ine

serine.type.d.ala.d.ala.carboxypeptidase ser 8 − leukin er

interleukin.8 int ypsin

trypsin tr amase lact − a t be

beta.lactamase ombin hr

thrombin t or viia or

coagulation.factfact or.viia coagulation or xa or

coagulation.factfact or.xa coagulation r pr glut caspase ornit aden hiv nucleo tr peptidase be transf ser alcohol deh endo et plat be pur t h choles adenosy lanos guan his vit es andr gaba a alpha subunit adenosine deaminase ace rna adenosine type 3 phospholipase a2 s ur adenosine type 1 pr unspecif cy xant monoamine o aldeh nad adp alpha carbonat pr pr phosphor 1 3’ dipeptidy s t nitr 3’ dih t phosphodies me peptidy pr phosphor his cat cat glut cy ser int tr be t coagulation fact coagulation fact e h h hr ibonucleoside tr t ydr − ypt ypsin ,5’ ,5’ er trah ymidine kinase ymidy a ocollagen os oges o o idine phosphor cloo cloo amin d3 er tr t t omely t phosphatidy ombin t t − ydr hepsin b hepsin l a adr a adr t a ine endopeptidase ic ine t t − ine amine type 2 amine ele ty amat amat alloendopeptidase ogen y ein ein kinase − − 1 r hine o o leukin − t hine decarbo ase ogen dir − l y y aglandin t cy cy lcholines t − yde r − xyme err helin ydr lact er o of lat lat − − xy xy t t activating fact tidy sulfat e − glucosidase t ect enone 5alpha l xide synt type d er clic clic nucleoside phosphor lat − olat ol synt tr tyr − enocept enocept ing alky ic monoo − e cy e cy sin 1 genase 1 genase 2 lhomocy e r e r e deh of l dipeptidase a amase opepsin ibosy ic dies one r − 1 ed dna polymerase e synt ltransf ibosy − − xidase educt osine kinase oly − − peptidase iv t ecept ecept ydr h e r 8 lik − nucleo gmp phosphodies ase clase clase y t pr erase i − lpoly xidase a Fig. (5). High resolution pocket-pocket SH shape interaction matrix. The interaction matrix is color coded using darker − blue for a high e lglut t educt ydrat hase − ogenase lgly erase linosit endoper ala oline dio ltransf or viia or xa t diphosphat ase l or ar hase hase or type 1 or erase er h or kainat or nmda subunit y s xy xy − lase ar cinamide f t glut einase tide phosphodies

shape similarity (Tanimoto  0.9), blue for medium similarity (Tanimoto between 0.7 and 0.9) and light blue for low similarityd (Tanimoto < ase lase ase genase y ydr ol 4 − − l erase or − amat ala carbo y r educt l gr coa r o xy olase − 0.7). Both axes are labelled according to the 76 annotations in which the 677 complexesxide synt are distributed. Each annotation is represented by a e subunit kinase genase oups, o e synt e r educt orm ase y educt lase t consensus binding pocket shape. The interaction matrix is clustered by consensus binding pocket shape similarity.erase This also groups together xypeptidase y hase hase ltransf t ase (nadph2) her t ase t related targets. erase han me erase t h

This analysis of the ligand-pocket matrixy mainly agrees consensus pocket (Fig. 10 middle) shares high shape l gr with the correlations found between bindingoups pockets. Fig. similarity with the ligands of lanosterol synthase, purine (10) shows the ligand-pocket superpositions of the same nucleotide phosphorilase, thimidine kinase, cholestenone 5- consensus pockets shown in Fig. (6) with selected examples reductase, and adenosylhomocysteinase. of high similarity ligands. For example, the androgen Unlike the pocket-pocket matrix, the thrombin consensus consensus binding pocket (Fig. 10 top) shares high shape pocket is found to have lower similarity with other serine similarity with the ligands of -adrenoreceptor, purine  protease ligands. The pocket-ligand superpositions look nucleoside phosphorilase, thimidine kinase, hydroxymethyl- worse in this case. This is because MSSH does not represent glutaryl CoA reductase, and adenosylhomocysteinase. accurately the shapes of surface pockets. Fig. (10 bottom) Similarly, the hydroxymethylglutaryl CoA reductase 120 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

Fig. (6). Example superpositions of consensus pocket shapes. This figure shows the consensus pocket shapes of four example targets (androgen, hydroxymethilglutaryl CoA reductase, GABA-A alpha subunit, and thrombin) in lilac, along with the superpositions of the pockets of several other targets with similar consensus pocket shapes (grey). shows the thrombin consensus pocket superposed with the serine protease ligands superpose correctly on the left side of ligands of coagulation factors Xa and VIIIa, -lactamase, the thrombin pocket, while the poor quality surface pocket trypsin, interleukin-8, and serine-type d-ala-d-ala representation on the right remains unmatched. Finally, in carboxypeptidase. As can be seen, the thrombin SH pocket the pocket-pocket correlation matrix, the GABA-A alpha is shape is bigger than the shapes of these ligands. Hence, only predicted to be a highly promiscuous target. The GABA-A the large trypsin ligands match well this large surface pocket alpha consensus pocket shows high shape similarity with the shape. Nevertheless, it can be observed in Fig. (10) that the ligands of -adrenoreceptor, purine nucleoside phosphori- Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 121

Fig. (7). High resolution zoom of the top right corner (red box) of the ligand-pocket SH shape interaction matrix at a Tanimoto threshold of 0.7. The interaction matrix is colored in dark blue if there is SH similarity between ligand and binding pocket shapes over a Tanimoto of 0.7 and a PDB complex exists, blue if there is SH similarity between ligand and binding pocket shapes over a Tanimoto of 0.7 and a PDB complex does not exist, and light blue if there is no SH similarity between ligand and binding pocket shapes over a Tanimoto of 0.7. Both axes are labelled according to the MDDR annotations in which the complexes are distributed. 122 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

Fig. (8). High resolution zoom of the top right corner (red box) of the ligand-pocket SH shape interaction matrix at a Tanimoto threshold of 0.9. The interaction matrix is colored in dark blue if there is SH similarity between ligand and binding pocket shapes over a Tanimoto of 0.9 and a PDB complex exists, blue if there is SH similarity between ligand and binding pocket shapes over a Tanimoto of 0.9 and a PDB complex does not exist, and light blue if there is no SH similarity between ligand and binding pocket shapes over a Tanimoto of 0.9. Both axes are labelled according to the MDDR annotations in which the complexes are distributed. It can be observed that a high threshold value highlights targets which are predicted to be promiscuous (i.e. those columns with a high number of dark blue pixels). Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 123

RS1 ED2 ED4 GD9 RIV D56 NNI NND NBV RO4 356 A00 PZF 230 47D 417 HBQ 277 SY1 MYF 565 872 PO1 DH1 DT8 FR6 T50 COX BPR PNX TEP TIO 198 62P 3NA AEJ 4TR 4TZ 4PP FIS FRL TAZ 7IN ARR FR7 HA1 BI8 VDN CIA 338 667 MIN H20 H12 SYM BMV JAN 778 IMN 5AP MKC GW5 100 BPG BFI CEL COP 336 486 TIN A3M AB1 DMP U49 OAP MEL IGN LYA SBN CP6 PNU STI ZES RIT D16 G26 478 TBO C20 R01 ZPR HUP ZST 7NI NVP EFZ SU1 HTA DX9 E20 IBP S11 BFU ILO NRG IPU LY2 MID DHT PVB ETS STL L75 ROC BZB BA1 HBY MNO BM9 BM2 MTX GW3 TOL TES ADC STR EST .i e .a .1 .2 .1 .3 .1 .2 .a .1 .iv eta .a2 .xa ase ase unit ase ase ase ase ase unit unit .viia ase actor ypsin ylase ylase atase atase er er er xylase yptase xidase tr .type .type .type .type .kinase .kinase leukin.8 tr .sub actor .cyclase .cyclase .o xidase ydrolase ydr thrombin estrogen actor yl.groups androgen ..nadph2. histamine .synthase xygenase .synthase xygenase .synthase peptidase .synthase ansf ansf ansf caspase .o endothelin .h .reductase .reductase .reductase cathepsin.l ating.f yl.sulf cathepsin.b xygenase xygenase .deaminase ydrogenase xypeptidase inter ylate ylate yltr .deh stromelysin.1 progesterone .dio xide xide protein.kinase yde vitamin.d3.lik .nmda.sub m ymidine ster .decarbo .phosphor beta.lactamase .phosphor .endopeptidase olate .kainate histamine hiv.1.retropepsin th or xanthine ibosyltr

adenosine adenosine ic.o phospholipase cycloo cycloo aden guan .5alpha.reductase .f alpha.glucosidase beta.adrenoceptor ine .r ymidylate ic.diester lanosterol.synthase aldeh .phosphodiester .phosphodiester idine .than.meth phosphodiester acetylcholinester nitr ydrof ucleotidyltr coagulation.f gaba.a.alpha.sub th ser nithine peptidyl.dipeptidase coagulation.f n ur monoamine adenosine metalloendopeptidase dipeptidyl.peptidase protein.tyrosine carbonate .receptor alcohol.deh dih .diphosphate platelet.activ or .receptor ucleoside beta.adrenoceptor adenosylhomocysteinase yl.coa.reductase ..other nad.adp .n unspecific.monoo olylpolyglutamate na.directed.dna.polymer r ucleotide phosphor .d.ala.d.ala.carbo ine cholestenone X1.phosphatidylinositol.4.kinase ydrof procollagen.proline ucleoside pur glutamate ylglutar .type glutamate ah yl.groups ibosylglycinamide X3..5..cyclic.gmp ibon ine prostaglandin.endopero tetr r .ar ser xymeth

X3..5..cyclic.n

ydro ! phosphor h

ing.alkyl.or err Tanimoto shape similarity score ansf

! tr Tanimoto ! 0.9 Dark blue 0.7 " Tanimoto < 0.9 Blue ! Tanimoto < 0.7 Light blue

Fig. (9). High resolution in silico promiscuity prediction matrix. The in silico ligand-pocket interaction matrix is shown for the 127 ligands for which biological activities are available in BindingDB. The matrix is color coded using dark blue for a high shape similarity Supplementary Fig. (5) High resolution in silico promiscuity prediction matrix. The in silico ligand-pocket (Tanimoto  0.9), blue for medium similarity (Tanimoto between 0.7 and 0.9) and light blue for low similarity (Tanimoto < 0.7). The horizontalinterac axistion is m labelledatrix is according shown tofo ther th 76e 127annotations ligand ins whichfor w htheic h677 bi complexesological a arecti vidistributed,ties are avandai thelab verticalle in Bi axisnd isnhowsgDB the. T 3-letterhe code for the ligands used in the promiscuity predictions. The targets with high shape similarity amongst all the ligands (i.e. columns with a highma numbertrix is of c odarklor pixels)coded are us predicteding dark to bl beue promiscuous. for a high shape similarity (Tanimoto ! 0.9), blue for medium similarity

(Tanimoto between 0.7 and 0.9) and light blue for low similarity (Tanimoto < 0.7).!The horizontal axis is labelled

according to the 76 annotations in which the 677 complexes are distributed, and the vertical axis shows the 3-letter

code for the ligands used in the promiscuity predictions. The targets with high shape similarity along all the

ligands (dark blue columns) are predicted to be promiscuous. 124 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

Fig. (10). Example superpositions of consensus pocket shapes with selected ligands. This figure shows the same consensus pockets as in Fig. (6) (lilac), with high similarity ligands shown as multi-coloured surfaces (red/green/blue/yellow). lase, thimidine kinase, hydroxymethylglutaryl CoA silico ligand-pocket interaction matrix is color coded as in reductase, cholestenone 5-reductase, adenosylhomocy- Fig. (3a and 3b). Fig. (9) shows this in silico matrix in detail. steinase, lanosterolsynthase, vitamin d3-like receptors, As before, a high similarity threshold helps to highlight the estrogen, androgen, adenosindeaminase, acetylcholine- possible promiscuous targets. Hence, we use a Tanimoto sterase, and RNA directed DNA polymerase. Again, the threshold of 0.9 to predict promiscuous targets (dark blue promiscuity predicted for all these targets agrees well with columns in Fig. 9) and we compare these predictions with the existing MDDR activity classes for their ligands. promiscuity evidence from in vitro results. Supplementary Table 2 lists the predicted promiscuous targets as well as Comparison with Experimental Results several MDDR activity classes related to their ligands. The In order to validate our approach, we compared our in predicted promiscuous targets agree with the existing silico interaction matrices with biological activity data MDDR activity classes for their ligands except for vitamin extracted from BindingDB [40]. Fig. (3d) compares our in d3-like receptors and thimidine kinase, which are only found silico results with the experimental results for the 127 to be related to vitamin 3D analog class and thymidine ligands for which data is available in BindingDB. The in kinase inhibitor, respectively. Fig. (11) shows in detail the Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 125

RS1 ED2 ED4 GD9 RIV D56 NNI NND NBV RO4 356 A00 PZF 230 47D 417 HBQ 277 SY1 MYF 565 872 PO1 DH1 DT8 FR6 T50 COX BPR PNX TEP TIO 198 62P 3NA AEJ 4TR 4TZ 4PP FIS FRL TAZ 7IN ARR FR7 HA1 BI8 VDN CIA 338 667 MIN H20 H12 SYM BMV JAN 778 IMN 5AP MKC GW5 100 BPG BFI CEL COP 336 486 TIN A3M AB1 DMP U49 OAP MEL IGN LYA SBN CP6 PNU STI ZES RIT D16 G26 478 TBO C20 R01 ZPR HUP ZST 7NI NVP EFZ SU1 HTA DX9 E20 IBP S11 BFU ILO NRG IPU LY2 MID DHT PVB ETS STL L75 ROC BZB BA1 HBY MNO BM9 BM2 MTX GW3 TOL TES ADC STR EST .i e .a .1 .2 .1 .3 .1 .2 .a .1 .iv eta .a2 .xa ase ase unit ase ase ase ase ase unit unit .viia ase actor ypsin ylase ylase atase atase er er er xylase yptase xidase tr .type .type .type .type .kinase .kinase leukin.8 tr .sub actor .cyclase .cyclase .o xidase ydrolase ydr thrombin estrogen actor yl.groups androgen ..nadph2. histamine .synthase xygenase .synthase xygenase .synthase peptidase .synthase ansf ansf ansf caspase .o endothelin .h .reductase .reductase .reductase cathepsin.l ating.f yl.sulf cathepsin.b xygenase xygenase .deaminase ydrogenase xypeptidase inter ylate ylate yltr .deh stromelysin.1 progesterone .dio xide xide protein.kinase yde vitamin.d3.lik .nmda.sub m ymidine ster .decarbo .phosphor beta.lactamase .phosphor .endopeptidase olate .kainate histamine hiv.1.retropepsin th or xanthine ibosyltr adenosine adenosine ic.o phospholipase cycloo cycloo aden guan .5alpha.reductase .f alpha.glucosidase beta.adrenoceptor ine .r ymidylate ic.diester lanosterol.synthase aldeh .phosphodiester .phosphodiester idine .than.meth phosphodiester acetylcholinester nitr ydrof ucleotidyltr coagulation.f gaba.a.alpha.sub th ser nithine peptidyl.dipeptidase coagulation.f n ur monoamine adenosine metalloendopeptidase dipeptidyl.peptidase protein.tyrosine carbonate .receptor alcohol.deh dih .diphosphate platelet.activ or .receptor ucleoside beta.adrenoceptor adenosylhomocysteinase

yl.coa.reductase ..other nad.adp .n unspecific.monoo olylpolyglutamate na.directed.dna.polymer r ucleotide phosphor .d.ala.d.ala.carbo ine cholestenone X1.phosphatidylinositol.4.kinase ydrof procollagen.proline ucleoside pur glutamate ylglutar

.type glutamate ah yl.groups ibosylglycinamide X3..5..cyclic.gmp ibon ine prostaglandin.endopero tetr r .ar ser xymeth

X3..5..cyclic.n ! ydro phosphor h

ing.alkyl.or err

! ansf 0 < IC50/nM ! 1 Dark blue tr 1 < IC50/nM ! 10 Blue ! IC50/nM > 10 Light blue

Fig. (11). High resolution in vitro promiscuity prediction matrix. The in vitro interaction matrix is color coded according to the biological activity values, using dark blue for high activity (IC50/nM  1), blue for medium activity (1 < IC50/nM  10) and light blue for low activity Supplementary Fig. (6) High resolution in vitro promiscuity prediction matrix. The in vitro interaction matrix (IC50/nM > 10). The horizontal axis is labelled according to the 76 annotations in which the 677 complexes are distributed, and the vertical axis shows the 3-letter code for the ligands used in the promiscuity predictions. is color coded according to the biological activity values, using dark blue for high activity (IC50/nM ! 1), blue for

medium activity (1 < IC50/nM ! 10) and light blue for low activity (IC50/nM > 10). The horizontal axis is

labelled according to the 76 annotations in which the 677 complexes are distributed, and the vertical axis shows

the 3-letter code for the ligands used in the promiscuity predictions.! 126 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

RS1 ED2 ED4 GD9 RIV D56 NNI NND NBV RO4 356 A00 PZF 230 47D 417 HBQ 277 SY1 MYF 565 872 PO1 DH1 DT8 FR6 T50 COX BPR PNX TEP TIO 198 62P 3NA AEJ 4TR 4TZ 4PP FIS FRL TAZ 7IN ARR FR7 HA1 BI8 VDN CIA 338 667 MIN H20 H12 SYM BMV JAN 778 IMN 5AP MKC GW5 100 BPG BFI CEL COP 336 486 TIN A3M AB1 DMP U49 OAP MEL IGN LYA SBN CP6 PNU STI ZES RIT D16 G26 478 TBO C20 R01 ZPR HUP ZST 7NI NVP EFZ SU1 HTA DX9 E20 IBP S11 BFU ILO NRG IPU LY2 MID DHT PVB ETS STL L75 ROC BZB BA1 HBY MNO BM9 BM2 MTX GW3 TOL TES ADC STR EST .i e .a .1 .1 .3 .2 .1 .2 .a .1 .iv eta .a2 .xa ase ase unit ase ase ase ase ase unit unit .viia ase actor ypsin ylase ylase atase atase er er er xylase yptase xidase tr .type .type .type .type .kinase .kinase leukin.8 tr .sub actor .cyclase .cyclase .o xidase ydrolase ydr thrombin estrogen actor yl.groups androgen ..nadph2. histamine .synthase xygenase .synthase xygenase .synthase peptidase .synthase ansf ansf ansf caspase .o endothelin .h .reductase .reductase .reductase cathepsin.l ating.f yl.sulf cathepsin.b xygenase xygenase .deaminase ydrogenase xypeptidase inter ylate ylate yltr .deh stromelysin.1 progesterone .dio xide xide protein.kinase yde vitamin.d3.lik .nmda.sub m ymidine ster .decarbo .phosphor beta.lactamase .phosphor .endopeptidase olate .kainate histamine hiv.1.retropepsin th or xanthine ibosyltr adenosine adenosine ic.o phospholipase cycloo cycloo aden guan .5alpha.reductase .f alpha.glucosidase beta.adrenoceptor ine .r ymidylate ic.diester lanosterol.synthase aldeh .phosphodiester .phosphodiester idine .than.meth phosphodiester acetylcholinester nitr ydrof ucleotidyltr coagulation.f gaba.a.alpha.sub th

ser nithine peptidyl.dipeptidase coagulation.f n ur monoamine adenosine metalloendopeptidase dipeptidyl.peptidase protein.tyrosine carbonate .receptor alcohol.deh dih .diphosphate platelet.activ or .receptor ucleoside beta.adrenoceptor adenosylhomocysteinase yl.coa.reductase ..other nad.adp .n unspecific.monoo olylpolyglutamate na.directed.dna.polymer r ucleotide phosphor .d.ala.d.ala.carbo ine cholestenone X1.phosphatidylinositol.4.kinase ydrof procollagen.proline ucleoside pur glutamate ylglutar .type glutamate ah yl.groups ibosylglycinamide X3..5..cyclic.gmp ibon ine prostaglandin.endopero tetr r .ar ser xymeth X3..5..cyclic.n

ydro ! phosphor 0 < IC50/nM ! 1 Dark blue h ing.alkyl.or

err 1 < IC50/nM ! 10 Blue ansf ! tr IC50/nM > 10 Light blue High similarity found (In silico

! predicted promiscuous targets) Light steel blue bands

Supplementary Fig. (7) High resolution in silico vs in vitro promiscuity prediction matrix. The in vitro vs in Fig. (12).sili cHigho ma resolutiontrix is col oinr silicocoded vs u insin vitrog da promiscuityrk blue whe predictionn there is evmatrix.idence The of inhi ghvitro bi vsol ogiin silicocal a cmatrixtivity is(I colorC50/ nMcoded ! 1using), bl udarke blue when there is evidence of high biological activity (IC50/nM  1), blue when there is evidence of medium biological activity (1 < IC50/nM  10), andwh lighten th blueere iwhens evid thereence iso fevidence medium of b ilowolo gbiologicalical activ iactivityty (1 < (IC50/nMIC50/nM > ! 10).10) ,The and in li gsilicoht bl upredictede when ttargetshere i s(Tanimoto evidence similarityof score  0.9) are highlighted as light blue bands. Analyzing the matches it can be observed that, with the available biological data, in vitro evidencelow of bpromiscuousiological a ctargetstivity often(IC5 0agrees/nM >with 10 in). silicoThe inpredicted silico promiscuouspredicted ta targets.rgets ( THowever,animoto there sim iisla norit yexperimental score " 0. data9) a rfore some targets that are predicted to be promiscuous, and there are some targets (mainly proteins with binding pockets on the surface which MSSH does nothi ghlrepresentighted accurately) as light s twhicheel bl areue banotnds predicted. Anal ytoz ibeng promiscuous the matche ands it therecan bise experimentalobserved th evidenceat, with ofth eit. a vailable biological data, in vitro evidence of promiscuous targets often agrees with in silico predicted promiscuous targets. However,

there is no experimental data for some targets that are predicted to be promiscuous, and there are some targets

(mainly proteins with binding pockets on the surface which MSSH does not represent accurately) which are not

predicted to be promiscuous and there is experimental evidence of it.! Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 127 in vitro matrix. In both matrices, the horizontal axis is to small differences in residues surrounding the site, or labelled according to the 76 annotations in which the 677 existence of a flexible lid to enable the to complexes are distributed and the vertical axis shows the 3- accommodate a broad range of ligand sizes with good letter ligand code used in the promiscuity predictions. It can potency [52]). be seen that the available experimental data from BindingDB As shown in previous studies [23, 51], there are cases supports the notion that many existing targets are where targets highly related by sequence are unrelated by promiscuous (e.g. thrombin, coagulation factor Xa, dipeptidil their ligands, and cases where receptors unrelated by peptidase IV, aldehyde reductase, dihydrofolate reductase, sequence are highly related by their ligands. Our approach steryl sulfatase, cyclooxigenase 2, RNA directed DNA has the advantage to combine information from both the polymerase, hiv retropepsin). It also shows more general ligand and binding pocket spaces. By mapping small- annotations which are expected to involve multiple targets molecule shape space to protein binding pocket shape space, (e.g. transferring alkyl groups, peptidase, metalloendo- we are able to identify groups of receptors that can be peptidase, or unspecific monooxy- unrelated by sequence and structure but which have ligands genase). with common shapes. In this way, previously unknown The experimental evidence of promiscuous targets agrees cross-interactions can be detected. We have shown here that well with several MDDR activity classes existent for their shape clustering can help to identify off-target relationships. ligands. Supplementary Table 3 shows the promiscuous Cross-shape matching can be used as a first approach to targets identified by the experimental data and the various identify promiscuous ligands and targets, and this should MDDR activity classes related to their ligands. save time and costs compared to using standard functional assays. This should also facilitate the search for novel targets Fig. (12) shows the comparison between the in silico and of marketed drugs. in vitro promiscuity predictions. The in silico predicted targets (Tanimoto score  0.9) are highlighted as light blue The work presented here has focused on a retrospective bands over the in vitro experimental data. Comparing the study of known MDDR drugs. We are currently working to matches with the available experimental data, in vitro determine a better similarity threshold and to calculate a evidence of promiscuous targets often agrees with in silico more rigorous interaction probability. predicted promiscuous targets (e.g. GABA-A alpha subunit, androgen, estrogen, acetylcholinesterase, RNA directed CONCLUSION DNA polymerase). However, there are cases where We have presented a 3D shape-based approach for experimental data is not available for some targets that are predicting drug promiscuity by correlating both ligand and predicted to be promiscuous (e.g. procollagen proline binding pocket SH shapes. The method has been validated dioxygenase, cholestenone 5 reductase, vitamin d3-like using a subset of the MDDR for which experimental receptors, ribonucleoside diphosphate reductase), and other information is available and has been demonstrated to be cases where targets are predicted not to be promiscuous but effective in identifying related targets which are known to where there is experimental evidence of promiscuity (e.g. have related MDDR activity classes. adenosine deaminase, metalloendopeptidase, hiv 1 retropepsin, coagulation factor Xa, thrombin, dipeptidil When assessing similarity between two targets, the peptidase IV, aldehyde reductase, peptidase). This later advantage of examining ligand-pocket shape similarity group mainly consists of proteins with surface pockets which compared to protein sequence is the ability to identify targets are not represented well in MSSH. which may have different folds but which have unexpectedly similar binding sites. Normally, polypharmacology DISCUSSION prediction methods operate in ligand space or protein space. Previous computational studies to predict pharmaco- Comparing ligands with consensus pocket shapes leads to logical profiles have used the similarity of chemical ligand interesting promiscuity predictions which are often structures, protein sequences, pharmacophoric binding consistent with known crystallographic examples of pockets, or phenotypic side-effects to infer whether two promiscuous ligands and protein targets. Our results show drugs share a target. Our novel approach relates targets by that the performance in ligand space is comparable to that in SH shape similarity in both the ligand and binding pocket binding pocket space, which provides supporting evidence spaces. This allows promiscuous ligands and targets to be for the pocket-based predictions. Moreover, the comparison predicted using an explicit shape-based representation. Since of our in silico promiscuity predictions with the available in 3D shape complementarity is essential for molecular vitro results from Binding DB shows a similar agreement. recognition, it is perhaps not surprising that our 3D shape- Overall, we have presented a new protocol to detect based approach can give very good promiscuity predictions. promiscuous ligands and targets and we have validated it We have shown some specific examples in which our using experimental information. Our results indicate that method can detect a promiscuous target, such as androgen, promiscuous ligands and targets are more common than GABA-A alpha subunit, hydroxymethylglutaryl CoA previously assumed. Detecting and quantifying the reductase, and thrombin. The ligands found to be similarities between target families will help the promiscuous are often small and hydrophobic, as observed identification and exploitation of possible promiscuous previously [25]. Moreover, the binding pockets for our targets. predicted promiscuous targets are consistent with the general requirements for promiscuity (i.e. large hydrophobic binding CONFLICT OF INTEREST sites, evidence of alternative binding modes for the same Non declared. ligand at the same site, sensitivity of the exact binding mode 128 The Open Conference Proceedings Journal, 2011, Volume 2 Pérez-Nueno et al.

ACKNOWLEDGEMENTS [24] Vidal, D.; Mestres, J. In Silico Receptorome Screening of Antipsychotic Drugs. Mol. Inf., 2010, 29, 543-551. The authors thank Cepos Insilico Ltd. for providing an [25] Mestres, J.; Gregori-Puigjané, E.; Valverde, S.; Solé, R.V. The Academic Licence for PARASURF. VIPN is grateful for a topology of drug-target interaction networks: implicit dependence Marie Curie IEF Fellowship, grant reference DOVSA on drug properties and target families. Mol. BioSyst., 2009, 5, 1051-1057. 254128. [26] Henrich, S.; Salo-Ahen, O.M.H.; Huang, B.; Rippmann, F.F.; Cruciani, G.; Wade, R.C. Computational approaches to identifying SUPPLEMENTARY MATERIAL and characterizing protein binding sites for ligand design. J. Mol. Supplementary material is available on the publishers Web Recog., 2010, 23, 209-219. [27] Sciabola, S.; Stanton, R.V.; Mills, J.E.; Flocco, M.M.; Baroni, M.; site along with the published article. Cruciani, G.; Perruccio, F.; Mason, J.S. High-Throughput Virtual Screening of Proteins Using GRID Molecular Interaction Fields. J. REFERENCES Chem. Inf. Model., 2010, 50, 155-169. [1] Xie, L.; Xie, L.; Bourne, P.E. Structure-based systems biology for [28] Schmitt, S.; Kuhn, D.; Klebe, G.A. New Method to Detect Related analyzing off-target binding. Curr. Opin. Struct. Biol., 2011, 21, 1- Function Among Proteins Independent of Sequence and Fold 11. Homol- ogy. J. Mol. Biol., 2002, 323, 387-406. [2] Wermuth, C.G. Selective optimization of side activities: another [29] Shulman-Peleg, A.; Nussinov, R.; Wolfson, H.J. SiteEngines: way for drug discovery. J. Med. Chem., 2004, 47, 1303-1314. recognition and comparison of binding sites and protein-protein [3] Keiser, M.J; Irwin, J.J.; Shoichet, B.K. The Chemical Basis of interfaces. Nucleic Acids Res., 2005, 33, W337-W341. Pharmacology. Biochemistry, 2010, 49, 10267-10276. [30] Jambon, M.; Imberty, A.; Delage, G. Geourjon, C.A new [4] Azzaoui, K. et al. Modeling promiscuity based on in vitro safety bioinformatic approach to detect common 3D sites in protein pharmacology profiling data. Chem. Med. Chem., 2007, 2, 874-880. structures. Proteins: Struct., Funct., Genet., 2003, 52, 137-145. [5] Merlot, C. In silico methods for early toxicity assessment. Curr. [31] Kinnings, S.L.; Jackson, R.M. Binding Site Similarity Analysis for Opin. Drug Discov. Devel., 2008, 11, 80-85. the Functional Classification of the Protein Kinase Family. J. [6] Campillos, M.; Kuhn, M.; Gavin, A.C.; Jensen, L.J; Bork, P. Drug Chem. Inf. Model., 2009, 49, 318-329. target identification using side-effect similarity. Science, 2008, 321, [32] Najmanovich, R.; Kurbatova, N.; Thornton, J. Detection of 3D 263-266. atomic similarities and their use in the discrimination of small- [7] Fedorov, O. et al. A systematic interaction map of validated kinase molecule protein-binding sites. Bioinformatics, 2008, 24, i105- inhibitors with Ser/Thr kinases. Proc. Natl. Acad. Sci. USA, 2007, i111. 104, 20523-20528. [33] Belongie, S.; Malik, J.; Puzicha, J. Shape Matching and Object [8] Trubetskoy, O.V. et al. High throughput screening assay for UDP- Recognition Using Shape Contexts. IEEE Trans. Pattern Anal. glucuronosyltransferase 1A1 glucuronidation profiling. Assay Drug Mach. Intell., 2002, 24, 509-522. Dev. Technol., 2007, 5, 343-354. [34] Richmond, N.J.; Willett, P.; Clark, R.D. Alignment of three- [9] Nobeli, I.; Favia, A.D.; Thornton, J.M. Protein promiscuity and its dimensional molecules using an image recognition algorithm. J. applications for biotechnology. Nat. Biotechnol., 2009, 27, 157- Mol. Graphics Model., 2004, 23, 199-209. 167. [35] Lin, J.; Clark, T. An analytical, variable resolution, complete [10] Chong, C.R.; Sullivan, D.J. New uses for old drugs. Nature, 2007, description of static molecules and their intermolecular binding 448, 645-646. properties. J. Chem. Inf. Model., 2005, 45, 1010-1016. [11] Nigsch, F.; Bender, A.; Jenkins, J.L.; Mitchell, J.B.O. Ligand- [36] Ritchie, D.W.; Kemp, G.J.L. Protein Docking Using Spherical target prediction using Winnow and naive Bayesian algorithms and Polar Fourier Correlations. Proteins: Struct. Func. Genet., 2000, the implications of overall performance statistics. J. Chem. Inf. 39, 178-194. Model., 2008, 48, 2313-2325. [37] Schuffenhauer, A.; Zimmermann, J.; Stoop, R.; van der Vyver, J-J.; [12] Niijima, S.; Yabuuchi, H.; Okuno, Y. Cross-Target View to Feature Lecchini, L.; Jacoby, E. An ontology for pharmaceutical ligands Selection: Identification of Molecular Interaction Features in and its application for in silico screening and library design. J. LigandTarget Space. J. Chem. Inf. Model., 2011, 51, 15-24. Chem. Inf. Comput. Sci., 2002, 42, 947-955. [13] Paolini, G.V.; Shapland, R.H.B.; van Hoorn, W.P.; Mason, J.S.; [38] MDL Drug Data Report, 2010.2 (MDL Informations Systems Inc., Hopkins, A.L. Global mapping of pharmacological space. Nat. San Leandro, CA, 2010). Biotechnol., 2006, 24, 805-815. [39] Berman, H.M.;Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; [14] Weskamp, N.; Hüllermeier, E; Klebe, G. Merging chemical and Weissing, H.; Shindyalov, I.N.; Bourne, P.E. The protein data biological space: Structural mapping of enzyme binding pocket bank. Nucleic Acids Res., 2000, 28, 235-242. space. Proteins, 2009, 76, 317-330. [40] Liu, T.; Lin, Y.; Wen, X.; Jorrisen, R.N.; Gilson, M.K. Binding [15] Keiser, M. J. et al. Predicting new molecular targets for known DB: a web-accessible database of experimentally determined drugs. Nature, 2009, 462, 175-181. protein-ligand binding affinities. Nucleic Acids Res., 2007, 35, [16] Milletti, F.; Vulpetti, A. Predicting Polypharmacology by Binding D198-D201. Site Similarity: From Kinases to the Protein Universe. J. Chem. Inf. [41] CEPOS InSilico Ltd.: Erlangen, Germany, 2009. Model., 2010, 50, 1418-143. http://www.ceposinsilico.de/ (accessed: June 24, 2009). [17] Altschul, S.A.; Gish, W.; Miller, W.; Myers, E.W.; Lipman, D.J. [42] Beautrait, A.; Leroux, V.; Chavent, M.; Ghemtio, L.; Devignes, Basic local alignment search tool. J. Mol. Biol., 1990, 215, 403- M.-D.; Smaïl-Tabbone, M.; Cai, W.; Shao, W.; Moreau, G.; 410. Bladon, P.; Yao, Maigret, B. J. Multiple-step virtual screening [18] Altschul, S.A.; Madden, T.L.; Scha ffer, A.A.; Zhang, J.; Zhang, using VSM-G: overview and validation of fast geometrical Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: a matching enrichment. J. Mol. Model., 2008, 14, 135-148. new generation of protein database search programs. Nucleic Acids [43] Mavridis, L.; Hudson, B.D.; Ritchie, D.W. Toward High Res., 1997, 25, 3389-3402. Throughput 3D Virtual Screening Using Spherical Harmonic [19] Pearson, W.R. Rapid and sensitive sequence comparison with fastp Surface Representations. J. Chem. Inf. Model., 2007, 47, 1787- and fasta. Methods Enzymol., 1990, 183, 63-98. 1796. [20] Arnold, R.; Rattei, T.; Tischler, P.; Truong, M.D.; Stumpflen, V.; [44] Pérez-Nueno, V.I.; Ritchie, D.W.; Borrell, J.I.; Teixidó, J. Mewes, W. Simap-the similarity matrix of proteins. Bioinformatics, Clustering and classifying diverse HIV entry inhibitors using a 2005, 21, ii42-ii46. novel consensus shape based virtual screening approach: Further [21] Johnson, M.A.; Maggiora, G.M. Concepts and applications of evidence for multiple binding sites within the CCR5 extracellular molecular similarity; Wiley & Sons: New York, 1990. pocket. J. Chem. Inf. Model., 2008 , 48, 2146-2165. [22] Keiser, M.J. et al. Relating protein pharmacology by ligand [45] Pérez-Nueno, V.I.; Ritchie, D.W.; Rabal, O.; Pascual, R.; Borrell, chemistry. Nature Biotechnol., 2007, 25, 197-206. J.I.; Teixidó, J. Comparison of ligand-based and receptor-based [23] Hert, J.; Keiser, M.J.; Irwin, J.J.; Oprea, T.I.; Shoichet, B.K. virtual screening of HIV entry inhibitors for the CXCR4 and CCR5 Quantifying the relationships among drug classes. J. Chem. Inf. receptors using 3D ligand shape-matching and ligand-receptor Model., 2008, 48, 755-765. docking. J. Chem. Inf. Model., 2008, 48, 509-533. Predicting Drug Promiscuity Using Spherical Harmonic Surface The Open Conference Proceedings Journal, 2011, Volume 2 129

[46] Venkatraman, V.; Pérez-Nueno, V.I.; Mavridis, L.; Ritchie, D. W. 1992: Recommendations of the Nomenclature Committee of the A comprehensive comparison of ligand-based virtual screening International Union Of Biochemistry and Molecular Biology on the reveals limitations of current 3D methods tools against the DUD Nomenclature and Classification of Enzymes; Academic Press: San dataset. J. Chem. Inf. Model., 2010, 50, 2079-2093. Diego, 1992. [47] Perez-Nueno, V.I.; Ritchie, D.W. Using consensus-shape clustering [50] Ritchie, D.W.; Kemp, G.J.L. Fast computation, rotation, and to identify promiscuous ligands and protein targets and to choose comparison of low resolution spherical harmonic molecular the right query for shape-based virtual screening. J. Chem. Inf. surfaces. J. Comput. Chem., 1999, 20, 383-395. Model., 2011, 51, 1233-1248. [51] Adams, J.C.; Keiser, M.J.; Basuino, L.; Chambers, H.F.; Lee, D.S.; [48] Perez-Nueno, V.I.; Venkatraman, V.; Mavridis, L.; Clark, T; Wiest, O.G.; Babbitt, P.C. A mapping of drug space from the Ritchie D.W. Using spherical harmonic surface property viewpoint of small molecule . PLoS Comput. Biol., representations for ligand-based virtual screening. Mol. Inf., 2010, 2009, 5, e1000474, 1-12. 30, 151-159. [52] Hopkins, A.; Mason, J.S.; Overington, J.P. Can we rationally [49] International Union of Biochemistry and Molecular Biology, design promiscuous drugs? Curr. Opin. Struct. Biol., 2006, 16, Nomenclature Committee & Webb, E.C. Enzyme Nomenclature 127-136.

Received: June 17, 2011 Revised: July 28, 2011 Accepted: August 05, 2011

© Pérez-Nueno et al.; Licensee Bentham Open.

This is an open access article licensed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted, non-commercial use, distribution and reproduction in any medium, provided the work is properly cited.