Development and evaluation of in silico toxicity screening panels Will Krawszik1, Maja Aleksic2, Paul Russell2, Jonathan G.L. Mullins3

Figures below indicating high throughput docking locations of validation data sets and specific docking orientations of “blind test” 1 – Moleculomics In Silico Discovery Inc (Canada), 500 Boulevard Cartier Ouest, Bureau 115, Laval, Quebec, H7V 5B7, Canada compounds. 2 – Safety and Environmental Assurance Centre, Unilever, Colworth Science Park, Sharnbrook, Bedfordshire, MK44 1LQ, UK 3 – Moleculomics Ltd (UK), Institute of Life Sciences, Swansea University Medical School, Singleton, Swansea, SA2 8PP, UK Cholecystokinin A receptor (CCKAR)

Contact email: [email protected] Contact telephone: 514-701-2771 Heavy atoms in the receptor , indicated in Silver Background - Modelling receptor interactions is of significant interest to the scientific Dockings extracted community, with many computational tools available. However, current tools are designed for from training data the prediction of on-target effects and are widely used in the pharmaceutical industry, where sets, indicated by blue circles 5-HT3 (HTR3A) compounds are routinely screened for binding affinity to only a single receptor of interest. Dockings of “low interactivity” and Targets to provide an early assessment of the “interactivity” potential hazard of a compound or chemical ligands from, series, as recommended by Bowes et al (2012), indicated in yellow Nature reviews: Drug Discovery, 11: 909-922: and finally; CB2 (CNR2) G protein-coupled receptors Dockings of “high A2A (ADORA2A) interactivity” α1A- (ADRA1A) ligands, indicated by α2A-adrenergic receptor (ADRA2A) red ‘x’s. β1-adrenergic receptor (ADRB1) β2-adrenergic receptor (ADRB2)‡ Cannabinoid receptor CB1 (CNR1) Cannabinoid receptor CB2 (CNR2) In summary, the project developed a workflow capable of reliable prediction, at Cholecystokinin A receptor (CCKAR) D1 (DRD1)‡ >90% accuracy, of protein-ligand hits when compared with recorded in vitro (DRD2)‡ interactions, detailed in DrugBank and ToxCast. This is the first successful A (EDNRA) Approach - An off-target screening approach was (HRH1)‡ implementation of an in silico panel for pharmacological profiling and is a highly adopted, screening compounds of interest against the (HRH2) δ-type (OPRD1) promising development. structures of 44 receptors known to be associated with κ-type opioid receptor (OPRK1)‡ μ-type opioid receptor (OPRM1)‡ toxic or adverse reactions. This work involved extensive Muscarinic M1 (CHRM1) testing, comparison and cross-referencing of at least 3 Muscarinic acetylcholine receptor M2 (CHRM2)‡ Predicted interaction scores below are in the range [0, 1]. Results have been normalised such that >0.5 is the criterion for a predicted hit Muscarinic acetylcholine receptor M3 (CHRM3) with a true positive rate of 90% and a true negative rate of 30-40% (estimated). To aid analysis of these results a colour coding system has independent docking methods to in vitro results. A 5-HT1A (HTR1A) been applied whereby blue denotes a score <0.5, and therefore a miss. Hits are indicated by a score >0.5 shaded by an increase in fundamental feature of this work was establishing “hit 5-HT1B (HTR1B) intensity of red towards a score of 1.0 (full confidence hit). This colour coding enables the user to readily identify trends within the results. 5-HT2A (HTR2A)‡ thresholds” for both known toxins and FDA approved 5-HT2B (HTR2B) High/ compounds to provide a useful reference for compounds Vasopressin V1A receptor (AVPR1A) of unknown or toxicity. Ion channels Acetylcholine receptor subunit α1 or α4 (CHRNA1 or CHRNA4)‡ Stage 1: Structural modelling and validation Voltage-gated calcium channel subunit α Cav1.2 (CACNA1C)‡ Stage 2: In silico screening of “blind test” compounds GABAA receptor α1(GABRA1)‡ Potassium voltage-gated channel subfamily H member 2; hERG (KCNH2) Stage 3: Continued validation / refinement of the Potassium voltage gated channel KQT-like approach member 1 (KCNQ1) and minimal potassium channel MinK (KCNE1) NMDA receptor subunit NR1 (GRIN1)‡ 5-HT3 (HTR3A)‡ ToxCast compounds FDA Approved Drug Bank “experimental” Voltage-gated sodium channel subunit α

Enzymes Acetylcholinesterase (ACHE) Cyclooxygenase 1;COX1 (PTGS1) Cyclooxygenase 2; COX2 (PTGS2)‡ Monoamine oxidase A (MAOA)‡ Phosphodiesterase 3A (PDE3A) Phosphodiesterase 4D (PDE4D)‡ Lymphocyte-specific protein tyrosine kinase (LCK) Transporters Conclusions - The project benefited from an extensive validation exercise, Dopamine transporter (SLC6A3) 19 “test” compounds vs. Bowes et al “Panel 44 set” Noradrenaline transporter (SLC6A2)‡ involving large databases of in vitro results. This enabled analysis of the Serotonin transporter (SLC6A4)‡ accuracy of prediction with reference to in vitro results, for which the Nuclear receptors Pathway Analysis Androgen receptor (AR) prediction of hits was pleasingly high, although there remains work to Glucocorticoid receptor (NR3C1) improve upon the delineation of misses. In addition, comparisons with the in Normalisation techniques – A number of techniques were applied to normalise both the vitro data for the blind test compounds were promising. The resulting in vitro control data and the results of a given docking in the context of other dockings as technology system provides extremely valuable molecular knowledge which follows; provides the basis to a novel screening tool as it enables a paradigm shift Normalisation of the in vitro data – It was observed that the control data obtained from the FDA Approved and ToxCast compounds naturally featured more hits than misses. In from reliance on observing effects at a system level, including a reduced vitro control data was normalised to ensure the resulting prediction was not biased reliance upon animal testing, to predicting effects based on understanding at towards the prediction of a hit over a miss or vice versa. the molecular level, whilst also reducing drug development costs through the Normalisation of the ligand results - Results were normalised due to the observation of ability to screen for toxic or adverse reactions earlier within the drug the tendency of the in vitro studies to identify a higher level of larger molecule "hits" and development cycle. Such molecular knowledge is a valuable commodity to a smaller molecule "misses". The developed algorithm normalises the predicted energy of interaction independent of the number of atoms. range of industries including; agrochemical, biotech, synthetic biology and Normalisation of the docking results – This was undertaken to account for scoring medical/health research. functions that take into account how tightly a drug binds relative to other ligands that are predicted to bind at the same site and; and for the purpose of comparison of the respective binding affinities at different sites within the same protein.