<<

Chemical Research in Toxicology

This document is confidential and is proprietary to the American Chemical Society and its authors. Do not copy or disclose without written permission. If you have received this item in error, notify the sender and delete all copies.

Identification of potential aryl hydrocarbon receptor ligands by virtual screening of industrial chemicals

Journal: Chemical Research in Toxicology

Manuscript ID tx-2016-004609

Manuscript Type: Article

Date Submitted by the Author: 20-Dec-2016

Complete List of Authors: Larsson, Malin; Umeå University, Department of Chemistry Fraccalvieri, Domenico; Università degli Studi di Milano-Bicocca, 1Dipartimento di Scienze dell’Ambiente e del Territorio Andersson, C. David; Umeå University, Department of Chemistry Bonati, Laura; University of Milano-Bicocca, Department of Earth and Environmental Sciences Linusson, Anna; Umeå University, Department of Chemistry Andersson, Patrik; Umea University, Department of Chemistry

ACS Paragon Plus Environment Page 1 of 52 Chemical Research in Toxicology

1 2 3 4 Identification of potential aryl hydrocarbon receptor ligands by virtual 5 6 screening of industrial chemicals 7 8 9 10 Malin Larsson†, Domenico Fraccalvieri‡, C. David Andersson†, Laura Bonati‡, Anna Linusson†, and 11 12 Patrik L. Andersson†* 13 14 15 †Department of Chemistry, Umeå University, SE-901 87 Umeå, Sweden 16 17 18 ‡Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 19 20 21 20126 Milano, Italy 22 23 24 *Corresponding author: Tel.: +46-90-786-5266; fax: +46-90-786-7655. Email address: [email protected] 25 26 27 28 29 30 Malin Larsson†: [email protected] 31 32 33 ‡ 34 Domenico Fraccalvieri : [email protected] 35 36 † 37 C. David Andersson : [email protected] 38 39 † 40 Anna Linusson : [email protected] 41 42 ‡ 43 Laura Bonati : [email protected] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 2 of 52

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Page 3 of 52 Chemical Research in Toxicology

1 2 3 4 ABSTRACT 5 6 7 We developed a virtual screening procedure to identify potential ligands to the aryl hydrocarbon 8 9 receptor (AhR) among a set of industrial chemicals. AhR is a key target for dioxin-like compounds, 10 11 12 which is related to these compounds potential to induce cancer and a wide range of endocrine and 13 14 immune system related effects. The virtual screening procedure included an initial filtration aiming at 15 16 identifying chemicals with structural similarities to 66 known AhR binders, followed by three 17 18 enrichment methods run in parallel. These include two -based methods (structural fingerprints 19 20 and nearest neighbor analysis) and one structure-based method using an AhR homology model. A set 21 22 of 6,445 commonly used industrial chemicals was processed, and each step identified unique potential 23 24 25 ligands. Seven compounds were identified by all three enrichment methods, and these compounds 26 27 included known activators and suppressors of AhR. Only approximately 0.7% of the studied industrial 28 29 compounds were identified as potential AhR ligands, and of these we suggest that 2-chlorotrityl 30 31 chloride, nisoldipine, 3-p-hydroxyanilino-carbazole, and 3-(2-chloro-4-nitrophenyl)-5-(1,1- 32 33 dimethylethyl)-1,3,4-oxadiazol-2(3H)-one should be addressed in future studies of AhR-mediated 34 35 activities. 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 4 of 52

1 2 3 4 1. INTRODUCTION 5 6 7 2,3,7,8-tetrachlorodibenzo-p-dioxin (2378-TCDD) is a persistent organic pollutant that is known to 8 9 induce several toxicological outcomes ranging from acute syndromes such as chloracne in humans to 10 11 12 severe long-term adverse health effects, including immune response-related effects, cancer, and 13 1-5 14 reproductive disorders . AhR plays a central role in the toxicological outcomes of 2378-TCDD, 15 16 which is the most potent known ligand of AhR6. In addition to dioxins and dioxin-like chemicals, AhR 17 18 can bind to and be activated by a number of diverse natural, endogenous, and synthetic compounds6. 19 20 The AhR signaling pathway is also known to cross talk with the estrogen receptor pathway, which 21 22 implies that AhR-activating compounds can result in endocrine disruption7-9. Furthermore, activation 23 24 25 of AhR has also been linked to many immune response processes that are associated with numerous 26 10-16 27 diseases . Thus the identification and regulation of chemicals that induce AhR-related pathways is a 28 29 critical human health and environmental issue. Non- methods, including in vitro and in 30 31 silico methods, are promoted in the European chemical legislation REACH as ways to identify and 32 33 prioritize chemicals for further testing17. In the United States, the Tox21 and the ToxCast programs 34 35 were initiated to identify hazardous chemicals by using high-throughput screening approaches, 36

37 18-27 38 including large batteries of in vitro assays . In the ToxCast program, 320 chemicals were screened 39 24 40 for five nuclear receptor signaling pathways, including the AhR pathway, and a small percentage of 41 42 these chemicals showed weak AhR-related responses. 43 44 45 Virtual screening is frequently used in medicinal chemistry where in silico methodologies are used to 46 47 48 evaluate large compound libraries for potential associations with a well-defined target, usually a 49 28 50 protein . To evaluate the virtual screening output, the hits are often verified by competitive binding 51 52 assays and/or other in vitro assays29. Virtual screening is based on the structural data of known ligands 53 54 of the studied target (ligand-based screening) and/or characteristics of the target protein (structure- 55 56 based screening)30-38. Ligand-based information includes structural fingerprints and specific chemical 57 58 properties, while structure-based virtual screening is based on interactions between candidate ligands 59 60 and the receptor as evaluated by molecular docking methods37,38. Currently no X-ray crystal structure

ACS Paragon Plus Environment Page 5 of 52 Chemical Research in Toxicology

1 2 3 of the AhR ligand binding domain (LBD) is available, and thus homology models have been 4 5 developed to enable detailed studies of ligand interactions36,39-41. The binding free energies of docking 6 7 8 poses obtained from a homology model of the AhR LBD have been shown to correlate well with the 9 10 experimentally derived competitive binding affinities of 14 polychlorinated dibenzo-p-dioxins 11 12 (PCDDs), including 2378-TCDD, indicating that this homology model can be used for virtual 13 14 screening purposes40. 15 16 17 18 The aim of this study was to use virtual screening to identify new potential AhR ligands among a set 19 20 of commonly used industrial chemicals. A virtual screening protocol was developed based on 21 22 structural information from AhR binders that have been shown to induce AhR-related responses and 23 24 on information from a rat AhR homology model40. Ligand similarities were determined by structural 25 26 fingerprints and by nearest neighbor analysis based on 2D-descriptors. Fingerprint-based approaches 27 28 identify similar substances based on their molecular sub-structures, while nearest neighbor analysis 29 30 31 identifies similarities in chemical and structural properties. Protein-ligand interactions were evaluated 32 33 using the binding free energies of the docking poses. When creating the protocol, we used the three 34 35 screening steps in parallel because multiple scoring and data fusion have proven to be more robust 36 37 than, and often outperform, a single virtual screening method34,35,42,43. In the literature search for 38 39 known AhR ligands for the virtual screening protocol, we compiled a database of 214 AhR 40 41 modulators. This study begins with a multivariate analysis of this set of compounds to obtain a broad 42 43 44 overview of the structural and chemical variation of currently known AhR-related compounds. 45 46 47 48 49 50 51 2. MATERIALS AND METHODS 52 53 54 2.1 Datasets 55 56 57 58 We searched the literature for compounds that activate AhR in rat or mouse cell assays, especially 59 60 those whose AhR activity was identified with ethoxyresorufin-O-deethylase (EROD) activity and

ACS Paragon Plus Environment Chemical Research in Toxicology Page 6 of 52

1 2 3 dioxin responsive chemically activated luciferase expression (DR-CALUX) assays. These compounds 4 5 are hereafter called “AhR modulators”. The search was performed in SciFinder (2014-09-25) using the 6 7 8 following delimiters: “CALUX dioxin”; “EROD” with refinement “relative potency, not soil, not 9 10 sediment, not contamination, not diet”; and “EROD” with refinement a) “in vitro”, b) “luciferase 11 12 dioxin AhR”, and c) “luciferase AhR”. Reported data from the EROD and CALUX assays showed 13 14 high correlation, and we thus decided to merge these data (Supporting Information). 15 16 17 18 A second database was derived with compounds that bind to AhR, here called the “AhR binders”, that 19 20 were selected based on a reported half maximum inhibition (IC50), inhibition constant (Ki), or 21 22 dissociation constant (Kd) at or below 10 µM as measured in competitive binding assays using labeled 23 24 2378-TCDD in rat or mouse cell systems (Dataset S2). The binding data were obtained from SciFinder 25 26 (2014-09-25) using the delimiter “Ah receptor” with refinement “agonist affinity” and “Ah receptor 27 28 competitive”. We also used the available ligand binding data from the studies obtained in the literature 29 30 31 search for AhR modulators. In addition, compounds were defined as binders if 50% of the labeled 32 44 33 2378-TCDD was displaced at compound concentrations at or below 10 µM . The threshold of 10 µM 34 35 was adopted to avoid false positives as in vitro data above that limit might be erroneous. Several 36 37 classical AhR ligands are very hydrophobic and have very limited water solubility which increases the 38 39 experimental uncertainty in that concentration range. Applying this threshold also means that we use a 40 41 safety factor of approximately 1,000 in relation to levels reported in humans of AhR ligands. For 42 43 44 example, very high levels of PCBs in human blood (up to 15 nM) have been reported for populations 45 45 46 in Eastern Slovakia . 47 48 49 The third analyzed data set covered an inventory of high and low production volume chemicals 50 51 (H/LPVCs)46, i.e. the “industrial chemicals”. Compounds, including atoms (i.e., Al, As, Ba, Bi, Cd, 52 53 Co, Cr, Mn, Ni, Pb, Sb, Sn, Sr, Ti, and Zr), that are not available for parameterization in the 54 55 56 MMFF94x force field were removed. In practice, this resulted in the removal of metal complexes, and 57 47,48 58 the final H/LPVC database consisted of 6,445 industrial chemicals . Chemical Abstracts Service 59 60 (CAS) numbers were used to identify each chemical. The molecular structures of all compounds

discussed are given in Figures 1, 6, and S7 and in Tables S3–S5.

ACS Paragon Plus Environment Page 7 of 52 Chemical Research in Toxicology

1 2 3 4 2.2 Chemical structures and molecular descriptors 5 6 7 The substances in the AhR modulators set were characterized using 68 calculated 2D46 and 18 3D 8 9 descriptors (Table S1). These 86 descriptors were used for creating an overview of the structural and 10 11 chemical variation of the AhR modulators. For the 3D descriptors, the chemical structures were energy 12

13 47-49 14 minimized using the MMFF94x force field and Austin Model 1 (AM1) prior to calculations of 15 16 force field-based and quantum chemistry-based descriptors, respectively. The descriptors were 17 18 obtained using MOE version 2012.10,48 except for one shape descriptor, DistMax (the longest distance 19 20 in Ångströms between two atoms in the molecule), which was calculated based on the AM1 3D- 21 22 coordinates using MATLAB version R2012a50. Only 2D descriptors were used for the virtual 23 24 screening, and therefore the structures of the H/LPVC database were solely characterized by the 68 2D 25 26 27 descriptors (cf. the 2D descriptors calculated for the AhR modulators). The structures were pre-treated, 28 29 i.e. strong acids were deprotonated, strong bases were protonated, and counter ions and duplicates 30 31 were removed, prior to calculations of the molecular descriptors using MOE48. 32 33 34 35 36 37 38 The descriptors covered features to describe hydrophobicity, polarizability, reactivity, aromaticity, 39 40 flexibility, size, and shape. Before the calculations of descriptors and the energy minimization, the 41 42 structures were treated with the Wash procedure in MOE using the default settings. Examples of 43 44 descriptors included the hydrophobicity of the compounds as reflected by the octanol-water partition 45 46 coefficient (logP) and the reactivity of the compounds as reflected by molecular orbital energies such 47 48 49 as the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital 50 51 (LUMO) energies and the difference between these energies (GAP). Polarity was provided by the 52 53 implementation of the Gasteiger and Marsili Partial Equalization of Orbital Electronegativities 54 55 (PEOE) method, which describes the polarity of the molecular surface (the van der Waals surface) 56 57 based on atomic partial charges (PEOE_VSA)48. The shape of the molecules was estimated by 58 59 connectivity indices, such as the Balaban distance connectivity index J (balabanJ) and kappa shape 60 indices (Kier1, Kier2, and Kier3),51,52 as well as by the ratios of the 3D descriptors pmi1, pmi2, and

ACS Paragon Plus Environment Chemical Research in Toxicology Page 8 of 52

1 2 3 pmi3, which reflect relations in the width, height, and length of the molecule53. Many known AhR 4 5 ligands are aromatic, and thus descriptors related to this property were included, such as the numbers 6 7 8 of aromatic bonds and rings. 9 10 11 2.3 Techniques used for multivariate analysis and virtual screening 12 13 14 15 2.3.1 Principal component analysis (PCA) 16 17 PCA is a projection method where the largest variation in the data is extracted by creating new latent 18 19 orthogonal variables. Here, each object is a molecule and the variables are the molecular descriptors. 20 21 The first principal component (PC) shows the largest variation in the data, i.e., the largest variation 22 23 24 between the molecules based on their chemical and structural characteristics. The second PC is 25 26 orthogonal to the first one and reflects the second largest variation and so on. Each PC is defined by 27 28 score values representing the locations of the molecules in the multivariate space (using projections to 29 30 the latent variable) and by loading values that indicate which descriptors are responsible for the 31 32 distribution seen in the score values. PCA was used for the analysis of the AhR modulators and was 33 34 used in the initial filtration step and in the nearest neighbor analysis of the virtual screening procedure 35 36 37 (Section 2.3.2.2). To evaluate the PCA model, the following statistical measures were used: Distance 38 2 39 to the Model (DModX), eigenvalues, the explained variation in each PC (R X), and Hotelling statistics 40 41 (Hotelling’s T2 range at the 95% confidence level). DModX and Hotelling’s T2 were used to assess the 42 43 applicability domain of the studied compounds and to detect outliers among the compounds54. A PC 44 45 was considered significant if it had an eigenvalue equal or above 2. The PCA was performed with 46 47 SIMCA version 13.054. 48 49 50 51 2.3.2 Ligand-based similarity techniques 52 53 2.3.2.1 Tanimoto coefficients 54 55 Molecular ACCess System (MACCS) fingerprints were used to define the fragments for the 56 57 55 58 fingerprinting . MACCS contains a total of 166 predefined structural keys that correspond to various 59 60 fragments that form binary fingerprints of the molecules. The fingerprints between pairs of molecules

ACS Paragon Plus Environment Page 9 of 52 Chemical Research in Toxicology

1 2 3 were compared using the Tanimoto coefficient (TC) that describes the similarities between the binary 4 5 strings according to: 6 7 8 9 = (1) 10 + 𝑛𝑛𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 11 𝑇𝑇𝑇𝑇 12 𝑛𝑛𝐴𝐴 𝑛𝑛𝐵𝐵 − 𝑛𝑛𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 13 where nSAME is the number of identical fragments in molecule A and molecule B and nA and nB are the 14 15 numbers of fragments in molecules A and B, respectively43. A TC of 0.60 was set as the cut-off for 16 17 18 calling molecule A and B similar in this study (Supporting Information). The TCs were calculated in 19 48 20 MOE version 2012.10 . 21 22 23 2.3.2.2 Euclidean distances in the chemical property space 24 25 The distance between molecules in a defined chemical property space (here defined by latent variables 26 27 28 from the PCA) can be used as a measure of molecular similarity. Here, the Euclidean distance (ED) 29 30 between two molecules based on the Cartesian coordinates of the latent variables from PCA was 31 32 calculated according to: 33 34 35 = ( ) + ( ) + + ( ) 36 (2) 2 2 2 37 𝑝𝑝𝑝𝑝 1 1 2 2 𝑛𝑛 𝑛𝑛 38 𝐸𝐸𝐸𝐸 � 𝑞𝑞 − 𝑝𝑝 𝑞𝑞 − 𝑝𝑝 ⋯ 𝑞𝑞 − 𝑝𝑝 39 where p and q are two molecules, p1, p2, …, pn are the Cartesian coordinates given by the score values 40

41 of the 1 to n PCs for molecule p, and q1, q2, …, qn are the Cartesian coordinates given by the score 42 43 values of the 1 to n PCs for molecule q43. EDs were used to locate closest neighbors to each of the 44 45 46 AhR binders, and the cut-offs for the EDs differed according to the scaling of the data for the PCAs 47 48 used in the screening steps. The ED cut-offs were set based on the point at which the structures no 49 50 longer shared the same number of rings and/or similar functional groups in the same positions as in the 51 52 AhR binders. An ED of 1.5 was used in the initial filtration step to provide structurally similar 53 54 compounds to a few structurally diverse AhR binders. For the nearest neighbor analysis in the parallel 55 56 virtual screening step, the ED was set to 5.0 and a maximum of 10 neighbors was kept for each AhR 57 58 59 binder. The much smaller ED cut-off in the initial filtration step was set as it was because the 60 descriptors (except those already log-transformed) were log-transformed prior to analysis to normalize

ACS Paragon Plus Environment Chemical Research in Toxicology Page 10 of 52

1 2 3 their distribution and to minimize the influence of extreme values46. More information on the cut-off 4 5 procedure is given in the Supporting Information. 6 7 8 9 2.3.3 Docking protocol and evaluation 10 11 A previously generated homology model of the LBD of the rat AhR35, which was derived from the 12 13 template structures of three HIF-2α PAS-B domains in complexes with artificial ligands56,57, was used 14 15 to study the molecular interactions between the potential ligands and the LBD. The docking procedure 16

17 35 18 was based on a previously developed protocol for docking to homology models and included 19 20 refinement of the model containing a template ligand (THS-01757) by energy minimization with the 21 22 MacroModel program included in Maestro58, docking using the Glide SP program (Glide 6.2, 23 24 Schrödinger Suite 2014-359,60), and refinement and rescoring of the docking poses with the generalized 25 26 Born/surface area (MM-GBSA) molecular mechanics method as implemented in the Prime software61. 27 28 Compared to the previously adopted ensemble-docking protocol35, only one receptor conformation 29 30 31 was selected for docking in this work so as to reduce the computational costs. The receptor grid for 32 33 docking was centered on the THS-017 ligand, and docking was performed within a 12 Å distance from 34 35 the ligand position40,57. Tautomerisation and protonation at pH 7.4, as well as stereoisomerism, were 36 37 generated and used for the studied ligands using the program LigPrep in Maestro. The ten highest- 38 39 ranked docking poses of each ligand stereoisomer, according to the GlideScore SP scoring function, 40 41 were rescored with the Prime MM-GBSA method that allows for estimation of the binding free energy 42 43 44 (ΔGbind) between the compounds and AhR, which accounts for the interaction energies and 45 46 desolvation effects that occur upon complex formation. This method yields ΔGbind values that correlate 47 1,40,62 48 well with experimental IC50 values of docked and rescored PCDD/Fs and PAHs , and it has been 49 50 shown to perform better for congeneric series of ligands63. In the rescoring procedure, the ligands and 51 52 protein residues within 8.0 Å from the ligand were energy minimized while the remaining residues 53 54 55 were kept fixed. The AhR–ligand complexes with the lowest ΔGbind, including one pose of a specific 56 57 stereoisomer of each ligand, were analyzed further. We estimated a cutoff based on the ΔGbind values 58 59 of the 65 known AhR binders (one failed in the docking procedure) which had an average ΔGbind of 60 −112.5 kcal/mol. The industrial chemicals that had a ΔGbind value within one standard deviation of the

ACS Paragon Plus Environment Page 11 of 52 Chemical Research in Toxicology

1 2 3 average ΔGbind (−99.3 kcal/mol) of the 65 known binders were considered more likely to be potential 4 5 AhR binders than those with higher ΔGbind values. This selection became the final enrichment set from 6 7 8 the molecular docking. More information on the cutoff procedure is given in the Supporting 9 10 Information. Docked and rescored ligands were classified based on MACCS fingerprint descriptors 11 12 using a hierarchical clustering model based on Tanimoto coefficients (eq. 1)64 and a cluster linkage 13 14 method based on the weighted average intra-cluster and inter-cluster distances65 with a beta value of 15 16 0.25. Hydrogen bonds, halogen bonds, and aromatic π–π and hydrogen–π bonds between AhR and 17 18 48 19 ligands were mapped using MOE with a distance cut-off of 4.5 Å and a maximum interaction energy 20 21 of −0.5 kcal/mol, taking atom pair distance and directionality into account. 22 23 24 2.4 Search for AhR-mediated effects for selected industrial chemicals from the 25 26 27 virtual screening 28 29 30 We performed a literature search for AhR-mediated effects for the compounds that were selected as 31 32 33 potential AhR ligands in the virtual screening procedure (Dataset S7, Table S5). The search was 34 35 performed in SciFinder (2016-03-16) using the compounds’ CAS numbers and limiting the search to 36 37 the options “Adverse Effect, including toxicity” and also retrieving “Additional related references, 38 39 e.g., activity studies, disease studies”. This resulted in an individual list of SciFinder references for 40 41 each compound, on which the following refinements were applied: a) “CYP1A1”, b) “CYP1A”, c) 42 43 “CYP1”, and d) “AhR”. 44 45 46 47 3. RESULTS AND DISCUSSION 48 49 50 51 3.1 Chemical characterization of AhR modulators 52 53 The AhR modulator database includes a range of small and halogenated aromatic compounds and 54 55 polycyclic aromatic compounds that are known to induce AhR-mediated effects (Dataset S1). 56 57 58 Examples of typical AhR modulators, besides 2378-TCDD, are 3,3’,4,4’,5-pentachlorobiphenyl (PCB 59 60 126), benzo-a-pyrene (BaP), and beta-naphthoflavone (BNF) (Figure 1). The database also included

endogenous rigid aromatics such as 6-formylindolo[3,2-b]carbazole (FICZ) and dinaphtho[1,2-b;1’2’-

ACS Paragon Plus Environment Chemical Research in Toxicology Page 12 of 52

1 2 3 d]furan (DNF), flexible aromatic compounds such as 2-(1’H-indole-3’-carbonyl)-thiazole-4-carboxylic 4 5 acid ester (ITE), and large, complex, and flexible compounds like G2 and . 6 7 8 9 PCA analysis of the structural variation of the 214 AhR modulators resulted in five significant PCs 10 11 explaining 87%, 48%, 17%, 13%, and 5% of the variation in the data, respectively. By studying the 12 13 first three components, a clear separation into four clusters emerged, including 1) halogenated 14 15 aromatics, 2) polycyclic aromatic hydrocarbons (PAHs), 3) natural products, and 4) endogenous 16 17 18 substances (Figure S3). The largest cluster in the score plot included the majority of the halogenated 19 20 aromatic compounds, and this showed that most of the AhR modulators are structurally similar (Figure 21 22 2). The first PC described the difference between compounds with regards to size and surface 23 24 characteristics, the second PC described the difference between compounds with regards to 25 26 hydrophobicity, and the third PC described the difference between compounds with regards to density; 27 28 the numbers of rings, aromatic bonds, and halogens; GAP; and the shape/branching index balabanJ. 29 30 31 The fourth PC separated compounds based primarily on their LUMO energy, the number of rotatable 32 33 bonds, and the number of rings in relation to the number of atoms. The fifth PC was related to 34 35 variations in molecular shape described by the ratios of shape descriptors, including the length, width, 36 37 and height of the molecules (pmi2/pmi1, pmi3/pmi1, npr2 = pmi2/pmi3). The set of AhR modulators 38 39 also included roughly 40 structurally diverse compounds of which many were more flexible and less 40 41 hydrophobic compared to most AhR modulators. As seen in the PCA (Figure 2), the AhR modulators 42 43 44 cover a large variation in structural characteristics, and it might be questioned if all of these 45 6,66 46 compounds actually bind to AhR . The development of the virtual screening protocol was therefore 47 48 based on known AhR binders, and we identified 66 compounds in the literature that have AhR binding 49 50 data (Dataset S2, Figure 2). 51 52 53 54 3.2 Virtual screening of industrial chemicals 55 56 The virtual screening procedure consisted of an initial filtration step followed by three parallel 57 58 enrichment steps, including two ligand-based methods (nearest neighbor analysis and structural 59 60 fingerprints) and one structure-based method (molecular docking) (Figure 3).

ACS Paragon Plus Environment Page 13 of 52 Chemical Research in Toxicology

1 2 3 3.2.1 Initial filtration based on structural and chemical properties 4 5 6 The majority of the 66 known AhR binders from the literature were aromatic compounds including 7 8 carbon, hydrogen, oxygen and chlorine, with a few exceptions, e.g. FICZ, ITE, , 9 10 leflunomide, and nimodipine. In order to identify potential AhR binders among the compounds in the 11 12 H/LPVC database, we used an initial filtration step based on PCA. A total of 330 H/LPVCs were 13 14 identified by PCA within the applicability domain of the 66 AhR binders. FICZ, ITE, flutamide, 15 16 leflunomide, and nimodipine were identified as being outside the domain of the PCA model, and a 17 18 19 parallel filtration step was introduced to cover substances with similar structural characteristics as 20 21 these four compounds. This procedure was based on a PCA including FICZ, ITE, flutamide, 22 23 leflunomide, and nimodipine and the 6,445 compounds in the H/LPVC database, and it yielded an 24 25 additional set of 99 industrial chemicals that are referred to here as atypical potential AhR binders. 26 27 These compounds include, in general, those with a larger structural variation compared to the 28 29 remaining 330 compounds, which are referred to as typical AhR binders (Figure 1; 1–3 and 9). In 30 31 32 summary, the filtration step resulted in 429 industrial chemicals that were further processed in the 33 34 three parallel virtual screening procedures (Figure S6). 35 36 37 3.2.2 Ligand-based methods 38 39 The enrichment based on the nearest neighbor analysis and the structural fingerprints of the 429 40 41 42 industrial chemicals (Section 3.2.1) yielded 93 and 32 compounds, respectively (Figure 3). The 93 43 44 compounds of the nearest neighbor analysis (Dataset S5) were identified close to known AhR binders 45 46 in the chemical property space. These compounds typically included two or three aromatic rings, and 47 48 29 of the 93 compounds were halogenated and 8 of them had chlorine atoms in lateral positions on 49 50 both outer rings, thus resembling the most potent AhR agonist, 2378-TCDD. These eight compounds 51 52 typically included an oxygen, nitrogen, or sulfur connecting the halogenated aromatic rings, for 53 54 55 example 5-chloro-2-(2,4-dichlorophenoxy)aniline (56966-52-0), tetradifon (116-29-0), and 2-(4- 56 57 chlorophenoxy)-5-(trifluoromethyl)aniline (349-20-2). Similar structures, such as polychlorinated 58 59 diphenyl sulfides, have been shown to activate mouse AhR in the H4IIE-luc activation assay67. In 60 addition, 14 of the compounds were polycyclic and as such resembled PAHs, i.e. they had 3–5 phenyl

ACS Paragon Plus Environment Chemical Research in Toxicology Page 14 of 52

1 2 3 rings and typically had no substituents on the rings. Eight of these 14 compounds had fused rings 4 5 similar to the known potent PAHs with nitrogen or a carbonylic carbon in the phenyl rings, for 6 7 8 example 12H-phthaloperin-12-one (6925-69-5). The aromatic rings were halogenated in two cases – 3- 9 10 bromobenz[de]anthracen-7-one (81-96-9) and 3,9-dibromo-7H-benz[de]anthracen-7-one (81-98-1). 11 12 Similar molecules to the last two compounds have been shown to activate AhR in yeast cells68. Unlike 13 14 the other polycyclic compounds, six compounds had non-fused rings and typically had three rings 15 16 connected by a central phosphorous atom, for example triphenylphosphine (603-35-0). There were 46 17 18 19 neighbors for the AhR binder BNF, which shared hits with PAHs but also had unique neighbors, 20 21 including naphthalene-like compounds with two fused rings and carbonylic functionalities, for 22 23 example 2-(2-quinolyl)-1H-indene-1,3(2H)-dione (83-08-9). The remaining 50 compounds (of the 93) 24 25 were similar to the atypical AhR binders (Figure 1; 4–8). The atypical AhR binder FICZ, which is 26 27 structurally quite rigid similar to the more typical binders, was identified close to 2,2'-thiophene-2,5- 28 29 diylbis(benzoxazole) (2866-43-5) and 5,12-dihydroquino[2,3-b]acridine-7,14-dione (1047-16-1). ITE 30 31 32 and leflunomide were found close to over 70 compounds. The 10 closest neighbors of ITE all had 33 34 hydrogen acceptor atoms at similar positions. For instance, 1,8-dihydroxy-4,5- 35 36 bis(methylamino)anthraquinone (56524-76-6) has hydrogen acceptor atoms situated pair-wise in the 37 38 middle and at both ends of the structure, which is similar to ITE (Figure S7). 39 40 41 The 32 compounds identified using structural fingerprints (Dataset S4) had a similar share of 42 43 44 halogenated compounds (approximately one third) as the 93 compounds identified by the nearest 45 46 neighbor analysis. However, only 4 of these 32 compounds had chlorine atoms in lateral positions on 47 48 both outer rings, i.e. resembling 2378-TCDD. The compounds identified by structural fingerprints 49 50 included 12 halogenated aromatic compounds, and of the 32 compounds 15 were similar to typical 51 52 AhR binders (halogenated aromatics, PAHs, and beta-naphthoflavone) and 17 were similar to atypical 53 54 AhR binders. The three industrial chemicals nisoldipine (63675-72-9), bis(isopropyl)naphthalene 55 56 57 (38640-62-9), and 4-bromo-2-fluoro-1,1’- (41604-19-7) were very similar to known AhR 58 59 binders (TC above 0.8). The highest similarity (TC = 0.86) was obtained for nisoldipine, which has the 60 same core structure as nimodipine but different substituents. None of the identified 32 compounds (i.e.

ACS Paragon Plus Environment Page 15 of 52 Chemical Research in Toxicology

1 2 3 having TC ≥ 0.60) were similar to ITE or any of the PCDFs, unlike what was seen in the nearest 4 5 neighbor analysis. However, the structural fingerprint-based method identified isoxazoles, one acidic 6 7 8 isoflavone, and diphenyl-azo compounds with ester functionalities that were not identified in the 9 10 nearest neighbor analysis. Three isoxazole acyl chlorides were found, which are used as precursors in 11 12 the synthesis of antibiotics, e.g. cloxacillin69,70. Isoflavones with certain substitution patterns on their 13 14 rings have been shown to induce luciferase activity in mouse H1L6.1c2 cells71, and the isoflavone 15 16 identified by the structural fingerprint approach, 3-methylflavone-8-carboxylic acid (3468-01-7), is a 17 18 72 19 metabolite of the urinary incontinence drug flavoxate . The azo-compounds, such as 2-[[4-[(2-cyano-

20 73 21 3-nitrophenyl)azo]-m-tolyl](2-acetoxyethyl)amino] ethyl acetate (66882-16-4) , are used as textile 22 23 dyes. 24 25 26 3.2.3 Structure-based method 27 28 The 66 previously established AhR binders from the literature and the 429 industrial chemicals 29

30 40 31 remaining after the initial filtering step were docked to a homology model of the AhR LBD . The 32 33 AhR homology model used for the molecular docking shows a tunnel-shaped and buried ligand- 34 35 binding cavity. Out of the 23 residues flanking the binding site, 10 residues with hydrophobic side 36 37 chains (Ala, Leu, Ile, Pro, Phe, Tyr) form three patches of hydrophobic surface covering a large 38 39 proportion of the binding site surface. Eight amino acids line the site with O, N, or S atoms that can 40 41 participate in hydrogen or halogen bonds. Analysis of the docking of the established AhR binders 42 43 44 revealed that few molecules bonded to AhR with classical hydrogen bonds (O–H or N–H to O or N) 45 46 even though a majority of the ligands contained a heteroatom with hydrogen bonding capability. 47 48 Chlorine- or bromine-containing ligands frequently halogen bonded to oxygen (Thr287 (OH), Ala332, 49 50 His330, Ser334 (OH)) or to sulfur (Cys298, Cys331, Met346, Met338). A few compounds interacted 51 52 with the aromatic residue His289 through aromatic stacking (23479-PeCDF and 124678-HxCDF) or 53 54 55 hydrogen–arene interactions (PCB77, PCB157, and 3-). Notably, only 35% of the 56 57 known binders participated in these specific interactions with AhR, and the remaining molecules’ 58 59 binding to AhR could be attributed to shape complementarity, van der Waals interactions, and 60 desolvation effects. From the cutoff based on ΔGbind of the top-ranked poses, 177 of the 429 potential

ACS Paragon Plus Environment Chemical Research in Toxicology Page 16 of 52

1 2 3 binders identified in the initial filtration step were singled out as more probable AhR binders than the 4 5 remaining 252 chemicals. The 177 industrial chemicals (Dataset S6) had a relatively higher proportion 6 7 8 of specific interactions when bonding to AhR compared to the docked and rescored known AhR 9 10 binders, and 69% had at least one hydrogen bond, 11% had halogen bonds, and 39% had aromatic 11 12 interactions. One third of the 177 industrial chemicals were halogenated, and most of these contained 13 14 one or two halogen atoms. 15 16 17 18 The potential binders were assessed for their structural similarity to the known binders using a 19 20 hierarchical clustering based on fingerprint descriptors, and the resulting clusters are presented in 21 22 Figure S7. Clusters 1, 2, 4, and 5 only contained known typical binders, and five clusters (3, 14, 15, 23 24 16, and 35) contained at least one known binder. A total of 153 compounds were identified in the 25 26 remaining 26 clusters, and some clusters included small aromatic and often halogenated compounds 27 28 (e.g. 20, 21, and 31), but the majority contained flexible and hydrophilic compounds with fused or 29 30 31 nonfused aliphatic and aromatic rings (e.g. 7, 8, 9, and 25). The five clusters with at least one binder 32 33 consisted of small aromatic, often halogenated, compounds, for example 2-chlorotritylchloride 34 35 (42074-68-0) (cluster 3). A unique feature captured by the molecular docking was the steroid ring 36 37 structures. Examples of such chemicals were cholic acid (81-25-4) and ethisterone (434-03-7) (cluster 38 39 7). Cholic acid is a bile acid with low water solubility74 indicating that a hydrophobic environment 40 41 would suit it well, and it has similarities with the endogenous compound 4A. Both compounds 42 43 44 are acidic, flexible, and contain three hydroxyl groups, and lipoxin 4A has been reported to bind to 45 75 46 AhR in guinea pig cells and to activate AhR-dependent gene expression in mouse cells . The 47 48 synthetic female sex ethisterone shares structural similarities with the estrogenic steroid 49 50 hormone equilenin (Table S4). The latter has been shown to induce CYP1A1 mRNA in treated human 51 52 hepatoma (HepG2) cell lines and to weakly displace 3H-BaP from AhR76. Another unique feature was 53 54 the three flexible multi-fused ring structures with one hexa-chlorinated ring and the other rings with 55 56 57 none or single hydrophilic groups, for example aldrin (309-00-2) (cluster 27). A structurally similar 58 59 compound to aldrin is hexachlorobenzene , which has been shown to induce AhR expression in the 60 HepG2 cell line77. Aldrin and hexachlorobenzene are banned by the Stockholm Convention, but this is

ACS Paragon Plus Environment Page 17 of 52 Chemical Research in Toxicology

1 2 3 not the case for endosulfan alcohol (2157-19-9), which is also located in the same cluster (27). A 4 5 number of benzothiazoles and thiazoles (clusters 11 and 12, respectively) were identified as potential 6 7 8 AhR binders. This included the earlier mentioned di(benzothiazol-2-yl) disulphide (120-78-5) (Section 9 10 3.2.2) but also benzothiazoles that have an aliphatic ring structure. An example of such a compound is 11 12 N-cyclohexyl-2-benzothiazolylsulfenamide (95-33-0) (Table S4). He et al.78 showed that several of 13 14 such derivatives induce AhR-dependent luciferase reporter gene expression in recombinant mouse 15 16 hepatoma cells. More examples and an extended analysis of some of the above-mentioned chemicals 17 18 19 and clusters are given in the Supporting Information. 20 21 22 3.3 Consensus compounds and potential AhR ligands 23 24 25 26 The virtual screening protocol resulted in 41 compounds that were identified by at least two of the 27 28 three parallel methods (Figure 5, Table S9a) among which seven were identified by all three 29 30 enrichment methods and are referred to as consensus compounds (Figure 6). Among the 41 identified 31 32 compounds, 17 were halogenated, 16 contained fused rings, and 14 were atypical potential ligands. To 33 34 our knowledge, data on AhR-related activation or suppression has been reported for only 8 of the 41 35 36 chemicals (Table S11). The ligand-based enrichment steps (structural fingerprints and nearest 37 38 39 neighbor analysis) jointly identified 12 compounds, and 22 compounds were jointly identified by the 40 41 nearest neighbor and molecular docking enrichments (Figure 5, Table S7). The structural fingerprints 42 43 and molecular docking methods did not identify any compounds in common other than the consensus 44 45 compounds. The 41 compounds identified by at least two enrichment methods included 11 chemicals 46 47 registered in REACH – five were identified as compounds being produced between 100 and 100,000 48 49 tons per year, and six were registered for intermediate usage79 (Table S10). Three of these 11 50 51 52 compounds – di(benzothiazol-2-yl) disulphide, triclosan, and bumetrizole – have been reported to 53 54 activate or suppress AhR (Table S7). 55 56 57 The seven consensus compounds included structural analogues of both typical (5 compounds) and 58 59 atypical (2 compounds) AhR binders and these were aromatic, mainly halogenated (five out of seven 60 compounds), and relatively hydrophobic. These compounds covered well-studied environmental

ACS Paragon Plus Environment Chemical Research in Toxicology Page 18 of 52

1 2 3 contaminants, including the biocide triclosan, the brominated flame retardant 2,2’,4,4’,5- 4 5 pentabromodiphenylether (BDE99), and the pesticide p,p'-dichlorodiphenyltrichloroethane (pp’-DDT). 6 7 8 The DDT isomers pp’-DDT and o,p'-dichlorodiphenyltrichloroethane and their metabolites have been 9 10 shown to suppress the activity of CYP1A1 in human placenta cells80. BDE99 has been shown to be a 11 81 12 partial AhR-agonist, and also an antagonist with an EC50 value of 13 µM . The AhR-related activity 13 14 of triclosan has been studied by Ahn et al. who concluded that it is a weak agonist using an in vitro 15 16 luciferase reporter gene assay82. To our knowledge, AhR-mediated effects have not been studied for 17 18 19 the remaining four consensus compounds. These include the pesticide 2-chlorotrityl chloride, the dye 20 21 3-p-hydroxyanilino-carbazole, nisoldipine, which is a pharmaceutical sold under the brand name Sular 22 23 and is used as a calcium channel blocker83, and the herbicide precursor 3-(2-chloro-4-nitrophenyl)-5- 24 25 (1,1-dimethylethyl)-1,3,4-oxadiazol-2(3H)-one84,85. The three latter compounds have structural 26 27 similarities to FICZ, nimodipine, and flutamide, all of which are atypical AhR binders. 28 29 30 31 Among the 12 solely ligand-based hits, i.e. the compounds that were identified in both the nearest 32 33 neighbor analysis and the structural fingerprints, three were halogenated and three included fused 34 35 rings. Three of the 12 were atypical binders, including 2-amino-5-nitrobenzophenone (1775-95-7), 2- 36 37 chloroethyl 3-nitro-p-toluate (59383-11-8), and 4-methoxy-3-nitro-N-phenylbenzamide (97-32-5). The 38 39 following halogenated compounds were identified: 2,6-dichloro-N-phenylaniline (15307-93-4), 2- 40 41 chloroethyl 3-nitro-p-toluate (59383-11-8), and 4-bromo-2-fluoro-1,1’-biphenyl (41604-19-7). Three 42 43 44 of the hits resembled PAHs, including anthracene-9-carbaldehyde (642-31-9), 3-(9-anthryl) 45 46 acrylaldehyde (38982-12-6), and 2,8-dimethylnaphtho(3,2,1-kl)xanthene (81-37-8) (Table S7). 2,6- 47 48 dichloro-N-phenylaniline is an analogue to the anti-inflammatory drug diclofenac and has been used to 49 50 synthesize diclofenac analogues in an attempt to identify new anti-inflammatory drug candidates86. 51 52 Anthracene-9-carbaldehyde is a common intermediate in the production of dyes and pigments and has 53 54 been shown to be metabolized mainly by CYP2B1 and CYP2C11 in rats and primarily by CYP3A 55 56 87 57 (and to some extent by CYP2A6, CYP2B6, and CYP2C9) in human liver microsomes . Among these 58 59 12 compounds was terphenyl, which is generally a mixture of the three isomers ortho-, meta-, and 60

ACS Paragon Plus Environment Page 19 of 52 Chemical Research in Toxicology

1 2 3 para-terphenyl (where the para isomer is dominating in technical mixtures). Terphenyl has been used 4 5 as a dye since the 1980s and more recently in new nanomaterials due to its optical properties88,89. 6 7 8 9 The structure-based and the nearest neighbor enrichment processes identified 22 common compounds, 10 11 and among these 9 were halogenated, 12 had fused ring systems, and 9 were atypical. Among the 12 13 halogenated compounds was dicofol, a hydroxylated derivative of DDT90 that has a weak 14 15 thyroidogenic effect91. Moreover, these 22 compounds included many that have hydrogen acceptors in 16 17 18 similar positions as the known atypical AhR binders ITE and FICZ. Examples of such compounds are 19 20 bumetrizole (3896-11-05), di(benzothiazol-2-yl) disulphide (120-78-5), and 5-chloro-2-(2,4- 21 22 dichlorophenoxy)aniline (56966-52-0). Bumetrizole is a benzotriazole used as an ultraviolet 23 24 absorption stabilizer (UV326), and this compound has been shown to activate AhR-related pathways 25 26 through the induction of and AHR2 in zebra fish embryos92. Benzothiazoles, such as 27 28 di(benzothiazol-2-yl) disulphide, are used as vulcanization accelerators in rubber production93, and 29 30 31 these compounds have been identified as a new class of AhR agonists in a recent study of 16 32 78 33 benzothiazoles . 5-chloro-2-(2,4-dichlorophenoxy)aniline is an analogue of triclosan (a hydroxyl 34 35 group is replaced by an amino group) and is an intermediate in the synthesis of other triclosan 36 37 analogues, and it has significant potency against drug-resistant strains of the human malaria parasite 38 39 Plasmodium falciparum94. 40 41 42 43 4. CONCLUSIONS 44 45 46 47 We have developed a virtual screening protocol to identify potential AhR ligands among a set of 48 49 industrial chemicals, and the protocol was validated by successfully identifying chemicals that are 50 51 known to be AhR agonists and/or antagonists. Overall, relatively few novel compounds were 52 53 identified as potential AhR ligands among the 6,445 industrial chemicals studied here. Only 41 were 54 55 identified by at least two of the three parallel steps, and this suggests that among these industrial 56 57 chemicals only a small fraction are able to induce AhR-mediated effects and that there are very few 58 59 60 compounds that resemble the most well-studied AhR ligands. The multivariate analysis of 214 known AhR modulators identified in the existing literature also showed that most of these compounds belong

ACS Paragon Plus Environment Chemical Research in Toxicology Page 20 of 52

1 2 3 to the typical chemical classes related to AhR activation. Of these 214 known modulators, 4 5 approximately 40 compounds had more flexible/aliphatic, larger, and more complex structures (e.g. 6 7 8 compounds with atom types other than carbon, hydrogen, halogens, and oxygen). The structure-based 9 10 screening, which consisted of molecular docking using a rat AhR homology model, emphasized the 11 12 versatile nature of AhR and indicated that AhR interacts with both non-aromatic polycyclic 13 14 compounds and aromatic-substituted hydrocarbons. The 41 novel compounds identified as potential 15 16 AhR ligands are prime candidates for further in vitro studies. In particular, we strongly suggest 17 18 detailed studies on the four consensus compounds that have to our knowledge have not been tested for 19 20 21 AhR-mediated effects, including 2-chlorotrityl chloride, nisoldipine, 3-p-hydroxyanilino-carbazole, 22 23 and 3-(2-chloro-4-nitrophenyl)-5-(1,1-dimethylethyl)-1,3,4-oxadiazol-2(3H)-one. 24 25 26 27 FUNDING INFORMATION 28 29 This work was financially supported by the European Union through the project SYSTEQ (226694- 30 31 FP7-ENV-2008-1). 32 33 34 35 ACKNOWLEDGEMENTS 36 37 The authors thank Professor Mats Tysklind (Umeå University) for fruitful discussions in the initial 38 39 planning of the study. 40 41 42 43 SUPPORTING INFORMATION 44 45 46 Additional information on the selection of the 66 known AhR binders, the PCA of the 214 known AhR 47 48 49 modulators, the comparisons of EROD and CALUX data, the histograms of ΔGbind-values from the 50 51 rescoring, the PCA of the 429 industrial chemicals and AhR binders after the initial screening step, 52 53 representative structures from the parallel virtual screening steps, and the structures of the final 54 55 selection of the 41 potential AhR ligands. The supporting material also includes a file of datasets, 56 57 including the median relative effect potencies (REPs) for the AhR modulators, the Simplified 58 59 Molecular Input Line Entry Specification (SMILES) of the AhR modulators, the AhR binders, the 60

ACS Paragon Plus Environment Page 21 of 52 Chemical Research in Toxicology

1 2 3 subsets of industrial chemicals after the different steps in the virtual screening protocol, and of the 4 5 final list of potential AhR ligands. 6 7 8 9 ABBREVIATIONS 10 11 12 13 2378-TCDD, 2,3,7,8-tetrachlorodibenzo-p-dioxin; 23479-PeCDD, 2,3,4,7,9-pentachlorodibenzo-p- 14 15 dioxin; 124678-HxCDD, 1,2,4,6,7,8-hexachlorodibenzo-p-dioxin; AhR, Aryl hydrocarbon receptor; 16 17 BaP, benzo-a-pyrene; BDE99, 2,2’,4,4’,5-pentabromodiphenylether; CAS, Chemical Abstracts 18 19 Service; DNF, dinaphtho[1,2-b;1’2’-d]furan; DR-CALUX, dioxin responsive chemically activated 20 21 luciferase expression; DModX, Distance to the Model; ED, Euclidean distance; EROD, 22 23 24 ethoxyresorufin-O-deethylase; FICZ, 6-formylindolo[3,2-b]carbazole; H/LPVC, high and low 25 26 production volume chemicals; ITE, 2-(1’H-indole-3’-carbonyl)-thiazole-4-carboxylic acid ester; LBD, 27 28 ligand binding domain; MACCS, Molecular ACCess System; MM-GBSA, molecular mechanics 29 30 generalized Born/surface area; PC, principal component; PCA, principal component analysis; HOMO, 31 32 highest occupied molecular orbital; LUMO, lowest unoccupied molecular orbital; PCB, 33 34 35 ; PCB126, 3,3’,4,4’,5-pentachlorobiphenyl; PCB157, 2,3,3’,4,4’,5’- 36 37 hexachlorobiphenyl; PCB77, 3,3',4,4'-tetrachlorobiphenyl; PCDD/Fs, polychlorinated dibenzo-p- 38 39 dioxins and ; pp’-DDT, p,p'-dichlorodiphenyltrichloroethane; R2X, explained variation 40 41 in each PC; REACH, Registration, Evaluation, Authorisation and restriction of Chemicals; REP, 42 43 relative effect potency; SMILES, simplified molecular-input line-entry system; TC, Tanimoto 44 45 coefficient. 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 22 of 52

1 2 3 4 REFERENCES 5 6 (1) Safe, S. (1990) Polychlorinated (PCBs), dibenzo-para-dioxins (PCDDs), dibenzofurans 7 8 (PCDFs), and related compounds - environmental and mechanistic considerations which support the 9 10 development of toxic equivalency factors (TEFs). Cr. Rev. Toxicol. 21, 51-88. 11 12 13 14 15 16 17 (2) Sorg, O., Zennegg, M., Schmid, P., Fedosyuk, R., Valikhnovskyi, R., Gaide, O., Kniazevych, V., 18 19 Saurat, J.H. (2009) 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) poisoning in Victor Yushchenko: 20 21 identification and measurement of TCDD metabolites. Lancet 374, 1179-1185. 22 23 24 25 26 27 28 (3) Van den Berg, M., Birnbaum, L., Bosveld, A.T.C., Brunstrom, B., Cook, P., Feeley, M., Giesy, 29 30 J.P., Hanberg, A., Hasegawa, R., Kennedy, S.W., Kubiak, T., Larsen, J.C., Van Leeuwen, F.X.R., 31 32 Liem, A.K.D., Nolt, C., Peterson, R.E., Poellinger, L., Safe, S., Schrenk, D., Tillitt, D., Tysklind, M., 33 34 35 Younes, M., Waern, F., Zacharewskr, T. (1998) Toxic equivalency factors (TEFs) for PCBs, PCDDs, 36 37 PCDFs for humans and wildlife. Environ. Health Perspect. 106, 775-792. 38 39 40 41 42 43 (4) Van den Berg, M., Birnbaum, L.S., Denison, M., De Vito, M., Farland, W., Feeley, M., Fiedler, H., 44 45 46 Hakansson, H., Hanberg, A., Haws, L., Rose, M., Safe, S., Schrenk, D., Tohyama, C., Tritscher, A., 47 48 Tuomisto, J., Tysklind, M., Walker, N., Peterson, R.E. (2006) The 2005 World Health Organization 49 50 Reevaluation of Human and Mammalian Toxic Equivalency Factors for Dioxins and Dioxin-Like 51 52 Compounds. Toxicol. Sci. 93, 223-241. 53 54 55 56 57 58 59 (5) Landi, M.T., Consonni, D., Patterson, D.G., Needham, L.L., Lucier, G., Brambilla, P., Cazzaniga, 60 M.A., Mocarelli, P., Pesatori, A.C., Bertazzi, P.A., Caporaso, N.E. (2002) Half-lives of tetra-, penta-,

ACS Paragon Plus Environment Page 23 of 52 Chemical Research in Toxicology

1 2 3 hexa-, hepta-, and octachlorodibenzo-p-dioxin in rats, monkeys, and humans - a critical review. 4 5 Chemosphere 48, 631–644. 6 7 8 9 10 11 12 (6) Denison, M.S., Nagy, S.R. (2003) Activation of the aryl hydrocarbon receptor by structurally 13 14 diverse exogenous and endogenous chemicals. Annu. Rev. Pharmacolog. 43, 309-334. 15 16 17 18 19 20 21 (7) Klinge, C.M., Bowers, J.L., Kulakosky, P.C., Kamboj, K.K., Swanson, H.I. (1999) The aryl 22 23 hydrocarbon receptor (AHR)/AHR nuclear translocator (ARNT) heterodimer interacts with naturally 24 25 occurring estrogen response elements. Mol. Cell. Endocrinol. 157, 105-119. 26 27 28 29 30 31 32 (8) Ohtake, F., Fujii-Kuriyama, Y., Kawajiri, K., Kato, S. (2011) Cross-talk of dioxin and estrogen 33 34 receptor signals through the ubiquitin system. J. Steroid Biochem. Mol. Biol. 127, 102-107. 35 36 37 38 39 40 41 (9) Swedenborg, E., Pongratz, I. (2010) AhR and ARNT modulate ER signaling. Toxicol. 268, 132- 42 43 138. 44 45 46 47 48 49 50 (10) Bessede, A., Gargaro, M., Pallotta, M.T., Matino, D., Servillo, G., Brunacci, C., Bicciato, S., 51 52 Mazza, E.M.C., Macchiarulo, A., Vacca, C., Iannitti, R., Tissi, L., Volpi, C., Belladonna, M.L., 53 54 Orabona, C., Bianchi, R., Lanz, T.V., Platten, M., Della Fazia, M.A., Piobbico, D., Zelante, T., 55 56 57 Funakoshi, H., Nakamura, T., Gilot, D., Denison, M.S., Guillemin, G.J., DuHadaway, J.B., 58 59 Prendergast, G.C., Metz, R., Geffard, M., Boon, L., Pirro, M., Iorio, A., Veyret, B., Romani, L., 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 24 of 52

1 2 3 Grohmann, U., Fallarino, F., Puccetti, P. (2014) Aryl hydrocarbon receptor control of a disease 4 5 tolerance defence pathway. Nature 511, 184-190. 6 7 8 9 10 11 12 (11) Boule, L.A., Winans, B., Lawrence, B.P. (2014) Effects of Developmental Activation of the AhR 13 14 on CD4(+) T-Cell Responses to Influenza Virus Infection in Adult Mice. Environ. Health Perspect. 15 16 17 122, 1201-1208. 18 19 20 21 22 23 (12) Frawley, R., DeVito, M., Walker, N.J., Birnbaum, L., White, K., Jr., Smith, M., Maynor, T., 24 25 Recio, L., Germolec, D. (2014) Relative Potency for Altered Humoral Immunity Induced by 26 27 28 Polybrominated and Polychlorinated Dioxins/Furans in Female B6C3F1/N Mice. Toxicol. Sci. 139, 29 30 488-500. 31 32 33 34 35 36 37 (13) Heilmann, C., Budtz-Jorgensen, E., Nielsen, F., Heinzow, B., Weihe, P., Grandjean, P. (2010) 38 39 Serum Concentrations of Antibodies Against Vaccine Toxoids in Children Exposed Perinatally to 40 41 Immunotoxicants. Environ. Health Perspect. 118, 1434-1438. 42 43 44 45 46 47 48 (14) Hochstenbach, K., van Leeuwen, D.M., Gmuender, H., Gottschalk, R.W., Stolevik, S.B., 49 50 Nygaard, U.C., Lovik, M., Granum, B., Namork, E., Meltzer, H.M., Kleinjans, J.C., van Delft, J.H.M., 51 52 van Loveren, H. (2012) Toxicogenomic Profiles in Relation to Maternal Immunotoxic Exposure and 53 54 Immune Functionality in Newborns. Toxicol. Sci. 129, 315-324. 55 56 57

58 59 60

ACS Paragon Plus Environment Page 25 of 52 Chemical Research in Toxicology

1 2 3 (15) Stølevik, S.B., Nygaard, U.C., Namork, E., Haugen, M., Meltzer, H.M., Alexander, J., Knutsen, 4 5 H.K., Aaberge, I., Vainio, K., van Loveren, H., Løvik, M., Granum, B. (2013) Prenatal exposure to 6 7 8 polychlorinated biphenyls and dioxins from the maternal diet may be associated with 9 10 immunosuppressive effects that persist into early childhood. Food Chem. Toxicol. 51, 165-172. 11 12 13 14 15 16 17 (16) Winans, B., Humble, M.C., Lawrence, B.P. (2011) Environmental toxicants and the developing 18 19 immune system: A missing link in the global battle against infectious disease? Reprod. Toxicol. 31, 20 21 327-336. 22 23 24 25 26 27 28 (17) REACH, 2007. European Parliament and Council Regulation (EC) No 1907/2006 concerning the 29 30 Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH). Available 31 32 at: http://ec.europa.eu/growth/sectors/chemicals/reach/index_en.htm [2016-09-08]. 33 34 35 36 37 38 39 (18) Berg, E.L., Polokoff, M.A., O'Mahony, A., Nguyen, D., Li, X. (2015) Elucidating mechanisms of 40 41 toxicity using phenotypic data from primary human cell systems—a chemical biology approach for 42 43 thrombosis-related side effects. Int. J. Mol. Sci. 16, 1008-1029. 44 45 46 47 48 49 50 (19) Dix, D.J., Houck, K.A., Martin, M.T., Richard, A.M., Setzer, R.W., Kavlock, R.J. (2007) The 51 52 ToxCast Program for Prioritizing Toxicity Testing of Environmental Chemicals. Toxicol. Sci. 95, 5- 53 54 12. 55 56 57

58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 26 of 52

1 2 3 (20) EPA, Tox21. United States Environmental Protection Agency (EPA). Available 4 5 at: http://www.epa.gov/ncct/Tox21/ [2016-09-08]. 6 7 8 9 10 11 12 (21) EPA, ToxCast. United States Environmental Protection Agency (EPA). Available 13 14 at: http://www.epa.gov/ncct/toxcast/ [2016-09-08]. 15 16 17 18 19 20 21 (22) Filer, D., Patisaul, H.B., Schug, T., Reif, D., Thayer, K. (2014) Test driving ToxCast: endocrine 22 23 profiling for 1858 chemicals included in phase II. Curr. Opin. Pharmacol. 19, 145-152. 24 25 26 27 28 29 30 (23) Judson, R.S., Houck, K.A., Kavlock, R.J., Knudsen, T.B., Martin, M.T., Mortensen, H.M., Reif, 31 32 D.M., Rotroff, D.M., Shah, I., Richard, A.M., Dix, D.J. (2010) In Vitro Screening of Environmental 33 34 Chemicals for Targeted Testing Prioritization: The ToxCast Project. Environ. Health Perspect. 118, 35 36 37 485-492. 38 39 40 41 42 43 (24) Rotroff, D.M., Beam, A.L., Dix, D.J., Farmer, A., Freeman, K.M., Houck, K.A., Judson, R.S., 44 45 LeCluyse, E.L., Martin, M.T., Reif, D.M. (2010) Xenobiotic-metabolizing enzyme and transporter 46 47 48 gene expression in primary cultures of human hepatocytes modulated by ToxCast chemicals. J. Tox. 49 50 Env. Health, Part B 13, 329-346. 51 52 53 54 55 56 57 (25) Rotroff, D.M., Dix, D.J., Houck, K.A., Knudsen, T.B., Martin, M.T., McLaurin, K.W., Reif, 58 59 D.M., Crofton, K.M., Singh, A.V., Xia, M., Huang, R., Judson, R.S. (2013) Using in vitro high 60

ACS Paragon Plus Environment Page 27 of 52 Chemical Research in Toxicology

1 2 3 throughput screening assays to identify potential endocrine-disrupting chemicals. Environ. Health 4 5 Perspect. 121, 7-14. 6 7 8 9 10 11 12 (26) Oki, N.O., Edwards, S.W. (2016) An integrative data mining approach to identifying adverse 13 14 outcome pathway signatures. Toxicol. 350, 49-61. 15 16 17 18 19 20 21 (27) Richard, A.M., Judson, R.S., Houck, K.A., Grulke, C.M., Volarath, P., Thillainadarajah, I., Yang, 22 23 C., Rathman, J., Martin, M.T., Wambaugh, J.F. (2016) ToxCast chemical landscape: Paving the road 24 25 to 21st century toxicology. Chem. Res. Toxicol. 29, 1225-1251. 26 27 28 29 30 31 32 (28) Gohlke, H., Klebe, G. (2002) Approaches to the Description and Prediction of the Binding 33 34 Affinity of Small-Molecule Ligands to Macromolecular Receptors. Angew. Chem. Int. Ed. 41, 2644- 35 36 37 2676. 38 39 40 41 42 43 (29) Kouskoumvekaki, I., Petersen, R.K., Fratev, F., Taboureau, O., Nielsen, T.E., Oprea, T.I., Sonne, 44 45 S.B., Flindt, E.N., Jonsdottir, S.O., Kristiansen, K. (2013) Discovery of a Novel Selective PPAR 46 47 48 gamma Ligand with Partial Agonist Binding Properties by Integrated in Silico/in Vitro Work Flow. J. 49 50 Chem. Inf. Model. 53, 923-937. 51 52 53 54 55 56 57 (30) AlQudah, D.A., Zihlif, M.A., Taha, M.O. (2016) Ligand-based modeling of diverse 58 59 aryalkylamines yields new potent P-glycoprotein inhibitors. Eur. J. Med. Chem. 110, 204-223. 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 28 of 52

1 2 3 (31) Ai, G., Tian, C., Deng, D., Fida, G., Chen, H., Ma, Y., Ding, L., Gu, Y. (2015) A combination of 4 5 2D similarity search, pharmacophore, and molecular docking techniques for the identification of 6 7 8 vascular endothelial growth factor receptor-2 inhibitors. Anti-Cancer Drugs 26, 399-409. 9 10 11 12 13 14 (32) Xie, H., Qiu, K., Xie, X. (2014) 3D QSAR studies, pharmacophore modeling and virtual 15 16 17 screening on a series of steroidal aromatase inhibitors. Int. J. Mol. Sci. 15, 20927-20947. 18 19 20 21 22 23 (33) Cross, S., Baroni, M., Goracci, L., Cruciani, G. (2012) GRID-based three-dimensional 24 25 pharmacophores I: FLAPpharm, a novel approach for pharmacophore elucidation. J. Chem. Inf. 26 27 28 Model. 52, 2587-2598. 29 30 31 32 33 34 (34) Svensson, F., Karlén, A., Sköld, C. (2011) Virtual screening data fusion using both structure-and 35 36 37 ligand-based methods. J. Chem. Inf. Model. 52, 225-232. 38 39 40 41 42 43 (35) Swann, S.L., Brown, S.P., Muchmore, S.W., Patel, H., Merta, P., Locklear, J., Hajduk, P.J. (2011) 44 45 A unified, probabilistic framework for structure-and ligand-based virtual screening. J. Med. Chem. 54, 46 47 48 1223-1232. 49 50 51 52 53 54 (36) Bisson, W.H., Koch, D.C., O'Donnell, E.F., Khalil, S.M., Kerkvliet, N.I., Tanguay, R.L., 55 56 57 Abagyan, R., Kolluri, S.K. (2009) Modeling of the Aryl Hydrocarbon Receptor (AhR) Ligand Binding 58 59 Domain and Its Utility in Virtual Ligand Screening to Predict New AhR Ligands. J. Med. Chem. 52, 60 5635-5641.

ACS Paragon Plus Environment Page 29 of 52 Chemical Research in Toxicology

1 2 3 4 5 6 7 (37) Kitchen, D.B., Decornez, H., Furr, J.R., Bajorath, J. (2004) Docking and scoring in virtual 8 9 screening for drug discovery: methods and applications. Nature rev. Drug Disc. 3, 935-949. 10 11 12 13 14 15 16 (38) Spyrakis, F., Cavasotto, C.N. (2015) Open challenges in structure-based virtual screening: 17 18 Receptor modeling, target flexibility consideration and active site water molecules description. Arch. 19 20 Biochem. Biophys. 583, 105-119. 21 22 23 24 25 26 27 (39) Lo Piparo, E., Koehler, K., Chana, A., Benfenati, E. (2006) Virtual screening for aryl 28 29 hydrocarbon receptor binding prediction. J. Med. Chem. 49, 5702-5709. 30 31 32 33 34 35 36 (40) Motto, I., Bordogna, A., Soshilov, A.A., Denison, M.S., Bonati, L. (2011) New Aryl 37 38 Hydrocarbon Receptor Homology Model Targeted To Improve Docking Reliability. J. Chem. Inf. 39 40 Model. 51, 2868-2881. 41 42 43 44 45 46 47 (41) Pandini, A., Soshilov, A.A., Song, Y., Zhao, J., Bonati, L., Denison, M.S. (2009) Detection of the 48 49 TCDD binding-fingerprint within the Ah receptor ligand binding domain by structurally driven 50 51 mutagenesis and functional analysis. Biochem. 48, 5972-5983. 52 53 54 55 56 57 58 (42) Baber, J.C., Shirley, W.A., Gao, Y., Feher, M. (2006) The use of consensus scoring in ligand- 59 60 based virtual screening. J. Chem. Inf. Model. 46, 277-288.

ACS Paragon Plus Environment Chemical Research in Toxicology Page 30 of 52

1 2 3 4 5 6 7 (43) Willett, P., Barnard, J.M., Downs, G.M. (1998) Chemical similarity searching. J. Chem. Inf. 8 9 Comp. Sci. 38, 983-996. 10 11 12 13 14 15 16 (44) Hu, W., Sorrentino, C., Denison, M.S., Kolaja, K., Fielden, M.R. (2007) Induction of Cyp1a1 is a 17 18 nonspecific biomarker of aryl hydrocarbon receptor activation: Results of large scale screening of 19 20 pharmaceuticals and toxicants in vivo and in vitro. Mol. Pharm. 71, 1475-1486. 21 22 23 24 25 26 27 (45) Petrik, J., Drobna, B., Pavuk, M., Jursa, S., Wimmerova, S., Chovancova, J. (2006) Serum PCBs 28 29 and organochlorine pesticides in Slovakia: Age, gender, and residence as determinants of 30 31 organochlorine concentrations. Chemosphere 65, 410-418. 32 33 34 35 36 37 38 (46) Rannar, S., Andersson, P.L. (2010) A Novel Approach Using Hierarchical Clustering To Select 39 40 Industrial Chemicals for Environmental Impact Assessment. J. Chem. Inf. Model. 50, 30-36. 41 42 43 44 45 46 47 (47) Halgren, T.A. (1999) MMFF VII. Characterization of MMFF94, MMFF94s, and other widely 48 49 available force fields for conformational energies and for intermolecular-interaction energies and 50 51 geometries. J. Comput. Chem. 20, 730-748. 52 53 54 55 56 57 58 (48) Molecular operating environment (MOE) 2012.10. Chemical Computing Group, Quebec, Canada. 59 60

ACS Paragon Plus Environment Page 31 of 52 Chemical Research in Toxicology

1 2 3 (49) Dewar, M.J.S., Zoebisch, E.G., Healy, E.F., Stewart, J.J.P. (1985) The Development and Use of 4 5 Quantum-Mechanical Molecular-Models .76. Am1 - a New General-Purpose Quantum-Mechanical 6 7 8 Molecular-Model. J. Am. Chem. Soc. 107 3902-3909. 9 10 11 12 13 14 (50) MATLAB version R2012a. The MathWorks Inc., Natick, Massachusetts, U.S.A. 15 16 17 18 19 20 21 (51) Balaban, A.T. (1979) Chemcial Graphs .34. Five New Topological Indexes for the Branching of 22 23 Tree-like Graphs. Theor. Chem. Acta 53, 355-375. 24 25 26 27 28 29 30 (52) Hall, L.H., Kier, L.B., 1991. The Molecular Connectivity Chi Indexes and Kappa Shape Indexes 31 32 in Structure-Property Modeling. John Wiley & Sons, Inc., Hoboken, NJ, USA. 33 34 35 36 37 38 39 (53) Sauer, W.H.B., Schwarz, M.K. (2003) Molecular shape diversity of combinatorial libraries: A 40 41 prerequisite for broad bioactivity. J. Chem. Inf. Comp. Sci. 43, 987-1003. 42 43 44

45 46 47 48 (54) SIMCA 13.0. Umetrics AB, Umeå, Sweden, 2009. 49 50 51 52 53 54 55 (55) MACCS Keys. MDL Information Systems, Inc., 14600 Catalina Street, San Leandro, CA 94577. 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 32 of 52

1 2 3 (56) Scheuermann, T.H., Tomchick, D.R., Machius, M., Guo, Y., Bruick, R.K., Gardner, K.H. (2009) 4 5 Artificial ligand binding within the HIF2α PAS-B domain of the HIF2 transcription factor. Proc. Natl 6 7 8 Acad. Sci. 106, 450-455. 9 10 11 12 13 14 (57) Key, J., Scheuermann, T.H., Anderson, P.C., Daggett, V., Gardner, K.H. (2009) Principles of 15 16 17 Ligand Binding within a Completely Buried Cavity in HIF2 alpha PAS-B. J. Am. Chem. Soc. 131, 18 19 17647-17654. 20 21 22 23 24 25 (58) Schrödinger Release 2014-3: MacroModel. Schrödinger, LLC. 120 West 45th Street, 17th Floor, 26 27 28 Tower 45, New York, NY, 10036-4041. 29 30 31 32 33 34 (59) Friesner, R.A., Banks, J.L., Murphy, R.B., Halgren, T.A., Klicic, J.J., Mainz, D.T., Repasky, 35 36 37 M.P., Knoll, E.H., Shelley, M., Perry, J.K., Shaw, D.E., Francis, P., Shenkin, P.S. (2004) Glide: A new 38 39 approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J. 40 41 Med. Chem. 47, 1739-1749. 42 43 44 45 46 47 48 (60) Schrödinger Release 2014-3: Glide. Schrödinger, LLC, New York, NY, 2014. 49 50 51 52 53 54 (61) Schrödinger Release 2014-3: Prime. Schrödinger, LLC, New York, NY, 2014. 55 56 57 58 59 60

ACS Paragon Plus Environment Page 33 of 52 Chemical Research in Toxicology

1 2 3 (62) Piskorskapliszczynska, J., Keys, B., Safe, S., Newman, M.S. (1986) The cytosolic receptor- 4 5 binding affinities and AHH induction potencies of 29 polynuclear aromatic-hydrocarbons. Toxicol. 6 7 8 Lett. 34, 67-74. 9 10 11 12 13 14 (63) Jogalekar, A.S., Reiling, S., Vaz, R.J. (2010) Identification of optimum computational protocols 15 16 17 for modeling the aryl hydrocarbon receptor (AHR) and its interaction with ligands. Bioorg. Med. 18 19 Chem. Lett. 20, 6616-6619. 20 21 22 23 24 25 (64) Canvas 2.5. Schrödinger, LLC. 120 West 45th Street, 17th Floor, Tower 45, New York, NY, 26 27 28 10036-4041. 29 30 31 32 33 34 (65) Williams, W., Lance, G. A general theory of classification sorting strategies: 1. Hierarchical 35 36 37 systems, 2. Clustering systems. Comput. J. 9. 38 39 40 41 42 43 (66) Denison, M.S., Soshilov, A.A., He, G., DeGroot, D.E., Zhao, B. (2011) Exactly the Same but 44 45 Different: Promiscuity and Diversity in the Molecular Mechanisms of Action of the Aryl Hydrocarbon 46 47 48 (Dioxin) Receptor. Toxicol. Sci. 124, 1-22. 49 50 51 52 53 54 (67) Zhang, J., Zhang, X., Xia, P., Zhang, R., Wu, Y., Xia, J., Su, G., Zhang, J., Giesy, J.P., Wang, Z. 55 56 57 (2016) Activation of AhR-mediated toxicity pathway by emerging pollutants polychlorinated diphenyl 58 59 sulfides. Chemosphere 144, 1754-1762. 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 34 of 52

1 2 3 (68) Ohura, T., Morita, M., Makino, M., Amagai, T., Shimoi, K. (2007) Aryl hydrocarbon receptor- 4 5 mediated effects of chlorinated polycyclic aromatic hydrocarbons. Chem. Res. Toxicol. 20, 1237-1241. 6 7 8 9 10 11 12 (69) Gujral, R.S., 2014. Process for preparing isoxazolyl penicillins. Vardhman Chemtech Limited, 13 14 India. WO2014072843A1. CAPLUS AN 2014:786600(Patent). 15 16 17 18 19 20 21 (70) Li, X., Zhang, Y., Hao, X., Guo, X., Jin, S., An, X., Han, Z., Liu, M., Li, Z., Liu, X., Jiao, W., 22 23 Sun, Z., 2011. Method for preparation of crystal Cloxacillin sodium. Hebei Huari Pharmaceuticals 24 25 Co., Ltd., Peop. Rep. China. CN102070653A. CAPLUS AN 2011:662696(Patent). 26 27 28 29 30 31 32 (71) Wall, R.J., He, G., Denison, M.S., Congiu, C., Onnis, V., Fernandes, A., Bell, D.R., Rose, M., 33 34 Rowlands, J.C., Balboni, G., Mellor, I.R. (2012) Novel 2-amino-isoflavones exhibit aryl hydrocarbon 35 36 37 receptor agonist or antagonist activity in a species/cell-specific context. Toxicol. 297, 26-33. 38 39 40 41 42 43 (72) Zaazaa, H.E., Mohamed, A.O., Hawwam, M.A., Abdelkawy, M. (2015) Spectrofluorimetric 44 45 determination of 3-methylflavone-8-carboxylic acid, the main active metabolite of flavoxate 46 47 48 hydrochloride in human urine. Spectrochim. Acta, Part A 134, 109-113. 49 50 51 52 53 54 (73) Arnold, M., Murgatroyd, A., Grund, C., Goerlitz, G., Liebig, T., 2012. Disperse dye mixtures, 55 56 57 their preparation and use. DyStar Colours Distribution GmbH, Germany. WO2012095284A1. 58 59 CAPLUS AN 2012:1042720(Patent). 60

ACS Paragon Plus Environment Page 35 of 52 Chemical Research in Toxicology

1 2 3 (74) Moroi, Y., Kitagawa, M., Itoh, H. (1992) Aqueous solubility and acidity constants of cholic, 4 5 deoxycholic, chenodeoxycholic, and ursodeoxycholic acids. J. Lipid. Res. 33, 49-53. 6 7 8 9 10 11 12 (75) Schaldach, C.M., Riby, J., Bjeldanes, L.F. (1999) Lipoxin A(4): A new class of ligand for the Ah 13 14 receptor. Biochem. 38, 7594-7600. 15 16 17 18 19 20 21 (76) Jinno, A., Maruyama, Y., Ishizuka, M., Kazusaka, A., Nakamura, A., Fujita, S. (2006) Induction 22 23 of -1A by the equine estrogen equilenin, a new endogenous aryl hydrocarbon 24 25 receptor ligand. J. Steroid Biochem. Mol. Biol. 98, 48-55. 26 27 28 29 30 31 32 (77) de Tomaso Portaz, A.C., Caimi, G.R., Sánchez, M., Chiappini, F., Randi, A.S., de Pisarev, 33 34 D.L.K., Alvarez, L. (2015) Hexachlorobenzene induces cell proliferation, and aryl hydrocarbon 35 36 37 receptor expression (AhR) in rat liver preneoplastic foci, and in the human hepatoma cell line HepG2. 38 39 AhR is a mediator of ERK1/2 signaling, and cell cycle regulation in HCB-treated HepG2 cells. 40 41 Toxicol. 336, 36-47. 42 43 44 45 46 47 48 (78) He, G., Zhao, B., Denison, M.S. (2011) Identification of benzothiazole derivatives and polycyclic 49 50 aromatic hydrocarbons as aryl hydrocarbon receptor agonists present in tire extracts. Environ. Toxicol. 51 52 Chem. 30, 1915-1925. 53 54 55 56 57 58 59 (79) ECHA, European Chemicals Agency. Registered Substances Database. Available 60 at http://echa.europa.eu/web/guest/information-on-chemicals/registered-substances [2016-09-08].

ACS Paragon Plus Environment Chemical Research in Toxicology Page 36 of 52

1 2 3 4 5 6 7 (80) Wojtowicz, A.K., Honkisz, E., Zieba-Przybylska, D., Milewicz, T., Kajta, M. (2011) Effects of 8 9 two isomers of DDT and their metabolite DDE on CYP1A1 and AhR function in human placental 10 11 cells. Pharmacol. Res. 63, 1460-1468. 12 13 14 15 16 17 18 (81) Hamers, T., Kamstra, J.H., Sonneveld, E., Murk, A.J., Kester, M.H.A., Andersson, P.L., Legler, 19 20 J., Brouwer, A. (2006) In vitro profiling of the endocrine-disrupting potency of brominated flame 21 22 retardants. Toxicol. Sci. 92, 157-173. 23 24 25

26 27 28 29 (82) Ahn, K.C., Zhao, B., Chen, J., Cherednichenko, G., Sanmarti, E., Denison, M.S., Lasley, B., 30 31 Pessah, I.N., Kultz, D., Chang, D.P.Y., Gee, S.J., Hammock, B.D. (2008) In vitro biologic activities of 32 33 the antimicrobials triclocarban, its analogs, and triclosan in bioassay screens: Receptor-based bioassay 34 35 screens. Environ. Health Perspect. 116, 1203-1210. 36 37 38 39 40 41 42 (83) Arai, K., Homma, T., Mizuno, M., 2015. Drug for preventing or treating hypertension or disease 43 44 caused by hypertension. Daiichi Sankyo Company, Limited, Japan. WO2015012205A1. CAPLUS AN 45 46 2015:165663(Patent). 47 48 49 50 51 52 53 (84) Gidwani, R.M., Jain, N.J., Vishwanath, M.H., 2002. Preparation of 3-aryl-1,3,4-oxadiazol-2(3H)- 54 55 one derivatives as herbicide synthesis intermediates. Gharda Chemicals Ltd., India. IN188913A1. 56 57 CAPLUS AN 2006:1163075(Patent). 58 59 60

ACS Paragon Plus Environment Page 37 of 52 Chemical Research in Toxicology

1 2 3 (85) Pilgram, K.H.G., 1979. Herbicidal anilide derivative. Shell Internationale Research Maatschappij 4 5 B. V., Netherlands. BR7900088A. CAPLUS AN 1980:514541(Patent). 6 7 8 9 10 11 12 (86) Moser, P., Sallmann, A., Wiesenberg, I. (1990) Synthesis and quantitative structure-activity 13 14 relationships of diclofenac analogs. J. Med. Chem. 33, 2358-2368. 15 16 17 18 19 20 21 (87) Marini, S., Grasso, E., Longo, V., Puccini, P., Riccardi, B., Gervasi, P.G. (2003) 4- 22 23 Biphenylaldehyde and 9-anthraldehyde: two fluorescent substrates for determining P450 enzyme 24 25 activities in rat and human. Xenobiotica 33, 1-11. 26 27 28 29 30 31 32 (88) Liphardt, B., Luettke, W. (1981) Laser dyes. I. Bifluorophoric laser dyes for increase of the 33 34 efficiency of dye lasers. Liebigs Ann. Chem., 1118-1138. 35 36 37 38 39 40 41 (89) Fan, Y., Xu, J., Jian, W., Chen, M., 2010. Environmental friendly dark blue and black disperse 42 43 dye compositions. Jiangsu Jihua Chemical Co., Ltd., Peop. Rep. China . CN101792615A. CAPLUS 44 45 AN 2010:982508(Patent). 46 47 48 49 50 51 52 (90) Ricking, M., Schwarzbauer, J. (2012) DDT isomers and metabolites in the environment: an 53 54 overview. Environ. Chem. Lett. 10, 317-323. 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 38 of 52

1 2 3 (91) Ishihara, A., Sawatsubashi, S., Yamauchi, K. (2003) Endocrine disrupting chemicals: interference 4 5 of thyroid hormone binding to transthyretins and to thyroid hormone receptors. Mol. Cell. Endocrinol. 6 7 8 199, 105-117. 9 10 11 12 13 14 (92) Fent, K., Chew, G., Li, J., Gomez, E. (2014) Benzotriazole UV-stabilizers and benzotriazole: 15 16 17 Antiandrogenic activity in vitro and activation of aryl hydrocarbon receptor pathway in zebrafish 18 19 eleuthero-embryos. Sci. Total Environ. 482-483, 125-136. 20 21 22 23 24 25 (93) Wik, A., Dave, G. (2009) Occurrence and effects of tire wear particles in the environment–A 26 27 28 critical review and an initial risk assessment. Environ. Pollut. 157, 1-11. 29 30 31 32 33 34 (94) Anderson, J.W., Sarantakis, D., Terpinski, J., Kumar, T.R.S., Tsai, H.-C., Kuo, M., Ager, A.L., 35 36 37 Jacobs, W.R., Jr., Schiehser, G.A., Ekins, S., Sacchettini, J.C., Jacobus, D.P., Fidock, D.A., 38 39 Freundlich, J.S. (2013) Novel diaryl ureas with efficacy in a mouse model of malaria. Bioorg. Med. 40 41 Chem. Lett. 23, 1022-1025. 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Page 39 of 52 Chemical Research in Toxicology

1 2 3 4 Figure legends: 5 6 7 8 Figure 1. Molecular structures of selected compounds that have shown AhR-mediated effects in vitro 9 10 (AhR modulators, 1-12) and for which some have been shown to competitively bind to AhR (AhR 11 12 13 binders, 1–9). Numbers 1, 2, 3, and 9 are examples of typical AhR ligands, i.e. small, rigid, aromatic 14 15 compounds. Numbers 4, 5, 6, 7, and 8 are atypical AhR ligands, i.e. they are rather flexible aromatic 16 17 compounds containing additional functional groups and atom types compared to the typical AhR 18 19 ligands. 1) 2,3,7,8-tetrachloro-dibenzo-p-dioxin (2378-TCDD), 2) 3,3’,4,4’,5-pentachloro-biphenyl 20 21 (PCB 126), 3) benzo-a-pyrene (BaP), 4) 6-formylindolo[3,2-b]carbazole (FICZ), 5) 2-(1’H-indole-3’- 22 23 carbonyl)-thiazole-4-carboxylic acid ester (ITE), 6) flutamide, 7) leflunomide, 8) nimodipine, 9) beta- 24 25 26 naphthoflavone (BNF), 10) dinaphtho[1,2-b;1’2’-d]furan (DNF), 11) , 12) bilirubin. 27 28 29 Figure 2. The score plot of the first and second principal components of the 214 AhR modulators 30 31 identified in the literature based on 68 2D-descriptors and 18 3D-descriptors. The 66 known AhR 32 33 binders are shown as red circles, and the remaining AhR modulators are shown as blue stars. The 34 35 numbers refer to the molecular structures given in Figure 1. 36 37 38 39 Figure 3. The flowchart of the virtual screening protocol for identifying AhR ligands among a set of 40 41 6,445 industrial chemicals. The numbers next to the boxes correspond to the total number of 42 43 compounds remaining at each step. The steps included an initial filter using principal component 44 45 analysis followed by three parallel steps consisting of similarity measurements based on the 66 known 46 47 AhR binders (nearest neighbor analysis and structural fingerprints) and molecular docking with an 48 49 50 AhR homology model. 51 52 53 Figure 4. Cartoon representation of the AhR homology model showing the rather small and buried 54 55 cavity occupied by the docked 2,3,7,8-tetrachlorodibenzo-p-dioxin. The cavity is mostly hydrophobic 56 57 (depicted as brown sticks) but has a few polar amino-acid residues (depicted as blue sticks). 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 40 of 52

1 2 3 Figure 5. The number of potential AhR ligands identified by the three parallel virtual screening 4 5 methods of nearest neighbor analysis based on chemical/physical properties, structural fingerprints, 6 7 8 and molecular docking with an AhR homology model. Out of the 429 industrial chemicals identified 9 10 in the initial filtering procedure that were used as input for the three screening methods, the numbers 11 12 refer to the numbers of compounds identified by the three methods. A total of 41 compounds were 13 14 identified as potential AhR ligands by at least two methods, and seven compounds were identified by 15 16 all three methods. 17 18 19 20 Figure 6. Chemical structures of the seven consensus compounds identified by the virtual screening 21 22 protocol. Triclosan (13), 2,2’,4,4’,5-pentabromodiphenylether (BDE99) (14), p,p'- 23 24 dichlorodiphenyltrichloroethane (pp’-DDT) (15), 2-chlorotritylchloride (16), 3-p-hydroxy- 25 26 anilinocarbazole (17), 3-(2-chloro4-nitrophenyl)-5-(1,1-diemthylethyl)-1,3,4-oxadiazol-2(3H)-one 27 28 (18), and nisoldipine (19). The number strings show the Chemical Abstracts Service (CAS) numbers 29 30 31 of the corresponding compounds. 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Page 41 of 52 Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Figure 1. 34 35

36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 42 of 52

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Figure 2. 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Page 43 of 52 Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Figure 3. 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 44 of 52

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Figure 4. 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Page 45 of 52 Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Figure 5. 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment Chemical Research in Toxicology Page 46 of 52

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Figure 6. 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment O Cl H Cl O Cl N Page 47 of 52 ChemicalCl ResearchCl in Toxicology Cl O Cl Cl Cl N H 1 1 2 3 4 2

3 H N 4 O F F H O O N O N O 5 O O F F S N O O 6 O O + O N F N H N − H − F + O 7 O N 8 O 9 5 6 7 8 10 11 O 12 O O O 13 OH OH H O CH3 14 H C NH O 2 15 9 O H3C H N 16 O H CH3 17 O NH O OH HO HN CH3 18 CH3 19 11 O H2C 20 ACS Paragon Plus Environment 21 10 12 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 15 Chemical Research in Toxicology Page 48 of 52 ∗ ∗ 10 ∗∗ ∗ ∗∗ ∗ 6 ∗∗∗ 1 ∗∘∗∗∘ ∘ ∗ ∗ ∗∗ 7 5 ∗ ∗∗ 11 25 ∗∘ ∗ ∗ ∗ ∗∗∗ ∗∗∗∗ 9 3 ∗ ∘∗∘∗∗∘ ∗ ∗∗ ∗∗ ∗ ∗∘∗∘∗∘∗∘∗ ∘ ∘∗ ∘∘ 4 ∗ ∗ ∗ 푡 0 ∗∗∗∘∗∘∘∗∘∗∗∘∗ ∗ ∗ ∗ 2 4 ∗∗ ∗∘∘∗∘∘∘∗∘∗∗∘∗∗ ∘∗∗∗∘ ∗ 8∘ ∗ ∗ ∗ ∗∘∘∘∗∗∘∗∗∘ ∘ 5 ∗ ∗∘∗∘∗∘∘∗∗∘∗∘∗ ∗∗∘ 10 1,2,3 ∗∗∗∗ ∗∗∗ ∗ ∗ 6-5 ∗∗∗∗ ∘∗∗ 7 ∗∗ ∗ ∗ -108 ∗ 12 ∗ 9 ∗ ACS Paragon Plus Environment -1510 -10 -5 0 5 10 15 20 25 30 35 11 푡1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 49 of 52 ChemicalIndustrial Research Chemicals in Toxicology

1 6445 2 3 Initial filter based on structural 4 and chemical properties 5 6 429 429 429 7 8Structural Fingerprints Nearest Neighbor Analysis Molecular Docking 9 10 32 93 177 11 12 13 ACS Paragon Plus 7Environment 14 15 Potential AhR ligands 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Chemical Research in Toxicology Page 50 of 52

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

38 89x79mm (300 x 300 DPI) 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment Page 51 of 52 Chemical Research in Toxicology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 /FBSFTU /FJHICPS  .PMFDVMBS %PDLJOH  22 23 24 25 26 27    28 29  30 31   32 33 34 35  36 37 38 39 40 4USVDUVSBM 'JOHFSQSJOUT  41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment Cl Br Cl OH O Cl ChemicalBr O ResearchBr in ToxicologyCl Cl Page 52 of 52

Br Cl Cl Cl 1 Cl Br Cl 2 13 : 3380-34-5 14 : 32534-81-9 15 : 50-29-3 16 : 42074-68-0 3 4 O O + H N 5 N O O O O - N N+ O 6 N O- O N OH O H Cl 7 ACS Paragon Plus Environment N H 8 17 : 86-72-6 18 : 31399-83-4 19 : 63675-72-9 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60