<<

Perspective

pubs.acs.org/jmc

How Beyond Rule of 5 Drugs and Clinical Candidates Bind to Their Targets Bradley C. Doak, Jie Zheng, Doreen Dobritzsch, and Jan Kihlberg* Department of ChemistryBMC, Uppsala University, Box 576, SE-751 23 Uppsala, Sweden

*S Supporting Information

ABSTRACT: To improve discovery of drugs for difficult targets, the opportunities of chemical space beyond the rule of 5 (bRo5) were examined by retrospective analysis of a comprehensive set of structures for complexes between drugs and clinical candidates and their targets. The analysis illustrates the potential of compounds far beyond rule of 5 space to modulate novel and difficult target classes that have large, flat, and groove-shaped binding sites. However, ligand efficiencies are significantly reduced for flat- and groove-shape binding sites, suggesting that adjustments of how to use such metrics are required. Ligands bRo5 appear to benefit from an appropriate balance between rigidity and flexibility to bind with sufficient affinity to their targets, with macrocycles and nonmacrocycles being found to have similar flexibility. However, macrocycles were more disk- and spherelike, which may contribute to their superior binding to flat sites, while rigidification of nonmacrocycles lead to rodlike ligands that bind well to groove-shaped binding sites. These insights should contribute to altering perceptions of what targets are considered “druggable” and provide support for in beyond rule of 5 space.

1. INTRODUCTION with “traditional” small molecule drugs,11 i.e., drugs that comply is at a crossroads where ground-breaking with the rule of 5 (Ro5) guidelines and are highly likely to be advances in our understanding of how diseases develop are now cell permeable and orally bioavailable. Still, it has been pointed made at an unprecedented pace. However, efficiency of drug out that large portions of well-established target classes such as discovery has continued to decline as the number of new drugs ion channels, GPCRs, and nuclear receptors remain unex- plored.10 However, an even larger number of targets from less approved each year has essentially been constant during the “ ffi ” past 30 years, while the costs of pharmaceutical development explored and novel classes which are di cult-to-drug using − fi have increased dramatically.1 3 This decline has been attributed Ro5 compliant compounds could provide signi cant, additional to a few fundamental issues, including a need to deliver first-in- opportunities for drug discovery. For example, the human proteome8,9 is estimated to have 100 000−1 000 000 binary class treatments for complex diseases while at the same time − 12,13 meeting increased demands for safety and efficacy.4 As a result protein protein interactions (PPIs) and may constitute there is high attrition in phase II and III clinical trials, mainly one of the most important sources of novel targets for drug due to lack of efficacy and safety issues.2,5,6 Therefore, it has discovery. However, the proportion of the proteome and its been emphasized that improved selection of targets that are massive number of PPIs that are involved in pathogenic mechanisms remains to be established. Even with that caveat associated with diseases is the single most important factor fi required to increase efficacy and deliver innovative medicines.2 the recent and rapid developments in target identi cation urgently need to be matched by innovative approaches for During the past two decades the human genome and various 14,15 other genomes have been mapped,7 and significant progress has modulating nontraditional target classes, such as PPIs. fi “ ffi ” been made toward mapping the human proteome.8,9 These Targets currently classi ed as di cult-to-drug with Ro5 rapid advances have made an increased number of potential compliant ligands characteristically have binding sites that are fl fl drug targets accessible that belong to both established and large, highly lipophilic, or highly polar, exible, at, or novel target classes. Despite the advances in target featureless (i.e., contain few opportunities for molecular fi interactions such as hydrogen bond donors and accept- identi cation, less than a quarter of recently approved drugs 16−19 are directed against novel targets, and the majority of these ors). In addition, the perceived lack of oral drugs target established classes of G protein-coupled receptors outside Ro5 space has led many to abandon these targets and “ ” (GPCRs), transporters, or enzymes.1,10 A limiting factor may classify them as undruggable . Thus, what initially appears as be that approximately 3000 of the genes in the human genome have been estimated to be related to disease. Out of these only Received: August 18, 2015 600−1500 have been considered amenable for manipulation Published: October 12, 2015

© 2015 American Chemical Society 2312 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Perspective vast opportunities of novel targets emerging from advances in two data sets having mean quantitative estimates of druglike- genomics and proteomics will, to a large extent, require small ness (QED) scores of 0.31 and 0.16, respectively, providing a molecule drug discovery to move outside Ro5 space into what single measure of the distance from traditional rule of 5 space. has been termed beyond Ro5 (bRo5) or “middle space”.20 Both of these QED scores are significantly below 0.67 and 0.49, Interestingly, recent analysis of drugs and clinical candidates which are the mean values identified by medicinal chemists for that fall outside Ro5 space has shown that this space offers compounds being “attractive” and “unattractive” for drug significant possibilities for discovery of orally bioavailable and development, respectively.21,30 Our data set of drugs and cell permeable compounds, possibly more than previously clinical candidates was obtained by searching different data- thought.21 It can therefore be argued that a too strict bases for compounds with MW ranging from 500 to 3000 Da implementation of the Ro5 may have hampered the followed by filtering to remove contrast agents, veterinarian pharmaceutical industry from seizing opportunities involving products, etc.21 Therefore, some drugs and clinical candidates − novel but more difficult targets.21 25 outside our strict definition of Ro5 space, i.e., those with MW < We and others have hypothesized the benefits of using bRo5 500 Da and one of HBD > 5, HBA > 10, ClogP > 5, or ClogP < drugs for difficult targets,14,15,21,26,27 and examples and case 0, have not been included in the analysis. We also highlight that studies have been reported in the literature. Here we present a some calculated properties are highly correlated to each other − comprehensive analysis of bRo5 drugs and clinical candidates (e.g., HBA and PSA, rs = 0.89 0.94), as illustrated in the that highlight their ability to modulate difficult targets, thereby correlation tables and principle component analysis, which can expanding the number of targets for which we can design oral be found in the Supporting Information (Figures S1 and S2).It and parenteral drugs. First, we assessed what target classes is nonetheless useful to base the classification on these current drugs and clinical candidates outside Ro5 space are seemingly redundant properties to aid filtering and analysis in directed toward in comparison to Ro5 compliant drugs. different situations, ranging from computer assisted to practical, Analysis then focused on how drugs and clinical candidates “back of the envelope” calculations. The three data sets were outside Ro5 space bind to their targets based on crystal then analyzed and compared extensively across different structures of 130 clinically relevant complexes, which were ligand−target interaction properties. Differences are described compared to drug−target complexes in Ro5 space. This as being “significant” where statistically significant different allowed us to define to what extent binding site and ligand mean values were found (unpaired t test, with a p-value of characteristics such as size, shape, molecular interactions, <0.05); full details and p-values can be found in the Supporting affinity, and ligand efficiencies differ between different drug Information but are also denoted in the figures. spaces. The influence of conformational flexibility of the ligand The data set of 475 drugs and clinical candidates in eRo5 and and its shape was also investigated for compounds in beyond bRo5 space that make up the current data set was previously Ro5 space. The results are then discussed to provide guidance curated and used to investigate oral bioavailability in eRo5 and for design of bioactive small molecule drugs outside Ro5 space bRo5 space.21 It was also classified with regards to chemical for difficult targets. class, route of administration, and phase of development.21 This allows discussion of trends in drug development and 2. DRUGS AND CLINICAL CANDIDATES DATA SETS demonstrates that de novo designed compounds are in majority To facilitate this in-depth analysis of how drugs and clinical (43%), with equal numbers of natural products and peptides/ candidates that do not comply with the Ro5 bind to their peptiodomimetics (26% each) across the full data set.21 The targets, a comprehensive data set of 475 drugs and clinical majority of de novo designed compounds are oral (64%), candidates with MW > 500 Da was classified by the whereas natural products and in particular peptides/peptidomi- compounds’ calculated physicochemical properties. They were metics are mainly parenteral (59% and 80%, respectively). then divided into two data sets where intuitive and natural Analyzing the data set by phase of development, chemical divisions in the ligand property distributions appeared as space, and chemical class demonstrates that de novo designed previously reported,21 each representing different chemical compounds dominate strongly in all clinical phases in eRo5 spaces (Figure 1a). Two data sets of Ro5 compliant drugs were space and that the majority of them are intended for oral also compiled from ChEMBL28 and the recent literature10 for administration (Figure 1b). In bRo5 space peptides constitute comparison during analysis (Figures 1, 2, and 3). In this the largest group across clinical candidates with proportions of analysis compounds in rule of 5 space adhere to all of Lipinski’s de novo designed compounds and natural products being only guidelines, whereas compounds that break one Ro5 guideline somewhat lower. In phase II, phase III, and approved, bRo5 (MW = 500−700 Da) and also have other properties which natural products and peptides are mainly for parenteral may extend a short distance outside strict Ro5 space were administration, whereas the proportion of orals is >45% for classified as being in extended Ro5 space (eRo5).21 Finally, de novo designed compounds. In summary, the drug discovery compounds in beyond Ro5 space (bRo5) all have MW > 500 Da industry is focusing on development of de novo designed drug and in addition have one or more properties outside the eRo5 candidates for oral administration in eRo5 space while relying ranges. They are thus far beyond Ro5 space but with an upper on all three chemical classes and more on parenteral delivery in MW limit of 3000 Da set to exclude biologics such as insulin. bRo5 space. It is noteworthy that a significant number of The classification into eRo5 and bRo5 space is useful to compounds (161 in total) from all three chemical classes in completely separate compounds in Ro5 space from those that bRo5 space are in clinical development, indicating a willingness reside far away in bRo5 space and do not conform to its trends. to venture outside the Ro5. This is further supported by the Thus, eRo5 space may be thought of as a buffer zone between emergence of a number of biotechnology companies that focus Ro5 and bRo5 space, representing the natural tail of the on this chemical space,27 often in partnership with larger distribution of Ro5 drugs, in line with the original report of pharmaceutical companies. In the current analysis we aim to Lipinski,29 and the beginning of bRo5 space. The rationale for globally assess bRo5 ligand−target interactions to define if the eRo5 and bRo5 classification is further highlighted by the moving far from traditional Ro5 space is warranted in efforts to

2313 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 1. (a) Classification of 475 drugs and clinical candidates that have MW > 500 Da into extended rule of 5 (blue) and beyond rule of 5 (green) chemical space based on calculated physicochemical properties. (b) Development pipeline by chemical space and chemical class showing peptides/ peptidomimetics (green), natural products and derivatives (blue), and de novo designed drugs and clinical candidates (red) by phase. Orals are in dark and parenterals in light colors, respectively. conquer difficult targets and for what target classes and types of target an expanding number of kinases. Structural and adhesion binding sites compounds in bRo5 space provide advantages. targets such as tubulin as well as transferases and isomerases are also more prevalent for eRo5 and/or bRo5 drugs and clinical 3. TARGET CLASS MODULATION BY CHEMICAL candidates. In our analysis there is also a higher prevalence of SPACE bRo5 and eRo5 drugs and clinical candidates at “other” targets, a class consisting of antioxidants, vitamin and hormone To analyze the target class preferences of drugs and clinical fi candidates in eRo5 and bRo5 space, the data set of 475 drugs replacements, orphan drugs, and other unclassi able molecular and clinical candidates with MW > 500 Da21 was first classified targets. Moreover, as compared to Ro5 drugs, a smaller using a similar taxonomy as employed to dissect trends and proportion of bRo5 drugs and clinical candidates bind to well- innovations in drug development for a large data set of established target classes such as ion channels and nuclear approved drugs (Figure 2a).10 A Ro5 target class reference data receptors. Similar trends in target class preference are also set was obtained by filtering this large literature data set of observed for the oral only and approved only subsets of eRo5 approved drugs10 by all of the Ro5 guidelines and selection of and bRo5 drugs and clinical candidates (Supporting Informa- only one target per drug. Drugs in eRo5 and bRo5 space tion Figure S3). Importantly, a number of classes that are more showed interesting differences in their target class preferences frequently targeted by eRo5 and bRo5 drugs and clinical compared to Ro5 compliant drugs (Figure 2b). For instance, an candidates, such as proteases, kinases, and transferases, are increased proportion of eRo5 and bRo5 drugs and clinical among those recently concluded to be underexplored in drug 10 candidates modulate protease and kinase targets, which have discovery. been increasingly explored only in the past decade.10 Similar to As different targets have been found to have different 31,32 the approved Ro5 compliant kinase inhibitors in our data set, preferred ligand chemical spaces, we conclude that tyrosine kinases were the largest subgroup targeted by eRo5 increased exploration of eRo5 and bRo5 space should be and bRo5 drugs and clinical candidates (42%). Kinase beneficial for future development of drugs for underexplored inhibitors currently in clinical trials originate from Ro5, as target classes. The discovery of protein kinase inhibitors, now well as from eRo5 and bRo5 space, and no significant trends commonly used in oncology, is an important example of how linking subgroups of kinases to a particular chemical space were exploration of novel chemical space can expand what targets are apparent. Nevertheless it is clear that medicinal chemists are considered “druggable”.33 It should also be noted that eRo5 and drawing from compounds outside traditional Ro5 space to bRo5 compounds also appear to be among the most suitable

2314 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 2. (a) Creation of target class preference data sets. The data set compiled by Rask-Andersen et al.10 was filtered by all the rule of 5 (Ro5) guidelines, and the primary target reported in the literature was selected for each drug to give the Ro5 data set. The primary targets reported in the literature for the full set of extended Ro5 (eRo5) and beyond Ro5 (bRo5) drugs and clinical candidates were classified using the same taxonomy. (b) Proportion of Ro5 drugs (red), eRo5 (blue), and bRo5 (green) drugs and clinical candidates modulating the indicated target classes. The proportion of compounds in each of the three chemical space data sets that modulate a specific target class (the number of compounds that modulate that target class divided by the total number in the data set, N) is shown by the vertical bars. Proportions were calculated to show differences in target preferences as the number of compounds differ significantly between the three data sets. Compounds that are in phase (P) I, II, or III or approved (App.) are shown by increasingly darker color shadings of the data sets. Targets are arranged by Ro5 target class preference from highest (left) to lowest (right). Alternative plots that show only approved, clinical, or orally bioavailable dugs and clinical candidates in eRo5 and bRo5 space, as well as the exact number of compounds in each category, are included in Supporting Information Figure S3. for modulating the increasing number of protein−protein representative of all erythronolide−ribosome complexes. This interactions (PPI) that are emerging as therapeutic targets.14,34 produced three nonredundant data sets of crystal structures of Ro5 drugs (N = 29), eRo5 (N = 26), and bRo5 (N = 22) drugs 4. CHARACTERIZATION OF DRUG−TARGET and clinical candidates bound to their targets (Table 1). An COMPLEXES BY CHEMICAL SPACE extensive set of additional annotated figures for the all 4.1. Generating Drug−Target Structure Data Sets. To structures and the nonredundant data sets can be found in probe how drugs outside Ro5 space bind to their targets, the Supporting Information along with statistical analysis. structural data for all available complexes of eRo5 and bRo5 4.2. Shape and Size of Binding Sites. The binding site drugs and clinical candidates with their targets were extracted shape of drug−target complexes can be assessed manually or by cross-referencing the 475 drugs and clinical candidates with with the aid of descriptors calculated from binding sites the Protein Data Bank (PDB). In total, 93 drugs had crystal identified by automated methods. Two such methods, the structures of relevant drug−target complexes that fulfilled the recently described Difference of Gaussian Site (DoGSite)35,36 chosen quality requirements (20% of the data set, Figure 3). A and MetaPocket 2.0,37 the latter of which is based on several reference set of 37 crystal structures of relevant Ro5 drug− algorithms, were used to analyze the three data sets of drug− target complexes was also obtained after clustering a Ro5 target complexes. The threshold for successful identification of filtered ChEMBL drugs data set according to their binding sites was set as >20% of the volume of the bound drug physicochemical properties (Figure 3). In order to remove being covered by the calculated binding site. With this lenient bias toward highly explored targets or drug classes and ensure cutoff only 54% and 43% of all bRo5 drug−target bindings sites that conclusions reflect the true variation between chemical were successfully identified by DoGSite and MetaPocket, spaces, redundant complexes between a target and other respectively (Supporting Information Figure S4a). In addition, members of the same were excluded. For example, the successfully calculated bRo5 binding sites covered a the erythromycin A−ribosome complex 1JZY was used as significantly lower proportion of the bound drug than Ro5

2315 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 3. Cross-referencing the extended Ro5, beyond Ro5 data sets, and a representative set of ChEMBL28 Ro5 drugs with the Protein Data Bank (PDB) and filtering by quality constraints gave three data sets of relevant Ro5, eRo5, and bRo5 drug−target structures. To avoid bias toward highly explored drug classes, further filtering to remove redundant drug class−target structures gave unbiased, nonredundant data sets which contain only one compound per drug−target class (e.g., one erythronolide, one azole anti-infective, etc.). The number of compounds approved (App.) and the number in phases (P) III, II, and I development are shown. Results of the analysis of the complete data sets can be found in the Supporting Information for comparison, while the nonredundant data sets are analyzed within the paper as well as in the Supporting Information. drug binding sites (mean 54% vs 94%, respectively; Supporting shapes observed for eRo5 drugs and clinical candidates are Information Figure S4). Hence, bindings site shape was also more evenly distributed between groove, tunnel, pocket, and assessed manually by visual inspection and classification as internal sites, revealing an ability of compounds residing just being flat, groove, tunnel, pocket, or internal. These correspond outside Ro5 chemical space to target a wide range of sites, with to the drug interacting with its target by a single face for a flat the exception of flat binding sites. It should be noted that site, two or three faces for a groove, and four faces with two compounds in eRo5 space also bind to pocket-shaped and noninteracting opposing faces for a tunnel-shaped site. internal sites, which are then larger than those that have Ro5 Interactions of the drug through four or five faces, leaving compliant ligands (Supporting Information Figure S8). Too one noninteractive face, characterizes a pocket, and for an few ligands in bRo5 space bind to pockets to draw any internal binding site the drug is completely buried inside the statistically significant conclusions regarding binding site size. target (Supporting Information Figure S5). For binding sites The shapes of the ligands, in their target bound conformations, that were successfully calculated with DoGSite, descriptors such were also assessed using normalized principle moment of as enclosure and depth corresponded well to the shape inertia (nPMI) plots, which characterize ligands by their classifications and thereby support the manual classification similarity to rod, disk, and sphere shapes (Figure 4b). In (Supporting Information Figure S6). However, volume and agreement with previous analyses based on calculated 3D sphericity descriptors failed to accurately describe the size and conformations,38 Ro5 compliant drugs were predominantly shape of flat- and groove-shaped binding sites, as the calculated rodlike, while those in eRo5 and particularly in bRo5 space sites poorly covered the actual drug or clinical candidate were more disk- and spherelike. Flat and groove binding sites binding site. The manual classification is also supported by the were also found to have ligands that were significantly more increase in the mean proportion of drug surface areas (SA) that disk- and spherelike compared to ligands for pocket and become buried upon binding to the target from flat to internal internal binding site shapes (Supporting Information Figures S9 binding sites (Supporting Information Figure S8). Due to the and S10). low success rate in binding site calculation and the inability of In addition to binding site shape, the ligand surface area (SA) descriptors to characterize those sites that were successfully that is buried upon binding and its proportion to the total calculated, the manual classification of binding site shapes was ligand SA also provide useful information about the nature of used throughout the current analysis. the binding sites. While the buried SA indicates the size of the The distribution of binding site shapes showed striking binding site, the proportion of buried ligand SA can indicate differences between the three sets of drug−target complexes how open or exposed the binding site is. The plot of buried (Figure 4a), with higher proportions of bRo5 drugs and clinical ligand SA against total ligand SA shows that eRo5 and bRo5 candidates binding to the “difficult” open, flat, and groove drugs have larger SAs buried in complexes with their targets binding sites compared to Ro5 drugs. In contrast, Ro5 drugs than Ro5 drugs (Figure 4c,d). The proportion of buried ligand display a preference for pocket and internal binding sites, which SA is, however, lower for drugs in eRo5 and bRo5 space conforms well with the view of such sites as being highly compared to Ro5 space (Figure 4c,e, Supporting Information “druggable” with Ro5 compliant compounds. Binding site Figure S8), which is consistent with the preference of eRo5 and

2316 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Table 1. Analyzed Nonredundant Drug−Target Complexes in Extended and Beyond Ro5 Chemical Space

aRoute of administration used in the indicated phase of development. bRo5 drugs for flat and groove binding sites. Although the and particularly bRo5 drugs bind to difficult, larger, and more buried SA of eRo5 and bRo5 drugs is larger than that of Ro5 open binding sites that approach the size of PPI interfaces. In drugs, it is still slightly smaller than interface areas in weak addition, compounds in bRo5 space are likely large enough to − μ ff protein protein interactions (PPIs with Kd values of >1 M) interact e ectively with hotspots of PPI interfaces. and significantly smaller than those in strong protein−protein 4.3. Molecular Interactions at Binding Sites. The μ 39 interactions (PPIs with Kd values of <1 M, Figure 4d). overall polarity of the drug and the target at the binding site However, most of the affinity of protein−protein interactions and the number of intermolecular interactions, such as arises from smaller hotspot areas within the interface40,41 and hydrogen bonds, are important descriptors for binding sites. drugs in bRo5 space may reach the critical size required to bind The proportion of nonpolar heavy atoms at the binding site such hotspots and gain sufficient affinity toward PPI interface, i.e., the number of carbon and sulfur atoms divided by interfaces.14,25 In conclusion, drugs in bRo5 space are less the total number of heavy atoms at the interface, provides a rodlike in ligand shape and bind to larger binding sites than measure of the overall polarity of the ligand and target Ro5 drugs but with a lower proportion of their total SA buried interfaces. No significant differences in the mean values of this in the complex. This agrees well with the observation that eRo5 measure of interface polarity from Ro5 to bRo5 space were

2317 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 4. (a) Distribution of binding site shapes, (b) ligand normalized principle moment of inertia (nPMI) shape plot, (c) buried ligand surface area (SA, Å2) as a function of total ligand SA (Å2), (d) box plot of buried ligand SA (Å2), and (e) box plot of proportion of buried ligand SA. Each panel presents data for rule of 5 drugs (Ro5, red), extended Ro5 (eRo5, blue), and beyond Ro5 (bRo5, green) drugs and clinical candidates in complex with their respective targets. Protein−protein interaction (PPI) interface SA data were extracted from Luo et al.39 and reanalyzed. Box plots show minimum and maximum values as whiskers; the 25th, 50th, and 75th percentiles as boxes; and mean values as crosses. Horizontal lines indicate unpaired and unequal variance t test of compared data sets with their p-values. found for the ligand or target interface data sets (Figure 5a, glance it may appear that bRo5 drugs show a much wider Supporting Information Figure S11). The mean proportion of distribution, particularly a higher maximum number of HBD nonpolar atoms at the targets interfaces is similar to previous interactions. However, it should be taken into account that a estimates42 and is also consistently lower than that of the criterion for including drugs in the Ro5 and eRo5 data sets is ligands interfaces. Since bRo5 drugs have a smaller proportion ≤5 HBD atoms, limiting their possible HBD interactions. In of their total surface area buried upon binding to the target addition, the eRo5 and bRo5 data sets encompass both oral and compared to Ro5 drugs, they might enrich polar or lipophilic parenteral drugs and clinical candidates, with two parenterally atoms at the binding site interface to improve interactions. delivered polysaccharides, amikacin and β-acarbose, being the However, the polarity of the ligand binding site interface is only two drugs with >5 HBD interactions in the data set. The similar to the overall polarity of the ligand for all three chemical number of π−π interactions in drug−target complexes is also spaces, indicating that neither polarity nor lipophilicity is similar for all three data sets (Supporting Information Figure enriched at the interface with the target (Figure 5a). S15). Other types of interactions, such as ionic, cation−π Similar to interface polarity, the number of atoms forming interactions, and halogen bonding, were less frequently hydrogen bonding interactions between the ligand and target observed in all three data sets; hence, no confident conclusions does not significantly differ between the three sets of drugs. could be drawn about these types of interactions. Therefore, Ligands in bRo5 space display a slightly higher mean number of despite the increase in size and the difference in shape observed HBA atom interactions but a statistically similar number of for binding sites of bRo5 drugs, the overall polarity as well as mean HBD atom interactions compared to Ro5 ligands (Figure type and number of molecular interactions in the binding site 5b,c, Supporting Information Figures S13 and S14). At first remains similar to those of Ro5 drugs. These findings indicate

2318 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 6. (a) Affinities and (b) ligand efficiencies (LE, kcal/(mol· HAC)) for rule of 5 (Ro5, red), extended Ro5 (eRo5, blue), and beyond Ro5 (bRo5, green) drugs and clinical candidates. (c) Figure 5. (a) Proportion of nonpolar heavy atoms (carbon and sulfur) Relationship between LE and binding site shape. As a rule of to the total number of heavy atoms for the ligand binding site thumb, Ro5 drug candidates are optimized to have 10 nM affinities, interface, the overall ligand, and the target binding site interface. The corresponding to LEs of 0.30 for a compound with a molecular weight distributions of the number of (b) hydrogen bond acceptor (HBA) of 500 Da (heavy atom count, HAC ≈ 36). This “guideline” for atom interactions and (c) hydrogen bond donor (HBD) atom optimization is marked with a gray line in (b) and (c). Box plots show interactions of ligands with their targets. Each panel presents data for minimum and maximum values as whiskers; the 25th, 50th, and 75th rule of 5 (Ro5, red) drugs, extended Ro5 (eRo5, blue), and beyond percentiles as boxes; and mean values as crosses. Horizontal lines Ro5 (bRo5, green) drugs and clinical candidates in complexes with indicate unpaired and unequal variance t test of compared data sets their respective targets. Box plots show minimum and maximum values with their p-values. as whiskers; the 25th, 50th, and 75th percentiles as boxes; and mean values as crosses. Horizontal lines indicate unpaired and unequal variance t test of compared data sets with their p-values. (Figure 6a). This also holds true for 102 of the 177 approved drugs in the original eRo5 and bRo5 data sets21 for which affinities were available (Supporting Information Figure S16) that the same approaches for lead optimization may be applied and their analysis leads to at least two important conclusions. to design of bRo5 ligands as for Ro5 ligands, with recent First, drugs outside Ro5 space do not require higher affinities illustrative examples being the discovery of hepatitis C virus for their targets compared to Ro5 compliant drugs to NS3/4a protease inhibitors.43 compensate for any perceived or actual unfavorable pharma- 4.4. AffinitiesandLigandEfficiencies. Affinities, cokinetics. Second, despite being perceived as “difficult”, measured as equilibrium dissociation constants, inhibition binding sites that are larger and more open can be modulated constants, or concentrations giving 50% inhibition of target by drugs with similar affinities as drugs directed to sites “ ” activity (Kd, Ki,IC50), were extracted from the literature for traditionally considered highly druggable . This correlates well drug−target complexes in the three data sets where available. with the observations that similar numbers and types of Affinity data were consistent with those previously reported for molecular interactions are formed between large, open, and a large data set of drugs,44 and drugs in Ro5, eRo5, and bRo5 smaller, enclosed binding sites and their respective ligands (cf. space had similar mean values and distributions of affinities Figure 5 and Supporting Information Figure S11−S15).

2319 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 7. Chemical structures of drugs and clinical candidates in bRo5 space that bind to flat and groove-shaped binding sites on targets. Compounds are grouped by therapeutic indication and their status, route of administration and clinically relevant targets are given below each structure.

Ligand efficiency metrics have found widespread use;45 the affinity of a compound with respect to its lipophilicity, is however, they also have some limitations associated with their similar for all three data sets and across all binding site shapes application, particularly outside traditional Ro5 drug space.46 (mean values ± standard deviations: 4.0 ± 1.4 to 6.6 ± 3.7, We nonetheless believe it is useful to characterize the ligand efficiency (LE) and lipophilic ligand efficiency (LLE) Supporting Information Figure S17), indicative of Ro5, eRo5, distributions observed in eRo5 and bRo5 space to provide and bRo5 drugs having similar lipophilicities and affinities. guides for those who wish to use them in drug development. As ff fi Therefore, e orts to develop drugs in eRo5 and bRo5 space the drugs in eRo5 and bRo5 space are signi cantly bigger than ffi Ro5 drugs, i.e., they have higher molecular weights and more should aim to generate compounds with similar a nities to heavy atoms, their LE is significantly lower (Figure 6b, those in Ro5 space but with altered guidelines for LE, Supporting Information Figure S17). While LE is known to particularly for difficult binding sites. Typical bRo5 drugs 47 not be completely independent of the size of compounds, we − · fi have a LE of 0.11 0.30 kcal/(mol HAC), with both increased also nd that it is correlated to the proportion of buried SA and, fl hence, also the shape of the binding site (Figure 6c). It should compound size and open, at binding sites contributing to their be noted that this correlation is not lost for size corrected reduced values. It should be emphasized that the ligand ligand efficiencies (see Supporting Information Figures S18 and efficiencies discussed herein reflect the historical development S19 for full details). Hence, ligand efficiencies are significantly of drugs and that a recent review of inhibitors of PPIs indicates lower for flat and groove binding sites than for pocket and internal sites, with mean values increasing from 0.19 to 0.42 that LE values in the top half of our data set, 0.20 and upward, kcal/(mol·HAC) from flat to internal sites. LLE,45 a measure of are possible for compounds targeting difficult binding sites.14

2320 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Figure 8. (a) Distribution of parenterals (triangles) and orals (circles) for macrocycles (right) in beyond Ro5 (bRo5 MC, green) and extended Ro5 space (eRo5 MC, orange), as well as nonmacrocycles (left) in the two chemical spaces (bRo5 non-MC, red, and eRo5 non-MC, blue). The dashed black box shows the eRo5 space limits for ClogP and MW. (b) Comparison of average root-mean-square deviation (rmsd, Å) values of the representative conformers of bRo5 macrocyclic drugs (N = 8, mean number of representative conformers of 3.5) and bRo5 nonmacrocyclic drugs (N = 4, mean number of representative conformers of 2.5). For the macrocycles, rmsd values for core atoms (atoms in the macrocycle ring), core plus periphery atoms (the core atoms plus all single heavy atoms attached to the core), as well as all atoms are compared. Box plots show minimum and maximum values as whiskers; the 25th, 50th, and 75th percentiles as boxes, and mean values as crosses. Horizontal lines indicate unpaired and unequal variance t test of compared data sets with their p-values. (c, d) Chemical structure, all observed crystal structures, and the most representative structure colored by the circular standard deviation of each bond of erythromycin A and ritonavir, respectively. For erythromycin A core atoms are in purple, periphery atoms in orange, and side chain atoms in black. The superimpositions show all experimentally determined structures from the Cambridge Structural Database and Protein Databank for the two drugs with heavy atoms colored by chemical element (green for carbon, blue for nitrogen, red for oxygen, and yellow for sulfur). Flexible bonds giving rise to different conformers of each drug were identified from the circular standard deviation of the dihedral angles and are color-coded from white (0) to red (1) representing rigid to flexible bonds, respectively.

5. EXAMINING BEYOND RULE OF 5 DRUGS AND nonmacrocyclic and bind to groove- or pocket-shaped binding CLINICAL CANDIDATES sites in rodlike conformations, which likely reflects the higher druggability of groove- and pocket-shaped binding sites.14 Our A key result of this analysis is that drugs and clinical candidates nonredundant data set of bRo5 ligand−target structures in bRo5 space are of particular interest because of their greater fl ffi fl contains three drug classes that bind to at binding sites, all ability to modulate di cult more open, at, and groove-shaped of which are used to treat infectious diseases (Figure 7) and 11 binding sites. Two recent investigations provide additional that target groove-shaped binding sites, five of which are used fi support for this observation. The rst found that a in oncology. Investigating if the chemical class of the ligand in representative set of macrocyclic natural products with high bRo5 space (i.e., de novo designed, natural product, or peptide/ MW bind either face-on to flat binding sites or edge-on to peptidomimetic) affected the properties of the complex with groove-shaped binding sites.25 The second highlighted that de the target showed no significant differences for properties novo designed inhibitors of protein−protein interactions that discussed above. This is most likely due to the small number of entered clinical trials during the past decade are predominantly complexes in each subset of the bRo5 data set.

2321 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

Interestingly, more than half of the nonredundant data set of rmsd values are significantly higher when all atoms in the bRo5 drugs and clinical candidates that bind to flat and groove- macrocycle drugs are taken into account (Figure 8b, Supporting shaped binding sites are macrocycles (Figure 7). Examination Information Figure S20). This indicates that the side chains are of the full data set of 475 drugs and clinical candidates that have commonly the most dynamic regions of macrocyclic drugs. a MW > 500 Da also reveals that orally bioavailable, as well as Furthermore, the all atom rmsd of macrocycles is similar to that parenteral, macrocycles are significantly enriched in bRo5 space of nonmacrocyclic drugs and clinical candidates in bRo5 space, compared to nonmacrocycles (Figure 8a), a finding that has suggesting that both chemical classes have a similar degree of been highlighted before.21,25,27 Though the reason for this overall flexibility. Although rmsd values are useful for enrichment has not been conclusively identified, it is known comparison of conformations, they give little insight into the that conformational constraints imposed by the macrocyclic source of conformational flexibility. Therefore, an analysis of structure can convey improved but also the location of bonds that rotate to give the different higher potency and better selectivity as compared to related conformations was conducted. For macrocyclic drugs in bRo5 acyclic analogues at similar binding sites.26,27,48,49 Conforma- space, bonds from all regions, i.e., the macrocyclic core, the side tional restriction has therefore been postulated as a general chains, as well as bonds linking the two, were rotated to form principle leading to macrocycle enrichment in bRo5 the different conformers (Figure 8c, Supporting Information space26,27,48,49 and is consistent with studies showing that Figure S21). However, most bonds throughout all regions of increasing the number of rotatable bonds has a negative effect the macrocycle display only modest or limited rotational on the oral bioavailability of drugs, independent of their freedom. Nonmacrocyclic drugs in bRo5 space also demon- chemical class.50,51 Generally increasing the conformational strate a similar distribution of rotational freedom about bonds flexibility of drug candidates is expected to reduce both the (Figure 8d, Supporting Information Figure S21). Conforma- affinity and selectivity of target binding,48 although the tionally constrained regions of nonmacrocyclic drugs arise from correlation between flexibility and promiscuity has been aromatic rings, other π-systems such as amides, and substituted questioned.52,53 As the influence of conformational restriction aliphatic rings that occur in higher proportions than for through macrocyclization on target binding still remains macrocyclic drugs (Supporting Information Figure S20e,f). The unclear, at least in some cases,54 we also investigated flexibility flexibility of nonmacrocycles mainly originates from rotation and shape of macrocycles and nonmacrocycles for our data set around bonds that connect these rigid elements. Overall, this of bRo5 drugs and clinical candidates to obtain an overview of analysis of conformational flexibility and its origins indicates flexibility in bRo5 space. that macrocycles are not more rigid than nonmacrocyclic drugs 5.1. Flexibility of Macrocycles and Nonmacrocycles. in bRo5 space across different drug classes. Instead, macro- We first investigated whether macrocyclic drugs and clinical cyclisation could be considered as a complementary strategy to candidates are more rigid than nonmacrocyclic ones in bRo5 other, more traditional approaches for introducing rigidity into chemical space. Due to the well-known difficulty in predicting drugs. Hence, decreased flexibility may not be the primary the conformations of bRo5 compounds using computational reason for the enrichment of oral macrocycles in bRo5 space. methods,49,55 we analyzed all available experimental conformers 5.2. Ligand Efficiency and Shapes of Ligands and from crystal structures in the Protein Databank (PDB) and Binding Sites. To further investigate the differences and Cambridge Structural Database (CSD) for our data set of bRo5 similarities of macrocyclic and nonmacrocyclic drugs and drugs and clinical candidates. Arguably such an analysis is clinical candidates in bRo5 space the affinity, ligand efficiency limited by the size of the data set and the data that are available (LE), binding site shape, and ligand shape were analyzed. for each compound and possibly biased by crystal packing Affinities did not differ significantly between macrocyclic and effects and poorly refined geometries,56,57 but it has the nonmacrocyclic drugs and clinical candidates in bRo5 space, advantage of being based on experimental data. We found a but macrocycles were found to have a slightly lower LE total of 24 drugs and clinical candidates in bRo5 space, with 12 compared to nonmacrocycles (Figure 9a,b, Supporting nonredundant drug classes that displayed multiple conforma- Information Figure S22). This is consistent with the tions in the crystalline state. In spite of its somewhat limited observations that drugs and clinical candidates binding to flat size, this data set consists of the largest number of bRo5 drugs binding sites have lower LE and that only macrocyclic drugs in for which experimentally determined conformations have been bRo5 space bind to flat binding sites for the nonredundant data analyzed and allowed us to reach some interesting but sets (Figure 9c). In contrast, nonmacrocyclic drugs display a potentially preliminary conclusions. slight preference for groove- and pocket-shaped binding sites. All conformers observed for a given drug or clinical candidate Although the number of nonredundant drugs and clinical were clustered to identify a set of representative conformers candidates in these two categories is low, the complete bRo5 that showed >1 Å root-mean-square deviation (rmsd) of all drug−target structure data set also retains a higher proportion heavy atoms between different conformers. This ensured that of flat binding sites for macrocycles (Supporting Information the subsequent analysis was not biased by multiple crystal Figure S23). In line with these findings, the nPMI shape of structures of the same conformation. The representative bRo5 drugs indicates that macrocycles have a trend to be more conformers were then compared using average rmsd, where a disk- and spherelike than nonmacrocycles (Figure 9d and high value indicates that the compound is flexible and can Supporting Information Figure S23). As already discussed adopt one or more significantly different conformations. In above, recent investigations of macrocyclic natural products25 addition to the rmsd values for all atoms, rmsd values for the and of inhibitors of protein−protein interactions14 provide macrocyclic core atoms subset and core plus single heavy atoms additional support for these trends. Thus, one may conclude attached directly to the core (peripheral atoms) subset were that the shape of macrocycles in combination with suitable also calculated for macrocyclic drugs and clinical candidates. rigidity and conformational preferences make them well suited We found that rmsd values of the macrocyclic core atoms are for binding to difficult flat binding sites with sufficient potency similar to those of the core plus peripheral atom subset, but that and selectivity. The more linear and often aromatic nonmacro-

2322 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

cycles appear to be somewhat better adapted for groove- and pocket-shaped binding sites, although macrocycles also bind to these sites. 5.3. Advantages of Macrocycles in Beyond Rule of 5 Space. As highlighted in the literature26,27,48 and discussed above, macrocyclization of a linear compound can improve both oral bioavailability and affinity at the binding site of a specific target. As macrocyclization decreases flexibility, it has been suggested that decreased flexibility is the main reason for enrichment of macrocycles in bRo5 space.48 However, in the current data set we examine drugs and clinical candidates acting at different targets giving a global picture of flexibility in bRo5 space. This indicates that both macrocyclic and nonmacrocyclic drugs in bRo5 space have similar flexibilities and affinities across different targets and binding sites. Interestingly, we also find that disk and spherelike macrocycles bind more commonly to flat binding sites than rodlike nonmacrocycles, which more frequently target groove-shaped binding sites. Hence, we conclude that the unique ability of macrocycles to adopt disk- and spherelike shapes that are better suited for binding to flat and groove-shaped sites is an important reason for enrichment of macrocycles in bRo5 space. Improved permeability across membranes, which translates into higher oral bioavailability, has been demonstrated for a number of macrocycles and constitutes another reason.48 It should also be remembered that most macrocycles in bRo5 space are natural products that were discovered prior to target-based drug discovery and high throughput screening. Thus, their current enrichment as oral drugs in beyond Ro5 space also reflects the history of drug discovery for difficult targets while at the same time providing inspiration and insight for future discovery of oral drugs for difficult targets.

6. PERSPECTIVE 6.1. Designing Drugs beyond the Rule of 5. A key conclusion from this analysis is that drugs and clinical candidates in bRo5 space are better suited to modulate “difficult” and emerging target classes compared to Ro5 compliant drugs, in particular when binding sites are large and flat or groove-shaped (Table 2). In previous analyses we demonstrated that 93% of the current oral drugs and clinical candidates in eRo5 and bRo5 space fall within an “outer limit” of physicochemical space where there still remains a reasonable chance to design orally bioavailable drugs.21,27 This space was delineated by MW ≤ 1000 Da, −2 ≤ ClogP ≤ 10, HBD ≤ 6, HBA ≤ 15, PSA ≤ 250 Å2, and NRotB ≤ 20. The current target binding analysis and the previous oral bioavailability analyses therefore provide further guidance for design of bRo5 ligands for “difficult” targets, in particular with regard to the shape of the binding site, the flexibility of the ligand, and the LE that may be attained for binding sites of different shapes. When designing orals in bRo5 space, molecular weight may be increased up to 1000 Da which allows high-affinity binding Figure 9. (a) Affinities, (b) ligand efficiencies (LE), (c) binding site to “difficult” extended, open, flat, and groove-shaped binding shapes, and (d) normalized principle moment of inertia (nPMI) plot sites that bury up to 800−900 Å2 of ligand surface area (Table of bioactive conformations of drugs for beyond Ro5 macrocyclic 2, Supporting Information Figure S25). In contrast, drugs that (bRo5 MC, green) and beyond Ro5 nonmacrocyclic (bRo5 non-MC, adhere strictly to the Ro5 bind to pockets and internal sites and red) drugs and targets. As a rule of thumb, Ro5 drug candidates are bury up to 600 Å2 of ligand surface area. Overall polarity at the optimized to have 10 nM affinities, corresponding to a LE of 0.30 for a compound with a molecular weight of 500 Da (heavy atom count, ligand interface is similar for bRo5, eRo5, and Ro5 drugs and HAC ≈ 36). This “guideline” for optimization is marked with a gray clinical candidates which is mirrored in their similar lip- line in (b). Box plots show minimum and maximum values as ophilicity, i.e., ClogP values centered around 4. Hence, an whiskers; the 25th, , and 75th percentiles as boxes; and mean values as increase in 2D PSA up to 250 Å2 is often found at higher crosses. molecular weights in bRo5 space to maintain this balance of

2323 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

a Table 2. Summary of Interactions between Drugs and Clinical Candidates and Their Targets by Chemical Space

aProperty values shown are the 10th to 90th percentile of the all structures encompassing both orals and parenterals. Orals and parenterals show very similar property values (Supporting Information Figures S24−S28). bBoxes indicate differences between bRo5 space and small molecule drug space. cLE values should be adjusted based on the size of the ligand and shape of binding site during optimization with the final aim of being in this range and as high as possible. polarity. The majority of this increased PSA should originate difficult targets, leads with LE < 0.30 kcal/(mol·HAC) should from an increased number of HBA, as HBD must be strictly not be discarded, as this analysis indicates that they retain a controlled at ≤6 to avoid reducing oral bioavailability. This reasonable chance of having sufficient affinity for their target corresponds well to the observed small increase in the number while still being developed into a “possible to be oral” drug of ligand HBA atom interactions between the ligand and its space. Instead, the size and flexibility of the ligand and the target in bRo5 space as compared to eRo5 and Ro5 space, while shape of the target binding site should be taken into account, ligand HBD atoms interactions are similar. Similar overall allowing progression of compounds that may give candidate ligand−target affinities were observed between Ro5, eRo5, and drugs with ligand efficiencies of ≥0.12 kcal/(mol·HAC), a bRo5 space, which resulted in reduced ligand efficiency in bRo5 guideline that captures 90% of current oral drugs and clinical space. Additional, detailed comparisons of the properties of candidates in bRo5 space. Finally, intramolecular hydrogen ligand−target interfaces in each chemical space revealed that bonding, saturation of efflux transporters with high doses, use trends are similar between orals and parenterals for drugs and of improved formulations, and capitalizing on selective clinical candidates (Supporting Information Figures S24−28). transporter mediated distribution to target organs can also However, parenterals in bRo5 space have reduced LE and play a role in producing a cell permeable and orally bioavailable increased LLE as compared to orals in bRo5 space (Supporting drug in bRo5 space.21 Information Figure S27). These differences are caused by drugs 6.2. Conclusion and Future Challenges. This analysis in parenteral bRo5 space which have high molecular weight and suggests that many of the currently recognized “difficult” high polarity; they are often peptidic in nature and thereby targets, and the multitude of novel targets that are emerging difficult to administer orally. In conclusion, this analysis from genomics and proteomics, are out of reach for drugs in indicates that the same approaches for optimization of oral Ro5 space but that they may be suited to manipulation by drugs drugs can be applied in bRo5 space as in Ro5 space, provided in bRo5 space. Disk- and sphere-shaped macrocycles and that physicochemical properties are kept within the recently rodlike nonmacrocycles in bRo5 space are particularly well reported “outer limits”.21,27 suited to targets that have large, flat-, or groove-shaped binding Compounds in bRo5 space require an overall appropriate sites, respectively. Orally bioavailable and cell permeable drugs balance between flexibility and rigidity that allows them to bind for difficult targets can be designed within the current “outer with high affinity to the target without paying an unnecessary limits” of physicochemical property space beyond which entropic penalty. Conformational flexibility may be reduced pharmacokinetics currently present an almost insurmountable through macrocyclization and use of aromatic and substituted challenge.21 In order to capitalize on opportunities in bRo5 aliphatic rings and other π-systems, such as amides, with equal space, changes in the perception of target properties that are success but at different types of binding sites. Interestingly, considered “druggable” and which ligand properties allow orally macrocyclization appears to favor disk- and spherelike bioavailability are required. Realizing that physicochemical conformations that facilitate binding of the ligand to the very property space can be expanded and introduction of reduced difficult flat binding sites. Incorporation of other rigidifying LE goals, which should be tailored to binding site shape and structural elements in nonmacrocycles is more prone to yield size, will enable development of drugs in bRo5 space. It may rodlike conformations that bind to groove-shaped binding sites. also be important to focus optimization so that candidate drugs During optimization of beyond Ro5 drugs it is also important have an appropriate balance between rigidity and flexibility in to adjust goals for LE. Compounds and series with a LE value order to obtain satisfactory potency and ADMET properties. below 0.30 kcal/(mol·HAC) are often not prioritized in drug The increased interest in drug discovery for difficult targets is discovery. However, during optimization of bRo5 drugs for becoming apparent from the increasing number of relevant

2324 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective publications from industry and academia, the emergence of Medicinal Chemistry at Monash University in 2012, before taking several companies specializing in this chemical space, and the his current position in Sweden. His research interests include beyond growing number of approved drugs outside Ro5 chemical rule of 5 drug design, specifically macrocycles, as well as fragment- space.21 Thus, it is important to consider the discoveries and based drug design and difficult targets. breakthroughs that will be required to enhance drug discovery Jie Zheng holds a Masters of Organic Chemistry at the Department of in bRo5 space. Improved lead generation for novel classes of ffi Organic Chemistry, Uppsala University, Sweden. She obtained a targets with di cult binding sites will be key to success. Bachelor of Environmental Engineering from Shandong University of Extrapolation of current trends indicates that natural products Technology, China, and then went on to study organic chemistry at and peptides will continue to constitute valuable starting points 21 Umeå University and Uppsala University, Sweden. Following research in this chemical space. Fragment-based lead generation is also in environmental and technical chemistry, her research interests now beginning to prove its worth in delivering drugs and clinical ffi 58 include computational chemistry with an emphasis on how candidates for di cult targets and may become of increasing compounds bind to their targets and applications in molecular design. value for bRo5 lead discovery. Compounds in bRo5 space that bind to difficult targets will be bigger and more complex than Doreen Dobritzsch is a senior lecturer in Biochemistry at Uppsala Ro5 drugs. Although significant progress has been made in University, Sweden, since 2013. Previously she held an Assistant preparation of complex structures, e.g., macrocycles,59 synthesis Professorship at the Karolinska Institute, where she also performed her will likely remain difficult, time-consuming, and costly for the postdoctoral studies after obtaining a Ph.D. in Biochemistry from the immediate future. Therefore, development of improved Martin-Luther-Universitaẗ Halle-Wittenberg, Germany, in 1999. Her fi − predictive methods would also have a major impact on bRo5 research in the eld of structural biochemistry is focused on structure drug discovery by allowing only the most highly prioritized function relationships in enzymes, primarily those involved in compounds to be selected for synthesis. Access to efficient and pyrimidine degradation, and interactions occurring between immune reliable methods for 3D conformer generation would allow molecules and self-antigens in rheumatoid arthritis. more accurate predictions of flexibility, lipophilicities, and polar Jan Kihlberg holds a Chair in Organic Chemistry at Uppsala surface areas and facilitate modeling of cell permeability and University, Sweden, since 2013. During the previous 10 years at target binding. Predictive models for cellular efflux and AstraZeneca R&D Mölndal he held positions as Director of Medicinal metabolism would also be extremely valuable, even though it Chemistry, then as Director of Competitive Intelligence and Business may not be realistic to expect them to be developed in the near Foresight Analysis. He became Professor in Organic Chemistry at future. In summary, we believe that developments in lead Umeå University in 1996 after having established his research group at generation, synthetic methodology, and predictive methods, in Lund Institute of Technology in 1991. He holds a Ph.D. in Organic combination with insights into how targets are engaged and Chemistry from Lund Institute of Technology. His key research improved understanding of ADMET properties, will allow interests are to understand what properties convey cell permeability more effective drug discovery in bRo5 space in the near future. and target binding to compounds outside the rule of 5 as well as It is our hope that this will ultimately contribute to an enhanced studies of the chemical biology of glycopeptides, peptides, and their delivery of innovative medicines and increased pharmaceutical mimetics. R&D efficiency. ■ ACKNOWLEDGMENTS ASSOCIATED CONTENT ■ B.C.D. was supported by a postdoctoral fellowship funded by *S Supporting Information Uppsala University. The authors thank Prof. Helena Danielson, The Supporting Information is available free of charge on the Uppsala University, Sweden, for providing valuable comments ACS Publications website at DOI: 10.1021/acs.jmed- on the manuscript. chem.5b01286. Full methods for generation of the data set and its ■ ABBREVIATIONS USED anlysis; figures for all data discussed in the manuscript as bRo5, beyond rule of 5; eRo5, extended rule of 5; LLE, well as figures for the full structure data set along with lipophilic ligand efficiency; nPMI, normalized principle mo- statistical anlysis. (PDF) ment of inertia; QED, quantitative estimate of ; Ro5, rule of 5 AUTHOR INFORMATION ■ ■ REFERENCES Corresponding Author * (1) Rask-Andersen, M.; Almen, M. S.; Schioth, H. B. Trends in the Phone: +46 (0)18 4713801. E-mail: [email protected]. exploitation of novel drug targets. Nat. Rev. Drug Discovery 2011, 10, Author Contributions 579−590. All authors contributed to generation of data, to its analysis, and (2) Bunnage, M. E. Getting pharmaceutical R&D back on target. Nat. − to the writing of the manuscript. They have approved the final Chem. Biol. 2011, 7, 335 339. version of the manuscript. (3) Kinch, M. S.; Hoyer, D.; Patridge, E.; Plummer, M. Target selection for FDA-approved medicines. Drug Discovery Today 2015, Notes 20, 784−789. The authors declare no competing financial interest. (4) Scannell, J. W.; Blanckley, A.; Boldon, H.; Warrington, B. Biographies Diagnosing the decline in pharmaceutical R&D efficiency. Nat. Rev. Drug Discovery 2012, 11, 191−200. Bradley C. Doak is a postdoctoral researcher in the Department of (5) Hay, M.; Thomas, D. W.; Craighead, J. L.; Economides, C.; Organic Chemistry, Uppsala University, Sweden. Brad obtained a Rosenthal, J. Clinical development success rates for investigational Bachelor of Medicinal Chemistry at the Department of Medicinal drugs. Nat. Biotechnol. 2014, 32,40−51. Chemistry, Monash Institute of Pharmaceutical Sciences, Monash (6) Cook, D.; Brown, D.; Alexander, R.; March, R.; Morgan, P.; University, Australia. He then completed a Ph.D. in Synthetic Satterthwaite, G.; Pangalos, M. N. Lessons learned from the fate of

2325 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

AstraZeneca’s drug pipeline: a five-dimensional framework. Nat. Rev. (21) Doak, B. C.; Over, B.; Giordanetto, F.; Kihlberg, J. Oral Drug Discovery 2014, 13, 419−431. druggable space beyond the rule of 5: insights from drugs and clinical (7) International Human Genome Sequencing Consortium.. candidates. Chem. Biol. 2014, 21, 1115−1142. Finishing the euchromatic sequence of the human genome. Nature (22) Abad-Zapatero, C. A sorcerer’s apprentice and the rule of five: 2004, 431, 931−945. from rule-of-thumb to commandment and beyond. Drug Discovery (8) Kim, M. S.; Pinto, S. M.; Getnet, D.; Nirujogi, R. S.; Manda, S. S.; Today 2007, 12, 995−997. Chaerkady, R.; Madugundu, A. K.; Kelkar, D. S.; Isserlin, R.; Jain, S.; (23) Zhang, M.-Q.; Wilkinson, B. Drug discovery beyond the ‘rule-of- Thomas,J.K.;Muthusamy,B.;Leal-Rojas,P.;Kumar,P.; five’. Curr. Opin. Biotechnol. 2007, 18, 478−488. Sahasrabuddhe, N. A.; Balakrishnan, L.; Advani, J.; George, B.; (24) Walters, W. P. Going further than Lipinski’s rule in drug design. Renuse, S.; Selvan, L. D.; Patil, A. H.; Nanjappa, V.; Radhakrishnan, A.; Expert Opin. Drug Discovery 2012, 7,99−107. Prasad, S.; Subbannayya, T.; Raju, R.; Kumar, M.; Sreenivasamurthy, S. (25) Villar, E. A.; Beglov, D.; Chennamadhavuni, S.; Porco, J. A., Jr; K.; Marimuthu, A.; Sathe, G. J.; Chavan, S.; Datta, K. K.; Subbannayya, Kozakov, D.; Vajda, S.; Whitty, A. How proteins bind macrocycles. Y.; Sahu, A.; Yelamanchi, S. D.; Jayaram, S.; Rajagopalan, P.; Sharma, Nat. Chem. Biol. 2014, 10, 723−731. J.; Murthy, K. R.; Syed, N.; Goel, R.; Khan, A. A.; Ahmad, S.; Dey, G.; (26) Driggers, E. M.; Hale, S. P.; Lee, J.; Terrett, N. K. The Mudgal, K.; Chatterjee, A.; Huang, T. C.; Zhong, J.; Wu, X.; Shaw, P. exploration of macrocycles for drug discovery - an underexploited G.; Freed, D.; Zahari, M. S.; Mukherjee, K. K.; Shankar, S.; structural class. Nat. Rev. Drug Discovery 2008, 7, 608−624. Mahadevan, A.; Lam, H.; Mitchell, C. J.; Shankar, S. K.; (27) Giordanetto, F.; Kihlberg, J. Macrocyclic drugs and clinical Satishchandra, P.; Schroeder, J. T.; Sirdeshmukh, R.; Maitra, A.; candidates: what can medicinal chemists learn from their properties? J. Leach, S. D.; Drake, C. G.; Halushka, M. K.; Prasad, T. S.; Hruban, R. Med. Chem. 2014, 57, 278−295. H.; Kerr, C. L.; Bader, G. D.; Iacobuzio-Donahue, C. A.; Gowda, H.; (28) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Pandey, A. A draft map of the human proteome. Nature 2014, 509, Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; 575−581. Overington, J. P. ChEMBL: a large-scale bioactivity database for drug (9) Wilhelm, M.; Schlegl, J.; Hahne, H.; Moghaddas Gholami, A.; discovery. Nucleic Acids Res. 2012, 40, D1100−1107. Lieberenz, M.; Savitski, M. M.; Ziegler, E.; Butzmann, L.; Gessulat, S.; (29) Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Marx, H.; Mathieson, T.; Lemeer, S.; Schnatbaum, K.; Reimer, U.; Experimental and computational approaches to estimate solubility and Wenschuh, H.; Mollenhauer, M.; Slotta-Huspenina, J.; Boese, J. H.; permeability in drug discovery and development settings. Adv. Drug Bantscheff,M.;Gerstmair,A.;Faerber,F.;Kuster,B.Mass- Delivery Rev. 2001, 46,3−26. spectrometry-based draft of the human proteome. Nature 2014, 509, (30) Bickerton, G. R.; Paolini, G. V.; Besnard, J.; Muresan, S.; 582−587. Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. (10) Rask-Andersen, M.; Masuram, S.; Schioth, H. B. The druggable 2012, 4,90−98. genome: evaluation of drug targets in clinical trials suggests major (31) Vieth, M.; Sutherland, J. J. Dependence of molecular properties shifts in molecular class and indication. Annu. Rev. Pharmacol. Toxicol. on proteomic family for marketed oral drugs. J. Med. Chem. 2006, 49, 2014, 54,9−26. 3451−3453. (11) Hopkins, A. L.; Groom, C. R. The druggable genome. Nat. Rev. (32) Morphy, R. The influence of target family and functional activity Drug Discovery 2002, 1, 727−730. on the physicochemical properties of pre-clinical compounds. J. Med. (12) Venkatesan, K.; Rual, J. F.; Vazquez, A.; Stelzl, U.; Lemmens, I.; Chem. 2006, 49, 2969−2978. Hirozane-Kishikawa, T.; Hao, T.; Zenkner, M.; Xin, X.; Goh, K. I.; (33) Zhang, J.; Yang, P. L.; Gray, N. S. Targeting cancer with small Yildirim, M. A.; Simonis, N.; Heinzmann, K.; Gebreab, F.; Sahalie, J. molecule kinase inhibitors. Nat. Rev. Cancer 2009, 9,28−39. M.; Cevik, S.; Simon, C.; de Smet, A. S.; Dann, E.; Smolyar, A.; (34) Nero, T. L.; Morton, C. J.; Holien, J. K.; Wielens, J.; Parker, M. Vinayagam, A.; Yu, H.; Szeto, D.; Borick, H.; Dricot, A.; Klitgord, N.; W. Oncogenic protein interfaces: small molecules, big challenges. Nat. Murray, R. R.; Lin, C.; Lalowski, M.; Timm, J.; Rau, K.; Boone, C.; Rev. Cancer 2014, 14, 248−262. Braun, P.; Cusick, M. E.; Roth, F. P.; Hill, D. E.; Tavernier, J.; Wanker, (35) Wirth, M.; Volkamer, A.; Zoete, V.; Rippmann, F.; Michielin, E. E.; Barabasi, A. L.; Vidal, M. An empirical framework for binary O.; Rarey, M.; Sauer, W. H. Protein pocket and ligand shape interactome mapping. Nat. Methods 2009, 6,83−90. comparison and its application in virtual screening. J. Comput.-Aided (13) Zhang, Q. C.; Petrey, D.; Deng, L.; Qiang, L.; Shi, Y.; Thu, C. Mol. Des. 2013, 27, 511−524. A.; Bisikirska, B.; Lefebvre, C.; Accili, D.; Hunter, T.; Maniatis, T.; (36) Volkamer, A.; Kuhn, D.; Grombacher, T.; Rippmann, F.; Rarey, Califano, A.; Honig, B. Structure-based prediction of protein-protein M. Combining global and local measures for structure-based interactions on a genome-wide scale. Nature 2012, 490, 556−560. druggability predictions. J. Chem. Inf. Model. 2012, 52, 360−372. (14) Arkin, M. R.; Tang, Y.; Wells, J. A. Small-molecule inhibitors of (37) Huang, B. MetaPocket: a meta approach to improve protein protein-protein interactions: progressing toward the reality. Chem. Biol. ligand binding site prediction. OMICS 2009, 13, 325−330. 2014, 21, 1102−1114. (38) Ruddigkeit, L.; van Deursen, R.; Blum, L. C.; Reymond, J. L. (15) Surade, S.; Blundell, T. L. Structural biology and drug discovery Enumeration of 166 billion organic small molecules in the chemical of difficult targets: the limits of ligandability. Chem. Biol. 2012, 19,42− universe database GDB-17. J. Chem. Inf. Model. 2012, 52, 2864−2875. 50. (39) Luo, J.; Guo, Y.; Zhong, Y.; Ma, D.; Li, W.; Li, M. A functional (16) Seco, J.; Luque, F. J.; Barril, X. Binding site detection and feature analysis on diverse protein-protein interactions: application for druggability index from first principles. J. Med. Chem. 2009, 52, 2363− the prediction of binding affinity. J. Comput.-Aided Mol. Des. 2014, 28, 2371. 619−629. (17) Krasowski, A.; Muthas, D.; Sarkar, A.; Schmitt, S.; Brenk, R. (40) Clackson, T.; Wells, J. A. A hot spot of binding energy in a DrugPred: A structure-based approach to predict protein druggability hormone-receptor interface. Science 1995, 267, 383−386. developed using an extensive nonredundant data set. J. Chem. Inf. (41) Bogan, A. A.; Thorn, K. S. Anatomy of hot spots in protein Model. 2011, 51, 2829−2842. interfaces. J. Mol. Biol. 1998, 280,1−9. (18) Perola, E.; Herman, L.; Weiss, J. Development of a rule-based (42) Schmidtke, P.; Barril, X. Understanding and predicting method for the assessment of protein druggability. J. Chem. Inf. Model. druggability. A high-throughput method for detection of drug binding 2012, 52, 1027−1038. sites. J. Med. Chem. 2010, 53, 5858−5867. (19) Hajduk, P. J.; Huth, J. R.; Fesik, S. W. Druggability indices for (43) Rosenquist, A.; Samuelsson, B.; Johansson, P. O.; Cummings, protein targets derived from NMR-based screening data. J. Med. Chem. M. D.; Lenz, O.; Raboisson, P.; Simmen, K.; Vendeville, S.; de Kock, 2005, 48, 2518−2525. H.; Nilsson, M.; Horvath, A.; Kalmeijer, R.; de la Rosa, G.; Beumont- (20) Terrett, N. Drugs in middle space. MedChemComm 2013, 4, Mauviel, M. Discovery and development of simeprevir (TMC435), a 474−475. HCV NS3/4A protease inhibitor. J. Med. Chem. 2014, 57, 1673−1693.

2326 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327 Journal of Medicinal Chemistry Perspective

(44) Overington, J. P.; Al-Lazikani, B.; Hopkins, A. L. How many drug targets are there? Nat. Rev. Drug Discovery 2006, 5, 993−996. (45) Hopkins, A. L.; Keseru, G. M.; Leeson, P. D.; Rees, D. C.; Reynolds, C. H. The role of ligand efficiency metrics in drug discovery. Nat. Rev. Drug Discovery 2014, 13, 105−121. (46) Kenny, P. W.; Leitao, A.; Montanari, C. A. Ligand efficiency metrics considered harmful. J. Comput.-Aided Mol. Des. 2014, 28, 699− 710. (47) Reynolds, C. H.; Tounge, B. A.; Bembenek, S. D. Ligand binding efficiency: trends, physical basis, and implications. J. Med. Chem. 2008, 51, 2432−2438. (48) Mallinson, J.; Collins, I. Macrocycles in new drug discovery. Future Med. Chem. 2012, 4, 1409−1438. (49) Chen, I. J.; Foloppe, N. Tackling the conformational sampling of larger flexible compounds and macrocycles in and drug discovery. Bioorg. Med. Chem. 2013, 21, 7898−7920. (50) Veber, D. F.; Johnson, S. R.; Cheng, H. Y.; Smith, B. R.; Ward, K. W.; Kopple, K. D. Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem. 2002, 45, 2615−2623. (51) Varma, M. V.; Obach, R. S.; Rotter, C.; Miller, H. R.; Chang, G.; Steyn, S. J.; El-Kattan, A.; Troutman, M. D. Physicochemical space for optimum oral bioavailability: contribution of human intestinal absorption and first-pass elimination. J. Med. Chem. 2010, 53, 1098− 1108. (52) He, M. W.; Lee, P. S.; Sweeney, Z. K. Promiscuity and the conformational rearrangement of drug-like molecules: insight from the Protein Data Bank. ChemMedChem 2015, 10, 238−244. (53) Haupt, V. J.; Daminelli, S.; Schroeder, M. Drug promiscuity in PDB: protein binding site similarity is key. PLoS One 2013, 8, e65894. (54) DeLorbe, J. E.; Clements, J. H.; Whiddon, B. B.; Martin, S. F. Thermodynamic and structural effects of macrocyclic constraints in protein−ligand interactions. ACS Med. Chem. Lett. 2010, 1, 448−452. (55) Watts, K. S.; Dalal, P.; Tebben, A. J.; Cheney, D. L.; Shelley, J. C. Macrocycle conformational sampling with MacroModel. J. Chem. Inf. Model. 2014, 54, 2680−2696. (56) Sondergaard, C. R.; Garrett, A. E.; Carstensen, T.; Pollastri, G.; Nielsen, J. E. Structural artifacts in protein-ligand X-ray structures: implications for the development of docking scoring functions. J. Med. Chem. 2009, 52, 5673−5684. (57) Liebeschuetz, J.; Hennemann, J.; Olsson, T.; Groom, C. R. The good, the bad and the twisted: a survey of ligand geometry in protein crystal structures. J. Comput.-Aided Mol. Des. 2012, 26, 169−183. (58) Baker, M. Fragment-based lead discovery grows up. Nat. Rev. Drug Discovery 2013, 12,5−7. (59) Marsault, E.; Peterson, M. L. Macrocycles are great cycles: applications, opportunities, and challenges of synthetic macrocycles in drug discovery. J. Med. Chem. 2011, 54, 1961−2004.

2327 DOI: 10.1021/acs.jmedchem.5b01286 J. Med. Chem. 2016, 59, 2312−2327