<<

Till Mor och Far

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Elinder, M., Geitmann, M., Gossas, T., Källblad, P., Winquist, J., Nordström, H., Hämäläinen, M., and Danielson, UH. (2011) Experimental validation of a fragment library for lead discovery using SPR biosensor technology. Journal of Biomolecular Screening. 16(1):15-25 II Winquist, J., Nordström, H., Geitmann, M., Gossas, T., Ho- man, E., Hämäläinen, M., and Danielson, UH. New Scaffolds for Design of Inhibitors of Resistant HIV-1 Protease Iden- tified by Fragment Library Screening. Submitted. III Winquist, J., Abdurakhmanov, E., Baraznenok, V., Henderson, I., Vrang, L., and Danielson, UH. Resolution of the Interaction Mechanisms and Characteristics of Non-nucleoside Inhibitors of Hepatitis C Virus Polymerase – Laying the Foundation for Discovery of Allosteric HCV . Submitted. IV Winquist, J., Gustafsson, L., Nilsson, I., Musil, D., Deinum, J., Geschwindner, S., Xue, Y., and Danielson, UH. Identification of structure-kinetic and structure-thermodynamic relationships for thrombin inhibitors. Manuscript.

Reprints were made with permission from the respective publishers.

Contents

Introduction ...... 11 ...... 11 Attrition – a costly problem ...... 11 Phases of pre-clinical drug discovery ...... 12 Clinical trials ...... 13 Levels of therapeutic intervention ...... 13 as drug targets ...... 14 Proteases ...... 14 Polymerases ...... 15 RNA viruses – a moving target ...... 15 Human immunodeficiency virus, HIV...... 16 Therapies ...... 16 Taxonomy and molecular virology ...... 17 HIV-1 protease ...... 17 Hepatitis C virus, HCV ...... 18 Therapies ...... 18 Taxonomy and molecular virology ...... 19 HCV polymerase, NS5B ...... 20 Endogenous targets ...... 21 Cardiovascular diseases ...... 21 Thrombin ...... 21 Physicochemical properties of drugs ...... 22 Fragment-based drug discovery, FBDD ...... 23 General approaches for drug discovery ...... 25 Metrics in drug discovery ...... 26 Interaction analysis ...... 27 Affinity, kinetics, and interaction mechanisms ...... 27 Thermodynamic interaction analysis ...... 29 Techniques for interaction analysis ...... 32 Surface plasmon resonance-based biosensors ...... 32 Aims ...... 36

Present investigations ...... 37 Paper I – Design and validation of a fragment library ...... 37 Interaction profiles ...... 37 Fragments with slow dissociation rates ...... 38 Superstoichiometry may be an insidious and non-general trait ...... 38 Conclusions ...... 38 Paper II – Screening for novel inhibitor scaffolds ...... 39 Tversky coefficient as a measure of similarity ...... 41 Conclusions ...... 41 Logistical challenges in compound screening (Papers I and II) ...... 42 Paper III – Cellular , mechanism, and kinetics of allosteric enzyme inhibitors ...... 43 Assessment of interaction mechanism ...... 44 Chemo- and thermodynamic evaluation of interactions ...... 44 Inhibitor interference of NS5BΔ21 RNA-binding ...... 46 Conclusions ...... 46 Paper IV – Structure-kinetic and -thermodynamic relationships of enzyme inhibitors ...... 47 Conclusions ...... 48 Perspectives ...... 49 Summary ...... 49 Future studies ...... 50 Notes ...... 51 Computer software for visualization ...... 51 Sammanfattning på svenska (Summary in Swedish) ...... 52 Bakgrund och målsättning ...... 52 Delarbeten ...... 53 Slutsatser ...... 53 Acknowledgements ...... 54 References ...... 56

Abbreviations

CD Candidate drug DMSO Dimethyl sulfoxide DSF Differential scanning fluorometry DTT Dithiothreitol EDC 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide hydro chloride FBDD Fragment-based drug discovery FDA US Food and Drug Administration FRET Förster resonance energy transfer GPCR G--coupled HCV Hepatitis C virus HIV Human immunodeficiency virus HTS High throughput screening IRES Internal ribosomal entry site ITC Isothermal titration calorimetry MS Mass spectrometry NHS N-hydroxysuccinimide NME New molecular entity NMR Nuclear magnetic resonance NS Non-structural protein NTP Nucleoside triphosphate OGP n-octyl-β-D-glucopyranoside PCR Polymerase chain reaction PD Pharmacodynamic PDB ID Protein Data Bank identity PK Pharmacokinetic R&D Research and development RT Reverse transcriptase SAR Structure-activity relationship SF Stopped-flow spectroscopy SKR Structure-kinetic relationship SPR Surface plasmon resonance v/v Volume to volume ratio

Introduction

Drug discovery Drug discovery is a complex, multivariate process aiming at designing a stable, potent compound with high therapeutic and low . Furthermore, production of the compound should be chemically and eco- nomically tractable. All of these aspects are affected by the structural and physicochemical properties of the compound, established in the pre-clinical phase of drug discovery. This thesis has the overall aim to reduce attrition in drug discovery via two interim aims. The first is to better understand what strategies are effi- cient for drug discovery. In order to ultimately reach compounds efficient in clinic, the second has been to develop methods to extract information valua- ble when prioritizing and optimizing substances during the development process.

Attrition – a costly problem The number of new molecular entities (NME) approved by the US Food and Drug Administration (FDA) were <1 per company annually for nine of the largest pharmaceutical companies during the period 2005-2010. With a con- tinuous increase in R&D spending during the same period, the annual amount had by the end of 2010 reached 60 billion US dollars [1]. The cost to develop a drug from a screening hit to a compound approved for clinical use has been estimated to approximately US$ 1.8 billion [2]. In each step of the drug discovery process, many potential drugs are ex- cluded. In the early 1990´s the most common causes of failure were issues with and . By the year 2000, these posts had been reduced considerably, and the main problems were then and efficacy [3]. The cost of development increases rapidly in each step of the ladder to- wards approval why thorough evaluation of drug hits and leads are important for identification of failing compounds at an early stage [4]. There is current- ly a 95 % attrition rate in the clinical phases [2]. If this could be decreased only by a few per cent, and undesired compounds excluded at an early stage, great effort and resources could be saved.

11 Phases of pre-clinical drug discovery Pre-clinical drug discovery can be divided into five more or less distinct phases (Figure 1). Initially, a drug target is identified. Typically, it is an es- sential part of the system to be affected. The target type must be considered early since the demands on the properties of a compound that targets an ex- tracellular receptor or an intracellular enzyme differs, e.g. due to uptake is- sues. Next, the target is validated to be ligandable, i.e. if it can bind small with reasonably high affinities [5], and, crucially, druggable, i.e. possible to affect its role in the mechanism intended to be altered by liganding – an interaction that is blocked or an activity inhibited. At this stage, assays are developed for use also in hit identification. That is the phase where compounds with desired effects, acting with some degree of selectivity towards their target, are identified. Different types of screening methods are often applied, e.g. high throughput screening (HTS) or fragment library screening, the latter a cornerstone of fragment-based drug discovery (FBDD). A library in this context is a collection of unique molecular entities. An alternative method is the pure rational design approach, where structural and mechanistic information is used to design leads. The kinetic profile and target selectivity are ascribed increasing importance throughout the pre- clinical research, and selectivity assessment with anti-targets is a valuable tool to avoid in vivo toxicity.

Target Target Hit Lead Lead identification validation identification generation optimization •Disease • Druggability • Kinetic profile • Kinetic profile • Kinetic profile mechanism • Assay development• Specificity •Interaction •SAR / SKR •Target type • (Selectivity) mechanism • • Selectivity •Animal PK/PD/ADMET

Figure 1. Phases of pre-clinical drug discovery.

In the lead generation phase, the mechanisms of interaction, and the func- tional effects, are examined. Thermodynamic evaluation of the interaction is believed to provide a better understanding of both the interaction itself, and the of the compound [6]. In the last pre-clinical stage, the leads are optimized by structure/activity-relationship (SAR) and struc- ture/kinetics-relationship (SKR) studies, resulting in a candidate drug (CD). Before the CDs enter clinical testing, pharmacokinetic (PK), pharmaco- dynamic (PD), and ADME/Tox (absorption, distribution, metabolism, excre- tion, and toxicity) evaluations are performed in vitro, e.g. through serum protein and -binding assays, as well as in animal model systems. The importance of valid model systems in the pre-clinical phase of drug discovery, be it in silico, in vitro, or in vivo, cannot be overemphasized.

12 Clinical trials After evaluation of the CDs in animal models, they enter clinical trials with human subjects. Where no reliable animal models are available, e.g. for some psychiatric illnesses, the actual validation of the drug target is per- formed as late as in clinical phase III trials [4]. The clinical evaluation of a CD begins in phase 0, where a handful to a dozen healthy volunteers are exposed to sub-therapeutic doses of the com- pound to evaluate PD and PK properties. The results serve as a guideline for the next stage in development [7]. In phase I, general safety issues and identification of side-effects are ad- dressed in small groups (20-80 individuals) of healthy volunteers. Again, PD and PK properties are monitored closely. Phase II trials include 100-300 patients with the disease or condition in- tended to be treated. Here, the efficacy of the potential drug is assessed for the first time. In phase III, the candidate drug is compared to already existing treatments in large scale trials (1000-3000 patients). Side effects are usually noted in this stage, which is the most expensive phase in the development of a new drug. To reach the market, new drugs need to display some advantage over compounds already approved and in clinical use. After a drug has been approved, a phase IV trial is usually conducted. It addresses, e.g., the risks of long term use, optimal use, and interactions with other drugs [8].

Levels of therapeutic intervention Medical treatment can in simple terms be directed towards five different hierarchical levels (Figure 2): at the level of genomic DNA (in some cases RNA), gene expression and the transcriptome, the proteome, the interactome, and the metabolome.

1. Genome 2. Transcriptome 3. Proteome 4. Interactome 5. Metabolome

• Gene therapy • RNA interference •Vaccines • Receptor blockade • Enzyme inhibition

Figure 2. The five levels available for therapeutic intervention. One example for each category is given. Epigenetic aspects have been omitted for clarity.

1) With gene therapy, an individual’s genomic DNA can be altered, usu- ally by introduction of a new gene copy. Most commonly, vectors of lenti- or adenoviral origin are used and the effect can be either transient or perma- nent. Some successful examples of gene therapy as a therapeutic regime have been published in the late 00th (see [9] and references therein).

13 2) Post-transcriptional gene silencing by RNA interference (RNAi) is a naturally occurring process hypothesized to be feasible for therapeutic pur- poses [10]. 3) Some medical conditions require modifications of the proteome. The injections of a blood clotting factor for a haemophiliac, or insulin for a pa- tient suffering from diabetes are two examples. Vaccines can also be consid- ered a form of proteome alteration. 4) and 5) The interactome and metabolome are, in the end, involved in all diseases and conditions. They can be targeted directly, often by small- molecular drugs, or through the lower therapeutic levels. The vast majority of all modern drugs act on [11]. In early 00th, the three most common drug targets were G-protein-coupled receptors (GPCRs), protein kinases, and ion channels. Close to half of all the small- drug targets were enzymes [11] and >50 % of all targets of small- molecule and biological drugs are membrane bound [12]. Drugs directed towards proteins can also be in the form of, e.g., peptidomimetics [13], RNA [14], and therapeutic affinity probes such as antibodies or artificial deriva- tives thereof [15, 16]. Despite the fact that many drugs target intracellular structures, the relative occurrence and importance of active and passive uptake of the compounds must be considered unknown and is actively debated (see [17] and references within).

Enzymes as drug targets The importance of enzymes as drug targets is explained by their essential role in catalyzing reactions otherwise too slow to support life as we know it. They consequently have a direct effect on the metabolome. The catalytic efficiency of enzymes can generally be affected by com- pounds that bind and sterically hinder substrate access to the area of the cata- lytically active residues (the active site). can also be indirectly influenced by altering the features of the active site or orientation of the cata- lytic residues, e.g. through allostery. Allosteric effectors interact with a site distant to the active site and regulate enzyme activity via conformational effects. In contrast to active site inhibitors, allostery can also enhance effi- ciency (allosteric activators).

Proteases Proteases, also referred to as proteinases or peptidases, constitute a group of enzymes which hydrolyze peptide bonds, i.e. cleaves proteins. They are clas- sified according to their . Almost one third of all prote- ases belong to the serine protease class [18]. They utilize a serine moiety in the catalysis, while other classes involve cysteine, threonine, aspartic, metallo, and glutamic residues. The catalytic triad of serine proteases (Ser-

14 His-Asp) is found in four structure families, the different connectivity being a sign of convergent evolution [19]. Proteases are involved in many different processes in the human body, such as degradation of damaged proteins, regulation of blood clotting, and signaling pathways. Due to difficulties in achieving selectivity, proteases have been considered a challenging class of drug targets. The selectivity of proteases is dependent both on the catalytic mechanism and the substrate. Due to the different reactivity of the catalytic residues, they have, e.g., different pH optima. Aspartic proteases, such as HIV prote- ase, therefore have an unusually low pH optimum, ranging from 2-5. The substrate selectivity is a result of the shape and other features of the active site. Typically, specificity is sequence dependent and can be expressed ac- cording to the Schechter and Berger nomenclature [20] where the residues of a substrate are denoted, from the N-terminus, P3, P2, P1, P1’, P2’, etc, with the scissile bond between the primed and the non-primed residues. The en- zyme interaction sites are numbered correspondingly with the letter S.

Polymerases Polymerases are the central replicative machineries of life with the task to turn single stranded polynucleotide strands into double strands. In humans, three types are represented: DNA-dependent DNA polymerases (DdDp; active during replication), DNA-dependent RNA polymerases (DdRp; pivot- al for gene expression), and RNA-dependent RNA polymerases (RdRp; sug- gested to be involved in transposon control [21]). A fourth type, RNA- dependent DNA polymerases (RdDp), is found in retroviruses. Polymerases have, due to their processive nature, a quite intricate catalyt- ic mechanism. In addition, they utilize both large macromolecules, in the form of polynucleotide template strands, and small substrates, nucleoside triphosphates, in catalysis. This calls for a dynamic and flexible structure. Most polymerases function according to the common ‘two-metal-ion’ mech- anism; one ion is believed to deprotonate the 3’ OH of the nascent strand, creating a nucleophile to attack the α-phosphate group of an incoming NTP, held in place by the second metal ion, to elongate the polynucleotide [22].

RNA viruses – a moving target Lacking metabolism, viruses are normally not classified as living organisms. However, these obligate intracellular parasites have genomes, consisting of single or double stranded RNA or DNA, that encodes crucial enzymes and structural components required for their replication and new infectious viri- ons. The most common genetic setup produces viruses and proteins referred to as the wild type (wt).

15 Viruses do not acknowledge any taxonomical boundaries or restrictions defined by man. For RNA viruses, whose genomes are amplified by poly- merases lacking proofreading mechanisms, genomically less well-defined virus populations of quasi-species [23] evolve very fast when subjected to a new selective pressure. A high viral load in the host will further increase the rate of this process. This brings a fast rate of evolution and, thus, develop- ment of resistance to selective agents such as drugs [24]. However, the probability for a specific multiple-site mutation is rather low, even at a high mutational rate and viral titer. This is the rationale for the benefit of combinational treatment of viral infections. The evolutionary dis- tance between the wild type and the mutant is sometimes referred to as ge- netic barrier to resistance; a longer distance, i.e. multiple mutations, is hard for the virus to take in one leap (generation), thus equals a higher resistance barrier. Resistance mutations in genes encoding viral enzymes do not only affect the structure of the protein and, thus, its interaction with the drug. They po- tentially affect also secondary polynucleotide structures, expression efficien- cy of the corresponding gene, post-transcriptional and post-translational modifications, and catalytic activity. This prevents the virus from unre- strained mutagenesis. However, some viral enzymes, e.g. HIV protease, recognize the back-bone of the substrate rather than a specific amino acid sequence, making them less sensitive to mutations in their substrates [25].

Human immunodeficiency virus, HIV Human immunodeficiency virus (HIV), initially referred to as human T- lymphotropic virus type III (HTLV-III), was isolated in 1983 as the causa- tive agent of human acquired immunodeficiency syndrome (AIDS) [26]. HIV is a blood borne infection whose most common routes of transmission include blood transfusions, sharing of non-sterile needles for injective drug use, and unprotected sexual relations. With over 34 million people infected worldwide, HIV is a global concern [27].

Therapies Despite massive efforts in academia and industry, no curative regimen for HIV infection has been identified so far. Many stages of the viral life cycle have been targeted in the quest of effective therapy, including, but not lim- ited to, cell entry, genomic integration, and viral maturation [28]. To tackle the rapidly emerging resistance to monotherapy, the stand- ard of care is now combination therapy. The specific combination of drugs is selected with respect to the genetic profile of the viral population infecting the individual patient. These HAART (highly active anti-retroviral therapy) regimens often consist of three different drugs, typically targeting HIV re- verse transcriptase (RT) with a combination of non-nucleoside and nucleo-

16 side analogues, and an HIV protease inhibitor [29]. For treatment naïve pa- tients, the use of protease inhibitors has historically been avoided due to side effects. The second generation inhibitors are improved in that aspect and are therefore used more frequently in combination therapies [30]. Currently, 13 HIV RT and 10 HIV protease inhibitors are approved for clinical use [28]. However, to battle ever developing resistance, reduce pill burden and thera- peutic side-effects, let alone the prospect of finding a curative agent, the demand for new enzyme inhibitors is continuously high.

Taxonomy and molecular virology HIV is a lentivirus (‘lenti‘ from the Latin word for “slow”) of the retroviridae family. There are two major types, HIV-1 and HIV-2, both with a tropism for human CD4 positive T-cells. The virulence of the less spread HIV-2 type is lower compared to HIV-1. HIV-1 is further divided into four subtypes (M, N, O, and P) of which one (M), often simply referred to as ‘HIV’, is responsible for the majority of all infections [31]. The enveloped virion contains two copies of a single stranded positive- sense RNA strand. After entry into the host cell, the RNA is reversely tran- scribed to DNA by HIV RT, also a part of the virion. The DNA is subse- quently transported into the nucleus where it is integrated in the host ge- nome. From here, it replicates by use of the replicative machinery of the host cell. The HIV genome encodes 19 proteins from 9 genes. Additional tran- scripts are generated by alternative splicing. Some of the proteins are trans- lated in the form of polyproteins which are released into their functional form by proteolytic processing by cellular and viral proteases. Three of the gene products are enzymes: integrase, RT, and protease New virions are assembled at the cell membrane where they bud off to in- fect other T-cells. With the virus integrated into the genome of immune cells, it produces new virions for some time before the cells die from damag- es caused by the infection. Thus, the immune system is worn down, resulting in acquired immunodeficiency syndrome (AIDS) if left untreated. At this stage, little of the immune system is left to battle all the opportunistic para- sites we are exposed to, but normally not affected by, in our everyday lives [32].

HIV-1 protease HIV-1 protease (HIVp) is a homodimeric aspartic protease (EC 3.4.23.16) with its active site built up by the two subunits assembled symmetrically Figure 3). Each 11 kDa monomer consists of 99 amino acid residues. Two β- sheet region flaps (top in Figure 3) control access to the quite hydrophobic active site. The catalytic residues are situated at the bottom of the cavity beneath the flaps and bind the water molecule used for hydrolysis of the scissile substrate peptide bond.

17

Figure 3. Structure of the symmetric homodimer of HIV-1 in complex with an active site inhibitor (green). The apoenzyme has not got the flexible top flaps locked down, but vary in degrees of open. PDB ID: 1KZK.

HIVp has an essential role in the production of new infective viral particles, since it hydrolyzes polypeptide precursors expressed from the viral genome [33]. In in vitro assays it has also repeatedly proven to be autohydrolytic. The enzyme is flexible and tolerates mutations well; polymorphism has been observed in >70 % of all residues [34], and resistance has been observed to all approved HIVp inhibitors [35].

Hepatitis C virus, HCV In 1989, the agent causing non-A-non-B hepatitis was identified as Hepatitis C virus (HCV) [36]. It is transmitted via blood and the main route for spread of the infection today is via needle sharing amongst injecting drug users, or in tattoo and body-piercing studios. The primary infection is often rather mild and asymptomatic, but patients have an 80 % risk of developing a chronic condition. Of these, up to 20 % develop cirrhosis and subsequently hepatocellular carcinoma. This makes HCV infections one of the major causes of liver cancer worldwide and the most common reason for liver transplants in the US. Chronic HCV infection also predisposes patients for other complications, e.g. renal dysfunction [37].

Therapies Progress towards anti-HCV therapies was initially slow since the only model system available up until 1999 was the chimpanzee – an expensive system with tough ethical aspects. In 1999, a cell-based replicon system was invent- ed [38]. It was subsequently refined and now serves as an efficient model

18 system for identification and characterization of potential HCV drugs [37, 39]. Routes to new therapies tried so far include diffent types of vaccines [40, 41], prevention of virion cell entry, assembly and maturation of new virions, disruption of replication via inhibition of the NS3 protease, the NS5B poly- merase, and blockage of the internal ribosome entry site (IRES) [42]. The standard of care was up until 2011 a general regimen of pegylated ribavirin (an anti-viral nucleoside guanosine analog) and INF-α. Patients subjected to this long therapy suffer severe side-effects. In the end, less than 50 % of HCV genotype 1 infected patients that complete the treatment achieve a sustained viral response (SVR), i.e. are cured [41]. The approval of two drugs in 2011, targeting the NS3 protease was therefore a considerable relief [43].

Taxonomy and molecular virology HCV is a hepacivirus and belongs to the taxonomic family of Flaviviridae together with flaviviruses (including, for example, Dengue and West Nile viruses) and pestiviruses. There are six main genotypes of HCV, some with additional subtypes, with approximately 30 % amino acid sequence diver- gence upon pairwise comparison. Genotypes 1b and 2 are most prevalent in Europe [44]. The virus has a liver cell tropism, but has been observed to infect also leukocytes [45]. HCV is a single stranded, enveloped RNA-virus with a 9.6 kb positive- sense genome, consisting of a 5’ non-coding region (NCR), an IRES se- quence, a protein coding section, and a 3’ NCR. The positive sense of the genome enables direct translation into a polyprotein. This precursor of some 3000 amino acids is co- and post-translationally processed into at least 10 mature viral proteins (C, E1, E2, p7, NS2, NS3, NS4A, NS4B, NS5A, and NS5B). Some of these have structural roles, e.g. building up the icosahedral viral capsid, while other non-structural (NS) proteins are responsible for the polyprotein processing (e.g. the NS2 and NS3 proteases) and viral replica- tion (e.g. the helicase domain of NS3 and NS5B polymerase). The structural proteins are located in the carboxy-terminal part of the polyprotein. Viral replication takes place at the surface of the endoplasmatic reticulum (ER) and most NS proteins are membrane associated or have a transmembrane region [46]. This is thought to enable compartmentalization, with a more protected environment and a high local concentration of the replicative machinery increasing the rate of replication [47]. Compartmental- ization can also help to avoid early cellular recognition of the infection. The HCV polymerase lacks proofreading, which together with a high vi- ral load result in a less well-defined virus population of quasi-species [48].

19 HCV polymerase, NS5B HCV NS5B is an RNA-dependent RNA polymerase, EC 2.7.7.48. It has a carboxy-terminal transmembrane region of 21 amino acids, anchoring it to the host cell ER, essential for in vivo replication [49]. However, the in vitro activity of NS5B has been reported to be independent of the trans-membrane region [50]. Due to the difficulties of working with trans-membrane proteins, the truncated form lacking the carboxy-terminal 21 amino acids (NS5BΔ21) is often used in research. The crystal structure of NS5BΔ21 was determined in 1999 [51, 52] and can, as numerous polymerases, be described with the analogy of a right hand with thumb, palm, and fingers subdomains (Figure 4).

Active site Palm II Palm III

Thumb II

Palm I

Thumb I

Figure 4. Structure of HCV NS5BΔ21 with its thumb (green), palm (red), and fin- gers (blue) domains highlighted. The latter domain forms the NTP channel in the center of this illustration. The two metal ions (purple) are situated in the active site. PDB ID: 1GX6. Five allosteric sites are indicated by reported inhibitor structures: Thumb I, beige (PDB ID: 2BRL); Thumb II, orange (PDB ID: 3FRZ); Palm I, dark grey (PBD ID: 3H2L); Palm II, light grey (PDB ID: 3FQL); and, somewhat con- cealed in figure, Palm III, black (PDB ID: 3LKH).

Two channels, one for the RNA template and one for NTPs, supply the active site, situated in the palm domain. The active site aspartic acid residues (Asp220, Asp318, and Asp319) coordinates two Mg2+ ions used as co- factors according to the ‘two-metal-ion’ mechanism. Magnesium can be exchanged for manganese with maintained activity, but altered preference for, e.g., mode of initiation [53].

20 NS5B has a highly dynamic structure and can initiate RNA synthesis from circular templates [54]. During synthesis of double-stranded RNA the polymerase must undergo substantial conformational rearrangements for the template and the new strand to exit [55, 56]. The flexibility of a thumb- domain β-hairpin is important for this event, as well as for positioning the viral template genome correctly for initiation of replication [57]. Although it is able to prime its own replication by so called back priming, NS5B is thought to utilize de novo synthesis in vivo [56]. The activity of NS5B can be regulated by active site inhibition, or through one of five allosteric sites. These are referred to as the thumb-I and II, and palm-I, II, and III sites, and have been reported to bind, e.g., indole, pyri- dine, benzothiadiazine, benzofuran, and benzamide based scaffolds, respec- tively (as summarized in [58]). Nucleoside and nucleotide-like inhibitors directed against the active site of DNA-polymerases, e.g. HIV RT, have been reported to suffer from toxici- ty due to inhibition of human mitochondrial polymerase γ [59]. RdRPs have eukaryote-wide distribution and similar problems for corresponding NS5B inhibitors are a potential problem.

Endogenous targets In addition to the drug targets in invasive disease-causing agents, such as viruses and bacteria, there are numerous endogenous targets. While anti- infective therapy often aims to eradicate the intruder(s), endogenous targets cannot always be affected without severe side-effects, e.g. as for some neu- rodegenerative diseases [60, 61].

Cardiovascular diseases In 2008, over 17 million people died from cardiovascular disease [62]. How- ever, disease in the heart and blood vessels can have many shapes and caus- es. In atherosclerotic vascular disease, remnants after plaque ruptures can induce blood clotting in the vessels (thrombosis). To counteract this, throm- bin has been identified as a good therapeutic target [63].

Thrombin Human α-thrombin is a serine protease (EC 3.4.21.5) essential for blood clotting. It is an important drug target and has consequently been extensively investigated. Thrombin is produced in the form of prothrombin (coagulation factor II), an inactive zymogen, and activated by proteolytic cleavage. The activity of thrombin is regulated reversibly via Na+ that interacts with an allosteric site [64], and can be irreversibly inactivated by antithrombin. The substrate se-

21 lectivity of thrombin is altered when it binds to thrombomodulin, a mem- brane bound thrombin receptor. Thrombin is a heterodimer, the two polypeptide chains linked by disulfide bonds. The catalytic triad, consisting of H57, D102, and S195, is situated in the B chain (259 residues), which has a chymotrypsin-like fold. The A chain (36 residues) has no known function [65]. Hirudin is a peptide with anticoagulant properties since it binds to, and inhibits, the activity of thrombin. It was originally discovered in leeches which have an advantage of the anticoagulant properties. A recombinantly produced truncated form is often used for determination of thrombin concen- tration via active site titration in research laboratories (as in paper IV).

Figure 5. A surface representation of thrombin in complex with melagatran (red sticks) and recombinant hirudin (on the left in green). The A chain, seen in top of the figure in yellow, stretch over the opposite of chain B (in gray) from the active site. The Na+ allosteric binding site is in magenta close to the active site. PDB ID: 1K22.

Physicochemical properties of drugs A majority of orally administered drugs obey a set of criteria referred to as the Lipinsky ‘Rule-of-Five’ (Ro5) [66]. Compliance to the rules increases the probability of success for small-molecular compounds to be efficient drugs. The rules were based on drugs accepted for clinical phase II trials, thus including compounds that failed in later stages. However, to establish

22 guidelines based on drug candidates successful in phase III, where they must show an advantage over already approved regimens, would be problematic since it would introduce a confounded and irrelevant historical parameter of order-of-discovery [67]. The Ro5 criteria are all physico-chemical: ≤ 5 hy- drogen-bond donors, ≤ 10 hydrogen-bond acceptors (e.g. oxygen or nitrogen atoms), a log P octanol-water partition coefficient ≤ 5, molecular mass ≤ 500 Da. The average number of heavy atoms in a 500 Da drug-like molecule is 38, thus with an average atomic mass of 13.3 Da [68]. Affinity for a target can generally be improved by addition of hydropho- bic motifs to the . This has detrimental effects on probability of clini- cal success and has been referred to as molecular obesity [69]. During the clinical phases of development, the average molecular weight of orally ad- ministered drugs later approved decreases. Simultaneously, the most lipo- philic substances are excluded [67]. The Ro5 should really be considered a guideline rather than a set of rules; approximately 20 % of all approved orally administered drugs fail at least one of the criteria [12]. Furthermore, it does not apply to naturally occurring compounds for which specific mechanisms of absorption exists [66]. Natural products used as drugs often differ in structure from those put to- gether by researchers. The number of stereogenic centers and overall com- plexity is typically higher in the former with a higher relative number of carbon, hydrogen, and oxygen atoms. The molecular weights are often high- er than the 500 Da suggested by Ro5, and some even contain reactive groups, a feature preferably avoided by medicinal chemists [70]. Inspired by Lipinsky, there are now other rules of thumb – for example Veber’s criteria for bioavailability in the rat model system: polar surface ≤140Å2, or the sum of H-bond donors and acceptors of ≤12, and ≤10 rotata- ble bonds [71]. Compound designed to, e.g., be able to pass the blood-brain- barrier have yet other typical characteristics.

Fragment-based drug discovery, FBDD FBDD is based on a concept of modular assembly of drug molecules. These modular units are binding motifs and referred to as fragments. One of the advantages with fragments over conventional larger molecules in drug dis- covery is explained by combinatorics (Figure 6). The number of theoretical carbon-based compounds with a molecular mass less than 500 Da has been estimated to exceed 1060 [72], often referred to as the total chemical space of drugs. A large compound library with some 106 entities covers only a minute fraction of that hyper volume, here 10-54.

23

Figure 6. Example of combinatorial advantage of fragments. Three fragments here cover the same chemical space as 27 larger, more drug-like, molecules of a certain size. Inclusion of pairwise combinations increases the difference in numbers further.

By the use of smaller compounds, fragments of potential drug molecules, which assess sub-binding motifs, the numbers can be improved. With a typi- cal molecular weight of 150-300 Da, corresponding to some 12-25 heavy (i.e. non-hydrogen) atoms, the chemical space for fragments is smaller, with approximately 108-1025 theoretical compounds [73]. Thus, a fragment library of 1000 compounds covers at least 10-22 of the total theoretical chemical space. However, the chemically tractable space, let alone the biological- activity space, is certainly much smaller than the full theoretical. With a model based on binary traits, Hann et al showed that the risk of mismatch due to high complexity increases rapidly with the size of the lig- and [74]. A low-complexity hit, i.e. a fragment, could instead be decorated into a full match. Fragments found to convey desired action (e.g. inhibition of enzyme activity or blockage of receptor binding) by binding to a drug target in a concentration dependent and reversible manner can be matured into larger by growth in an iterative fashion, linking to other frag- ments, or merging with larger structures (Figure 7). Proof of principle has been shown for all approaches. However, fragment linking with sustained overall affinity has proven difficult (see [75] and references within). The size of the final drug molecule is generally not affected by the type of discovery approach; fragment or HTS. Importantly, the additivity of fragment affinities has been proven not to apply to macromolecular systems due to non-local interactions [76, 77].

Figure 7. A schematic joining of fragments into a larger ligand.

The first practical example of fragment screening was published in 1996 by Shuker et al, where ‘SAR-by-NMR’ was described [78]. The approach has since been adopted by, mainly, SPR-based platforms [79], crystallographic methods [80], and mass spectrometry (MS) [81]. Yet other screening tech- niques are under development, e.g. weak (liquid) (WAC) in combination with MS [82]. Fragment screens are also performed in silico, often in combination with other techniques [83].

24 The small size of fragments also brings challenges. The sensitivity of any assay used for interaction analysis for fragments needs to be high due to the low molecular weights (in particular for biosensors setups) and affinities. Furthermore, a smaller binding motif generally cannot be expected to display high selectivity. Very small fragments, with a molecular weight <160 Da can be expected to display a promiscuous binding profile with low affinities [74]. However, quite high target selectivity can be achieved with high affini- ty fragments for highly ligandable targets [84, 85]. How to compile a fragment library well suited for screening is an open question and several approaches have been suggested and tried, e.g. the use of metabolite like compounds [86], and different ratios of sp2/sp3 hybridized carbons [87]. The library should also cover the maximal hypervolume of chemical space [88]. The purpose of screening a fragment library is im- portant to consider since the screening for novel inhibitor scaffolds of a spe- cific target and the general interrogation of target ligandability may require two different compound library compositions. While nature can readily produce very intricate molecules, medicinal chemists are often restrained to a smaller toolbox. The compounds must be synthetically tractable and economically feasible, reducing, for instance, the number of incorporated stereogenic centers. Desired interaction characteris- tics add another layer of constraints, in addition to the demands on bioavail- ability, localization, time and mechanism, etc. In analogy with the ‘Rule-of-Five’, a ‘Rule-of-Three’ has been suggested for FBDD [89]: ≤3 hydrogen-bond donors, ≤3 hydrogen-bond acceptors (i.e. oxygen or nitrogen atoms), calculated logP ≤3, molecular mass ≤300 Da. The validity of this ‘rule’ has, however, been questioned [90]. In August 2011, the first drug developed by a fragment-based approach, the kinase inhibitor PLX4032, was approved by the FDA [91], showing proof-of-concept for this strategy.

General approaches for drug discovery Before the molecular biology era, drug discovery was purely based on phe- notypic screening and evaluation, using a holistic top-down approach. Now- adays, target based discovery is more common with its reductionistic bot- tom-up take on the discovery process. However, target based techniques have historically had a lower NME production rate in comparison with phe- notypic based approaches [92]. Both phenotypic and target based approaches can utilize rational drug de- sign (RDD), which can be seen as the opposite of purely combinatorial tech- niques – indeed a wide definition. Mechanism based inhibitors and transi- tion-state analogues (see section on thermodynamics below) are often exam- ples of results from RDD.

25 Target based RDD can have two different starting points, either based on information about the ligands, e.g. structural and functional feature maps (), or on the structure of the target, often obtained by x-ray crystallography or NMR spectroscopy. The target binding site topology and features can also be deduced with help of a ligand analogue series by struc- ture-activity relationship (SAR) studies. The non-additive nature of binding energies of macromolecules and their dynamic structures can make SAR, and RDD, challenging [93].

Metrics in drug discovery Many metrics and indices have been suggested to help discriminate com- pounds with high potential to reach the clinic and market from those with lower. To compare the potency of same-size hits in pre-clinical drug discov- ery, the ligand efficiency (LE) metric has been widely used. The free energy of binding (ΔG) is considered in relation to the number of heavy (i.e. non- hydrogen) atoms. Thus, the affinity is normalized with respect to size:

∆ ∆ Eq. 1

LE can be used to identify strong binding motifs and compare close ana- logues. Since this metric does not scale linearly with ligand size above 20 heavy atoms, several other indices have been suggested. These also include features increasingly important in the later stages of the pre-clinical research, e.g. weight adjusted % target inhibition [94], residence time [95], enthalpic [6] and kinetic efficiency [96] (EE and KE, respectively); kinetic profiles hold increasingly more information in the later stages of compound development. However, the general understanding of how specific structural components affect the kinetic profile of a compound (i.e. structure kinetic relationships; SKR) is still limited. For structural similarity comparisons of smaller molecules, Tanimoto and Tversky [97] coefficients are often used [98]. They are both based on pair- wise comparisons of binary traits according to Eq. 2, where A and B are the numbers of traits true for molecule 1 and 2, respectively (see Figure 8 for a conceptual visualization with a Venn diagram), and S is the similarity score:

26 A∩B Eq. 2 A∩Bα∙A⧵Bβ∙B⧵A

Figure 8. The intersect of A and B (denoted A∩B), the relative complement of B in A (A\B), and the relative complement of A in B (B\A) are visualized.

The Tanimoto coefficient is calculated when α = β = 1. It is used for compar- ison of same sized molecules. To identify substructures of one molecule in another, the Tversky coefficient is used. The Tversky score is calculated when α = 0 and β = 1, or when α = 1 and β = 0 (see use in paper II and [88]).

Interaction analysis Affinity, kinetics, and interaction mechanisms The interaction between two molecules can be described in various ways. A characteristic often used to describe the binding strength is affinity, prefera- bly reported as the equilibrium association or KA and KD, respectively, the two constants being each other’s reciprocals. They state the ratio of the concentrations of the free interactants and their complex at equilibrium (for Scheme 1 as stated in Eq. 3). The overall affinity is a result of all forces, attractive and repulsive, between the two molecules. These forces are electrostatic in nature, or have an entropic origin (see below). Electrostatic charge-charge interactions include structural motifs with formal charges (ions), dipoles (including hydrogen bonding), and dispersive forces from transient dipoles. The median affinity for small- molecule drugs on the market is approximately 20 nM [12]. The specificity of a ligand can be defined as the difference in affinity towards the intended target and other interactants. Thus, specificity is dependent on the number of off-targets and their relative abundance. Although KD is a useful metric in early stage pre-clinical drug discovery, it is too blunt of a tool on its own in later phases. In the lead generation and optimization stages, kinetic profiles are a valuable asset. Affinity is a com- posite of the rate constants describing the interaction. The association and

27 constants are generally denoted kon, k1, or ka, and koff, k-1, or kd, respectively. Consider the binding of an inhibitor (I) to an enzyme (E). In the least complex case the interaction can be described

k1 E + I EI Scheme 1 k -1 and the affinity expressed

EI Eq. 3 EI However, macro molecular binding is generally more complex with multiple steps, including, e.g., conformational changes. There are two theories of how molecular recognition occurs, both replacing the old key-lock model with its rigid protein structures. Either, the liganding event induces a conformational change, a so called induced-fit (the model is also known as the KNF model, named after Koshland, Némethy, and Filmer [99]):

k1 k2 E + I EI E*I Scheme 2 k k -1 -2 or the ligand binds the enzyme as it samples transient conformations, also known as selected-fit, population-shift or the MWC (Monod-Wyman- Changeux) model [100]:

k1 k2 E E*+ I E*I Scheme 3 k k -1 -2 The affinity expressions for Scheme 2 and Scheme 3 are somewhat more complex in comparison to Eq. 3. The interaction mechanism has a large impact on both the toxicity and the efficacy of a drug [101]. For example, the rate of complex decay, and, thus, the residence time of the ligand, can have important implications in a biolog- ical system. The maximal length of effect is restricted by the turn-over rate of the target population. Two drugs with the same affinity towards their tar- get, but with different rate constants and interaction mechanisms, can differ in efficacy, as was the case in paper III. This demonstrate the importance and advantage of access to kinetic rate data [95]. In the design of, e.g., TS- analogues, information about the interaction mechanism and structure is vital.

28 Thermodynamic interaction analysis An interaction can also be described in terms of thermodynamic entities. The , ΔG, is the overall potential energy in a system at constant temperature and pressure. The maximal free energy contribution per non- hydrogen atom in a reversible drug-target interaction, determining the max- imal affinity of the ligand, has been estimated to be approximately -6.3 kJ/mol (i.e. -1.5 kcal/mol) for small molecules, excluding metal ions which display unusually high values [102]. Thus, an addition of, for in- stance, a single methyl group can increase compound potency 10-fold [68]. This estimate holds well up to 20 heavy atoms and at 25 the maximal affini- ties plateau [103]. According to transition-state (TS) theory, a transient high energy complex (indicated by the double dagger, ‡) is formed during enzymatic catalysis. Similarly, a higher energy state is visited upon mere liganding. The catalytic effect of an enzyme is generally believed to originate in a stabilized TS with a reduced energy barrier. The energy levels of individual reactions are af- fected by the solvent properties, e.g. ionic strength. The effect depends on the individual reaction mechanism. The change in ΔG during an interaction, or a full catalytic turn-over, can be visualized schematically as in Figure 9.

a) Energy S‡ b) Energy

‡ ΔG ES EI‡ off ΔG E+S EP on E+I ES ΔG E+P EI

Reaction coordinate Reaction coordinate

c) Energy d) Energy ‡ E*I‡ EI E*+I EI E+I E+I E*I E*I

Reaction coordinate Reaction coordinate Figure 9. a) Schematic energy profile for a catalytic cycle of substrate ‘S’ to product ‘P’ by an enzyme ‘E’, and for a spontaneous reaction of S to P. b) Changes in Gibbs free energy during a liganding event, illustrated by an enzyme-inhibitor interaction. c) and d) display the energy profiles of selected and induced fit, respectively.

29 ΔG can be dissected into an term, ΔH, describing the total energy of a system, and entropy, ΔS, popularly explained as a measurement of ran- domness in a system.

∆ ∆ ∆ Eq. 4

A molecule with a hydrophobic structural motif dissolved in water will de- crease the randomness of the system with an ordered first shell of solvent molecules around its lipophilic part. If similar motifs aggregate, a smaller non-polar surface area is exposed to the polar solvent. The clathrate structure of the water will be less rigid and, thus, the energy of the system lower. This is referred to as the hydrophobic effect. Thus, the stability of a protein fold or complex is not only determined by the strength of the bonds formed be- tween the interacting molecules, but also by the degree of freedom in the system. The Gibbs free energy of an interaction can also be expressed by an affin- ity entity as

∆ Eq. 5

Thus, selectivity can be expressed as ΔΔG. Eq. 5 enables the assessment of ΔH and ΔS through KD values by combining Eq. 4 and Eq. 5, into the van’t Hoff equation: ∆ 1 ∆ ∙ Eq. 6 To improve statistical reliability, the temperature range of the experiments should be chosen as wide as possible, with the temperature for the sought ΔG in the middle [104]. Eq. 4 holds true also for the separate association and dissociation events, which can be studied through two general methods. The Eyring equation, derived from transition-state theory, can for the rate constant ‘k’ be ex- pressed (kB and h for the Boltzmann and Planck constants, respectively):

‡ ∙∆ ⁄ Eq. 7

Combined with Eq. 4:

‡ ‡ ∙∆ ⁄ ∙∆ / Eq. 8 After rearrangement it takes on a linear form

30 ∆‡ 1 ∆‡ ∙ Eq. 9 The transmission coefficient (κ) is here approximated to 1 and omitted from the equations, i.e. all reaction complexes reaching TS are assumed to follow the reaction trajectory forward. Despite a common practice, the theoretical justification of this general approximation has been questioned. However, experimentally determined values of κ have been close to 1 [105]. With the empirically derived Arrhenius equation, an (Ea) can be calculated

∙⁄ Eq. 10

‡ where Ea relates to ΔH as

‡ ∆ Eq. 11

‡ ΔS can be obtained from Ea by using the Eyring equation in combining Eq. 8, Eq. 10 and Eq. 11 into Eq. 12:

‡ ∙∆ / Eq. 12

a) ln KD b) ln k c) ln (k/T)

‡ Intercept: -ΔS/R Intercept: ln A Intercept: ΔS /R + ln(kB/h) ‡ Slope: ΔH/R Slope: -Ea/R Slope: -ΔΗ /R

1/T 1/T 1/T Figure 10. a) Thermodynamic analysis of the equilibrium by a van’t Hoff plot. The association and dissociation events can be analyzed correspondingly by Arrhenius (b) and Eyring (c) plots.

Due to the many non-local intramolecular interactions in proteins, simple rules of additivity for thermodynamic entities do not apply. All the interac- tions of the system, i.e. the protein and its binding partner, must be consid- ered as a whole [76, 93]. For the same reason, ΔH has been argued to charac- terize an interaction better than ΔG, since enthalpy-entropy compensation may obscure details of the interaction [76]. Examples of how to apply the above techniques are found in papers III and IV.

31 Techniques for interaction analysis There are many different techniques with which to study interactions be- tween molecules. To gain a proper understanding of the interaction, com- plementary techniques are used in an orthogonal fashion. For structural information, x-ray crystallography and nuclear magnetic resonance spectroscopy (NMR) are widely used, while a catalytic mecha- nism can be kinetically dissected by stopped-flow spectroscopy. For further characterization, thermodynamic entities can either be assessed indirectly, e.g. through van’t Hoff analysis of affinity data from a range of tempera- tures, or directly by calorimetric techniques such as isothermal titration calorimetry (ITC). A surface plasmon resonance-based biosensor can pro- vide, among other things, measurements of affinity and kinetic rate constants (see further below). Some techniques for interaction analysis use extra reporter molecule(s), e.g. an enzyme substrate with feasible spectrophotometric features. One such indirect technique, often used for protease activity measurement, is Förster resonance energy transfer (FRET, also known as fluorescence resonance energy transfer), e.g. as in paper II. By use of a fluorescent group attached at one end of a substrate peptide, and a so called quencher on the other end, activity can be observed as an increase in fluorescence when the peptide is hydrolyzed and the two groups diffuse from each other.

Surface plasmon resonance-based biosensors Surface plasmon resonance (SPR)-based biosensors provide a direct binding, real-time, and label-free technique with suitable dynamic range to study molecular interactions [106] in terms of affinity, kinetics, and thermodynam- ics. The instruments used in the work presented in this thesis were all of the Biacore type, the most commonly used type today [107].

The technique A biosensor chip, i.e. a glass slide with a thin layer of gold, is mounted on a prism. On the opposite side of the slide, a flow chamber is created by a sili- con micro fluidic system. A laser beam is sent through the prism towards the sensor chip in an angle admitting total reflection (Figure 11). The reflection of the light gives rise to a weak evanescent electromagnetic field close to the flow cell floor on the other side of the chip. At a specific angle of the incom- ing light (the SPR angle) the resonance frequencies of the surface plasmon in the gold layer and the evanescent field match and an energy transfer occurs. This can be detected as a dip in intensity of the reflected light. The SPR an- gle is dependent on the refractive indices of the prism and in the flow cham- ber, close to the chip. The former is constant whereas the latter is variable. When a compound in solution (analyte) binds to, or dissociates from, a mol- ecule immobilized on the surface (ligand), the mass close to the chip surface,

32 and thus the refractive index, changes [108]. For biochemically relevant molecules the signal varies linearly with mass [109]. SPR-based biosensors are sometimes able to detect conformational changes of the ligands since a change in structure can affect the refractive index [110]. The innate effect of ligands and analytes on the refractive index, i.e. the refractive index incre- ment, is generally disregarded, despite known differences [111, 112].

Flow cell

Sensor chip

Prism

Light source Detector

Figure 11. A schematic of a Biacore type SPR-biosensor with analyte (red spheres) injected over a protein ligand (green skeins) immobilized on the dextran matrix surface (black).

A technique based on refractive indices is highly sensitive to differences in buffer composition. During early phase drug discovery, compounds investi- gated are often quite aliphatic, and, thus, hydrophobic. To ensure solubility, the organic solvent dimethyl sulfoxide (DMSO) is often added to 3-5 % v/v of the buffers used. Unfortunately, DMSO has a high refractive index. Thus, small variations in DMSO concentration between sample and running buffer will translate to substantial signal differences, referred to as bulk response. Furthermore, a detection area (spot) with immobilized ligand will display a lower bulk response compared to an unaltered spot, the latter having a larger surface area exposed to solvent. This excluded volume effect and the previ- ously described artifacts are removed from the data by reference and blank sample subtraction, and a DMSO standard solvent correction procedure [113].

Requisition of data and extraction of information The interaction event is recorded in a signal to time plot, referred to as a sensorgram, with three distinct phases. First, the baseline signal level is de- fined with the continuous buffer flow. Thereafter, a sample is injected over the chip surface during the association phase. Finally, during the dissociation phase, the analyte is washed off by buffer flow (Figure 12). If the target sur- face has not been sufficiently regenerated, an additional harsher washing

33 step, e.g. a brief pulse of buffer with high ionic strength or low pH, can be included.

Signal [RU] A D

time [s]

Figure 12. The sensorgram is composed of an association phase (A) and a dissocia- tion phase (D). The signal level prior sample injection (arrow in figure) is referred to as baseline.

Data is often presented in overlay plots with multiple sensorgrams (Figure 13), and, if they display sufficient curvature, global non-linear regression of an interaction model of choice is used to extract the kinetic rate constants. When the association and dissociation events are very fast, the sensorgrams are shaped like square pulses. The lack of curvature prevents proper deter- mination of the rate constants. If steady state (see below) has been reached during the association phase, a saturation plot (Figure 13), i.e. a signal to concentration plot, can be used to determine the equilibrium dissociation constant as the concentration at which the signal is half the maximal. Square- pulse sensorgrams are often observed when working with fragment sized molecules due to rapid kinetics and low affinities. When the rate of complex formation equals the rate of break down, the association phase sensorgram is horizontal. This steady state does not, how- ever, imply the surface to be saturated. In Figure 13, at least the three highest concentrations reach steady state. Saturation would demand higher concen- trations still. Often, analyte solubility issues prevent this. The theoretical maximal signal (Rmax) can in the least complex case be calculated as ∙ Eq. 13

The ratio of the apparent and theoretical Rmax gives an estimate of the frac- tion of complex competent ligand molecules. This is commonly referred to as surface functionality.

34

Figure 13. Overlay plot of sensorgrams (left) and the corresponding saturation plot (right) for a simulated data set. The analyte concentrations differ by a factor two, with the highest concentration yielding the highest signal. The theoretical maximal signal was 100 RU.

Experimental design The quality of information achieved through SPR-biosensors is highly de- pendent on the experimental design and data evaluation. It is, e.g., advisable to perform experiments at different surface immobilizations. If many ligand molecules are attached to the biosensor surface and the association rate of the analyte is high, the buffer film closest to the surface risks to be depleted of analyte and the diffusion from the buffer bulk to the surface will be rate determining. This condition of mass transfer limitation (MT) can often be solved by a lower surface immobilization level and a higher buffer flow rate. Other variables to consider are, e.g., the method of immobilization, which can heavily influence the final surface functionality; the choice of which molecule to be the ligand and which the analyte; and buffer conditions. By a clever choice of target combinations, additional information, such as location of binding site, can be obtained. In paper III, same-site binding is suggested by a competition experiment.

35 Aims

The aim of this thesis has been to provide suitable starting materials, effi- cient experimental approaches and informative data analysis procedures that enable identification and selection of compounds that can be developed into efficient drugs. The research has focused on strategies that will contribute to decreasing the attrition in drug discovery.

36 Present investigations

Paper I – Design and validation of a fragment library The aim was to compile and experimentally validate a fragment library, de- signed for primary screening using SPR biosensor technology. It was ex- pected that the procedure would enable identification and removal of com- pounds unsuitable for screening by this platform. For example, compounds aggregating at the high concentration levels required for screening, com- pounds with pronounced affinity towards dextrane, and fragments with irre- versible, or very slow, dissociation rates. From 4.6 million commercially available compounds, 930 fragments was been selected for the new library by a set of physicochemical and filters. With access to a Biacore A100 SPR biosensor, and an in- tended fragment screen in mind (see paper II), the library was evaluated using a target panel of an aspartic protease (HIV-1 protease, HIVp), a serine protease (human α-thrombin), a metallo enzyme (carbonic anhydrase II, CA; EC 4.2.1.1), and, as an anti-target, human serum protein (HSA). All proteins were known to be stable under the same experimental conditions. The fragments were assessed for interaction with the panel at a single concentration of 200 µM in phosphate buffered saline with 5 % DMSO, pH 7.4, at 25 °C. In a duplicate run, a reverse sample order was used to mini- mize the risk of losing data due to ill-behaving compounds. The responses from blank samples, i.e. negative controls, were used to define a detection threshold as the signal average plus three standard deviations.

Interaction profiles The screening hit rate is target specific and linked to its inherent ligandability [85]. From the interactions displaying significant signal levels, 50 fragments were found to interact with a single target. None of these inter- actions were with HIVp. The duplicate screening runs were analyzed togeth- er, revealing thrombin and HSA to individually interact with more than 90 % of the fragments. CA and HIVp interacted with 72 % and 35 %, respectively. The figures should be compared with caution, if at all, since the protein sur- faces were not normalized with respect to functionality. More importantly, the library was found to have broad and versatile interaction profiles for the individual targets, not just containing general protein binders.

37 Fragments with slow dissociation rates Most fragments displayed fast association and dissociation rates in the form of square-pulse shaped sensorgrams. 35 fragments, however, displayed very slow dissociation rates from at least one target protein. Thus, they risk af- fecting the analysis of succeeding compounds. Two fragments did not disso- ciate virtually at all from any of the panel proteins during the 60 s dissocia- tion phase but were later found to behave better in a screen against human cytomegalovirus protease (manuscript in preparation). In conclusion, these fragments should be run at the end of screens and with longer dissociation times.

Superstoichiometry may be an insidious and non-general trait The degree of functionality of an immobilized biosensor chip surface is typi- cally assessed by the use of a positive control molecule larger than a frag- ment. A semi-unfolded immobilized target may provide more binding sites for the smaller fragments, underestimating the binding capacity of the sur- face in a fragment screen. Furthermore, since the refractive index increment of small molecules can differ 2- to 3-fold [112] and conformational changes [110] can affect the signal output, estimated values of stoichiometry should be interpreted with caution. Indications of superstoichiometric interactions in the screen were most commonly recorded for CA and HIVp. Notably, these surfaces displayed the lowest degree of functionality, as assessed by their respective positive con- trols. An advantage with a target panel of multiple protein types was the oppor- tunity to compare estimated stoichiometries for all compounds. A fragment was said to bind superstoichiometricaly to a target if the binding stoichiome- try was greater than 5:1 [114]. No fragment was found to fulfill this criterion for all four target proteins. This is in line with previous reports of superstoichiometric binding behavior being target dependent [114].

Conclusions The compiled fragment library was found compatible with an SPR-based platform. It displayed varied interaction profiles with the target panel mem- bers, and was, after a minor sample order rearrangement, devoid of com- pounds obstructive for analysis, and, thus, in need of exclusion. The compi- lation and validation procedure can be considered a success.

38 Paper II – Screening for novel enzyme inhibitor scaffolds HIV-1 protease is a well studied drug target towards which several genera- tions of drugs has been produced. Previous generations of protease inhibitors have been associated with severe side effects. Due to the high mutational rate and the heavy viral load of HIV, combination therapies are preferred. Optimally, the compounds used should have different resistance profiles. The many quasi-species of HIV also increase the demand of a large drug arsenal. Thus, the need of new drugs is high. In an undertaking of finding novel HIVp inhibitor scaffolds not sensitive to some common resistance mutations, a fragment screen was launched. As a target panel, the wild type (wt) enzyme and three mutant variants (G48V, V82A, and I84V) were used [115]. This panel covers known signature re- sistance mutations towards all but one of the nine to date clinically approved HIVp inhibitors (summarized in [35]). All targets contained the stabilizing Q7K mutation, known to reduce autohydrolysis [116].

Figure 14. Structure of HIV-1 protease with the K7 (blue), G48 (orange), V82 (green) and I84 (magenta) amino acid moieties highlighted, in complex with an active site inhibitor (red). PDB ID: 1KZK.

HTS has been argued to provide leads with efficiencies matching those from fragment-based lead discovery. The latter, however, has a higher probability to identify novel drug scaffolds due to a better coverage of chemical space [117]. With a fragment library of 930 compounds known to be suitable for use with SPR-based biosensors (paper I), the panel was assessed for interaction

39 by 5 non-zero sample concentration series (31 to 500 µM in two-fold dilu- tions). The screen was performed at 25 °C in a phosphate buffered saline, pH 7.4, with 5 % DMSO. A concentration series was considered successful if signals for at least four out of five samples were above noise threshold, de- fined as the average of blank sample signals plus three standard deviations. Apparent affinities were calculated by non-linear regression of an interaction model to the saturation plots. The model used was based on steady state af- finity but contained a linear term to account for signals from any low affinity interactions. A screening hit was defined as a fragment interacting with HIVp wt and two or three of the mutant variants in a concentration dependent and reversi- ble manner, with an apparent affinity in the range 0-10 mM, and a positive Rmax (Figure 15). The duplicate runs were analyzed separately and together, the latter to increase the prospects of finding both true and false hits. 47 compounds fulfilled these criteria and were assessed for inhibitory potential in a FRET-based orthogonal activity assay. The activities of the wt enzyme in presence and absence of fragments were compared. Compounds with a high innate fluorescence were excluded. Ten fragments were identified to interact with the protease panel and inhibit their activity.

Figure 15. Interaction profile for the two separate SPR-screening runs. Fragments interacting with wt and at least two mutant enzyme variants (boxed in black) were selected for assessment of inhibitory potential. An analysis of the combination of the two runs was also performed (not shown in figure). In total, 47 fragments were se- lected

All clinically approved HIVp inhibitors were searched for sub-structure similarities with the identified fragments by Tversky similarity analysis. No full match was found. Furthermore, a subset of compounds deposited as HIV-1 protease binders from the Binding Database, all with a reported IC50 or KI < 1 µM, were searched for similarities, using the fragments for query. One full match was found. These results suggest most of the compounds to be novel structures.

40 Tversky coefficient as a measure of similarity Similarity between molecules can be defined as matches and overlaps of their structures. The use of Tversky coefficients as a tool for similarity anal- ysis is a pragmatic method. However, it has drawbacks due to its simplistic nature. The scoring function is strictly structure based, ignoring the possibil- ity of two chemical motifs to have a common biological function, e.g. H- bonding of carbonyl and sulphonyl groups. As identified substructures of the query can be scattered over the target structure, comparing molecules of fragment size with larger, more complex structures, often results in relatively high similarity score. Figure 16 presents an example from the fragment screen of paper II.

H2N O N H a) b) N c)

O HN O

OH

N

O

H3C NH

H3C CH3 Figure 16. Similar substructures of fragment hit (a) as identified by Tversky analysis in the clinical HIVp inhibitors saquinavir (b) and indinavir (c). The similarity coeffi- cients were 0.92 and 0.65, respectively. Structures considered similar are indicated with bold lines. Figure courtesy of Evert Homan.

Conclusions 47 out of 930 fragments were identified as interaction screening hits, i.e. a 5.1 % hit rate. Ten fragments were identified to interact with, and inhibit, HIVp wt and, to different extents, the mutants of the panel. This corresponds to 1.1 % of the total fragment library and 21.3 % of the interaction screening hits. None of the compounds are substructures of clinically approved HIVp inhibitors. The novelty of the compounds was assessed by Tversky similarity analysis with a subset of the Binding Database as reference. Only one frag- ment was found to be a full substructure match. To evolve these fragments, their site(s) of interaction should be identified.

41 Logistical challenges in compound screening (Papers I and II) Screening campaigns provide not only scientific challenges, but also logisti- cal. The purity and identity of all compounds need to be verified. Before the screening commences, all compounds must be diluted and the samples properly mixed. In the projects described in paper I and II, the interaction between some 930 fragments and four target molecules were investigated. Furthermore, in project II, individual concentration series of all fragments were used. The SPR-platform is associated with certain restrains on sample prepara- tion procedures. To minimize signal artifacts from altered DMSO concentra- tions due to evaporation of water, the total time for exposure to air without lid was restricted to a maximum of 30 minutes once the fragments had been introduced into an aqueous buffer. To meet with the time limit, an automated liquid handling system was used. Concentrations series with four samples can easily be produced with a plate replicator in a 96 to 384 well microtiter plate format whereas five concentrations call for a more intricate solution. The large number of samples assayed in a compound screen gives rise to a vast raw-dataset. To evaluate the results, an automated procedure is prefer- ably used. In paper II, the 930 fragments, each in 6 samples with different concentrations (blank sample included), targeting four ligands and one refer- ence spot resulted in approximately 28000 interactions, i.e. time series. After referencing and solvent correction signal adjustments, two different strate- gies were used. In paper I, the signal value at some fixed time points (4 s before the end of the association phase, 5 and 55 s into the dissociation phase) were extracted. By a simple quotient, samples displaying slow or virtually non-existent dissociation could easily be identified. In paper II, data was imported into a MATLAB environment where apparent affinities could be determined through non-linear regression to a suitable interaction model. To process each interaction series individually would take a long time. How- ever, a brief visual inspection of sensorgrams is often necessary. Biacore biosensors intended for larger screening campaigns use parallel flow cells with multiple immobilization spots in each cell. Thus, samples from the same microtiter plate will be assayed in different flow cells. This increases the complexity of sample handling and evaluation further, but also provides an opportunity to analyze interactions with many targets simultane- ously.

42 Paper III – Cellular potency, mechanism, and kinetics of allosteric enzyme inhibitors The in vivo efficacy of allosteric inhibitors is not easily deduced from in vitro based assay experiments, such as inhibition and interaction studies. A deeper understanding of these relationships would be of great use in the pre- clinical phase of drug discovery. This was the aim of paper III. Filibuvir, VX-222, and tegobuvir are all small molecule candidate drugs in clinical phase II studies. They target the RNA-dependent RNA polymer- ase of hepatitis C, NS5B (EC 2.7.7.48), of the 1b genotype. To study the correlation between in vivo anti-HCV potency and inhibitor mode-of-action, the interaction characteristics between the compounds and their target were investigated. NS5B is a complex target having multiple allosteric sites, two forms of substrates (one a macromolecule and the other of fragment size), and a processive catalytic mechanism. It is complicated to resolve any interaction mechanisms for the system. To succeed, we had to use multiple platforms. The anti-HCV potencies of the three inhibitors were assessed in a replicon cell assay. The potencies, quantified as EC50 values, were found to be 0.2 µM, 0.005 µM, and 0.003 µM for filibuvir, VX-222, and tegobuvir, re- spectively. For all in vitro assays, a truncated form of NS5B was used, lack- ing the 21 C-terminal amino acid residues constituting a transmembrane helix (NS5BΔ21). Tegobuvir has previously been reported not to interact with NS5B in in vitro assays, indicating an indirect mechanism of action [118]. Filibuvir and VX-222 are reported to be allosteric inhibitors acting via the thumb-II site [119, 120]. In a scintillation proximity assay, measuring the polymerase activity, dose-response curves were noted for all inhibitors. Tegobuvir displayed an unexpected behavior, activating the polymerase at lower concentrations, to partially inhibit it at higher. An IC50 value for VX-222 could be determined, whereas filibuvir did not reach full inhibition at the highest concentration used. With a simple differential scanning fluorometry (DSF) assay, filibuvir and VX-222 were found to affect the melting temperature of NS5BΔ21. This is expected of a ligand. In an SPR-based interaction assay, the two inhibitors were found to compete for binding, as is expected for ligands binding to the same site. However, a difference in interaction mechanism was observed. The VX-222 compound interaction data could not, in contrast to that of filibuvir, be explained by a simple Langmuir 1:1 interaction model. Tegobuvir was neither observed to interact with the polymerase in the DSP, nor the SPR assay, in full agreement with previous reports [118].

43 Assessment of interaction mechanism To evaluate the nature of the inhibitor-enzyme binding complexity, a single concentration of inhibitor was injected for different time over surfaces with two types of immobilized NS5BΔ21 in the SPR-biosensor. To one surface, the enzyme had been tethered by standard amine coupling. The protein on the other surface was cross-linked, i.e. rigidified by intramolecular covalent bonds through the use of amino coupling reagent. This analysis was per- formed at 5 and 25 ºC. Again, VX-222 displayed a complex interaction, seen as a lack of steady state with a two-phase association and dissociation event (Figure 17, right panel). This is in-line with, e.g., a two-state induced-fit model (Scheme 2) with a slow conformational change step. Slight differences in interactions with the two surfaces were observed mainly for VX-222. This can be a sign of differences in interaction-mechanism. At the low temperature, a tendency of non-Langmuir binding behavior was seen also for filibuvir. This suggests a temperature sensitive transformation between the EI and E*I states that becomes rate limiting at low temperatures (i.e. a small k-2 rate constant in Scheme 2). Conformational changes upon liganding are expected for allo- steric inhibitors due to their indirect mechanism of action.

10 14 12 8 10 6 8

4 6 4 Response, [RU] Response, Response, [RU] 2 2 0 0

0 100 200 300 400 500 600 0 100 200 300 400 500 600 Time, [s] Time, [s] Figure 17. Overlay sensorgrams from single concentration experiments with various injection lengths. Filibuvir interactions with cross-linked NS5BΔ21 at 25 ºC (left) are well described by a 1:1 model. The interaction mechanism between VX-222 and the cross-linked polymerase at 5 ºC (right) seems to be more complex.

Chemo- and thermodynamic evaluation of interactions The binding of filibuvir and VX-222 to the polymerase was further charac- terized by a chemodynamic analysis where effects of buffer pH and ionic strength (in form of NaCl concentration) were assessed at room temperature. All observed rate constants and affinities were rather unaffected in the rang- es analyzed (pH: 7.0-8.0, NaCl: 30-500 mM). VX-222 data was again best explained by a two-state model. Alternative interaction models with a com- parable complexity to the two-state model, e.g. heterogeneous ligand and population shift models were rejected based on protein purity and observed kinetic profile. At 25 ºC, in a buffer with pH 7.5 and 130 mM NaCl, the af-

44 finities of filibuvir and VX-222 for NS5BΔ21 were determined to 54 and 25 nM, respectively, i.e. quite similar. The different mechanisms of interaction described by the interaction models used for filibuvir and VX-222 also brought differences in time of target occupancy. Despite similar KD values for the two inhibitors, their indi- vidual residence times differed 15-fold. To assess the temperature dependence of the interaction and thermody- namic entity contribution for filibuvir, concentration series were analyzed at various temperatures in the range 5-35 ºC (Figure 18). The complex interac- tion mechanism prevented us from a similar evaluation of VX-222. As exemplified by Figure 18, sensorgram data can be used quantitatively, to determine affinities and rate constants, or qualitatively, to visualize trends. From the sensorgrams, it is clear that the rate constants increase with in- creasing temperature.

Figure 18. Sensorgrams from temperature series for the interaction of filibuvir and native and cross-linked NS5BΔ21. Most of the affinity values noted in the signal vs. concentration plots are to be considered apparent since steady state has not been reached for all samples.

Again a 1:1 interaction model fit increasingly better with higher analysis temperature. Clear trends were found for the dissociation rate and affinity with increasing temperature: koff increased and KD decreased.

45 By a van’t Hoff plot, the enthalpy and entropy changes for the entire in- teraction were extracted. The native enzyme displayed a favorable enthalpy and entropy, whereas the interaction with the cross-linked enzyme had a favorable enthalpy but unfavorable entropy. This indicates the importance of protein flexibility for the interaction. Interestingly, the thermodynamic profile of the interaction between native NS5BΔ21 and filibuvir was similar to that reported for HIV RT and an allo- steric inhibitor [121]. If this profile is a common theme for allosteric inhibi- tors and polymerases is, however, too early to tell. Notably, the profiles are both different from those of HIVp and active site inhibitors [122]. Similarly to the van’t Hoff analysis, Eyring plots were used to examine the dissociation phase contributions. From the thermodynamic profile of the interaction at equilibrium, the profile of the association event could be calcu- lated. All contributions in both phases were unfavorable, their relative sizes giving rise to the overall profile discussed above.

Inhibitor interference of NS5BΔ21 RNA-binding With an SPR assay setup, filibuvir and VX-222 were found to partially block RNA binding of NS5BΔ21. The result was independent of which macro molecule was immobilized. However, the interaction mechanism could not be explained by models of lower complexity. This was possibly caused by oligomerization of the enzyme – a previously reported propensity of NS5B [123]. Filibuvir and VX-222 have been reported to inhibit elongative RNA synthesis. The latter inhibitor is further said to have a modest effect on de novo initiation, whereas filibuvir does not [124]. Interference with RNA binding, as seen in our experiments, should affect both types.

Conclusions The two allosteric inhibitors filibuvir and VX-222 were both found to inter- fere with RNA binding of the polymerase and to have similar affinities for NS5BΔ21, as measured in vitro. The differences in residence time, deduced from the proposed mechanisms of interaction, were hypothesized to cause the differences in anti-HCV potency observed in cell cultures. This project exemplifies the importance of the extra level of information that kinetic pro- filing provides. In future studies, a comparison of efficacy in different HCV genotypes coupled with a structure analysis would possibly shed further light over the mechanism of interaction and action. The results of this project identify filibuvir as a suitable reference com- pound in future interaction analysis experiments due to its kinetic profile.

46 Paper IV – Structure-kinetic and -thermodynamic relationships of enzyme inhibitors The full understanding of how structure can be translated into thermodynam- ic entities, in turn dictating the kinetic characteristics of an interaction, is yet distant. To take a step closer to this epiphany, relationships between struc- ture, kinetics, and thermodynamics were investigated and analyzed in com- bination with structural data for a set of close inhibitor analogues and their target, human α-thrombin (Figure 19). Human α-thrombin (EC 3.4.21.5) is a serine protease. It is a well studied drug target and has been included in multiple fragment screening campaigns (e.g. paper I and [125]). Melagtran was approved for clinical use as a throm- bin inhibitor in the mid 00’s [126]. It was, however, retracted from the mar- ket due to reported liver toxicity [127]. Nevertheless, as an optimized inhibi- tor of thrombin, it can in combination with five close analogues, all with amino substituted P3 moieties, be used as a chemical tool to assess, e.g., SKR properties.

Figure 19. The thrombin inhibitor melagatran and five close analogues. The varied structure is encircled.

All six inhibitors were assessed for interaction characteristics by stopped- flow spectroscopy (SF), SPR, and ITC-based assays. Individual weaknesses in the three techniques were well compensated by their respective strengths. The thermodynamic contributions were determined directly by ITC, and indirectly, through the use of temperature series and Arrhenius, Eyring, and van’t Hoff analysis, by SF and SPR-platforms. The latter techniques were also used for a kinetic evaluation of the inhibitors. ITC reported a binding stoichiometry close to 1, and 1:1 interaction mechanisms were observed for all inhibitors by SPR. Due to very fast, close to diffusion controlled, association rates, the SPR platform was challenged to not display results obscured by mass transport

47 limitation [128]. Also the ITC platform struggled with the almost ‘too good’ compounds, in the end providing only ΔHobs values for the interactions. This could have been circumvented with a competition experiment setup, where a ligand with weaker affinity is first titrated, to subsequently be displaced by the high affinity ligand. However, data from this type of assay suffer from higher noise compared to data from a direct titration assay. The thermody- namic data recorded with SF differed from the SPR and ITC results. This may be a consequence of it being an indirect method making use of a sub- strate, in contrast to SPR and ITC, which are direct binding methods. The trends in kinetics and thermodynamic profiles were, however, in agreement. Affinity was found to increase with increasing lipophilicity, primary as a result of decreased dissociation rates. The kinetic rate constants were also observed to depend on the degree of substitution of the amine nitrogen. A primary amine had the fastest dissociation rate and the tertiary the slowest. Also the length of the aliphatic chain affected the rate of the interaction. A shorter or longer chain than the apparently optimal ethyl substituent resulted in faster dissociation rates. The P3 amide group was found to affect the thermodynamic profile considerably in comparison with the carboxyl group of melagatran, probably due to their different interactions with a glutamate residue of the binding site (Glu192). The degree of substitution and shielding of the amide-nitrogen consequently seems to be important for achieving a low dissociation rate constant. The structural analysis by x-ray crystallography confirmed inhibitor bind- ing to the S1 and S2 sites to be largely unaffected by the P3-moiety. The orientations of the P3-moieties were, however, not uniform. Thus, most of the differences in binding kinetics and thermodynamics were concluded to be attributable to the P3-substituents. Interestingly, the enthalpic contribution was, for one inhibitor, found not to increase upon formation of an additional , in contrast to what could be expected.

Conclusions Our results show that minor changes in the structure of the melagatran ana- logue inhibitors have significant effects on its interaction with thrombin that translates directly into kinetic and thermodynamic effects. As the general understanding of structure-thermodynamic relationships remains elusive, an empirical observation still advocate its use in drug discovery; to reach the highest affinity of a ligand, the favorable contribution of both ΔS and ΔH must be maximized. The latter is apparently harder to optimize, why an enthalpic binder could be considered a potentially better starting point for drug discovery [129].

48 Perspectives

Summary Time resolved interaction studies have proven to be important in pre-clinical drug discovery. It enables a more thorough evaluation and optimization of drug candidates, opening up for biologically relevant concepts such as resi- dence time in the lead generation and optimization phases. Assessment of the mechanism of interaction and action of inhibitors can identify superior compounds in an isoaffinity group, providing the best CDs for clinical test- ing. Thus, the rational design of drugs can truly benefit from SKR studies. In the work described in this thesis, we exemplified this in two different papers. In one (paper III), two same-site allosteric inhibitors with similar affinities for their target were studied by kinetic, thermodynamic, and chemodynamic analysis. The large differences in anti-viral potency observed for the compounds were suggested to be caused by their individual interac- tion mechanisms. In the other paper (paper IV), the kinetic and thermody- namic evaluation of a series of close inhibitor analogues were analyzed in combination with crystallographic data to understand the fundamental nature of molecular interactions and their incitement. The never ending plague of of today was addressed by an SPR aided fragment screen against a panel of four viral enzyme variants (paper II). By using wt and mutant targets known to convey resistance to- wards clinically approved drugs, we identified ten hits compounds, i.e. novel inhibitor scaffolds with activity towards all panel members. If developed into full scale drug molecules, these would be a valuable asset in combination therapies due to their different resistance profile. A standard procedure for how to compile, and determine the suitability of, a fragment library for use with an SPR-based biosensor assay, has also been described (paper I). This confirms that selection of compounds using Ro3 and other molecular property cutoff filters is an efficient way of compiling a library with suitable properties. The procedure will hopefully help to elimi- nate ill-behaving compounds from present libraries, and those still to be compiled, but, maybe just as important, also prevent the exclusion of inter- esting fragments not behaving well only with the occasional target. Thus, SPR-based biosensors have been used to advance projects in most stages of pre-clinical drug discovery with the aim to reduce future drug attri- tion and to understand molecular interactions on a fundamental level.

49 Future studies A fairly uncharted field of research, although continuously growing, is that of full-length membrane proteins. With many drug targets residing in this group, the interest has not been lacking, but rather the techniques to interro- gate them. The main part of the methods described in this thesis is applicable to membrane proteins. An obvious merge of them would be an SPR biosen- sor-aided fragment screening campaign, with a validated fragment library, for novel inhibitor scaffolds of full-length HCV NS5B, not sensitive to re- sistance mutations reported from replicon assays for advanced clinical trial compounds. Furthermore, this assay would be more holistic and could vali- date inhibitors identified with an assay based on the truncated enzyme form. Biophysical techniques tend to be quite reductionistic. The continuous com- parison of results with more holistic assays is vital. Again, successful drug discovery is dependent on valid model systems and interrogation techniques. To further develop the fragments identified to inhibit HIVp in paper II, the site(s) of binding should be assessed, e.g. by x-ray crystallography. This would simplify the interpretation of the results from a subsequent analogue- by-catalogue expansion of a selected couple of the scaffolds to expand them into more mature inhibitor compounds. The combined use of estimated affinities, kinetic rate constants, and ther- modynamic contributions should be further evaluated and developed. Today, we lack, e.g., the tools to dissect a more complex interaction mechanism in terms of thermodynamic contributions.

50 Notes

Computer software for visualization The illustrations in this thesis were made using PyMOL [130], gnuplot (open source plotting tool, version 4.4), and Microsoft Office PowerPoint 2007.

51 Sammanfattning på svenska (Summary in Swedish)

Bakgrund och målsättning I våra kroppar och celler finns en mängd olika enzymer, vilka kan liknas vid små maskiner. De hjälper oss bland annat att bryta ner vår mat, dela våra celler och skydda oss från inkräktare så som virus och bakterier. Virus och bakterier har egna och för dem livsviktiga enzymer. Ett läkemedel fungerar ofta som en kil eller sten som fastnar i någon av inkräktarens maskiner och gör den obrukbar. En molekyl som hindrar ett enzyms funktion kallas inhibi- tor. För att vi skall kunna använda läkemedlet får det inte påverka kroppens egna enzymer alltför mycket. Sådana oavsiktliga effekter kallas för läkeme- delsbiverkningar. Ibland vill man dock påverka den egna kroppens celler och enzymer, t ex vid cancer och blodproppsbildning, eller när du får huvudvärk. En grundläggande förutsättning för och del av livet är interaktioner mel- lan molekyler. Hastigheten med vilken de samverkar och bindnings-styrkan mellan dem är centrala. Vidare spelar sättet de interagerar på, mekanismen, en stor roll. Detta gäller även för effektiviteten hos läkemedel. Här finns till exempel grunden till läkemedelsresistens hos vissa virus. Genom att föränd- rad struktur (mutation) hos viruset binder inte läkemedlet lika starkt, eller ens alls, och viruset kan överleva. För att skapa ett läkemedel behöver man en utgångspunkt. På mitten av 90-talet användes läkemedelsfragment för första gången för att skapa en enzyminhibitor. Fragment är här små delar som i par eller 3-och-3 kan pusslas ihop till läkemedelsliknande molekyler. Att ta fram nya läkemedel är en lång och kostsam process. Under utveck- lingen, som är uppdelad i en preklinisk och en klinisk del, upptäcker man ofta egenskaper hos substanser som gör att de inte passar som läkemedel och utesluts därför från studierna. Då utveklingskostnaden successivt ökar vill man sålla bort olämpliga substanser så tidigt som möjligt. För att kunna göra det behöver man ha tillgång till så bra information som möjligt. Därför har avhandlingen varit fokuserad på att ta fram nya metoder för läkemedelsut- veckling och metoder för att ta fram och använda information om interaktio- ner mellan läkemedel och deras mål på ett optimalt sätt.

52 Delarbeten De arbeten som presenteras i denna avhandling har alla syftat till att effekti- visera den prekliniska delen av dagens läkemedelsutveckling. Med en teknik baserad på det fysikaliska fenomenet ytplasmonresonans kan man bland annat studera interaktionen mellan två molekyler, t ex ett enzym och en inhi- bitor, separat, det vill säga utan att använda hela celler. Med hjälp av sådan teknik har en metod tagits fram för att kontrollera om plattformen och en generell samling fragment kan användas tillsammans (delarbete I). Fragmentsamlingen användes sedan för att identifiera nya grundstrukturer av molekyler som kan hämma ett enzym som är livsviktigt för HIV (delarbe- te II). Dessa är intressanta utgångspunkter för en ny generation läkemedel mot HIV då de inte påverkas av de resistensmutationer hos enzymet som är vanliga vid behandling med dagens mediciner. I delarbete III studerades tre läkemedelskandidater mot hepatit C viruset (HCV) – samtliga i kliniska studier (fas II) och riktade mot HCV polymera- set (det enzym som viruset använder för att kopiera sin arvsmassa). En av inhibitorerna kunde inte påvisas att interagera med polymeraset. Förmodligen omvandlas läkemedlet till en aktiv form i levande celler. De övriga två substanserna band ungefär lika hårt till enzymet och på samma ställe. Trots det fanns en stor skillnad i verkningsgrad mot viruset i cellbase- rade försök. Detta kan förklaras med en skillnad i interaktions-mekanism mellan enzym och inhibitorer. Hur interaktionen påverkades av den omgi- vande miljöns surhetsgrad och salthalt studerades också, vilket är ett sätt att lära känna enzymet och dess egenskaper bättre. I det sista delarbetet (delarbete IV), studerades hur små skillnader i struk- tur hos en inhibitor kan påverka interaktionen med ett enzym. Trombin, som är centralt för koagulation av blod, och sex inhibitorer med liknande struktur användes som modell. Tre olika tekniska plattformar användes för att analy- sera interaktionernas hastighet, styrka och termodynamiska drivkraft. Resul- taten jämfördes och analyserades tillsammans med strukturell data. På detta sätt kunde små förändringar i struktur kopplas till interaktionernas karaktär. Sådan information och förståelse är värdefull för utveckling av framtida läkemedel.

Slutsatser Ytplasmonresonansbaserade biosensorer har med gott resultat använts i flera faser av läkemedelutvecklingens prekliniska del. Med dessa delarbeten som grund kan substanser som inte passar som läkemedel sållas bort tidigare i utvecklingkedjan och således spara både pengar och arbetstid.

53 Acknowledgements

I would like to express my sincere gratitude to my supervisor Helena Dan- ielson – a role model for students and supervisors alike – for accepting me as her PhD student and helping me to, amongst many things, cast a solid scien- tific foundation. It has truly been a privilege, and great fun. I would also like to acknowledge Mikael Widersten for being my co- supervisor, Gun Stenberg for teaching me the wet-lab side of life, Bengt Mannervik and Gunnar Johansson for gladly sharing their vast biochemical knowledge, and Françoise Raffalli-Mathieu for many interesting conversa- tions and valuable help. Inger Hermanson, Lilian Forsberg, Bo Fredriksson, and Gunnar Svensson: You have always, with a smile, been quick to lend a helping hand when needed. Thank you. Without you, chaos would rule the department.

A sincere thank you to my co-authors for a fruitful collaboration and val- uable input for my work in general: Despite a full schedule, people at Beactica have always found the time to help me out: Evert Homan, Peter Brandt, Maria Lindgren, Mari Kullman- Magnusson, Matthis Geitmann, Thomas Gossas, and Per Källblad. Markku Hämäläinen, Stefan Löfås, Ewa Pol, Robert and Olof Karlsson (former Biacore) have all been an invaluable asset in every aspect of using the SPR instruments and analyzing the data. Without the rapid aid of Anders Cronhäll, I would have been lost. The help of Lotta Vrang, and Vera Baraznenok with coworkers at Medivir, and Stefan Geschwindner, Johanna Deinum, and Yafeng Xue at AstraZeneca, Mölndal, have been both important and appreciated.

I would like to thank the former and present members of the Danielson group: My former room-mate Malin, for all the un-, non-, and pure scientific discussions, often done in profile of each another, and the in-between laughs. Your enthusiasm is contagious. Helena N (a woman with an impeccable taste of, e.g., chocolate!), Matthis, and Thomas: you have all, with incredible patience, shared your vast BIAcore expertise. Eldar, the humble lab-problem solver; it did not take many days before the supervisor was supervised. My intellectual role model Göran: “Traveling through fragmentspace ain't like dusting crops, boy. Without precise calculations we could fly right through a library or bounce too close to a HCV project, and that'd end your trip real

54 quick, wouldn't it?” With a couple of publications before you started your graduate studies and a semi-post doc position before its end, without ever making a fuzz, Tony, you are inspiring. Sofia, the royal backbone of the group: Talented song-writer and scientist. I have really enjoyed taking this journey, all the way from 2007, with you. E-flasks in straight lines are beau- tiful. Angelica, you are an endless source of sunshine and a talented teacher. Don’t let the S51 act up on you! Christian, your combination of focus, suc- cess, god spirits, and humbleness is impressive! I would never have made the 90 km without your vaxing skills. I have also had good help from my project students: Weiqiao, Zhao, Emil, Harsha, and Arne. Thank you all!

Many other co-workers, present and previous, at the department(s) have provided a really nice working environment and many memorable laughs. To see a lunch time discussion get a life of its own and evolve is hilarious! Once in a while, I even learned some chemistry. Thank you, one and all!

I find this a good opportunity to also acknowledge some former supervi- sors and scholars of science: Sherry Mowbray and Eva-Lena Andersson, Mary Phillips-Jones and Victor Blessie, Anders Isaksson and Hanna Göransson-Kultima. I have had great use of your help. Thank you.

Further, I would like to thank the people close to me for accepting my special needs as an aspiring scientist, i.e. to tuck my shirt in and talk about strange science-stuff. Your patience and understanding has been a blessing: Team Söderbärke: Anders and Peter, Svante, Anders, and Oskar, with partners: Over the last 25 years you have kept me sane through more or less insane adventures. It has been a blast. Let it never end. Barbazoo, was it? Helena, Sofia, Sten, and Susanne, with spouses. 10 years of friendship! Soon, also I can leave BMC and Polacksbacken. The occasional get-togethers with the klin-farm youngsters and the bio- tech-guys have always been appreciated. Sara and Anders. You, good people, who always have a smile and a solid piece of good advice in store, I am very happy you are my friends. Daniel and Emma, Martin and Annika, Ola and Anna: What can go wrong with fat, sugar, and chocolate? Feeding frenzies and discussions – I have enjoyed them all. Except maybe that soft ice-cream buffet. But again… The Backströms: Thank you for welcoming a southerner like me so open- heartedly to your family, despite the occasionally unattended hairstyle. Mother, Father, Thomas, Anders, and Morfar: You have provided me with the foundation and support required for an undertaking such as graduate studies, including a reality check once in a while. Thank you. Finally, my marvelous wife Ellenor – a private scholar in life, science and love (not necessarily listed in order of importance) – Jag älskar dig ♥ Now would be a good time to flicker through the rest of the thesis 

55 References

1. Bunnage, M.E., Getting pharmaceutical R&D back on target. Nat Chem Biol, 2011. 7(6): p. 335-9. 2. Paul, S.M., et al., How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat Rev Drug Discov, 2010. 9(3): p. 203-14. 3. Kola, I. and J. Landis, Can the pharmaceutical industry reduce attrition rates? Nat Rev Drug Discov, 2004. 3(8): p. 711-5. 4. Smith, C., Drug target validation: Hitting the target. Nature, 2003. 422(6929): p. 341, 343, 345 passim. 5. Edfeldt, F.N., R.H. Folmer, and A.L. Breeze, Fragment screening to predict druggability (ligandability) and lead discovery success. Drug Discov Today, 2011. 16(7-8): p. 284-7. 6. Ladbury, J.E., G. Klebe, and E. Freire, Adding calorimetric data to decision making in lead discovery: a hot tip. Nat Rev Drug Discov, 2010. 9(1): p. 23-7. 7. Phase 0 trials: a platform for drug development? Lancet, 2009. 374(9685): p. 176. 8. Clinical trials, U.S. National Institutes of Health. 2012 [cited 2012 26th of January]; Available from: http://clinicaltrails.gov. 9. Both, G., et al., Gene therapy: therapeutic applications and relevance to pathology. Pathology, 2011. 43(6): p. 642-56. 10. Seyhan, A.A., RNAi: a potential new class of therapeutic for human genetic disease. Hum Genet, 2011. 130(5): p. 583-605. 11. Hopkins, A.L. and C.R. Groom, The druggable genome. Nat Rev Drug Discov, 2002. 1(9): p. 727-30. 12. Overington, J.P., B. Al-Lazikani, and A.L. Hopkins, How many drug targets are there? Nat Rev Drug Discov, 2006. 5(12): p. 993-6. 13. Qiu, X. and Z.P. Liu, Recent developments of peptidomimetic HIV-1 protease inhibitors. Curr Med Chem, 2011. 18(29): p. 4513-37. 14. Keefe, A.D., S. Pai, and A. Ellington, as therapeutics. Nat Rev Drug Discov, 2010. 9(7): p. 537-50. 15. Ahlgren, S. and V. Tolmachev, Radionuclide molecular imaging using Affibody molecules. Curr Pharm Biotechnol, 2010. 11(6): p. 581-9. 16. Yuvienco, C. and S. Schwartz, Monoclonal antibodies in rheumatic diseases. Med Health R I, 2011. 94(11): p. 320-4. 17. Kell, D.B., P.D. Dobson, and S.G. Oliver, Pharmaceutical drug transport: the issues and the implications that it is essentially carrier-mediated only. Drug Discov Today, 2011. 16(15-16): p. 704-14. 18. Hedstrom, L., Serine protease mechanism and specificity. Chem Rev, 2002. 102(12): p. 4501-24. 19. Dodson, G. and A. Wlodawer, Catalytic triads and their relatives. Trends Biochem Sci, 1998. 23(9): p. 347-52.

56 20. Schechter, I. and A. Berger, On the active site of proteases. 3. Mapping the active site of papain; specific peptide inhibitors of papain. Biochem Biophys Res Commun, 1968. 32(5): p. 898-902. 21. Crombach, A. and P. Hogeweg, Is RNA-dependent RNA polymerase essential for transposon control? BMC Syst Biol, 2011. 5: p. 104. 22. Steitz, T.A., A mechanism for all polymerases. Nature, 1998. 391(6664): p. 231-2. 23. Eigen, M. and P. Schuster, Hypercycle - Principle of Natural Self-Organization .A. Emergence of Hypercycle. Naturwissenschaften, 1977. 64(11): p. 541-565. 24. Domingo, E., et al., Basic concepts in RNA virus evolution. FASEB J, 1996. 10(8): p. 859-64. 25. Prabu-Jeyabalan, M., E. Nalivaika, and C.A. Schiffer, Substrate shape determines specificity of recognition for HIV-1 protease: analysis of crystal structures of six substrate complexes. Structure, 2002. 10(3): p. 369-81. 26. Barre-Sinoussi, F., et al., Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS). Science, 1983. 220(4599): p. 868-71. 27. World Health Organization, Progress report 2011: Global HIV/AIDS response. [cited 2011 December 13th]; Available from: http://www.who.int/hiv/pub/progress_report2011/summary_en.pdf. 28. Sierra, S. and H. Walter, Targets for Inhibition of HIV Replication: Entry, Enzyme Action, Release and Maturation. Intervirology, 2012. 55(2): p. 84-97. 29. Granich, R., et al., Highly active antiretroviral treatment as prevention of HIV transmission: review of scientific evidence and update. Curr Opin HIV AIDS, 2010. 5(4): p. 298-304. 30. Tejerina, F. and J.C. Bernaldo de Quiros, Protease inhibitors as preferred initial regimen for antiretroviral-naive HIV patients. AIDS Rev, 2011. 13(4): p. 227-33. 31. Butler, I.F., et al., HIV genetic diversity: biological and public health consequences. Curr HIV Res, 2007. 5(1): p. 23-45. 32. Acheson, N.H., Fundamentals of molecular virology. 2007: John Wiley & Sons. 33. Kohl, N.E., et al., Active human immunodeficiency virus protease is required for viral infectivity. Proc Natl Acad Sci U S A, 1988. 85(13): p. 4686-90. 34. Rhee, S.Y., et al., Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res, 2003. 31(1): p. 298-303. 35. Wensing, A.M., N.M. van Maarseveen, and M. Nijhuis, Fifteen years of HIV Protease Inhibitors: raising the barrier to resistance. Antiviral Res, 2010. 85(1): p. 59-74. 36. Choo, Q.L., et al., Isolation of a cDNA clone derived from a blood-borne non- A, non-B viral hepatitis genome. Science, 1989. 244(4902): p. 359-62. 37. Bartenschlager, R., M. Frese, and T. Pietschmann, Novel insights into hepatitis C virus replication and persistence. Adv Virus Res, 2004. 63: p. 71-180. 38. Lohmann, V., et al., Replication of subgenomic hepatitis C virus RNAs in a hepatoma cell line. Science, 1999. 285(5424): p. 110-3. 39. Lindenbach, B.D., et al., Complete replication of hepatitis C virus in cell culture. Science, 2005. 309(5734): p. 623-6. 40. Barnes, E., et al., Novel adenovirus-based vaccines induce broad and sustained T cell responses to HCV in man. Sci Transl Med, 2012. 4(115): p. 115ra1. 41. Huang, Z., M.G. Murray, and J.A. Secrist, 3rd, Recent development of therapeutics for chronic HCV infection. Antiviral Res, 2006. 71(2-3): p. 351- 62.

57 42. Bradford, S. and J.A. Cowan, Catalytic metallodrugs targeting HCV IRES RNA. Chem Commun (Camb), 2012. 43. Butt, A.A. and F. Kanwal, Boceprevir and telaprevir in the management of hepatitis C virus-infected patients. Clin Infect Dis, 2012. 54(1): p. 96-104. 44. Simmonds, P., Reconstructing the origins of human hepatitis viruses. Philos Trans R Soc Lond B Biol Sci, 2001. 356(1411): p. 1013-26. 45. Roque-Afonso, A.M., et al., Compartmentalization of hepatitis C virus genotypes between plasma and peripheral blood mononuclear cells. J Virol, 2005. 79(10): p. 6349-57. 46. Moradpour, D., F. Penin, and C.M. Rice, Replication of hepatitis C virus. Nat Rev Microbiol, 2007. 5(6): p. 453-63. 47. Moradpour, D., et al., Functional properties of a monoclonal antibody inhibiting the hepatitis C virus RNA-dependent RNA polymerase. J Biol Chem, 2002. 277(1): p. 593-601. 48. Martell, M., et al., Hepatitis C virus (HCV) circulates as a population of different but closely related genomes: quasispecies nature of HCV genome distribution. J Virol, 1992. 66(5): p. 3225-9. 49. Moradpour, D., et al., Membrane association of the RNA-dependent RNA polymerase is essential for hepatitis C virus RNA replication. J Virol, 2004. 78(23): p. 13278-84. 50. Ferrari, E., et al., Characterization of soluble hepatitis C virus RNA-dependent RNA polymerase expressed in Escherichia coli. J Virol, 1999. 73(2): p. 1649- 54. 51. Lesburg, C.A., et al., Crystal structure of the RNA-dependent RNA polymerase from hepatitis C virus reveals a fully encircled active site. Nat Struct Biol, 1999. 6(10): p. 937-43. 52. Bressanelli, S., et al., Crystal structure of the RNA-dependent RNA polymerase of hepatitis C virus. Proc Natl Acad Sci U S A, 1999. 96(23): p. 13034-9. 53. Ranjith-Kumar, C.T., et al., Mechanism of de novo initiation by the hepatitis C virus RNA-dependent RNA polymerase: role of divalent metals. J Virol, 2002. 76(24): p. 12513-25. 54. Ranjith-Kumar, C.T. and C.C. Kao, Recombinant viral RdRps can initiate RNA synthesis from circular templates. RNA, 2006. 12(2): p. 303-12. 55. Deval, J., et al., High resolution footprinting of the hepatitis C virus polymerase NS5B in complex with RNA. J Biol Chem, 2007. 282(23): p. 16907- 16. 56. Choi, K.H. and M.G. Rossmann, RNA-dependent RNA polymerases from Flaviviridae. Curr Opin Struct Biol, 2009. 19(6): p. 746-51. 57. Hong, Z., et al., A novel mechanism to ensure terminal initiation by hepatitis C virus NS5B polymerase. Virology, 2001. 285(1): p. 6-11. 58. Barreca, M.L., et al., Allosteric inhibition of the hepatitis C virus NS5B polymerase: in silico strategies for drug discovery and development. Future Med Chem, 2011. 3(8): p. 1027-55. 59. Koczor, C.A. and W. Lewis, Nucleoside reverse transcriptase inhibitor toxicity and mitochondrial DNA. Expert Opin Drug Metab Toxicol, 2010. 6(12): p. 1493-504. 60. Lipton, S.A., Turning down, but not off. Nature, 2004. 428(6982): p. 473. 61. Lipton, S.A., Paradigm shift in neuroprotection by NMDA receptor blockade: memantine and beyond. Nat Rev Drug Discov, 2006. 5(2): p. 160-70. 62. World Health Organization - Global atlas on cardiovascular disease prevention and control 2011 [cited 2012 March 27th]; Available from: http://www.who.int/cardiovascular_diseases/en/.

58 63. Husmann, M. and M. Barton, Therapeutical potential of direct thrombin inhibitors for atherosclerotic vascular disease. Expert Opin Investig Drugs, 2007. 16(5): p. 563-7. 64. Di Cera, E., et al., The Na+ binding site of thrombin. J Biol Chem, 1995. 270(38): p. 22089-92. 65. Di Cera, E., Thrombin. Mol Aspects Med, 2008. 29(4): p. 203-54. 66. Lipinski, C.A., et al., Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev, 2001. 46(1-3): p. 3-26. 67. Wenlock, M.C., et al., A comparison of physiochemical property profiles of development and marketed oral drugs. J Med Chem, 2003. 46(7): p. 1250-6. 68. Hopkins, A.L., C.R. Groom, and A. Alex, Ligand efficiency: a useful metric for lead selection. Drug Discov Today, 2004. 9(10): p. 430-1. 69. Hann, M.M., Molecular obesity, potency and other addictions in drug discovery. Medchemcomm, 2011. 2(5): p. 349-355. 70. Clardy, J. and C. Walsh, Lessons from natural molecules. Nature, 2004. 432(7019): p. 829-37. 71. Veber, D.F., et al., Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem, 2002. 45(12): p. 2615-23. 72. Bohacek, R.S., C. McMartin, and W.C. Guida, The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev, 1996. 16(1): p. 3-50. 73. Fink, T., H. Bruggesser, and J.L. Reymond, Virtual exploration of the small- molecule chemical universe below 160 Daltons. Angew Chem Int Ed Engl, 2005. 44(10): p. 1504-8. 74. Hann, M.M., A.R. Leach, and G. Harper, Molecular complexity and its impact on the probability of finding leads for drug discovery. J Chem Inf Comput Sci, 2001. 41(3): p. 856-64. 75. Orita, M., et al., Lead generation and examples opinion regarding how to follow up hits. Methods Enzymol, 2011. 493: p. 383-419. 76. Ladbury, J.E. and M.A. Williams, The extended interface: measuring non-local effects in biomolecular interactions. Curr Opin Struct Biol, 2004. 14(5): p. 562-9. 77. Baum, B., et al., Non-additivity of functional group contributions in protein- ligand binding: a comprehensive study by crystallography and isothermal titration calorimetry. J Mol Biol, 2010. 397(4): p. 1042-54. 78. Shuker, S.B., et al., Discovering high-affinity ligands for proteins: SAR by NMR. Science, 1996. 274(5292): p. 1531-4. 79. Danielson, U.H., Fragment library screening and lead characterization using SPR biosensors. Curr Top Med Chem, 2009. 9(18): p. 1725-35. 80. Davies, T.G. and I.J. Tickle, Fragment screening using x-ray crystallography. Top Curr Chem, 2012. 317: p. 33-59. 81. Vivat Hannah, V., et al., Native MS: an 'ESI' way to support structure- and fragment-based drug discovery. Future Med Chem, 2010. 2(1): p. 35-50. 82. Duong-Thi, M.D., et al., Weak affinity chromatography as a new approach for fragment screening in drug discovery. Anal Biochem, 2011. 414(1): p. 138-46. 83. Muchmore, S.W. and P.J. Hajduk, Crystallography, NMR and virtual screening: integrated tools for drug discovery. Curr Opin Drug Discov Devel, 2003. 6(4): p. 544-9. 84. Hajduk, P.J., et al., Privileged molecules for protein binding identified from NMR-based screening. J Med Chem, 2000. 43(18): p. 3443-7.

59 85. Barelier, S., et al., Ligand specificity in fragment-based drug design. J Med Chem, 2010. 53(14): p. 5256-66. 86. Dobson, P.D., Y. Patel, and D.B. Kell, 'Metabolite-likeness' as a criterion in the design and selection of pharmaceutical drug libraries. Drug Discov Today, 2009. 14(1-2): p. 31-40. 87. Lovering, F., J. Bikker, and C. Humblet, Escape from flatland: increasing saturation as an approach to improving clinical success. J Med Chem, 2009. 52(21): p. 6752-6. 88. Schulz, M.N., et al., Design of a fragment library that maximally represents available chemical space. J Comput Aided Mol Des, 2011. 25(7): p. 611-20. 89. Congreve, M., et al., A 'rule of three' for fragment-based lead discovery? Drug Discov Today, 2003. 8(19): p. 876-7. 90. Koster, H., et al., A small nonrule of 3 compatible fragment library provides high hit rate of endothiapepsin crystal structures with various fragment chemotypes. J Med Chem, 2011. 54(22): p. 7784-96. 91. Bollag, G., et al., Clinical efficacy of a RAF inhibitor needs broad target blockade in BRAF-mutant melanoma. Nature, 2010. 467(7315): p. 596-9. 92. Swinney, D.C. and J. Anthony, How were new discovered? Nat Rev Drug Discov, 2011. 10(7): p. 507-19. 93. Szwajkajzer, D. and J. Carey, Molecular and biological constraints on ligand- binding affinity and specificity. Biopolymers, 1997. 44(2): p. 181-98. 94. Abad-Zapatero, C. and J.T. Metz, Ligand efficiency indices as guideposts for drug discovery. Drug Discov Today, 2005. 10(7): p. 464-9. 95. Copeland, R.A., D.L. Pompliano, and T.D. Meek, Drug-target residence time and its implications for lead optimization. Nat Rev Drug Discov, 2006. 5(9): p. 730-9. 96. Holdgate, G.A. and A.L. Gill, Kinetic efficiency: the missing metric for enhancing compound quality? Drug Discov Today, 2011. 16(21-22): p. 910-3. 97. Tversky, A., Features of Similarity. Psychological Review, 1977. 84(4): p. 327-352. 98. Monev, V., Introduction to similarity searching in chemistry. Match- Communications in Mathematical and in Computer Chemistry, 2004(51): p. 7- 38. 99. Koshland, D.E., Jr., G. Nemethy, and D. Filmer, Comparison of experimental binding data and theoretical models in proteins containing subunits. , 1966. 5(1): p. 365-85. 100. Monod, J., J. Wyman, and J.P. Changeux, On the Nature of Allosteric Transitions: A Plausible Model. J Mol Biol, 1965. 12: p. 88-118. 101. Swinney, D.C., The role of binding kinetics in therapeutically useful drug action. Curr Opin Drug Discov Devel, 2009. 12(1): p. 31-9. 102. Kuntz, I.D., et al., The maximal affinity of ligands. Proc Natl Acad Sci U S A, 1999. 96(18): p. 9997-10002. 103. Reynolds, C.H., S.D. Bembenek, and B.A. Tounge, The role of molecular size in ligand efficiency. Bioorg Med Chem Lett, 2007. 17(15): p. 4258-61. 104. Zhukov, A. and R. Karlsson, Statistical aspects of van't Hoff analysis: a simulation study. J Mol Recognit, 2007. 20(5): p. 379-85. 105. Winzor, D.J. and C.M. Jackson, Interpretation of the temperature dependence of rate constants in biosensor studies. Anal Biochem, 2005. 337(2): p. 289-93. 106. Onell, A. and K. Andersson, Kinetic determinations of molecular interactions using Biacore--minimum data requirements for efficient experimental design. J Mol Recognit, 2005. 18(4): p. 307-17.

60 107. Rich, R.L. and D.G. Myszka, Grading the commercial optical biosensor literature-Class of 2008: 'The Mighty Binders'. J Mol Recognit, 2010. 23(1): p. 1-64. 108. Biacore AB, Technology note 1 - Surface plasmon resonance, BR-9001-15. 2001. 109. Malmqvist, M., Biospecific interaction analysis using biosensor technology. Nature, 1993. 361(6408): p. 186-7. 110. Christopeit, T., T. Gossas, and U.H. Danielson, Characterization of Ca2+ and phosphocholine interactions with C-reactive protein using a surface plasmon resonance biosensor. Anal Biochem, 2009. 391(1): p. 39-44. 111. Zhao, H., P.H. Brown, and P. Schuck, On the distribution of protein refractive index increments. Biophys J, 2011. 100(9): p. 2309-17. 112. Davis, T.M. and W.D. Wilson, Determination of the refractive index increments of small molecules for correction of surface plasmon resonance data. Anal Biochem, 2000. 284(2): p. 348-53. 113. Biacore T100 Software Handbook, version AA. 114. Giannetti, A.M., B.D. Koch, and M.F. Browner, Surface plasmon resonance based assay for the detection and characterization of promiscuous inhibitors. J Med Chem, 2008. 51(3): p. 574-80. 115. Johnson, V.A., et al., 2011 update of the drug resistance mutations in HIV-1. Top Antivir Med, 2011. 19(4): p. 156-64. 116. Rose, J.R., R. Salto, and C.S. Craik, Regulation of autoproteolysis of the HIV-1 and HIV-2 proteases with engineered amino acid substitutions. J Biol Chem, 1993. 268(16): p. 11939-45. 117. Keseru, G.M. and G.M. Makara, The influence of lead discovery strategies on the properties of drug candidates. Nat Rev Drug Discov, 2009. 8(3): p. 203-12. 118. Shih, I.H., et al., Mechanistic characterization of GS-9190 (Tegobuvir), a novel nonnucleoside inhibitor of hepatitis C virus NS5B polymerase. Antimicrob Agents Chemother, 2011. 55(9): p. 4196-203. 119. Li, H., et al., Discovery of (R)-6-cyclopentyl-6-(2-(2,6-diethylpyridin-4- yl)ethyl)-3-((5,7-dimethyl-[1 ,2,4]triazolo[1,5-a]pyrimidin-2-yl)methyl)-4- hydroxy-5,6-dihydropyran-2-on e (PF-00868554) as a potent and orally available hepatitis C virus polymerase inhibitor. J Med Chem, 2009. 52(5): p. 1255-8. 120. Watkins, W.J., A.S. Ray, and L.S. Chong, HCV NS5B polymerase inhibitors. Curr Opin Drug Discov Devel, 2010. 13(4): p. 441-65. 121. Geitmann, M. and U.H. Danielson, Additional level of information about complex interaction between non-nucleoside inhibitor and HIV-1 reverse transcriptase using biosensor-based thermodynamic analysis. Bioorg Med Chem, 2007. 15(23): p. 7344-54. 122. Shuman, C.F., M.D. Hamalainen, and U.H. Danielson, Kinetic and thermodynamic characterization of HIV-1 protease inhibitors. J Mol Recognit, 2004. 17(2): p. 106-19. 123. Clemente-Casares, P., et al., De novo polymerase activity and oligomerization of hepatitis C virus RNA-dependent RNA-polymerases from genotypes 1 to 5. PLoS One, 2011. 6(4): p. e18515. 124. Yi, G., et al., Biochemical Study of the Comparative Inhibition of Hepatitis C Virus RNA Polymerase by VX-222 and Filibuvir. Antimicrob Agents Chemother, 2012. 56(2): p. 830-7. 125. Hamalainen, M.D., et al., Label-free primary screening and affinity ranking of fragment libraries using parallel analysis of protein panels. J Biomol Screen, 2008. 13(3): p. 202-9.

61 126. Brighton, T.A., The direct thrombin inhibitor melagatran/ximelagatran. Med J Aust, 2004. 181(8): p. 432-7. 127. AstraZeneca. AstraZeneca Decides to Withdraw Exanta (Press release). 2006- 02-14 [cited 2012 April 2nd]. 128. Karlsson, R., Affinity analysis of non-steady-state data obtained under mass transport limited conditions using BIAcore technology. J Mol Recognit, 1999. 12(5): p. 285-92. 129. Freire, E., Do enthalpy and entropy distinguish first in class from best in class? Drug Discov Today, 2008. 13(19-20): p. 869-74. 130. Schrodinger, LLC, The PyMOL Molecular Graphics System, Version 0.99, 2006.

62