!"#!! $"%&

        &% '()*    &+% ),-()(''.(---(  /0/ // (,'1.              !  "# $ %&%&'( # ! (   ( )#  #*+ ()#,-"#  .   / !#-

 $   -&%&-0  (1 1"   +) / 2 3 !"!3   3 ! 4"# -   -           %5-56 -3   -708'96:'%::666:-

;< # #   # =         ##   !  - 7;>370## ! ( # (   #.    !   #  #  ! ( (-"#(# !*/8, $    #*$/;,   #<  (     #.##    !  -8 # (   (   - "#(  (##(  # 7;:% !     <  7; !!# 7;:%  *1",  #! (    : #81"7-3    #  .   #      1" ! -"#   (  #  (= :  ! # ( !. :(      (   81"7-  !    *# - !,#          # (   #  !  7; - "#   (##        ! !!#80&* ,: 80   . ( #  !*/8, $    # *$/;,- +      ( /8 80&* ,:80  .   (  :  ! -+#    # . !   !   # *) ,   ! *0 , #2    /8  .##       .# .   :   # -"# # $/;80&* ,:80  .   (  #?   ( # !!.   (  (   # -"#   ( #  !:!    ( ((    (    #!*80&* ,:80  , ! *> # ,- @#(  ! ##     !   # -

 ; 2 7;>370   (   81"7   #  ! /8$/;80&* ,:80    ! ! !   !# -

  !  !     " ! " #$%&!'()$&*+ !,

A # $  &%&

7008%5%:5%'& 708'96:'%::666:     :%9*# >> -?-> B C    :%9,

To my family

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Junaid M, Lapins M, Eklund M, Spjuth O, Wikberg JE (2010) Proteochemometric modeling of the susceptibility of mutated variants of the HIV-1 to inhibitors. PLoS ONE 5: 1-10.

II Salaemae W, Junaid M, Angsuthanasombat C, Katzenmeier G (2010) Struc- ture-guided mutagenesis of residues in the dengue virus two- component protease NS2B-NS3. J Biomed Sci 17: 68.

III Junaid M, Chalayut C, Torrejon AS, Angsuthasombat C, Shutava I, Lapins M, Wikberg JE, Katzenmeier G (2012) Enzymatic analysis of recombinant Japanese encephalitis virus NS2B(H)-NS3pro with model peptide substrates. PloS ONE (Accepted).

IV Prusis P, Junaid M, Petrovska R, Yahorava S, Yahorau A, Katzenmeier G, Lapins M, Wikberg JE (2012) Design and evaluation of substrate-based octapeptide and nonsubstrate-based tetrapeptide inhibitors of dengue virus NS2B/NS3 proteases. (Manuscript).

V Junaid M, Lapins M, Katzenmeier G, Cui T, Wikberg JE (2012) Design and Evaluation of Dengue NS3 Protease Peptide Inhibitors. (Manuscript).

Reprints were made with permission from the respective publishers.

I have also contributed to the following article, which is not included in this thesis. This publication reports further development of the research de- scribed in paper I.

VI Spjuth O, Eklund M, Lapins M, Junaid M, Wikberg JES (2010) Services for prediction of drug susceptibility for HIV proteases and reverse transcriptases at the HIV Drug Research Centre. Bioinformatics 27: 1719- 1720.

Contents

1 Introduction ...... 11 1.1 Modern Drug Discovery and Development (MDDD) Process ...... 11 1.2 Computational Methods in Drug Discovery and Design ...... 14 1.2.1 Virtual High-throughput Screening (V-HTS) ...... 14 1.2.2 Structure-Based Drug Design (SBD) ...... 15 1.2.3 Quantitative Structure-Activity Relationship (QSAR) ...... 15 1.2.4 Statistical Molecular Design (SMD) ...... 15 1.2.5 Proteochemometrics (PCM) ...... 16 1.3 as Drug Target in Modern Medicine ...... 20 1.4 The Virus in Modern Medicine ...... 20 1.5 ...... 22 1.5.1 HIV/AIDS ...... 22 1.5.2 Life Cycle of HIV ...... 25 1.5.3 HIV/AIDS Drug Targets and Drugs ...... 26 1.5.4 Highly Active Antiretroviral Therapy (HAART) ...... 29 1.5.5 HIV Drug Resistance: The Bottle Neck ...... 29 1.6 Flaviviruses ...... 31 1.6.1 Dengue ...... 32 1.6.2 Japanese encephalitis (JE) ...... 35 1.6.3 Life Cycle of Flavivirus ...... 38 1.6.4 Flaviviral Drug Targets: NS2B-NS3 Protease ...... 40 1.6.5 Drug Development for Flaviviral NS2B-NS3 Proteases ...... 42 2 Aims ...... 44 3 Results Discussion ...... 45 3.1 PCM analysis of HIV mutants susceptibility to NRTIs (paper I) ... 45 3.2 Kinetics of DEN NS2B(H)-NS3pro mutants (paper II) ...... 46 3.3 JEV NS2B(H)-NS3pro substrate specificity (paper III) ...... 48 3.4 DEN NS2B(H)-NS3pro inhibitors design & evaluation (papers IV & V) ...... 50 4 Conclusions and Perspectives ...... 53 5 Acknowledgements...... 55 6 References ...... 57

Abbreviations

Abz o-aminobenzoic acid Ac Acetyl ADE Antibody-dependent enhancement ADMET Absorption, distribution, metabolism, excretion, toxicity AIDS Acquired immunodeficiency syndrome amc 4-Methylcoumaryl-7-amide ARV Antiretroviral Boc tert-butoxycarbonyl CV Cross validation DEN Dengue virus DF DHF Dengue hemorrhagic fever DoE Design of experiments DSS Dengue shock syndrome ECV External cross validation FBD Fragment-based design FDA Food and drug administration FDS Fold decrease in susceptibility FeLV Feline leukemia virus GRT Genotypic resistance testing HAART Highly active anti-retroviral therapy HIV-1 Human immunodeficiency virus type 1 HIV-1RT HIV-1 Reverse transcriptase HTLV Human T-cell leukemia virus ICTV International Committee on Taxonomy of IND Investigational new drug JEV Japanese encephalitis virus kcat Turnover number, Rate of catalysis Ki Inhibition constant Km Michaelis constant LOOCV Leave-one-out cross validation NDA New drug application NNRTIs Non-nucleoside reverse transcriptase inhibitors NRTIs Nucleos(t)ide reverse transcriptase inhibitors

NS2B(H) Non-structural protein 2B hydrophilic sequence NS3 Non-structural protein 3 NS3pro NS3 protease PAGE Polyacrylamide gel electrophoresis PCA Principal component analysis PCM Proteochemometrics PLS Partial least squares projections to latent structures PRT Phenotypic resistance testing QSAR Quantitative structure-activity relationship QSPR Quantitative structure-property relationship RdRp RNA-dependent RNA polymerase RS Rough sets SBD Structure-based drug design SED Statistical experimental design SIV Simian immunodeficiency virus SMD Statistical molecular design SOE-PCR Splicing over extension polymerase chain reaction SVM Support vector machine TBEV Tick-borne encephalitis virus V-HTS Virtual high-throughput screening VPT Virtual phenotypic testing WNV West Nile virus YFV virus

1 Introduction

1.1 Modern Drug Discovery and Development (MDDD) Process Contrary to traditional trial-and-error based drug design which involves testing of chemicals on tissues, organs or whole animals, the modern drug discovery process exploits knowledge about drug targets, and therefore, is referred to as rational drug design. Modern drug discovery is a lengthy, costly and complex process usually consisting of three stages, namely, discovery research, pre-clinical development and clinical development (Figure 1). A brief overview of different drug discovery and development phases is given below.

1) Drug Target Identification A drug target is a key biomolecule involved in a particular functional (meta- bolic or signaling) pathway that is specific to a pathological condition or to the survival or infectivity of a pathogen. The revolution in automation and information technologies has made the whole genome sequencing and proteome characterisation a routine task, and consequently has opened windows to new worlds of potential drug targets.

2) Target Validation The relevance of the target in a disease process or pathogenicity is a difficult task to prove and which has no general approach. A plausible relevance of a target maybe demonstrated by knock-out or knock-in animal models, site- directed mutagenesis studies [paper II], use of ribozymes, or neutralising antibodies, along with many other approaches.

3) Lead Discovery A ‘lead’ is a chemical compound that has pharmacological or biological activity, and has the potential to be optimised to a drug candidate. Large libraries (106-107) of diverse substances may be tested on drug tar- get using real (usually in vitro) or virtual (in silico) high-throughput screen- ing [1]. Generating millions of compounds and screening in the wet-lab is often practically unfeasible. Alternate approaches requiring fewer com- pounds include statistical molecular design (SMD) [paper V], structure- based design (SBD) [papers II, III], fragment-based design (FBD), pharma- cophore-based design (PBD), quantitative structure- activity/property rela-

11 tionship (QSAR/QSPR) and proteochemometrics (PCM) [paper I] and are methods of great importance that can be utilized in drug discovery processes.

4) Lead Optimisation The structure of a ‘lead’ is used as a starting point for chemical modifica- tions in order to obtain compounds with certain desired properties. These properties include ease of synthesis, specificity, efficacy, adherence to Lip- inski’s rules [2] and many other criteria. Various in vitro and in vivo methods are used to test these properties [3]. Lead optimisation requires the simulta- neous optimisation of multiple parameters and is thus a time consuming and costly step.

4) Preclinical Development Preclinical developments involve in vitro studies and animal trials. Different formulations and wide ranging dosages of the compounds are introduced and evaluated in order to obtain preliminary efficacy, general safety, toxicity patterns and pharmacokinetic information. The compound reaching the pre- clinical stage is referred to as a ‘candidate drug’ and a candidate drug having a good safety profile is called an ‘investigational new drug’ (IND). An IND application, containing data about animal pharmacology and toxicology studies, chemistry and manufacturing, clinical protocols and in- vestigators, is filed at the Food and Drug Administration (FDA) for approval of clinical trials.

5) Clinical Trials Pharmaceutical clinical trials are commonly divided into 4 phases (Phase I– IV). At the clinical drug development stage the selected IND passes through three phases (Phase I–III). Phase I utilises a fairly small number (20–80) of healthy volunteers, where the primary goal is to assess the safety of the agent and to determine an acceptable dose for Phase II. Phase II normally utilises about 30–300 patients, where the main goal is to initially assess an activity of the drug against a disease. Only about 25% of the candidates that enter clinical trials complete Phases I and II successfully and pass to Phase III. In Phase III the candidate drug is administered to a larger group (hundreds to thousands) of patients with the goal to demonstrate its safety and prove its effectiveness over existing standard therapies. After successful completion of a Phase III trial, a new drug application (NDA) is filed at regulatory au- thorities for approval to launch a new drug onto the market [4]. The Phase IV trials (also called pharmaco-epidemiologic studies or post- marketing surveillance) are designed to detect rare or long-term adverse effects over a larger patient population and longer timescale than was possi- ble during phases I–III of the clinical trials.

12

• Classical Biological Techniques (Molecular, Cellular) Target • Advanced Techniques (Genomics, Proteomics, Bioinformatics) Identification

• Knock in/out animal models • 3D Structural studies (Real/Virtual) Target • Interfering molecules (ribozymes, antibodies, etc.) Validation

• High-throughput Screening (Real/Virtual) • Structure (Real/Virtual)-based Design Lead • Statistical Molecular Design (SMD) Identification •QSAR/PCM

•In Vitro/In vivo screening • Structure-based Design Lead •SMD Optimization • QSAR/PCM Modeling

• In-vitro studies (Cell lines) Pre-clinical • Trials on animal populations Development

• Phase I (Healthy, 20-80), Safety, PK, PD, tolerability • Phase II (Patients 20-300), Safety confirmation Clinical • Phase III (Pateints, 100s-1000s), efficacy, compared with Trials Standard Drug

• Drug Registration & Marketing • Post-marketing surveillance (Phase IV trial) Marketing

Figure 1. Overview of drug discovery and development process.

13 1.2 Computational Methods in Drug Discovery and Design All existing drugs on the market together hit only about 500 drug targets, representing less than 1% of potential targets of the presumed druggable proteome. Similarly, the existing drugs (slightly over a thousand in number) represent a tiny portion of the chemical space of potential drug-like mole- cules. Thus a large number of potential targets and a huge number of small molecules of likely therapeutic value remain to be exploited. Drug discovery and development is an expensive and time-consuming process. The average total cost per drug development varies between around US$ 1–2 billion, the typical development time is 10–15 years [5] and 98% of initially promising projects fail at later stages of development. Integration of novel computational approaches at every stage of drug discovery and devel- opment, such as the lead discovery, lead optimisation and other pre-clinical stages, might reduce the failure rate, costs and time for drug discovery and development. In recent years, many different computational methods have been developed [6-9], the most common of which are briefly introduced separately below.

1.2.1 Virtual High-throughput Screening (V-HTS) Conventional high-throughput screening (HTS) is the process of assaying large libraries of compounds against drug-targets for their biological and potential therapeutic activity. Although, conventional HTS is a common method at early stages of the drug discovery process, its application in large projects is costly and ineffi- cient. Many of the hits identified from HTS fail in later stages, mainly due to undesired effects in vivo [10]. Virtual High-throughput Screening (V-HTS) is the process in which large libraries of diverse compounds are created virtually in a computer and then sorted by using suitable algorithms in order to find potential hits, which may then be obtained physically and tested in vitro to confirm their activity against the target [11]. V-HTS reduces costly syntheses and testing of a large number of compounds and thus provides the best way to cover a large chemical space without the commitment in time, materials and infrastructure that conventional HTS demands. One example of V-HTS is high-throughput docking (HTD) which is ap- plied in cases where the 3D structural information of the target is available (see Structure-based drug design).

14 1.2.2 Structure-Based Drug Design (SBD) When the 3D structure of the target is available, various computational methods can be applied to predict the interaction of ligands with the target. One commonly used approach is molecular docking, where the 3D structure of the target macromolecule and the ligand are used to model the target- ligand binding [5,8][paper IV]. Structure-based design is a highly useful approach during early phases of modern drug discovery [12]. For example, high-throughput docking is ap- plied to screen a large number of compounds (V-HTS) on targets with known 3D structures. However, the main problem restricting its application is that reliable 3D information (i.e. crystal structure or homology model) of target macromolecules is often unknown [13,14].

1.2.3 Quantitative Structure-Activity Relationship (QSAR) Quantitative structure-activity relationship (QSAR) is a computational modelling technique that correlates the chemical structure (represented by multiple physicochemical property descriptors) of a series of compounds with their activities, especially biologic activities, and the model thus created can be used to analyse the properties of compounds and to predict new structures with improved properties [15,16]. QSAR modelling is of fundamental importance in early and preclinical drug development stages. Initially, global QSAR models are generated on earlier collected data (from in-house experimentation, databases and/or lit- erature). At a later stage, when a large number of compounds are selected and synthesised in specific series, local/specific and more accurate QSAR models are created based on specific compound series [15]. In the pharmaceutical industry, QSAR is generally used to screen for ADMET (absorption, distribution, metabolism, excretion and toxicity) stud- ies, which are crucial to the success of candidate drugs. Many of the mathematical techniques used in QSAR are common to those used in proteochemometrics (PCM), and are summarised under the PCM- section below.

1.2.4 Statistical Molecular Design (SMD) Congeneric series of compounds are often assembled from two or more chemical building blocks. SMD provides an approach for selecting chemical building blocks to efficiently span the chemical space and increase the in- formation content in libraries of compounds [15][paper V]. However, SMD can also be used to optimally select compounds from large collections, even if they do comprise series constructed from distinct building blocks.

15 Statistical experimental design (SED), or design of experiments (DoE) as it is also called, is the approach used to design a set of experiments to inves- tigate the effects of a number of experimental variables (factors) on response [17,18]. SMD is DoE applied to design of molecules, i.e. the factors, experiments, and responses representing the chemical features of compounds to be syn- thesised, and their biological actions [19].

The advantage of SMD is that it produces more representative and accurate data for experimental assessment at less expense, compared with random testing of chemicals. Therefore, SMD has a great potential in modern drug discovery.

1.2.5 Proteochemometrics (PCM) Proteochemometrics (PCM), which was developed by Wikberg and co- workers in 1999 [20], is a novel bioactivity modelling technique that de- scribes the interaction space (the space spanned by the non-covalent interac- tions between all molecules) between a series of multiple ligands and targets (macromolecules). PCM is similar to QSAR, but superior in the sense that PCM not only utilises the chemical information confined in a series of chemical com- pounds, as does QSAR, but that it also utilises the information confined in a series of biological targets (Figure 2). PCM models are therefore more de- scriptive and generally have higher predictability, compared with QSAR models. A simplified schema for PCM modelling is given below1: Derivation of descriptors of biological macromolecules (targets), and chemical compounds (ligands) - The descriptor matrix of the targets can be viewed as a set of points in an n-dimensional space representing the chemi- cal space of macromolecules. Similarly, the descriptors matrix of ligands can be viewed as a set of points in an m-dimensional space representing the chemical space of ligands. To improve the quality of the PCM model, pre- processing of descriptor matrices, prior the modelling, is performed by dif- ferent methods such as data centring and scaling to unit variance or block- scaling of the groups of related descriptors, etc. Thus two matrices (target and ligand) are obtained in this step. Generation of interaction-term (cross-term) descriptors – These descrip- tors are obtained by multiplying any two descriptors in the target and ligand matrices generated in the previous step. Usually three types of interaction

1 Note that many of the steps in PCM modelling apply also to QSAR modelling.

16 (cross-term) descriptor matrices are generated, namely, target-target, ligand- ligand and target-ligand. Correlation of Independent and Response variables – The descriptors of targets, ligands and the cross-terms are used as independent variables and the values of measured biological activity are considered to be the dependent (response) variables in PCM modelling. The independent variables matrix is correlated with the response variables matrix, using various machine learn- ing techniques such as PLS, SVM, RS etc., and PCM models are created. Model Validation – Validation (the process of determining the degree to which a model is an accurate representation of the real process from the per- spective of the intended uses of the model) is a very important step in model- ling because it allows us to evaluate, compare and eventually choose the best model in terms of predictability. The most common validation techniques include cross-validation, exter- nal validation, permutation techniques, and interpretation validation [21,22]. Here the internal cross-validation and external cross-validation – used in this thesis [paper I] – are introduced below. Cross-validation (CV) - CV is the most popular approach for the estimation of the predictive power and potential over-fitting of models [21,23,24]. In g- fold CV, the data is divided into G equal-sized groups, each gth group is ex- cluded one at a time, the model is then developed on the G-1 groups (train- ing set) of the data and a prediction error is calculated on the excluded set (test set). This procedure is repeated for g = 1, 2…G and prediction errors for all test sets are combined. In general, the number of groups (G) is 5–10. When G is equal to the number of observations in the data set, the CV is known as leave-one-out-validation (LOOCV). The fraction of the predicted y-variation (Q2 or cross-validated R2) - the measure of the predictability of the regression model – is calculated as:

n (y − y )2 B i.ob i. pr = 2 = − i 1 Q 1 n , − 2 B(yi.ob yob.av ) i=1 Here yi.ob, yi.pr and yob.av represent the observed, predicted, and mean values, respectively, of response variables. External cross-validation (ECV) – A trained model is used to predict the output variable for a set of data points for which an observed activity value is available, and the points are not included in the model training and hence completely unknown (i.e. external) to the model [25]. After model training,

17 the performance of the model on the test set is estimated using the explained 2 variance for prediction of the external test set (Q ext)

n − 2 B (yi.ob yi. pr ) 2 i=1 ext = − Q 1 n , − 2 B (yi.ob yob.av ) i=1

Here, yi.ob, yi.pr and yob.av represent the observed, predicted, and mean val- ues, respectively, of response variables. ECV provides more reliable performance estimate than internal validation to assess the quality of model predictions on a data set of unknown com- pounds. Model Interpretation – PCM model can be used for interpretation of fac- tors that determine target-ligand interaction activity and for prediction of the activity of novel ligands and novel targets. The simple way of a validated PCM model interpretation is to observe the coefficients of the PLS regres- sion equation, in the case that PLS is used for creation of the PCM model, i.e. a large value of a coefficient for a ligand or protein descriptor would indicate that the property represented by this descriptor has a large influence on the interaction activity and vice versa. In the same way, a large absolute value of a regression coefficient for a cross-term would indicate high influ- ence of the represented combination of features for selective interactions, and vice versa. The ± sign of a coefficient value indicates the direction (posi- tive/negative) of the influence of the descriptor on the response. Further, the contribution of a group of descriptors (e.g., amino acid, sequence segment, part of a ligand, etc.) to the activity (e.g. drug-resistance, binding affinity, inhibition, etc.) can be computed from the sum of values for all the descrip- tors involved. All such information presents a detailed picture of the nature of molecular interactions and provide basis for the application of PCM modelling in ra- tional drug design and therapy. In drug design, the information would give directions on how to design/modify a ligand in order to achieve a desired property for its interaction with the target. Moreover, in target validation, PCM can be used to identify and quantify not only the role of the ac- tive/ residues but also the influence of other residues distant to the active site in a target [paper I & VI], [26,27] The major advantage of PCM, in contrast to QSAR models which are built on single targets, is that PCM incorporates data from multiple targets in a single general model which can be used to extrapolate the activities not

18 only of novel ligands on known targets or mutants but also of known com- pounds on new targets. Another advantage of PCM is that, unlike structure-based drug design, it can also be applied in situations where reliable 3D information of the target macromolecules is unavailable. Overall, PCM is a robust and general method and therefore has a wide scope and applications in modern drug discovery, design and therapy. Since its introduction in 1999, PCM has been successfully applied in several stud- ies, including analysis of drug interaction with different classes of G-protein coupled receptors, [20,28-33], antibody-antigen interactions [34], cleava- bility of protease substrates [35], substrate interactions with dengue virus NS3 proteases [36] HIV resistance to protease inhibitors [37] and HIV resis- tance to NRTIs [papers I & VI]. In conclusion, PCM will have a pivotal role for the future, not only in the discovery of novel drugs and targets but also in optimising drug therapy.

Figure 2. General Schema of PCM Modelling

19 1.3 Enzyme as Drug Target in Modern Medicine Modern medicine is a molecular science, mainly directed towards diagnosis and therapy. For the latter the main approach is the use of medicines, which to the largest extent encompasses the small molecules that we term drugs and which are directed against macromolecular drug targets whose perturba- tion has a beneficial effect on disease states. A majority of the drug targets are , which have vital role in the life processes. Enzymes are ideal drug targets because of their: 1) essentiality in biochemical processes, 2) specificity for specific physiological and pathological processes and sub- strates, 3) simplicity, which allow them to, in fairly simple ways, be charac- terised structurally and mechanistically, 4) general availability (i.e. to gener- ally be obtainable on a large scale and with high purity for high throughput studies), and 5) the propensity to find small molecules capable of inhibiting them. Today, almost half of the small molecule drugs on the market are inhibi- tors of enzymes [38]. Current human genome surveys suggest that druggable targets are dominated by enzymes. In 2001 the worldwide sales of small molecule drug-inhibitors of enzymes were about US$ 65 billion, and the market was growing with an average annual increase of 8% [38]. Another survey suggests that 50–75% of all drug discovery research at the major US pharmaceutical and biotechnological companies is focused on enzymes as their primary targets [39]. Therefore it is certain that specific enzyme inhibi- tion will remain a major focus of pharmaceutical research also for long time to come.

1.4 The Virus in Modern Medicine “It’s time to close the book on infectious diseases and pay more attention to chronic ailments such as cancer and heart disease…the war against infec- tious diseases has been won…” This was a remark made by the eminent epidemiologist, paediatrician and American Surgeon General, Willium H. Stewart, in 1967, who, like many others at that time, was buoyed up by the pace of scientific advancement in the mid of the 20th century and believed that the battle against bacteria and viruses was over [40,41]. Unfortunately, as we know he has been proved wrong, and since then more than 20 new infectious diseases, including the HIV/AIDS, have appeared on the world stage. Today, more than 25% of the deaths worldwide are caused by (mainly) viral and bacterial infectious diseases. Twenty-six years later Donald A. Henderson at the U.S. Office of Science and Technology Policy had to say: “The recent emergence of AIDS and Dengue haemorrhagic infections, among others, is serving usefully to disturb our ill-founded complacency

20 about infectious diseases. Such complacency has prevailed in this country (USA) throughout much of my career ... It is evident now, as it should have been then, that mutation and change are facts of nature, that the world is increasingly interdependent, and that human health and survival will be chal- lenged, ad infinitum, by new mutant microbes, with unpredictable patho- physiological manifestations...” [42]. A virus (Latin - toxin or poison [43]) is a sub-microscopic particle and the simplest infectious microorganism found in nature. It has neither a cellular organisation, nor the replication ability outside living host cells, or the ge- netic information encoding the molecular machinery required for the genera- tion of genetic information and for protein synthesis. It possesses only one type of nucleic acid, namely either DNA or RNA [44,45]. When the virus enters the cell and hijacks its internal machinery, things change and this non- living miniature capsule for genetic information emerges as a dominant liv- ing and replicating organism, which may cause cancers, or acute or chronic infections.

Classification The simple and most useful classification system of viruses is considered to be the Baltimore Classification (Figure 3) [46] which places the viruses into one of seven groups based on their genetic contents (DNA or RNA) and replication strategies (mRNA synthesis).

+ ssRNA + ssDNA + ssRNA VI II IV

DNA/RNA dsDNA -ssRNA Hybrid I V

dsDNA dsRNA VII mRNA III

ssRNA

Figure 3. The Baltimore Classification System. The unique pathways from various viral genomes to messenger RNA define specific virus based on the nature and the polarity of viral genome [46].

21 1.5 Retroviruses Retroviruses (Class VI in the Baltimore Classification) belong to the genus Lentivirus of the family Retroviridae, and are the viruses that contain two copies of positive-sense ssRNA as genetic material, and the reverse tran- scriptase enzyme needed to transcribe ssRNA into dsDNA. Retroviruses are unique in their reproduction: they reproduce by tran- scribing themselves into DNA from RNA by using the retroviral reverse transcriptase enzyme. All retroviruses have similar genomic organisation and express three main polypeptides, namely 1) the gag which contains the ma- trix, the and the nucleocapsid proteins that make the internal structure of the virion; 2) the gag- which contains the above gag structural pro- teins, and three enzymes, namely reverse transcriptase, protease, and inte- grase; and 3) the that contains two proteins located at the exterior of the mature virion and which take an important part in the infec- tion process, by their involvement in the cell receptor recognitions [47]. Retroviruses cause a broad spectrum of diseases, which include certain cancers in humans and animals, immune-deficiencies such as AIDS, myelo- pathies, and chronic infections such as arthritis, infective dermatitis, and uveitis. The first to be identified was the feline leukemia virus (FeLV) in 1973 [48]. The first human retrovirus that was isolated was the T-cell leukemia virus (HTLV-1) by Poiesz and co-workers in 1980 [49].

1.5.1 HIV/AIDS HIV-1 was discovered in 1983 in AIDS patients in France and United states [50,51]. HIV-2 was subsequently isolated from patients in West Africa in 1986 [52]. Both HIV-1 and HIV-2 are transmitted by body fluids, and they have indistinguishable clinical manifestations, including acquired immuno- deficiency syndrome AIDS (see also under progression and symptoms).

Prevalence and Transmission Molecular phylogenetic studies suggest that HIV evolved from the closely related simian immunodeficiency virus (SIV), a lentivirus which infects chimpanzees and gorillas in Africa [53,54]. It is believed that SIV transmit- ted in the late 19th or early 20th century to humans involved in bushmeat ac- tivities. AIDS was first recognised among homosexual men in the United States in 1981 [55]. Since then, HIV/AIDS has killed more than 25 million people. In 2010, around 34 million people were living with HIV (which constitutes an increase by 17% since 2001), 1.8 million people died, and 2.7 million people became infected with the virus (Figure 4). From 2001 to 2010, the number of people living with HIV increased by 34% in North America and Western

22 and Central Europe, reaching 2.2 million cases (Figure 5). In Eastern Europe and Central Asia it increased about 250%, reaching 1.5 million cases, and about 5% in South and South-east Asia, reaching 4.0 million cases. How- ever, sub-Saharan Africa is the worst affected region, with around 68% (23 million) of the total global HIV-positive people living there [56].

Western & Eastern Europe & North America Central Europe Central Asia 1.3 million 0.84 million 1.5 million East Asia Middle East & 0.79 million Caribbean N. Africa South & 0.2 million 0.47 million South-East Asia 4.0 million

Latin America Oceania 1.5 million 0.054 million Sub-Saharan Africa 22.9 million

Figure 4. Estimated HIV Prevalence among young adults (age 15–49 years) world- wide, as of 2010 [56].

Figure 5. Number of people living with HIV worldwide from 1990 to 2010. (Figure reproduced from [56], with permission).

HIV can be transmitted with body fluids from infected individuals, such as during unprotected sex (semen, and vaginal secretions), via blood transfu- sions (blood and its products), pregnancy, breast-feeding, and child birth (mother to child with blood, milk), injection, surgery, tattooing and alike (sharing of contaminated injection and surgical equipment and skin piercing tools).

23 Progression and Symptoms There are three stages of HIV progression: 1) acute or primary infection, 2) chronic infection or latency, and 3) AIDS (Figure 6). During the first stage, the virus replicates rapidly, which leads to acute viremia, with millions of virus particles per millilitre of blood [57]. This stage is associated with a marked drop in the level of CD4+ T cells, activa- tion of CD8+ T cells, and production of antibodies. The acute stage may last for several (usually 2–4) weeks, and may be associated with flu-like symp- toms (fever, swollen lymph nodes, sore throat, muscle pain, rash and weak- ness). Sometimes headache, nausea and vomiting, enlarged liver/ spleen, weight loss, and neurological symptoms are also seen [58]. This is the most infectious stage.

1000 7 CD4+ Cells Count 800 6

5 600 4 Viremia 400 3

200 2

CD4+ T-Lymphocyte Count [Cells/l] Count T-Lymphocyte CD4+ 1 000 Weeks Years Months HIV RNA Plasma Level: Log[Copies/ml] Acute Chronic (Latency) AIDS

Figure 6. Schematic representation of the typical course of HIV progression. During the primary infection (acute phase), the viremia (red) rises to a maximum, CD4+ T- cell count (blue) drops rapidly and then rises again to a subnormal level, and the immune system is rapidly activated. During the latency phase, the CD4+ T-cell count decreases slowly, the viremia and the immune response rise slowly. In the AIDS phase, the virus counts increases rapidly and opportunistic infections appear. (Modi- fied with permission from the journal [59]).

During the latency stage, the number of viral particles is decreased by the immune system, which causes the acute infection to turn into a chronic in- fection. This stage can last from two weeks up to 20 years. Different tissues such as lymph nodes and other tissues rich in CD4+ T cells become infected. There may be few or no symptoms, but the individuals are still infectious in this stage [60]. AIDS is the final stage of the HIV infection, in which the CD4+ T cell count drops below a critical level (< 200 l-1), and various opportunistic bacterial, viral and fungal infections (respiratory tract, skin and throat infec-

24 tions) cancers and other conditions appear due to loss of cell-mediated im- munity. The viral load also increases during this stage.

Vaccination Despite decades of research in the field, no effective HIV vaccine has been developed so far. The major challenge to HIV vaccine development is the hyper-variability of the virus [61]. There is one clinical trial, however, that demonstrates that development of an effective HIV vaccine might be feasible. It was conducted in Thailand in 2009. In this community-based trial, the test subjects were given a combina- tion of two previously unsuccessful recombinant vaccines (canarypox vector vaccine, ALVAC-HIV, and glycoprotein 120 subunit vaccine, AIDSVAX B/E). The trial showed that there was a 31.2% reduction in the probability over a 42 months period to become infected compared to controls that did not receive the vaccine [62]. Further investigation is underway to assess the effects of additional boosting doses of the vaccine and to determine its effi- cacy against HIV subtypes found in other parts of the world, such as sub- Saharan Africa [56].

1.5.2 Life Cycle of HIV All retroviruses share a similar replication pathway and it is here exemplified by the HIV-1 replication cycle (Figure 7). The cycle starts when the virus attaches to the surface of a cell hosting a receptor to which the virus can attach, and then passes through a series of crucial steps including the many- folding of the virus, and finishes with the release of immature virus particles from the host cell. The main target cell for HIV-1 is the CD4+ cell, where the virus in the first step binds to the CD4 receptor, and one of two co-receptors, CCR5 and CXCR4. The virus’ envelope then fuses with the CD4+ cell’s cell membrane and the HIV matrix, capsid protein, two RNA molecules, reverse transcrip- tase, protease, and enter into the cytoplasm of the CD4+ cell. Dur- ing this process of so-called uncoating, the viral RNA is thus released into the CD4+ cell. Reverse transcription of the viral ssRNA into dsDNA then occurs, with the help of the HIV-1 reverse transcriptase (HIV-1 RT). The HIV-1 reverse transcriptase, unlike many other DNA polymerases, lacks proof-reading activity, and therefore the reverse transcription is highly error- prone, and multiple mutations may occur during this stage in the HIV-1 ge- nome. The transcribed DNA must then be imported into the nucleus of the CD4+ cell, where it becomes integrated into a host chromosome by the HIV- 1 integrase enzyme. In the next step, viral DNA is transcribed into mRNA to produce several regulatory proteins, such as (trans-activator of transcrip- tion), and (regulator of virion expression). Next, the gag, gag-pol and

25 env polyproteins are expressed from viral mRNA. In the final step, the viral genome and polyproteins are assembled into immature virions that bud out of the cell. The virions mature and become infectious after the cleavage of the gag and gag-pol polyproteins into different structural and functional proteins, by the proteolytic action of the HIV-1 protease.

Glycoprotein (gp120) Envelope Capsid CD4 Receptor

HIV RNA 1

7 6 4 5 3 2 8

Host Cell HIV Proteins

Figure 7. Schematic representation of HIV-1 replication cycle. 1. Binding, 2. Fusion and infection, 3. Reverse transcription, 4. Integration, 5. Transcription, 6. Assembly, 7. Budding, 8. Maturation.

1.5.3 HIV/AIDS Drug Targets and Drugs Access to antiviral therapy has transformed infection by HIV from an almost certain death to a chronic condition. In 2002 only 2.7% of HIV-positives in low- and middle-income countries had access to anti-HIV therapy, while by the end of 2010 this number raised up to 47%. The major factor for this in- creased access is the falling of prices of antiretroviral therapy, mainly due to generic competition. In the year 2000, the cost for a three-drug HAART combination lay in the range US$ 10 000–15 000 per person, per year. The current cost for the same regimen is less than US$ 120 per person, per year. [63].

26 More than 25 anti-HIV drugs have been approved, and all, except three, are targeting one of three HIV-1 enzymes, namely the reverse transcriptase, protease or integrase [64,65]. The main classes of anti-HIV drugs are de- scribed below.

HIV-1RT Inhibitors HIV-1RT is the sole enzyme that produces dsDNA from the ssRNA genome, which is an essential step in retroviral replication [66-68]. The enzyme’s substrate binding site is the drug target for more than half of the anti-HIV drugs. There are two types of drugs that target HIV-1RT, namely nucleoside/nucleotide analogue reverse transcriptase inhibitors (NRTIs) and non-nucleoside analogue reverse transcriptase inhibitors (NNRTIs). NRTIs are analogues of nucleotide substrates that lack the 3´-OH group present in the ribose ring of the four natural deoxyribonucleotides (Figure 8).

AZT d4T FTC

ddC 3TC ddI

ABC TDF

Figure 8. Chemical structures of clinical NRTIs.

NRTIs are phosphorylated by cellular kinases to form 5-triphosphates, which are then used by HIV-RT as substrates and become incorporated in the extending DNA chain [66,68]. Due to the absence of a 3-OH group in their ribose ring, NRTIs act as chain terminators, blocking further elongation

27 of the DNA [69]. Eight NRTIs have been approved for clinical use, namely: zidovudine (AZT), didanosine (ddI), zalcitabine (ddC), tenofovir (TDF), lamivudine (3TC), emtricitabine (FTC), abacavir (ABC), and stavudine (d4T). Of these zalcitabine (ddC) was discontinued in 2006 due to the avail- ability of better alternatives, and due to its severe adverse effects, which include peripheral neuropathies [70]. NNRTIs represent a group of HIV-1 RT inhibitors that bind in the so- called NNRTI binding pocket, where they induce conformational changes in the RT that terminate the reverse transcription of ssRNA. The approved NNRTIs are nivirapine (NVP), delaviridine (DLV), rilpivirine (RPV), etravirine (ETR) and efavirenz (EFV).

HIV-1Protease Inhibitors (PIs) Protease inhibitors represent the second largest group of HIV-1 inhibitors in use. PIs are analogues and competitors of the natural peptide substrates for the enzyme active site. Most of the clinically approved PIs, such as amprenavir (APV), darunavir (DRV), atazanavir (ATV), lopinavir (LPV), indinavir (IDV), nelfinavir (NFV), (RTV) and saquinavir (SQV), are non-cleavable peptidomimetic substrate analogues that contain a similar structural determinant as the normal substrates, namely a non-cleavable hydroxyethylene core rather than the natural peptidic linkage [71]. A second generation PI named tipranavir (TPV) has also been approved. TPV has been found to be effective against a majority of drug-resistant strains of HIV-1 and HIV-2. However, new strains resistant to TPV have also been identified [72,73].

HIV-1 Integrase Inhibitors There is only one approved inhibitor belonging to this class, namely raltegravir (RGV). RGV inhibits the HIV-1 integrase, which is the enzyme responsible for integration of viral DNA into the host cell DNA [74].

Other inhibitors of HIV-1 There are two inhibitors that target the viral entry into the cell. One of them binds to the viral trans-membrane protein , and thereby prevents the viral fusion with the host cell. This one is classified as a ‘fusion inhibitor’, and is called enfuviritide (ENF) [65]. The other recently developed inhibitor, maraviroc (MVC), binds to the CCR5 chemokine co-receptors and is classi- fied as ‘CCR5 co-receptor antagonist entry inhibitor’ [75]. Another class of potential HIV inhibitors are called ‘maturation inhibitors’ and inhibit the last step of the gag polyprotein processing, which results in the formation of defective and non-infective viral particles [76].

28 1.5.4 Highly Active Antiretroviral Therapy (HAART) None of the available antiretrovirals administered as monotherapy are able to suppress HIV replication for any extended period of time due to rapid emer- gence of resistant strains of HIV-1. Serial use of individual drugs leads to multi-drug resistance. Moreover, resistance to one drug may cause resistance to other similar drugs (cross-resistance) of the same class of antiretrovirals. Therefore, highly active antiretroviral therapy (HAART), which is the com- bination of three or more drugs from at least two different classes of anti- retrovirals, was introduced. The first line treatments include two NRTIs in combination with one NNRTI or a PI [77]. HAART offers several advantages in anti-HIV therapy: 1) multiple drugs suppress viral replication more effectively than single drugs [78,79], 2) the generation of new HIV variants (including the resistant variants) are arrested due to suppression of viral replication, and 3) the emergence of resistance becomes unlikely, as many different mutations need to appear concomitantly that render all drugs in the regimen resistant for resistance to occur. In pa- tients who receive HAART as their first line therapy, the emergence of HIV resistance to the multiple drugs in the HAART regimen is possible only if the levels of drugs are insufficient to block the viral replication, but suffi- cient to exert a positive selective pressure on variants with decreased drug susceptibility [52]. Although HAART has increased the life expectancy of people with HIV/AIDS dramatically since its introduction in 1996 [80,81], the virus is not eradicated from the patient. Therefore the therapy is to be continued throughout the patient’s life [65].

1.5.5 HIV Drug Resistance: The Bottle Neck HIV drug resistance is the ability of the virus to withstand the inhibitory effects of a given anti-HIV drug and thus continue to reproduce in its pres- ence [82]. Although the access to highly active antiretroviral therapy (HAART) has reduced the mortality, the high replication rate combined with the high mu- tability of HIV leads to emergence of drug-resistant strains, which under- mines our efforts to stop the AIDS pandemic [83]. About 109 to 1010 virions are produced in an untreated HIV infected individual every day, and it has been estimated that each possible single-point mutation arises 104 to 105 times per day in this population [84]. While some mutations result in func- tion-impaired viruses, others lead to drug-resistant forms. Drug resistance can be considered as a natural evolutionary response to escape from the drug pressure. In a general case, the evolutionary responses can, for example, be decreased drug influx or increased drug efflux, development of alternate pathways to bypass the inhibition reaction, increased production of drug-

29 sensitive enzymes, increased amount of competing substrates, or decreased requirement of the product of inhibited reaction. However, the most common mechanism of resistance mediated by mutations in the genes of infectious organisms alters the respective drug target macromolecule and consequently its interaction with the drug [85-89]. HIV-1 develops resistance to NRTIs by two distinct mechanisms. The first is by mutations that primarily cause steric hindrance, decreasing the rate of incorporation of the NRTI into the elongating DNA chain. The other is by mutations that increase phosphorolysis leading to removal of already incor- porated chain-terminating inhibitors from the DNA, thereby allowing reverse transcription to continue [83]. HIV-1 resistance to NNRTIs involves prevention of the drug binding to its binding pocket near the HIV-RT active site, thus allowing DNA polym- erization to proceed normally [83]. HIV-1 Resistance to PIs involves mutations in HIV-1 protease which re- duce binding affinity of PIs to the HIV-1 protease, e.g. by modifying the points of interaction of the drug with the enzyme [90]. When HIV develops resistance to the first line therapy, a new regimen has to be selected from multiple alternative possible drug combinations. Since anti-HIV drugs that act at the same target (and binding site) are rather similar in their molecular properties, cross-resistance is frequent, and a new regimen cannot be based on the assumption that the virus will be susceptible to the drugs remaining in the therapeutic arsenal [83]. Therefore resistance testing has become an important tool in the management of HIV/AIDS. There are three types of such resistance tests. Two of these are phenotypic resistance testing (PRT) and genotypic resistance testing (GRT). In PRT, the viral ac- tivity is measured in the absence and presence of varying concentrations of ARV medication. In GRT, the viral drug target coding genes are sequenced and examined for resistance-related mutations. PRT is comparatively easy to interpret but difficult to perform. GRT is much faster and cost-effective than the PRT. However, sequence data provide only indirect evidence of resis- tance and interpretation of the results is difficult for complex mutational combinations. In an attempt to overcome GRT interpretation complexity, a third newly introduced approach is the so-called virtual phenotypic testing (VPT). VPT is the combination of GRT and PRT. First a GRT is performed and then the data obtained is subjected to computational/statistical models that correlate the GRT data to phenotype, using large datasets containing genotype and phenotype information from individual viral isolates [91]. Several types of statistical and machine learning modelling techniques have been used for finding relationships between mutations in the HIV ge- nome and phenotypic drug susceptibility. These include artificial neural networks [92,93], support vector regression [92,94], least-squares regression, decision trees [92], and non-linear regression methods [95]. However, mod-

30 els created by these methods were hitherto developed for a single drug at a time, and do not allow the models to draw conclusions for any other drug beside the drug for which the model was created. Similarities and differences in the interaction of drugs with susceptible or resistant drug targets are determined by the similarities and differences in structural and physicochemical properties of the drugs as well as the targets. Stated that a model of a drug is based on only the mutation pattern in the targets, the model will not be able to generalise towards another drug. It will be valid only for the drug for which the model is trained on. Such a model is thus not able to explain cross-resistance, and is unable to extrapolate to new anti-HIV agents. Further, in some previous modelling approaches, mutations were represented by the amino acid letter codes as binary indicator variables, rather than quantitative molecular property descriptors of the sequence resi- dues that are relevant for drug-protein interactions (e.g. amino acid size and shape, charge, hydrophobicity, solubility, etc.). Therefore, the models have limited ability to generalise and predict the effect of less common mutations on interactions between the target and the drug. One of the aims of this thesis is to present the application of an advanced and more generalised modelling approach, termed proteochemometrics (PCM), for the analysis of HIV-1 resistance to NRTIs (see 1.2.5 & papers I & VI).

1.6 Flaviviruses The Flavivirus genus (family Flaviviridae) comprises over 70 viruses, in- cluding dengue virus (DEN), Japanese encephalitis virus (JEV), West Nile virus (WNV), tick-borne encephalitis virus (TBEV), and yellow fever virus (YFV), which are all primarily transmitted by arthropods. These flaviviruses share a similar genomic organisation and replication strategy, yet cause a broad spectrum of distinct clinical diseases, ranging from mild fever and malaise to fatal hemorrhagic fever, encephalitis and severe inflammation of the liver. They can be broadly grouped into viruses that have the capacity to cause vascular leakage and haemorrhage, such as DEN, and those that can cause encephalitis, including JEV, WNV and TBEV. The flaviviruses (class IV in Baltimore Classification [46]), as defined in the eighth report of the International Committee on Taxonomy of Viruses (ICTV), are the viruses which contain a positive polarity single stranded RNA genome, icosahedral nucleocapsid, size (diameter) of about 40–65 nM, surrounded by a lipoprotein envelope [96]

31 1.6.1 Dengue Dengue is considered the most important arthropod-borne viral infection of humans [97]. The origin of word ‘dengue’ is unknown but some people be- lieve that it is derived from the Swahili phrase ‘Ka-Dinga Pepo’, meaning ‘cramp-like seizure caused by an evil spirit’, which was the word used for a dengue-like illness during 1823 in Zanzibar and on the East African coast [98]. Dengue is caused by the dengue virus (DEN), a mosquito-borne posi- tive single-stranded RNA virus of genus Flavivirus, family Flaviviridae. There are four serotypes, DEN1–4, characterised by genetic and immu- nological differences [99]. Infection induces life-long protection against the infecting serotype, but it gives only a short time (2–3 months) cross protec- tive immunity against the other types.

Prevalence and Transmission Dengue is the most rapidly spreading mosquito-borne disease in the world today, with a 30-fold increase over the last 50 years [97]. WHO estimates that globally, 2.5 billion people (two-fifth of the world’s population) are at risk, and annually, 50-100 million people become infected and 0.5 million people with severe dengue require hospitalization [100]. The disease is endemic in more than 100 countries of the tropical and sub- tropical regions (Figure 9), and the dengue hemorrhagic fever (DHF) has occurred in more than 60 of these countries.

January Isotherm

10.C

July Isotherm

Areas at risk of dengue transmission 10.C Figure 9. Areas where dengue has been reported, as of 2008. The contour lines of January and July indicate the potential limits of the northern and southern hemi- spheres for year-round survival of the dengue vector Aedes aegypti [97]. (Adapted with permission from WHO).

32 The factors contributing to the increasing incidence of dengue include in- creasing urban populations, increased movement of infected persons, spread of mosquitoes, resistance of mosquitoes to insecticides, and global warming. The dengue viruses are maintained in a cycle that involves humans as the primary amplifying hosts and the mosquitoes of genus Aedes, commonly Aedes aegypti, as the primary vectors (Figure 10). The virus enters into the female mosquito when the mosquito feeds on the blood of an infected indi- vidual during viremia, the later which develops after about 4 days of infec- tion, and lasts for about 5–12 days [101].

Urban Enzootic Cycle Cycle

Vector Aedes aegypti

Figure 10. Dengue virus transmission cycles - urban epidemic, and enzootic (modi- fied from [102] with permission from the journal).

Once in the mosquito, the virus replicates there during an extrinsic incuba- tion period of 8–12 days. The mosquito thereby gets the life-long capability to infect susceptible individuals, which amounts to a time of around 3–4 weeks. The mosquitoes may also transmit the virus through their offspring by trans-ovarian transmission; i.e. via its eggs. The mosquito eggs have the ability to survive without water for several months, which poses a challenge to vector control efforts. Some dengue ecology studies in sylvatic habitats in West Africa [103] and Malaysia [104] have identified the enzootic cycle involving non-human primates as reservoir hosts, and tree-hole dwelling Aedes spp. as vector (Figure 10) [102]. When the mosquito (female Aedes aegypti) ingests the blood containing the virus from a dengue infected viremic person, the virus enters and replicates in the mosquitoes mid gut, ovaries, nerve tissues, fat body, and finally in the salivary glands. When the mosquito then bites a non-infected individual, the virus enters into his/her body from the mosquito’s saliva, and the cycle con- tinues (Figure 11).

33

Figure 11. Dengue transmission phases [105]. (Adapted with permission from WHO).

Progression and Symptoms After the virus enters into the human body, it localises and replicates in vari- ous target organs such as the liver and lymph nodes. It then starts spreading through the blood and infects white blood cells and other lymphatic tissues, and circulates in the blood for about five days. During this viremic phase, the virus can be transmitted to feeding mosquitoes. The dengue infection causes a broad spectrum of symptoms, ranging from mild undifferentiated fever, dengue fever (DF) to the life-threatening dengue haemorrhagic fever (DHF) with plasma leakage that may lead to hypovol- aemic shock, called dengue shock syndrome (DSS). The symptoms appear after an incubation period of 3–14 days (most commonly 4–7 days), at a time point shortly after the onset of viremia, and they may last for 3–10 days (Figure 11). Dengue fever in its classic form is a non-fatal illness, character- ised by fever, severe headache, muscle and joint pain, rash, nausea and vom- iting, that lasts for about a week. The potentially deadly severe form dengue hemorrhagic fever, is characterised by high continuous fever that usually lasts for 2-7 days, and which can rise as high as 40–41 °C; possibly with febrile convulsions, and a haemorrhagic condition, with liver enlargement and circulatory failure due to massive plasma leakage from vascular capillar- ies. Many patients recover spontaneously after a short period of fluid and electrolyte therapy. In severe cases, the patient’s condition suddenly deterio- rates. After having had fever for 2–7 days, the body temperature drops, fol- lowed by signs of circulatory failure with rapid and weak pulse, hypotension, cold and clammy skin, and the patient may reach a critical state of shock (DSS). Death may occur within 12–24 hours if appropriate supportive treat- ment is not provided [106,107].

34 The pathogenesis of DHF is not well defined, though most cases of DHF are attributed to immune-pathologic mechanisms [108]. Further, it has been shown that secondary infection with different dengue serotype can lead to severe disease (DHF and DSS), and that the anti-DEN antibodies passively transferred from mother to infants increase the risk of DHF and DSS in in- fants for a certain period of time. The proposed mechanism for this phe- nomenon is antibody-dependent enhancement (ADE) [109,110]. ADE can be defined as the process where non-neutralising cross-reactive antibodies (raised during a primary infection, or acquired passively at birth) bind to epitopes on the surface of a heterologous infecting virus and facili- tate virus entry into Fc-receptor-bearing cells such as monocytes or macro- phages, and thus increase virus replication and produce massive infection [97]. The rapid increase in the levels of inflammatory cytokines and chemi- cal mediators (caused by activation of CD4+ and CD8+ T-lymphocytes and lysis of infected monocytes by cytotoxic T-cells) leads to induced plasma leakage from vascular endothelial cells, and cause DHF/DSS [108].

Dengue Vaccine Development Four types of dengue vaccines are in various stages of development (Phase I and II trials). These are based on live attenuated viruses, chimeric live at- tenuated viruses, inactivated viruses, and nucleic acid-based vaccines. There are several challenges to dengue vaccine development. First, the vaccine must be tetravalent, because there are four serotypes responsible for dengue pathogenesis. Second, the mechanism for protective immunity has not been fully elucidated. Third, there is a lack of a suitable animal model that recapitulates human disease and which can be used to evaluate candidate vaccines. Finally, the potential for immune-pathogenesis, including antibody-dependent enhancement (ADE) which pose a danger that the vaccine can potentially cause severe disease (DHF or DSS) in vaccine recipients for the case that solid immunity is not established against all four serotypes [97]. Despite the significant progress in dengue vaccine development research, there is still a long way to go and several hurdles to cross. Therefore, along with vaccine development, there is a dire need to search for dengue antivi- rals.

1.6.2 Japanese encephalitis (JE) Japanese encephalitis (JE) constitutes acute encephalitis that can progress to paralysis, seizures, coma and death. JE is caused by Japanese encephalitis virus (JEV); a mosquito-borne single-stranded positive sense RNA virus of genus Flavivirus of the family Flaviviridae. There are four major genotypes of JEV, specifically, G-I, G-II, G-III and G-IV, classified based on nucleo- tide differences which amount to more than 12%. All genotypes can cause

35 encephalitis and paralysis, but have different virulence based on the genetic variation. For unknown reasons the G-III is the most virulent genotype.

Prevalence and Transmission The first cases of this disease were recorded in the 1870s in Japan, and hence its name, Japanese encephalitis. The prototype Nakayama strain of JEV was first isolated from the brain of a fatal case in 1934 [111] and from the Culex tritaeniorhynchus mosquito in 1938 [112]. JE is endemic in many countries in Asia, Africa and South America and it appears sporadically in Australia and the western Pacific [113] (Figure 12).

Figure 12. Distribution of Japanese encephalitis in Asia. The dark areas in the map indicate the countries of a large portion of Asia and the western Pacific region in which Japanese encephalitis is prevalent [113].

The worldwide annual number of JE cases is about 68 000, out of which approximately 20–30% of cases are fatal and 30–50% of survivors develop significant neurologic sequelae [114,115]. However, the true annual inci- dence of encephalitis cases is estimated to be closer to 175 000, if they had been reported properly [116]. Pigs and wild birds are the amplifying hosts, and Culex tri- taeniorhynchus is the primary vector [117]. Pigs are necessary for pre- epizootic amplification of the virus [118]. Humans and horses can develop fatal encephalitis, but they are only incidentally infected and are dead-end hosts for the virus, as mosquitoes do not get the infection from JE patients, essentially due to the low and short-lived viremia (Figure 13).

36 The female Culex tritaeniorhynchus mosquitoes become infected by feed- ing on viremic pigs and after an extrinsic incubation period (7–14 days) transmit the virus to other susceptible animals and humans.

Enzootic/ Rural epizootic Cycle

Vertebrate Vector Dead-end Reservoir Hosts

Figure 13. The cyclical pattern of JEV transmission between vertebrate reservoir (pigs, birds), vectors (mosquitoes) and dead-end hosts (humans, horses). (Modified from [102] with permission from the journal).

Progression and Symptoms After the virus enters the human body from an infected mosquito, it repli- cates peripherally and causes a transient viremia, before the virus invades the central nervous system. The most likely mechanism of crossing the blood brain barrier and neuro-invasion for JEV is considered to be passive transfer [119]. The symptoms of infection, which appear after an incubation period of 6– 14 days, are non-specific and include fever of about 38 °C, chills, muscle pain, and meningitis-type headaches with vomiting. In children, initially gastrointestinal symptoms, such as nausea, vomiting and abdominal pain appear. The symptoms last for 2–4 days and then the condition of the patient deteriorates rapidly with a progressive decline in alertness eventually result- ing in coma [120]. The majority of patients have painful neck stiffness and convulsions. Motor paralysis, tremor, rigidity, and abdominal movements are also present in some patients [121]. Respiratory distress and opisthotonos (i.e. tetanic spasm of back muscles with forward arching of the spine) indi- cate infection of the brain. Fatality is associated with acute cerebral oedema or severe respiratory distress from pulmonary oedema.

Vaccination Currently there is no antiviral therapy for JEV or any other flaviviral infec- tion, and so far the main strategy to control the disease is by preventive

37 methods such as vaccination and preventing mosquito bites [122-124]. Since the zoonosis is endemic in large parts of Asia, it is not likely to ever be ex- tinguished. For prevention, human vaccination is considered the most reli- able method. There are two vaccines licensed by FDA, as follows:

1. Mouse brain-derived, formalin-inactivated vaccine (JE-MB): This was developed using the prototype Nakayama strain, by Biken Institute in Japan [125] and it has caused substantial reduction in JE incidence in several coun- tries [126,127]. This vaccine was approved in 1992, but due to certain aller- gic and neurologic reactions [128] its manufacture was terminated by the Japanese government in 2006, and is therefore no longer available. 2. Inactivated vero cell culture-derived vaccine (JE-VC): This vaccine was approved by the FDA in 2009 for use in persons 17 years of age. The safety of JE-VC in children is under investigation and hence the vaccine has not been approved for use in children less than 17 years of age [129]. Other JE vaccines also exist, which include live-attenuated SA14-14-2 JE vaccine, which are manufactured and used in Asia, but they are not approved in the United States and Europe [130-132]. Although the improvements in JEV vaccination coverage has reduced the JE incidence, still about 55 000 (more than 80% of the total) annual cases occur in areas with well established or developing JE vaccination pro- grams[133]. Therefore, therapeutic agents for JE would be of great impor- tance to develop.

1.6.3 Life Cycle of Flavivirus For most flaviviruses, the natural reservoirs are mostly birds, rodents and other non-human wild primates. Humans and other domestic mammals are considered to be mostly dead-end hosts, because the viral load in the blood remains low, and the transmission from a human host to another human is ineffective. One exception to this is in case of DEN, in which humans can become a reservoir host because of high viral load in the blood and further transmission of virus can occur [134]. The arthropod vector (mosquito or tick) is infected when taking blood from the animal reservoir and then introduces the virus into a host during its blood meal. The viruses enter the target cells (e.g. dendritic cells) by recep- tor-mediated endocytosis and reach the endosomes. In endosomes, the acidic environment triggers conformational changes in the viral envelope glyco- protein (E) that induce fusion of the viral and the host cell membranes [135,136] (Figure 14).

38

Figure 14. Schematic representation of flavivirus replication cycle. 1. Virus-host cell binding; 2. Receptor- mediated endocytosis; 3. Membrane fusion and uncoating; 4. Translation; 5. Polyprotein processing (5A. non-structural proteins, e.g. viral heli- case and RdRp; 5B. structural proteins, e.g. prM and capsid); 6. RNA synthesis; 7. Encapsidation and budding of virions in the ER; 8. Transportation through trans- golgi network. 9. Maturation; 10. Release of mature virions by exocytosis [137].

Like a cellular mRNA, the released genome encodes a single open reading frame (orf) which is translated into a 3400 amino acids long single precursor polyprotein that is cleaved by viral NS2B-NS3 proteases and host signalases to produce three structural proteins (C, prM, and E) which constitute the viral particle and seven non-structural (NS) proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, NS5) (Figure 15), which are involved in viral RNA replication, virus assembly, and modulation of the host cell responses [137]. Replication occurs in modified membrane structures derived from the endo- plasmic reticulum (ER). After translation of input genomic RNA, a non- capped negative-stand RNA is copied from genomic RNA by the NS5 RNA- dependent RNA polymerase (NS5 RdRp). This negative-stand RNA serves as a template for the synthesis of new positive strand viral RNA [138]. Capped genomic RNAs are generated that provide a template for translation into precursor polypeptide and for packaging into virions. An immature non- infectious flavivirus virion is formed by the assembly of a single copy of viral positive-strand genomic RNA and the structural proteins C, prM and E components in the ER, where viral RNA-protein C complex is packaged into an ER-derived lipid bilayer containing hetero-dimers of the prM and E pro- teins [139,140]. Mature infectious viral particles are produced by structural rearrangements, such as cleavage of the prM into M by host cell protease, , in the low pH of the trans-golgi network, and released into the ex-

39 tracellular medium by exocytosis, which often results in the cell lysis and cell death [141-143].

1.6.4 Flaviviral Drug Targets: NS2B-NS3 Protease In principle, every step in a viral life cycle can host a potential drug target. However, the molecular events in the morphogenesis of flaviviruses, like DEN and JEV, are poorly characterised which makes design of inhibitors a daunting task. Conversely, viral proteases have been the focus for drug dis- covery and development for a long time for other viruses, with several suc- cess stories, such as the development of clinically useful HIV protease in- hibitors. However, as yet no effective antivirals are available for the treat- ment of flavivirus infection in humans, and the identification and characteri- sation of novel drug targets is critical to developing new types of anti- flaviviral drugs.

The Flaviviral NS3 Protein: The flaviviral NS3 protein is a multifunc- tional enzyme that contains protease, helicase and RNA triphosphatase (RTPase) activities. The N-terminal domain of ~180 amino acids of NS3 conveys the protease activity and is called NS3pro. The two C-terminal do- mains consist of 450 amino acids that convey the RNA helicase and RTPase activities [144-146].

NS3pro: NS3pro is a -like domain, which exists in a complex with an essential , NS2B, the latter which is also encoded by the viral genome. The complex is crucial for the cleavage of the viral precursor polyprotein (Figure 15) and consequently for viral replication [147,148]. Host protease Structural Non-structural

NH2 C prM E NS1 2A 2B NS3 4A 4B NS5 COOH

Capsid Replication NS3 Cofactor IFN Resistance? pathogenicity Membrane Envelope Protease, Helicase Meth. protein Proteins IFN Resistance? RNATri-Phosphatase Guanyl Transferase

Figure 15. Schematic diagram of flavivirus polyprotein, cleaved by virus-encoded NS2B-NS3pro complex (black down-pointing triangle - ) and the host-encoded proteases (grey up-pointing triangle - ).

40 This makes the NS2B-NS3pro an attractive target for anti-flaviviral inhibitor development. The NS2B-NS3 of the Flavivirus genus are now collectively termed flavivirin (EC 3.4.21.91) [149]. Deletion analysis of DEN NS2B indicates that the central hydrophilic do- main – NS2B(H) – of NS2B is sufficient as a co-factor for activation of NS3pro [150-152]. In addition, proteolytic activity of WNV NS3pro in com- plex with the NS2B hydrophilic region of WNV (residues 50-97) has been demonstrated using recombinant enzyme expressed in Escherichia coli (E. coli) [153]. It is important to establish structural and functional determinants in flaviviral NS2B-NS3pro in order to understand their roles and to facilitate design of inhibitors. Site-directed mutagenesis studies of NS2B and NS3pro have provided useful insights [154,155]. The high level of conservation and similarities of sequences within the flaviviral proteases suggest that specific- ity characteristics found for one flavivirus protease could also be of rele- vance for other closely related flaviviral NS3 protease. For example, NS2B- NS3pro requires a dibasic recognition sequence (P1-Arg, P2-Lys) for sub- strate cleavage which is conserved throughout the flaviviruses [156]. How- ever, the NS3 proteases from different flaviviruses, such as DEN and WNV, exhibit different substrate specificities, suggesting that their respective active site are distinctly conformationally organised [157]. Active site mutagenesis and SAR studies are useful to gain insights. Some early studies dedicated to site directed mutagenesis for ultra- conserved residues among the flaviviral proteases identified His51, Asp75 and Ser135 as the residues which are putatively involved in catalysis or substrate binding [158,159]. However, the protease variants were in these studies assayed by polyacrylamide gel electrophoresis (PAGE) to reveal auto-proteolytic cleavages of the NS2B-NS3 precursor in vivo. Al- though this approach yielded semi-quantitative data for activity of the mu- tant enzymes, it did not report precise numerical kinetic parameters of the enzyme mutants with substrates supplied for trans cleavage reactions. Fur- thermore, a number of important residues such as L115, S163 and I165 were not included in the study; residues in which possible roles for enzyme activ- ity were later determined by structural experiments [160,161]. Therefore, one of the studies of the present thesis is directed to site-directed mutagene- sis of selected positions in the active site of dengue NS2B(H)-NS3pro to determine their roles for the catalytic efficiency of the dengue virus NS3 protease [paper II]. The functional profiles and substrate specificity patterns of NS2B-NS3 of some flaviviruses, such as DEN2, and some other related flaviviruses like WNV have been studied recently [162-166] [paper II], and efforts were fo- cused on design strategies for inhibitors [papers IV & V]. The JEV NS2B-NS3pro is a hitherto neglected potential drug target, and needs to be characterised structurally and mechanistically. The cloning, ex-

41 pression, purification, biochemical characterisation, substrate specificity and inhibition studies [paper III] of JEV NS2B(H)-NS3pro was therefore one of the objectives of this thesis.

1.6.5 Drug Development for Flaviviral NS2B-NS3 Proteases Despite the tremendous success in development of inhibitors for HIV (AIDS) and virus proteases, the potential for dengue and related flaviviral proteases as drug targets remains still to be exploited [167]. One of the strategies applied in the discovery of anti-flavivirals has been HTS of non-peptidic small molecule libraries. Several such studies were reported but very few small molecule inhibitors have been identified so far using this technique [168-174]. A more popular strategy has been the design of peptide-based substrate analogue inhibitors. In one such study, synthetic peptides mimicking the uncleavable transition state isosteres of the P6-P2 residues of the native polyprotein cleavage sites were tested. A peptide with -keto amide group in place of the cleavable amide bond inhibited the DEN NS3pro competitively in the micromolar range, and another NS3/NS4A-derived peptide with a C- terminal aldehyde group in place of P1-P2 (Ac-FAAGRR-CHO) demon- strated competitive inhibition of NS3pro at 16 μM (Ki) [152]. In another study, hexapeptides derived from the cleavage sites of DEN NS3 protease showed low micromolar inhibitory activity; the best one, Ac-RTSKKR- CHO, was based on the NS2A/NS2B cleavage site and showed a Ki of 12 μM [175]. In another similar study, small peptide chains resembling the non-prime amino acids sequences of the natural substrates with different electrophilic, aldehydic, ketonic or acidic warheads added, were designed and screened. Boronic acid and trifluromethyl ketone derivatives were found to be potent competitive inhibitors of DEN2 NS2B-NS3 protease (Bz-nKRR-B(OH)2, Ki: 0.04 M; Bz-nKRR-CF3, Ki: 0.9 M) [176]. Recently, a series of N-terminal- capped tripeptide aldehyde inhibitors were reported, showing inhibitory ac- tivity against DEN2 and WNV protease in micromolar and submicromolar IC50 ranges, respectively [177]. However, due to the basic nature of the non-prime residues of the natural substrate (arginine, lysine) and the large size of the peptides, these highly charged peptide based substrate-analogue inhibitors are presumably not cell permeable and therefore of little value as leads for further drug discovery. The X-ray crystal structures of the target protein and the target-ligand complex provide basis for structure-based inhibitor discovery and helps ac- celerate the discovery process. Although the previous crystal structures of DEN NS3pro alone and its complex with Bowman-Birk inhibitor did not represent the active form of the enzyme, they were successfully used in structure-based discovery of small molecule inhibitors [168,169]. Recently,

42 the active NS2B-NS3 protease was crystallized [161,178] and it was found that although the protease catalytic triad (His51, Asp75, Ser131) was located similarly in the NS3 and NS2B-NS3 crystal structures, large conformational differences between NS3pro and NS2B-NS3pro structures were evident. For example, the S1 site in the substrate-binding region formed a deep pocket in NS3pro, while in NS2B-NS3pro it formed a shallow depression. These dif- ferences would surely affect the active site-ligand binding and must be ac- counted for in structure-based drug discovery [179]. The increasing availability of molecular structural and process informa- tion and the existence of powerful computer modelling approaches acceler- ate rational drug discovery. Several computational approaches are in use, which include computer-aided structure based design and virtual screening, with notable success stories in discovery [180-182]. As al- ready reviewed above proteochemometrics modelling (PCM) is a novel tool that can be used to correlate structural and physicochemical properties of targets, like viral enzymes (HIV, DEN), and ligands (substrates and inhibi- tors) with functional responses to create predictive PCM models. The results already achieved by this approach suggest that the PCM models are valid and useful and that they can be used for accelerated design of novel antivi- rals [36,183], as well as in analysis of viral drug resistance to viral enzyme inhibitors [27,35,184,185][paper I]. Intensive efforts and sustained multidisciplinary research is required for the future to explore the scope and exploit the potential of viral enzymes as drug targets for inhibitor design and thus cope with growing threat to human health posed by HIV, DEN, JEV and other viruses.

43 2 Aims

The overall aim of this thesis was to study retroviral reverse transcriptase (HIV-1RT) and flaviviral protease (NS2B-NS3pro) enzymes as drug targets, with main focus on the major challenges faced by enzyme inhibitors in com- bating retro- and flavi-viruses. In vitro (kinetic) and in silico (computational) approaches were applied to characterise the inhibition of HIV-1RT and the DEN/JEV NS2B-NS3pro enzymes.

The specific aims were:

1. To analyse the contribution of different mutations in the active site of HIV-1RT for the susceptibility of HIV to NRTIs, aiming for a computer model with potential use in informed anti-HIV combination therapy.

2. To investigate the effect of specific amino acid substitutions in the DEN NS2B(H)-NS3pro active site on the activity of the enzyme, in order to achieve better understanding of structural determinants of enzymatic ac- tivity of the dengue virus NS2B(H)-NS3 protease.

3. To present a simple and efficient procedure to clone, express, purify JEV NS2B(H)-NS3pro, characterise its substrate specificity and inhibition kinetics.

4. To analyse new NS2B(H)-NS3 protease peptide inhibitors, providing insights into how such compounds can be designed and optimised.

44 3 Results Discussion

3.1 PCM analysis of HIV mutants susceptibility to NRTIs (paper I) HIV-RT is the sole enzyme producing double-stranded DNA from the HIV’s single-stranded RNA genome. It hence participates in an essential step in retroviral replication [66-68] and it is a major target for approved antiretrovi- rals, namely the NRTIs and NNRTIs. Although combination therapy (HAART) greatly increases the life expec- tancy of people with HIV, the virus still mutates its genes encoding the drug targets, which thus creates resistant strains and which leads to HIV escaping the antiretroviral therapy. When first-line therapy fails, the complex mutational patterns make it dif- ficult to select a new regimen from multiple alternative drug combinations. Proper computational models that can predict HIV drug susceptibilities from virus genotype and drug molecular properties might be a useful remedy to this. Therefore, the main aim of paper I was to model the susceptibility of mutated HIV1-RT variants to the clinically used NRTIs. The approach pre- sented might find use in genome-based optimisation of HIV therapy. The data set used comprised 728 HIV variants with unique RT sequences and their corresponding phenotypic assays for eight NRTIs. The molecular property descriptors of NRTIs, the physicochemical properties (z-scales) of RT, and their cross-terms were correlated with the log-fold-decrease in sus- ceptibility (logFDS) using PLS regression, which resulted in a highly valid predictive PCM model for HIV-1RT drug susceptibility. Validation of the model demonstrated its robustness, accuracy and predictability in modelling the susceptibility of the excluded RT variants and inhibitors. There are previous models that address HIV-RT drug resistance; however 2 the resolution of the PCM model was superior, as estimated in terms of Q ext and RMSEP, compared with the other hitherto best-performing approaches, which analysed the susceptibilities for each NRTI in separate models [92,93,186]. The model of paper I was used to analyse the role of individual mutations and mutation combinations in drug resistance, and pinpoint the mutations that have specific effects to either single or multiple inhibitors (Figure 16). A web server (www.hivdrc.org) was developed based on the PCM model to facilitate predictions of the drug susceptibility of HIV-RT variants bearing clinically known or hypothetical mutations. The server has the potential to

45 guide the selection of inhibitors for combination therapy against particular RT variants, thus giving a direction how genome-based HIV therapy can be optimised [paper VI].

P66 P51

75 77184 208 62 77 75 74 151 151 210 65 74 65 215 221 69 67 62 22867 228 215 69 210 184 208 221

Figure 16. Locations of major Drug resistance mutations in the HIV-1 RT, identified by the PCM model.

3.2 Kinetics of DEN NS2B(H)-NS3pro mutants (paper II) Being indispensable for the flaviviral replication, the NS2B-NS3 proteases of human-pathogenic flaviviruses, such as DEN and WNV, have received substantial scientific attention as potential targets for the development of anti-flaviral therapeutics. Due to the unusually high specificity of the NS2B-NS3 protease for sub- strate processing [147], an understanding of the interaction of the enzyme active site residues with the substrates is fundamental to protease inhibitor designs. The aim of paper II was to achieve a better understanding of structural de- terminants of substrate specificity of the dengue virus NS2B(H)-NS3 prote- ase. To this end, alanine substitutions were introduced at selected positions within the active site by site-directed mutagenesis. The recombinant en- zymes mutants were over-expressed, purified and assayed enzymatically, using the model substrate Boc-GRR-amc [154].

46

Figure 17. Presentation of kinetic parameters for samples of NS2B(H)-NS3pro wild- type and mutant derivatives. Samples were assayed by using fluorescence emission from cleavage of the peptide substrate GRR-amc at 37°C as described under Meth- ods. The bar graph shows a comparison of numerical constants obtained for Km (panel A), kcat (panel B) and catalytic efficiency kcat/Km (panel C) for the wild-type protein NS2B(H)-NS3pro and active site mutant proteins L115A, D129A, G133A, T134A, N152A, S163A and I165A. Samples of Y150A and G151A were inactive in the enzyme assay and a 23-fold increase in enzyme concentration did not result in detectable activity. Data represent the mean of triplicate measurements and error bars are indicated.

47 All mutations, except L115A, decreased the activity of the protease enzyme at different degrees compared to the wild type enzyme. The changes in cata- lytic efficiency observed for the mutant NS3pro enzymes can be summarised to fall in the order: L115A>wild-type>T134A>S163A>G133A> D129A>N152A>I165A. The Y150A and G151A mutants displayed negligi- ble activity under the conditions of the assay (Figure 17). Even a 23-fold increase in enzyme concentration over the amount used in the normal assays for the wild-type protein did not result in detectable activity, thus suggesting that these mutations inactivate the enzyme completely. Previously, certain ultra-conserved residues were found among flaviviral proteases to be involved in substrate binding and catalysis of the dengue NS3 protease [158]. However, the activities of the mutant proteases were assayed by SDS-PAGE analysis of auto-proteolytic cleavage of the NS2B- NS3 precursor in vivo, and no precise numerical values for the kinetic activ- ity of the mutant protease were reported. Moreover, a number of residues, such as L115, S163 and I165, were not included in the previous investiga- tion. To the best of my knowledge, paper II describes for the first time a kinetic analysis of mutations in the dengue virus NS3 protease by a trans- cleavage assay with a synthetic peptide model substrate. These structural requirements provide a basis in informed discovery and optimisation of se- lective inhibitors against the flaviviral NS3proteases. The high level of se- quence conservation within flaviviral proteases indicates that the findings may also be useful in designing protease inhibitors for other flaviviruses such as WNV and JEV.

3.3 JEV NS2B(H)-NS3pro substrate specificity (paper III) Contrary to related flaviviral proteases, JEV NS2B-NS3 protease is structur- ally and mechanistically much less characterised. Here we aimed at estab- lishing a straightforward procedure for cloning, expression, purification and first time biochemical characterisation of JEV NS2B(H)-NS3 protease. The full-length NS2B-NS3 polyprotein region of JEV was obtained by time- and cost-efficient gene synthesis, and the synthetic gene was used as template for the generation of NS2B(H)-NS3pro by SOE-PCR. The recom- binant protease produced by E. coli as a soluble protein was purified by metal chelate affinity chromatography to >95% purity. The kinetic parame- ters (Table 1, paper III) were obtained by fluorescence release from several small synthetic peptide substrates resembling the dibasic cleavage site se- quences of the JEV polyprotein precursor. The highest binding affinity and cleavage efficiency was observed for the PyrRTKR-amc substrate, which represents the native JEV NS2B/NS3

48 cleavage site at the P1 to P3 positions. The binding affinity for a substrate with leucine at P3 (Boc-LRR-amc) is 3-fold higher when compared to Gly at P3 (Boc-GRR-amc). However, catalytic efficiency for the latter was about 10-fold higher than for Boc-LRR-amc, thereby suggesting a significant contribution of the P3 position to the catalytic mechanism, as was also seen in an earlier study on the DEN NS3 protease [163]. The prominent contribution to catalytic efficiency of residues at the P3 and P4 position (and possibly also prime-side residues) is also reflected in the comparatively weak activity of the internally quenched peptide Abz-(R)4SAGnY-NH2, a potent NS3pro substrate originally designed against the capsid protein sequence RRRRSAG of DEN2 [162]. Whereas the tetrabasic non-prime side sequence is strongly favoured by DEN NS3pro, the presence of Q-N at P4-P3 of JEV capsid cleavage site apparently confers a high difference in specificity between the two enzymes. Moreover, it cannot be ruled out that the prime-side sequences (SAG in DEN and GGN in JEV) contribute substantially to the decrease in efficiency as seen for this substrate peptide. It can be concluded that the enzyme from JEV favours substrates with K- R at the P2-P1 positions, whereas the NS3 protease from DEN prefers argin- ine at these positions, thereby suggesting a greater similarity of the JEV pro- tease to WNV than to DEN [166]. This view is supported by amino acid sequence alignments, crystal structures for the enzymes from DEN and WNV, and structure-guided mutagenesis studies which show that functional determinants of activation and substrate recognition for JEV and WNV are more closely related than they are with other flaviviral proteases. It is note- worthy that the sequence of the NS2B cofactor from JEV shows greater identity to NS2B from WNV (67.2%) than with other mosquito-borne flaviviruses (28-34%) as well as with tick-borne flaviviruses (18–22%) [155]. The serine protease inhibitor aprotinin was tested on JEV NS3 prote- ase and it was found that the protein inhibitor aprotinin inhibit JEV NS2B/NS3 protease with surprisingly low potency (IC50: 4.13 ± 0.167 M). Using multiple sequence alignment of the NS2B-NS3 proteases of WNV, JEV, DEN and YFV, we identified four amino acid residues (positions 84 and 86 in NS2B and 132 and 155 in NS3pro), which were different for all four proteases. In order to confirm the role of these residues in substrate binding and cleavage preferences of NS2B-NS3 proteases, a structural 3D homology modelling was made of the JEV protease. The model revealed that in spite of the high degree of overall similarity of the 3D conformation of the JEV protease with that of the WNV NS2B-NS3pro complex [155,161,187], the change of N84, Q86, T132 and I155 in WNV NS2B-NS3 protease to D84, H86, R132 and E155 in the JEV NS2B-NS3 protease leads to rearrangement of the possible H-bonds between the ligand and the S1, S3, and S4 pockets of the enzyme (paper III, Figure 9, panel B, C). Moreover, the presence of a basic amino acid in S1, and an acidic amino acid in S4, may influence both the binding pocket geometry and ligand-protease

49 interactions. Similarly, an earlier study employing hybrid NS2B-NS3 constructs of DEN and JEV sequences showed that only a (DEN)NS2B- (JEV)NS3 protein could efficiently process the JEV polyprotein, whereas the (JEV)NS2B-(DEN(NS3) construct was inactive, supporting the notion that NS2B proteins of different origins modulate the structure and substrate affinity of the protease [155,188]. To sum up, in paper III we present for the first time an enzyme-kinetic analysis of a recombinant JEV protease by employing model substrate pep- tides. Our data demonstrate that the JEV protease bears marked differences to other known flaviviral proteases. Our study may serve as an entry point to the development of efficient JEV inhibitors, e.g. by employment of high- throughput screening.

3.4 DEN NS2B(H)-NS3pro inhibitors design & evaluation (papers IV & V) In response to the global problem of flaviviral diseases, considerable efforts are being devoted to exploit the potential of drug targets in flaviviruses, us- ing different strategies such as: 1. HTS studies of non-peptidic small molecule inhibitors, including charged [168,173,174] and uncharged [169-172], molecules. Although several such studies have been conducted for the DEN protease, very few active molecules have been found. 2. Substrate-analogue peptide derivatives. Several such peptides are effec- tive, with Ki in the lower micromolar or even nanomolar range [176,189]. However, due to the basic nature of the non-prime residues of the natural substrate (arginine, lysine) and the large size of the peptides, these substrate-analogue inhibitors are not cell permeable.

The aim of papers IV and V was to design and screen series of peptides with varying sizes and residues to find out their roles to afford NS3 protease inhi- bition. In the first step (paper IV), a series of 45 peptides with varying number (four to nine), type and relative position of their residues, were designed, by substitutions, deletions, extensions, and insertions, using an internally quenched dengue protease substrate Abz-(R)4SAGnY-NH2 as starting point (see paper IV, table 1). The general structures of the peptides were Abz- (R)0/2/4-XXXX-nY-NH2, (R)0/1/2/4-XXXX-NH2, and Ac-(R)0/1/2/4-XXXX-NH2. The peptides were assayed for their inhibition of DEN1–4 NS2B(H)-NS3 proteases using the fluorogenic internally quenched peptide Abz- (R)4SAGnY-NH2 as substrate. Several designed octapeptides showed inhibi- tory activities towards all the four proteases in the low micromolar to sub-

50 micromolar range. Systematic removal of cationic substrate non-prime side residues and variations in the prime side sequence resulted finally in an un- charged tetrapeptide, WYCW-NH2, with Ki values of 4.2, 4.8, 24.4, and 11.2 μM against the proteases from DEN1–4, respectively. While the tetrapep- tides HWCW-NH2, HLCW-NH2, and WYCG-NH2 were found inactive, they became active when the N-terminal was extended by adding one or several arginine residues, which is the most common residue at P1 and P2 positions of NS2B-NS3 protease natural substrates. While the tetrapeptide SRP22 (WYCG-NH2) was inactive, the peptide SRP12 (Abz-WYCG-nY-NH2) showed high inhibitory activity against three of the four proteases. It is pos- sible that this is due to the fact that no steric hindrance from neighbouring glycine (G) residue allows 3-nitrotyrosine (nY) to adopt a conformation which is favourable for interactions with the protease. The peptides SRP25-SRP36 were designed and screened to investigate the influence of N-terminal acetylation on the peptide activity. These pep- tides were synthesised in parallel with non-acetylated analogues (SRP13- SRP24) and in most cases they showed similar or up to two fold lower activ- ity. However, in one case the activity of acetylated peptide SRP-27 was higher than the activity of its analogue SRP-15 (6.3 μM versus 142.6 μM against DEN-2 protease). Adding an N-terminal acetyl decreased the activity of the most active WYCW-NH2 peptide (Ac-WYCW-NH2, Ki: 18 μM for DEN2 versus 4.8 μM for the non-acetylated variant).

Paper V aimed to find improved peptides using the best peptide from paper IV (WYCW-NH2) as starting point. To this end a targeted library of 25 tetrapeptides with the general structure XXXX-NH2 was designed with aid of statistical molecular design (SMD). To overcome the cell permeability issues posed by basic residues (R, K) used in classical substrate-analogues, neutral residues (W, F, Y, C, L, S, T, A and N) were used instead. The pep- tide library was custom-synthesised by Sigma (Singapore). The active target DEN2 NS2B(H)-NS3 protease was purified and re- natured from inclusion bodies expressed in E. coli. The kinetic parameters -1 Km, kcat, and kcat/Km were 100 ± 8.6 uM, 0.112 ± 0.003 s , and 1125 ± 66.78 M-1s-1 respectively for Ac-nKRR-amc. The library was screened biochemically for its inhibition of the DEN2 NS2B(H)-NS3 protease. Out of the 25 peptides, four peptides showed sig- nificant enzyme inhibition with the inhibition constants (Ki) amounting to 10.84 ± 0.98μM, 11.2 ± 0.86μM, 11.3 ± 1.05μM, and 21.5 ± 1.97μM for FFCW-NH2, WFCW-NH2, WWCW-NH2, and WYCF-NH2, respectively. The results show that both the type and the relative position of all the four residues in the peptide are important for inhibition activity. The only amino acid tolerated in the third position of the tetrapeptides is cysteine. Its re- placement with residues such as alanine (A) and asparagine (N) resulted in complete loss of inhibitory activity. Further, even physico-chemically seem-

51 ingly small modifications such as replacement of cysteine (C) with serine or threonine (T) resulted in loss of activity. This study illustrates and encourages the development of improved pep- tide inhibitors for dengue protease based on the selection of residues not restricted to the classical residues in the natural cleavage sites such as argin- ine and lysine. The findings may serve as starting point for designing more potent and lower molecular weight drug-like inhibitors, using natural amino acids and their synthetic derivatives.

52 4 Conclusions and Perspectives

The work in this thesis covered different in silico and in vitro methods to investigate the influence of structural variations in viral targets (RT, NS3pro), viral enzyme substrates (fluorogenic peptides) and viral enzyme inhibitors (peptides, organic compounds) on their functions, with the aim to promote drug discovery and drug therapy. This is summarised as follows: 1. Role of HIV1-RT mutations in drug susceptibility: The generalised PCM model created here is capable to predict HIV susceptibility to NRTIs for existing and to some extent for novel never-before seen mutations in RT. The web server (hivdrc.org) developed based on the model facilitates predictions of the drug susceptibility of HIV-RT variant, and has the po- tential to guide the selection of inhibitors for combination therapy against particular RT variants, thus giving a direction how genome- based HIV therapy can be optimised. 2. Effects of DEN-NS3pro mutations on enzyme activity: The study en- compassed mutations in DEN NS3 protease of importance for the en- zyme’s substrate cleavage activity. The structural requirements of DEN NS3pro presented may aid informed discovery and optimisation of selec- tive inhibitors against the flaviviral NS3proteases. The high level of se- quence conservation among flaviviral proteases indicates that design of inhibitors for the DEN flaviviral proteases will aid in the design of in- hibitors for other flaviviral proteases, such as those of WNV and JEV. 3. Role of substrate residues in JEV-NS3pro activity: This was the rst study characterising substrate preferences and inhibitor proles for the JEV-NS3pro. The study can serve as an entry point to the design of effi- cient screening assays aimed at antiviral inhibitors discovery. 4. Role of basic peptide residues in DEN-NS3pro inhibition: Peptides with basic residues (Arg, Lys) at P1–P4 positions yield the most efficiently cleaved substrates and the most potent inhibitors. However, due to their basic nature they show weak membrane permeability, which disqualify them as antiviral drug candidates. The study applied a successful strat- egy to the stepwise removal of basic residues and shortening of the pep- tide, which resulted in a tetrapeptide with low micromolar inhibitory ac- tivity against the DEN proteases. The compound is promising for further optimisation to achieve high affinity inhibitors of DEN proteases. 5. Role of non-basic tetrapeptides for DEN-NS3pro inhibition: Tetrapep- tides containing Trp, Cys, Tyr, Phe, Leu, Thr, Ser, Ala and Asp residues, were designed based on the DEN inhibitory tetrapeptide from the previ-

53 ous study [paper IV]. The residues were varied systematically using sta- tistical molecular design approach. Four novel peptides with micromolar inhibitory activities were found. The data showed that both the type and the position of the residues are important for inhibition activity. The re- sults indicate that further improved inhibitors may be developed based on the found tetrapeptides as leads, which might lead to potent inhibitors with lower molecular weight and with drug-like properties, for potential use in treatment of flaviviral infections.

54 5 Acknowledgements

I would express my sincere gratitude to all the nice people who motivated and supported me over the years and who made this long journey enjoyable and rewarding.

I am especially grateful to:

Prof. Dr. Jarl Wikberg, my supervisor, for giving me the opportunity to be part of your group, for providing a creative, pleasant and collaborative work environment, for patient support, continuous encouragement and for sharing your exceptional knowledge, enthusiasm, and experience in science. Thank you also for giving me free choice to steer the project in my own direction.

Dr. Maris Lapins, my co-supervisor, for introduction to PCM, valuable sug- gestions, critical comments, expert guidance and for always being ready to help. It would not be possible to complete this thesis without your collabora- tion, contribution, guidance and help. Thank you for your patient support I could wish.

Prof. Dr. Gerd Katzenmeier, Mahidol University, for the nice working envi- ronment, never-ending support, patience, understanding, concern and ex- ceptional hospitality. Thank you for always taking time to discuss, guide and help. This work would have never been completed without your collabora- tion and support.

Prof. Eva Brittebo, (IPBS, Uppsala University), Prof. Prasert Auewarakul, Prof. Chanan Angsuthanasombat (IMBS, Mahidol University), and Prof. Staffan Uhlen, (University of Bergen, Norway), for teaching, understanding, support, concern and hospitality.

My wonderful co-authors – Maris Lapins, Gerd Katzenmeier, Sviatlana Ya- horava, Aleh Yahorav, Martin Eklund, Ola Spjuth, Ramona Petrovska, Iryna Shutava, Prof. Chanan Angsuthanasombat, Peteris Prusis, Anna S. Torrejon, Dr. Taian Cui, Wanisa Salemae and Ckakard Chalayut for your input, sup- port, collaboration and friendship.

Prof. Margareta Hammarlund-Udenaes, Prof. Lena Bergstrom, Prof. Ingrid Nylander, Dr. Anne-Lie Svensson, Dr. Erika Roman, Prof. Mats Karlsson, Prof. Georgy Bakalkin, Prof. Fred Nyberg, Prof. Ernst Oliw, Dr. Tatiana I,

55 Dr. Hiroyuki W, Prof. Albert Ketterman, Prof. Varaporn A, Dr. Sarin C and Prof. Duncan R. Smith for your valuable time, suggestions and guidance.

The honourable members of the graduate studies committee and the exami- nation committee for volunteering your expertise, energy and time in assess- ing and ensuring the overall quality of this work.

My nice present and former colleagues and friends at IPBS – Iryna shutava (again), Stephy Prakash, Aleksejs Kontijevskis, Anton Shulukh, Ilona Mandrika, Sadia Oreland, Jonathan Alvrtsan, Berg Arvid, Klas Jonsson, Valentin Georgiev, Polina Georgiev, Loudin Daoura, Apilak W, MW Sadiq, M. Mumtaz, M. Zubair for your company, friendship and support.

My wonderful friends at IMBS – esp. Sarbast Algubare, Thanate Juntadech (Bot), Dr. Somphob L, Anchalee N, M. Yousof, M. Ijaz, Surian, Shumpo, Jum, Somsri, Opas, Pan, Ae, Niramon, Tunmai, and all for your support, friendship and hospitality.

Magnus Jansson, Erica Johnsson (the late), Ulrica Bergstrom, Agneta Bergstrom, Marina, Marianne Danersund, Agneta Hortlund, Kjel Akerlund, Pernille Husberg, Umnaj, Pepo, Jesper Andersson & colleagues (thesis pro- duction team) and all other administrative and technical staff for administra- tive and technical support. Thank you for always being ready to help.

My ULLA Summer School fellows and friends from all over the world for sharing their knowledge, wonderful experiences and for all the fun.

My fellows and friends in the ‘Training of Teaching in Higher Education’, for sharing knowledge and experiences.

All my teachers for sharing knowledge and experience, for guidance, support and encouragement throughout my studies, from pre-school to university.

All my other fellows and friends, no one mentioned, no one forgotten.

My all family members and relatives for encouragement and support, opti- mism and love, prayers and cares. Special thanks to my parents and grand- parents (Abai au Baba) for the immortal love, prayers and memories.

Higher Education Commission (HEC) and University of Malakand (UOM), Government of Pakistan, for financial support. M.Junaid April 12, 2012

56 6 References

1. Hertzberg RP, Pope AJ (2000) High-throughput screening: new technology for the 21st century. Curr Opin Chem Biol 4: 445-451. 2. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 46: 3-26. 3. Bachmann KA, Ghosh R (2001) The use of in vitro methods to predict in vivo pharmacokinetics and drug interactions. Current Drug Metabolism 2: 299- 314. 4. Brown D, Superti-Furga G (2003) Rediscovering the sweet spot in drug discovery. Drug Discov Today 8: 1067-1077. 5. Praksh N, Devangi P (2010) Drug Discovery. Journal Antivir Antiretrovir 2: 63- 68. 6. Kitchen DB, Decornez H, Furr JR, Bajorath J (2004) Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 3: 935-949. 7. Ortiz AR, Gomez-Puertas P, Leo-Macias A, Lopez-Romero P, Lopez-Vinas E, et al. (2006) Computational approaches to model ligand selectivity in drug design. Curr Top Med Chem 6: 41-55. 8. Leach AR, Shoichet BK, Peishoff CE (2006) Prediction of protein-ligand inter- actions. Docking and scoring: successes and gaps. J Med Chem 49: 5851- 5855. 9. Bohm HJ, Schneider G (2003) Protein-ligand Interactions: From Molecular Recognition to Drug Design (Methods and Principles in Medicinal Chemistry); Mannhold R, Kubinyi H, Folkers G, editors: Wiley-VCH. 10. Schlyer S, Horuk R (2006) I want a new drug: G-protein-coupled receptors in drug development. Drug Discov Today 11: 481-493. 11. Shoichet BK (2004) Virtual screening of chemical libraries. Nature 432: 862- 865. 12. Schneider G, Bohm HJ (2002) Virtual screening and fast automated docking methods. Drug Discov Today 7: 64-70. 13. Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, et al. (2000) Crystal structure of rhodopsin: A G protein-coupled receptor. Science 289: 739-745. 14. Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen SG, Thian FS, et al. (2007) High-resolution crystal structure of an engineered human beta2- adrenergic G protein-coupled receptor. Science 318: 1258-1265. 15. Wikberg JES, Eklund M, Willighagen EL, Spjuth O, Lapins M, et al. (2011) Introduction to Pharmaceutical Bioinformatics. Stockholm: Oakleaf Academic Publishing House. 1-416 p.

57 16. Ibezim EC, Duchowicz PR, Ibezim NE, Mullen LMA, Onyishi IV, et al. (2009) Computer - aided linear modeling employing QSAR for drug discovery. Scientific Research and Essays 4: 1559–1564. 17. Lundstedt T, Seifert E, Abramo L, Thelin B, Nyström A, et al. (1998) Experimental design and optimization. Chemom Intell Lab Syst: 3-40. 18. Tye H (2004) Application of statistical 'design of experiments' methods in drug discovery. Drug Discovery Today 9: 485-491. 19. Linusson A, Elofsson M, Andersson IE, Dahlgren MK (2010) Statistical Mole- cular Design of Balanced Compound Libraries for QSAR Modeling. Current Medicinal Chemistry 17: 2001-2016. 20. Lapinsh M, Prusis A, Gutcaits A, Lundstedt T, Wikberg JE (2001) Development of proteochemometrics: a novel technology for the analysis of drug- receptor interactions. Biochim Biophys Acta 1525: 180-190. 21. Witten IH, Frank E (2005) Data mining: Practical machine learning tools and techniques. Oxford: Elsevier Inc. 22. Hastie T, Tibshirani RJ, Friedman J (2001) The elements of statistical learning. New York: Springer. 23. Efron B (1983) Estimating the Error Rate of a Prediction Rule - Improvement on Cross-Validation. Journal of the American Statistical Association 78: 316- 331. 24. Osten DW (1988) Selection of optimal regression models via cross-validation. J Chemom 2: 39. 25. Gramatica P (2007) Principles of QSAR models validation: internal and external. Qsar & Combinatorial Science 26: 694-701. 26. Prusis P, Uhlen S, Petrovska R, Lapinsh M, Wikberg JE (2006) Prediction of indirect interactions in proteins. BMC Bioinformatics 7: 167. 27. Kontijevskis A, Prusis P, Petrovska R, Yahorava S, Mutulis F, et al. (2007) A look inside HIV resistance through retroviral protease interaction maps PLoS Comput Biol 3: e48. 28. Lapinsh M, Prusis P, Lundstedt T, Wikberg JE (2002) Proteochemometrics modeling of the interaction of amine G-protein coupled receptors with a diverse set of ligands. Mol Pharmacol 61: 1465-1475. 29. Lapinsh M, Prusis P, Mutule I, Mutulis F, Wikberg JE (2003) QSAR and proteo- chemometric analysis of the interaction of a series of organic compounds with melanocortin receptor subtypes. J Med Chem 46: 2572-2579. 30. Lapinsh M, Prusis P, Uhle´n S, Wikberg JE (2005) Improved approach for pro- teochemometrics modeling: application to organic compound - amine G pro-tein-coupled receptor interactions. . Bioinformatics 21: 4289-4296. 31. Lapinsh M, Veiksina S, Uhlen S, Petrovska R, Mutule I, et al. (2005) Proteochemometric mapping of the interaction of organic compounds with melanocortin receptor subtypes. Mol Pharmacol 67: 50-59. 32. Prusis P, Muceniece R, Andersson P, Post C, Lundstedt T, et al. (2001 ) PLS modeling of chimeric MS04/MSH-peptide and MC1/MC3-receptor interac- tions reveals a novel method for the analysis of ligand-receptor interactions. Biochim Biophys Acta 1544: 350-357.

58 33. Strombergsson H, Prusis P, Midelfart H, Lapinsh M, Wikberg JE, et al. (2006) Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand interactions. Proteins 63: 24-34. 34. Mandrika I, Prusis P, Yahorava S, Shikhagaie M, Wikberg JE (2007) Proteochemo-metric modeling of antibody-antigen interactions using SPOT synthesised peptide arrays. . Protein Eng Des Sel 20: 301-307. 35. Kontijevskis A, Wikberg JE, Komorowski J (2007) Computational proteomics analysis of HIV-1 protease interactome. Proteins 68: 305-312. 36. Prusis P, Lapins M, Yahorava S, Petrovska R, Niyomrattanakit P, et al. (2008 ) Proteochemometrics analysis of substrate interactions with den-gue virus NS3 proteases Bioorg Med Chem 16: 9369-9377. 37. Lapins M, Eklund M, Spjuth O, Prusis P, Wikberg JE (2008) Proteochemometric modeling of HIV protease susceptibility. BMC Bioinformatics 9: 181. 38. Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1: 727-730. 39. Copeland RA (2005) Evaluation of enzyme inhibitors in drug discovery. A guide for medicinal chemists and pharmacologists. Methods Biochem Anal 46: 1- 265. 40. Weiss RA, McMichael AJ (2004) Social and environmental risk factors in the emergence of infectious diseases. Nat Med 10: S70-76. 41. Stewart W (1967) 'A Mandate for State Action,' presented at the Association of State and Territorial Health Officers. Washington, DC. 42. Henderson AD (1993) Surveillance System and Intergovernmental Cooperation. In: Morse SS, editor. Emerging Viruses. New York: Oxford University Press. pp. 283-289. 43. Huges SS (1997) The virus: a history of concept: Heinemann Education Books. 44. Lwoff A, Tournier P (1966) The classification of viruses. Annu Rev Microbiol 20: 45-74. 45. Flint SJ, Enquist LW, Krug RM, Lacanielo VR, Shalka AM (2000) Principles of virology: molecular biology, pathogenesis and control. ASM Press. 46. Baltimore D (1971) Expression of animal virus genomes. Bacteriol Rev 35: 235- 241. 47. Coffin JM, Hughes SH, Varmus HE (1997) Retroviruses. Woodbury, NY.: Cold Spring Harbor La-boratory Press. 48. Hardy WD, Old LJ, Hess PW, Essex M, Cotter S (1973) Horizontal transmission of feline leukaemia virus (FeLV). Nature 244: 266-269. 49. Poiesz BJ, Ruscetti FW, Gazdar AF, Bunn PA, Minna JD, et al. (1980) Detection and isolation of type C retrovirus particles from fresh and cultured lympho-cytes of a patient with cutaneous T-cell lymphoma. Proc Natl Acad Sci USA 77: 7415-7419. 50. Gallo RC, Thomson MM (1996) Human T-cell Lymphotropic Virus Type I; Hollsberg P, Hafler DA, editors. Chichester: Wiley and Sons. 325 p. 51. Barre-Sinoussi F, Chermann JC, Rey F, Nugeyre MT, Chamaret S, et al. (1983) Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS). Science 220: 868-871.

59 52. Clavel F, Guetard D, Brun-Vezinet F, Chamaret S, Rey MA, et al. (1986) Isolation of a new human retrovirus from West African patients with AIDS. Science 233: 343-346. 53. Heeney JL, Dalgleish AG, Weiss RA (2006) Origins of HIV and the evolution of resistance to AIDS. Science 313: 462-466. 54. Worobey M, Telfer P, Souquiere S, Hunter M, Coleman CA, et al. (2010) Island biogeography reveals the deep history of SIV. Science 329: 1487. 55. Gottlieb MS (2006) Pneumocystis pneumonia--Los Angeles. 1981. Am J Public Health 96: 980-981; discussion 982-983. 56. UNAIDS (2011) World AIDS day report 2011: How to get zero: Faster, Smarter, Better. Geneva: Joint United Nations Programme on HIV/AIDS (UNAIDS). 57. Piatak M Jr, Saag MS, Yang LC, Clark SJ, Kappes JC, et al. (1993) High levels of HIV-1 in plasma during all stages of infection determined by competitive PCR. Science 259: 1749-1754. 58. Kahn JO, Walker BD (1998) Acute human immunodeficiency virus type 1 infection. N Engl J Med 339: 33-39. 59. Grossman Z, Meier-Schellersheim M, Paul WE, Picker LJ (2006) Patho genesis of HIV infection: what the virus spares is as im portant as what it destroys. Nat Med 12: 289-295. 60. Burton GF, Keele BF, Estes JD, Thacker TC, Gartner S (2002) Follicular dendritic cell contributions to HIV pathogenesis. Semin Immunol 14: 275- 284. 61. Koff WC (2011) HIV vaccine development: Challenges and opportunities towards solving the HIV vaccine-neutralizing antibody problem. Vaccine. 62. Rerks-Ngarm S, Pitisuttithum P, Nitayaphan S, Kaewkungwal J, Chiu J, et al. (2009) Vaccination with ALVAC and AIDSVAX to prevent HIV-1 infection in Thailand. N Engl J Med 361: 2209-2220. 63. UNAIDS (2011) Doha+10 Trips flexibilities and access to antiretroviral therapy: Lessons from the past, opportunities for the future. UNAIDS technical brief. 64. Food and Drug Administration - FDA (2011) Antiretroviral drugs used in the treatment of HIV infection. US Food and Drug Administration. pp. http://www.fda.gov/ForConsumers/byAudience/ForPatientAdvocates/HIVa ndAIDSActivities/ucm118915.htm. 65. Lengauer T, Sing T (2006) Bioinformatics-assisted anti-HIV therapy. Nat Rev Microbiol 4: 790-797. 66. Parniak MA, Sluis-Cremer N (2000) Inhibitors of HIV-1 reverse transcriptase. Adv Pharmacol 49: 67-109. 67. Goff SP (1990) Retroviral reverse transcriptase: synthesis, structure, and function. J Acquir Immune Defic Syndr 3: 817-831. 68. De Clercq E (2002) Strategies in the design of anti-viral drugs. Nat Rev Drug Discov 1: 13-25. 69. Goody RS, Muller B, Restle T (1991) Factors contributing to the inhibition of HIV reverse transcriptase by chain-terminating nucleotides in vitro and in vivo. FEBS Lett 291: 1-5.

60 70. Birgerson LE (2006) M.D./Alert. Roche Professional Product Information Department. pp. http://www.fda.gov/downloads/Drugs/DrugSafety/DrugShortages/ucm0860 99.pdf. 71. Randolph JT, DeGoey DA (2004) Peptidomimetic inhibitors of HIV protease. Curr Top Med Chem 4: 1079-1095. 72. Chrusciel RA, Strohbach JW (2004) Non-peptidic HIV protease inhibitors. Curr Top Med Chem 4: 1097-1114. 73. Johnson VA, Brun-Vezinet F, Clotet B, Gunthard HF, Kuritzkes DR, et al. (2007) Update of the drug resistance mutations in HIV-1: 2007. Top HIV Med 15: 119-125. 74. Zeinalipour-Loizidou E, Nicolaou C, Nicolaides A, Kostrikis LG (2007) HIV- 1integrase: from biology to chemotherapeutics. Curr HIV Res 5: 365-388. 75. Meanwell NA, Kadow JF (2007) Maraviroc, a chemokine CCR5 receptor an- tagonist for the treatment of HIV infection and AIDS. Curr Opin Investig Drugs 8: 669-681. 76. Temesgen Z, Feinberg JE (2006) Drug evaluation: bevirimat-HIV Gag protein and viral maturation inhibitor. Curr Opin Investig Drugs 7: 759-765. 77. Adolescents PoAGfAa (2011) Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents Department of Health and Human Services. pp. 1-167. 78. Gulick RM, Mellors JW, Havlir D, Eron JJ, Gonzalez C, et al. (1997) Treatment with indinavir, zidovudine, and lamivudine in adults with human immuno- deficiency virus infection and prior antiret-roviral therapy. N Engl J Med 337: 734-739. 79. Hammer SM, Squires KE, Hughes MD, Grimes JM, Demeter LM, et al. (1997) A controlled trial of two nucleoside an-alogues plus indinavir in persons with hu-man immunodeficiency virus infection and CD4 cell counts of 200 per cubic millimeter or less. N Engl J Med 337: 725-733. 80. Mocroft A, Ledergerber B, Katlama C, Kirk O, Reiss P, et al. (2003) Decline in the AIDS and death rates in the EuroSIDA study: an observational study. Lancet 362: 22-29. 81. Palella FJ, Jr., Baker RK, Moorman AC, Chmiel JS, Wood KC, et al. (2006) Mortality in the highly active antiretroviral therapy era: changing causes of death and disease in the HIV outpatient study. J Acquir Immune Defic Syndr 43: 27-34. 82. UNAIDS (2011) HIV Drug resistance Fact sheet. April 2011. W. H. O. pp. http://www.who.int/hiv/facts/WHD2011-HIVdr-fs-final.pdf. 83. Clavel F, Hance AJ (2004) HIV drug resistance. N Engl J Med 350: 1023-1035. 84. Coffin JM (1995) HIV population dynamics in vivo: implications for genetic variation, pathogenesis, and therapy. Science 267: 483-489. 85. Blanchard JS (1996) Molecular mechanisms of drug resistance in Mycobacte- rium tuberculosis. Annu Rev Biochem 65: 215-239. 86. Borst P (1991) Genetic mechanisms of drug resistance. A review. Acta Oncol 30: 87-105. 87. Remy S, et a (2003) A novel mechanism underlying drug resistance in chronic epilepsy. Ann Neurol 53: 469-479.

61 88. Schafer JM, Bentrem DJ, Takei H, Gajdos C, Badve S, et al. (2002) A mechanism of drug resistance to tamoxifen in breast cancer. J Steroid Biochem Mol Biol 83: 75-83. 89. Erickson JW, Burt SK (1996) Structural mechanisms of HIV drug resistance. Annu Rev Pharmacol Toxicol 36: 545-571. 90. Hong L, Zhang XC, Hartsuck JA, Tang J (2000) Crystal structure of an in vivo HIV-1 protease mutant in complex with saquinavir: insights into the mechanisms of drug resistance. Protein Sci 9: 1898-1904. 91. Schutten M (2006) Resistance assays. In: AM G, editor. Antiretroviral Resistance in Clinical Practice. London: Mediscript. 92. Rhee SY, Taylor J, Wadhera G, Ben-Hur A, Brutlag DL, et al. (2006) Genotypic predictors of human immunodeficiency virus type 1 drug resistance. Proc Natl Acad Sci U S A 103: 17355-17360. 93. Kjaer J, Hoj L, Fox Z, Lundgren JD (2008) Prediction of phenotypic suscepti- bility to antiretroviral drugs using physicochemical properties of primary enzy-matic structure combined with artificial neural networks. HIV Med 9: 642-652. 94. Beerenwinkel N, Daumer M, Oette M, Korn K, Hoffmann D, et al. (2003) Geno2pheno: Estimating phenotypic drug resistance from HIV-1 genotypes. Nucleic Acids Res 31: 3850-3855. 95. Saigo H, Uno T, Tsuda K (2007) Mining complex genotypic features for pre- dicting HIV-1 drug resistance. Bioinformatics 23: 2455-2462. 96. Fauquet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA (2005) Virus taxonomy, VIIIth Report of the ICTV. London: Elsevier/Academic Press. 125916 p. 97. WHO (2009) Dengue: Guidelines for diagnosis, treatment, prevention and control. Geneva: World Health Organization (WHO) and the Special Programme for Research and Training in Tropical Diseases (TDR) pp. 147. 98. Christie J (1872) Remarks on ‘KiDinga Pepo’: a peculiar form of exanthematous disease. BMJ 1: 577-579 99. Kawaguchi I, Sasaki A, Boots M (2003) Why are dengue virus serotypes so distantly related? Enhancement and limiting serotype similarity be-tween dengue virus strains. Proc R Soc Lond 270: 2241-2247. 100. WHO (2012) Dengue and severe dengue (fact sheet). pp. http://www.who.int/mediacentre/factsheets/fs117/en/. Accessed: April, 2012. 101. Gubler DJ, Kuno G, Markoff L (1997) Dengue and dengue hemorrhagic fever: its history and resurgence as a global public health problem. In: Gubler DJ KG, Markoff L, editor. Dengue and dengue hemorrhagic fever. Wallingford: CAB International. pp. 1-22. 102. Weaver SC, Barrett AD (2004) Transmission cycles, host range, evolution and emergence of arboviral disease. Nat Rev Microbiol 2: 789-801. 103. Diallo M, Ba Y, Sall AA, Diop OM, Ndione JA, et al. (2003) Amplification of the sylvatic cycle of dengue virus type 2, Senegal, 1999-2000: entomologic findings and epidemiologic considerations. Emerg Infect Dis 9: 362-367. 104. Rudnick A (1965) Studies of the ecology of dengue in Malaysia. J Med Ent 2: 203-208.

62 105. WHO (2011) Practical manual and Guidelines for Dengue Vector Surveillance Sri Lanka: Medical Research Institute and Dengue coordination Unit. pp. 1- 138. 106. WHO (1997) Dengue haemorrhagic fever: Clinical diagnosis, treatment and control. Geneva. pp. 12-23. 107. George R, Lum LCS (1997) Clinical spectrum of dengue infection. In: Gubler DJ, Kuno G, editors. Dengue and dengue hemorrhagic fever. New York: CAB International. pp. 89-113. 108. Kurane I, Ennis FA (1997) Immunopathogenesis of dengue virus infections. In: Gubler DJ KG, editor. Dengue and dengue hemorrhagic fever. New York: CAB International. pp. 273-290. 109. Halstead SB (1970) Observations related to pathogenesis of dengue hemorrhagic fever. IV. Hypothesis and discussion. Yale J of Biol Med 42: 350-362. 110. Moren DM (1994) Anti-body dependent enhancement of infection and the pathogenesis of dengue disease. Clin Infec Dis 19: 500-512. 111. Mitamura T, Kitaoka M, Watanabe M, Okuba K, Tenjin S, et al. (1936) Study on Japanese encephalitis virus. Animal experiments and mosquito transmission experiments. Kansai Iji 1: 260-261. 112. Mitamura T, Kitaoka M, Mori K, Okuba K (1938) Isolation of the virus of Japanese epidemic encephalitis from mosquitoes caught in nature. Tokyo Iji Shinshi 62: 820-824. 113. Fischer M, Lindsey N, Staples JE, Hills S, Centers for Disease C, et al. (2010) Japanese encephalitis vaccines: recommendations of the Advisory Committee on Immunization Practices (ACIP). MMWR Recomm Rep 59: 1-27. 114. Gould EA, Solomoncn T, Mackenzie JS (2008) Does antiviral therapy have a role in the control of Japanese encephalitis? . Antiviral Research 78: 140- 149. 115. Misra UK, Kalita J (2010) Overview: Japanese encephalitis. Prog Neurobiol 91: 108-120. 116. Tsai TF (2000) New initiatives for the control of Japanese encephalitis by vaccination: minutes of a WHO/CVI meeting, Bangkok, Thailand,13-15 Oct. 1998. Vaccine 18: 1-25. 117. Buescher EL, Scherer WF (1959) Ecologic studies of Japanese encephalitis virus in Japan. IX. Epidemiologic correlations and conclusions. Am J Trop Med Hyg 8: 719-722. 118. Soman RS, Rodrigues FM, Guttikar SN, Guru PY (1977) Experimental viraemia and transmission of Japanese encephalitis virus by mosquitoes in ardeid birds. Indian J Med Res 66: 709-718. 119. Liou ML, Hsu CY (1998) Japanese encephalitis virus is transported across the central blood vessels by endocytosis in mouse brain. Cell Tissue Res 293: 389-394. 120. Kumar R, Mathur A, Kumar A, Sharma S, Chakraborty S, et al. (1990) Clinical features and prognosis indicators of Japanese encephalitis in childern in Lucknow (India). Indian J Med Res 91: 321-327.

63 121. Solomon T, Dung NM, Kneen R, Thao Le TT, Gainsborough M, et al. (2002) Siesures and raised intracranial pressure in Vietnamese patients with Japanese encephalitis. Brain 125: 1084-1093. 122. Beasley D, Lewthwaite P, Solomon T (2006) Current use and development of vaccines for Japanese encephalitis. Expert opinion on Biological Therapy 8: 95-106. 123. Solomon T (2003) Recent advances in Japanese encephalitits. Journal of Neuro Virology: 274-283. 124. Solomon T (2008) New vaccines for Japanese encephalitis. Lancet Neurol 7: 116-118. 125. Shlim DR, Solomon T (2002) Japanese encephalitis vaccine for travelers: exploring the limits of risk. Clin Infect Dis 35: 183-188. 126. Igarashi A (2002) Control of Japanese encephalitis in Japan: immunization of humans and animals, and vector control. Curr Top Microbiol Immunol 267: 139-152. 127. Sohn YM (2000) Japanese encephalitis immunization in South Korea: past, present, and future. Emerg Infect Dis 6: 17-24. 128. Takahashi H, Pool V, Tsai TF, T. CR (2000) Adverse events after Japanese encephalitis vaccination: a review of postmarketing surveillance data from Ja-pan and the United States.The VAERS Working Group. Vaccine 18: 2963-2969. 129. FDA (2010) Product approval information [package insert]. Ixiaro (Japanese encephalitis virus vaccine inactivated). Livingston, United Kingdom: Intercell Biomedical. 130. Organization WH (2006) Japanese encephalitis vaccines. Wkly Epidemiol Rec 81: 331-340. 131. Monath TP (2002) Japanese encephalitis vaccines: current vaccines and future prospects. Curr Top Microbiol Immunol 267: 105-138. 132. Beasley DW, Lewthwaite P, Solomon T (2008) Current use and develop- ment of vaccines for Japanese encephalitis. . Expert Opin Biol Ther 8: 95-10. 133. Campbell GL, Hills SL, Fischer M, Jacobson JA, Hoke CH, et al. (2011) Estimated global incidence of Japanese encephalitis: a systematic review. Bull World Health Organ 89: 766-774, 774A-774E. 134. Gubler DJ, Kuno G, Markoff L (2007) Flaviviruses. In: Knipe D HP, editor. Fields Virology. Philadelphia: Lippincott Williams and Wilkins. pp. 1153- 1252. 135. Bressanelli S, Stiasny K, Allison SL, Stura EA, Duquerroy S, et al. (2004) Structure of a flavivirus envelope glycopro-tein in its low-pH-induced membrane fusion conformation. EMBO J 23: 728-738. 136. ModisY, Ogata S, Clements D, Harrison SC (2004) Structure of the dengue virus envelope protein after membrane fusion. Nature 427: 313-319. 137. Lindenbach BD, Thiel HJ, Rice CM (2007) Flaviviridae: the viruses and their replication. In: Knipe DM, Howley PM, editors. Fields Virology. 5 ed. Philadelphia: Lippincott, Williams, and Wilkins. pp. 1101-1152. 138. Brinton MA (2002) The molecular biology of West Nile Virus: a new invader of the western hemisphere. Annu Rev Microbiol 56: 371-402.

64 139. Lorenz IC, Kartenbeck J, Mezzacasa A, Allison SL, Heinz FX, et al. (2003) Intracellular assembly and secretion of recombinant sub-viral particles from tick-borne encephalitis virus. J Virol 77: 4370-4382. 140. Mackenzie JM, Westaway EG (2001) Assembly and maturation of the flavivirus Kunjin virus appear to occur in the rough endoplasmic reticulum and along the secretory pathway, respectively. J Virol 75: 10787-10799. 141. Li L, Lok SM, Yu IM, Zhang Y, Kuhn RJ, et al. (2008) The flavivirus precursor membrane-envelope protein complex: structure and maturation. Science 319: 1830-1834. 142. Mukhopadhyay S, Kuhn RJ, Rossmann MG (2005) Structural perspectoive of the flavivirus life cycle. Nat Rev Microbiol 3: 13-22. 143. Yu IM, Zhang W, Holdaway HA, Li L, Kostyuchenko VA, et al. (2008) Structure of the imma-ture dengue virus at low pH primes proteolytic maturation. Science 319: 1834-1837. 144. Luo DH, Xu T, Watson RP, et a (2008) Insights into RNA unwinding and ATP hy-drolysis by the Flavivirus NS3 protein. EMBO J 27: 3209-3219. 145. Sampath A, Xu T, Chao A, Luo D, Lescar J, et al. (2006) Structure-based mutationalanalysis of the NS3 helicase from dengue virus. J Virol 80: 6686-6690. 146. Johansson A, Poliakov A, Akerblom E, et a (2003) Acyl sulfonamides as potent pro-tease inhibitors of the full-length NS3 (protease- helicase/NTPase): a comparative study of different C-terminals. Bioorg Med Chem 11: 2551-2568. 147. Chambers TJ, Weir RC, Grakoui A, McCourt DW, Bazan JF, et al. (1990) Evidence that the N-terminal domain of nonstructural protein NS3 from yellow fever virus is a serine protease responsible for site-specific cleavages in the viral polyprotein. Proc Natl Acad Sci USA 87: 8898-8902. 148. Falgout B, Pethel M, Zhang YM, Lai CJ (1991) Both nonstructural proteins NS2B and NS3 are required for the proteolytic processing of dengue virus nonstructural proteins. J Virol 65: 2467-2475. 149. Amberg SM, Rice CM (1998) Handbook of proteolytic enzymes; Barrett AJ RN, Woessner JF, editor. London: Academic Press. 150. Falgout B, Miller HR, Lai CJ (1993) Deletion analysis of dengue virus type 4 nonstructural protein NS2B: identification of a domain required for NS2B- NS3 proteinase activity. J Virol 67: 2034-2042. 151. Yusof R, Clum S, Wetzel M, Murthy HM, Padmanabhan R (2000) Purified NS2B/NS3 serine protease of dengue virus type 2 exhibits cofactor NS2B dependence for cleavage of substrates with dibasic amino acids in vitro. J Biol Chem 275: 9963-9969. 152. Leung D, Schroder K, White H, Fang NX, Stoermer MJ, et al. (2001) Activity of recombinant dengue 2 virus NS3 protease in the presence of a truncated NS2B co-factor, small peptide substrates, and inhibitors. J Biol Chem 276: 45762-45771. 153. Nall TA, Chappell KJ, Stoermer MJ, Fang NX, Tyndall JD, et al. (2004) Enzymatic characterization and homology model of a catalytically active recombinant West Nile virus NS3 protease. J Biol Chem 279: 48535- 48542.

65 154. Niyomrattanakit P, Winoyanuwattikun P, Chanprapaph S, Angsuthanasombat C, Panyim S, et al. (2004) Identification of residues in the dengue virus type 2 NS2B cofactor that are critical for NS3 protease activation. J Virol 78: 13708-13716. 155. Lin CW, Huang HD, Shiu SY, Chen WJ, Tsai MH, et al. (2007) Functional determinants of NS2B for activation of Japanese encephalitis virus NS3 pro-tease. Virus Research 127: 88-94. 156. Chambers TJ, Hahn CS, Galler R, Rice CM (1990) Flavivirus genome organization, expression, and replication. Annu Rev Microbiol 44: 649- 688. 157. Shiryaev SA, Kozlov IA, Ratnikov BI, Smith JW, Lebl M, et al. (2007) Clea- vage preference distinguishes the two-component NS2B-NS3 serine proteinases of Dengue and West Nile viruses. Biochem J 401: 743-752. 158. Valle RPC, Falgout B (1998) Mutagenesis of the NS3 protease of dengue virus type 2. J Virol 72: 624-632. 159. Ryan MD, Monaghan S, Flint M (1998) Virus-encoded proteinases of the Flaviviridae. J Gen Virol 79 ( Pt 5): 947-959. 160. Aleshin AE, Shiryaev SA, Strongin AY, Liddington RC (2007) Structural evidence for regulation and specificity of flaviviral proteases and evolution of the Flaviviridae fold. Protein Sci 16: 795-806. 161. Erbel P, Schiering N, Arcy AD, Renatus M, Kroemer M, et al. (2006) Structural basis for the activation of aviviral NS3 proteases from dengue and West Nile virus. Nat Struct Mol Biol 13: 372-373. 162. Niyomrattanakit P, Yahorova S, Mutule I, Mutulis F, Petrovska R, et al. (2006) Probing the substrate specificity of the dengue virus type 2 NS3 serine protease by using internally quenched fluorescent peptides. Biochem J 397: 203-211. 163. Li J, Lim SP, Beer D, Patel V, Wen D, et al. (2005) Functional profiling of recombinant NS3 proteases from all four serotypes of dengue virus using tetra- and octa-peptide substrate libraries. J Biol Chem 280: 28766-28774. 164. Shiryaev SA, Ratnikov BI, Aleshin AE, Kozlov IA, Nelson NA, et al. (2007) Switching the substrate specificity of the two-component NS2B-NS3 flavivirus proteinase by structure-based mutagenesis. J Virol 81: 4501- 4509. 165. Iempridee T, Thongphung R, Angsuthanasombat C, Katzenmeier G (2008) A comparative biochemical analysis of the NS2B(H)-NS3pro protease complex from four dengue virus serotypes. Biochim Biophys Acta 1780: 989-994. 166. Mueller NH, Yon C, Ganesh VK, Padmanabhan R (2007) Characterization of the West Nile virus protease substrate specicity and inhibitors. The International Journal of Biochemistry & Cell Biology 39: 606-614. 167. Katzenmeier G (2004) Inhibition of the NS2B-NS3 Protease - Towards a causative Therapy for Dengue Virus Disease. Dengue Bulletin 28: 58-67. 168. Ganesh VK, Muller N, Judge K, Luan C, Padmanabhan R, et al. (2005) Identifi-cation and characterization of non-substrate based inhibitors of the essential dengue and West Nile virus proteases Bioorg Med Chem 13: 257- 264.

66 169. Tomlinson SM, Malmstrom RD, Russo A, Mueller N, Pang YP, et al. (2009) Structure-based discovery of dengue virus protease inhibitors. Antiviral Res 82: 110-114. 170. Sidique S, Shiryaev SA, Ratnikov BI, Herath A, Su Y, et al. (2009) Structure- activity relationship and improved hydrolytic stability of pyrazole derivatives that are allosteric inhibitors of West Nile virus NS2B-NS3 proteinase. Bioorg Med Chem Lett 19: 5773-5777. 171. Johnston PA, Phillips J, Shun TY, Shinde S, Lazo JS, et al. (2007) HTS identifies novel and specific uncompetitive inhibitors of the two-component NS2B-NS3 proteinase of West Nile virus. Assay Drug Dev Technol 5: 737- 750. 172. Mueller NH, Pattabiraman N, Ansarah-Sobrinho C, Viswanathan P, Pierson TC, et al. (2008) Identification and bio-chemical characterization of small- molecule inhibitors of west nile virus serine protease by a high-throughput screen. Antimicrob Agents Chemother 52: 3385-3393. 173. Ekonomiuk D, Su XC, Ozawa K, Bodenreider C, Lim SP, et al. (2009) Discovery of a non-peptidic inhibitor of west nile virus NS3 protease by high-throughput docking. PLoS Negl Trop Dis 3: e356. 174. Ekonomiuk D, Su XC, Ozawa K, et a (2009) Flaviviral protease inhibitors identified by fragment-based library docking into a structure generated by molecular dynamics. J Med Chem 52: 4860-4868. 175. Chanprapaph S, Saparpakorn P, Sangma C, Niyomrattanakit P, Hannongbua S, et al. (2005) Competitive inhibition of the dengue virus NS3 serine protease by synthetic peptides representing polyprotein cleavage sites. Biochem Biophys Res Commun 330: 1237-1246. 176. Yin Z, Patel SJ, Wang W, Wang G, Chan W, et al. (2006) Peptide inhibitors of dengue virus NS3 prote-ase. Part 1: warhead. Bioorg Med Chem Lett 16: 36-39. 177. Schuller A, Yin Z, Brian Chia CS, Doan DN, Kim HK, et al. (2011) Tripeptide inhibitors of dengue and West Nile virus NS2B-NS3 protease. Antiviral Res 92: 96-101. 178. D'Arcy A, Chaillet M, Schiering N, Villard F, Lim SP, et al. (2006) Purification and crystallization of dengue and West Nile virus NS2B-NS3 complexes. Acta Crystallogr Sect F Struct Biol Cryst Commun 62: 157-162. 179. Tomlinson SM, Watowich SJ (2008) Substrate inhibition kinetic model for West Nile virus NS2B-NS3 protease. Biochemistry 47: 11763-11770. 180. Damm KL, Ung PM, Quintero JJ, Gestwicki JE, Carlson HA (2008) A poke in the eye: inhibiting HIV-1 protease through its flap-recognition pocket. Biopolymers 89: 643-652. 181. Luzhkov VB, Selisko B, Nordqvist A, Peyrane F, Decroly E, et al. (2007) Virtual screening and bioassay study of novel inhibitors for dengue virus mRNA cap (nucleoside-2'O)-methyltransferase. Bioorg Med Chem 15: 7795-7802. 182. Mukherjee P, Desai P, Ross L, White EL, Avery MA (2008) Structure-based virtual screening against SARS-3CL(pro) to identify novel non-peptidic hits. Bioorg Med Chem 16: 4138-4149.

67 183. Wikberg JES, Lapinsh M, Prusis P (2004) Proteochemometrics - a tool for modelling the molecular interaction space. In: Mueller G KH, editor. Chemogenomics Methods and Principles in Medicinal Chemistry. Weinheim: Wiley-VCH. 184. Kontijevskis A, Petrovska R, Yahorava S, Komorowski J, Wikberg JE (2009) Proteochemometrics mapping of the interaction space for retroviral proteases and their substrates. Bioorg Med Chem 17: 5229-5237. 185. Lapins M, Wikberg JE (2009) Proteochemometric modeling of drug resistance over the mutational space for multiple HIV protease variants and multiple pro-tease inhibitors. Chem Inf Model 49: 1202-1210. 186. Beerenwinkel N, Sing T, Lengauer T, Rahnenfuhrer J, Roomp K, et al. (2005) Computational methods for the design of effective therapies against drug resistant HIV strains. Bioinformatics 21: 3943-3950. 187. Salaemae W, Junaid M, Angsuthanasombat C, Katzenmeier G (2010) Structure-guided mutagenesis of active site residues in the dengue virus two-component protease NS2B-NS3. J Biomed Sci 17: 68. 188. Jan LR, Yang CS, Trent DW, Falgout B, Lai CJ (1995) Processing of Japanese encephalitis virus non-structural proteins: NS2B-NS3 complex and heterologous protease. J Gen Virol 76: 573-580. 189. Stoermer MJ, Chappell KJ, Liebscher S, Jensen CM, Gan CH, et al. (2008) Potent cationic inhibitors of West Nile virus NS2B/NS3 protease with serum stability, cell permeability and antiviral activity. J Med Chem 51: 5714-5721.

68

 "    "                 / 2   2   2

         2   23 "  "   3       0     4   5    2        6   7  5  2  2  0 3 52  2      0     2  2 2   8  2     "      2   24

         0 / 0   44    /0/ // (,'1.