<<

A Thesis

entitled

Homology-based Structural Prediction of the Binding Interface Between the Tick-Borne

Encephalitis Virus Restriction Factor TRIM79 and the Flavivirus Non-structural 5

Protein.

by

Heather Piehl Brown

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Master of Science Degree in Biomedical Science

______R. Travis Taylor, PhD, Committee Chair

______Xiche Hu, PhD, Committee Member

______Robert M. Blumenthal, PhD, Committee Member

______Amanda Bryant-Friedrich, PhD, Dean College of Graduate Studies

The University of Toledo

December 2016

Copyright 2016, Heather Piehl Brown

This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author. An Abstract of

Homology-based Structural Prediction of the Binding Interface Between the Tick-Borne Encephalitis Virus Restriction Factor TRIM79 and the Flavivirus Non-structural 5 .

by

Heather P. Brown

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Master of Science Degree in Biomedical Sciences

The University of Toledo December 2016

The innate immune system of the host is vital for determining the outcome of virulent virus infections. Successful immune responses depend on detecting the specific virus, through interactions of the or genomic material of the virus and host factors. We previously identified a host antiviral protein of the tripartite motif (TRIM) family, TRIM79, which plays a critical role in the antiviral response to flaviviruses. The Flavivirus genus includes many arboviruses that are significant human pathogens, such as tick-borne encephalitis virus (TBEV) and West Nile virus (WNV). We found that TRIM79 directly interacts with the viral polymerase, nonstructural protein 5 (NS5), and leads to lysosomal degradation of NS5. This restriction is specific to TBEV, as the direct binding of TRIM79 and subsequent degradation of NS5 was not seen in the mosquito-borne flavivirus WNV, despite the TBEV and WNV NS5 proteins sharing a

59% identity. Thus, dissecting the TRIM79/NS5 interaction will provide an effective model of how antiviral proteins differentiate between similar viral proteins. To begin addressing how

TRIM79 targets only TBEV NS5 the 3D structures of TBEV and WNV NS5 were compared and modeled the TBEV NS5/TRIM79 interaction complex to identify critical residues for this interaction. Because the structures of TRIM79 and TBEV NS5 are unsolved, homology-based protein modeling was used to create preliminary structures for both proteins. These structures

iii were then used to predict the binding interface for TRIM79 monomers and dimers. From the predicted binding interfaces, residues important for binding were identified that were unique to

TBEV NS5 that could then be mutated to disrupt the interaction, rendering TBEV NS5 resistant to TRIM79 restriction.

iv

To my husband, TJ, who believed in me even when I did not believe in myself. I would never have gotten to here without your love and support.

Acknowledgements

This thesis would not have been possible without the support of many individuals. A special thanks to my committee members Dr. Bob Blumenthal and Dr. Xiche Hu, both of whom contributed lots of support with learning new techniques and methods, and by posing thought provoking questions that helped guide the research and kept it from being an impossibly large question to answer.

I would like to thank all the past and present members of the Taylor Lab, but especially

John Presloid, Adaeze Izuogu, and Brian Youseff, who have provided insight, guidance, and encouragement that helped me develop into a better scientist. I owe a special debt of gratitude to Dr. Travis Taylor, who was kind enough to invite a budding bioinformatician into a virology lab. Travis’s ability to get on the same level and explain things in a way that made daunting concepts seem easy was invaluable. His endless guidance and patience mean more to me than I can ever express.

This endeavor would never have been possible without the support of my family. My parents have provided a guiding light for me to follow and never tried to quell my relentless desire to learn more about the world around me. My brothers and my sisters have all provided unconditional emotional support and let me ramble on about my project. My grandmother Paula provided not only emotional and financial support of this endeavor, but was also a valuable source with which to discuss my project.

vi Table of Contents

Abstract iii

Acknowledgements vi

Table of Contents vii

List of Tables ix

List of Figures x

I. Introduction 1

A. Flaviviruses 1

B. Flavivirus Entry 2

C. Flavivirus Replication 3

D. Flavivirus Immune Evasion 5

E. Host TRIM Restriction Factors 7

F. TRIM79 10

G. Protein Structural Modeling 11

II. Methods 16

A. Homology Modeling 16

B. Predicting the Interaction 16

C. Selecting Key Residues 17

D. Calculating Interaction Energies 18

E. In silico Mutagenesis 19

F. Eukaryotic Linear Motif 20

III. Results 21

A. Structural Modeling 21

vii B. Interaction Complex Modeling 23

C. Modeling Effects of 24

D. Summary 25

IV. Discussion 27

References 60

viii

List of Tables

Table 1 Excerpt from a contact map output file...... 48

Table 2 Preliminary list of TBEV mutagenesis candidates with the corresponding

...... 50

Table 3 List of TBEV mutagenesis candidates with the corresponding amino acid

mutations from TRIM79 monomer cross referenced with the alignment of

TRIM79 and TRIM30...... 51

Table 4 Final list of TBEV mutagenesis candidates with the corresponding amino

acid mutations from the TRIM79 dimer and cross referenced with

the alignment of TRIM79 and TRIM30...... 52

Table 5 Energy calculations for all interaction complexes...... 54

Table 6 Energy calculations for the impact of point mutations on complex 19...... 56

Table 7 Energy calculations for the impact of the mutation P95K on other interaction

complexes...... 59

ix List of Figures

Figure 1 Code used in TK Console to generate the contact map...... 28

Figure 2 of JEV NS5 and TBEV NS5...... 29

Figure 3 Homology modeled structure of TBEV NS5 utilizing the solved crystal

structure of JEV NS5...... 30

Figure 4 Structural comparison of TBEV NS5 and WNV NS5 modeled on the TBEV

NS5 structure...... 35

Figure 5 Multiple sequence alignment (MSA) of WNV NS5, LGTV NS5, and TBEV

NS5...... 39

Figure 6 Structure of TBEV NS5 with all non-identical and non-similar residues

marked...... 41

Figure 7 Sequence alignment of TRIM5α and TRIM79 ...... 42

Figure 8 Homology modeled partial structure of mouse TRIM79 utilizing the solved

partial crystal structure of rhesus macaque TRIM5α ...... 43

Figure 9 Sequence alignment of mouse TRIM79 and mouse TRIM30...... 47

Figure 10 Top four predicted interaction complexes...... 49

Figure 11 Structure of TBEV NS5 with preliminary candidate residues marked...... 53

Figure 12 Model of interaction complex 19...... 55

Figure 13 Graph of ΔE impact of point mutations...... 57

Figure 14 Graph of ΔΔE impact of ...... 58

x

Chapter One

Introduction

The Flaviviruses.

Mankind has always battled against foes that were invisible to the naked eye.

Bacteria and viruses predate the human species and, though antibiotics have been developed to treat many of the bacteria that can cause us harm, viruses have proved to be more difficult to treat. Among the myriad of pathogenic viruses one genus has been of interest to the Taylor Laboratory, the flaviviruses due to the large incidences of disease caused by them. Members of this genus are geographically widespread, and can be found on every continent that has a dense human population. There are currently over 70 members of this genus, of which forty species are able to cause disease in humans. In this genus are several well known and emerging human pathogens including Dengue virus

(DENV), Yellow Fever virus (YFV), Tick-borne Encephalitis virus (TBEV), Japanese

Encephalitis virus (JEV), St. Louis Encephalitis virus (SLEV), West Nile virus (WNV),

Zika virus (ZIKV), and Powassan virus (POWV). These viruses are responsible for hundreds of thousands of cases reported each year, and often lead to hospitalization or in extreme cases, death. These viruses cycle between their arthropod vectors and vertebrate reservoir hosts including monkeys, mice, and birds. For these viruses, humans are a dead- end host. The members of this genus that cause disease in humans are transmitted to humans via the bite of an infected arthropod, also giving them a classification as arboviruses. Flaviviruses are usually separated into two major groups depending on 1 which arthropod serves as the vector of transmission: mosquito, which accounts for 34 species, or tick, which accounts for 17. However, 22 can be zoonotically transmitted and currently have no known vector. Infection by these viruses in humans can lead to encephalitis, meningitis, myelitis, and in extreme cases, hemorrhagic fever. With many of these viruses the infection produces two stages. The initial stage often presents itself as a febrile disease with symptoms similar to the flu. If the immune system fails to clear the infection it can then progress to the second stage where the more severe symptoms are seen. Among those who have survived the infection, long-term neurological sequelae may be observed, ranging from impaired speech to loss of balance. The severity and morbidity caused is dependent on viral titer of exposure, physical location of inoculation site, physiological and genetic factors of the host, and viral strain (Knipe et al.).

Flaviviruses are responsible for over 50,000 deaths per year, and with the expansion of their endemic regions and increasing incidence of these human infections, they are important to study (Zmurko, Neyts, & Dallmeier). Currently, the United States only has one approved vaccine for flaviviruses (Heinz & Stiasny). For those unfortunate enough to contract these diseases, only supportive care is available, as no specific antiviral treatment exists for members of this genus.

Flavivirus Entry

The virus is transmitted into the host through the bite of an infected arthropod.

The primary site of replication is in the dendritic cells and neutrophils (Robertson,

Mitzel, Taylor, Best, & Bloom). From here the virus enters the regional lymph nodes and then the efferent lymphatics. From the efferent lymphatics the virus then leads to plasma viremia and from there can enter the vascular endothelium, the olfactory epithelium,

2 and/or the neural parenchyma. Once here the virus can spread to neurons and glia (Knipe et al.). The virus enters the cell via receptor mediated endocytosis. The Envelope (E) glycoprotein of the virus is found to interact with multiple cell receptors while other native proteins as attachment factors including heat shock proteins 90 and 70, along with

C-type lectins and mannose receptors. The specific mode of entry is dictated by the cell type and strain of virus. In WNV and DENV this internalization has been observed to occur very quickly in the infection process. Once the virus has entered the cell as an endosome, the low pH environment of the endosome causes a conformation change in the

E protein which results in the un-coating of the virus and a fusion of the host and viral membranes. This releases the genomic data of the virus into the cell (Smit, Moesker,

Rodenhuis-Zybert, & Wilschut).

Flavivirus Replication.

Flaviviruses are classified as Group IV viruses because of their positive sense single-stranded RNA . They are also enveloped with an icosahedral capsid. Due to the nature of their genome, the host cell views the flavivirus genome as messenger

RNA (mRNA) and immediately translates the genome into a single polyprotein. The polyprotein is then cleaved into the multiple individual proteins, and viral replication then occurs on modified endoplasmic reticulum (ER) membranes. The genome of flaviviruses is relatively small, usually between 10 and 11 kilobases long. It is comprised of a 5’ untranslated region (UTR) with a cap structure, and a 3’UTR that, in most species, forms a complex RNA fold rather than being polyadenylated. Both the 3’ UTR and 5’ UTR seem to aid in the replication of the virus. The 3’ UTR forms a stem-loop and has been found to interact with the host protein elongation factor 1A (EF1A), which

3 regulates translation due to its function of transporting charged transfer RNA (tRNA) into the ribosome. This region may also regulate the translation and viral genome regulation to keep these processes balanced. The 5’ UTR seems to serve as a site of initiation of positive strand synthesis, and deletions in this region are lethal for the replication of the flavivirus, but not for its translation. The genome has one open reading frame (ORF) encoding the three structural proteins: the envelope (E), pre-membrane (prM), and capsid

(C) proteins. The E protein plays a role in membrane fusion as well as mediating binding to the target cell. The prM protein is vital for proper folding of the nascent E protein. The prM is cleaved via the host protease furin where the M remains in the mature virion and the pr segment is secreted from the cell. The capsid mediates membrane association, and deletions within the internal hydrophobic α-helix of the capsid protein decrease the amount of virus released from infected cells (Lindenbach & Rice).

The seven non-structural proteins, all classified by NS to signify nonstructural, include: NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5. Many of the NS proteins participate in the formation of the replication complex. NS2B and NS3 form the viral protease with NS2B serving as an essential cofactor, while the C-terminal region of NS3 serves as the viral helicase and nucleoside triphosphatase. NS5 is the largest and most conserved protein in flaviviruses and is comprised of two domains: the RNA-dependent

RNA polymerase (RdRP), responsible for genome replication, and the methyltransferase

(MTase) which adds a methyl group to guanine N-7 and ribose 2’-O of the RNA cap which prevents the host from mounting a triphosphate-triggered innate immune response

(Dong et al.). NS1 is necessary for a productive infection of the flavivirus and is essential for viral RNA synthesis. The viral replication complex does not form efficiently when

4 functional NS1 is not present (Youn, Ambrose, Mackenzie, & Diamond). NS2A, NS2B,

NS4A, and NS4B are integral membrane proteins and serve as a scaffold for the formation of the replication complex. NS4A and NS4B have multiple roles in viral replication and in how the virus interacts with the host. NS4A can facilitate ER membrane reorganization, formation of new replication complexes, and inducing autophagy to prevent the death of the host cell. In addition, NS4A helps regulate the

ATPase activity of the helicase function of NS3, but dissociation of single stranded RNA from helicase is accomplished by the association of NS4B. With the assistance of NS1,

NS4A and NS4B help modulate replication of the virus (Zou et al.). NS4B has been found to be important for the ability of the flavivirus to adapt to new host environments

(Zmurko et al.). NS2A binds to regions of the replication complex along with the 3’ untranslated region (UTR) and associates with the ER membrane. It also plays a role in the secretion of mature virus particles, and their assembly (Leung et al.). Immature virus particles are then transported through the trans-golgi network where the preM protein is cleaved by the host protease, furin, to the mature M protein, and the mature infectious particles are exocytosed from the cell. Many viral proteins are also used in the antagonism of immune proteins or can utilize them to assist in the replication process.

Flavivirus Immune Evasion.

Pathogenicity and persistence of infection is influenced by the response of the host immune system. Viruses have coevolved with the hosts to evade or utilize components of the host’s immune response for their benefit and many of the viral proteins contribute to the pathogenesis of the virus. For example, two mutations in the

NS4B protein lead to decreased neurovirulence as well as a reduction in the ability of the

5 virus to extend further into the body, whereas a single mutation can increase the ability of the virus to replicate and lead to death of the host. One host protein that is utilized by some members of the flavivirus genus is Ube2i. The standard function of this protein is to catalyze small -like modifier (SUMO) transfer to target proteins. Specifically, in the case of dengue virus, Ube2i is a host factor essential for viral replication (Zmurko et al.).

The immune response is broken into two major components, innate and adaptive.

The innate response is usually triggered within minutes of the virus entering the body. A major component of the innate antiviral response is type I interferon (IFN), which includes IFN-α and IFN-β. The production of IFN occurs when the virus binds to a pattern recognition receptor (PRR). In the case of flaviviviruses this is usually the toll- like receptors (TLR), usually TLR 3 and TLR 7/8, or the retinoic-acid-inducible I

(RIG-I). For example, Japanese encephalitis virus (JEV) interacts with TLR7, and a defect or deficiency of these receptors leads to a significant difference in the susceptibility to this virus (Nazmi et al.). Once bound to the receptor a signaling cascade leads to the activation of transcription factors. Signaling cascades may progress through conformational changes, cleavage, or the addition of other molecules to the substrates.

Though the signaling cascade for each flavivirus receptor varies, all can activate interferon regulatory factor (IRF)3, IRF7, and nuclear factor κB (NF-κB). NF- κB then increases expression of type I IFNs, IFN-α and IFN-β, as well as pro-inflammatory cytokines, including interleukin (IL-) 18 (Tsai et al.). The binding of interferon to the interferon alpha-beta receptors (IFNARs) leads to signaling of the Janus kinase and signal transducer and activator of transcription (JAK-STAT) pathway, and the increased

6 expression of a large number of interferon stimulated (ISGs) which include, but are not limited to, interferon regulatory factors (IRFs), interferon-induced proteins with tetratricopeptide repeats (IFITs), and tri-partite motif (TRIM) proteins (Best et al.). The production of IFN can also induce the adaptive immune response via the major histocompatibility complex (MHC) class I pathway which can increase antigen presentation as well as triggering the maturation of dendritic cells.

Host Cell Trim Restriction Factors.

TRIM proteins have many functions within a cell. They received their name due to the conservation of the RBCC order of domains in the protein, including a RING domain (R), followed by one or two B-boxes (B), followed by a coiled-coil (CC). The

RING domain contains a zinc-binding motif which, in many TRIM proteins, confers E3 activity; some have been shown to participate in sumoylation as well.

The B-box also contains zinc-binding motifs, and is important for the innate resistance to specific viruses, including HIV. The B-box has also been shown to play a role in recruiting substrates for proteosomal degradation (Meroni). The coiled-coil domain mediates the homomeric and heteromeric interactions of members of the TRIM family with one another and can be vital for antiviral activity of these proteins. TRIM proteins are subdivided into 11 different families based on combination, or presence of 11 unique motifs. These motifs are C-terminal subgroup one signature (COS), fibronectin type 3

(FN3), the SPRY associated (PRY), splA kinase and ryanodine (SPRY), plant homeodomain (PHD), bromodomain (BR), filamin-type immunoglobin (FIL), NHL repeats (NHL), meprin and tumour-necrosis factor receptor-associated factor homology

(MATH), ADP ribosylation factor like (ARF), and transmembrane. The COS domain

7 usually works by regulating the binding to microtubules and contributes to resistance to retroviruses. FN3 domains can bind heparin and DNA, while PHD domains seem to regulate transcription by modifying chromatin structure. Bromodomains recognize residues that have the post-translational modification of acetylation, and are always paired with a PHD domain. These two domains work together to repress transcription.

NHL repeats tend to play a role in protein-protein interactions, and FIL domains mediate actin cross-linking and dimerization. The MATH domain modulates interaction with

TRAF proteins, and ARF domains help control the traffic of vesicles within the cell. The

SPRY domain is found across the Eukarya domain, whereas the PRYSPRY is only found in vertebrates, and these domains form an interaction complex that resembles the antigen- antibody complex and can play a role in the immune response (Ozato, Shin, Chang, &

Morse). Many members of the TRIM family have virus specific roles in immunity.

TRIM5 has been shown to restrict HIV-1 and other retroviruses (Grutter & Luban),

TRIM22 can protect cells from the encephalomyocarditis virus (Eldin et al.), and

TRIM56 can inhibit pestivirus infection (Khadka et al.). In the case of flaviviruses, another TRIM protein has been shown to inhibit viral replication, TRIM79, (Taylor et al.) and this interaction will be discussed in further detail later.

The function of these antiviral TRIM proteins is mainly as E3 ubiquitin ligases.

E3 is the final enzyme in the ubiquitin pathway. The pathway begins with a ubiquitin activating enzyme (E1) that forms a thioester bond with ubiquitin. The ubiquitin conjugating enzyme (E2) stabilizes binding of activated ubiquitin to the protein substrate.

The ubiquitin ligase then forms an isopeptide bond between the ubiquitin and a lysine residue on the target protein. The result of ubiquitination often depends on the number of

8 ubiquitin molecules, and the type of linkage between the molecules. Typically, in the case of polyubiquitin that is linked at the Lys48 residue, the result is proteosomal degradation of the substrate protein. In the case of polyubiquitin that is linked at the Lys63 residue, they have been found to interact in pathways for cellular signaling, ribosomal biogenesis,

DNA repair, and intracellular trafficking. Monoubiquitination helps in membrane trafficking by providing a recognition signal (Miranda & Sorkin).

In 2010, several TRIM proteins were found to also function as small ubiquitin- like modifier (SUMO) protein E3 ligases. TRIM19, also known as promyelocytic leukemia protein (PML), is an effective SUMO E3 ligase. TRIM19 modifies the p53 protein, a tumor suppressor protein, , an oncoprotein, and c-Jun, another oncoprotein, while also binding to Ubc9, a SUMO E2 conjugating enzyme. TRIM27 is also a SUMO E3 ligase, acting on Mdm2 and p53, as well as interacting with Ubc9.

Other TRIM proteins that enhance SUMOylation of Mdm2 are TRIM32, TRIM 36,

TRIM28, TRIM1, TRIM22, TRIM39, and TRIM28 (Chu & Yang). As with ubiquitination, the consequences of SUMOylation vary. One consequence of the addition of SUMO is a change in the conformation of the substrate, regulating its function. This consequence can be seen in thymine-DNA glycosylase (TDG), which when SUMOylated loses its affinity for DNA. Another potential consequence of SUMOylation is to serve as a scaffold or interaction center to recruit new proteins, as seen when the SUMOylation of

PML (TRIM19) recruits RNF4, a ubiquitin ligase. The final consequence of

SUMOylation is blocking the of another protein on the substrate. One example of this is seen in E2-25k, a ubiquitin-conjugating enzyme, which inhibits its ability to interacting with the E1 ubiquitin activating enzyme (Wilkinson & Henley).

9 SUMO and ubiquitin have a relationship within the cell. There have been reports of both of these post-translational modifiers working to co-regulate substrate proteins.

One way in which they do this is via antagonism. This can be seen in the case of inhibitory κBα (IκBα), which is a regulator of NFκB. When IκBα is ubiquitinated, it is marked for degradation via proteasomes; however, when the IκBα is SUMOylated it is then protected from ubiquitin-mediated proteasome degradation. Ubiquitin and SUMO can also perform sequentially, as exhibited in the case of the NFκB essential moderator

(NEMO) where SUMOylation causes a nuclear accumulation of NEMO which is then phosphorylated then ubiquitinated to export it from the nucleus. There has also been evidence of interplay between SUMO and ubiquitin in regulating p53, where ubiquitination of the substrate leads the recruitment of a SUMO E3, enhancing its

SUMOylation. Ubiquitination and SUMOylation can also directly cross-regulate one another, by interacting with each other’s enzymatic proteins. SUMO depresses the ubiquitination pathway by SUMOylating E2-25k so it can no longer conjugate ubiquitin.

Ubiquitin also downregulates SUMO by ubiquitinating the SUMO E1 protein SAE1, resulting in its degradation. Sometimes these two processes can form negative feedback loops, and is the case with the ubiquitin E3 ligase, parkin. Parkin ubiquitinates RanBP2, a

SUMO E3, leading to the degradation of RanBP2, and SUMO binding to parkin increases its activity as an E3 (Wilkinson & Henley).

TRIM79.

As discussed earlier, many members of the TRIM family of proteins serve an antiviral function. In the case of flaviviruses, a recently discovered TRIM protein was found to potently restrict the ability of members of this genus to replicate. In a -two-

10 hybrid study, TRIM79 was identified as a potential binding partner for Langat virus

(LGTV) NS5. It was later found that this interaction was specific to the NS5 protein. The area of NS5 that was found to interact was within the methyltransferase domain and the region of TRIM79 that was found to interact excluded the RING domain. The promoter region of the TRIM79 gene contained recognized binding sites for immune response transcription factors. To test for the specificity of NS5 for TRIM79, LGTV NS5 was also tested against TRIM30, which shares an 82% sequence identity with TRIM 79, but does not bind NS5. The result of the interaction between TRIM79 and NS5 is lysosomal degradation of NS5, leading to restriction of viral replication. To further understand this interaction, TRIM79 was tested against four other flaviviruses: TBEV, POWV, WNV, and JEV. Restriction of replication was observed for TBEV and POWV, but not for

WNV or JEV, indicating that TRIM79 is specific to tick-borne members of the flavivirus genus, however the reason for this specificity is not fully understood (Taylor et al.).

Dissecting this interaction became the focus of my project. We were faced with a challenge because the structures of both TBEV NS5 and TRIM79 are unsolved. Thus, we turned to using in silico methods to help us determine the interaction complex and determine which residues are most likely to be critical for the interaction to take place.

Protein Structural Modeling.

When the structure of a protein of interest is unknown, several modeling methods may be employed to predict its three-dimensional structure. The first method that can be utilized is homology modeling, which is used when homologs of the protein of interest or the protein’s domain of interest have solved crystal structures. It is based on two reliable assumptions: two homologous proteins will share very similar structures and protein

11 folding patterns between homologs are more conserved than the overall amino acid sequence (Rost). Homology modeling is done by aligning a template protein to the amino acid sequence of the protein of interest and predicting the conserved regions, then creating a model backbone by duplicating the conserved regions of the template structure.

The side chains are then added. If the protein of interest does not have a homolog with a solved crystal structure but has similar fold patterns to a non-homologous protein, the next modeling method that may be used is threading, or fold comparison. This method separates the protein of interest into sub-fragments. The sub-fragments are then run through a database that contains all known protein folds. This method is based on the observations that the number of possible folds found in nature are relatively small, and that over the past three years more than 90% of structures that are submitted to the (PDB) have similar structural fold to an existing database member. The third method of structural modeling that can be used is ab initio, which means from the beginning. This method is based on the assumptions that all information about a protein’s structure is contained in its amino acid sequence, and that a will always fold into a state at the lowest free energy. Ab initio modeling is accomplished by making multiple possible conformations for the protein of interest, followed by calculating which conformation has the lowest energy state, which is most favorable by nature. While each of these modeling methods may be useful, homology modeling is the most reliable when quality templates are available, and so was used for this study.

In homology modeling, the most important step is choosing the best template to use to model your protein. When selecting which homolog structure to use as a template, there are several criteria that must be met. The first is how identical the sequence of the

12 protein of interest is to the sequence of the potential template protein. The higher percent identity between the protein of interest and the template protein, the more confident one can be in the predicted results. Ideally a threshold of 40% identity is used, but the higher the identity, the more reliable the predicted structure is. A lower percentage of identity can be used if few template proteins exist. If the percent identity is between 25% and

35% then your template is considered to be in the “twilight zone” and your results will be more likely to contain false positives. It is also important that this similarity is maintained across the entire length of the protein, not only small regions (Rost). The second criterion for selecting a template protein is to ensure that the template protein is a contiguous hit from N-terminus to C-terminus. If no such template exists, then if the region of interest of the protein is known, and a solved crystal structure of that region exists, it may be used. If no contiguous templates exist, it is possible to use templates which overlap, but this is not ideal. The third criterion for determining the ideal template to proceed with modeling is the resolution of the potential template structure. A resolution of 2.5 angstroms (Å) or higher is best, but those of up to 4.0 Å may be used, though the side chains may be modeled incorrectly due to poor resolution of the template (Pevsner).

In silico methods have proven to be useful in determining and predicting interactions of ligands. Utilizing in silico methods, it also possible to generate virtual mutations in proteins and assess their likely impact. In 2009, a study was published utilizing homology modeling to determine the ectodomains of human Toll-like

Receptors (TLRs) 7/8/9. They found that these domains had a similar structure, and this could be used to help interpret experimental data from these receptors (Wei et al.). More recently, homology modeling has been incorporated into a collection of tools for

13 determining targets of T- and B- cell response, the immune epitope database analysis resource (IEDB-AR), and has helped identify what type of immune response may occur to various substances (Kim et al.). A saturation mutagenesis was performed on

Endothelial Protein C receptor (EPCR) and then studied to see the impact of the mutation on EPCR’s ability to interact with phosphatidylethanolamine (PTY). With the in silico methods, both single point mutations and double mutations could be analyzed, and the results revealed 12 single codon mutations and 9 double mutants of EPCR that could decrease its affinity for PTY (Chiappori, D'Ursi, Merelli, Milanesi, & Rovida). In 2014, a study was released that utilized multiple in silico methods to predict the effect of single polymorphisms (SNPs) resulting in missense mutations to the structure of anaplastic lymphoma kinase (ALK). Utilizing multiple systems, they were able to predict which missense mutations, and therefore which SNPs, would have the biggest impact on

ALK structure (Priya Doss, Chakraborty, Chen, & Zhu). This demonstrates that the use of in silico methods is not only possible, but can also help to narrow the range of in vitro and in vivo experiments that need to be conducted.

In order to determine how two or more molecules interact with one another using in silico methods, docking software is often the preferred method to use. The docking software utilizes many criteria to determine possible interaction complexes. One step is usually by first looking at shape complementarity. This is accomplished by assigning one molecule to remain stationary as the receptor, while moving the other molecule as the ligand. It moves the ligand in small increments while also rotating the ligand into all potential conformations. It removes those predicted structures that result in one molecule passing through another, but this still leaves thousands of potential results. The software

14 then narrows the results by finding the predicted interactions with the lowest electrostatics, which is the stationary electric charges. The software then seeks the lowest desolvation free energy, which is a calculation of the energy required to break the bond between the proteins and the surrounding , and then create new bonds between the two proteins and create the new bonds within the solvent. The software then clusters and ranks the predicted interaction interfaces based on which interactions have the most neighboring atoms, thus allowing for a stronger interaction (Kozakov et al.). Through the course of this thesis, modeling and docking studies were used to understand the interaction complex between the mouse restriction factor TRIM79 and the NS5 protein of flaviviruses.

15 Chapter Two

Methods

Homology Modeling.

To begin modeling our proteins of interest we first determined if there are any homologous proteins with known structures. To address this question we turned to the

National Center for Biotechnology Information (NCBI) and used the Basic Local

Alignment Search Tool (BLAST). BLAST searches the repository of all known sequences to find the closest homologs (Altschul, Gish, Miller, Myers, & Lipman). We uploaded our protein sequences and use BLAST against the Protein Data Bank (PDB)

(Berman et al.) , which is a repository of all solved protein structures. We selected our template proteins based upon the criteria listed previously in the introduction to protein structural modeling. We then downloaded the structural file of our template proteins from the Protein Data Bank and then went to the modeling server SWISS-MODEL (Biasini et al.), and uploaded the amino acid sequence of the protein of interest and the structural file of the template protein. We then submitted this information to the server where modeling proceeds as previously described.

Predicting the Interaction.

The ClusPro server was selected due to its prior success in predicting at least one near native complex within its top 10 predicted interfaces (Kozakov et al.). We first designated one protein, TBEV NS5, as the ligand and the other, TRIM79, as the receptor.

We then uploaded our structure files from the homology modeling to the ClusPro server

(Comeau, Gatchell, Vajda, & Camacho). Once the server had completed its predictions, we then selected the top 30 predicted interfaces to be investigated further. We also 16 created a dimer of TRIM79 using the ClusPro server, as members of the TRIM family can dimerize to increase their activity level. Following the trend of previous studies we chose the TRIM79 dimer with the proper orientation that was dominated by hydrophobic interactions. We then used this dimer as the receptor and TBEV NS5 as the ligand. The top 30 interfaces between the TRIM79 dimer and TBEV NS5 were also selected for further investigation. As with the monomer we cross referenced the predicted TBEV residues with their TRIM79 interaction partners and used the TRIM30 and TRIM79 alignment to narrow the list of candidates.

Selecting Key Residues.

Once the interactions were predicted by the ClusPro server, the results were then narrowed to find the top predicted interacting residues. We utilized the TKconsole in

VMD (Humphrey, Dalke, & Schulten) to write a script (Figure 1) that would generate a contact map. The contact map calculates the distances between each central carbon of the amino acid (Cα) of TBEV NS5 and each Cα of TRIM79. The program was written to output a file with all residue pairs that were within 10 Å of one another. This was because any amino acids whose Cα are greater than 10 Å are highly unlikely to interact with one another. Without this cutoff value we would have a list of hundreds of thousands of interaction pairs. A multiple sequence alignment (MSA) was performed on NS5 from

TBEV, LGTV, and WNV. These three were chosen because both TBEV and LGTV NS5 have been shown to interact with TRIM79 while WNV NS5 does not interact. For this

MSA we used ClustalOmega (Sievers et al.), software that uses a point accepted mutation

(PAM) matrix to align our three proteins. The residues that were completely conserved or highly similar between all three flaviviruses were eliminated from our list of mutagenesis

17 candidates. If a residue was different between LGTV and TBEV, we removed it as a candidate as both still interact with TRIM79. If residues shared the same general classification, as defined by ClustalOmega, a lower ranking was assigned. Those residues that contained charged sidechains on one NS5, but had a neutral or opposite charged side chains on the other NS5 were given a higher rank. Final ranking of candidates was assigned by counting the number of times a specific residue was predicted to be a part of the interaction interface, with a minimum value of 5 out of 30 occurrences. This left us with a list of twenty preliminary candidates. To further narrow the list we eliminated the residues that were not part of the bait region used in a prior yeast two-hybrid screen

(Taylor et al.). This narrowed our list of candidates to fourteen. We then looked at where these residues were predicted to interact with TRIM79 and performed a sequence alignment between TRIM79 and TRIM30, the most closely related TRIM protein that did not interact with TBEV NS5. If a TRIM79 residue was predicted, across all predicted interactions, to interact with one of the candidates, but the residue was conserved between TRIM79 and TRIM30 then we eliminated it as a candidate. This left us with 11 potential candidate residues. When we cross referenced the top results from the dimer docking with the top results of the monomer docking, this further narrowed our list of final candidates to 8.

Calculating Interaction Energies.

To calculate which interactions were the most likely and most favorable, the interaction energy for each predicted potential interaction was calculated. This was accomplished by solving the following equation for the change in Gibbs free energy, ΔE:

ΔE = Σ Ecomplex – Σ Eligand – Σ Ereceptor. Each summation of energy was calculated utilizing

18 Swiss-PdbViewer 4.1.0 (Guex & Peitsch) using the Compute Energy () tool in the program. The energies for the ligand and receptor were calculated by utilizing the files of the receptor and ligand wherein the position and orientation of each molecule was retained from the complex while not being exposed to the force and influence of the other molecule. Once the change in free energy was calculated, the interactions were ranked by which one produced the most favorable result, and then we proceeded with in silico mutagenesis studies of that interaction.

In Silico Mutagenesis.

In silico mutagenesis was carried out utilizing the Mutate function in Swiss-

PdbViewer (Guex & Peitsch). Initially the pdb file containing only NS5 was loaded and then point mutations were created, and the energy minimized to eliminate improper angles on the mutated residue. The energy was then calculated for the mutated NS5 alone. Then the pdb file containing the NS5-TRIM79 complex was loaded and the point mutation was created on NS5 and was ensured to be in the identical conformation as the

NS5 alone. The energy for the interaction was then calculated, followed by the calculation of the change in Gibbs free energy for the mutated complex. This point mutation was carried out for all predicted residues in the most favorable interaction complex. This was then followed up, by comparing the change in ΔE (ΔΔE) with the wildtype ΔE as our baseline. If the ΔΔE was a positive number, then the mutation resulted in destabilizing the interaction, whereas a negative ΔΔE would indicate the mutation strengthening the interaction.

19 Eukaryotic Linear Motif.

The predicted residues were compared against results of the Eukaryotic Linear

Motif (ELM) resource (Dinkel et al.) to see if they fell within predicted motifs that may have been absent in the other NS5 proteins. ELM utilizes experimentally confirmed motifs as the foundation of their database. The database contains curated cleavage, docking, ligand, modification, and target sites. The amino acid sequence of our NS5 proteins was uploaded to the ELM server. The server then runs the amino acid sequence against its database of manually curated of short linear motifs (SLiMs) to find any matches.

20 Chapter Three

Results

Structural Modeling.

For our initial building of the interaction structure of TBEV NS5, using SWISS

Model to predict the three dimensional structure, we selected the PDB file 4K6M (Lu &

Gong) which is a crystal structure of Japanese Encephalitis virus (JEV) NS5. The identity between JEV NS5 and TBEV NS5 is 57%, with a similarity of 73% (Figure 2), and the resolution of the template is 2.6Å, making it fall well within the guidelines for template selection outlined in the introduction. Our results (Figure 3) indicated that it very likely follows the same basic structure as other solved NS5 domains, with the distinct RNA dependent RNA Polymerase (RdRP) domain containing a cleft with high affinity for binding single stranded RNA templates (Figure 3C). The MTase domain also contained a curved “cove” area consistent with other solved flavivirus MTases (Figure 3D). The domain also shares the structural characteristics of class I AdoMet-dependent MTases

(Schubert, Blumenthal, & Cheng).

To confirm our template choice for TBEV NS5, we also used 4K6M (Lu & Gong) as a template to model WNV NS5, and then compared the modeled version of the WNV domains to the solved crystal structures by means of the RMSD calculation. A RMSD of

0.75 Angstrom was obtained between the modeled structure and the crystal structure. We then compared the generated structure of TBEV NS5 to the solved structure of WNV

NS5 but found few significant differences in structure (Figure 4). Due to the fact that

WNV NS5 does not interact with TRIM79, while TBEV and LGTV both do, we then looked at the alignment of TBEV, WNV, and LGTV NS5 (Figure 5) to identify residues

21 that varied between WNV and TBEV, but were conserved between TBEV and LGTV, and mapped these residues onto the three dimensional structure (Figure 6) for further study once the interaction interface was predicted.

Predicting the complete structure of TRIM79 proved to be more of a challenge, as our BLAST results failed to find a high-quality template that was contiguous from the N- to C- termini. We did submit the amino acid sequence to both a folding comparison modeler, RaptorX (Källberg et al.), and an ab initio modeler, I-TASSER (Zhang), to see if they could be used to determine the full length structure of TRIM79. When the results of these two methods contradicted one another, we then focused on the region of

TRIM79 that we knew to be important since, as stated previously, removal of the RING domain did not impair its ability to bind to TBEV NS5. We also previously observed that removal of the SPRY domain did not impair NS5 binding. For this reason, we could safely proceed with homology modeling using a template that covered just the B-box, coiled-coil, and partial SPRY domain, so we selected 4TN3 (Goldstone et al.) which is a crystal structure of the Bbox and coiled coil region of TRIM5α from Rhesus macaque.

The resolution of our template protein was 3.2 Å, and it has an identity of 45% and a similarity of 64% (Figure 7) keeping it within the desired parameters. The resulting structure (Figure 8) was very similar to the prediction from RaptorX. We next examined an alignment of TRIM79 to TRIM30 (Figure 9) for context once the interaction interface was predicted. We also predicted the potential structure that a TRIM79 dimer would have, as some TRIM proteins function more efficiently as dimers, and taking into account that these dimers are usually oriented in an anti-parallel configuration (Sanchez et al.).

22 However, TRIM79 has been found to create multimers beyond dimers, and so our focus was on how a TRIM79 monomer would form an interaction complex with TBEV NS5.

Interaction Complex Modeling.

Once we had our predicted NS5 and TRIM79 structures, we modeled the potential interaction complexes that might occur. To more expeditiously analyze the predicted interfaces, we created a contact map (Table 1) to help predict which residues were interacting with one another. Some complexes that were predicted could be discarded immediately either because they involved domains of the protein that were not included in the truncation studies, or by not having any unique interaction pairs. However, this still left us with several potential interaction interfaces that were possible (Figure 10). Once we calculated all the contact maps we then narrowed the residues as previously described and were left with a primary list of candidates (Table 2). These candidates included a putative SUMOylation site K74, as found by the ELM resource, that is not present in

WNV or JEV and two other lysine residues K108 and K240 that might be modified by ubiquitin. When we further narrowed the list of candidates by cross-referencing the

TBEV NS5 candidates with the TRIM79-TRIM30 alignment to find unique residues, we got our second list of candidates (Table 3). It is important that both K74 and K108 were retained as potential candidates. We narrowed our pool of candidates even further by looking at the predicted interaction interfaces with the TRIM79 dimer (Table 4), and were left with eight mutagenesis candidates, with K74 and K108 still retained on the list.

Significantly, SUMO interacting motifs (SIMs) were located proximal to the SUMO motif on NS5, and TRIM79 was also found to have a SIM motif in the targeted area as well as a putative SUMO site. While JEV and WNV also contain motifs for SUMO and

23 SIMs, they are located outside of the TRIM79-binding target area of NS5. Additionally,

TBEV NS5 residue G69 was found to be a part of a LIR motif, which plays a role in selective autophagy, and a similar motif was found in the targeted region of TRIM79. It is therefore possible that these ligands (LIR and SUMO) could serve as intermediates or scaffolds for the interaction between TRIM79 and TBEV NS5, while not vital to the complex formation. The final candidates were then mapped onto the TBEV structure

(Figure 11).

Modeling Effects of Mutation.

To assess which mutations would have the most impact on disrupting the interaction between TBEV and TRIM79 we proceeded with calculating the energy of the interactions as previously described (Table 5). Complex19 (Figure 12) had the most favorable result, and that interaction yielded three residues that had been part of our primary and secondary list of candidates as well as three residues that were not on our primary list of candidates. The in silico point mutations were then performed and the impact of those mutations on the interaction interface was calculated (Table 6 and Figure

13). The numbers were then normalized with the wildtype serving as the baseline and the change in energy relative to the baseline was calculated for each point mutation (Table 6 and Figure 14). We then consider the Coulomb Interaction equation which states:

When further examined the ε can be further expanded to indicate it is ε0D

where D is the dielectric constant. Because the energy calculations are done by

GROMOS96 (van Gunsteren et al.), the calculations are conducted as if they are suspended in the gas phase where the dielectric constant would be equal to one. Since our structures interact as proteins, and not as gasses, we must account for this by changing

24 the dielectric constant. Those who study believe the dielectric constant for proteins to fall between that of methanol (dielectric constant of 4) and water

(dielectric constant of 80). Dividing our results for ΔΔE by the corresponding dielectric constant, we find that the energy calculations fall within the expected range of 10 kCal/mol to 100 kCal/mol. Based on our calculations, only one mutation, L119Y, led to an increase in stability of the complex. While all the other point mutations decreased the predicted stability of the complex, the one that had the largest impact by far was P95K

(ΔΔE > 720 kCal/mol). When adjusted for a dielectric constant of 4 (methanol), ΔΔE was

180 kCal/mol. A result of ΔΔE = 9kCal/mol was seen when using a dielectric constant of

80 (water). This result was similar to the change in energy resulting from the interaction of IL-4 and its receptor (Kumar & Gromiha). The effect of the P95K mutation was tested on the next 4 complexes with the most favorable energy calculations, complexes 0, 29,

20, and 12. The results (Table 7) indicated that this point mutation did not have as big of an impact on the other interaction complex. This may be due to the fact that P95 was not predicted to be a part of the interface of the interaction in these complexes.

Summary.

In the fight against emerging global flaviviruses the challenge remains in dissecting interactions between the virus and the host so that we can find novel means of preventing and treating the debilitating diseases brought on by these viruses. By having a known potent antiviral protein, and knowing the dynamics of its interaction with the virus, we can then design targeted therapeutics that can be used to fight the progression of these diseases and prevent the sequela from becoming an issue. The difference may come down to studying the differences between the vectors of these viruses and how their

25 internal environment can impact the robustness of these pathogens. While we have new insights as to how this interaction may occur, the war rages on, with humanity equipped with more knowledge than it had before.

26 Chapter Four

Discussion

Our study showed that, given an adequate template, it was possible to create a model of TBEV NS5 that we could use to visualize the structure. We could then use this structure to perform further studies about which residues were on the surface and, therefore, able to interact with other proteins. We were also able to determine a preliminary structure for TRIM79 that could be used in our study of the interaction complex. These structures then allowed for the prediction of how the interaction complex could form between the monomers of TBEV NS5 and TRIM79.

This study was able to demonstrate how knowledge of the three dimensional structure when utilized with sequence alignments was able to narrow our list of potential candidates from over 60 to less than 15. This allows for fewer in vitro studies to be performed. The energy calculations allow for us to put forth candidates based on what impact the mutation had on the stability of the proposed complex.

Our study points out that a point mutation that should be further pursued is P95K, and to test in vitro if we can observe and measure a difference in the outcome when this mutation is present. However, it may be possible that the mutation of a single residue does not create sufficient disruption of the interaction complex. If this is the case, we should also perform the mutations N50T and M51G. It may show that these mutations when combined with P95K have an additive effect, and lead to a more potent disruption of the interaction complex.

27

Figure 1. Code used to generate the contact map. The code in red is determined by the modeling software used. The first red text is the pdb file generated by the docking software while the second two are the segment names designate for the ligand (LA0) and the receptor (RA0). The code in green is designated by the user and is the output file for the contact map data.

28

Figure 2. Sequence alignment of JEV NS5 and TBEV NS5 by BLAST. The query sequence is JEV, while the subject sequence is TBEV.

29

Figure 3A. Homology modeled structure of TBEV NS5 utilizing the solved crystal structure of JEV NS5. TBEV NS5 and JEV NS5 are 57% identical, with a similarity score of 73% utilizing the BLOSUM62 matrix. The RdRP domain is in red, while the MTase is in green 30

Figure 3B. As in Figure 1A, but rotated 180⁰ about the verticle axis.

31

Figure 3C. As in Figure 1A, but in a spece filling representation. This angle highlights the cleft in the RdRP domain that has a high affinity for binding single stranded RNA templates

32

Figure 3D. As in figure 2C, but rotated 90⁰ about the vertical axis. This angle highlights the “cove” formed in the MTase domain.

33

Figure 3E. As in Figure 2B, but in space filling representation. Rotation angle is measured from figure 2C.

34

Figure 4A. Structural comparison of TBEV NS5 and WNV NS5 modeled on the TBEV NS5 structure. Areas of structural identity are blue, while areas of structural similarity are green. Areas in red are those with the greatest difference using the BLOSUM62 matrix.

35

Figure 4B. As in Figure 3A but rotated 180⁰ about the vertical axis.

36

Figure 4C. As in Figure 3A but a space filling representation.

37

Figure 4D. As in Figure 3B, but in space filling representation.

38

Figure 5. Multiple sequence alignment (MSA) of WNV NS5, LGTV NS5, and TBEV NS5 done utilizing the CLUSTAL Omega program on the European Institute (EBI) server

39

Figure 5 (cont.). Multiple sequence alignment (MSA) of WNV NS5, LGTV NS5, and TBEV NS5 done utilizing the CLUSTAL Omega program on the European Bioinformatics Institute (EBI) server

40

Figure 6. Structure of TBEV NS5 color coded as follows: MTase domain in red, RdRP in green. Residues in blue indicate potential TRIM79 interaction partners, as they are not conserved between WNV and TBEV, but are conserved between TBEV and LGTV.

41

Figure 7. Sequence alignment of TRIM5α and TRIM79 by BLAST. Query sequence is TRIM5α, and the subject sequence is TRIM79.

42

Figure 8A. Homology modeled partial structure of mouse TRIM79 utilizing the solved partial crystal structure of rhesus macaque TRIM5α. TRIM79 and TRIM5α are 45% identical, with a similarity score of 64% utilizing the BLOSUM62 matrix. The B Box domain is in red, coiled coil domain in cyan, and SPRY domain in purple. The area modeled is designated by the dotted box. 43

Figure 8B. As in Figure 6A, but rotated 180⁰ about the vertical axis.

44

Figure 8C. As in Figure 2A, but in space filling reepresentation.

45

Figure 8D. As in Figure 6C, but rotated 180⁰ about the vertical axis.

46

Figure 9. Clustal omega alignment of TRIM79 and TRIM30, the two closest homologs that show specificity in binding and not binding to TBEV NS5 respectively. They are 80% identical with a 90% similarity using a BLOSUM62 scoring matrix. Domains, as established by UniProt, highlighted as follows: RING in magenta, B-box in cyan, coiled-coil in green, and PRY/SPRY in yellow.

47

48

Figure 10. Top four predicted interaction interfaces. Molecules are colored as follows: TBEV NS5 in green and TRIM79 in blue.

49

Table 2. Preliminary list of mutagenesis candidates* with the corresponding amino acid mutations rank TBEV/LGTV residue WNV residue number residue 1 G 69 R 2 K 108 P 3 L 44 R 4 E 66 V 5 A 96 R 6 P 95 K 7 A 63 R 8 K 74 V 9 M 98 Q 10 T 177 R 11 K 240 V 12 P 137 S 13 M 51 G 14 L 43 A *Only residues that were included in the truncated region that interacted were considered. Candidates are arbitrarily ranked by the number of modeled interfaces they were predicted to occur in.

50 Table 3. List of mutagenesis candidates with the corresponding amino acid mutations from monomer rank TBEV/LGTV residue number WNV residue residue 1 G 69 R 2 K 108 P 3 L 44 R 4 E 66 V 5 A 96 R 6 P 95 K 7 K 74 V 8 M 98 Q 9 T 177 R 10 M 51 G 11 L 43 A List was determined by accounting for the differences between WNV NS5 and TBEV/LGTV NS5 and cross referencing those with residues that differed between TRIM79 and TRIM30.

51 Table 4. Final list of mutagenesis candidates with the corresponding amino acid mutations from docking the TRIM79 dimer rank TBEV/LGTV residue number WNV residue residue 1 G 69 R 2 K 108 P 3 L 44 R 4 E 66 V 5 A 96 R 6 P 95 K 7 K 74 V 8 T 177 R This was determined by accounting for the differences between WNV NS5 and TBEV/LGTV NS5 and cross referencing those with residues that differed between TRIM79 and TRIM30 and cross referencing the result from the TRIM79 dimer and the TRIM79 monomer.

52 180°

Figure 11. Structure of TBEV NS5 color coded as follows: MTase domain in red, RdRP in green. Residues in blue indicate potential TRIM79 interaction partners, as they are not conserved between WNV and TBEV, but are conserved between TBEV and LGTV. Candidate residues in yellow. 53

Table 5. Calculations of the change in energy (ΔE) for all predicted interaction complexes that contained residues in the correct region of interaction. Complex E-Complex E- NS5 E-TRIM79 # (kJ/mol) (kJ/mol) (kJ/mol) ΔE (kJ/mol) 19 -38847.8 -33156.3 -4201.5 -1490.0 0 -38543.5 -33248.9 -4167.0 -1127.6 29 -38343.6 -33072.8 -4148.6 -1122.1 20 -37042.7 -31790.3 -4167.7 -1084.7 12 -38319.2 -33176.2 -4092.8 -1050.3 18 -38514.3 -33405.5 -4082.0 -1026.8 9 -38769.2 -33328.4 -4427.6 -1013.2 17 -38219.2 -33140.3 -4087.8 -991.0 7 -37881.1 -32841.6 -4059.1 -980.3 2 -38573.3 -33676.2 -3936.8 -960.3 22 -38543.0 -33459.3 -4151.8 -931.9 10 -37574.5 -32655.0 -4031.0 -888.4 13 -38656.6 -33606.3 -4170.9 -879.4 3 -37783.8 -33022.3 -3901.4 -860.1 8 -38760.9 -33536.9 -4373.8 -850.2 21 -38126.3 -33059.4 -4226.1 -840.8 14 -38284.3 -33348.3 -4127.6 -808.3 16 -38334.5 -33539.1 -4011.5 -784.0 15 -38025.9 -33318.4 -3968.6 -738.9 11 -38114.1 -33492.5 -3884.4 -737.3 27 -38398.3 -33690.8 -4039.1 -668.4 4 451237184.0 -32728.9 277.8 451269635.2

54

Figure 12. A model of Complex 19 with TRIM79 in blue and TBEV NS5 in green.

55 Table 6. Calculation of the change in free energy (ΔE) for each point mutation. Specimen E- Complex E- NS5 E- ΔE ΔΔE ΔΔE (kJ/mol) (kJ/mol) TRIM79 (kJ/mol) (kJ/mol) (kCal/mol) (kJ/mol) Wildtype -38847.8 -33156.3 -4201.5 -1490.0 0.0 0.0 R46E -38618.0 -32931.1 -4201.5 -1485.3 4.7 1.1 N50T -38422.5 -32970.3 -4201.5 -1250.7 239.3 57.2 M51G -38734.9 -33273.0 -4201.5 -1260.3 229.7 54.9 E66V -38838.4 -33156.9 -4201.5 -1480.0 10.0 2.4 P95K -35857.1 -33189.2 -4201.5 1533.6 3023.6 722.7 L119Y -38444.5 -32538.2 -4201.5 -1704.8 -214.8 -51.3

56

ΔE for point mutations 2000

1500

1000

500

0 ΔE (kJ/mol) -500

-1000

-1500

-2000

Wildtype R46E N50T M51G E66V P95K L119Y

Figure 13. Graphical representation of the change in energy caused by the indicated point mutations to TBEV NS5.

57 ΔΔE for point mutations 3500

3000

2500

2000

1500

1000

500

0 ΔΔE (kJ/mol) -500

Wildtype R46E N50T M51G E66V P95K L119Y

Figure 14. Graphical representation of the change in energy relative to the change in energy seen in the wildtype caused by the indicated point mutations to TBEV NS5. This was calculated by subtracting the change in energy seen in the wildtype complex from the change in energy seen in the complex where the point mutation occurred.

58 Table 7. Calculation of the impact of the point mutation P95K on the remaining top energetically favorable interaction complexes Complex E-Complex E- NS5 E- ΔE ΔΔE (kJ/mol) (kJ/mol) TRIM79 (kJ/mol) (kJ/mol) (kJ/mol) 0 -38574.8 -33285.8 -4167.0 -268.2 1.3 29 -38379.3 -33101.5 -4148.6 -269.9 -1.7 20 -37076.1 -31828.4 -4167.7 -258.1 1.1 12 -38345.2 -33206.9 -4092.8 -249.9 1.1

59

References

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of molecular biology, 215(3), 403-410.

Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., . . . Bourne, P. E. (2000). The protein data bank. Nucleic acids research, 28(1), 235- 242.

Best, S. M., Morris, K. L., Shannon, J. G., Robertson, S. J., Mitzel, D. N., Park, G. S., . . . Bloom, M. E. (2005). Inhibition of interferon-stimulated JAK-STAT signaling by a tick-borne flavivirus and identification of NS5 as an interferon antagonist. J Virol, 79(20), 12828-12839. doi:10.1128/JVI.79.20.12828-12839.2005

Biasini, M., Bienert, S., Waterhouse, A., Arnold, K., Studer, G., Schmidt, T., . . . Bordoli, L. (2014). SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic acids research, gku340.

Chiappori, F., D'Ursi, P., Merelli, I., Milanesi, L., & Rovida, E. (2009). In silico saturation mutagenesis and docking screening for the analysis of protein-ligand interaction: the Endothelial Protein C Receptor case study. BMC bioinformatics, 10(Suppl 12), S3.

Chu, Y., & Yang, X. (2011). SUMO E3 ligase activity of TRIM proteins. Oncogene, 30(9), 1108-1116. doi:10.1038/onc.2010.462

Comeau, S. R., Gatchell, D. W., Vajda, S., & Camacho, C. J. (2003). ClusPro: an automated docking and discrimination method for the prediction of protein complexes. Bioinformatics, 20(1), 45-50. doi:10.1093/bioinformatics/btg371

Dinkel, H., Van Roey, K., Michael, S., Kumar, M., Uyar, B., Altenberg, B., . . . Behrendt, A. (2015). ELM 2016—data update and new functionality of the eukaryotic linear motif resource. Nucleic acids research, gkv1291.

Dong, H., Fink, K., Zust, R., Lim, S. P., Qin, C. F., & Shi, P. Y. (2014). Flavivirus RNA methylation. J Gen Virol, 95(Pt 4), 763-778. doi:10.1099/vir.0.062208-0

Eldin, P., Papon, L., Oteiza, A., Brocchi, E., Lawson, T. G., & Mechti, N. (2009). TRIM22 E3 ubiquitin ligase activity is required to mediate antiviral activity against encephalomyocarditis virus. J Gen Virol, 90(Pt 3), 536-545. doi:10.1099/vir.0.006288-0

60

Goldstone, D. C., Walker, P. A., Calder, L. J., Coombs, P. J., Kirkpatrick, J., Ball, N. J., . . . Stoye, J. P. (2014). Structural studies of postentry restriction factors reveal antiparallel dimers that enable avid binding to the HIV-1 capsid lattice. Proceedings of the National Academy of Sciences, 111(26), 9609-9614.

Grutter, M. G., & Luban, J. (2012). TRIM5 structure, HIV-1 capsid recognition, and innate immune signaling. Curr Opin Virol, 2(2), 142-150. doi:10.1016/j.coviro.2012.02.003

Guex, N., & Peitsch, M. (1996). Swiss-PdbViewer: a fast and easy-to-use PDB viewer for Macintosh and PC. Protein Data Bank Quaterly Newsletter, 77(7).

Heinz, F. X., & Stiasny, K. (2012). Flaviviruses and flavivirus vaccines. Vaccine, 30(29), 4301-4306.

Humphrey, W., Dalke, A., & Schulten, K. (1996). VMD: visual . Journal of molecular graphics, 14(1), 33-38.

Källberg, M., Wang, H., Wang, S., Peng, J., Wang, Z., Lu, H., & Xu, J. (2012). Template-based protein structure modeling using the RaptorX web server. Nature protocols, 7(8), 1511-1522.

Khadka, S., Vangeloff, A. D., Zhang, C., Siddavatam, P., Heaton, N. S., Wang, L., . . . LaCount, D. J. (2011). A physical interaction network of dengue virus and human proteins. Mol Cell Proteomics, 10(12), M111 012187. doi:10.1074/mcp.M111.012187

Kim, Y., Ponomarenko, J., Zhu, Z., Tamang, D., Wang, P., Greenbaum, J., . . . Bourne, P. E. (2012). Immune epitope database analysis resource. Nucleic acids research, gks438.

Knipe, D., Howley, P. M., Griffin, D., Lamb, R., Martin, M., & Roizman, B. (2001). Fields virology, vol. 1. Philadelphia (EUA): Lippincott Williams & Wilkins.

Kozakov, D., Beglov, D., Bohnuud, T., Mottarella, S. E., Xia, B., Hall, D. R., & Vajda, S. (2013). How good is automated protein docking? Proteins, 81(12), 2159-2166. doi:10.1002/prot.24403

Kumar, M. S., & Gromiha, M. M. (2006). PINT: protein–protein interactions thermodynamic database. Nucleic acids research, 34(suppl 1), D195-D198.

Leung, J. Y., Pijlman, G. P., Kondratieva, N., Hyde, J., Mackenzie, J. M., & Khromykh, A. A. (2008). Role of nonstructural protein NS2A in flavivirus assembly. J Virol, 82(10), 4731-4741. doi:10.1128/JVI.00002-08

61

Lindenbach, B. D., & Rice, C. M. (2003). Molecular biology of flaviviruses. Adv Virus Res, 59, 23-61.

Lu, G., & Gong, P. (2013). Crystal Structure of the full-length Japanese encephalitis virus NS5 reveals a conserved methyltransferase-polymerase interface. PLoS Pathog, 9(8), e1003549.

Meroni, G. (2012). TRIM/RBCC Proteins: Springer.

Miranda, M., & Sorkin, A. (2007). Regulation of receptors and transporters by ubiquitination: new insights into surprisingly similar mechanisms. Molecular interventions, 7(3), 157.

Nazmi, A., Mukherjee, S., Kundu, K., Dutta, K., Mahadevan, A., Shankar, S. K., & Basu, A. (2014). TLR7 is a key regulator of innate immunity against Japanese encephalitis virus infection. Neurobiol Dis, 69, 235-247. doi:10.1016/j.nbd.2014.05.036

Ozato, K., Shin, D. M., Chang, T. H., & Morse, H. C., 3rd. (2008). TRIM family proteins and their emerging roles in innate immunity. Nat Rev Immunol, 8(11), 849-860. doi:10.1038/nri2413

Pevsner, J. (2015). Bioinformatics and functional : John Wiley & Sons.

Priya Doss, C. G., Chakraborty, C., Chen, L., & Zhu, H. (2014). Integrating in silico prediction methods, molecular docking, and molecular dynamics simulation to predict the impact of ALK missense mutations in structural perspective. BioMed research international, 2014.

Robertson, S. J., Mitzel, D. N., Taylor, R. T., Best, S. M., & Bloom, M. E. (2009). Tick- borne flaviviruses: dissecting host immune responses and virus countermeasures. Immunol Res, 43(1-3), 172-186. doi:10.1007/s12026-008-8065-6

Rost, B. (1999). Twilight zone of protein sequence alignments. Protein engineering, 12(2), 85-94.

Sanchez, J. G., Okreglicka, K., Chandrasekaran, V., Welker, J. M., Sundquist, W. I., & Pornillos, O. (2014). The tripartite motif coiled-coil is an elongated antiparallel hairpin dimer. Proceedings of the National Academy of Sciences, 111(7), 2494- 2499.

Schubert, H. L., Blumenthal, R. M., & Cheng, X. (2003). Many paths to methyltransfer: a chronicle of convergence. Trends in Biochemical Sciences, 28(6), 329-335. doi:10.1016/s0968-0004(03)00090-2

62

Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., . . . Söding, J. (2011). Fast, scalable generation of high‐ quality protein multiple sequence alignments using Clustal Omega. Molecular systems biology, 7(1), 539.

Smit, J. M., Moesker, B., Rodenhuis-Zybert, I., & Wilschut, J. (2011). Flavivirus cell entry and membrane fusion. Viruses, 3(2), 160-171. doi:10.3390/v3020160

Taylor, R. T., Lubick, K. J., Robertson, S. J., Broughton, J. P., Bloom, M. E., Bresnahan, W. A., & Best, S. M. (2011). TRIM79alpha, an interferon-stimulated gene product, restricts tick-borne encephalitis virus replication by degrading the viral RNA polymerase. Cell Host Microbe, 10(3), 185-196. doi:10.1016/j.chom.2011.08.004

Tsai, C. Y., Liong, K. H., Gunalan, M. G., Li, N., Lim, D. S., Fisher, D. A., . . . Wong, S. B. (2015). Type I IFNs and IL-18 regulate the antiviral response of primary human gammadelta T cells against dendritic cells infected with Dengue virus. J Immunol, 194(8), 3890-3900. doi:10.4049/jimmunol.1303343 van Gunsteren, W. F., Billeter, S., Eising, A., Hünenberger, P. H., Krüger, P., Mark, A. E., . . . Tironi, I. G. (1996). Biomolecular simulation: the {GROMOS96} manual and user guide.

Wei, T., Gong, J., Jamitzky, F., Heckl, W. M., Stark, R. W., & Rössle, S. C. (2009). Homology modeling of human Toll‐ like receptors TLR7, 8, and 9 ligand‐ binding domains. Protein Science, 18(8), 1684-1691.

Wilkinson, K. A., & Henley, J. M. (2010). Mechanisms, regulation and consequences of protein SUMOylation. Biochemical Journal, 428(2), 133-145.

Youn, S., Ambrose, R. L., Mackenzie, J. M., & Diamond, M. S. (2013). Non-structural protein-1 is required for West Nile virus replication complex formation and viral RNA synthesis. Virol J, 10, 339. doi:10.1186/1743-422X-10-339

Zhang, Y. (2008). I-TASSER server for protein 3D structure prediction. BMC bioinformatics, 9(1), 1.

Zmurko, J., Neyts, J., & Dallmeier, K. (2015). Flaviviral NS4b, chameleon and jack-in- the-box roles in viral replication and pathogenesis, and a molecular target for antiviral intervention. Rev Med Virol, 25(4), 205-223. doi:10.1002/rmv.1835

Zou, J., Xie, X., Wang, Q. Y., Dong, H., Lee, M. Y., Kang, C., . . . Shi, P. Y. (2015). Characterization of dengue virus NS4A and NS4B protein interaction. J Virol, 89(7), 3455-3470. doi:10.1128/JVI.03453-14

63