<<

CHAPTER 1 INTRODUCTION AND REVIEW OF LITERATURE

CHAPTER 1: INTRODUCTION AND REVIEW OF LITERATURE

Part 1

1.1. Positive sense single stranded RNA replication

Viruses containing single stranded positive-sense (+ve) RNA genomes are the largest group of . These include major pathogens of vertebrate and invertebrate animals, fungi, bacteria and plants. The ability of RNA viruses to mutate at high rate helps these viruses to adapt to various conditions in environment and also contributes to virus evolution (Domingo et al., 1999).

1.2.Gene expression in positive sense RNA viruses

The +ve sense RNA virus complete their life cycle in cytoplasm of infected cells Following un-coating, the +ve sense genome is directly translated using host cell translational machinery to produce viral proteins. The translated product is one or several large polyproteins which are comprised of several concatenated proteins that must be processed by a host and/or virally encoded protease enzymes to form structural and non- structural components. Upon entry in host cells, only the first open reading frame (ORF) in the viral genome is efficiently translated. Downstream ORFs need to be expressed either via novel translation events or by synthesizing subgenomic mRNA (sgRNA).

1.2.1. Subgenomic mRNAs

The first ORF on the viral genome is directly translated to produce non-structural proteins that include RNA dependent RNA polymerase (RdRp). The RdRp with the help of other viral and host proteins transcribe downstream ORFs as sgRNAs to produce structural proteins or proteins needed in intermediate or late stages of infection. The sgRNAs usually have the same 3’end as that of the genomic RNA but may have different 5’ends. All sgRNAs are produced via subgenomic promoters (SgP) present within the genomic negative (-ve) sense RNA or antigenomes. The lengths of SgPs can range from 24 nucleotides (nt) to over 100nt (Koev and miller 2000; Wang and Simon 1997; van der Vossen et al ., 1995; Balmori et al ., 1993). In Brome mosaic virus (BMV), and , size of

1 subgenomic promoter s is less than 100nt. Though subgenomic promoter features are more diverse and complex they share a common characteristic of having at least a small part of promoter sequence located just upstream of the 5’end of the resulting sgRNA (Miller and Koev 2000).

In some viruses, sgRNA promoter s may be either located partially downstream of the 5’end of sgRNA, or distantly upstream or even on the separate RNA molecule. The SgPs can vary greatly between or even within the viruses . In barley yellow dwarf virus (BYDV), 3 sgRNA promoters have completely different primary and secondary structure s and positions in relation to the start site s. The sgRNA1 promoter is located upst ream of the initiation site, the sgRNA2 and sgRNA3 promoters are situated downstream of the respective initiation sites (Koev et al ., 1999; Koev and Miller 2000 ). Diverse promoter arrangement s probably have different affinities for viral and host factors and allow differential regulation of sgRNA synthesis. The +ve sense RNA viruses that produce sgRNAs include animal viruses belonging to families Coronaviridae and Togaviridae, Arteriviridae , Nodaviridae, Caliciviridae, Hepeviridae, and Astroviridae; and the plant viruses belonging to families Tombusviridae, , Luteoviridae, and Closteroviridae . There are several strategies to initiate sgRNA synthesis .

1.2.1.1. Internal initiation

Viral polymerase synthesizes full length -ve sense RNA and then binds to the internal promoter on -ve sense RNA to initiate synthesis of +ve sense sgRNA (Miller et al ., 1985) (fig.1).The group of plant viruses and alphavirus es are known to utilize this mechanism (Wang and Simon 1997; Levis et al ., 1990; Van der Kuyl et al ., 1990; Miller et al ., 1985).

Figure 1: Internal initiation

2

1.2.1.2. Premature termination -during –ve sense synthesis of genomic RNA template

Virus polymerase synthesizes a partial -ve sense RNA by early termination and uses it as template to synthesize +ve sense sgRNA (fig.2). The termination signal may be formed by a structure formed by base pairing between the sequences of two RNA . Such signals are recognized by viral RdRp alone or with the help of other protein factors (Sit et al ., 1998). Termination signals due to c is-base pairing are proposed for tomato bushy stunt tombusvirus (TBSV) (Zhang et al ., 1999), while secondary structure within the RNA can also facilitate termination (Nagy et al ., 1998).

Figure 2: Premature termination

1.2.1.3. Discontinuous transcription

A small RNA sequence is generated from the 5’-end of the genomic RNA followed by an internal initiation on –ve sense RNA genome. This mechanism ha s been proposed for Nidovirales (Zhang and La i et al ., 1994) (fig.3A). Nidoviruses have large genome s (15 to 31kb) and have complex subgenomic promoters . T hey produce nested set of sgRNA s which contain a 90nt sequence known as leader sequences and it is derived from genomic RNA (gRNA) 5’-end (Lai and Cavanag 1997). Discontinuous transcription usually involves hopping of the polymerase from one promoter to the other promoter (Sawicki and Sawicki 1990) (fig. 3B)

3

(A)

(B)

Figure 3: Discontinuous Transcription (A) Leader –priming (B) Recombination during negative strand synthesis

1.3. Positive strand RNA viruses

Positive strand animal RNA viruses can be grouped in to three superfamilies based on their genomic organization, replication strategies and replicase protein s (Koonin and Dolja 1993). The replicase proteins include methyltransferases (MT), chymotrypsin -like and papain-like proteases (PRO), RNA-dependent RNA polymerase ( RdRp), RNA helicase (HEL). On the basis of comparative analysis of amino acid sequences of viral proteins +ve sense RNA viruses are classified broadly into three superfamilies , the picornavirus-like, flavivirus-like and alphavirus -like (van der Heijden and Bol 2002).

4

1.3.1. Picornavirus-like superfamily

This superfamily is the largest, most diverse and most widely represented across the diversity of the eukaryotic hosts. In addition to a distinct RdRp lineage, the picornavirus -like superfamily is defined by the presence of a conserved array o f signature genes, which encode a superfamily -III helicase, a small genome linked protein (VPg), a distinct chymotrypsin-like protease 3CPro (Eugene et al ., 2008; Koonin and Dolja 1993 ).The prototype of the family Picornaviridae is poliovirus. It has positive sense single stranded genome of ~7.5kb size that has a long 5’untranslated region (5’UTR), a single ORF and a short 3’UTR followed by a poly -A tail. Instead of having a cap structure at the 5’-end of the gRNA there is a covalently linked to VPg protein. On entry, the virus hijacks cellular translation machinery and inhibits protein synthesis of the host cell . The 5' end of poliovirus RNA has highly structured sequence known as internal ribosome entry site (IRES) which directs translation of viral RNA. The genome is translated as a polyprotein containing three domains P1 (codes for structural protein) , P2 and P3 domain (code for non -structural proteins) (fig. 4). The polyprotein processing carried out by viral proteases results in formation of structural proteins (VP1 -4), needed to assemble virus capsids and non- structural proteins (2A -2B-2C-3A-3B-3Cpro-3Dpol. The VPgs are small peptides of 21–24 amino acids in length. A conserved Tyr3 links VPg to the 5’terminal UMP of the genome via a phosphodiester bond (Paul 2002; Wimmer et al ., 1993). Uridylated form of VPg is used as a primer for synthesis of both + ve and –ve strand RNA. The RNA polymerase, 3Dpol has two synthetic activities in vitro (Cameron et al ., 2002; Paul 2002). It carries out covalent attachment of UMP to the hydroxyl group of tyrosine in the VPg or extends RNA/ DNA primers on homo polymeric or heteropolymeric RNA templates (Paul et al ., 1998).

Figure 4: Schematic representation of poliovirus genome

5

1.3.2. Alphavirus-like superfamily

The alphavirus-like superfamily contains animal viruses belonging to the families such as Hepeviridae (Hepatitis E virus) and Togaviridae (genera Rubivirus and Alphavirus ), Bromoviridae , Closteroviridae (plant viruses) and Tetraviridae (insect viruses). Members of this group of viruses have 3 conserved domains arranged in the same order as methyltransferase-helicase and polymerase (Kääriäinen and Ahola 2002; van der Heijden and Bol 2002). In addition to this typical arrangement of non-structural polyprotein, viruses like hepatitis E, alpha-and rubi- carry X domain (now known as macro domain) and protease (PRO) domains. Another conserved domain with unidentified role called as Y domain is present downstream of methyltransferase domain in hepatitis E virus and rubella virus, but is not found in (Koonin and Dolja 1993; Koonin et al ., 1992). The methyltransferase is present in all members of the alphavirus-like superfamily and is a hallmark of this family (Rozanov et al ., 1992). These viruses have 5’ cap structure and synthesis of one or more sub-genomic during the life cycle. The large difference between structural proteins makes viruses of different structure (Kääriäinen and Ahola 2002; Strauss and Strauss 1994). These viruses have very less sequence conservation between homologous viral proteins and share very short stretches of amino acid sequences containing motifs that are involved in enzymatic functions (Koonin and Dolja 1993).

Semliki forest virus (SFV), the prototype viruses of this superfamily, has its RNA genome (11.5 kb for SFV) embedded in a capsid surrounded by an envelope. The envelope is made of heterodimers of transmembrane glycoproteins E1 and E2. Viral genome has a 5’ cap and a poly-A tail (Strauss et al ., 1984; Simmons and Strauss 1972). In contrast to picornaviruses and flaviviruses, the non-structural protein coding region is situated at the 5’- terminal proximal region and the genome consists of two large ORFs. N-terminal ORF encodes non-structural polyprotein (nsp1234) while C-terminal ORF encodes structural polyprotein (C, E3, E2, 6K, E1) (fig.5). In addition to genomic RNA, viruses also synthesize SgRNA for production of structural proteins (Cancedda and Shatkin 1979). Upon infection of cells, the upstream two-thirds of the RNA genome is translated into non-structural polyprotein, P1234 (Takkinen 1986), which further undergoes processing to generate four non-structural proteins (nsp1, nsp2, nsp3 and nsp4). The nsp1 is an RNA capping enzyme

6 and shows both methyltransferase and guanylyltransferase activities . These activities are essential for capping of viral genomic and subgenomic RNAs (Ahola and Kääriäinen 1995; Mi and Stollar 1991) . It is also involved in anchoring replication complex to cell membrane during replication (Peranen et al ., 1995). The nsp2 possesses NTPase , RNA triphosphatase and RNA helicase activities (Vasiljeva et al ., 2000; Gomez de Cedrón et al ., 1999) in the N- terminal half, while the C -terminal codes for viral (papain-like) cysteine protease which catalyzes polyprotein processing of the non-structural polyprotein precursor (Vasiljeva et al ., 2001). The nsp3 has role in infection and pathogenesis (Park and Griffin 2009; Tuittila and Hinkkanen 2003) . The nsp4 is the viral RdRp. The structural polyprotein is produced via internal subgenomic promoter.

Figure 5: Schematic representation of genome organization of alphavirus

1.3.3. Flavivirus-like superfamily

The Flavivirus -like superfamily includes plant and animal viruses with different genome organizations , however, share related RdRps. The animal viruses belong ing to the Flavivirus genus (Flaviviridae family) include, mosquito borne viruses such as yellow fever (YFV), dengue (DENV), Japanese encephalitis (JEV), West Nile (WNV), Zika virus and the tick-borne encephalitis virus (TBEV). Most of these viruses are transmitted by arthropod s (mosquito or tick) and hence, also known as . Flaviviru ses are enveloped viruses having ~11 kb genome with a 5’cap (Cleaves and Dubin 1979; Wengler and Gross 1978 ) and a non-polyadenylated 3’end (Wengler and Gross 1978). The viral RNA encodes a single ORF having highly structured ~100nt 5’UTR and 350–700nt 3’UTR . The ORF is transalated into a large polyprotein that is cleaved co -translationally and post -translationally into functional proteins -. 3 structural proteins (C-prM-E), followed by 7 nonstructural (NS) proteins (NS1-NS2A-NS2B -NS3-NS4A-NS4B-NS5)-3’ (Rice et al ., 1985) (fig.6). The N- 7 termini of prM, E, NS1, and NS4B are cleaved in the ER lumen by host signal peptidase, while most of the NS proteins and the C-terminus of C protein are processed by NS3 serine protease together with its cofactor NS2B in cytoplasm. NS5 is the most conserved and largest flavivirus protein. It harbours an N-terminal methyltransferase domain and a C- terminal RdRp domain. The NS3 is multifunctional protein and contains N -terminal serine protease, C-terminal NTPase/RNA helicase s, and 5’RNA triphosphatase (RTPase) (Yon et al ., 2005; Benarroch et al ., 2004; Borowski et al ., 2001).

Figure 6: Schematic representation of a typical flavivirus genome

1.4.Viral RNA dependent RNA polymerase s

RNA dependent RNA polymerase s (RdRps) are important components of RNA replication and transcription complexes and important players in replication of viral RNA genome with out going through a DNA stage. RNA viruses multiply their genomes from a single copy to multiple copies in only a few hours. Sp eed and e fficiency are important characteristics of viral enzymes. In addition to this, they also must be highly specific to the viral RNA as there are huge numbers of cellular RNAs in the infected cell ’s cytoplasm. Other virus- and host cell-encoded components may be also essential in viral RNA replication and RdRp activity through the formation of RNA–protein complexes. In +ve sense RNA viruses, t he genomic RNA initially directs synthesis of viral proteins, and when the viral RdRp and ot her essential proteins are accumulated, the viral RNA is copied from the 3 ′ end generating complementary negative sense RNA , which in turn, is transcribed into new positive sense genomic RNA molecules. The integrity of viral genome requires correct initiation of RNA synthesis by viral polymerases.

8

1.4.1. Initiation mechanisms

Although diverse RNA viruses use different strategies for their genome replication, principally there are only two basic mechanisms, the RNA synthesis can be initiated either in primer independent (de novo) manner or primer-dependent manner (Ranjith-Kumar et al ., 2002a; Kao et al ., 2001; Paul et al ., 1998). RNA viruses use either one or both of these mechanisms for initiation of RNA synthesis. Almost all viruses initiate both replication and transcription from the end of the genome. However, for subgenomic RNA synthesis, viruses employ internal initiation.

1.4.1.1 De novo initiation

De novo initiation by RdRp usually starts either with GTP or ATP. The initial nucleotide provides 3’OH for the addition of incoming nucleotide which forms the first phosphodiester bond. Initiation by de novo mechanism does not require any additional enzyme from host or virus to generate primer for replication. The RdRp subsequently elongates RNA in template dependent manner. In some cases abortive initiation may occur and the short RNA products may serve as a primer in further replication cycle. Such abortive priming has been reported in reovirus (Yamakawa et al ., 1981), turnip crinkle carmovirus (Nagy et al ., 1997), and rotavirus (Chen and Patton 2000). De novo initiation is utilized by viruses with positive sense, negative sense, double stranded and ambisense RNA genomes (Kao et al ., 2001). The negative-strand RNA viruses such as vesicular stomatitis virus (VSV) (Testa and Banerjee 1979) and dsRNA viruses such as Cystoviridae (Yang et al ., 2003; Makeyev and Bamford 2000) and rotavirus (Chen and Patton 2000) have evolved this strategy. De novo initiation is widely used by positive sense RNA viruses such as members of the family Flaviviridae , namely hepatitis C virus (HCV) (Kao et al ., 2001; Luo et al ., 2000; Zhong et al ., 2000; Kao et al ., 1999 Oh et al ., 1999;), dengue (Ackermann and Padmanabhan, 2001) and Kunjin (Guyatt et al ., 2001) and plant alphavirus-like viruses (Strauss and Strauss., 1994; Goldbach et al ., 1991).

1.4.1.2 Primer dependent initiation

In some viruses, initiation of polymerization is mediated by a primer. Primer can be either a protein or an oligonucleotide. 9

1) Protein primers: VPg protein is covalently attached to the 5 ′-end of viral genome and can can act as a primer during RNA synthesis in virus families including Picornaviridae and Caliciviridae. The RNA polymerases of these viruses use VPg as primer for both –ve and +ve strand RNA synthesis. The VPg can act as a primer upon uridylylation, providing a free hydroxyl group required for establishing phophodiester bond by the viral polymerase. 2) Oligonucleotides primers: i. Cap snatching: these are produced by cleavage of 5’end of a capped cellular mRNA. Many segmented –ve sense RNA viruses use this mechanism for transcription (Hagen et al ., 1995). ii. Leader RNA: a short RNA primer or leader RNA about 2-4nt generated by abortive cycling process can serve as a primer (Mcclure 1985). iii. Template primed or copy back initiation: the 3’end of the RNA template loops back and serve as a primer (Laurila et al ., 2002).

1.5. Structural and functional insights into RNA dependent RNA polymerases (RdRps)

Although primary sequence conservation is not observed among viral RdRps, they contain highly conserved motifs (Butcher et al ., 2001; Bruenn 1993). There are seven motifs arranged in the order G, F1-3, A, B, C, D and E from N-terminus to C-terminus of these enzymes (Bruenn 2003; Gorbalenya et al ., 2002). The seven motifs are grouped into three subdomains known as the fingers, palm and thumb (fig.7). Further, the loops present in the finger subdomain are called as fingertips which interconnect the fingers with the thumb and create an overall ‘closed right hand’ conformation that is unique to the RdRps in contrast to the ‘open hand’ confirmation found in other polymerases (Ferrer 2006). The closed right hand conformation allows correct arrangement of metal ions and substrate molecules at active site of the enzyme for catalysis (Brautigam and Steitz 1998). Divalent metal ions are important for nucleotide polymerization reaction and most polymerases use Mg 2+ (Steitz 1998). Mn 2+ has been shown to increase de novo initiation by RdRps (Ranjith-Kumar et al ., 2002b), however, it is also suggested that Mn 2+ may not play a physiologically relevant role

10 in synthesis because of its low intracellular concentration (Quamme et al ., 1993; Zhang and Ellis, 1989).

Three-dimensional structures are known for RdRps of +ve RNA virus such as poliovirus, human rhinovirus, foot and mouth disease virus in the family Picornaviridae (Appleby et al ., 2005; Ferrer-Orta et al ., 2004; Thompson and Peersen 2004; Love et al ., 2004; Hansen et al ., 1997), rabbit hemorrhagic disease virus, Norwalk virus from the family Caliciviridae (Ng et al ., 2002 and 2004) and hepatitis C virus, bovine viral diarrhea virus from the family Flaviviridae (Choi et al ., 2006 and 2004; O’Farrell et al ., 2003; Ago et al ., 1999; Bressanelli et al ., 1999; Lesburg et al ., 1999)

Figure 7: A typical structure of RdRp of a positive sense RNA virus

1.5.1. The finger subdomain

The finger sub-domain comprises of two regions: an inner region and an outer region. The inner region mainly contains a cluster of α-helices that encircle the palm sub- domain. The outer fingers that project away from the palm, contains several β-strands. This

11 region forms the fingertip region extending towards the thumb domain. The central β-sheet forms the central core of the outer fingers and this is a common feature among RdRps, though the conformation of the polypeptide chain in fingertips differs considerably among different RdRps. The fingers subdomain contains motifs F and G, which are conserved among RdRps of +ve sense RNA viruses (Bruenn 2003; Koonin, 1991). Motif F extends from the fingers subdomain towards the thumb subdomain contains several basic amino acids that interact with the incoming nucleotide. Motif G is found in many single stranded +ve sense RNA viruses. The motif has S-X-G consensus sequence, and forms a loop that is part of the template entrance tunnel (Pan et al ., 2007; Gorbalenya et al ., 2002). Presence of an additional conserved structural motif H, located in the thumb subdomain of ss + RNA viruses has been suggested but its actual function has not been described (Cameron et al ., 2009; Cerny et al ., 2014).

1.5.2. The thumb subdomain

The thumb is the most diverse domain, present in the C-terminal region of the protein. Picornaviruses and caliciviruses RdRps have small thumb domains containing about four-helix bundle. These viruses use primer dependent initiation and lack the β-hairpin or priming loop or C-terminal folds and thus have a wider template tunnel to accommodate a template- primer-complex for initiation (Ferrer-Orta et al ., 2006). Flavivirus polymerases contain a significantly larger thumb domain having, more than twice the number of residues in contrast to Picornaviridae 3D polymerases. They are composed of a β-thumb region and three additional α-helices which protrude into the active site, reduce the template channel volume to allow only ssRNA access to the active site during initiation (Choi et al ., 2004). In addition, some RdRp (HCV) major space in the active site cavity was occupied by the folding back of the C-terminal extensions into the molecule (Leveque et al ., 2003; Butcher et al ., 2001). All known RdRps have a small positively charged tunnel on the back side of the thumb, which possibly serves as the nucleotide diffusion tunnel (Cameron et al ., 2009; Yap et al ., 2007; Bressanelli et al ., 2002).

12

1.5.3. The palm domain: the catalytic domain

The palm domain of all viral RdRp, contains motifs A–E. The palm subdomain includes most of the highly conserved sequence motifs known in polymerase. Motif A has the consensus sequence DX4–5D, while motif C contains the GDD tripeptide. Motif B is situated between the fingers and palm subdomains facilitate binding the template and the incoming nucleotide its N-terminal loop is mainly involved in template binding (Gong et al ., 2010, Crotty et al ., 2003). Motif C is comparatively highly conserved sequence in viral RdRps. Motif D is known to change its confirmation upon binding of nucleotide and also functions as a structural scaffold for the palm domain. At the N-terminal end motif D has a highly conserved glycine residue (Gong et al ., 2010). Motif E exists as a tight loop located at the junction of the palm and thumb domains. The part of this loop is present in the active site groove and assumed to help in proper positioning of the 3’-OH end of the primer for phosphodiester bond formation. This motif is comparatively less conserved (Cameron et al ., 2009) (fig.8).

Figure 8: Structural motifs of RdRps (te Velthuis A. J. W 2014) 13

1.6. Cis-acting regulatory elements in positive sense RNA virus genomes

Positive sense RNA virus genome ends o form secondary and higher-order RNA structures. These functionally important structures are known as cis -acting elements as their functions cannot be complemented in trans. They contribute in genome stability and inter or intramolecular RNA-RNA interactions. They are also involved in binding of viral and cellular proteins and regulate viral replication, transcription or translation during the viral life cycle. Mostly, these are located in the 5 ′NCR and 3 ′NCR of the viral genomes but sometimes located within the coding sequences adjacent to it. In addition, some cis -acting RNA elements are found within the coding sequences, distant to the genomic ends.

1.6.1. The cis -acting elements in Picornavirus-like superfamily

5’ cloverleaf (oriL):

Poliovirus (PV) genome, a prototype of picornavirus-like superfamily contains a highly conserved cloverleaf structure at the 5’-terminal end of the genome to which VPg is covalently linked at the terminal UMP (Fig. 9). Cloverleaf structure is organized in four stem loop structures (SLA, SLB, SLC and SLD) and regulates both translation and RNA replication. cellular poly (rC)-binding protein (PCBP) and viral protein 3CDpro Forms a ribonucleoprotein (RNP) complex on cloverleaf structure which is essential for replication in RNA virus (Parsley et al ., 1997; Xiang et al .,1995; Harris et al ., 1994; Andino et al ., 1993; Andino et al .,1990) Stem-loop B interacts either PCBP or 3AB while a tetra loop (UGCG) present in stem-loop D binds with 3CDpro (Rieder et al ., 2003; Paul 2002). Mutations that disrupt RNP-complex formation inhibit RNA replication but do not affect translation. The C residues present in stem-loop B are required for PCBP binding and participate in RNA replication, also an adjacent C-rich sequence present between the cloverleaf and the IRES called as spacer sequence which is found to be an essential part of the 5’-terminal cis -acting element ( oriL ) of the poliovirus genome (Toyoda et al., 2007).

14

Figure 9: The 5’terminal cloverleaf region

3’NTR-poly-A (oriR):

3’non-coding region (NTR) is very diverse among picornaviruses. The enteroviral 3′NTR along with the poly (A) tail was proposed to be the origin of replication (ori R) for - ve strand RNA synthesis. This region provides specific binding sites for viral and/or cellular proteins. The stem-loops X and Y, present at the 3’NTR of poliovirus display a ‘kissing interaction’, which is crucial for RNA replication. The poly-A tail of picornaviruses is genetically encoded (Wimmer et al ., 1993) unlike the cellular mRNAs, in which poly-A tails are added post-transcriptionally. The poly-A tail is ~90nt but studies have shown that the presence of a poly-A tail with 20nt is sufficient for RNA replication and infectivity of the viral RNA (Silvestri et al ., 2006). Interestingly, its complementary poly-U tract contains only about 20nt while the poly-A tail is approximately 90nt long (van Oij et al ., 2006) (fig. 10).

15

Figure 10: Structure and nucleotide sequence of the poliovirus 3’ NTR with the poly-A tail

Internal origin of replication (oriI or cre): An important cis -acting element found either to the 5’NTR or within the coding sequences of picornavirus genomes (Fig.11) (Paul 2002). oriI elements were first identified in the human rhinovirus 14 (HRV14) capsid protein VP1coding sequence (McKnight and Lemon 1998), consequently identified in poliovirus 2C ATPase and coxsackie virus B3, in capsid protein VP2 of cardioviruses and in 2A pro of HRV2 (van Oij et al ., 2006). The oriI of FMDV is an exception, and is situated in the 5’NTR (Mason et al ., 2002). oriIs contain a small stem-loop structure and have diverse nucleotide sequences. However, enterovirus and rhinovirus oriIs share a conserved motif (G1XXXA5A6A7 XXXXXXA14) (Yin et al ., 2003; Yang et al ., 2002). The A5 residues present in these motifs help the linkage of UMPs to VPg to generate VPgpU and VPgpUpU which act as primers for RNA synthesis. Mutational studies have shown that the enteroviral oriI function is not dependent of its position within the genome. When the oriI element of PV within the coding region is inactivated by mutation, the insertion of a second oriI (PV or HRV14) into the 5 ′NTR restored its function (Yin et al ., 2003).

16

Figure 11: The structure and nucleotide sequence of the PV1 oriI [cre(2C)] element. The bold letters represent the conserved cre(2C) sequences

Internal Ribosomal Entry Site (IRES):

The PV IRES is present in the 5’NTR between 124- 630nts (Wimmer et al ., 1993; Paul 2002). It is a highly structured element that allows ribosome binding and initiates translation without the 5’-m7 cap (fig.12). PV IRES is known as Type I IRES, which directs the ribosome to a first AUG downstream of the IRES (Pelletier et al ., 1988), from which it scans along the RNA until it reaches a second AUG where translation is initiated (Jackson et al ., 2010; Bonnal et al ., 2005; Belsham 1992). IRES-dependent translation is known to be inefficient, compared to cap-dependent translation initiation, (Andreev et al ., 2009). The IRES also contain signals for RNA replication in stem-loops II, IV, and V (Cameron et al ., 2009).

17

Figure 12: 5’NTR of poliovirus genome with cloverleaf and IRES secondary structures

Cloverleaf structure at the 3’end of negative sense genome:

Both the cellular and viral (2CATPase, 2BC) proteins have been shown to bind with the 3’terminal sequence of PV –ve sense RNA (Paul 2002) though the importance of these RNA/protein interactions is unidentified yet. Using in vitro translation and RNA replication system the requirement of the 5’terminal sequence of stem-A in the +ve sense genome, and consequently the 3’terminal sequence of the -ve sense strand, was demonstrated to be important for the efficient +ve strand RNA synthesis (Sharma et al ., 2005).

1.6.2. Cis acting elements in flavivirus-like superfamily

5’NTR elements and promoter signals for RNA synthesis :

The 5’NTRs of flavivirus genomes are relatively small and show less conservation of nucleotide sequence (Brinton and Dispoto 1988), however, share a similar secondary structure which contains a large stem-loop having side stem-loop (SLA) (fig.13). A second short stem-loop (SLB), is also present downstream of SLA, which ends at the translation initiator codon. The SLA and the upstream sequence of the translation initiator AUG was found to be crucial cis -acting element for viral replication (Filomatori et al ., 2006; Kofler et 18 al ., 2006; Cahour et al ., 1995). Additionaly, a conserved stem-loop structure just downstream to the initiator AUG of DENV genome was reported play a role as a regulatory element which is important for start codon selection for translation initiation (Clyde and Harris 2006). In vivo and in vitro study, suggested that the SLA functions as the promoter for DENV negative strand RNA synthesis.

Figure 13: Schematic representation of the 5’and 3’NTR regulatory elements of dengue virus genome

3’NTRs Cis -Acting elements:

The flaviviruses 3’NTRs are about 350-700nt long. Depending upon sequence/structure conservation 3’NTR can be divided into 3 domains (fig. 14). Domain I, starting immediately after the stop codon, is a hypervariable region that contains insertions and deletions in most flaviviruses. Domain II, displays moderate conservation and contains several hairpin motifs. It has a distinguished dumbell (DB) structure containing conserved sequence called CS2 motif, which is found in all mosquito-borne flaviviruses (Gritsun and Gould 2006; Romero et al ., 2006; Olsthoorn and Bol 2001). The DEN and JE subgroups showed tandem repeat of DB structure, which contains a repeated conserved sequence motif (RCS2). Domain III, the most conserved region has terminal stable stem-loop structure (3’SL). Upstream of the 3’SL contains a highly conserved sequence, the CS1 motif (Hahn et al ., 1987).

Inverted complementary sequences at the 5’ and 3’ ends of genome:

Flaviviruses have inverted complementary sequences at the ends of the genomes (fig. 14). It has been suggested that these sequences at both 5’ and 3’ends base pair and form a circular conformation (panhandle-like structures) (Hahn et al ., 1987) (fig. 14). High

19 conservation of 5’- 3’CS was observed among mosquito borne viruses, whereas 5’-3’UAR display less sequence conservation. In tick-borne flaviviruses (TBF), two pairs of complementary sequences (CSA and CSB) were proposed as cyclization elements (CS) (Khromykh et al. 2001; Mandl et al ., 1993). The requirement of 5’–3’CS base pairing demonstrated for polymerase activity (You and Padmanabhan 1999). Crosslinking studies and electrophoretic mobility shift assay have shown RNA–RNA interaction with the 5’ and 3’-end sequences of flaviviruses. The specific role of both, 5’-3’CS and 5’-3’UAR, has been shown by mutagenesis studies (Alvarez et al ., 2005; You et al ., 2001).

Figure 14: Location and sequence of flavivirus cyclization sequences: a schematic representation of location of 5’3’UAR and 5’3’CS regions of mosquito-borne flaviviruses show n in top panel. The lower panels show sequences of the complementary regions 5’3’CS and 5’ 3’UAR of flaviviruses. The highlighted sequences are the inverted complementary sequences. The initiation codon AUG is indicated in bold.

20

1.6.3. Cis acting elements in Alphavirus like superfamily

5′-terminal elements

The sindbis virus (SINV) genome has 200nt long 5′-terminal which contains two cis-acting elements called as conserved sequence elements (CSEs). The first CSE (nt 1to 59) consist of a small stem-loop and a large stem loop (SL1 and SL2). The second CSE is predicted to form 2 smaller stem loop structures (SL3 and SL4), is located between 155- 205nt in nsP1 coding sequence (Frolov et al ., 2001) (fig. 15). The SL3 and SL4 enhance RNA replication and their integrity is essential for SINV replication in mosquito cells than in mammalian cells (Fayzulin et al ., 2004; Frolov et al ., 2001; Niesters et al ., 1990).

3′NTR

The length of 3′NTR in alphaviruses varies from 121-524nt. The prototype SINV 3′NTR is 323nt long (fig. 15). It contains a poly -A tail and a highly conserved CSE of 19nt without any secondary structure. There is an AU-rich segment and repeated sequence elements 25-72nt upstream of the CSE. In SINV 3 ′NTR, 3 copies of a 40nt long repeat elements are present and deletions of these repeats results in generation of viable virus with impaired growth efficiency (Kuhn et al ., 1990).The 3 ′NTR is crucial for initiation of negative strand RNA synthesis from a +ve strand genomic template (Kuhn et al ., 1990).

Internal RNA elements

The nsp1 encoding region of SINV genome harbors a 132nt long segment (944- 1076nt) which is predicted to form 4 stem-loop structures and identified as the encapsidation signal (fig. 15) (Linger et al ., 2004; Frolova et al ., 1997; Weiss et al ., 1994). A second internal cis -acting element in the SINV genome is located in the ‘junction region’ (JR), which includes sequences preceding and from the coding region of the 26S subgenomic RNA. This region (in the –ve sense RNA) contains the promoter for the subgenomic RNA. This is a highly conserved sequence in alphaviruses (Ou et al ., 1982). In a subsequent study with DI particles, the minimal subgenomic promoter for SINV was found 18-19nt upstream and 5nt downstream from the start of the subgenomic RNA (Levis et al ., 1990).

21

Figure 15: Cis-acting elements in the genomes of sindbis virus (Alphavirus, Togaviridae )

1.7.Role of host factors in RNA replication

Replication of +ve RNA viruses is a complex process which involves interactions of viral RNAs viral proteins. After translation and processing of viral proteins, there is an extensive cytoplasmic membrane rearrangement which involves viral and host proteins to form replication complex (RC) (Jouvenet et al ., 2010; Miller et al ., 2008). The RCs of picorna-, flavi-, hepaci-, corona- and arteriviruses are associated with membrane vesicles derived from endoplasmic reticulum (ER), while in togaviruses, e.g., Rubella virus and Semliki Forest virus, RCs are derived from endosomes and lysosomes. The members of the family Nodoviridae,e.g. flock house virus assembles its RCs on mitochondrial membranes (Miller et al ., 2008; Novoa et al ., 2005). The NS4A of flaviviruses, the NS4B of hepatitis C virus (HCV), the 2BC and 3A of picornaviruses and the nsp3 to nsp4 of arteriviruses and coronaviruses are known to induce membrane rearrangements (Fernandez-Garcia 2009; Clementz et al ., 2008; Miller et al ., 2008; Posthuma et al ., 2008; Egger et al ., 2002).

22

Part 2

1.8. Hepatitis E virus

Hepatitis E virus (HEV) causes self resolving acute liver inflammation. Hepatitis E was first noticed during an epidemic of hepatitis in 1978, which occurred in Kashmir (India). Based on the symptoms and epidemiological features possibility of a new human hepatitis virus was postulated (Khuroo 1980). Eventually, by doing a retrospective study of one previous waterborne epidemic in 1955 in New Delhi (India), that was believed to be due to hepatitis A virus (Vishwanathan 1957), it was found that the causative agent is not hepatitis A or hepatitis B virus but a new virus (non-A, non-B) (Wong et al., 1980). Balayan could transmit the virus into himself by oral administration and 36 days post inoculation started showing abnormal liver enzymes, clinical signs and symptoms of acute non-A, non- B. Electron microscopic analysis of the stool samples revealed 27-30nm spherical virus like particles (VLPs) (Balayan et al ., 1983). Reyes et al (1990) used cloning techniques to characterize this virus further by making cDNA libraries in vector λgt10 and screened with probes prepared from RNA extracted from the bile. The screening identified a clone (~1.3 kb) that selectively hybridized to cDNA prepared from RNA extracted from stool samples of hepatitis patients. The sequence analysis showed that it was RNA dependent RNA polymerase region of the virus genome. Later, the viral genome was molecularly cloned by Tam et al from faeces of a macaque experimentally infected with a Burmese strain of the virus (Tam et al ., 1991). Yarbough et al (1991) developed an immunoassay to detect antibodies developed by this virus. When retrospective analysis of the serum samples from viral hepatitis epidemics in India, 1955-56 (Delhi) and 1975-76 (Ahmedabad), it was confirmed that both were due to the enterically transmitted non-A, non-B type (ET-NANB), which was later named as hepatitis E virus.

1.8.1. Virus transmission

HEV is excreted in feces of infected persons and is transmitted by faeco- oral route. Person to person transmission is rare in both, epidemic (Arankalle et al ., 2000) and sporadic settings (Aggarwal et al ., 1994). There is occasional nosocomial transmission (Robson et al .,

23

1992). Vertical transmission from mother to infant is also possible (Khuroo et al ., 1995). Transfusion associated hepatitis E is possible in endemic situations (Arankalle et al ., 2000 and1998). Hepatitis E is an important public health concern in many developing countries and the virus is hyperendemic/endemic in these countries mainly due to poor sanitary facilities. Hepatitis E can occur either in epidemic form (Naik et al ., 1992; Khuroo, 1980; Vishwanathan et al ., 1957) or in sporadic form in such situations. Hepatitis E represents the major cause of water-borne epidemics in India. A common feature is contamination of water with sewage (Arankalle et al ., 1994; Chadha et al ., 1991; Wong et al ., 1980; Sreenivasan et al ., 1978; Viswanathan et al ., 1957). In rural areas, the outbreaks are seen during floods that lead to contamination of drinking water sources by human excreta. In summer months, scarcity of water in rivers increases chances of infection due to concentration of fecal contaminants. Overall, attack rates during outbreaks range from 1%-15%. The rates are highest among young adults (3%-30%); the reason for this is not clear. Infections in children are probably asymptomatic infections. About 40-50% of sporadic hepatitis among Indian adults is due to HEV (Arankalle et al ., 1993). The mortality rate for HEV ranges from 0.5- 3%, except in pregnant women it increases with each succeeding trimester and may reach 10-30% in epidemic situations (Acharya et al ., 1996; Khuroo et al ., 1981). Outbreaks of hepatitis E in endemic regions are mainly caused by genotype 1 virus. HEV is endemic in Southeast and Central Asian countries, parts of the Middle East, Africa and Mexico.

Recently, sporadic cases or smaller outbreaks due to autochthonous transmission of hepatitis E have been recorded worldwide. In the USA, Europe (including France, UK, Germany, the Netherlands, Spain, Austria and Greece), and Asian–Pacific countries (Japan, Korea, Taiwan, Hong Kong, New Zealand and Australia) zoonotic transmission of the virus mainly observed. Swine HEV (genotype 3) was shown to be endemic in pigs and able to cross species barrier (Meng et al ., 1998a; Schluader et al ., 1998). A direct evidence of zoonotic transmission came from evidence of human infections due to consumption of undercooked sika deer and wild boar meat (Tamada et al ., 2004; Tei et al ., 2004). Several animal species show presence of anti-HEV antibodies including swine, cattle, sheep, goats, macaques, horses, cats, dogs, rats and mice suggesting existence of additional animal reservoirs (Meng, 2000; Arankalle et al ., 1994; Balayan et al ., 1990).

24

1.8.2. Clinical features

Clinical outcomes associated with HEV infection are quite diverse. Hepatitis E is a moderately severe icteric disease and shows a self-limited course. Liver is the principal target organ. Clinical HEV infection is most common in adolescents and young adults, but also occurs to a lesser extent in children (Hyams et al ., 1992; Arankalle et al ., 1988). During epidemics of HEV, the ratios of clinical to subclinical infection were shown to be~1:10 in adults, 1:4 in children and 1:26 in pregnant women (Arankalle et al ., 1998; Clayson et al ., 1997). In a proportion of patients, the disease manifestations may be severe and lead to fulminant hepatitis (acute liver failure). Pregnancy is associated with increased risk of severe disease and increased mortality up to 15-20% (Khuroo et. al ., 1981). Increased morbidity and mortality is observed in chronic liver disease patients super infected with HEV (Hamid et al ., 2002).

The onset of jaundice is marked by yellowish skin and sclera, dark urine, pale coloured faeces. The prodromal symptoms usually subside after visible jaundice; however, visible signs may not appear in some patients but may experience severe symptoms. The levels of liver enzyme, alanine aminotransferase (ALT) peak (1,000-2,000U/liter) at the onset of symptoms and then the serum bilirubin levels peak (5-25mg/dl). The prothrombin time may be increased in acute viral hepatitis, especially in fulminant hepatitis, indicating extensive hepatocellular necrosis and worse prognosis. Similarly, a reduction in the serum albumin level may occur.

In experimentally infected primates, HEV RNA was found in serum, bile, and feces several days before elevation of serum ALT activity (Ticehurst et al ., 1992; McCaustland et al ., 2000; Arankalle et al ., 1988; Bradley et al ., 1987). Expression of HEV capsid protein, indicative of viral replication could be microscopically visualized about 7 days post infection in 70% to 90% of hepatocytes at the peak of viral replication. HEV antigens in hepatocytes have been detected simultaneously with appearance of HEV in bile and feces, before or concurrent elevation of ALT levels and morphologic changes in the liver (Krawczynski and Bradley 1989). These findings suggest that HEV may be released from hepatocytes into bile, and consequently into feces, before morphological changes occur in the liver. 25

HEV RNA detection in experimentally infected chimpanzees revealed viral genomic sequences in serum and stools from the very beginning of the infection and sudden drop of the viral titer with development of antibody response (Li et al ., 2006). Most commonly used HEV diagnosis test is serological assays (ELISA) for detection of anti HEV immunoglobulin IgG and IgM. Synthetic peptides or recombinant proteins (usually ORF2 alone or ORF2+ORF3) are used as antigens in these assays. Anti-HEV IgM antibodies appears early in the infection and can last approximately 4-5 months which can be detected during the acute phase of the illness. Immediately after IgM increase from the acute phase IgG antibodies appear until the convalescent phase.

HEV infection doesn’t lead to chronic liver disease, but recently the progression of acute hepatitis E to chronic hepatitis in case of organ transplant cases has been reported (Kamar et al ., 2008). It is usually caused by genotype 3 virus. Risk factors include immunosuppression, solid organ transplantation, HIV infection, hemodialysis, and hematological malignancies. The route of transmission is not different from that seen in acute infection. Most patients are either asymptomatic or present with non-specific symptoms. Liver pathology shows progressive fibrosis, portal hepatitis with lymphocytic infiltration, piecemeal necrosis with progression to cirrhosis (Haagsma et al ., 2008; Kamar et al ., 2008; Lockwood et al ., 2008). Extra intestinal manifestations neuralgic amyotrophy, peripheral neuropathies, encephalitis, encephalopathy, Parsonage Turner syndrome, paroxysmal myopathy, and bilateral pyramidal syndromes are known to occur. Presence of chronic infection in immunocompromised patients carries a bad prognosis, which, if left untreated rapidly progresses to cirrhosis (10% in 2 years) and end-stage liver disease. Hence, anti-HEV antiviral drugs have gained importance in treating such patients. Peg-IFN α-2b and Ribavirin therapy was recently found to be effective in treating severe acute hepatitis E in HEV chronic infections (Debing and Neyts 2014). Some drugs such as Mycophenolic acid, cyclophilins A and B and Sofosbuvir have been recently tried and were found to be effective in reducing the viral loads in HEV infected individuals and also in cell culture systems (Dao Thi et al ., 2015; Wang et al ., 2014).

26

1.8.3. Taxonomy and classification

HEV was initially classified as calicivirus on the basis of morphological similarity. However, later sequence analysis have suggested that the putative functional motifs on HEV non-structural proteins are similar to those present in alphaviruses and Rubella virus of the Togaviridae family (Berke et al ., 2000; Koonin et al .,1992) also to the plant viruses (Koonin et al ., 1992). Methyltransferase the enzyme for HEV capping has properties similar to viruses among the alphavirus-like superfamily (Ropp et al ., 2000). The International Committee on Taxonomy of Viruses (ICTV) has then identified a new family, Hepeviridae for classifying hepatitis E virus. This has been further grouped into two genera, Orthohepevirus which includes all mammalian and avian HEV isolates and Piscihepevirus which includes cutthroat trout HEV. HEV infecting humans belongs to species and has 4 genotypes: Genotype 1 (HEV-1), HEV-2, HEV-3 and HEV-4 (Smith et al ., 2014). HEV-1 and HEV-2 viruses infect only humans, are predominant in developing countries, and while, HEV-3 and HEV-4 viruses can infect humans as well as other animals are mostly prevalent in developing as well as industrialized countries (fig. 16). These 4 genotypes have been further divided into different subtypes. HEV-1 has 5 subtypes; HEV-2 has 2 subtypes, whereas HEV-3 and HEV-4 display greater diversity and have 10 and 7 subtypes, respectively (Lu et al ., 2006).

HEV-1 constitutes human strains circulating in Asia and Africa. HEV-2 comprises human strains from Mexican, Nigeria and Chad. HEV-3 comprises human and animal strains from the USA, Canada, France, UK, Argentina, Spain, New Zealand, the Netherlands, Austria, etc. HEV-4 includes human and animal strains detected in China, Japan, Taiwan, Vietnam, India, Italy and France (Lu et al ., 2006). HEV-1 and HEV-2 are responsible for waterborne outbreaks in Asian and African countries and in Mexico. HEV-3 and HEV-4 strains are reported to cause the acute autochthonous cases in European countries, USA, Argentina, China and Japan (Teshale et al ., 2010a).

Hepatitis E-like virus has also been found in Central American, European, and African bats, making a novel phylogenetic clad in the family Hepeviridae . Bat viruses comprise a distinct genus within the family Hepeviridae and cannot infect humans (Drexler

27 et al ., 2012). Another strain, isolated from trouts in the western USA has been termed as the cutthroat trout virus.

Figure 16: Phylogenetic tree of hepatitis E virus (HEV) isolates, based on a partial nucleotide sequence of the capsid gene (Krain et al ., 2014)

1.8.4. Physicochemical properties

Hepatitis E virus is a nonenveloped, spherical particle with a diameter of ~ 30 to 34nm (Balayan et al ., 1983). The virus is sensitive to CsCl and freeze thawing. It is less stable compared to the hepatitis A virus. The buoyant density of hepatitis E viral antigen or virus particles has been found to be 1.35 g/cm3 (Balayan et al ., 1990) and 1.39 to 1.40 g/cm3 (Favorov et al ., 1989) in CsCl and 1.29 g/cm3 in potassium tartarate or glycerol. The sedimentation coefficient is 183s for HEV (Bradley et al ., 1988). HEV particle contains genomic RNA enclosed within capsid made up of a single protein. It is not known whether subgenomic RNA transcripts are packed in the capsid with genomic RNA.

28

1.8.5. Animal models and cell culture systems

For HEV transmission studies, non human primates such as rhesus, cynomolgus, owl monkeys and chimpanzees have been used successfully by many researchers which has provided important information regarding HEV pathogenesis, immunology and vaccinology (Vitral et al ., 2005; McCaustland et al ., 2000; Ticehurst et al ., 1992; Uchida et al ., 1991). Ability of HEV to infect pigs has been studied experimentally and pigs are suspected as potential reservoirs for HEV (Williams et al ., 2001). Anti-HEV antibodies have been detected in number of other animal species including rats, dogs, cattle, sheep, camels, etc., and it is not known whether these HEV strains are transmissible to humans. Experimental infection of HEV in laboratory Wistar rats has been demonstrated but no further experimental data or information has been reported (Maneerat et al ., 1996). Recently experimental infection with HEV genotype 4 was demonstrated in BALB/c nude mice (Huang et al ., 2009b). HEV RNA and antigens were detected by indirect immunofluorescence and PCR in liver, kidney, spleen, jejunum, ileum and colon in all of the inoculated and in one of the contact-exposed nude mice. Circulation of a ‘novel HEV genotype’ among rabbits in China and subsequent experimental infection to naïve rabbits has been demonstrated (Zhao et al ., 2009).

HEV can replicate in a limited number of cell lines of hepatic origin but these are not robust systems and HEV remains a difficult virus to study in vitro. There has been only limited success in developing suitable tissue culture systems for HEV replication. Earlier studies have reported limited propagation of HEV in 2BS, A549 cells (Wei et al ., 2000; Huang et al ., 1995; Haung et al., 1992) and FRKH cells (Kazachkov et al., 1992). Infection of primary cynomolgus hepatocytes and PLC/PRF/5 cells has been shown, but replication was not robust (Tam et al., 1996; Meng et al ., 1997). Tanaka et al (2007) reported an efficient cell-culture system for HEV-3 (JE03-1760F strain) replication. Ten percent (w/v) human fecal suspension with high HEV load (2X10 7 copies/ml) was used to inoculate hepatocarcinoma cell line (PLC/PRF/5). HEV progeny released in PLC/PRF/5 cell culture supernatants could be passaged 5 times successively and highest titers achieved were 8.6X10 7 copies/ ml on day 60. Tanaka et al (2009) further showed efficient replication of HEV-4 virus (HE-JF5/15F) recovered from a fulminant hepatitis patient in PLC/PRF/5 and

29

A549 cells. Recently, selection of a rare virus recombinant containing an insertion of 174nt (58 amino acids) of a human ribosomal protein gene has been reported. This selection resulted during adaptation of Kernow-C1 strain (HEV-3), from a chronically infected patient to growth in human hepatoma cells. This virus exhibited a broad host range and produced extracellular virus upon infecting HepG2/C3A (human hepatocarcinoma) cells (Shukla et al ., 2011). Zhang et al (2011) recently reported that swine HEV can replicate both in swine and human cells. Using swine anal swab or liver as inoculum for infection of IBRS-2 (swine cells) and A549 they could detect positive and negative strand RNA. Cytopathic effect could also be seen in both cell lines though at different passage levels. However, till date there is no robust cell culture systems for genotype 1and 2 viruses, which are human viruses.

1.8.6. Infectious cDNA clones

Most of the previous studies to understand molecular events during HEV replication have used infectious cDNA clones. Successful development of genotype 1 full-genome cDNA clones, in vitro transcription to generate genomic transcripts and transfection of hepatoma cells to obtain infectious virions has been reported (Emerson et al ., 2001; Panda et al ., 2000). These recombinant genomes could establish infection in rhesus macaques or chimpanzees and develop symptoms like hepatitis. A further study from Emerson et al (2004) showed that capped full-genome transcripts are 32-38 times more efficiently translated than their uncapped counterparts. Several primate cell lines such as PLC/PRF/5, Huh-7, Caco-2, HepG2/ 3CA, Vero, AGMK used in this study supported HEV replication, however none of the non-primate cells used supported replication indicating requirement of species specific factors for transcription and translation. Our group has recently demonstrated that ORF1 is important in crossing the species barrier by using genotype 1 and 4 chimeric constructs (Chatterjee et al ., 2016). Quantitative measurement of HEV replication became possible due to development of fused ORF2-GFP; ORF2-luciferase constructs (Cao et al., 2010; Thakral et al., 2005; Emerson et al., 2004).

30

1.8.7. Genome Organization of HEV

The viral genome is a single-stranded, +RNA of ~7.2 kb. It consists of a short 5 ′ non- coding region (NCR) of 27-35nt in length followed by three open reading frames (ORF1, ORF2 and ORF3) which overlaps partially (Tam et al ., 1991). The 3’ end NCR is 65-74nt long, terminated with a poly (A) tail of 150-200nt in length. The genome has m7G cap at the 5’end (Kabrane et.al ., 1999 ) (fig. 17).

Figure 17: A schematic diagram of HEV genome organization

1.8.7.1. Open Reading Frame 1(ORF1)

The ORF1 encompassing approximately two third of the viral genome consists of 5079 nucleotides, which encodes non-structural polyprotein of 1693 amino acids. Based on the ORF1 sequence alignments, HEV has been classified in alphavirus-like superfamily of +ve sense single stranded RNA viruses. The putative domains in the HEV nonstructural polyprotein include methyltransferase (MeT), papainlike cysteine protease (PCP), RNA helicase (Hel) and RNA dependent RNA polymerase (RdRp) (Koonin et al ., 1992). Downstream of MeT and PCP domains there are two domains namely ‘Y’ and ‘macro’ (earlier known as ‘X’ domain) respectively. There is a polyproline rich hypervariable region that shows high degree of nucleotide and amino acid variation between different HEV isolates. It is still not clear whether the ORF1 protein is processed into distinct units as is the case for other +ve sense RNA viruses.

31

Methyltransferase:

Presence of this domain was hypothesized by Koonin et al (1992) and it was proved to be functional by Magden et al (2001). Evidence of m7G cap present at the 5'-end of the HEV gRNA confirmed the functional role for HEV methyltransferase (Kabrane et al ., 1999). In the capping process, the 5 ′ gamma-phosphate from the nascent RNA molecules removed by RNA triphosphatase. A recent report from our group demonstrated that this function is associated with HEV helicase domain. Incubation of a purified recombinant HEV helicase protein with either α-32 P- RNA or γ-32P-RNA removes 32 P only from γ-32P-RNA, indicating that it has a γ-phosphatase activity, which might catalyze the first step in RNA cap formation (Karpe and Lole 2010a). The cap at the 5’end of the viral genome helps the virus to circumvent interferon mediated innate immunity, as 5’ triphosphate group on uncapped RNA are known to be potent activators of the interferon response.

Protease

Conservation of the X-domain in association with viral papain-like proteases (PCP) found in HEV suggests similarity of HEV protease to proteases found in other alphaviruslike-viruses. PCP is proposed to involve in polyprotein processing of ORF1 polyprotein. The above hypothesis was put to test by many researchers. Ropp et al (2000) observed products of 78 and 107 kDa with long incubations of ORF1 protein expressed in mammalian cells using recombinant vaccinia virus. Mutagenesis of Cys483, predicted cysteine of the cysteine-histidine catalytic dyad did not affect the generation of these products. Processing was absent when the protein was expressed in E.coli or HepG2 (human hepatoma) cells (Ansari et al ., 2000). Metabolic labeling and immunoprecipitation of HepG2 cell lysates made from the cells transfected with in vitro transcribed gRNA yielded distinct ~35 kDa, ~36 kDa and~38 kDa products, as detected with anti-MeT, anti-RdRp and anti-helicase antibodies respectively (Panda et al ., 2000). However, functionality of these individual subunits was never proved. A study by Sehgal et al (2006) observed multiple processed fragments which could be detected by anti-hexahistidine or anti-FLAG antibodies when a His6-ORF1-FLAG fusion protein was expressed in insect cells using recombinant baculoviruses. The processing was inhibited in the presence of E-64d, a cell-permeable cysteine protease inhibitor. It is likely that the processing was carried out by cellular protease/s. In the same study, a 35-kDa N-terminal fragment was characterized as 32 methyltransferase protein by mass spectrometry. In another study, Suppiah et al (2011) could detect only precursor of 185 kDa when expressed the ORF1 protein from an HEV infectious cDNA clone (Burma strain; genotype 1) using a plasmid based system.

Papain like cysteine proteases are predominantly found in viruses belonging to the alphavirus-like superfamily. These proteases perform dual function of polyprotein processing and suppression of host defenses. HEV encoded PCP recognized LXGG sequence and cleaved synthetic substrates containing ubiquitin or ubiquitin like small proteins (ubiquitin-AMC, ISG15-AMC, Nedd8-AMC, and SUMO-AMC). The enzyme also removed ISG15 from cellular proteins. Considering recent studies showing important role of ubiquitination, ISGylation and sumoylation in cellular antiviral pathways, it was suggested that HEV PCP may have dual role during virus infection- ORF1 polyprotein (pORF1) processing and downregulation of the cellular antiviral response (Karpe and Lole 2011). Subsequently, HEV PCP was shown to be play a role in deubiquitinating TBK-1 and RIG-I. Ubiquitination of these proteins is an important step in their activation further to induce type- I interferon (Nan et al ., 2014). An active ubiquitin proteasome system is required for HEV replication has been demonstrated by Karpe and Meng (2012).

Helicase

The putative HEV helicase of was shown to contain all 7 conserved motifs found in superfamily-1 of the helicases and proposed to contain RNA-binding as well as the NTPase domains (Koonin et al ., 1992). Functionality of HEV helicase was recently demonstrated by our group. HEV-Hel showed NTPase and RNA unwinding activities. Enzyme could hydrolyz all rNTPs efficiently; dATP and dCTP were more efficiently hydrolysed as compared to dGTP and dTTP. Enzyme showed 5’-to-3’ polarity by unwinding only RNA duplexes with 5’ overhangs (Karpe and Lole 2010b).

RNA dependent RNA polymerase

Amino acids 1207-1693 of ORF1 showed strong homology with the RdRp proteins of Rubivirus and of beet necrotic yellow vein virus (BNYVV) (Koonin et al ., 1992). HEV- RdRp contains 8 conserved motifs (i-viii or A-H) that are similar to RdRps from other +ve sense RNA viruses. HEV-RdRp belongs to supergroup-III. It also has amino acid motif found to be conserved across +ve sense RNA viruses as the canonical Glycine-Aspartate-

33

Aspartate (GDD) tripeptide. It has been shown that amino acid change of motif GDD to GAD results into replication-deficient-HEV replicon (Graff et al ., 2005a).

A putative RdRp containing polypeptide, encoded by 3546-5106nt in the ORF1, was demonstrated to interact with the 3’ NCR on the gRNA of HEV. Protein could also synthesize complementary strand in vitro (Agrawal et al ., 2001). Rehman et al (2008) showed localization of this protein in ER suggesting that HEV replicates probably in ER. The protein was also able to copy transfected HEV +ve sense RNA in cells indicating that the replicase was functional.

Macro domain or X-domain

In animal cells, macro domains are associated with proteins involved in ADP- ribosylation or poly (ADP-ribose) polymerization and also in ATP-dependent chromatin remodeling. Very few viruses, alphaviruses, coronavirus, rubella virus and hepatitis E virus encode macro domain in their genome. Role of macro domains either in the replication or in interactions with their hosts are not yet clear in these RNA viruses. The first direct biochemical evidence indicated that they might be involved in the downstream processing of ADP-ribose 1”-phosphate which is a side product of cellular pre-tRNA splicing. Polypeptide encompassing amino acids 775-960 (Burmese strain, genotype 1) of HEV ORF1 showed ADPR-1″ phosphatase activity and in vitro binding to poly (ADP-ribose) and ADP ribose (Neuvonen and Ahola 2009; Egloff et al ., 2006). The protein had more affinity for poly (ADP-ribose) as compared to ADP ribose. However, significance of these functions is not yet clear.

Hypervariable region (HVR)

HEV genome has one hypervariable region (HVR) located in the ORF1, which overlaps with the proline-rich sequence within the N terminus of the X-domain and the C- terminal part of the putative PCP domain. HEV can tolerate small insertions or deletions in the HVR and amino acid residues in this region (Pudupakam et al ., 2009); however, these deletions impair replication efficiency (Pudupakam et al ., 2011). Interchange of the HVR sequences between different HEV genotypes results in differential replication efficiencies. A virus strain with 174-base insertion from a human ribosomal protein gene in HVR was isolated when the Kernow-C1 strain of HEV-3 was adapted to grow in human hepatocellular

34 carcinoma cells (Shukla et al ., 2011). This insertion was stable during serial passages in cell culture, and using an infectious cDNA clone, it was observed to be the crucial factor in virus adaptation to growth in cell culture.

1.8.7.2. Open Reading Frame2 (ORF2)

It constitutes 1980nt at the 3’-end of HEV genome and encodes for the major viral capsid protein of 660 amino acids. Asparagine (Asn) residues are glycosylated at 137, 310 and 562 position. This modification is important for the infectious virus particles formation (Graff et al ., 2008). The pORF2 carries an N-terminal arginine-rich signal peptide by which it translocates across ER and glycosylated by addition of N-linked high mannose sugar residues. It has been suggested that glycosylated form of ORF2 or gpORF2 is an unstable form of the protein (Toresi et al ., 1999). The capsid also binds to HEV genomic RNA and may play a role in viral assembly (Xing et al ., 2010). Virus was attenuated when three amino acid mutations were introduced (F51L, T59A and S390L) in the HEV ORF2 capsid protein (Cordoba et al ., 2011). This protein is immunogenic, and shown to be targeted by neutralizing antibodies (Yamashita et al ., 2009).

When expressed in insect cells, truncated HEV capsid protein was produced (112– 660aa) which forms virus like particles (VLPs) with a diameter of 23.7 nm and T (triangulation number) = 1 icosahedral symmetry (Xing et al ., 1999) in which 14–608aa forms T = 3 VLPs (Xing et al ., 2011). The three functional domains, S (shell), M (middle), and P (protruding) in the pORF2 molecule were discovered by the X-ray crystal structure (Yamashita et al ., 2009; Guu et al ., 2009). The S domain forms an icosahedral shell forming base for arranging M and P domains and is highly conserved among HEV genotypes. The P domain provide putative binding site for neutralizing antibody and cellular receptor (Xing et al ., 2011; Guu et al ., 2009; Yamashita et al ., 2009). Expression in E. coli also shown to produce truncated ORF2 proteins and demonstrated to assemble into higher order structures: p239 (aa 368-606), E2 (aa 394-606) and E2a (459-660). These proteins appear as homodimers (Li et al ., 2005a; Zhang et al ., 2001).

The recombinant ORF2 proteins/peptides were shown to contain immunodominant epitopes and hence formed the basis of many ELISAs, proliferation assays and vaccine candidates. Immunogenic epitopes in ORF2 were observed to be distributed among 6 35 antigenic domains in ORF2 protein (Khudyakov et al ., 1999). The minimal neutralizing domain of the virus is located from amino acids 458-607 within the ORF2 protein (Zhou et al ., 2006a) mapped in the ‘P’, protruding domain of the ORF2 protein. ORF2 protein has been used for vaccine development employing different approaches. A vaccine based on the 56 kDa (112–607 aa) recombinant protein expressed in baculovirus expression system went up to phase III clinical trial, with 95% efficacy (Shrestha et al ., 2007). This subunit vaccine protected against genotype 1 (homologous) as well genotypes 2 and 3 HEV (heterologous) challenge in monkey model (Purcell et al ., 2003). However, there is no further progress in of this vaccine. Another success story is the vaccine based on bacterially-expressed recombinant genotype 1 HEV ORF2 protein (p239, 368-606 amino acids). Phase III clinical trial of this vaccine showed >95% efficacy and cross protection against HEV genotype 4 (Zhu et al ., 2010). This vaccine is licensed and commercially available only in China

The p239 protein can bind and enter into different cell lines susceptible to HEV infection (He et al ., 2008). By performing pull-down experiments the p239 protein was found to be interacting with alpha-tubulin, Grp78/BiP and heat shock protein 90 (HSP90). Studies with inhibitor showed that HSP90 is important for the intracellular trafficking of p239 (Zheng et al ., 2010). The ORF2 VLPs expressed in insect cells were also shown to bind Huh7 liver cells and this binding was associated with heparan surface proteoglycans (HSPGs) (Kalia et al ., 2009).

1.8.7.3. Open Reading Frame 3 (ORF3)

HEV ORF3 is translated by a bicistronic SgRNA (Graff et al ., 2006). It overlaps ORF2 by 300 nucleotides at its 3’ end and most variable among HEV strains. It is a small 114 amino acid phosphoprotein and remains attached with cytoskeleton (Zafrullah et al ., 1997). It bears two hydrophobic domains named D1 and D2, at N-terminal end and two proline rich regions i.e. P1 and P2 at C-terminal end (Ahmad et al ., 2011). Phosphorylation at ser71 residue guides interaction of nonglycosylated ORF2-ORF3 protein via a 25 aa region in the ORF3 protein (Tyagi et al ., 2002). This interaction probably plays a regulatory role in the assembly of HEV virions. However, ORF3 phosphorylation is not essential for infection in macaques (Graff et al ., 2005b). The C-terminal region of the ORF3 protein is multifunctional and appears to be involved in virion morphogenesis and pathogenesis. ORF3

36 expression displays filamentous distribution pattern when it interacts with microtubules and punctate pattern upon interaction with early and recycling endosomes (Kannan et al ., 2009). Using an in vitro replication system it was shown that ORF3 protein is dispensable for replication in cells (Emerson et al ., 2006), but essential for release of virions. ORF3 is also essential for infection of rhesus macaques (Graff et al ., 2005b). It was observed that monoclonal antibodies against ORF3 can capture HEV particles from cell culture supernatants and serum from HEV infected patients but cannot capture virus from fecal samples indicating that nascent virus has ORF3 on its surface.

ORF3 colocalizes with the cytoskeleton with the help of D1 hydrophobic domain (Kar-Roy et al ., 2004). Hydrophobic domain D2 has been shown to affect cellular iron homeostasis by interacting with hemopexin (Ratra et al ., 2008). The P2 region contains a proline-rich PxxP motif that binds to sarcoma (src)-homology 3 (SH3) domain-containing proteins (Korkaya et al ., 2001). These proteins are important for signal transduction pathways that promote cell survival.

In polarized Caco-2 cells supporting HEV replication, the ORF3 protein exhibited accumulation at the apical membrane and virus egress also took place from this surface. In a recent study, ORF3 protein have shown to binds with Tsg101 via its PSAP motif, and ORF3 protein interacts with Tsg101 and α1-microglobulin to help the assembly of the endosomal sorting complex needed for transport (Nagashima et al., 2011). In addition, Tsg101 and the vacuolar protein-sorting (Vps) proteins Vps4A and Vps4B have shown involved in release of HEV particles suggesting that the multivesicular body (MVB) pathway is involved in viral release., it has been demonstrated that ORF3 increases the phosphorylation of hepatocyte nuclear factor 4 (HNF4) In Huh7 liver cells. Phosphorylation impairs the nuclear translocation of HNF4 and downregulates HNF4-dependent gene expression (Chandra et al ., 2011).

1.9. cis -acting regulatory (CRES) in HEV genome (fig. 18, 19)

5’ NCR

The 5’NCR of is only 25-27nt long, has a methylated cap. It forms a complex secondary structure or hairpin (Huang et al ., 1992; Pardigon and strauss 1992). Downstream

37 of the 5’NCR, the coding region encompassing 150-208nt of ORF1 has homology with the 51nt conserved sequence present in alphaviruses. There is a stem loop structure similar to alphaviruses present in this region of HEV genome (Dominguez et al ., 1990; Niesters and strauss 1990). ORF2 binds to 76nt region at the 5’end of the HEV genome suggesting that this interaction may participate in encapsidation of viral genome (Surjit et al ., 2004).

3’NCR:

The 3’NCR contains two highly conserved stem loop structures (SL1 and SL2) (Emerson et al ., 2001). Interaction of viral RdRp with 3’NCR of HEV RNA and its importance in replication was documented first by Agrawal et al (2001). The 3’NCR without poly (A) tail failed to bind with RdRp. Deletion mutants lacking either SL1 or SL2 failed to form complex with RdRp, indicating requirement of the complete 3’NCR region including two stem loop structures and poly (A) tail for the RdRp binding. Sequence variation in 3’NCR of various genotypes is very high. Chimeric genomes made by swapping 3’NCR of genotype 1 virus with divergent 3'NCR sequences of HEV-2 and HEV-3 replicate as efficiently as the parental genome in culture and can also infect rhesus monkeys with similar efficiency as that of the wild type genome (Graff et al ., 2005a). However, genomes harbouring a mutation at 7106nt position in 3’NCR cannot infect rhesus monkeys. These results suggest that both, secondary structures and sequence of this region are important for the genomes to be infectious and replication competent (fig. 19).

Putative subgenomic promoter

A highly conserved stem–loop structure identified within the HEV junction region (JR) has very high sequence homology with a JR present in Rubella virus genome and with the conserved alphavirus Subgenomic Promoter (SgP) sequence (Cao et al ., 2010; Huang et al ., 2007). It is suggested that putative SgP is located in this region of HEV genome and it helps SgRNA synthesis. A 12 nucleotide (5105-5116) region within the JR is conserved in all HEV mammalian genotypes. Graff et al (2005a) have reported that synthesis of ORF2 and ORF3 proteins is abolished when sequences in this region are altered. An in-depth study involving genotype 3 swine HEV and mutants with changes in junction region identified a double stem loop (SL) structure of approximately 50 nucleotides (Cao et al ., 2010). A conserved 6 nucleotide sequence AAUAAC was found in the loop region and mutation

38 altering this sequence resulted in less efficient virus replication as compared to the wild type. It was found that AAU/C triplet has three nucleotide overlap with one arm of the SL stem. Mutation leading to the nucleotide sequence AACAAG prevented pairing of two base pairs on the SL stem and inhibited HEV replication. An attempt to replace the stem with a mutated complement sequence did not result in virus recovery, indicating that both, the sequence and structure play an important role in HEV replication (fig. 19).

A regulatory element situated in ORF2 encoding region

Emerson et al (2013) have recently reported presence of two highly conserved stem loop structures (ISL1 and ISL2) in the centre of ORF2 encoding region that are essential for capsid protein synthesis. Silent mutations in this region were shown to have negative effect on the capsid protein synthesis.

Figure 18: Proposed cis-regulatory elements in HEV genome (Purdy et al ., 1993)

39

Figure 19: The secondary structure of the proposed cis -regulatory elements of HEV-1 (strain Sar-55) as predicted using Mfold (Cao and Meng 2012 ) (A) Secondary structures of the 3’ end of HEV genome (nt 7082–7192 An) showing SL1 and SL2 (B) Negative- polarity complement of the JR present in the HEV genome (nt 5096 –5157), the subgenomic mRNA, ORF2 and ORF3 start sites are indicated by arrows.

1.10. HEV life cycle and genome replication (fig.20)

HEV entry into target cells

The virus attachment and entry into the host are the first steps in successful infection. In HEV, these processes are largely unknown because of the lack of a robust cell culture system or small animal model. Recently it has been reported that dimerization of the antigenic domain E2s located on ORF2 protein (aa455–602) protein is required for the virus host interaction and binding of neutralizing antibodies. HEV ORF2 protein interacts with Heparan Sulphate Proteoglycan s (HSPGs) present on host cell surface and initiates the viral infection (Kalia et al ., 2009). It has been demonstrated that p239 VLPS bind to and penetrate HepG2 cellular membrane and get transported further by cellular HSP90 protein to 40 the perinuclear sites (Zheng et al ., 2010). Study has shown that HEV enters into the cells through clathrin-mediated endocytosis (Kapur et al ., 2012). Following the internalization of HEV particle, it traffics to Rab5-positive compartments on the way to acidic lysosomal compartments. The entry of HEV to cells needs dynamin-2, actin and clathrin, membrane cholesterol, but is free from factors associated with macropinocytosis (Holla et al ., 2015).

Translation and processing of ORF1 proteins

After entry of the HEV genome into the cell ORF1 is translated by cap-dependent mechanism by host cell machinery to produce non-structural polyprotein containing multiple domains that are required in replication and protein processing of viral origin (Ansari et al ., 2000; Ropp et al ., 2000). It is still not known whether ORF1 polyprotein with multiple enzymatic domains is processed in to individual functional subunits or remains intact and functions as a single protein.

Translation of downstream Open Reading Frames (ORF2 and ORF3)

Hepatitis E virus is a +ve sense RNA virus and replication proceeds via synthesis of –ve sense replicative intermediate. Tam et al (1991) found three RNA species having length of ~7.2, 3.7 and 2.0 kb, which were designated as the genomic RNA and two sub-genomic RNAs, respectively from liver tissue of macaques infected with HEV under experimental conditions. As per this model, the two sub-genomic RNAs of 3.7 and 2.0 kb would be used to translate ORF3 and ORF2 proteins respectively. Out of four AUG codons (5104, 5113, 5131 and 5145) present at the junction region, it was proposed that ORF3 starts at 5104 and overlaps by one nucleotide with the ORF1 stop codon at position 5105 (with respect to sequence of genotype 1 SAR-55 strain). Codon at 5113 and 5131nt positions were considered to code for internal methionine residues in the ORF3 protein. AUG codon at 5145 was considered to be the start codon for ORF2 protein. This hypothesis was proved to be wrong by Graff et al (2006).They generated stable transformed cell lines that were selected for their ability to survive in presence of G418 due to the expression of neomycin resistance gene from either ORF2 or ORF3 of HEV replicon. Though they could detect two RNA species approximately 7.3 and 2.2 kb in size, they failed to detect the proposed second subgenomic RNA of 3.7 kb. The 2.2 kb subgenomic RNA was capped and its 5’end matched to nucleotide number 5122. With these results this group proposed a new model

41 with ORF2 start at 5145 (same as suggested by the previous model) and 5131 as the ORF3 start and translation of ORF3 and ORF3 proteins from the same bicistronic subgenomic mRNA. Replacement of AUG codon at 5131 by alanine abolished the expression of ORF3 protein confirming the suggested start. This model could also explain differences in reading frames observed in HEV-4 isolates, which contain a T base insertion between nucleotides 5116/5117 (in SAR-55). AUG3 was proposed to be the translation start of ORF3 for all mammalian HEV genotypes (HEV-1 to HEV-4). This model was confirmed by intrahepatic inoculation of wild type and mutant HEV-3 swine HEV replicons into pigs by Huang et al (2007). They demonstrated that mutation of AUG1 or insertion of a T base as in HEV-4 does not affect the virus infectivity; however, mutation of AUG3 abolishes virus infectivity. The support for a single subgenomic mRNA model also came from the study done by Ichiyama et al (2009). They transfected PLC/PRF/5 cells either with in vitro transcribed full genome RNA of HEV-3 or infected cells with fecal suspension containing HEV-4 virus. Isolated RNA showed only one subgenomic RNA of 2.2 kb with start at 5122. It is proposed that ORF2 and ORF3 are translated from a single bicistronic mRNA which overlaps with each other but neither overlaps ORF1.

A cis reactive element UGAAUAACAUGU (nucleotide 5106-5117), located upstream of the nucleotide 5122 is highly conserved among mammalian HEV strains and has been reported to be critical for synthesis of 2.2kb subgenomic RNA (Huang et al ., 2007).

Virus Assembly and release

It is suggested that ORF3 protein may play a role in virus replication and virion morphogenesis (Yamada et al ., 2009). The viral genome has encapsidation signal at the 5’end which recognized by the capsid protein for packaging during virus assembly (Surjit et al ., 2004). The 76nt region (130-206) at the 5’ end of the HEV genome was found to be important for binding with OR2 protein and further downstream 44nt sequence contributed in strengthening this interaction. It is also known that ORF3 protein interacts with ORF2 protein (Tyagi et al ., 2002) and cellular Tsg101 (Surjit et al ., 2006). PSAP motif of ORF3 protein binds to the Tsg101, which is conserved in all HEV isolates, including avian HEV (Surjit et al ., 2006). Tsg101 is a key factor in the formation of multivesicular bodies (MVBs), where cellular vesicles bud away from the cytoplasm into the lumen to create the 42

MVBs. The MVB pathway is employed in the budding of several enveloped viruses from the plasma membrane (Pornillos et al ., 2002). It is suggested that the ORF3 protein recruit Tsg101 to promote budding of ‘enveloped’ HEV.

Figure 20 : Proposed life cycle of HEV (Cao and Meng 2012)

1.11. Rationale of the study

Positive sense viruses after entering host cell immediately translate their proteins to initiate virus replication. Their RNA genomes encode viral proteins and also contain cis - acting RNA elements that direct different viral process es, such as protein translation, genome replication and transcription of subgenomic RNAs. Most of the regulatory RNA

43 elements are present in the 5’- and 3’- non-coding regions of viral genomes, while subgenomic promoters are located within the genome.

The replication and transcription strategies of HEV are poorly understood. It is presumed that the first step in HEV genome replication is synthesis of replicative intermediate RNA (negative sense antigenome) from the genomic RNA. The antigenome then serves as template for synthesis of two classes of positive sense RNA molecules, genomic (gRNA) and subgenomic (sgRNA). The sgRNA requires RdRp to initiate synthesis from internal promoter (SgP) within the antigenome. The recognition elements that determine template specificity of HEV-RdRp to ensure amplification of appropriate viral RNA species are yet not well characterized. It is also not known how replicase recognizes subgenomic promoter in presence of 5’and 3’-ends of the genome, which are binding sites (promoters) for it. It is not yet clear, whether the subgenomic mRNAs are transcribed from a full-length minus-strand or from two different subgenomic minus strands. In the current study, we aimed to characterize HEV RdRp and to analyze how it interacts with the cis - regulatory elements in the HEV genome.

44