University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange

Doctoral Dissertations Graduate School

8-2010

5’-Proximal cis-Acting RNA Signals for Genome Replication

Bo-Jhih Guan Microbiology, [email protected]

Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss

Part of the Virology Commons

Recommended Citation Guan, Bo-Jhih, "5’-Proximal cis-Acting RNA Signals for Coronavirus Genome Replication. " PhD diss., University of Tennessee, 2010. https://trace.tennessee.edu/utk_graddiss/802

This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council:

I am submitting herewith a dissertation written by Bo-Jhih Guan entitled "5’-Proximal cis-Acting RNA Signals for Coronavirus Genome Replication." I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the requirements for the degree of Doctor of Philosophy, with a major in Microbiology.

David A. Brian, Major Professor

We have read this dissertation and recommend its acceptance:

Chunlei Su, Gladys Alexandre, Albrecht von Arnim

Accepted for the Council:

Carolyn R. Hodges

Vice Provost and Dean of the Graduate School

(Original signatures are on file with official studentecor r ds.) To the Graduate Council:

I am submitting herewith a dissertation written by Bo-Jhih Guan entitled “5’-Proximal cis-Acting RNA Signals for Coronavirus Genome Replication.” I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the requirements for the degree of Doctor of Philosophy, with a major in Microbiology.

David A. Brian, Major Professor

We have read this dissertation and recommend its acceptance:

Chunlei Su

Gladys Alexandre

Albrecht von Arnim

Accepted for the Council:

Carolyn R. Hodges Vice Provost and Dean of the Graduate School

(Original signatures are on file with official student records.) 5’-Proximal cis-Acting RNA Signals for Coronavirus Genome Replication

A Dissertation Presented for the Doctor of Philosophy Degree The University of Tennessee, Knoxville

Bo-Jhih Guan August 2010

Copyright © 2010 by Bo-Jhih Guan All rights reserved.

ii ACKNOWLEDGEMENTS I am heartily thankful to my major professor, Dr. David A. Brian, for his guidance, support, and encouragement from the initial to the final level enabled me to develop an understanding of the subject. It is a great honor and pleasure doing research under Dr. Brian’s mentoring and supervision. I am grateful to my committee members, Dr. Chunlei Su, Dr. Gladys Alexandre, and Dr. Albrecht von Arnim, for their enthusiasm, discussion, and inspiration. I would also like to make a special reference to Dr. Ralph S. Baric who is a professor of the University of North Carolina. Without his kindness I could not have gotten such relevant data. In addition, this dissertation would not have been possible without the help of the current and past Brian lab’s colleagues, Kimberley Nixon, Kortney Gustin, Hung-Yi Wu, Yu-Pin Su, Yi-Hsin Fan, Agnieszka Dziduszko, and Tara Tucker. I would like to show my gratitude to all their advices and friendships. Lastly, I offer my regards and blessings to all of those who supported me in any respect during the completion of the dissertation research. Thank you.

iii ABSTRACT RNA sequences and higher-order structures in the 5’ and 3’ untranslated regions (UTRs) of positive-strand RNA are known to function as cis-acting elements for translation, replication, and transcription. In , these are best characterized in the group 2a (BCoV) and mouse hepatitis (MHV), yet their precise mechanistic features are largely undefined. Here, we use a reverse genetics system in MHV to exploit the ~30% nt sequence divergence between BCoV and MHV to establish structure/function relationships of 5’ UTR cis-replication elements. It had been previously shown that a precise replacement of the 391-nt MHV 3’ UTR with the 288-nt BCoV 3’ UTR yields wt-like MHV. Our attempts to replace the 209-nt MHV 5’ UTR with the 210-nt BCoV 5’ UTR, however, yielded a non-viable chimera. Therefore, a systematic analysis of individual 5’-terminal structures was made to identify compatible elements. By placing each of four putative cis-acting domains from the BCoV 5’ UTR into the MHV genome, we learned that (i) stem-loops (SLs) I & II and SLIII are functionally compatible, (ii) SLIV is compatible if it spans parts of the 5’ UTR and the nonstructural protein 1 (nsp1) cistron, thus identifying this part of ORF 1 as a component of the cis-replication signal, (iii) a relatively unstructured 32-nt region mapping between SLIII and SLIV defines a novel virus species-specific cis-replication element, (iv) spontaneous suppressor mutations within MHV SLI and nsp1 cistron compensated for growth defects arising from the BCoV 32-nt element in the MHV genome, (v) cross talk between the 32-nt element, SLI, and the nsp1 cistron appears essential for virus replication, (vi) the BCoV 5’ UTR and nsp1 cistron function together in the MHV genome to generate a wt-like MHV phenotype, and (vii) a functional 5’ UTR-nsp1 domain in group 2a coronaviruses cannot be substituted by the corresponding genomic element from the group 2b SARS-CoV. We postulate that the interaction between the 5’ UTR and nsp1 cistron (or possibly nsp1 protein) functions as a molecular switch between genome translation and ignition of negative-strand RNA synthesis.

iv TABLE OF CONTENTS

Chapter Page I. LITERATURE REVIEW ...... 1 Background...... 1 Questions that led to this dissertation research...... 16 II. CIS-REPLICATION STEM-LOOP IV IN THE MOUSE AND BOVINE CORONAVIRUS GENOMES SPANS PARTS OF THE 5’ UNTRANSLATED REGION AND NONSTRUCTURAL PROTEIN 1 CISTRON ...... 19 Introduction...... 19 Materials and methods ...... 22 Results...... 28 Discussion...... 39

III. GENETIC EVIDENCE FOR INTERACTION BETWEEN TWO 5’-TERMINAL GENOMIC CIS-REPLICATION ELEMENTS AND NONSTRUCTURAL PROTEIN 1 IN THE MOUSE HEPATITIS CORONAVIRUS...... 42 Introduction...... 42 Materials and methods ...... 44 Results...... 48 Discussion...... 59

IV. STEM-LOOP III IN THE 5’ UTR, IDENTIFIED PREVIOUSLY AS A CIS-

REPLICATION ELEMENT FOR BCOV DI RNA, IS NOT ESSENTIAL FOR REPLICATION OF THE MOUSE HEPATITIS CORONAVIRUS ENOME ...... 68 Introduction...... 68 Materials and methods ...... 71 Results...... 73 Discussion...... 79

V. FUTURE DIRECTIONS ...... 82 Introduction...... 82 To identify other viral factors that interact with 5’ genomic RNA cis-replication elements ...... 82

v To identify the functional defect arising from the exchange of the incompatible species-specific 32 (30)-nt 5’ UTR segment between BCoV and MHV ...... 83 To characterize the 5’ UTR-nsp1 interaction ...... 84 To explore the possibility and means of 5’-3’-end cross talk of the viral genome..85 LIST OF REFERENCES ...... 86 APPENDIX...... 106 Expression and purification of BCoV nucleocapsid protein using SUMO-fusions ...... 107 BCoV N binds cis-replication structures in the 5’ UTR ...... 110 BCoV N binds SLIII with specificity and micromolar affinity ...... 113 BCoV nsp1 binds SLIII and its flanking sequences with specificity and micromolar affinity...... 118 BCoV N and nsp1 bind the negative-strand counterpart of SLIII ...... 122 There is no detectable RNA-RNA interaction between the 5’ and 3’ UTRs of BCoV based on a gel-shift assay...... 127 VITA...... 130

vi LIST OF TABLES TABLE Page TABLE 1.1. Representative coronavirus species and groups………………………..2 TABLE 2.1. Oligonucleotides used in Chapter II…..………………………………24 TABLE 3.1. Oligonucleotides used in Chapter III……..…………………………...46 TABLE 4.1. Oligonucleotides used in Chapter IV………..………………………...72

vii LIST OF FIGURES

Figure Page FIG. 1.1. Schematic depiction of coronavirus particle and life cycle...... 5 FIG. 1.2. Organization and gene expression of the MHV genome...... 7 FIG. 1.3. Model for coronavirus RNA synthesis...... 10 FIG. 1.4. Structure of the BCoV DI RNA and some of the observations that led to this

research...... 12 FIG. 2.1. Comparison of RNA structures in the 5’-terminal regions of the mouse

hepatitis coronavirus, A59 strain (MHV-A59) and the bovine coronavirus,

Mebus strain (BCoV-Mebus)...... 20 FIG. 2.2. BCoV SLs I & II, and SLIII domains support virus replication in the MHV

background...... 30 FIG. 2.3. BCoV nt 142-226 can substitute for its MHV counterpart after one blind cell

passage...... 32

FIG. 2.4. A variety of SLIV structures are tolerated for virus replication...... 34 FIG. 2.5. Detrimental effects of the BCoV 32-nt unstructured region in the MHV

background...... 36 FIG. 2.6. Disruption of the lower helix of stem-loop V in the nsp1 coding region does

not disable MHV replication...... 38 FIG. 3.1. Cis-replication elements in the MHV 5’ UTR...... 43 FIG. 3.2. Severely impaired growth of the BCoV/MHV chimera (B142-173/M; also

32-nt named B /M) is restored to wild-type levels through 1-10 viral passages.....49 FIG. 3.3. Putative suppressor mutations found in the 5’ UTR and nsp1 coding regions

142-173 32-nt after serial passaging of B /M (B /M)...... 50 FIG. 3.4. Identification of suppressor mutations by reconstituting viable viruses

harboring revertant mutations...... 52 FIG. 3.5. Genetic cross talk between the 5’ UTR and nsp1 in the MHV genome...... 54 FIG. 3.6. Functional exchangeability of the 5’ UTR-nsp1 regions in the viral genome

between BCoV and MHV...... 56

viii FIG. 3.7. Functional inexchangeability of the 5’ UTR-nsp1 domains in the viral genome between SARS-CoV and MHV...... 58

FIG. 3.8. Phylogenetic relationships of 5’ UTR nucleotides and nsp1 amino acids

among group 2a coronaviruses...... 61

FIG. 3.9. Hypothetical model for a switch from translation to initiation of

negative-strand RNA synthesis during coronavirus replication...... 65

FIG. 4.1. Cis-replication SLIII in the MHV 5’ UTR...... 69

FIG. 4.2. Stem and loop mutants of BCoV SLIII in the MHV genome...... 74

FIG. 4.3. Predicted alternative “long” secondary structure for MHV and BCoV SLIII...76

FIG. 4.4. Growth of MHV SLIII truncation mutants in cell culture...... 78 FIG. A.1. Expression and purification of BCoV N protein by SUMO-fusion in E. coli. 108 FIG. A.2. RNA structures in the BCoV genome tested for BCoV N binding...... 111 FIG. A.3. EMSA of purified BCoV N protein binding to the positive-strand RNA probe, (+) 69-182 nt...... 114 FIG. A.4. EMSA of purified BCoV N binding to the positive-strand RNA probe, (+) 85-126 nt...... 116 FIG. A.5. EMSA of purified BCoV nsp1 binding to the positive-strand probes, (+) 69-182 nt and (+) 85-126 nt...... 119 FIG. A.6. EMSA of purified BCoV N binding to the negative-strand probes, (-) 69-182 nt and (-) 85-126 nt...... 123 FIG. A.7. EMSA of purified BCoV nsp1 binding to the negative-strand RNA probes, (-) 69-182 nt and (-) 85-126 nt...... 125 FIG. A.8. Positive-strand RNA-RNA binding analysis between the BCoV 5’ and 3’ terminal cis-replication RNA elements...... 128

ix LIST OF ABBREVIATION aa amino acid BCoV bovine coronavirus BP blind passage BSL bulged stem-loop BSL-PK bulged stem-loop along with pseudoknot CPE cytopathic effect cpm counts per minute DI RNA defective interfering RNA eIF eukaryotic initiation factor E protein envelope protein EMSA electrophoretic mobility shift assay ER endoplasmic reticulum ERGIC ER-Golgi intermediate compartment HCoV human coronavirus HE hemagglutinin-esterase hnRNP A1 heterogeneous nuclear ribonucleoprotein A1 hpe hours post electroporation hpi hours post hpt hours post transfection h hour HSP heat shock protein HVR hypervariable region IBV infectious bronchitis virus IDD intrinsically disordered domain kb kilobase

Kd dissociation constant kDa kilodalton MHV mouse hepatitis coronavirus MOI multiplicity of infection

x M protein membrane protein N protein nucleocapsid protein NMR nuclear magnetic resonance nsp nonstructural protein nt nucleotide ORF open reading frame PABP poly(A) binding protein PAGE polyacrylamide gel electrophoresis PFU plaque forming unit PK pseudoknot PL1pro papain-like protease 1 PL2pro papain-like protease 2 PTB polypyrimidine tract binding protein RdRp RNA-dependent RNA polymerase RTC replication/transcription complex rpm revolutions per minute S protein spike protein SARS severe acute respiratory syndrome SARS-CoV SARS coronavirus sgmRNA subgenomic messenger RNA SL stem-loop ssRNA single-stranded RNA SUMO small ubiquitin-related modifier SYNCRIP synaptotagmin-binding cytoplasmic RNA-interacting protein TGEV transmissible virus TRS transcription regulating sequence uORF upstream open reading frame UTR untranslated region VP viral passage wt wild type

xi CHAPTER I. LITERATURE REVIEW

BACKGROUND What are coronaviruses? Coronaviruses are a genus of animal viruses within the , a family grouped with Roniviridae and Arteriviridae to form the order (22). Coronaviruses infect primarily the respiratory and enteric tracts of numerous animal species, including humans, often causing clinically significant acute respiratory and gastrointestinal diseases. can be systemic or localized and can occasionally cause hepatitis, cardiovascular disease, or neurological illness (Table 1.1.) (118). In human medicine it has been known since the 1960’s that coronaviruses cause ~10-20% of common colds and were not considered serious human pathogens until the outbreak of severe acute respiratory syndrome (SARS) in 2002-2003 which was a coronavirus infection characterized by a rapid spread and ~10% mortality rate. Two new human coronaviruses associated with severe in children, human coronavirus NL63 (HCoV-NL63) and coronavirus HKU1 (HCoV-HKU1), have also recently been found (29, 79). In veterinary medicine, coronavirus infections in domestic animals are of major economic importance since mortality rates can reach 100% in newborns and weanlings. Thus, coronaviruses are currently considered important viral pathogens in both human and veterinary medicine (75, 116). The Coronavirus Family. Before 2003, coronaviruses were classified as three groups, 1, 2, and 3, based on serologic relationships. Groups 1 and 2 comprised all mammalian coronaviruses and group 3 comprised the avian coronavirus (Table 1.1.) (52). In 2003 the sequences of only 10 complete genomes were known. Since the discovery of the SARS coronavirus (SARS-CoV) a burst of research has taken place world-wide and many new coronaviruses from different species have been discovered and their genomes sequenced. Currently, 26 complete genomes have led to a newly proposed sub- classification in each group based on the phylogenetic clustering of nucleotide sequences and genome organization (Table 1.1.) (52, 119, 164). Group 1 coronavirus have been divided into subgroups 1a and 1b where the transmissible gastroenteritis virus (TGEV) and human enteric coronavirus strain 229E (HCoV-229E) respectively serve as prototypes.

1 TABLE 1.1. Representative coronavirus species and groups GenBank Group Designation Species Hosta Disease Cellular accession receptor numberb Transmissible TGEV Pig Gastroenteritis APN NC_002306 gastroenteritis virus FCoV Feline enteric coronavirus Cat Enteritis APN Y13921 Feline infectious Enteritis, 1a FIPV Cat APN AY994055 peritonitis virus peritonitis CCoV Dog Enteritis APN D13096 Group BtCoV* 1 Bat coronavirus Bat Unknown Unknown NC_010437 Human enteric coronavirus HCoV-229E Human Nasopharyngitis APN NC_002645 strain 229E Human coronavirus strain 1b HCoV-NL63 Human Pneumonia ACE2 NC_005831 NL63 Porcine epidemic diarrhea PEDV Pig Enteritis Unknown NC_003436 virus Hepatitis, MHV Mouse hepatitis virus Mouse CEACAM 1 NC_001846 encephalitis 9-O-acetylated BCoV Bovine coronavirus Cow Enteritis U00735 sialic acid Enteritis, 9-O-acetylated ECoV Equine coronavirus Horse NC_010327 diarrhea sialic acid Porcine hemagglutinating PHEV Pig Encephalitis Unknown NC_007732 encephalomyelitis virus Antelope Sable 9-O-acetylated Sable antelope coronavirus Unknown EF424621 CoV antelope sialic acid 2a Calf-Giraffe 9-O-acetylated Calf-giraffe coronavirus Giraffe Unknown EF424624 CoV sialic acid Canine respiratory Group CRCoV Dog Pneumonia Unknown CQ772298 2 coronavirus Human coronavirus strain 9-O-acetylated HCoV-OC43 Human Nasopharyngitis NC_005147 OC43 sialic acid Human coronavirus strain HCoV-HKU1 Human Pneumonia Unknown NC_006577 HKU1 Human enteric coronavirus HECoV-4408 Human Enteritis Unknown FJ415324 strain 4408 Severe acute respiratory SARS-CoV Human Pneumonia ACE2 NC_004718 2b syndrome coronavirus BtCoV* Bat coronavirus Bat Unknown Unknown NC_009696 2c BtCoV* Bat coronavirus Bat Unknown Unknown NC_008315 2d BtCoV* Bat coronavirus Bat Unknown Unknown NC_009021 IBV Infectious bronchitis virus Chicken Pneumonia Unknown NC_001451 3a PhCoV Pheasant coronavirus Pheasant Pneumonia Unknown AJ618988 TCoV Turkey coronavirus Turkey Gastroenteritis Unknown AY342357 Group 3b SW1 Beluga whale coronavirus Whale Unknown Unknown NC_010646 3 BuCoV Bulbul coronavirus Bulbul Unknown Unknown NC_011548 3c ThCoV Thrush coronavirus Thrush Unknown Unknown NC_011549 MuCoV Munia coronavrius Munia Unknown Unknown NC_011550

a Host denotes animals from which virus was first isolated. b One representative GenBank accession number is given for each species. When available, a complete genomic sequence is given. *More than 60 bat coronavirus species have been identified and tentatively classified as members of group 1 or group 2. Abbreviations: APN, aminopeptidase N; ACE2, angiotensin-converting enzyme 2; CEACAM 1, carcinoembryonic antigen adhesion molecule 1.

2 The best established sub-classification lies in group 2 coronaviruses where group 2a is considered to be one lineage harboring genes of hemagglutinin-esterase (HE) and two papain-like proteases (PL1pro and PL2pro) (164). Two of the most extensively studied coronaviruses, mouse hepatitis virus (MHV) and bovine coronavirus (BCoV), belong to the group 2a and will be the focus of this dissertation. The human SARS-CoV and other SARS-CoV-like viruses isolated from civets and bats are most likely an early split-off from the group 2a, and were classified as group 2b coronaviruses (140). Group 2c and 2d are two additional subgroups of group 2 coronaviruses isolated from different genera of bats. Infectious bronchitis virus (IBV) and other avian coronaviruses comprise the group 3a and 3c coronaviruses whereas the first non-avian coronavirus isolated from beluga whale is classified into group 3b (104, 165). The striking diversity in species is one of the significant features of the coronavirus genus. Most individual coronavirus species naturally infect only one or a few closely- related animal species due to receptor specificity or compatibility within another replication feature. However, as the result of currently incompletely-understood attributes of the coronavirus RNA-dependent RNA polymerase (RdRp), an unusual mechanism of template switching during RNA synthesis (described below) occurs during coronavirus RNA replication and transcription such that homologous and heterologous recombination takes place at high rates leading to coronaviral diversity (39, 76, 115). For the ~27-32 kilobase (kb) coronaviral genome, the largest known among all RNA viruses, the recombination rate has been measured at ~25% across the genome among all progeny of a double infection (8). Accordingly, it is tempting to propose that bats and birds are gene pools for coronavirus recombination in their respective viral subgroups (164). The fact that bats and birds display species diversity, are able to fly long distances, are adaptable to a wide variety of environments, and roost and flock together probably promotes virus recombination and spreading to different hosts (160, 164). Interspecies jumping and rapid adaptation by naturally-occurring recombinant coronaviruses could explain the observation that closely related coronaviruses were found from distantly related host species and thus may have been the cause of the disastrous SARS-CoV outbreak (166).

3 Coronavirus Virion Morphology. Coronavirus virions are spherical particles of medium size (80-120 nm diameters) with an endoplasmic reticulum (ER)-derived membrane envelope containing characteristic glycoprotein spike structures (peplomers). The spikes serve as the attachment protein for cell entry and cause the negatively-stained virus to appear to have a crown resembling the solar corona, hence the name coronavirus (Fig. 1.1. A) (77). All coronaviruses possess four structural proteins: a spike glycoprotein (S), a small envelope glycoprotein (E), a membrane glycoprotein (M), and a nucleocapsid phosphoprotein (N). The S, E, and M proteins are anchored in the as transmembrane proteins. The group 2a coronaviruses (e.g., all BCoV strains and some MHV strains) contain an additional shorter glycoprotein spike, HE, that, curiously, may have originated from the HE protein of the type C human virus (72, 171). The S protein is a large (150-200 kDa) transmembrane glycoprotein that assembles into trimers and projects 12-20 nm from the virion surface. S mediates receptor attachment as well as viral-host cell membrane fusion (38). M is a triple-spanning membrane glycoprotein and the most abundant of the envelope glycoprotein. M together with the small, minor component E glycoprotein, determines the shape of virion (178). N is a phosphorylated RNA-binding protein mainly responsible for the formation of the nucleocapsid. N inside the cell also has a regulatory nonstructural function leading to an enhancement of viral genome replication. While it is still being debated, a consensus seems to be emerging that coronaviruses have a helical rather than an icosahedral nucleocapsid structure inside the virion (126). Coronavirus Life Cycle. Coronavirus replication takes place exclusively within the cytoplasm (Fig. 1.1. B) (11, 97). Most coronaviruses naturally infect epithelial cells of the respiratory or enteric tracts. Viral infections are initiated by the binding of viral S proteins to specific cellular receptors (Table 1.1.) which triggers a major conformational change in the S molecule. Changes in S protein structure following its attachment lead to entry of the viral nucleocapsid into the cytoplasm. For MHV, entry is via virus-cell membrane fusion (147) whereas for the SARS-CoV, entry appears to be by receptor-mediated endocytosis (139). After nucleocapsid entry the positive-strand viral RNA genome is released into the cytoplasm where the first open reading frame (ORF 1) is immediately

4 A

B

FIG. 1.1. Schematic depiction of coronavirus particle and life cycle. (A) A model of coronavirus particle (left) along with an electron-microscopic image of human SARS-CoV (right). The coronavirus particle, called a virion, is organized with the spike (S), membrane (M), and envelope (E) glycoproteins. The single-stranded positive-sense viral RNA genome is packaged into helical capsid with nucleocapsid (N) proteins. (B) Summary of the coronavirus life cycle as depicted in a review by Bergmann and colleagues (11). The entire replication process takes place in the cytoplasm. After attachment to the specific receptor, e. g., the CEACAM-1 molecule for MHV, the coronavirus enters host cells by membrane fusion or receptor-mediated endocytosis. The positive-strand RNA genome is unpackaged and directly translated to make the replicase/transcriptase proteins. The genome and replication/transcription complexes are located in a poorly understood membranous compartment where viral genome replication and transcription occur. After transcription, the nascent subgenomic mRNAs are transported to the cytoplasm or the endoplasmic reticulum (ER) where they are translated into structural (S, E, M, N) and accessory proteins (also called non-structural proteins or NSPs). Genomic RNA is packaged by N protein and which later becomes assembled with structural proteins into a virion by budding into the ER or early Golgi compartments. The mature virion then exits the cell via the exocytic pathway and is released.

5 translated and processed into ~16 replicase/transcriptase proteins. These include two kinds of proteases, an RdRp, a helicase, an exonuclease, an endonuclease, an N-methyltransferase, and a 2’-O-methyltransferase (described below). The replicase/transcriptase proteins function for the synthesis of progeny genome (replication) and a 3’-coterminal set of subgenomic mRNAs (sgmRNAs) (transcription) (details described below). Viral RNA synthesis are thought to occur on or within the cytoplasmic surface of double membrane vesicles derived from ER (73, 74, 141). The sgmRNAs are translated into structural proteins or accessory proteins some of which are important pathogenic factors. The newly synthesized genomes along with newly made structural proteins come together at the ER- Golgi intermediate compartment (ERGIC) where nucleocapsid formation (encapsidation) and virion assembly (packaging) occur by budding into the ERGIC (98). Glycoprotein S, M and E are made by membrane translocation and are glycosylated within the lumen of the ER. Encapsidation selectively incorporates only genomic RNA into nucleocapsid structures by means of unique packaging signals (95, 112). Virion assembly is carried out through cooperative interactions among structural proteins, particularly the M and E (13, 159). Finally, the mature virions are transported to the plasma membrane in vesicles via exocytosis. For some coronaviruses, such as MHV, a fraction of S protein incorporated into the ER at the time of synthesis is not packaged into virions but rather is carried to the plasma membrane where it protrudes from the cell surface and interacts with receptors of adjacent cells to cause cell fusion. This gives rise to the formation of large, multinucleate syncytia (described in Chapter II, III, and IV). Inactivated virus added externally to cells in culture can also cause cell fusion and syncytia formation (45). Coronavirus Genome Organization and Gene Expression. The coronavirus genome is a single-stranded, positive-sense RNA ranging from 27 to 32 kb in length that is 5’ capped and 3’ polyadenylated (Fig. 1.2). Its coding region is flanked by terminal untranslated regions (UTRs) (77, 97). The genomes of coronaviruses are structurally polycistronic containing 8~14 ORFs. Gene 1, the 5’-most ORF, is functionally polycistronic as well (described below), but the remaining ORFs are functionally monocistronic since (in most cases) only the 5’-terminal ORF on each sgmRNA is translated. Translation of genomic and subgenomic RNAs is presumably by a cap-dependent mechanism (77, 78). When

6

51015202530 ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●kb 1234567

mRNA 1 5’ 1a A 3’ 1b n (genome) 2a/HE S4 EM N

subgenomic mRNA transcription genome translation A mRNA 2 n mRNA 2-1 An pp1a mRNA 3 An pp1ab A mRNA 4 n A mRNA 5 n proteolytic autoprocessing mRNA 6 An

mRNA 7 An

PL1 PL2 3CL subgenome translation

PL1 PL2 3CL RdRp HEL ExoN NeU MT 12 3 456789101112 13141516 2 A mRNA 2 n mRNA 2-1 HE An

mRNA 3 S An nonstructura l proteins 4 mRNA 4 (nsp 1-16) An

E A mRNA 5 n M mRNA 6 structural proteins An (HE, S, E, M, N) mRNA 7 N An

FIG. 1.2. Organization and gene expression of the MHV genome. Adapted and modified from Sawicki and colleagues (130). The organization of 31.3-kb MHV-A59 genome is shown at the top. On the left, the positive-strand genomic RNA (gRNA) serves as a template for translation during which polyproteins (pp) 1a and 1ab are made from a single open reading frame 1 (ORF 1) as a result of a -1 programmed framshifting at the ORF 1a/1b junction. The pp1a and pp1ab undergo proteolytic autoprocessing into 16 nonstructural proteins (nsps) with a number of confirmed and putative functional domains as depicted. On the right, the gRNA also functions as a template for subgenomic mRNA (sgmRNA) transcription. The structural relationship between genome and subgenomic mRNAs is shown. A nested set of 3’ co-terminal sgmRNAs are synthesized by transcription, and then each sgmRNA is usually translated to produce only the protein encoded by the 5’ most open reading frame on the sgmRNA. All of the structural proteins and accessory proteins are made by translation of the subgenomic mRNA. Open box representing the translated ORF in each sgmRNA is denoted in gray. Open circles at the 5’ termini of the gRNA and sgmRNAs identifies the 5’ methylated the cap structure. PL1, papain-like proteinase 1; PL2, papain-like proteinase 2; 3CL, 3C-like proteinase; RdRp, RNA-dependent RNA polymerase; HEL, helicase; ExoN, 3’Æ5’ exoribonuclease; NeU, uridylate-specific endoribonuclease; MT, ribose-2’-O-methyltransferase; HE, hemagglutinin-esterase glycoprotein; S, spike glycoprotein; E, small envelope protein, M, membrane glycoprotein; N, nucleocapsid phosphoprotein.

7 genomic RNA is translated directly, the ORF comprising the 5’-proximal two-thirds of the genome is translated to make two polyproteins. The first polyprotein (pp1a, ~500 kD) is translated from ORF 1a. The second polyprotein (pp1ab, ~800 kD) is translated from ORF 1ab by a mechanism of programmed ribosomal frameshifting at the ORF 1a/1b junction (108). The pp1a and pp1ab are cotranslationally and/or posttranslationally processed by two or three internal-encoded viral proteases into 15~16 nonstructural proteins (nsps) that together make up the replicase/transcriptase (42, 54, 130). The major structural proteins, S, E, M, and N, and several accessory proteins are translated from the 3’-proximal one-third of the genome by way of individually expressed sgmRNAs (115). The nsps generated by translation of ORF 1 on genomic RNA are usually referred to as replicase/transcriptase proteins in recognition of their roles in viral RNA synthesis (Fig. 1.2). Both the precise arrangement of the nsps among themselves and with host factors to form the membrane-bound replication/transcription complexes (RTC), and the mechanisms by which the complexes lead to specific RNA synthesis, are currently unknown or poorly understood (16, 141, 158). The recently developed reverse genetics systems for coronaviruses (described below) and the occurrence of the SARS-CoV epidemic have stimulated research that has begun to characterize the structural features and enzyme activities of many of the ORF 1 proteins. In general, it is postulated that the products of pp1a (nsp1-11) function both to prepare a favorable cellular environment for viral replication and to assemble the RNA synthesizing machinery for carrying out the enzyme activities required for RNA synthesis (Fig. 1.2) (97). Specifically, currently documented functional activities include suppression of host gene expression and cell cycle progression (nsp1) (26, 69, 70), protease activity (nsp3 and nsp5) (7, 91), anchoring transmembrane domains (nsp3, nsp4, and nsp6) (12, 137, 156), protein-protein interactions in complex assembly (nsp7 and nsp8) (120, 180), a putative RNA primase (nsp8) (66), single-stranded RNA (ssRNA) binding proteins of unknown function (nsp9 and nsp10) (41, 148), primer-dependent RdRp activity (nsp12) (152), 5’Æ3’ helicase activity (nsp13) (135), 3’Æ5’ exoribonuclease and N7- methyl- transferase activity (nsp14) (28, 105), endoribonuclease activity (nsp15) (67), and a 2’-O-methyltransferase (nsp16) (34).

8 Coronavirus RNA Synthesis. The coronavirus RNA synthesis includes genome replication and transcription of sgmRNAs (Fig. 1.3). By using the positive-strand viral genome as a template, both replication and transcription begin with the synthesis of negative-strand RNAs, which in turn serve as templates for the amplification of positive- strand virus genomes and subgenomes. The mechanism of initiation of negative-strand and positive-strand syntheses for genomic and subgenomic mRNAs is presumably the same or very similar for both. In addition, the ignition process possibly requires 5’-3’-end cross-talking that is guided by interactions between the distant segments of the genome, which may involve specific RNA signals and functional RTC (64, 77, 85, 185). The major difference between replication and transcription is that during negative-strand synthesis to make the viral antigenome, RNA synthesis from the genomic RNA template continues for the full length of the template, whereas the RNA synthesis is discontinuous while making the viral anti-subgenomes (Fig. 1.3) (128, 129). The discontinuous step is triggered by transcription regulating sequences (TRS) whereby the RdRp undergoes a template switch to a spot near the 5’ end of the genome where transcription continues on the leader sequence of the genome. The TRS, including a core consensus (which is 5’-AAUCUAAAC-3’ for both BCoV and MHV) and nearby flanking sequences, are quite conserved within each coronavirus group. The synthesis of negative-strand RNAs is initiated and elongated, along with consecutive events of scanning, reading through, and template switching, which requires interactions between TRS in the leader and TRSs in the body ahead of each ORF. Accordingly, continuous and discontinuous RNA syntheses occur and generate genome- length and a series of smaller, subgenome-length negative-strand RNAs, respectively (115, 130). The negative-strand RNAs all possess elements that are complementary to counterparts on the genome, i.e., a 5’ poly(U) tract and a 3’ anti-leader. The complete negative-strand RNAs, probably in association with their positive-strand counterparts as double-stranded structures, then serve as templates for positive-strand RNA synthesis. The positive-sense sgmRNAs are amplified to a level approximately ten to one hundred times as abundant as their negative-sense counterparts (60, 134). Collectively, a 3’-coterminal nested set of six to nine sgmRNAs are made of which each sgmRNA contains an identical 5’ leader sequence (65~93 nt, depending on viruses) derived from the 5’ end of the genome.

9

1234567

+ 5’ 1a A 3’ 1b n TRS TRS TRS TRS TRS TRS TRS + 5’ An 3’ NEGATIVE-STRAND RNA SYNTHESIS (continuous and discontinuous)

- 3’ Un 5’ - 3’ U 5’ n - 3’ Un 5’ - 3’ Un 5’ 5’…AAUCUAAAC…3’ 5’…AAUCUAAAC…3’ - 3’ Un 5’ - 3’ Un 5’

TRS TRS - 3’ U 5’ ORF1a ORF2 n Leader ORF1b Body

POSITIVE-STRAND RNA SYNTHESIS

- 3’ U 5’ n sgmRNA 1 + 5’ An 3’ REPLICATION (genome) - 3’ U 5’ n sgmRNA 2 + 5’ An 3’

- 3’ Un 5’ + 5’ An 3’ sgmRNA 3

- 3’ Un 5’ + 5’ An 3’ sgmRNA 4 TRANSCRIPTION - 3’ U 5’ n sgmRNA 5 + 5’ An 3’ - 3’ U 5’ n sgmRNA 6 + 5’ An 3’ - 3’ U 5’ n sgmRNA 7 + 5’ An 3’

FIG. 1.3. Model for coronavirus RNA synthesis. Adapted and modified from Paul S. Masters (97). Using MHV as an example, the replication and transcription of coronaviruses are shown. The positive-strand (solid line) genome serves as a template to be copied continuously or discontinuously into genome-length and subgenome-length negative-strand (broken line) RNAs, respectively. The inset shows details of the arrangement of leader and body copies of the transcription regulating sequences (TRS). The initiation for template switching possibly arises from the base-pairing between leader TRS and body TRSs, by which leading to the 3’-coterminal nested set of negative-strand subgenomic mRNAs. Subsequently, the corresponding negative-strand RNAs are used as templates for synthesis of the progeny positive-sense genome (replication) and positive-sense subgenomic mRNAs (transcription).

10 Cis-acting RNA elements in coronavirus defective interfering RNA replication. Viral cis-acting RNA replication elements are unique signals in the form of primary sequences or higher-order structures that lead to the replication of the viral genome (15). In coronaviruses, defective interfering RNAs (DI RNAs) have been used for over two decades to define the cis-acting elements specific for their replication. Because DI RNAs are extensively deleted variants of the viral genome and are only able to replicate as molecular parasites in the presence of the RNA synthesizing machinery of helper viruses, they are used to identify the cis-replication elements which are postulated to function similarly in the virus genome. By using DI RNAs as working molecules for identifying cis- replication elements, there can be a clear distinction from trans-acting factors that might be affected by any RNA structure mutations in analysis of the complete genome. In DI RNA replication analyses, the required trans-acting elements are provided by the helper virus. Naturally occurring and artificially constructed DI RNAs have been discovered and manipulated from all three groups of coronaviruses, and in most cases, the DI RNAs contain the entire terminal UTRs and a portion of the 5’ end of replicase 1a ORF (24, 92, 94, 103, 117). For BCoV, a naturally occurring 2.13 kb DI RNA has been cloned, modified, and studied by our laboratory for the identification and characterization of cis-acting replication elements (Fig. 1.4) (23-25). The BCoV DI RNA is composed of the 5’ UTR of the viral genome, a contiguous ORF harboring the first 288 nt of the nsp1 gene, the intact N gene, and the whole genomic 3’ UTR plus a 68-nt poly(A) tail (24). By using the BCoV DI RNA as a working molecule, along with closely related MHV DI RNAs, cis-acting replication signals have been best defined in the group 2a coronaviruses (Fig. 1.4). To date, nine cis-replication elements have been identified for the replication of group 2a coronaviruses. Of these, three are in the 5’ UTR, two are in the nsp1 gene, and four are in the 3’ UTR (Fig. 1.4). These are each described as follows: (i) The BCoV 5’-terminal 90-nt harboring stem-loop I (SLI) (nt 5-54) and SLII (nt 55-80). These were demonstrated to be important as a primary sequence and/or structure for BCoV DI RNA replication (24, 25). Recently, the SLI region in MHV (homologous to that in BCoV) has been reanalyzed and is now described as being comprised of SL1 and SL2, of which SL1 was shown to be structurally labile for putative 5’ UTR-3’ UTR interaction in MHV replication and SL2 was

11

5’ UTR (210 nt) 3’ UTR (288 nt)

V

BSL HVR

PK zz z I z z zz 1 IV VI z II III 2

210 zz z 5’ 3’ 5’ UAA 53954558037 97116 186 215 239 310 311 340 288 An 3’ 1

5’ 1a 3’ 1b An N

5’ partial nsp1 3’ DI RNA N An (2.2 Kb) 65 211 499 1842 2133

sgmRNA 7 5’ N An 3’ (1.7 Kb) 65 78 1421 1712

5’ cis-acting 5’ partial nsp1 3’ region (421 nt) 75 211 499 421 nt

FIG. 1.4. Structure of the BCoV DI RNA and some of the observations that led to this research. Shown below the genome is the structure of BCoV DI RNA relative to the BCoV genome. Open boxes represent open reading frames (ORFs). The solid line denotes the 5’ and 3’ untranslated regions (UTRs) where the common leader RNA sequence at the 5’ terminal is indicated by a hatched pattern. Base 499 of the DI RNA is the first base in the ORF of N which forms a contiguous ORF with the 288 nucleotide (nt) of the nonstructural protein 1 (nsp1) gene. Also shown are the 5’-proximal 421-nt region that provides cis-acting replication signals for the DI RNA, a sequence that differentiates the DI RNA molecule from the shortest subgenomic mRNA (sgmRNA 7). Shown above the genome is a schematic depiction of current model for the cis-replication higher-order RNA structures postulated in the 5’ and 3’ UTRs. Four stem-loops (I, II, III, and IV) are found in the 5’ UTR. Two stem-loops (V and VI) map within the nsp1 coding region. A bulged stem-loop (BSL) with overlapping pseudoknot (PK) and a hypervariable region (HVR) are demonstrated in the 3’ UTR. Solid circles in the 5’ UTR indicate the AUG start codon of ORF 1 whose coding sequence is indicated by a thickened line. The stop codon for the N-gene preceding the 3’ UTR is boxed. All nt numbering is from the first base at the 5’ and 3’ end of the genome, respectively.

12 proposed to be a uYNMG(U)a-like tetraloop structure critical for stimulating sgmRNA negative-strand synthesis (83, 87, 88). These same structures are predicted in the BCoV genome. (ii) SLIII (nt 97-116). SLIII has associated with it an internal translation start codon for a short intra-5’ UTR ORF (8 aa in BCoV) of unknown function. Both SLIII and the short upstream ORF (uORF) are phylogenetically conserved structures in group 2a and possibly all coronaviruses (122). (iii) SLIV (nt 186-215). SLIV is located immediately upstream of the coding region of ORF 1, holds the start codon for genome translation, and appears to be conserved only within the group 2a coronaviruses (123). (iv) SLV (nt 239- 310) and (v) SLVI (nt 311-340). SLV and SLVI map within the nsp1 coding region and both function as higher-order cis-acting signals for replication of BCoV DI RNA (19, 55). (vi) A bulged stem-loop (BSL) at the 5’ end of the 3’ UTR. The BSL maps just downstream from the stop codon of the N gene. It partially overlaps with an adjacent hairpin pseudoknot. (vii) Hairpin-like RNA pseudoknot (PK). The BSL and PK are shown to be mutually exclusive since they share a sequence and cannot simultaneously exist. Both structures, however, are required for the replication of group 2a DI RNAs and the MHV genome which has led to the idea of that they function together as a molecular switch modulating some features of RNA synthesis. These two structures appear to be conserved in group 2a and group 2b coronaviruses (49, 51, 61-63, 163). (viii) A complex octamer- associated BSL at the 3’ end of the 3’ UTR. This BSL is poorly conserved in both sequence and secondary structure among closely related coronaviruses such as BCoV and MHV. While this structure is viewed as a hypervariable region (HVR), an octanucleotide 5’-GGAAGAGC-3’ harbored within this region is nearly conserved among all coronaviruses (89, 177). Intriguingly, although the HVR has been demonstrated to be important for replication of MHV DI RNA, removal of this element from the MHV genome has little effect on virus replication in tissue culture but affects pathogenesis in mice (50). (ix) A 3’-terminal poly(A) tail. The 3’ poly(A) tail has been shown to be a crucial cis-acting signal for BCoV DI RNA replication (143, 144). Proteins identified to bind cis-replication elements. Extremely little is understood about how these cis-replication elements function. It has been generally presumed that RNA-protein interactions direct the various steps leading to RNA synthesis but RNA-RNA

13 interactions have never been ruled out (15). Several viral proteins with RNA-binding activity have been described as the result of structural, biochemical, and genetic evidence. A majority of these are replicase/trancriptase gene products generated from ORF 1a and 1b, that include nsp1, nsp7, nsp8, nsp9 and nsp10 (41, 55, 68, 97, 148, 149, 180). Plus, N, a structural protein, is also a RNA binding protein. The nsp1 of BCoV has strong and weak binding affinity to the positive-strands of 5’ and 3’ UTRs, respectively (55). Nsps 7-10 of MHV are implicated in interactions with the 3’ BSL-PK accounting for the potential molecular switch (190). Specific cis-acting RNA substrates bound with N protein have mapped to positive-strand of TRS in the MHV 5’ UTR and in the IBV 3’ UTR (113, 182). Moreover, a number of host proteins have also been identified and postulated to participate in the connection between cis-acting signals and RNA synthesis. Heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1), synaptotagmin-binding cytoplasmic RNA-interacting protein (SYNCRIP), and polypyrimidine tract-binding protein (PTB, also known as hnRNP I) have been shown to strongly bind positive strand forms of SLI and SLII in the 5’ UTR of MHV. In addition, hnRNP A1 and SYNCRIP also bind the negative strand of SLI and SLII (30, 81, 82). For SLIV and SLV-VI of BCoV, six and two uncharacterized cellular proteins appeared to be UV cross-linked with their positive strands, respectively (55, 123). Additionally, a complex of proteins including hnRNP A1, mitochondrial aconitase and heat-shock protein 70 (HSP70), and chaperones HSP60 and HSP40 are demonstrated to bind the positive-strand form of the 3’ UTR in MHV (64, 109, 111) . The negative strand of the 3’ UTR of MHV is bound with the PTB, specifically at the motif that is complementary to the conserved octanucleotide (65). Finally, the binding ability of poly(A) binding protein (PABP) to the poly(A) tail is proportional to the rate of RNA replication and appears to be independent of its function on translation (143). The precise role of the interactions between proteins (viral and cellular) and cis-replication elements remains to be determined. Coronavirus Reverse Genetics. While studies with DI RNAs have provided a foundation for approaching the molecular biology of coronaviruses, the availability of reverse genetics systems has enabled a more in-depth analyses of the mechanisms of RNA synthesis, viral protein functions, virus-host interactions, pathogenesis, and development of

14 recombinant vaccines (97). Reverse genetics for positive-strand RNA virus is defined as the transcription of infectious RNA from a full-length cDNA clone. It was established to investigate the correlation between artificial genetic modifications and consequential phenotypic effects by manipulating the real virus (121). The reverse genetics for coronaviruses became possible only recently because of the historic difficulties in working with the huge genome size. Remarkably, to date four different and independent strategies of coronavirus reverse genetics systems have been developed. These are as follows: (i) Targeted RNA recombination, the first-developed reverse genetics system for coronaviruses, was based on the rationale that a recombinant virus can be created and selected from cells infected with a recipient coronavirus carrying a selectable marker. The mutation is introduced with a synthetic donor RNA bearing the mutation of interest (99). The success of this method was made feasible by the exceptionally high rate of homologous RNA recombination in coronaviruses. While so far applicable only in MHV, this system can be used for making mutations only in the downstream one-third of the genome. (ii) The bacterial artificial chromosome (BAC) in a low copy number and with an upstream CMV promoter has been utilized to assemble a full-length cDNA copy of the coronavirus genome (43). Full-length infectious viral RNA is produced in vivo by host RNA polymerase II after transfection of the BAC DNA into host cells. Although to date engineered for only TGEV and HCoV-OC43, this strategy overcomes the low efficiency of RNA transfection, avoids the limitation of in vitro RNA preparation, and ensures complete 5’ capping of the viral RNA (2, 145). (iii) Vaccinia virus cloning vectors have been used to assemble the entire coronavirus cDNAs by using enzyme-directed insertion methods (155). Coronaviruses are recovered from cells transfected with infectious RNA that is produced in vitro from recombinant vaccinia virus DNA supplemented with bacteriophage T7 RNA polymerase. Alternatively, coronaviruses can also be rescued from cells transfected with a DNA mixture containing the vaccinia virus DNA and expression vector for T7 RNA polymerase. This system was first applied to human coronavirus-229E, but the genomes of two other coronaviruses, MHV and IBV, have also been successfully used (21, 32, 153). (iv) Systematic assembly of a full-length infectious cDNA was designed and carried out by in vitro ligation of five to seven contiguous cDNAs that span the entire genome (10).

15 These cDNA fragments are arranged in such a way so as to interrupt the viral genomic regions that are toxic to the E. coli host in cloning vectors. They are also engineered with unique flanking restriction sites such that only one precise assembly of the entire genome takes place. The 5’-terminal fragment contains the T7 RNA polymerase promoter. Genome-length infectious RNAs made from in vitro transcription are electroporated into host cells where recombinant coronaviruses are produced and recovered. One major advantage of this system is that it allows for simple, rapid, and straightforward site-directed mutagenesis of virtually any site in the genome with minimal chances of introducing unexpected mutations elsewhere. This method enables the independent genetic manipulation of cDNA subclones. This strategy is now widely applied in coronaviruses of all three groups. Viruses in which it has been established include TGEV, HCoV-NL63, MHV, SARS-CoV, and IBV (37, 172-175).

QUESTIONS THAT LED TO THIS DISSERTATION RESEARCH With the BCoV and MHV DI RNAs as major working molecules, nine cis-replication RNA elements have been identified as described above. Among these is a 421-nucleotide (nt) 5’ cis-acting replication region that is the only difference between the replicating BCoV DI RNA and the shortest subgenome that is incapable of replication (sgmRNA 7) (Fig. 1.4). This region therefore undoubtedly bears distinct signals specific for RNA replication and has been a focus of investigation in our laboratory in recent years (19, 24, 55, 122, 123). It has been postulated that the 421-nt is providing signals for initiation of negative-strand RNA synthesis from the positive-strand or initiation of positive-strand from negative-strand. Interestingly, within the 421-nt region not only are the four higher-order structures (SLIII-SLVI) present, but so too is the translation of the contiguous ORF necessary for DI RNA replication (19, 23). Additionally, deletion or mutation of the 5’-proximal region of the nsp1 coding region, but not the 3’-proximal region, have been demonstrated to be detrimental for virus replication in the context of the MHV genome (17). It has also been shown that MHV nsp1 co-localizes and interacts with viral replication complexes, nsp7 and nsp10, during early times postinfection (18). Accordingly, the mechanistic contribution of nsp1 has been speculated to be a dual function of RNA structures and protein product.

16 Therefore, question #1 was what RNA structures in the 5’-proximal 500-nt regulatory region of the genome (including the 5’ UTR) and the 3’ UTR does nsp1 bind? Does this protein have a regulatory function in translation that can be determined by an in vitro assay? The results of this part of my study have been published as a collaborative paper and make up part of the Appendix. Other questions led to other components of the Appendix. What 5’ and 3’ terminal RNA structures are bound by nsp16 and the N protein? Might these be regulatory proteins? Despite significant sequence divergence among group 2a coronaviruses, many of their putative cis-replication RNA elements have predicted or documented common higher-order structure and some are functionally interchangeable. For instance, the BCoV 3’ UTR was found to be able to entirely replace the MHV 3’ UTR in the context of the MHV genome (62). By contrast, neither the group 1 TGEV 3’ UTR nor group 3 IBV 3’ UTR was capable of substituting for MHV 3’ UTR. Moreover, it was suggested that common RNA replication signals exist among group 2a coronaviruses as evidenced by support of replication of the BCoV DI RNA by all the other group 2a coronaviruses including MHV (169). Yet little of the functional interchangeability of 5’-proximal cis-replication signals among coronaviruses has been examined in the context of the viral genome. Therefore, question #2 was can it be determined whether the BCoV higher-order cis-acting structures within 5’ UTR function in the context of the MHV genome when analyzed by a reverse genetics approach? If so, what are they? The answers to this question led to Chapter II. In the study described in Chapter II, it was learned that (i) SLIV of BCoV functioned as a cis-replication element in the MHV genome only after it was extended into the nsp1 coding region, and (ii) a 30-nt region immediately upstream of the newly defined SLIV could not be supplied by the BCoV counterpart without severe detriment to MHV growth. It was noted that after two blind cell passages following transfection with chimeric virus genomes carrying the 32-nt BCoV element, wt plaques emerged. Therefore, question #3 was what potential suppressor mutations might have arisen to yield the wt-like phenotype? Might the position and character of the suppressor mutations identify potential RNA-RNA, RNA-protein, or protein-protein interactions that would explain

17 the debilitated phenotype? Could these interactions be established by reconstituting a wt MHV with compensatory mutations? In attempts to recapitulate mutational distortions of SLIII structure that would impair MHV genome replication as they did for BCoV DI RNA replication (122), we learned that mutational distortions in the upper loop had little or no effect on MHV replication. Therefore, question #4 evolved that led to Chapter IV. What are the limits to SLIII changes that would still enable MHV replication? That is, can parts of SLIII be removed and still leave a viable virus?

18 CHAPTER II. CIS-REPLICATION STEM-LOOP IV IN THE MOUSE AND BOVINE CORONAVIRUS GENOMES SPANS PARTS OF THE 5’ UNTRANSLATED REGION AND NONSTRUCTURAL PROTEIN 1 CISTRON

INTRODUCTION Cis-acting structures in positive-strand RNA virus genomes that function as signals for genome translation, transcription and replication are potential sites for antiviral drug design. The fact that replication signals from evolutionarily divergent coronaviruses can be exchanged, for example between the 3’ UTR of the bovine coronavirus (BCoV) and mouse hepatitis coronavirus (MHV) (61, 62), or the SARS-CoV and MHV (51), would suggest that antiviral molecules designed against common signals might be broadly effective therapeutic agents. Several studies have identified cis-replication structures in the 3’ UTR of MHV and BCoV that are likely to be involved in the initial steps of genome translation and initiation of negative-strand RNA synthesis (143, 190). The 3’-proximal cis-replication elements in the positive-strand identified thus far, either in a helper virus- dependent DI RNA replicon of MHV or BCoV, or in the full-length MHV genome, are (i) the 3’ poly(A) tail (84, 143), (ii) the 3’-terminal 55 nt of the 3’ UTR (50, 84, 190), and (iii) an upstream bulged stem-loop (SL) and associated hairpin pseudoknot (49, 61, 62, 163). Curiously, a ~140-nt hypervariable region in the 3’ UTR that comprises part of a 3-proximal bulged stem-loop and harbors a coronavirus-universal octamer sequence (GGAAGAGC) in MHV is not required for virus replication but plays a role in pathogenesis (50). Cis-replication structures have also been identified in the 5’-proximal region of group 2 coronaviruses, but few studies have characterized their common signaling features among the viruses. The higher-order RNA structures within the BCoV 5’ UTR were initially predicted by the Tinoco algorithm and more recently by the Mfold algorithm of Zuker (100, 183). They were characterized as helical SLs I-IV (Fig. 2.1 B) and their predicted structures were consistent with enzyme structure probing analyses (24, 25, 122, 123). In the context of a BCoV DI RNA in helper virus-infected cells, the higher-order structure of SLIII and SLIV (identified by the shaded areas in Fig. 2.1 B) were shown by mutation

19

A IV CC I III CA G C AU C G CA C-G C-G C-G MHV-A59 C-G G-C G-U (SL1) 100 U•C 5’ UTR: 209 nt A-U UA A-U C-G C-G G-C A GC C-G U 20 CG U U A 209 CU G C A-U A U-A U U C-G G-C G-U C-G C II C-G G-C G-C G-U C A G-C TRS-CS U G U•U (SL2) C C G A U•C A A C A C A A-U U U A U U U-G G-C U G C C U-A U-A A C U U U C-G A-U U-A U-A A-U A-U C-G G-C C-G A-U C-G C-G A-U A-U U-A G-C U-G G-C A-U C•U G-U G-C 5’ UAUAA-U AA-U AAAU-A AA C-GACAUUUGUAGUUCCUUGACUUUCGUUCUCUGCCAGUGACGU-AAAUAC 3’ 1405 42 56 60 77 80 130 140 160 171 225 230 B IV CC I III CA AU CA C-G G U C-G UG G-U U-G G-C (SL1) C-G BCoV-Mebus U-A G-C G-C U-G 5’ UTR: 210 nt A 210 20 G C II 100 A-U A U G U-A A-U G-U TRS-CS C-G C-G C-G U-A G-U G-C A A U U G-C U-A U A A U A G U•U C C UU U A U•C (SL2) U U G U C A A•C A-U U U C-G G-C U G A-U C C U-A C-G U U U-A C A A-U G-C C U A-U C U U A-U U-A U-A U-A G-C U C-G U-A C-G U-A G-C U-A U-A A-U U-A U-A A-U U-A C-G G-C 5’ GAUUG-C UG-C U-A CAUC-GUGUCUAUAUUCAUUUCUGCUGUUAACAGCUUUCAGCCAGGGACGU-AAAUAC 3’ 1375 39 54 55 80 84 129 140 160 174 226 231 C MHV SLI MHV SLII MHV SLIII

MHV -UAUAAGAGUGAUUGGCGUCCGUACGUACCCUCUCAACUCUAAAACUCUUGU-AGUU-UAAAUCUAAUCUAAACUUUAUAAACGG---CACUUCCUGCGUGUCCAUGCCCGCGGGCCUGGUCUUGU BCoV GAUUGUGAGCGAUUUGCGUGCGUGCAU-CCCGCUUCACUG---AUCUCUUGUUAGAUCUUUUUAUAAUCUAAACUUUAUAAAAACAUCCACUCCCUGUAU-UCUAUGCUUGUGGGCGUAGAUUUUU

BCoV SLI BCoV SLII BCoV SLIII

MHV SLIV

MHV CAUAGUGCUGACAUUUGUAGUUCCUUGACUUUC--GUUCUCUGCCAGUGACGUGUCCAUUCGGCGCCAGCAGCCCACCCAUAGGUUGCAUAAUGGCAAAGAUGGGCAAAUACGGUCUCG BCoV CAUAGUGGUGUCUAUAUUCAUUUCUGCUGUUAACAGCUUUCAGCCAGGGACGUGUUGUAUCCUAGGCAGUGGCCCACCCAUAGGU--CACAAUGUCGAAGAUCAACAAAUACGGUCUCG BCoV SLIV

FIG. 2.1. Comparison of RNA structures in the 5’-terminal regions of the mouse hepatitis coronavirus, A59 strain (MHV-A59) and the bovine coronavirus, Mebus strain (BCoV-Mebus). (A) Higher-order structures for MHV. (B) Higher-order structures for BCoV. For both viruses, the structures of stem-loops I (comprised of two smaller stem-loops named SL1 and SL2), II, III, and IV, and a short piece of the nsp1 ORF are shown. The “short” forms of SLIII and SLIV, the previously-described SLIII and SLIV forms for BCoV (122, 123) are high-lighted in gray. The start and stop codons for the SLIII-associated intra-5’ UTR upstream ORF and the start codon for the nsp1 ORF are boxed. (C) Aligned 5’ UTR and adjacent 28-nt (partial ORF 1) sequences of MHV and BCoV. The alignment was constructed to minimize inserted gaps by Vector NTI. Identical regions are noted by shading. Positions of the MHV and BCoV stem-loops are noted.

20 analyses to be required for DI RNA replication and were then identified as cis-replication signals (122, 123). Similar analyses could not be carried out for SLI and SLII since SLI resides within the leader and SLII resides largely within the leader and also harbors the transcription regulatory core sequence (UCUAAAC) (Fig. 2.1 B) (168, 170). The transfected DI RNAs mutated in the SLI and SLII regions rapidly acquired the leader sequence of the helper virus genome in a process known as “leader switching” (24, 25, 93). It was shown, however, that the 5’-terminal 13 nt of SLI are required for BCoV DI RNA replication (24). The BCoV SLIII, and its predicted homologue in other coronaviruses, has associated with it the start codon for a short upstream open reading frame (uORF) also found in the 5’ UTR of virtually all coronaviruses (122). The coronavirus uORF encodes typically 8-11 aa, but its function has not been determined. Recent Mfold analyses have predicted the BCoV SLIII to be 46-nt in length (a “long” SLIII) (19) rather than 20-nt as previously predicted (Fig. 2.1 B) (122). The BCoV long SLIII is equivalent to the SLIII predicted for MHV (Fig. 2.1 A) (19, 88). Between SLIII and a predicted SLIV in both BCoV and MHV is a ~40-nt region which appears to be a relatively unstructured region by Mfold prediction (Fig. 2.1 A and B). SLIV and its predicted homologue in other coronaviruses (123) has associated with it the start codon for ORF 1, an ~ 20-kilobase ORF encoding 16 nonstructural proteins that make up the replication/transcription complex (53). In BCoV cross linking studies have identified SLIV to be a binding site for cellular but not viral proteins (123). Downstream of SLIV and within the partial ORF 1 found in BCoV DI RNA are SLs V and VI which were shown by stem disruption and restoration studies to be cis-replication structures for DI RNA replication (19, 55). Recently, the SLI region in MHV (nt 1-56) has been analyzed by NMR spectroscopy and studied in the reverse genetics system of MHV and has been described as comprising SLs 1 and 2 (Fig. 2.1 A and B) (83, 88). The terminal loop of SL2 has been shown to be a U-turn motif that behaves as a cis-replication signal in the MHV genome (88). In the same study, the MHV homologue of BCoV SLII (nt 57-74 in MHV) was predicted to be single stranded or weekly folded structure (88). In the study of Liu et al. (88), MHV SL4b, nt 96-115, is equivalent to the higher-order structure we have named here and elsewhere as SLIII, nt 97-116, in BCoV (122). All in all, the predictions in the 5’ UTR higher-order

21 structures in BCoV and MHV show similarities suggesting they might be functionally conserved among group 2a coronaviruses (24, 122, 123). Here, we postulated that since the MHV and BCoV genomes are structurally similar, their 5’ UTRs would be interchangeable as was their 3’ UTRs. An initial set of experiments, however, demonstrated that a precise replacement of the 209-nt MHV 5’ UTR with the 210-nt BCoV 5’ UTR yielded no viable chimeras. We therefore made replacements of regionally-characterized cis-acting BCoV structures in an effort to more precisely define the required 5’-proximal elements for MHV genome replication. For this, the BCoV 5’ UTR SLs I & II region, the SLIII region, a 32-nt unstructured region between SLIII and SLIV, and eventually a 16-nt-extended version of the short SLIV were used to replace the equivalent regions in the 5’ UTR of the MHV genome. From this, we learned (i) that the SLs I & II region and the SLIII region can immediately functionally replace homologous regions in the MHV genome, (ii) that the BCoV 32-nt unstructured region produces virus with slow-growing small plaques after two blind cell passages, (iii) that only the 16-nt extended version of the intra-5’ UTR BCoV SLIV (the newly-described “long” SLIV) can replace the MHV long SLIV for virus replication, and (iv) that structure disruptions of SLV within the partial nonstructural protein 1 (nsp1) cistron did not hinder MHV replication indicating this structure may be less critical as a cis-replication element in the MHV genome than in the BCoV DI RNA replicon. These results together establish (i) that the cis-replication SLIV of BCoV and MHV includes sequence from both the 5’ UTR and nsp1 coding region, and (ii) that a relatively unstructured 30 to 32-nt hypervariable region within the 5’ UTR strongly prefers the context of the parent genome for its function in virus replication, thus identifying a subgroup 2a-specific cis-replication element.

MATERIALS AND METHODS Cells and viruses. Delayed brain tumor (DBT) cells (58), mouse L2 cells (47), and baby hamster kidney cells expressing the MHV receptor (BHK-MHVR) (175) were grown in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% defined fetal calf serum (FCS) (Hyclone) and 20 μg/ml gentamicin (Invitrogen), and maintained at 37 °C

with 5% CO2 for all experiments. BHK-MHVR cells were maintained in selection

22 medium containing 0.8 mg/ml Geneticin (G418 sulfate; Invitrogen) (175). Sequence alignment and GenBank accession numbers. 5’ UTR sequence alignments of group 2a coronaviruses were made with Vector NTI Suite 8 (Invitrogen). GenBank Accession Numbers were as follows: NC_001846 for MHV-A59, NC_006852 for MHV-JHM, AF201929 for MHV-2, NC_006577 for CoV-HKU1, NC_010327 for ECoV, U00735 for BCoV-Mebus, NC_003045 for BCoV-ENT, NC_005147 for HCoV-OC43, FJ415324 for HECoV-4408, NC_007732 for PHEV, EF424624 for Calf-Giraffe CoV, and EF424621 for Antelope CoV. RNA structure prediction. The Mfold program of Zuker (http://mfold.bioinfo.rpi.edu/) (100, 183) was used for RNA structure predictions. Plasmid construction for mutations in fragment A of the infectious clone for MHV-A59. Fragments A through G for the infectious clone of MHV-A59 (icMHV) (175) were a kind gift from Ralph Baric, University of North Carolina. cDNA plasmid A (containing fragment A) which contains an upstream T7 promoter and nt 1- 4882 of the MHV genome was prepared in plasmid pCR-XL-TopoA (Invitrogen) and used to replace MHV sequences with homologous regions of the BCoV-Mebus. Site-directed mutagenesis was carried out by overlap PCR with primers shown in Table 2.1. All PCR reactions used AccuPrimeTM Taq Polymerase High Fidelity (Invitrogen) under conditions recommended by the manufacturer. Briefly, the first PCR was done with primer T7startBCoV (or T7startMHV) and the designated reverse (+) primer, and the second PCR was done with primer MHV-ScaI and the designated forward (-) primer. PCR fragments were chromatographically purified (QIAEX II; Qiagen) and used in the third PCR with primers T7startBCoV (or T7startMHV) and MHV-ScaI. The resulting PCR fragments were TA cloned into pCR-XL-Topo followed by plasmid preparation and sequence confirmation. With purified cloned DNA, a restriction fragment exchange was performed to minimize unwanted mutations in the wt icMHV plasmid DNA. Briefly, both wt icMHV plasmid A DNA and site-directed mutagenized TOPO plasmid DNA were digested with restriction enzymes SfiI and ScaI (New England Biolabs), whose sites are in the vector at nt 1616 upstream from the T7 promoter and in the MHV genome at nt 910, respectively. The ~2.5 kb SfiI-ScaI fragment from TOPO clones were ligated with the ~6.5 kb SfiI-ScaI fragment

23 TABLE 2.1. Oligonucleotides used in Chapter II

a b c Binding Oligonucleotide Polarity Sequence (5’Æ3’) region (nt) T7 promoter T7startMHV + gtcggcctcttaatacgactcactatagTATAAGAGTGATTGGCGTCCG MHV 1-21 T7 promoter gtcggcctcttaatacgactcactataggattgtgagcgatttgc T7startBCoV + BCoV 1-17 A59-Sca I - CGCCTTGCAGGAGTACTTACCCTTC MHV 899-923 MHV 83-104 A59-BCoV(1-84)(-) + ctaaactttataaaaacatcCACTTCCTGCGTGTCCATGCCC BCoV 65-84 MHV 83-104 gggcatggacacgcaggaagtgGATGTTTTTATAAAGTTTAG A59-BCoV(1-84)(+) - BCoV 65-84 cttgtgggcgtagatttttcatagtggtgtctatattcaTTCCTTGACTTTCGTTCTC MHV 141-162 A59-BCoV(85-141)(-) + TGC BCoV 103-141 ctatgaaaaatctacgcccacaagcatagaatacagggagtgCCGTTTATAAAGTTT MHV 61-82 A59-BCoV(85-141)(+) - AGATTAG BCoV 85-126 gccagggacgtgttgtatcctaggcagtggcccacccataggtcacAATGGCAAAGA MHV 209-230 A59-BCoV(142-210)(-) + TGGGCAAATAC BCoV 164-209 ctaggatacaacacgtccctggctgaaagctgttaacagcagaaaCTACAAATGTCA MHV 119-140 A59-BCoV(142-210)(+) - GCACTATGAC BCoV 142-186 MHV 226-237 cccacccataggtcacaatgtcgaagatcaacaAATACGGTCTCG A59-BCoV(142-226)(-) + BCoV 194-226 MHV 226-237 A59-BCoV(142-226)(+) - CGAGACCGTATTtgttgatcttcgacattgtgacctatgggtggg BCoV 194-226 gtgacctatgggtgggccactgcctaggatacaacaCGTCACTGGCAGAGAAC MHV 153-170 A59-BCoV(174-226)(+) - G BCoV 174-209 MHV 171-190 tttctgctgttaacagctttcagaaagggacgTGTCCATTCGGCGCCAGCAG A59-BCoV(142-173)(-) + BCoV 142-173 MHV 119-140 A59-BCoV(142-173)(+) - gctgttaacagcagaaaCTACAAATGTCAGCACTATGAC BCoV 142-158 MHV 210-230 BCoV-SLIVtop(-) + gtggcccacccataggtcacaATGGCAAAGATGGGCAAATAC BCoV 190-210 MHV 169-186 tgtgacctatgggtgggccacTGGCGCCGAATGGACACG BCoV-SLIVtop(+) - BCoV 190-210 GCAGCCCACCCATAGGTTGCATAatgtcgaagATGGGCAAATAC MHV 187-209, BCoV-SLIVmid(-) + G 219-231 BCoV 211-219 TATGCAACCTATGGGTGGGCTGCtgcctagGAATGGACACGTC MHV 187-209, BCoV-SLIVmid(+) - AC 165-179 BCoV 183-189 GCCCACCCATAGGTTGCATAATGGCAAAgatcaacaAATACGGT MHV 190-217, BCoV-SLIVbot(-) + CTCG 226-237 BCoV 219-226 TGCAACCTATGGGTGGGCTGCTGGCGCCgatacaacaCGTCACT MHV 180-207, BCoV-SLIVbot(+) - GGCAG 159-170 BCoV 174-182 MHV 218-230 BCoV-SLIVtop+mid(-) + gtggcccacccataggtcacaatgtcgaaGATGGGCAAATAC BCoV 190-218 MHV 167-179 tgtgacctatgggtgggccactgcctagGAATGGACACGTC BCoV-SLIVtop+mid(+) - BCoV 183-210 MHV 187-209, BCoV-SLIVmid+bot(-) + GCAGCCCACCCATAGGTTGCATAatgtcgaagatcaacaATACG 226-231 BCoV 211-226 MHV 187-206, BCoV-SLIVmid+bot(+) - GCAACCTATGGGTGGGCTGCtgcctaggatacaacaCGTCACTGG 162-170 BCoV 174-189 MHV 210-217 gtggcccacccataggtcacaATGGCAAAgatcaacaaatacg BCoV BCoV-SLIVtop+bot(-) + 190-210, 219-232 MHV 179-186, 165-171 BCoV-SLIVtop+bot(+) - tgtgacctatgggtgggccacTGGCGCCGatacaacACGTCAC BCoV 190-210, 175-181 BCoV-SLIV-L(-) + GCCCACCCATAGGTTGCATAATG MHV 190-212 MHV 190-212 BCoV-SLIV-L(+) - CATTATGCAACCTATGGGTGGGCcac BCoV 189-191 MHV 185-203 BCoV-SLIV-R(-) + CAGCAGCCCACCCATAGGTcac BCoV 207-209 MHV 185-203 gtgACCTATGGGTGGGCTGCTG BCoV-SLIV-R(+) - BCoV 207-209

24

TABLE 2.1. Continued.

a b c Binding Oligonucleotide Polarity Sequence (5’Æ3’) region (nt) MHV 156-175 tttctgctgttaacagcTCTCTGCCAGTGACGTGTCC A59-BCoV(142-158)(-) + BCoV 142-158 BCoV(122-158)(+) - gctgttaacagcagaaatgaatatagacaccactatg BCoV 122-158 MHV 172-190 A59-BCoV(159-174)(-) + tttcagccagggacgtGTCCATTCGGCGCCAGCAG BCoV 159-174 BCoV(140-175)(+) - cacgtccctggctgaaagctgttaacagcagaaatg BCoV 140-175 MHV 190-209 A59-BCoV(176-192)(-) + ttgtatcctaggcagtgGCCCACCCATAGGTTGCATA BCoV 176-192 BCoV(157-193)(+) - ccactgcctaggatacaacacgtccctggctgaaagc BCoV 157-193 MHV CGATCAACGTGCCAAGCCACAAGG MHV-1094(+) - 1094-1117 DI3(+) - cgggatccgtcgacacgcgtttttttttttttttttttt Poly(A) MHV-leader(-) + TATAAGAGTGATTGGCGTCCG MHV 1-21 BCoV-leader(-) + gattgtgagcgatttgcgtgcg BCoV 1-22 MHV(605-623)(+) - GTTACACAGGCAGACGCGC MHV 605-623 MHV(261-284)(-) + CCATGGATGCTTCCGAACGCATCG MHV 261-284 MHV GGATGGTGGTGCAGATGTGG MHV(30811-30830)(-) + 30811-30830 MHV MHV-NPol(A)(+) - ttttttttttttttttttttttttttACACATTAGAGTC 31019-31033 MHV T7MHV-NATG(-) + gtaatacgactcactatagATGTCTTTTGTTCCTGGGC 29669-29687 MHV CAGCAAGACATCCATTCTGATAGAGAGTG MHV(31094-31122)(+) - 31094-31122

a The positive and negative symbols in the oligonucleotide names indicate the polarity of the nucleic acids to which the oligonucleotides anneal. b Polarity of the oligonucleotide relative to the positive-strand viral genome. c Lower-case bases represent non-MHV sequences. Capitalized bases represent MHV sequences.

25 from the wt icMHV plasmid A DNA. The BCoV/MHV chimeric plasmid A was then used to create a recombinant virus with fragments B through G in the MHV-A59 reverse genetics system. Assembly of the full-length MHV-A59 infectious recombinant virus RNA. Wild-type (wt) and chimeric MHV viruses were produced as described by Yount et al. (175) and as modified by Denison et al. (35). Briefly, plasmids containing the seven cDNA cassettes (A, B, C, D, E, F, and G) of the MHV-A59 genome were digested with the appropriate restriction enzymes (New England Biolabs) followed by gel electrophoresis purification of the fragments (QIAEX II; Qiagen). In this study, fragment A was modified by RT-PCR mutagenesis to contain the described mutations using the appropriate primers (Table 2.1). Gel purified DNA fragments were in vitro ligated with T4 DNA ligase (New England Biolabs) in a total reaction volume of 210 μl overnight at 4 °C. Ligated cDNA was chloroform extracted, isopropanol precipitated, and size confirmed by electrophoresis in a 0.6 % nondenaturing agarose gel. Infectious MHV genomic RNA was generated in vitro with the mMessage mMachine T7 transcription kit (Ambion) from the ligated cDNA in a total reaction volume of 50 μl supplemented with the addition of 7.5 μl of 30 mM GTP. In vitro transcription was done at 40.5 °C for 30 min, 37 °C for 60 min, 40.5 °C for 30 min, 37 °C for 30 min, and 40.5 °C for 30 min. In parallel, RNA transcripts encoding the MHV nucleocapsid (N) protein were generated from a cloned N cDNA plasmid containing an upstream T7 promoter using the mMessage mMachine in a total reaction volume of 25 μl at 37 °C for 3 h. Both icMHV genomic RNA and N RNA were treated with 5 μl Turbo DNAse (Ambion) at 37 °C for 30 min and mixed for electroporation. To prepare for electroporation, BHK-MHVR cells were separated by trypsinization, washed twice with phosphate–buffered saline (PBS), and resuspended in PBS at a concentration of ~1×107 cells/ml. The RNA mixture was added to 650 μl of the resuspended BHK-MHVR cells in a 4-mm-gap cuvette (Phenix) and pulsed three times by 850 V at 25 μF with a Gene PulserTM electroporator (Bio-Rad). Electroporated cells were then seeded on a layer of ~1×106 DBT cells in a 75-cm2 flask and incubated at 37 °C. Virus viability was determined by the development of cytopathic effect (CPE, syncytia) in cells in 72 h or less. Cultures that did not develop syncytia after 72 h were blind cell passaged at 48 h intervals

26 three times at 37 °C in a further attempt to recover infectious virus. At least three independent experiments were done before a mutated MHV genome was claimed nonviable. Characterization of viral mutants by RT-PCR and viral RNA sequence analyses. Cell cultures that developed complete CPE were freeze-thawed once and supernatant fluids were centrifuged at 5,000 × g for 10 min to clear cell debris. Virus from cells that had been electroporated (i.e., virus passage 0 [VP0]) was passaged onto DBT cells (i.e. VP1), and the VP1 cells were lysed with Trizol reagent (Invitrogen) followed by RNA extraction in accordance with the manufacturer’s protocol. To analyze the 5’ ~1000-nt and 3’ ~500-nt sequences of chimeric viruses, total cellular RNA harvested from VP1 cells was reverse transcribed with Superscript II reverse transcriptase (Invitrogen) using MHV-1094(+) and DI3(+) primers, respectively. The cDNA was PCR amplified with primers MHV- leader(-) or BCoV-leader(-) and MHV-1094(+) for 5’ terminal sequences, and with primers DI3(+) and MHV(30811-30830)(-) for 3’ terminal sequences. PCR products were gel purified (QIAEX II; Qiagen) prior to automated sequencing with primers MHV(261-284)(-) and MHV(605-623)(+) for the 5’ ~1000-nt fragment, and with primer MHV(30811-30830)(-) for the 3’ ~500-nt fragment. To determine virus titers and plaque morphology, plaque assays were done on L2 cells. Briefly, L2 cells of ~80% confluence in 6-well flasks were inoculated with serial diluted viruses for 2 h at 37 °C, washed once with PBS, and overlaid with DMEM supplemented with 1% low melting agarose and 2% FCS. After incubation at 37 °C for ~60 h, the overlay were fixed with 10% formaldehyde at room temperature for 30 min, and then discarded. The fixed cells were stained with 1% crystal violet for 10 min, washed with water, and air dried before plaques were counted. Representative plaque images were taken with a Nikon digital camera and prepared using Adobe Photoshop CS. Growth kinetics and sequence analyses. To measure growth kinetics, DBT cells of ~80% confluence in 25-cm2 flasks (~4 × 106 cells) were inoculated with wt or chimeric MHV in DMEM at a multiplicity of infection (MOI) of 0.01 or 5.0 PFU/cell for 2 h at 37 °C. Inocula were removed, cells washed twice with DMEM, and 5 ml DMEM with 2% FCS were added and incubation was continued at 37 °C. 100 μl samples of supernatant fluid

27 were collected at the indicated times post-infection and viral titers were determined by plaque assays on L2 cells as described above. To analyze sequence changes following viral passage, wt and chimeric MHV viruses were used to infect DBT cells of ~80% confluence in 25-cm2 flask at an MOI of ≤ 0.1 PFU/cell. After complete CPE was observed (usually around 24 h post-infection), supernatant fluids were collected, clarified at 5,000 × g for 10 min, and used for subsequent infections. Intracellular RNA from infected cells was Trizol-extracted, RT-PCR amplified, gel purified, and sequences analyzed as described above. Northern blot analysis of viral RNAs. DBT cells of ~80% confluence in 25-cm2 flasks (~4×106 cells) were infected with wt or chimeric MHV-A59 viruses at an MOI of 0.01 PFU/cell. At 20 hpi, cells were lysed with Trizol (Invitrogen) and intracellular RNA was isolated in accordance with the manufacturer’s protocol. RNA (~6 μg) from one-tenth of each extraction was electrophoresed in a 1.0 % agarose-formaldehyde gel at 150 V for 4 h. RNA was transferred to HyBond N+ nylon membrane (Amersham Biosciences) by vacuum blotting for 3 h followed by UV cross-linking. After pre-hybridizing the membrane with NorthernMax Prehybridization/Hybridization Buffer (Ambion) at 55 °C for 4 h, the blot was probed with 20 pmol of γ-32P-end-labeled 3’ UTR-specific oligonucleotide MHV(31094-31122)(+) at ~4×105 cpm/pmol, at 55 °C overnight for detecting viral genomic and subgenomic RNAs. Probed blots were exposed to Kodak XAR-5 film for 24 h at -80 °C for imaging, and images were prepared using Adobe Photoshop CS.

RESULTS A precise 210-nt BCoV 5’ UTR failed to replace the 209-nt MHV 5’ UTR in the MHV genome. Since the 288-nt 3’ UTR of BCoV can functionally replace the 301-nt 3’ UTR of MHV in the MHV genome (61, 62), and the BCoV DI RNA can replicate in MHV-infected cells indicating the BCoV cis-replication signals can be recognized by the MHV RNA replication machinery (169), we reasoned that a precise substitution of the BCoV 5’ UTR in the MHV genome might function for genome replication as well. To test this, a 5’ UTR replacement was made by using the MHV reverse genetics system developed by Yount et al. (175). The BCoV replacement of the MHV 5’ UTR, B1-210/M chimera,

28 yielded no viable virus as determined by syncytia formation in DBT cells and plaque assay in L2 cells following three blind cell passages (Fig. 2.2 A and B). We therefore chose to separately analyze shorter domains of the 5’ UTR to determine which 5’ UTR components were compatible, if any, based on the identity of cis-replication elements described in the BCoV DI RNA. The BCoV 5’ UTR stem-loops I & II region (nt 1-84) and stem-loop III region (nt 85-141), but not the 5’ UTR-confined stem-loop IV region (nt 142-210), can functionally substitute for the homologous MHV region. To test the hypothesis that individual parts of the BCoV 5’ UTR can functionally replace homologous regions in the MHV genome, the BCoV 5’ UTR was initially divided into three segments, SLs I & II (nt 1-84), SLIII (nt 85-141), and SLIV (nt 142-210) as can be identified in Figs. 2.1 and 2.2 A, and each was tested separately or jointly in the MHV background as a chimeric BCoV/ MHV construct. Virus-induced syncytia were detected within 24 h post-electroporation (hpe) from wt genomic RNA and within 24-48 hpe from cells electroporated with chimeric genomic RNAs containing BCoV SLs I & II (B1-84/M), BCoV long SLIII (B85-141/M), and BCoV SLs I & II & long SLIII (B1-141/M). In each, plaques appeared similar in size and shape to wt MHV (Fig. 2.2 B). RT-PCR and sequence analyses of RNA from virus passages 1 (VP1) and 10 (VP10) from all chimeras revealed no additional mutations within the 5’-1000 or 3’-500 genomic nt (data not shown). Growth kinetics determined on DBT cells using MOIs of 5 PFU/cell and 0.01 PFU/cell (Fig. 2.2 C) revealed little difference between wt MHV and B85-141/M progeny during the early part of the growth curve (i.e., within 12 h), but the progeny from B1-84/M and B1-141/M replicated more slowly during these periods. All three chimeras, however, reached a maximal viral titer of ~1.5×107 PFU/ml, ~ ten-fold lower than wt, by 12 and 20 h for the MOIs of 5 and 0.01, respectively. These results demonstrated that while the BCoV SLs I & II and SLIII in chimeras are not identical to wt in structure and sequence, both supported replication of the chimera with near wt MHV-like properties. The B142-210/MHV, B85-210/M, and B1-84, 142-210/M chimeras, on the other hand, failed to develop CPE by 72 hpe and no viable virus was recovered during three blind cell passages (Fig. 2.2 B and data not shown). These results, therefore, identified one or more regions of

29

ABI III

MHV-A59 5’ UTR: 209 nt IV

II

209 zz lethal z Mock wt MHV B1-210/M 5’ 3’

1 82 140 209 82 nt 58 nt 69 nt wt MHV

B1-84/M

85-141 B /M lethal B1-84/M B85-141/M B142-210/M B142-210/M

B1-141/M B85-210/M B1-84, 142-210/M

B1-210/M lethal lethal 84 nt 57 nt 69 nt B1-141/M B85-210/M B1-84, 142-210/M 1 84 141 210 C

9 8 8 8 8 7 7 7 6 7 6 6 5 5 6

5 4 4 moi = 5.0 moi = 0.01 5 log PFU/ml PFU/ml log PFU/ml log log PFU/ml moi = 5.0 log PFU/mL PFU/mL wt MHVMHV 3 wt MHVMHV 3 moi = 0.01 4 wt MHV wt MHVMHV wtwt MHV MHV wt MHV 4 wtwt MHV MHV M/B(1-84)1-84 M/B(1-84)1-84 B /M B /M wtwt MHV MHV wt MHV 1-84 2 1-84 2 wt MHV M/B(1-84)B /M M/B(1-84)B /M 1-141 M/B(1-141)B /M 1-141 85-141 85-141 M/B(1-141)B /M 3 M/B(85-141)B /M M/B(85-141)B /M 1-141 3 M/B(1-141)B /M 1-141 M/B(85-141)B85-141/M 1 M/B(85-141)B85-141/M 1 M/B(1-141)B /M 2 0 2 0 0 4 8 12162024 0 4 8 121620242832 04812162024 0 4 8 12 16 20 24 28 32 Hours Post Infection Hours Post Infection Hours Post Infection Hours Post Infection

FIG. 2.2. BCoV SLs I & II, and SLIII domains support virus replication in the MHV background. (A) Scheme showing RNA domains in the BCoV 5’ UTR that were tested for supporting replication in BCoV/MHV chimeras. Stem-loop structures coinciding with domains in MHV (solid line) and BCoV (stippled line) are depicted at the top and bottom, respectively. Line lengths are proportional to the genetic distance. Solid circles identify the AUG start codon for the nsp1 ORF. (B) Plaque phenotypes of wt MHV and BCoV/MHV chimeras described in (A). Plaque assays were performed on monolayer mouse L2 cells and stained with crystal violet. Nonviable chimeras formed no visible plaques during 1-3 blind cell passages and were indistinguishable from mock-infected cells. (C) Single-cycle and multiple-cycle growth kinetics of the viable BCoV/MHV chimeras relative to those of wt MHV. DBT cells were infected with the indicated viruses at an MOI of 5.0 or 0.01 PFU/cell. At the indicated times post infection, aliquots of medium were removed and infectious titers were determined by plaque assays on mouse L2 cells. Open and solid symbols represent results from two independent experiments.

30 incompatibility between the BCoV nt 142-210 sequence and the remainder of the MHV genome. The BCoV stem-loop IV region (nt 142-210) when extended by 16-nt into the nsp1 cistron to form a longer stem-loop IV (nt 142-226) enabled recovery of MHV chimeras, but only after one blind cell passage. A predicted long SLIV for BCoV after a 16-nt extension into ORF 1 shows 12 base pairings within the 16-nt extension (Fig. 2.1 B), and the same for MHV SLIV (Fig. 2.1 A). However, with the B142-210/M chimera, only 8 base pairings are likely for the same region between the BCoV 5’ UTR and MHV ORF 1. Therefore, a chimera utilizing BCoV nt 142-226 in the chimera B142-226/M was tested (Fig. 2.3 A). Following electroporation, viral plaques became evident after one blind cell passage but these were small and grew to a much lower titer (1.5× 104 PFU/ml) than did wt (Fig. 2.3 B). In addition, 5’ 1000-nt sequence analyses of isolated plaques revealed a mixed population of mutations mapping at nt 30, 162, 168, and 363 (data not shown). These results suggested that these mutations were compensatory and enabled the replication (data not shown). Thus, the replicating but debilitated chimera indicated that either the 32-nt unstructured BCoV region between SLIII and SLIV, or the long SLIV itself, or both, was not compatible with the rest of the MHV genome. To further analyze the compatibility of BCoV sequences within these domains, the 32-nt domain was divided into two parts and studied separately in new chimeras, and the long SLIV was studied separately in a chimera. The results with the BCoV long SLIV alone are described first. The BCoV long stem-loop IV alone (nt 174-226) is immediately fully functional within the context of the MHV genome. A chimera with the BCoV long SLIV alone, B174-226/M (Fig. 2.3 A), formed plaques within 24 hpe and progeny from VP0 exhibited MHV wt-like plaques (Fig. 2.3 B). Sequence analyses of progeny genomes demonstrated no additional mutations within the 5’ 1000-nt and 3’ 500-nt regions (data not shown). In addition, growth kinetics with VP1 at low MOI was indistinguishable from wt MHV (Fig. 2.3 C), and Northern blot analyses on RNA from infected cells displayed a wt-like pattern (Fig. 2.3 D). These results together demonstrated that the long SLIV from BCoV is fully functional in the context of the MHV genome despite an 18-nt (33%) difference between

31

ABIV I III

MHV-A59

209 zz II z

30-nt mutations 5’ 3’ wt MHV B142-226/M B174-226/M

1 140 155 170 225

wt MHV B142-226/M B174-226/M

B1-158/M mutations B1-173/M lethal lethal B1-158/M B1-173/M B1-226/M B1-226/M 1 141 158 173 226

C D M 6 / V 2 2 k H - 4 c M 7 o t 1 8 m w B

7 gRNA

6 sgRNA2 sgRNA3 5

4 log PFU/mL log 3 sgRNA4 sgRNA5 moi = 0.01 2 wtwt MHV MHV sgRNA6 M/B(174-226)B174-226/M sgRNA7 1

0 0 4 8 12 16 20 24 28 32 123 Hours Post Infection

FIG. 2.3. BCoV nt 142-226 can substitute for its MHV counterpart after one blind cell passage. (A) Top: Schematic of predicted secondary structures within the 5’-proximal 250-nt of the MHV and BCoV genomes showing long stem-loops III and IV and position of the MHV 30-nt unstructured region. Bottom: Selected regions of the 30-nt from MHV and 32-nt from BCoV being tested in the MHV background are shown. (B) Plaque phenotypes shown with the corresponding mutant chimera. (C) Growth kinetics of the B174-226/M chimera as compared to that of wt MHV. DBT cells were infected with virus at a MOI of 0.01 PFU/cell and virus titers at the indicated times post infection were determined by plaque assays on L2 cells. The number depicted at each time point is the average of three independent experiments. (D) Northern analysis showing the RNA patterns for wt MHV and the B142-226/M chimera. Cultures of DBT cells were mock-infected or infected with MOI of 0.01 PFU/cell, and intracellular RNA was extracted at 20 hours post infection. The Northern analysis used a 32P-radiolabeled probe specific for the 3’ UTR as described in the Materials and Methods. gRNA, genomic RNA; sgRNA, subgenomic RNA.

32 the BCoV and MHV long SLIV structures and 5’-terminal nsp1 aa sequence differences of MSKINK and MAKMGK, respectively. It also indicates that the 5’-terminal 16-nt of the nsp1 ORF act as part of the cis-acting SLIV RNA. A surprising number of variations in stem-loop IV structure are tolerated, although replication is impaired. Additional experiments demonstrated that deviations from intact wt-like BCoV or MHV SLIV structures were tolerated for virus replication, but a wt-like structure is important for robust growth. To examine the structural features of the long SLIV that are important for virus replication, eight chimeric constructs with various combinations of long SLIV components were tested. Four independent trials were made for each construct and the results of one trial for each are shown (Fig. 2.4). (i) Using chimeric SLIV constructs in which the left side (B177-192/M) or the right side (B207-224/M) of the entire stem is made with BCoV sequence (thus replacing the homologous sequence in the MHV SLIV as defined by the alignment in Fig. 2.1 C), mixed plaque sizes were obtained (Fig. 2.4 A and B). However, for these, two blind cell passages were required before plaques appeared. Sequence analyses of selected plaque-purified populations revealed that from the left-side chimera (B177-192/M), two populations of progeny with deletions were found, one (plaque 1) with a deletion of nt 173 through 190 (counting from MHV nt 171, M171, at the base of the chimeric SLIV) and another (plaque 2) with a deletion of nt 174 through 178 (Fig. 2.4 A). From the right-side chimera (B207-224/M), the progeny carried a deletion of nt 184 through 203 that included the terminal loop (Fig. 2.4 B, plaque 1). (ii) Using chimeric constructs in which the stem of only the lower half of SLIV (B177-187, 214-224/M), or the lower half of the left side (B177-187/M), or lower half of the right side (B214-224/M) was made BCoV-like, wt-like plaques were obtained immediately after transfection (Fig. 2.4 C), and only sequences that matched those of the input chimeras were found (Fig. 2.4 C). (iii) Using chimeric constructs in which the stem of only the lower fourth of SLIV (B177-180, 222-224/M), or lower fourth of the left side (B177-180/M), or lower fourth of the right side (B222-224/M) was made BCoV-like, wt-like plaques were obtained immediately after transfection (Fig. 2.4 D), and only sequences that matched those of the input chimeras were found (Fig. 2.4 D). Taken together, these data show that there is tolerance for variation in SLIV structure, but that optimal virus replication in the chimera

33

A B177-192/M B B207-224/M CC CC CC CC △ △ CA CA CA CA △ △ AU AU AU AU △ △ CA CA CA CA △ △ C-G C-G C-G C-G △ △ C-G C-G C-G C-G △ △ G-U (M190) △ U G-U G-U △ △ (M203) (B192) G A-U △ U G A-U A-U C (B207) △ U C C-G △ G C-G C-G △ G U mixed U A A G-C A △ C A G-C A G-C A △ C A U deletion U U U △ deletion U △ A A A A △ A △ A-U A △ U A A-U A A-U A △ U A C-G △ G C-G C-G △ G G C-G △ G G C-G C-G U (M184) △ G U G-C △ C + G-C G-C G-C A C A △ A A C A C A G C A G U G △ U G G G C G A △ A C G A G A G A C A △ A C A C A C A U-G △ G (M178) △ G U-G U-G A U-A △ A △ A U-A U-A U A-U △ U △ U A-U A-U G C-G △ G △ G C-G C C-G C (B177) U C-G △ G (M174) △ G C-G A C-G A U-G (M173) △ G U-G U-G A (B224) U-G A G-C G-C G-C G-C G-C (M171) U-A (M225) U-A U-A U-A U-A

deletion deletion deletion

C D B177-187, 214-224/M B177-187/M B214-224/M B177-180, 222-224/M B177-180/M B222-224/M CC CC CC CC CC CC CA CA CA CA CA CA AU AU AU AU AU AU CA CA CA CA CA CA C-G C-G C-G C-G C-G C-G C-G C-G C-G C-G C-G C-G G-U G-U G-U G-U G-U G-U A-U A-U A-U A-U A-U A-U C-G C-G C-G C-G C-G C-G G-C A G-C A G-C A G-C A G-C A G-C A U U U U U U A A A A A A A-U A A-U A A-U A A-U A A-U A A-U A C-G C-G C-G C-G C-G C-G (B187) G C-G U (B214) G C-G C-G U C-G C-G C-G G-C G-C G-C G-C G-C G-C A C A G A C A C A G C A C A C A U G U G G G G G C G A C G A G A G A G A G A C A C A C A C A C A C A U-G U-G U-G U-G U-G U-G A U-A A U-A U-A (B180) A U-A A U-A U-A U A-U U A-U A-U U A-U U A-U A-U G C-G C G C-G C-G C G C-G C (B222) G C-G C-G C (B177) U C-G A U C-G C-G A (B177) U C-G A U C-G C-G A U-G A (B224) U-G U-G A U-G A (B224) U-G U-G A G-C G-C G-C G-C G-C G-C U-A U-A U-A U-A U-A U-A

2.0 × 106 2.5 × 106 6.5 × 103 7.2 × 106 1.0× 107 5.0 × 104

FIG. 2.4. A variety of SLIV structures are tolerated for virus replication. Wild-type bases for the long SLIV in BCoV that differ from MHV according to the alignment in Fig. 2.1 C are noted by arrows. The AUG start codon for ORF 1 is boxed. (A) Results for a chimera in which the upstream-half of SLIV is made BCoV-like in the MHV background. The genome sequence of two individually plaque-purified progeny isolates show one 18-nt upstream deletion pattern and one 5-nt upstream deletion pattern. (B) Results for a chimera in which the downstream-half of SLIV is made BCoV-like in the MHV background. The genomic sequence of the plaque-purified progeny isolate shows a 20-nt deleted upstream region that includes the terminal loop. (C) Results for chimeras in which the lower half of SLIV, or the upstream or downstream parts of the lower half of SLIV, are made BCoV-like in the MHV background. The progeny of these chimeras were viable but debilitated, and they retained the parental BCoV sequences. (D) Results for chimeras in which the lower fourth of SLIV, or the upstream or downstream parts of the lower fourth of SLIV, are made BCoV-like in the MHV background. The progeny of these chimeras were viable but debilitated, and they retained the parental BCoV sequences.

34 takes place with an intact wt BCoV or wt MHV SLIV. In light of this, it is curious that B1-210/M in which only the lower part of the SLIV mismatched was not viable (Fig. 2.2 A and B). It is possible that the lower part of the stem is more critical for SLIV function. The BCoV 32-nt region upstream of the long stem-loop IV (nt 142-173) behaves as a coronavirus subgroup 2a-specific cis-replication domain. Given that chimeras containing the BCoV SLs I & II and SLIII domains together (B1-141/M), or the long BCoV SLIV domain alone (B174-226/M), but not those containing the BCoV unstructured 32-nt domain (B1-226/M or B142-226/M), were immediately functional in the MHV background for virus replication (Fig. 2.2 and Fig. 2.3), it was postulated that the BCoV 32-nt domain (corresponding to a 30-nt domain in the MHV 5’ UTR) is an unstructured cis-replication RNA element that might be coronavirus subgroup 2a-specific. To examine this hypothesis, chimeras were made to separately examine each of two subdomains in the 32-nt unstructured region. For this, chimeras B1-158/M which extends the BCoV sequence in the viable B1-141/M chimera 17 nt into the 32-nt domain, and B1-173/M which extends completely through the BCoV 32-nt domain, were tested (Fig. 2.3 A and B). The 17-nt extension in B1-158/M enabled replication, but only after two blind cell passages, and the plaques were heterogeneous in size (Fig. 2.3 B). Sequence analyses of isolated plaques revealed a mixed population of mutations mapping at nt 55, 145, and 147, the latter two of which were within the 17-nt extension (data not shown). These results suggested that the mutations were compensatory and enabled the replication. For chimera B1-173/M, no viable virus was obtained immediately or after three blind cell passages (Fig. 2.3 B). This pattern was consistent with results from B1-226/M which showed no viable progeny after three blind passages of transfected cells (Fig. 2.3 B). To examine the compatibility of the BCoV 32-nt region alone in the MHV background, reciprocal 5’ UTR constructs were tested in which the BCoV 32-nt domain was the only BCoV sequence within the MHV background, (B142-173/M, or for brevity, B32-nt/M), and two in which the MHV 30-nt domain within the BCoV 5’ UTR or 16-nt-extended BCoV 5’ UTR with partial nsp1 cistron was placed in the MHV background (B1-210, M30-nt/M and B1-226, M30-nt/M) (Fig. 2.5 B). After transfection with the B32-nt/M chimera, viable virus yielding heterogeneous-size plaques (Fig. 2.5 C) appeared after two blind cell passages, thus conforming to the above patterns with the BCoV 32-nt

35 MHV-A59 141 UUCCUUGA--CUUUCGUUCUCUGCCAGUGACG 170 (30 nt) A MHV-JHM 146 UUCCUUGA--CUUUCGUUCUCUGCCAGUGACG 175 (30 nt) MHV-like MHV-2 141 UUCCUUGG--UUUUUGUUCUCUGCCAGUGACG 170 (30 nt) CoV-HKU1 138 UUUCCACACUUUUCAUCUCUCUGCCAGUGACG 169 (32 nt) Intermediate ECoV 141 UUUCUACUGUUAACAACUCUCAGCCAGGGACG 172 (32 nt) BCoV-Mebus 142 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 173 (32 nt) BCoV-ENT 142 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 173 (32 nt) HCoV-OC43 142 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 173 (32 nt) BCoV-like HECoV-4408 142 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 173 (32 nt) PHEV 142 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 173 (32 nt) Calf-Giraffe CoV 126 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 157 (32 nt) Antelope CoV 127 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 158 (32 nt)

IV B I III C MHV-A59 209 zz II z

mutations 30-nt 142-173 5’ 3’ wt MHV B /M

1 140 170 209 225 140 nt 30 nt 39 nt 16 nt wt MHV B142-173/M B1-210, M30-nt/M

B1-226, M30-nt/M lethal B1-210, M30-nt/M B1-226, M30-nt/M wt BCoV 141 nt 32 nt 37 nt 16 nt 1 141 173 210 226

t /M -n 0 3 M V , DE6 k H 2 c M -2 1 8 o t m w B

gRNA 7

6 sgRNA2 sgRNA3 5

4

log PFU/mL sgRNA4 3 sgRNA5 moi = 0.01 sgRNA6 2 wt MHV BM/B226+M301-226, M30-nt/M sgRNA7 1

0 0 4 8 121620242832 123 Hours Post Infection FIG. 2.5. Detrimental effects of the BCoV 32-nt unstructured region in the MHV background. (A) Alignment of the 32-nt unstructured region among group 2a coronaviruses. Identical nucleotides are shaded. Abbreviations: MHV, mouse hepatitis virus (strains -A59, -JHM, and -2); CoV-HKU1, human coronavirus- HKU1; ECoV, equine coronavirus; BCoV, bovine coronavirus (strains -Mebus and -ENT); HCoV-OC43, human coronavirus-OC43; HECoV-4408, human enteric coronavirus-4408; PHEV, porcine hemagglutinating encephalomyelitis virus; Calf-Giraffe CoV, calf-giraffe coronavirus; Antelope CoV, sable antelope coronavirus. (B) Diagram depicting chimeric constructions. (C) Plaque phenotypes of the chimeric constructs. (D) Growth kinetics at MOI of 0.01 PFU/cell of the B1-226, M30-nt/M chimera relative to that of the wt. The number depicted at each time point is the average of three independent experiments. (E) Northern analysis for viral RNA synthesis of wt and the B1-226, M30-nt/M chimera was done as described for Fig. 2.3. gRNA: genomic RNA. sgRNA: subgenomic RNA.

36 domain in the MHV background. For chimera B1-210, M30-nt/M, no viable virus was obtained immediately or after three blind cell passages (Fig. 2.5 C). This result was consistent with the idea that a “long” SLIV is critically preferred for viral replication. On the other hand, after transfection with B1-226, M30-nt/M, medium-size plaques formed within 48 h after electroporation of genomic RNA (Fig. 2.5 B) and a final replication titer 100-fold lower than wt MHV (Fig. 2.5 C) was obtained. Northern blot analysis of viral RNAs from infected cells shows that sgmRNAs are made in the same relative proportions as with wt MHV but are less abundant (Fig. 2.5 D). These results indicate that the 32-nt unstructured region in BCoV is not immediately compatible with the MHV background and thus support the idea that the unstructured region may be a subgroup 2a-specific cis-replication element. Interestingly, an alignment of ~30-nt homologous sequences within the unstructured regions among twelve group 2a coronaviruses were examined (Fig. 2.5 A). The results show that sequences in this region can be classified as MHV-like, BCoV-like, or intermediate. This result is consistent with the notion that this region represents a subgroup 2a-specific cis-acting region. Disrupting the lower stem of SLV within the nsp1 cistron blocks BCoV DI RNA replication but does not block MHV genome replication. The structures of several coronavirus DI RNAs have shown that the most 5’-terminal region of the nsp1 ORF is present in each of them (15). Group 3 coronaviruses do not have an apparent nsp1 cistron, however their 5’ UTR is long suggesting an equivalent cis-replication element could be located within the 5’ UTR. Early experiments with the BCoV DI RNA suggested that both the 5’-partial nsp1 ORF and the N gene were required in cis for replication (23). However it could not be determined whether it is the product of translation, or the act of translation itself, that is required in cis. Therefore, to determine whether the Mfold-predicted SLV and SLVI behaved as cis-replication elements in the BCoV DI RNA, mutants that had disrupted and then restored the stem-loops were examined for replication. Disruption in both stem-loops blocked BCoV DI RNA replication (19, 55). In MHV, SLV and SLVI are predicted by Mfold but only SLV allowed the design of silent mutations to disrupt and then restore stem structure, and these were tested (Fig. 2.6). None of the mutations significantly disrupted MHV genome replication or virus production

37

ABV CG Lt: C239U, C242U CA 8 UA Rt: G305A, G308A UC Lt/Rt: C239U, C242U 270 C-G G-C G305A, G308A 7 U-A A-U G-C G 6 U A C-G C-G U-A 5 U-G U-A A A A G 290 4 G-U A-U

C-G VI log PFU/mL C-G moi = 5.0 C-G UU 3 U G U wt MHV 250 C A G U A G-C G G 2 Lt G-C U C G-C A C 330 Rt U U G-C A G-C Lt/Rt A G A-U 1 A A G-C 242U C-G 305A G-U U-A A G U-G G-C 0 239U C-G 308A A-U 5’ CGGUCUCG G-U C-GCGCAAGAA 3’ 0 4 8 12162024

230 238 309 310 339 347 Hours Post Infection

FIG. 2.6. Disruption of the lower helix of stem-loop V in the nsp1 coding region does not disable MHV replication. (A) Mfold-predicted structures of stem-loops V and VI in the nsp1 coding region. Mutations in the lower helix of stem-loop V in the nsp1 coding region, analogous to those in stem-loop V of the BCoV nsp1 coding region that disabled BCoV DI RNA replication (19), are shown. (B) Growth kinetics for MHV containing wt stem-loop V (wt MHV), and helix-disrupting mutations separately in the left side (mutant Lt) and right side (mutant Rt) of the lower helix, and a combination of left and right mutations (mutant Lt/Rt) that restore the helix structure.

38 (Fig. 2.6 A and B). These results together suggest SLV may not be a cis-replication element in the MHV genome, at least as determined by disruption of the base of the stem-loop.

DISCUSSION In this study we asked whether the BCoV 5’ UTR would function in the context of the MHV genome and support virus replication. It was anticipated that it would since the two viruses are phylogenetically closely related and the BCoV DI RNA has been shown to replicate in the presence of MHV helper virus indicating the two share replication signals. Surprisingly, a precise replacement of the 209-nt MHV 5’ UTR with the 210-nt BCoV 5’ UTR was not functional in the MHV genome. By replacing individual homologous regions in the MHV 5’ UTR with cis-acting domains in the BCoV 5’ UTR (as defined by DI RNA replication studies), it was learned that whereas domain I (comprised of SLs I and II), and domain II (comprised of SLIII) worked well despite sequence differences of 37 and 36%, respectively, (i) SLIV (domain IV) with a nt sequence difference of 33% was functional only after it had been extended into the nsp1 coding region by 16 nt (to form a “long” SLIV), and (ii) a relatively unstructured 32-nt region (domain III) between SLs III and IV showing a nt difference of 40% yielded a viable virus only after two blind cell passages. The first conclusion from this study is that an RNA sequence is identified within the 5’-terminal 16-nt region of ORF 1 that functions as part of a cis-acting RNA signal, SLIV, for virus replication. The idea that part of the nsp1 coding region could provide an important cis-replication function had been hinted at since a 5’-terminal portion of the nsp1 cistron was found as a part of coronavirus DI RNAs (with the exception of group III coronaviruses which have no nsp1 cistron) (14, 142), and this part of the nsp1 cistron was found to be necessary for DI RNA replication in the BCoV DI RNA (23). More recently, SLs V and VI in the BCoV DI RNA, as higher-order RNA structures, have been shown to behave as cis-replication elements (19, 55). In addition, in MHV, mutations in the 5’- terminal region of the nsp1 cistron have been shown to be detrimental to MHV replication (17), but whether these were the result of RNA structural changes or protein changes were

39 not determined. The presence of the nsp1 start codon in the downstream terminal portion of SLIV leaves open the possibility that SLIV may function as a riboswitch for regulation of ORF 1 translation or other function in RNA replication. SLIV could also play a role in initiation of negative-strand RNA synthesis by interacting with proteins that in turn engage the 3’ UTR to initiate formation of the replication complex. Or, since SLIV mirrors the negative-strand which harbors the initiation site for new positive-strand RNA synthesis for generating new genomes, it could be that structural changes are impairing this step. Whether SLIV interacts in any way with SLV or SLVI in the nsp1 cistron to regulate some aspect of replication remains to be seen. Certainly it is puzzling that SLV functions as a cis-replication element in the DI RNA but not the intact genome. It could be that the four point mutations as used in the current study were insufficient for disrupting SLV structure in the context of the whole genome, or that the genome has a mechanism for coping with this disruption. The second major conclusion from the current study is that the relatively unstructured BCoV 32-nt domain III (30-nt in MHV) is a coronavirus subgroup 2a-specific cis- replication element. This is supported by the fact that a chimeric MHV genome with BCoV domains I-II, III, and IV (the long SLIV version) and the 30-nt MHV domain III replicated immediately after transfection of the recombinant genome to yield MHV wt-like plaques (B1-226, M30-nt/M, Fig. 2.5). That is, no blind cell passaging was required. What the 30-nt unstructured region does for virus replication is not known at this time. The fact that it appears to be tailored to different species within subgroup 2a viruses would suggest it has evolved with other viral elements. Perhaps this explains why MHV supported the replication of BCoV DI RNA less well than did the other BCoV-like coronaviruses such as ECoV, HCV-OC43, and HCoV-4448 (169). In an associated follow-up study to this one (Guan and Brian, to be submitted) we learned that reversion of mutations within the 32-nt region to become more MHV-like together with suppressor mutations within nsp1 suggest that the nsp1 protein interacts with the 32-nt region. Also, in a separate study the binding of BCoV N to SLIII was found to be enhanced by the presence of its flanking 32-nt region suggesting that perhaps there is an interaction between the N protein and the 32-nt region (Guan and Brian, to be submitted).

40 Thus, the structure at the lower stem of SLIV and the 32-nt unstructured region between SLIII and SLIV appear to be sites critical to coronavirus replication, and the mechanism of their function should be pursued in the interest of finding sites for designing inhibitors of coronavirus replication.

41 CHAPTER III. GENETIC EVIDENCE FOR INTERACTION BETWEEN TWO 5’-TERMINAL GENOMIC CIS-REPLICATION ELEMENTS AND NONSTRUCTURAL PROTEIN 1 IN THE MOUSE HEPATITIS CORONAVIRUS

INTRODUCTION In a previous study (Chapter II), we exploited the ~35% nucleotide sequence divergence between the 5’ UTRs of the BCoV and MHV to identify the 5’-proximal cis-replication elements for replication in the MHV genome. We learned that domain I in MHV (nt 1-82) which contains higher-order structures named SLI and SLII can be functionally replaced by its BCoV counterpart, as can domain II (nt 83-140) which contains SLIII, and domain IV (nt 171-225) which contains a long SLIV that extends into the nsp1 cistron (Fig. 3.1). In each of these substitutions, despite sequence differences of 36, 34, and 33%, respectively, the chimeric virus replicated for at least 3 passages without compensatory suppressor mutations being observed within the 5’-most 1000-nt and 3’-most 500-nt of the genome. Surprisingly, a domain III of 30-nt unstructured region in the MHV 5’ UTR (nt 141-170) that maps between SLIII and SLIV could not be immediately functionally replaced by its BCoV counterpart, a 32-nt region (nt 142-173) in the BCoV 5’ UTR (Fig. 3.1). However, the resulting chimeric B142-173/M genome (also called B32-nt/M) was observed to be viable (i.e., make plaques) after two blind cell passages. In two out of three experimental attempts, there were no viable viruses after three blind cell passages. The results yielding viable B32-nt/M virus suggested that there had been compensatory mutations in the chimeric genome that enabled replication of the chimera. Here, we describe the analyses that found and identified three suppressor mutations that could each act alone to bring about wt-like plaques in the context of B32-nt/M background, but not without a simultaneous single change (a “reversion” at one of two sites) within the transplanted BCoV 32-nt element. One of the suppressor mutations (C30U) mapped within SLI in the 5’ UTR, and two (A353U and A363G) mapped within the nsp1 coding region. These experiments were done using the reverse genetics system for MHV-A59 developed by the Ralph Baric laboratory at the University of North Carolina (175) which

42

IV CC I III CA G C AU C G CA C-G C-G C-G MHV-A59 C-G G-C G-U (SL1) U•C A-U UA 99 A-U C-G C-G G-C A GC C-G U 20 CG U U A 209 CU G C A-U A U-A U U C-G G-C G-U C-G C II C-G G-C G-C G-U C A G-C TRS-CS U G U•U (SL2) C C G A U•C A A C A C A A-U U U A U U 123 U-G G-C U G C C U-A U-A A C U U U C-G A-U U-A U-A A-U A-U 30 nt C-G G-C C-G A-U C-G C-G A-U A-U U-A G-C 141 170U-G G-C A-U C•U G-U G-C 5’ UAUAA-U AA-U AAAU-A AA C-GACAUUUGUAGUUCCU UGACUUUCGUUCUCUGCCAGUGACGU-AAAUAC 3’

1405 42 56 60 77 80 130 230 BCoV UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG

142 173 32 nt

5’ 1a 3’ 1b An 2a/HE S4 EM N ●●●● ●●●● ●●●● ●●●● ●●●● ●●●● ●kb 51015202530

FIG. 3.1. Cis-replication elements in the MHV 5’ UTR. The organization of the 31.3-kb MHV genome is shown at the bottom. Above the genome is a detailed view of the current model of cis-replication RNA stem-loops (SLs) I, II, III, and IV within the MHV 5’ UTR. These four SL structures can be functionally replaced by their counterpart in the BCoV 5’ UTR. Open boxes at nt 99 and 210 identify AUG start codons for the short upstream ORF and ORF 1, respectively. An open box at nt 123 identifies the UAG stop codon for the short upstream ORF. A dashed box in the loop of SLII denotes the core signal within the transcription regulatory sequence (TRS-CS). The putative SLI is comprised of SL1 and SL2 identified by NMR and proposed by Liu and colleagues (88). The highlighted 30-nt region (nt 141-170) indicates a newly identified MHV-specific cis-replication element (identified in Chapter II) which cannot be readily replaced with the BCoV counterpart. After replacing this element with the 32-nt BCoV counterpart (BCoV nt 142-173), the BCoV/MHV chimera, named B32-nt/M, is viable only after appearance of suppressor mutations at other 5’-proximal sites in the MHV.

43 enabled reconstitution of B32-nt/M with each of the three suppressor mutations to demonstrate the sufficiency of each suppressor nucleotide (along with a commensurate single intra-BCoV 32-nt “reversion”) for generating a wt-like MHV virus. The pattern of suppressor and reverting mutations, furthermore, provides evidence of a 3-way cross talk between site C30 in SLI, sites U160 and U166 in the 30-nt region, and A363 (the first base of a lysine codon) in the nsp1 coding region. Genetic evidence also indicated that this cross talk is virus species-specific, for at least the BCoV and MHV members of group 2a coronaviruses, since a chimeric virus containing the entire BCoV 5’ UTR and nsp1 cistron within the MHV background had the phenotype of a wt MHV in cell culture. No such chimera could be made between the SARS-CoV and MHV. From these data, a model is proposed to show that the putative 3-way interaction acts as part of a switch that toggles between genome translation and initiation of negative-strand RNA synthesis on the genome.

MATERIALS AND METHODS Cells and viruses. DBT cells, L2 cells, and BHK-MHVR cells were grown in media and conditions as described in Chapters II. Sequence alignment and GenBank accession numbers. The nsp1 coding sequences of group 2a coronaviruses and SARS-CoV were translated by computer to amino acid sequences using Vector NTI Suite 8 (Invitrogen). The 5’ UTR nucleotide sequence and nsp1 amino acids alignments, and phylogenetic tree construction, were performed by using the neighbor-joining method available with CLUSTAL X 1.83 in 1,000 bootstrap trials. The generated phylogenetic tree was viewed by TreeView (Win32) 1.6.6. Putative intrinsically disordered region and potential RNA binding residues within the BCoV and MHV nsp1s were predicted with Disembl (http://dis.embl.de) (86) and BindN programs (http://bioinformatics.ksu.edu/bindn) (162), respectively . GenBank Accession Numbers for the sequences studied here are as follows: NC_001846 for MHV-A59, NC_006852 for MHV-JHM, AF201929 for MHV-2, NC_006577 for CoV-HKU1, NC_010327 for ECoV, U00735 for BCoV-Mebus, NC_003045 for BCoV-ENT, NC_005147 for HCoV-OC43, FJ415324 for HECoV-4408, NC_007732 for PHEV, EF424624 for Calf-Giraffe CoV, EF424621 for Antelope CoV, and NC_004718 for SARS-CoV-Toronto.

44 Plasmid construction for mutations in fragment A of the infectious clone for MHV-A59. The general strategy to generate chimeric and mutated MHV-A59 infectious clone was as described in Chapter II. Site-directed mutagenesis was made by overlap PCR with the primers shown in Table 3.1. Briefly, to make fragment A for rec1 infectious clone, a PCR was done with the primers MHV-C30U(-) and A59-ScaI in the presence of B32-nt/M and wt-MHV DNA template. To make rec2, the first PCR was done with the primers T7startMHV and MHV-A353T(+), and the second PCR was done with the primers MHV-A353T(-) and A59-ScaI, wherein both PCR reactions the A fragment for B32-nt/M assembly was used as the amplification DNA template. PCR fragments were chromatographically purified and used in a third PCR with the primers A59-ScaI and T7startMHV. Same procedure was used for constructing other recombinant clones except: to make rec3, primers MHV-A363G(-) and MHV-A363G(+) were used; to make rec4, primers MHV-C30U(-), MHV-A353T(+), and MHV-A353T(-) were used; to make rec5, primers MHV-C30U(-), MHV-A363G(+), and MHV-A363G(-) were used. To make mutated and chimeric clones, similar PCR strategy was used with designated primers and cognate DNA templates. The mutated and chimeric plasmid A then was used to create a recombinant virus with fragment B through G in the MHV-A59 reverse genetics system. It should be noted that in the case of replacing MHV nsp1 with SARS-CoV counterpart, additional synonymous mutations by primer-mediated PCR were made to eliminate Esp3 I and Sca I restriction sites appearing at nt 330 and 582, respectively, in order to enable the assembly of a full-length SARS-CoV/MHV chimera. Assembly of the full-length MHV-A59 infectious recombinant virus RNA. Assembly of full-length MHV-A59 infectious recombinant construct was carried out as described in Chapter II. Characterization of viral mutants by RT-PCR, viral RNA sequence analyses, and growth kinetics measurement. Analyses of plaque morphology and genome sequencing of recovered progeny viruses were carried out as described in Chapter II. Northern blot analysis of viral RNAs. Northern blot assays were performed as described in Chapter II.

45 TABLE 3.1. Oligonucleotides used in Chapter III

a b c Binding Oligonucleotide Polarity Sequence (5’Æ3’) region (nt) T7 promoter T7startMHV + gtcggcctcttaatacgactcactatagTATAAGAGTGATTGGCGTCCG MHV 1-21 T7 promoter gtcggcctcttaatacgactcactataggattgtgagcgatttgc T7startBCoV + BCoV 1-17 A59-Sca I - CGCCTTGCAGGAGTACTTACCCTTC MHV 899-923 MHV-C30U (-) + GAGTGATTGGCGTCCGTACGTACCTTCTCAAC MHV 6-37 MHV-A353U (-) + CGCAAGAACCGAATGTTAAAGG MHV 340-361 MHV-A353U (+) - CCTTTAACATTCGGTTCTTGCG MHV 340-361 MHV-A363G (-) + GGAGAAACTTTGGTTAATCACGTG MHV 360-383 MHV-A363G (+) - CACGTGATTAACCAAAGTTTCTCC MHV 360-383 MHV-U160A+U166G(-) + CGTTCTCAGCCAGGGACGTGTCC MHV 153-175 MHV-U160A+U166G(+) - GGACACGTCCCTGGCTGAGAACG MHV 153-175 MHV 221-223 GGGcaaatacggtctcgaactac BCoV (225-244) (-) + BCoV 222-244 MHV 221-223 BCoV (225-244) (+) - gtagttcgagaccgtatttgCCC BCoV 222-244 SARS-CoV T7start-SARS (-) + gtcggcctcttaatacgactcactatagatattaggtttttacctacccagg 1-24 CGCCTTGCAGGAGTACTTACCCTTCGGgactgcacctccattgagctca MHV 897-923 SARS-ScaI (+) - c SARS-CoV 788-810 SARS-CoV SARS-330 mu (-) + gtccttcaggttagggatgtgctagtgcg 322-350 SARS-CoV SARS-330 mu (+) - cgcactagcacatccctaacctgaaggac 322-350 SARS-CoV ggtataacactgggggtgctcgtgccac SARS-582 mu (-) + 565-592 SARS-CoV SARS-582 mu (+) - gtggcacgagcacccccagtgttatacc 565-592 cctcgatctcttgtagatctAATCTAATCTAAACTTTAtttaaaatctgtgtagctgtc MHV 58-75 SARS-CoV MHV-TRS1 (-) + g 38-57, 73-94 MHV 58-75 MHV-TRS1 (+) - cgacagctacacagattttaaaTAAAGTTTAGATTAGATTagatctacaagagatc SARS-CoV gagg 38-57, 73-94 gtagatctgttctctaaacgaacAATCTAATCTAAACTTTAtttaaaatctgtgtagc MHV 58-75 MHV-TRS2 (-) + tgtcg SARS-CoV 50-72, 73-94 cgacagctacacagattttaaaTAAAGTTTAGATTAGATTgttcgtttagagaaca MHV 58-75 MHV-TRS2 (+) - gatctac SARS-CoV 50-72, 73-94 MHV 210-230 MS-nsp1 (-) + ATGGCAAAGATGGGCAAATACatggagagccttgttcttggtgtc SARS-CoV 265-288 MHV 210-230 MS-nsp1 (+) - gacaccaagaacaaggctctccatGTATTTGCCCATCTTTGCCAT SARS-CoV 265-288 MHV 210-230 SM-nsp1 (-) + gggtgtgaccgaaaggtaagATGGCAAAGATGGGCAAATAC SARS-CoV 245-264 MHV 210-230 SM-nsp1 (+) - GTATTTGCCCATCTTTGCCATcttacctttcggtcacaccc SARS-CoV 245-264 MHV MHV-1094(+) - CGATCAACGTGCCAAGCCACAAGG 1094-1117 DI3(+) - cgggatccgtcgacacgcgtttttttttttttttttttt Poly(A) MHV-leader(-) + TATAAGAGTGATTGGCGTCCG MHV 1-21 MHV(605-623)(+) - GTTACACAGGCAGACGCGC MHV 605-623 MHV(261-284)(-) + CCATGGATGCTTCCGAACGCATCG MHV 261-284 MHV MHV(30811-30830)(-) + GGATGGTGGTGCAGATGTGG 30811-30830

46

TABLE 3.1. Continued.

a b c Binding Oligonucleotide Polarity Sequence (5’Æ3’) region (nt) MHV ttttttttttttttttttttttttttACACATTAGAGTC MHV-NPol(A)(+) - 31019-31033 MHV T7MHV-NATG(-) + gtaatacgactcactatagATGTCTTTTGTTCCTGGGC 29669-29687 MHV MHV(31094-31122)(+) - CAGCAAGACATCCATTCTGATAGAGAGTG 31094-31122

a The positive and negative symbols in the oligonucleotide names indicate the polarity of the nucleic acids to which the oligonucleotides anneal. b Polarity of the oligonucleotide relative to the positive-strand viral genome. c Lower-case bases represent non-MHV sequences. Capitalized bases represent MHV sequences. Bold-capitalized bases denote mutated changes.

47 RESULTS Genetic analysis of phenotypic revertants of a MHV chimera harboring the poorly compatible 32-nt unstructured region from the BCoV 5’ UTR identify three potential second site suppressor mutations and two potential intra-BCoV 32-nt “revertant” mutations. In Chapter II it was demonstrated that a 30-nt unstructured region mapping between SLIII and SLIV within the 5’ UTR of MHV behaved as a species-specific cis-acting replication element since the 32-nt homologous region from BCoV was essentially incompatible with the MHV genome as shown with the chimeric construct B32-nt/M (Fig. 3.1). The chimeric B32-nt/M genome after transfection was found to be nonviable in two experiments, but in a third, it was debilitated for growth and genetically unstable during the earliest stages of virus recovery (Fig. 3.2). The recovered virus from blind cell passage 2 exhibited heterogeneous plaque sizes that ranged from very small to nearly wt-like (large, ~2.5 mm diameter) (Fig. 3.2 A). The mutations observed within the 5’-terminal 1000-nt from small and large plaque isolates was analyzed at virus passages 1, 6 and 10 and the results are summarized in Fig. 3.3 A. Details of mutations that arose at given times are shown for the S3 isolate as an example in Fig. 3.3 B. Sequence analyses of the serially-passaged, plaque-purified viruses that had developed a stable large plaque size and wt-like growth kinetics showed a commensurate acquisition of either one or both of two sets of mutations. (i) Development of a wt-like phenotype correlated with the acquisition of one or more of three possible “suppressor” mutations, one within SLI (C30U) and two within the nsp 1 coding region (A353U and A363G). (Possibilities mapping outside the 5’-terminal 1000-nt of the genome may exist, but as shown in this study, any of the three identified here can cause a reverted phenotype.) (ii) Development of a wt-like phenotype correlated with the acquisition of one or both of two possible “reverting” mutations within the transplanted 32-nt region from BCoV (A162U and G168U) (Fig. 3.3). This series of observations led us to hypothesize that “suppressor” mutations outside of the 32-nt transplanted region, and/or possibly “revertant” mutations within the 32-nt region, or some combination of both, had arisen to enable the reverted phenotype. These mutations also might, upon further analysis, reveal specific RNA-RNA interactions within the 5’ UTR, and/or between the 5’ UTR and the nsp1 coding region or

48

A B32-nt/M progeny wt MHV

Electroporation Blind cell Blind cell of B32-nt/M RNA passage passage 72 h 1 48 h 2 48 h

Small plaques Large plaques

S1 S2 S3 L1 L2

VP1

VP6

VP10

B 8 8 8 VP1 VP6 VP10 7 7 7

6 6 6 5 5 5 4 4 4 log PFU/mL log PFU/mL PFU/mL log PFU/mL PFU/mL 3 moi = 0.01 3 moi = 0.01 3 moi = 0.01 wt MHV wt MHV wt MHV S1 2 S1 2 2 S1 S2 S2 S2 S3 S3 S3 1 L1 1 L1 1 L1 L2 L2 L2 0 0 0 0 4 8 121620242832 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 Hours Post Infection Hours Post Infection Hours Post Infection

FIG. 3.2. Severely impaired growth of the BCoV/MHV chimera (B142-173/M; also named B32-nt/M) is restored to wild-type levels through 1-10 viral passages. (A) Chimera B32-nt/M exhibits heterogeneity in plaque size but becomes uniform and wt-like through 10 viral passages. At the center-top is a picture of the plaques from recovered B32-nt/M viruses after blind cell passage 2. Note the heterogeneity in plaque size as compared to the uniformly large size of the wt MHV plaques (top-right). Shown below are the plaques obtained from three small-plaque isolates (S1, S2, and S3), two large-plaque isolates (L1 and L2), and a wt control, when plated at viral passages 1, 6, and 10. (B) The B32-nt/M chimera initially displays growth defects but wt-like growth rates appear after a few viral passages. Multiple-cycle growth kinetics of small-plaque (S1, S2, and S3) and large-plaque (L1 and L2) isolates were compared to that of wt virus. DBT cells were inoculated with passage-1, passage-6, and passage-10 viruses, respectively, at an MOI of 0.01 PFU/cell. Supernatants were obtained at the indicated time points post infection and virus titers were determined by plaque assays on L2 cells. The number depicted at each time point is the average of three independent experiments.

49 IV A I III

950 209 nsp1 Chimera zz B142-173/M II z (also named B32-nt/M)

5’ 3’

MHV 141 UUCCU UGACUUUCGUUCUCUGCCAGUGACG 170 348 371 23 36 P K V K G K T L Original BCoV 142 173 sequences ACGUACCCUCUCAA UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG CCG AAA GTT AAA GGA AAA ACT TTG

Nucleotide C30U C144U C157G A162U G168U A353U A363G changes (K48N) (K52E) VP1 C30U G168Umix S1 VP6 C30U A162Umix G168Umix VP10 C30U C157Gmix A162Umix G168Umix

mix VP1 C30U G168U S2 VP6 C30U C144U G168U VP10 C30U C144U G168U

VP1 C30U A162Umix S3 VP6 C30U C157Gmix A162Umix A363Gmix VP10 C30U C157Gmix A162Umix G168Umix A363Gmix

VP1 C30U C157G L1 VP6 C30U C157G G168Umix VP10 C30U C157G G168U

VP1 C30U G168U L2 VP6 C30U G168U A353Umix VP10 C30U G168U A353Umix

B SLI BCoV 32 nt nsp1 36 23 172 141 368 350 C30U G168U A162U C157G A363G

3’AACUCUuCCAUGCA 5’ 3’GCAGuGACCGuCUUUgGACAAUUGUCGUCUUU 5’ 3’UCAAAgAGGAAAUUGAAAG 5’

VP1 TTG A G AAGGT A C G T C G T CCC T GGC WGAAAG C T G TTAAC A G C A G AAA A G TTTTTCC TTTAAC TTTC

S3 VP6

TTG A G AAGGT A C G T C G T CCC T GGC A G AAAG C T G TTAAC A G C A G AAA A G TTTY T CC TTTAAC TTTC

VP10 TTG A G AAGGT A C G T C G T C M C T GGC WG AAAS C T G TTAAC A G C A G AAA A G TTTY T CC TTTAAC TTTC

FIG. 3.3. Putative suppressor mutations found in the 5’ UTR and nsp1 coding regions after serial passaging of B142-173/M (B32-nt/M). (A) Summary of sequencing results of RT-PCR products from viral passages 1, 6, and 10 of small- and large-plaque isolates. At the top is a depiction of the 5’-proximal genomic region in which the potential repressor mutations were found. Beneath this depiction is a table showing the nucleotide changes in comparison with the original chimera sequences. The loci of nucleotide changes are boxed with arrows. Mutations arising over time are marked with superscript “mix” according to the sequencing data shown in panel (B). (B) Sequencing results for viral passages 1, 6, and 10 of the S3 isolate are shown for the SLI region, the BCoV 32-nt region, and the nsp1 coding region. The nucleotides shown are complementary to the sequencing data because of the downstream primers used for automated sequencing. The loci of the completed and ongoing mutations are indicated by arrows. Foci with clearly mutated or mixed nucleotides are boxed.

50 coding region product (i.e., either as RNA-RNA or RNA-protein interactions) that are important for virus replication. While it is possible that the reversions within the transplanted BCoV 32-nt region alone would suffice to give rise to wt-like (revertant) phenotypes in the context of the B32-nt/M background, we chose here to first analyze the three potential suppressor mutations (the one in SLI and the two in the nsp1 coding region) (Fig. 3.3 A) in the B32-nt/M background for sufficiency to cause a reversion to a wt-like phenotype. Any one of the three potential suppressor mutations in the 5’-terminal genomic region is sufficient to cause a MHV wt-like phenotype in the B32-nt/M background, but (usually) not without a simultaneous reversion of G168U within the BCoV 32-nt region. To test whether any one of the three potential second-site suppressor mutations C30U, A353U, or A363G (Fig. 3.3 A) are authentic suppressors, genome reconstructions were made that tested each individually in the B32-nt/M background (Fig. 3.4 A, B and C). In contrast to the slow development of plaques following transfection with the B32-nt/M genomic RNA in the original experiment (Fig. 3.2), B32-nt/M genomic RNA recombinants harboring the single putative suppressors C30U (rec1), A353U (rec2), or A363G (rec3) developed near wt-like plaques within 24-36 h post electroporation (Fig. 3.4 B). These results demonstrated an immediate suppressor function for each of the potential suppressor mutations. Cells electroporated with double putative suppressor mutations, C30U and A353U (rec4), and C30U and A363G (rec5), also showed rapid development of wt-like plaques in the electroporated cells (Fig. 3.4 B). Within the first virus passage of recovered recombinants rec 1-5, plaques were nearly uniform in size and only slightly smaller than wt plaques. Growth curves similar to wt were found for rec 1, 3 and 5, but for rec 2, the growth rate lagged that of wt, and for both rec 2 and 4, the final titer reached was approximately ten-fold lower than for wt (Fig. 3.4 C). Interestingly, for each of rec 1-5, the potential revertant G168U was also present in plaques isolated from virus passage 1, indicating this was possibly a simultaneous change enabling rapid recovery (Fig. 3.4 D). In rec1, C144U was also found, and in rec3 and rec4, the potential reversion A162U was also found. From these results, i.e., from the fact that there was an apparent simultaneous selection of the intra-BCoV 32-nt A162U and/or G168U reversions, it cannot be determined

51

A B 1 5’ UTR 209 nsp1 950 wt MHV rec1 rec2 wt MHV C30U BCoV 32nt rec1 A353U rec2 A363G rec3 rec4 rec5 rec3

C30U A353U rec4

C30U A363G rec5

C30U mut1 mut2 mut3 mut1 A353U mut2 A363G mut3

C D 8 BCoV VP1 32nt

7

6 MHV 141 UUCCU UGACUUUCGUUCUCUGCCAGUGACG 170 Original sequences BCoV 142 UUUCUGCUGUUAACAGCUUUCAGCCAGGGACG 173 5 Nucleotide changes C144U A162U G168U 4 rec1 C144U G168U log PFU/mL log moi = 0.01 3 rec2 mix wt MHV G168U rec1 rec3 A162Umix G168Umix

2 rec2 VP1 rec3 rec4 A162Umix G168Umix rec4 1 rec5 G168Umix rec5 0 0 4 8 121620242832 Hours Post Infection

FIG. 3.4. Identification of suppressor mutations by reconstituting viable viruses harboring revertant mutations. (A) Structures of recombinant and mutant viruses in comparison with wt MHV. Recombinants rec1, rec2, and rec3 were generated in the context of the B142-173/M (B32-nt/M) backbone with additional mutations C30U, A353U, and A363U. These three mutations were second-site mutations found from revertants of plaque isolates described in Fig. 3.3. Synergetic suppression effects were examined by constructing the recombinants rec4 and rec5 harboring the C30U mutation along with A353U and A363U, respectively, in the presence of BCoV 32-nt. In contrast, mutants, mut1, mut2, and mut3, respectively, contain mutations of C30U, A353U, and A363U, in the MHV background. (B) Shown are the plaque phenotypes of recovered recombinant and mutant viruses, compared with that of the wt MHV. (C) Growth kinetics of the constructed recombinants, rec1 through rec5, compared to that of wt. DBT cells were inoculated with passage-1 viruses at an MOI of 0.01 PFU/cell as described in the Materials and methods. Data shown are the averages of three independent experiments. (D) Sequencing results of the BCoV 32-nt from RT-PCR products of recombinants from passage-1 of rec1 through rec5 as described in the legend for Fig. 3.3.

52 whether or not each of the potential suppressor mutations alone (C30U, A353U, or A363G) was sufficient to cause a reverted phenotype. To ask the inverse question, that is, do any of the three potential suppressor mutations (C30U, A353U, and A363G) alone in the wt MHV background cause a detrimental effect on virus growth (that might have been neutralized by the simultaneous intra-BCoV 32-nt mutations), each mutation was placed into the wt MHV background, forming mut1, mut2 and mut3, and tested (Fig. 3.4 A). Interestingly, of the three, only the intra-nsp1 ORF mutation, A363G (which causes a K52E mutation, a charge change from the basic lysine at amino acid 52 in nsp1 to a glutamic acid), showed a detrimental effect that was manifested by very small plaques that appeared after one blind cell passage of the electroporated cells (Fig. 3.4 B). This result suggests there may be an interaction between the 30-nt element in the MHV genome and nt 363 in the nsp 1 coding region, or the nsp1 protein itself. No other mutations were found in the 5’-terminal 1000-nt of recovered virus. A severe detrimental effect resulting from placement of the third suppressor mutation (A363G) alone into the wt MHV background enabled demonstration of a three-way interaction between SLI, the 30-nt element, and the nsp1 coding region (or the nsp1 protein). The severely detrimental effect of the A363G (K52E) mutation in mut3 enabled a more detailed examination of potential genetic cross talk between SLI, the MHV 30-nt region, and the nsp1 coding region (or nsp1 protein). For this, the A363G mutation was tested for restoration of the wt-like phenotype in the MHV background in combination with the C30U mutation in SLI alone, with the paired U160A and U166G mutations (homologous to the BCoV 32-nt element at these sites) alone, and with both the C30U, U160A, and U166G mutations together (Fig. 3.5). To test for cross talk between A363G in the nsp1 coding region and C30U in SLI, both mutations were placed into the wt MHV background to form mut5 (Fig. 3.5 A) and the recombinant genomic RNA was tested for replication by electroporation into cells. Remarkably, mut5 demonstrated wt-like MHV plaques immediately after transfection (Fig. 3.5 C), and wt-like growth kinetics at virus passage 1 (Fig. 3.5 B), whereas the A363G mutant alone demonstrated very small plaques and growth kinetics with a final titer at 104-105-fold lower than mut5 or wt (Fig. 3.5 B and C). This result demonstrated that these

53

A B 8 moi = 0.01 wt MHV mut3 1 209 950 5’ UTR nsp1 7 mut4 mut5 wt MHV mut6 6

A363G 5 mut3 A363G U160A+U166G 4 mut4 log PFU/mL log 3 C30U A363G mut5 2

C30U U160A+U166G A363G 1 mut6 0 0 4 8 121620242832 Hours Post Infection

C wt MHV mut3 mut4 mut5 mut6

FIG. 3.5. Genetic cross talk between the 5’ UTR and nsp1 in the MHV genome. (A) Shown are diagrams indicating the construction of mutants as compared to wt MHV and mut3 described in Fig. 3.4. (B) Growth kinetics of the recovered mutants, mut3 through mut5, and their comparison with that of wt MHV. (C) Plaque phenotypes of recovered mutants and their comparison with those of wt MHV. The severely impaired growth of mut3 (with the A363G mutation) is restored to nearly wt growth by the simultaneous introduction of C30U to SLI (mut5), or BCoV counterparts U160A and U166G to the MHV 30-nt region (mut4), or both (mut6).

54 two mutated sites together enabled wt-like virus and suggested there was some form of cross talk between these two elements, either directly (as RNA-RNA) or indirectly (as RNA-protein). To test for cross talk between the A363G in the nsp1 coding region and the (BCoV-like) U160A and U166G in the 30-nt region, all three mutations were simultaneously placed into the MHV background to form mut4 and tested for replication by electroporation into cells. Remarkably, as with mut5, mut4 demonstrated wt-like plaques immediately after transfection (Fig. 3.5 C), and wt-like growth kinetics were demonstrated at virus passage 1 (Fig. 3.5 B). This result suggested that A363G was selected in the B32-nt/M chimera to accommodate the A162 and the G168 in the BCoV 32-nt region, and therefore strengthens the notion that there is an interaction between the nsp1 coding region (or Nsp1 protein) and the 30-nt element. To test whether all three mutants function mutually to yield a wt-like phenotype, all were used together to make mut6 (Fig. 3.5 A). As demonstrated in Fig. 3.5 C, wt-like plaques were formed, and in Fig. 3.5.B, wt-like growth kinetics resulted. These results together, therefore, support the hypothesis that there is a 3-way interaction between SLI, the 30-nt region, and nsp1 coding region (or the nsp1 protein). The BCoV 5’ UTR and entire BCoV nsp1 cistron function together in the MHV background to yield wt-like MHV. The notion that there is a necessary virus species-specific cross talk between group 2a coronavirus 5’ UTRs and nsp1 cistrons (or, more likely, their nsp1 proteins) for virus replication was strengthened by the results from four other chimeric constructs that tested the compatibility between the 5’ UTR and nsp 1 coding region. Since we had previously shown (Chapter II) that a functional SLIV extends 16-nt into the nsp1 cistron, we tested whether the entire BCoV 5’ UTR along with the extended 16-nt in the context of the MHV background was viable. For this, construct B1-226/M was tested and it was nonviable even after three blind cell passages from three separate electroporation attempts (Fig. 3.6 A and B). Likewise, a chimera with the entire MHV 5’ UTR through 16-nt into the nsp1 cistron along with the remaining nsp1 cistron of BCoV in the MHV background, B227-906/M, was nonviable (Fig. 3.6 A and B). These results are consistent with the idea that the 5’ UTR and nsp 1 cistron have to be compatible

55

A B 1 5’ UTR 209 nsp1 (247 aa) 950 MHV wt MHV B1-906/M 141 170 225 908 wt MHV

B1-226/M 1-5 aa B227-906/M B1-226/M B227-906/M 1-5 234-247 aa aa MHV 30nt B1-226, M30-nt/M 1-5 aa BCoV 32nt B32-nt, 227-906/M 1-5 234-247 aa aa B1-226, M30-nt/M B32-nt, 227-906/M

B1-906/M 234-247 142 173 226 aa BCoV 5’ UTR nsp1 (246 aa) 906 1 210 948

M t /M 6 / -n 0 0 -9 3 7 2 M 2 , , M CDV 6 t 6 / H 2 n 0 k 2 - 9 c M - 2 - o t 1 3 1 8 m w B B B

7 gRNA

6 sgRNA2 sgRNA3 5 4

log PFU/mL sgRNA4 3 moi = 0.01 sgRNA5 wt MHV wt MHV sgRNA6 2 BM/B226+M301-226, M30-nt/M BM/B32+B(227-906)32-nt, 227-906/M sgRNA7 1 BM/B(1-906)1-906/M 0 0 4 8 121620242832 12 3 4 5 Hours Post Infection

FIG. 3.6. Functional exchangeability of the 5’ UTR-nsp1 regions in the viral genome between BCoV and MHV. (A) Specific interaction between the poorly compatible region within the 5’ UTR and nsp1 of BCoV and MHV. Shown is the schematic of testing the interaction between 5’ UTR and nsp1 by replacing MHV sequences (solid line) with BCoV counterparts (dashed line) in the reverse genetics system of MHV. Note that the 5’ 16-nt of nsp1 from either BCoV or MHV is along with its cognate 5’ UTR to maintain the integrity of long-SLIV as described in the Chapter II. (B) Plaque phenotypes of lethal or viable chimeric viruses. (C) Indistinguishable growth kinetics of the chimera with entire BCoV 5’ UTR and nsp1 relative to that of wt MHV. Multiple-cycle growth kinetics at an MOI of 0.01 PFU/cell of the chimeric viruses were compared to that of wt. Data shown are average of three independent experiments. (D) Northern analysis for viral RNA synthesis of wt and the chimeric viruses. Cultures of DBT cells were mock-infected or infected with an MOI of 0.01, and intracellular RNA was extracted at 20 hours post infection. The Northern analysis used a 32P-radiolabeled probe specific for the 3’ UTR as described in the Materials and methods. gRNA: genomic RNA. sgRNA: subgenomic RNA.

56 throughout (presumably to enable the 3-way interaction described above). The results with three more chimeras support this conclusion as well. In the first, a chimera with the BCoV 5’ UTR extended 16-nt into the nsp1 cistron and containing the MHV 30-nt element in the MHV background, B1-226, M 30-nt/M, was tested, and it was viable but debilitated, giving rise to medium-size plaques and growth kinetics that lagged wt MHV and reached a final titer 100-fold lower than wt MHV (Fig. 3.6 A, B and C). In the second, a chimera with the MHV 5’ UTR extending 16-nt into the nsp1 cistron and containing the BCoV 32-nt element and a BCoV nsp1 cistron, B32-nt, 227-906/M, was tested, and its behavior essentially mimicked that of B1-226, M 30-nt/M (Fig. 3.6 A, B and C). That is, it replicated but was somewhat debilitated (Fig. 3.6 A, B and C). In the third construct, B1-906/M, the entire BCoV 5’-terminal sequence through the nsp1 codon was placed into the MHV background and was viable and indistinguishable from wt MHV (Fig. 3.6 A, B and C). Thus, the BCoV 5’ UTR and adjacent nsp1 cistron function together in the MHV background to support the replication of a wt-like MHV. This result is fully consistent with the notion that there is virus species-specific cross talk between the 5’ UTR region and the nsp1 coding region, or the nsp1 protein itself, for wt-like replication. The 5’ UTR and adjacent nsp1 cistron of the group 2b SARS-CoV in the MHV background does not support chimera replication. While it has been previously demonstrated that the SARS-CoV 3’ UTR was able to functionally replace the MHV 3’ UTR (51), the same was not true for the 5’ UTR (71). It was shown, however, that some putative cis-acting elements from the SARS-CoV 5’ UTR can function in the MHV genome (71). It was also shown in the same study that a SARS-CoV transcription regulatory sequence (TRS) (i.e., the RdRp template-switching signal) in the SARS-CoV SL3 (nt 58-72) cannot support MHV replication (71). Our repeat of these experiments utilizing two different strategies for introducing the MHV TRS into the SARS-CoV 5’ UTR in chimeras S5’UTR-1/M and S5’UTR-2/M yielded lethal viruses (Fig. 3.7). Nor did replacement of the MHV nsp 1 cistron with the SARS-CoV cistron in chimera S265-804/M yield a viable virus (Fig. 3.7). To apply the lesson learned above with the BCoV and MHV chimeras, namely, that the cognate 5’ UTR and nsp1 cistron from one virus (e.g., BCoV) can function together to

57

10 11 1 2 3 4 5 6 7 8 9 5’ 12 13 14 15 16 An 3’

MHV SARS-CoV 1 209 950 5’ UTR nsp1 (247 aa) MHV Virus recovery

8 wt MHV ~1 × 10 PFU/mL MHV-TRS S5’UTR-1/M lethal 72 73 nt MHV-TRS S5’UTR-2/M lethal 57 73 nt

S265-804/M lethal 1-5 230-247 aa aa MHV-TRS S804-1/M lethal 72 73 nt 230-247 aa MHV-TRS 804-2 57 73 S /M lethal nt 230-247 SARS-CoV aa 5’ UTR nsp1 (180 aa) 1 264 804

FIG. 3.7. Functional inexchangeability of the 5’ UTR-nsp1 domains in the viral genome between SARS-CoV and MHV. The diagrams illustrate the construction of MHV chimeras whose 5’ UTR and/or nsp1 fragments (solid lines) were replaced with their SARS-CoV counterparts (hatched lines) in the reverse genetics system of MHV. Lengths are proportional to the genetic distance. To enable TRS-dependent transcription in the MHV background, the chimeras bearing SARS-CoV 5’ UTR also contained the MHV-TRS as indicated. The carboxyl-terminal 18 amino acids of MHV were also fused with SARS-CoV nsp1 to enable proteolytic cleavage of nsp1 from the polyprotein by the MHV papain-like cysteine protease. MHV-TRS: MHV transcription regulatory sequence (5’ AATCTAATCTAAACTTTA 3’).

58 yield a viable chimera with another virus (e.g., MHV), we tested just such chimeras between the SARS-CoV and MHV. Two chimeras were made, both with a MHV TRS to make leader fusion compatible but with different lengths of TRS-containing flanking sequence, however both constructs were nonviable (Fig. 3.7). Thus, while some cis-replication elements in the 5’ UTR of SARS-CoV are compatible in the MHV background, e.g., stem-loops 1, 2 and 3, the specific 5’ UTR-nsp1 interaction was not functionally conservative between group 2a and 2b coronaviruses, at least by the analytic method used here.

DISCUSSION In this study we attempted to determine how it is that a 32-nt region of the BCoV 5’ UTR mapping between SLIII and SLIV (nt 142-173) cannot readily replace its 30-nt counterpart in the MHV 5’ UTR (nt 141-170) in the MHV background while the three remaining 5’ UTR structures in BCoV (SLI and SLII [together], SLIII, and [a long form of] SLIV) function immediately in the MHV background to yield viable chimeric viruses with near wt-like MHV properties (Chapter II). We learned that a BCoV142-173/MHV chimera (also called B32-nt/M chimera), after first appearing nonviable in the electroporated cells and in the first cell passage post-transfection, grew poorly with mixed plaque sizes after a second blind cell passage. After further virus passaging (i.e., virus passages 2 through 10), progeny viruses became more like MHV wt. That is, plaques became large and growth kinetics mimicked that of wt MHV. Genome sequence analyses of the 5’-terminal 1000-nt and 3’-terminal 500-nt at virus passage 6 and 10, furthermore, identified three potential suppressor mutations outside of the transplanted BCoV 32-nt region, and two potential “revertant” mutations within the transplanted 32-nt region. While suppressor mutations could have occurred anywhere within the 32-kb genome, reconstruction experiments in which the suppressor mutations were placed back into the MHV genome along with the BCoV-specific 32-nt sites, revealed that one or more combinations of suppressor and revertant mutations found within the 5’-terminal 400-nt alone were sufficient for phenotypic reversion of the chimera to a MHV wt-like virus. These results indicated that these probably are the only mutations causing the reverted phenotype. Furthermore, a study of

59 the suppressor and reversion mutations led us to conclude that there are 2- and 3-way interactions between nt 30 in the SLI, nt 160 and/or nt 166 in the 30-nt region within the 5’ UTR, and aa 52, encoded by nt 363-365, within the nsp1 cistron, that are necessary for virus replication. To our knowledge, this is the first demonstration of a functional role for nsp1 in the behavior of viral genomic RNA. In earlier studies, we had shown that nsp1 is an RNA binding protein that binds cis-replication elements in both the 5’-terminal and 3’-terminal regions of the genome (55) (see Appendix), and in studies in other laboratories it has been shown that nsp1 in MHV and the SARS-CoV exerts profound effects on cellular mRNA stability and translational capability (69, 70, 189). In the current study, we observed that the mutation at nt 363 (and possibly also at nt 353) within the nsp1 ORF behaved as a suppressor mutation enabling replication of the B32-nt/M chimera for BCoV. We postulate, therefore, that this charged amino acid in the nsp1 protein is involved in a specific interaction with C30 in SLI and G168 in the 32-nt region to facilitate replication of the chimera. That is, the charged amino acid is important for replication of MHV but not BCoV in this context. The rationale for our hypothesis that the nsp1 protein (and not the encoding RNA) is acting in the 3-way interaction for regulation of replication as observed in this study is the following: (i) A detailed analysis of this region of nsp1 described in Fig. 3.8 shows an amino-proximal pattern of charged amino acids that are different between the BCoV and MHV clades of subgroup 2a coronaviruses (Fig. 3.8 A). These patterns, as well as the overall aa similarity of 58% between the two viruses, probably have an important influence on the folding of nsp1 and its interaction with RNA. (The group 2 SARS-CoV, by comparison, a distant group 2b outlier, has yet a more remarkable difference [not shown]). The suppressor mutations K48N and K52E found in MHV nsp1 are both potential RNA-binding residues located in a predicted intrinsically disordered region (86). They are, however, not conserved between MHV-like and BCoV-like viruses making them candidates for virus-specific RNA-protein interactions such as might be the case with nsp1 (Fig. 3.8 B). (ii) The charged amino acids within the amino-proximal half of nsp1 are vital for MHV genome replication as documented in studies from Brockway et al. (17). This is

60

FIG. 3.8. Phylogenetic relationships of 5’ UTR nucleotides and nsp1 amino acids among group 2a coronaviruses. (A) Alignments of amino acids 1 through 83 of nsp1 of group 2a coronaviruses. At the top is the diagram of putative secondary structure of MHV-A59 nsp1 as derived from the nuclear magnetic resonance data of the homologous region of SARS-CoV nsp1 reported by Almeida et al. (3). Loci of the revertant mutations, K48N and K52E, in nsp1 are shown. Beneath this is nsp1 amino acids alignment of group 2a coronaviruses. Potential intrinsically disordered regions in nsp1 of MHV and BCoV predicted by Disembl (86) are shown on top with dashed rectangles. Potential RNA-binding residues predicted by BindN (162) for MHV and BCoV are shown in “+” and “Δ”, respectively. The tentative clade classification of group 2a coronaviruses is noted at the right. Amino acids identical across all group 2a coronaviruses are highlighted in black. Amino acids conserved in BCoV-like but are similar in MHV-like and intermediate group 2a coronaviruses are highlighted in gray. Mutations made in MHV-A59 nsp1 by Brockway et al. (17) that are detrimental (open asterisk) or lethal (solid asterisk) for viral replication are shown. Regions in the proteins coinciding with the coding regions harboring SLV and SLVI are underlined with grid rectangles. (B) Phylogenetic analysis of 5’ UTR nucleotides and nsp1 amino acids of group 2a coronaviruses and SARS-CoV. The trees were constructed by the neighbor-joining method and bootstrap values calculated from 1,000 trees are shown. Putative clades designated as MHV-like, BCoV-like, or intermediate as describe in Chapter II are shown. Abbreviations for coronaviruses are as follows: MHV, mouse hepatitis virus (strains -A59, -JHM, and -2); CoV-HKU1, human coronavirus-HKU1; ECoV, equine coronavirus; BCoV, bovine coronavirus (strains -Mebus and -ENT); HCoV-OC43, human coronavirus-OC43; HECoV-4408, human enteric coronavirus-4408; PHEV, porcine hemagglutinating encephalomyelitis virus; Calf-Giraffe CoV, calf-giraffe coronavirus; Antelope CoV, sable antelope coronavirus; SARS-CoV, severe acute respiratory syndrome coronavirus.

61

A 1 132 3 4 5 6 flexibly disordered β α β 10 β β β β flexibly disordered MHV-A59 nsp1 1 80 160 247 48 52

potential intrinsically disordered regions

48 52 3 6 57 64 69 7879 N E +++ ++++ +++++ ++++ MHV-A59 1 MAKMGKYGLGFKWAPEFPWMLPNASEKLGNPERSEEDGFCPSAAQEPKVKGKTLVNHVRVNCSRLPA-LECCVQSAIIRDIFVD 83 MHV-JHM 1 MAKMGKYGLGFKWAPEFPWMLPNASEKLGNPERSEEDGFCPSAAQEPKVKGKTLVNHVRVDCSRLPA-LECCVQSAIIRDIFVD 83 MHV-like MHV-2 1 MAKMGKYGLGFKWAPEFPWMLPNASEKLGSPERSEEDGFCPSAAQEPKTKGKTLINHVRVDCSRLPA-LECCVQSAIIRDIFVD 83 CoV-HKU1 1 MIKTSKYGLGFKWAPEFRWLLPDAAEELASPMKSDEGGLCPSTGQAMESVGFVYDNHVKIDC-RCILGQEWHVQSNLIRDIFVH 83 Intermediate ECoV 1 MAKINKYGLDLQWAPEFPWMFEDTEEKLDNPSGSEVGIICPTTAQELGTCRITFENHVMVDCRRLI--QEGCVQSNLIREIKMN 82 BCoV-Mebus 1 MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDIVCSTTAQKLETGGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 BCoV-ENT 1 MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDIVCSTTAQKLETGGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 HCoV-OC43 1 MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDMICSTTAQKLETDGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 BCoV-like HECoV-4408 1 MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDIVCSTTAQKLETGGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 PHEV 1 MSKINKYGLELYWAPEFPWMFEDAEEKLDNPSSSEVDIVCSTTAQKLETGGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 Calf-Giraffe CoV 1 MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDIVCSTTAQKLETGGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 Antelope CoV 1 MSKINKYGLELHWAPEFPWMFEDAEEKLDNPSSSEVDIVCSTTAQKLETGGICPENHVMVDCRRLLK-QECCVQSSLIREIVMN 83 ΔΔ ΔΔΔ Δ ΔΔ ΔΔΔ ΔΔ Δ ΔΔ Δ Δ

SLV region SLVI region

B Co BC V- B oV- C like like Calf-Giraffe CoV e Antelope CoV ik HCoV-OC43 Calf-Giraffe CoV -l BCoV-Mebus PHEV Antelope CoV V H BCoV-ENT HECoV-4408 e BCoV-ENT HECoV-4408 M ik MHV-JHM 934 -l BCoV-Mebus PHEV V MHV-A59 H HCoV-OC43 1000 M MHV-JHM ECoV MHV-2 MHV-A59 981 1000 1000 ECoV MHV-2

1000 679 1000 574 group 2a group 2a Intermediate

CoV-HKU1 Intermediate

CoV-HKU1

5’ UTR nsp1

group 2b group 2b 0.10 0.10 SARS-CoV SARS-CoV

62 specifically true for residues conserved among all group 2a coronaviruses known at the time including K3, K6, H57, R64, E69, R78, and D79 which were crucial or essential for viral replication (Fig. 3.8 A) (17). The K48 and K52 residues were not included in this set of analyses since they do not exist in the genomes of BCoV-like viruses (17). (iii) None of the nt point suppressor (and revertant) mutations encountered in the present study caused more than a trivial disruption of the corresponding RNA sequence as evaluated by Mfold predictions. Nor could any base-pairing between nsp1 coding sequence at these sites and 5’ UTR sites be found by inspection. (iv) The BCoV nsp1 has been previously shown to be an RNA-binding protein that interacts specifically with 5’ UTR cis-acting RNA elements (55). (v) The two amino residues, K48 and K52 belong to an intrinsically disordered domain (IDD) based on bioinformatic predictions (90) and on homologous alignment with NMR-determined SARS-CoV nsp1 structure (Fig. 3.8 A) (3). Many of the IDD containing proteins undergo induced folding, i.e., disorder-to-order transition, upon binding to their physiological partners, including DNA, RNA and proteins (167). The precise 3-way interactions proposed here for regulatory function are difficult to envision in a precise way. Of special interest is the interaction between SLI (C30) and nsp1. It is not clear how the C30U mutation located within SLI relieves the problem caused by the downstream long-ranged BCoV 32-nt substitution because no apparent RNA base-pairing can be identified. The C30U mutation, however, did enable replication of the chimeric genome (Fig. 3.3) and it was able to rescue the severe growth defect resulting from the A363G (K52E) mutation in nsp1 (Fig. 3.5). Thus, we speculate that the 5’ UTR-nsp1 interaction is tri-partite among SLI, 30-nt and nsp1 and that the cross talk between SLI and 30-nt RNAs is probably mediated by a protein bridge involving nsp1 and possibly other RNA structures and/or other viral or cellular proteins. Hence, we envision that the substitution of BCoV 32-nt forces a slightly different conformation of the nsp1 protein that leads to an SLI mutation (C30U) and/or nsp1 interactions as the result of mutations such as K48N and K52E. This idea is consistent with the fact that the C30U mutation, in the context of wt 5’ UTR, did not cause a detectable effect on the viral phenotype in the MHV background (Fig. 3.4). It is also possible that the BCoV 32-nt element in the MHV background interferes with the folding of the entire 5’ UTR cis-acting structures such that

63 mutations within the SLI and/or nsp1 are required to regain their functional activities. A related observation is that placement of the BCoV 5’ UTR and nsp1 in the MHV background enabled the chimera to replicate as efficiently as wt MHV (Fig. 3.7). Further mutational analysis will be needed to precisely define the nsp1-5’ UTR interactions identified in this study. What is the biological significance of 5’ UTR-nsp1 interaction? Given that the coronavirus genomic RNAs are capped and polyadenylated, and that many precedents exist that describe translation regulation through the binding of regulatory proteins to 5’-proximal elements, it is reasonable to speculate that nsp1, an early protein in coronavirus replication, regulates genome translation in a similar fashion. For positive-strand RNA viruses that replicate in the cytoplasm, there is a need for a mechanism to inhibit genome translation at some point since translation (requiring a 5’Æ3’ movement of ribosome) and negative- strand RNA synthesis from the genome template for genome replication (requiring a 3’Æ5’ movement of the viral RdRp) cannot simultaneously occur on the same molecule. A model is proposed here for such a role for coronavirus nsp1 and is illustrated in Fig. 3.9. The proposed model combines some features of a mechanism proposed by Zust et al. (190) for initiating negative-strand RNA synthesis on the coronavirus genome. Consistent with models for translation of eukaryotic mRNAs and positive-stand RNA virus genomes (5, 6, 48, 57), efficient translation would probably be facilitated by circularization of the RNA template through a PABP1-eIF4G-eIF4E protein bridge (Fig. 3.9 A). The 5’-3’-end bridging model for coronavirus genomic RNA is supported by data showing a minimum length requirement for the 3’ poly(A) tail that matches that for PABP1 binding and efficient DI RNA replication (64, 83, 143, 185). In the model proposed by Zust et al. for initiating coronavirus negative-strand synthesis, an essential 3’-terminal cis-acting bulged stem-loop forms a thermodynamically stable interaction with an upstream intra-pseudoknot loop sequence to prevent the pseudoknot formation in order to support the 5’-3’-end bridging and genome translation at the appropriate time (Fig. 3.9 A). According to their model, once genome translation is completed, viral proteins signal a 3’ end change wherein a supercomplex is formed to initiate negative-strand RNA synthesis (Fig. 3.8 B). We postulate that nsp1 is one of these signaling proteins. We envision that nsp1 interacts with

64

FIG. 3.9. Hypothetical model for a switch from translation to initiation of negative-strand RNA synthesis during coronavirus replication. (A) Upon cell entry after infection, the translation of positive-strand viral genome utilizes host cap-dependent machinery and a circularization resulting from a bridging by proteins of eIF4E, eIF4G and PABP. This step is based on the widely-supported translation model of most eukaryotic mRNAs. At this stage, formation of the pseudoknot (PK) is prohibited by closure of the bottom segment of the 3’-terminal bulged stem-loop through base-pairing between the PK loop and the 3’ end of the genome. (B) After translation and processing of the coronavirus polyproteins pp1a and pp1ab, viral nsp1 binds the 5’ UTR to exclude initiation factors and down-regulates genome translation. The 5’ end translation-silencing signal is then transduced to the 3’ end by protein-protein interaction between nsp1, nsp7 and nsp10, thus recruiting nsp8 and nsp9 to bind the stem formed by the PK loop and the 3’ end. Negative-strand RNA synthesis is then initiated by the primase activity of nsp8, then allowing the PK to form. (C) Once PK formation is achieved, the main RNA synthesizing enzyme, RdRp (nsp12), along with other replication-related nonstructural proteins from the translation of ORF 1 join the giant protein complex and carry out elongation of the negative-strand primer. Not shown are possible roles played by other viral and cellular factors. eIF: eukaryotic initiation factor; PABP, poly(A) binding protein.

65 A BSL HVR

z zz z z z zz

PK loop

S A 600S AA 6 AA 0SS AA 440 3’AAAAAAAAAAAA AAA AA IV III PABP I

Genome zz z 40S 60S eIF 4G II 40S translation 60S

eIF 4E 5’

60S60S 60S60S 40S40S 40S40S

B

HVR z zz z z z zz

9 9 A8 AA 8 AA AA 77 3’AAAAAAAAAAAA AAA AA 1010

PABP

nsp1 Transition nsp1 zz eIF 4G z

eIF 4E 5’

C HVR PK zz z z z z zz

14 1313 1515 14 1616 RdRpRdRp UU U A UU A 99 U AA 88 UU A 5’ UUUUUUUUUUUUUUUU AA 3’AAAAAAAAAAAA AAA AA 77 1010

Negative-strand RNA synthesis zz z

5’

66 the 5’ UTR and with the nsp7-nsp10 complex (18) which in turn accompanies nsp8 (a primase) and nsp9 (another RNA-binding protein) to bind the stem formed between the pseudoknot loop 2 and the extreme 3’ end of the genome (190). The nsps7-8-9-10 complex is a postulated pre-replication complex required for synthesis of an oligonucleotide primer by nsp8 which is later used by the primer-dependent RdRp (or nsp12) to complete negative-strand RNA synthesis (36, 41, 66, 68, 120, 148, 149, 180, 190). Following primer synthesis, the pseudoknot takes shape and recruits the RdRp and other viral proteins, and a mature replication complex is established which then elongates the negative-strand antigenome (Fig. 3.9 C). It should be noted that the viral protein N (113, 151, 182, 184, 186) and cellular proteins such as PTB (65, 81), hnRNP A1 (64), and chaperones (109, 110, 176, 177), shown previously to interact with 5’ and 3’-terminal cis-replication elements, are not depicted here but are highly likely participants in the process. In a final note, many intrinsically disordered proteins fold flexibly and bind different targets depending on physiological conditions and intracellular locations, and in some cases play key roles in the assembly of supercomplexes (40). These properties may describe the multi-functional nsp1 which we postulate is playing key roles as a regulatory protein in viral genome translation and replication. This postulate arises from the genetic evidence obtained in this study. We anticipate that our model will facilitate further studies on the mechanisms governing coronavirus RNA translation and replication.

67 CHAPTER IV. STEM-LOOP III IN THE 5’ UTR, IDENTIFIED

PREVIOUSLY AS A CIS-REPLICATION ELEMENT FOR BCOV DI RNA, IS NOT ESSENTIAL FOR REPLICATION OF THE MOUSE HEPATITIS CORONAVIRUS GENOME

INTRODUCTION Primary sequences and higher-order structures in the 5’ and 3’ UTRs of coronaviruses have been shown to function as cis-acting elements for replication of either a helper virus- dependent DI RNA replicon, or for replication of the full-length viral genome. The terminal genomic cis-replication RNA signals have been best characterized in the group 2 coronaviruses through studies using the BCoV (for which only a DI RNA replicon has been studied since a reverse genetics system is not yet available), the MHV (for which both a DI RNA replicon and a reverse genetics system for the full-length viral genome are available), and the SARS-CoV (for which a DI RNA replicon is not available, and a reverse genetics system is available) (15, 97). To date, from several studies with BCoV and MHV that have been carried out in different laboratories, and from Chapters II and III above, an overall configuration of the cis-replication elements within the 5’ UTR of the group 2a coronaviruses (BCoV and MHV) can be summarized as depicted in Fig. 4.1. The 5’ UTR cis-replication elements are four highly structured stem-loops (SLs) named I (comprised of the two smaller SL 1 and 2), II, III, and IV, and a relatively unstructured segment of 30 nt in MHV (32 nt in BCoV) that shows virus species-specific replication properties (Chapters II and III). At this time it is not known precisely what each cis-acting element does. The 5’ UTR RNA replication signals could be regulating genome translation and/or the subsequent steps of genomic and subgenomic RNA syntheses. The regulation involving any of these steps presumably requires an interaction with trans-acting viral and/or cellular proteins. It should be noted that the initial studies used to identify the coronavirus 5’ UTR cis-replication signals utilized exclusively the DI RNA systems which have the advantage of identifying cis-acting signals independent from any trans-acting signaling function. That is, the cis-acting activity can be evaluated independently from any alternative function (such as encoding a trans-acting

68

wt MHV

III MHV (83-140 nt) [BCoV (85-141 nt) ]

G C ↑ U

U ↑ C G

U ↑ C-G C-G

G-C ↑ B85-141/M 100 U•C G

83 A-U ↑ 120 140

U ↑ C-G A

↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ 5’ CACUUCCUGCGUGUC-GUCUUGUCAUAGUGCUGACAUU-UGUAG↑ 3’

C UAΔ U AU U G U UA A UCAΔ

IV I

209 zz II III z

5’ 3’ 405 42 56 60 77 97 114 171 225

5’ 1a 3’ 1b An 2a/HE S4 EM N ●●●●●●●● ●●●● ●●●● ●●●● ●●●● ●kb 51015202530

FIG. 4.1. Cis-replication SLIII in the MHV 5’ UTR. The organization of the 31.3-kb MHV genome is shown at the bottom. Above the genome is a current model for the four cis-replication RNA stem-loop structures I, II, III, and IV in the MHV 5’ UTR. Distances as drawn are proportional to the distances in the nucleotide sequence. Solid circles indicate the AUG start codon for ORF 1. The enlarged section is a detailed view of nt 83-140 which are comparable to BCoV nt 85-141 (variant bases in the BCoV sequence are identified by arrows). Note that this region contains the SLIII structure as defined in earlier studies (122) as well as flanking sequences of 14-nt upstream and 26-nt downstream. The start and stop codons for the SLIII-associated intra-5’ UTR ORF are boxed. The plaque phenotypes of wild type (wt) and chimeric MHV B85-141/M are displayed at the right.

69 protein) for the same genetic element (19). Making a distinction between a cis and a trans function for a given genetic element is difficult when using only the viral genome and a reverse genetics system. Therefore, DI RNA analyses are still the method of choice for identifying viral cis-replication elements. The identity of the BCoV 5’-proximal SLIII as a higher-order cis-replication element was determined by enzymatic probing and by mutation analysis (122). The structure of the stem was shown to be required for DI RNA accumulation through mutation experiments that disrupted its structure and concurrently destroyed its replicating ability and then compensatory mutations that restored its base-paired structure and concurrently restored its replicating ability (122). In addition, a SLIII-associated intra-5’ UTR AUG-initiated short upstream ORF (uORF) was shown to be strongly preferred by the BCoV DI RNA for optimal replicating ability (122). Consistent with these apparent requirement for DI RNA replication, SLIII, along with its associated uORF, appears highly conserved among group 2 coronaviruses and to be similarly conserved in coronavirus groups 1 and 3 implying an important biological function (122). Further studies also showed that seven host proteins of unknown identity as well as three unidentified viral proteins of 55, 38, and 22 kDa from BCoV-infected cells could be UV-crosslinked to the SLIII probe, suggesting that these proteins may be trans-acting factors for SLIII function (S. Raman, PhD dissertation, 2003). These experimentally determined properties together suggested that SLIII is a crucial cis-replication element for coronavirus DI RNA replication and therefore probably is also required for coronavirus genome replication. Here, to establish the functional importance of SLIII and its associated short uORF for MHV replication, a set of mutation experiments on SLIII in the context of the MHV genome was carried out with the use of a reverse genetics system. It was learned, surprisingly, that several structural mutations of SLIII and even deletion of a large portion of SLIII, including deletion of the associated short uORF, and eventually deletion of the entire long form of SLIII, are tolerated by MHV for virus replication. The results of this study raise new questions about the role of SLIII in coronavirus replication and lead us to speculate on how its function could be (at least partially) determined.

70 MATERIALS AND METHODS Cells and viruses. DBT cells, L2 cells, and BHK-MHVR cells were grown in media and conditions as described in Chapters II and III. Plasmid constructions for mutations in fragment A of the infectious clone for MHV-A59. The general strategy to generate mutated or truncated MHV-A59 infectious clone was as described in Chapters II and III. Site-directed mutagenesis was made by overlap PCR with the primers shown in Table 4.1. Briefly, to make B85-141/M-loopC, the first PCR was done with the primers G106C-G108C(-) and A59-ScaI, and the second PCR was done with the primers G106C-G108C(+) and T7startMHV, wherein both PCR reactions the A fragment for B85-141/M assembly was used as the amplification DNA template. PCR fragments were chromatographically purified and used in a third PCR with the primers A59-ScaI and T7startMHV. The other SLIII-associated mutants were established following the same PCR scheme except for the use of different designated primers. To make B85-141/M-loopA, primers G106A-G108A(-) and G106A-G108A(+) were used; for B85-141/M-loopU, primers G106T-G108T(-) and G106T-G108T(+) were used; for B85-141/M-stemL, primers BCoV SL3-2L(-) and BCoV SL3-2L(+) were used; for B85-141/M-stemR, primers BCoV SL3-2R(-) and BCoV SL3-2R(+) were used; for B85-141/M-stemLR, primers BCoV SL3-2LR(-) and BCoV SL3-2LR(+) were used; for MHVΔ96-115, primers MHV(96-115)del(-) and MHV(96-115)del(+) were used; for MHVΔ91-120, primers MHV(91-120)del(-) and MHV(91-120)del(+) were used; for MHVΔ80-130, primers MHV(80-130)del(-) and MHV(80-130)del (+) were used; and for MHVΔ75-138, MHV(75-138)del(-) and MHV(75-138)del(+) were used. Assembly of the full-length MHV-A59 infectious recombinant virus RNA. Assembly of full-length MHV-A59 infectious recombinant construct was carried out as described in Chapters II and III. Characterization of viral mutants by RT-PCR, viral RNA sequence analyses, and growth kinetics measurement. Analyses of plaque morphology and genome sequencing of recovered progeny viruses were carried out as described in Chapters II and III. Northern blot analysis of viral RNAs. Northern blot assays were performed as described in Chapters II and III.

71 TABLE 4.1. Oligonucleotides used in Chapter IV

a b c Binding Oligonucleotide Polarity Sequence (5’Æ3’) region (nt) T7 promoter T7startMHV + gtcggcctcttaatacgactcactatagTATAAGAGTGATTGGCGTCCG MHV 1-21 A59-Sca I - CGCCTTGCAGGAGTACTTACCCTTC MHV 899-923 cttgtgggcgtagatttttcatagtggtgtctatattcaTTCCTTGACTTTCGTTCTC MHV 141-162 A59-BCoV(85-141)(-) + TGC BCoV 103-141 ctatgaaaaatctacgcccacaagcatagaatacagggagtgCCGTTTATAAAGTTT MHV 61-82 A59-BCoV(85-141)(+) - AGATTAG BCoV 85-126 G106C-G108C(-) + gtattctatgcttCtCggcgtaga BCoV 93-116 G106C-G108C(+) - tctacgccGaGaagcatagaatac BCoV 93-116 G106A-G108A(-) + gtattctatgcttAtAggcgtaga BCoV 93-116 G106A-G108A(+) - tctacgccTaTaagcatagaatac BCoV 93-116 G106T-G108T(-) + gtattctatgcttTtTggcgtaga BCoV 93-116 G106T-G108T(+) - tctacgccAaAaagcatagaatac BCoV 93-116 BCoV SL3-2L(-) + gtatAGAatgGGtgtgggcgtagatttttcatagtggtgtc BCoV 93-133 BCoV SL3-2L(+) - gacaccactatgaaaaatctacgcccacaCCcatTCTatac BCoV 93-133 BCoV SL3-2R(-) + gtattctatgcttgtgUCcgtUCUtttttcatagtggtgtc BCoV 93-133 BCoV SL3-2R(+) - gacaccactatgaaaaaAGAacgGAcacaagcatagaatac BCoV 93-133 BCoV SL3-2LR(-) + gtatAGAatgGGtgtgUCcgtUCUtttttcatagtggtgtc BCoV 93-133 BCoV SL3-2LR(+) - gacaccactatgaaaaaAGAacgGGcacaCCcatTCTatac BCoV 93-133 MHV 80-95, CGGCACTTCCTGCGTGCTTGTCATAGTGCTGAC MHV(96-115)del(-) + 116-132 MHV 80-95, MHV(96-115)del(+) - GTCAGCACTATGACAAGCACGCAGGAAGTGCCG 116-132 MHV 71-90, MHV(91-120)del(-) + CTTTATAAACGGCACTTCCTCATAGTGCTGACATTTGTAG 121-140 MHV 71-90, CTACAAATGTCAGCACTATGAGGAAGTGCCGTTTATAAAG MHV(91-120)del(+) - 121-140 GTTTAAATCTAATCTAAACTTTATAAAACATTTGTAGTTCCT MHV 53-79, MHV(80-130)del(-) + TGACTTTCG 131-154 CGAAAGTCAAGGAACTACAAATGTTTTATAAAGTTTAGATT MHV 53-79, MHV(80-130)del(+) - AGATTTAAAC 131-154 GTAGTTTAAATCTAATCTAAACTTTAGTTCCTTGACTTTCGT MHV 50-74, MHV(75-138)del(-) + TCTC 139-159 GAGAACGAAAGTCAAGGAACTAAAGTTTAGATTAGATTTA MHV 50-74, MHV(75-138)del(+) - AACTAC 139-159 MHV MHV-1094(+) - CGATCAACGTGCCAAGCCACAAGG 1094-1117 DI3(+) - cgggatccgtcgacacgcgtttttttttttttttttttt Poly(A) MHV-leader(-) + TATAAGAGTGATTGGCGTCCG MHV 1-21 MHV(605-623)(+) - GTTACACAGGCAGACGCGC MHV 605-623 MHV(261-284)(-) + CCATGGATGCTTCCGAACGCATCG MHV 261-284 MHV MHV(30811-30830)(-) + GGATGGTGGTGCAGATGTGG 30811-30830 MHV MHV-NPol(A)(+) - ttttttttttttttttttttttttttACACATTAGAGTC 31019-31033 MHV gtaatacgactcactatagATGTCTTTTGTTCCTGGGC T7MHV-NATG(-) + 29669-29687 MHV MHV(31094-31122)(+) - CAGCAAGACATCCATTCTGATAGAGAGTG 31094-31122 a The positive and negative symbols in the oligonucleotide names indicate the polarity of the nucleic acids to which the oligonucleotides anneal. b Polarity of the oligonucleotide relative to the positive-strand viral genome. c Lower-case bases represent non-MHV sequences. Capitalized bases represent MHV sequences. Bold-capitalized bases denote mutated changes.

72 RESULTS A wild-type-like loop sequence and a wild-type-like stem integrity of the “short” SLIII are not required for MHV replication. Experiments described in Chapter II demonstrated that the wt BCoV short SLIII with flanking sequences (nt 85-141), differing by ~30% in nt sequence from wt MHV counterpart (nt 83-140) (Fig. 4.1), when used to construct a B85-141/M chimera enabled the chimera to make nearly MHV wt-like plaques, and to have nearly wt-like growth kinetics, although the final virus titer of the chimera is approximately tenfold less than that of wt. This demonstrated that some variation in SLIII structure was tolerated for MHV virus replication. To further explore the structural requirements for SLIII function, the B85-141/M chimera was used as a starting point and further mutations were made to alter the loop and stem (Fig. 4.2), and effect of these on virus replication were studied. Since loops (terminal and internal) and bulges within an RNA helix structure are known to be sites for protein binding (101), we first sought the effects of changes in these for virus growth and began with an analysis of the SLIII terminal tetraloop. Tetraloops are secondary structural motifs that stabilize hairpins in a variety of RNAs (106). Among all currently identified group 2a coronaviruses, the consensus sequence for the SLIII tetraloop is YGYG, where Y stands for any pyrimidine (27, 122). This conservation in sequence suggests the second and fourth guanines probably play a role in the function of the tetraloop and they were therefore selected for mutation analysis. Mutants B85-141/ M-loopC, B85-141/M-loopA, and B85-141/M-loopU in which both guanines in the tetraloop were changed to cytosines, adenines, and uracils, respectively, were made and tested (Fig. 4.2 A). In vitro-prepared mutant genomic RNAs were electroporated into host cells and the character of progeny viruses, if any, were monitored. To our surprise, all three mutants were viable and none were impaired with respect to plaque size or progeny titers (~1×107 PFU/ml) when compared to those of B85-141/M (Fig. 4.2 A). This indicated that the character of the two tetraloop bases, despite their conservation among known group 2a coronaviruses, is not critical for SLIII function in virus replication. We next tested the group 2a-conserved features of the SLIII helix. By site-directed mutagenesis three mutants were created for testing (Fig. 4.2 B). In mutant B85-141/M- stemL, five of the eight bases in the left arm were changed leaving only the SLIII-associated

73

A B85-141/M-loopC B85-141/M-loopA B85-141/M-loopU

C U A U U U U C U A U U U-G U-G U-G C-G C-G C-G G-C G-C G-C U-G U-G U-G A-U A-U A-U U-A U-A U-A C-G C-G C-G 5’UAUU-AUUU 3’ 5’UAUU-AUUU 3’ 5’UAUU-AUUU 3’

B B85-141/M-stemL B85-141/M-stemR B85-141/M-stemLR G U G U G U UG UG UG G G U U G-U G G C C G-C G-C G-C G-C U-G U-G U-G A-U A-U A-U A A U U A-U G G C C G-C 5’UAUA AUUU 3’ 5’UAUU UUUU 3’ 5’UAUA-UUUU 3’

FIG. 4.2. Stem and loop mutants of BCoV SLIII in the MHV genome. For each mutant, highlighted nucleotides in black are those that were changed from the chimera B85-141/M. The start codon for the SLIII-associated intra-5’ UTR ORF is boxed. Plaque phenotypes are shown below the depiction of mutants.

74 AUG codon intact and altering the second amino acid (LÆG) in the putative 8-aa short peptide derived from the intra-5’ UTR uORF. In mutant B85-141/M-stemR, five of the eight bases in the right arm were changed that maintained the CGU base pairing with the AUG start codon in the left arm and altered two amino acids (GÆS and DÆL) in the putative 8-aa peptide product. In mutant of B85-141/M-stemLR, the entire stem structure was replaced with one that used a different set of bases yielding a stem with the same free energy as B85-141/M, and yielded a peptide differing by three amino acids (LÆG, GÆS, and DÆL) from B85-141/M. For each mutant, infectious RNA was prepared and electroporated into host cells and the progeny were evaluated. Surprisingly, each mutant yielded progeny virus with plaque sizes that were indistinguishable from those of B85-141/M (Fig. 4.2 B). That is, they were essentially MHV wt-like. These results suggested that the integrity of SLIII is not necessary for MHV replication in cell culture. Thus, the wt-like plaques and growth phenotypes of the six short SLIII mutants led us to hypothesize that the principal role of SLIII may be to supply a platform for the intra-5’ UTR short ORF which may play a biological role during infection, but is not in itself required for virus replication in cell culture. A deleted “short” SLIII mutant virus is viable and is only mildly debilitated. To further examine the required features of SLIII for MHV replication, we chose to evaluate a “long” form of SLIII that had been previously suggested by alternative Mfold-predictions (Fig. 4.3 A). Interestingly, the long form of SLIII is consistent with a model predicted by other algorithms as well (88) and seems to be a structure phylogenetically conserved among all known coronaviruses (27). The long SLIII in MHV contains two central small loops separated by a helix of three base pairs with a seven base-paired helix both above and below the two central loops. The upper-most helix and the terminal tetraloop comprise the entire short SLIII evaluated above (Fig. 4.3 A). The long SLIII of BCoV has a large (16-nt) central loop with an eight base-paired stem above and a five base-paired helix below the loop. The upper stem and terminal tetraloop here also comprise the entire short SLIII in BCoV (Fig. 4.3 A). For both MHV and BCoV, the existence of the long SLIII remains to be established by structure probing.

75

A III G C C G III C-G MHV C-G BCoV G U G-C UG 100 U•C U-G A-U C-G C-G G-C C-G II U-G U U 100 A-U G C U-A U U TRS-CS C-G G-U U-A II C-G A A U U G-U 120 U A A U TRS-CS U C C UU C C U U G U 120 A A C A A-U U U U A U U A-U C C C C U-A U-A C A U U C-G 60 A-U C U A-U A-U U-A U-A A-U C-G U-A C-G U-A G-C U-A A-U C•U G-U U-A C-G AAAU-A AA C-GACAUUUGUAG U-A CAUC-GUGUCUAUAUUCA

57 60 77 80 130 140 55 80 84 129 141 B MHVΔ 96-115 MHVΔ 91-120 MHVΔ 80-130 MHVΔ 75-138 95 G C 116 U U G-U II C-G II II II G-U 120 TRS-CS U TRS-CS 90 U TRS-CS TRS-CS C C C C 121 A A C A A A C A A A A A U A U U U A U U U A U A C C U-A C C U-A C C C C U U C-G U U C-G U U U U A-U A-U A-U A-U A-U A-U A-U C-G A-U C-G A-U A-U 74 U-A G-C U-A G-C U-A U C•U G-U C•U G-U C•U C AAAU-A AA C-GACAUUUGUAG AAAU-A AA C-GACAUUUGUAG AAAU-A AA ACAUUUGUAG AAAU AG 57 60 77 80 130 140 57 60 77 80 130 140 57 60 77 79 131 140 57 60 140

mutations lethal

FIG. 4.3. Predicted alternative “long” secondary structure for MHV and BCoV SLIII. (A) At left is the Mfold-predicted wild-type (wt) “long” structure for MHV SLIII along with its adjacent upstream SLII. At right is the predicted “long” BCoV counterpart. The previously-described “short” SLIII shown to be required for BCoV DI RNA replication is highlighted in gray. The heptameric UCUAAAC template- switching signal (also called the transcription regulatory sequence or TRS) for transcription initiation is shown in the loop of SLII and is boxed with a dashed line. The start and stop codons for the SLIII-associated intra-5’ UTR ORF are boxed. (B) MHV SLIII truncation mutations tested for replication in MHV. Nucleotides are numbered according to the wt MHV sequence from the first base at the 5’ end of the genome. Plaque phenotypes are shown below the respective mutant with their names given at the top.

76 To establish whether the long SLIII in MHV plays a role in viral replication, truncation and deletion mutations of it in the genetic background of the wt MHV genome were constructed and tested (Fig. 4.3 B). In the first mutation, the upper part of the long SLIII, that is, the part comprising the entire short SLIII, was precisely removed to form MHVΔ96-115. Remarkably, this mutant was viable and immediately after electroporation of the mutated genome formed plaques nearly wt MHV in size (Fig. 4.3 B). A phenotype characterization of this surprising mutant was done by first establishing single-cycle and multiple-cycle growth curves of MHVΔ96-115 progeny (Fig. 4.4 A). Under both conditions of growth, the growth rate of MHVΔ96-115 lagged slightly behind that of wt MHV and the maximal titer reached was approximately tenfold less than that of wt MHV. A Northern blot analysis of viral RNAs from infected cells, furthermore, revealed that the MHVΔ96-115 produced the same set of sgmRNA species and in the same relative proportions as wt MHV (Fig. 4.4 B). These results indicated that the RNA sequence making up the entire structure of the short SLIII is not required for MHV replication in cell culture or for the generation of wt MHV plaques. A deleted “long” SLIII mutant virus is viable, but is greatly debilitated. We next evaluated the requirement for the remaining part of the long SLIII (i.e., the part without the short SLIII component) for MHV replication. For this, a second mutant of wt MHV, MHVΔ91-120, was made and tested in which nt 91-120 was deleted. Nucleotides 91-120 comprise all of the short SLIII and all of the top bulge and central stem in the long SLIII (Fig. 4.3 B). Surprisingly, again, MHVΔ91-120 replicated and formed plaques indistinguishable from wt MHV immediately after electroporation of the mutant genomic RNA (Fig. 4.3 B). Growth curves of this mutant have not yet been done; however, Northern blot analysis of viral RNAs from infected cells shows that sgmRNAs are made in the same relative proportions as with wt MHV but are far less abundant (Fig. 4.4 B). Thus, MHV without nt 91-120 can replicate, but viral growth is slightly more debilitated than virus just missing the short SLIII. In the next mutant evaluated, the entire full-length of the long stem-loop, nt 80-130, was precisely deleted from the wt MHV genome to form mutant MHVΔ80-130. Most surprisingly, this mutant too was viable after electroporation of the mutated genomic RNA

77

A 9 8 8 7 6 7

5 6 4 5

PFU/ml log log PFU/mL moi = 0.01 moi = 5.0 3 wt MHV wt MHV wt MHV 4 wt MHV 96-115 2 MHV(ΔΔ96-11596-115) MHV(ΔΔ96-115) 96-115 MHV(ΔΔ96-11596-115) MHV(ΔΔ96-115) 3 1

0 2 0 4 8 12 16 20 24 28 32 0 4 8 12162024 Hours Post Infection Hours Post Infection

5 0 0 1 2 3 -1 -1 -1 B 6 1 0 V 9 9 8 k H Δ Δ Δ c M V V V o t H H H m w M M M

gRNA

sgRNA2 sgRNA3

sgRNA4 sgRNA5 sgRNA6

sgRNA7

12 345

FIG. 4.4. Growth of MHV SLIII truncation mutants in cell culture. (A) Growth of the truncation mutant MHVΔ96-115 relative to that of wild-type (wt) MHV. DBT cells were infected with wt MHV or MHVΔ 96-115 at an MOI of 0.01 PFU/cell (left) or 5.0 PFU/cell (right). At the indicated times postinfection, aliquots of medium were removed and infectious titers were determined on mouse L2 cells. Open and solid symbols represent results from two independent experiments. (B) RNA synthesis of SLIII truncation mutants in cell culture relative to that of wt MHV. Northern analyses for viral RNA synthesis of wt and the SLIII truncation mutants. Cultures of DBT cells were mock-infected or infected with MOI of 0.01 PFU/cell, and intracellular RNA was extracted at 24 hours post infection. The Northern analysis used a 32P-radiolabeled probe specific for the 3’ UTR as described in the Materials and Methods. gRNA, genomic RNA; sgRNA, subgenomic RNA.

78 and the plaques generated, albeit after a long recovery period after transfection, were large and indistinguishable from wt plaques. A Northern blot analysis of viral RNAs from cells infected with progeny of MHVΔ80-130 revealed that sgmRNA species were produced in the same relative proportions as from wt MHV (Fig. 4.4 B). In addition, a sequence analysis of the genomic 5’-proximal 1000-nt region in the recovered virus showed two mutations, U33A and C37U (data not shown). These two nucleotide changes may be functioning as suppressor mutations and their further analyses might reveal a mechanistic feature of long SLIII function. Finally, an extensive deletion mutation that removed nt 75-138 which included some upstream and some downstream flanking sequences of the long SLIII and formed mutant MHVΔ75-138 was found to be lethal. The results with MHVΔ75-138, therefore, indicate an essential role for at least part of this region of the genome for virus replication.

DISCUSSION In this study, we have shown that the MHV genome is capable of tolerating a wide variety of mutations in the SLIII region (Figs. 4.1 and 4.2). More strikingly, we found that both the “short” (nt 96-115) and “long” (nt 80-130) forms of SLIII, along with the SLIII- associated 24-nt uORF, can be completely removed without killing virus replication in cell culture (Figs. 4.3 and 4.4). These findings differ considerably from those our laboratory reported earlier for the replication of BCoV DI RNA (122). This apparent discrepancy points out some fundamental differences between the replication requirements for DI RNAs and viral genomes that are not fully understood but that have been noted before (50, 55). Namely, these differences might reflect the fact that successful competition by the DI RNA for the helper virus replication machinery might require the DI RNAs to be more sensitive to alternations in structure in order to achieve replication fitness (55). In the virus genome, removal of short SLIII significantly impaired viral replication efficiency as observed in kinetic assays (Fig. 4.4 A), suggesting that in this context SLIII may act as a modulator of RNA synthesis rather than an essential element, per se. That is, its function might differ to suit a particular set of physiological circumstances. Interestingly, a similar observation

79 was reported earlier by S. Goebel et al. (50) for the HVR in the MHV 3’ UTR which acts as a cis-acting element for DI RNA replication. In a DI RNA, this region is required for replication (89), whereas it is not required for replication in the full-length genome. In the mouse host, however, deletion of the HVR from the genome led to a dramatic attenuation of virulence (50). This finding also established that some genetic elements are responsible for tissue tropism which determines the site of virus replication and associated virulence. Indeed, it is known that diseases and clinical symptoms caused by MHV infection vary greatly depending on the strain, dose, and mode of the inoculation of the virus, as well as the genetic background, age, and immune status of the mouse host (33, 56, 118). Thus, it is possible that SLIII may also play a critical role in pathogenesis. This, then, calls for further mouse infectivity studies to determine the potential roles of SLIII RNA element in pathogenesis. It remains to be determined whether the SLIII-associated short uORF benefits the replication of the viral genome. Mutagenesis studies in the group 2a BCoV DI RNA showed that the short uORF or its product may act to enhance replication through a cis-acting influence on translation although its upstream area does not contain highly favorable Kozak sequences (23, 122, 157). Several short uORFs have also been shown to affect the translation of downstream ORFs in eukaryotic and viral RNAs through a variety of mechanisms under a range of pathological condition (80, 107). It would seem possible that the SLIII-associated intra-5’ UTR uORF behaves similarly in the coronavirus genome. In this regard, translational analysis of the genome ORF 1 in the absence of the uORF is needed to determine the uORF’s importance for virus replication in cell culture and the animal host, and to determine the pathogenic consequences of its removal. If the SLIII really plays an important, but not essential, role in viral RNA synthesis, then its function may be that of a regulatory element that could be functioning at any of several steps leading to RNA replication. As such, it is reasonable to speculate that its function (s) will be mediated by trans-acting factors, based on precedents in other positive-strand RNA viruses (6, 48, 57). In this regard, both host and viral proteins should be considered candidate factors for binding the positive- and/or negative-strand of SLIII. This hypothesis is consistent with an earlier finding in our laboratory that translation

80 efficiency of a reporter gene attached to the genomic 5’ UTR of BCoV differs between infected and uninfected cells (133). It is also consistent with the fact that seven unknown cellular proteins and three viral proteins of 55, 38, and 22 kDa bind the positive-strand BCoV SLIII as determined by a UV-crosslinking assay (S. Raman, PhD dissertation, 2003). In addition, a preliminary study with purified BCoV N and nsp1 proteins showed that these interact with SLIII (see Appendix) suggesting their potential roles as SLIII trans-acting factors. Since N has also been shown to be enhancing RNA replication in transfection studies (154), its association with SLIII in connection with this behavior needs to be studied. From experiments described here it was not established whether SLIII functions as a structure in the “short” or “long” form. However, from the preliminary evidence of potential suppressor mutations in SLI for replicating mutants lacking the long-SLIII (Fig. 4.3) it is possible that the flanking areas of long-SLIII may be involved in an interaction with SLI. This inference is consistent with what we learned from studies described in Chapter III wherein the relatively unstructured hypervariable 30-nt (nt 141-170) just downstream of the long SLIII specifically interacts with SLI and the nsp1 coding region (or possibly nsp1 itself) as demonstrated by genetic evidence. It is also conceivable that the long-SLIII itself may be one constituent of a giant interactive complex, and removal of it would potentially reshape the complex thereby leading to the unexpected suppressor mutations within SLI. The dispensability of the SLIII element in the MHV genome for virus replication raises the question of just what is the minimal 5’ UTR requirement for virus replication in cell culture. Answering a similar question regarding the 3’ UTR led to an experimental shortening of a functional MHV 3’ UTR (50, 190). A systematic analysis of the minimal structural requirements for 5’ end may help to define the 5’-proximal mechanisms involved in genome translation, genome replication, and genome-directed discontinuous transcription steps that generate the leader-containing sgmRNAs.

81 CHAPTER V. FUTURE DIRECTIONS INTRODUCTION After the SARS epidemic in 2003, world-wide interest was stimulated to identify new coronaviruses from a variety of hosts that might have served as reservoirs for the origin of the SARS-CoV via interspecies virus jumping and recombination. Significant research effort around the world was also boosted to depict coronavirus replication mechanisms and coronavirus RNA and protein functions with the hope of developing new anti-coronaviral therapeutic agents, and to characterize host immune response with the hope of developing effective new vaccines. Yet many details of coronavirus replication mechanisms remain largely unknown. In this dissertation, two model group 2a coronaviruses, BCoV and MHV, have been investigated to examine their 5’-terminal cis-acting RNA signals for viral genome replication. The results led to the surprising findings that (i) part of the 5’-proximal RNA signaling structure lies within the nsp1 coding region (Chapter II), (ii) a species-specific interaction between sites within the 5’ UTR and nsp1 occurs (Chapter III), and (iii) a large part of 5’ UTR-mapping SLIII, formerly thought of as a critical cis-replication structure, can be deleted without killing the virus (Chapter IV). The second of these discoveries has opened up the possibility that such interactions may occur in other if not all coronaviruses. It also suggests a potential mechanism by which a transition between genome translation and negative-strand RNA synthesis might be regulated. Understanding this mechanism in detail might identify a target for antiviral drug design that could be broadly effective against coronaviruses.

TO IDENTIFY OTHER VIRAL FACTORS THAT INTERACT WITH 5’ GENOMIC RNA CIS-REPLICATION ELEMENTS In our substitution experiments (Chapters II and III) and deletion experiments (Chapter IV), the genomic RNA sequences of recovered viruses were only confirmed for the 5’-terminal 1000 nt and the 3’-terminal 500 nt mainly based on the fact that these components appear in all characterized coronavirus DI RNAs, and they therefore must be essential for replication (15). To explore the possibility that suppressor mutations might

82 exist elsewhere in the genome, whole genome sequencing should be done to find them. Finding them might identify other viral proteins that contribute to suppression of debilitating mutations. Because of the large genome size, however, this has not been feasible for all of the engineered viruses. Particular interests should focus on the phenotypic revertants of the B32-nt/M chimera and the MHVΔ80-130 mutant whose behavior led to the finding of 5’ UTR-nsp1 interactions and nonessential nature of SLIII, respectively. Once nucleotide changes are probed, reconstruction experiments could be done to establish the authenticity of putative suppressor mutations. Further structural and biochemical analyses would likely arise from the identification of novel viral factors. An alternative method for finding other viral trans-acting factors that interact with the 5’ UTR might be affinity purification wherein the entire 5’ UTR RNA from different constructs would be biotin-labeled and used for streptavidin bead pull-down of intracellular proteins extracted from virus-infected cells at different times postinfection. The purified proteins could then be identified by mass spectrometry and sequencing (125). Given that the 5’ UTR likely behaves as a unit for replication based on our results, attention would be given to the intact 5’ UTR instead of individual cis-acting RNA elements. Confirmation of the 5’ UTR- bound viral proteins could then be done by using electrophoretic mobility shift assays (EMSAs) along with purified proteins. Mapping of the putative viral proteins on the 5’ UTR would then need to be done to understand their function.

TO IDENTIFY THE FUNCTIONAL DEFECT ARISING FROM THE EXCHANGE OF THE INCOMPATIBLE SPECIES-SPECIFIC 32 (30)-NT

5’ UTR SEGMENT BETWEEN BCOV AND MHV From genetic evidence described in Chapter III, a species-specific interaction between 5’ UTR and nsp1 was shown to be necessary for virus replication. The precise nature and function of this interaction, however, remains to be elucidated. To learn these, some of the lethal chimeras, such as B1-226/M and B227-906/M (Fig. 3.7), might be useful. First attempt could be addressed to their effects on translation. It is known that the capped infectious RNAs prepared in vitro must undergo translation immediately after transfection into host cells in order to synthesize the replicase proteins needed to initiate replication.

83 Therefore, it should be tested whether the 5’-end replication structures identified in the dissertation and elsewhere are regulators of translation. To establish the capability of these lethal constructs to support translation, in vitro translation could be employed by using the rabbit reticulocyte lysate along with radiolabeling and SDS-PAGE analysis. The study could then be expanded to in vivo system by evaluating translational consequences driven by 5’-terminal cis-replication elements fused with reporter gene. Another way to examine the defects of these mutants is the coupled in vitro translation-replication assays, which has been shown to be quite successful in distinguishing viral genome replication from genome translation under the same experimental system (20). This strategy can be joined with two-cycle ribonuclease (RNase) protection assay for specifically detecting negative-strand RNAs (114). Furthermore, it might also be enormously rewarding to establish an in vitro replication system whereby active MHV replication complexes are isolated from MHV- infected cell lysates and used to assay viral RNA synthesis. This system would also serve as a framework for further investigation of the coronaviral replication cycle. Given that a similar system has been demonstrated for SARS-CoV (158), development of the in vitro MHV replication machinery would certainly facilitate the progress on understanding the modulation of virus replication.

TO CHARACTERIZE THE 5’ UTR-NSP1 INTERACTION The evidence described in Chapter III suggested specific interactions between SLI, the hypervariable 30-nt region between SLIII and SLIV, and the nsp1 which has been partly proven for BCoV as shown in the Appendix. However, it is unknown how SLI and the 30-nt region interact or how this interaction is affected by nsp1 binding. A number of analytical procedures are available for characterizing RNA-protein complexes that could be used. In-line probing and nuclear magnetic resonance (NMR) are two such techniques. The in-line probing assay relies on structure-dependent spontaneous cleavage of RNA and has been shown useful for finding altered structure or altered RNA orientation after ligand binding (102, 124). Comparing the in-line probing profiles of the MHV or BCoV 5’ UTR in the presence and absence of nsp1 might give hints about the structural changes of 5’ UTR as a function of binding. NMR analysis of the nsp1-5’ UTR complex is an alternative way

84 to map and characterize the interactive sites at high-resolution. Both methods can be employed with the nsp1 mutant (K52E) that is detrimental for viral replication (Chapter III). In addition, the 5’ UTR may function as a tertiary RNA structure which is a structure that is poorly understood at present. Current algorithms for predictions of RNA tertiary conformations are available but are not accurate (136). A comprehensive effort will be needed to understand the structure/function relationship of the 5’ UTR and nsp1 binding.

TO EXPLORE THE POSSIBILITY AND MEANS OF 5’-3’-END CROSS TALK OF THE VIRAL GENOME Genome circularization as a function of viral and cellular factors may be a common feature of positive-strand RNA viruses. Circularization potentially provides several advantages to the virus for its replication including coordination of translation and genome replication, coordination of transcription and genome replication, coordination of viral macromolecules for intracellular trafficking of viral components, and controlling viral genome integrity (stability) (44, 48, 57, 179). To further test our proposed model of an nsp1-mediated molecular switch between genome translation and negative-strand RNA synthesis, several fundamental questions should be studied. First, does translation and replication occur at the same template simultaneously? This would not be expected since the ribosome travels in the 5’Æ3’ direction and the RdRp travels in the 3’Æ5’ direction. To answer this question, the genome template can be analyzed for its ability to synthesize RNAs by purified viral RdRp in the presence or absence of translation inhibitor. Besides, translation and RNA synthesis likely take place at different intracellular compartments, meaning that the positive-strand template would be guided to the double membrane vesicle after protein synthesis (158). With regard to this presumption, is nsp1 involved in the events of translation shut-off and genome trafficking? An extended question is progeny RNAs after genome replication still capable of serving as templates for translation in vivo, or they are all packaged into virions and exit the host cells? Are there any 5’-3’-end cross talks during the course of viral infection? What roles the end-to-end communications might play? Future experiments addressed to these issues could bring us closer to understanding how viral translation and replication are regulated in infected cells.

85 LIST OF REFERENCES

86 LIST OF REFERENCES

1. Almazan, F., C. Galan, and L. Enjuanes. 2004. The nucleoprotein is required for efficient coronavirus genome replication. J Virol 78:12683-8. 2. Almazan, F., J. M. Gonzalez, Z. Penzes, A. Izeta, E. Calvo, J. Plana-Duran, and L. Enjuanes. 2000. Engineering the largest RNA virus genome as an infectious bacterial artificial chromosome. Proc Natl Acad Sci U S A 97:5516-21. 3. Almeida, M. S., M. A. Johnson, T. Herrmann, M. Geralt, and K. Wuthrich. 2007. Novel beta-barrel fold in the nuclear magnetic resonance structure of the replicase nonstructural protein 1 from the severe acute respiratory syndrome coronavirus. J Virol 81:3151-61. 4. Alvarez, D. E., M. F. Lodeiro, S. J. Luduena, L. I. Pietrasanta, and A. V. Gamarnik. 2005. Long-range RNA-RNA interactions circularize the dengue virus genome. J Virol 79:6631-43. 5. Andino, R., G. E. Rieckhof, P. L. Achacoso, and D. Baltimore. 1993. RNA synthesis utilizes an RNP complex formed around the 5'-end of viral RNA. Embo J 12:3587-98. 6. Andino, R., G. E. Rieckhof, and D. Baltimore. 1990. A functional ribonucleoprotein complex forms around the 5' end of poliovirus RNA. Cell 63:369-80. 7. Baker, S. C., K. Yokomori, S. Dong, R. Carlisle, A. E. Gorbalenya, E. V. Koonin, and M. M. Lai. 1993. Identification of the catalytic sites of a papain-like cysteine proteinase of . J Virol 67:6056-63. 8. Baric, R. S., K. Fu, M. C. Schaad, and S. A. Stohlman. 1990. Establishing a genetic recombination map for murine coronavirus strain A59 complementation groups. Virology 177:646-56. 9. Baric, R. S., G. W. Nelson, J. O. Fleming, R. J. Deans, J. G. Keck, N. Casteel, and S. A. Stohlman. 1988. Interactions between coronavirus nucleocapsid protein and viral RNAs: implications for viral transcription. J Virol 62:4280-7. 10. Baric, R. S., and A. C. Sims. 2005. Development of mouse hepatitis virus and

87 SARS-CoV infectious cDNA constructs. Curr Top Microbiol Immunol 287:229-52. 11. Bergmann, C. C., T. E. Lane, and S. A. Stohlman. 2006. Coronavirus infection of the central nervous system: host-virus stand-off. Nat Rev Microbiol 4:121-32. 12. Bi, W., J. D. Pinon, S. Hughes, P. J. Bonilla, K. V. Holmes, S. R. Weiss, and J. L. Leibowitz. 1998. Localization of mouse hepatitis virus open reading frame 1A derived proteins. J Neurovirol 4:594-605. 13. Bos, E. C., W. Luytjes, H. V. van der Meulen, H. K. Koerten, and W. J. Spaan. 1996. The production of recombinant infectious DI-particles of a murine coronavirus in the absence of helper virus. Virology 218:52-60. 14. Brian, D. A., and Baric, R.S. 2005. Coronavirus genome structure and replication. Current Topics in Microbiology and Immunology 287:1-30. 15. Brian, D. A., and R. S. Baric. 2005. Coronavirus genome structure and replication. Curr Top Microbiol Immunol 287:1-30. 16. Brockway, S. M., C. T. Clay, X. T. Lu, and M. R. Denison. 2003. Characterization of the expression, intracellular localization, and replication complex association of the putative mouse hepatitis virus RNA-dependent RNA polymerase. J Virol 77:10515-27. 17. Brockway, S. M., and M. R. Denison. 2005. Mutagenesis of the murine hepatitis virus nsp1-coding region identifies residues important for protein processing, viral RNA synthesis, and viral replication. Virology 340:209-23. 18. Brockway, S. M., X. T. Lu, T. R. Peters, T. S. Dermody, and M. R. Denison. 2004. Intracellular localization and protein interactions of the gene 1 protein p28 during mouse hepatitis virus replication. J Virol 78:11551-62. 19. Brown, C. G., K. S. Nixon, S. D. Senanayake, and D. A. Brian. 2007. An RNA stem-loop within the bovine coronavirus nsp1 coding region is a cis-acting element in defective interfering RNA replication. J Virol 81:7716-24. 20. Brunner, J. E., J. H. Nguyen, H. H. Roehl, T. V. Ho, K. M. Swiderek, and B. L. Semler. 2005. Functional interaction of heterogeneous nuclear ribonucleoprotein C with poliovirus RNA synthesis initiation complexes. J Virol 79:3254-66. 21. Casais, R., V. Thiel, S. G. Siddell, D. Cavanagh, and P. Britton. 2001. Reverse

88 genetics system for the avian coronavirus infectious bronchitis virus. J Virol 75:12359-69. 22. Cavanagh, D. 1997. Nidovirales: a new order comprising Coronaviridae and Arteriviridae. Arch Virol 142:629-33. 23. Chang, R. Y., and D. A. Brian. 1996. cis Requirement for N-specific protein sequence in bovine coronavirus defective interfering RNA replication. J Virol 70:2201-7. 24. Chang, R. Y., M. A. Hofmann, P. B. Sethna, and D. A. Brian. 1994. A cis-acting function for the coronavirus leader in defective interfering RNA replication. J Virol 68:8223-31. 25. Chang, R. Y., R. Krishnan, and D. A. Brian. 1996. The UCUAAAC promoter motif is not required for high-frequency leader recombination in bovine coronavirus defective interfering RNA. J Virol 70:2720-9. 26. Chen, C. J., K. Sugiyama, H. Kubo, C. Huang, and S. Makino. 2004. Murine coronavirus nonstructural protein p28 arrests cell cycle in G0/G1 phase. J Virol 78:10410-9. 27. Chen, S. C., and R. C. Olsthoorn. Group-specific structural features of the 5'-proximal sequences of coronavirus genomic RNAs. Virology 401:29-41. 28. Chen, Y., H. Cai, J. Pan, N. Xiang, P. Tien, T. Ahola, and D. Guo. 2009. Functional screen reveals SARS coronavirus nonstructural protein nsp14 as a novel cap N7 methyltransferase. Proc Natl Acad Sci U S A 106:3484-9. 29. Chiu, S. S., K. H. Chan, K. W. Chu, S. W. Kwan, Y. Guan, L. L. Poon, and J. S. Peiris. 2005. Human coronavirus NL63 infection and other coronavirus infections in children hospitalized with acute respiratory disease in , China. Clin Infect Dis 40:1721-9. 30. Choi, K. S., A. Mizutani, and M. M. Lai. 2004. SYNCRIP, a member of the heterogeneous nuclear ribonucleoprotein family, is involved in mouse hepatitis virus RNA synthesis. J Virol 78:13153-62. 31. Clerte, C., and K. B. Hall. 2006. Characterization of multimeric complexes formed by the human PTB1 protein on RNA. Rna 12:457-75.

89 32. Coley, S. E., E. Lavi, S. G. Sawicki, L. Fu, B. Schelle, N. Karl, S. G. Siddell, and V. Thiel. 2005. Recombinant mouse hepatitis virus strain A59 from cloned, full-length cDNA replicates to high titers in vitro and is fully pathogenic in vivo. J Virol 79:3097-106. 33. Compton, S. R., S. W. Barthold, and A. L. Smith. 1993. The cellular and molecular pathogenesis of coronaviruses. Lab Anim Sci 43:15-28. 34. Decroly, E., I. Imbert, B. Coutard, M. Bouvet, B. Selisko, K. Alvarez, A. E. Gorbalenya, E. J. Snijder, and B. Canard. 2008. Coronavirus nonstructural protein 16 is a cap-0 binding enzyme possessing (nucleoside-2'O)-methyltransferase activity. J Virol 82:8071-84. 35. Denison, M. R., B. Yount, S. M. Brockway, R. L. Graham, A. C. Sims, X. Lu, and R. S. Baric. 2004. Cleavage between replicase proteins p28 and p65 of mouse hepatitis virus is not required for virus replication. J Virol 78:5957-65. 36. Donaldson, E. F., A. C. Sims, R. L. Graham, M. R. Denison, and R. S. Baric. 2007. Murine hepatitis virus replicase protein nsp10 is a critical regulator of viral RNA synthesis. J Virol 81:6356-68. 37. Donaldson, E. F., B. Yount, A. C. Sims, S. Burkett, R. J. Pickles, and R. S. Baric. 2008. Systematic assembly of a full-length infectious clone of human coronavirus NL63. J Virol 82:11948-57. 38. Du, L., Y. He, Y. Zhou, S. Liu, B. J. Zheng, and S. Jiang. 2009. The spike protein of SARS-CoV--a target for vaccine and therapeutic development. Nat Rev Microbiol 7:226-36. 39. Duffy, S., L. A. Shackelton, and E. C. Holmes. 2008. Rates of evolutionary change in viruses: patterns and determinants. Nat Rev Genet 9:267-76. 40. Dyson, H. J., and P. E. Wright. 2005. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol 6:197-208. 41. Egloff, M. P., F. Ferron, V. Campanacci, S. Longhi, C. Rancurel, H. Dutartre, E. J. Snijder, A. E. Gorbalenya, C. Cambillau, and B. Canard. 2004. The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded RNA-binding subunit unique in the RNA virus world. Proc Natl Acad Sci U S A

90 101:3792-6. 42. Enjuanes, L., F. Almazan, I. Sola, and S. Zuniga. 2006. Biochemical aspects of coronavirus replication and virus-host interaction. Annu Rev Microbiol 60:211-30. 43. Enjuanes, L., I. Sola, S. Alonso, D. Escors, and S. Zuniga. 2005. Coronavirus reverse genetics and development of vectors for gene expression. Curr Top Microbiol Immunol 287:161-97. 44. Filomatori, C. V., M. F. Lodeiro, D. E. Alvarez, M. M. Samsa, L. Pietrasanta, and A. V. Gamarnik. 2006. A 5' RNA element promotes dengue virus RNA synthesis on a circular genome. Genes Dev 20:2238-49. 45. Frana, M. F., J. N. Behnke, L. S. Sturman, and K. V. Holmes. 1985. Proteolytic cleavage of the E2 glycoprotein of murine coronavirus: host-dependent differences in proteolytic cleavage and cell fusion. J Virol 56:912-20. 46. Furuya, T., and M. M. Lai. 1993. Three different cellular proteins bind to complementary sites on the 5'-end-positive and 3'-end-negative strands of mouse hepatitis virus RNA. J Virol 67:7215-22. 47. Gaertner, D. J., D. F. Winograd, S. R. Compton, F. X. Paturzo, and A. L. Smith. 1993. Development and optimization of plaque assays for rat coronaviruses. J Virol Methods 43:53-64. 48. Gamarnik, A. V., and R. Andino. 1998. Switch from translation to RNA replication in a positive-stranded RNA virus. Genes Dev 12:2293-304. 49. Goebel, S. J., B. Hsue, T. F. Dombrowski, and P. S. Masters. 2004. Characterization of the RNA components of a putative molecular switch in the 3' untranslated region of the murine coronavirus genome. J Virol 78:669-82. 50. Goebel, S. J., T. B. Miller, C. J. Bennett, K. A. Bernard, and P. S. Masters. 2007. A hypervariable region within the 3' cis-acting element of the murine coronavirus genome is nonessential for RNA synthesis but affects pathogenesis. J Virol 81:1274-87. 51. Goebel, S. J., J. Taylor, and P. S. Masters. 2004. The 3' cis-acting genomic replication element of the severe acute respiratory syndrome coronavirus can function in the murine coronavirus genome. J Virol 78:7846-51.

91 52. Gonzalez, J. M., P. Gomez-Puertas, D. Cavanagh, A. E. Gorbalenya, and L. Enjuanes. 2003. A comparative sequence analysis to revise the current taxonomy of the family Coronaviridae. Arch Virol 148:2207-35. 53. Gorbalenya, A. E., L. Enjuanes, J. Ziebuhr, and E. J. Snijder. 2006. Nidovirales: evolving the largest RNA virus genome. Virus Res 117:17-37. 54. Graham, R. L., J. S. Sparks, L. D. Eckerle, A. C. Sims, and M. R. Denison. 2008. SARS coronavirus replicase proteins in pathogenesis. Virus Res 133:88-100. 55. Gustin, K. M., B. J. Guan, A. Dziduszko, and D. A. Brian. 2009. Bovine coronavirus nonstructural protein 1 (p28) is an RNA binding protein that binds terminal genomic cis-replication elements. J Virol 83:6087-97. 56. Haring, J., and S. Perlman. 2001. Mouse hepatitis virus. Curr Opin Microbiol 4:462-6. 57. Herold, J., and R. Andino. 2001. Poliovirus RNA replication requires genome circularization through a protein-protein bridge. Mol Cell 7:581-91. 58. Hirano, N., K. Fujiwara, and M. Matumoto. 1976. Mouse hepatitis virus (MHV-2). Plaque assay and propagation in mouse cell line DBT cells. Jpn J Microbiol 20:219-25. 59. Hochstrasser, M. 2009. Origin and function of ubiquitin-like proteins. Nature 458:422-9. 60. Hofmann, M. A., and D. A. Brian. 1991. The 5' end of coronavirus minus-strand RNAs contains a short poly(U) tract. J Virol 65:6331-3. 61. Hsue, B., T. Hartshorne, and P. S. Masters. 2000. Characterization of an essential RNA secondary structure in the 3' untranslated region of the murine coronavirus genome. J Virol 74:6911-21. 62. Hsue, B., and P. S. Masters. 1997. A bulged stem-loop structure in the 3' untranslated region of the genome of the coronavirus mouse hepatitis virus is essential for replication. J Virol 71:7567-78. 63. Hsue, B., and P. S. Masters. 1998. An essential secondary structure in the 3' untranslated region of the mouse hepatitis virus genome. Adv Exp Med Biol 440:297-302.

92 64. Huang, P., and M. M. Lai. 2001. Heterogeneous nuclear ribonucleoprotein a1 binds to the 3'-untranslated region and mediates potential 5'-3'-end cross talks of mouse hepatitis virus RNA. J Virol 75:5009-17. 65. Huang, P., and M. M. Lai. 1999. Polypyrimidine tract-binding protein binds to the complementary strand of the mouse hepatitis virus 3' untranslated region, thereby altering RNA conformation. J Virol 73:9110-6. 66. Imbert, I., J. C. Guillemot, J. M. Bourhis, C. Bussetta, B. Coutard, M. P. Egloff, F. Ferron, A. E. Gorbalenya, and B. Canard. 2006. A second, non-canonical RNA-dependent RNA polymerase in SARS coronavirus. Embo J 25:4933-42. 67. Ivanov, K. A., T. Hertzig, M. Rozanov, S. Bayer, V. Thiel, A. E. Gorbalenya, and J. Ziebuhr. 2004. Major genetic marker of nidoviruses encodes a replicative endoribonuclease. Proc Natl Acad Sci U S A 101:12694-9. 68. Joseph, J. S., K. S. Saikatendu, V. Subramanian, B. W. Neuman, A. Brooun, M. Griffith, K. Moy, M. K. Yadav, J. Velasquez, M. J. Buchmeier, R. C. Stevens, and P. Kuhn. 2006. Crystal structure of nonstructural protein 10 from the severe acute respiratory syndrome coronavirus reveals a novel fold with two zinc-binding motifs. J Virol 80:7894-901. 69. Kamitani, W., C. Huang, K. Narayanan, K. G. Lokugamage, and S. Makino. 2009. A two-pronged strategy to suppress host protein synthesis by SARS coronavirus Nsp1 protein. Nat Struct Mol Biol 16:1134-40. 70. Kamitani, W., K. Narayanan, C. Huang, K. Lokugamage, T. Ikegami, N. Ito, H. Kubo, and S. Makino. 2006. Severe acute respiratory syndrome coronavirus nsp1 protein suppresses host gene expression by promoting host mRNA degradation. Proc Natl Acad Sci U S A 103:12885-90. 71. Kang, H., M. Feng, M. E. Schroeder, D. P. Giedroc, and J. L. Leibowitz. 2006. Putative cis-acting stem-loops in the 5' untranslated region of the severe acute respiratory syndrome coronavirus can substitute for their mouse hepatitis virus counterparts. J Virol 80:10600-14. 72. King, B., B. J. Potts, and D. A. Brian. 1985. Bovine coronavirus hemagglutinin protein. Virus Res 2:53-9.

93 73. Knoops, K., M. Kikkert, S. H. Worm, J. C. Zevenhoven-Dobbe, Y. van der Meer, A. J. Koster, A. M. Mommaas, and E. J. Snijder. 2008. SARS-coronavirus replication is supported by a reticulovesicular network of modified endoplasmic reticulum. PLoS Biol 6:e226. 74. Knoops, K., C. Swett-Tapia, S. H. van den Worm, A. J. Te Velthuis, A. J. Koster, A. M. Mommaas, E. J. Snijder, and M. Kikkert. Integrity of the early secretory pathway promotes, but is not required for, severe acute respiratory syndrome coronavirus RNA synthesis and virus-induced remodeling of endoplasmic reticulum membranes. J Virol 84:833-46. 75. Ksiazek, T. G., D. Erdman, C. S. Goldsmith, S. R. Zaki, T. Peret, S. Emery, S. Tong, C. Urbani, J. A. Comer, W. Lim, P. E. Rollin, S. F. Dowell, A. E. Ling, C. D. Humphrey, W. J. Shieh, J. Guarner, C. D. Paddock, P. Rota, B. Fields, J. DeRisi, J. Y. Yang, N. Cox, J. M. Hughes, J. W. LeDuc, W. J. Bellini, and L. J. Anderson. 2003. A associated with severe acute respiratory syndrome. N Engl J Med 348:1953-66. 76. Lai, M. M. 1992. RNA recombination in animal and plant viruses. Microbiol Rev 56:61-79. 77. Lai, M. M., and D. Cavanagh. 1997. The molecular biology of coronaviruses. Adv Virus Res 48:1-100. 78. Lai, M. M., C. D. Patton, and S. A. Stohlman. 1982. Further characterization of mRNA's of mouse hepatitis virus: presence of common 5'-end nucleotides. J Virol 41:557-65. 79. Lau, S. K., P. C. Woo, C. C. Yip, H. Tse, H. W. Tsoi, V. C. Cheng, P. Lee, B. S. Tang, C. H. Cheung, R. A. Lee, L. Y. So, Y. L. Lau, K. H. Chan, and K. Y. Yuen. 2006. Coronavirus HKU1 and other coronavirus infections in Hong Kong. J Clin Microbiol 44:2063-71. 80. Le Quesne, J. P., K. A. Spriggs, M. Bushell, and A. E. Willis. Dysregulation of protein synthesis and disease. J Pathol 220:140-51. 81. Li, H. P., P. Huang, S. Park, and M. M. Lai. 1999. Polypyrimidine tract-binding protein binds to the leader RNA of mouse hepatitis virus and serves as a regulator of

94 viral transcription. J Virol 73:772-7. 82. Li, H. P., X. Zhang, R. Duncan, L. Comai, and M. M. Lai. 1997. Heterogeneous nuclear ribonucleoprotein A1 binds to the transcription-regulatory region of mouse hepatitis virus RNA. Proc Natl Acad Sci U S A 94:9544-9. 83. Li, L., H. Kang, P. Liu, N. Makkinje, S. T. Williamson, J. L. Leibowitz, and D. P. Giedroc. 2008. Structural lability in stem-loop 1 drives a 5' UTR-3' UTR interaction in coronavirus replication. J Mol Biol 377:790-803. 84. Lin, Y. J., C. L. Liao, and M. M. Lai. 1994. Identification of the cis-acting signal for minus-strand RNA synthesis of a murine coronavirus: implications for the role of minus-strand RNA in RNA replication and transcription. J Virol 68:8131-40. 85. Lin, Y. J., X. Zhang, R. C. Wu, and M. M. Lai. 1996. The 3' untranslated region of coronavirus RNA is required for subgenomic mRNA transcription from a defective interfering RNA. J Virol 70:7236-40. 86. Linding, R., L. J. Jensen, F. Diella, P. Bork, T. J. Gibson, and R. B. Russell. 2003. Protein disorder prediction: implications for structural proteomics. Structure 11:1453-9. 87. Liu, P., L. Li, S. C. Keane, D. Yang, J. L. Leibowitz, and D. P. Giedroc. 2009. Mouse hepatitis virus stem-loop 2 adopts a uYNMG(U)a-like tetraloop structure that is highly functionally tolerant of base substitutions. J Virol 83:12084-93. 88. Liu, P., L. Li, J. J. Millership, H. Kang, J. L. Leibowitz, and D. P. Giedroc. 2007. A U-turn motif-containing stem-loop in the coronavirus 5' untranslated region plays a functional role in replication. Rna 13:763-80. 89. Liu, Q., R. F. Johnson, and J. L. Leibowitz. 2001. Secondary structural elements within the 3' untranslated region of mouse hepatitis virus strain JHM genomic RNA. J Virol 75:12105-13. 90. Longhi, S., P. Lieutaud, and B. Canard. Conformational disorder. Methods Mol Biol 609:307-25. 91. Lu, Y., X. Lu, and M. R. Denison. 1995. Identification and characterization of a serine-like proteinase of the murine coronavirus MHV-A59. J Virol 69:3554-9. 92. Makino, S., N. Fujioka, and K. Fujiwara. 1985. Structure of the intracellular

95 defective viral RNAs of defective interfering particles of mouse hepatitis virus. J Virol 54:329-36. 93. Makino, S., and M. M. Lai. 1989. High-frequency leader sequence switching during coronavirus defective interfering RNA replication. J Virol 63:5285-92. 94. Makino, S., C. K. Shieh, J. G. Keck, and M. M. Lai. 1988. Defective-interfering particles of murine coronavirus: mechanism of synthesis of defective viral RNAs. Virology 163:104-11. 95. Makino, S., K. Yokomori, and M. M. Lai. 1990. Analysis of efficiently packaged defective interfering RNAs of murine coronavirus: localization of a possible RNA-packaging signal. J Virol 64:6045-53. 96. Malakhov, M. P., M. R. Mattern, O. A. Malakhova, M. Drinker, S. D. Weeks, and T. R. Butt. 2004. SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics 5:75-86. 97. Masters, P. S. 2006. The molecular biology of coronaviruses. Adv Virus Res 66:193-292. 98. Masters, P. S., L. Kuo, R. Ye, K. R. Hurst, C. A. Koetzner, and B. Hsue. 2006. Genetic and molecular biological analysis of protein-protein interactions in coronavirus assembly. Adv Exp Med Biol 581:163-73. 99. Masters, P. S., and P. J. Rottier. 2005. Coronavirus reverse genetics by targeted RNA recombination. Curr Top Microbiol Immunol 287:133-59. 100. Mathews, D. H., J. Sabina, M. Zuker, and D. H. Turner. 1999. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol 288:911-40. 101. Mattaj, I. W. 1993. RNA recognition: a family matter? Cell 73:837-40. 102. McCormack, J. C., X. Yuan, Y. G. Yingling, W. Kasprzak, R. E. Zamora, B. A. Shapiro, and A. E. Simon. 2008. Structural domains within the 3' untranslated region of Turnip crinkle virus. J Virol 82:8706-20. 103. Mendez, A., C. Smerdou, A. Izeta, F. Gebauer, and L. Enjuanes. 1996. Molecular characterization of transmissible gastroenteritis coronavirus defective interfering genomes: packaging and heterogeneity. Virology 217:495-507.

96 104. Mihindukulasuriya, K. A., G. Wu, J. St Leger, R. W. Nordhausen, and D. Wang. 2008. Identification of a novel coronavirus from a beluga whale by using a panviral microarray. J Virol 82:5084-8. 105. Minskaia, E., T. Hertzig, A. E. Gorbalenya, V. Campanacci, C. Cambillau, B. Canard, and J. Ziebuhr. 2006. Discovery of an RNA virus 3'->5' exoribonuclease that is critically involved in coronavirus RNA synthesis. Proc Natl Acad Sci U S A 103:5108-13. 106. Moore, P. B. 1999. Structural motifs in RNA. Annu Rev Biochem 68:287-300. 107. Morris, D. R., and A. P. Geballe. 2000. Upstream open reading frames as regulators of mRNA translation. Mol Cell Biol 20:8635-42. 108. Namy, O., S. J. Moran, D. I. Stuart, R. J. Gilbert, and I. Brierley. 2006. A mechanical explanation of RNA pseudoknot function in programmed ribosomal frameshifting. Nature 441:244-7. 109. Nanda, S. K., R. F. Johnson, Q. Liu, and J. L. Leibowitz. 2004. Mitochondrial HSP70, HSP40, and HSP60 bind to the 3' untranslated region of the Murine hepatitis virus genome. Arch Virol 149:93-111. 110. Nanda, S. K., and J. L. Leibowitz. 2001. Mitochondrial aconitase binds to the 3'-UTR of mouse hepatitis virus RNA. Adv Exp Med Biol 494:603-8. 111. Nanda, S. K., and J. L. Leibowitz. 2001. Mitochondrial aconitase binds to the 3' untranslated region of the mouse hepatitis virus genome. J Virol 75:3352-62. 112. Narayanan, K., and S. Makino. 2001. Cooperation of an RNA packaging signal and a viral envelope protein in coronavirus RNA packaging. J Virol 75:9059-67. 113. Nelson, G. W., S. A. Stohlman, and S. M. Tahara. 2000. High affinity interaction between nucleocapsid protein and leader/intergenic sequence of mouse hepatitis virus RNA. J Gen Virol 81:181-8. 114. Novak, J. E., and K. Kirkegaard. 1991. Improved method for detecting poliovirus negative strands used to demonstrate specificity of positive-strand encapsidation and the ratio of positive to negative strands in infected cells. J Virol 65:3384-7. 115. Pasternak, A. O., W. J. Spaan, and E. J. Snijder. 2006. Nidovirus transcription: how to make sense.? J Gen Virol 87:1403-21.

97 116. Peiris, J. S., S. T. Lai, L. L. Poon, Y. Guan, L. Y. Yam, W. Lim, J. Nicholls, W. K. Yee, W. W. Yan, M. T. Cheung, V. C. Cheng, K. H. Chan, D. N. Tsang, R. W. Yung, T. K. Ng, and K. Y. Yuen. 2003. Coronavirus as a possible cause of severe acute respiratory syndrome. Lancet 361:1319-25. 117. Penzes, Z., C. Wroe, T. D. Brown, P. Britton, and D. Cavanagh. 1996. Replication and packaging of coronavirus infectious bronchitis virus defective RNAs lacking a long open reading frame. J Virol 70:8660-8. 118. Perlman, S. 1998. Pathogenesis of coronavirus-induced infections. Review of pathological and immunological aspects. Adv Exp Med Biol 440:503-13. 119. Perlman, S., and J. Netland. 2009. Coronaviruses post-SARS: update on replication and pathogenesis. Nat Rev Microbiol 7:439-50. 120. Peti, W., M. A. Johnson, T. Herrmann, B. W. Neuman, M. J. Buchmeier, M. Nelson, J. Joseph, R. Page, R. C. Stevens, P. Kuhn, and K. Wuthrich. 2005. Structural genomics of the severe acute respiratory syndrome coronavirus: nuclear magnetic resonance structure of the protein nsP7. J Virol 79:12905-13. 121. Racaniello, V. R., and D. Baltimore. 1981. Cloned poliovirus complementary DNA is infectious in mammalian cells. Science 214:916-9. 122. Raman, S., P. Bouma, G. D. Williams, and D. A. Brian. 2003. Stem-loop III in the 5' untranslated region is a cis-acting element in bovine coronavirus defective interfering RNA replication. J Virol 77:6720-30. 123. Raman, S., and D. A. Brian. 2005. Stem-loop IV in the 5' untranslated region is a cis-acting element in bovine coronavirus defective interfering RNA replication. J Virol 79:12434-46. 124. Regulski, E. E., and R. R. Breaker. 2008. In-line probing analysis of riboswitches. Methods Mol Biol 419:53-67. 125. Rhode, B. M., K. Hartmuth, H. Urlaub, and R. Luhrmann. 2003. Analysis of site-specific protein-RNA cross-links in isolated RNP complexes, combining affinity selection and mass spectrometry. Rna 9:1542-51. 126. Risco, C., I. M. Anton, L. Enjuanes, and J. L. Carrascosa. 1996. The transmissible gastroenteritis coronavirus contains a spherical core shell consisting of

98 M and N proteins. J Virol 70:4773-7. 127. Romero-Lopez, C., and A. Berzal-Herranz. 2009. A long-range RNA-RNA interaction between the 5' and 3' ends of the HCV genome. Rna 15:1740-52. 128. Sawicki, S. G., and D. L. Sawicki. 1995. Coronaviruses use discontinuous extension for synthesis of subgenome-length negative strands. Adv Exp Med Biol 380:499-506. 129. Sawicki, S. G., and D. L. Sawicki. 1998. A new model for coronavirus transcription. Adv Exp Med Biol 440:215-9. 130. Sawicki, S. G., D. L. Sawicki, and S. G. Siddell. 2007. A contemporary view of coronavirus transcription. J Virol 81:20-9. 131. Schelle, B., N. Karl, B. Ludewig, S. G. Siddell, and V. Thiel. 2006. Nucleocapsid protein expression facilitates coronavirus replication. Adv Exp Med Biol 581:43-8. 132. Schelle, B., N. Karl, B. Ludewig, S. G. Siddell, and V. Thiel. 2005. Selective replication of coronavirus genomes that express nucleocapsid protein. J Virol 79:6620-30. 133. Senanayake, S. D., and D. A. Brian. 1999. Translation from the 5' untranslated region (UTR) of mRNA 1 is repressed, but that from the 5' UTR of mRNA 7 is stimulated in coronavirus-infected cells. J Virol 73:8003-9. 134. Sethna, P. B., S. L. Hung, and D. A. Brian. 1989. Coronavirus subgenomic minus-strand RNAs and the potential for mRNA replicons. Proc Natl Acad Sci U S A 86:5626-30. 135. Seybert, A., A. Hegyi, S. G. Siddell, and J. Ziebuhr. 2000. The superfamily 1 helicase has RNA and DNA duplex-unwinding activities with 5'-to-3' polarity. Rna 6:1056-68. 136. Shapiro, B. A., Y. G. Yingling, W. Kasprzak, and E. Bindewald. 2007. Bridging the gap in RNA structure prediction. Curr Opin Struct Biol 17:157-65. 137. Shi, S. T., J. J. Schiller, A. Kanjanahaluethai, S. C. Baker, J. W. Oh, and M. M. Lai. 1999. Colocalization and membrane association of murine hepatitis virus gene 1 products and De novo-synthesized viral RNA in infected cells. J Virol 73:5957-69. 138. Silvera, D., A. V. Gamarnik, and R. Andino. 1999. The N-terminal K homology

99 domain of the poly(rC)-binding protein is a major determinant for binding to the poliovirus 5'-untranslated region and acts as an inhibitor of viral translation. J Biol Chem 274:38163-70. 139. Simmons, G., J. D. Reeves, A. J. Rennekamp, S. M. Amberg, A. J. Piefer, and P. Bates. 2004. Characterization of severe acute respiratory syndrome-associated coronavirus (SARS-CoV) spike glycoprotein-mediated viral entry. Proc Natl Acad Sci U S A 101:4240-5. 140. Snijder, E. J., P. J. Bredenbeek, J. C. Dobbe, V. Thiel, J. Ziebuhr, L. L. Poon, Y. Guan, M. Rozanov, W. J. Spaan, and A. E. Gorbalenya. 2003. Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage. J Mol Biol 331:991-1004. 141. Snijder, E. J., Y. van der Meer, J. Zevenhoven-Dobbe, J. J. Onderwater, J. van der Meulen, H. K. Koerten, and A. M. Mommaas. 2006. Ultrastructure and origin of membrane vesicles associated with the severe acute respiratory syndrome coronavirus replication complex. J Virol 80:5927-40. 142. Spaan, D. A. B. a. W. J. M. 1997. Recombination and coronavirus defective interfering RNAs. Seminars in Virology 8:101-111. 143. Spagnolo, J. F., and B. G. Hogue. 2000. Host protein interactions with the 3' end of bovine coronavirus RNA and the requirement of the poly(A) tail for coronavirus defective genome replication. J Virol 74:5053-65. 144. Spagnolo, J. F., and B. G. Hogue. 2001. Requirement of the poly(A) tail in coronavirus genome replication. Adv Exp Med Biol 494:467-74. 145. St-Jean, J. R., M. Desforges, F. Almazan, H. Jacomy, L. Enjuanes, and P. J. Talbot. 2006. Recovery of a neurovirulent human coronavirus OC43 from an infectious cDNA clone. J Virol 80:3670-4. 146. Stohlman, S. A., R. S. Baric, G. N. Nelson, L. H. Soe, L. M. Welter, and R. J. Deans. 1988. Specific interaction between coronavirus leader RNA and nucleocapsid protein. J Virol 62:4288-95. 147. Sturman, L. S., C. S. Ricard, and K. V. Holmes. 1990. Conformational change of the coronavirus peplomer glycoprotein at pH 8.0 and 37 degrees C correlates with

100 virus aggregation and virus-induced cell fusion. J Virol 64:3042-50. 148. Su, D., Z. Lou, F. Sun, Y. Zhai, H. Yang, R. Zhang, A. Joachimiak, X. C. Zhang, M. Bartlam, and Z. Rao. 2006. Dodecamer structure of severe acute respiratory syndrome coronavirus nonstructural protein nsp10. J Virol 80:7902-8. 149. Sutton, G., E. Fry, L. Carter, S. Sainsbury, T. Walter, J. Nettleship, N. Berrow, R. Owens, R. Gilbert, A. Davidson, S. Siddell, L. L. Poon, J. Diprose, D. Alderton, M. Walsh, J. M. Grimes, and D. I. Stuart. 2004. The nsp9 replicase protein of SARS-coronavirus, structure and functional insights. Structure 12:341-53. 150. Tahara, S. M., T. A. Dietlin, C. C. Bergmann, G. W. Nelson, S. Kyuwa, R. P. Anthony, and S. A. Stohlman. 1994. Coronavirus translational regulation: leader affects mRNA efficiency. Virology 202:621-30. 151. Tahara, S. M., T. A. Dietlin, G. W. Nelson, S. A. Stohlman, and D. J. Manno. 1998. Mouse hepatitis virus nucleocapsid protein as a translational effector of viral mRNAs. Adv Exp Med Biol 440:313-8. 152. te Velthuis, A. J., J. J. Arnold, C. E. Cameron, S. H. van den Worm, and E. J. Snijder. The RNA polymerase activity of SARS-coronavirus nsp12 is primer dependent. Nucleic Acids Res 38:203-14. 153. Thiel, V., J. Herold, B. Schelle, and S. G. Siddell. 2001. Infectious RNA transcribed in vitro from a cDNA copy of the human coronavirus genome cloned in vaccinia virus. J Gen Virol 82:1273-81. 154. Thiel, V., J. Herold, B. Schelle, and S. G. Siddell. 2001. Viral replicase gene products suffice for coronavirus discontinuous transcription. J Virol 75:6676-81. 155. Thiel, V., and S. G. Siddell. 2005. Reverse genetics of coronaviruses using vaccinia virus vectors. Curr Top Microbiol Immunol 287:199-227. 156. van der Meer, Y., E. J. Snijder, J. C. Dobbe, S. Schleich, M. R. Denison, W. J. Spaan, and J. K. Locker. 1999. Localization of mouse hepatitis virus nonstructural proteins and RNA synthesis indicates a role for late endosomes in viral replication. J Virol 73:7641-57. 157. van der Most, R. G., W. Luytjes, S. Rutjes, and W. J. Spaan. 1995. Translation but not the encoded sequence is essential for the efficient propagation of the

101 defective interfering RNAs of the coronavirus mouse hepatitis virus. J Virol 69:3744-51. 158. van Hemert, M. J., S. H. van den Worm, K. Knoops, A. M. Mommaas, A. E. Gorbalenya, and E. J. Snijder. 2008. SARS-coronavirus replication/transcription complexes are membrane-protected and need a host factor for activity in vitro. PLoS Pathog 4:e1000054. 159. Vennema, H., G. J. Godeke, J. W. Rossen, W. F. Voorhout, M. C. Horzinek, D. J. Opstelten, and P. J. Rottier. 1996. Nucleocapsid-independent assembly of coronavirus-like particles by co-expression of viral envelope protein genes. Embo J 15:2020-8. 160. Vijaykrishna, D., G. J. Smith, J. X. Zhang, J. S. Peiris, H. Chen, and Y. Guan. 2007. Evolutionary insights into the ecology of coronaviruses. J Virol 81:4012-20. 161. Villordo, S. M., and A. V. Gamarnik. 2009. Genome cyclization as strategy for flavivirus RNA replication. Virus Res 139:230-9. 162. Wang, L., and S. J. Brown. 2006. BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res 34:W243-8. 163. Williams, G. D., R. Y. Chang, and D. A. Brian. 1999. A phylogenetically conserved hairpin-type 3' untranslated region pseudoknot functions in coronavirus RNA replication. J Virol 73:8349-55. 164. Woo, P. C., S. K. Lau, Y. Huang, and K. Y. Yuen. 2009. Coronavirus diversity, phylogeny and interspecies jumping. Exp Biol Med (Maywood) 234:1117-27. 165. Woo, P. C., S. K. Lau, C. S. Lam, K. K. Lai, Y. Huang, P. Lee, G. S. Luk, K. C. Dyrting, K. H. Chan, and K. Y. Yuen. 2009. Comparative analysis of complete genome sequences of three avian coronaviruses reveals a novel group 3c coronavirus. J Virol 83:908-17. 166. Woo, P. C., S. K. Lau, and K. Y. Yuen. 2006. Infectious diseases emerging from Chinese wet-markets: zoonotic origins of severe respiratory viral infections. Curr Opin Infect Dis 19:401-7. 167. Wright, P. E., and H. J. Dyson. 2009. Linking folding and binding. Curr Opin

102 Struct Biol 19:31-8. 168. Wu, H. Y., and D. A. Brian. 2007. 5'-proximal hot spot for an inducible positive-to-negative-strand template switch by coronavirus RNA-dependent RNA polymerase. J Virol 81:3206-15. 169. Wu, H. Y., J. S. Guy, D. Yoo, R. Vlasak, E. Urbach, and D. A. Brian. 2003. Common RNA replication signals exist among group 2 coronaviruses: evidence for in vivo recombination between animal and human coronavirus molecules. Virology 315:174-83. 170. Wu, H. Y., A. Ozdarendeli, and D. A. Brian. 2006. Bovine coronavirus 5'-proximal genomic acceptor hotspot for discontinuous transcription is 65 nucleotides wide. J Virol 80:2183-93. 171. Yokomori, K., N. La Monica, S. Makino, C. K. Shieh, and M. M. Lai. 1989. Biosynthesis, structure, and biological activities of envelope protein gp65 of murine coronavirus. Virology 173:683-91. 172. Youn, S., J. L. Leibowitz, and E. W. Collisson. 2005. In vitro assembled, recombinant infectious bronchitis viruses demonstrate that the 5a open reading frame is not essential for replication. Virology 332:206-15. 173. Yount, B., K. M. Curtis, and R. S. Baric. 2000. Strategy for systematic assembly of large RNA and DNA genomes: transmissible gastroenteritis virus model. J Virol 74:10600-11. 174. Yount, B., K. M. Curtis, E. A. Fritz, L. E. Hensley, P. B. Jahrling, E. Prentice, M. R. Denison, T. W. Geisbert, and R. S. Baric. 2003. Reverse genetics with a full-length infectious cDNA of severe acute respiratory syndrome coronavirus. Proc Natl Acad Sci U S A 100:12995-3000. 175. Yount, B., M. R. Denison, S. R. Weiss, and R. S. Baric. 2002. Systematic assembly of a full-length infectious cDNA of mouse hepatitis virus strain A59. J Virol 76:11065-78. 176. Yu, W., and J. L. Leibowitz. 1995. A conserved motif at the 3' end of mouse hepatitis virus genomic RNA required for host protein binding and viral RNA replication. Virology 214:128-38.

103 177. Yu, W., and J. L. Leibowitz. 1995. Specific binding of host cellular proteins to multiple sites within the 3' end of mouse hepatitis virus genomic RNA. J Virol 69:2016-23. 178. Yu, X., W. Bi, S. R. Weiss, and J. L. Leibowitz. 1994. Mouse hepatitis virus gene 5b protein is a new virion envelope protein. Virology 202:1018-23. 179. Yuan, X., K. Shi, A. Meskauskas, and A. E. Simon. 2009. The 3' end of Turnip crinkle virus contains a highly interactive structure including a translational enhancer that is disrupted by binding to the RNA-dependent RNA polymerase. Rna 15:1849-64. 180. Zhai, Y., F. Sun, X. Li, H. Pang, X. Xu, M. Bartlam, and Z. Rao. 2005. Insights into SARS-CoV transcription and replication from the structure of the nsp7-nsp8 hexadecamer. Nat Struct Mol Biol 12:980-6. 181. Zhang, X., H. P. Li, W. Xue, and M. M. Lai. 1998. Cellular protein hnRNP-A1 interacts with the 3'-end and the intergenic sequence of mouse hepatitis virus negative-strand RNA to form a ribonucleoprotein complex. Adv Exp Med Biol 440:227-34. 182. Zhou, M., A. K. Williams, S. I. Chung, L. Wang, and E. W. Collisson. 1996. The infectious bronchitis virus nucleocapsid protein binds RNA sequences in the 3' terminus of the genome. Virology 217:191-9. 183. Zuker, M. 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31:3406-15. 184. Zuniga, S., J. L. Cruz, I. Sola, P. A. Mateos-Gomez, L. Palacio, and L. Enjuanes. Coronavirus nucleocapsid protein facilitates template switching and is required for efficient transcription. J Virol 84:2169-75. 185. Zuniga, S., I. Sola, S. Alonso, and L. Enjuanes. 2004. Sequence motifs involved in the regulation of discontinuous coronavirus subgenomic RNA synthesis. J Virol 78:980-94. 186. Zuniga, S., I. Sola, J. L. Moreno, P. Sabella, J. Plana-Duran, and L. Enjuanes. 2007. Coronavirus nucleocapsid protein is an RNA chaperone. Virology 357:215-27. 187. Zuo, X., S. Li, J. Hall, M. R. Mattern, H. Tran, J. Shoo, R. Tan, S. R. Weiss, and

104 T. R. Butt. 2005. Enhanced expression and purification of membrane proteins by SUMO fusion in Escherichia coli. J Struct Funct Genomics 6:103-11. 188. Zuo, X., M. R. Mattern, R. Tan, S. Li, J. Hall, D. E. Sterner, J. Shoo, H. Tran, P. Lim, S. G. Sarafianos, L. Kazi, S. Navas-Martin, S. R. Weiss, and T. R. Butt. 2005. Expression and purification of SARS coronavirus proteins using SUMO-fusions. Protein Expr Purif 42:100-10. 189. Zust, R., L. Cervantes-Barragan, T. Kuri, G. Blakqori, F. Weber, B. Ludewig, and V. Thiel. 2007. Coronavirus non-structural protein 1 is a major pathogenicity factor: implications for the rational design of coronavirus vaccines. PLoS Pathog 3:e109. 190. Zust, R., T. B. Miller, S. J. Goebel, V. Thiel, and P. S. Masters. 2008. Genetic interactions between an essential 3' cis-acting RNA pseudoknot, replicase gene products, and the extreme 3' end of the mouse coronavirus genome. J Virol 82:1214-28.

105 APPENDIX

106 EXPRESSION AND PURIFICATION OF BCOV NUCLEOCAPSID PROTEIN USING SUMO-FUSIONS Other than its structural role in the virion, the coronaviral nucleocapsid protein (N) has been shown to have regulatory significance in viral RNA and protein syntheses (97). Giving that N was found to specifically bind the TRS, a template-switching signal, at the 3’ end of leader-encoding RNA, it has been proposed that N participates in viral transcription (i.e., production of sgmRNAs) either in a direct or regulatory manner (9, 146, 154). N has also been proven to enhance the efficiency of viral replication after transfection of the full-length synthetically-produced RNA genome (during reverse genetics studies) (1, 131, 132, 175). Likewise, a correlation has been found between the affinity of N binding to the leader sequence on sgmRNAs and the translational efficiency of these and other mRNAs (experimentally constructed) (150, 151). Although N has been known to perform these several non-structural functions, the mechanism by which it does so remains to be determined. To begin to characterize the interaction between N and viral RNAs, we systematically sought the binding of E. coli-expressed BCoV N to the known putative cis-replication RNA elements located at the 5’ and 3’ termini of BCoV genome. To prepare BCoV N for electrophoretic mobility shift assay (EMSA), the pET SUMO Protein Expression System (Invitrogen) was utilized for high-level expression of recombinant N in E. coli and cleavage of native proteins (Fig. A.1). The small ubiquitin- related modifier (SUMO) is a protein that functions in post-translational modification (SUMOylation) by covalent attachment to other proteins (59). SUMOylation is reversible through the action of specific deconjugating enzymes, i.e., SUMO proteases. It has been demonstrated that fusion of SUMO at the N-terminus of other proteins can lead to enhanced solubility of the modified proteins (96, 187, 188). This property makes SUMO a useful tag for expression of recombinant proteins. In addition, cleavage by SUMO protease can result in the production of native proteins without extra conjugated moieties. Accordingly, expression of a construct encoding the SUMO-N utilized the pET SUMO plasmid as the backbone (Fig. A.1 A). The pET SUMO vector also bears an N-terminal 6×His tag which enables rapid purification of expressed SUMO-N by Ni-NTA affinity chromatography.

107

AB Culture and induce E. coli to express SUMO-N protein Harvest cells

Extract cell soluble proteins

Purify SUMO-N protein with Ni-NTA column

Dialyze for removal of elution agents Add SUMO protease to cleave SUMO-N protein Re-apply to Ni-NTA column to subtract SUMO and SUMO protease

Collect purified BCoV N protein

C 1 2 3 4 5 d n d d d d d e i e e e e e v d tete h fi fi fi fi fi a d e g i i i i i le n e c ro u r r r r r o c u u u u u u c ti u d p ro p p p p p e d e d s lu n in r l th ty ty ty ty ty e a e -i -i- e b - h i i i i i z e l G k u w s n n n n n ly t a n r l w fi fi fi fi fi o o T a o lo a ff f f f f ia r in kDa N IP M S F W A A A A A D P F 220 120

80 SUMO-N 60 50 N 40

30

25

20

12345 6 7891011121314

FIG. A.1. Expression and purification of BCoV N protein by SUMO-fusion in E. coli. (A) Map and features of pET SUMO vector for expression of SUMO-N fusion protein in E. coli. (B) The procedure for purification of BCoV N expressed with SUMO-fusion system in E. coli and cleavage of 6x His-SUMO-tagged protein. (C) Aliquots of the samples from various steps of the purification procedure were resolved by 12% SDS-PAGE and stained with Coomassie blue. The migration positions of molecular weight markers, SUMO-N, and N resulting from final protease cleavage are as indicated.

108 The procedure for purifying BCoV N is illustrated in Fig. A.2 B. SDS-PAGE and Coomassie blue staining were used to evaluate the effectiveness of the purification steps and the cleavage of the SUMO-N fusion protein (Fig. A.2 C). Briefly, BL21 (DE3) E. coli cells transformed with pSUMO-N was inoculated 5 ml of LB medium containing 50 g/ml kanamycin and cells were grown at 37 °C overnight with shaking at 250 rpm. The overnight culture was transferred into 500 ml fresh LB medium to permit exponential growth until ~0.6 OD600. Protein expression was induced with 1 mM IPTG (isopropyl-β- D-thiogalactopyranoside) and cells were incubated for another 20 h at 20 °C with shaking at 200 rpm. The E. coli cells were harvested by centrifugation at 5,000 × g for 20 min at 4 °C,

flash frozen in liquid N2, and resuspended by probe sonication (3 × 10 sec bursts, on ice) in lysis buffer (PBS containing 200 mM NaCl, 10 mM imidazole, 5% glycerol, 1% Triton X-100, 1mg/ml lysozyme, 1 mM PMSF [phenylmethanesulphonylfluoride], 25 U/ml Benzonase endonuclease [Novagen], and 1× EDTA-free Complete Protease Inhibitor [Roche], pH 7.8) at 3 ml per 1 g of cells. The lysed cells were centrifuged at 20,000 × g for 30 min at 4 °C, and supernatant was collected as the soluble fraction. The soluble fraction was incubated with ~2 ml Ni-NTA superflow resin (Qiagen) for 2 h at 4 °C on an orbital shaker. The mixture was loaded onto a Poly-Prep chromatography column (Bio-Rad) and the samples of flow-through containing unbound proteins were collected. The resin was extensively washed with ~50-100 ml of wash buffer (lysis buffer containing

° 20 mM imidazole) at 4 C until OD280 value reached the base line (UV value = 0). The SUMO-N fusion protein was eluted with 4 ml elution buffer (PBS containing 200 mM NaCl, 300 mm imidazole, pH 7.8) where different fractions were checked on SDS-PAGE and pooled. The purified SUMO-N was dialyzed and concentrated with YM-30 membranes (30 kDa cutoff, Millipore) against PBS at 4 °C to remove high salt and imidazole. Cleavage of SUMO by SUMO protease and recovery of native BCoV N by reloading onto a column with Ni-NTA resin were achieved in accordance with the manufacturer’s protocol (Invitrogen and Qiagen). The concentration of final eluted BCoV N was determined using the Bradford color-reaction assay (Bio-Rad) measured spectrophotometrically at 595 nm with bovine serum albumin as a standard. Purified BCoV N samples were stored at -80 °C after glycerol was added to 10%.

109 BCOV N BINDS CIS-REPLICATION STRUCTURES IN THE 5’ UTR To determine the binding capability of purified E. coli-expressed BCoV N protein to terminal cis-replication RNA elements, N protein was systematically incubated with 32P- labeled RNA probes that included the three 5’ UTR-located elements, SLI-SLII [(+) 1-98 nt], SLIII [(+) 69-182 nt], SLIV [(+) 152-262 nt], the nsp1-cistron-located SLV-SLVI [(+) 231- 364 nt], and the two 3’ UTR-located elements, the BSL-PK [(+) 291-159 nt] and the HVR [(+) 162-1 nt] (Fig. A.2 A). To make the RNA probes for EMSAs, primers bordering the respective elements that had incorporated the EcoRI and HindIII restriction enzyme sites for directional ligation into EcoRI/HindIII-digested pGEM3Zf(-) vector (Promega) were used for PCR amplification. To synthesize and radiolabel RNA probes, 2 μg of Hind III- digested plasmid DNA was transcribed in a 50-μl reaction mixture containing 40 U of T7 RNA polymerase (Promega), 0.5 mM rNTP mixture of ATP, GTP, and CTP, 120 μCi of [α32P]UTP (3,000 Ci/mmole) (MP Biomedicals), 100 mM DTT, 0.1 μg/μl acetylated BSA, and 1 U/μl RNasin (Promega) at 37 °C for 2 h. The mixture was then treated with 2.5 U of RQ1 RNase-free DNase (Promega) at 37 °C for 30 min, and probes were purified by electrophoresis on a 6%-polyacrylamide-7M urea gel. RNA visualized by autoradiography was eluted from the excised gel band overnight at 4 °C in elution buffer (1 mM EDTA, 0.5 M ammonium acetate) and ethanol precipitated. Cerenkov counts were determined by scintillation counting in water. RNA probes were quantified spectrophotometrically. The procedure for EMSA was as described by Andino et al. (6) and modified by Silvera et al. (138). Briefly, ~4 × 104 cpm of α32P-labeled RNA probe (2-5 ng) was incubated in a 20-μl binding reaction mixture containing 5 mM HEPES (pH 7.9), 40 mM

KCl, 2 mM MgCl2, 4% glycerol, 2 mM DTT, 0.25 μg/μl heparin, 0.5 μg/μl yeast tRNA, 1 U/μl RNasin (Promega), 1× EDTA-free Complete Protease Inhibitor (Roche), and indicated amounts of N protein. The mixture was pre-incubated without probe at 30 °C for 10 min, followed by additional 10 min incubation after probe was added. Two μl of 50% glycerol was added, and then mixtures were electrophoretically separated on a 6% nondenaturing polyacrylamide gel at 4 °C for 4 h using 0.5× TBE (45 mM Tris-borate and 1 mM EDTA) as running buffer. Gels were dried and exposed to Kodak XAR-5 film, and the radiolabeled bands were visualized by autoradiography and quantified by phosphor imaging.

110

A 5’ 1a N A 3’ 1b n

5’ UTR partial ORF 1a 3’ UTR

V

IV BSL HVR

I PK zz III z z z z 1 210 VI zz II zz z 2

5’ 3’ 5’ UAA 53954558037 84 129 174 226 239 310 311 340 288 An 3’ 1 (+) 1-98 nt 5’ 3’ (+) 69-182 nt (+) 291-159 nt 5’ 3’ 5’ 3’ (+) 152-262 nt (+) 162-1 nt 5’ 3’ 5’ 3’ (+) 231-364 nt 5’ 3’

B

Hot (+) 1-98 nt (+) 69-182 nt (+) 152-262 nt (+) 231-364 nt (+) 291-159 nt (+) 162-1 nt probes NN N N N N

[µM] - 0.25 0.5 [µM] - 0.25 0.75 1.25 [µM] - 0.25 0.75 1.5 [µM] - 0.25 0.75 1.5 [µM] - 0.1 0.25 0.5 1.0 [µM] - 0.1 0.25 0.5 1.0

C2

C C1

1 2 3 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 5 1 2 3 4 5

FIG. A.2. RNA structures in the BCoV genome tested for BCoV N binding. (A) The organization of the BCoV genome is shown at the top. Below this is a schematic diagram of the current model for the BCoV 5’ and 3’ terminal cis-replication RNA secondary structures required for BCoV DI RNA replication. Six stem-loops (SL I-VI) and three higher-order structures are depicted. Solid circles at nt 210 denote the AUG start codon for ORF 1 whose coding sequence is depicted by a thickened line. The stop codon UAA for the N-gene preceding the 3’ UTR is boxed. All nt numbering is from the first base at the 5’ and 3’ ends of the genome, respectively. Shown below the RNA structures are the RNA segments (solid lines) used as 32P-labeled probes in the EMSAs. BSL, bulged stem-loop; PK, pseudoknot; HVR, 3’ hypervariable region. (B) EMSA results when purified BCoV N protein mixes with the respective probes indicated at the top of each gel. Arrows identify the positions of free RNA probes. Brackets [C] identify the shifted probe as the result of an N protein:RNA complex.

111 In the EMSA experiments, N was not found to bind the RNA probes encompassing the 5’-proximal SLIV [(+) 152-262 nt] and the SLV-SLVI [(+) 231-364 nt], or the 3’-proximal BSL-PK [(+) 291-159 nt] and HVR [(+) 162-1 nt] (Fig. A.2 B). By contrast, N bound weakly to the RNA probe harboring the 5’-proximal SLI-SLII [(+) 1-98 nt] and strongly bound to the probe bearing the SLIII [(+) 69-182 nt] (which contained the extended flanking sequences of 15 nt upstream and 53 nt downstream). The weak interaction between N and SLI-SLII is consistent with previous observations showing this same interaction by co-immunoprecipitation and by Northwestern blotting analysis (9, 146). The EMSA results, therefore, are consistent with the hypothesis that this interaction plays a role in the discontinuous RNA transcription event (i.e., RdRp template switching event) since the UCUAAAC template switching signal is within the loop of SLII. In addition, this is the first report of a direct interaction between any coronaviral protein and SLIII. The binding of SLIII shown here confirms and extends earlier observations from our laboratory (S. Raman, PhD dissertation, 2003) that an unknown viral protein of 55 kDa (similar in molecular weight to N and therefore presumed to be N) in BCoV-infected cell was UV-crosslinked to the SLIII RNA probe, (+) 69-182 nt, and supershifted in an EMSA with antiserum to N. Taken together, these results indicated N may function to regulate RNA synthesis or translation of the genome through its interaction with 5’ cis-replication SLI, SLII, or SLIII, or some combination of these, but not through the binding of the 5’-proximal SLIV, SLV, and SLVI, or the 3’-proximal RNA cis-replication elements.

112 BCOV N BINDS SLIII WITH SPECIFICITY AND MICROMOLAR AFFINITY To determine the binding specificity of BCoV N to the SLIII and its flanking sequences [nt 69 to182 in the 5’ UTR; the probe: (+) 69-182 nt], EMSAs were carried out following the procedure described for Fig. A.2 except that specific or non-specific RNA competitors, as indicated, was incubated in the binding reaction mixture for 10 min at 30 °C prior to the addition of the labeled probe (Fig. A.3 A-B). Figure A.3 B, lane 1, shows that the (+) 69-182 nt probe migrates as upper and lower bands, presumably representing two folded forms of the molecule. Lanes 2-4 show a positive correlation between increasing amounts of N in the reaction and a greater abundance of the shifted bands. Lanes 5-7 show a complete inhibition of probe shifting with 5, 10, and 15 μg of unlabeled probe, representing 2500, 5000, and 7500 molar excess over labeled probe, respectively, whereas no inhibition was seen with 5, 10, and 15 µg of yeast tRNA (lanes 8-10). These results demonstrate a specific binding of BCoV N to the SLIII-bearing probe. To determine the binding affinity of N to the SLIII-bearing probe, (+) 69-182 nt, EMSA was carried out in which serial dilutions of N were added to a constant nanomolar amount of the probe (Fig. A.3 C). Lane 1 shows the two migrating forms of the free probe as observed in Fig. A.3 B. Lanes 2-9 show the band-shifting complexes with increasing concentrations of N. From these results it appears that complex 1 resulted from N binding to the lower-migrating form of the probe with a Kd (dissociation constant) of ~0.2 μM. This result suggests a single N-binding site on the lower-migrating form of the probe. Surprisingly, multiple shifted bands (complexes 2-4) were observed in the mobility shifting that apparently arose from the upper-migrating form of the probe, suggesting sequential loading of N onto the RNA probe as the amounts of N increased, perhaps reflecting cooperative binding. The binding affinities of N to these multiple sites were not determined. To verify the correlation between the two folded forms of the free probe and corresponding shifted bands, the RNA probe was heated to 95 °C for 5 min and snap-cooled on ice in an attempt to generate a single secondary structure for the probe. The EMSA procedure was then carried out as described for Fig. A.3 C with this treated probe (Fig. A.3 D) (31). Interestingly, after the heatÆsnap-cool treatment, the upper-migrating form of

113

A B Untreated hot probe: (+) 69-182 nt RNA 5’ 1a N 3’ 1b An yeast tRNA ------(+) 69-182 nt RNA ------N[µM] - 0.25 0.75 1.25 0.75 0.75 0.75 0.75 0.75 0.75 5’ UTR

IV C2

I III

1 210 zz II z 2 C1 5’ 3’ 53954558037 84 129 174 226

69-182 nt 5’ 3’ (+)

12345678910

CDUntreated HeatÆsnap-cooled hot probe: (+) 69-182 nt RNA hot probe: (+) 69-182 nt RNA

N N

[µM] - 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 [µM] - 0.10.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

C4 C3 C2

C1 C

123456789 12345678910

FIG. A.3. EMSA of purified BCoV N protein binding to the positive-strand RNA probe, (+) 69-182 nt. (A) Schematic identification of cis-replication stem-loops I-IV in the BCoV 5’ UTR. Shown below the stem-loops is the RNA segment (solid line) used as the 32P-labeled probe in the EMSA described in panels B, C, and D. 2.5 nM of 32P-labeled probe was used in each reaction for the EMSA. (B) Lane 1, probe alone showing the upper and lower migrating forms; lanes 2-4, shifted bands into lower [C1] and upper [C2] complexes that form with the indicated amounts of N; lanes 5-7, competition of N binding with 5, 10, and 15 µg of unlabeled probe; lanes 8-10, competition of N binding with 5, 10, and 15 µg of yeast tRNA over and above the initial yeast tRNA in the binding buffer present in all lanes. (C) Lane 1, probe alone showing the upper and lower migrating forms; lanes 2-9, shifted bands into lower [C1] and upper [C2-C4] complexes that form with the indicated amounts of N. (D) The same experiment as described in panel (C) except that the 32P-labeled probe was heated and then snap-cooled (95 °C, 5 min, then 4 °C, 10 min) before adding to the probe reaction mixture. Lane 1, probe alone showing a single migrating form; lanes 2-10, a shifted band [C] identifying the complex that forms with the indicated amounts of N. In panels B, C, and D, arrows indicate the positions of free probes.

114 free probe disappeared (Fig. A.3 D, lane 1), suggesting the formation of the upper-migrating form of the probe had occurred gradually in the buffered conditions during overnight elution at 4 °C from the denaturing gel during probe preparation. As anticipated, lanes 2-10 show

only a single shifted band, identifying a Kd value of ~0.2 μM. These results indicate that indeed the single and fastest migrating RNA:protein complex resulted from the lower- migrating form of the RNA probe. To confirm the binding specificity of N to the SLIII proper, two analyses were carried out that were based on the EMSA procedure (Fig. A.4). First, the radiolabeled probe nearly representing the SLIII, (+) 85-126 nt, was used for EMSA along with specific and non-specific competitors (Fig. A.4 B). Lane 1 shows the probe migrating as a single band even without the heatÆsnap-cool treatment, indicating a spontaneous tendency for the probe to yield a single secondary structure. Lanes 2-4 show far less quantitative shifting of the probe [than the probe of (+) 69-182 nt, Fig. A.3] with increasing concentrations of N, thus binding affinity could not be determined in this experiment. Lanes 5-7, however, show that shifted band with 0.75 μM N was completely inhibited by the addition of 4, 8, and 12 μg of unlabeled probe, whereas lanes 8-10 show no inhibition of binding with 4, 8, 12 μg of yeast tRNA. These results show weaker but still specific binding of N to the (+) 85-126 nt probe as compared to the (+) 69-182 nt probe. In the second approach, binding of N to the heatÆsnap-cooled (+) 69-182 nt probe was evaluated by adding competitors of SLIII and flanking sequences, respectively (Fig. A.4 C). The RNA segments used as a 32P-labeled probe, (+) 69-182 nt, or unlabeled competitor, (+) 85-126 nt and (+) 127-182 nt, were depicted along with putative 5’ UTR cis-acting stem-loop structures in Fig. A.4 A. Figure A.4 C, lane 1, shows the migration of a single band of free (+) 69-182 nt probe after heatÆsnap-cool treatment as demonstrated in Fig. A.3 D, and thereby a single shifted band was observed with the addition of N as shown in lanes 2-4. Significantly, lanes 5-7 show the shifted band with 0.2 μM of N was completely inhibited by the addition of 80, 160, and 240 ng, respectively, of unlabeled (+) 85-126 nt probe. Yet the competitions were not seen in lanes 8-10 where 80, 160, and 240 ng of unlabeled (+) 127-182 nt probe was added, respectively. Neither was the competition observed with the addition of 0.4, 0.8, and 1.6 μg yeast tRNA as shown in lanes 11-13. Taken together, these results show that whereas

115 A B Untreated hot probe: (+) 85-126 nt RNA

5’ 1a N A 3’ 1b n tRNA ------(+) 85-126 nt RNA ------N[µM] - 0.25 0.75 1.25 0.75 0.75 0.75 0.75 0.75 0.75 5’ UTR IV

I III 1 210 C zz II z

2

5’ 3’ 53954558037 84 129 174 226

69-182 nt 5’ 3’ (+) 85-126 nt 5’ 3’ (+) 127-182 nt 12345678910 5’ 3’ (+)

HeatÆsnap-cooled C hot probe: (+) 69-182 nt RNA yeast tRNA ------(+) 127-182 nt RNA ------(+) 85-126 nt RNA ------N[µM] - 0.05 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2

C

12345678910111213

FIG. A.4. EMSA of purified BCoV N binding to the positive-strand RNA probe, (+) 85-126 nt. (A) Schematic identification of cis-replication stem-loops I-IV in the BCoV 5’ UTR. Shown below the stem-loops are the RNA segments (solid lines) used as 32P-labeled probes in the EMSAs described in panels B and C. (B) 5 nM of 32P-labeled probe, (+) 85-126 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing a single migrating form; lanes 2-4, a shifted band [C] that forms with the indicated amounts of N protein; lanes 5-7, N binding in the presence of 4, 8, and 12 µg unlabeled probe, (+) 85-126 nt; lanes 8-10, N binding in the presence of 4, 8, and 12 µg yeast tRNA over and above the initial yeast tRNA in the binding buffer present in all lanes. (C) 10 nM of 32P-labeled probe, (+) 69-182 nt, was heated and snap-cooled, and used in each reaction for the EMSA. Lane 1, probe alone showing a single migrating form; lanes 2-4, a shifted band [C] that forms with the indicated amounts of N; lanes 5-7, N binding in the presence of 80, 160, and 240 ng of unlabeled probe, (+) 85-126 nt; lanes 8-10, N binding in the presence of 80, 160 and 240 ng of unlabeled probe, (+) 127-182 nt; lanes 11-13, N binding in the presence of 0.4, 0.8, and 1.6 µg of yeast tRNA over and above the initial yeast tRNA in the binding buffer present in all lanes. In panels B and C, arrows indicate the positions of free probes.

116 SLIII per se is a specific target for N protein binding, the presence of its extended flanking sequences probably contributes to increase the binding affinity and/or permit the folding of an alternative higher-order structure of SLIII.

117 BCOV NSP1 BINDS SLIII AND ITS FLANKING SEQUENCES WITH SPECIFICITY AND MICROMOLAR AFFINITY In an early collaborative set of experiments with Kortney Gustin (PhD, 2008) and others in the laboratory, binding of BCoV nsp1 to BCoV 5’ UTR cis-replication elements was studied (55). Here, is a detailed description of that piece of work. In retrospect, since MHV nsp1 in more recent experiments has been shown by genetic evidence to specifically interact with its homologous 5’ UTR (Chapter III), these experiments have become even more important. To determine whether BCoV nsp1 binds the putative 5’-proximal cis-replication elements within the BCoV 5’ UTR, same experimental scheme was employed. First, a similar procedure as described for BCoV N was utilized to construct, express, and purify recombinant BCoV nsp1 protein from E. coli expression system (55). Secondly, in cooperation with lab colleagues, we have learned that BCoV nsp1 is an RNA-binding protein that weakly binds 5’-terminal SLI-SLII and SLIV, moderately binds 3’-terminal BSL-PK, and strongly binds the SLIII with its extended flanking sequence (55). These data indicates that nsp1 may function with these cis-replication elements by using a variety of binding sites and affinities. Having learned that BCoV nsp1 causes a strong band shifting in the (+) 69-182 nt probe representing SLIII with its extended flanking sequences (55), we aimed to establish both its specificity and affinity of binding. To determine the binding specificity and affinity of nsp1 to the probe (+) 69-182 nt, EMSAs were used following the procedure described for N binding. Figure A.5 B, lane 1, shows that the SLIII-bearing probe, (+) 69-182 nt, migrates as upper and lower bands, representing two presumably different folded forms. Lane 2-4 shows a positive correlation between an increasing amount of nsp1 in the reaction and a greater abundance of the shifted bands. Two separately shifted bands were found, presumably arising from the binding of nsp1 to the upper- and lower-migrating forms of the free probe, respectively. In contrast to the multiple shifted complexes from the binding of N to the upper-migrating form of the probe, nsp1 appeared to form only a single shifted complex from both forms of the probe, suggesting a single nsp1-binding site or multiple binding sites with very strong cooperativity. In addition, lane 2 shows approximately 50% of the lower-migrating form and all of the upper-migrating form of the

118 A B Untreated hot probe: (+) 69-182 nt RNA 5’ 1a N 3’ 1b An yeast tRNA ------(+) 69-182 nt RNA ------nsp1 [µM] - 2.5 5 7.5 5 5 5 5 5 5 5’ UTR

IV

I III

1 210 C2 zz II z

2

5’ 3’ C1 53954558037 84 129 174 226 69-182 nt 5’ 3’ (+) 85-126 nt 5’ 3’ (+) 127-182 nt 12345678910 5’ 3’ (+)

HeatÆsnap-cooled CDUntreated hot probe: (+) 69-182 nt RNA hot probe: (+) 85-126 nt RNA yeast tRNA ------yeast tRNA ------(+) 127-182 nt RNA ------(+) 85-126 nt RNA ------(+) 85-126 nt RNA ------nsp1 [µM] - 0.8 2.5 5 2.5 2.5 2.5 2.5 2.5 2.5 nsp1 [µM] - 0.4 0.8 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6

C C

12345678910 12345 678910111213

FIG. A.5. EMSA of purified BCoV nsp1 binding to the positive-strand probes, (+) 69-182 nt and (+) 85-126 nt. (A) Schematic identification of cis-replication stem-loops I-IV in the BCoV 5’ UTR. Shown below the stem-loops are the RNA segments (solid lines) used as 32P-labeled probes in the EMSAs shown in panels B, C, and D. (B) 2.5 nM of 32P-labeled probe, (+) 69-182 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing upper and lower migrating forms; lanes 2-4, shifted bands into lower [C1] and upper [C2] complexes that form with the indicated amounts of nsp1; lanes 5-7, competition of nsp1 binding with 5, 10, and 15 µg of unlabeled probe; lanes 8-10, competition of nsp1 binding with 5, 10, and 15 µg of yeast tRNA over and above the tRNA in the binding buffer present in all lanes. (C) 5 nM of 32P-labeled probe, (+) 85-126 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing a single migrating form; lanes 2-4, a shifted band [C] that forms with the indicated amounts of nsp1; lanes 5-7, competition of nsp1 binding with 4, 8, and 12 µg of unlabeled probe, (+) 85-126 nt; lanes 8-10, competition of nsp1 binding with 4, 8, and 12 µg of yeast tRNA over and above the initial yeast tRNA in the binding buffer present in all lanes. (D) 10 nM of 32P-labeled probe, (+) 69-182 nt, was heated and snap-cooled, and used in each reaction for the EMSA. Lane 1, probe alone showing a single migrating form; lanes 2-4, a shifted band [C] that forms with the indicated amounts of nsp1; lanes 5-7, competition of nsp1 binding with 80, 160, and 240 ng of unlabeled probe, (+) 85-126 nt; lanes 8-10, competition of nsp1 binding with 80, 160 and 240 ng of unlabeled probe, (+) 127-182 nt; lanes 11-13, competition of nsp1 binding with 0.4, 0.8, and 1.6 µg of yeast tRNA over and above the tRNA in the binding buffer present in all lanes. Arrows indicate the positions of free probes.

119 probe fully shifting positions with 2.5 μM nsp1, indicating a Kd of ~2.5 μM for nsp1 binding. Lanes 5-7 show a complete inhibition of probe (+) 69-182 nt shifting with 5, 10, and 15 μg of unlabeled probe, representing 2500, 5000, and 7500 molar excess over labeled probe, respectively, whereas no inhibition was seen with 5, 10 and 15 μg of yeast tRNA. These results together show specific binding of nsp1 to the SLIII-bearing probe (+) 69-182 nt. To confirm the binding specificity of nsp1 to the SLIII proper, a probe which harbors only the SLIII was also used and the results were compared to SLIII with the longer flanking sequences. Two EMSAs were carried out (Fig. A.5 C and D). First, a probe nearly representing the SLIII per se, (+) 85-126 nt, was used for the EMSA along with specific (unlabeled probe RNA) and non-specific (yeast tRNA) competitors (Fig. A.5 C). Lane 1 shows the probe (+) 85-126 nt migrating as a single band even without heatÆsnap- cool treatment, indicating a tendency of this SLIII structure to readily form spontaneously. Lanes 2-4 show far less quantitative shifting of the probe to an upper band with increasing amounts of nsp1 than was seen with the longer flanking regions (Fig. A.5 B). Thus, a binding affinity could not be determined by these experiments. Lanes 5-7, however, show that a shifted band with 2.5 μM nsp1 was completely inhibited by the addition of 4, 8, and 12 μg, respectively, of unlabeled (+) 85-126 nt probe, whereas lanes 8-10 show no inhibition of binding with 4, 8, 12 μg of yeast tRNA. These results show weaker but still specific binding of nsp1 to the SLIII per se. In the second approach, binding of nsp1 to the heatÆsnap-cooled (+) 69-182 nt probe (with the longer flanking regions) was sought in the presence of three potential competitors: unlabeled SLIII [(+) 69-182 nt], unlabeled 55-nt RNA region between SLIII and SLIV representing no stem-loop [(+) 127-182 nt], and unlabeled yeast tRNA (Fig. A.5 A and D). Figure A.5 D, lane 1, shows the migration of a single band of free (+) 69-182 nt probe after heatÆsnap-cool treatment as demonstrated in Figs. A.3 D and A.4 C, and therefore only a single shifted band were observed with the addition of nsp1 as shown in lanes 2-4. Lanes 5-7 show a weak shifted band with 1.6 μM nsp1 that was essentially all competed out by the addition of 240 ng competitor of SLIII RNA probe (+) 85-126 nt, as shown in lane 7. Little or no competition was seen in lanes 8-10 with 80, 160, and 240 ng of unlabeled nonstructured RNA probe (+) 127-182 nt, or in

120 lanes 11-13 with 0.4, 0.8, and 1.6 μg of yeast tRNA. Taken together, these results show that whereas SLIII per se is likely a specific target for nsp1 binding, the presence of its extended flanking sequences probably contribute to an increase of the binding affinity and/or permit the folding of an alternative higher-order structure of SLIII that enhances binding.

121 BCOV N AND NSP1 BIND THE NEGATIVE-STRAND COUNTERPART OF SLIII The functional importance of the negative-strand counterparts of genomic and subgenomic RNAs of coronaviruses is much less understood. One widely supported idea for the role of negative-strand RNA in coronaviruses life cycle is that template switching takes place from an internal site on the positive-strand genome to a site near the 5’ end of the genome during the generation of negative-strand RNAs to produce a subgenomic mRNA-length negative strand from which the positive-strand subgenomic RNA are synthesized (130). It is also speculated that the coronavirus genome 5'-proximal end is a partially double-stranded structure that, as a double-stranded structure, facilitates the template switching step (168, 170). The functional activities of the negative-strands are probably also achieved through interactions with cellular and viral proteins. To date, however, only cellular proteins hnRNP A1, SYNCRIP and PTB have been shown to bind the negative-strand counterparts of 5’ UTR and 3’ UTR of MHV genome (30, 46, 65, 181). Having shown that BCoV N and nsp1 specifically bind the SLIII with high affinity and behaves in a potentially multiple site-binding manner, we sought to examine the binding capacity of N and nsp1 to the negative-strand of SLIII with the same experimental approach (Fig. A.6). The plasmids for generating the positive-strand probes for EMSAs were used to make the RNA probes of negative-strand counterparts except the DNA template was digested with EcoRI (not HindIII) followed by in vitro RNA transcription with SP6 RNA polymerase (not T7 RNA polymerase). EMSAs were done following the same procedure described above for binding positive-strand RNA. Figure A.6 B, lane 1, shows that the probe, (-) 69-182 nt, migrates similarly to its positive-strand counterpart in that it migrates as upper and lower bands, probably representing two folded forms of the probe. The structure of the predicted negative-strand of SLIII has not yet been examined by structure probing. Lanes 2-4 show a positive correlation between increasing amounts of N and shifted bands. Interestingly, the shifted bands appeared to result from only the upper-migrating but not from the lower-migrating form of the probe, suggesting a certain structure of the negative-strand probe is a prerequisite for N binding. In addition, shifted band with 0.75 μM of N was completely inhibited by the addition of unlabeled probe, (-)

122

A B Untreated hot probe: (-) 69-182 nt RNA

5’ 1a N A 3’ 1b n yeast tRNA ------(-) 69-182 nt RNA ------N[µM] - 0.25 0.75 1.25 0.75 0.75 0.75 0.75 0.75 0.75 5’ UTR

IV C

I III

1 210 II zz z 2

5’ 3’ 53954558037 84 129 174 226

69-182 nt 3’ 5’ (-) 85-126 nt 3’ 5’ (-) 12345678910

Untreated CDhot probe: (-) 85-126 nt RNA HeatÆsnap-cooled hot probe: (-) 69-182 nt RNA yeast tRNA ------N (-) 85-126 nt RNA ------[µM] - 0.20.4 0.8 1.2 1.6 N[µM] - 0.25 0.75 1.25 0.75 0.75 0.75 0.75 0.75 0.75

C

123456 12345678910

FIG. A.6. EMSA of purified BCoV N binding to the negative-strand probes, (-) 69-182 nt and (-) 85-126 nt. (A) Schematic identification of cis-replication stem-loops in the BCoV genomic 5’ UTR. Shown below the stem-loops are the RNA segments (broken lines) used as negative-strand 32P-labeled probes for the EMSAs described in panels B, C, and D. (B) 2.5 nM of 32P-labeled probe, (–) 69-182 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing upper and lower migrating forms; lanes 2-4, a shifted band [C] that forms with indicated amounts of N; lanes 5-7, competition of N binding with 0.5, 1, and 1.5 µg of unlabeled probe, (–) 69-182 nt; lanes 8-10, competition of N binding with 4, 8, and 12 µg of yeast tRNA over and above the tRNA in the binding buffer present in all lanes. (C) The same experiment as shown in panel (B) except that the 32P-labeled probe was heated and snap-cooled (95 °C, 5 min then 4 °C, 10 min) before adding to the reaction mixture. Lane 1, probe alone showing migration of a single form; lanes 2-6, no shifted bands observed under the conditions described for panel B. (D) 5 nM of 32P-labeled probe, (–) 85-126 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing a single migrating form; lanes 2-4, a shifted band [C] that forms with the indicated amounts of N; lanes 5-7, competition of N binding with 1.5, 3, and 4.5 µg of unlabeled probe, (–) 85-126 nt; lanes 8-10, competition of N binding with 4, 8, and 12 µg of yeast tRNA over and above the tRNA in the binding buffer present in all lanes. In panels B, C, and D, arrows indicate the positions of free probes.

123 69-182 nt (lanes 5-7), but not by yeast tRNA (lanes 8-10), indicating a specific interaction between N protein and the negative-strand RNA region of SLIII. The shifted band of the (-) 69-182 nt probe disappeared once the probe had been heatÆsnap-cooled, indicating that the upper-migrating form of the probe for N binding can only occur after it has properly re-folded (after the urea treatment) to a native structure under the overnight elution conditions at 4 °C in buffered solution (Fig. A.6 C). To confirm the binding specificity of N to the negative-strand form of SLIII RNA and not its flanking sequences, a similar strategy was designed but for a probe that represented the negative-strand form of SLIII proper, (-) 85-126 nt (Fig. A.6.A). This probe was used for the EMSA along with putative specific and non-specific competitors (Fig. A.6 D). Lane 1 shows the (-) 85-126 nt probe migrating as a single band without the heatÆsnap- cool treatment, indicating a tendency for the negative-strand SLIII structure to maintain a single folded form. Lanes 2-4 show that very little shifting of the probe occurs, far less quantitative, with increasing concentration of nsp1, yet the shifted band with 0.75 μM N was clearly inhibited by unlabeled (-) 85-126 nt probe (lanes 5-7) but not inhibited by yeast tRNA (lanes 8-10). Taken together, these results show a much weaker but still specific binding of N to the negative-strand form of SLIII per se, whereas its extended downstream flanking sequences greatly enhanced this binding affinity. It is possible that N binds this negative-strand RNA, i.e., the RNA between SLIII and SLIV, with significant affinity. Inasmuch as this sequence in the positive-strand represents an important 32-nt unstructured RNA for nsp1 binding (Chapter III), further studies should be done on it in the MHV reverse genetics system. To determine if BCoV nsp1 also binds the negative-strand counterpart of SLIII, similar EMSAs were carried out with the same probes described for Fig. A.6 (Fig. A.7). Interestingly, shifted bands were observed not only with the upper-migrating form of (-) 69-182 nt probe with the long flanking regions, but also the lower-migrating form of the probe (Fig. A.7 B, lanes 2-4). Both shifted bands were specifically inhibited by unlabeled (-) 69-182 nt RNA (lanes 5-7) but not by yeast tRNA (lanes 8-10). EMSA was also carried out with SLIII proper (-) 85-126 nt probe (Fig. A.7 C). In this experiment shifted bands were observed (lanes 2-4), although far less than quantitative, and binding was shown to be

124

A B Untreated hot probe: (-) 69-182 nt RNA

5’ 1a N A 3’ 1b n yeast tRNA ------(-) 69-182 nt RNA ------nsp1 [µM] - 2.5 5 7.5 5 5 5 5 5 5 5’ UTR

IV

I III C2

1 210 zz II z

2

5’ 3’ C1 53954558037 84 129 174 226

69-182 nt 3’ 5’ (-) 85-126 nt 3’ 5’ (-)

1 2345 678910

C Untreated hot probe: (-) 85-126 nt RNA

yeast tRNA ------(-) 85-126 nt RNA ------nsp1 [µM] - 0.8 2.5 5 2.5 2.5 2.5 2.5 2.5 2.5

C

12345678910

FIG. A.7. EMSA of purified BCoV nsp1 binding to the negative-strand RNA probes, (-) 69-182 nt and (-) 85-126 nt. (A) Schematic identification of cis-replication stem-loops in the BCoV 5’ UTR. Shown below the stem-loops are the RNA segments (broken lines) used as negative-strand 32P-labeled probes for the EMSAs described in panels B and C. (B) 2.5 nM of 32P-labeled probe, (-) 69-182 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing upper and lower migrating forms; lanes 2-4, shifted bands identify the lower [C1] and upper [C2] complexes that formed with indicated amounts of nsp1; lanes 5-7, competition of nsp1 binding with 0.5, 1, and 1.5 µg of unlabeled probe, (–) 69-182 nt; lanes 8-10, competition of nsp1 binding with 4, 8, and 12 µg of yeast tRNA over and above the tRNA in the binding buffer present in all lanes. (C) 5 nM of 32P-labeled probe, (–) 85-126 nt, was used in each reaction for the EMSA. Lane 1, probe alone showing a single migrating form; lanes 2-4, a shifted band [C] that forms with the indicated amounts of nsp1; lanes 5-7, competition of nsp1 binding with 1.5, 3, and 4.5 µg of unlabeled probe, (–) 85-126 nt; lanes 8-10, competition of nsp1 binding with 4, 8, and 12 µg of yeast tRNA over and above the tRNA in the binding buffer present in all lanes. In panels B and C, arrows indicate the positions of free probes.

125 specific as evidenced by inhibition of binding with unlabeled (-) 85-126 nt probe (lanes 5-7) but not with yeast tRNAs (lanes 8-10). These results together indicate that negative-strand SLIII per se is also a specific binding target for nsp1 binding, and suggests that the extended flanking sequence contributes to the binding. Enhanced binding by the flanking sequence could be a function of direct binding to the flanking sequence, or altered capacity of SLIII binding due to altered folding. Whether interplay between N and nsp1 is cooperative or competitive for the binding of SLIII-bearing negative-strand RNA and its flanking sequence remains to be determined.

126 THERE IS NO DETECTABLE RNA-RNA INTERACTION BETWEEN

THE 5’ AND 3’ UTRS OF BCOV BASED ON A GEL-SHIFT ASSAY Accumulative evidence has pointed to a hypothesis that RNA genomic circularization through a protein-protein bridge or RNA-RNA cross talk may be a general replication mechanism for positive-strand RNA viruses (4, 57, 127). The long-range interaction between the 5’ and 3’ ends of viral genome has been implicated in the regulation of translation, initiation of negative-strand RNA synthesis, and the switch between them (44, 48, 161). While several cellular and viral proteins have been proposed to function in coronaviruses as mediators of 5’-3’-end cross talk, based on their UTR-binding properties, terminal RNA-RNA interactions have never been experimentally ruled out. To directly examine an RNA-RNA interaction between the termini of the BCoV genome, RNA binding assays were established as described by Alvarez et al. (4) who used the assay to examine the termini interaction of flavivirus genome. 32P-labeled RNA probes of the 3’ UTR-located BSL-PK structure [(+) 291-159 nt] and the HVR structure [(+) 162-1 nt] were obtained by in vitro transcription and were purified through 6% PAGE with 7M urea (Fig. A.8 A). Unlabeled RNA probes encompassing different 5’- and 3’-terminal cis-replication elements were prepared from the RiboMax kit (Promega) along with Bio-spin 6 column purification (Bio-Rad) and quantified spectrophotometrically (Fig. A.8 A). Both labeled and unlabeled RNA samples were heat denatured at 85 °C for 5 min and slow-cooled to room temperature. Different unlabeled RNA probes were added to the binding mixture which contained 5mM

32 HEPES, pH 7.9, 100mM KCl, 5mM MgCl2, 3.8% glycerol, 2.5 μg tRNA, and P-labeled RNA probe (0.4 nM, 13,000 cpm) in a final volume of 30 μl. RNA:RNA complexes were analyzed by electrophoresis through native 6% polyacrylamide gels supplemented with 5% glycerol. Gels were pre-run for 1 h at 4 °C at 150 V, and then mixtures were loaded with additional 2 μl 50% glycerol followed by electrophoresis at 4 °C for 4 h. Gels were dried and visualized by autoradiography. Non-detectable RNA-binding shifts were observed when four different RNA probes representing the 5’-terminal located cis-replication elements were analyzed for their binding ability to the 3’ UTR-located BSL-PK and HVR, respectively (Fig. A.8 B and C). These results suggest the absence of a physical RNA-RNA contact between 5’ and 3’ ends of the BCoV genome. This negative evidence

127

A

5’ 1a N A 3’ 1b n

5’ UTR partial ORF 1a 3’ UTR V

IV BSL HVR

I PK z III zz z z z 1 210 VI zz II zz z 2

5’ 3’ 5’ UAA 53954558037 84 129 174 226 239 310 311 340 288 An 3’ 1 (+) 1-98 nt (+) 291-159 nt 5’ 3’ 5’ 3’ (+) 69-182 nt (+) 162-1 nt 5’ 3’ 5’ 3’ (+) 152-262 nt (-) 291-159 nt 5’ 3’ 3’ 5’ (+) 231-364 nt (-) 162-1 nt 5’ 3’ 3’ 5’

BCHot probe: 0.4 nM (+) 291-159 nt RNA Hot probe: 0.4 nM (+) 162-1 nt RNA

Cold RNA (+) 1-98 nt (+) 69-182 nt (+) 152-262 nt (+) 231-364 nt (-) 291-159 nt Cold RNA (+) 1-98 nt (+) 69-182 nt (+) 152-262 nt (+) 231-364 nt (-) 162-1 nt

1:5 1:100 1:1000 1:3000 1:0 1:5 1:100 1:1000 1:5 1:100 1:1000 1:3000 1:5 1:100 1:1000 1:3000 1:0 1:5 1:100 1:1000 1:5 1:100 1:1000 1:3000 1:5 1:100 1:1000 1:3000 1:5 1:100 1:1000 1:3000 1:5 1:100 1:1000 1:3000 1:5 1:100 1:1000 1:3000 [hot:cold] [hot:cold]

C2

C C1

1 2 3 4 5 6 7 8 991011121314151617181920 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 101011121314151617181920 11 12 13 14 15 16 17 18 19 20

FIG. A.8. Positive-strand RNA-RNA binding analysis between the BCoV 5’ and 3’ terminal cis-replication RNA elements. (A) A model of BCoV 5’ and 3’ terminal cis-replication RNA secondary structures identified for BCoV DI RNA replication. Shown below the 3’ UTR RNA structures are two (+) RNA segments (mosaic lines) used as 32P-labeled probes. Also shown are two (-) RNA segments (broken lines) used as respective unlabeled binder to each probe in the binding assays. Shown below the 5’ terminal RNA structures are four unlabeled (+) RNAs (solid lines) tested for 5’-3’ RNA-RNA interaction. (B) and (C) The RNA probes present in the binding assays are indicated at the top. Binding assays were carried out with increasing amounts of unlabeled RNA (+) 1-98 nt (lanes 1-4), (+) 69-182 nt (lanes 5-8), (+) 152-262 nt (lanes 9-12), (+) 231 to 364 nt (lanes 13-16), and respective negative-strand counterpart of each probe (lanes 17-20). Arrows indicate the positions of free RNA probes. Shifted band in (B) and shifted bands in (C) showed the complexes [C] that form with the indicated amounts of RNAs.

128 is consistent with the fact that significant base-pairing between the 5’ and 3’ ends of coronavirus genome have not been found by inspection or by computer analysis. However, it remains possible that RNA conformational changes arising from protein binding may lead to potential RNA-RNA interaction.

129 VITA

Bo-Jhih Guan was born in Chiayi, Taiwan. He received his Bachelor of Science degree in Aquaculture from the National Taiwan Ocean University, Taiwan, in 1999, and Master of Science degree in Fisheries Science from the National Taiwan University, Taiwan, in 2001. After that, he served as a Squad leader and Educational foreman in the Army of Taiwan for the fulfillment of military obligation from 2001 to 2003. Thereafter he worked as a research assistant in the Department of Microbiology and Immunology, National Defense Medical Center, Taiwan, from 2003 to 3005. He entered the graduate program in Microbiology at the University of Tennessee, Knoxville, in the fall of 2005. He received his Doctor of Philosophy degree in the summer of 2010.

130