<<

Iowa State University Capstones, Theses and Retrospective Theses and Dissertations Dissertations

1999 Replication of yellow dwarf RNA and transcriptional control of gene expression Guennadi Koev Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/rtd Part of the Molecular Biology Commons, and the Plant Pathology Commons

Recommended Citation Koev, Guennadi, "Replication of barley yellow dwarf virus RNA and transcriptional control of gene expression " (1999). Retrospective Theses and Dissertations. 12464. https://lib.dr.iastate.edu/rtd/12464

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bteedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event tiiat the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate tine deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at tiie upper left-hand comer and continuing from left to right in equal sections with small overiaps.

Photographs included in the original manuscript have been reproduced xerographicaliy in this copy. Higher quality 6° x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directiy to order.

Bell & Howell Information and teaming 300 North Zeeb Road, Ann Arbor. Ml 48106-1346 USA UIVQ 800-521-0600

Replication of barley yellow dwarf virus RNA

and transcriptional control of gene expression

by

Guennadi Koev

A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Plant Pathology

Major Professor: W. Allen Miller

Iowa State University

Ames, Iowa

1999 UMI Number 9950100

® UMI

UMI Microfomtl9950100 Copyright 2000 by Bell & Howell Information and Leaming Company. All rights reserved. This microfonm edition is protected against unauthorized copying under Title 17, United States Code.

Bell & Howell Information and Leaming Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 ii

Graduate College

Iowa State University

This is to certify that the Doctoral dissertation of

Guennadi Koev has met the dissertation requirements of Iowa State University

Signature was redacted for privacy.

Signature was redacted for privacy. Foi^he Major Program

Signature was redacted for privacy. iii

TABLE OF CONTENTS

ABSTRACT iv

CHAPTER L GENERAL INTRODUCTION I

CHAPTER 2. PRIMARY AND SECONDARY STRUCTURAL ELEMENTS REQUIRED FOR SYNTHESIS OF BARLEY YELLOW DWARF VIRUS SUBGENOMIC RNAI 20

CHAPTER 3. BARLEY YELLOW DWARF VIRUS UTILIZES VERY DIFFERENT PROMOTERS FOR TRANSCRIPTION OF ITS SUBGENOMIC RNAs 51

CHAPTER 4. CHARACTERIZATION OF THE BARLEY YELLOW DWARF VIRUS 3" ORIGIN OF REPLICATION 82

CHAPTERS. GENERAL CONCLUSIONS 105

REFERENCES 110

ACKNOWLEDGMENTS 130 Barley yellow dwarf virus (B YDV) is an important pathogen of cereal crops. It has a single-stranded positive sense genomic RNA (gRNA), 5.7 kilobases. During infection of a plant cell, B YDV generates a nested set of three subgenomic mRNAs (sgRNAs) for expression of its 3'-proximal genes. The goal of this study was to map and characterize cis- acting RNA signals involved in transcription and replication of B YDV RNA. Three sgRNA promoters and the 3' origin of replication were characterized. The sgRNA 1 promoter was mapped to a 98 nt region that contains two stem-loop structures. A combination of primary and secondary structural elements was required for promoter activity. Most of the promoter sequence (75 nt) is located upstream of the transcription initiation site. In contrast, sgRNA2 and sgRNA3 promoters contained the majority of their sequences downstream of the sgRNA start sites. SgRNA2 promoter was mapped to a 143 nt region predicted to fold into a cloverleaf-like stmcture. SgRNA3 promoter is 44 nt long and was predicted to form a hairpin and a single-stranded RNA region. Sequence and structure comparisons of the three subgenomic promoters did not reveal any similarities indicating that transcription of B YDV sgRNAs is controlled by extremely different cis-elements.

The 3' origin of replication of BYDV capable of supporting a basal level of replication was mapped to the 104 3' terminal nucleotides. Based on the computer prediction, phylogenetic and mutational analysis, and nuclease sensitivity assays, this region forms four stable stem-loop structures (SL1-SL4, 3' to 5') that contain terminal GNRA and

UNCG tetraloops. Each stem-loop was indispensable for virus replication according to results of deletion mutagenesis. Loop sequences of SLl and SL2 were not important for V replication, whereas loops of SL3 and SL4 preferred the GNRA consensus sequence. RNA secondary structure and not the primary sequence of the stems in SLl through SL4 is important for viral RNA replication in protoplasts.

Characterization of the cis-acting elements required for transcription and replication of BYDV is an important step towards better understanding of the basic mechanisms of the viral life cycle, which may help in development of new antiviral strategies. 1

CHAPTER 1. GENERAL INTRODUCTION

Luteoviridae

Members of the family Luteoviridae are single-stranded positive sense RNA that form icosahedral virions, approximately 25-30 nm in diameter (86). Based on their genome organizations, serological, cytopathological and other properties, the Luteoviridae are divided in three genera; , Polerovirus, and Enamovinis (78). Viruses that belong to the three genera share highly homologous genes for the coat protein and its readthrough extension, which has been implicated in aphid transmission. This accounts for their serological relatedness, similarity of virion morphology and means of transmission.

Based on these features, these viruses have been grouped into the family Luteoviridae. On the other hand, the genes that encode viral RNA-dependent RNA polymerases (RdRp) are as unrelated as they could be. The RdRp of the members of the genus Luteovirus belongs to the carmo-like virus supergroup (supergroup 2), whereas those of the genera Polerovirus and

Enamovirus belong to the sobemo-like supergroup (supergroup 1) (60). Such seemingly inconsistent genome organization exemplifies the modular character of the evolution of RNA viruses where RNA recombination is an important driving force (60).

The role of RNA recombination in the evolution of the Luteoviridae was suggested by Miller and colleagues (89). The proposed recombination event could have happened according to one of the two scenarios (Fig. 1). A carmo-like virus (e.g. a dianthovirus) RNA may have recombined with an existing ancestor of poleroviruses resulting in a luteovirus genome, or, alternatively, a sobemo-like virus RNA may have recombined with an existing 2

Origin of Luteovirus Dianthovirus RNA1

Polerovirus

Luteovirus

Origin of Polerovirus Sobemovirus iliil

Luteovirus

Polerovirus

^Esa

Fig. 1. Role of recombination in the evolution of the genera Luteovirus and Polerovirus of the Luteoviridae (modified from Miller et al., 1997). Solid lines represent genomic RNAs, dashed lines represent subgenomic RNAs. Boxes show open reading frames (ORFs). Regions with identical shading pattems are phylogenetically related. Solid lines with arrow denote the proposed path of the crossover. Gray boxes represent origins of replication and subgenomic promoters ~ putative hot spots of RNA recombination. ORF designations: POL, RNA polymerase; CP, coat protein; AT, aphid transmission protein; MP?, putative movement protein; PRO, protease; VPg, genome-linked protein. 3 luteovirus ancestor resulting in a polerovirus genome (Fig. 1). Bipartite genus Enamovirus has evolved by combining a polerovirus-like RNAI and an umbravirus-like RNA2. This genus contains only one species, RNAI of pea enation .

The family Luteoviridae includes important crop pathogens. Barley yellow dwarf virus (B YDV), that belongs to the genus Luteovirus, is one of the most devastating virus pathogens that infect cereals. It is especially serious in . Average yield losses due to

B YDV infection range between 11 and 33% (4-3). Beet western yellows virus (BWYV) and potato leafiroU virus (PLRV) exemplify important plant pathogens that belong to the genus

Polerovirus. All members of the Luteoviridae are transmitted by certain aphid species with which they have co-evolved intimate relationships. The viruses are transmitted in a circulative non-propagative manner (rev. in (33)).

Due to the mandatory aphid transmission, one of the most effective methods to control luteovirus epidemics has been insecticide applications targeting aphid vectors.

However, chemical control is costly and potentially harmful to the environment.

Conventional breeding for resistance to has not been able to provide high levels of crop protection. Genetic engineering, on the other hand, is a promising new approach, which potentially combines high effectiveness and low cost. Using the pathogen-mediated protection approach, several research groups have developed transgenic plants with increased resistance to PLRV (rev. in (8)) and BYDV (58).

Because of the constant evolution of the virus populations in the field, none of the plant protection methods can be used indefinitely. Therefore, continuous research efforts need to be undertaken that are aimed at better understanding of virus replication and 4 interaction with the host. Such efforts will help to identify new targets for antiviral agents,

which may become the bases of new control strategies.

Barley yellow dwarf virus

Genome organization. B YDV is the type member of the genus Luteovirus. It has a single-stranded genomic RNA (gRNA) of positive polarity. B YDV gRNA contains six open

reading frames (ORPs) (Fig. 2) (94). ORFl and ORF2 encode proteins that are required for

virus replication (96). The product of ORFl (39 kDa protein) was suggested to be a helicase

because it contains several amino acid motifs conserved in most known helicases (37).

However, based on the lack of other conserved motifs found in helicases, a different study

suggested that BYDV, as well as the majority of RNA viruses with genomes smaller that 6

kb, does not contain a heUcase (60). The role of the 39 kDa protein in virus replication

remains to be determined. The product of ORF2 (60 kDa protein) has a GDD amino acid

motif conserved in RdRps of the majority of viruses (60). Therefore, this protein is suggested to be viral RdRp, although this function has not been demonstrated biochemically.

The BYDV coat protein (22 kDa) is encoded by 0RF3. It is not required for virus replication in protoplasts (96) but is necessary for transmission. Product of 0RF5 (50 kDa) is fused to the viral coat protein. It has been implicated in aphid Uransmission (18). 0RF4, which overlaps the coat protein gene, encodes a 17 (kDa) protein that is not required for virus

replication in protoplasts (96). The 17 kDa protein was suggested to be a movement protein

based on its requirement for plant systemic infection in BYDV (18) and its localization in

plasmodesmata in PLRV-infected plants (118). On the other hand, Domier and colleagues 5

BYDV

a gRNA 5.7 kb

sgRNA1 3.0 kb

subgenomic RNAs

sgRNA 2 0.9 kb

sgRNAS — 0.3 kb

Fig. 2. Genome organization of barley yellow dwarf virus. Solid lines represent genomic and subgenomic RNAs. Black boxes indicate ORFs with their numbers. Shown on the right, is a northern blot profile of BYDV RNAs. Total RNA from an infected plant was subjected to northern blot hybridization as described in Materials and Methods. Numbers indicate the sizes of the viral l^As in kilobases (kb). 6 have detected the 17 kDa protein of BYDV associated with RNA in the cytoplasm and the nucleus of infected cells, but never in the plasmodesmata, which argues against its function as a movement protein (100). The function of the 6.7 kDa protein encoded by ORF6 is unknown, but it is not required for virus replication in oat protoplasts (96). It has no aniino acid sequence homology to any known proteins.

Transcriptional control of gene expression. BYDV gRNA serves as the mRNA for expression of ORFs 1 and 2. For expression of its 3' proximal ORFs, BYDV generates three subgenomic RNAs (sgRNAs) during infection (Fig. 2). SgRNAl is the message for the coat protein (ORFS), the putative movement protein (ORF4), and the protein involved in aphid transmission (ORF5). SgRNA2 contains 0RF6, and the third viral sgRNA (sgRNA3) does not contain any genes, therefore, its function is unknown. In plants, the abundance of the

BYDV sgRNAs is inversely proportional to their size (Fig. 2), a phenomenon observed in other viruses (29). As demonstrated in the studies included in this dissertation, synthesis of the three viral sgRNAs is controlled by very divergent cis-acting elements named subgenomic promoters. The sgRNAl promoter is located at the border of the homologous and divergent regions in the genera Luteovirus and Polerovirus. Therefore it was suggested to be a hot spot of RNA recombination that has occurred in the evolution of the Luteoviridae

(89).

Translational control of gene expression. Because BYDV RNA does not have a 5' cap, it is translated by a cap-independent mechanism (2). Cap-independent translation has been demonstrated in a variety of RNA vimses and some cellular mRNAs (114). The best- characterized examples are RNAs of picomaviruses that contain internal ribosome entry sites

(IRES) (114). BYDV, on the other hand, has a different cis-element that mediates its cap- 7 independent translation. This translational enhancer (TE) is located in the 3' proximal region of BYDV gRNA (142, 143). By communication with the 5' UTR it mediates ribosome recruitment in the absence of the 5' cap.

The major protein translated from the viral gRNA is the 39 kDa product of ORFl.

For expression of the ORF2 product, the putative RdRp, BYDV uses (-1) translational frameshifting (11). A shifty heptanucleotide conserved in several viruses in combination with a cis-acting signal located at the 3' end of the viral genome, effects an estimated 1-8% frameshifting that results in translation of a 99 kDa fusion protein (ORF1+ORF2), the viral

RdRp (W.A. Miller and C. Paul, in preparation). Subde alterations in the frameshifting rate result in the lack of virus replication (W.A. Miller and C. Paul, in preparation), probably indicating either the strict requirement for a certain amount of the 99 kDa protein or for the stoichiometry of the 39 kDa and the 99 kDa products.

Translation of sgRNAl is also cap-independent (141). The major product expressed from SgRNAl is the viral coat protein. At the estimated rate of 1%, the stop codon of the coat protein ORF (0RF3) is not recognized by the ribosomes, and translation continues resulting in the expression of the 0RF5 product, the aphid transmission protein (24). This translational readthrough is mediated by a cis-acting element located immediately downstream of the 0RF3 stop codon and by a distant region approximately 700 nt 3' to the stop codon (12).

For tianslation of ORF4 (putative 17 kDa movement protein), which overlaps the coat protein gene, BYDV utilizes the leaky scanning strategy (25). The signals that regulate this process include RNA secondary structure in the 5' UTR of sgRNAl and primary sequence in the vicinity of the start codons of ORFs 3 and 4 (25). 8

Transcriptional control in RNA viruses

Because of the inability to translate multicistronic messages in eukaryotic cells, RNA viruses have evolved an array of strategies to express their 3' proximal genes. One such strategy is deployment of subgenomic mRNAs. Numerous RNA viruses synthesize one or more sgRNAs, which are 3'-coterminal with their genomes, for expression of downstream genes. The most common genes expressed off the sgRNAs are structural proteins (capsid, envelope, etc.) and movement proteins. There are a number of genes translated from sgRNAs whose functions are unknown. SgRNAs are transcribed later in infection, after replicase proteins are expressed from the gRNA, and initial cycles of RNA replication have occurred. Therefore, sgRNAs express products needed dviring intermediate and late stages of infection.

Subgenomic promoters and regulatory elements. The RNA cis-acting elements that effect transcription of sgRNAs have been named subgenomic promoters. Such promoters have been mapped in a variety of RNA viruses (6, 10, 29, 52, 59, 70, 132, 139,

140, 149). They range between 24 nt (70) and over 100 nt in size (6) and are usually located mosdy upstream of the transcription initiation sites. However, there are exceptions, such as the sgRNA promoter of beet necrotic yellow vein virus (6) and two of the three sgRNA promoters of B YDV described in this dissertation. One of the best-characterized subgenomic promoters is that of brome mosaic virus (BMV). Mapped to a region that includes 74 to 95 nt upstream and 16 nt downstream of the initiation site, it contains several elements, a combination of which provides full transcription activity in vivo (29). However, in vitro, a 9

22-nt promoter element is sufficient for the basal level of sgRNA synthesis by the BMV

RdRp (1).

Unlike promoters for DNA-dependent RNA polymerases, such as those found in eukaryotes, promoters used by viral RdRps often depend on RNA secondary structure for specific recognition. Requirements for stable stem-loops have been predicted and experimentally documented in several RNA viruses, such as tobraviruses (36), BYDV (59),

TCV (139), and AMV (50). This is not unusual because higher order RNA structure has been shown to play an important role in various processes that rely on specific RNA-protein interactions (4, 16, 22,99,104, 109). Interestingly, in vitro experiments with purified BMV

RdRp demonstrated sequence-specific recognition of the subgenomic promoter by the enzyme and the lack of the RNA secondary structure requirement (122). This very clearly demonstrates limitations of the in vitro approach in studying complex biological systems.

While under in vitro conditions the viral enzyme does not require extensive promoter sequence or secondary strucmre, in the infected cell the situation is apparendy different.

In addition to the regions immediately adjacent to the transcription start sites, some distant cis-elements have been shown to regulate sgRNA synthesis. Some of these elements are proposed to exert their effect by direct base-pairing to the regions near the start site.

Others that do not show apparent sequence complementarity to the promoter regions, may function through protein mediators. Long-distance RNA base-pairing has been implicated in activation of sgRNA transcription in potexviruses that synthesize up to three sgRNAs (136).

Experiments with potato virus X (PVX) have shown that mutations disrupting potential base- pairing between a region located at the 5' end of the gRNA and regions upstream of the transcription start sites of the sgRNAs abolish transcription, whereas compensatory 10 mutations restore the activity, although not to the wild type level. This suggests the requirement for both the base-pairing and the primary RNA sequence of these elements (54,

55). Another example of proposed direct base-pairing of distant RNA elements is stimulation of sgRNA2 transcription of tomato bushy stunt tombusvirus (TBSV). Mutational analysis similar to that performed in PVX, demonstrated the requirement for complementarity between a region immediately upstream of the sgRNA2 start site and a region located approximately 1000 nt fiirther upstream (150). In both PVX and TBSV phylogenetic analysis showed conservation of such long-distance interactions in related viruses reiterating their biological relevance. Base-pairing between regulatory elements located on two different genomic RNAs of the bipartite red clover necrotic mosaic virus

(RCNMV) is the only known example of trans-activation of sgRNA transcription (124). The requirement for such RNA-RNA interactions was demonstrated by mutational analysis.

SgRNA transcription in mouse hepatitis coronavirus (MHV) exemplifies a requirement for potential long-distance interactions that do not involve base-pairing. A requirement for the 3' UTR of MHV for sgRNA transcription was suggested by the fact that certain 3' UTR mutants did not produce a sgRNA in the MHV defecdve interfering RNA

(DI-RNA) system, but were able to synthesize both positive and negative strands of gRNA

(73). It was suggested that the elements in the 3' UTR must somehow interact with the intergenic regions that correspond to the 5' ends of sgRNAs and that have been previously implicated in regulation of coronavirus transcription. Because no apparent sequence homology or complementarity between the 3' UTR and the intergenic regions of MHV was observed, the putative interaction was proposed to be mediated by proteins (73). Recently, polypyrimidine tract-binding protein (PTB) binding to the minus strand of the 3' UTR was 11 shown to correlate with sgRNA transcription (46). Also, PTB binding caused an RNA conformational change in the minus strand of the 3' UTR which correlated with transcription activity (46). PTB is part of eukaryotic RNA splicing machinery. Interestingly, it was shown that PTB and another splicing factor, hnRNP Al, also bind to other coronavirus transcription regulatory cis-elements (71, 72), suggesting a possibility of interaction of the transcription cis-elements mediated by RNA-protein and protein-protein interactions.

Mechanisms of sgRNA synthesis. Several models have been suggested to describe the mechanism of sgRNA synthesis in positive strand RNA viruses. The most widely recognized and the only unequivocally proven model is internal initiation of plus-strand

SgRNA synthesis by viral replicase on the promoter located on the minus strand template.

This mechanism was first demonstrated in BMV (90) and later in other RNA viruses (131,

140). Another model was suggested for sgRNA synthesis of RCNMV, the virus that requires trans-activation of transcription by base-pairing of elements located on different gRNAs.

This model suggests that transcription occurs by premature termination of minus strand synthesis and subsequent independent repHcation of the minus strand product (124). It was proposed that the base-paring between the two RNA sequences creates a termination strncmre which by itself or with the help of protein factors makes the viral RdRp pause and dissociate releasing newly synthesized minus strand. This minus strand sgRNA then serves as a template for the plus strand sgRNA synthesis.

Two different versions of discontinuous RNA transcription were suggested for

SgRNA synthesis by coronaviruses and arteriviruses. These viruses have extremely large

RNA genomes (15 to over 30 kb) and produce a nested set of seven 3' coterminal sgRNAs

(66). Interestingly, all sgRNAs contain a leader sequence identical to that present at the 5' 12

end of the gRNA, but which is absent in their putative subgenomic promoters. This indicates

that during transcription, sgRNAs must somehow acquire the leader sequence from the 5' end

of gRNA. A reasonable way to explain such a phenomenon is by discontinuous RNA

transcription. A leader-primed transcription model was proposed by Lai and colleagues,

which suggested that viral replicase produces free plus strand leader RNAs by abortive

synthesis (7, 62, 63). These leader molecules then prime (+)sgRNA synthesis at the short

homologous regions in subgenomic promoters on the (-)gRNA. An alternative model

suggested that discontinuous sgRNA transcription occurs during (-) strand RNA synthesis.

Having reached an intergenic region, viral replicase would dissociate and re-attach at the 5'

leader region of the gRNA, thus incorporating the antileader into the (-)sgRNA (116). The (-

) strand sgRNAs then serve as templates for (+)sgRNA synthesis. A large amount of data

has been accumulated that supports both models, but no consensus has been reached on the

mechanism of coronavirus transcription.

Replication of RNA viruses

Because of the abundance of the host RNAs in virus infected cells, viruses evolved

strategies for specific amplification of their gRNA. Among those strategies, the best- characterized is the use of specific cis-acting RNA elements present in various parts of viral

genomes. By their location, these elements can be divided into 5' terminal, 3' terminal, and

internal signals. By binding viral repUcase complex and host factors necessary for viral RNA

rephcation, these signals ensure efficient and specific amplification of viral genomes.

5' terminal replication elements. The most obvious function of the cis-acting

signals located at the 5' termini of viral genomic RNAs is the role of the promoter for the 13 positive strand synthesis from the antigenomic template. Therefore, technically, these elements should be considered at the 3' ends of the negative strands. A number of such signals containing RNA secondary structural elements have been described that act as promoters for (+)gRNA synthesis (14). Unexpectedly, it was demonstrated in several viruses, that the 5' cis-elements required for replication are, in fact, present in the positive strands. Therefore, their requirement for replication could not be explained by their predicted fimction as plus strand promoters. Experiments in poliovirus and BMV clearly demonstrated that RNA structural elements indispensable for virus viability are formed in the

(+)gRNA and not in the (-)gRNA (4). Baltimore and colleagues showed formation of a protein complex around the cloverleaf RNA structure at the 5' end of poliovirus genome (3,

4). Strong stem-loop structure formation at the 5' end of gRNA was demonstrated in BMV

(109). Hall and colleagues suggested that strong secondary structural elements in viral 5'

UTRs, with the help of protein factors, may facilitate release of the otherwise double- stranded 3' end of the (-)gRNA, making it accessible to viral RdRp for initiation of positive strand synthesis (109). In poliovirus, the 5' terminal cloverleaf structure has also been shown to play an important role in viral RNA stability, translation (104), and a regulatory switch between translation and replication (31).

RNA secondary structural elements at the 5' genomic termini have been described in numerous viruses (rev. in (14)), and most of them have been suggested to play a role in viral

RNA replication, which may be similar to those proposed in poliovirus and BMV.

3' terminal replication elements. In order to accomplish their replication cycle, viruses need to synthesize a negative strand gRNA, which serves as a template for positive strand gRNA synthesis and in some cases also for sgRNA synthesis. In most viruses. 14 promoters for (-)gRNA synthesis, also known as origins of replication, are located in the genomic 3' UTRs (14). Specific RNA sequence and higher order structure are among factors that ensure the specificity of viral RNA amplification. An overview of the best-characterized types of 3' elements involved in RNA virus replication is presented in the introduction to chapter 4 of this dissertation.

Internal replication elements. During mutational analyses of various RNA viruses researchers have come across internal regions that are required for replication. Those regions either do not encode any proteins or encode proteins that are not important for replication.

Instead, they contain cis-acting RNA signals that regulate viral RNA amplification. One such region (cis-acting replication element, ere) was discovered in human rhinovirus 14

(HRV14) (79). It was shown that a stable RNA secondary structure (stem-loop) formed in the ere. Maintainance of this stem-loop, regardless of its sequence, was important for replication. The role of ere in RNA synthesis is unclear. It was suggested to possess RdRp binding properties (79). Another internal element important for replication was identified in bacteriophage Q|3 RNA. It is located on the positive strand gRNA and serves as a replicase- binding site for initiation of minus strand synthesis about 1.5 kb downstream. BMV RNA3 has an intergenic replication enhancer (IRE) that increases the rate of minus strand synthesis and gready stabilizes RNA3 (30). A replication enhancer region has been found in TBSV

(112). It is located within the region corresponding to the region IH of the viral DI RNA.

While this element is not absolutely required for viral RNA replication, it greatly enhances synthesis of both positive and negative strands of TBSV gRNA.

General features of RNA virus replication. One of the most common features shared by RNA viruses is that replication of their RNA occurs on cellular membranes (14). 15

Replication complexes are anchored to the membrane lipid bilayer by the transmembrane domains of the proteins that constitute them. One such viral protein, 6 kDa, has been described in tobacco etch potyvirus (117). This integral membrane protein was suggested to interact with the ER, which in the process of viral infection disintegrates and forms virus- induced vesicles, the loci of viral replication. Similar vesicles were observed in cells infected with numerous RNA viruses (14) including BYDV (100). The origins of the membranes that participate in vesicle formation differ and include the endoplasmic reticulum (ER) and compartments of the Golgi system.

As soon as the replication complex has assembled, viral RNA synthesis can initiate.

There are two major mechanisms of transcription initiation employed by RNA viruses: de novo initiation and elongation of a primer. Oligonucleotide primers are required for initiation of DNA synthesis by eukaryotic DNA polymerases during DNA replication.

Similarly, RdRps of picoraaviruses and some other virus groups need a primer for initiation of RNA synthesis. In poUovirus, which is a member of the picomavirus group, a genome-

Unked protein, VPg, serves as a primer for initiation (105). The 22 amino acid peptide (3B) is uridylylated by the viral replicase, and the covalendy attached uridine residues prime RNA synthesis at the 3' poIy(A) tail of the poUovirus genome. Viruses in other groups have been shown or suggested to possess VPg (14). Alternatively, viruses that do not encode VPgs, use de novo RNA synthesis for replication and transcription. Viral RdRps recognize promoters located on the template RNA and initiate transcription without the requirement for a primer.

VPg was not found in BYDV, therefore, the virus probably utilizes de novo initiation for

RNA synthesis. An important feature shared by many viruses is cis-preferential replication. Cis- preferentiality is the inability of a wild type virus to rescue amplification of a replication- defective mutant. Various degrees of cis-preferential replication have been observed in poliovirus (102), turnip yellow mosaic virus (TYMV) (144), BYDV (96), and others.

Several explanations have been proposed for cis-preferentiality. One possibility is mandatory translation of the viral RNA in order to be replicated. Conceivably, a ribosome passage through the genomic RNA might melt out secondary structures inhibitory to replication (102). Also, the inability of the viral replicase or other important proteins to diffuse to other locations in the cell may result in preferential replication of the RNA molecules from which those proteins were translated (102). Cis-preferential replication in

BYDV was suggested based on the inability of the wild type virus to complement a number of replication-deficient mutants (96). On the other hand, bro mo viruses, carmoviruses, tombusviruses, coronaviruses, and others can be successfully replicated in trans. Numerous members of these groups have associated satellite and DI RNAs, which are molecular parasites incapable of independent replication that rely upon helper virus RdRps. Therefore, it was suggested that cis-preferential replication has evolved to guard against amplification of defective genomes (102). Only intact wild type RNAs capable of translating functional proteins would be replicated in the cis-preferendal manner. Cis-preferential replication is not possible in multipartite viruses where viral replicase is encoded by one RNA component but is responsible for amplification of all components.

Regulation of viral life cycle is essential for successful replication and pathogenesis.

In the infected host cell, numerous factors compete for the access to viral RNA: ribosomes and other elements of translation machinery, viral replicase, coat protein, movement protein. 17 etc. A regulatory mechanism is required to ensure that viral genome is first translated, then replicated and transcribed, and subsequently moved to adjacent cells or encapsidated (or both). Translation and replication of viral RNA are two processes incompatible on the same molecule at the same time. Translation is accomplished by ribosomes moving 5' to 3', whereas (-)gRNA synthesis is performed by viral RdRps moving in the 3' to 5' direction. An elegant study by Andino and colleagues demonstrated how this potential head-on collision of

RdRp and ribosomes is avoided in polio virus (31). Binding of the cellular poly (rC) binding protein (PCBP) to the cloverleaf structure at the 5' end of viral RNA stimulates translation.

When a certain amount of the viral polyprotein is accumulated and processed, viral 3CD protein (which contains the RdRp, 3Dpol) binds to a different region of the cloverleaf and mediates dissociation of PCBP (which leads to translation repression) and formation of the replicase complex that subsequently initiates minus strand synthesis at the 3' end. Therefore, binding of either PCBP or 3CD determines whether poliovims genome is translated or used as a template for minus strand synthesis (31).

PCBP is one of the cellular proteins that participate in replication of RNA viruses.

Due to the limited amount of genetic information a viral genome can accommodate, numerous viruses have evolved means to subvert the cellular machinery to serve their needs.

The use of the translational machinery by viruses was relatively obvious, due to the requirement of numerous factors and complex structures like ribosomes for the translation process to occur. However, involvement of cellular factors in replication was not necessarily expected because viruses had been known to encode their own RNA polymerases.

Numerous cellular proteins have been described that bind transcription regiilatory regions of various RNA viruses (64). Those include elements of the RNA splicing machinery 18

(hnRNPs), translation factors. La antigen, actin, tubulin, and others (64). Importance of cellular factors for virus RNA replication was also demonstrated by isolation of yeast mutants, that belong to several complementation groups, which were unable to support replication of BMV RNA (47).

It was suggested that cellular factors required for virus replication might bind either to the viral replicase or directly to the RNA recognition elements and then recruit the replicase to the template. Involvement of cellular factors in virus replication by binding to cis-acting regulatory regions is reminiscent of the DNA-dependent transcription in eukaryotes and suggests conservation of the most basic features of RNA synthesis (64).

Dissertation organization

Chapters 2-4 of this dissertation contain three manuscripts. The manuscript in chapter 2 is entitled "Primary and secondary structural elements required for synthesis of barley yeUow dwarf virus subgenomic RNAl". This paper was published in the Journal of

Virology in 1999 (J. Virol., 73:2876-2885). It is co-authored by B.R. Mohan and W. Allen

Miller. Dr. Mohan's contribution to this manuscript is the experiment illustrated in Fig. 2.

Involved in the project at its early stages, he also performed experiments that were not included in the manuscript but which provided valuable preliminary data. The manuscript in chapter 3, "Barley yellow dwarf virus utilizes very different promoters for transcription of its subgenomic RNAs," is prepared for submission to the Journal of Virology. It is also co- authored by W. AUen Miller, as is the third manuscript included in chapter 4. Its tide is

"Characterization of the barley yellow dwarf virus 3' origin of replication," and it is prepared for submission to the journal Virology. As my major professor. Dr. Miller contributed to the 19 original design of die experiments in all three manuscripts and to the interpretation of results.

Dr. Miller played the major role in writing the Discussion in chapter 2, he closely edited chapter 3 and made minor grammar corrections in chapter 4. In addition to the manuscripts, the dissertation contains the General Introduction and General Conclusions chapters

(chapters 1 and 5, respectively). References cited in all five chapters are placed at the end of the dissertation. 20

CHAPTER 2. PRIMARY AND SECONDARY STRUCTURAL ELEMENTS

REQUIRED FOR SYNTHESIS OF BARLEY YELLOW DWARF VIRUS

SUBGENOMIC RNAl

A paper published in the Journal of Virology 73:2876-2885 (1999)

Guennadi Koev, B.R. Mohan, and W. Allen Miller

Abstract

Barley yellow dwarf luteovirus (BYDV) generates three 3'-cotenninal subgenomic

RNAs (sgRNAs) in infected cells. The promoter of sgRNAl is a putative hot spot for RNA recombination in luteovirus evolution. The sgRNAl transcription start site was mapped previously to either nucleotides 2670 or 2769 of BYDV genomic RNA (gRNA) in two independent studies. Our data support the former initiation site. The boundaries of the

SgRNAl promoter map between nucleotides 2595 and 2692 on genomic RNA. Computer prediction, phylogenetic comparison, and structural probing revealed two stem-loops (SLl and SL2) in the sgRNAl promoter region on the negative strand. Promoter function was analyzed by inoculating protoplasts with a full-length infectious clone of the BYDV genome containing mutations in the sgRNA promoter. Because the promoter is located in an essential coding region of the replicase gene, we duplicated it in a nonessential part of the genome from which a new sgRNA was expressed. Mutational analysis revealed that secondary stmcture, but not the nucleotide sequence was important at the base of SLl.

Regions with both RNA primary and secondary structural features that contributed to 21 transcription initiation were found at the top of SLl. Primary sequence, but not the secondary structure was required in SL2 which includes the initiation site. Disruption of base-pairing near the sgRNAl start site increased the level of transcription 3 to 4-fold. We propose that both primary and secondary structure of the sgRNAl promoter of BYDV play unique roles in sgRNAl promoter recognition and transcription initiation.

Introduction

Many positive strand RNA viruses express genes via RNA-templated transcription of subgenomic mRNAs. Several mechanisms of subgenomic RNA (sgRNA) synthesis have been proposed for various viruses, including internal initiation of the replicase at subgenomic promoters, premature termination of negative-sense RNA synthesis with subsequent independent replication, 5' leader-primed synthesis, and RNA recombination (reviewed in

(66, 87). So far, only internal initiation on the negative strand template has been demonstrated as a mechanism of sgRNA synthesis in plant RNA viruses (32, 90, 140), although premature termination during negative strand synthesis has been suggested as an alternative mechanism (87, 124). Despite the lack of direct evidence for internal initiation of transcription in most viruses, we will adhere to convention and refer to cw-elements responsible for synthesis of sgRNAs as subgenomic RNA promoters.

Boundaries of sgRNA promoters have been determined in several RNA viruses in vivo (6, 10, 29, 52, 70, 132, 140, 149). Their sizes vary from 24 nucleotides in Sindbis vims

(70) and 27 nucleotides in cucumber necrosis virus (CNV) (52) to nearly 100 nt in turnip crinkle virus (TCV) (140) and over 100 nt in beet necrotic yellow vein virus (BNYW) (6). 22

With the exception of BNYVY (6), the larger portions of subgenomic promoters are located upstream of the transcription initiation site (in the positive sense).

In vitro analysis of the sgRNA4 promoter of brome mosaic virus (BMV) and its comparison with other alpha-like virus subgenomic promoters revealed 4 major structural elements: a core promoter, an AU tract downstream of the initiation site, an oUgo A tract and an enhancer element located upstream of the core (76). However, wild t5^e levels of RNA4 synthesis in vivo require an additional upstream element that contains repeats of sequences from the core (29). The core promoter of BMV RNA4 has been characterized extensively in vitro, revealing sequence requirements for transcription initiation and suggesting that the primary and not the secondary structure of RNA is critical for specific and accurate initiation, much like in DNA-dependent RNA polymerase promoters (122). However, RNA secondary and tertiary structures play important roles in various processes of the virus Ufe cycle including RNA replication, recombination, translation, and others. These processes involve

RNA-protein interactions (4, 16, 22, 99, 109). The combination of primary and secondary

RNA structure requirements is common in control of vims replication (68, 104, 109, 113).

Therefore, we set out to determine the role of RNA structure in sgRNA promoter function.

The object of this study is the PAV strain of barley yellow dwarf virus (B YDV) which belongs to the genus Luteovirus (formerly called subgroup I) of the family

Luteoviridae (formerly the luteovirus group) (85). BYDV has a positive sense, 5.7 kb genomic RNA (gRNA). In the infected cell, three 3'-coterminal sgRNAs are produced (24,

53). They are not encapsidated. All three sgRNAs play different roles in virus replication.

SgRNAl is the mRNA for the coat protein, a readthrough extension of the coat protein involved in aphid transmission, and a 17 kDa protein required for plant systemic infection (18). SgRNA2 codes for a small 4.3-7.0 kDa peptide that is dispensable for virus replication in protoplasts (96). SgRNA2 may also regulate translation from gRNA and sgRNAl. The role of sgRNA3 which lacks ORFs is unclear (87, 92).

Luteoviridae are split into two major genera, Luteoviriis and Polerovirus, based on differences in the 5' halves of their genomes (85). The genes in these regions, including the

RNA-dependent RNA polymerase (RdRp), are unrelated between the genera (89). The border between divergent and homologous regions is located between ORFs 2 and 3 (RdRp and coat protein genes) (89). This region also includes the 5' end of sgRNAl. Based on this observation, we proposed that recombination has occurred during luteovirus evolution by replicase strand switching at the subgenomic promoters (89, 91).

As a first step in testing the recombination model we have begun mapping the sgRNA

promoters of BYDV. In this smdy, we mapped the primary and secondary structures required for sgRNAl synthesis in detail. We show that both primary and secondary RNA stmcture play unique roles in promoter recognition by viral replicase in vivo.

Materials and methods

Plasmids. pPAV6 is a fiill-length cDNA clone of BYDV-PAV described in (23). pGK-1 was constructed by cloning the Ava I (2456)-5'5'p I (2737) fragment of pPAV6 into the pGEM-3Z digested with Ava I and Ssp I. Mutants pKel-6, pKel-f, and p2670M were constructed by two-step PGR (67). To make pKel-6 and pKel-f, in the first round of PGR we

used an upstream mutagenic primer, 5'-

GCCCAACTCCAGTC[G/C/A]GT[T/C]AAAGTGACGACTCCACAT-3', (altered bases in

bold) spanning bases 2655-2690 and a downstream primer (5'-CTGAATTCGTTCACCACC- 3') complementary to bases 2867-2850. For 2670M we used an upstream mutagenic primer

(5'-CCAGAGTCTGAAGGTGACGACT-3') complementary to bases 2663-2684 and the

above downstream primer. The product of the first round was gel-purified and used in the

second round of PGR as the downstream primer with the upstream primer RB1100 (5'-

TGGCTCTTGCACTTGAAC-3') spanning bases 1927-1945. The resulting PGR product

was digested with 5jfl 107 I and Tthl 111 and cloned into pPAV6 cut with 5jrl 107 I and

TthWl I. Mutants with the duplicated sgRNAl and sgRNA3 promoters were constructed by

PGR amplification of the promoter region with the primers listed in Table 1, contciining

flanking Kpn I restriction sites, and cloning Kpn I-digested PGR products into the unique

Kpn I site of pPAV6 (4154). For the sgRNAl promoter secondary strucmre probing,

pT7SGPl was constructed by amplifying a region of pPAV-6 between nts. 2595-2716 with

primers 2595 (Table 1) and T7SGP1

(GCGGAMTCrAArACGACTCACrArAGGGATGGAAAGCAGTATTGATT, Eco RI site

underlined, T7 promoter italicized) which contained flanking Kpn I and Eco RI restriction

sites, respectively, and cloning the PGR product into the plasmid pUGl 18/1180 described in

(23).

Infection of protoplasts and northern blot analysis. Infectious RNA transcripts

were obtained by transcripdon in vitro of plasmids linearized with Sma I, by using T7 RNA

polymerase (RiboMax kit, Promega, Madison, WI). Oat protoplasts were prepared and

inoculated with 10 |ig of RNA as described in (24). Total RNA was extracted from

inoculated protoplasts by using the QIAGEN (Los Angeles, CA) RNeasy plant RNA

isolation kit. RNA (5-10 [ig) was analyzed by northem blot hybridization essentially as

described in (119). A probe complementary to the 3' terminal 1.5 kb of the PAV genome was used to detect viral genomic and subgenomic RNA accumulation. This probe was obtained by in vitro transcription of the plasmid pSPlO (24) linearized with Hind IE using T7

RNA polymerase. Membranes hybridized with this probe were exposed to Phosphorimager screens for 1-2 days. The subgenomic promoter activity was quantified as the ratio of the

SgRNAlA signal intensity to that of the gRNA. Because of the significant accumulation of the lower molecular mass RNA due to RNA degradation, the value of the sgRNAl A signal was determined by subtracting the background value (region under the sgRNAlA band) from the SgRNAlA signal. All mutants were evaluated in 2 to 5 separate experiments.

RNA structure analysis. pT7SGPl was linearized with Kpn I, prior to transcription with T7 RNA. Transcripts were 5' end-labeled using [7-^~P]ATP as described in (28, 137).

RNA was purified using denaturing 5% polyacrylamide, 8M urea gel electrophoresis.

Structural probing with imidazole was performed in 0.04 mM NaCl 1 mM EDTA 10 mM

MgCU with 0 M, 0.8 M, and 1.6 M imidazole for 17 and 22 hrs at 25°C as described in (28,

137). Partial digestion with T1 ribonuclease was done as described in (93). Reaction products were separated by using denaturing 6% polyacrylamide, 8M urea, gel electrophoresis. The gels were dried and exposed to Phoshorimager screens for 1-3 days, and visualized with a STORM 840 Phosphorimager (Molecular Dynamics, Sunnyvale, CA). The

U2 and T1 RNA sequencing ladders were generated as described in (28, 137).

Results

Re-examining the 5' end of sgRNAl. The 5' end of sgRNAl of BYDV-PAV was mapped to position 2769 of gRNA by Dinesh-Kumar et al. (24), and to position 2670 by

Kelly et al. (53). The 99 nt discrepancy has been attributed to the two different isolates of B YDV-PAV used in these studies: Australia and Illinois isolates were used by Kelly et al., and Dinesh-Kumar et al., respectively. However, the high homology shared by the two isolates as well as a conserved hexanucleotide GUGAAG present at the 5' ends of the gRNA and sgRNAs 1 and 2 revealed in the study by Kelly et al. (53), prompted us to revisit this issue. We constructed a probe, pGK-1, that is complementary to the region of BYDV gRNA

(bases 2456-2737) that spans the sgRNAl start site as mapped by KeUy et al. (2670), but should not detect sgRNAl as mapped by Dinesh-Kumar et al. (2769, Fig. 1). This probe hybridized with sgRNAl of BYDV-PAV-IUinois and an in vitro transcript that contains the

5' end at 2670. It did not detect an in vitro transcript with the 5' end at 2769. Therefore, the

5' extremity of sgRNAl of this isolate is in fact well upstream of 2769, and consistent with the 5' end at 2670 in BYDV-PAV-Australia (Fig. 1). Promoter mapping described herein supports an initiation site at or near 2670.

Mutations near the sgRNAl start site affect transcription. The sgRNAl initiation site (2670) is located within ORF 2 which encodes viral RdRp (Fig. 1), which is required for replication (96). To examine the possibility of using deletion mutagenesis to map the subgenomic promoter, we introduced a stop codon at position 2650 of the infectious clone, pPAV6, which truncated ORF2 by thirty 3' terminal codons. No replication of this mutant transcript in oat protoplasts was detected by northern blot analysis (data not shown). Thus, deletion mapping of the subgenomic promoter in this region was not possible.

To determine the importance of individual nucleotides around the start site for sgRNA synthesis, we introduced point mutations in this region of pPAV6. Two mutants with five base changes around the start site replicated but synthesized no sgRNAl (Fig. 2). One had a Val-> Asp amino acid change, the other had no amino acid changes (Fig. 2). To test the 27 importance of the G at the initiation site, it alone was mutated to a C in mutant 2670M. This resulted in an unavoidable amino acid substitution (Val->Leu). This mutant did not replicate at all (Fig. 2). This is sxarprising considering that the more radical change of Val-> Asp at this site did not knock out replication. Thus, either the RdRp tolerated Asp but not Leu at amino acid number 825, or the particular base that coincides with the 5' end of sgRNAl is essential for gRNA replication. Mutants Kel-6 and Kel-f demonstrated the sensitivity of sgRNAl transcription to changes in the conserved hexanucleotide GUGAAG and immediately upstream of the initiation site. Mutant 2670M further emphasized the difficulty of the promoter characterization due to potential undesired alterations in RdRp function.

Mapping the boundaries of the sgRNAl promoter. To allow more mutagenesis of the SgRNAl promoter, we moved a copy of it to a nonessential portion of the genome. This duplication of subgenomic promoters and synthesis of sgRNAs from ectopic locations has been demonstrated in several RNA vimses (6, 10,29, 69, 132, 140). We duplicated a 314- nucleotide region flanking the putative sgRNAl initiation site (nts. 2503-2816) and introduced it into a unique Kpn I site in ORF 5 (position 4154, Fig. 3A). This ORF is not required for virus replication in protoplasts (96). As expected, the promoter duplication resulted in the expression of an additional 1.7 kb sgRNA (sgRNAlA, Fig. 3B). This caused a dramatic drop in accumulation of sgRNAl from its natural setting and also reduced gRNA levels. This phenomenon of reduced synthesis of sgRNAs from upstream promoters when additional promoters are inserted downstream has been observed with other RNA viruses

(29, 140). Reduced gRNA accumulation is likely due to lack of coat protein synthesis (96) caused by reduced levels of its mRNA (sgRNAl). This ectopic expression of sgRNAl set 28

the stage for the deletion mapping and detailed characterization of the sgRNAl promoter

without interfering with the RdRp coding region.

We tested a set of mutants containing various portions of the 314-nucleotide promoter

region for sgRNAl A synthesis in oat protoplasts. The smallest construct capable of directing

SgRNAl A transcription, 2SL, consisted of the 98-nt RNA sequence from bases 2595 to 2692.

Three smaller constructs I (nts. 26II-2692), J (nts. 2595-2679), and K (nts. 26U-2679), were incapable of producing the artificial sgRNAl A (Fig. 3C). The 2SL construct did not cause such a large reduction in sgRNAl and gRNA accumulation as did construct PAVSGl A

perhaps because it was not as strong a promoter as the 314 nt insert. Furthermore, the 314 nt duplication in PAVSGl A may have given a gRNA too large to be encapsidated, while the smaller overall size of 2SL gRNA (98 nt duplication) may have permitted encapsidation.

Secondary structure prediction for the sgRNAl promoter. To explore the

possible role of RNA secondary structure in sgRNAl synthesis, we analyzed the 98-nt

promoter sequence for potential RNA folding patterns using the MFOLD program (151).

Because sgRNA synthesis has been shown to occur by internal initiation of transcription on

the negative strand template in other viruses, we used the complement of the mapped subgenomic promoter region for the secondary structure predictions. Most of the suboptimal

folding patterns contained two stem-loop structures (SLl and SL2). There were two major

variations of the SLl folding: stmcture I and structure II (Fig. 4A). To establish which one is

more likely to exist, we compared the sgRNAl promoter regions of other BYDV isolates and

the related soybean dwarf virus (SbDV). The BYDV sequences were too highly conserved

to shed light on the secondary stmcture, while SbDV diverged significantly (Fig. 4B). The

predicted structure of the SbDV subgenomic promoter region resembled only that of BYDV- 29

PAV structure n. Sequence co-variations found in SbDV revealed four sites at which base changes retained base pairing (boxed base pairs. Fig. 4A) in structure H. Thus, structure 11 is more likely to exist in BYDV, even though it is calculated to be slightly less stable than structure I (Fig. 4A).

Comparison of the sgRNAl promoter with those of sgRNAs 2 and 3 and the 3' end of the negative strand of gRNA did not show any significant sequence homology except for the earlier identified hexanucleotide 3'-CACUUC-5' in sgRNAs 1 and 2 and gRNA (5'-

GUGAAG-3' in the positive sense). RNA secondary structure predictions of the negative strand RNA in those regions revealed a hairpin with the initiation site in its stem similar to

SL2 (Fig. 4C). No structure like SLl was predicted in the other sgRNA promoter regions.

This suggested that SL2-like structure may be a common structural element of RdRp recognition regions.

Nuclease probing of the sgRNAl promoter secondary structure. To test the existence of the two computer-predicted stem-loops in the sgRNAl promoter, we constructed a plasmid, pTTSGPl, which contained a region of the negative strand gRNA spaiming the minimal sgRNAl promoter (nts. 2595-2716) cloned behind the bacteriophage T7 promoter.

For RNA secondary structure analysis, we used 5' end-labeled in vitro transcripts of the

SgRNAl promoter obtained from Kpn I-linearized pTTSGPl. To detect double- and single- stranded RNA regions within the subgenomic promoter, we performed partial digests of the

5' end-labeled transcripts with RNase T1 (cuts single stranded G's) and imidazole, a chemical ribonuclease that cleaves RNA at all single-stranded bases. Imidazole has been used to resolve secondary structure of various RNAs (28, 137). 30

Both the T1 and the imidazole analyses identified most of the single-stranded RNA regions corresponding to those predicted by computer (Fig. 5A). The lower part of SL2 was sensitive to nuclease, suggesting the existence of a single-stranded junction between the two major stem-loop domains. The upper and the lower regions of SLl were well protected from both T1 and imidazole digestion implying stable RNA helices. Highly protected

Q&48/Q2W9/A2650 and U2602/G2603/G2604 appeared paired to each other, whereas both G2605 and

G2647 were nuclease-sensitive and consistent with the bulge in structure E (Figs. 4A, 5A).

Therefore, despite the lower predicted stability, structure II appears more likely to form in solution than structure I.

The middle portion of SLl exhibited ambiguous base-pairing. Both the base-paired and the single-stranded conformations may coexist in dynamic equilibrium ("breathing"), reflecting the weak base-pairing of the AU-rich AUUCU:AGAAU helix. Based on their nuclease sensitivity, both terminal loops of SLl and SL2 seemed well-defined and consistent with the computer prediction (Fig. 5A). The ribonuclease probing analysis superimposed on the phylogenetically conserved, computer-generated secondary structure allowed us to propose the solution structure of the sgRNAl promoter (Fig. 5B).

Primary and secondary RNA structure of SLl are required for transcription of sgRNAl. To test the involvement of primary and secondary RNA structure elements of SLl in sgRNA synthesis, we introduced a series of mutations into the duplicated sgRNAl promoter. Oat protoplasts were inoculated with the mutants, and total RNA was isolated 24 hpi and analyzed by northern blot hybridization. We used the ratio of steady-state levels of sgRNAlA:gRNA as a measure of the promoter activity. 31

To examine the role of the helix at the base of SLl, we introduced nucleotide alterations (blocks of 4 nucleotides) in both strands of the helix that disrupted and restored the base-pairing (mutants SLllA, SLl IB, and SLllC, Fig. 6A). Transcription of sgRNAlA was eliminated when the base-pairing was disrupted in mutants SLIIA

(sgRNAlA:gRNA=0.06+/-0.02) and SLl IB (0.02+/-0.01), and was restored to a level higher than wild type (construct 2SL, 0.5I+/-0.08, Fig. 6B) in the compensatory mutant SLl IC

(1.564-/-0.25, Fig. 6B). These results indicated that the secondary structure, and not the nucleotide sequence at the bottom of SLl, is required for transcription.

To test the role of the upper part of SLl, we introduced similar mutations to the upper helix (SL12A, SL12B, and SL12C, Fig. 6A). Both mutations SL12A and SL12B that disrupted the base-pairing exhibited low levels of sgRNAlA accumulation (0.12+/-0.05 and

0.11+/-0.02, respectively. Fig 6B). The compensatory mutant SL12C failed to restore the promoter activity (0.05+/-0.02, Fig. 6B), indicating that specific nucleotide sequences on both sides of the helix, (and possibly RNA secondary structure as well) are important in this region.

To examine the role of the single-stranded and ambiguous regions of SLl in

SgRNAlA synthesis, mutant SLID was constructed with two sequence tracts deleted; Uigoi"

A2620 and C,64i-G2657 (Fig- 6A). Surprisingly, this mutant produced low, but significant levels of SgRNAlA (0.10+/-0.01, Fig. 6B), indicating that the middle portion of SLl could be deleted while still retaining a low level of transcription. Deletion of the bulged U2635 (SLIU,

Fig. 6A) reduced transcription (0.19+/-0.04, Fig 6B), indicating its importance, but not an absolute requirement for the promoter activity. Changing the sequence of the terminal loop of SLl to its complement (Fig. 6A) yielded low levels of sgRNAlA (mutant SL13, 0.10+/- 32

0.03, Fig. 6B). Thus, the specific sequence of the terminal loop is also important for transcription.

Nucleotide sequence but not the secondary structure of SL2 is required for transcription of sgRNAl. We next characterized the sequence and structural elements of

SL2 that are involved in sgRNA synthesis. As reported in initial mutants, sequence alterations within five bases of the transcription start site in its natural location in ORF2, knocked out sgRNAl synthesis (Fig. 2). Mutations in Kel-6, Kel-f, and 2670M were predicted to disrupt secondary structure of SL2. To separate the influence of the nucleotide sequence alteration from the RNA secondary structure disruption, we designed a series of

SgRNA I promoter mutants that disrupted and restored the secondary structure of SL2 in the duplicated promoter. Three nucleotides near the start site were altered to disrupt base-pairing in either strand of SL2 in mutants trpll and trpl2 (Fig. 7A). The mutant trplc contained both of the above sets of substimtions to restore the structure of SL2 (Fig. 7A). No sgRNA 1A synthesis was detected in trpll (0.01+/-0.02) and trplc (0.0+/-0.02) both of which are mutated in the conserved hexanucleotide at the start of sgRNAl (Fig. 7B). However, trpl2 exhibited a level of transcription three times higher than wild type (1.60+/-0.02). Thus, the primary structure adjacent to the start is required for sgRNA synthesis (Fig. 7B), and SL2 secondary structure may acmally inhibit transcription.

In order to examine the role of the initiating C (Gigvo the positive strand), we changed it to a G (mutant 2670G, Fig. 7A). We also weakened the bottom of SL2 by altering the complementary G to a C (2688C, Fig. 7A). In the double mutant, compl, the base- pairing was restored while keeping the primary structure altered (Fig. 7A). The results were similar to those in the previous experiment: no transcription in 2670G (-0.02+/-0.01) and 33 compl (-0.16+/-0.21), and almost four times higher than wild type level of transcription in

2688C (L95+/-0.32) (Fig. 7B). Finally, altering the sequence of the SL2 terminal loop to its complement in the mutant SL21 (Fig. 7A) reduced the sgRNAlA level to 0.11+/-0.01 (Fig.

7B), indicating the importance, but not the absolute requirement, of the nucleotide sequence of the SL2 terminal loop. These results stressed the importance of the primary RNA sequence and negative effect of the secondary structure at the initiation site.

In pursuit of the nucleotides in SL2 that could be altered without affecting transcription, we replaced the 5'-terminal conserved, hexanucleotide of sgRNAl (GUGAAG in the positive strand) with the 5' terminus of sgRNA3 (GACGAC in the positive strand).

Interestingly, this mutant, SL2SG3 (Fig. 7A), did not produce detectable levels of sgRNAl A

(0.04-/-0.02, Fig. 7B). We also constructed a mutant, SGP300, with die duplicated putative sgRNA3 promoter region (301 nts, 5150-5450) sparming sgRNA3 start site (nt 5348) inserted in the Kpn I site in ORF 5 in order to test if this promoter could function outside its wild-type location. The mutant produced sgRNAl A (0.72+/-0.04, Fig. 7B) which indicated that the sgRNA3-specific 5' terminal hexanucleotide could function in the context of its own promoter. This shows that the replicase recognizes very different promoters.

Discussion

Re-evaluation of the 5' end of sgRNAl. Here we show that the 5' end of sgRNAl is most likely at position 2670 as reported by Kelly et al (53) and not at our originally reported site of 2769 (24). The error may have occurred due to a variety of technical difficulties, involving mismatches between the probe (from B YDV-PAV Australia) and the viral RNA (B YDV-PAV Illinois), and the unexpectedly far upstream location of the 5' end. 34

The difficulty of mapping 5' ends of Luteoviridae subgenomic RNAs is further indicated by discrepancies reported in potato leafroll virus (PLRV). Initially, the 5' end of sgRNAI of

PLRV was reported to be only 40 nt upstream of the coat protein start codon (126).

However, subsequent analysis mapped it to 212 nt upstream at a region which, like B YDV

(nts. 2670-2675), shows homology to the 5' end of the genomic RNA (83). Phylogenetic comparisons support this latter start site for other Luteoviridae (89).

Structure and function of sgRNAl. Mutations in the subgenomic promoter unveiled roles for three types of structures: (i) one in which the secondary and not the primary structure is important (base of SLl), (ii) an ambiguous region where the primary and possibly secondary structure both may influence the transcription efficiency (upper stem and loop of SLl), and (iii) a region where the primary and not the secondary structure is required

(stem of SL2). The nuclease probing and phylogenetic analysis support the existence of the base of SLl. Stem-loops have been predicted in sgRNA promoters of other RNA viruses

(76, 140, 149), with the initiation site usually located within a single-stranded region. To our knowledge, our data represent the first actual demonstration of a requirement for a specific helical domain in a viral sgRNA promoter. The requirement for primary and not secondary structure at the sgRNAl initiation site result is consistent with other studies showing the role of single-stranded regions for specific protein recognition (5, 99, 104).

The ambiguous results of the stmcture probing of the distal portion of SLl lead us to propose that this portion of the subgenomic promoter forms metastable structures.

Alternative conformations have been demonstrated for other viral RNAs (93, 110), therefore a portion of molecules may also fold as predicted in structure I, or the entire

AUUCUrAGAAU stem of structure It may be unpaired, giving a very large bulge between steins at the top and base of SLl. The 36-base deletion mutant (SLID) lacking this ambiguously paired region still retained 20% of wild-type promoter activity. This indicates that the deleted region probably does not contact the viral replicase directly, but may provide favorable spatial localization of the essential elements.

Recognition of the sgRNAl promoter by replicase. We propose that SLl acts as a replicase recognition site, placing the replicase in proximity to the start site at the base complementary to 2670. The double-stranded proximal end (bottom) of SLl may serve only to provide the structural foundation that ensures that the sequence at the distal end (top) of

SLl (Fig. 6) is presented in the proper orientation for specific binding by the replicase. The ambiguously structured, nonessential, central portion of SLl may play only an auxiliary role in spacing between the distal region and the initiation site.

This model of a separate RNA binding site and adjacent initiation site resembles those proposed for a recombination site in turnip crinkle virus (TCV) satellite RNAs, subgenomic promoter recognition by BMV replicase (122), and recognition of bacteriophage

QP RNA by its replicase (13). The TCV satellite RNA sequence includes a bulged stem-loop as the putative repUcase binding site adjacent to the actual site at which RNA synthesis takes place (99). The well-characterized BMV promoter differs by its lack of requirement for the secondary structure in replicase recognition (122). However, this difference may be explained in part by the use of a cell-free transcription system which may have less stringent requirements for cis-elements in vitro than those observed in vivo (29). The divergent sequences of the three BYDV promoters may allow differential expression of each. Each sgRNA promoter may have a separate recognition site on the replicase holoenzyme, each of which may be on a separate protein factor as shown for (-h) and (-) strand recognition by Q(3 36 replicase (13), or the sequences may have different affinities for the same site on the replicase.

Possible alternative mechanisms of sgRNAl synthesis. Our data leave open the possibility that sgRNAs could be synthesized by premature termination during (-) strand synthesis on genomic template followed by independent replication of the sgRNA (87).

Evidence suggesting such a mechanism has been provided for coronaviruses (17) and dianthoviruses (124). Upstream of the sgRNAl 5' end, stem-loops complementary to SLl and SL2 are predicted to exist in the (+) strand using MFOLD. The mutations that support a role for the heUx at the base of SLl in (-) strand could equally support the role of the complementary helix in (+) strand. Such a structure in the (+) strand could inhibit replicase migration along the template, favoring termination. The resulting 3' end of the truncated (-) strand would resemble that of full-length (-) strand owing to the CACUUC homology, allowing it to be recognized and replicated by the replicase. Thus, that essential sequence would still be serving the promoter function.

The premature termination model has been invoked for red clover necrotic mosaic dianthovirus (RCNMV), owing to a remarkable base-pairing between the (+) strands of the two genomic RNAs of the virus that is essential for formation of sgRNA from genomic

RNAl (124). The polymerase of RCNMV is closely related to that of B YDV, so a similar replication mechanism might apply to BYDV. Because BYDV has only one genomic RNA, the termination structure would form either as the complement of SL1 as discussed above, or intermolecularly in which two genomic RNA molecules dimerize by base-pairing at the complementary sequences that would otherwise form the stem-loop. Alternatively, it is 37 possible that other inter- or intramolecular interactions could generate a transcription termination structure.

Some observations argue against a premature termination model. The disruption of the secondary structure at the initiation site (SLl) that increases sgRNAl A synthesis (Fig. 7) should have had the opposite effect if sgRNAlA was synthesized by premature termination during the negative strand synthesis because stable stems should increase termination (107).

Furthermore, the role of SLl is more than just providing a stem-loop structure, that would (in the (+) strand) block replicase migration, because mutations at the distal end of SLl reduced

SgRNAlA accumulation iadependent of their effect on secondary structure.

Role in recombination. The presence of a sgRNA promoter at the 3' end of ORF2 is consistent with our proposed model that Luteoviridae genomes recombine at a sub genomic promoter in the vicinity of the intergenic region between ORFs 2 and 3 (89). This model requires that sgRNAs are generated by intemal initiation of the replicase on the (-) strand. A premature termination mechanism, followed by independent replication of the sgRNA, is difficult to reconcile with the sgRNA promoter being a recombination hot spot, if a stem- loop in the (+) strand facilitates termination. However, if base-pairing between genomic

RNAs occurs as with RCNMV, one could imagine replicase occasionally switching genomic

RNA strands, rather than terminating at this base-paired region. This type of recombination, mediated by base-pairing between template strands, has been demonstrated in brome mosaic virus (98).

Evolution of sgRNA promoters. The small sizes of viral genomes often requires the overlapping of protein coding regions with c/j-acting RNA elements. It is an intriguing question, how genetic information coding for a protein and an RNA cw-element with which 38 it interacts co-evolved on the same region of viral genome. 0RF2 is very sensitive to deletions and point mutations (Fig. 2), while the sgRNA promoters tolerate changes and consist of quite diverse sequences. Thus, we propose that sgRNA promoters evolved independently at the appropriate genomic locations while allowing overlapping ORPs to maintain their function. The size of the BYDV sgRNA1 promoter is comparable to that in other RNA viruses, but no apparent sequence homology can be found with subgenomic promoters of members of other virus groups. This diversity among sgRNA promoters of related virus taxa and ability to tolerate movement to different regions of the genome (6, 10,

29, 69, 132, 140) further supports the hypothesis of multiple, independent origins of sgRNA promoters. The only conserved secondary structures among the BYDV promoters are the

SL2-like hairpins flanking all initiation sites (Fig. 4C), yet the SL2 stem structure inhibits

SgRNA 1 transcription (Fig. 7). Perhaps the SL2-like stems serve as negative regulatory elements to prevent too much transcription of the sgRNAs at inappropriate stages in RNA replication. Obviously much additional research is necessary to unveil this complex interplay of replicase-RNA interactions.

Acknowledgments

The authors thank Brice Felden for advice on RNA structure probing methods, the

USDA Biotechnology Risk Assessment Research Grants Program (no. 94-39210-0531) and the USDA National Research Initiative (grant no. 98-35303-6447) for funding. This is paper

No. J-18114 of the Iowa State University Agricultural and Home Economics Experiment

Station Project 3545, and supported by Hatch Act and State of Iowa funds. 39

Figure legends

Fig. 1. Mapping the 5' end of sgRNAl of BYDV. The upper part of the figure shows the genome organization of BYDV. Boxes represent ORPs (1-6) with the si2es of protein products indicated in kilodaltons (K). Bold horizontal lines represent genomic

(gRNA) and subgenomic (sgRNA) RNAs. The lower part of the figure shows the putative

SgRNAl promoter region with the reported initiation sites indicated by right-angled arrows.

SgRNA transcripts (which extend to the 3' end of the genome) that represent sgRNAl's with

5' ends determined by KeUy et al. (53) (pSGl) and by Dinesh-Kumar et al. (24) (pSP-18) are indicated by bold lines below map. The probe for the northem blot hybridization is complementary to the region of gRNA between nucleotides 2456 and 2737 (pGK-1, dashed line). Northem blot contains total RNA from infected plants (lane 1) or the indicated in vitro transcripts.

Fig. 2. Effect of point mutations around the sgRNAl start site. Northem blot analysis showing gRNA and sgRNAl accumulation in protoplasts inoculated with mutant full-length genomic transcripts. Total RNA from oat protoplasts (24 hpi) was hybridized with a ^'P-labeled transcript complementary to bases 2737-2985 of BYDV gRNA. Each mutant was analyzed in duplicate. Altered nucleotides and amino acids are in bold. Arrows above each sequence show the sgRNAl transcription start site identified by Kelly et al. (53)

(position 2670).

Fig. 3. Ectopic expression of the sgRNAl promoter. (A) Map of the constmct

PAVSGl A that contains a dupUcated subgenomic promoter region (gray box) inserted in the unique Kpn I site of PAV6. The Kpn I site was duplicated in the cloning process. Dashed line represents the expected artificial sgRNAl A produced from the duplicated promoter. (B) Northern blot shows viral RNAs from protoplasts (24 hpi) inoculated with PAV6 (lanes 2,3) and PAVSGIA which has the 314 nt region expected to contain the sgRNAl promoter duplicated in the Kpn I site (sgRNAlA, lanes 4 and 5). Uninf., uninoculated protoplasts

(lane I). RNA degradation products formed a band just below the position of the 18S ribosomal RNA (rRNA 'shadow') caused by the very abundant rRNA. The probe is transcript from pSPlO, complementary to the 3' terminal 1.5 kb of the viral gRNA. (C)

Northern blot analysis of RNAs from protoplasts (24 hpi) infected with full-length transcripts containing the following portions of the sgRNAl promoter region duplicated in the Kpn I site: 2SL, nts 2595-2692; I, nts 2611-2692; J, nts 2595-2679; K, 2611-2679.

Fig. 4. RNA sequence and secondary structure analysis of the sgRNAl promoter of

BYDV. (A) Two secondary strucmres (I and H), with the calculated free energies predicted using MFOLD (151) contain two stem-loops, SLl and SL2. Bases in bold italics differ among Luteoviruses (mostly SbDV, panel B). Boxed base pairs indicate co-variations that preserve the predicted secondary structure. Right-angled arrow indicates initiation site (nt

2670). (B) Alignment of RNA sequences in the subgenomic promoter regions of five BYDV strains and SbDV. The BYDV strains are: FAV-Australia (also known as PAV-Vic), (pav- aus), PAV-Japan, (pav-jap), PAV-Purdue (pav-p), PAV-129 (pavl29), and MAV (mav).

Bottom line shows consensus sequence (cons). Dashes indicate bases that do not differ from consensus. (C) Computer-predicted stem-loop structures in the genomic and subgenomic

SgRNAl promoter regions of BYDV. Sequence is negative sense, numbering is positive sense. The conserved hexanucleotide at the initiation sites is in bold.

Fig. 5. Nuclease probing of the sgRNAl promoter secondary structure. (A)

Imidazole and Tl ribonuclease partial digests of 5' end-labeled transcript of pT7SGPl 41 containing the sgRNAl promoter region (negative sense). Gel-purified, end-labeled RNA was incubated in 0 M (0) and 0.8 M (I) imidazole under non-denaturing (native) conditions for 17 hours (lanes 3 and 4). The non-denaturing T1 digest was performed with 0.01 units of the enzyme for 5 min at 37°C (lane 5) (see Materials and Methods). Denaturing digests with the T1 (cuts after G) and U2 (cuts A>U) ribonucleases generated markers in lanes 1 and 2.

The products were separated on a 6% polyacrylamide gel containing 8M urea. The straight line beside lane 4 indicates predicted base-paired regions, dashed lines represent predicted single stranded junctions and ambiguous regions, curved lines show predicted loops and bulges. Double and single arrows represent G's that were cleaved strongly and weakly, respectively, by T1 nuclease. Filled circles indicate uncut or very weakly cut G's. (B)

Solution structure of the sgRNAl promoter. Arrowheads represent the T1 analysis data, triangles represent the imidazole digestion data. Larger and smaller symbols indicate strong and weak cuts, respectively.

Fig. 6. Effect of site-specific mutations in the SLl region on sgRNAl A accumulation. (A) Mutations in the SLl region of the duplicated sgRNAl promoter. Altered structures are boxed, mutant sequences are italicized. The names of mutant constructs are above the diagrammed mutations. (B) Subgenomic RNA synthesis by the mutants as determined by northern blot analysis. Total RNA from oat protoplasts (24 hpi) was blotted and probed with labeled T7 transcript from pSPlO. The names of the mutants are shown above individual lanes. The promoter activity values calculated as the ratio of sgRNAl A to gRNA level are shown under each lane (+/- standard deviation). Data represent averages from three separate experiments for each mutant. 42

Fig. 7. Effect of site-specific mutations in tiie SL2 region on the subgenomic promoter activity. (A) Mutations in the SL2 region of the duplicated subgenomic promoter.

(B) Activity of the subgenomic promoter mutants. All designations and methods are as in

Fig. 6. 43

Table I. Primers used to construct mutants of the duplicated sgRNA promoters. Kpn I recognition sequences are underlined, mutant nucleotides are in bold, deletions are designated as underlined spaces. Construct 5' primer 3' primer pPAVSGlA p2SL 2595. ATAGGTACC- SL ATAGGTACr- GGAGGTGACCCTAAGAT AGATGTGGAGTCGTCAC pi 2611, ATAGGTACC- SL TACAGCAAATCGTCGAG pi 2595 2679. ATAGGTACC- TCACCTTCACACTCTGG pK 2611 2679 pSLlIA 2595 SLllA. ATAGGTACCAGATGTGGAGTC- GTCACCTTCACACTCTGCTCATGGGCA- CTTACCGTA pSLllB SLllB. ATAGGTACC- SL GCTCATGACCCTAAGATACA pSLl IC SUIB SLllA pSL12A SL12A, ATAGGTACCGGAGTTGACCCr- SL AAGATACAGCAAATCGTCGAGAGGTACT- AGCTCGGTCTTACGGTAAGT pSLUB SU2B, ATAGGTACCGGAGTTGACCCT- SL AAGATACAGCAAATCGAGCAGAGGTAC- TACG pSLl2C SL12C. ATAGGTACCGGAGTTGACCCT- SL AAGATACAGCAAATCGAGCAGAGGTACT- AGCrCGGTCTTACGGTAAGT pSLlD SLID. ATAGGTACCGGAGTTG CGTCG- SL AGAGGTACTACGACG_CAACTCCAGAGTG pSLlU SUU, ATAGGTACCGGAGTTGACCCT- SL AAGATACAGCAAATCGTCGAGAGGTAC- T_CGACGGTCTTAC pSLI3 SU3. ATAGGTACCGGAGTTGACCCT- SL AAGATACAGCAAATCGTCGAGTCCATC- TACGACGGTCTTAC p2670G 2595 2670G. ATAGGTACCAGATGTGGAGTC- GTCACCTTCAGACTCTGGAGTT p2688C 2595 2688C. ATAGGTACCAGATCTGGAGTC- GTCACCTTCACACTCTGGAGTT pComp I 2595 compl. ATAGGTACCAGATCTGGAGTG- GTCACCTTCAGACTCTGGAGTT pTrpII 2595 trpll. ATAG£ZEA£CAGATGTGGAGTC- GTCACCAGGACACTCTGGAGTT pTrpI2 2595 trpl2. ATAGGTACCAGATGTCTTGTC- GTCACdTCACACrCTGGAGTr pTrplc 2595 tqilc. ATAGGTACCAGATGTCTTGTC- GTCACCAGGACACrCTGGAGTT pSL21 2595 SUI. ATAGGTACCAGATGTGGAGAG- CAGTGCTTCACACTCTGGAGTT pSL2SG3 2595 SL2SC3. ATAGGTACCAGATGTGGA- GTCGTCACGTCGTCACrCTGGAGTTGG pSGP300 5150. ATAGGTACCACATAAATAACCC- 5450. ATAGGTACCGTGGCTCCAAGA- GCTA GACCC 44

22K 3 CP I 5 AT 39K ( 2 RdRp I glitisit 50K 4.3-7.IK 1 I 60K 17K [H gRNA sgRNAI — sgRNA2 — sgRNA3

QRF2 10RF3 Ava I Ssp I PAV- 2456 2670 2737 2769 I pSGl {+•) in vitro transcript

probe (-), pGK-1 pSP-18 (+) in vitro transcript RNA from PAV-lL-infectedI plantspia / / k

gRNA_ 5.7 kb

sgRNAI _ 3kb

12 3 4 45

PAV6 (wt) Kel-6 KeM 2670M Ser Val Lys Ser Asp Lys Ser Val Lys Ser Leu Lys r* AGUGUGAAG UCGGAUAAG UCCGUUAAA AGUCUGAAG

'— ^ ^I r-^ 11 I I 1

isi ->S: i : - l£i' 12 3 4 5 6

Fig. 2. 46

22K 3 CP 15 AT 39K I 2_RdRp ! CTTTS 50K 4.3-7.1K I 1 I 60K 17K Kpn r^Kpn / gRNA PAVSG1A sgHNAl sgRNAI A sgRNA2 sgRNA3 B PAV6 PAVSG1A uninf. I 1 r gRNA (5.7 kb)

sgRNAI sgRNAI (3kb) (3.3 kb)

artificial rRNA 'shadow' sgRNAI A (1.7 kb)

sgRNA2 (0.9 kb)

sgRNA3 (0.3 kb)

PAV6 2SL I J K gRNA RNA (5.7 kb) 5.8 kb)

sgRNAI (3kb)

rRNA 'shadow' CI artificial sgRNAIA (1.6 kb) sgRNA2 (0.9 kb)

sgRNA3 (0.3 kb)

1 2 3 4 5 Fig. 3. 47

Structure I Structure II fiT^ ~ \U-263S -C, SL1 G I c SL1 V |j g\ ^ gT^a-2635

U c SL2 GU c-G 1133 G-C r^r U - \.U ^ CU 4U - £/A ^r ®r,9 lU-X- Al0 C" GC U„A - u J:' GG C U oG-CA u"9 G-C 2575-0-Q § C G-CyUf. U-4 •^U-GGI U»G U-A'''^. U»G C-G26S5^f-G C-G^^C ^ |~.^|A - U T A ". ,A-U®G7 *A-U

¥?? g:| P? 3C - G U- A 5- 3' C- G U- A 5' 2595 2SS2 2595 2692 AG=-25.0 kcal/mol AG=-21.6 kcal/mol 2595 2S24 pav-aus pav-jap pav-p niav U A pavl29 U G sbdv —C —G G -CAGG-A cons CCuCAACUGG GAuUCUaUGU cguuuAgCAG

2625 2554 pav-aus U C pav-jap U C pav-p 0 C mav G UU pavi29 G A U sbdv —UG-U—A- AA UU C GGU- cons CUc-CcAUgA ugCUGc-AGA aUGCCAuucA

2655 2692 pav-aus pav-jap G pav-p G mav A pavl29 -U UA— sbdv UA-C G— CGC U— -UCUAA C —AAAU— cons cgGgUUGaGG ucuCaCAcUU CcacugcuGa GGuguaGA

U C^G C U C G AC GU C-G AG CU UU A-U A - U sgRNAI C-G C G G»U gRNA U - A SL2 U - A sgRNA2 U U sgRNA3 G - C nC-G U»G U»G U-A U-A C-G C-G C-G C-G |A-U A-U G-C lA-U ^C-G 4C-G AU-A 3'LUC-G5' 3'A - U 5' a*'" U - A S' 3'^C A 5' t 2r 2669 2689 4S09 4826 5348 5368 48

denaturing native native pri U2 I [~0 Tj

) m

- Au<-2635 SL1 y - G

UCA-UAGA5L 1^2692 49

G^U SL13 A A ~T~ SL1 C^A SL1U Itfe/iete SL12C SL12B C - G A 11 - A ful G -C G G C - G C -G 4— C C G - C SL12A U -A U U A - U UUUA y£UAU;A U - A U - A SLID SL2 A - U delete G G G - C C ^ G A C U:au"C C U C - G SL11C SL11B C - G GG*^ U -A U U A - U A A U • G G -C G -4- G C - G c c C -G A -U A A U - A U U G -C G G C - G C C ^C-G C - G UCUCA- UAGA 3' 5' 2595 2692 B

gRNA -f5.7- 5.8 kb)

sgRNAI -(3- H Hit 3.1 kb)

m artificial -sgRNAIA (1.6 kb)

sgRNA2 "(0.9 kb)

sgRNA3 "(0.3 kb)

CJ in Ift CNJ o o CJ o o o o o o o + + CO CVJ CO CVJ m o O) m o o to o r- o d d d d

Fig. 6. 50

suu^.|

G - C ^ U U^^AG - C C ,A^'^ SL2SG3 UG^ C - G U - A U - ft A - ^ G G G - C G-CyUc U - A 5:§eQ= 2688C |:| |g AGA5'cS 2692 3' GC '- ®G nVJ 0 UC- 2595

B gRNA

artificial ^ (?.6- , 1.7 W3) sgBN^ • (0.9 kb)

sgBNA3

- (M ?4 o o ^

Fig. "7- 51

CHAPTER 3. BARLEY YELLOW DWARF VIRUS UTILIZES VERY DIFFERENT

PROMOTERS FOR TRANSCRIPTION OF ITS SUBGENOMIC RNAs

A paper to be submitted to the Journal of Virology

Guennadi Koev and W. Allea Miller

Abstract

Numerous RNA viruses generate subgenomic mRNAs (sgRNAs) for expression of

their S'-proximal genes. A major step in control of viral gene expression is the regulation of sgRNA synthesis by specific promoter elements. We used barley yellow dwarf virus

(B YDV) as a model system to smdy transcriptional control in a virus with multiple sgRNAs.

BYDV generates three sgRNAs during infection. The sgRNAI promoter has been mapped

previously to a 98 nt region which forms two stem-loop strucmres. It was determined that

SgRNAI is not required for BYDV RNA replication in oat protoplasts. In this study we show that neither sgRNA2 nor sgRNA3 is required for BYDV RNA replication. We identified negative sense subgenomic-sized RNA species in oat protoplasts infected with

BYDV. These negative sense RNAs were not detected in mutants deficient in synthesis of corresponding sgRNAs. By using deletion mutagenesis, we mapped promoters for sgRNA2 and sgRNAS. The minimal sgRNA2 promoter is 143 nt long (nt 4810-4952) and is located immediately downstream of the putative sgRNA2 start site (nt 4810). The minimal sgRNAS promoter is at most 44 nt long (nt 5345-5388) with most of the sequence located downstream of sgRNA3 start site (nt 5351). This is in contrast to the sgRNAI promoter, in which most of 52 the promoter sequence is located upstream of the transcription initiation site. Computer prediction and phylogenetic analysis of the minus strand RNA suggested formation of a branched stem-loop structure in the sgRNA2 promoter region and a hairpin and a single- stranded stretch in the sgRNA3 promoter region. RNA sequence and secondary structure comparisons of the three subgenomic promoters of BYDV revealed no homology between these elements demonstrating that an RNA virus with multiple sgRNAs can have very different subgenomic promoters.

Introduction

Synthesis of subgenomic mRNAs (sgRNA) is a common strategy used by positive sense RNA viruses for expression of their 3'-proximal genes. In combination with other strategies such as unconventional translational events and posttranslational proteolytic processing of precursor polyproteins, it allows efficient utilization of the viral genetic material. During infection, many viruses produce sets of one or more sgRNAs that are 3' coterminal with their genomes. Synthesized later in infection, sgRNAs encode late viral genes whose products are required for pathogenesis and particle formation. such as Sindbis virus and the alpha-like multipartite produce one sgRNA for expression of the coat protein. Plant viruses that belong to such groups as potexvirus, tombusvirus, carmovirus, and tobamovirus, as well as some other alpha-like viruses, produce two or three sgRNAs for expression of the coat protein and movement proteins. RNA viruses with larger genomes such as (also alpha-like) and the produce up to nine and seven sgRNAs, respectively (44, 66). 53

Several potential mechanisms for sgRNA synthesis have been proposed. However, only one mechanism, de novo internal initiation at a subgenomic promoter, has been unequivocally demonstrated, (90, 131, 140). Other models include premature termination of minus strand synthesis with subsequent independent replication, RNA splicing, and discontinuous transcription models. The premature termination mechanism has been suggested for red clover necrotic mosaic dianthovirus (RCNMV) (124). RNA splicing and two different versions of the discontinuous transcription mechanism (leader priming and recombination during minus strand synthesis) have been proposed for coronaviruses and other Nidovirales (jc&v. in (66)).

Regardless of the actual mechanism of sgRNA synthesis, cis-acting elements required for transcription have been named subgenomic promoters. Subgenomic promoters have been mapped and characterized in several viruses (6, 10, 29, 52, 59, 70, 132, 139, 140, 149). The best-studied example is the promoter of RNA4 of brome mosaic virus (BMV) (1, 29, 122).

The length of the subgenomic promoters, as mapped in vivo, ranges from 24 nt in Sindbis virus (70) to over 100 nt in beet necrotic yellow vein virus (BNYVV) (6). Almost all subgenomic promoters characterized to date, with the exception of that of BNYW (6), are located largely upstream of the transcription start site. A combination of primary RNA sequence and secondary structural elements has been found to be required for sgRNA transcription in vivo (59, 139). In contrast, in vitro experiments with purified BMV replicase showed the importance of the primary RNA sequence, but not the secondary structure, for promoter activity (122). This may indicate that some components of the transcription complex are missing from in vitro systems. Interestingly, genomic locations of transcriptional control elements are not always confined to the areas colinear with the sgRNA 54

5' ends, which suggests involvement of long-distance interactions in transcriptional regulation. Synthesis of sgRNAs of the coronavirus mouse hepatitis virus (MHV) is controlled not only by the intergenic regions (putative subgenomic promoters) but also by the

3' UTR (46,73). The 5' end of potato virus X (PVX) influences sgRNA transcription (55,

82), and an enhancer element exerts long distance regulation of tomato bushy smnt virus

(TBSV) sgRNA synthesis at a promoter located over 1000 bases downstream (150).

Insertion of a heterologous sequence in the 3' UTR of tobacco mosaic virus (TMV) resulted in increased sgRNA transcription (121). One of the most unusual types of transcriptional control has been uncovered in the bipartite virus RCNMV where transcription of the sgRNA from RNAl template is activated by base-pairing between regulatory elements in RNAs 1 and 2 (124).

Considering RNA promoters to be regions recognized by viral RNA polymerases or associated subunits, it is natural to expect conservation of certain feamres that determine specificity of this recognition. Indeed, viruses that have more than one sgRNA, often contain short stretches of homologous sequence near their transcription initiation sites (rev. in (66,

74)). However, these short regions by themselves are insufficient for transcription and serve as parts of larger subgenomic promoters. Turnip crinkle virus (TCV), which has two sgRNAs, also contains stable hairpins in both promoters in addition to the conserved 5'-

GGG-3' sequence at the initiation sites (140). To our knowledge, no data are available on mapping and detailed characterization of subgenomic promoters in other viruses containing multiple sgRNAs. However, based on the notion that the expression of gene products encoded by sgRNAs may be differentially regulated, it is reasonable to predict differences in the promoter structures within one virus. 55

In this study we characterize transcriptional control of the three sgRNAs of barley yellow dwarf virus (BYDV). B YDV belongs to the genus Luteovirus of the family

Luteoviridae (84). The virus has a 5.7 kb gRNA that encodes 6 open reading frames (ORFs)

(Fig. lA). Only ORFs 1 and 2, which encode viral replication proteins, are translated from the gRNA. SgRNAl serves as the mRNA for ORFs 3-5. ORFs 3, 4, and 5 encode the 22 kDa coat protein, the 17 kDa protein required for plant systemic infection (18), which is translated by the leaky scarming mechanism (25), and the extension of the coat protein required for aphid transmission, produced by the translational readthrough (24). ORF6 encodes a highly variable 6.7 kDa protein of unknown flmction which is expressed via sgRNA2. SgRNA3, at 0.3 kb, is the smallest sgRNA. It does not encode any protein, and its role in the viral life cycle is unclear.

The family Luteoviridae contains two major genera, Luteovirus and Polerovirus (78).

Members of both genera share high degree of homology in the part of their genomes that contains ORFs 3-5. The 5' halves, which contain replicase genes, are as divergent as possible: the Polerovirus replicase belongs to the supergroup 1, and the Luteovirus replicase belongs to the supergroup 2 (60). Such genomic organization implies occurrence of a recombination event in the evolution of the family. The putative crossover sites are located at the regions corresponding to the sgRNAl and sgRNA2 promoters of BYDV (89). We have previously characterized the sgRNAl promoter and mapped it to a 98 nt region with the majority of the sequence (75 nt) located upstream of the transcription start site (59). We have also shown that the promoter folds into two stem-loops, and that both RNA sequence and secondary structural elements are important for its activity. 56

We proposed recently that sgRNA2 plays a role in translational regxilation of B YDV gene expression (141). Unlike the majority of eukaryotic mRNAs, BYDV genomic RNA is uncapped (2). To compensate for the lack of the 5' cap, the virus has evolved a cis-element at the 3' UTR that mediates its cap-independent translation (142, 143). This 3' translational enhancer (3'TE) located between nt 4810 and 4920 is indispensable for virus replication (2).

Interestingly, when acting in trans, the 3'TE inhibits translation of BYDV RNA. In vitro, sgRNA2, which contains the 3'TE in its 5' UTR, strongly inhibits translation of the gRNA in trans, but only weakly inhibits translation of sgRNAl (143). Thus, we suggest that sgRNAZ, which accumulates to a 20-40-fold molar excess over gRNA, mediates the switch from the early (replicase) to the late (coat protein) gene expression (141). Understanding sgRNA2 transcriptional regulation wiU help in testing this model.

Here we report characterization of the sgRNA2 and sgRNA3 promoters of BYDV and compare them to the sgRNAl promoter. We demonstrate that the three subgenomic promoters of BYDV have very limited sequence homology and share no structural resemblance.

Materials and Methods

Plasmids. The full-length infectious clone of BYDV-PAV, pPAV6 (23), was used to develop mutant constructs. To make sgRNA2 knock-out mutants (SG2A/U, SG2G/C), PGR amplification product of downstream mutagenic primers (SG2A/T and SG2G/C, respectively) and the upstream primer CB0416 (Table 1) was digested with Kpn I and BamH

I and subcloned into pSGl (141), containing region corresponding to sgRNAl of BYDV, cut

with the same enzymes. The resulting plasmids were digested with Kpn I and Sma I, the 51 fragments corresponding to the 3' region of BYDV were purified by 0.8% low-melt agarose gel electrophoresis and subcloned into pPAV6 cut with the same enzymes. SgRNA3 knock­ out mutants (SG3G/C and SG3G/C2) were constructed by two-step PGR (67). In the first step, a region of PAV6 was amplified using the upstream mutagenic primer (SG3G/C and

SG3G/C2) and the downstream primer, 3'wt (Table 1). The gel-purified product was used as a downstream primer in PGR with the upstream primer CB0416. The resulting product was digested with Kpn I and Sma I and subcloned into pPAV6 cut with the same enzymes.

The PAVDSGP2-(I-8) deletion series was constructed by using two-step PGR (67).

In the first step, a region of PAV6 was amplified by a mutagenic upstream primer (4620D,

4686D, 4706D, 4729D, 4763D, 4790D, 4810D, 481 ID, respectively) and a downstream

primer pOlS (Table 1). The product was gel-purified and used as a downstream primer in the second step PGR with the upstream primer GB0416. The product of the second step PGR

was digested with Kpn I and BamHl, gel-purified and subcloned into pSGl cut with the same enzymes. The resulting plasmid was cut with Kpn I and Sma I and the fragment containing the deletion mutation was gel-purified and subcloned into pPAV6 cut with the same enzymes.

For sgRNA2 and sgRNA3 promoter mapping by duplication in the Kpn I site of

PAV6, promoter regions were PCR-amplified with primers listed in Table 1, which contained

flanking Kpn I sites. The products were digested with Kpn I and subcloned into pPAV6 cut

with Kpn I except constructs SGP2BG/G, SGP2D, SGP2E, and SGP2E-BF, which were

subcloned into the Kpn I site of SG2G/G.

All mutants were confirmed by sequencing. 58

Protoplast infection and northern blot analysis. Oat protoplasts were prepared and electroporated with infectious transcripts essentially as described in (25). Infectious transcripts were prepared using the Megascript T7 RNA in vitro transcription system

(Ambion, Austin, TX). Ten to fifteen micrograms of RNA was used for electroporation.

Total RNA was extracted from protoplasts -24 hours postinoculation (hpi) using RNeasy plant RNA isolation kit (QIAGEN, Los Angeles, CA). For minus strand detection, RNA was isolated following the procedure described in (119), using aurin tricarboxylic acid as RNase inhibitor. RNA (5-10 |ig) was analyzed by northern blot hybridization essentially as described in (119). A ^^P-labeled riboprobe complementary to the 3' terminus of BYDV was used to detect viral gRNA and sgRNAs. For positive strand detection, the plasmid pSPlO

(24) was linearized with Hind EH and used in an in vitro transcription reaction with T7 RNA polymerase. For negative strand detection, the same plasmid was linearized with Sma I and transcribed in vitro with SP6 RNA polymerase (Promega, Madison, WI). GeneScreen nylon membranes (Dupont) were hybridized with the probes and exposed to Phoshporlmager

(Molecular Dynamics, Sunnyvale, CA) screens for 5-24 hours (positive strand detection) or

2-4 days (negative strand detection).

RNA sequence and structure analysis. Sequence alignments of BYDV isolates were performed using GCG software. RNA secondary structure predictions were done using the MFOLD program, version 3.0, at the MFOLD website

(http://mfold2.wustl.edu/-mfoId/ma/fonnl.cgi) (77, 152). 59

Results

SgRNA2 and sgRNA3 are not required for vims replication in oat protoplasts.

In our previous study we showed that sgRNAl, which encodes the coat protein, is not required for viral RNA replication in oat protoplasts (59). To determine the roles of sgRNA2 and sgRNA3 in BYDV replication, we developed mutants defective in their synthesis. In order to make sgRNA2 and sgRNA3-deficient mutants, we changed their initiation sites in the fuU-length viral infectious clone PAV6 (23). The 5' ends of sgRNAs 2 and 3 were previously mapped to the nucleotides 4809 and 5348, respectively (53). By using site- directed mutagenesis, we changed the sgRNA2 5' terminal nucleotide A to a U (mutant

SG2A/U) and the sgRNA3 5' terminal nucleotide G to a C (mutant SG3G/C) (Fig. 1A).

Northem blot analysis of the total RNA from infected protoplasts showed, surprisingly, that neither of these mutations had any effect on the accumulation of these sgRNAs (Fig. IB).

Based on the requirement for the correct initiation nucleotide for BYDV sgRNAl synthesis (59) and sgRNA transcription in other viruses (122, 132, 139), it appears that the 5' termini of BYDV sgRNAs 2 and 3 may have been mismapped. SgRNAs 1 and 2, as well as the genomic RNA of BYDV, share a conserved hexanucleotide GUGAAG at their 5' ends

(53), and sgRNAl starts at the first G of this hexanucleotide, whereas sgRNA2 start was mapped to an A one nucleotide upstream of the conserved G (53). Therefore, by analogy with SgRNAl, we mutated the first G of the conserved hexanucleotide at position 4810 to a C in the mutant SG2G/C (Fig. lA). SgRNA3 5' terminus lacks the conserved hexanucleotide.

In an attempt to make a sgRNA3-defficient mutant, we changed the nearest G at position

5351 to a C in mutant SG3G/C2 (Fig. 1A). Both mutants failed to produce their respective sgRNAs. Neither mutation negatively affected viral replication and transcription (Fig. IB), indicating that neither sgRNA2 nor sgRNA3 is required for virus RNA replication in protoplasts. These data also suggest that the 5' ends of sgRNA2 and sgRNA3 correspond to the nucleotides 4810 and 5351, respectively. Thus, all B YDV sgRNA 5' termini appear to be guanosine residues which is consistent with those in other members of the virus supergroup

2, such as carmoviruses (140) and tombusviruses (150).

Mapping the boundaries of sgRNA2 promoter. We attempted to map the sgRNA2 promoter 5' boundary by using deletion mutagenesis of the sequence upstream of the start site (Fig. 2A). Surprisingly, all deletions, except PAVDSGP2-8, had litde effect on sgRNA2 accumulation, indicating that the 5' border of the sgRNA2 promoter is located at the putative transcription start site (nt 4810) (Fig. 2B). It also confirmed the result of the previous experiment, which indicated that synthesis of sgRNA2 probably does not initiate at position

4809 (Fig. IB), as reported previously (53), because the deletion in mutant PAVDSGP2-7 included A4go9 (replacing it with a uridine). This did not abolish sgRNA2 accumulation. This result indicated that no RNA sequence upstream of the transcription start site is needed for sgRNA2 synthesis.

The 5' end of sgRNA2 includes the 3'TE (nt 4810-4920), therefore, the promoter may overlap it. Thus, mutations in the subgenomic promoter may knock out 3'TE, which would

be lethal (2). To map the 3' boundary of the sgRNA2 promoter, we duplicated it in the unique Kpn I site (nt 4154) of the BYDV genome in ORF5 (Fig. 3A), as we did previously for SgRNA1 promoter mapping (59). This ectopic expression of sgRNA has been done in other viruses (6, 10, 29, 69, 132, 140). ORF5 is not required for virus replication in protoplasts (96). Mutants SGP2B and SGP2C that contained 143 nt downstream of the transcription start site (nt 4810), duplicated in the Kpn I site, produced an artificial sgRNA of the expected size (sgRNA2A, Fig. 3B). Mutant SGP2A, which contained only 91 nt downstream of the start site, gave no sgRNA (Fig. 3B).

The amount of the artificial sgRNA2A synthesized by the mutants SGP2B and

SGP2C was very low (Fig. 3B). Previous studies of the BMV subgenomic promoter (29) and the sgRNAl promoter of BYDV (59) showed that downstream sgRNA promoters are preferentially used, and they reduce expression from those located upstream. Thus, to increase transcription from the ectopic (upstream) promoter, we subcloned it into the mutant

SG2G/C that does not produce sgRNAZ in its natural location (Fig. 1A). Indeed, the same

267-nt region used in mutant SGP2B resulted in a much higher level of artificial sgRNAZA transcription in the context of SG2G/C (mutant SG2BG/C, Fig. 3B). To map the 3' boundary of the promoter, we tested two additional mutants that contained the duplicated sgRNA2 promoter region extending from the transcription start site (4810) to positions 4922 and 4952, respectively (SGP2D and SGP2E, Fig. 3A). Only SGP2E produced artificial sgRNA2A, defining the 3' promoter boundary to a region between nucleotides 4922 and 4952 (Fig. 3B).

Thus, we mapped the minimal sequence required for sgRNA2 synthesis to the 143 nt located between nucleotides 4810 and 4952 of the BYDV genomic RNA. SGP2BG/C, which included 120 nt upstream, gave more sgRNA than SGP2E indicating that it may contain an enhancer sequence. However, it is clear that region from 4810-4952 is sufficient for transcription. Interestingly, this includes an essential sequence located between nucleotides

4922 and 4952, over 100 nt downstream of the start site.

To test whether the 3'TE and sgRNA2 promoter are functionally connected, we introduced a small mutation into the sgRNA2 promoter construct in the Kpn I site, which was shown previously to abolish the function of the 3'TE (143). The mutation is a 4 nt 62 duplication inserted in the BamHl site within the 3'TE (SGP2E-BF, Fig. 3) (143). When this duplication was present in the extra copy of the sgRNA2 promoter, it had no effect on the sjnithesis of artificial sgRNA2A. This indicates that the sgRNA2 promoter does not require a functional 3'TE sequence, suggesting that they function independently of each other

(Fig. 3B).

Mapping the sgRNA3 promoter. The reported transcription start site of sgRNA3

(nt 5348) is located within the 3'IJTR of BYDV RNA (53). A mutant widi a 100-nt deletion spanning the sgRNA3 start site failed to replicate in protoplasts (C. Paul and WAM, unpublished data). This made it potentially difficult to map the sgRNA3 promoter in its original location by deletion mutagenesis. Thus, as with sgRNAs 1 and 2, we duplicated the sgRNA3 promoter and inserted it in the Kpn I site of BYDV RNA. The minimal sequence required for sgRNA3 synthesis was a 44 nt region located between nucleotides 5345 and

5388 (Fig. 4). This is the smallest subgenomic promoter of BYDV. Similar to the sgRNA2 promoter and unlike the sgRNAI promoter, it is located mostiy downstream of the transcription start site.

BYDV sgRNA promoters have limited sequence homology. In an attempt to find common elements within sgRNA promoters of BYDV, we analyzed nucleotide sequences of the three subgenomic promoters. Because only the sgRNA2 and sgRNA3 promoters are located mostiy downstream of the start site, proper alignment of the three promoters was difficult. We chose sgRNA transcription initiation sites as the start points of our analysis.

Surprisingly, besides the conserved hexanucleotide (CUCAAC) in the (-) strand of promoters for sgRNAs 1 and 2, no significant sequence homology was found between overlapping regions of the three subgenomic promoters (Fig. 5A). 63

To explore the conservation of sgRNA promoters, we performed sequence alignments of the promoter regions in the negative strands of different strains of B YDV. Aligimients of both sgRNA2 and sgRNA3 promoter regions showed significant sequence conservation among various strains of the virus (Fig. 5 B and C). Sequences of BYDV-PAV129 and

BYDV—MAV sgRNA2 promoters appeared most divergent, containing stretches of non­ homologous regions. The conservation of the sgRNA2 promoter sequence could be due to conservation of the essential 3'TE that is contained within it, rather than to conservation of promoter function.

Interestingly, BYDV-PAV129 lacks any sequence resembling the sgRNA3 promoter.

Thus, we were unable to use sequence of BYDV-PAV129 for alignment of the sgRNA3 promoter regions. It is not known whether BYDV-PAV129 produces sgRNA3.

BYDV SgRNA promoters have different secondary structures. We have shown previously that both RNA primary and secondary structures are required for sgRNA 1 synthesis (59). Therefore, we analyzed sgRNA2 and sgRNA3 promoter regions for potential secondary structure. Figure 6A presents optimal (AG=-42.2 kcal/mol) and suboptimal (AG=-

40.8 kcal/mol) conformations of sgRNA2 promoter predicted using MFOLD (77, 152). The two conformations share identical stems at the base and two stem-loops (SL2 and SL3) at the top of the structure. The suboptimal stracture has a different midsection and an additional stem-loop (SLl) which creates a four-way junction. The conservation of the SL3 structure, inspite of the sequence divergence in BYDV-MAV and BYDV-PAV129 (Fig. 5B), suggests its potential importance. The co-variation found in the middle helix of the suboptimal structure favors formation of that structure and not of the theoretically most energetically stable one. The G-U pairs that co-vary with Watson-Crick pairs indicate selection for 64 conservatioa of base-paired regions that would not form in the positive strand. The sgRNA3 promoter sequence is predicted to fold into a hairpin structure and a single-stranded region and due to its small size does not offer any ambiguities.

Comparison of the secondary structure of sgRNAl promoter and the structures predicted for the other two promoters showed no common elements (Fig. 6C), indicating that

B YDV SgRNA transcription is controlled by very divergent cis-elements.

BYDV generates (-) strand RNAs of the subgenomic RNA size. In an attempt to gain insights into the mechanism of sgRNA synthesis of BYDV, we analyzed negative strand

RNA species generated during protoplast infection. In addition to the negative strand genomic RNA, we detected subgenomic-size negative strands (Fig. 7). BYDV mutants deficient in sgRNA1 synthesis, Kel-6 and Kel-f (59), the mutant deficient in sgRNA2 synthesis, SG2G/C, and the mutant deficient in sgRNA3 synthesis, SG3G/C2, failed to produce detectable levels of respective (-)sgRNAs (Fig. 7). Interestingly, we also detected an

RNA species migrating between gRNA and sgRNA 1, which we believe, is an undenatured double-stranded replicative form of sgRNAl. This RNA is missing in both sgRNAl- deficient mutants.

Discussion

One of the surprising findings of this study was the fact that mutations of sgRNA2 and sgRNAS start sites as mapped by Kelly et al. (53), did not abolish sgRNA synthesis.

Both sgRNAs were mapped by using primer extension assuming that sgRNAs are colinear with the gRNA. However, this is not true for all viruses. To our knowledge, no data exist on 65 direct sequencing of isolated sgRNAs of BYDV. Potentially, there may be non-templated addition of S'-terminal nucleotides that would complicate precise mapping of their 5' ends.

Roles of BYDV sgRNAs. Our experiments have shown the lack of requirement for sgRNAs 2 and 3 for viral RNA replication ia protoplasts. We have demonstrated previously that sgRNAl is also dispensable for viral RNA replication. It serves as the mRNA for the viral coat protein which is usually not required for replication of positive sense RNA viruses, with the exception of alfamoviruses and ilarviruses (rev. in (51)). Lack of requirement for sgRNA2 supports a previous observation that the ORF6 product encoded by sgRNA2 is not required for replication (96). Nevertheless, we expected that sgRNA2 may influence gRNA replication based on a novel regulatory role it may play in viral translation (141). This role is to preferentially inhibit translation of the viral replicase from the gRNA, thereby mediating the switch to the late gene expression (coat protein) from sgRNAl. Based on this model, it is not surprising that absence of sgRNA2 does not abolish virus replication, because none of the products of SgRNAl are necessary for viral RNA replication in protoplasts.

While the lack of sgRNA2 transcription in SG2G/C did not adversely affect viral

RNA replication, it might have deleterious consequences for other aspects of the viral life cycle that would not be detected by the assays used in this study. For example, the disruption of the stoichiometry of viral RNA and protein accumulation may result in inefficient encapsidation or movement of the virus. The product of 0RF6, while dispensable for replication, may be needed for other processes. Another potential role for both sgRNAs 2 and 3 would be attenuation of virus replication, which could prevent premature death of the host and allow a larger window for transmission. Therefore, sgRNAs 2 and 3 may be a type of molecular parasites (much like satellites and DIs), whose presence, however, created 66 selective advantage for the virus. In planta infection experiments with mutants lacking sgRNAs will address these possibiUties.

Recognitioa of divergent promoters. The mapping experiments, sequence alignments, and RNA secondary structure predictions clearly demonstrate the lack of homology between the sgRNA promoters of B YDV. Location of sgRNA2 and sgRNA3 promoters downstream of the transcription initiation sites, in contrast to the sgRNAl promoter and the vast majority of subgenomic promoters mapped in other RNA viruses (10,

29, 52, 59, 70, 132, 139, 140, 149), is especially striking. This raises the question: how does the replicase complex recognize such different promoters? Either it has multiple RNA recognition domains or there are separate RNA-recognizing proteins for each promoter, and each of these interacts with the replicase. Various host proteins have been found associated with viral replication and transcription complexes (rev. in (64)). Different host factors control the specificity of bacteriophage QP replicase for the positive and negative strands of

QP RNA (13).

An RNA virus may evolve divergent promoters for differential temporal regulation of

SgRNA accumulation, if such regulation provides selective advantage. Also, overlapping of subgenomic promoters and important ORFs and cis-acting elements in a small virus may result in evolution of dissimilar promoters. For example, the sgRNA1 promoter sequence has to accommodate an essential part of the replicase coding region and part of the 5' UTR of

SgRNAl that controls translation (2). The region that contains the sgRNA2 promoter overlaps the 3'TE and part of ORF6, as well as the sgRNA2 5' UTR. The sgRNA3 promoter is located in the region important for virus RNA replication (C. Paul and WAM, unpublished). More detailed characterization of sgRNA2 and sgRNA3 promoters is needed to elucidate RNA primary and secondary structural requirements for promoter activity. The negative strand-specific conservation of the sgRNA2 promoter secondary structure suggests its potential importance for transcription. Studies with other RNA viruses have shown involvement of RNA secondary structure in various processes of the viral life cycle including

RNA transcription and replication. The lack of stmctural homology between B YDV subgenomic promoters is surprising. This contrasts to TCV whose promoters have similar stable hairpins immediately upstream of the start site (140). The hairpin predicted to form within the sgRNA3 promoter structurally resembles SL2 of sgRNAl promoter (Fig. 6, compare B and C). However, not only is SL2 secondary structure not required for sgRNAl transcription, it inhibits promoter activity (59). It will be interesting if future smdies show that sgRNA3 promoter has no secondary structure requirement, and that the hairpin has evolved to attenuate transcription, the role we suggested for SL2 of sgRNAl promoter (59).

Mechanism of sgRNA synthesis. Based on the fact that internal initiation of sgRNA synthesis on the negative strand is the only unequivocally demonstrated mechanism of transcription in plant RNA viruses, and that B YDV replicase belongs to the supergroup 2, the same supergroup as TCV, which employs internal initiation for its transcription, we favor the internal initiation model as the mechanism of sgRNA synthesis in B YDV. However, as we have indicated (59), we do not exclude the possibility that premature termination with independent replication could be the true mechanism as suggested for RCNMV (124). For example, the region identified as the sgRNAl promoter, which contains a helical structure indispensable for sgRNA synthesis, can form identical secondary structures in both the positive and the negative strands (59). The remarkable overlapping of the 3'TE region with 68

the sgRNA2 promoter suggested that the same RNA element has a dual function.

Potentially, the 3'TE could function indirectly as a sgRNA2 "promoter" in the positive strand, by serving as a terminator of the (-) strand sgRNA2 synthesis, which is consistent

with the premature termination mechanism of sgRNA production. Conceivably, binding of

the factor(s) that interacts with the 3'TE, could create an obstacle for the viral RdRp

synthesizing negative strand RNA, causing termination and release of the (-)sgRNA2 which

would serve as a template for (+)sgRNA2. However, the BamH I fill-in mutation, while

lethal for the 3'TE, did not affect synthesis of sgRNA (Fig. 3), which argues against

functional connection between 3'TE and sgRNA2 promoter. Also, the phylogenetic

conservation of the predicted secondary structure of sgRNA2 promoter in the negative strand

(G-U pairs. Fig. 6A) argues that the negative strand contains an important transcription

initiation signal. No such phylogenetic evidence has been found for either sgRNA I or

sgRNA3 promoters.

Our data show that during infection, BYDV generated both genomic and

subgenomic-sized negative strands of RNA. Mutants deficient in sgRNA production,

however, did not accumulate detectable levels of (-)sgRNAs. These data are consistent with

both transcription models because, while (-)sgRNAs may be primary transcription products

and serve as templates for (+)sgRNA synthesis, they may instead be dead-end products

synthesized off the (+)sgRNA templates that contain the 3' replication origin. They also may

potentially function in replication of (+)sgRNA which is originally produced by internal

initiation. A similar compromise model has been suggested for coronaviruses by Brian and

colleagues: (+)sgRNAs initially synthesized by leader priming might be amplified by

replication via a (-)sgRNA intermediate (120). 69

Mutations that resulted in the lack of sgRNA synthesis may have disrupted either the replicase recognition region in the negative strand or the termination element in the positive strand. Unfortunately, to our knowledge, no conclusive experimental data are available on the termination process in positive sense RNA viruses, which would allow us to compare sgRNA promoter regions with structures involved in termination of RNA synthesis.

Involvement of RNA secondary structure has been suggested in RdRp pausing that leads to recombination in RNA viruses (27, 65). Evidence from studies done with rhabdoviruses and suggests that termination could be both sequence and secondary structure dependent (9, 40, 56, 129).

The increase of activity of the ectopic sgRNA2 promoter due to a knock-out of its downstream endogenous copy is consistent with observations made in both the alpha-like

BMV (29) and the coronaviruses (133). BMV uses internal initiation for sgRNA synthesis

(90), however consensus has not been reached on the coronavirus transcription model.

Subgenomic promoter attenuation by other downstream promoters is also consistent with both transcription models (133). The attenuation effect could result either from viral RdRp dissociating from the plus strand template when it encounters a subgenomic promoter

(terminator) or from RdRp pausing and subsequendy dissociating at the promoters on the minus strand where other transcription initiation complexes assemble. Either way, the largest

SgRNA would experience the highest number of obstacles during its synthesis and therefore will be the least abundant. More studies need to be done in order to unveil the mechanism of

SgRNA synthesis in BYDV. 70

Acknowledgments

This work was supported by a grant from USDA-NRI. The authors thank Kay

Scheets for advice on RNA extraction for detection of minus strand viral RNA.

Figure legends

Fig. 1. SgRNA2 and sgRNA3 knock-out mutagenesis. (A) BYDV genome organization. Black boxes represent ORFs with their numbers. Numbered solid lines represent sgRNAs. Nucleotides at the 5' end of each sgRNA are shown in boxes with mutations in bold-italic. Name of each construct is to the left of each box. (B) Northern blot analysis of total RNA from oat protoplasts infected with the wild type and mutant transcripts

(-24 hpi). A riboprobe complementary to the 3' terminal 1.5 kb of BYDV gRNA was used.

Left panel shows the wQd type virus (PAV6) and the constructs whose mutations failed to abolish sgRNA synthesis (SG2A/U and SG3G/C). Right panel shows PAV6 and sgRNA2 and sgRNA3-deficient mutants (SG2G/C and SG3G/C2). Bands corresponding to gRNA and sgRNAs are indicated.

Fig. 2. Deletion mapping of the 5' border of sgRNA2 promoter. (A) Maps of the constructs (PAVDSGP2-1 through PAVDSGP2-8) used in the experiment. Location of the sgRNA2 start site, as reported by Kelly et al. (53), is indicated by an arrow. The 5' terminal sequence of sgRNA2 is shown in italics underneath. Sites of deletions are indicated by the broken lines. Boundaries of deletions are designated by nucleotide positions not included in the deletions. (B) Northern blot analysis of total RNA from oat protoplasts infected with the wild type and deletion mutant transcripts (-24 hpi). 71

Fig. 3. Mapping sgRNA2 promoter by ectopic expression. (A) Map of the constructs that contain duplicated copies of the sgRNA2 promoter (box with an arrow) in the Kpn I site which is duplicated in the cloning process. The gray box represents the 3'TE. The artificial sgRNA2A is indicated by the dashed line. Arrows with numbers indicate PCR primers used to amplify promoter regions for ectopic expression. Lengths of the duplicated regions (in nt) shown in italics. SGP2E-BF contains the same sgRNA2 promoter region duplication as does

SGP2E, but with the BamH I fiU-in mutation (143) (shown in boldface italics). (B) Northern blot analysis of total RNA from oat protoplasts (-24 hpi) infected with the wild t>^e and

mutant transcripts. Left panel shows PAV6 and the constructs with sgRNA2 promoter duplicated in the Kpn I site of PAV6. Right panel shows PAV6 and the constructs with sgRNA2 promoter duplicated in the Kpn I site of SG2G/C, a mutant deficient in sgRNA2 synthesis.

Fig. 4. Mapping sgRNA3 promoter by ectopic expression. (A) Map of the promoter

regions duplicated in the Kpn I site of PAV6. Arrows with numbers indicate PCR primers

used to amplify promoter regions for ectopic expression. Lengths of the duplicated regions

(in nt) shown in italics. Position of the sgRNA3 start site, as reported by Kelly et al. (53), is shown in bold. (B) Northem blot analysis of total RNA from oat protoplasts (~24 hpi) infected with the wild type and mutant transcripts.

Fig. 5. Sequence analysis of sgRNA promoters. All sequences are negative sense.

Numbers that show nucleotide positions refer to the positive sense of B YDV-PAV-Australia gRNA. (A) Alignment of subgenomic promoter sequences of BYDV. Transcription initiation sites of the three subgenomic promoters are aligned, and only the overlapping regions are shown. The conserved hexanucleotide CACUUC is shown in bold. Dashes 72 indicate the lack of consensus. (B) Alignment of sgRNA2 promoters in different strains of

BYDV: PAV-Japan (pav-jap, accession D85783), PAV-Purdue (pav-p, accession D11032,

DO 1214), PAV-Australia (pav-aus, accession X07653), PAV-129 (pavl29), and MAV (mav, accession D11028, DO 1213). In the consensus sequence, bases conserved in all strains are shown in upper case, those conserved in three or four strains are shown in lower case. Gaps are designated by dots. The 3'TE is located between nt 4810 and 4920. (C) Alignment of sgRNA3 promoters in three strains of BYDV. Strains are designated as in (B). In the consensus sequence, bases conserved in all three strains are shown in upper case, those conserved in two strains are shown in lower case.

Fig. 6. Analysis of secondary structures of sgRNA promoters of BYDV (negative sense). Angled arrows indicate sgRNA initiation sites. (A) Potential secondary structures of sgRNA2 promoter predicted by MFOLD. The optimal and suboptimal conformations are shown with the calculated free energy (AG) indicated for each structure. Nucleotides that vary in different strains of BYDV are in boldface italics. Base-pair co-variations that preserve proposed secondary strucmre are boxed. Extra bases in PAV129 extend the stem of

SL3 (boxed). The variable distal end of SL3 in MAV is also boxed. (B) An MFOLD secondary structure prediction for sgRNA3 promoter. (C) Secondary structure of sgRNAl promoter supported by MFOLD analysis, phylogenetic evidence, and nuclease probing (59).

Fig. 7. Detection of positive and negative sense RNA in oat protoplasts infected with

PAV6 and mutants deficient in sgRNAl synthesis (Kel-6 and Kel-f), sgRNAZ synthesis

(SG2G/C), and sgRNA3 synthesis (SG3G/C2). Northem blot analysis of total RNA from infected protoplasts probed for the positive (left panel) and the negative (right panel) strands 73 of BYDV. The putative undenatured double-stranded replicative intermediate of sgRNAl is designated sgRNAl RI. 74

Table 1. Primers used in this study. Altered bases are in boldface italics, deletions are shown as underlined spaces, Kpn I recognition sites are underlined.

Expeiimenc Primer name and sequence

SgRNA knock.out SG2A/r. CCCAGGATCCGATTTGTGCTAGTGGTGTrGTCTTCACA- GGAATTGCGCCTTGTA SG2G/C.CCCAGGATCCGATTrGTGCTAGTGGTGTTGTCTTCACT- GGAATTGCGCCrrGTA SG3G/C, CCACGACCTGGTACAAGT SG3G/C2, GAAGACGTTAAAACTCGACCACCTGGTACAAGTCGT 3'wt. ATACCCCGGGTTGCCGAA CB04I6, GGTCTAGATAATACGACTCACTATAGGGTACACAAACAAGCGAAT

SgRNA2 promoter 4620D. CCTGAAGACGTACCTCCAAT_AAAGAAGAACCCCA deletions 4686D. TCCACGCTACrCTATGAAAG_CTACTCCCACTGCC 4706D. GGCAACmTTATCCAGACT.GTGTCAACTACTTCAAACAT 4729D. CCAGACrTGTAGAAGCGA_AAGGGAGCAGCTCCG 4763D. ACTCCCACTGCCCCATCC_AATTCCAGCGGAATC 4790D, CAAACATGACAAGGGAGC_GCGTACAAGGCGCAA 48 lOD. CCGGGAGTACACTAGGATT_GTGAAGACAACACCAC 48 UD. CCGGGAGTACACTAGGATT_TGAAGACAACACCACTA pO 18. GCCTGTTTCCCAGGATCCTATAGTGAGTCGTATTACA

Promoter duplications 4620, ATAGGTACCAAAGAAGAACCCCA 4686. ATAGGTACCCTACTCCCACTGCC 4810. ATAGGTACCGTGAAGACAACACCACTA 4900. ATAGGTACCACGGCGGTAGGTTG 4922. ATAGGTACCCATCGGCCAAACACAATA 4952. ATAGGTACCAATACAAACGGCGA 5249. ATAGGTACCAAGGCTATCCCACC 5293. ATAGGTACCAGTGGGTGACTTCG 5319. ATAGGTACCGATCGTC^GGATTGA 5330. ATAGGTACCTTGAAGACGTTAAAACTC 5345. ATAGGTACCCTCGACGACCTGGTACAA 5373. ATAGGTACCAGnTAACGACTTGTACCA 5388. ATAGGTACCGTATCCACCCGAGTC 5402. ATAGGTACCGGGCCGGGTGTGGTG 5432. ATAGGTACCCGTTTCGTATCGTG 75

PAV6 (wt) lAGUGAAGI PAV6 IGACGACI SG2AAJ IfJGUGAAGi SG3G/C ICACGACI SG2G/C lACUGAAGl SG3G/C2 IGACCACI 1 n 4809 5348 B vO vC?'

- gRNA -

- sgRNAI —

ig? m

sgRNA2 —

—sgRNAS — 76

4809

PAV6 AGUGAAG 4567 4619 PAVDSGP2-1 4624 4685 PAVDSGP2-2 4663 4705 PAVDSGP2-3 F'AVD$6f>2-4 4707 4762 PAVDSGP2-5 4736 4790 PAVDSGP2-6 4759 4810 PAVDSGP2-7

PAVDSGP2-8

PAVDSGP2 PAV6 -gRNA S&if ^SsaK — sgRNAI

— sgRNA2

— sgRNAS

Fig. 2. 77

3TE

Kpn I Kpn I sgRNAI sgRNA2A

sgRNA2 sgRN^

4620 4686 4809 4900 4952

4810 4922 SGP2A 281 nt SGP2B (SGP2BG/C) 267nt

SGP2C 333 nt

SGP2D 113 nt

SGP2E 142 nt

SGP2E-BF 1 5' GGAUCGyiiyCC 3" B A ^

3 gRNA Ciii|pi#iifC

3 sgRNAI C

artificial ^ sgRNA2A'

— sgRNA2 —

— sgRNA3 -

Fig. 3. 78

5348 5150 5249 5293 5330 5388 54325450

5319 5345 5373 5402 SGP300 301 nt SGP3C 125 nt SGP3D 140 nt SGP3H 84 nt SGP3K 59 nt SGP3L 44 nt

-gRNA W- pS ^ 0. ^ ^0 isgRNM

, artificial sgRNASA

- sgRNA2

— sgRNAS

Fig. 4. 79

sgRNM I 2695 '2798 sgRNA2 fS!" USW : 4952 sgRNA3 | 53451 .•53SS

sgRNAl CACUTJCCACT GCUGAGGUGU AGA sgRNA2 uCACXnJCUGUU GUGGUGAUCG UGU sgRNA3 cugCUGGACCAUG UUCAGCAAUU UGA consensus

B 4810 pav-aus pav-jap C pav-p C-A- pavl2 9 UAA U CG—U mav U— G- —.—AU consensus CACUUCu.gu UGUGGUGAuc GUgUUuaGCc UAGGACCCUU

pav-aus -U pav-jap pav-p -U- - G pavl2 9 -U- -UAACCUUC CUC mav - UU—UCA consensus UGUCCGUCUU GAagCCAAGc AUUCGAGCCC AU ccGacag

pav-aus -C- pav-jap pav-p pavl2 9 GUGGUG GUAGA G—A-GG G- mav GGAA G-A C consensus uugg AUGGC GGCaUAgCau aAcACAaACC GGCUACCUaC

4952 pav-aus pav-jap _G pav-p _G pavl2 9 A— G A C mav U—A— C—U—C —GC-C consensus UaGAgGUgCA aUAgCGgCAA ACauAa

5345 5388 pav-aus pav-jap mav Consensus GAGCUGCUGG AuCAcGUUCA GCAAUUUGAC UGAuCCCACC UAUG

Fig. 5. 80

A C A A C-G C SL2 PAV129 SL2 g PAV129 A UU U ACC CCAC U I III II I I U • G c A^.UGG ,,GGUG G SL1 A 11 G GC U^GUCC CCAU CCG^C CCAU CCG^C U I I I I SL3 • Ml II* / SL3 CcCAGG^ GGUA aau G_CC GGUA GGUf.G C G K "GI^IC-GI G C -G Cql)UUQUUGUC-G^^UA ®C>*>JAQ 4 " ' C ^GGAucC A U ^U-A^ AAqqA ^u- ACACA [UZaJ-^IUTG] u- A MAV _ G — C >iA MAV u- A |U • GL'^-FU - A\U A G- C , , G-CC. C U G -C [yZi^EE^G ^G • AG • U^ U - A u - A G - C- 4922 G -C-4922 guG-CU .uG-Cu C U„ u ItT^+'lc^ c- G _ U- A ^[U u- n [yZ ^,fA-U ^ A A- 3 T-C- GCAAUAGCGGCAAACAUAA« 5' 3'tc- GCA>IUAGCGGCAAAC>»AA4 5' 4810 4952 4810 4952 AG=-40.8 kcal/mol AG=-42.2 kcal/mol C^A A U U B C f/ C-G [CjiGJ-»-lU»GI U - Ai. A-U SL1 C - G^ G*U G-C G-C A-U f U-A |UUAg:g ^C-G c G-C AU - A U-A UqU C-G C A U-A G U U-A A-U SL2 3'G^ ~ ^UGACUGAGCCCACCUAUG 5' G G 5345 5388 G-C C^ G A C AG=-9.3 kcal/mol UIauUC C U C-G U-A A-U U •G C-G C-G U-A f A -U C-G •-C-G 3'C-GUCUCA -UAGA 5' 2595 2692

Fig. 6. 81

P _\C?'

— gRNA -

sgRNAI _ Rl

— sgRNAI —

il«l

sgRNA2-

— sgRNA3— 9

(+) STRAND (-) STRAND

Fig. 7. 82

CHAPTER 4. CHARACTERIZATION OF THE BARLEY YELLOW DWARF

VIRUS 3' ORIGIN OF REPLICATION.

A paper to be submitted to Virology

Gueimadi Koev and W. Allen Miller

Abstract

We characterized the 3'-terminal cis-acting RNA element required for replication of

barley yellow dwarf virus (B YDV). Computer predictions, nuclease sensitivity assays, and

phylogenetic and mutational analyses revealed that the 3'-terminal 104 nucleotides (nt) of

B YDV genomic RNA (gRNA) form four stable stem-loops (SL) with GNRA and UNCG

terminal tetraloops (SLl, SL2, SL3, and SL4, 3' to 5'). Northem blot analysis of total RNA

from oat protoplasts inoculated with wild type and mutant infectious transcripts indicated that

each of the four stem-loops was required for virus RNA replication. No full-length minus

strand gRNA was detected in the cells inoculated with the stem-loop deletion mutants,

however the mutant lacking SLl produced much higher than the wild type levels of minus

strand RNAs that were shorter than gRNA. RNA secondary structure of the stems and not

their primary sequence was essential for replication. The sequence of the terminal loops of

SLl and SL2 was not important, whereas mutations in the terminal loops of SL3 and SL4

abolished or reduced viral RNA replication to various degrees. Deletion mutagenesis showed that the 104 nt region containing the four stem-loops was sufficient for low level

RNA replication, however sequences upstream were needed for full activity. We propose 83 that the 104 nt region at the 3' terminus of BYDV is the viral replication origin that promotes minus strand synthesis of BYDV gRNA.

Introduction

Involvement in RNA replication is an obvious role for the 3' termini of positive sense

RNA viruses. In amplification of viral genome, negative sense (antigenomic) RNA synthesis initiates at the 3' genomic terminus. Viruses have evolved a variety of 3' terminal structures that serve as promoters for the minus strand synthesis (14, 26). These are referred to as 3' origins of replication. A well-characterized 3' element is the tRNA-like structure (TLS) found in several groups of plant RNA vimses (rev. in (75)). The TLS is required for replication of brome mosaic virus (BMV) RNA because it contains promoter elements for antigenomic RNA synthesis (88). Recently, it was demonstrated that in vitro, a promoter for the mmip yellow mosaic virus (TYMV) replicase can be reduced to a single-stranded

CC(A/G) run, and a new role for the TLS was suggested where its secondary structure prevents initiation at internal CCA elements (123) and downregulates antigenomic RNA synthesis by binding translation factor EF-la (26). Despite the sequence and secondary structure differences from true tRNAs, TLS exhibit tRNA mimicry. They can be aminoacylated by host aminoacyl transferases (not always required for replication (35)) and are recognized as a substrate by the CCA nucleotidyltransferase, which helps maintain the correct 3' terminus over numerous rounds of replication. Thus, TLS serve a telomeric function in viruses that possess them (111). It has been suggested that TLS are molecular remnants of the RNA world, and that their original function was to tag RNA genomes for replication (145). Thus, RNA promoter may have been the original function of tRNAs, and their role in translation evolved later.

Heteropolymeric RNA sequences that do not exhibit tRNA mimicry constitute another broad group of viral 3' termini. Variable in size and sequence, many of them contain conserved secondary structural elements that play a role in RNA replication. Tne 3' UTRs of viruses in such unrelated groups as carmovirus (125), coronavirus (45, 146), pestivirus (148), tombusvirus (41), rubivirus (19), and others, contain stable secondary and tertiary structural elements indispensable for replication. Host factors and viral replication proteins can bind specifically to such structures, and often their binding ability correlates with replication competence (rev in (64)).

Similar to eukaryotic mRNAs, many viruses contain poly(A) tails at their 3' termini.

While generally recognized as stability and translational control elements, poly(A) tails can also affect virus replication (115). However, presence of a poly(A) tail does not exclude involvement of the upstream genomic 3' sequences, and the higher order RNA structures they can form, in virus replication. For example, the polyadenylated genomes of coronaviruses and potyviruses also contain stem-loops in their 3' UTRs that are required for replication (39, 45). Picomaviruses have tertiary structural elements, pseudoknots, in their 3'

UTRs that support replication. Several smdies have demonstrated the importance of the 3' pseudoknots or kissing stem-loops for efficient replication of different picomaviruses (80,

95, 106, 138). On the other hand, deletion of the entire 3' UTRs in poliovirus and human rhinovims, while severely inhibiting viral growth, did not abolish replication, suggesting that template recognition by the replicase might not be as specific as originally believed (128). 85

Co-compartmentalization of the replicase with the template RNA may also play an important role in specificity of RNA amplification (128).

In this study we characterized the 3' terminal region that is necessary for replication of the barley yellow dwarf virus (BYDV) genome. B YDV is a single-stranded, positive sense RNA virus, the tj^e member of the genus Luteovirus in the family Luteoviridae (84).

It has an uncapped, non-polyadenylated 5.7 kb genomic RNA (gRNA) that contains six open reading frames (ORFs) (Fig. lA). During infection, BYDV generates three subgenomic

RNAs (sgRNAs). Viral gRNA serves as a messenger for ORFs 1 and 2, where ORF 2 is expressed by translational frameshifting (11). The role of the 39 kDa protein encoded by

ORFl is unclear, whereas the 99 icDa frameshift product of ORFs 1 and 2 is the viral RNA- dependent RNA polymerase (RdRp). The RdRp gene of BYDV has been assigned to supergroup 2 (60). Like the majority of viruses in this group, BYDV contains a heteropolymeric sequence at its 3' end. Our previous smdies have shown that sequences in the 3' region of BYDV genome are involved in regulation of translation (143). Here we characterize 3' proximal cis-elements required for BYDV RNA replication. We show that the 104 3' terminal nucleotides of BYDV gRNA form four stable phylogenetically conserved stem-loop structures that contain terminal GNRA and UNCG tetraloops. We suggest that these structures constitute the viral 3' origin of replication. A combinadon of primary RNA sequence and secondary structure in this region is required for virus replication in protoplasts. I

86

Results

The 3' terminus of BYDV is predicted to fold into four stem-loops. Because of

the well-documented importance of higher order RNA structure in viral replication, we

analyzed the 3' terminus of BYDV gRNA for potential formation of RNA secondary

structure. Using the MFOLD program (77, 151, 152) we predicted two alternative

conformations of BYDV 3' terminus (Fig. IB): the most energetically stable (AG=-46.9

kcal/mol) and a suboptimal conformation (AG=-45.4 kcal/mol). Both structures consist of

four stem-loops (SLl through SL4) that contain terminal tetraloops. The tetraloops of SLl,

SL3, and SL4 fit the GNRA consensus (N is any nucleotide, R is a purine). The SL2

terminal loop is a UNCG tetraloop. Both GNRA and UNCG tetraloops have been shown to

confer unusually high stability to their stem-loop structures due to non-Watson-Crick

interactions between the nucleotides in the loops (20, 42, 135). The most stable predicted

conformation differs from the suboptimal strucmre only by a shorter SL2 and base-pairing of

the 3'-terminal four bases (ACCC-3') (Fig. IB).

To test if the predicted structure is phylogenetically conserved, we aligned the 3' ends

of several BYDV strains . This showed high sequence conservation in this region (Fig. IC).

However, variation was found that supported existence of some structures. For example, the

first A in the GAAA tetraloop of SLl is changed to a U in BYDV-MAV. Also, in SL3, the

third nucleotide of ±e tetraloop GCAA is changed to a G in BYDV-MAV. These changes

stiU fit the GNRA tetraloop consensus. The variable nucleotides in the single-stranded

regions do not change their single-stranded nature. Two base-pair covariations were found in

SL3 and SL4 that support existence of these strucmres. The variation analysis, however, did

not allow us to distinguish between the two predicted stmctures. Sequence variation 87 suggested that both BYDV-MAV and BYDV-PAV-Japan contain larger internal bulge in

SL4, which in B YDV-PAV-Australia is limited to three unpaired adenosines (Fig. IB).

Nuclease probing of the 3* terminal RNA secondary structure. To determine the type of structure formed by the 3' terminus of BYDV in solution, we probed its structure using the single strand-specific chemical nuclease imidazole and ribonuclease T1 that cuts single-stranded guanosines (28, 93, 137). Imidazole failed to cleave the tetraloop nucleotides probably due to the non-Watson-Crick interactions within the loops (Fig. 2). However, guanosines in the GNRA tetraloops were sensitive to the T1 nuclease, whereas those in the regions predicted to be base-paired appeared insensitive. The region 5'-ACUC-3' (nt 5638-

5641) appeared sensitive to imidazole cleavage, which was inconsistent with the suboptimal strucmre (Fig. IB) where this region forms the base of SL2 by pairing with 5'-GGGU-3' (nt

5619-5622). However, the region 5'-GGGU-3' (nt 5619-5622) was imidazole-insensitive

(Fig. 2). This can be explained by base-pairing of the 5'-GGGU-3' region with the 3' terminus (5'-ACCC-3') of BYDV, as predicted in the optimal conformation. Therefore, the structural probing data supported formation of the calculated most stable structure with the 3' terminal nucleotides base-paired to the region between SL2 and SL3 (Fig. 2).

The 3' terminal stem-loops are required for viral RNA replication. To test the importance of the 3' terminal stem-loop structures for viral RNA replication, we constructed mutants in the full-length infectious clone of BYDV (PAV6) that contained deletions of each stem-loop (3'SLID through 3'SL4D, Fig. 3A). Also, in order to test whether the region containing the four stem-loops is sufficient to support viral RNA replication, we designed an additional mutant, 3'SL5D, that contained a deletion of nt 5503-5567 just upstream of SL4.

Because portions of the 3' terminal region of BYDV genomic RNA have been implicated in 88 translational regulation, we tested these mutants for translation in germ extract. All five mutant transcripts translated similarly to wild type (PAV6), producing abundant 39 kDa product of ORFl and low levels of the 99 kDa frameshift product, the viral RdRp (Fig. 3B).

None of the full-length viral in vitro transcripts containing the stem-loop deletions replicated

(Fig. 3C). The mutant 3'SL5D replicated very poorly (Fig. 3C) suggesting that while the region containing the four 3' terminal stem-loops is sufficient for a basal-level replication, upstream sequence (bases 5503-5567) is required for wild type activity.

Low levels of (-)gRNA and (-)sgRNAs accumulated in cells infected with the wild type virus and none were detected in cells inoculated with the deletion mutants 3'SL2D through 3'SL5D (Fig. 3C). Due to the very low abundance of the negative strand viral RNA,

Northern blot hybridization was not sensitive enough to detect negative strand RNA species of the poorly replicating mutant 3'SL5D. Literestingly, the mutant 3'SLID, which lacked

SLl, consistently produced very high amounts of negative strand RNA with the molecular mass lower than that of the full-length gRNA (Fig. 3C).

Helical regions of the 3' terminal stem-loops are necessary for replication. To test the role of the primary sequence versus the secondary stmcture of the stems in SL1 through SL4, we designed a series of mutations that disrupted and restored these heUces.

Three mutants were generated for each stem-loop: the 3'-proximal arm, the 5'-proximal arm, or both arms were mutated such that each single arm mutant disrupted base-pairing and the double mutant restored base-pairing (Fig. 4A). To distinguish between the two possible conformations of SL2 (Fig. IB), we designed two groups of mutants for this strucmre. In the first group we assumed that SL2 formed a long stem, as predicted in the suboptimal conformation (mutants 3'SL21 through 3'SL23, Fig. 4A), and in the second group we 89 assumed that it formed a short stem, as predicted in the optimal structure and supported by the nuclease sensitivity assay (mutants 3'SL2U1 through 3'SL2U3, Fig. 4A). Mutants with the stem structures disrupted in SLI, SL3, and SL4 failed to repUcate, whereas restoration of these stems in the double mutants restored virus replication to the wild type level in SL3 and to levels lower than the wild type in SLI and SL4 (Fig. 4B). The mutant 3'SL14, which contained a deletion of the four bulged nucleotides in SLI (Fig. 4A), replicated at the wild- type level indicating that the bulges in SLI are not required for replication (Fig. 4B). These data showed that the secondary structures of the stems in SLI, SL3, and SL4, and not their primary RNA sequences, are required for replication.

The disruption of the proximal end of the predicted stem in SL2 (alternative) was lethal in 3'SL22, and dramatically decreased but did not abolish RNA replication in 3'SL21

(Fig. 4B). The double mutant 3'SL23 failed to restore replication (Fig. 4B) indicating either a primary RNA sequence (GGGU) requirement in this region or that the double mutant folded in an unpredicted nonfiinctional conformation or the erroneous nature of the suboptimal conformation of SL2. On the other hand, distal disruption mutants (3'SL2U1 and

3'SL2U2) were severely debilitated, and the double mutant 3'SL2U3 restored replication to the wild type level (Fig. 4B). Taken together, these data suggest that SL2 forms the most energetically stable conformation, and that its secondary strucmre is important for replication.

Role of the terminal tetraloops in replication. Because of the previously reported importance of GNRA and UNCG tetraloops in RNA structure and function (34, 49, 97, 147), we elucidated the requirements for the terminal loop sequences in SLI, SL2, SL3, and SL4 for virus RNA replication. In a series of mutants (Fig. 5A), the sequence of the loop in SLI 90

(GAAA) was altered to a UNCG tetraloop (UUCG), another GNRA tetraloop (GAGA), and a non-tetraloop sequence (GACA). All these nautants replicated at the wild type level, as did

the mutant with the SL2 tetraloop (UUCG) changed to the complementary sequence (AAGC)

(Fig. 5B), indicating that the sequences of the terminal loops in SLl and SL2 are not important for virus RNA replication.

Alteration of the SL3 tetraloop sequence (GCAA) to its complement (CGUU)

abolished virus replication (Fig. 5B). Changing it to a UNCG tetraloop (UUCG), a different

GNRA tetraloop (GCGA), and a non-tetraloop sequence with only one base different from

the wild type (GCCA), restored replication to various degrees but never to the wUd type level

(Fig. 5B). The mutant 3'SL3GCGA, which had one base different from the wild type

sequence and fit the GNRA consensus, replicated at the highest level. Similar results were

obtained with the SL4 tetraloop mutants designed in the same fashion as those of SL3 (Fig.

5A). Although none of the mutations abolished virus replication, they all inhibited it to

various degrees (Fig. 5B), indicating that the original wild type sequence GAAA is optimal.

Discussion

Role of stem-loop structures in replication. In this smdy we identified a cis-acting element at the 3' terminus of BYDV RNA that is required for vims replication. The fmding

of extensive RNA secondary strucmre in this region was not unexpected, but identification of

four consecutive stem-loops with terminal tetraloops was rather striking. Extremely stable

structures that contain terminal GNRA and UNCG tetraloops have been implicated in protein

binding and formation of RNA tertiary structure (34, 49, 97, 147). Binding of genomic 3'

UTRs by viral and host cellular proteins has been well documented in various viruses (rev. in (64)), therefore a similar protein binding role could be expected of the 3' terminal tetraloops of BYDV. Formation of the RNA tertiary structure important for virus replication has also been shown (80, 95, 106, 138). However, the types of tertiary folding found in viruses are limited to pseudoknots and kissing stem-loops formed by conventional Watson-Crick base- pairing of relatively distant RNA elements. GNRA tetraloops have been involved in a different type of tertiary interactions, in which they bind RNA helical regions containing specific receptors, as found in the group I intron of Tetrahymena thermophilia and the td intron of the bacteriophage T4 (15, 21, 108). A possibility of such tertiary interactions in

BYDV can be explored.

Interestingly, our mutagenesis showed that sequences of the SLl and the SL2 terminal tetraloops were not important for replication. The loop sequence changes in the SL3 and the SL4 tetraloops were more deleterious. However, only one tetraloop mutation,

3'SL3CGUU, completely abolished the virus' ability to replicate. Nevertheless, a preference for a GNRA tetraloop was obvious, because mutants that maintained tetraloop consensus

(3'SL3GCGA and 3'SL4GAGA) replicated at the highest levels in their groups. The lack of a requirement for specific sequences in the loops of essential stem-loops has been demonstrated in turnip crinkle virus (TCV) (125) and cymbidium ringspot virus (CymRSV), which also contains a 3' terminal GNRA tetraloop (41). On the other hand, red clover necrotic mosaic virus (RCNMV) RNA2 was unable to replicate with an alteration in the loop of the 3' terminal stem-loop structure (130). A GNRA tetraloop in the 5' terminal stem-loop structure of potato virus X (PVX) is important for virus replication (82). A PVX mutant with a loop sequence alteration was able to replicate in plants. However, after two passages in planta, the wild type tetraloop sequence was recovered, indicating selective advantage of the 92

GNRA sequence (81). A similar in planta experiment with mutant B YDV tetraloops may provide some clues to the optimal structural features of the replication origin of the virus.

Due to the mandatory aphid transmission for plant infection by B YDV, such experiments are technically difficult, and therefore they were not attempted in this study.

The stem mutagenesis in SLl through SL4 showed the importance of the secondary

* structure of helical regions and a relaxed primary sequence requirement. This is consistent with similar smdies performed with 3' replication elements of other RNA viruses like TCV

(125), CymRSV (41), RCNMV (130), tobacco etch vims (TEV) (39), bovine viral diarrhea virus (BVDV) (148), alfalfa mosaic virus (AMV) (134), and others. On the other hand, in beet necrotic yellow vein virus, both the primary and the secondary structures of the 3' helical regions were indispensable for replication (68). The secondary structure of 3' terminal stem-loops of RNA viruses has been shown to be important for protein binding (19,

48). Perhaps, failure to bind important replication factors rendered the stem disruption mutants non-viable.

Role of upstream sequence elements in replication. Based on our inability to detect full-length (-)gRNA in replication-deficient stem-loop deletion mutants and the lack of apparent translational defects in these mutants, we suggest that the 3' terminus of B YDV serves as a replication origin by promoting (-)gRNA synthesis. The 3' terminal 104 nt that contain the four stem-loops, however, were not sufficient for the wild type replication level so sequence upstream (bases 5503-5567) must play a supporting role. Elements located upstream of the 3' terminal TLS stimulate replication of brome mosaic virus (BMV) and tobacco mosaic virus (TMV) RNAs (61, 127). In TCV and RCNMV genomes, elements located hundreds of nucleotides upstream of the 3' end are needed for efficient replication 93

(16, 130). This reflects the complexity of RNA virus replication. It is possible that other upstream elements of the B YDV genome may play a role in the (-)gRNA synthesis as was reported in other viruses. We can rule out bases 2789-4515 that can be deleted without negatively affecting B YDV replication (96). In the bacteriophage QP RNA, an internal region 1.5 kb upstream of the 3' end is the actual replicase binding site (13). Long-distance base-pairing delivers the replicase complex to the 3' end (57). Human rhinovirus contains an internal RNA element required for replication (79), and a part of the intergenic region of the

BMV RNA3 greatly enhances its replication (30).

Potential means of regulation of (-) strand RNA synthesis. Accumulation of abundant, less-than full-length negative strand RNA in the SLl deletion mutant was surprising. This could indicate increased minus strand initiation rate in the absence of SLl combined with the decreased processivity of the viral replicase which made it unable to complete full-length (-)gRNA synthesis. This phenomenon requires further investigation.

Another surprising finding of this smdy was the inferred double-stranded nature of the

BYDV 3' terminus (5'-ACCC-3'). The biological relevance of this base-pairing is suggested by the replication deficiency of the mutants 3'SL22 and 3'SL23, where it would be disrupted.

Because of the conservation of the 3'-terminal 5'-CCC-3' sequence in the majority of viruses in the supergroup 2, a compensatory double mutation restoring this base-pairing most likely would be lethal and therefore it was not attempted. In viruses that contain a TLS, the 3' terminal nucleotides are usually unpaired, which makes them accessible to the replication complex. It is tempting to speculate that the putatively base-paired 3' end of BYDV RNA is a strategy for the negative regulation of the (-)gRNA synthesis. At a certain stage of infection, a conformational change may melt the 3'-terminal base-pairing and make it 94 accessible to the RdRp. Base-pairing has been implicated in preventing initiation of minus strand RNA synthesis from promoter-like elements within the TLS in TYMV (123). Also in

TYMV, binding of the EF-la is suggested to limit RdRp access to the minus strand promoter

(26). Conceivably, such a mechanism may mediate an important regulatory switch between negative and positive strand synthesis, or between translation and. replication, since these two processes are incompatible on the same RNA molecule. Possible regulatory conformational changes in viral 3' UTRs have been suggested for AMV (103) and flaviviruses (110).

The findings of this study represent yet another example of a viral replication origin, further demonstrating the diversity of replication cis-elements that RNA viruses have evolved. Our characterization of the B YDV 3' origin of rephcation is a step towards understanding the composition and the mechanism of formation of viral rephcation complexes. Strategies for identification of the viral and host cell proteins interacting with the

3' origin of BYDV are being considered.

Materials and Methods

Plasmids. The mutant constructs were derived from the full-length infectious clone of BYDV, pPAV6 (23), by PCR-mediated site-directed mutagenesis using Vent DNA polymerase (New England Biolabs). The mutants p3'SLlD, p3'SLlUUCG, p3'SLlGAGA, p3'SLlGACA, p3'SLll, p3'SL12, p3'SL13, p3'SL14, and p3'SL2D were constructed by PGR amplification of the 3' portion of the BYDV genome with the downstream mutagenic primers listed in Table I and the upstream primer CB0416

(GGTCTAGATAATACGACTCACTATAGGGTACACAAACAAGCGAAT). The product was digested with the restriction endonucieases Kpn I and Sma I, purified by 0.8% low-melt 95 agarose gel electrophoresis, and inserted into pPAV6 cut with the same enzymes. The remaining mutants were generated by two-step PCR (67). In the first step, upstream mutagenic primers listed in Table 1 were used with the downstream primer 3'wt

(ATACCCCGGGTTGCCGAA). The product was gel-purified and used as a downstream primer in the second step PCR with the upstream primer CB0416. The second step PCR product was cut with Kpn I and Sma I, gel-purified and placed in pPAV6 cut with the same enzymes. The construct p3' used for RNA structural probing was generated by subcloning a

Bgl IL-Sal I fragment of pgl016, containing the 3' end of BYDV, into pSS I (93) cut with

BamH I and Sal I. Plasmid preparations were performed using the QuantumPrep DNA purification kit (Bio-Rad). All mutants were confirmed by sequencing.

Protoplast infection and Northern blot analysis. Wild type and mutant infectious transcripts were generated by in vitro transcription of Sma I-linearized plasmids using

Megascript T7 RNA polymerase system (Ambion, Austin, TX). We used 10-15 fig of RNA for electroporation of oat protoplasts prepared as described in (25). Total RNA was extracted from protoplasts -24 hours postinoculation (hpi) using RNeasy plant RNA isolation kit

(QIAGEN, Los Angeles, CA). For minus strand detection, RNA was isolated following the procedure described in (119), using aurin tricarboxylic acid as RNase inhibitor. RNA (5-10

|ig) was analyzed by northern blot hybridization essentially as described in (119). A "P- labeled riboprobe complementary to the 3' terminus of BYDV was used to detect viral gRNA and sgRNAs. For positive strand detection, plasmid pSPlO (24) was Linearized with Hind EH and transcribed in vitro with T7 RNA polymerase (New England Biolabs). For negative strand detection, the same plasmid was linearized with Sma I and transcribed in vitro with

SP6 RNA polymerase (Promega, Madison, WI). GeneScreen nylon membranes (Dupont) 96 were hybridized with the probes and exposed to Phoshporlmager screens (Molecular

Dynamics, Sunnyvale, CA) for 5-24 hours (positive strand detection) or 2-4 days (negative strand detection).

Transcripts of the stem-loop deletion mutants were tested for translation in wheat germ extract translation system (Promega, Madison, WI) following manufacturer's instructions, using ^^S-labeled methionine, essentially as described in (142). Translation products were analyzed by SDS-PAGE (10% acrylamide) as described in (142). All experiments were performed at least twice.

RNA sequence and structure analysis. Sequence alignments of BYDV isolates

were performed using GCG software. RNA secondary structure predictions were done using the MFOLD program, version 3.0, at the MFOLD website

(http://mfold2.wustl.edu/~mfold/ma/forml.cgi) (77, 152). RNA secondary structure probing

was performed on in vitro transcripts 5' end-labeled with ^"P essentially as described in (28,

93, 137). The transcripts were derived from p3' linearized with Sma I.

Acknowledgments

This work was supported by a grant from USDA-NRI. The authors thank Kay

Scheets for advice on RNA extraction for detection of minus strand viral RNA. We are grateful to Mark Young and Sergei Filichkin for providing unpublished sequence of the 3' end of B YDV-MAV. 97

Figure legends

Fig. 1. (A) Genome organization of BYDV. The horizontal lines represent genomic and subgenomic RNAs (sgRNA). Filled rectangles show open reading frames (ORFs) indicated by number (1 through 6) and by encoded products (RdRp, RNA dependent RNA polymerase; CP, coat protein; MP, putative movement protein; AT, aphid transmission protein). Molecular masses of protein products are indicated in kilodaltons (K). (B)

Potential secondary structores of BYDV 3' terminus predicted by MFOLD (77, 152). The optimal and suboptimal conformations are shown with the free energy (AG) indicated for each structure. Nucleotides that vary in different strains of BYDV are in boldface italics.

Base-pair covariations that preserve proposed secondary structure are boxed. Nucleotide variations that do not preserve secondary structure are circled. (C) Sequence alignment of the 3" termini of BYDV strains: PAV-Australia (pav-aus, accession X07653), MAV (mav),

PAV-Japan (pav-jap, accession D85783). In the consensus sequence, bases conserved in all strains are shown in upper case, those conserved in two strains are shown in lower case.

Gaps are designated by dots. Dashes indicate bases identical with the consensus. Numbers that show nucleotide positions refer to the positive sense of BYDV-PAV-Australia gRNA.

Fig. 2. Solution structure of the 3' terminus of BYDV as determined by nuclease sensitivity assays (data not shown). Bases sensitive to imidazole cleavage are indicated by arrowheads, those sensitive to the T1 ribonuclease are indicated by open triangles. The assays were performed as described in (28, 93, 137).

Fig. 3. Characterization of the stem-loop deletion mutants. (A) Diagram of the mutants with deleted regions shown in boxes. (B) Translation of the wild type (PAV6) and mutant RNA transcripts in a wheat germ extract translation system (Promega, Madison, WI).

'^S-labeled translation products were resolved by SDS-PAGE (10% acrylamide). Names of mutants are shown above each lane. Marker lane, designated BMV, shows translation products of brome mosaic virus (BMV) RNAs. Product sizes are indicated in kilodaltons

(kDa). (C) Accumulation of PAV6 and deletion mutant RNAs as determined by northern blot analysis of total RNA isolated from infected oat protoplasts -24 h after inoculation. The left panel shows accumulation of the positive strand viral RNAs, the right panel shows accumulation of the negative strand. The probe was complementary to the 3' terminal 1.5 kb of BYDV gRNA. The migration of the viral genomic RNA (gRNA) and subgenomic RNAs

(sgRNA) are indicated.

Fig. 4. Effect of the stem-loop secondary stmcture on virus RNA replication. (A)

Diagram of the mutants with disrupted and restored helical regions shown in boxes. Altered bases are in boldface italics. The SL2 mutants based on the suboptimal strucmre (SL2 alternative) are in the dashed box. (B) Accumulation of viral RNAs in protoplasts inoculated with the wild type (PAV6) or the mutant transcripts as detected by northern blot hybridization of total RNA isolated from infected oat protoplasts ~24 h after inoculation.

Fig. 5. Effect of the terminal tetraloop sequence changes on virus RNA replication.

(A) Diagram of the mutants with altered loop sequences shown in boxes. (B) Accumulation of viral RNAs from protoplasts inoculated with the wild type (PAV6) or the mutant transcripts as detected by northem blot hybridization of total RNA isolated from infected oat protoplasts -24 h after inoculation. 99

Table 1. PCR primers used for site-directed mutagenesis. Altered bases are in boldface itaUcs, deletions are shown as underlined spaces.

Constnict Primer name and sequence

p3'SLlD 3'SLID. ATACCCCGGGTT_GTTAAGAGTACCCCGA p3'SL2D 3'SL2D. ATACCCCGGGTTGCCGAACTGCTCnTCGAGTGTCAGCGTT- AA.CCAACACATTGCTGT p3'SUD 3'SL3D. ACCGAAAGGAAAGCCAGTAT_GGGTCACACCTTCGGG p3'SL4D 3'SL4D. GAACCGTGTCCACGGGC_TATCCAACACAGCAA p3'SL5D 3'SL5D.AGCAAGCTCTGAGCCAGGAGA_CGGGCCTGGTTACCG p3'SLll 3'SLl 1. ATACCCCGGGTTGCACrACTGCTCTTTCGAG p3'SL12 3"SL1Z ATACCCCGGGTrGCCGAAGTGCTCTTTCGAGTGAGCCCGTTA- AGAGTACCCCGA p3'SL13 3'SLl3.ATACCCCGGGTTG

22K 50K 39|^ 1 i&il 14.3-7.1 K |60K^ m 17K 5' 3' sgRNAI —• sgRNA2' sgRNA3

SL4 B SL2 SL1 u c U G C-G , , ,•^2)0 -G C-G C-G LG-CH»FIZA A-U RTT\ C-G G-CV-J ^ MA-U - U ^^ACA-UAC,C I , - A " S»J~^IV ' Vaf r-rt A C-G ^ C/r?v .CA-S 9-9.U 5'C-GUAUC-G_GGGU Szp^r-5 ^cu-A 5573 ^-cccAAC - gcaa^ (Ofiig 55^1:1 5'G -G UAU C - G G - CUUAACG - CAACCC 3' AG=-46.9 kcal/mol C " G ^573 ^ S677 UU : A C aG=-45.4 kcal/mol ^ G C A-U G-C

SL1

5573 pav-aus GU GC mav —U G pav-jap Consensus CUGccUACCG AAAGGAAAag CAGUAUCCAA CAcAGCaAUG UGUUGGGGGU 5677 pav-aus C mav U G pav-jap U Consensus uACACCCUUC GGGGUACUC. UUAACGCUGA CACUCGaAAG AGCAGUUCGG CAACCC

Fig. 1. 101

SL4 AA A SL3~~ C>G A C A C-G A SL2 C-Ga UC A 2 ^ ® G-5633 U -A A -U-5613 C-G U-G c-G C-G o A-U m TL; -laT * „ G -C-5593A - U ^ACA - UA r. U-A c-G •-C '-'o. 5'C-G UAU C-GGGGU 444 I I I I . A U A 3'CCCAAC " G^A^ > ' I G -c444 5677 5673G • U c-G ••U U - A "•A^" c A-5653 &G - C A-U G - C %A°0 SU

Fig. 2. 102

3'SL1D 3'SL4D 3'SL2D delete delete 3'SL3D delete delete t u c U G SL4 SL3 C-G SL2 SL1 C-G A-U C-G S'SLSD C-G delete CA -U A-U CU - A G-C G . U C-G Y-LESNTLCGGGC UAU G-C I UUAAC AACCC3' 5568 I 5677

104kDa 99kDa- 97kDa

39kDa- -35kDa

<2^

gRNA —

— sgRNAI — --

rRNA shadow

• sgRNA2 —

^ sgRNA3 —

plus strand minus strand 103

3'SL2U3

• 3'SL32 3'SL31 3'SL2U2 U G 3'SL2U1 3'SL42 G G C - 61 5^ U U G G U U U U G G G-C ACA - UAc,, C C fi_S C GGGU (J

N-rCAA^ U-A C C G G C-G A-U U U G G U«G G G C C G-C 3'SL43 A A U {/ 4-1/ 3'SL33 3-SL11

(alternative) A - U SU

3'SL14

3'SL23 3'SI_21 U U G 6 G G G G B

SL1 SL2 SL2 SL3 SL4 alternative v'V *5b jx A o'V o'> i\ v

sgRNAI im r •64 sgRNA2 sgRNAS

Fig. 4. W. !^o CA c ?c DO o ocoocc > oo o. 0 o o>»ooo^I 1I I• I ooII >> O- r o v%

\% V?' c> c o n v»'.- oo>>o>o> ooccocoII I II I I c I V\ (/> o c o O) CO r tj o> 6 - o %sV. Ci o - o 5 ^p --co> o o> c 0 o o>o ocoooo >ooo c !b," III II I • II r > 1 II I -• ,0>0C00 o 0 o o oco J cooo o > w o " G (Q \ ^ C -f C o z > 105

CBLAPTER5. GENERAL CONCLUSIONS

Transcriptional control of BYDV

Research results presented in this dissertation demonstrate the complexity of

transcriptional regulation of BYDV gene expression. BYDV has evolved three extremely

divergent promoters for transcription of its sgRNAs. They vary in size, RNA sequence,

secondary structure and location relative to the transcription start sites. The sgRNAl

promoter is 98 nt long and located mostly upstream of the sgRNAl start site. It folds into

two stem-loop structures. The sgRNA2 promoter was mapped to a 143 nt region located

entirely downstream of the sgRNA2 start site. It is predicted to fold into a cloverleaf-like

structure (branched stem-loops). The sgRNA3 promoter is the smallest promoter of BYDV.

At most 44 nt long, it is predicted to fold into a hairpin and a single-stranded region. No

sequence homology could be found between the three promoters except the six nucleotides

GUGAAG (positive sense) in sgRNAl and sgRNA2 promoters. These smdies demonstrated

for the first time that an RNA vims with multiple sgRNAs can have such different promoters

that control their synthesis.

Several directions of future research can be envisioned based on the findings

presented here. First, the mechanism of sgRNA synthesis in BYDV has yet to be elucidated.

Development of an in vitro transcription-replication system, in which to test BYDV RNA

constructs containing the mapped promoter regions, would help to address this issue.

Second, the roles of sgRNA2 and sgRNAS remain unclear. The mutants deficient in synthesis of these sgRNAs should be tested for their ability to form virions and to be

transmitted to plants by aphids. Monitoring stability of the mutations and possible recovery 106 of revertants and second-site mutants will shed light on the role of the sgRNAs 2 and 3 in the viral life cycle.

Third, the sgRNA2-deficient mutant will help to test the translation regulation model in which sgRNA2 plays the role of a switch between the early and the late gene expression

(141). Monitoring the levels of viral protein accumulation in protoplasts (or plants) infected with the wild type and sgRNA2-lacking viruses may yield some clues as to the validity of the model. It was also suggested that the 3' TE might cause plant gene expression shut-off during infection. The sgRNA2-deficient mutant may be useful in testing this hypothesis because sgRNA2 provides a very high supply of the 3' TE.

Fourth, the issue of specific recognition of the different promoters by B YDV replicase needs to be addressed. To test the idea that the specificity is controlled by the host factors, an attempt should be made to isolate host proteins that bind to the subgenomic promoters. This could be accomplished by using several approaches including, electrophoretic mobility shift assays (gel-shifts), UV-crosslinking, or yeast 3-hybrid system.

Finally, characterization of the sgRNA2 and sgRNA3 promoters needs to be continued. Site-directed mutagenesis of the promoters will help to identify RNA primary and secondary structural requirements for sgRNA synthesis.

RepUcation of BYDV RNA

Characterization of the BYDV 3' origin of replication has contributed to our knowledge of cis-acting signals involved in replication of RNA viruses. The 3' replication origin of BYDV forms four stable stem-loops with terminal tetraloops. A combination of the

RNA primary sequence and secondary structure is required for viral repUcation. This 107 element represents another divergent promoter recognized by the B YDV replicase. This study is the first step towards elucidation of the mechanism of B YDV RNA replication.

These are some of the potential research directions to be explored in future.

First, viral and host proteins that bind the 3' origin of replication should be isolated and identified. Host factor involvement in replication of various RNA viruses has been repeatedly documented (64).

Second, the possibility of the RNA tertiary folding mediated by the tetraloops should be explored. RNA structure probing methods sensitive to tertiary structures and UV- crosslinking should be used.

Third, the SLl and SL2 tetraloop mutants should be tested in planta in an attempt to determine the role of these tetraloops in the viral life cycle.

Fourth, the increased minus strand RNA synthesis by the mutant 3'SLID, which lacks

SLl, needs to be explored. An in vitro replication system would be helpful for this purpose.

Fifth, other repUcation cis-elements need to be identified in the B YDV genome. The

5" end of gRNA is predicted to form a series of stem-loops which may be important for replication. Also a possibility of the internal regulatory regions involved in replication should be explored.

Regulation of transcription and replication

As demonstrated in the studies presented in this dissertation, B YDV RNA has several promoter elements that serve as signals for RNA synthesis. All these promoters are extremely divergent in their sequence and secondary structure, which raises a question; how are such different elements recognized by the same viral replicase? One possibility is that 108

B YDV RdRp uses different host factors for recognition of different promoters (Fig. 1). The factors may bind promoters on BYDV RNA and recruit RdRp via protein-protein interactions. The rate of transcription can be affected by the relative abundance of the host factors, their affinity to the promoters and affinity of the RdRp to the host factors. The amount of certain factors at various stages of viral life cycle may also regulate transcription and replication. For example, the excess of factor A at early stages will result in intensive (-

)gRNA synthesis, whereas increasing concentrations of factors B, C, D, and E later would potentially inhibit (-)gRNA synthesis and mediate transcription of sgRNAs and (+)gRNA synthesis. As speculated in chapter 4, binding of the putative factor A (Fig. 1) to the 3' origin of replication may cause an RNA conformational change that would unwind the very 3' end of the gRNA and make it accessible to the RdRp.

Alternatively, the viral replicase itself may have different recognition sites for different BYDV promoters. Biochemical studies with the purified enzyme would help to determine the mechanism of transcription and replication of BYDV RNA.

Understanding molecular mechanisms of the processes that constitute viral life cycle is essential to development of antiviral strategies. This study is a modest contribution to the ongoing exploration of the world of infectious diseases of plants and animals. 109

factor A plus strand 5' 3' origin of replication (minus strand promoter)

BYDV RdRp, 99 kDa

60 kDa

factor D

factor C

factor B factor E

minus strand 3' l?l I til 5' plus strand sgRNAI sgRNA2 sgRNA3 promoter promoter promoter promoter

Fig. 1. A potential model for reco^tion of divergent promoters by BYDV RdRp. Each RNA promoter is specifically recognized by a separate host protein (factors A through E). The viral RdRp, depicted as a 99 kDa fusion product of the 39 kDa and the 60 kDa proteins, is recruited by the factors bound to the promoter regions. This allows the viral RdRp to initiate RNA synthesis at appropriate sites. Solid lines indicate plus and minus strand gRNA of BYDV. The putative promoter for plus strand synthesis is depicted by a question mark. 110

REFERENCES

1. Adkins, S., R. W. Siegel, J. H. Sun, and C. C. Kao. 1997. Minimal templates directing

accurate initiation of subgenomic RNA synthesis in vitro by the brome mosaic virus

RNA-dependent RNA polymerase. Rna. 3:634-47.

2. Allen, E., S. Wang, and W. A. Miller. 1999. Barley yellow dwarf virus RNA requires

a cap-independent translation sequence because it lacks a 5' cap. Virology. 253:139-

144.

3. Andino, R., G. E. Rieckhof, P. L. Achacoso, and D. Baltimore. 1993. Poliovirus RNA

synthesis utilizes an RNP complex formed around the 5'- end of viral RNA. EMBO J.

12:3587-98.

4. Andino, R., G. E. Rieckhof, and D. Baltimore. 1990. A functional ribonucleoprotein

complex forms around the 5' end of poliovirus RNA. Cell. 63:369-80.

5. Ansel-McKiimey, P., and L. Gehrke. 1998. RNA determinants of a specific RNA-

coat protein peptide interaction in alfalfa mosaic virus: conservation of homologous

features in ilarvirus RNAs. J Mol Biol. 278:767-85.

6. Balmori, E., D. Gilmer, K. Richards, H. Guilley, and G. Jonard. 1993. Mapping the

promoter for subgenomic RNA synthesis on beet necrotic yeUow vein virus RNA 3.

Biochimie. 75:517-521.

7. Baric, R. S., S. A. Stohlman, and M. M. Lai. 1983. Characterization of replicative

intermediate RNA of mouse hepatitis virus: presence of leader RNA sequences on

nascent chains. J Virol. 48:633-40. Ill

8. Barker, H., and P. M. Waterhouse. 1999. The development of resistance to

luteoviruses mediated by host genes and pathogen-derived transgenes, p. 169-210. In

H. G. Smith and H. Barker (ed.). The Luteoviridae. CABI Publishing, Wallingford,

UK.

9. Barr, J. N., S. P. Whelan, and G. W. Wertz. 1997. cis-Acting signals involved in

termination of vesicular stomatitis virus mRNA synthesis include the conserved

AUAC and the U7 signal for polyadenylation. J Virol. 71:8718-25.

10. Boccard, F., and D. Baulcombe. 1993. Mutational analysis of cis-acting sequences

and gene function in RNA3 of cucumber mosaic virus. Virology. 193:563-578.

11. Brault, v., and W. A. Miller. 1992. Translational frameshifting mediated by a viral

sequence in plant cells. Proc Natl Acad Sci Usa. 89:2262-2266.

12. Brown, C. M., S. P. Dinesh-Kumar, and W. A. Miller. 1996. Local and distant

sequences are required for efficient read-through of the barley yeUow dwarf virus-

PAV coat protein gene stop codon. J. Virol. 70:5884-5892.

13. Brown, D., and L. Gold. 1996. RNA replication by QP replicase: A working model.

Proc. Natl. Acad. Sci. USA. 93:11558-11562.

14. Buck, K. W. 1996. Comparison of the replication of positive-stranded RNA viruses of

plants and animals. Adv Virus Res. 47:159-251.

15. Butcher, S. E., T. Dieckmann, and J. Feigon. 1997. Solution structure of a GAAA

tetraloop receptor RNA. EMBO J. 16:7490-7499. 16. Carpenter, C. D., J.-W. Oh, C. Zhang, and A. E. Simon. 1995. Involvement of a stem-

loop structure in the location of junction sites in viral RNA recombination.

J.Mol.Biol. 245:608-622.

17. Chang, R.-Y., R. Krishnan, and D. A. Brian. 1996. The UCUAAAC promoter motif

is not required for high-frequency leader recombination in bovine coronavirus

defective interfering RNA. J. Virol. 70:2720-2729.

18. Chay, C. A., U. B. Gunasinge, S. P. Dinesh-Kumar, W. A. Miller, and S. M. Gray.

1996. Aphid transmission and systemic plant infection determinants of barley yellow

dwarf luteovirus-PAV are contained in the coat protein readthrough domain and 17-

kDa protein, respectively. Virology. 219:57-65.

19. Chen, M. H., and T. K. Frey. 1999. Mutagenic analysis of the 3' cis-acting elements

of the rubella virus genome. J Virol. 73:3386-403.

20. Cheong, C., G. Varani, and 1. Tinoco, Jr. 1990. Solution structure of an unusually

stable RNA hairpin, 5'GGAC(UUCG)GUCC. Nature. 346:680-2.

21. Costa, M., and F. Michel. 1997. Rules for RNA recognition of GNRA tetraloops

deduced by in vitro selection: comparison with in vivo evolution. Embo J. 16:3289-

302.

22. Deiman, B. A. L. M., R. M. Kortlever, and C. W. A. Pleij. 1997. The role of the

pseudoknot at the 3' end of tumip yellow mosaic virus RNA in minus-strand synthesis

by the viral RNA-dependent RNA polymerase. J.Virol. 71:5990-5996. 113

23. Di, R., S. P. Dinesh-Kumar, and W. A. Miller. 1993. Translational frameshifting by

barley yellow dwarf virus RNA (PAV serotype) in Escherichia coli and in eukaryotic

cell-free extracts. Molec.Plant-Microbe Interact. 6:444-452.

24. Dinesh-Kumar, S. P., V. Brault, and W. A. Miller. 1992. Precise mapping and in vitro

translation of a trifunctional subgenomic RNA of barley yellow dwarf virus.

Virology. 187:711-722.

25. Dinesh-Kumar, S. P., and W. A. Miller. 1993. Control of start codon choice on a

plant viral RNA encoding overlapping genes. Plant Cell. 5:679-692.

26. Dreher, T. W. 1999. Functions of the 3'-untranslated regions of positive strand RNA

viral genomes. Annu. Rev. Phytopathol. 37:151-174.

27. Duggal, R., and E. Wimmer. 1999. Genetic recombination of poliovirus in vitro and

in vivo; Temperature-dependent alteration of crossover sites. Virology. 258:30-41.

28. Felden, B., H. Himeno, A. Muto, J. McCutcheon, J. Atkins, and R. F. Gesteland.

1997. Probing the structure of the Escherichia coli lOSa RNA (tmRNA). RNA. 3:89-

103.

29. French, R., and P. Ahlquist. 1988. Characterization and engineering of sequences

controlling in vivo synthesis of brome mosaic virus subgenomic RNA. J. Virol.

62:2411-2420.

30. French, R., and P. Ahlquist. 1987. Intercistronic as well as terminal sequences are

required for efficient amplification of brome mosaic virus RNA3. J Virol. 61:1457-

65. 114

31. Gamamik, A. V., and R. Andino. 1998. Switch from translation to RNA replication in

a positive-stranded RNA vims. Genes Dev. 12:2293-304.

32. Gargouri, R., R. L. Joshi, J. F. Bol, S. Astier-Manifacier, and A.-L. Haenni. 1989.

Mechanism of synthesis of turnip yeUow mosaic virus coat protein subgenomic RNA

in vivo. Virology. 171:386-393.

33. Gildow, F. E. 1999. Luteoviras transmission and mechanisms regulating vector

specificity, p. 88-113. In H. G. Smith and H. Barker (ed.). The Luteoviridae. CABI

Publishing, Wallingford, UK.

34. Gluck, A., Y. Endo, and I. G. Wool. 1992. Ribosomal RNA Identity Elements for

Ricin-A-Chain Recognition and Catalysis - Analysis with Tetraloop Mutants. J Mol

Biol. 226:411-424.

35. Goodwin, J. B., J. M. Skuzeski, and T. W. Dreher. 1997. Characterization of chimeric

turnip yellow mosaic virus genomes that are infectious in the absence of

aminoacylation. Virology. 230:113-24.

36. Goulden, M. G., G. P. Lomonossoff, J. W. Davies, and K. R. Wood. 1990. The

complete nucleotide sequence of PEB V RNA2 reveals the presence of a novel open

reading frame and provides insights into the structure of tobraviral subgenomic

promoters. Nucleic Acids Res. 18:4507-12.

37. Habili, N., and R. H. Symons. 1989. Evolutionary relationship between luteoviruses

and other RNA plant viruses based on sequence motifs in their putative RNA

polymerases and nucleic acid helicases. Nucleic Acids Res. 17:9543-9556. 115

38. Hagiwara, Y., V. Peremyslov, and V. V. Dolja. 1999. Regulation of closterovirus

gene expression examined by insertion of a self-processing reporter and by northern

hybridization. J. Virol. 73:7988-7993.

39. Haldeman-CahiU, R., J. A. Daros, and J. C. Carrington. 1998. Secondary structures in

the capsid protein coding sequence and 3' nontranslated region involved in

amplification of the tobacco etch virus genome. J Virol. 72:4072-9.

40. Harrison, G. P., M. S. Mayo, E. Hunter, and A. M. Lever. 1998. Pausing of reverse

transcriptase on retroviral RNA templates is influenced by secondary structures both

5' and 3' of the catalytic site. Nucleic Acids Res. 26:3433-42.

41. Havelda, Z., and J. Burgyan. 1995. 3' Terminal putative stem-loop structure required

for the accumulation of cymbidium ringspot viral RNA. Virology. 214:269-72.

42. Heus, H. A., and A. Pardi. 1991. Structural features that give rise to the unusual

stabihty of RNA hairpins containing GNRA loops. Science. 253:191-4.

43. Hewings, A. D., and C. E. Eastman. 1995. Epidemiology of barley yellow dwarf in

North America, p. 75-106. In C. J. D'Arcy and P. A. Burnett (ed.). Barley Yellow

Dwarf: 40 Years of Progress. APS Press, St. Paul.

44. Hilf, M. E., A. V. Karasev, H. R. Pappu, D. J. Gumpf, C. L. Niblett, and S. M.

Gamsey. 1995. Characterization of citrus tristeza virus subgenomic RNAs in infected

tissue. Virology. 208:576-582.

45. Hsue, B., and P. S. Masters. 1997. A bulged stem-loop structure in the 3' untranslated

region of the genome of the coronavirus mouse hepatitis virus is essential for

replication. J. Virol. 71:7567-7578. 116

46. Huang, P., and M. M. C. Lai. 1999. Polypyrimidine tract-binding protein binds to the

complementary strand of the mouse hepatitis virus 3' untranslated region, thereby

altering RNA conformation. J. Virol. 73:9110-9116.

47. Ishikawa, M., J. Diez, M. Restrepo-Hartwig, and P. Ahlquist. 1997. Yeast mutations

in multiple complementation groups inhibit brome mosaic virus RNA replication and

transcription and perturb regulated expression of the viral polymerase-like gene. Proc

Natl Acad Sci USA. 94:13810-13815.

48. Ito, T., and M. M. Lai. 1997. Determination of the secondary structure of and cellular

binding to the 3'-untranslated region of the Hepatitis C Vims RNA genome. J. Virol.

71:8698-8706.

49. Jaeger, L., F. Michel, and E. Westhof. 1994. Involvement of a GNRA tetraloop in

long-range RNA tertiary interactions. J Mol Biol. 236:1271-6.

50. Jaspars, E. M. 1998. A core promoter hairpin is essential for subgenomic RNA

synthesis in alfalfa mosaic alfamovirus and is conserved in other Bromoviridae. Virus

Genes. 17:233-42.

51. Jaspars, E. M. J. 1999. Genome activation in alfamo- and ilarviruses. Arch. Virol.

144:843-863.

52. Johnston, J. C., and D. M. Rochon. 1995. Deletion analysis of the promoter for the

cucumber necrosis virus 0.9-kb subgenomic RNA. Virology. 214:100-109.

53. Kelly, L., W. L. Gerlach, and P. M. Waterhouse. 1994. Characterisation of the

subgenomic RNAs of an Australian isolate of barley yellow dwarf luteovirus.

Virology. 202:565-573. 117

54. Kim, K.-H., and C. Hemenway. 1996. The 5' nontranslated region of potato virus X

RNA affects both genoniic and subgenomic RNA synthesis. J. Virol. 70:5533-5540.

55. Kim, K. H., and C. L. Hemenway. 1999. Long-distance RNA-RNA interactions and

conserved sequence elements affect potato virus X plus-strand RNA accumulation.

RNA. 5:636-645.

56. Klarmarm, G. J., C. A. Schauber, and B. D. Preston. 1993. Template-directed pausing

of DNA synthesis by HIV-1 reverse transcriptase during polymerization of HTV-l

sequences in vitro. J Biol Chem. 268:9793-802.

57. Klovins, J., V. Berzins, and J. van Duin. 1998. A long-range interaction in Qbeta

RNA that bridges the thousand nucleotides between the M-site and the 3' end is

required for replication. Rna. 4:948-57.

58. Koev, G., B. R. Mohan, S. P. Dinesh-Kumar, K. A. Torbert, D. A. Somers, and W. A.

Miller. 1998. Extreme reduction of disease in oats tranformed with the 5' half of the

barley yellow dwarf virus-PAV genome. Phytopathology. 88:1013-1019.

59. Koev, G., B. R. Mohan, and W. A. Miller. 1999. Primary and secondary structural

elements required for synthesis of barley yellow dwarf virus subgenomic RNAl. J.

Virol. 73:2876-2885.

60. Koonin, E. V., and V. V. Dolja. 1993. Evolution and taxonomy of positive-strand

RNA viruses: implications of comparative analysis of amino acid sequences. Critic.

Rev. Biochem. Molec. Biol. 28:375-430. 118

61. Lahser, F. C., L. E. Marsh, and T. C. Hall. 1993. Contributions of the brome mosaic

virus RNA-3 3'-nontranslated region to replication and translation. J Virol. 67:3295-

3303.

62. Lai, M. M., R. S. Baric, P. R. Brayton, and S. A. Stohlman. 1984. Characterization of

leader RNA sequences on the virion and mRNAs of mouse hepatitis virus, a

cytoplasmic RNA virus. Proc Natl Acad Sci USA. 81:3626-30.

63. Lai, M. M., C. D. Patton, R. S. Baric, and S. A. Stohlman. 1983. Presence of leader

sequences in the n[iRNA of mouse hepatitis virus. J Virol. 46:1027-33.

64. Lai, M. M. C. 1998. Cellular factors in the transcription and replication of viral RNA

genomes: A parallel to DNA-dependent RNA transcription. Virology. 244:1-12.

65. Lai, M. M. C. 1996. Recombination in large RNA viruses: Coronaviruses. Semin.

ViroL 7:381-388.

66. Lai, M. M. C., and D. Cavanagh. 1997. The molecular biology of coronaviruses. Adv.

Virus Res.: 1-100.

67. Landt, O., H. P. Grunert, and U. Hahn. 1990. A general method for rapid site-directed

mutagenesis using the polymerase chain reaction. Gene. 96:125-8.

68. Lauber, E., H. Guilley, K. Richards, G. Jonard, and D. Gilmer. 1997. Conformation

of the 3'-end of beet necrotic yellow vein benevirus RNA 3 analysed by chemical and

enzymatic probing and mutagenesis. Nucleic Acids Res. 25:4723-9.

69. Lehto, K., G. L. Grantham, and W. O. Dawson. 1990. Insertion of sequences

containing the coat protein subgenomic RNA promoter and leader in front of the 119

tobacco mosaic virus 30K ORF delays its expression and causes defective cell-to-cell

movement. Virology. 174:145-157.

70. Levis, R., S. Schdesinger, and H. V. Huang. 1990. Promoter for Sindbis virus RNA-

dependent subgenomic RNA transcription. J. Virol. 64:1726-1733.

71. Li, H. P., P. Huang, S. Park, and M. M. Lai. 1999. Polypyrimidine tract-binding

protein binds to the leader RNA of mouse hepatitis virus and serves as a regulator of

viral transcription. J Virol. 73:772-7.

72. Li, H. P., X. Zhang, R. Duncan, L. Comai, and M. M. Lai. 1997. Heterogeneous

nuclear ribonucleoprotein A1 binds to the transcription- regulatory region of mouse

hepatitis virus RNA. Proc Natl Acad Sci USA. 94:9544-9.

73. Lin, Y.-J., X. Zhang, R.-C. Wu, and M. M. C. Lai. 1996. The 3" untranslated region of

coronavirus RNA is required for subgenomic mRNA transcription from a defective

interfering RNA. J. Vkol. 70:7236-7240.

74. Maia, L G., K. Seron, A.-L. Haenni, and F. Beraardi. 1996. Gene expression from

viral RNA genomes. Plant Molec. Biol. 32:367-391.

75. Mans, R. M. W., C. W. A. Pleij, and L. Bosch. 1991. Transfer RNA-Like Structures -

Structure, Function and Evoludonary Significance. Eur J Biochem. 201:303-324.

76. Marsh, L. E., T. W. Dreher, and T. C. Hall. 1988. Mutational analysis of the core and

modulator sequences of the BMV RNA3 subgenomic promoter. Nucleic Acids Res.

16:981-995. 77. Mathews, D. H., J. Sabina, M. Zuker, and D. H. Turner. 1999. Expanded sequence

dependence of thermodynamic parameters provides robust prediction of RNA

secondary structure. J. Mol. Biol. 288:911-940.

78. Mayo, M. A., and C. J. D'Arcy. 1999. Family Luteoviridae-. a reclassification of

luteoviruses, p. 15-22. In H. G. Smith and H. Barker (ed.). The Luteoviridae. CABI

Publishing, Wallingford, UK.

79. McKnight, K. L., and S. M. Lemon. 1998. The rhinovirus type 14 genome contains an

internally located RNA structure that is required for viral replication. RNA. 4:1569-

84.

80. Melchers, W. J., J. G. Hoenderop, H. J. Bruins Slot, C. W. Pleij, E. V. Pilipenko, V. I.

Agol, and J. M. Galama. 1997. Kissing of the two predominant hairpin loops in the

coxsackie B virus 3' untranslated region is the essential structural feature of the origin

of replication required for negative-strand RNA synthesis. J Virol. 71:686-96.

81. Miller, E. D., K. H. Kim, and C. Hemenway. 1999. Restoration of a stem-loop

structure required for potato virus X RNA accumulation indicates selection for a

mismatch and a GNRA tetraloop. Virology. 260:342-53.

82. Miller, E. D., C. A. Plante, K. H. Kim, J. W. Brown, and C. L. Hemenway. 1998.

Stem-loop strucmre in the 5' region of potato virus X genome required for plus-strand

RNA accumulation. J. Mol. Biol. 284:591-608.

83. Miller, J. S., and M. A. Mayo. 1991. The location of the 5' end of the potato leafroU

luteovirus subgenomic coat protein mRNA. J. Gen. Virol. 72:2633-2638. 84. Miller, W. A. 1999. Luteoviridae, p. In press. In R. G. Webster and A. Granoff (ed.).

Encyclopedia of Virology, Second ed. Academic Press, London.

85. Miller, W. A. 1998. Luteoviridae. In R. G. Webster and A. Granoff (ed.).

Encyclopedia of virology. Academic Press, London, UK.

86. Miller, W. A. 1994. Luteoviruses, p. 792-798. In R. G. Webster and A. Granoff (ed.),

Encylopedia of Virology, vol. 2. Academic Press, London.

87. Miller, W. A., C. M. Brown, and S. Wang. 1997. New punctuation for the genetic

code: luteovirus gene expression. Seminars in Virology. 8:3-13.

88. Miller, W. A., J. J. Bujarski, T. W. Dreher, and T. C. Hall. 1986. Minus-strand

initiation by the brome mosaic virus replicase within the 3' tRNA-like structure of

native and modified RNA templates. J. Mol. Biol. 187:537-546.

89. Miller, W. A., S. P. Dinesh-Kumar, and C. P. Paul. 1995. Luteovirus gene expression.

Critic Rev Plant Sci. 14:179-211.

90. Miller, W. A., T. W. Dreher, and T. C. Hall. 1985. Synthesis of brome mosaic virus

subgenomic RNA in vitro by internal initiation on (-) sense genomic RNA. Nature.

313:68-70.

91. Miller, W. A., G. Koev, and B. R. Mohan. 1997. Are there risks associated with

transgenic resistance to luteoviruses? Plant Disease. 81:700-710.

92. Miller, W. A., and L. Rasochova. 1997. Barley yellow dwarf viruses. Annu. Rev.

Phytopathol. 35:167-190. 122

93. Miller, W. A., and S. L. Silver. 1991. Alternative tertiary structure attenuates self-

cleavage of the ribozyme in the satellite RNA of barley yellow dwarf virus. Nucleic

Acids Res. 19:5313-5320.

94. Miller, W. A., P. M. Waterhouse, and W. L. Gerlach. 1988. Sequence and

organisation of barley yellow dwarf virus genomic RNA. Nucl. Acids Res. 16:6097-

6111.

95. Mirmomeni, M. H., P. J. Hughes, and G. Stanway. 1997. An RNA tertiary structure in

the 3' untranslated region of enteroviruses is necessary for efficient replication. J

Virol. 71:2363-70.

96. Mohan, B. R., S. P. Dinesh-Kumar, and W. A. Miller. 1995. Genes and cw-acting

sequences involved in replication of barley yellow dwarf virus-PAV RNA. Virology.

212:186-195.

97. Murphy, F. L., and T. R. Cech. 1994. Gaaa Tetraloop and Conserved Bulge Stabilize

Tertiary Structure of a Group I Intron Domain. J Mol Biol. 236:49-63.

98. Nagy, P. D., and J. J. Bujarski. 1993. Targeting the site of RNA-RNA recombination

in brome mosaic virus with antisense sequences. Proc. Nad. Acad. Sci. USA.

90:6390-6394.

99. Nagy, P. D., C. Zhang, and A. E. Simon. 1998. Dissecting RNA recombination in

vitro: role of RNA sequences and the viral replicase. EMBO J. 17:2392-2403.

100. Nass, P. H., L. L. Domier, B. P. Jakstys, and C. J. D'Arcy. 1998. In situ localization

of barley yeUow dwarf virus-PAV 17 kDa protein and nucleic acids in oats.

Phytopathology. 88:1031-1039. 101. Navas-Castillo, J., M. R. Aibiach-Marti, S. Gowda, M. E. Hilf, S. M. Gamsey, and

W. O. Dawson. 1997. Kinetics of accumulation of citrus tristeza virus RNAs.

Virology. 228:92-97.

102. Novak, J. E., and K. Kirkegaard. 1994. Coupling between genome translation and

replication in an RNA virus. Genes Dev. 8:1726-1737.

103. Olsthoom, R. C. L., S. Mertens, F. T. Brederode, and J. F. Bol. 1999. A

conformational switch at the 3' end of a plant virus RNA regulates viral replication.

EMBO J. 18:4856-4864.

104. Parsley, T. B., J. S. Towner, L. B. Blyn, E. Ehrenfeld, and B. L. Semler. 1997. Poly

(rC) binding protein 2 forms a ternary complex with the 5'- terminal sequences of

poliovirus RNA and the viral 3CD proteinase. Rna. 3:1124-34.

105. Paul, A. v., J. H. van Boom, D. Filippov, and E. Wimmer. 1998. Protein-primed

RNA synthesis by purified poliovirus RNA polymerase. Nature. 393:280-4.

106. Pilipenko, E. V., K. V. Poperechny, S. V. Maslova, W. J. Melchers, H. J. Slot, and V.

I. Agol. 1996. Cis-element, oriR, involved in the initiation of (-) strand poliovirus

RNA: a quasi-globular multi-domain RNA structure maintained by tertiary ('kissing')

interactions. Embo J. 15:5428-36.

107. Piatt, T. 1981. Termination of transcription and its regulation in the tryptophan

operon of E. coli. Cell. 24:10-23.

108. Pley, H. W., K. M. Flaherty, and D. B. McKay. 1994. Model for an RNA tertiary

interaction from the structure of an intennolecular complex between a GAAA

tetraloop and an RNA helix. Nature. 372:111-113. 124

109. Pogue, G. P., and T. C. Hall. 1992. The requirement for a 5' stem-loop strucmre in

brome mosaic virus replication supports a new model for viral positive-stamd RNA

initiation. J. Virol. 66:674-684.

110. Proutski, V., E. A. Gould, and E. C. Holmes. 1997. Secondary structure of the 3'

untranslated region of flaviviruses: similarities and differences. Nucleic Acids Res.

25:1194-202.

111. Rao, A. L., T. W. Dreher, L. E. Marsh, andT. C. Hall. 1989. Telomeric function of

the tRNA-like structure of brome mosaic virus RNA. Proc Natl Acad Sci USA.

86:5335-9.

112. Ray, D., and K. A. White. 1999. Enhancer-like properties of an RNA element that

modulates Tombusvirus RNA accumulation. Virology. 256:162-71.

113. Rohll, J. B., D. H. Moon, D. J. Evans, and J. W, Almond. 1995. The 3' untranslated

region of picomavirus RNA: Features required for efficient genome replication.

Journal of Virology. 69:7835-7844.

114. Samow, P. (ed.). 1995. Cap-independent translation, vol. 203. Springer, Berlin.

115. Samow, P. 1989. Role of 3'-end sequences in infectivity of poliovirus transcripts

made in vitro. J Virol. 63:467-70.

116. Sawicki, S. G., and D. L. Sawicki. 1990. Coronavirus transcription: subgenomic

mouse hepatitis vims replicative intermediates function in RNA synthesis. J Virol.

64:1050-6. 125

117. Schaad, M. C., P. E. Jensen, and J. C. Carrington. 1997. Formation of plant RNA

virus replication complexes on membranes: role of an endoplasmic reticulum-targeted

viral protein. Embo J. 16:4049-59.

118. Schmitz, J., C. Stussi-Garaud, E. Tacke, D. Prufer, W. Rohde, and O. Rohfritsch.

1997. In sim localization of the putative movement protein (prl7) from potato leafroll

luteovirus (PLRV) in infected and transgenic potato plants. Virology. 235:311-22.

119. Seeley, K. A., D. H. Byrne, and J. T. Colbert. 1992. Red light-independent instability

of oat phytochrome mRNA in vivo. Plant CeU. 4:29-38.

120. Sethna, P. B., S. L. Hung, and D. A. Brian. 1989. Coronavirus subgenomic minus-

strand RNAs and the potential for mRNA replicons. Proc Nad Acad Sci USA.

86:5626-30.

121. Shivprasad, S., G. P. Pogue, D. J. Lewandowski, J. Hidalgo, J. Donson, L. K. Grill,

and W. O. Dawson. 1999. Heterologous sequences gready affect foreign gene

expression in tobacco mosaic virus-based vectors. Virology. 255:312-323.

122. Siegel, R. W., S. Adkins, and C. C. Kao. 1997. Sequence-specific recognition of a

subgenomic RNA promoter by a viral RNA polymerase. Proc. Natl. Acad. Sci. USA.

94:11238-11243.

123. Singh, R. N., and T. W. Dreher. 1998. Specific site selection in RNA resulting from a

combination of nonspecific secondary structure and -CCR- boxes: initiation of minus

strand synthesis by turnip yellow mosaic virus RNA-dependent RNA polymerase.

Rna. 4:1083-95. 126

124. Sit, T. L., A. A. Vaewhongs, and S. A. Lommel. 1998. RNA-mediated transactivadon

of transcription from a viral RNA. Science. 281:829-832.

125. Song, C. Z., and A. E. Simon. 1995. Requirement of a 3'-terminal stem-loop in in

vitro transcription by an RNA-dependent RNA polymerase. J.Mol.BioL

254:6-14.

126. Tacke, E., D. Prufer, F. Salamini, and W. Rohde. 1990. Characterization of a potato

leafroll luteovirus subgenomic RNA: differential expression by internal translation

initiation and UAG suppression. J. Gen. Virol. 71:2265-2272.

127. Takamatsu, N., Y. Watanabe, T. Meshi, and Y. Okada. 1990. Mutational analysis of

the pseudoknot region in the 3' noncoding region of tobacco mosaic virus RNA. J

Virol. 64:3686-93.

128. Todd, S., J. S. Towner, D. M. Brown, and B. L. Semler. 1997. Replication-competent

picomaviruses with complete genomic RNA 3' noncoding region deletions. J. Virol.

71:8868-8874.

129. Tuerk, C., P. Gauss, C. Thermes, D. R. Groebe, M. Gayle, N. Guild, G. Stormo, Y.

d'Aubenton-Carafa, O. C. Uhlenbeck, I. Tinoco, Jr., and et al. 1988. CUUCGG

hairpins: extraordinarily stable RNA secondary structures associated with various

biochemical processes. Proc Natl Acad Sci USA. 85:1364-8.

130. Turner, R. L., and K. W. Buck. 1999. Mutational analysis of cis-acting sequences in

the 3'- and 5'- untranslated regions of RNA2 of red clover necrotic mosaic virus.

Virology. 253:115-24. 131. van der Kuyl, A. C., K. Langereis, C. J. Houwing, E. M. Jaspars, and J. F. Bol. 1990.

czj-acting elements involved in replication of alfalfa mosaic virus RNAs in vitro.

Virology. 253:327-336.

132. van der Vossen, E. A., T. Notenboom, and J. F. Bol. 1995. Characterization of

sequences controUing the synthesis of alfalfa mosaic virus subgenomic RNA in vivo.

Virology. 212:663-672.

133. Van Marie, G., W. Luytjes, R. G. van der Most, T. van der Straaten, and W. J. Spaan.

1995. Regulation of coronavirus mRNA transcription. J Virol. 69:7851-6.

134. van Rossum, C. M. A., F. T. Brederode, L. Neeieman, and J. F. Bol. 1997. Functional

equivalence of common and unique sequences in the 3' untranslated regions of alfalfa

mosaic vims RNAs 1, 2, and 3. J. Virol. 71:3811-3816.

135. Varani, G., C. Cheong, and I. Tinoco, Jr. 1991. Structure of an unusually stable RNA

hairpin. Biochemistry. 30:3280-9.

136. Verchot, J., S. M. Angell, and D. C. Baulcombe. 1998. In vivo translation of the triple

gene block of potato virus X requires two subgenomic mRNAs. J Virol. 72:8316-20.

137. Vlassov, V. V., G. Zuber, B. Felden, J. P. Behr, and R. Giege. 1995. Cleavage of

tRNA with imidazole and spermine imidazole constructs: A new approach for

probing RNA structure. Nucleic Acids Res. 23:3161-3167.

138. Wang, J., J. M. Bakkers, J. M. Galama, H. J. Bruins Slot, E. V. PiUpenko, V. I. Agol,

and W. J. Melchers. 1999. Structural requirements of the higher order RNA kissing

element in the enteroviral 3'UTR. Nucleic Acids Res. 27:485-90. 128

139. Wang, J., C. D. Carpenter, and A. E. Simon. 1999. Minimal sequence and structural

requirements of a subgenomic RNA promoter for turnip crinkle virus. Virology.

253:327-336.

140. Wang, J., and A. E. Simon. 1997. Analysis of the two subgenomic RNA promoters

for turnip crinkle virus in vivo and in vitro. Virology. 232:174-186.

141. Wang, S., L. Guo, E. Allen, and W. A. Miller. 1999. A potential mechanism for

selective control of cap-independent translation by a viral RNA sequence in cis and in

trans. RNA. 5:728-738.

142. Wang, S., and W. A. Miller. 1995. A sequence located 4.5 to 5 kilobases from the 5'

end of the barley yellow dwarf virus (PAV) genome strongly stimulates translation of

uncapped mRNA. J Biol Chem. 270:13446-13452.

143. Wang, S. P., K. S. Browning, and W. A. Miller. 1997. A viral sequence in the 3'

untranslated region mimics a 5' cap in facilitating translation of uncapped mRNA.

EMBO J. 16:4107-4116.

144. Weiland, J. J., and T. W. Dreher. 1993. Cis-preferential replication of the turnip

yellow mosaic virus RNA genome. Proc. Natl. Acad. Sci. USA. 90:6095-6099.

145. Weiner, A. M., and N. Maizels. 1987. tRNA-like structures tag the 3' ends of genomic

RNA molecules for replication: impHcations for the origin of protein synthesis. Proc

Natl Acad Sci USA. 84:7383-7.

146. Williams, G. D., R. Y. Chang, and D. A. Brian. 1999. A phylogenetically conserved

hairpin-type 3' untranslated region pseudoknot functions in coronavirus RNA

replication. J Virol. 73:8349-55. 147. Woese, C. R., S. Winker, and R. R. Gutell. 1990. Architecture of ribosomal RNA:

constraints on the sequence of "tetra- loops". Proc Natl Acad Sci USA. 87:8467-71.

148. Yu, H., C. W. Grassmann, and S. E. Behrens. 1999. Sequence and structural elements

at the 3' terminus of bovine viral diarrhea vims genomic RNA: functional role during

RNA replication. J Virol. 73:3638-48.

149. Zavriev, S. K., C. M. Hickey, and S. A. Lommel. 1996. Mapping of the red clover

necrotic mosaic virus subgenomic RNA. Virology. 216:407-410.

150. Zhang, G., V. Slowinski, and K. A. White. 1999. Subgenomic mRNA regulation by a

distal RNA element in a (+)-strand RNA virus. RNA. 5:550-561.

151. Zuker, M. 1989. On finding all suboptimal foldings of an RNA molecule. Science.

244:48-52.

152. Zuker, M., D. H. Mathews, and D. H. Turner. 1999. Algorithms and thermodynamics

for RNA secondary structure prediction: A practical guide, /n B. J. and B. F. C. Clark

(ed.), RNA biochemistry and biotechnology. Klluwer Academic Publishers. 130

ACKNOWLEDGMENTS

I would like to thank people who have helped me during my graduate career. First, very special thanks to my major professor Dr. W. Allen Miller for his excellent advise, help, guidance, and patience. For all these years he has been a great source of ideas and the standard of creativity to measure up to. I am very grateful to the lab manager Randy Beckett for his help with the laboratory techniques and software and for keeping our lab up and running. Thanks to all people who worked in the lab during my training for their advise and stimulating discussions: B.R. Mohan, Jennifer Barry, Liang Guo, Shanping Wang, Lada

Rasochova, Mark Stevens, Cindy Paul, Chris Brown, Michelle Aulik, Michael Walter, Sang

Dc Song, and Ed Allen. I appreciate the help and guidance from the members of my POS committee; Dr. Susan Carpenter, Dr. John Hill, Dr. Thomas Baum, and Dr. Bryony Bonning.

Thanks to my family, especially to my wife Victoria for support.