Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2020

Decoding the initiation mechanism of chlorotic mottle

Elizabeth Jacqueline Carino Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd

Recommended Citation Carino, Elizabeth Jacqueline, "Decoding the translation initiation mechanism of maize chlorotic mottle virus" (2020). Graduate Theses and Dissertations. 17962. https://lib.dr.iastate.edu/etd/17962

This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Decoding the translation initiation mechanism of maize chlorotic mottle virus

by

Elizabeth Jacqueline Carino

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Genetics and Genomics

Program of Study Committee: Wyatt A. Miller, Major Professor Steven Whitham Thomas Lubberstedt Walter Moss Bing Yang

The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred.

Iowa State University

Ames, Iowa

2020

Copyright © Elizabeth Jacqueline Carino, 2020. All rights reserved.

ii

DEDICATION

I want to dedicate this thesis to all I hold dear: friends, family, mentors, and colleagues.

During my scientific journey at Iowa State University, I had the privilege to cultivate long-

lasting friendships from diverse academic fields from bioinformatics, agronomy, business,

electrical engineering, physics, chemistry, mechanical engineering, graphic design, genetics and

genomics, plant pathology and microbiology, kinesiology, mathematics, chemical engineering,

psychology, to the department of education and many others fields. Thank you! I am grateful for

all your unwavering support through good times and rough times. Thank you all for being a

marvelous council of friends.

To my mom: thank you for supplying me with the building blocks I needed to build up my strength and resilience. Unfortunately, you didn’t have the chance to see the final product of this dissertation as your end arrived abruptly during the first stages of my dissertation journey.

Nonetheless, I dedicate this thesis to you. I miss you. ¡Lo logramos!

To my GMAP people: I would like to thank all of the graduate college staff and the graduate student volunteers who served with me in the GMAP research symposium committee during my four years of service. It was inspiring to be surrounded by brilliant minds who were willing to showcase and support the greatness of others.

To all the unsung graduate student heroes with whom I have the opportunity to collaborate and work towards making a greater impact on the ISU campus and the Ames community, this dissertation also goes to you. I look forward to the contributions in your respective fields and society. iii

TABLE OF CONTENTS

Page

LIST OF FIGURES ...... vi

LIST OF TABLES ...... ix

ACKNOWLEDGMENTS ...... x

ABSTRACT ...... xi

CHAPTER 1: LITERATURE REVIEW ...... 1 Preface ...... 1 Family ...... 1 Tombusviridae Family- Brief History ...... 1 General Overview ...... 4 Virus Replication ...... 7 Cap-Independent Translation Elements (CITEs) ...... 14 Cap-independent translation elements classification ...... 15 Maize Lethal Necrosis (MLND) ...... 19 What causes Maize Lethal Necrosis Disease? ...... 19 From its origins until now ...... 21 Economic Impact ...... 21 Detection and Prevention ...... 22 Maize Chlorotic Mottle Virus ...... 22 Overview ...... 22 MCMV Symptoms ...... 23 MCMV transmission ...... 25 MCMV detection ...... 26 Molecular Information ...... 28 References ...... 31 Figures ...... 44 Tables ...... 62

CHAPTER 2: THE RNA OF MAIZE CHLOROTIC MOTTLE VIRUS - THE ESSENTIAL VIRUS IN MAIZE LETHAL NECROSIS DISEASE - IS TRANSLATED VIA A PANICUM -LIKE CAP-INDEPENDENT TRANSLATION ELEMENT ...... 67 Abstract ...... 67 Author Summary ...... 68 Introduction ...... 69 Results ...... 71 Mapping the 3’-cap-independent translation element in MCMV ...... 71 Determining the secondary structure of MCMV 3’-CITE ...... 73 Comparison of the MTE to PTEs: role of the pseudoknot ...... 74 The MTE binds eIF4E ...... 77 Secondary structure of the MCMV 5’-UTR ...... 79 iv

Long distance base pairing of the MTE to the 5’ UTR ...... 79 Effects of mutations on the translation of full-length MCMV genomic RNA ...... 81 Replication of mutant MCMV RNA ...... 82 Discussion ...... 84 Identification of the 3’CITE in MCMV genome ...... 84 Long-distance kissing stem-loop interactions ...... 86 Translation of sgRNA1 ...... 87 Efficient replication vs efficient translation? ...... 88 Toward host resistance to MCMV ...... 89 Materials and methods ...... 90 Plasmids...... 90 In vitro Transcription...... 90 In vitro Translation ...... 91 Translation in Protoplasts ...... 92 RNA Structure Probing ...... 92 Expression and purification of wheat eIF4E ...... 93 Electrophoretic mobility shift assay (EMSA) ...... 94 Inoculation of Black Mexican Sweet (BMS) protoplasts ...... 94 Plant Inoculations ...... 95 Dot blot and northern blot hybridizations ...... 95 RT-PCR ...... 96 Acknowledgments ...... 97 References ...... 97 Figures ...... 107 Tables ...... 124

CHAPTER 3: PANICUM MOSAIC LIKE 3'-CAP-INDEPENDENT TRANSLATION ENHANCERS AND EIF4E INTERACTIONS ...... 126 Abstract ...... 126 Introduction ...... 126 Eukaryotic 4E ...... 126 eIF4E and viruses ...... 127 Computational RNA- dynamics ...... 129 Results and Discussion ...... 130 Structure alignment of eIF4E ...... 130 Wheat eIF4E cap-binding site structural changes ...... 131 Maize eIF4E 3D structure prediction ...... 132 Triticum aestivum eIF4E-PEMV2 PTE simulation docking model ...... 136 Modeling of maize chlorotic mottle virus 3’-CITE interaction with eIF4E ...... 137 Wheat eIF4E mutation design ...... 139 GST-eIF4E fusion: Protein solubility problems ...... 141 Construction of maize cap-binding protein expression vectors ...... 147 Conclusion ...... 153 Materials and Methods ...... 157 References ...... 167 Figures ...... 175 Tables ...... 193 v

CHAPTER 4: GENERAL CONCLUSION ...... 201 References ...... 204 vi

LIST OF FIGURES

Page

Figure 1.1. Genome organization of the type species for each genus in the Tombusviridae family grouped based on sub-families ...... 44

Figure 1.2. Tobacco bushy stunt virus (TBSV) capsid architecture...... 45

Figure 1.3. Examples of capsid structure observed in Tombusviridae viruses...... 46

Figure 1.4. Maximum likelihood phylogenetics of RdRp from Tombusviridae viruses...... 47

Figure 1.5. Examples of Tombusviridae viruses targeting different membranous organelles for replication in the plant cell...... 48

Figure 1.6. Depiction of cap-dependent ...... 49

Figure 1.7. Depiction of cap-independent translation of tomato bushy stunt virus (TBSV)...... 50

Figure 1.8. 3'-Cap Independent Translation Elements identified in Tombusviridae viruses...... 51

Figure 1.9. Readthrough translation of the Carnation Italian Ringspot virus (CIRV) ...... 52

Figure 1.10. Pea enation mosaic virus RNA2 (PEMV2) programmed ribosomal -1 frameshift...... 53

Figure 1.11. Maize Lethal Necrosis Disease Global Distribution...... 54

Figure 1.12. Maize chlorotic mottle virus global distribution...... 54

Figure 1.13. Maize chlorotic mottle virus genome organization...... 55

Figure 1.14. Sequence alignment of MCMV complete genomes available in GeneBank...... 56

Figure 1.15. Maize chlorotic mottle virus (MCMV) phylogeny tree analysis...... 57

Figure 1.16. Alignment of maize chlorotic mottle virus (MCMV) 3'-CITE (MTE)...... 58

Figure 1.17. Maize chlorotic mottle virus predicted 3'-CITE (MTE) structures from different isolates...... 60

Figure 1.18. Maize chlorotic mottle virus 3'-CITE (MTE) structures from different isolates...... 61

Figure 2.1. In vitro translation of MCMV genomic RNA...... 107 vii

Figure 2.2. Mapping the sequence in MCMV 3’UTR required for cap-independent translation...... 108

Figure 2.3. SHAPE analysis of MTE structure from the MCM41 infectious clone...... 109

Figure 2.4. Alignment of known and predicted PTEs...... 110

Figure 2.5. Effect of MTE mutations in the potential pseudoknot interaction on translation. ... 111

Figure 2.6. Electrophoretic mobility shift assays (EMSA) of mutant uncapped MTE and PTE RNAs with eIF4E...... 112

Figure 2.7. SHAPE analysis of the MCMV 5’ UTR...... 113

Figure 2.8. Long-distance interaction between MCMV 5’ UTR and MTE...... 114

Figure 2.9. Long-distance interaction between sgRNA1 5’ UTR and MTE...... 115

Figure 2.10. Effects of mutations on translation and replication of full-length MCMV genome...... 116

Figure 2.11-S1. Luciferase translation assay including the 3’ end of coat protein-coding region...... 117

Figure 2.12-S2. Secondary structures of known and predicted PTEs...... 118

Figure 2.13-S3. Electrophoretic mobility shift assays (EMSA) of MTE and PTE capped RNA probes with eIF4E...... 119

Figure 2.14-S4. Non-linear regression graphs of EMSAs...... 120

Figure 2.15-S5. RT-PCR of plants inoculated with MCMV mutants...... 121

Figure 2.16-S6. SHAPE probing of mutant MTE sequence in MCM41 mutant constructs used for infectivity assays...... 122

Figure 2.17-S7. MTE SHAPE structures of selected mutants...... 123

Figure 3.1. Sequence alignment of plant eIF4E proteins...... 175

Figure 3.2. Triticum aestivum eIF4E (2IDR) 3D model...... 176

Figure 3.3. Wheat eIF4E conformational changes upon interaction with eIF4E simulation...... 176

Figure 3.4. Structure sequence alignment and superimposition of Triticum Aestivum and Pisum sativum eIF4E structures...... 177 viii

Figure 3.5.Simulated Zea mays eIF4E 3D protein structure...... 177

Figure 3.6. Vina dock log files and ligand conformations of wheat eIF4E and maize eIF4E. ... 178

Figure 3.7. Superimposition of eIF4E crystal structures of plant proteins and predicted Zea mays...... 179

Figure 3.8. Pea Enation mosaic virus 2 (PEMV2) 3'-CITE 3D model prediction...... 180

Figure 3.9. Secondary structures of known and predicted PTEs...... 181

Figure 3.10. Triticum aestivum eIF4E and PEMV2 protein-RNA docking simulation...... 182

Figure 3.11. Maize chlorotic mottle virus 3'CITE (MTE) 3D predicted model...... 183

Figure 3.12. Triticum aestivum eIF4E and maize chlorotic mottle virus 3'CITE (MTE) 3D predicted protein-RNA docking simulation...... 183

Figure 3.13. Zea mays eIF4E and maize chlorotic mottle virus 3'CITE (MTE) 3D predicted protein-RNA docking simulation...... 184

Figure 3.14. Sequence alignment of wheat eIF4E protein mutations...... 185

Figure 3.15. Visualization of mutations in Triticum aestivum (wheat) eIF4E...... 186

Figure 3.16. Expressing eIF4E mutants in Tuner (DE3) pLysS at 37°C...... 186

Figure 3.17. Troubleshooting eIF4E solubility using different BL21 expression competent cells...... 187

Figure 3.18. Optimization eIF4E solubility using different temperatures...... 188

Figure 3.19. IPTG fine-tuning to optimize eIF4E expression...... 189

Figure 3.20. Expression of eIF4E-pGEX using optimized conditions...... 190

Figure 3.21. Purification of wheat eIF4E-GST tag proteins...... 190

Figure 3.22. Purification of His-tag eIF4E wild type protein, nickel vs. cobalt purification. .... 191

Figure 3.23. Construction of Zea mays cap-binding proteins expression vectors...... 191

Figure 3.24. Induction examples of Zea mays cap-binding protein vectors...... 192 ix

LIST OF TABLES

Page

Table 1-1. Viruses of the family Tombusviridae for which reference genomes were available in GenBank as in January 2020...... 62

Table 1-2. Tombusviridae that use proximal (PRTE) and distal (DRTE) readthrough elements to activate readthrough translation of RNA-dependent RNA- polymerase associated proteins...... 65

Table 1-3. Summary of characterized 3’-cap-independent translation elements/enhancers ...... 66

Table 2-1. Summary replication symptoms of inoculated maize plants with MCMV mutants . 124

Table 2-2. Table of primers. List of primer sets used in this publication...... 125

Table 3-1. Summary of T. aestivum eIF4E mutations ...... 193

Table 3-2. Novagen competent cells genetic marker key ...... 194

Table 3-3. Protein expression solubility summary results ...... 195

Table 3-4. Summary of maize eIF4E search results ...... 196

Table 3-5. Protein-RNA summary interactions from Electrophoretic mobility shift (EMSA) .. 197

Table 3-6. Identification of homologous amino acid residues reported conferring virus resistance on eIF4E ...... 198

Table 3-7. List of primers ...... 199

x

ACKNOWLEDGMENTS

I would like to thank my committee chair, W. Allen Miller, and my committee members,

Steven Whitham, Walter Moss, Bing Yang, and Thomas Lubberteadt, for their guidance and support throughout the course of this research.

Also, I would like to thank my friends, colleagues, department faculty, and staff for making my time at Iowa State University an enjoyable experience. I would also like to thank all my people from GMAP, LGSA (Latin@ Graduate Student Association), PLPM-GSO (Plant

Pathology and Microbiology Graduate Student Organization), G3 (Graduate Genetics and

Genomics group), GPSS (Graduate and Professional Student Senate), and TSC for making my experience at Iowa State University wholesome. xi

ABSTRACT

Maize chlorotic mottle virus (MCMV) is the key player of Maize Lethal Necrosis

Disease (MLND). MLND is caused by the co-infection and synergistic interaction of MCMV and any that infects grasses. Recent outbreaks of MLND have ravaged maize fields in

Kenya and neighboring regions of East Africa. The catastrophic economic losses brought back the interest of stakeholders to learn more about the synergistic interaction between MCMV and and find ways to use this knowledge to create MLND resistant lines.

MCMV is a positive-sense single-stranded RNA virus from the Tombusviridae family.

Similarly to other members of the Tombusviridae family, MCMV does not have a 5’-cap or a poly (A) tail and must use alternative mechanisms to translate viral proteins. The lack of a 5’-cap in the viral RNA does not prevent it from interacting with host translation factors. Instead of a

5’-cap, most tombusvirids are known to employ unconventional mechanisms to sequester translation factors to the viral RNA. One mechanism of recruiting host’s translation factors is 3’- cap-independent translation elements (3’-CITEs). 3’-CITEs are secondary structures located at the 3’-end of the virus genome used to recruit the translation initiation machinery. The work presented in this dissertation revolves around finding out the translation initiation control elements of MCMV RNA.

The unearthing of MCMV’s translation initiation process began with a series of deletions at the 3’-end of the virus genome. The genome sequence mapping indicated that there was an essential sequence for virus translation between nucleotides 4164 to 4333. Structural RNA probing of this region showed the presence of a panicum mosaic virus 3-CITE (PTE) -like structure located in the 3’-untranslated region (UTR) of the viral RNA. Similar to other PTEs, the MCMV 3’-CITE secondary structure was composed of a hammer-like helix structure that xii

branched out into two side loops connected by a pyrimidine bridge. On the main stem of the

hammer-like structure is a single-stranded bulge that is hyper-modified by SHAPE probing

reagents in the presence of magnesium. PTEs are characterized by a pyrimidine rich bridge

composed of cytosines and a purine-rich bulge that interacts with each other forming a

pseudoknot. In contrary to most PTEs, the MTE was predicted to have a weak pseudoknot.

The PTE C-G pseudoknot formation enables the virus to interact with the cap-binding pocket of eIF4E. Although the establishment of a suitable pseudoknot between the C-G domains of MTE is questionable, MTE interacts with initiation factor 4E. Similar to other 3’CITEs, the

MTE used long-distance base-pairing to bring the translation machinery to the 5’ end. The eIF4E-MTE RNA-protein interaction model was investigated by mutating both the RNA and the protein. Comparison of electrophoretic mobility shift assays (EMSA) results using mutated eIF4E with other PTE-like structure indicated that even though MCMV interact with eIF4E, it might use a mechanism that has yet to be characterized.

1

CHAPTER 1: LITERATURE REVIEW

Preface

The work discussed in this dissertation revolved around the decryption of the translation

initiation mechanism of maize chlorotic mottle virus. Maize chlorotic mottle virus (MCMV) is a single-stranded positive RNA virus from the Machlomovirus genus within the Tombusviridae family. Abiotic factors and synergistic interaction with other viruses often influence the magnitude of crop damage caused by this virus. For instance, when a potyvirus and MCMV co- infect susceptible crops, it results in substantial economic losses. Since the work discussed in the next chapters focused on experimental assays used to decode MCMV’s translation mechanism, the goal of this chapter is to provide the reader with background information that would allow him or her to understand the contribution of this dissertation to the scientific community at large.

This chapter walks you through the history of the Tombusviridae family and some of the general molecular characteristics of this virus family. It also provides you with a brief introduction to cap-independent translation elements. It describes the overall impacts of the synergistic interaction between potyviruses and MCMV that result in Maize Lethal Necrosis

Disease (MLND). Finally, it provides a general overview of maize chlorotic mottle virus from its play symptoms and transmission to molecular information.

Tombusviridae Family

Tombusviridae Family- Brief History

For many years, viruses that are now classified under the Tombusviridae family umbrella lacked a proper family classification. In 1971, sixteen groups of plant viruses were described by the 1966-1970 Plant Virus Subcommittee of the International Committee on Nomenclature of

Viruses (ICNV or currently known as ICTV) (Harrison et al., 1971) and the tombusvirus group 2 was among the sixteen virus groups proposed that year. However, the tombusvirus group classification was not legitimized by ICNV until 1975 at the Third International Congress for

Virology in Madrid after it was suggested that this grouping system was now commonly used among scientists (Fenner and Maurin, 1976). By 1979, only about 5 species of viruses composed the tombusvirus group (Matthews, 1979). It was not until 1993 that the Tombusviridae family was created by the ICTV (Mayo and Martelli, 1993).

Over the years, the Tombusviridae family grew and acquired new genera as new viruses were found, and more viral features were identified. In 1993, the Tombusviridae consisted of only two genera: Tombusvirus and Carmovirus. Then, in 1996 three more genera, Dianthovirus,

Machlomovirus, and Necrovirus (which already existed as their own separate group) were moved into the Tombusviridae family (Van Regenmortel et al., 2000). Subsequently, in 1998 three new genera, Aureusvirus, Avenavirus, and Panicovirus, were proposed and included in the

Tombusviridae family (Pringle, 1998). In 2012, the Necrovirus genus was divided into two new genera, Alphanecrovirus and Betanecrovirus, based on phylogenetic analysis of the polymerase and movement proteins 1 and 2 of the seven known virus species at the time (Adams et al., 2013; Rochon, 2011d). In the same year, three new genera, Zeavirus, Macanavirus, and

Gallantivirus, were created to classify previously unclassified viruses and new proposed viral species (Rochon, 2011a, b, c). Two years later, in 2013, the Umbravirus genus was moved into the Tombusviridae family, accrediting the phylogenetic results of the RNA-dependent RNA polymerase (RdRp) (Rochon et al., 2013). One of the last few new genera created was

Pelarspovirus; when this genus was proposed, it was not supported because it falls within a monophyletic lineage of the Carmovirus genus (Scheets et al., 2014). However, after the

Carmovirus genus was further analyzed, it was decided to divide this genus into three and accept 3

the previously proposed Pelarspovirus as a new genus of the Tombusviridae family (Scheets et

al., 2015). The new divided Carmovirus genus gave birth to the latest three genera of

Alphacarmovirus, Betacarmovirus, and Gammacarmovirus. As of 2018, the Tombusviridae

family is composed of 76 species of viruses allocated among 16 different genera (Table 1-1,

Figure 1.1).

Over 47 years, the now Tombusviridae family went from being a group of virus species

to a virus family composed of diverse genera which shared morphological, structural, and

molecular features. In 2018, a new layer of organization was added to this virus family when the

creation of three sub-families was proposed (Scheets et al., 2018). The three new proposed

subfamilies were Procedovirinae, Regressovirinae, and Calvusvirinae. Due to the diversity of

the viruses added to the Tombusviridae family over almost five decades, the new sub-family

categories were created to organize viruses based on the virus mechanism to produce RdRp

(Scheets et al., 2018). The members of the Tombusviridae family produce their RdRp by mainly two mechanisms: 1) using readthrough of the amber (UAG) codon to synthesize RdRp (Russo et

al., 1994), and 2) using frameshift to synthesize RdRp (Rochon et al., 2012). The sub-family

Procedovirinae includes all genera of viruses that use readthrough of the amber codon

mechanism to generate RdRp (Scheets et al., 2018). For the genera that use -1 frameshift

mechanism to synthesize its RdRp, the sub-families were split into two categories:

Regressovirinae for the genera that encodes coat protein (CP) and Calvusvirinae for the

Umbravirus genus which does not code for CP (Figure 1.1).

4

One of the most notable and latest changes that affected the Tombusviridae family was the establishment of the realm (Walker et al., 2019). The realm Riboviria was derived from the word ribonucleic acid (Gorbalenya et al., 2017) and it encompasses all viruses with positive single-stranded RNA ((+) ssRNA), (+) double-strand RNA (dsRNA), or negative-strand genomic RNA ((-) RNA) that use cognate RdRps for replication (Gorbalenya et al., 2017;

Walker et al., 2019). The suggestion of the establishment of the new Riboviria realm was proposed in 2016 but approved in 2017 (Gorbalenya et al., 2017). The authors advocating for the creation of the new realm in the virus taxonomy wanted to place RNA viruses that use similar

RdRp under a universal basal realm rank. It was suggested that positive (+) ssRNA viruses are most likely to be monophyletic and ancient because there was no evidence of multiple origins of

(+) ssRNA isolated from eukaryotic and prokaryotic hosts. Since it was suspected that (+) ssRNAs with similar RdRp function descend from a common ancestor, they could qualify for an independent basal taxon on their own virus taxonomy (Gorbalenya et al., 2017). After much debate (Gorbalenya et al., 2017), the Riboviria realm was placed at the highest taxonomic rank permitted by the ICTV code (Walker et al., 2019).

The Tombusviridae family has grown over the years with the increasing surge of information on plant viruses. As technology and research investigation advances, it should not be surprising to see this virus family grow over the next decade or so. Neither should we be surprised if another genus is split into new genera such as the Necrovirus and Carmovirus were split in 2012 and 2015, respectively.

General Overview

Their similar morphological features can characterize the members of the Tombusviridae family. Some of these features include a T=3 icosahedral capsid composed of 180 identical protein subunits (Figure 1.2-C), 32-35 nm virions, single-stranded positive-sense RNA genomes, 5

the lack of a 5’cap and polyadenylated 3’-end, and their highly conserved non-structural proteins

(Harrison et al., 1971; Rochon et al., 2012; Russo et al., 1994). Some of these phylogenetically

conserved non-structural proteins (Figure 1.1) include pre-readthrough proteins of 23-48 kDa,

RdRp of 82-112 kDa, and movement proteins (MP) that vary in size from 6 to 33 kDa (Rochon

et al., 2012; Russo et al., 1994).

The coat protein (CP) of most members of the Tombusviridae (tombusvirids) is divided

into four domains (Figure 1.2-A). The N-terminus forms a flexible region composed of two domains: the RNA-binding (R) domain at the N-terminus and the arm region (a), which connects the R domain to the S domain (Russo et al., 1994). The R domain contains many basic amino acids and extends into the virion in a nonspecific manner (Giesman-Cookmeyer et al., 2001).

The arm region (Figure 1.2 A-B) is suspected of allowing dense packing of the virion RNA by

neutralizing the charged phosphates on the RNA. The globular S domain (Figure 1.2-B) forms the face of the virion, and it is composed of two sets of four-stranded antiparallel β-sheets

(Hopper et al., 1984). The protruding (P) domain forms an antiparallel β sheet structure in a jellyroll conformation, with one six-stranded β sheet and one four-stranded β sheet (Figure 1.2-

B). Then, there is a small hinge (h) sequence that connects the S and P domains, which facilitates

the possibility of adopting two different angle configurations between these domains (Figure 1.2

A-B). The P domain forms surface projections that give the virions a rough appearance;

however, not all the Tombusviridae possess the P domain (Figure 1.3-A). In comparison to the most Tombusviridae, necroviruses, panicoviruses, and machlomoviruses have smooth virions, small CPs, and lack a P domain (Figure 1.3).

6

Core proteins such as RdRp, capsid, and movement proteins are often used to generate

phylogenic associations among viruses (Stuart et al., 2004; Wolf et al., 2018). RdRps are

suspected to be very ancient enzymes that originated from the junctions of proto-tRNAs of a

primitive RNA translation system (de Farias et al., 2017). The proto-translation system assumes

that initial peptides might have been synthesized from Direct RNA Templates (DRTs)

(Demongeot and Seligmann, 2019; Ma, 2010; Root-Bernstein et al., 2016). Since RdRp is

involved in the replication of RNA viral genomes, it is suspected that the evolution of RdRp can

be intrinsically linked with the evolution of RNA viruses (de Farias et al., 2017). Because it is

present in all RNA viruses and has conserved structural domains, the RdRp is often used to

construct viral phylogenetic analysis (de Farias et al., 2017). RdRp sequence/ structural

information can be used to extrapolate information about virus evolution. In this chapter,

sequences resembling known Tombusviridae viral species (Table 1-1) were subjected to a phylogenic analysis (Figure 1.4). The study of the RdRp genome phylogeny using maximum likelihood phylogeny (King and Van Doorslaer, 2018) on this chapter was designed to identify any potential viruses in GeneBank which have yet to be included on the Tombusviridae classification. From this search 10 viruses not currently listed on the ICTV list were found:

Adonis mosaic virus (AdMV_LC171345.1), Jasmine virus H (JaVH_KX897157.1), Jasmine mosaic-associated virus (JMaV_MG958506.1), Gompholobium virus A

(GomVA_NC_030742.1), Corn salad necrosis virus (CSNV_MF125267.1), Bermuda grass latent virus (BGLV_NC_032405.1), Gentian virus A (GeVA_LC373507.1), Lisianthus necrosis virus (LNV_DQ011234.1), Rose yellow leaf virus (RYLV_KC166239.1), Manawatu river virus

(ManRV_JN204350.1), and Turitea creek virus (TuCV_JN204349.1) (black-colored sequence on Figure 1.4). 7

The Tombusviridae family produces its RdRp and associated proteins from genomic

RNA (Rochon et al., 2012). The size of the RdRp among the species in this family can range

from 77-kDa to 112-kDa, while the overlapping auxiliary replication-associated proteins range from 22-kDa to 50-kDa (Figure 1.1). In comparison to most RNA polymerases, the

Tombusviridae polymerase is small and lacks the helicase motif (Giesman-Cookmeyer et al.,

2001). The viral RdRp catalyzes RNA synthesis from RNA templates and is involved in viral genome replication and translation processes (Cimino et al., 2011; Jia and Gong, 2019).

Although viruses on this family use the RdRp to synthesize virus progeny genomes, the specific mechanism used for replication can differ among virus groups (Gunawardene et al., 2017). In general, virus replication involves the assembly of replicase complexes composed of virus RNA template and viral- and host-proteins (Nagy and Pogany, 2011).

Virus Replication

Upon entering the cell, tombusvirids release the viral genomic RNA into the cytoplasm, where replication and translation take place (Rochon et al., 2012). Similarly to viruses, plant viruses use viral-coded proteins and host factors to generate viral replication organelles/compartments (VROs) (Nagy and Lin, 2020). Replication of tombusvirids occurs in the cytoplasm. For replication to take place, membranous organelles are recruited to the replicase-associated proteins (Hwang et al., 2008; Jin et al., 2018; Rochon et al., 2012; Sasvari et al., 2018; Xu and Nagy, 2016). Some examples of these membranous organelles include peroxisomes, endoplasmic reticulum (ER), mitochondria, chloroplasts, and endosomes

(Lazarow, 2011) (Figure 1.5). These positive-RNA viruses can remodel the selected membrane and induce the formation of spherules where RNA replication takes place (Lazarow, 2011; Nagy and Lin, 2020; Nagy and Pogany, 2011; Sasvari et al., 2018). 8

Even though all members of the Tombusviridae family replicate in the cytoplasm, they

use diverse membranous organelles to support viral replication in infected cells (Hwang et al.,

2008; Rochon et al., 2012; Sasvari et al., 2018). For example, tomato bushy stunt virus (TBSV) uses peroxisomal and endoplasmic reticulum (ER) membranes for replication (Sasvari et al.,

2018). TBSV hijacks endosomal sorting complexes required for transport (ESCRT) proteins for

virus replication (Barajas et al., 2009). The recruitment of ESCRT proteins facilitates the

assembly of the replicase complex. ESCRT factors are recruited for virus replication when viral

p33 protein interacts with Vps23p ESCRT-I protein and Bro1p accessory ESCRT protein

(Barajas et al., 2009). It is suspected that the interaction between TBSV-p33 and Vps23p is dependent on the ubiquitination of p33, which facilitates the binding to the Vps23p N-terminal

domain. The recruitment of ESCRT proteins facilitates the rearrangement of the peroxisome

membrane, membrane proliferation, and changes in lipid composition that result in the formation

of viral replication organelles/compartments (VROs) (Barajas et al., 2009; Christ et al., 2017;

McCartney et al., 2005; Nagy and Lin, 2020). The formation of these VROs is critical for virus replication as they provide a protective environment from defense mechanisms lurking in the cytosol (Nagy et al., 2016).

In the absence of peroxisomes, TBSV can utilize the ER membranes for replication

(Jonczyk et al., 2007; Rubino et al., 2007). Studies in yeast using peroxins (pex) (Jonczyk et al.,

2007) suggested that the replication protein p33 was re-localized to the ER membrane in the absence of peroxisomes. In Jonczyk et al. 2007, the deletion of PEX3 or PEX19 resulted in the elimination of peroxisomes. In the lack of PEX3 and PEX19, TBSV-p33 replication protein was mislocalized to the perinuclear and cortical ER (Jonczyk et al., 2007). In the ER, p33 created punctuate structures that increased in size during the late replication stage. It was suggested that 9

TBSV replication took place on these punctuated sites similar to peroxisome VROs. Overall, this study inferred that TBSV replication could overcome shortcomings/obstacles presented in the host’s cell types because the virus recruited host factors to the ER membrane as efficiently as in peroxisomes. In the case of TBSV, even though it can replicate in the ER, it might favor peroxisomes over the ER to reduce ER stress and apoptosis (Park and Park, 2019).

Another example of the Tombusviridae’s ability to recruit different membranous organelles for virus replication is Carnation Italian ringspot virus (CIRV). CIRV generates spherules using the mitochondrial outer membranes for replication (Hwang et al., 2008;

Richardson et al., 2014). In CIRV, p36 protein is an auxiliary RNA-binding protein that serves a similar purpose as TBSV p33. Similarly to TBSV p33, CIRV p36 protein binds and recruits

Vps23 to the mitochondria through a unique sequence at the N-terminal of p36 (Richardson et al., 2014). CIRV p36 enables the formation of multivesicular bodies (MVBs) by transmogrification of the mitochondria (Hwang et al., 2008). Even though CIRV relies on

ESCRT components for replication, the ESCRT recruitment mechanism differs from the one used by TBSV (Richardson et al., 2014). TBSV p33 protein interacts with VPs23 through the ubiquitin E2 variant (UEV) domain, while CIRV p36 requires the C-terminal steadiness box domain (StBox) of Vps23. The interaction of both auxiliary replication proteins (p33 or p36) with ESCRT Vps23 is a good illustration of how two members of the same family can use the same protein for a similar purpose yet different recruitment mechanisms.

10

Translation Mechanisms of the Tombusviridae Family

In the eukaryotic system, the presence of a 7-methylguanosine cap (m7G(5’)ppp(5’)N) structure at the 5’-end and the 3’ poly(A) tail facilitates the formation of an mRNA “closed- loop” (Archer et al., 2015; Shirokikh and Preiss, 2018). The configuration of this translation initiation closed-loop model enables the interaction of cap-binding eukaryotic initiation factor 4E

(eIF4E) with the adaptor protein eukaryotic initiation factor 4G (eIF4G), and the poly(A)- binding protein (PAB) (Figure 1.6-A) (Archer et al., 2015). The interactions of these proteins

(eIF4E, eIF4G, eIF2A, and PAB) hold the 5’- and 3’-end of the mRNA in close proximity, which promotes recruitment of the small ribosomal subunit to the 5’-end of the mRNA (Figure 1.6-B)

(see (Shirokikh and Preiss, 2018) for in-depth details). The small ribosomal subunit and corresponding translation initiation associated factors begin the scanning of the mRNA towards the 3’-end until a recognition site is reached (Figure 1.6-B). Upon recognition of the start codon, a cascade of rearrangements occurs such that some factors disassociate from the complex and the large ribosomal subunit join the small ribosomal subunit to initiate protein synthesis (Shirokikh and Preiss, 2018). During elongation, amino acids are assembled to form polypeptide chains, which eventually will fold into a functional protein. The polypeptide chain will continue to grow until the ribosomal subunit complex detects a termination codon (UAA,

UGA, or UAG) (Dever and Green, 2012). The encounter with the termination codon triggers the polypeptide chain release by 1 (eRF1) or eRF2 and other changes on the ribosomal translation complex, which culminate in the dissociation of the ribosomal subunits, mRNA, and deacetylated tRNA (for more details refer to (Dever and Green, 2012). Upon this disassociation, the recently released translation components undergo regeneration such that they can be recycled for subsequent translation rounds. 11

Since the Tombusviridae viral RNA genome lacks both a 5’-cap and a 3’-poly (A) tail, they used alternative mechanisms to recruit the host’s translation machinery to the viral RNA

(Fabian and White, 2004; Rochon et al., 2012). Viruses in the Tombusviridae family use a 3’- terminal RNA sequence that functions as a cap-independent translation element/enhancer (3’-

CITE) (Fabian and White, 2004, 2006; Wu and White, 1999). For example, TBSV 3’-CITE has a

Y-shaped RNA structure that base-pairs with another stem-loop located at the 5’-end of the viral genome (Fabian and White, 2004, 2006). The 3’-CITE recruits the translation initiation complex to the 3’-end of the viral RNA (Simon and Miller, 2013). The translation initiation complex is brought up to the 5’-end of the viral genome by mRNA circularization (Fabian and White, 2004,

2006; Miller et al., 2007; Simon and Miller, 2013). The circularization of the viral mRNA

produced by the long-distance RNA-RNA interaction between the 5’-UTR and the 3’-CITE

brings the translation initiation complex to the 5’-end of the viral RNA (Fabian and White, 2004,

2006; Miller et al., 2007). Similar to cap-dependent translation in eukaryotes (Figure 1.6), the

host ribosomal complex begins the translation of the viral proteins (Figure 1.7). There are

diverse cap-independent translation elements structures characterized, and not all recruit the same factors of the translation initiation complex (Miller et al., 2007; Simon and Miller, 2013).

Although 3’-CITEs have been found in various members of the Tombusviridae, there are still

viruses for which their translation initiation mechanisms have not been characterized (Figure

1.8).

The Tombusviridae family encodes two replication proteins in the 5’-most open reading

frames (ORFs), and they are translated from the genomic RNA (Hearne et al., 1990) using

readthrough sequences or -1 frameshifting. TBSV is a well-characterized example of

readthrough translation. TBSV translates replication-associated proteins p33 and p92 from its 12

first ORF(Hearne et al., 1990). Translation of the first ORF originates at the initiation codon

AUG166-168 and extends to an amber (UAG) terminator at nucleotide 1054, which results in the

production of p33. However, when the readthrough this amber terminator, elongation

continues until the next (UGA) is reached, resulting in p92 readthrough protein

(Hearne et al., 1990). The misreading of the amber codon is commonly achieved by

incorporation of a tRNA instead of the release factor (Beier and Grimm, 2001). Readthrough of

the amber codon requires an unconventional anticodon-codon base-pairing interaction. Because

the readthrough of the amber codon happens infrequently, the accumulation ratio of TBSV p33

and p92 products is about 20:1 (Hearne et al., 1990; Oster et al., 1998).

Carnation Italian ringspot virus (CIRV) is another member of the tombusvirus genus that

is well characterized (White and Nagy, 2004). CIRV uses proximal and distal readthrough

elements (Figure 1.9) to activate the readthrough translation of RdRp associated proteins

(Cimino et al., 2011). CIRV forms a recoding stimulatory element (RSE) structure near the

amber terminator codon, where the proximal readthrough element (PRTE) interacts with a distal

readthrough element (DRTE) via long-range base-pairing (Figure 1.9-A) (Cimino et al., 2011).

The interaction between PRTE-DRTE enables the readthrough of p95 protein. Also, an upstream

linker-downstream linker (UL-DL), located upstream of the PRTE and downstream of the

DRTE, interact to mediate the folding of the viral RNA. The RSE hairpin structure, where the

PRTE sequence is embedded, forms a pseudoknot, which causes the RSE structure to undergo

structural changes (Kuhlmann et al., 2016). The formation of this pseudoknot within the RSE

stabilizes the RNA structure promoting the to pause and insert a tRNA that decodes the

amber stop codon. Some of the viruses suspected to use PRTE/DRTE for readthrough translation

include turnip crinkle virus (TCV), panicum mosaic virus (PMV), maize chlorotic mottle virus 13

(MCMV), oat chlorotic stunt virus (OCSV), cucumber leaf-spot virus (CLSV), and tobacco necrosis virus-D (TNV-D) (Table 1-2) (Cimino et al., 2011; Kuhlmann et al., 2016; Newburn and White, 2017). However, the 3’-end location of the DRTEs varies among viruses (refer to

(Cimino et al., 2011) for more details). Hence, even though some members of the Tombusviridae family use readthrough translation to generate RdRp, the specific mechanisms used might differ from virus to virus.

Ribosomal frameshifting is used instead of readthrough by Calvusvirinae and

Regressovirinae (Miras et al., 2017; Scheets et al., 2018). Programmed ribosomal frameshifting

(PRF) is often induced by a seven nucleotide slippery sequence and adjacent stimulatory elements, which are commonly folded into a secondary RNA structure (Choi et al., 2020). The slippery sequence causes the ribosome to shift from the string of codons in the zero frame to the overlapping string of codons in the -1 frame. This sequence dictates the location and direction of the -1 frameshift (-1 FS). In plant viruses, ribosomal frameshifting was found first in barley yellow dwarf virus, a member of genus Luteovirus (Luteoviridae family) that closely resembles genus Dianthovirus of the Tombusviridae. It requires the requisite 7-nucleotide slippery sight, followed by a bulged stem-loop that base pairs to a stem-loop 4 kb downstream, forming a pseudoknot (Barry and Miller, 2002). Pea enation mosaic virus RNA2 (PEMV2) is a member of the Calvusvirinae that uses programmed ribosomal frameshifting (Figure 1.10). PEMV2 produces its RNA polymerase (p94) using -1 ribosomal frameshift of ORF1 (Gao and Simon,

2016, 2017). Similarly to other Umbraviruses, PEMV2 forms three hairpins near the slippery site

(Gao and Simon, 2016). One of the three structures near the slick site is the RSE. Similarly to the readthrough translation, RSE interacts with a hairpin (Pr), located in the 3’-end of the virus, via long-distance base-pairing (Figure 1.10-B). However, the length of the spacer sequence between 14 the slippery sequence and the RSE is vital for optimal ribosomal frameshift (Gao and Simon,

2016). In addition to the spacer sequence, the structure and location of the stop codon are important for frameshift in PEMV2. However, the pseudoknot formation within RSE and structural stability is not sufficient for successful ribosomal frameshift in all viruses. Some viruses such as red clover necrosis mosaic virus (RCNMV) rely on sequence specificity in addition to structural stability (Kim and Lommel, 1998)

Cap-Independent Translation Elements (CITEs)

Cap-independent translation elements\enhancers (CITEs) are secondary structures located towards the 3’-end of the viral genome and are used to recruit the host translation machinery for the viral functions (Fabian and White, 2006; Miller et al., 2007; Simon and Miller, 2013;

Truniger et al., 2017). These 3’-CITEs compensate for the lack of 5’-cap structure in the viral

RNA (Simon and Miller, 2013). Thus far, 3’-CITEs have been reported or predicted on plant viruses that lack a 5’-cap and polyadenylation at the 3’-end (Gao and Simon, 2017; Miller et al.,

2007; Nicholson and White, 2011; Simon and Miller, 2013). In general, the 3’-CITEs recruit the host translation initiation factors to the 3’-end of the viral RNA and circularized to the 5’-end of the genome through long-distance base-pairing (Simon and Miller, 2013; Truniger et al., 2017)

(Figure 1.7). The type of translation initiation factors/subunits recruitment varies according to the 3’-CITE structure and molecular mechanism (Table 1-3). Thus far, there are seven characterized 3’-CITE structures reported (Table 1-3). Some of these characterized 3’-CITEs can be interchanged among viruses as long as the long-distance base-pairing is maintained

(Nicholson et al. 2010). Some viruses posses multiple CITE structures at the 3’-end of the genome (Du et al., 2017). 15

Cap-independent translation elements classification

The 3’-CITEs are categorized based on their RNA secondary structures. Some of these

well-characterized structures include translation enhancer domain (TED), barley yellow draft

virus (BYDV)-like translation element (BTE), panicum mosaic virus-like element (PTE), T-

shaped structure (TSS), I-shaped structure (ISS), Y-shaped structure (YSS), and cucurbit -

borne yellow virus (CABYV-Xinjiang) like translation element (CXTE) (Miras et al., 2018;

Nicholson and White, 2011; Simon and Miller, 2013; Truniger et al., 2017). A 3’-CITE consists

of conserved structures that often includes a loop-structure that contain conserved interacting

sequences YGCCA/UGGCR, which allows it to engage with a loop (UGGCR) at the 5’end of the

viral RNA (Simon and Miller, 2013). The translation initiation factor binding properties differ among structures (Table 1-3).

TED. TED translation was first described in the satellite tobacco necrosis virus (STNV)

(Danthinne et al., 1993). A long stem-loop with several internal bulges and a top hairpin

characterizes TED-like structures (Blanco-Perez et al., 2016; Truniger et al., 2017). TED-like

structures (Table 1-3) contain complementary sequences in an apical loop that interact with an apical loop in the 5’-terminal hairpin through long-distance base-pairing (Blanco-Perez et al.,

2016; Simon and Miller, 2013). TED-like structures bind to the eukaryotic cap-binding complex,

initiation factor (eIF) 4F, and eIF(iso)4F (Gazo et al., 2004). In the absence of eIF4G and

eIFiso4G subunits, TED structure binds to cap-binding proteins eIF4E and eIFiso4E (Gazo et al.,

2004). However, TED binding affinity to eIF4E/iso4E increases in the presence of eIF4G/iso4G,

thus it has highest affinity for eIF4F/iso4F. In contrary to other 3’-CITEs that recruit eIF4E, TED does not appear to interact with cap-binding pocket since the presence of m7GTP does not inhibit the TED-eIF4E interaction (Gazo et al., 2004).

16

BTE. The barley yellow draft virus (BYDV)-like translation element (BTE) is characterized by its cloverleaf-like structure (Table 1-3) (Simon and Miller, 2013). The cloverleaf-like structure branches into two-to-five stem-loops, depending on the virus, and is connected to the rest of the genome by a long basal helix (Guo et al., 2000b; Truniger et al.,

2017; Wang et al., 2010). The BTE interacts with the 5’-end of the virus through long-distance base-pairing with the 5’-untranslated region (UTR) (Allen et al., 1999; Guo et al., 2000a; Miras et al., 2017; Simon and Miller, 2013; Treder et al., 2008; Truniger et al., 2017). The BTE structure binds to translation initiation factor 4F (eIF4F) though eIF4G interaction (preferably)

(Treder et al., 2008; Wang et al., 2017). In the presence of helicase factors eIF4A, eIF4B, eIF4F, eIF3 and ATP, BTE recruits 40S subunit (Bhardwaj et al., 2019; Sharma et al., 2015; Zhao et al.,

2017). The eIF3 factor produces a bridge between BYDV’s UTRs, resulting in an increase in 5’ to 3’ interaction and recruitment of 40S-eIF complex to the 5’-UTR. The recruitment of the 40S ribosomal subunit to the BTE leads to conformational changes and structural rearrangements in the initiation translation complex, which locks the initiation machinery in place and active for scanning (Bhardwaj et al., 2019).

PTE. PMV-like translation element (PTE) consists of two helical branches (Table 1-3).

The bridge that connect the helical branches is a pyrimidine-rich (C-domain), and the basal helix

contains a guanylate rich bulge (G-domain) on its 5’-side (Batten et al., 2006; Wang et al., 2009).

Two of the three base pairs are predicted to form a pseudoknot between C-domain and G-domain

(Wang et al., 2011). Electrophoretic mobility shift assays (EMSA) indicated the PTE binds to

eIF4E (Wang et al., 2009). Computational modeling, footprinting, and SHAPE protein protection

assays suggest an interaction between the G-C domain pseudoknot and the cap-binding pocket of

eIF4E (Wang et al., 2011; Wang et al., 2009).The PTE’s 5’ side hairpin contains an apical loop 17

that has a complementary sequence to a hairpin located at the 5’-end of the virus genome (Batten

et al., 2006; Miller et al., 2007; Simon and Miller, 2013). The complementary sequences in the

PTE and the 5’-end hairpin engage in RNA-RNA interaction through long-distance base-pairing

(Batten et al., 2006; Simon and Miller, 2013; Truniger et al., 2017). However, in addition to the above interaction , the PEMV2 PTE-like 3’CITE also relies on an upstream kl-TSS (kissing-loop

T-shaped structure) structure to base-pair to the 5’-end of the viral genome (Du et al., 2017).

TSS. The T-shape structure (TSS) of turnip crinkle virus (TCV) contains three hairpins and two pseudoknots which fold the RNA structure similarly to tRNAs (Table 1-3) (McCormack et al., 2008). The TSS structure binds to the P-site of 80S ribosomes and 60S subunits

(McCormack et al., 2008; Yuan et al., 2009). The TSS forms a stable scaffold frame that enables simultaneous base-pairing interactions with external sequences from both sides of its internal symmetric loop (Table 1-3) (Yuan et al., 2009). When the synthesis of RdRp reaches a specific threshold, the RdRp binds to one of the TSS sides resulting in RNA folding conformational changes of the structure (Le et al., 2017). The TSS conformational structural change facilitates

the TSS structural transition required for inhibiting translation to permit replication of the viral

RNA. Thus, the TSS structure shape required for translation is activated when RdRp is absent or degraded.

18

Long-range RNA-RNA interactions between 5’-hairpin and TSS are generated through

the kissing-loop TSS (kl-TSS) (Gao et al., 2013). Since not all TSS engage on 5’ to 3’ long-

distance base-pairing, TSS that possess structural features that allow them to interact with the 5’-

end are referred to as kl-TSS (Gao et al., 2012). The kl-TSS structure resembles the TSS

structure. It has two hairpins, and three-way branched element that forms a T shape structure

tridimensionally (3D) (Gao et al., 2014). In addition to binding 60S and 80S subunits, kl-TSS

binds 40S and simultaneously engages in long-distance kissing-loop interaction with 5’-end

hairpin (Gao et al., 2014).

YSS. The Y-shaped structure (YSS) CITEs contain three major long helices (Table 1-3)

(Fabian and White, 2004). In comparison to PTEs, the YSS helices are longer, and they lack the

G-C bulge (Simon and Miller, 2013; Truniger et al., 2017). Similarly to most PTEs, the 5’-side hairpin engages in long-distance RNA-RNA interaction with the 5’-terminal hairpin loop (Fabian and White, 2004). The YSS structure recruits eIF4F or eIFiso4F (Nicholson et al., 2013).

ISS. The I-shaped structure (ISS) is composed of a stem-loop structure with internal bulges protruding from the stem (Table 1-3) (Liu and Goss, 2018; Nicholson et al., 2010;

Truniger et al., 2008). The apical loop of ISS contains a complementary sequence to the 5’end of the viral genome, which facilitates long-distance base-pairing (Nicholson et al., 2010; Truniger et al., 2008). The ISS structure binds to eIF4F through interaction with the eIF4E subunit (Liu and Goss, 2018; Nicholson et al., 2010; Truniger et al., 2017). The presence of eIF4A and eIF4B increases the ISS-eIF4F binding affinity (Liu and Goss, 2018). The ISS structure can prevent replication from disrupting ongoing translation by moderately binding the 40S subunit causing the complex to stall at the 3’end of the viral RNA (Liu and Goss, 2018).

19

CXTE. The CABYV-Xinjiang-like translation element (CXTE) structure consists of two

helices protruding from a central hub (Miras et al., 2014). The exact translation mechanisms of

this 3’-CITE have yet to be defined. However, it appears that CXTE interacts with the 5’-end of the virus genome (Miras et al., 2014). For example, a resistance-breaking strain of MNSV has been reported to possess two 3’-CITEs at its 3’-end, ISS, and CXTE (Miras et al., 2014). In

susceptible melon plants, both 3’-CITEs are active. However, in knockout eIF4E melon plants, only CXTE is functional. The CXTE is the least characterized of the 3’-CITEs. Interestingly,

European isolates of CABYV lack a CXTE or any other known CITE. CABYV is in the polerovirus genus, which is not in the Tombusviridae. No other polerovirus is known to have a

CITE.

Maize Lethal Necrosis (MLND)

What causes Maize Lethal Necrosis Disease?

Maize Lethal Necrosis Disease (MLND) is a ferocious maize disease caused by the synergistic interaction between maize chlorotic mottle virus (MCMV) and a potyvirus (Uyemoto,

1980a). The interaction of MCMV and a potyvirus produces severe chlorosis, leaf yellowing and browning, rotting cobs, and premature plant death (FAO, 2013; Uyemoto, 1980a). Some of the interacting potyviruses include maize dwarf mosaic virus (MDMV), wheat streak mosaic virus

(WSMV), (SCMV), and Johnsongrass mosaic virus (JGMV)

(Redinbaugh and Stewart, 2018; Scheets, 1998; Stewart et al., 2017; Uyemoto, 1980a). Although the potential interaction between MCMV and poleroviruses, maize yellow dwarf virus (MYDV-

RMV) and the mastrevirus maize streak virus (MSV), is suspected, it is not confirmed (Massawe et al., 2018; Wamaitha et al., 2018). Even though MSV and MYDV-RMV were isolated along with MLND viruses in infected fields, synergistic assays between these viruses have yet to be performed. 20

The dynamics of the synergistic interactions vary according to the environmental

conditions and the potyvirus interacting with MCMV (Scheets, 1998). For example, in a co-

infection of MCMV and WSMV, the accumulation of both viruses increases where WSMV was

benefiting the most (Scheets, 1998). On the other hand, MCMV increases 2- to 5-fold in the presence of SCMV, while SCMV accumulation not affected by MCMV presence (Scheets,

1998). The mechanistic details of MCMV synergistic interactions are not well known; however,

MLND triggers changes on the host’s microRNAs (miRNAs) expression levels. Characterization of miRNAs from SCMV and MCMV synergistic interactions suggested that miR159, miR393, and miR394 downregulated as a response to the host’s antiviral defense mechanism (Xia et al.,

2019). Individually, SCMV upregulated five miRNAs (miR:159, 393, 394, 444, and 827), whereas MCMV increased the accumulation of two miRNAs (miR: 166, and 529) while decreasing expression of three miRNAs (miR: 159, 394, and 827) (Xia et al., 2019).

In plants, miRNAs participate in diverse regulatory pathways, including plant-pathogen defense mechanisms (Liu et al., 2017). MicroR159 (miR159) is a highly conserved 21-nt miRNA that is present in most plants (Millar et al., 2019). The miR159 is associated with pathogen defense response, abiotic stress response, and plant development (Millar et al., 2019; Wang et al.,

2012; Zheng et al., 2020). Overexpression of miR159 can lead to flower malformation and pollen sterility, while suppression results in plant stunting and malformation (Liao et al., 2015;

Wang et al., 2012). miR393 operates as an auxin signaling regulator and affects diverse plant development and biotic responses (Yuan et al., 2019). Overexpression of miR393 leads to an increase in abaxial stomatal density; on the other hand, miR393 suppression results in stomatal density (Yuan et al., 2019). Finally, miR394 is involved in shoot meristem development and abiotic stress regulation responses (Kumar et al., 2019; Ni et al., 2012; Tian et al., 2018). 21

Overexpression of miR394 leads to drought tolerance and susceptibility to pathogens (Ni et al.,

2012; Tian et al., 2018). The virus-host miRNAs interactions might influence the symptoms

observed in MLND.

From its origins until now

MLND appeared in the Americas in the mid-1970s (Figure 1.11). The first report of

MLND in the USA occurred in 1978 in Kansas, where this disease caused an estimated maize crop loss of 50% or more in three different counties (Uyemoto, 1980a). Then in 1979, field screening of sick maize and plants in Peru revealed the presence of MCMV and

MDMV (Castillo and Nault, 1982b; Nault, 1979). In 1990, there was an MLND outbreak in

Kauai, Hawaii (Jensen et al., 1990). Twenty years later, new MLND outbreaks emerged in China and East Africa (Lukanda et al., 2014; Redinbaugh and Stewart, 2018; Wang et al., 2014;

Wangai et al., 2012b; Xie et al., 2011). Some of the African countries affected include Kenya,

Tanzania, Uganda, Rwanda, and Ethiopia (Figure 1.11). Finally, the latest case of MLND surfaced in Ecuador in 2015 (Quito-Avila et al., 2016).

Economic Impact

MLND is a ruinous disease that has caused severe crop losses since 2011 in East Africa

(De Groote et al., 2016; FAO, 2013, 2017; Gitonga, 2014; Mahuku et al., 2015; Redinbaugh and

Stewart, 2018; Wangai et al., 2012b). MLND caused economic losses at the farm, county, and

national levels (FAO, 2013, 2017; Gitonga, 2014). In 2014, Kenya lost approximately 23% (~2.1

million metric tons) of maize production due to MLND (CIMMYT, 2014a). Farmers affected by

MLND suffered total crop losses, which were aggravated by the unavailability of feeding

infected plant’s foliage to livestock due to fear of fungal infection (FAO, 2013, 2017; Gitonga,

2014). The prices of maize progressively increased by approximately 18% due to the decrease in 22 seed harvest over the years; this increase in prices contributes towards the food insecurity and farming crisis in several regions of the African continent (FAO, 2019a, b; Mahuku et al., 2015).

Detection and Prevention

To prevent further economic and food security threats by MLND, global research organizations, local governments, and companies from the private sector have developed several prevention and defense initiatives (CIMMYT, 2014a, b; Gitonga, 2014). Because MLND involves the synergistic interaction of MCMV and a potyvirus (Goldberg, 1987; Niblett and

Claflin, 1978; Scheets, 1998; Wangai et al., 2012b), humanitarian groups are searching for maize germplasm that confer resistance against both viruses (Gowda et al., 2018; Jones et al., 2018a;

Redinbaugh and Zambrano, 2014; Xing et al., 2006; Zambrano et al., 2014). Despite humanitarian efforts to increase agricultural production in Africa, since 2014, production levels are still below pre-crisis levels (FAO, 2019a). Therefore, identification of resistant maize varieties is crucial to combat this ongoing disease as a sustainable long-term solution.

Maize Chlorotic Mottle Virus

Overview

Maize chlorotic mottle virus (MCMV) belongs to the Machlomovirs genus within the

Tombusviridae family (King et al., 2012b). It has a 4.4 kb RNA with four ORFs. MCMV virions are approximately 30 nm in diameter with an icosahedral symmetry like the rest of the members of the Tombusviridae family (Figure 1.3) (King et al., 2012a; Wang et al., 2015). MCMV has a conserved β-barrel structure composed of anti-parallel β-sheets, similar to other Tombusviridae viruses (Wang et al., 2015). Identical to other icosahedral viruses with a T=3 lattice, the MCMV capsid is made of 60 identical asymmetric units. MCMV host range infection is limited to members of the Gramineae family (King et al., 2012a; WU et al., 2013a). The monocot host 23

range includes crops such as barley (Hordeum vulgare L.), proso millet (Panicum miliaceum L.) foxtail millet (Setaria italic), wheat (Triticum aestivum), and corn (Zea mays) (Cabanas et al.,

2013b).

Maize chlorotic mottle virus infection and symptoms were first described in 1974 on the central coast of Peru (Castillo and Hebert, 1974). Later, cases of MCMV were reported in the

United States of America in 1976 (R.C.Nutter et al., 1989). Then, in 1978 MCMV was detected in different regions of Peru during an -borne pathogen survey (L. R. Nault et al., 1979).

Other cases of outbreaks include agricultural areas in Mexico, Brazil, Argentina, Taiwan, China,

Hawaii, and most recently in Kenya and Rwanda (Figure 1.12) (Cabanas et al., 2013b; Wang et al., 2015; Wangai et al., 2012a; WU et al., 2013a; Zeng et al., 2013). The severity of the crop loss varies among outbreaks. However, synergistic interaction between MCMV and potyvirus increases lethality, killing >70% of the crop (Wangai et al., 2012a).

MCMV Symptoms

Infected maize plants have stunted growth with short internodes and sometimes premature death (Castillo and Hebert, 1974; Plantwise). Early leaf symptoms include thin chlorotic stripes running parallel to the veins, which consequently turn into elongated chlorotic blotches that produced leaf necrosis and epinasty. In the synergistic interaction with a potyvirus

(MLND), necrosis of young leaves leads to a “dead heart” symptom and eventually plant death

(Wangai et al., 2012a). As the disease progresses, it causes malformation and partially filled ears; most of the time, plants die before tasseling.

MCMV causes chlorosis and mottling in the infected plant. Proteomic analysis (Dang et al., 2019) of MCMV infected plants indicated a decrease in photosynthesis activity. A drop in photosynthesis was associated with MCMV viral symptoms. The abundance of 14 differentially abundant proteins (DAPs) decreased while 13 DAPs increased as a response to MCMV infection 24

(Dang et al., 2019). Among these DAPs, photosystem oxygen-evolving enhancer proteins, PsbP and PsbQ, were downregulated. The abundance of proteins involved in photosynthesis electron transport such as PetE and PetF decreased. In contrast to DAPs associated with photosynthesis, ribosomal-related proteins were significantly upregulated (40/50 DAPs). In infected MCMV plants, 19 DAPs were associated with the 60S ribosomal subunit, while 13 DAPs related to the

40S ribosomal subunit. The increase in ribosomal subunits during infection can be associated with the immune response of the plant and the viral replication/translation mechanism. The study of virus and host-protein interactions could help us understand the effects triggered in the plant as a whole when infected by viruses.

Infection with MCMV causes abnormalities in maize (Wangai et al., 2012a). MCMV

induces upregulation of steroid phytohormones, brassinosteroids (BRs), increasing the plant’s

susceptibility (Cao et al., 2019). BRs are involved in diverse physiological functions, including

seed germination, cell elongation, cell division, senescence, vascular-differentiation, reproduction, root development, photomorphogenesis, and plant immunity response (Saini et al.,

2015). The BR pathway changes upon infection of MCMV making the plant more susceptible.

Upon infection, MCMV free radical reactive nitrogen species (RNS) such as nitric oxide accumulate in the plant (Cao et al., 2019). The increase in nitric oxide serves as a downstream signal that enhances the maize susceptibility to MCMV. The knockout of an essential

(ZmDWRF4) in BR synthesis results in increase resistance against MCMV(Cao et al., 2019).

Although maize susceptibility was associated with BR phytohormones and RNS, there is much work to be done to understand the complex plant-pathogen crosstalk interactions.

25

MCMV transmission

MCMV is transmitted mechanically (King et al., 2012a). It can be seed-transmitted or by six species of the Chrysomelidae family (Nault et al., 1978). The six species of reported to be positive insect vectors carriers were: cereal (Oulema melanopa), corn (Chaetocnema pulicaria), the flea beetle ( frontalis), the southern corn rootworm (Diabrotica undecimpunctata), the northern corn rootworm (D. longicornis), and the western corn rootworm (D. virgifera). O. melanopa was determined to transmit the virus in both larvae and adult form. Based on Jensen (1985), the MCMV transmission mechanism is similar to any other beetle-transmitted plant viruses. The ability to transmit the virus does not correlate to the sex, age, or genotype of the beetle. Even though the virus does not inactivate upon ingestion and transmission continued for more than two days, the viral transmission by an individual insect is random. Temperature seems to affect the transmission efficiency of the virus because the feeding behavior of the is different depending on the temperature.

In recent years, thrips (Frankliniella williamsi Hood, Thrips tabaci, and Frankliniella schultzei) has been reported as one of the main MCMV transmission vectors (Cabanas et al.,

2013a; Mwando et al., 2018). However, similar to early beetle-vector transmission assays, thrips transmit MCMV soon after acquisition and retain the virus for only a few days with no evidence of latency (Cabanas et al., 2013a). Similar to beetle vectors, thrips infected maize plants in both larvae stage and adult stage. Yet, the MCMV virus transmission dimished when insects fed on healthy tissue or the thrips larvae developed into adults.

MCMV induces changes in the maize plants such the emission of volatile organic compounds (VOC) that are appealing to thrips insect vectors (Mwando et al., 2018). Other plant viruses induce changes in their hosts to stimulate the production of VOC emissions (Claudel et al., 2018; Tungadi et al., 2017). The increase of VOC emissions acts as a semiochemical lure for 26

insect vectors to infected plant tissues (Claudel et al., 2018; Tungadi et al., 2017). In MCMV

VOC studies, thrips preferred MCMV infected plants over healthy plants (Mwando et al., 2018).

MCMV-infected plants produced more (E)-4,8-dimethyl-1,3,7-nonatriene (DMNT), methyl salicylate (MeSA), (E,E)-4,8,12-trimethyltrideca-1,3,7,11-tetraene (TMTT) volatile compounds.

The increase of DMNT, MeSA, and TMTT augmented the attraction of thrips to MCMV

infected plants. The changes in volatile compounds produced upon virus infection were

associated with the virus spread enhancement (Mwando et al., 2018). The study of VOC

semiochemical compounds produced by virus-infected plants could provide insights for virus

spread management strategies.

MCMV detection

Conventional methods to detect MCMV include enzyme-linked immunosorbent assay

(ELISA), western blot, reverse transcription-polymerase chain reaction (PCR), real-time PCR, and biosensors based on surface plasmon resonance (SPR) (Cabanas et al., 2013b; Castillo and

Nault, 1982a; Uyemoto, 1979; Wangai et al., 2012a; Zeng et al., 2013). ELISA appears to be the most popular method used to detect and identify infected maize samples with MCMV. However,

Zeng et al. (2013) proposed the SPR sensor as a novel fast method to detect MCMV. This sensor appears to have a dynamic sensor range from 1 to 1000 ppb. When specificity for MCMV was compared against other maize viruses, maize dwarf mosaic virus (MDMV), wheat streak mosaic virus (WSMV), and sugarcane mosaic virus (SCMV), the SPR sensor showed highly specific signal response for MCMV. The innovative design of this sensor allows for the fast detection of

MCMV using crude extracts of infected samples and reduces sample preparation time.

As more cases of MCMV were reported in different countries, research groups focused on developing methods to detect MCMV fast and accurately. Some of these methods revolved around polymerase chain reaction (PCR). In 2011 a real-time TaqMan RT-PCR was proposed as 27

a sensitive method to detect MCMV (Zhang et al., 2011). However, the drawbacks of TaqMan

RT-PCR were that it was time-consuming, complicated, and had false-negative results during sampling. In 2013, immunocapture reverse transcription-PCR (IC-RT-PCR) was proposed as a sensitive, specific, and rapid MCMV detection method (Wu et al., 2013b). The IC-RT-PCR used primers that identified MCMV-coat protein (CP). Although this method improved the sensitivity to MCMV, even in the presence of other viruses, the process was still time-consuming.

In 2017, one-step reverse transcription loop-mediated isothermal amplification (RT-

LAMP) was used (Chen et al., 2017). RT-LAMP reduced the diagnosis time to 60 minutes under

63°C isothermal conditions, improved sensitivity, and results could be visualized by SYBER

green I staining in a closed-tube (Chen et al., 2017). RT-LAMP used 4 different primers attach to

MCMV-CP highly conserved regions. The creative component of RT-LAMP was samples could

be screeneed in one tube using a fluorescent dye that indicated the presence of MCMV. If further

analysis was needed for the MCMV infected samples, RT-LAMP product could be used for

downstream applications. Finally, in 2019 recombinase polymerase amplification (RPA) was

proposed as a reliable, sensitive, and efficient method to use in equipment-limited facilities (Jiao

et al., 2019). In comparison to RT-LAMP, RT-RPA used a 38°C isothermal reaction and was

completed within 30 minutes (Jiao et al., 2019). The decrease in time and isothermal

temperature, facilitate the use of simple equipment which was ideal for in-site field testing.

Virus Prevention

When farmers use standard integrated pest management (IPM) strategies, MCMV

spreading is prevented. Some of these precautions include crop rotation, using disease-free seeds,

insect management, weeding of fields to remove alternative hosts, rogue infected plants, and

proper sanitation to prevent virus spread (CIMMYT, 2014a). One of the most effective ways to 28

manage MCMV infections is through the integration of different cultural practices that vary from

insecticides to plant resistant lines (Nelson, 2011). The use of insecticides is a common approach

to disrupt the vector-maize interaction because it reduced the vector numbers on susceptible

maize (Broadbent, 1957; Kannan et al., 2018). Roguing is another method commonly used to removed infected plants from the fields and prevents virus spread (Jeger et al., 2018). In addition

to roguing, sanitation of agricultural tools decreases the chances of accidental virus inoculation

and prevents spreading (Kiruwa et al., 2016).

Crop rotation for at least one could reduce MCMV occurrence (Uyemoto, 1980b). Thus,

plants outside the Gramineae family should be planted to minimize virus incidence (Kiruwa et

al., 2016). Recommended non-host plants include potatoes, sweet potatoes, cassava, beans, bulb

onions, vegetables, legumes, and garlic (Franke et al., 2018; Kiruwa et al., 2016). Crop rotations

not only reduce the pest and pathogen occurrence but also improve the soil microbiome and

fertilization resulting in a sustainable agricultural system (Franke et al., 2018; Uzoh et al., 2019).

However, one of the most sought strategies against this virus is resistant or tolerant varieties

(Kiruwa et al., 2016). Creating resistant/tolerant varieties reduce crop losses and reduce

environmental impact (Jhansi Rani and Usha, 2013; Kiruwa et al., 2016). Thus far, germplasm

screening of maize varieties has yielded tolerant MCMV lines (Awata et al., 2019; Gowda et al.,

2015; Jones et al., 2018b; Nyaga et al., 2019).

Molecular Information

MCMV is a 4,437 nt positive-sense single-stranded RNA virus belonging to the

Machomovirus genus from the Tombusviridae family (Figure 1.1) (Nutter et al., 1989a; Scheets,

2016). Similar to other Tombusviridae viruses, MCMV does not have a 5’-cap and a

polyadenylated tail (Nutter et al., 1989a; Rochon et al., 2012; Scheets, 2016). MCMV has a 5’

untranslated region (UTR) of 117 nucleotides long and a 3’ UTR of approximately 343 29 nucleotides. The MCMV genome contains seven ORFs (Figure 1.13) (Scheets, 2000, 2016). The first three ORFs, p50, p32, and p111 (RdRp), are translated from the genomic RNA as a template

(Nutter et al., 1989a). Similarly to other Procedovirinae members, MCMV is predicted to use readthrough translation (Figure 1.9) to generate its RNA-dependent RNA polymerase (RdRp)

(Nutter et al., 1989a). MCMV replication requires proteins p50 and p111, RdRp (Scheets, 2016).

The p32 ORF (Figure 1.13) helps towards virus accumulation, and it is suspected to perform functions assisting replication and RNAi suppression (Scheets, 2016).

MCMV generates two sub-genomic RNAs (sgRNAs), sgRNA1, and sgRNA2 (Lommel et al., 1991). The sgRNA1 is 1.47 kb, while sgRNA2 is 0.34 kb (Scheets, 2000). The sgRNA1 serves as mRNA for translation of the capsid protein, p25, which is generated through leaky scanning of the p7 AUG codon (Figure 1.12) (Lommel et al., 1991; Nutter et al., 1989b). The sgRNA1 also codes for p31, p7a, and p7b proteins (Scheets, 2000). Translation of p7a starts at the first AUG codon, and suppression of opal stop codon results in the production of p31 protein

(Figure 1.13). MCMV p31 protein enhances systematic movement in plants while p7a and p7b serve as cell-to-cell and/ or long-distance virus movement proteins (Scheets, 2016).

Among the different MCMV outbreaks around the world (Figure 1.12), there seems to be little sequence divergence (Figure 1.14). Previous analysis of MCMV sub-populations indicated that sequence diversity partitioned between populations (Braidwood et al., 2018). Alignment and phylogenetic analysis of MCMV complete genomes showed that African and Asian MCMV virus sequences were closely related (Figure 1.15), similar to Braidwood et al. (2018) observations. In comparison to African MCMV isolates, American MCMV isolates have more variation (Figure 1.15). MCMV sequences surrounding coding regions were highly conserved

(Figure 1.14). Braidwood et al. (2018) observed a natural non-sense mutation within p31 ORF of 30

Hawaiian and Ecuadorian isolates. The mutation identified in the Hawaiian and Ecuadorian isolates introduced an early “stop” codon six amino acids after the opal readthrough codon resulting in a truncated p31 protein (Braidwood et al., 2018). In concurrence with Scheets

(2016), who showed that truncations of p31 delayed virus spread, Braidwood et al. (2018)

observed a lower systematic spread of MCMV in Hawaiian and Ecuadorian isolates. Mixed

MCMV populations with and without the truncated p31 were isolated from the

Hawaiian/Ecuadorian viral strains. Although the truncated p31 was prevalent in those isolates, it seems that the MCMV virus somehow corrects the introduced “stop codon,” even if it is at a lower rate.

Since many members of the Tombusviridae family use 3’-CITEs to compensate for the lack of translation, we suspected that MCMV contains a 3’-CITE (refer to chapter 2). Alignment of 169-nt within the 3’-untranslated region (UTR) of MCMV isolates indicated a highly conserved 3’-CITE (Figure 1.16). Similar to the complete genome phylogenetic comparisons, the Chinese and African MCMV 3-CITE (MTE) were closely related. Secondary structure predictions were performed for representative MTEs variants (Figure 1.17). Most of the MTE variations occurred in the stem-helix structure. Refolding of the variant MTE secondary structure using experimental conditions (Chapter 2) indicated that the hammer-shape structure of the MTE was highly conserved (Figure 1.18). It would be interesting to see if the natural mutations on the

MTE stem-structure affect the virus infectivity.

31

References

Adams, M.J., King, A.M., Carstens, E.B., 2013. Ratification vote on taxonomic proposals to the International Committee on Taxonomy of Viruses (2013). Arch Virol 158, 2023-2030. Allen, E., Wang, S., Miller, W.A., 1999. Barley yellow dwarf virus RNA requires a cap- independent translation sequence because it lacks a 5' cap. Virology 253, 139-144. Archer, S.K., Shirokikh, N.E., Hallwirth, C.V., Beilharz, T.H., Preiss, T., 2015. Probing the closed-loop model of mRNA translation in living cells. RNA Biol 12, 248-254. Awata, L.A.O., Beyene, Y., Gowda, M., L, M.S., Jumbo, M.B., Tongoona, P., Danquah, E., Ifie, B.E., Marchelo-Dragga, P.W., Olsen, M., Ogugo, V., Mugo, S., Prasanna, B.M., 2019. Genetic Analysis of QTL for Resistance to Maize Lethal Necrosis in Multiple Mapping Populations. (Basel) 11, 32. Barajas, D., Jiang, Y., Nagy, P.D., 2009. A unique role for the host ESCRT proteins in replication of Tomato bushy stunt virus. PLoS Pathog 5, e1000705. Batten, J.S., Turina, M., Scholthof, K.B., 2006. Panicovirus accumulation is governed by two membrane-associated proteins with a newly identified conserved motif that contributes to pathogenicity. Virol J 3, 12. Beier, H., Grimm, M., 2001. Misreading of termination codons in eukaryotes by natural nonsense suppressor tRNAs. Nucleic Acids Res 29, 4767-4782. Bhardwaj, U., Powell, P., Goss, D.J., 2019. Eukaryotic initiation factor (eIF) 3 mediates Barley Yellow Dwarf Viral mRNA 3'-5' UTR interactions and 40S ribosomal subunit binding to facilitate cap-independent translation. Nucleic Acids Res 47, 6225-6235. Blanco-Perez, M., Perez-Canamas, M., Ruiz, L., Hernandez, C., 2016. Efficient Translation of Pelargonium line pattern virus RNAs Relies on a TED-Like 3 -Translational Enhancer that Communicates with the Corresponding 5 -Region through a Long-Distance RNA-RNA Interaction. PLoS One 11, e0152593. Braidwood, L., Quito-Avila, D.F., Cabanas, D., Bressan, A., Wangai, A., Baulcombe, D.C., 2018. Maize chlorotic mottle virus exhibits low divergence between differentiated regional sub- populations. Sci Rep 8, 1173. Broadbent, L., 1957. Insecticidal Control of the Spread of Plant Viruses. Annual Review of Entomology 2, 339-354. Cabanas, D., Watanabe, S., Higashi, C.H., Bressan, A., 2013a. Dissecting the mode of maize chlorotic mottle virus transmission (Tombusviridae: Machlomovirus) by Frankliniella williamsi (Thysanoptera: Thripidae). J Econ Entomol 106, 16-24. Cabanas, D., Watanabe, S., Higashi, C.H.V., Bressan, A., 2013b. Dissecting the Mode of Maize Chlorotic Mottle Virus Transmission(Tombusviridae: Machlomovirus). J. Econ. Entomol. 106, 16-24. 32

Cao, N., Zhan, B., Zhou, X., 2019. Nitric Oxide as a Downstream Signaling Molecule in Brassinosteroid-Mediated Virus Susceptibility to Maize Chlorotic Mottle Virus in Maize. Viruses 11, 368. Castillo, J., Hebert, T.T., 1974. Nueva Enfermedad Virosa Afectando al Maiz en el Peru. Fitopatologia 9, 79-84. Castillo, J., Nault, L.R., 1982a. Enfermedades Causadas por Virus and Mollicutes en Mais en el Peru. Firopatologia 17, 40-48. Castillo, J., Nault, L.R., 1982b. Enfermedades Causadas por Virus y Mollicutes en Maiz en el Peru. Fitopatologia 17, 40-47. Chen, L., Jiao, Z., Liu, D., Liu, X., Xia, Z., Deng, C., Zhou, T., Fan, Z., 2017. One-step reverse transcription loop-mediated isothermal amplification for the detection of Maize chlorotic mottle virus in maize. Journal of virological methods 240, 49-53. Choi, J., O'Loughlin, S., Atkins, J.F., Puglisi, J.D., 2020. The energy landscape of -1 ribosomal frameshifting. Sci Adv 6, eaax6969. Christ, L., Raiborg, C., Wenzel, E.M., Campsteijn, C., Stenmark, H., 2017. Cellular Functions and Molecular Mechanisms of the ESCRT Membrane-Scission Machinery. Trends Biochem Sci 42, 42-56. Cimino, P.A., Nicholson, B.L., Wu, B., Xu, W., White, K.A., 2011. Multifaceted regulation of translational readthrough by RNA replication elements in a tombusvirus. PLoS Pathog 7, e1002423. CIMMYT, 2014a. Maize Lethal Necrosis: Building a comprehensive response, 2014 Annual Report : turning research into impact, Mexico, pp. 24-25. CIMMYT, 2014b. Update: CIMMYT maize inbred lines and pre-commercial hybrids with potential resistance to maize lethal necrosis (MLN), www.cummyt.org. Claudel, P., Chesnais, Q., Fouche, Q., Krieger, C., Halter, D., Bogaert, F., Meyer, S., Boissinot, S., Hugueney, P., Ziegler-Graff, V., Ameline, A., Brault, V., 2018. The Aphid-Transmitted Turnip yellows virus Differentially Affects Volatiles Emission and Subsequent Vector Behavior in Two Brassicaceae Plants. Int J Mol Sci 19, 2316. Dang, M., Cheng, Q., Hu, Y., Wu, J., Zhou, X., Qian, Y., 2019. Proteomic Changes during MCMV Infection Revealed by iTRAQ Quantitative Proteomic Analysis in Maize. Int J Mol Sci 21, 35. Danthinne, X., Seurinck, J., Meulewaeter, F., Van Montagu, M., Cornelissen, M., 1993. The 3' untranslated region of satellite tobacco necrosis virus RNA stimulates translation in vitro. Mol Cell Biol 13, 3340-3349. de Farias, S.T., Dos Santos Junior, A.P., Rego, T.G., Jose, M.V., 2017. Origin and Evolution of RNA-Dependent RNA Polymerase. Front Genet 8, 125. 33

De Groote, H., Oloo, F., Tongruksawattana, S., Das, B., 2016. Community-survey based assessment of the geographic distribution and impact of maize lethal necrosis (MLN) disease in Kenya. Crop Prot 82, 30-35. Demongeot, J., Seligmann, H., 2019. More Pieces of Ancient than Recent Theoretical Minimal Proto-tRNA-Like RNA Rings in Genes Coding for tRNA Synthetases. J Mol Evol 87, 152-174. Dever, T.E., Green, R., 2012. The elongation, termination, and recycling phases of translation in eukaryotes. Cold Spring Harb Perspect Biol 4, a013706. Du, Z., Alekhina, O.M., Vassilenko, K.S., Simon, A.E., 2017. Concerted action of two 3' cap- independent translation enhancers increases the competitive strength of translated viral genomes. Nucleic Acids Res 45, 9558-9572. Fabian, M.R., White, K.A., 2004. 5'-3' RNA-RNA interaction facilitates cap- and poly(A) tail- independent translation of tomato bushy stunt virus mrna: a potential common mechanism for tombusviridae. J Biol Chem 279, 28862-28872. Fabian, M.R., White, K.A., 2006. Analysis of a 3'-translation enhancer in a tombusvirus: a dynamic model for RNA-RNA interactions of mRNA termini. Rna 12, 1304-1314. FAO, 2013. Maize Lethal Necrosis Disease (MLND) - A snapshot, in: FSNWG (Ed.), Food Security and Nutrition Working Group, La FAO en situaciones de emergencias. FAO, 2017. Status of Maize lethal necrosis disease (MLND)in kenya, Pest Reports, Food and Agriculture Organization of the United Nations. FAO, 2019a. Central African Republic - Situation report March 2019, Food and Agriculture Organization of the United Nations. FAO, 2019b. Democratic Republic of the Congo - Situation report Februrary 2019, Democratic Republic of the Congo, Food and Agriculture Organization of the United Nations. Fenner, F., Maurin, J., 1976. The classification and nomenclature of viruses. Archives of Virology 51, 141-149. Franke, A.C., van den Brand, G.J., Vanlauwe, B., Giller, K.E., 2018. Sustainable intensification through rotations with grain legumes in Sub-Saharan Africa: A review. Agric Ecosyst Environ 261, 172-185. Gao, F., Gulay, S.P., Kasprzak, W., Dinman, J.D., Shapiro, B.A., Simon, A.E., 2013. The kissing-loop T-shaped structure translational enhancer of Pea enation mosaic virus can bind simultaneously to ribosomes and a 5' proximal hairpin. J Virol 87, 11987-12002. Gao, F., Kasprzak, W., Stupina, V.A., Shapiro, B.A., Simon, A.E., 2012. A ribosome-binding, 3' translational enhancer has a T-shaped structure and engages in a long-distance RNA-RNA interaction. J Virol 86, 9828-9842. Gao, F., Kasprzak, W.K., Szarko, C., Shapiro, B.A., Simon, A.E., 2014. The 3' untranslated region of Pea Enation Mosaic Virus contains two T-shaped, ribosome-binding, cap-independent translation enhancers. J Virol 88, 11696-11712. 34

Gao, F., Simon, A.E., 2016. Multiple Cis-acting elements modulate programmed -1 ribosomal frameshifting in Pea enation mosaic virus. Nucleic Acids Res 44, 878-895. Gao, F., Simon, A.E., 2017. Differential use of 3'CITEs by the subgenomic RNA of Pea enation mosaic virus 2. Virology 510, 194-204. Gazo, B.M., Murphy, P., Gatchel, J.R., Browning, K.S., 2004. A novel interaction of Cap- binding protein complexes eukaryotic initiation factor (eIF) 4F and eIF(iso)4F with a region in the 3'-untranslated region of satellite tobacco necrosis virus. J Biol Chem 279, 13584-13592. Giesman-Cookmeyer, D., Sit, T.L., Lommel, S.A., 2001. Tombusviridae, Encyclopedia of Life Sciences. Gitonga, K., 2014. Maize Lethal Necrosis - The growing challenge in Eastern Africa in: Service, U.F.A. (Ed.). Global Agricultural Informaiton Network, https://gain.fas.usda.gov. Goldberg, K.B., 1987. Concentration of Maize Chlorotic Mottle Virus Increased in Mixed Infections with Maize Dwarf Mosaic Virus, Strain B. Phytopathology 77, 162-167. Gorbalenya, A., Krupovic, M., Siddell, S., Varsani, A., Kuhn, J., 2017. Riboviria: establishing a single taxon that comprises RNA viruses at the basal rank of virus taxonomy. ICTV. Gowda, M., Beyene, Y., Makumbi, D., Semagn, K., Olsen, M.S., Bright, J.M., Das, B., Mugo, S., Suresh, L.M., Prasanna, B.M., 2018. Discovery and validation of genomic regions associated with resistance to maize lethal necrosis in four biparental populations. Mol Breed 38, 66. Gowda, M., Das, B., Makumbi, D., Babu, R., Semagn, K., Mahuku, G., Olsen, M.S., Bright, J.M., Beyene, Y., Prasanna, B.M., 2015. Genome-wide association and genomic prediction of resistance to maize lethal necrosis disease in tropical maize germplasm. Theor Appl Genet 128, 1957-1968. Gunawardene, C.D., Donaldson, L.W., White, K.A., 2017. Tombusvirus polymerase: Structure and function. Virus Res 234, 74-86. Guo, L., Allen, E., Miller, W.A., 2000a. Structure and function of a cap-independent translation element that functions in either the 3' or the 5' untranslated region. Rna 6, 1808-1820. Guo, L., Allen, E., Miller, W.A., 2000b. Structure and function of a cap-independent translation element that functions in either the 3 ' or the 5 ' untranslated region. Rna-a Publication of the Rna Society 6, 1808-1820. Harrison, B.D., Finch, J.T., Gibbs, A.J., Hollings, M., Shepherd, R.J., Valenta, V., Wetter, C., 1971. Sixteen groups of plant viruses. Virology 45, 356-363. Hearne, P.Q., Knorr, D.A., Hillman, B.I., Morris, T.J., 1990. The complete genome structure and synthesis of infectious RNA from clones of tomato bushy stunt virus. Virology 177, 141-151. Hopper, P., Harrison, S.C., Sauer, R.T., 1984. Structure of tomato bushy stunt virus. V. Coat protein sequence determination and its structural implications. J Mol Biol 177, 701-713. Hwang, Y.T., McCartney, A.W., Gidda, S.K., Mullen, R.T., 2008. Localization of the Carnation Italian ringspot virus replication protein p36 to the mitochondrial outer membrane is mediated by an internal targeting signal and the TOM complex. BMC Cell Biol 9, 54. 35

Jeger, M.J., Madden, L.V., van den Bosch, F., 2018. Plant Virus Epidemiology: Applications and Prospects for Mathematical Modeling and Analysis to Improve Understanding and Disease Control. Plant Dis 102, 837-854. Jensen, S., 1985. Laboratory Transmission of Maize Chlorotic Mottle Virus by Three Species of Corn Rootworms. Plant Disease 69, 864-868. Jensen, S.G., Ooka, J.J., Lockhart, B.E., Lommel, S.A., Lane, D., Wysong, D.S., Doupnik, B., 1990. Corn Lethal Necrosis is Hawaii. Phytopathology 80, 1022. Jhansi Rani, S., Usha, R., 2013. Transgenic plants: Types, benefits, public concerns and future. Journal of Pharmacy Research 6, 879-883. Jia, H., Gong, P., 2019. A Structure-Function Diversity Survey of the RNA-Dependent RNA Polymerases From the Positive-Strand RNA Viruses. Front Microbiol 10, 1945. Jiao, Y., Jiang, J., An, M., Xia, Z., Wu, Y., 2019. Recombinase polymerase amplification assay for rapid detection of maize chlorotic mottle virus in maize. Arch Virol 164, 2581-2584. Jin, X., Cao, X., Wang, X., Jiang, J., Wan, J., Laliberte, J.F., Zhang, Y., 2018. Three- Dimensional Architecture and Biogenesis of Membrane Structures Associated with Plant Virus Replication. Frontiers in plant science 9, 57. Jonczyk, M., Pathak, K.B., Sharma, M., Nagy, P.D., 2007. Exploiting alternative subcellular location for replication: tombusvirus replication switches to the endoplasmic reticulum in the absence of peroxisomes. Virology 362, 320-330. Jones, M.W., Penning, B.W., Jamann, T.M., Glaubitz, J.C., Romay, C., Buckler, E.S., Redinbaugh, M.G., 2018a. Diverse Chromosomal Locations of Quantitative Trait Loci for Tolerance to Maize chlorotic mottle virus in Five Maize Populations. Phytopathology 108, 748- 758. Jones, M.W., Penning, B.W., Jamann, T.M., Glaubitz, J.C., Romay, C., Buckler, E.S., Redinbaugh, M.G., 2018b. Diverse Chromosomal Locations of Quantitative Trait Loci for Tolerance to Maize chlorotic mottle virus in Five Maize Populations. Phytopathology™ 108, 748-758. Kannan, M., Ismail, I., Bunawan, H., 2018. Maize Dwarf Mosaic Virus: From Genome to Disease Management. Viruses 10, 492. Kim, K.H., Lommel, S.A., 1998. Sequence element required for efficient -1 ribosomal frameshifting in red clover necrotic mosaic dianthovirus. Virology 250, 50-59. King, A., Adams, M., Carstens, E., Leftkowitz, E., 2012a. Virus Taxonomy. Academic Press, San Diego, California, USA. King, A.M., Adams, M.J., Carstens, E.B., Lefkowitz, E.J., 2012b. Machlomovirus, Virus Taxonomy. ELSEVIER Academic Press, United States of America, pp. 1134-1138. King, K.M., Van Doorslaer, K., 2018. Building (Viral) Phylogenetic Trees Using a Maximum Likelihood Approach. Current protocols in microbiology 51, e63. 36

Kiruwa, F., Feyissa, T., Ndakidemi, P., 2016. Insights of maize lethal necrotic disease: A major constraint to maize production in East Africa. African Journal of Microbiology Research 10, 271-279. Kuhlmann, M.M., Chattopadhyay, M., Stupina, V.A., Gao, F., Simon, A.E., 2016. An RNA Element That Facilitates Programmed Ribosomal Readthrough in Turnip Crinkle Virus Adopts Multiple Conformations. J Virol 90, 8575-8591. Kumar, A., Gautam, V., Kumar, P., Mukherjee, S., Verma, S., Sarkar, A.K., 2019. Identification and co-evolution pattern of stem cell regulator miR394s and their targets among diverse plant species. BMC Evol Biol 19, 55. L. R. Nault, D.T.G., R. E., Gingery, E., Bradfute, Loayza, J.C., 1979. Identification of Maize Viruses and Mollicutes and Their Potential Insect Vectors in Peru. PHYTOPATHOLOGY 69, 824-828. Lazarow, P.B., 2011. Viruses exploiting peroxisomes. Curr Opin Microbiol 14, 458-469. Le, M.T., Kasprzak, W.K., Kim, T., Gao, F., Young, M.Y., Yuan, X., Shapiro, B.A., Seog, J., Simon, A.E., 2017. Folding behavior of a T-shaped, ribosome-binding translation enhancer implicated in a wide-spread conformational switch. Elife 6. Liao, Q., Tu, Y., Carr, J.P., Du, Z., 2015. An improved cucumber mosaic virus-based vector for efficient decoying of plant microRNAs. Sci Rep 5, 13178. Liu, Q., Goss, D.J., 2018. The 3' mRNA I-shaped structure of maize necrotic streak virus binds to eukaryotic translation factors for eIF4F-mediated translation initiation. J Biol Chem 293, 9486-9495. Liu, S.R., Zhou, J.J., Hu, C.G., Wei, C.L., Zhang, J.Z., 2017. MicroRNA-Mediated Gene Silencing in Plant Defense and Viral Counter-Defense. Front Microbiol 8, 1801. Lommel, S.A., Kendall, T.L., Xiong, Z., Nutter, R.C., 1991. Identification of the maize chlorotic mottle virus capsid protein cistron and characterization of its subgenomic messenger RNA. Virology 181, 382-385. Lukanda, M., Owati, A., Ogunsanya, P., Valimunzigha, K., Katsongo, K., Ndemere, H., Kumar, P.L., 2014. First Report of Maize chlorotic mottle virus Infecting Maize in the Democratic Republic of the Congo. Plant Dis 98, 1448. Ma, W., 2010. The scenario on the origin of translation in the RNA world: in principle of replication parsimony. Biol Direct 5, 65. Mahuku, G., Lockhart, B.E., Wanjala, B., Jones, M.W., Kimunye, J.N., Stewart, L.R., Cassone, B.J., Sevgan, S., Nyasani, J.O., Kusia, E., Kumar, P.L., Niblett, C.L., Kiggundu, A., Asea, G., Pappu, H.R., Wangai, A., Prasanna, B.M., Redinbaugh, M.G., 2015. Maize Lethal Necrosis (MLN), an Emerging Threat to Maize-Based Food Security in Sub-Saharan Africa. Phytopathology 105, 956-965. Massawe, D.P., Stewart, L.R., Kamatenesi, J., Asiimwe, T., Redinbaugh, M.G., 2018. Complete sequence and diversity of a maize-associated Polerovirus in East Africa. Virus Genes 54, 432- 437. 37

Matthews, R.E., 1979. Third report of the International Committee on Taxonomy of Viruses. Classification and nomenclature of viruses. Intervirology 12, 129-296. Mayo, M.A., Martelli, G.P., 1993. Virology division news. Archives of Virology 133, 491-498. McCartney, A.W., Greenwood, J.S., Fabian, M.R., White, K.A., Mullen, R.T., 2005. Localization of the tomato bushy stunt virus replication protein p33 reveals a peroxisome-to- endoplasmic reticulum sorting pathway. Plant Cell 17, 3513-3531. McCormack, J.C., Yuan, X., Yingling, Y.G., Kasprzak, W., Zamora, R.E., Shapiro, B.A., Simon, A.E., 2008. Structural domains within the 3' untranslated region of Turnip crinkle virus. J Virol 82, 8706-8720. Millar, A.A., Lohe, A., Wong, G., 2019. Biology and Function of miR159 in Plants. Plants (Basel) 8, 255. Miller, W.A., Wang, Z., Treder, K., 2007. The amazing diversity of cap-independent translation elements in the 3'-untranslated regions of plant viral RNAs. Biochem Soc Trans 35, 1629-1633. Miras, M., Miller, W.A., Truniger, V., Aranda, M.A., 2017. Non-canonical Translation in Plant RNA Viruses. Frontiers in plant science 8, 494. Miras, M., Rodriguez-Hernandez, A.M., Romero-Lopez, C., Berzal-Herranz, A., Colchero, J., Aranda, M.A., Truniger, V., 2018. A Dual Interaction Between the 5'- and 3'-Ends of the Melon Necrotic Spot Virus (MNSV) RNA Genome Is Required for Efficient Cap-Independent Translation. Frontiers in plant science 9, 625. Miras, M., Sempere, R.N., Kraft, J.J., Miller, W.A., Aranda, M.A., Truniger, V., 2014. Interfamilial recombination between viruses led to acquisition of a novel translation-enhancing RNA element that allows resistance breaking. New Phytol 202, 233-246. Mwando, N.L., Tamiru, A., Nyasani, J.O., Obonyo, M.A.O., Caulfield, J.C., Bruce, T.J.A., Subramanian, S., 2018. Maize Chlorotic Mottle Virus Induces Changes in Host Plant Volatiles that Attract Vector Thrips Species. J Chem Ecol 44, 681-689. Nagy, P.D., Lin, W., 2020. Taking over Cellular Energy-Metabolism for TBSV Replication: The High ATP Requirement of an RNA Virus within the Viral Replication Organelle. Viruses 12, 56. Nagy, P.D., Pogany, J., 2011. The dependence of viral RNA replication on co-opted host factors. Nat Rev Microbiol 10, 137-149. Nagy, P.D., Strating, J.R., van Kuppeveld, F.J., 2016. Building Viral Replication Organelles: Close Encounters of the Membrane Types. PLoS Pathog 12, e1005912. Nault, L.R., 1979. Identification of Maize Viruses and Mollicutes and Their Potential Insect Vectors in Peru. Phytopathology 69, 824. Nault, L.R., Styer, W.E., Coffey, M.E., Gordon, D.T., Negi, L.S., Niblett, C.L., 1978. Transmission of Maize Chlorotic Mottle Virus by Chrysomelid Beetles. Phytopathology 68, 1071-1074. Nelson, S., 2011. Maize Chlorotic Mottle. Plant Disease. 38

Newburn, L.R., White, K.A., 2017. Atypical RNA Elements Modulate Translational Readthrough in Tobacco Necrosis Virus D. J Virol 91, JVI.02443-02416. Ni, Z., Hu, Z., Jiang, Q., Zhang, H., 2012. Overexpression of gma-MIR394a confers tolerance to drought in transgenic Arabidopsis thaliana. Biochemical and biophysical research communications 427, 330-335. Niblett, C.L., Claflin, L.E., 1978. Corn lethal necrosis - a new virus disease of corn in Kansas. Plant Disease Reporter 62, 15-19. Nicholson, B.L., White, K.A., 2011. 3' Cap-independent translation enhancers of positive-strand RNA plant viruses. Curr Opin Virol 1, 373-380. Nicholson, B.L., Wu, B., Chevtchenko, I., White, K.A., 2010. Tombusvirus recruitment of host translational machinery via the 3' UTR. Rna 16, 1402-1419. Nicholson, B.L., Zaslaver, O., Mayberry, L.K., Browning, K.S., White, K.A., 2013. Tombusvirus Y-shaped translational enhancer forms a complex with eIF4F and can be functionally replaced by heterologous translational enhancers. J Virol 87, 1872-1883. Nutter, R.C., Scheets, K., Panganiban, L.C., Lommel, S.A., 1989a. The complete nucleotide sequence of the maize chlorotic mottle virus genome. Nucleic Acids Res 17, 3163-3177. Nutter, R.C., Scheets, K., Panganiban, L.C.a.L., S.A., 1989b. The complete nucleotide sequence of the maize chlorotic mottle virus genome. Nucleic Acids Research. Nyaga, C., Gowda, M., Beyene, Y., Muriithi, W.T., Makumbi, D., Olsen, M.S., Suresh, L.M., Bright, J.M., Das, B., Prasanna, B.M., 2019. Genome-Wide Analyses and Prediction of Resistance to MLN in Large Tropical Maize Germplasm. Genes (Basel) 11, 16. Oster, S.K., Wu, B., White, K.A., 1998. Uncoupled expression of p33 and p92 permits amplification of tomato bushy stunt virus RNAs. J Virol 72, 5845-5851. Park, C.J., Park, J.M., 2019. Endoplasmic Reticulum Plays a Critical Role in Integrating Signals Generated by Both Biotic and Abiotic Stress in Plants. Frontiers in plant science 10, 399. Plantwise, Plantwise knowledge bank: Maize chlorotic mottle virus, Plantwise. Pringle, C.R., 1998. Virus taxonomy--San Diego 1998. Arch Virol 143, 1449-1459. Quito-Avila, D.F., Alvarez, R.A., Mendoza, A.A., 2016. Occurrence of maize lethal necrosis in Ecuador: a disease without boundaries? European Journal of Plant Pathology 146, 705-710. R.C.Nutter, K.Scheets, L.C.Panganiban, S.A.Lommel, 1989. The complete nucleotide sequence of the maize chlorotic mottle virus genome. Nucleic acids research 17, 3163-3177. Redinbaugh, M.G., Stewart, L.R., 2018. Maize Lethal Necrosis: An Emerging, Synergistic Viral Disease. Annu Rev Virol 5, 301-322. Redinbaugh, M.G., Zambrano, J.L., 2014. Chapter Eight - Control of Virus Diseases in Maize, in: Loebenstein, G., Katis, N. (Eds.), Advances in Virus Research. Academic Press, pp. 391-429. 39

Richardson, L.G., Clendening, E.A., Sheen, H., Gidda, S.K., White, K.A., Mullen, R.T., 2014. A unique N-terminal sequence in the Carnation Italian ringspot virus p36 replicase-associated protein interacts with the host cell ESCRT-I component Vps23. J Virol 88, 6329-6344. Rochon, D., Rubino, L., Russo, J., Martelli, G., Lommel, S.A., 2012. Tombusviridae, in: King, A.M.Q., Adams, M.J., Carstens, E.B., Lefkowitz, E.J. (Eds.), Virus Taxonomy. Elsevier, San Diego, pp. 1111-1138. Rochon, D.M., 2011a. Create a new genus, Macanavirus, in the family Tombusviridae, ICTV. Rochon, D.M., 2011b. Create genus named Zeavirus in the Family Tombusviridae. ICTV. Rochon, D.M., 2011c. Create new genus, Gallantivirus, in the family Tombusviridae. ICTV, ICTV. Rochon, D.M., 2011d. Divide the genus Necrovirus into 2 new genera, Alphanecrovirus and Betanecrovirus. ICTV, ICTV. Rochon, D.M., Martelli, G., Rubino, L., Scheets, K., White, A., 2013. Transfer of the Umbravirus genus from “unassigned genus” to Tombusviridae family. ICTV, ICTV. Root-Bernstein, R., Kim, Y., Sanjay, A., Burton, Z.F., 2016. tRNA evolution from the proto- tRNA minihelix world. Transcription 7, 153-163. Rubino, L., Navarro, B., Russo, M., 2007. Cymbidium ringspot virus defective interfering RNA replication in yeast cells occurs on endoplasmic reticulum-derived membranes in the absence of peroxisomes. J Gen Virol 88, 1634-1642. Russo, M., Burgyan, J., Martelli, G.P., 1994. Molecular biology of tombusviridae. Adv Virus Res 44, 381-428. Saini, S., Sharma, I., Pati, P.K., 2015. Versatile roles of brassinosteroid in plants in the context of its homoeostasis, signaling and crosstalks. Frontiers in plant science 6, 950. Sasvari, Z., Kovalev, N., Gonzalez, P.A., Xu, K., Nagy, P.D., 2018. Assembly-hub function of ER-localized SNARE proteins in biogenesis of tombusvirus replication compartment. PLoS Pathog 14, e1007028. Scheets, K., 1998. Maize chlorotic mottle machlomovirus and wheat streak mosaic rymovirus concentrations increase in the synergistic disease corn lethal necrosis. Virology 242, 28-38. Scheets, K., 2000. Maize chlorotic mottle machlomovirus expresses its coat protein from a 1.47- kb subgenomic RNA and makes a 0.34-kb subgenomic RNA. Virology 267, 90-101. Scheets, K., 2016. Analysis of gene functions in Maize chlorotic mottle virus. Virus Res 222, 71- 79. Scheets, K., Jordan, R., White, A., Hernandez, C., 2014. Create the genus Pelarspovirus in the family Tombusviridae, ICTV. Scheets, K., Surapathrudu, K., Miller, W.A., 2018. Create three subfamilies in family Tombusviridae, ICTV. 40

Scheets, K., White, A., Rubino, L., Martelli, G.P., Rochon, D.M., 2015. Divide the genus Carmovirus into three new genera: Alphacarmovirus, Betacarmovirus, and Gammacarmovirus., ICTV. Sharma, S.D., Kraft, J.J., Miller, W.A., Goss, D.J., 2015. Recruitment of the 40S ribosome subunit to the 3'-untranslated region (UTR) of a viral mRNA, via the eIF4 complex, facilitates cap-independent translation. J Biol Chem 290, 11268-11281. Shirokikh, N.E., Preiss, T., 2018. Translation initiation by cap-dependent ribosome recruitment: Recent insights and open questions. Wiley Interdiscip Rev RNA 9, e1473. Simon, A.E., Miller, W.A., 2013. 3' cap-independent translation enhancers of plant viruses. Annu Rev Microbiol 67, 21-42. Stewart, L.R., Willie, K., Wijeratne, S., Redinbaugh, M.G., Massawe, D., Niblett, C.L., Kiggundu, A., Asiimwe, T., 2017. Johnsongrass mosaic virus Contributes to Maize Lethal Necrosis in East Africa. Plant Dis 101, 1455-1462. Stuart, G., Moffett, K., Bozarth, R.F., 2004. A whole genome perspective on the phylogeny of the plant virus family Tombusviridae. Arch Virol 149, 1595-1610. Tian, X., Song, L., Wang, Y., Jin, W., Tong, F., Wu, F., 2018. miR394 Acts as a Negative Regulator of Arabidopsis Resistance to B. cinerea Infection by Targeting LCR. Frontiers in plant science 9, 903. Treder, K., Kneller, E.L., Allen, E.M., Wang, Z., Browning, K.S., Miller, W.A., 2008. The 3' cap-independent translation element of Barley yellow dwarf virus binds eIF4F via the eIF4G subunit to initiate translation. Rna 14, 134-147. Truniger, V., Miras, M., Aranda, M.A., 2017. Structural and Functional Diversity of Plant Virus 3'-Cap-Independent Translation Enhancers (3'-CITEs). Frontiers in plant science 8, 2047. Truniger, V., Nieto, C., Gonzalez-Ibeas, D., Aranda, M., 2008. Mechanism of plant eIF4E- mediated resistance against a Carmovirus (Tombusviridae): cap-independent translation of a viral RNA controlled in cis by an (a)virulence determinant. Plant J 56, 716-727. Tungadi, T., Groen, S.C., Murphy, A.M., Pate, A.E., Iqbal, J., Bruce, T.J.A., Cunniffe, N.J., Carr, J.P., 2017. Cucumber mosaic virus and its 2b protein alter emission of host volatile organic compounds but not aphid vector settling in tobacco. Virol J 14, 91. Uyemoto, J.K., 1979. Detection of Maize Chlrotic MOttle Virus Serotypes by Enzyme-Linked Immunosorbent Assay. Phytopathology, 290-292. Uyemoto, J.K., 1980a. Severe Outbreak of Corn Lethal Necrosis Disease in Kansas. Plant Disease 64, 99-100. Uyemoto, J.K., 1980b. Severre Outbreak of Corn Lehtal Necrosis Disease in Kansas. Plant Disease, 99-100. Uzoh, I.M., Igwe, C.A., Okebalama, C.B., Babalola, O.O., 2019. Legume-maize rotation effect on maize productivity and soil fertility parameters under selected agronomic practices in a sandy loam soil. Sci Rep 9, 8539. 41

Van Regenmortel, M.H.V., Fauquet, C.M., Bishop, D.H.L., 2000. Virus Taxonomy: Classification and Nomenclature of Viruses : Seventh Report of the International Committee on Taxonomy of Viruses. Academic Press. Walker, P.J., Siddell, S.G., Lefkowitz, E.J., Mushegian, A.R., Dempsey, D.M., Dutilh, B.E., Harrach, B., Harrison, R.L., Hendrickson, R.C., Junglen, S., Knowles, N.J., Kropinski, A.M., Krupovic, M., Kuhn, J.H., Nibert, M., Rubino, L., Sabanadzovic, S., Simmonds, P., Varsani, A., Zerbini, F.M., Davison, A.J., 2019. Changes to virus taxonomy and the International Code of and Nomenclature ratified by the International Committee on Taxonomy of Viruses (2019). Arch Virol 164, 2417-2429. Wamaitha, M.J., Nigam, D., Maina, S., Stomeo, F., Wangai, A., Njuguna, J.N., Holton, T.A., Wanjala, B.W., Wamalwa, M., Lucas, T., Djikeng, A., Garcia-Ruiz, H., 2018. Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya. Virol J 15, 90. Wang, C.-Y., Zhang, Q.-F., Gao, Y.-Z., Ji, G., Huang, X.-J., Hong, J., Zhang, C.-X., 2015. Insight into the three-dimentional structure of maize chlorotic mottle virus revelaed by Cryo-EM single particle analysis. Virology, 171-178. Wang, D., Yu, C., Liu, S., Wang, G., Shi, K., Li, X., Yuan, X., 2017. Structural alteration of a BYDV-like translation element (BTE) that attenuates p35 expression in three mild Tobacco bushy top virus isolates. Sci Rep 7, 4213. Wang, Q., Zhou, X.P., Wu, J.X., 2014. First Report of Maize chlorotic mottle virus Infecting Sugarcane (Saccharum officinarum). Plant Dis 98, 572. Wang, Y., Sun, F., Cao, H., Peng, H., Ni, Z., Sun, Q., Yao, Y., 2012. TamiR159 directed wheat TaGAMYB cleavage and its involvement in anther development and heat response. PLoS One 7, e48445. Wang, Z., Kraft, J.J., Hui, A.Y., Miller, W.A., 2010. Structural plasticity of Barley yellow dwarf virus-like cap-independent translation elements in four genera of plant viral RNAs. 402, 177- 186. Wang, Z., Parisien, M., Scheets, K., Miller, W.A., 2011. The cap-binding translation initiation factor, eIF4E, binds a pseudoknot in a viral cap-independent translation element. Structure 19, 868-880. Wang, Z., Treder, K., Miller, W.A., 2009. Structure of a viral cap-independent translation element that functions via high affinity binding to the eIF4E subunit of eIF4F. J Biol Chem 284, 14189-14202. Wangai, A.W., Redinbaugh, M.G., Kinyua, M., Miano, D.W., Leley, P.K., Kasina, M., Mahuku, G., Scheets, K., Jeffers, D., 2012a. First Report of Maize chlorotic mottle virus and Maize Lethal Necrosis in Kenya. Plant disease, 1582. Wangai, A.W., Redinbaugh, M.G., Kinyua, Z.M., Miano, D.W., Leley, P.K., Kasina, M., Mahuku, G., Scheets, K., Jeffers, D., 2012b. First Report of Maize chlorotic mottle virus and Maize Lethal Necrosis in Kenya. Plant Dis 96, 1582. White, K.A., Nagy, P.D., 2004. Advances in the Molecular Biology of Tombusviruses: Gene Expression, Genome Replication, and Recombination. Elsevier, pp. 187-226. 42

Wolf, Y.I., Kazlauskas, D., Iranzo, J., Lucia-Sanz, A., Kuhn, J.H., Krupovic, M., Dolja, V.V., Koonin, E.V., 2018. Origins and Evolution of the Global RNA Virome. mBio 9. Wu, B., White, K.A., 1999. A Primary Determinant of Cap-Independent Translation Is Located in the 3′-Proximal Region of the Tomato Bushy Stunt Virus Genome. Journal of Virology 73, 8982-8988. WU, J.-x., WANG, Q., LIU, H., QIAN, Y.-j., XIE, Y., ZHOU, X.-p., 2013a. Monoclonal antibody-based serological methods formaize chlorotic mottle virus detection in China. Biomed & Biotechnol, 555-562. Wu, J.X., Wang, Q., Liu, H., Qian, Y.J., Xie, Y., Zhou, X.P., 2013b. Monoclonal antibody-based serological methods for maize chlorotic mottle virus detection in China. J Zhejiang Univ Sci B 14, 555-562. Xia, Z., Zhao, Z., Gao, X., Jiao, Z., Wu, Y., Zhou, T., Fan, Z., 2019. Characterization of Maize miRNAs in Response to Synergistic Infection of Maize Chlorotic Mottle Virus and Sugarcane Mosaic Virus. Int J Mol Sci 20, 3146. Xie, L., Zhang, J., Wang, Q., Meng, C., Hong, J., Zhou, X., 2011. Characterization of Maize Chlorotic Mottle Virus Associated with Maize Lethal Necrosis Disease in China. J Phytopathol 159, 191-193. Xing, Y., Ingvardsen, C., Salomon, R., Lubberstedt, T., 2006. Analysis of sugarcane mosaic virus resistance in maize in an isogenic dihybrid crossing scheme and implications for breeding potyvirus-resistant maize hybrids. Genome 49, 1274-1282. Xu, K., Nagy, P.D., 2016. Enrichment of Phosphatidylethanolamine in Viral Replication Compartments via Co-opting the Endosomal Rab5 Small GTPase by a Positive-Strand RNA Virus. PLoS Biol 14, e2000128. Yuan, W., Suo, J., Shi, B., Zhou, C., Bai, B., Bian, H., Zhu, M., Han, N., 2019. The barley miR393 has multiple roles in regulation of seedling growth, stomatal density, and drought stress tolerance. Plant Physiol Biochem 142, 303-311. Yuan, X., Shi, K., Meskauskas, A., Simon, A.E., 2009. The 3' end of Turnip crinkle virus contains a highly interactive structure including a translational enhancer that is disrupted by binding to the RNA-dependent RNA polymerase. Rna 15, 1849-1864. Zambrano, J.L., Jones, M.W., Brenner, E., Francis, D.M., Tomas, A., Redinbaugh, M.G., 2014. Genetic analysis of resistance to six virus diseases in a multiple virus-resistant maize inbred line. Theor Appl Genet 127, 867-880. Zeng, C., a, X.H., a, J.X., a, G.L., Maa, J., b, H.-F.J., a, S.Z., Chen, H., 2013. Rapid and sensitive detection of maize chlorotic mottle virus using surface plasmon resonance-based biosensor. Analytical Biochemistry 440, 18-22. Zhang, Y., Zhao, W., Li, M., Chen, H., Zhu, S., Fan, Z., 2011. Real-time TaqMan RT-PCR for detection of maize chlorotic mottle virus in maize seeds. Journal of virological methods 171, 292-294. 43

Zhao, P., Liu, Q., Miller, W.A., Goss, D.J., 2017. Eukaryotic translation initiation factor 4G (eIF4G) coordinates interactions with eIF4A, eIF4B and eIF4E in binding and translation of the barley yellow dwarf virus 3' cap-independent translation element (BTE). jbc.M116.764902. Zheng, Z., Wang, N., Jalajakumari, M.B., Blackman, L., Shen, E., Verma, S., Wang, M.B., Millar, A.A., 2020. miR159 represses a constitutive pathogen defense response in tobacco. Plant Physiol, pp.00786.02019.

44

Figures

Figure 1.1. Genome organization of the type species for each genus in the Tombusviridae family grouped based on sub-families Boxes represent known and predicted ORFs. The yellow boxes represent ORFs encoding the phylogenetically conserved RNA dependent RNA polymerase (RdRp) and their associated protein(s). Red boxes represent the capsid protein (CP) encoding ORFs. 45

Grey boxes represent unique proteins to the indicated virus. Grey-gradient boxes represent uncharacterized putative proteins. Green boxes correspond to silencing associated suppressor proteins. The Blue color-coded proteins represent viral movement protein (MP). For Alpha- /Beta-/gamma-carmovirus, alpha-/beta- necrovirus, and Panicovirus, the light-blue shaded boxes correspond to a conserve second movement protein among these viruses. For Avenavirus, the darkest blue shaded box represents a putative movement protein that does not resemble any of the Tombusviridae encoded proteins. The asterisk (*) indicates that a 3’-cap-independent translation element has been identified for that virus. ORF: open . RT: translational readthrough of termination codon. -1FS: -1 ribosomal frameshift event. CTG: protein has a CTG start codon.

Figure 1.2. Tobacco bushy stunt virus (TBSV) capsid architecture. A) Order of TBSV domains in a polypeptide chain from N-terminus to C-terminus. The number on top of each domain indicates the position of the domain based on uniport (P11795) data. B) Schematic view of a polypeptide chain of TBSV coat protein (CP). The R-domain on this schematic is missing the first 64 amino acids (aa) from the N-terminus. For parts A) & B), the domains were colored as follows: R-domain (red), S-domain (orange), P-domain (yellow), connecting loops (grey). C) Arrangement of TBSV subunits in virus particles. A, B, C, denote the 3 packing environments for the subunit. Top left: visualization of one asymmetric unit. Subunit A is in red, subunit B is in white/cream, and subunit C is in blue. Top right: visualization of pentamer composed of A, B, C subunits. Images generated using pdb files from (PDB: 2BTV) data and Hopper, Harrison, and Sauer (1984) publication.

46

Figure 1.3. Examples of capsid structure observed in Tombusviridae viruses. A) Schematic view of a capsid polypeptide chain of TBSV, MCMV, and TNV. B) visualization of subunits assembly. A, B, C, denote the 3 packing environments for the subunit. Left: illustration of one asymmetric unit. Right: the display of pentamer composed of A, B, C subunits. C) Arrangement of virus particle subunits. Image created using pdb files from the protein data bank, TBSV (Hopper et al., 1984), MCMV (Wang et al., 2015), and TNV (Oda et al., 2000). 47

68 TuCV_JN204352.1 ManRV_JN204351.1 CBV_NC_004725.1 47 EMCV_NC_023339.1 100 PeLV_AY100482.1 LNV_DQ011234.1 52 21 CuNV_NC_001469.1 GeVA_LC373507.1 48 MPV_NC_020073.2 80 CymRSV_NC_003532.1 39 31 34 TBSV_NC_001554.1 Tombusvirus GALV_NC_011535.1 19 48 44 100 PLCV_NC_030452.1 34 CIRV_NC_003500.3 AMCV_NC_001339.1 NRV_DQ663765.1 44 65 TMV_NC_043206.1 PetAMV_DQ663763.2 MNeSV_NC_007729.1 Zeavirus 100 MWLMV_NC_009533.1 JCSMV_NC_005287.1 98 SNMV_DQ367845.1 Aureusvirus 98 100 YSV_NC_022895.1 89 CLSV_NC_007816.2 83 PoLV_NC_000939.2 EIAV1_MG967280.1 100 SCNM-1_NC003806.1 100 RCNMV-1_NC003756.1 Dianthovirus 68 CRSV_NC_003530.1

99 TBTV_NC_004366.1 39 76 OPMV_NC_027710.2 GRV_NC_003603.1 47 100 eTBTV_NC_024808.1 87 PEMV2_NC_03853.1 Umbravirus 100 CMoMV_NC001726.1 CMoV_NC011515.1 OLV-1_NC_001721.1 99 TNV-A_NC_001777.1 100 Alphanecrovirus 100 CSNV_MF125267.1 54 OMMV_NC_006939.1 53 37 PNV_NC_029900.1 100 FNSV_NC_020469.1 Macanavirus GaMV_NC_001818.1 Gallantivirus CMMV_NC_011108.1 70 100 BGLV_NC_032405.1 100 46 100 TPAV_NC_021705.2 Panicovirus PMV_NC_002598.1 MCMV_NC_003627.1 Machlomovirus TLV1_NC_015227.2 99 85 GomVA_NC_030742.1 86 81 NLVCV_NC_009017.1 CbMV_NC_021926.1 PFBV_NC_005286.1 99 100 Alphacarmovirus 65 HnRSV_NC_014967.1 100 CarMV_NC_001265.2 87 AdMV_LC171345.1

93 40 AnFB_NC_007733.2 75 94 SCV_NC_001780.1 PLPV_NC_007017.2 Pelarspovirus 52 100 RYLV_KC166239.1 89 RrLDV_NC_020415.1 59 PCRPV_NC_005985.1 76 100 JMaV_MG958506.1 88 JaVH_KX897157.1 100 100 PelRSV_NC_026240.1 ELV_NC_026239.1 CICMV_NC_033777.1 26 TCV_NC_003821.3 91 100 Betacarmovirus 91 CCFV_NC001600.1 99 HCRSV_NC_003608.1 JINRV_NC_002187.1 100 PSNV_NC_004995.1 35 MNSV_NC_001504.1 Gammacarmovirus 100 SoyMMV_NC_011643.1 CPMoV_NC_003535.1 OCSV_NC_003633.1 Avenavirus 78 TNVD_NC_003487.1 100 BBSV_NC_004452.3 Betanecrovirus 0.250 LWSV_NC_001822.1

Figure 1.4. Maximum likelihood phylogenetics of RdRp from Tombusviridae viruses. The viruses were color-coded based on ICTV genus classification. Viruses in black have yet to be assigned to a specific genus. Virus abbreviations were used based on named given in reference genome’s publication, followed by the reference number on NCBI. A list of names and current classification of the viruses can be found in Table 1.1. RNA dependent RNA polymerase (RdRp) reference genome sequences were extracted from the NCBI database. Phylogeny trees were created using CLC genomics workbench 12.2. Sequences were aligned using the Maximum Likelihood phylogeny method. The tree was constructed using the UPGMA method with 1,000 bootstrap replicates. 48

Figure 1.5. Examples of Tombusviridae viruses targeting different membranous organelles for replication in the plant cell. Upper left corner, a depiction of the plant cell. The upper right corner notes the illustration used for each organelle. Bottom left box notes examples of well- studied Tombusviridae viruses replication sites. The bottom right corner box depicts the formation of spherules or multivesicular bodies (MVBs). Abbreviations: endoplasmic reticulum (ER), beet black scorch virus (BBSV), Carnation Italian ringspot virus (CIRV), Cymbidium ringspot virus (CymRSV), cucumber leaf spot virus (CLSV), cucumber necrosis virus (CuNV), melon necrotic spot virus (MNSV), red clover necrotic mosaic virus (RCNMV), tomato bushy stunt virus (TBSV). 49

Figure 1.6. Depiction of cap-dependent eukaryotic translation.(A) Recruitment of protein factors to the mRNA extremities. Eukaryotic initiation factor 4E (eIF4E) interacts with the mRNA cap and binds scaffold protein eIF4G. Then, eIF4A, ATP-dependent RNA helicase protein factor, interacts ephemerally with eIF4G and eIF4E to form eIF4F complex. Poly(A)-binding protein (PABP) binds to poly (A) tail of mRNA. The mRNA forms a close-loop when PABP interacts with eIF4G from the eIF4F complex. (B) Circularization of mRNA and cap-dependent translation. Cap-dependent translation initiation (orange) commences upon the assembly of the small ribosomal subunit (40s), eukaryotic initiation factors (eIFs), and initiator methionyl-tRNA (Met-tRNA) at the 5’-end of the mRNA. Small ribosomal subunit scans mRNA towards 3’-end until it recognizes a start codon. At the start codon, translation initiator complex rearranges, and large ribosomal subunit (60S) attaches to the complex, which activates translation elongation (green). Polypeptide chains synthesis begins as the ribosome complex, aided by elongation factors, moves through the open reading frame (ORF). Translation termination (red) is trigger by a nonsense stop codon where eukaryotic release factors (eRFs) facilitate the release of the completed protein and ribosomal subunits. The released eIFs and ribosomal subunits can be recycled for subsequent translation. 50

A) 4G 4E

2A

A U G U A A

B)

4G

4E

Initiation Elongation Termination & recycling Peptide tRNA protein factor Ribosome polypeptide/ protein

Figure 1.7. Depiction of cap-independent translation of tomato bushy stunt virus (TBSV).(A) Recruitment of protein factors to the 3’-cap-independent translation element (CITE). The 3’-CITE interacts with the viral RNA 5’-end via long-distance base-pairing with an apical loop at the 5’-UTR. (B) Circularization of mRNA through long-distance base-pairing between 5’-and 3’-end of viral RNA. Cap- dependent translation initiation (orange) commences upon the assembly of the small ribosomal subunit (40s), eukaryotic initiation factors (eIFs), and initiator methionyl-tRNA (Met-tRNA) at the 5’-end of the mRNA. Small ribosomal subunit scans mRNA towards 3’-end until it recognizes a start codon. At the start codon, translation initiator complex rearranges, and large ribosomal subunit (60S) attaches to the complex, which activates translation elongation (green). Polypeptide chains synthesis begins as the ribosome complex, aided by elongation factors, moves through the open reading frame (ORF). Translation termination (red) is trigger by a nonsense stop codon where eukaryotic release factors (eRFs) facilitate the release of the completed protein and ribosomal subunits. The released eIFs and ribosomal subunits can be recycled for subsequent translation. 51

68 TuCV_JN204352.1 ManRV_JN204351.1 CBV_NC_004725.1

47 EMCV_NC_023339.1 C XTE 100 P e LV _AY 100482.1 LNV_DQ011234.1 52 21 CuNV_NC_001469.1 GeVA _LC373507.1 48 MPV_NC_020073.2 TSS 80 CymRSV_NC_003532.1 YSS 0–3 stem loops 39 31 34 TBSV_NC_001554.1 Tombusvirus GA LV _NC_011535.1 CCC 19 48 44 A 100 PLCV_NC_030452.1 A C A G G G G G G U C C SL-III 34 CIRV_NC_003500.3 G U G SL-I A G AMCV_NC_001339.1 G

NRV_DQ663765.1 S-IV 44 65 TMV_NC_043206.1 PTE BTE PetAMV_DQ663763.2 MNeSV_NC_007729.1 Zeavirus U MWLMV_NC_009533.1 X G 100 G C C G JCSMV_NC_005287.1 X U 98 GU A G G SNMV_DQ367845.1 U Aureusvirus G X 98 100 YSV_NC_022895.1 GU X 89 CLSV_NC_007816.2 83 P o LV _NC_000939.2 TED ISS E IAV 1_MG967280.1 100 SCNM-1_NC003806.1 100 RCNMV-1_NC003756.1 Dianthovirus 68 CRSV_NC_003530.1

99 TBTV_NC_004366.1 39 76 OPMV_NC_027710.2 GRV_NC_003603.1 47 100 eTBTV_NC_024808.1 87 PEMV2_NC_03853.1 Umbravirus 100 CMoMV_NC001726.1 CMoV_NC011515.1 OLV -1_NC_001721.1 99 TNV-A_NC_001777.1 100 Alphanecrovirus 100 CSNV_MF125267.1 54 OMMV_NC_006939.1 53 37 PNV_NC_029900.1 100 FNSV_NC_020469.1 Macanavirus GaMV_NC_001818.1 Gallantivirus CMMV_NC_011108.1 70 100 B GLV _NC_032405.1 100 46 100 TPAV _NC_021705.2 Panicovirus PMV_NC_002598.1 MCMV_NC_003627.1 Machlomovirus TLV 1_NC_015227.2 99 85 GomVA _NC_030742.1 86 81 NLVCV_NC_009017.1 CbMV_NC_021926.1 PFBV_NC_005286.1 99 100 Alphacarmovirus 65 HnRSV_NC_014967.1 100 CarMV_NC_001265.2 87 AdMV_LC171345.1

93 40 AnFB_NC_007733.2 75 94 SCV_NC_001780.1 PLPV_NC_007017.2 Pelarspovirus 52 100 RY LV _KC166239.1 89 RrLDV_NC_020415.1 59 PCRPV_NC_005985.1 76 100 JMaV_MG958506.1 88 JaVH_KX897157.1 100 100 PelRSV_NC_026240.1 E LV _NC_026239.1 CICMV_NC_033777.1 26 TCV_NC_003821.3 91 100 Betacarmovirus 91 CCFV_NC001600.1 99 HCRSV_NC_003608.1 JINRV_NC_002187.1 100 PSNV_NC_004995.1 35 MNSV_NC_001504.1 Gammacarmovirus 100 SoyMMV_NC_011643.1 CPMoV_NC_003535.1 OCSV_NC_003633.1 Avenavirus 78 TNVD_NC_003487.1 100 BBSV_NC_004452.3 Betanecrovirus 0.250 LW S V _NC_001822.1

Figure 1.8. 3'-Cap Independent Translation Elements identified in Tombusviridae viruses.Phylogenetic tree of Tombusviridae viruses based on RdRp sequences (refer to Table 1-1). Virus acronyms are color- coded based 3’cap independent translation element (CITE) predicted or reported. The 3’-CITE’s structures are shown on the top left corner. No 3’-CITE has been reported or predicted for viruses in black ink. The light shaded regions on the CITEs indicate the regions suspected to interact with 5’-end. RNA dependent RNA polymerase (RdRp) reference genome sequences were extracted from the NCBI database. Phylogeny trees were created using CLC genomics workbench 12.2. Sequences were aligned using the Maximum Likelihood phylogeny method. The tree was constructed using the UPGMA method with 1,000 bootstrap replicates. 52

A) Initiation RNA-RNA Ribosome interaction Elongation polypeptide/ PRTE/DRTE Termination protein & recycling tRNA Peptide protein factor

p33 5’

B)

p94

5’

Figure 1.9. Readthrough translation of the Carnation Italian Ringspot virus (CIRV)A) Depiction of the translation of pre-readthrough (pre-RT) product, p36. (B) Drawing of readthrough translation of readthrough product, p95. The proximal readthrough element (PRTE) and distal readthrough element (DRTE) long-range base-pairing facilitate translation of readthrough (RT) product. B) The 3'-CITE interacts with the viral RNA 5'-end via long-distance base-pairing with an apical loop at the 5'-UTR. Cap- dependent translation initiation (orange) commences upon the assembly of the small ribosomal subunit (40s), eukaryotic initiation factors (eIFs), and initiator methionyl-tRNA (Met-tRNA) at the 5'-end of the mRNA. Small ribosomal subunit scans mRNA towards 3'-end until it recognizes a start codon. At the start codon, translation initiator complex rearranges, and large ribosomal subunit (60S) attaches to the complex, which activates translation elongation (green). Polypeptide chains synthesis begins as the ribosome complex, aided by elongation factors, moves through the open reading frame (ORF). Translation termination (red) is trigger by a nonsense stop codon where eukaryotic release factors (eRFs) facilitate the release of the completed protein and ribosomal subunits. The released eIFs and ribosomal subunits are recycled for subsequent translation. PRTE and DRTE sequences depicted in light blue. Illustrations were drawn based on the information reported on Cimino et al. 2011. 53

A)

p33

Initiation RNA-RNA Ribosome interaction Elongation polypeptide/ Termination Peptide protein & recycling tRNA protein factor

B)

C)

p94

Figure 1.10. Pea enation mosaic virus RNA2 (PEMV2) programmed ribosomal -1 frameshift.(A) Depiction of 3'-CITE circularization and translation of pre-readthrough (pre-RT) product, p33. (B) Illustration of RSE interaction with the 3'-end hairpin (Pr) promoting frameshift (C) Sketch of -1 frameshift (-1FS) translation of frameshift product, p94. Cap-dependent translation initiation (orange) commences upon the assembly of the small ribosomal subunit (40S), eukaryotic initiation factors (eIFs), and initiator methionyl-tRNA (Met-tRNA) at the 5'-end of the mRNA. Small ribosomal subunit scans mRNA towards 3'-end until it recognizes a start codon. At the start codon, translation initiator complex rearranges, and large ribosomal subunit (60S) attaches to the complex, which activates translation elongation (green). Polypeptide chains synthesis begins as the ribosome complex, aided by elongation factors, moves through the open reading frame (ORF). Translation termination (red) is trigger by a nonsense stop codon where eukaryotic release factors (eRFs) facilitate the release of the completed protein and ribosomal subunits. The released eIFs and ribosomal subunits are recycled for subsequent translation.

54

Figure 1.11. Maize Lethal Necrosis Disease Global Distribution. Countries are color-coded based on the latest reported cases. Map created using mapchart.net.

Figure 1.12. Maize chlorotic mottle virus global distribution. Countries are color-coded based on the latest reported cases. Map created using mapchart.net.

55

Machlomovirus CTG ORF3 ORF5 ORF1 RT ORF3-RT MCMV (4,437 nts) 5’ * 3’OH (NC_003627.1) OR F2 ORF2-RT ORF4 CTG ORF3 OR F5 p32K ORF3-RT (11 8 -987) sgRNA1 (1.47kb) 5’ 3’OH (2,971-4,437) RT ORF4 p50K sgRNA2 (337 nt) 5’ 3’OH (137-1,453) (4,101-4,437)

p111K - RdRp p25K - CP (137-3,034) (3,384-4,094) p7a (2,995-3,201) p31K (2,995-3,834)

p7b (3,203-3,397) Figure 1.13. Maize chlorotic mottle virus genome organization.Boxes represent known and predicted ORFs. The yellow boxes represent ORFs encoding the phylogenetically conserved RNA dependent RNA polymerase (RdRp) and their associated protein(s). Red boxes represent the capsid protein (CP) encoding ORFs. Grey boxes represent unique proteins to the virus. The Blue color-coded proteins represent viral movement protein (MP). The asterisk (*) indicates that a 3'-cap-independent translation element has been identified for that virus. ORF: open reading frame. RT: translational readthrough of a termination codon. 56

Figure 1.14. Sequence alignment of MCMV complete genomes available in GeneBank. Complete MCMV sequences were extracted from the NCBI database. Multiple sequence alignment was performed using CLC genomics workbench 12.2.

57

87 JQ982468.1_MCMV_YunnanChina 86 KF010583.1_MCMV_YunnanChina Asia JQ982470.1_MCMV_SichuanChina 81 JQ982469_MCMV_YunnanChina MF510242.1_MCMV_Rwanda 83 KF744396.1_MCMV_Rwanda KF744395.1_MCMV_Rwanda MF467386.1_MCMV_Kilimanjaro Tanzania KF744394.1_MCMV_Rwanda MF510248.1_MCMV_Rwanda KF744393.1_MCMV_Rwanda MF510251.1_MCMV_Rwanda MF510249.1_MCMV_Rwanda KP851970.3_MCMV_Rwanda MF510226.1_MCMV_Kenya MF510223.1_MCMV_Kenya MF510230.1_MCMV_Kenya MF510244.1_MCMV_Kenya MF510225.1_MCMV_Kenya MF510233.1_MCMV_Kenya MF510231.1_MCMV_Kenya KP798452_MCMV_Ethiopia MF510229.1_MCMV_Kenya MF510228.1_MCMV_Kenya MF510224.1_MCMV_Kenya Africa MF510227.1_MCMV_Kenya 92 KP798454_MCMV_Ethiopia MF510247.1_MCMV_Kenya KP798453.2_MCMV_Ethiopia 92 MF510243.1_MCMV_Ethiopia MF510241.1_MCMV_Kenya KP772217.1_MCMV_Ethiopia MF467389.1_MCMV_Arusha Tanzania MF467384.1_MCMV_Arusha Tanzania MF510240.1_MCMV_Kenya MF467387.1_MCMV_Kilimanjaro Tanzania MF510246.1_MCMV_Kenya MH238453.1_MCMV_Kenya MF510237.1_MCMV_Kenya JX286709.1_MCMV_Kenya 1 MF467391.1_MCMV_Kilimanjaro Tanzania MF510245.1_MCMV_Kenya KP798455.1_MCMV_Ethiopia 16 MH205605.1_MCMV_Kenya MF467377.1_MCMV_Manyara Tanzania MF510238.1_MCMV_Kenya MF510239.1_MCMV_Kenya MH238451.1_MCMV_Kenya MH238454.1_MCMV_Kenya MF510235.1_MCMV_Kenya MF510236.1_MCMV_Kenya 100 MF510234.1_MCMV_Kenya MF467380.1_MCMV_Arusha Tanzania MF467383.1_MCMV_Arusha Tanzania MF467381.1_MCMV_Arusha Tanzania 100 MF467392.1_MCMV_Kilimanjaro Tanzania MF467388.1_MCMV_Kilimanjaro Tanzania MF467379.1_MCMV_Manyara Tanzania MF467390.1_MCMV_Manyara Tanzania MF510232.1_MCMV_Kenya MF467382.1_MCMV_Manyara Tanzania MF467375.1_MCMV_Arusha Tanzania 100 MH238450.1_MCMV_Kenya MH238449.1_MCMV_Kenya MH238452.1_MCMV_Kenya MF467376.1_MCMV_Kilimanjaro Tanzania 100 MH238455.1_MCMV_Kenya 60 MF467385.1_MCMV_Kilimanjaro Tanzania KJ782300.1_MCMV_Taiwan GU138674.1_MCMV_Yunnan-China America 73 MF510222.1_MCMV_Ecuador 80 MF510221.1_MCMV_Ecuador 100 MF510220.1_MCMV_Hawaii 100 MF510219.1_MCMV_Ecuador 100 NC_003627.1_MCMV_Kansas-USA EU358605.1_MCMV_Nebraska-USA

0.004 Figure 1.15. Maize chlorotic mottle virus (MCMV) phylogeny tree analysis. Sequences were color-coded based on the continent where the sequence was extracted. Complete MCMV sequences were obtained from the NCBI database. Phylogeny trees were created using CLC genomics workbench 12.2. Sequences were aligned using the Maximum Likelihood phylogeny method. The tree was constructed using the UPGMA method with 1,000 bootstrap replicates. 58

Figure 1.16. Alignment of maize chlorotic mottle virus (MCMV) 3'-CITE (MTE). The predicted MTE was mapped to 4164-4333 of the MCMV-reference genome. The MTE sequence (reference genome: 4156-4326 nt) was subjected to multiple sequence alignment using CLC genomics workbench 12.2 software. The alignment was divided into two sections. The section observed here represents nucleotides (nt) 4156-4245 (reference genome location).

59

Figure 1.16. (continue).

60

Figure 1.17. Maize chlorotic mottle virus predicted 3'-CITE (MTE) structures from different isolates. Vienna RNAfold software predicted the secondary structure of selected MCMV strains that possess variation from the reference sequence (Figure 1.16). Variant nucleotides highlighted in red. 61

Figure 1.18. Maize chlorotic mottle virus 3'-CITE (MTE) structures from different isolates. Vienna RNAfold software predicted the secondary structure of selected MCMV strains that possess variation from the reference sequence (Figure 1.16). Experimental SHAPE data was used to re-draw the structures to fit known data (refer to chapter 2).Variant nucleotides highlighted in red.

62

Tables

Table 1-1. Viruses of the family Tombusviridae for which reference genomes were available in GenBank as in January 2020.

Accesion Genus Virus name abbreviation Subfamily code

Carrot mottle mimic virus CMoMV NC_001726.1

Carrot mottle virus CMoV NC_011515.1

Ethiopian tobacco bushy top virus TBTV NC_024808.1

Groundnut rosette virus GRV NC_003603.1 Calvusvirinae Umbravirus Lettuce speckles mottle virus LSMV

Opium poppy mosaic virus OPMV NC_027710.2

Pea enation mosaic virus 2 PEMV2 NC_003853.1

Tobacco bushy top virus TBTV NC_004366.1

Tobacco mottle virus TMV NC_043206.1

Angelonia flower break virus AnFB NC_007733.2

Calibrachoa mottle virus CbMV NC_021926.1

Carnation mottle virus CarMV NC_001265.2 Alphacarmovirus Honeysuckle ringspot virus HnRSV NC_014967.1

Nootka lupine vein clearing virus NLVCV NC_009017.1

Pelargonium flower break virus PFBV NC_005286.1

Saguaro cactus virus SCV NC_001780.1

Alphanecrovirus Olive latent virus 1 OLV-1 NC_001721.1

Alphanecrovirus Olive mild mosaic virus OMMV NC_006939.1

Alphanecrovirus Potato necrosis virus PNV NC_029900.1

Alphanecrovirus Tobacco necrosis virus A TNV-A NC_001777.1

Aureusvirus Cucumber leaf spot virus CLSV NC_007816.2

Aureusvirus Johnsongrass chlorotic stripe mosaic virus JCSMV NC_005287.1

Aureusvirus Maize white line mosaic virus MWLMV NC_009533.1 Procedovirinae

Aureusvirus Pothos latent virus PoLV NC_000939.2

Aureusvirus Yam spherical virus YSV NC_022895.1

Avenavirus Oat chlorotic stunt virus OCSV NC_003633.1

Betacarmovirus Cardamine chlorotic fleck virus CCFV NC_001600.1

Betacarmovirus Hibiscus chlorotic ringspot virus HCRSV NC_003608.1

Betacarmovirus Japanese iris necrotic ring virus JINRV NC_002187.1

Betacarmovirus Turnip crinkle virus TCV NC_003821.3

Betanecrovirus Beet black scorch virus BBSV NC_004452.3

Betanecrovirus Leek white stripe virus LWSV NC_001822.1

Betanecrovirus Tobacco necrosis virus D TNVD NC_003487.1

Gallantivirus Galinsoga mosaic virus GaMV NC_001818.1

Gammacarmovirus Cowpea mottle virus CPMoV NC_003535.1

Gammacarmovirus Melon necrotic spot virus MNSV NC_001504.1

Gammacarmovirus Pea stem necrosis virus PSNV NC_004995.1 63

Table 1-1 (continued)

Subfamily Genus Virus name abbreviation Accesion code

Gammacarmovirus Soybean yellow mottle mosaic virus SoyMMV NC_011643.1

Macanavirus Furcraea necrotic streak virus FNSV NC_020469.1

Machlomovirus Maize chlorotic mottle virus MCMV NC_003627.1

Panicovirus Cocksfoot mild mosaic virus CMMV NC_011108.1

Panicovirus Panicum mosaic virus PMV NC_002598.1

Panicovirus Thin paspalum asymptomatic virus TPAV NC_021705.2

Pelarspovirus Clematis chlorotic mottle virus ClCMV NC_033777.1

Pelarspovirus Elderberry latent virus ELV NC_026239.1

Pelarspovirus Pelargonium chlorotic ring pattern virus PCRPV NC_005985.1

Pelarspovirus Pelargonium line pattern virus PLPV NC_007017.2

Pelarspovirus Pelargonium ringspot virus PelRSV NC_026240.1

Pelarspovirus Rosa rugosa leaf distortion virus RrLDV NC_020415.1

Tombusvirus Artichoke mottled crinkle virus AMCV NC_001339.1

Tombusvirus Carnation Italian ringspot virus CIRV NC_003500.3

Tombusvirus Cucumber Bulgarian latent virus CBV NC_004725.1 Procedovirinae Tombusvirus Cucumber necrosis virus CuNV NC_001469.1

Tombusvirus Cymbidium ringspot virus CymRSV NC_003532.1

Tombusvirus Eggplant mottled crinkle virus EMCV NC_023339.1

Tombusvirus Grapevine Algerian latent virus GALV NC_011535.1

Tombusvirus Havel River virus HaRV NC_038690.1

Tombusvirus Lato River virus LRV

Tombusvirus Limonium flower distortion virus LFDV NC_038691.1

Tombusvirus Moroccan pepper virus MPV NC_020073.2

Tombusvirus Neckar River virus NRV NC_038927.1

Tombusvirus Pelargonium leaf curl virus PLCV NC_030452.1

Tombusvirus Pelargonium necrotic spot virus PNSV NC_005285.1

Tombusvirus Petunia asteroid mosaic virus PetAMV NC_038692.1

Tombusvirus Sitke waterborne virus SWBV NC_038693.1

Tombusvirus Tomato bushy stunt virus TBSV NC_001554.1

Zeavirus Maize necrotic streak virus MNeSV NC_007729.1

Trailing lespedeza virus 1 TLV1 NC_015227.2

Carnation ringspot virus CRSV NC_003530.1

Red clover necrotic mosaic virus RCNMV-1 NC_003756.1

Dianthovirus Regressovirinae Red clover necrotic mosaic virus RCNMV-2 NC_003775.1

Sweet clover necrotic mosaic virus SCNMV-1 NC_003806.1

Sweet clover necrotic mosaic virus SCNMV-2 NC_003807.1

64

Table 1-1 (continued)

Subfamily Genus Virus name abbreviation Accesion code

Ahlum waterborne virus AWV

unassinged, but listed Bean mild mosaic virus BMMV on ICTV taxonomy Chenopodium necrosis virus ChNV database Cucumber soil-borne virus CuSBV Weddel waterborne virus WWBV

Elderberry aureusvirus 1 ElAV1 MG967280.1

Adonis mosaic virus AdMV LC171345.1

Jasmine virus H JaVH KX897157.1

Jasmine mosaic-associated virus JMaV MG958506.1

Gompholobium virus A GomVA NC_030742.1

Corn salad necrosis virus CSNV MF125267.1 unassinged and not

currently listed on Bermuda grass latent virus BGLV NC_032405.1 ICTV taxonomy database Sesame necrotic mosaic virus SNMV DQ367845.1

Pear latent virus PeLV AY100482.1

Gentian virus A GeVA LC373507.1

Lisianthus necrosis virus LNV DQ011234.1

Rose yellow leaf virus RYLV KC166239.1

Manawatu river virus ManRV JN204350.1

Turitea creek virus TuCV JN204349.1

65

Table 1-2. Tombusviridae that use proximal (PRTE) and distal (DRTE) readthrough elements to activate readthrough translation of RNA-dependent RNA-polymerase associated proteins.

PRTE

DRTE 5’-... -OH 5’-... -3’ Sequence involved in Sequence involved in intra PRTE DRTE Virus internal pseudoknot RIVc internal loop (ntb location) (nt location) RSEa formation Carnation Italian ringspot virus AGUGCU d AGCACU GGGCU (4704-4708) (CIRV) (1148-1153) (4724-4729) AGCCC (4759-4763) Turnip crinkle virus UGCGCG GGGG (860-863) CGCGCG GGGC (3991-3994) (TCV) (878-883) CCCC (902-905) (4033-4038) GCCC (4051-4054) Panicum mosaic virus UCUCAG CUGAGA GGGCCG (4246-4251) (PMV) (1366-1371) (4283-4288) CGGCCC (4246-4251) Maize chlorotic mottle virus ACUCCGU ACGGAGU GGGCCG (4366-4371) (MCMV) (1513-1519) (4294-4400) CGGCCC (4432-4437) Oat chlorotic stunt virus GUUGCG UAGCC (725-729) CGCAAU GCCCAC (3616-3621) (OCSV) (749-754) GGCUA (763-767) (4078-4083) GUGGGC (4109-4114) Cucumber leaf-spot virus AAGGGAUGG AGGGG (799-803) UCAUCCCUU GGGCUA (4375-4380) (CLSV) (877-885) CCCCT (903-907) (4403-4411) UAGCCC (4426-4431) Tobacco necrosis virus-D GAGGGG CCCCUC GGGU (3713-3716) (TNV-D) (701-706) (3728-3733) ACCC (3539-3542)

Cardamine chlorotic fleck virus GCGCG GGGG (826-829) CGCGC (CCFV) (844-848) CCCC (866-869) (4021-4025) Saguaro cactus virus AACAUG GAGGG (795-799) CUGUU GGGCG (3056-3060) (SCV) (811-816) CCCUC (836-840) (3589-3593) CGCCC (3875-3879)

Melon necrotic spot virus CAUGA GGGAC (940-944) UCAUG (MNSV) (954-958) GTCCC (977-981) (4229-4233)

Pelargonium flower break virus GAUGA GGGGG (808-812) UCAUC (PFBV) (824-828) CCCCC (847-851) (3840-3844)

Calibrachoa mottle virus CCUCAA GGGGG (821-826) UUGAGG (CbMV) (837-842) CCCCC (863-867) (3900-3905) Pea enation mosaic virus 2 GGGCG (4083-4087) GAGGUAA UUACCUC (PEMV2) (995-1001) (4237-4243) CGCCC (4249-4253) a RSE: recoding stimulatory elements. Sequences form a pseudoknot (pink nucleotides on top diagram) between a lower stem of the RSE and an internal bulge near the PRTE. bnt: nucleotide location of sequence based on reference genomes c RIV: essential RNA replication region. Forms a pseudoknot (purple nucleotides on top diagram) between the 3’- end of the genome and a bulge in the large side loop near the DRTE d No favorable internal pseudoknot interactions identified for viruses with grey shaded boxes

66

Table 1-3. Summary of characterized 3’-cap-independent translation elements/enhancers

TED BTE PTE TSS YSS ISS CXTE

Secondary RNA structure

cucurbit aphid- barley yellow borne yellow translation panicum Name draft virus T-shaped I-shaped Y-shaped virus (CABYV- enhancer mosaic virus- description (BYDV)-like structure structure structure Xinjiang) like domain like enhancer element translation element Satellite Cucurbit aphid- Virus tobacco Barley yellow Panicum Turnip crinkle Tomato bushy Maize necrotic borne yellow reported necrosis dwarf virus mosaic virus virus stunt virus streak virus virus virus eIF4G/ eIF4F/ ribosome eIF4E/ 40S ribosome eIF4E eIF4F ? eIFiso4F subunit 60S eIF4G subunit Ligand interacts binds to 4F via with binds to directly interaction with 4E/iso4E 4IF4E cap- recruits 4F/iso4F eIF4G interacts recruits and 4E through the through binding by interaction required eIFs. Ligand with 17-nt SL binds to P- cap-binding the cap- pocket with all four Active on 4E interaction conserved site of 60S pocket. binding through PTE eIF4F silence plants sequence subunit and Functional in pocket. internal components 80S ribosome the absence of Binds to pseudoknot 4E/iso4E 4G Long- distance base-pairing no base- Long-distance (complement Long-distance Long-distance 5'-to 3' pairing base-pairing ary base-pairing base-pairing RNA between 5'-to 3' RNA between SL-III sequences) (complementary (complementary interaction 3'UTR and interactions 5' to 3' (BTE) and between SL1 sequences) sequences) s using 5'UTR. 5'-3' using interaction 5'UTR hairpin (PTE) and between SL1 between SL1 compleme interaction complementary using hairpin (YSS) and (ISS) and ntary might require sequences. complementary structure on hairpin structure hairpin structure sequences an additional sequences the on the 5'UTR on the 5'UTR 3'-CITE 5'UTR/coding region of first ORF

67

CHAPTER 2: THE RNA OF MAIZE CHLOROTIC MOTTLE VIRUS - THE ESSENTIAL VIRUS IN MAIZE LETHAL NECROSIS DISEASE - IS TRANSLATED VIA A PANICUM MOSAIC VIRUS-LIKE CAP-INDEPENDENT TRANSLATION ELEMENT

Elizabeth J. Carino1,2, Kay Scheets3, W. Allen Miller*1,2

1Department of Plant Pathology & Microbiology, 1344 Advanced Teaching and Research

Building, Iowa State University, Ames, IA 50011

2Interdepartmental Genetics & Genomics Program, 2102 Molecular Biology, Iowa State

University, Ames, IA 50011

3Department of Plant Biology, Ecology & Evolution, 301 Physical Sciences, Oklahoma

State University, Stillwater, OK 74078

Modified from a manuscript to be submitted to scientific journal

Author contributions: EJC formal analysis, investigation, methodology, writing, reviewing, editing. KS investigation, writing preparation. WAM conceptualization, supervision, writing, reviewing, and editing.

Funding: NIH grant R01 GM067104 to WAM, Iowa State University Plant Sciences

Institute to WAM, Iowa State University Graduate Minority Assistance Program (GMAP) to

EJC.

Abstract

Maize chlorotic mottle virus (MCMV) combines with a potyvirus in maize lethal necrosis disease (MLND), an emerging disease worldwide that often causes catastrophic yield loss. To inform resistance strategies, we characterized the translation initiation mechanism of MCMV.

We report that, like other tombusvirids, MCMV RNA contains a cap-independent translation element (CITE) in its 3’ untranslated region (UTR). The MCMV 3’ CITE (MTE) was mapped to nucleotides 4164-4333 in the genomic RNA. SHAPE probing revealed that the MTE is a 68

variant of the panicum mosaic virus-like 3’ CITE (PTE). Like the PTE, electrophoretic mobility

shift assays (EMSAs) indicated that eukaryotic translation initiation factor 4E (eIF4E) binds the

MTE despite the absence of a m7GpppN cap structure, which is normally required for eIF4E to

bind RNA. The MTE interaction with eIF4E, suggests eIF4E may be a soft target for engineered

resistance to MCMV. Using a luciferase reporter system, mutagenesis to disrupt and restore base

pairing revealed that the MTE interacts with the 5’ UTRs of both genomic RNA and the 3’-

coterminal subgenomic RNA1 via long-distance kissing stem-loop base pairing to facilitate translation in wheat germ extract and in protoplasts. However, the MTE is a relatively weak stimulator of translation and has a weak, if any, pseudoknot, which is present in the most active

PTEs. Most mutations designed to form a pseudoknot decreased translation activity. Mutations in the viral genome that reduced or restored translation prevented and restored virus replication in maize protoplasts and in plants. We propose that MCMV favors a weak translation element to allow highly efficient viral RNA synthesis.

Author Summary

In recent years, maize lethal necrosis disease has caused massive crop losses in East

Africa. It has also emerged in East Asia and parts of South America. Maize chlorotic mottle virus (MCMV) infection is required for this disease. While some tolerant maize lines have been identified, there are no known resistance genes that confer full immunity to MCMV. In order to design better resistance strategies against MCMV, we focused on how the MCMV genome is translated, the first step of gene expression required for infection by all positive-strand RNA viruses. We identified a structure (cap-independent translation element) in the 3’ untranslated region of the viral RNA genome that allows the virus to usurp a host translation initiation factor in a way that differs from host mRNA interactions with the translational machinery. This difference may guide engineering of – or breeding for – resistance to MCMV. Moreover, this 69

work adds to the diversity of known eukaryotic translation initiation mechanisms, as it provides

more information on mRNA structural features that permit noncanonical interaction with a

translation factor. Finally, owing to the conflict between ribosomes translating and viral

replicase copying viral RNA, we propose that MCMV has evolved a relatively weak translation

element in order to permit highly efficient RNA synthesis and that this replication-translation

trade-off may apply to other positive-strand RNA viruses.

Introduction

Maize lethal necrosis disease (MLND, also referred to as corn lethal necrosis) first identified in the Americas in the 1970s [1], has recently spread worldwide, causing devastating crop losses and food insecurity across East Africa, where maize is the most important subsistence and cash crop [2-9]. It has also emerged in China [10], Taiwan [11], Spain [12], and

Ecuador, where the damage was so catastrophic in 2015 and 2016 that a state of emergency was declared [13, 14]

MLND is caused by a mixed infection of maize chlorotic mottle virus (MCMV) and any potyvirus that infects maize, usually sugarcane mosaic virus (SCMV) [1, 15, 16]. However,

MCMV infection can also be severe when combined with abiotic stress such as drought, or with viruses outside the family [9]. Efforts to identify genetic resistance against MCMV and potyviruses have revealed resistance to SCMV [19-22], but only tolerance to MCMV [23] with reduced symptoms and virus levels. To our knowledge, no genes that confer complete resistance to MCMV have been identified. Despite its economic importance [5, 24-26], little is known about the molecular mechanisms of MCMV replication, gene expression, or its interactions with the host, which could provide valuable knowledge toward identifying targets for resistance breeding or engineering strategies. 70

MCMV is the sole member of the genus Machlomovirus in the family Tombusviridae

[27]. The 4437-nucleotide (nt) positive-sense RNA genome contains no 5’ cap, no poly(A) tail,

and encodes seven open reading frames (ORFs) [28-30]. The 5’ end of the genome contains two overlapping ORFs that code for a 32 kDa protein (P32) and a 50 kDa (P50) replicase-associated protein (RAP). The P50 ORF has a leaky stop codon, which allows for readthrough translation of a 61 kDa C-terminal extension on P50 to form the 111 kDa RNA-dependent RNA polymerase

(RdRp) [31] (Figure 2.1-A). In infected cells, MCMV generates two 3’-coterminal subgenomic

RNAs that are 5’-truncated versions of the genomic RNA. Subgenomic RNA1 (sgRNA1), spanning nts 2971-4437 serves as mRNA from which the coat protein (CP), and the movement proteins P7a, P7b, and P31 are translated. The 337 nt sgRNA2, representing the 3’ untranslated region (UTR), is a noncoding RNA [28]. Although the MCMV genome has been characterized to some extent [17, 28, 31], little is known about its translation mechanisms, an essential process

in the replication cycle.

Many positive-strand RNA viruses use noncanonical translation mechanisms, including

cap-independent translation. This frees the virus from having to encode capping enzymes, and

also allows the virus to avoid the host’s translational control system which often acts through

cap-binding proteins [32-34]. Because it differs from host mechanisms, this virus-specific

translation mechanism may provide unique targets for antiviral strategies. A translation strategy

used by all studied tombusvirids is to harbor a cap-independent translation element (CITE) in the

3’ UTR of the virus’ genomic RNA, which is uncapped [35-37]. The 3’ CITE replaces the role of

the m7GpppN cap structure present at the 5’ end of all eukaryotic mRNAs. About seven different 71

structural classes of 3’ CITE are known [35, 38, 39]. Most 3’ CITEs attract the key ribosome-

recruiting eukaryotic translation initiation factor heterodimer, eIF4F, by binding to one or both of

its subunits, eIF4G or eIF4E [35, 39-43].

Because all other tested tombusvirids harbor a 3’ CITE, we predicted that MCMV RNA

harbors a 3’ CITE to facilitate its translation. In silico analysis of the MCMV 3’ UTR using

MFOLD and ViennaRNAfold to predict RNA secondary structures did not reveal a structure

resembling a known 3’ CITE. Here we provide experimental data that demonstrate the presence

and function of a 3’ CITE, that we call the MCMV 3’ CITE (MTE). The MTE is structurally

similar to the panicum-mosaic virus-like translation element (PTE) class of CITE. We identify a

key translation initiation factor with which the MTE interacts (eIF4E), and show how the MTE

base pairs to the 5’ UTR to facilitate cap-independent translation, and that the functional MTE and the long-distance interaction are required for infection of maize. The results contribute to our understanding of structure-function relationships of cap-independent translation elements, and valuable information on the first step of gene expression of an important pathogen.

Results

Mapping the 3’-cap-independent translation element in MCMV

To roughly map the 3’ CITE of MCMV, we translated 3’-truncated transcripts from a full-length cDNA clone of the MCMV genome (pMCM41 [29]). pMCM41 DNA templates were transcribed in the absence of cap analog while pMCM721, which lacks the 5’ terminal A of the

MCMV genome, was used for capped transcripts because the 5' A of pMCM41 (identical to

MCMV RNA) cannot be capped using an m7GpppA cap analog and T7 RNA polymerase (KS

unpublished observation). Plasmid psgRNA1 was the template for transcription of full-length 72

sgRNA1, the mRNA for the 25 kDa CP and the movement proteins [28]. Transcribed RNAs and

RNA isolated from virions (vRNA) were translated in wheat germ extract (WGE) in the presence of 35S-methionine. The full-length, infectious transcript from SmaI-linearized pMCM41 and

vRNA yielded two protein products, P32 and P50, from the 5’-proximal overlapping ORFs

(Figure 2.1-B). Interestingly, vRNA yielded much less P32 protein, relative to P50, than did the

transcribed mRNA. Also, a faint band comigrating with CP is visible from both vRNA and the

full-length transcript. The expected 111 kDa protein generated by readthrough of P50 stop codon

was not detected, most likely because ribosomal readthrough occurs at a very low rate in these

conditions. Readthrough products have been difficult to detect among the in vitro translation

products of other tombusvirid genomes as well [44-46]. Unlike the full length genomic RNA

from SmaI-cut pMCM41, which yielded substantial protein products, the uncapped 3’-truncated transcripts produced almost no detectable protein product, suggesting that the 3’ CITE is downstream of the SpeI site at nt 4191 (Figure 2.1-B). It is noteworthy that translation in the presence of a 5’ cap on full-length and truncated pMCM41-derived RNAs gave much more translation product than uncapped full-length pMCM41 transcript or vRNA, indicating that the viral genome may be a relatively inefficient mRNA.

To rapidly map the 3’ CITE location at high resolution, a luciferase reporter (MlucM) was constructed such that the coding region of the virus was replaced by the firefly luciferase

(Fluc) coding sequence (Figure 2.2-A). Deletion analyses showed slight decrease in translation in vitro or in vivo when either the 3’-terminal 104 nt (nts 4334-4437) or the first 169 nt (nts 4095-

4263) of the 3’ UTR were deleted (Figure 2.2-B). Additional constructs that included the adjacent sequence upstream of the 3’ UTR, up to nt 3578 in the CP ORF, translated more efficiently than those that contained only the 3’ UTR (Figure 2.11-S1). However, the sequence 73

upstream of the 3’ UTR (3578-4108) alone was not enough to support translation, and the

greatest contributor to translation was mapped to the 3’ UTR. Numerous deletions in the MCMV

3’ UTR revealed that the region between nucleotides 4164-4333 produced luciferase activity

>100% of that from the full-length 3’ UTR in vitro and about 50% in vivo. The lower level of

translation in vivo may be due to reduced RNA stability owing to the absence of the 3’ end,

which is thought to confer stability in related viruses because of its highly base-paired terminal bases [47-49]. Deletions within nts 4164 to 4333, primarily of nts 4200-4300, reduced luciferase

translation in vitro, so in subsequent studies, we focused on nts 4164-4333 to characterize the

MCMV 3’ CITE (MTE).

Determining the secondary structure of MCMV 3’-CITE

To determine the structure of the MTE, we first attempted to predict its secondary

structure computationally. Two different algorithms, MFOLD [50], and ViennaRNA Package

[51], predicted a hammer-shaped structure unlike any known 3’ CITE, so we proceeded to

determine its secondary structure experimentally by subjecting the MTE (nts 4164-4333) to

selective 2’-hydroxyl acylation analyzed by primer extension (SHAPE) probing (Figure 2.3-A).

This revealed that the MTE consists of a long helix with various asymmetric internal loops

topped by two branching stem-loops (Figure 2.3-B), which differed from the computer-predicted

structure. The main stem contains a purine-rich bulge between nucleotides 4216-4223. In the

presence of magnesium ion, bases G4215, A4216, and G4219 were hypermodified by the SHAPE

reagent benzoyl cyanide, while bases AGA4221-4223 became hypomodified (Figure 2.3-A). The

MTE also contains a single-stranded “bridging domain” (nts 4246-4250) connecting the two branching stem-loops, which was moderately modified in the presence and absence of magnesium. Side loop-I (SL-I4235:4241) houses a pentamer, UGCCA4236-4240, in its loop that is 74

complementary to sequence UGGCA in the 5’-UTR. These pentamers may create a long- distance base-pairing interaction between the 3’ and 5’ UTRs (discussed later). The overall structure obtained from the SHAPE probing assays indicates that the MTE has a similar structure to those of Panicum mosaic virus-like 3’ CITES (PTEs) [38, 52], but differing by the presence of three hypermodified bases rather than a single hypermodified G in the purine-rich bulge in the presence of Mg2+ [53].

Comparison of the MTE to PTEs: role of the pseudoknot

Because the MTE SHAPE probing experiments suggested that the MTE resembled a

PTE, we compared the secondary structure of the MTE with known and predicted PTEs using

the alignment program, LocARNA [54, 55] (Figure 2.4-A). This alignment revealed a consensus structure with more variability than reported previously [53] because more predicted PTE sequences are aligned than previously. The MTE and PTEs contain a purine-rich bulge with at least one highly conserved G. However, the previously termed “C-rich” domain of PTE that bridges between stem-loops 1 and 2 is not always C-rich, thus we now call it the bridging domain. One putative PTE, from Pea stem necrosis virus (PSNV), contains no bridging domain, and only a two base-pair stem in stem-loop 2 (Figure 2.4-A). However, it has not been demonstrated to be functional. Potential pseudoknot base pairing between the purine-rich bulge and the bridging domain (square brackets Figure 2.4-A), can be drawn for all PTEs except

PSNV. However, for the MTE and some other PTE-like structures – the pseudoknot, if it exists at all, would consist of only two Watson-Crick base pairs: AG4221-4222:CU4249-4250 in the MTE.

The SHAPE probing (Figure 2.3-A) indicates that modification of AG4221-4222 decreased in the

presence of Mg2+, and they are thus probably base-paired (which favors pseudoknot formation), but the already-modest SHAPE sensitivity of bases CU4249-4250 in the bridging domain does not 75

decrease in the presence of Mg2+, as would be expected if the proposed pseudoknot forms.

However, the bridging domain of other PTEs in which this pseudoknot is likely also shows little

change in the modification in the presence of Mg2+ [53]. Thus, as with other PTEs, although

phylogenetic and structural data suggest this pseudoknot occurs, we cannot conclude this without

a doubt.

Because the MTE at least partially resembles the PTE consensus, we compared the MTE

translation stimulation activity with that of other PTEs. Translation activity of luciferase reporter

constructs containing PTEs in the 3’ UTR and the corresponding viral 5’ UTR [56] were

compared to a construct containing MCMV 5’ and 3’ UTRs (MlucM). MlucM stimulated

translation at a low level relative to most PTEs (Figure 2.4-C). However, it stimulates translation

9-fold more than MCMV mutant C4238G, which prevents base pairing of the MTE to the 5’

UTR (below), and 20-fold more than the negative control PEMV2-m2, (a CC to AA mutation in

the bridging domain), which was shown previously to virtually eliminate PTE activity [53, 56].

As reported previously [56], the PTE of Thin paspalum asymptomatic virus (TPAV)

stimulated translation to a much higher level than the others. The PTEs of Japanese iris necrotic

ring virus (JINRV) and Pelargonium flower break virus (PFBV) were not statistically significantly more stimulatory of translation than that of MCMV. Thus, even though the MTE resembles the PTE structure, it appears that MCMV (and JINRV and PFBV) have relatively weak PTE-like 3’ CITEs, compared to other characterized PTEs. It is noteworthy that here and previously [56], the most stimulatory PTEs (TPAV, PMV) have strong GGG:CCC pseudoknot base pairing between the purine-rich bulge and the bridging domain, whereas “weak” PTEs, such as the MTE and JINRV have little if any pseudoknot base pairing (Figure 2.3-B and Figure 2.12-

S2, respectively). However, the presence of a strong pseudoknot does not guarantee a strong 76

translation enhancer, as indicated by PEMV2 and HCRSV PTEs (Figure 2.4-A, Figure 2.4-C,

Figure 2.11-S1, Figure 2.12-S2). The role of potential pseudoknot base pairing is explored further below.

We constructed a series of mutations in the purine-rich bulge and bridging domain to test whether changes in these areas predicted to strengthen or weaken the pseudoknot had effects on translation efficiency (Figure 2.5). These included mutations designed to determine if a stronger pseudoknot could increase translation activity. Mutant A4248U, which should lengthen the proposed wild type pseudoknot from two (AG4221-4222:CU4249-4250) to three (AGA4221-

4223:UCU4248-4250) base pairs, translated only 55% as efficiently as wild type in WGE (Figure 2.5-

B). This mutant could also potentially form an ACU4216-4218:AGU4246-4248 pseudoknot helix. To

disrupt that possibility, a U4218A mutation was added. This double mutant translated 70% as

efficiently as wild type in WGE (Figure 2.5-C). However, neither of these mutants translated

appreciably in the more competitive conditions in protoplasts. In other constructs, mutations in

both the purine-rich bulge and the bridging domain were introduced to generate pseudoknot

base-pairing predicted to be more stable than wild type. In constructs in which the purine-rich

bulge remained purine-rich and the bridging domain became pyrimidine-rich, changing the

purine-rich domain or the bridging domain alone reduced translation (Figure 2.5 panels D, E, G,

H), while the double mutants capable of forming the pseudoknot (GGG:CCC or AAAA:UUUU)

translated more efficiently than the single-domain mutants. The GGG:CCC pseudoknot actually

yielded 50% more luciferase than wild type in WGE and protoplasts (Figure 2.5-F), whereas the

AAAA:UUUU predicted pseudoknot translated 35% as efficiently as wild type in WGE (Figure

2.5-I), which was slightly greater than the UUU mutation alone (which may form a weak

pseudoknot containing two G:U pairs) or the AAA mutation in the purine-rich bulge. Each of 77

these mutants translated about 15-20% as efficiently as wild type in WGE. However, none of

this set of mutants translated detectably in protoplasts (Figure 2.5, panels G, H, I). Swapping the

purines and pyrimidines to create a potential: ACCC:GGGU pseudoknot helix gave low and no

cap-independent translation in WGE and protoplasts, respectively (Figure 2.5-J). One mutation,

G4219U in the purine-rich bulge, was not predicted to affect pseudoknot interactions and did not

affect the translation activity of the MTE (Figure 2.5-K). This is interesting because G4219 is

hypermodified in the presence of Mg2+ (Figure 2.3). Overall, with one rather modest exception

(Figure 2.5-F), mutations designed to increase pseudoknot base-pairing altered the structure in such a way as to decrease translation efficiency.

The MTE binds eIF4E

Previously, PTEs have been shown to bind and require eukaryotic translation initiation factor 4E (eIF4E, the cap-binding protein), despite the absence of methylation (cap structure) on

the PTE RNA [53, 57]. Because the MTE resembles PTEs, we used electrophoretic mobility

shift assays (EMSA) to determine whether the MTE also binds eIF4E. To confirm eIF4E

integrity (cap-binding ability), capped forms of the tested MTE constructs, were incubated in the

presence of eIF4E and shown to confer strong mobility shifts (Figure 2.13-S3). We used the highly efficient TPAV PTE as a positive control for eIF4E binding to an uncapped PTE, as shown previously, and the nonfunctional TPAVm2, which contains mutations (CC to AA) in the bridging domain that inactivate the TPAV PTE and greatly reduced the binding affinity of the

PTE to eIF4E (in the absence of a cap) as a negative control [56]. As expected, the TPAV PTE, formed a protein-RNA complex (Figure 2.6), as indicated by the reduced mobility of 32P-labeled

PTE in the presence of eIF4E. Also as expected, the nonfunctional TPAVm2 PTE bound to

eIF4E only at very high concentrations, and most RNA remained unbound (Figure 2.6). Some 78 nonspecific binding to any RNA by eIF4E is expected, as it is a low-affinity nonspecific RNA- binding protein [58, 59]. The wild type MTE4187-4326 migrated as two bands in the absence of added protein. Based on previous studies and migration patterns of other 3’ CITES [56, 57], we surmise that the fastest moving (and most abundant) band represents the properly folded MTE.

Note that this band disappeared in the presence of ≥100 nM eIF4E (Figure 2.6-B, MTE-wt).

Thus, like PTEs, the MTE interacts with eIF4E with high affinity (Kd = 80-100 nM).

We next tested the ability of mutant MTEs to bind eIF4E, in order to determine if eIF4E binding correlates with the translation enhancement function, as was observed previously for the

TPAV PTE. Purine-rich bulge mutants U4218G and G4219U, which gave 50% and 100% of wild type translation, respectively, showed very similar EMSA profiles to wild type MTE

(Figure 2.6). The mutant designed to strengthen the pseudoknot interaction, U4218G/GA4247-

4248CC, and which gave 150% of wild type translation (Figure 2.5-F) shifted similarly to

U4218G alone, but with slightly less complete binding. This may be due to partial misfolding, as detected in the full-length genome context (below). MTE mutant GAC4247-4249UUU, which had greatly reduced translation, showed significantly less binding than the others at 80-

200 nM eIF4E, and some RNA remained unbound at the highest eIF4E concentrations. Finally, the mutant designed to prevent base pairing to the 5’ UTR (C4238G), but not expected to disrupt eIF4E binding because it enables cap-independent translation in the presence of complementary sequence in the 5’ UTR (shown below), gave an EMSA profile very similar to other functional mutants. In summary, functional MTEs bind eIF4E with higher affinity than a nonfunctional mutant, consistent with a requirement for eIF4E binding.

79

Secondary structure of the MCMV 5’-UTR

Most plant viral 3’ CITEs that have been studied interact with the 5’ end of the viral genomic RNA and subgenomic mRNA via long-distance base pairing of the 3’ CITE to the 5’

UTR, presumably to deliver initiation factors to the 5’ end where they recruit the ribosomal 40S subunit to the RNA [35, 52, 60-62]. Thus, we sought to determine if the same interaction occurs

in MCMV RNA. Initial in silico analysis of the 5’-end structure of MCMV revealed two sites

(GGCA12-15 or UGGCA103-107) that could potentially with the MTE sequence

UGCCA4236-4240 in loop 1. An RNA transcript containing bases 1-140 of MCMV RNA,

including the P32 (AUG118-120) and P50 (AUG137-139) ORF start codons, was subjected to SHAPE probing to determine which region was most likely to be available (single stranded) to interact with the MTE (Figure 2.7-A). The 5’ UTR (nts 1-117) was found to consist of a large stem-loop

with several large bulges, followed by a short stable stem-loop terminating 5 nt upstream of the

start codon (Figure 2.7-B). The first potential MTE-interacting sequence (GGCA12-15) is buried

in a stem helix, while the UGGCA103-107 is in a favorable loop (Figure 2.7-B). This led us to suspect that UGGCA103-107 is the potential base-pairing sequence that interacts with MTE side

loop 1.

Long distance base pairing of the MTE to the 5’ UTR

We next defined functionally which (if any) of the above candidate sequences base pairs

to the MTE. Mutations were introduced in the XGCCA regions in the 5’ UTR (nts 11-15 or 103-

107) and the MTE UGCCA4236-4240 region (Figure 2.8-A). Mutation of G13 to C caused only a

small decrease in luciferase activity in WGE and in oat protoplast translation systems (Figure

2.8-B). In contrast, mutation of G105 to C, reduced luciferase activity by ~75% in WGE and

protoplasts. Even more extreme, the C4238G mutation of the middle base in the MTE

UGCCA4236-4240 tract decreased luciferase activity by 80% to 90% (Figure 2.8-B, see also Figure 80

2.4-C). These mutations were then combined to restore any long-distance base pairing that may

have been disrupted. The MlucMG13C/C4238G double mutant, which would restore long-distance

base pairing to the 5’-proximal complementary sequence in the 5’ UTR, yielded the same low

translation activity as C4238G alone. In contrast, double mutant MlucMG105C/C4238G, which is

predicted to restore long-distance base pairing of the MTE to the 5’ distal complementary

sequence in the 5’ UTR, yielded a two-fold increase in translation activity when compared to

MlucMG105C and a 3-4-fold increase in translation relative to the more deleterious C4238G single

mutation (Figure 2.8-B). While the compensating mutations did not fully restore a wild type

level of translation, the fact that the double mutant MlucMG105C/C4238G but not MlucMG13C/C4238G

translated significantly more efficiently than MlucMC4238G supports base pairing between 5’ UTR

nts 103-107 and MTE nts 4236-4240 as a requirement for efficient cap-independent translation.

The MTE should also base pair to the 5’ UTR of sgRNA1, to allow translation of the

viral coat and movement proteins. Indeed, we identified a sequence, UGGCA2979-2983 in the short

25 nt 5’ UTR of sgRNA1, which matches the UGGCA103-107 that base pairs to the MTE. This

sequence is predicted to be in the terminal loop of the stem-loop that occupies the 5’ UTR of sgRNA1 (Figure 2.9-A), which starts at nt 2971 [28]. We investigated both the effect of this short 25 nt 5’ UTR on MTE-mediated translation efficiency and the role of base pairing (if any) between UGGCA2979-2983 and UGCCA4236-4240 in the MTE. In WGE, the sgRNA1 5’ UTR

enabled translation about equally efficiently as did the genomic 5’ UTR, while the translation

was about two-thirds as efficient in oat protoplasts (Figure 2.9-B), perhaps due to less RNA

stability conferred by the shorter 5’ UTR. Separate G2981C and C4238G mutations in the

sgRNA1 5’ UTR and the MTE, respectively, reduced translation significantly in both WGE and

oat protoplasts. The double mutant, designed to restore predicted base pairing, with G2981C and 81

C4238G in the same construct, gave surprising results. In WGE, as predicted, the double mutant

fully restored translation to wild type levels. However, in oat protoplasts, the same mRNA was

as nonfunctional as those containing single G2981C and C4238G mutations, showing no

restoration of translation whatsoever. Because WGE is a high-fidelity translation system that

measures only translation, independent of the complicated environment of the cell, we conclude

that the long-distance base pairing is necessary for efficient cap-independent translation. We

speculate that the G2981C mutation altered the RNA in such a way as to make it highly unstable

in cells or able to fortuitously interact with cellular components that preclude translation, and

which are absent in WGE.

Effects of mutations on the translation of full-length MCMV genomic RNA

To determine the effects of mutations that affect translation in the natural context of

genomic RNA, selected mutations from the luciferase experiments were introduced into the

MCMV infectious clone pMCM41. Firstly, uncapped, full-length genomic RNA transcripts from pMCM41 mutants were translated in WGE, and the predominant 35S-met-labeled viral protein

products (P50, P32 and P25) were observed. In agreement with the luciferase reporter

constructs, mutants MCM41C4238G and MCM41G105C, which disrupt the long-distance base

pairing between MTE and 5’ UTR, yielded less viral protein than wild type (Figure 2.10-A). In contrast, the double mutant, MCM41G105C/C4238G, which restores the long-distance base pairing,

translated more efficiently than wild type RNA for the P32 and P50 proteins. A 25 kDa protein,

presumably the viral coat protein (MW 25 kDa), was not expected to be translated much from

genomic RNA, as seen in Fig 1B, but it appeared in this experiment. Its translation remained at

about the same reduced level in the double mutant as in the single mutants. MCM41 mutants

G4219U, U4218G, and Δ4200-4300 translated similarly to the luciferase constructs, relative to

wild type. However, the mutant designed to form a GGG:CCC pseudoknot in the MTE, 82

MCM41U4218G/GA4247-4248CC, showed a substantial decrease in translation, in contrast to the same

mutation in the luciferase reporter, in which translation was increased by 50% (Figure 2.5-F).

To determine if the mutants that translated poorly in the full-length genome context did

so due to the gross misfolding of the MTE, we performed SHAPE probing in the context of the

MCMV genome (Figure 2.16-S6). Mutant C4238G, which disrupted long-distance base pairing

with 5’UTR, maintained near-wild type MTE secondary structure, the only difference was the

SHAPE reactivity data decreased in the side loop 1 (4235-4241) (Figure 2.17-S7). Functional

mutants G4219U and U4218G were also similar in structure to WT (Figure 2.17-S7), with the

exception that in U4218G, the purine-rich bulge was less modified in the absence and presence

of magnesium (Figure 2.16-S6). Interestingly, secondary structures of the MTE mutants

designed to have a strong pseudoknot, GA4247-4248CC and U4218G/GA4247-4248CC were

changed radically, with either an increased SL-I stem at the expense of the pseudoknot, or

formation of an unbranched, multiply-bulged stem-loop structure (Figure 2.17-S7). Both

mutants changed the wild type MTE structure in such a way that forces the reverse transcriptase

to stop around nucleotides 4228-4241 (~SLI) even in the absence of SHAPE chemicals (Figure

2.16-S6). These SHAPE results may explain the difference in the function of this mutant

between reporter assay (functional) and viral genomic context (nonfunctional).

Replication of mutant MCMV RNA

To test the effects of mutations on viral RNA replication and accumulation, maize protoplasts from a Black Mexican Sweet (BMS) cell culture were transfected with full-length mutant MCM41 transcripts. Unfortunately, in BMS protoplast preparations, high levels of RNase obscured detection of distinct viral RNAs in northern blot hybridization, so the level of viral

RNA was quantified by simple dot blot hybridization. Single mutants, which disrupted long- distance base-pairing, MCM41C4238G, and MCM41G105C, yielded about 35-40% as much viral 83

RNA replication product as wild type, while the double mutant, MCM41G105C/C4238G, yielded

about twice as much viral RNA as the single disruptive mutants (Figure 2.10-B). On the other

hand, the mutant designed to form a GGG:CCC pseudoknot, MCM41U4218G/GA4247-4248CC, yielded

80% less viral RNA than the wild type, whereas replication of MCM41G4219U was not significantly less than wild type. Finally, MCM41Δ4200-4300 produced virtually no viral RNA.

Overall, efficiently translating mutants replicated at near-wild type levels, while mutants with reduced translation in the full-length context accumulated much lower levels of viral RNA, as expected.

We next attempted to validate the observations in protoplasts by inoculating maize plants with pMCM41 mutant transcripts (average of 36 individual inoculations per mutant). Maize B73 plants inoculated with wild type MCM41 transcript began to exhibit chlorotic mottling symptoms between 8-10 dpi, while sweet corn (Golden Bantam) plants exhibited symptoms around 6-9 dpi. Viral RNA was detected via RT-PCR in inoculated leaves of all plants at 8 dpi, including those that never showed chlorotic mottling (Figure 2.15-S5). At 14 dpi, samples from the newest systemic leaves were subjected to northern blot hybridization with a probe complementary to the 3’ end of the MCMV genome (Figure 2.10-C). Following inoculation of both B73 and Golden Bantam maize plants, only three of the nine MCMV mutants tested elicited symptoms (chlorotic mottling). MCMV RNA was never detected in asymptomatic plants via northern blot hybridization. Viral RNA from samples displaying positive northern blot signals was subjected to Sanger sequencing for verification of introduced mutations (Table 2-1). The few single mutants (C4238G, G105C, U4218G) that showed symptoms at 14 dpi had reverted to wild type MCM41 (Table 2-1). MCM41G4219U had a similar infectivity to wild type, but in about

half of the infected plants the virus reverted to wild type (Table 2-1). Moreover, the 84

MCM41G4219U that did not revert to wild type accumulated less RNA (Figure 2.10-C), suggesting

that, although the G4219U mutation is tolerated, the wild type sequence is more competitive.

With one exception (below) MCM41G105C/C4238G retained its introduced mutations but the

infectivity (Table 2-1) and RNA accumulation (Figure 2.10-C) also was reduced relative to wild

type.

Interestingly, the viral RNA from one sweet corn plant inoculated with the

MCM41G105C/C4238G mutant acquired additional mutations. These spontaneous mutations

consisted of deletion of G94 in the 5’ UTR, and a U492A point mutation in the overlapping P32

and P50 ORFs. This mutation introduced a stop codon in the P32 frame, shortening the protein

from 289 aa to 125 aa, and changed amino acid 119 from valine to glutamic acid in the P50 and

P111 (RdRp) proteins. This spontaneous mutant did not induce symptoms until 12 dpi, but after

14 dpi, symptoms were more extreme than wild type, giving nearly translucent leaves (Figure

2.10-D). In summary, the MCMV genome tolerated few mutations, and only those mutations

that allowed the most efficient translation replicated in maize plants.

Discussion

Identification of the 3’CITE in MCMV genome

Given the severe losses it has caused since 2011 in mixed infection with the potyvirus

SCMV in East Africa [7, 63, 64], China [10, 65, 66] and Ecuador [13], and given the cost of

screening seed worldwide to ensure absence of this seed-transmitted virus [63, 67, 68], MCMV

is almost certainly the most economically important virus in the >76-member Tombusviridae family. Thus, it is imperative to understand the life cycle of MCMV, including translation, to identify molecular targets for genetic approaches to control this virus. All tombusvirids that have been studied contain a 3’ CITE [35, 40, 60, 69, 70], yet these are not predictable because

the class of 3’ CITE a tombusvirid possesses does not always correlate with phylogeny [35, 52]. 85

Moreover, we were unable to predict the class of 3’ CITE in MCMV RNA using ViennaFold or

MFOLD. Therefore, we used experimental approaches to determine that MCMV contains a 3’

CITE in the PTE class, which we call MTE. Unlike the most efficient PTEs, the MTE lacks a

strong pseudoknot and stimulates cap-independent translation less efficiently than most PTEs.

Previous studies of nine PTEs revealed a common secondary RNA structure composed of a branched structure with two side loops (Figure 2.4) connected by a pyrimidine-rich bridging domain [52] and a purine-rich bulge (formerly called G bulge) in the basal stem. A uniformly conserved feature is a guanosine in the purine-rich bulge that can be hypermodified by SHAPE reagents such as benzoyl cyanide in the presence of magnesium [53]. Despite this hypermodification, previous evidence also supports a pseudoknot interaction between purine-rich

bulge bases and the bridging domain [53]. In the case of the MTE and some PTEs, the predicted

pseudoknot interaction is tenuous, as the bridging domain is not always pyrimidine-rich,

however, the magnesium-dependent hypermodified G in the G-bulge is still present. In the PMV

and PEMV PTEs, this region is protected from SHAPE reagents by eIF4E and is thus the likely

eIF4E-binding site [53]. The structural basis of this interaction is unclear, but the strongest PTEs

have a relatively strong pseudoknot base pairing (GGG: CCC), whereas PTEs that have weak or

no Watson-Crick base pairs are generally less stimulatory of translation (Figure 2.4). One

mutant MTE designed to have a strong(er) pseudoknot yielded 50% more translation product

than the wild type MTE in the luciferase reporter context (MlucMU4218G/GA4247-4248CC, Figure 2.5-

F), but these same mutations reduced translation, and thus replication, when in the context of the

MCMV full-length clone (Figure 2.10) owing to misfolding (Figure 2.17-S7). Other mutants designed to have pseudoknots also reduced translation because of misfolding (Figure 2.17-S7). 86

Long-distance kissing stem-loop interactions

A sequence in the 5’ UTR complementary to a loop in the 3’-CITE required for efficient translation has been found in most 3’-CITE-containing viral genomes [35]. This presumably facilitates initiation factor-mediated recruitment of the ribosomal subunit to the 5’-end of the

RNA [35, 52, 60-62, 71].The sequence of the MTE long-distance kissing stem-loop interactions, between either the genomic 5’ UTR or the subgenomic RNA 5’ UTR and the MTE is

UGGCA:UGCCA. This sequence fits the consensus found in many tombusvirids:

YGGCA:UGCCR [35, 52], supporting our experimental evidence (Figure 2.8, Figure 2.9). Why this particular sequence pair has been conserved is unknown.

The unexpected translation of CP (P25) from full-length MCM41 RNA observed in Figure 2.10-

A, but only minimally in Figure 2.1-B, maybe due to different WGE batches. The two experiments were performed in Miller’s and Scheet’s labs, respectively. This may be due to variation in initiation factor levels or nucleases among batches of WGE used. It is possible that nucleolytic degradation in WGE generates small amounts of sgRNA1-sized RNA available for translation without internal ribosome entry.

In fact, in WGE, the noncoding sgRNA2 is indeed generated by exonucleolytic degradation of all but the

3’ UTR of genomic RNA (our unpublished data, and [72]). Alternatively, internal ribosome entry is possible, as Simon’s lab has reported that another tombusvirid, TCV, harbors an internal ribosome entry site (IRES) to allow direct translation of CP from genomic RNA (as well as from sgRNA) [73].

However, the TCV IRES region is a tract of unstructured RNA, and we found no similar sequence upstream of the MCMV CP start codon.

The MCM41C105G/G4238C transcript, which translates more efficiently than wild type in WGE, replicates indistinguishably from wild type MCM41 in protoplasts, but accumulates to much lower levels in whole plants. This is to be expected because the G4238C mutation in the MTE would prevent base pairing to the 5’ end of sgRNA1, which remains a wild type sequence. Thus, we predict translation of the

CP and movement proteins is reduced in this construct. CP and movement proteins are not necessary for 87

replication of other tombusvirids in protoplasts but would of course be necessary for the virus to

accumulate in plants, thus explaining the difference in accumulation in Figure 2.10-B and Figure 2.10-

C.

Translation of sgRNA1

Comparison of translation efficiencies of genomic (MlucM) and subgenomic RNA

(Msg1lucM) reporter constructs revealed no striking difference in translation efficiency

conferred by the 5’ UTR (Figure 2.9). This is interesting, because the secondary structure and

length of the 5’ UTR is much less in the sgRNA1 5’ UTR (24 nt, G = -6.6 kcal/mol) compared

to that of genomic RNA 5’ UTR (117 nt, G = -21.2 kcal/mol). We expected the sgRNA1 5’

UTR to provide superior translation efficiency because it should provide less resistance to

ribosome scanning. Perhaps in infected cells, in the presence of the abundant sgRNA2, which

contains the MTE, the sgRNA1 5’ UTR could outcompete the genomic RNA 5’ UTR for

limiting eIF4E as discussed below.

In addition to stimulating translation of genomic RNA and sgRNA1 in cis, the MTE may

regulate translation initiation in trans. Like certain other tombusvirids, such as red clover

necrotic mosaic virus (RCNMV) [74, 75], tobacco necrosis virus-D [76] and the flaviviruses

[77], MCMV generates a noncoding sgRNA (sgRNA2) corresponding to most of the 3’ UTR

[28]. These RNAs are generated by exonucleolytic degradation of the larger viral RNAs until the exonuclease reaches a blocking structure called xrRNA at which point it stops, leaving the sgRNA, the 337 nt sgRNA2 in the case of MCMV, intact [72]. Because this abundant RNA contains the MTE, we propose that it regulates viral and host translation by binding and sequestering the eIF4E subunit of eIF4F [78]. sgRNA2 of BYDV (related to the Tombusviridae) binds the eIF4G subunit of eIF4F and, as a result, inhibits translation of BYDV genomic RNA, 88 while favoring translation of BYDV sgRNA1, which, like MCMV sgRNA1, codes for movement and coat proteins. This would free genomic RNA of ribosomes, making it available for replication, encapsidation, and cell-to-cell movement. The selective inhibition of gRNA was on to be due to its highly structured 5’ UTR, which likely increases dependence on eIF4F relative to the unstructured 5’ UTR of sgRNA1 [79]. The same regulation may occur on MCMV

RNA. Thus, the MTE may play an essential role in MCMV infection by acting in trans. The role of sgRNA2 may be relevant to flaviviruses, which also produce noncoding sgRNAs from the 3’ UTRs that bind a variety of host proteins [77, 80-82], and which can affect translation

[83].

Efficient replication vs efficient translation?

It is possible that the relatively weak stimulation of translation by the MTE, and the relatively inefficient translation of viral genomic RNA, relative to artificially capped genomic

RNA (Figure 2.1-B), in contrast to the extremely high accumulation of MCMV RNA in infected cells (which is often visible by staining of total RNA from infected cells), may reveal a fundamental principle in RNA virus replication strategy. We propose that MCMV sacrifices efficient translation in exchange for extremely efficient RNA synthesis. RNA synthesis is inhibited by the presence of translating ribosomes on the viral RNA [82, 84, 85]. Thus, MCMV genomic RNA may have evolved to translate “poorly” in order to make it available to the viral replicase for highly efficient RNA replication. Moreover, owing to its abundance, genomic RNA may simply not need to translate efficiently. Indeed, highly efficient translation of such an abundant RNA could overwhelm the ribosomes, preventing translation of host mRNAs, harming the cell, and thus the virus itself. The 5’ UTR of the unrelated wheat yellow mosaic bymovirus harbors an IRES in a dynamic equilibrium state, which also gives suboptimal translation [86], perhaps also to allow efficient replication. In contrast, BYDV RNA has a highly efficient 3’ 89

CITE (BTE) [87], but accumulates to much lower levels in cells (not detectable by staining of

total RNA). BYDV may have adopted a strategy to accumulate low levels of RNA and virus

particles, which are adapted to very efficient acquisition by . This RNA would require

much more efficient translation in order to compete with host mRNAs for ribosomes. Thus,

viruses such as MCMV are “replication strategists”, going “all in” on RNA synthesis at the

expense of translation efficiency, while others, such as BYDV are “translation strategists”,

translating efficiently at a cost to RNA replication.

Toward host resistance to MCMV

As mentioned above, understanding translation may lead to strategies for resistance. The

MTE binds eIF4E via what must include different molecular contacts than the binding of a 5’

m7G cap to eIF4E via the cap-binding pocket [88]. Thus, it may be possible to identify mutants of eIF4E that lose the ability to bind the MTE but retain a functional cap-binding pocket to allow translation of host (capped) mRNAs. In melon, such a resistance mechanism has been identified against melon necrotic spot virus (MNSV). A point mutation in melon eIF4E confers recessive resistance to most strains of MNSV, while having no negative effect on melon agronomic performance [89, 90]. This mutation was shown to preclude translation of MNSV RNA (thus blocking infection) by preventing efficient binding of eIF4E to the MNSV 3’ CITE (an I-shaped structure) [91]. In the case of MCMV, in addition to traditional screening of maize genotypes for

MCMV resistance (which has achieved limited success), directed studies could identify natural eIF4E alleles, or guide construction of engineered eIF4E mutants, that bind poorly to the MTE, while still binding capped host mRNAs with high affinity, in order to achieve durable, recessive resistance to MCMV. 90

Materials and methods

Plasmids

The full-length infectious clone of the MCMV genome (pMCM41) was described

previously [29]. For psgRNA1 construction, template DNA pMCM41 was used to amplify

MCMV nt 2972-4437 using Vent DNA polymerase and oligonucleotides 3'SmaI (5'-

agcaagcttcccGGGCCGGAAGAG [29] and sgRNA1 (5'GGTATTTTGGCAGAAATTCC) that were phosphorylated with T4 polynucleotide kinase. The vector pT7E19 [92] was digested with

Sac I followed by mung bean nuclease digestion. Vector and insert were digested with HindIII, phenol/chloroform extracted and precipitated prior to ligation with T4 DNA ligase. DNA was added to competent E. coli DH5α, selected on ampicillin/XGal plates, and screened by restriction digests and sequencing. Transcripts made from psgRNA1 linearized with SmaI contain MCMV nt 2971-4437, the complete sgRNA1. All enzymes were from New England Biolabs. MlucM was constructed using Gibson Assembly kit (New England Biolabs) such that a firefly luciferase

(luc2, Promega) reporter gene was flanked by the 5’ and 3’ UTRs of MCMV (Figure 2.1). A Q5

Site-directed mutagenesis kit (New England Biolabs) with custom forward and reverse primers

(Table 2-2-S1) was used to generate the deletions (Figure 2.2) and mutants (Figure 2.5, Figure

2.8, Figure 2.9) on the UTRs of the luciferase constructs. Resulting luciferase plasmid constructs were verified by sequencing at the Iowa State University DNA Sequencing Facility.

In vitro Transcription

At Iowa State University (all experiments except Figure 2.1), plasmid DNA templates

were linearized by restriction digestion or amplified by PCR to ensure correct template length.

The RNAs were synthesized by in vitro transcription with T7 polymerase using MEGAscript (for uncapped RNAs) or mMESSAGE mMACHINE (for capped RNAs) kits (Ambion). RNAs used as probes for uncapped EMSA assays were generated using MEGAshortscript kit (Ambion). 91

RNA transcripts were purified according to the manufacturer’s instructions, RNA clean and

concentrator kit (ZymoResearch) was used for non-radioactive RNA preparation. RNAse-free

Bio-Spin columns P-30 (Bio-Rad) were used for radiolabeled RNA. The RNA integrity was verified by 0.8% agarose gel electrophoresis. The RNA concentration was determined by spectrophotometry for non-radiolabeled RNA. The radiolabeled RNA concentration was calculated by measuring the amount of incorporated radioisotope using a scintillation counter.

At Oklahoma State University (Figure 2.1) full length (nt 1-4437) or 3'-truncated (nt 1-4195) genomic RNA was synthesized with T7 RNA polymerase (New England Biolab) and either pMCM41 or pMCM721 linearized with Sma I or SpeI following company protocols for uncapped or capped RNA synthesis. Unincorporated NTPs were removed by three rounds of ammonium acetate/ethanol precipitation and resuspension in nuclease-free water. pMCM721 contains a G residue between the T7 promoter and MCMV sequence, allowing synthesis of capped or uncapped RNAs 1 nt longer than WT [29]. The 3K MCMV RNA transcript templates were made by PCR using primers p9KO2 (CAGAAATTCCCGAgTGTC, nt 2982-2999) and

M13 forward (-20) with both plasmid templates. These oligonucleotides were synthesized by the

Oklahoma State University Core Facility.

In vitro Translation

In vitro translation reactions were performed in WGE (Promega) as described [93].

Nonsaturating amounts of RNAs (30 fmol) were translated in WGE in a total volume of 12.5 µl

35 with amino acids mixture or [ S]-methionine amino acid mixture (Ct= 3.06 μCi/0.14 MBq, 5.8

Ci/mmol, Perkin Elmer), 93 mM potassium acetate, and 2.1 mM MgCl2 based on manufacturer’s

instructions. Translation reactions were incubated at room temperature (25°C) for 30 minutes.

Translation products were separated by electrophoresis on a NuPAGE® 4-12% Bis-Tris gel

(Invitrogen), detected with Pharox FXTM plus Molecular Imager and quantified by Quantity One 92

one-dimensional analysis software (Bio-Rad). For luciferase reactions, 10 µl of the translation reaction product was mixed into 50 µl of Luciferase assay reagent (Promega) and measured immediately on a GloMaxTM20/20 Luminometer (Promega). Statistical and data analysis were performed using GraphPad Prism software.

Translation in Protoplasts

Uncapped MlucM (10 pmol) or its derived mutants were co-electroporated into ~2x106 oat (Avena sativa cv. Stout) protoplasts along with capped mRNA encoding Renilla luciferase (1 pmol) [94] as an internal control. Protoplasts were prepared and assays performed as described previously [95]. After 4 h of incubation at room temperature, protoplasts were harvested and lysed in Passive Lysis Buffer (Promega). Luciferase activities were measured using Dual

Luciferase Reporter Assay System (Promega) in a GloMaxTM 20/20 luminometer (Promega).

To minimized variation among electroporation replicates, firefly luciferase activities were normalized to the Renilla luciferase. Background, firefly relative light units (RLUs), measured in the absence of added luciferase mRNA, were subtracted from the values obtained with MlucM and its deletions/mutant derivatives. Statistical and data analysis was performed using GraphPad

Prism software.

RNA Structure Probing

Selective 2’-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) was applied to selected UTR fragments of MCMV following the procedure previously described [96] except that SHAPE was conducted in the context of the complete genome instead of using the SHAPE cassette. In brief, 500 ng of RNA was denatured by heating to 95°C and renatured in SHAPE buffer (50mM HEPES-KOH, pH 7.2, 100 mM KCl, and ± 8 mM MgCl2) for 30 minutes at 30°C.

RNA was modified by mixing 1/10 (v/v) ratio of renatured RNA with 60 mM benzoyl cyanide 93

(Sigma) in anhydrous dimethyl sulfoxide (DMSO) (Sigma). After 2 min at room temperature, the

RNA was mixed with four-fold excess tRNA and precipitated in 3 volumes of 99% ethanol and

1/10 volume 3 M sodium acetate. Control RNA was treated with the same amount of DMSO

without benzoyl cyanide. The primer (3’UTR: 5’-TACTCCGTTGAGTTCAGAAACC-3’, or

5’UTR: 5’-CCATAAGTGCAGGGAGAGGG-3’) was 5’-end-labeled with γ-[32P] ATP and used

for primer extension. Gel electrophoresis conditions and phosphorimager visualizations were

done as described previously [57, 96].

Expression and purification of wheat eIF4E

Wheat eIF4E pET22b plasmid clone [97] was obtained from Dr. Karen Browning.

Plasmid was introduced in E. coli (BL21) cells and eIF4E expression was induced at O.D.600nm=

0.7, with 0.5mM IPTG. From here on, the purification procedure of the protein was similar to

previous published work [97]. In short, eIF4E expression in E. coli cells was induced by

incubating in IPTG for 2 hours at 37°C with shaking (160 rpm). Cells were harvested by

centrifugation at 6,000 x g for 15 minutes at 4°C, and cell pellets were quick frozen before

purification. E. coli cells were disrupted by sonication in buffer B-50 (20 mM HEPES-KOH

pH7.6, 0.1 mM EDTA, 1 mM DTT, 10% glycerol, 50 mM KCl) containing complete protease

inhibitor cocktail tablets (Roche). Cell debris was separated from supernatant through 3-5 rounds

of centrifugation (16, 000 x g for 15 minutes); for each centrifugation supernatant was transfer to

a clean round-bottom centrifugation tube. Wheat eIF4E protein was purified using m7GTP

agarose beads as described previously [97]. Protein was eluted using buffer B-100 (20 mM

HEPES-KOH pH7.6, 0.1 mM EDTA, 1 mM DTT, 10% glycerol, 100 mM KCl) plus 20 mM of

GTP. Protein purity was evaluated in a 4-12% NuPage Bis-Tris gel (Invitrogen). Protein

concentration was determined by spectrophotometry and using Bio-Rad protein assay kit with a

BSA protein standard curve. 94

Electrophoretic mobility shift assay (EMSA)

Calculated specific activities of probes were used to determine the molarity of RNA in

each purified stock. RNA probes were subjected to 6M urea-TBE gel electrophoresis to verify

quality of RNA. RNA was diluted to 10 fmol/10μL for EMSA assays. As previously described

[56],32P RNA labeled probes were incubated at indicated wheat eIF4E protein concentrations in

EMSA binding buffer. Protein and RNA probes were incubated in a total volume of 10 μL of 1X

EMSA binding buffer (10 mM HEPES pH 7.5, 20 mM KCl, 1 mM dithiothreitol, 3 mM MgCl2),

0.1 μg/μl yeast tRNA, 1 μg/μl bovine serum albumen, 1 unit/μl RNaseOUT ™ Recombinant

Ribonuclease Inhibitor (Invitrogen), and 20mM Tris-HCl/10% glycerol for 25 minutes in ice.

Then, 3 μL of 50% glycerol was added to each reaction. Immediately after, 7μL of RNA-protein mixture were loaded into a 5% polyacrylamide (acrylamide:bis-acrylamide 19:1), Tris- borate/EDTA (TBE) gel, which was run at ~4°C at 110V for 45 minutes in 0.5X cold TBE buffer. Gels were dried on Whatman 3MM paper and exposed to a phosphorimager screen overnight. Phosphor screens were scanned in a Bio-Rad PhosphorImager, and radioactivity counts were analyzed using Quantity One software (Bio-Rad). Statistical analysis was performed using GraphPad Prism software.

Inoculation of Black Mexican Sweet (BMS) protoplasts

Protoplasts were isolated from BMS suspension cultures as described previously [28, 29].

For each experiment, aliquots of isolated protoplasts (2.0 X 106) were transfected with 5 pmol of

pMCM41 and pMCM41 mutant transcript RNA using PEG-1540 (40% PEG, 6 mM CaCl2, 5

mM MES, pH 5.6). Protoplasts were diluted slowly in 5 mL of solution M (8.5% mannitol, 5

mM MES, pH5.6, 6 mM CaCl2), followed by incubation at 4°C for 20 minutes. Protoplasts then

were centrifuged at 100 x g for 5 min, followed by two washes with solution M. PGM buffer

(6% mannitol, 3% sucrose, M5524 MS salts (Sigma), M7150 vitamins (Sigma), 0.005% 95 phytagel) was then added to each protoplast sample, followed by incubation in the dark at room temperature for 24 hours. RNA was isolated using Plant RNAeasy kit (Qiagen) following manufacturer’s instructions. Purified RNA was analyzed on a 0.8% native agarose gel and quantified using a Nanodrop Spectrophotometer.

Plant Inoculations

Maize (B73 and sweet corn varieties) was grown in growth chambers on a 16:8 photoperiod with a temperature setting of 25°C/22°C (day/night). Maize seedlings were inoculated at the three-leaf stage. Plants were dusted with carborundum and inoculated on the third leaf with 10 μg of purified MCMV transcripts in 10mM sodium acetate, 5mM calcium chloride, and 0.5% bentonite by stroking three times with a freshly-gloved finger. Inoculated leaves were rinsed with water 10 minutes post inoculation; rinsed water was collected and autoclaved prior to disposal. Total plant RNA was extracted using Trizol (Invitrogen) from 100 mg of the newest systemic leaves for northern blot hybridization (14 dpi) and cDNA synthesis.

Quality of RNA was evaluated in 0.8% native agarose gel, and RNA was quantified using a

Nanodrop Spectrophotometer. RNA extracted from infected plants was subjected to cDNA synthesis and RT-PCR and sent to the Iowa State University DNA Facility for sequencing to evaluate the presence of introduced mutations.

Dot blot and northern blot hybridizations

RNA isolated from BMS protoplasts had higher ratio of degraded RNA genomic/ sub- genomic RNA in northern blot hybridizations such that lower molecular weight fragments overpowered the genomics and sub-genomics RNAs. However, a trend on the amount of MCMV

RNA detected suggested that replication was still occurring in protoplast; thus, dot blots were used instead. Unfractionated RNA from protoplasts was denatured prior to immobilization on a nylon membrane using a vacuum manifold apparatus as described in Brown et. al. 2004 [98]. 96

RNA from maize plants was not degraded so it was subjected to northern blot hybridization.

Total RNA from 100 mg newest leaf samples (14 dpi) was denatured in a

formaldehyde/formamide buffer solution by heating at 65°C for 15 minutes, follow by separation

in a 0.8% denaturing agarose gel and transferred to a nylon membrane. Both dot blot and

northern blot nylon membranes were UV cross-linked, follow by prehybridization (50% formamide, 5X SSC, 200 μg/mL polyanetholesulfonic acid, 0.1% SDS, 20mM sodium phosphate, pH 6.5) at 55°C for 2 hours. Membranes were hybridized using 32P-labeled probe

complementary to MCMV nts 3811-4356, which had been transcribed using SP6 RNA

polymerase. Washed blots were wrapped in plastic and exposed to phosphor screens. Phosphor

screens were scanned in a Bio-Rad PhosphorImager, radioactivity counts were analyzed using

Quantity One software (Bio-Rad). Statistical analysis was performed using GraphPad Prism

software.

RT-PCR

The presence of MCMV virus RNA was evaluated by RT-PCR. Total RNA isolated from

inoculated plants was extracted from washed maize 3rd leaves using Trizol. The concentration

and purity of extracted RNA were confirmed using a NanoDrop ND-2000 spectrophotometer, and the quality of RNA was evaluated via 0.8% native agarose gel electrophoresis. One microgram of RNA was subjected to cDNA synthesis following the manufacturer’s instructions

for Maxima first-strand cDNA synthesis kit (Thermo Fisher Scientific). The cDNA was

amplified using MCMV-CP primers (R: 5’- TGTGCTCAATGATTTGCCAGCCC, F: 5’-

ATGGCGGCAAGTAGCCGGTCT) for 25 cycles, and the products were separated on 1% agarose gels, visualized by SYBR safe DNA stain (Thermo Fisher Scientific) and photographed.

Similarly to MCMV-CP, maize ubiquitin cDNA expression (F: 5’-

TAAGCTGCCGATGTGCCTGCGTCG and R: 5’- 97

CTGAAAGACAGAACATAATGAGCACAG) was analyzed in the same sample set to serve as

an endogenous positive control. Overall results from RT-PCR screenings were reported on Table

2-1. Examples of RT-PCR agarose gel observations can be observed on Figure 2.15-S5.

Acknowledgments

The authors thank Dr. Karen Browning for advice on the purification of eIF4E. Dr. Song

Ki Cho and Dr. Jelena Kraft for advice on experimental protocols.

References

1. Niblett CL, Claflin LE. Corn lethal necrosis - a new virus disease of corn in Kansas. Plant Disease Reporter. 1978;62(1):15-9.

2. FAO. Maize Lethal Necrosis Disease (MLND) - A snapshot La FAO en situaciones de emergencias2013 [March]. Available from: http://www.fao.org/fileadmin/user_upload/emergencies/docs/MLND%20Snapshot_FINAL.pdf.

3. FAO. Status of Maize lethal necrosis disease (MLND)in kenya Food and Agriculture Organization of the United Nations2017. Available from: https://www.ippc.int/en/countries/kenya/pestreports/2017/11/status-of-maize-lethal-necrosis- disease-mlndin-kenya-1/.

4. Wangai AW, Redinbaugh MG, Kinyua ZM, Miano DW, Leley PK, Kasina M, et al. First Report of Maize chlorotic mottle virus and Maize Lethal Necrosis in Kenya. Plant Dis. 2012;96(10):1582. Epub 2012/10/01. doi: 10.1094/PDIS-06-12-0576-PDN. PubMed PMID: 30727337.

5. Mahuku G, Lockhart BE, Wanjala B, Jones MW, Kimunye JN, Stewart LR, et al. Maize Lethal Necrosis (MLN), an Emerging Threat to Maize-Based Food Security in Sub-Saharan Africa. Phytopathology. 2015;105(7):956-65. Epub 2015/03/31. doi: 10.1094/PHYTO-12-14- 0367-FI. PubMed PMID: 25822185.

6. Redinbaugh MG, Stewart LR. Maize Lethal Necrosis: An Emerging, Synergistic Viral Disease. Annu Rev Virol. 2018;5(1):301-22. Epub 2018/07/31. doi: 10.1146/annurev-virology- 092917-043413. PubMed PMID: 30059641.

7. Gitonga K. Maize Lethal Necrosis - The growing challenge in Eastern Africa. In: Service UFA, editor. https://gain.fas.usda.gov: Global Agricultural Informaiton Network; 2014.

8. Kusia ES, Subramanian S, Nyasani JO, Khamis F, Villinger J, Ateka EM, et al. First Report of Lethal Necrosis Disease Associated With Co-Infection of Finger Millet With Maize chlorotic mottle virus and Sugarcane mosaic virus in Kenya. Plant Disease. 2015;99(6):899-900. doi: 10.1094/Pdis-10-14-1048-Pdn. PubMed PMID: WOS:000360866000056. 98

9. Wamaitha MJ, Nigam D, Maina S, Stomeo F, Wangai A, Njuguna JN, et al. Metagenomic analysis of viruses associated with maize lethal necrosis in Kenya. Virol J. 2018;15(1):90. Epub 2018/05/25. doi: 10.1186/s12985-018-0999-2. PubMed PMID: 29792207; PubMed Central PMCID: PMCPMC5966901.

10. Xie L, Zhang JZ, Wang QA, Meng CM, Hong JA, Zhou XP. Characterization of Maize Chlorotic Mottle Virus Associated with Maize Lethal Necrosis Disease in China. J Phytopathol. 2011;159(3):191-3. doi: 10.1111/j.1439-0434.2010.01745.x. PubMed PMID: WOS:000286838000011.

11. Deng TC, Chou CM, Chen CT, Tsai CH, Lin FC. First Report of Maize chlorotic mottle virus on Sweet Corn in Taiwan. Plant Dis. 2014;98(12):1748. Epub 2014/12/01. doi: 10.1094/PDIS-06-14-0568-PDN. PubMed PMID: 30703919.

12. Achon MA, Serrano L, Clemente-Orta G, Sossai S. First Report of Maize chlorotic mottle virus on a Perennial Host, Sorghum halepense, and Maize in Spain. Plant Disease. 2017;101(2):393-. doi: 10.1094/Pdis-09-16-1261-Pdn. PubMed PMID: WOS:000392355200054.

13. Quito-Avila DF, Alvarez RA, Mendoza AA. Occurrence of maize lethal necrosis in Ecuador: a disease without boundaries? European Journal of Plant Pathology. 2016;146(3):705- 10. doi: 10.1007/s10658-016-0943-5.

14. Vega H, Beillard MJ. Ecuador declares state of emergency in corn production areas. In: Network GAI, editor. Quite, Ecuador: USDA Foreign Agricultural Service; 2016.

15. Goldberg KB, Brakke MK. Concentration of Maize Chlorotic Mottle Virus Increased in Mixed Infections with Maize-Dwarf Mosaic-Virus, Strain-B. Phytopathology. 1987;77(2):162-7. doi: DOI 10.1094/Phyto-77-162. PubMed PMID: WOS:A1987G522800008.

16. Scheets K. Maize chlorotic mottle machlomovirus and wheat streak mosaic rymovirus concentrations increase in the synergistic disease corn lethal necrosis. Virology. 1998;242(1):28- 38. Epub 1998/03/17. doi: 10.1006/viro.1997.8989. PubMed PMID: 9501040.

17. Wang Q, Zhang C, Wang C, Qian Y, Li Z, Hong J, et al. Further characterization of Maize chlorotic mottle virus and its synergistic interaction with Sugarcane mosaic virus in maize. Sci Rep. 2017;7:39960. Epub 2017/01/07. doi: 10.1038/srep39960. PubMed PMID: 28059116; PubMed Central PMCID: PMCPMC5216416.

18. Xia Z, Zhao Z, Chen L, Li M, Zhou T, Deng C, et al. Synergistic infection of two viruses MCMV and SCMV increases the accumulations of both MCMV and MCMV-derived siRNAs in maize. Sci Rep. 2016;6:20520. Epub 2016/02/13. doi: 10.1038/srep20520. PubMed PMID: 26864602; PubMed Central PMCID: PMCPMC4808907.

19. Zambrano JL, Jones MW, Brenner E, Francis DM, Tomas A, Redinbaugh MG. Genetic analysis of resistance to six virus diseases in a multiple virus-resistant maize inbred line. Theor Appl Genet. 2014;127(4):867-80. Epub 2014/02/07. doi: 10.1007/s00122-014-2263-5. PubMed PMID: 24500307. 99

20. Gowda M, Beyene Y, Makumbi D, Semagn K, Olsen MS, Bright JM, et al. Discovery and validation of genomic regions associated with resistance to maize lethal necrosis in four biparental populations. Mol Breed. 2018;38(5):66. Epub 2018/05/19. doi: 10.1007/s11032-018- 0829-7. PubMed PMID: 29773962; PubMed Central PMCID: PMCPMC5945787.

21. Redinbaugh MG, Zambrano JL. Chapter Eight - Control of Virus Diseases in Maize. In: Loebenstein G, Katis N, editors. Advances in Virus Research. 90: Academic Press; 2014. p. 391- 429.

22. Xing Y, Ingvardsen C, Salomon R, Lubberstedt T. Analysis of sugarcane mosaic virus resistance in maize in an isogenic dihybrid crossing scheme and implications for breeding potyvirus-resistant maize hybrids. Genome. 2006;49(10):1274-82. Epub 2007/01/11. doi: 10.1139/g06-070. PubMed PMID: 17213909.

23. Jones MW, Penning BW, Jamann TM, Glaubitz JC, Romay C, Buckler ES, et al. Diverse Chromosomal Locations of Quantitative Trait Loci for Tolerance to Maize chlorotic mottle virus in Five Maize Populations. Phytopathology. 2018;108(6):748-58. Epub 2017/12/30. doi: 10.1094/PHYTO-09-17-0321-R. PubMed PMID: 29287150.

24. FAO. Central African Republic - Situation report March 2019 Food and Agriculture Organization of the United Nations2019 [cited 2019 March]. Available from: http://www.fao.org/fileadmin/user_upload/emergencies/docs/FAOCARsitrep_March2019.pdf.

25. FAO. Democratic Republic of the Congo - Situation report Februrary 2019 [Online report]. Food and Agriculture Organization of the United Nations2019 [cited 2019 March]. Available from: http://www.fao.org/fileadmin/user_upload/emergencies/docs/FAO%20DRC%20sit%20rep_Febr uary%202019.pdf.

26. CIMMYT. Maize Lethal Necrosis: Building a comprehensive response. Report. Mexico: 2014.

27. King AM, Adams MJ, Carstens EB, Lefkowitz EJ. Machlomovirus. Virus Taxonomy. Report of the International Committee on Taxonomy of Viruses. 9th. United States of America: ELSEVIER Academic Press; 2012. p. 1134-8.

28. Scheets K. Maize chlorotic mottle machlomovirus expresses its coat protein from a 1.47- kb subgenomic RNA and makes a 0.34-kb subgenomic RNA. Virology. 2000;267(1):90-101. Epub 2000/01/29. doi: 10.1006/viro.1999.0107. PubMed PMID: 10648186.

29. Scheets K, Khosravi-Far R, Nutter RC. Transcripts of a maize chlorotic mottle virus cDNA clone replicate in maize protoplasts and infect maize plants. Virology. 1993;193(2):1006- 9. Epub 1993/04/01. doi: 10.1006/viro.1993.1216. PubMed PMID: 8460472.

30. Nutter RC, Scheets K, Panganiban LC, Lommel SA. The complete nucleotide sequence of the maize chlorotic mottle virus genome. Nucleic Acids Res. 1989;17(8):3163-77. Epub 1989/04/25. doi: 10.1093/nar/17.8.3163. PubMed PMID: 2726455; PubMed Central PMCID: PMCPMC317721. 100

31. Scheets K. Analysis of gene functions in Maize chlorotic mottle virus. Virus Res. 2016;222:71-9. Epub 2016/06/01. doi: 10.1016/j.virusres.2016.04.024. PubMed PMID: 27242072.

32. Browning KS, Bailey-Serres J. Mechanism of cytoplasmic mRNA translation. Arabidopsis Book. 2015;13(13):e0176. Epub 2015/05/29. doi: 10.1199/tab.0176. PubMed PMID: 26019692; PubMed Central PMCID: PMCPMC4441251.

33. Browning KS. The plant translational apparatus. Plant Mol Biol. 1996;32(1-2):107-44. Epub 1996/10/01. doi: 10.1007/bf00039380. PubMed PMID: 8980477.

34. Walsh D, Mathews MB, Mohr I. Tinkering with translation: protein synthesis in virus- infected cells. Cold Spring Harb Perspect Biol. 2013;5(1):a012351. Epub 2012/12/05. doi: 10.1101/cshperspect.a012351. PubMed PMID: 23209131; PubMed Central PMCID: PMCPMC3579402.

35. Simon AE, Miller WA. 3' cap-independent translation enhancers of plant viruses. Annu Rev Microbiol. 2013;67:21-42. Epub 2013/05/21. doi: 10.1146/annurev-micro-092412-155609. PubMed PMID: 23682606; PubMed Central PMCID: PMCPMC4034384.

36. Dreher TW, Miller WA. Translational control in positive strand RNA plant viruses. Virology. 2006;344(1):185-97. Epub 2005/12/21. doi: 10.1016/j.virol.2005.09.031. PubMed PMID: 16364749; PubMed Central PMCID: PMCPMC1847782.

37. Guo L, Allen E, Miller WA. Structure and function of a cap-independent translation element that functions in either the 3' or the 5' untranslated region. Rna. 2000;6(12):1808-20. Epub 2001/01/06. doi: 10.1017/s1355838200001539. PubMed PMID: 11142380; PubMed Central PMCID: PMCPMC1370050.

38. Miller WA, Wang Z, Treder K. The amazing diversity of cap-independent translation elements in the 3'-untranslated regions of plant viral RNAs. Biochem Soc Trans. 2007;35(Pt 6):1629-33. Epub 2007/11/23. doi: 10.1042/BST0351629. PubMed PMID: 18031280; PubMed Central PMCID: PMCPMC3081161.

39. Nicholson BL, White KA. 3' Cap-independent translation enhancers of positive-strand RNA plant viruses. Curr Opin Virol. 2011;1(5):373-80. Epub 2012/03/24. doi: 10.1016/j.coviro.2011.10.002. PubMed PMID: 22440838.

40. Nicholson BL, Wu B, Chevtchenko I, White KA. Tombusvirus recruitment of host translational machinery via the 3' UTR. Rna. 2010;16(7):1402-19. Epub 2010/05/29. doi: 10.1261/rna.2135210. PubMed PMID: 20507975; PubMed Central PMCID: PMCPMC2885689.

41. Gazo BM, Murphy P, Gatchel JR, Browning KS. A novel interaction of Cap-binding protein complexes eukaryotic initiation factor (eIF) 4F and eIF(iso)4F with a region in the 3'- untranslated region of satellite tobacco necrosis virus. J Biol Chem. 2004;279(14):13584-92. Epub 2004/01/20. doi: 10.1074/jbc.M311361200. PubMed PMID: 14729906. 101

42. Treder K, Pettit Kneller EL, Allen EM, Wang Z, Browning KS, Miller WA. The 3′ cap- independent translation element of Barley yellow dwarf virus binds eIF4F via the eIF4G subunit to initiate translation. Rna. 2008;14(1):134-47. doi: 10.1261/rna.777308.

43. Miras M, Truniger V, Querol-Audi J, Aranda MA. Analysis of the interacting partners eIF4F and 3'-CITE required for Melon necrotic spot virus cap-independent translation. Mol Plant Pathol. 2017;18(5):635-48. Epub 2016/05/05. doi: 10.1111/mpp.12422. PubMed PMID: 27145354; PubMed Central PMCID: PMCPMC6638222.

44. Huang M, Koh DC, Weng LJ, Chang ML, Yap YK, Zhang L, et al. Complete nucleotide sequence and genome organization of hibiscus chlorotic ringspot virus, a new member of the genus Carmovirus: evidence for the presence and expression of two novel open reading frames. J Virol. 2000;74(7):3149-55. Epub 2000/03/09. doi: 10.1128/jvi.74.7.3149-3155.2000. PubMed PMID: 10708431; PubMed Central PMCID: PMCPMC111815.

45. Altenbach SB, Howell SH. In vitro translation products of turnip crinkle virus RNA. Virology. 1982;118(1):128-35. Epub 1982/04/15. doi: 10.1016/0042-6822(82)90326-9. PubMed PMID: 18635130.

46. Johnston JC, Rochon DM. Translation of cucumber necrosis virus RNA in vitro. J Gen Virol. 1990;71 ( Pt 10)(10):2233-41. Epub 1990/10/01. doi: 10.1099/0022-1317-71-10-2233. PubMed PMID: 2230730.

47. Na H, Fabian MR, White KA. Conformational organization of the 3' untranslated region in the tomato bushy stunt virus genome. Rna. 2006;12(12):2199-210. Epub 2006/11/02. doi: 10.1261/rna.238606. PubMed PMID: 17077273; PubMed Central PMCID: PMCPMC1664717.

48. Koev G, Liu S, Beckett R, Miller WA. The 3prime prime or minute-terminal structure required for replication of Barley yellow dwarf virus RNA contains an embedded 3prime prime or minute end. Virology. 2002;292(1):114-26. Epub 2002/03/07. doi: 10.1006/viro.2001.1268. PubMed PMID: 11878914.

49. Zhang J, Simon AE. Importance of sequence and structural elements within a viral replication repressor. Virology. 2005;333(2):301-15. Epub 2005/02/22. doi: 10.1016/j.virol.2004.12.015. PubMed PMID: 15721364.

50. Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003;31(13):3406-15. Epub 2003/06/26. doi: 10.1093/nar/gkg595. PubMed PMID: 12824337; PubMed Central PMCID: PMCPMC169194.

51. Lorenz R, Bernhart SH, Honer Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, et al. ViennaRNA Package 2.0. Algorithms Mol Biol. 2011;6(1):26. Epub 2011/11/26. doi: 10.1186/1748-7188-6-26. PubMed PMID: 22115189; PubMed Central PMCID: PMCPMC3319429. 102

52. Chattopadhyay M, Kuhlmann MM, Kumar K, Simon AE. Position of the kissing-loop interaction associated with PTE-type 3'CITEs can affect enhancement of cap-independent translation. Virology. 2014;458-459:43-52. Epub 2014/06/15. doi: 10.1016/j.virol.2014.03.027. PubMed PMID: 24928038; PubMed Central PMCID: PMCPMC4101382.

53. Wang Z, Parisien M, Scheets K, Miller WA. The cap-binding translation initiation factor, eIF4E, binds a pseudoknot in a viral cap-independent translation element. Structure. 2011;19(6):868-80. Epub 2011/06/08. doi: 10.1016/j.str.2011.03.013. PubMed PMID: 21645857; PubMed Central PMCID: PMCPMC3113551.

54. Raden M, Ali SM, Alkhnbashi OS, Busch A, Costa F, Davis JA, et al. Freiburg RNA tools: a central online resource for RNA-focused research and teaching. Nucleic Acids Res. 2018;46(W1):W25-W9. Epub 2018/05/23. doi: 10.1093/nar/gky329. PubMed PMID: 29788132; PubMed Central PMCID: PMCPMC6030932.

55. Will S, Joshi T, Hofacker IL, Stadler PF, Backofen R. LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. Rna. 2012;18(5):900-14. Epub 2012/03/28. doi: 10.1261/rna.029041.111. PubMed PMID: 22450757; PubMed Central PMCID: PMCPMC3334699.

56. Kraft JJ, Peterson MS, Cho SK, Wang Z, Hui A, Rakotondrafara AM, et al. The 3' Untranslated Region of a Plant Viral RNA Directs Efficient Cap-Independent Translation in Plant and Mammalian Systems. Pathogens. 2019;8(1):28. Epub 2019/03/03. doi: 10.3390/pathogens8010028. PubMed PMID: 30823456; PubMed Central PMCID: PMCPMC6471432.

57. Wang Z, Treder K, Miller WA. Structure of a viral cap-independent translation element that functions via high affinity binding to the eIF4E subunit of eIF4F. J Biol Chem. 2009;284(21):14189-202. Epub 2009/03/12. doi: 10.1074/jbc.M808841200. PubMed PMID: 19276085; PubMed Central PMCID: PMCPMC2682867.

58. Magee J, Warwicker J. Simulation of non-specific protein-mRNA interactions. Nucleic Acids Res. 2005;33(21):6694-9. Epub 2005/11/30. doi: 10.1093/nar/gki981. PubMed PMID: 16314302; PubMed Central PMCID: PMCPMC1297708.

59. Carberry SE, Friedland DE, Rhoads RE, Goss DJ. Binding of protein synthesis initiation factor 4E to oligoribonucleotides: effects of cap accessibility and secondary structure. Biochemistry. 1992;31(5):1427-32. Epub 1992/02/11. doi: 10.1021/bi00120a020. PubMed PMID: 1737000.

60. Fabian MR, White KA. 5'-3' RNA-RNA interaction facilitates cap- and poly(A) tail- independent translation of tomato bushy stunt virus mrna: a potential common mechanism for tombusviridae. J Biol Chem. 2004;279(28):28862-72. Epub 2004/05/05. doi: 10.1074/jbc.M401272200. PubMed PMID: 15123633. 103

61. Chattopadhyay M, Shi K, Yuan X, Simon AE. Long-distance kissing loop interactions between a 3' proximal Y-shaped structure and apical loops of 5' hairpins enhance translation of Saguaro cactus virus. Virology. 2011;417(1):113-25. Epub 2011/06/15. doi: 10.1016/j.virol.2011.05.007. PubMed PMID: 21664637; PubMed Central PMCID: PMCPMC3152624.

62. Miller WA, White KA. Long-distance RNA-RNA interactions in plant virus gene expression and replication. Annu Rev Phytopathol. 2006;44:447-67. Epub 2006/05/18. doi: 10.1146/annurev.phyto.44.070505.143353. PubMed PMID: 16704356; PubMed Central PMCID: PMCPMC1894749.

63. Braidwood L, Quito-Avila DF, Cabanas D, Bressan A, Wangai A, Baulcombe DC. Maize chlorotic mottle virus exhibits low divergence between differentiated regional sub-populations. Sci Rep. 2018;8(1):1173. Epub 2018/01/21. doi: 10.1038/s41598-018-19607-4. PubMed PMID: 29352173; PubMed Central PMCID: PMCPMC5775324.

64. Fatma HK, Tileye F, Patrick AN. Insights of maize lethal necrotic disease: A major constraint to maize production in East Africa. African Journal of Microbiology Research. 2016;10(9):271-9. doi: 10.5897/ajmr2015.7534.

65. Huang J, Wen GS, Li MJ, Sun CC, Sun Y, Zhao MF, et al. First Report of Maize chlorotic mottle virus Naturally Infecting Sorghum and Coix Seed in China. Plant Disease. 2016;100(9):1955-. doi: 10.1094/Pdis-02-16-0251-Pdn. PubMed PMID: WOS:000381658700043.

66. Wang Q, Zhou XP, Wu JX. First Report of Maize chlorotic mottle virus Infecting Sugarcane (Saccharum officinarum). Plant Dis. 2014;98(4):572. Epub 2014/04/01. doi: 10.1094/PDIS-07-13-0727-PDN. PubMed PMID: 30708716.

67. Uyemoto JK, Bockelman DL, Claflin LE. Severe Outbreak of Corn Lethal Necrosis Disease in Kansas. Plant Disease. 1980;64(1):99-100. doi: Doi 10.1094/Pd-64-99. PubMed PMID: WOS:A1980JL74200028.

68. Jiang XQ, Meinke LJ, Wright RJ, Wilkinson DR, Campbell JE. Maize Chlorotic Mottle Virus in Hawaiian-Grown Maize - Vector Relations, Host Range and Associated Viruses. Crop Prot. 1992;11(3):248-54. doi: Doi 10.1016/0261-2194(92)90045-7. PubMed PMID: WOS:A1992HU12400009.

69. Gao F, Simon AE. Differential use of 3'CITEs by the subgenomic RNA of Pea enation mosaic virus 2. Virology. 2017;510:194-204. Epub 2017/07/28. doi: 10.1016/j.virol.2017.07.021. PubMed PMID: 28750323; PubMed Central PMCID: PMCPMC5891822.

70. Truniger V, Miras M, Aranda MA. Structural and Functional Diversity of Plant Virus 3'- Cap-Independent Translation Enhancers (3'-CITEs). Frontiers in plant science. 2017;8:2047. Epub 2017/12/15. doi: 10.3389/fpls.2017.02047. PubMed PMID: 29238357; PubMed Central PMCID: PMCPMC5712577. 104

71. Sharma SD, Kraft JJ, Miller WA, Goss DJ. Recruitment of the 40S ribosome subunit to the 3'-untranslated region (UTR) of a viral mRNA, via the eIF4 complex, facilitates cap- independent translation. J Biol Chem. 2015;290(18):11268-81. Epub 2015/03/21. doi: 10.1074/jbc.M115.645002. PubMed PMID: 25792742; PubMed Central PMCID: PMCPMC4416834.

72. Steckelberg AL, Vicens Q, Kieft JS. Exoribonuclease-Resistant RNAs Exist within both Coding and Noncoding Subgenomic RNAs. MBio. 2018;9(6):e02461-18. Epub 2018/12/20. doi: 10.1128/mBio.02461-18. PubMed PMID: 30563900; PubMed Central PMCID: PMCPMC6299227.

73. May J, Johnson P, Saleem H, Simon AE. A Sequence-Independent, Unstructured Internal Ribosome Entry Site Is Responsible for Internal Expression of the Coat Protein of Turnip Crinkle Virus. J Virol. 2017;91(8):e02421-16. Epub 2017/02/10. doi: 10.1128/JVI.02421-16. PubMed PMID: 28179526; PubMed Central PMCID: PMCPMC5375686.

74. Iwakawa HO, Mizumoto H, Nagano H, Imoto Y, Takigawa K, Sarawaneeyaruk S, et al. A viral noncoding RNA generated by cis-element-mediated protection against 5'->3' RNA decay represses both cap-independent and cap-dependent translation. J Virol. 2008;82(20):10162-74. Epub 2008/08/15. doi: 10.1128/JVI.01027-08. PubMed PMID: 18701589; PubMed Central PMCID: PMCPMC2566255.

75. Basnayake VR, Sit TL, Lommel SA. The Red clover necrotic mosaic virus origin of assembly is delimited to the RNA-2 trans-activator. Virology. 2009;384(1):169-78. Epub 2008/12/09. doi: 10.1016/j.virol.2008.11.005. PubMed PMID: 19062064.

76. Offel SK, Coutts RHA. Location of the 5' Termini of Tobacco Necrosis Virus Strain D Subgenomic mRNAs. J Phytopathol. 1996;144(1):13-7. doi: 10.1111/j.1439- 0434.1996.tb01481.x.

77. Clarke BD, Roby JA, Slonchak A, Khromykh AA. Functional non-coding RNAs derived from the flavivirus 3' untranslated region. Virus Res. 2015;206:53-61. Epub 2015/02/11. doi: 10.1016/j.virusres.2015.01.026. PubMed PMID: 25660582.

78. Miller WA, Jackson J, Feng Y. Cis- and trans-regulation of luteovirus gene expression by the 3' end of the viral genome. Virus Res. 2015;206:37-45. Epub 2015/04/11. doi: 10.1016/j.virusres.2015.03.009. PubMed PMID: 25858272; PubMed Central PMCID: PMCPMC4722956.

79. Shen R, Rakotondrafara AM, Miller WA. trans regulation of cap-independent translation by a viral subgenomic RNA. J Virol. 2006;80(20):10045-54. Epub 2006/09/29. doi: 10.1128/JVI.00991-06. PubMed PMID: 17005682; PubMed Central PMCID: PMCPMC1617300.

80. Damas ND, Fossat N, Scheel TKH. Functional Interplay between RNA Viruses and Non- Coding RNA in Mammals. Noncoding RNA. 2019;5(1):7. Epub 2019/01/17. doi: 10.3390/ncrna5010007. PubMed PMID: 30646609; PubMed Central PMCID: PMCPMC6468702. 105

81. Michalski D, Ontiveros JG, Russo J, Charley PA, Anderson JR, Heck AM, et al. Zika virus noncoding sfRNAs sequester multiple host-derived RNA-binding proteins and modulate mRNA decay and splicing during infection. J Biol Chem. 2019;294(44):16282-96. Epub 2019/09/15. doi: 10.1074/jbc.RA119.009129. PubMed PMID: 31519749; PubMed Central PMCID: PMCPMC6827284.

82. Sanford TJ, Mears HV, Fajardo T, Locker N, Sweeney TR. Circularization of flavivirus genomic RNA inhibits de novo translation initiation. Nucleic Acids Res. 2019;47(18):9789-802. Epub 2019/08/09. doi: 10.1093/nar/gkz686. PubMed PMID: 31392996; PubMed Central PMCID: PMCPMC6765113.

83. Holden KL, Harris E. Enhancement of dengue virus translation: role of the 3' untranslated region and the terminal 3' stem-loop domain. Virology. 2004;329(1):119-33. Epub 2004/10/13. doi: 10.1016/j.virol.2004.08.004. PubMed PMID: 15476880.

84. Murray KE, Steil BP, Roberts AW, Barton DJ. Replication of poliovirus RNA with complete internal ribosome entry site deletions. J Virol. 2004;78(3):1393-402. Epub 2004/01/15. doi: 10.1128/jvi.78.3.1393-1402.2004. PubMed PMID: 14722294; PubMed Central PMCID: PMCPMC321374.

85. Herold J, Andino R. Poliovirus RNA replication requires genome circularization through a protein-protein bridge. Mol Cell. 2001;7(3):581-91. doi: Doi 10.1016/S1097-2765(01)00205-2. PubMed PMID: WOS:000167966600013.

86. Geng G, Yu C, Li X, Yuan X. A unique internal ribosome entry site representing a dynamic equilibrium state of RNA tertiary structure in the 5'-UTR of Wheat yellow mosaic virus RNA1. Nucleic Acids Res. 2019. Epub 2019/11/13. doi: 10.1093/nar/gkz1073. PubMed PMID: 31713626.

87. Fan Q, Treder K, Miller WA. Untranslated regions of diverse plant viral RNAs vary greatly in translation enhancement efficiency. BMC Biotechnol. 2012;12(1):22. Epub 2012/05/09. doi: 10.1186/1472-6750-12-22. PubMed PMID: 22559081; PubMed Central PMCID: PMCPMC3416697.

88. Monzingo AF, Dhaliwal S, Dutt-Chaudhuri A, Lyon A, Sadow JH, Hoffman DW, et al. The structure of eukaryotic translation initiation factor-4E from wheat reveals a novel disulfide bond. Plant Physiol. 2007;143(4):1504-18. Epub 2007/02/27. doi: 10.1104/pp.106.093146. PubMed PMID: 17322339; PubMed Central PMCID: PMCPMC1851841.

89. Nieto C, Piron F, Dalmais M, Marco CF, Moriones E, Gomez-Guillamon ML, et al. EcoTILLING for the identification of allelic variants of melon eIF4E, a factor that controls virus susceptibility. BMC Plant Biol. 2007;7(1):34. Epub 2007/06/23. doi: 10.1186/1471-2229-7-34. PubMed PMID: 17584936; PubMed Central PMCID: PMCPMC1914064.

90. Truniger V, Nieto C, Gonzalez-Ibeas D, Aranda M. Mechanism of plant eIF4E-mediated resistance against a Carmovirus (Tombusviridae): cap-independent translation of a viral RNA controlled in cis by an (a)virulence determinant. Plant J. 2008;56(5):716-27. Epub 2008/07/23. doi: 10.1111/j.1365-313X.2008.03630.x. PubMed PMID: 18643998. 106

91. Nieto C, Morales M, Orjeda G, Clepet C, Monfort A, Sturbois B, et al. An eIF4E allele confers resistance to an uncapped and non-polyadenylated RNA virus in melon. Plant J. 2006;48(3):452-62. Epub 2006/10/10. doi: 10.1111/j.1365-313X.2006.02885.x. PubMed PMID: 17026540.

92. Petty IT. A plasmid vector for cloning directly at the transcription initiation site of a bacteriophage T7 promoter. Nucleic Acids Res. 1988;16(17):8738. Epub 1988/09/12. doi: 10.1093/nar/16.17.8738. PubMed PMID: 3047690; PubMed Central PMCID: PMCPMC338616.

93. Wang Z, Kraft JJ, Hui AY, Miller WA. Structural plasticity of Barley yellow dwarf virus- like cap-independent translation elements in four genera of plant viral RNAs. Virology. 2010;402(1):177-86. Epub 2010/04/16. doi: 10.1016/j.virol.2010.03.025. PubMed PMID: 20392470; PubMed Central PMCID: PMCPMC2905845.

94. Shen R, Miller WA. The 3' untranslated region of tobacco necrosis virus RNA contains a barley yellow dwarf virus-like cap-independent translation element. J Virol. 2004;78(9):4655-64. Epub 2004/04/14. doi: 10.1128/jvi.78.9.4655-4664.2004. PubMed PMID: 15078948; PubMed Central PMCID: PMCPMC387721.

95. Rakotondrafara AM, Jackson JR, Kneller EP, Miller WA. Preparation and Electroporation of Oat Protoplasts from Cell Suspension Culture. Current Protocols in Microbiology: John Wiley & Sons, Inc.; 2005.

96. Miras M, Sempere R, Kraft J, Miller W, Aranda M, Truniger V. Determination of the Secondary Structure of an RNA fragment in Solution: Selective 2`-Hydroxyl Acylation Analyzed by Primer Extension Assay (SHAPE). Bio-Protocol. 2015;5(2):e1386. doi: 10.21769/BioProtoc.1386.

97. Mayberry LK, Dennis MD, Leah Allen M, Ruud Nitka K, Murphy PA, Campbell L, et al. Expression and purification of recombinant wheat translation initiation factors eIF1, eIF1A, eIF4A, eIF4B, eIF4F, eIF(iso)4F, and eIF5. Methods Enzymol. 2007;430:397-408. Epub 2007/10/05. doi: 10.1016/S0076-6879(07)30015-3. PubMed PMID: 17913646.

98. Brown T, Mackey K, Du T. Analysis of RNA by northern and slot blot hybridization. Curr Protoc Mol Biol. 2004;Chapter 4(1):Unit 4 9. Epub 2008/02/12. doi: 10.1002/0471142727.mb0409s67. PubMed PMID: 18265351.

107

Figures

Figure 2.1. In vitro translation of MCMV genomic RNA.(A) Genome organization of MCMV. Boxes indicate open reading frames (ORFs) with encoded protein (named by its molecular weight in kDa) indicated. Dashed boundaries indicate leaky stop codons. RdRp: RNA-dependent RNA polymerase domain of P111; CP: coat protein. Positions of key restriction enzymes are indicated (SmaI: 4437, SpeI: 4191). Black oval indicates the location of the 3’ CITE (nts 4164- 4333). (B) Translation of capped (+) and uncapped (-) transcripts and virion RNA. pMCM41 was linearized at the indicated locations in the genome prior to transcription. Capped transcripts were from pMCM721, which is identical to pMCM41 but with one nonviral G at the 5’ end, which was required for capping. sgRNA1 is uncapped transcript from psgRNA1 linearized with Sma I. Uncapped transcript from Sma I-linearized pMCM41 is infectious [29]. sgRNA1 is uncapped in vitro transcript from p Gel shows 35S-met-labeled products translated in WGE. The prominent bands identified on the gel correspond to P32 and P50 products (See ORFs in Fig. 1A). 108

A) 1 120 4095 4437 5’ UTR 3’ UTR Fluc

B) 4095 4164 4278 4333 4437 WGE Oat

4095 4333 4095 4278

4095 4191 4164 4437

4192 4437

4164 4333

4192 4333

4164 4278

4192 4278

4095 4200 4300 4437

0 50 100 150 200 Normalized Relative luciferase activity (%)

Figure 2.2. Mapping the sequence in MCMV 3’UTR required for cap-independent translation. (A) Map of MlucM containing a firefly luciferase reporter gene (Fluc) flanked by the complete MCMV UTRs, with base numbering as in the full-length genome. (B) Black bars below left indicate the portions of the 3’ UTR present in each MlucM deletion construct. Sequence covering nt 4164-4333 (grey shading) shows the minimal amount of sequence required for efficient cap-independent translation. Relative luciferase activity obtained from indicated uncapped transcripts in WGE and oat protoplasts is shown as percentages of relative light units where constructs were normalized to MlucM wild type 3’ UTR (top bar). Data are shown as averages, relative to full-length 3’ UTR (±S.D.), from 4 independent experiments (for each construct: WGE: n=10, Oats: n=14). Protoplasts were co-electroporated with the indicated uncapped transcript plus capped Renilla luciferase mRNA as an electroporation control. Fluc/Rluc sample ratios were normalized to MlucMwt. One-way ANOVA was used to analyze the significance of each set of samples. For Dunnett’s multiple comparison test significance, an “uppercase” was assigned to a construct if the difference between MlucMwt and deletion mutant was P<0.001 whereas a “lowercase” was given to P<0.05. “A or a” was assigned to data comparisons against MlucMwt in WGE, while “B or b” was assigned to comparisons in oat protoplasts. 109

A) B) MgCl - + T G A C 0 1 0 1 U4205 A4210 G4215

G4215 bulge Purine G4220 U4225 G4219 G4230 G4231 C4235 MgCl - + I loop Side 4241 4250 A4240 A T G A C 0 1 0 1 U U G C4245 A A A Bridging U4153 Domain C CC GC A G A C U G G G C | | | | | | | | U A4160 U4250 C G4170 G G GC G U C C G A C A C4180 U G U G C4254 G A U U4190 C A 4230 _ U4200 MgCl - + Side loop I G C 4270 Side loop II G _ C T G A C 0 1 0 1 G _ C _ G4220 U A G4170 _ AG C G C A C4180 II loop internal Asymmetric Purine rich G A4240 G U4190 bulge U | C A _ 4219 G C _ U4200 U A 4280 4216 _ U4260 A U _ C G _ C G A4210 4210 A A Mismatch pair II G _ C

bulge Purine _ G4220 U A A4280 A _ A Mismatch pair I G C _ U A _ G _ C 4290 G4230 C G _ C4290 G4237 4200 G U loop I loop Side _ U A U A4240 small bulge I C U _ A4300 G U Bridging Domain A _ U _ U A U4250 _ G C G A A _ 4300 Asymmetric U

C4310 II loop Side A SHAPE Reactivity internal loop II C C 100% U4260 A _ G U G A _ U 40% _ 4190 U G 20% _ G C C4320 _ Chemical modification C _ G U4270 U A in the presence of Mg2+ A C 4310 A in the SHAPE buffer Asymmetric A A internal loop I U U C _ G _ C G _ _ 4164 C A A AGUGU U UG GG A G U A G G A U A C A CUUA AC C_ 4333 C4330 A C U A4280

Figure 2.3. SHAPE analysis of MTE structure from the MCM41 infectious clone. (A) SHAPE (selective 2’-hydroxyl acylation analyzed by primer extension) probing gel. RNA was modified with either 60 mM benzoyl cyanide (1) in dimethyl sulfoxide (DMSO) or DMSO only (0) in the presence (+) or absence (-) of magnesium in SHAPE buffer. (See Methods for details.) Gels show the products of reverse transcriptase extension of 32P-labeled primer on SHAPE-modified (1) and unmodified (0) templates. The sequencing ladders (four lanes at left, TGAC) were generated by dideoxy sequencing of unmodified RNA with the same 5’-labeled primer used in the modification lanes. Mobilities of selected nucleotides, numbered as in the viral genome are shown at left. (B) Superposition of the degree of modification of each nucleotide on the best- fitting predicted secondary structure of MTE. Relative benzoyl cyanide modification is indicated by the color-coding scheme of the SHAPE radioactivity scale. Nucleotides inside boxes are suspected to base pair with the 5’-UTR (Fig 7). Modification levels of nucleotides in the presence of magnesium are indicated by stars following the reactivity color-coding scheme. Potential pseudoknot interaction between the purine-rich bulge and bridging domain are indicated by dashed lines. Note: products of primer extension inhibition obtained from the SHAPE reaction are 1 nt shorter than those of dideoxy sequencing. 110

A) Mg Func. HyperG H1 G-domain H2 SL-1 Bridging-Domain SL-2 H2 H1 CarMV ------AGA CACACACGGCGGG------ACG UGUUGCC--- GGAGCUG------CCACUCC GC--CC AAGAUGGAUCGA------UAUUAACCAUCUU U--- GGCAACA------GCGUGCGUGGCU + + ------... (( (. ((( . (( .[[------[.. ((((((( --- ((((...------...)))) .]--]] (((((((.....------...... ))))))) .--- ))))))) ------))))) . ))) ... HnRSV -----UG UCUCAAGACUAUAGACG------GGA UGAGCUUCU- GGGGGCUG------CCACCCUC GC--CU CAUGCGCAUUGCCUGC------UCUGGGCGCAUGU- GGAAGCUGA------UAUAGUUGAGAGCU ? ? -----.. ((((( .. (((((( ...[------[[. ( . ((((((( - (((((...------...))))) .]--]] (((((((....((...------...))))))))) .- ))))))) . ) ------))))))))))) ... PFBV ------UA GGCUAUGAGAGG------GCA GUCACCCUA- CCUUCUG------CCAGAGG UU--CC CGGAGGAGUA------GUGAUCCUUUG --- GGGGUGAU------UCAUAGCUAG + + ------.. (((((((( ..[[------[.. (((((((( .- ((((...------...)))) .]--]] ((((((...------....))))))) --- )))))))) ------)))))))) .. SCV ------GUUGCUCGACUGUGAGG------GGA CCUACCCA-- CUGUGCUG------CCACACAG GAACUUCCAACCUU------CGGGUGG CGA-- GGUAGG------GCAGAAGAGUGAC + + ------(( .. ((( .. (((( ..[[------[.. (((((( ..-- (((((...------...))))) ...]]] (((.((..------..))))) ...-- )))))) ------)))) .. ))) .. )) PoLV ------CCAUAGUCGGAGG------GGUA AUCUAUCU-- GGGGAAGUUGGUGGUUCUUCUUCCUCCC---U GCCCUCAGGUGC------ACAAACUGGGGAGUG--- GAUAGAUAA----- UGAUUAUGG ? ? ------((((((((( ..[[------[... ((((((( .-- (((((((..((....))..))))))) ]]---] ((((((((....------.....)))))).)) .--- ))))))) ..----- ))))))))) HCRSV ------GGGCCUCCGUGCUGUUAGGGGCAGUGGAAACGUCGGUCUA------GCCAGCCG UCCCCUGGGUAGUGUGCUCCGUCUAAGUACACCACUACUCG-- GUUUCCACAAACGAUCGGACUCCC + + ------((( .. (((( ...... [[[[[.. (((((((( .. ((((...------....)))) .]]]]] ((((((((((((...... )))))...))))))) .-- )))))))) ...... )))) .. ))) JINRV ------U GGGUUUGCAGCGA------GAUG CCACGU---- GCCAGAGGAUA----GUACCACUGGCUGACCCUACCAGCGUUU------GCGAGUUGGUA U---- GCGUGGA------UGUAAAUUCCA + + ------. ((((((((( ..[[------.... (((((( ---- (((((.((...----...)).))))) ]].... (((((((....------....))))))) .---- )))))) .------))))))) . )) . GaMV ---CAGAAUGGUCUAUCAGGGAGC------AGGC ACUUUUCC-- GGCUUUG------GCGAGUC G----U CUCCACUUGUAUA------CGAACAGUGAG G—GAAAAGUG UA------UCUGAUAGAAGAA ? ? ---...... ((((((((( ..[[------.... (((((((( -- (((((..------..))))) ]----] (((.(((.((...------...)))))))) .-- )))))))) ..----- ))))))))) .... PSNV ----- CUGGUAGCAAUGGGUGGGG------UCGA UUUCCUACGUCUAGCUUUG------GCCGGCUAG ------CGUCU------CUCG UA- GUGGGGAAAUA---- ACUCAUUGAGACCAG ? ? ----- ((((( .. (((((((( ....------.... (((((((( .. ((((((...------...)))))) ------((...------..)) ..- )))))))) ...---- )))))))) .. ))))) CMMV ----- CCAGAGAACGUGUCAUCAGGG---GAUGACCAUGUC--- CACCUUG------CCGGGUG CC--CU CGGACCU------AUGUCCG U--- GACAUGGCG----- UGACGCUAGUAAUGG + + ----- ((( ...... (((((( ...[[[---[.... ((((((( --- ((((...------...)))) ]]--]] (((((..------..))))) .--- ))))))) ..----- )))))) ...... ))) PMV ACAGACCGUACAGCAGUCACACGG------GACG CCACACC--- ACCUUUG------CAGAGGU GC--CC UUGGGA------AACCAA U--- GGUGUGGG------GUGACACUGAUUAGUC ++ + ... ((( .... ((( .. ((((( ..[[------[... ((((((( --- (((((..------..))))) .]--]] ((((..------..)))) .--- ))))))) .------))))) .))).... ))) TPAV ------GGACAAGACCCCAGGCG------GGAC GCCACACC-- CAUGUUUG------CAGACAUG GC--CC GAGGA------AACUC U- GGUGUGGGC------UGGGGGGAAUCC +++ + ------((( ..... ((((( ...[------[[.. (((((((( -- (((((...------...))))) .]--]] (((..------..))) .- ))))))) . ) ------))))) .... ))) PEMV2 ------UAGAACACGUGGGAUAGGG---GAUGACCUUGUC--- GACCGUUU------GUCGGUC CC--CU GCUCCU------UAGAGC U--- GGCAAGGCG----- CCCAUUGGUUCUA + + ------(((((( .. ((((( ...[[[---[.... ((((((( --- (((((...------..))))) ]]--]] ((((..------..)))) .--- ))))))) ..----- ))))) .. )))))) MCMV ------A UGACCAUGACUG------GAGA GUGGGCG--- GCGGCUG------CCAACCGC AGA-CUGGGCGUAUA------UAGUAAGCCU U--- GACCCACC------CAUGGACAA + + ------. (( . ((((( ....------[[.. (((((( .--- ((((...------....)))) ...-]] ((((.....------...... )))) .--- ) . ))))) .------))))) . )) .

B) C) 10,000 C-domain **** C Side Y B 1,000 Side ** ** *** loop I loop II ns ns ns 100 Helix II

G **** Y W 10 G g ****

g Relative translation activity RR N G-domain 1

Helix I

Figure 2.4. Alignment of known and predicted PTEs.(A) Secondary RNA structure alignment of PTEs. Structural input for alignment was used from previously published data (S2 Fig). LocARNA [55] was used to identify conserved regions of the PTE structure. Then structures were aligned to fit the consensus. Bases are color-coded based on the specific location in the PTE structure (panel B). Purple: conserved helix I region (H1), orange: conserved purine-rich bulge, magenta: conserved helix II region (H2), blue: conserved stem-loop I (SL1), red: bridging domain, green: conserved stem-loop II (SL2). Parentheses or square brackets of the same color in opposite orientation are below complementary bases. Square brackets show potential pseudoknot base pairing. (B) Sketch of conserved consensus shape of PTEs. Color-coding corresponds to alignment results on part (A). Conserved bases are shown using IUPAC nomenclature (Y = C or U, R = A or G, W = A or U, B = all except A). Lower case g indicates G in ≥78 % of PTEs. (C) Relative translation level activity of different PTEs in WGE with wild type MCMV indicated as 100%. Previously constructed luciferase reporter constructs contain a firefly luciferase reporter gene flanked by 5’ and 3’-UTRs of the indicated plant viral genomes containing PTE in the 3’ UTR [56]. PEMV2-m2 contains a CC to AA mutation in the bridging domain. Uncapped RNA transcripts were incubated for 30 min in WGE at room temperature. Data shown are percentage averages (±S.D.) from 3 independent experiments (n=9), with two, three, and four asterisks indicating P<0.01, P<0.001, or P<0.0001, respectively, for the significance of difference from wild type MCMV (MlucM) construct. ns = not significantly different from MCMV. 111

A) B) C) AGACU AGUCU AGUCU

250 250 250

200 200 200 (%) A A (%)

A (%) G G G A A A G 150 G 150 G 150 G G G U UC A CA 100 A 100 CA 100 Normalized Normalized Normalized ****

AverageRLU **** AverageRLU 50 50 AverageRLU 50

0 0 **** 0 **** WGE Oat WGE Oat WGE Oat WT A4248U U4218A A4248U

D) E) F) AGACU ACCCU ACCCU

250 250 250 ** 200 200 A 200

A (%) (%) G (%) A G A G A **** 150 A G G 150 G 150 G G G G U G CA 100 CA 100 CA 100 Normalized Normalized Normalized **** **** AverageRLU

AverageRLU 50 AverageRLU 50 50 ******** 0 0 0 WGE Oat WGE Oat WGE Oat U4218G GA4247-4248CC U4218G GA4247-4248CC

G) H) I) AUUU U AGACU AUUU U

250 250 250

200 200 200 A (%) A A G G (%) G (%) A A A G 150 A 150 A 150 G A A U A A CA 100 CA 100 CA 100 Normalized Normalized Normalized AverageRLU 50 AverageRLU 50 AverageRLU 50 **** **** **** 0 **** 0 **** 0 **** WGE Oat WGE Oat WGE Oat GAC4247-4249UUU UGG4218-4220AAA UGG4218-4220AAA GAC4247-4249UUU J) K)

AGGGU AGACU

250 250 200 A (%) A 200 G (%) A G 150 A C G 150 C U C CA 100 UC

Normalized A 100 Normalized

AverageRLU 50 **** AverageRLU 50 0 **** 0 WGE Oat WGE Oat AC4248-4249GG G4219U UGG4218-4220CCC

Figure 2.5. Effect of MTE mutations in the potential pseudoknot interaction on translation. Structures of predicted pseudoknot interaction mutants. Mutated bases in the MTE of each MlucM construct are in red. Constructs are named for the base changes at the numbered positions. Potential pseudoknot interactions are marked by dashed lines, with G:U pairs in a lighter shade. Relative luciferase translation activities in WGE (black bars) or oat protoplasts (gray bars) are shown in percentages of relative light units normalized to the MlucM wild type. Data are averages (±S.D.) from 4 independent experiments (n=15). Fluc/Rluc sample ratios were normalized to MlucM wt. One-way ANOVA was used to analyze the significance of each set of samples. One-way ANOVA-Dunnett’s multiple comparison test was performed to compare statistical differences among double mutants and single mutants. Four asterisks indicate the statistical difference with P<0.0001; two asterisks indicate the statistical difference with P<0.005.

112

TPAV PTE TPAV-m2 PTE

eIF4E (nM) 0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM) 0 5 10 20 40 80 100 120 150 200 250 500

4E + PTE Shift 4E + PTE Shift

Free PTE Free PTE

MTE-wt MTE GAC4247-4249UUU

eIF4E (nM) 0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE 4E + MTE Shift Shift

Free MTE Free MTE

MTE U4218G MTE U4218G, GA4247-4248CC

eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM) 0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE 4E + MTE Shift Shift

Free MTE Free MTE

MTE G4219U MTE GA4247-4248CC

eIF4E (nM) 0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE Shift 4E + MTE Shift Free MTE Free MTE

MTE C4238G

eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE Shift

Free MTE

Figure 2.6. Electrophoretic mobility shift assays (EMSA) of mutant uncapped MTE and PTE RNAs with eIF4E. Ten fmol of the indicated uncapped 32P-labeled transcripts were incubated with the indicated concentrations of wheat eIF4E prior to electrophoresis on a non-denaturing gel.

113

A) B) MgCl - + MgCl - + T G A C 0 1 0 1 5’-end of MCMV genome A1 T G A C 0 1 0 1 105 G10

G G C A20 U A Small loop II A20 Bulge 1 C _ A G30 A _ U 100 G _ C A40 U _ A C _ G A40 C G U U UC A UGC CUCU CC C UGC ACU UA UG G 140 P32 start P50 start C60 _ C G C G U C _ U A C G A _ 10 U G _ C G MgCl - + U A G _ A T G A C 0 1 0 1 U A 1 U80 C Asymmetric U A 5’ A40 internal loop I 90 A G A 20 A C Bulge 1 G U50 C G C C A G100 Large loop I _ A C60 G _ C U _ G G C C _ G 30 A _ U G110 _ G70 A _ U 80 U _ A C _ G U A C A G78 C Asymmetric G120 C internal loop I G G MgCl - + A A 40 T G A C 0 1 0 1 _ C G _ C U71 A U _ G U U _ A U80 _ 70 G _ C 50 C132 U _ A C G C U U U A90 A G A C G C Large loop I G140 U U (Pyrimidine Rich ) G100 U C C A U G C Small 60 C145 loop II SHAPE Reactivity C110 100%

40% 20%

Chemical modification in the presence of Mg2+ in the SHAPE buffer

Figure 2.7. SHAPE analysis of the MCMV 5’ UTR. (A) SHAPE probing gel of the 5’ end of MCMV RNA (nts 1-140). RNA was treated with either 60 mM benzyl cyanide (1) in dimethyl sulfoxide (DMSO) or DMSO only (0) in the presence (+) or absence (-) of magnesium in SHAPE buffer. The sequencing ladders (lanes TGAC) were generated by dideoxy sequencing of RNA with the same 5’-labeled primer used in the modification lanes. Zoom-in sections (gels at right) were obtained in a different gel by allowing electrophoresis to run longer. (B) Superposition of the probe activities on the best-fitting predicted secondary structure of the 5’ UTR. SHAPE activity level indicated by color-coded bases. Nucleotides inside the squares are tracts that have the potential to base-pair with loop I of the MTE (Fig 3). Modification levels of nucleotides in the presence of magnesium are indicated by stars following the reactivity color- coding scheme. AUG start codons underlined.

114

A) C 4240

A G G C C 103 U A 107 G C G 11 8 137 U 4236 AUG AUG 3’

G 12 G C C A 15 5’

5’ 3’

5’- UTR MTE B) 5’-UTR 3’-UTR WGE Oat protoplast CGGCA UGGCA FLUC UGCCA 11 15 103 107 4236 4240 1 121 4095 4437 aB A AB

CGCCA UGGCA FLUC UGCCA A 11 13 15 103 107 4236 4240 1 4095

121 4437 Ac A

CGGCA UGCCA FLUC UGCCA AC 11 15 103 105 107 4236 4240 1

121 4095 4437 AC

CGGCA UGGCA FLUC UGGCA AC 11 15 103 107 4236 4238 4240 1 121 4095 4437 A CGCCA UGGCA FLUC UGGCA 11 13 15 103 107 4236 4238 4240 A 1 121 4095 4437 A CGGCA UGCCA FLUC UGGCA 11 15 103 105 107 4236 4238 4240 A 1 121 4095 4437 0 50 100 150 Relative luciferase activity (%)

Figure 2.8. Long-distance interaction between MCMV 5’ UTR and MTE. (A) Wire diagrams of 5’ UTR and MTE showing tracts (boxed bases) capable of base pairing between 5’ UTR and MTE. Mutations introduced to disrupt and restore potential long-distance base pairing are indicated in red. AUG start codons are shown at positions 118 and 137. (B) Effects of mutations (enlarged, red letters) on translation of MlucM. Relative luciferase translation activity of indicated uncapped transcripts in WGE and oat protoplasts are shown as percentages of relative light units relative to MlucM wild type (100%). Data are average percentages (±S.D.) from 4 independent experiments (for each construct: WGE: n=16; Oat: n=13). Two-way ANOVA multiple comparisons was used to analyze the significance of each set of samples against MlucMWT where A: P<0.0001, a: P<0.005. Two-way ANOVA Dunnett’s multiple comparison test was performed to compare the statistical difference among double mutants and single mutants. Mutants compared with MlucMG13C/C4238G were designated B if P<0.0001 or b if P<0.005. Mutants compared with MlucMG105C/G4238G were designated C if P<0.0001 or c if P<0.005. ANOVA multiple comparisons were performed in the context of each sample collection (i.e., WGE or oat protoplast).

115

A) 4240

C A G C G U

4236

C

G C G A 2983 _ 2979 U _ G U A U _ A U _ A A _ U U U _ 2994 2971 G _ C G _ C 5’ G C G A A U G 3’ 5’ 3’ s gRNA 1 5’-UTR MTE

B) Oat WGE

UGGCA FLUC UGCCA 103 107 4236 4240 1 121 4095 4437

UGGCA FLUC UGCCA 2979 2983 4236 4240 A

2971 2994 4095 4437 ABC

UGCCA FLUC UGCCA AB 2979 2981 2983 4236 4240

2971 2994 4095 4437 ABC

UGGCA FLUC UGGCA AB 2979 2983 4236 4238 4240 2971 2994 4095 4437

UGCCA FLUC UGGCA AB 2979 2981 2983 4236 4238 4240 2971 2994 4095 4437 0 50 100 150 Normalized Relative luciferase activity (%)

Figure 2.9. Long-distance interaction between sgRNA1 5’ UTR and MTE. (A) Wire diagrams of sgRNA1 5’ UTR and MTE showing tracts (boxed bases) capable of base pairing between 5’ UTR and MTE. Mutations introduced to disrupt and restore potential long-distance base pairing are indicated in red. The first start codon is shown at nt 2995. (B) Translation of MlucM (top row) or Msg1lucM (remaining rows), with mutations shown in enlarged red text. Relative luciferase translation activity of indicated uncapped transcripts in WGE and oat protoplasts are shown as percentages of relative light units relative to MlucM wild type (100%). Data are average percentages (±S.D.) from 4 independent experiments (For each construct: WGE: n=12; Oat: n=16). One-way ANOVA multiple comparison was used to analyze the significance of each set of samples against MlucMWT where A: P<0.0001. One-way ANOVA Dunnett’s multiple comparison test was performed to compare the statistical difference among double mutants and single mutants. Mutants compared with Msg1lucM were designated B if P<0.0001. Mutants compared with Msg1lucMG2981C/G4238G were designated C if P<0.0001. Mutations in UTRs are indicated in bold red letters.

116

A) CC P50 P32 P25 300 4248 (%) - G CC 4247

4238 200 4248 GA - / C 4300 / - G G U G C C 4247 4200 4218 4218 4219 4238 105 105 wt Δ GA U U G G G no RNA wt C 100 P50 P32 P25 0 Relative translation activity

B) Relative viral RNA accumulation (%) 150 100 50 0 1 2 3 Mock

wt

*** C4238G

*** G105C

G105C/C4238G **** U4218G **** GA4247-4248CC **** U4218G/GA4247-4248CC **** Δ4200-4300

G4219U

C) CC D) G105C/C4238G 4248 - G CC 4247 G4219U 4238 4248 GA - / C 4300 / G - G G U C C 4247 4218 4200 4238 105 105 4218 4219 U GA Δ Mock wt C G G U G

Mock gRNA

sgRNA1 WT sgRNA2

G105C/C4238G/ 28S rRNA ΔG94/U492A

Figure 2.10. Effects of mutations on translation and replication of full-length MCMV genome.(A) Translation products from full-length MCM41 transcript in WGE. Prominent bands correspond to P50, P32, and P25 protein products. Percentage relative protein levels quantified using ImageQuant from two independent experiments were plotted in the graph at right. (B) Dot- blots of transfection assays in BMS protoplasts. 24 hr post-transfection, total RNA was extracted, and vacuum filtered through a nylon membrane (see Methods). Blots were probed with 32P- labeled antisense transcript complementary to nts 3811-4356 of MCMV genomic RNA, quantified by phosphorimager, and normalized to that obtained with wild type MCM41 infection. One-way ANOVA was used to analyze the significance of each set of samples. Three or four asterisks indicate the statistical difference with P<0.001, or P<0.0001, respectively. Shown are blots from two experiments performed in triplicate, and one experiment with a single sample for each mutant. Mean relative spot intensity (viral RNA accumulation) for the three independent experiments was plotted in the bar graph to the left-side of the panel. (C) Northern blot hybridization of total RNA extracted from maize (B73) plants 14 dpi with the indicated mutants of MCM41. Mobilities of genomic RNA (gRNA), subgenomic RNAs 1 and 2 (sgRNA1 and sgRNA2) are indicated. (D) Symptoms at 14 dpi in systemically infected leaves from plants inoculated with indicated mutants. The bottom leaf indicated the severe symptoms observed in plants infected with MCM41G105C/C4238 in which two spontaneous mutations, ΔG94 and U492A, appeared. 117

P32 P7a P31

P50 RdRp (P111) P7b 3585 SpeI SmaI 1 3K CP (P25) 4191 4437 5’ 3’ CP (P25)

3578 4437

1 120 3578 4437 5’ UTR 3’ UTR Fluc

3578 4437 Capped uncapped CP (P25) * 4095 4437

3578 4108 CP (P25)

4095 4191

0 10 100 1,000 10,000 100,000 Relative translation activity B) Capped uncapped 3578-4437 3578-4108 4095-4437 4095-4191 100,000 3’UTR: ) % ( Cap: + _ + _ + _ + _ l e

v 10,000 e L n o

i 1,000 t a Luc l s n

a 100 r T e v i

t 10 a l e R 1

Figure 2.11-S1. Luciferase translation assay including the 3’ end of coat protein-coding region. (A) Genome organization of MCMV. Boxes indicate open reading frames (ORFs) with encoded protein (named by its molecular weight in kDa) indicated. Black oval shows the location of the 3’-CITE (4164-4333). The zigzag line indicates the truncated sequence of the CP ORF included in the luciferase construct. Luciferase translation assay results in WGE are shown to the right of the corresponding luciferase construct with the indicated 3’ UTR sequence. All constructs here contain the full MCMV 5’ UTR. Luciferase activities (relative light units) were normalized to MlucM construct containing the full 3’-UTR (4095-4437). Data samples were subjected to one- way ANOVA and Dunnett’s multiple comparison test was used to analyze the significance of each set of samples against MlucM, where one asterisk equals 0.015 P-value. (B) In vitro translation assays using 35S-methionine. Left: a representative 35S-met-labeled gel sample out of 3 independent assays. Right: Quantification of the overall radioactivity detected in each assay. 118

U C G U G A A A U C C CUCCGCCCA AGAUGG A C A G ______A C A ______U G_ A_ G G U_ G CC C U U G G ______C C C C C U C G C C U CA UG C_ G C U C G A G G A U CGAGG UUC UA CC A G G GG GG GUA CGCG U U C C A A A C C U A U C U C U U A _ U U _ G _G C G U _ G GU C U C _ C _ G C _ G C G _ U A A _ U G C U _ A _ U _ A _ C G _ C _ G A _ U _ A G _ C _ U G _ C A U C _G U A G G C G G _ CG C AU A A G A G G G G G G G G G _ C C C G A A _ _ G _ C _G G C A _ U A U G U A C _G C _ G _ _ _ A U U A A U U _ A G _ C C _ G _ A C C _ G A A _ A U C _ C _ G G G C A U A _ A _ U 3738 C _ G 3817 A U _ 3700 C _ G 3791 4121 C G 4193 CarMV HnRSV PMV Carnation mottle virus Honeysuckle ringspot virus Panicum mosaic virus

C G C U C ______C G_ G_ U_ G C C CU CGG AC U _ _ _ _ G CG_ G_ UCCCCUG C_ U_ C_ G U A U A A U U C CA C GC CUG U U U G U ______U U U GCC AG C GA G A C_ C CUGGCUGACC C UA CCAGC C _ G U U G U _ C _ G U A _ A GG GACCG AUGGUUG G _ U G G _ C G _ C U A A U A G C U A _ A _ A _ U U _ U G _ U A G _ C C G C _ G _ C _ G _ C G A AC G _ U G C UG C A _ AU G A C G G G _ G G GC G G U A G G A G G G A _ A _ A CU A U U AG C C _ G _ G _ G _ C CG A _ U _ A G _ C _ U G _ C U A C _ G U G G _ G _ U _ UU U A C G CU C _ A A A _ G U _ A A G C G U A G A _ U G _ U U _ U AG _ A A _ U _ A UA G _ C G _ C C _ G A U 3757 G C 3849 _ U _ A 3906 3979 C G 4078 3822 JINRV CMMV PEMV2 Japanese iris necrotic ring virus Cockfoot mild mosaic virus Pea enation mosaic virus-2

C A A U AG G _ _ _ _ CA_ CA_ G_ G A A C U U C_ C_ A_ CC U ______C C AC_ A_ UGGCCCGAG A G C G GUGUC GGU GG G UGUAC CU C A C A G U ______U C A U U A GAGGUUC CCGGA_ GGA_ A GC _ U C C _ A C _ G G C _ G C _ G G UUCC _ GUUUCCU U C _ G A _ U U C U _ G A G A _ U C _ G C _ G U _ A A _ U C _ G C _ G C _ G C _ G AC G C G A _ U G G C G G G _ C _ AC U A G G G _ U G G A A _ G C G U _ G C G G C G G _ GA _ U G U _ A _ A C G C _ G _ A A C _ G GA _ U C _ G G C G _ A C G _ A C G A G U _ G G A _ U U _ A A _ C G A U A G A A _ U C A _ C G U G _ U G _ C _ A G C _ U _ 3983 G _ C 4063 3680 G U 3758 3641 G C 3727 SCV T PAV PFBV Saguaro cactuc virus Thin paspalum asymptomatic virus Pelargonium flower break virus

Figure 2.12-S2. Secondary structures of known and predicted PTEs. The conserved bases predicted to interact with the 5’ UTR are inside black boxes. Predicted nucleotides that create the pseudoknot between the purine-rich bulge and the connecting bridge are indicated with dash lines. 119

capped TPAV PTE capped TPAV-m2 PTE eIF4E (nM) 0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + PTE 4E + PTE Shift Shift

Free PTE Free PTE

capped MTE-wt capped MTE G4219U

eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE Shift 4E + MTE Shift

Free MTE Free MTE

capped MTE U4218G capped MTE C4238G eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE Shift 4E + MTE Shift

Free MTE Free MTE

capped MTE U4218G, GA4247-4248CC capped MTE GAC4247-4249UUU eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500 eIF4E (nM)0 5 10 20 40 80 100 120 150 200 250 500

4E + MTE 4E + MTE Shift Shift

Free MTE Free MTE

Figure 2.13-S3. Electrophoretic mobility shift assays (EMSA) of MTE and PTE capped RNA probes with eIF4E. Ten fmol of the indicated capped 32P-labeled transcripts were incubated with the indicated concentrations of wheat eIF4E prior to electrophoresis on a non-denaturing gel.

120

A)

1.0 Uncapped RNA

TPAV TPAV -m2 MTE-wt MTE G4219U MTE U4218G 0.5 MTE C4238G MTE U4218G, GA4247-4248CC MTE GAC4247-4249UUU

0.0 0.0 1.0x10 -7 2.0x10 -7 4.0x10 -7 6.0x10 -7

eIF4E (M)

B)

1.0 Capped RNA

TPAV TPAV -m2 MTE-wt MTE G4219U 0.5 MTE U4218G MTE C4238G MTE U4218G, GA4247-4248CC MTE GAC4247-4249UUU

0.0 0.0 4.0x10 -8 6.0x10 -8 2.6x10 -7 4.6x10 -7 eIF4E (M)

Figure 2.14-S4. Non-linear regression graphs of EMSAs. Radioactivity in bands on gels in Fig 6 and S3 Fig was quantitated by phosphorimager. The level of radioactivity in the unshifted and shifted bands was used to generate nonlinear regression curves. 121

Figure 2.15-S5. RT-PCR of plants inoculated with MCMV mutants. A representative sample of results observed in RT-PCR screening of inoculated maize plants (8 dpi). RNA isolated from 3rd leaf was subjected to cDNA synthesis (see methods); cDNA from each isolation was divided into two fractions where one underwent RT-PCR to amplify Ubiquitin1 mRNA (top) as an internal control, while the other RT-PCR amplified the MCMV CP ORF (bottom). The summary of overall RT-PCR screenings is summarized in Table 1. For Table 1, even the faint bands observed in some mutants (e.g., samples 27-31) were counted as positive.

122

Figure 2.16-S6. SHAPE probing of mutant MTE sequence in MCM41 mutant constructs used for infectivity assays. (A) SHAPE probing gel of MTE in the whole genome context of MCM41 mutants. RNA was modified with either 60 mM benzyl cyanide (1) in dimethyl sulfoxide (DMSO) or DMSO only (0) in the presence (+) or absence (-) of magnesium in SHAPE buffer. The sequencing ladders (lanes AC) were generated by dideoxy sequencing of RNA with the same 5’-labeled primer used in the modification lanes. (B) Reactivity information was superimposed into the characterized MTE structure. To compare where modifications occurred in the presence of magnesium among all mutants, specific geometric shapes were assigned to each hyper-modified nucleotide. Five-point star: wt, four-point star: C4238G, hexagon: GA4247- 4248CC, rhombus: U4218G, triangle: GA4247-4248, U4218G, cross: G4219U. The color of each geometric figure corresponds to reactivity data coloring. 123

4251 4251 4250 A A U U U U U A U G G A A G A A A A A A A A C CC GC A G A C U G G G C C CC GC A G A C U G G G C C CC GC CC C U G G G C _ | | | | | | | | | | | 4238 _ C | | | | | | | | U 4238 _ | | | | | | | | U 4238 C G G GC G G G G G GC G U C C G G GC G U C C G G U C C G A G A G A C G C A U C U A C U A U U G G A U G U G A U G U A U G G G C C 4230 A A A G 4230 _ 4230 _ G C 4270 G C 4270 _ G 4270 G C G _ C A A _ Purine rich _ G C G _ C G C _ _ bulge _ U A U A G C _ _ _ AG C AG C G C G C G C A A A C C Purine rich G G A _ C G G G C bulge _ U | U | U A 4280 C Purine rich _ A _ C A _ A U G C G C _ _ bulge _ C G U A 4280 U A 4280 _ _ _ C G A U A U _ 4210 A A Mismatch pair II C G _ _ C G G _ C C G _ C G U _ A 4210 A A Mismatch pair II 4210 A A Mismatch pair II ...... G _ C G _ C _ _ U4218G,GA4147-4248CC . . . U A ...... U A . . . WT C4238G

4251 C U 4251 C 4251 C C U G U A U G C A G C A G A A A G _ C CC GC C U G G G C A C G CC C _ G G _ C _ | | | | | | | | | | | _ 4238 C U G _ C CC G U G GC G G G G U C C G C _ G A G A AC U A A U C G C A A U U A G A C U U _ A 4230 G G C U C G U C G _ A _ A A _ U 4238 U A U A _ G G 4270 4238 _ C _ A C C G G U Purine rich _ _ G C G U G A bulge _ A A G C G _ _ A C G U A C _ G G _ C _ _ C C C G _ C 4230 G U 4230 G C 4270 A C U 4270 C U _ U G G C C A _ G G _ C U A 4280 _ A _ G C G _ C A U _ _ _ G C G C C G _ _ _ G _ C U _ A C G Purine rich U _ A AG C 4210 A A Mismatch pair II AG C G C bulge G A G _ C C _ A G . . . U A . . . G Purine rich G G GC U bulge A _ C _ G C GA4247-4248CC A G C _ _ U A 4280 U A 4280 _ _ A U A U _ _ C G C G _ _ C G C G 4210 A A Mismatch pair II 4210 A A Mismatch pair II G _ C _ G C . . . U _ A ...... U _ A . . . GA4247-4248CC U4218G,GA4147-4248CC Alternative Alternative structure 4251 4251 A A U U U U G G A A A A A A C CC GC A G A C U G G G C C CC GC A G A C U G G G C _ | | | | | | | | U 4238 C 4238 _ C | | | | | | | | U G GC G U C C G G A G G GC G U C C G A C U A C A U G A U G U G U G G G A U C A C A 4230 _ G C 4270 4230 _ _ G C 4270 G C G _ C G _ C _ _ G C U A U _ A _ _ AG C G C G C A C G Purine rich Purine rich bulge U U | bulge G | C A _ C A _ G C G C _ _ U A 4280 U A 4280 _ _ A U A U _ _ C G C G _ _ C G C G 4210 A A Mismatch pair II 4210 A A Mismatch pair II G _ C _ _ G C . . . U A ...... U _ A . . . G4219U U4218G

Figure 2.17-S7. MTE SHAPE structures of selected mutants. SHAPE data was superimposed into best fitted secondary structures. The base of the long stem on MTE was conserved throughout all the structures; thus, only areas with major changes are shown. SHAPE reactivity data color-coding is the same as the previous figure. Major changes in the MTE structure are indicated by the creamy-yellow shading.

124

Tables

Table 2-1. Summary replication symptoms of inoculated maize plants with MCMV mutants

Maize a Northern Systematic Rate of Sy stematic Virus Construct RT-PCR Genotype Blot c Sympyoms d infection e recov ered _ _ B73 N/A 0% (0/15) N/A f Mock Sweet corn _ _ N/A 0% (0/13) N/A B73 + b + Normal 89% (17/19) MCMV-WT MCM41-WT Sweet corn + + Normal 88% (21/24) MCMV-WT B73 ± ± N/A; Delayed, mild 7% (1/15) Reverted to MCMV-WT MCM41-C4238G Sweet corn ± ± N/A; Delayed, mild 7% (1/15) Reverted to MCMV-WT _ B73 ± N/A 0% (0/15) N/A MCM41-G105C Sweet corn ± ± Delayed, normal 18% (2/11) Reverted to MCMV-WT

B73 + + N/A, Delayed, mild 20% (3/15) MCM41 G105C, C4238G MCM41-G105C, C4238G MCM41 G105C, C4238G; Delayed, mild & Sweet corn + _ 27% (3/11) MCM41 G105C, C4238G + T422A, severe ΔG94 _ B73 ± N/A 0% (0/17) N/A MCM41-GA4247-4248CC Sweet corn ± _ N/A 0% (0/15) N/A

B73 ± ± N/A, Normal 12% (2/17) Reverted to MCMV-WT MCM41-U4218G Sweet corn ± ± Delayed, normal 7% (1/15) Reverted to MCMV-WT _ B73 ± N/A 0% (0/19) N/A MCM41-U4218G, GA42247-4248CC _ Sweet corn ± N/A 0% (0/19) N/A _ B73 ± N/A 0% (0/17) N/A MCM41- Δ4200-4300 _ Sweet corn ± N/A 0% (0/15) N/A MCM41-G4219U; B73 + + Normal 88% (15/17) MCM41 G4219U Reverted to MCMV-WT [7/15] MCM41-G4219U; Sweet corn + + Normal 74% (14/19) Reverted to MCMV-WT [8/14] a MCMV coat protein detected by RT-PCR at 8 dpi on the 4th leaf. PCR bands detected in plants with no symptoms were faint

b +: samples tested positive 70%-100% of the time, -: samples were negative all the time (0%),± : any positive samples under 60% of the time c 14 dpi infected plants where MCMV genomic, sgRNA1, and sgRNA2 was detected by northern blot

d Mottling symptoms were observed on non-inoculated new leaves (4-6)

e Percentage was calculated by dividing plants showing mottle symptoms/ number of plants inoculated f N/A: not applicable

125

Table 2-2. Table of primers. List of primer sets used in this publication.

Con structs: Deletion or mutation Forward Primer Reve rse Primer Δ420 0-430 0 5'-ACGGTGCGACATGGTAAC-3' 5'-AACTACTAGTATACGATTTAGGC-3' Δ409 5-416 4 5'-AAAGTGTTTGGGAGCCTAAATC-3' 5'-CGCGGCTTACAATTTGGACTTTC-3' Δ409 5-419 1 5'-CTAGTAGTTTGCGTGATGAC-3' 5'-CGCGGCTTACAATTTGGACTTTC-3' AC424 8-4249 GG 5'-GCCAACCGCAGGGTGGGCGTATATAGTAAGCCTTGACC-3' 5'-AGCCGCCGCCCACTCTCC-3' UGG421 8-4220 CCC 5'-TGACCATGACCCCAGAGTGGGCG-3' 5'-TCACGCAAACTACTAGTATAC-3' GA424 7-4248 CC 5'-GCCAACCGCACCCTGGGCGTATATAGTAAGCCTTGACCC-3' 5'-AGCCGCCGCCCACTCTCC -3' U4218 G 5'-TGACCATGACGGGAGAGTGGG-3' 5'-TCACGCAAACTACTAGTATAC-3' GAC424 7-4249 UUU 5'-GCCAACCGCATTTTGGGCGTATATAGTAAGCCTTGACCCAC-3' 5'-AGCCGCCGCCCACTCTCC -3' UGG421 8-4220 AAA 5'-TGACCATGACAAAAGAGTGGGCG -3' 5'-TCACGCAAACTACTAGTATAC -3' G4219 U 5'-TGACCATGACTtGAGAGTGGG-3' 5'-TCACGCAAACTACTAGTATAC-3' A4248 U 5'-GCCAACCGCAGTCTGGGCGTATATAGTAAGCCTTGACC-3' 5'-AGCCGCCGCCCACTCTCC-3' U4218 A 5'-TGACCATGACAGGAGAGTGGG-3' 5'-TCACGCAAACTACTAGTATAC-3' MTE EMSA probe 5'-AATTAATACGCTCACTATAGGGTATACTAGTAGTTTGCGT-3' 5'-TGTATCCAGTTACCATGTCGCACCG-3' TPAV EMSA probe 5'-taatacgactcactatagg GCTCTATCCGAAA CTCCAGTG-3' 5'-ACTCTCTACCTTCCGTCCAG-3' G13C 5'-GTAATCTGCGCCAACAGACCC -3' 5'- CTCCCTATAGTGAGTCGTATTAGTG -3' G105 C 5'-CCCCTGACTGCCAATCAGGTTTC -3' 5'-AAATTCCCACGTTAGAGCTC -3' C4238 G 5'-GCGGCGGCTGgCAACCGCAGA-3' 5'-CCACTCTCCAGTCATGGTCATCACGCAAAC-3' sg1MlucM 5'-AGAAATTCCCGACGCGCGCATGGAAGACGC -3' 5'-GCCAAAATACCCCCTATAGTGAGTCGTATTAGTGGCTTTACC -3' G2981 C 5'-AGAAATTCCCGACGCGCGCATGGAAGACGC -3' 5'-GGCAAAATACCCCCTATAGTGAGTCGTATTAGTGGCTTTACC -3' MCMV-coat protein 5'-ATGGCGGCAAGTAGCCGGTCT -3' 5'- TGTGCTCAATGATTTGCCAGCCC -3' Maize Ubiquitin 1 5’-TAAGCTGCCGATGTGCCTGCGTCG -3' 5’- CTGAAAGACAGAACATAATGAGCACAG -3' Northen Blot probe 5'-ctgaATTTAGGTGACACTATAGTTCCTAGCATCTACTTGCCCC -3' 5'-GGATCGTGCCCTCAGCTACAATAGCTCTGAA -3'

126

CHAPTER 3: PANICUM MOSAIC VIRUSES LIKE 3'-CAP-INDEPENDENT TRANSLATION ENHANCERS AND EIF4E INTERACTIONS

Abstract

Panicum mosaic virus 3’-cap-independent translation elements (PTEs) are secondary structures on the 3’-end of the virus genome that posses a hammer-like structure that forms a pseudoknot between a G- and C-domain bulge. The G-C pseudoknot formed by the interaction of the G-C domains is suspected of interacting with the cap-binding pocket of eukaryotic initiation factor 4E (eIF4E). Computational modeling and site-directed mutagenesis were used to identify mutations around the cap-binding pocket to disrupt the binding of eIF4E to PTEs. However, due to complications in the protein purification procedure, there is much work to be done to verify the effect of the mutations reported here. Nonetheless, this chapter aims to illustrate the benefits of using computational models to supplement the work in the lab bench.

Introduction

Eukaryotic initiation factor 4E

Translation initiation factors play essential roles in the plant development process

(Dinkova et al., 2016). Eukaryotic initiation factor 4E (eIF4E) plays a crucial role in translation initiation (Hernandez and Vazquez-Pianzola, 2005; Schmitt-Keichinger, 2019; Yanagiya et al.,

2012). In plants, there are three types of 4E like proteins: eIF4E and eIFiso4E (Allen et al., 1992;

Browning et al., 1987; Kropiwnicka et al., 2015; Rodriguez et al., 1998) and novel cap-binding protein (nCBP) (Gomez et al., 2019; Keima et al., 2017; Ruud et al., 1998). Although eIF4E and eIFiso4E belong to class I and nCBP belong to class II, they all have affinities towards the mRNA 5'cap (Joshi et al., 2005). The conservation or substitution of tryptophans in conserved regions determines the classification of these proteins (Dinkova et al., 2016; Joshi et al., 2005). 127

The translation initiation cascade begins when eIF4E recognizes the 7-methylguanosine triphosphate (m7GTP) cap in the mRNA (Andreev et al., 2017; Monzingo et al., 2007; Yanagiya et al., 2012). The binding of eIF4E to the 5’-cap of the mRNA results in the formation of the

trimeric eIF4F complex, which is composed of eIF4G, eIF4E, and eIF4A (Andreev et al., 2017;

Hinnebusch, 2014, 2017). The interaction within eIF4G binding domains of eIF4E, poly (A)

binding protein (PABP), and mRNA produces a closed-loop structure (Hinnebusch, 2017). The mRNA circularization enables the formation of the 43S preinitiation complex (PIC)

(Hinnebusch, 2014). PIC scans untranslated regions (UTR) of the mRNA until it encounters upon a start codon. eIF4E and viruses

Although plant eIF4E factors play an essential role in translation initiation, they are often targeted by viruses for the translation of viral proteins (Gomez et al., 2019; Nicaise et al., 2003;

Saha and Makinen, 2020; Schmitt-Keichinger, 2019). Viruses in the potyvirus genus possess a

viral protein genome linked (VPg), which is covalently attached to the 5’-end of the viral RNA

(Jiang and Laliberté, 2011). The VPg interacts with host protein, and one of the best-

characterized interactions is eIF4E-VPg (Gao et al., 2020; Leonard, 2004; Saha and Makinen,

2020; Wittmann et al., 1997). The recognition of eIF4E as a major susceptibility factor for

potyviruses led research groups to develop resistance strategies by targeting this protein factor.

eIF4E mediated virus resistance has been shown in tomato (Gauffier et al., 2016; Moury et al.,

2020), melon (Rodríguez-Hernández et al., 2012), plum (Wang et al., 2013), peanut (Xu et al.,

2017), lettuce (Tavert-Roudet et al., 2012), potato (Cavatorta et al., 2011), peas (Bruun-

Rasmussen et al., 2007) and soybean (Gao et al., 2020). Since potyviruses also interact with

eIFiso4E, this protein has also been targeted for virus resistance. Some of the success stories

involving eIFiso4E include peach plants (Cui and Wang, 2017), and peanuts (Xu et al., 2017). 128

The binding affinity of eIF4E towards cap analogs is often 4-10 fold higher than that of

eIF(iso)4E (Kropiwnicka et al., 2015). In contrast, nCBP’s binding affinity is weak or about the

same as eIFiso4E (Kropiwnicka et al., 2015; Ruud et al., 1998). Studies in Arabidopsis thaliana indicated that nCBP acts as a recessive resistance gene against viruses in the

and families (Keima et al., 2017). Research studies using the yeast-two hybrid system indicated that VPg of cassava brown streak virus (CBSV) interacted strongly with cassava nCBP1 and nCBP2 (Gomez et al., 2019). The studies in Arabidopsis and cassava suggest that nCBP may be involved in the accumulation of virus movement protein (Gomez et al., 2019;

Keima et al., 2017). However, compared to eIF4E and eIFiso4E, nCBP is less characterized, and potential mechanisms of nCBP and virus interactions are not fully understood.

Potyviruses are not the only virus family that targets eIF4E. The absence of a 5’-cap and

poly (A) tail requires translation mechanisms such as 3’CITEs (Truniger et al., 2017; Truniger et

al., 2008). 3’-CITEs are unique structures located at the 3’-end of the virus genome that recruit

host translation initiation factors (Simon and Miller, 2013). The panicum mosaic virus 3’-CITE

(PTE) like structures have an affinity for eIF4E (Batten et al., 2006; Wang et al., 2011; Wang et al., 2009). PTE like structures posses a hammer-like structure composed of a long bulged helix bifurcating into two-stem loops (Wang et al., 2011). A C-rich bulge (C-domain) connects the two side stem-loops (Figure 3.8) (Wang et al., 2009). The C-domain interacts with a G-rich (G- domain) region of the PTE, forming a pseudoknot. Protein-RNA docking simulation between pea enation mosaic virus RNA2 (PEMV2) and eIF4E predicted interaction between the PTE G-C pseudoknot and the cap-binding pocket of eIF4E (Wang et al., 2011).

129

Computational RNA-protein dynamics

Structural information produced by x-ray crystallography, nuclear magnetic resonance

(NMR) spectroscopy, and electron microscopy generate various biological puzzle pieces (Madan

et al., 2016). Simulation computational tools can help us predict interactions and assembly of the

puzzle pieces. The continuous flow of experimental data enables the creation of bioinformatics

tools to generate more accurate in silico models (Mann et al., 2017; Stöcker et al., 2018;

Zagrovic et al., 2018). A process often used in protein simulations is “dock/docking.” Docking is

a process of sampling and scoring, where the software samples all possible binding interactions

between two independent structures (Fu and Meiler, 2018; Huang, 2014; Madan et al., 2016;

Morris et al., 2009; Trott and Olson, 2010; Yan et al., 2017). Due to the paucity of information

involving protein binding sites, protein-protein, and protein-nucleic acid, most global docking

simulations are ab initio (Huang, 2014; Yan et al., 2017). Although there are many docking tools available, most of them are based on protein-protein interactions (Tuszynska et al., 2015a; Yan et al., 2017).

NPDock and HDOCK are good platform examples to simulate protein-RNA docking interactions for non-bioinformatic experts (Tuszynska et al., 2015a; Yan et al., 2017). The pipeline workflow of both of these web servers was designed to handle nucleic acid-protein interactions. In comparison to NPDock, HDOCK accepts sequences and structure inputs (Yan et al., 2017). In this chapter, NPDock was used to simulate the interaction between wheat and maize eIF4E proteins with maize chlorotic mottle virus 3’-CITE (MTE). Because NPDock only accepts PDB files, RNA 3D structure prediction software such as SimRNA (Magnus et al., 2016) was used to predict 3’-CITE structure using SHAPE experimental data. Protein-RNA interactions observed from the in silico model were taken into account when designing mutations on the eIF4E protein. 130

The main objective of this study was to characterize the interaction between eIF4E and

PTE-like 3’CITE structures. In this chapter, mutations were introduced to the eIF4E protein

expression bacteria vectors. The mutations introduced were designed based on previously

reported data and in silico simulations. Experimental data involving eIF4E-PTE interaction

indicated an interaction between the binding pocket of eIF4E and G-C pseudoknot in the PTE

(Kraft et al., 2019; Wang et al., 2011; Wang et al., 2009). However, due to unforeseen challenges

in the experimental procedure, the experimental data presented in this chapter are preliminary

and must be optimized and repeated prior to publication.

Results and Discussion

Structure alignment of eIF4E proteins.

There are only a handful of plant eIF4E crystal structures available in the protein data

bank (PDB). From the 93 eIF4E structures available, only three of them are from plants,

Cucumis melo (PDB ID: 5ME7 and 5ME6) (Miras et al., 2017c), Pisum sativum (PDB ID:

2WMC) (Ashby et al., 2011), and Triticum aestivum (PDB ID: 2IDR, 2IDV) (Monzingo et al.,

2007). However, since the eIF4E structure is highly conserved (Dinkova et al., 2016; Joshi et al.,

2005), information interpolated from these structures can be used to estimate the location of cap-

biding amino acids in other eIF4E proteins for which crystal structure is unknown. Multiple

sequence alignment of 13 eIF4E protein sequences, including P. sativum (pea), T. aestivum

(wheat), and C. melo (melon), indicated that eIF4E in plants is highly conserved (Figure 3.1).

The crystal structures of eIF4E binding to m7GTP ligand revealed that there are 4-6 core amino acids (wheat: W62, W108, E109, R158, R163; melon: R114, W128, R178, K183; pea: W75,

W121, E122, R171, K176, R220) in eIF4E that interact with m7GTP (bold, orange sequences on protein alignment, Figure 3.1) (Ashby et al., 2011; Miras et al., 2017c; Monzingo et al., 2007).

In all of the plant eIF4E-m7GTP crystal structure models, the location of three amino acids 131

(Figure 3.1 red boxes) in the cap-binding pocket was highly conserved (wheat: W108, R158,

R63; melon: W128, R178, K183; pea: W121, R171, R220). The main tryptophan (wheat: W108,

melon: W128, pea: W75) that interacts with m7GTP appears to always be flanked by a lysine

and a glutamic acid. Interestingly, the glutamic acid (E122pea or E109wheat ) next to W108wheat only interacted with m7GTP in T. aestivum and P. sativum proteins (Figure 3.7).

Protein structures are not rigid or static; they undergo constant conformational changes around their flexible non-structured loops (Khrennikov and Yurova, 2017; Mishra et al., 2019;

Sankar et al., 2018). Proteins are continually moving and changing their conformations based on their interactions with other effectors (Sankar et al., 2018). Environmental factors, such as temperature, pH, binding effectors, or allosteric modulators, often regulate the catalytic efficiency of proteins (Guarnera and Berezovsky, 2016; Mishra et al., 2019). Protein binding sites are classified as functional binding sites (active sites) or regulatory binding sites (allosteric sites) (Mishra et al., 2019). In a functional binding site, the binding interaction between ligand and protein triggers a chemical modification on the binding site (Song and Zhang, 2018). In a regulatory binding site, the protein activity is regulated by the effector molecule (Guarnera and

Berezovsky, 2016; Song and Zhang, 2018). Frequently, a protein’s active site is composed of a group of amino acid residues located in a deep pocket or surrounded by a network of channels

(Mishra et al., 2019).

Wheat eIF4E cap-binding site structural changes

The cap-binding active site of wheat eIF4E (T. aestivum), contains five amino acid

residues involved in the eIF4E-m7GTP (Figure 3.2). The highly conserved tryptophan

(W108wheat) (Figure 3.1) is located on a flexible loop in the eIF4E 3D structure (Figure 3.2).

Protein dynamic simulations indicated that the elastic loops in wheat eIF4E structure are locked

in place upon the interaction of amino acid residues, W62, W108, E109, R163, and R158 with 132

m7GTP. The interaction eIF4E-m7GTP makes the protein structure more stable (Monzingo et

al., 2007). Simulations using the eIF4G-eIF4E crystal structure data from Miras et al. (2017b), the conformational changes suggested a structural rearrangement of eIF4E protein (Figure 3.3).

In melon, the eIF4E-4G interaction results in the formation of disulfide bonds between C133 and

C171 (C113 and C151 in wheat (Figure 3.3)) (Miras et al., 2017c). This disulfide bond rearranges one of the flexible loops such that it forms a helix at amino acid residues P131 to

A134 (P111 to A114 in wheat (Figure 3.3)). The eIF4E rearrangement from a flexible loop to a helix pulls out the W108wheat (W128melon) from the cap-binding pocket (Figure 3.3), which

inhibits eIF4E-m7GTP binding interaction (Miras et al., 2017c). Crystal structure data and

protein dynamic simulations allow us to visualize mobility protein dynamics, which can be

useful when designing mutations for functional assays.

Maize eIF4E 3D structure prediction

Similar to eIF4E orthologs in other plant species, maize eIF4E has higher binding activity

towards m7GTP than eIFiso4E (Lázaro-Mixteco and Dinkova, 2017). The expression levels of

eIF4E and eIFiso4E vary depending on the developmental stage of the maize plant (Dinkova and

Sánchez de Jiménez, 1999). Although the protein sequence of maize eIF4E has been

characterized (Dinkova et al., 2000; Lázaro-Mixteco and Dinkova, 2017; Manjunath et al.,

1999), there is no crystal structure of this protein yet. As there was no maize eIF4E at the protein

data bank (PDB), homologous protein structures of eIF4E were searched for comparison. After

searching for eIF4E structures on the PDB website, 93 crystal structures were found. Out of

those 93 structures, 45% were in the eIF4F complex while the rest of the structures were bound

to a ligand. Using BLASTp from the NCBI tools, 100 accession numbers of protein sequences

were found. Blast results were filtered to remove sequences of the same protein. The filtered

BLAST result indicated that the alpha chain dimer of wheat eIF4E had an 88% identity with a 133

significant E-value (6e-118). Sequence alignment of plant eIF4E proteins (Figure 3.1) and

multiple structure superimposition of eIF4E structured in the PDB database were employed to

predict maize eIF4E structure.

Bio3D-R-package (Grant et al., 2006) was used to analyze protein structure, sequence

alignments, and trajectory data of eIF4E. Zea mays eIF4E amino acid sequence (DAA5517.1)

was used to search for proteins with a similar structure; only the top 11 hits were used for

superimposition and principal component analysis (PCA). The structure multiple sequence

alignment (Figure 3.4-A) resulted in 11 sequences and 177 alignment positions (from which 155

had no-gaps). Bio3D-R-package used ward.D2 method to identify the clustering of the aligned

sequences to produce a dendrogram (Figure 3.4-B). The cluster alignment was partitioned into

four groups, where each sequence cluster had its specific color. The structures were

superimposed on their 155 aligned C-alpha atom positions (Figure 3.4-C). Since most crystal

eIF4E structures are missing amino acids at the N-terminus, the sequence closer to the N-

terminus was colored in grey.

PCA analysis was performed in the superimposition (Figure 3.4-C) to lower the

dimensional representation of protein hits. The PCA analysis revealed that 2IDR-A had a root mean square inner product (RMSIP) of 0.29 while 2WMC-G an RMSIP of 0.24, respectively.

RMSIP values measure the similarity between two sets of modes obtained from PCA (David and

Jacobs, 2011). RMSIP scores range from 0 to 1, where a score of 0 is given to orthogonal vectors and a score of 1 is given to identical vectors. RMSIP above 0.7 are considered excellent correspondence, and a score of 0.5 is considered fair (David and Jacobs, 2011). However,

RMSIP scores are dependent on subspace size. In other words, if the plant eIF4E superimposition PCA analysis points were larger, then the RMSIP values were most likely to be 134

higher. Based on these results, I decided the 2IDR-A chain would contain the best-fit coordinates

to simulate Zea mays eIF4E 3D structure folding.

The accuracy and structure prediction of proteins varies on the computational methods

and pipeline used (Deng et al., 2018). Based on structure alignments and PCA analysis from

Bio3D-R package, Triticum aestivum 2IDR-A crystal structure chain coordinates information

was the best fit to use as the template for Z. mays eIF4E prediction. The predicted Zea mays

eIF4E of two online ab initio protein prediction web-servers, I-TASSER (Yang and Zhang,

2015) and Phyre2 (Kelley et al., 2015), were compared for similarities. Phyre2 is a multiple-step

ab initio folding simulation server that used PSI-BLAST and PSIRED for Z. mays eIF4E (Zm4E)

secondary structure prediction. Phyre2 clusters 12-15aa secondary structures of the predicted

Zm4E secondary structure and compared it with known structural regions or domains on the

PDB database. The Zm4E protein structure model was predicted based on the structural

alignments of known fragments. Although I-TASSER is also an ab initio 3D protein prediction web-server, the user can add restraints and protein templates for simulation models. So, for the I-

TASSER web-server simulations, the coordinates from 2IDR-A were included as a template restraints. The comparison of Zm4E 3D structure prediction from Phyre2 and I-TASSER indicated that structures were very similar to one another. However, the I-TASSER Zm4E model

(Figure 3.6) was used for downstream application because cap-binding active site coordinates had a greater resemblance to wheat eIF4E (2IDR).

Molecular docking allows one to predict interactions between a ligand and its receptor protein (Feinstein and Brylinski, 2015). The Zm4E and m7GTP ligand interactions were predicted using AutoDock (Morris et al., 2009) and AutoDock Vina (Trott and Olson, 2010) suite docking tools. AutoDock is a suite of automated docking tools that allows one to predict the 135 interaction of small molecules, such as substrates and drug candidates, and binding receptors on a 3D structure (Morris et al., 2009). AutoDock Vina facilitates the virtual screening and automatically calculates the grid maps and clusters the results (Trott and Olson, 2010). First, m7GTP and 2IDR-A (Figure 3.6-A) docking simulations were performed to gain an understanding of the process and compare the docking accuracy by comparing it to the known crystal structure. From the obtained ligand conformations, ligand #2 had the closest polar interactions to the crystal structure published (Monzingo et al., 2007). The docking grid area coordinates for docking Zm4E-m7GTP were optimized based on the 2IDR-A-m7GTP docking simulation (Figure 3.6-B). The docking conformation from mode five (Figure 3.6-B) was determined to be the best fit as it interacted with six conserved amino acids (W65, W111, E112,

R161, R166, W170) around known cap-binding regions (Figure 3.1).

The 3’-CITE of maize chlorotic mottle virus interacts with eIF4E; however, the interaction dynamics are unknown. Mutations on various plant eIF4E resulted in virus resistance

(Ashby et al., 2011; Tavert-Roudet et al., 2012; Xu et al., 2017). Several of the mutations conferring resistance were located on the cap-binding region of eIF4E. The PCA analysis performed using Bio3D indicated that Pisum sativum eIF4E G-chain and Triticum aestivum eIF4E A-chain had the best RMSIP, so these chains were used for superimposition to predicted

Zm4E. The superimposition of P. sativum and T. aestivum eIF4E to predicted Z. mays eIF4E

(Figure 3.7) confirmed that the structural location of the amino acid residues in the cap-binding pocket active site was similar. The simulated Z. mays eIF4E 3D model would allow us to locate the amino acids in the cap-binding pocket, information that will be useful during the experimental design of mutants. 136

Triticum aestivum eIF4E-PEMV2 PTE simulation docking model

Potyviruses interact with eIF4E via its VPg through protein-protein interaction (Ala-

Poikela et al., 2019; Kanyuka et al., 2005; Nicaise et al., 2003; Saha and Makinen, 2020). In

contrast to potyviruses, tombusvirids use RNA-protein interactions to hijack the translation

machinery (Nicholson et al., 2010). The RNA-protein interaction between tombusvirids and protein factors is achieved through 3’-CITEs (Bhardwaj et al., 2019; Miras et al., 2017a;

Truniger et al., 2017; Wang et al., 2011). The objective of this study was to characterize the interaction between eIF4E and PTE-like 3’CITE structures. Nucleotide changes on the RNA structure of the PTE indicated the formation of a pseudoknot between the C- and G-domains is fundamental for eIF4E recruitment (Du et al., 2017; Kraft et al., 2019; Wang et al., 2009)

(Figure 3.8). However, the pseudoknot stability in all known PTE-like structures comes in a wide variety of strengths and sizes (Figure 3.9). In silico simulation and SHAPE probing of eIF4E-PEMV2-PTE complex suggested the possible interaction between the cap-binding pocket of eIF4E with PTE pseudoknot (Wang et al., 2011).

Analysis of the eIF4E-PEMV2 PTE predicted model by Wang et al. (2011) revealed potential amino acid residue targets for mutagenesis. Wang et al. (2011) protein-RNA docking model indicated that a hyper-modified G3840 located in the G-domain of the PTE (Figure 3.8)

many interacts with cap-binding site residues W108 and W62 in wheat eIF4E. RNA-protein binding assays using PEMV2-PTE and eIF4E mutants at these positions supported this model

(Wang et al., 2009). Single mutation of W108L decreased the wheat eIF4E-m7GTP binding, whereas eIF4E double mutant W108L/W62L precluded eIF4E-m7GTP binding (Wang et al.,

2009).

137

In contrast to the data discussed in Wang et al. (2011), additional nucleotide-amino acid

interactions were predicted (Figure 3.10). The nucleotide G3838 was a sandwich in direct contact

with W108 and W62 (Figure 3.10) instead of the hypermodified G3040. In addition to W108 and

W62 residues in eIF4E, residues (E109, Q59, K207, S205, G208, and P209) around the flexible

loops near the cap-binding active site interacted with three Gs on the PTE G-domain (3838-

3840) (Figure 3.10). An additional cap-binding (E109) residue was identified as a target for eIF4E-PTE interaction. Since the accuracy of docking simulations vary based on the pipeline use, parameters assumptions, and protein-RNA complex scoring, experimental data is required to validate computational models. Nevertheless, the protein-RNA simulation models were taken into consideration during the selection of target areas.

Modeling of maize chlorotic mottle virus 3’-CITE interaction with eIF4E

Maize chlorotic mottle virus (MCMV) is a single-stranded positive RNA virus from the

Tombusviridae family (Scheets, 2000, 2016). Similar to other members in the Tombusviridae family, MCMV lacks the presence of a poly-A tail and a 5’-cap (Scheets, 2016). MCMV uses a

3’-CITE located on its 3’-end of the genome to recruit eIF4E (refer to chapter 2). SHAPE structural data revealed a resemblance of the MCMV 3’-CITE (MTE) to the PTE-class 3’CITE.

However, in contrast to previously characterized PTEs (Figure 2.10), the secondary structure of the MTE does not have the C-domain, and a weak pseudoknot interaction between the G-domain the connecting bridge between the two side-loops is predicted.

The MTE 3D model was simulated using the SimRNA webserver (Magnus et al., 2016).

SimRNA is a computational method that uses Monte Carlo scheme for conformational space sampling of RNA folding simulations and 3D structure prediction (Magnus et al., 2016). RNA simulation was performed using a 100 nucleotide RNA fragment (4200-4300). SHAPE data restriction was converted to space-efficient bracket notation and fed to the SimRNA server. The 138

final simulated structure cluster was exported as a ‘.pdb’ file and use for RNA-protein

simulations (Figure 3.11). In contrast to PEMV2 PTE 3D simulation (Figure 3.8), the MTE did

not seem to form a pseudoknot between the predicted G-domain and connecting bridge between

the side loops (Figure 3.11). Even though G-C domain pseudoknot was not present in the MTE model, the rest of the structure folded similarly to PTEs.

The absence of G-C pseudoknot on the MTE 3D model impacted the simulation

recognition of the G-domain by the eIF4E binding site (Figure 3.12, Figure 3.13). The wheat

eIF4E-MTE model was generated using NP-Dock (nucleic acid-protein docking) web server

(Tuszynska et al., 2015a). NP-Dock workflow implements docking using GRAMM, scores all

the compatible decoys and clusters the best-scored models, and outputs the top 3 complex

representatives. The top protein-nucleic acid 3D complex structures were visualized and

compared in PyMol. The best fit structure model is shown in Figure 3.12 (wheat) and Figure

3.14 (maize).

All the amino acids predicted to interact with MTE were located on the flexible loops

surrounding the cap-binding pocket of eIF4E. In comparison to PEMV2-PTE, MTE interacted

only with one of the cap-binding active site residues, R163 (wheat) or W170 (maize). In the

MTE, the bulge corresponding to the G-domain in PTEs is located between nucleotides 4216-

4223; while the C-domain (connecting bridge) is located between nucleotides 4246-4250. In

contrast to the wheat eIF4E (Ta4E)-PEMV2-PTE model, the MTE-TA4E model indicated that

wheat eIF4E showed no interaction of eIF4E with the G-domain regions. Instead, the protein

primarily interacted with 4-nucleotides upstream of the G-domain (G4224-G4228). Interestingly,

wheat eIF4E intermingled with a nucleotide located in the C-domain (A4248). A comparison of

the amino acid residues interacting in the PEMV2-PTE-Ta4E vs. MTE-Ta4E model suggested 139

that both 3’-CITEs interacted with residues S205 and G208. However, the RNA-protein model’s accuracy of the MTE-Ta4E model is lower than the PEMV2-PTE-Ta4E.

In comparison to the Ta4E-MTE simulation complex (Figure 3.12), Zm4E-MTE had

more points of interaction (Ta4E: 13, Zm4E: 17) (Figure 3.13). Similar to the Ta4E-MTE, Zm4E

had only one point of interaction with the amino acid residues involved in m7GTPcap-binding

interactions. The simulation of 4E-MTE interaction suggested that regardless of the protein use

(Zm4E or Ta4E), nucleotides C4243, and A4348 interacted with the flexible loops are around the

cap-binding region. Interestingly, in the simulated Zm4E-MTE model, the 3-nucleotides (G4247-

C4249) corresponding to the C-domain interacted with Zm4E. In comparison to the PEMV2

model, the RMSD structural superimposition of the MTE models (Ta4E: 3.8 Å, Zm4E: 4.2 Å)

was higher than PEMV2 0.1 Å RMSD. One way to improve the quality of the docking MTE

model would be to optimize the molecular dynamics using a physics-based force field (Madan et al., 2016). Unfortunately, to make adaptations to the dynamic molecular interactions would require extensive system preparation and advance computational expertise outside of the scope of a conventional biologist.

Wheat eIF4E mutation design

Natural or introduced mutations in eIF4E have conferred viral resistance (Dinkova et al.,

2016; Leonard, 2004; Moury et al., 2020; Nicaise et al., 2003; Nieto et al., 2006). Mutations on eIF4E were designed based on the recessive resistance observations in the PTE-Ta4E model

(Wang et al., 2011) and natural eIF4E resistance in pea, melon, and lettuce (German-Retana et al., 2008). Three areas in the eIF4E were targeted for mutagenesis: 1) the 5’-cap-binding region,

2) the outer surface near the cap-binding region, and 3) the areas of Ta4E-PTE interaction

(Figure 3.14, Figure 3.15). All the amino acids targeted for mutagenesis were changed to alanine. The five amino acid residues (W62, W108, E109, R158, R163) in the cap-binding 140

pocket were mutated to disrupt binding (Figure 3.15). In addition to the five amino acids in the

cap-binding pocket of eIF4E, two additional amino acids (W49, W167) near the cap-binding region were mutated (Table 1-1). Mutant W49A was constructed based on an analogous mutation in lettuce (W64A) that reduced the cap-binding efficiency (German-Retana et al.,

2008). Mutant W167 is located at the bottom of the cap-binding pocket and suspected to stabilize cap-binding (Marcotrigiano et al., 1997; Monzingo et al., 2007).

There are some 3’-CITE elements that require the eIF4E-eIF4G complex to function correctly (Miras et al., 2017b). Plant RNA viruses that use an I-shaped RNA structure (ISS) bind to eIF4F (eIF4E and eIF4G) complex and are unable to interact with each subunit individually

(Liu and Goss, 2018; Miras et al., 2017b). Whether the ISS binds to both eIF4E and eIF4G simultaneously or binds to eIF4E upon the conformational change when bound to eIF4G is not known. In some cases, a 3’-CITE can have a higher affinity when linked to eIF4G. So far, we have only tested the eIF4E-MTE interactions, but we have yet to check if the MTE only binds to eIF4E or can bind other factors. The MTE-4E model suggested that MCMV could interact with eIF4E differently than most PTEs. One way to determine if MTE is affected by the conformational change eIF4E undergoes upon the binding of eIF4G is to mutate the binding site of eIF4G. Mutant W79A was designed to investigate the potential involvement of the eIF4G binding site with PTE-like structures (Figure 3.15, Table 1-1). The location of this mutation was based on W94 in lettuce, which partially disrupted the eIF4G binding while restoring the potyvirus interaction of naturally resistant plants (German-Retana et al., 2008).

141

The outer surface area near the cap-binding pocket of eIF4E was a target for mutagenesis

due to its association with virus resistance (German-Retana et al., 2008). Cap-binding residue

W62 is associated with resistance since a double mutation in this region led to the disruption of eIF4E-PTE interaction (Wang et al., 2009). Mutant H67A is next to W62 in the flexible loop-2 associated with the cap-binding activity. The proximity to the amino acid associated with PTE resistance makes H67 a good mutation target. Studies in lettuce-potyvirus resistance (German-

Retana et al., 2008) suggested that H67’s structural position did not affect cap-binding activity; however, its positive charge may stabilize PTE binding by to the RNA phosphate backbone.

Mutant F50A is located in a flexible loop between ɑ1 and β3 sheet and is associated with the cap-binding activity (German-Retana et al., 2008).

The last set of mutations were designed to test the PEMV2-PTE-Ta4E complex model

(Figure 3.10). The additional interactions predicted by the 3D complex model were located in flexible loops 2 and 3 around the cap-binding active site (Figure 3.15). If the simulation model is accurate, mutants Q59A, S205A, K207A, G208A, and P209A are predicted to partially or entirely inhibit 4E-PTE interaction without having much effect on cap-binding activity. In particular, mutants S205A and G208A are interesting because both Ta4E-PEMV2-PTE and

Ta4E-MTE models suggested interaction with these proteins.

GST-eIF4E fusion: Protein solubility problems

The use of m7GTP-sepharose affinity chromatography is likely the standard method of purification of eIF4E associated protein due to eIF4E’s m7GTP binding affinity (Carberry et al.,

1989; Dinkova et al., 2000; Mayberry et al., 2007; Monzingo et al., 2007). Our lab purifies eIF4E proteins using this method with no problems. Since cap-binding mutants disrupted binding with m7GTP cap, an alternative method of purification was pursued. The glutathione-S- transferase (GST) gene fusion system is commonly used due to its high-level expression and 142

purification simplicity (Harper and Speicher, 2008). GST fusion purification is a standard method of expression and purification of proteins because proteins can be purified under nondenaturing conditions (Harper and Speicher, 2008). Wheat eIF4E was cloned into pGEX-5X-

2 such that eIF4E was in frame with the GST protein expression, and a Factor Xa cleavage site facilitated the removal of GST from the protein.

The wheat GST-eIF4E (GST-Ta4E) constructs exhibited solubility problems. Initially, growth conditions of GST-eIF4E constructs were kept similar to Mayberry et al. (2007), where bacteria were transformed into BL21 competent cells and induced using 0.5mM isopropyl-β-D-

1-thiogalactopyranoside (IPTG) to induce production of eIF4E. However, GST-Ta4E fusion expressed in the insoluble fraction and premature proteolytic cleavage was observed (Sheber,

2018). The temperature incubation was reduced to 30°C and 25°C, but the GST-Ta4E fusion protein continued to be expressed on the insoluble fraction (Sheber, 2018).

The problem of solubility was addressed by changing to Novagen’s Rosetta-gami B

(DE3) pLysS and Tuner (DE3) pLysS competent cells, which contained deletions and key codon features that facilitated the formation of disulfide bonds in the E. coli cytoplasm (Novagen,

2010). The Tuner strains have a lacZ deletion in BL21 cells that allows for the protein expression control by adjusting the IPTG concentration (Chu et al., 2020; Novagen, 2010). The Tuner

(DE3)pLyss (refer to as Tuner-cells in the rest of the text) competent cells were composed of the following genotype: F-ompT hsdSB (rB-mb-) gal dcm lacY1(DE3) pLysS(CamR) (Refer to Table

3-2 for genetic key definitions). The Tuner-cell’s lacY mutation encodes lac permease, which

allows for uniform penetration of IPTG in the cells; consequently, lower IPTG concentrations

could enhance solubility and protein activity (Chu et al., 2020; Novagen, 2010). The other

competent cell type, Rosetta-gami B (DE3) pLysS, used had additional codon feature 143

composition. The Rosetta-gami B (DE3) pLysS (refer to R-gami B cells from this point forward)

- - - genotype contains the following modifications: F ompT hsdSB (rB mb ) gal dcm lacY1 ahpC

(DE3) gor522::Tn10 trxB pLysSRARE (CamR, KanR, TetR) (Refer to Table 3-2 for genetic key

definitions). R-gami B cells are hybrid strains designed to enhance the expression of eukaryotic

proteins (Novagen, 2010). The R-gami B cells’ genetic composition improves recombinant

expression issues by enhancing protein expression and formation of disulfide bonds in the

beacteria cytoplasm (Fathi-Roudsari et al., 2016; Li et al., 2019; Novagen, 2010).

At 37°C, the GST-Ta4E expression in Tuner cells were 95%-100% present on the

insoluble fraction (Figure 3.16). Although the GST-Ta4E complex (~52 kDa where GST=

26kDa and wheat eIF4E=26kDa) was expressed at a higher level than previously observed with

conventional BL21 (DE3) cells, most of the protein was present in the insoluble fraction (Figure

3.16). From all the constructs tested, only the wild type (WT) and two of the cap-binding mutants (W108A, R163A) has about 5-10% of the protein expressed on the soluble fraction

(Table 3.3). Next, the Tuner and R-gami-B competent cells expression and solubility were compared at 37°C using the wild type construct (Figure 3.17). Comparison of Tuner vs. R-gami-

B cells revealed that Tuner cells expressed the proteins at a higher level than R-gami-B cell expression. The growth rate on Tuner cells was higher than R-gami-B cells, which resulted in the production of a larger cell pellet for constructs in Tuner cells. However, the fast-growing

bacterial rate did not guarantee protein solubility. Even though growth rates were overall lower

in R-gami-B cells, some mutant constructs did better on slow-growing bacteria (Table 3-3).

The reduction of temperature and increased aeration are often optimized to increase solubility (Harper and Speicher, 2008). The protein expression comparison between 30°C and

25°C did not yield significant differences in the past; thus, induction at 30°C vs. 16°C was 144

compared (Figure 3.18). Upon discussion about GST solubility problems with other protein

researchers in the department, it was suggested that induction overnight at 16°C could improve

solubility issues for insoluble proteins. In this experiment, the comparison of inductions of Ta4E-

GST at 30°C for two hours versus 16°C for 12 hours revealed that based on ratios, only about

50% of the protein was expressed on the soluble fraction regardless of the temperature and

induction time (Figure 3.18). Because a more substantial amount of cells was obtained from

30°C incubation, we decided to proceed with inductions at 30°C as the optimal incubation

temperature.

The competent Tuner cells enable us to fine-tune the IPTG concentration to determine the ideal concentration for solubility of the GST-4E fusion complex (Novagen, 2010). Titration of four additional IPTG concentrations suggested than IPTG concentrations lower than 0.3 mM decreased the amount of protein expressed on the soluble fraction (Figure 3.19). The difference in Ta4E-GST fusion protein expressed in the soluble fraction was slightly bigger on 0.3mM; therefore, we proceeded to perform inductions using 0.3 mM IPTG from this point forward. The experiment illustrated in Figure 3.19 was executed at a small scale, so we infer that large scale protein induction would provide enough protein in the soluble fraction to purify.

Since bacterial growth conditions were based on small scale experiments, we needed to test if these conditions would work on a medium scale production. Ta4E-GST constructs were induced for 2:30 hours using a 0.3 mM IPT concentration for induction at 30°C. The lower concentration of IPTG determined by TUNER-IPTG optimization also benefitted R-gami-B cell’s expression (Figure 3.20). Under these conditions, the amount of protein present in the soluble fraction was greater than or equal to 50%. Since we observed that not all mutant constructs expressed at the same rates in each competent cells (Table 3-3), we examined all the 145

protein expression gels and decided the ideal competent cell for each construct. For example,

mutants W62A, and F50A expressed more soluble protein in the Tuner cells (Table 3-3). On the

other hand, W79A, R163A, and W167A expressed more soluble protein int the R-gami-B cells

(Table 3-3). For the proteins that behave similarly to wild type where the expression of soluble proteins was about the same in either cell, they were transformed in Tuner cells as they produced a higher amount of overall proteins.

Ta4E-GST proteins were grown under optimal parameters and underwent the Factor Xa cleavage to separate the GST tag from eIF4E. Cleavage of Factor Xa was performed following

NEB-instruction recommendations. A small amount of Factor Xa reaction was taken periodically

(every 2-3 hours) to check for complete cleavage. After Factor Xa treatment, the proteins underwent another round of glutathione purification to remove GST. Ironically, the second round of purification yield a clean GST protein, while the eluted eIF4E had several contaminants

(Figure 3.21-A). Initially, a small amount of contaminant was visible on the protein gels, so we decided to overload the gels to identify the size of the pollutants (Figure 3.21-B). The smaller bands at the bottom of the eIF4E bands were attributed to protein degradation due to Factor Xa cleavage incubation periods. Some mutant constructs still have small traces of the Ta4E-GST complex, but all of them had larger bands contaminant between 60-65 kDa (Figure 3.21-B).

During protein expression and purification, molecular chaperons in E. coli, such as

DnaK, are co-purified with target proteins (Morales et al., 2019). DnaK is a homolog of Hsp70 molecular chaperone, and its size is about 70kDa (Uhl et al., 2018). Changes to the sonication parameters were made to ensure cell pellets did not overheat during sonication to prevent co- purification of Hsp70-like proteins. Different size exclusion purification membranes were tested for removal of contaminants, but results were not so successful. We started to suspect potential 146

dimerization on the GST-4E constructs when the eluted protein was passed over a 50 kDa

column such that we saved the flowthrough while discarding the contaminants. However, protein

did not filter through the column and about 40-50% of the loaded protein was stuck on the filter membrane. This filter-bound protein could be recovered only using denaturing reagents. While searching the literature to find ways to troubleshoot Ta4E-GST purification, it was discovered that GST was not recommended for smaller protein, and because of its size, it tended to dimerize

(Bell et al., 2013).

The GST tag is commonly used for protein purification due to its versatility and high expression level (Harper and Speicher, 2008). However, GST tends to aggregate and form inclusion bodies when the tag is used with hydrophobic and large proteins (Bell et al., 2013;

Deceglie et al., 2012). In its native form, GST is a homodimer, but when it is bound to a protein that can also oligomerize, forming a large complex that is not easy to purify (Bell et al., 2013;

Maru et al., 1996). The wheat eIF4E crystal structure in the PDB database indicated that wheat eIF4E has the ability to oligomerize as the crystal was purified as a dimer (Monzingo et al.,

2007). Because of the proper folding protein uncertainties and difficulties encounters purifying the GST-fusion, we decided to change to a smaller tag, His6x tag.

Wheat eIF4E mutants were transferred to His-tag wheat eIF4E construct already available in our lab. Although sequencing results indicated the presence of wheat eIF4E wild type, the first 8-amino acids of the full-length sequence were replaces by 6XHis tag plus proline and valine amino acids. Despite this 8-missing amino acids, we decided to use this plasmid as we believed that a small fragment in the N-terminus should not disrupt the cap-binding activity and overall protein structure. Similar to before, the wild type plasmid was mutated using a Q5-site directed mutagenesis kit, and sequences were confirmed through sequencing. Growth and 147 expression conditions were kept the same as the ones optimized with GST-4E fusion. Protein solubility was improved (Table 3-3) in all the proteins. The increase in protein solubility suggested that something in the GST fusion complex was causing solubility problems.

Purification of His-proteins was compared using the two types of resin kits available in our lab, Ni-NTA and HisPur Cobalt from Thermo Fisher Scientific. For the nickel resin (Ni-

NTA), gravity purification was used, and for HisPur Cobalt, spin-column cleaning was used.

Similar to the GST-4E fusion purified proteins, the proteins purified with Ni-NTA resin also expressed chaperone proteins (Figure 3.22-A). However, the protein was much cleaner than the

GST-4E fusion. The small fraction of the sonicated protein lysate was purified using HisPur

Cobalt resin beads, resulting in a cleaner eIF4E product similar to the results when m7GTP sepharose beads are used (Figure 3.22-B). Because half of the eIF4E protein was lost in the flowthrough, the manufacturer’s protocol was modified to ensure the capture of protein. The flowthrough was collected and passed twice through the beads, which increased the protein capture by 40%.

Construction of maize cap-binding protein expression vectors

Electrophoresis mobility shift assays (EMSA) assays using wheat eIF4E indicated that

MCMV-3’CITE (MTE) posses the ability to bind eIF4E to initiate translation of viral proteins

(Chapter 2). Although wheat eIF4E is homologous to maize eIF4E, we wanted to compare binding affinities between wheat eIF4E and host-specific eIF4E. Thus far, we have tested MTE- wheat eIF4E interaction; however, we have not tested eIF4E isoform for interaction. Since maize has three 4E-like proteins that bind to 5’-cap (Dinkova et al., 2011; Dinkova et al., 2016), we decided to clone the genes of these proteins for protein-RNA interaction assays to test MTE’s binding affinity. In other plant systems, mutations in each of these cap-binding proteins, eIF4E

(Ashby et al., 2011; Moury et al., 2020; Nieto et al., 2006a; Rodríguez-Hernández et al., 2012), 148

eIFiso4E (Gallois et al., 2010; Hwang et al., 2009; Wang et al., 2013), and CBP (Gomez et al.,

2019; Keima et al., 2017), resulted in virus resistance. Our goal is to identify the cap-binding proteins that interacted with MTE and then look for mutations that would result in virus resistance.

The expression of eIF4E, eIFiso4E, and CBP varies based on the plant tissue and growth stage (Dinkova et al., 2000; Dinkova and Sánchez de Jiménez, 1999; Lázaro-Mixteco and

Dinkova, 2017). Maize genome database (MaizeGDB) has a wide variety of tools accessible to the public; these tools were used to search for eIF4E in the B73 genome. The search yielded five gene IDs; two corresponded to eIF4E located on 3, two corresponded to eIFiso4E located in one and five, and finally, nCBP, which is located in chromosome one

(Table 3-4). The eFB Atlas browser was used to search for protein and mRNA levels expressed in from each maize gene, and genes with the highest expression level were chosen for cloning.

Based on mRNA and protein expression levels, the following gene candidates were selected for cloning eIFiso4E from GRMZM2G22019-Chr5, eIF4E from GRMZM2G002616-Chr3, and

Zm00001d028515-Chr1.

The cDNA prepared from mRNA isolated from B73 seedlings was cloned into two different expression vectors. The first approach for cloning the PCR product (Figure 3.23-A) from cDNA synthesis was gateway cloning, and the second one was Gibson assembly. For

Gibson assembly, a pet vector containing His6-MBP tag (Figure 3.23-B) was used to prevent solubility problems for downstream applications (as observed with wheat eIF4E). The Gibson assembly was successful, while the Gateway cloning had some minor issues. However, these issues were resolved, and expression vectors were constructed using both protein expression backbone vectors. During the first protein expression test, pDest vectors (Gateway cloning) were 149

not expressing the protein as efficiently as the His6-MBP vectors. It was also observed that the

gateway cloning method added additional amino acids to the N-terminus of the protein, so I

proceeded to use His6X-MBP vectors instead. The His6-MBP constructs were induced following

similar conditions as optimized with wheat his-4E constructs and purified using m7GTP sepharose (Figure 3.24). Zm4E-MBP complexes were purified with no problems (Figure 3.24).

Screening of protein-RNA interaction using electrophoresis mobility shift assay

Electrophoresis mobility shift assays (EMSAs) detect protein-RNA complexes by combining a labeled RNA probe in the presence or absence of protein (Hellman and Fried,

2007). In this chapter, we struggled to get clean eIF4E mutant proteins due to solubility problems and co-expression of E. coli chaperon proteins. However, towards the end, we were able to purify enough protein for EMSA assays. Some of the mutant proteins described in this chapter had deficient expression levels or had problems growing that proteins were not purified from them. For example, W79A produced one to two colonies during transformation, which was low comparing the wild type that produced more than 100 colonies every time. The correct sequence was present on this plasmid, and it transformed okay in top-10-cells, but when transformed to protein expression competent cells, transformation efficiency decreased. Due to low expression levels of W79A, this mutant was not included in the EMSA screening. A similar problem was observed for two of the latest protein designs, K207A and G208A. Therefore, mutants W79A,

K207A, and G208A were omitted from EMSA screenings (Table 3-5).

Three different PTE-like structures were used for EMSA protein screenings. Translation and RNA-protein binding assays indicated that thin paspalum asymptomatic virus (TPAV) 3’-

CITE (TPAV-PTE) contained a strong PTE (Kraft et al., 2019); therefore, this PTE was used as a

positive control for PTE-eIF4E interaction. When the C-domain of this TPAV-PTE is changed to 150

adenines (TPAV-m2), cap-binding interaction is inhibited and the presence of a 5’-cap in the

RNA partially restored eIF4E binding (Kraft et al., 2019). TPAV-m2-PTE would serve as a negative control for PTE-eIF4E interaction. Finally, since the MTE resembles a PTE-like structure without the G-C domain pseudoknot (Figure 3.13), we wanted to see if MTE-eIF4E was similar to PTE-eIF4E interaction.

In Wang et al. (2009), mutations in amino acids W108 and W62 in the cap-binding active site of eIF4E led to the disruption of PTE-4E interaction. Changing the tryptophan to leucine at position 108 (W108L) decreased the PTE binding affinity to eIF4E. Then, when a second mutation was added in tryptophan 62 (W62L), the double mutations inhibited interaction between PEMV2-PTE and eIF4E (Wang et al., 2009). These experiments were the key to the

PEMV2-PTE-Ta4E model simulation. When we target the same regions but this time changing the tryptophan to alanine, W62A, neither the PTE-TPAV nor the MTE bound to the protein

(Table 3-5). However, the protein used for this screen was GST purified, so the lack of activity could not be attributed to the protein mutation until the protein is purified once again with His- tag.

Similarly to Wang et al. (2009), W108A affected PTE-4E interaction. In the absence of a cap, PTE did not bind W108A, but the presence of 5’-cap partially increased the eIF4E PTE interaction. The decrease in PTE binding affinity in the W108A mutant serves a validation for the prediction of the 3D model since this amino acid-residue-nucleotide interaction is likely to occur. Surprisingly, W108A-MTE binding was not drastically affected as W108A-TPAV-PTE

(Table 3-5). Although the MTE-4E model had potential computational issues, it did not predict interaction with W108 (Figure 3.12). The purified his-W108A-MTE binding decreased in the presence of a cap while binding was not affected in the absence of the 5’-cap. Since both PTE 151

and MTE partially bound to W108A in the presence of a 5’-cap, it could indicate that mutating

this region affects cap-binding affinity. The fact that uncap MTE binding is not affected by

W108A mutation could imply that the lack of a G-C pseudoknot in MCMV 3’-CITE (Figure

3.11) stimulates a different interaction mechanism to eIF4E than most PTEs.

Mutat E109A is located in the cap-binding pocket (Figure 3.2) of wheat eIF4E and interacts with m7GTP (Monzingo et al., 2007). Since this mutant is involved in m7GTP interaction, we expected cap-binding activity to decrease. As expected, mutant E109A binding to

TPAV-PTE and m7GTP was reduced (Table 3-5). Interestingly, uncapped TPAV-m2-PTE, which is nonfunctional when incubated with WT-4E, had some binding interaction when the

E109A was GST purified. Similar to uncapped TPAV-PTE, uncapped MTE-E109A interaction

was partial. However, in contrast to the expectations, capped-MTE had a complete bind to

E109A. Since uncapped MTE-4E binding was affected by the mutation, it could indicate that this

amino acid position has some interaction with MTE.

Mutant F50A is located in the outer surface of the protein in a flexible loop close β1

strand. Mutation at this amino residue is associated with potyvirus resistance in lettuce (Table

3-6) (German-Retana et al., 2008). Correlating to the mutant location, it did not affect cap-

binding (Table 3-5). However, it was observed that the F50A mutant partially inhibited the PTE-

4E binding. Surprisingly, this mutation promoted the binding interaction between F50A-4E and

TPAV-m2 nonfunctional PTE. Since this protein was purified using the GST tag, it was

uncertain whether this mutation compensates for the lack of G-C pseudoknot in TPAV-m2 or the

suspected oligomerization with GST influenced the EMSA results.

152

Mutants Q59A and S205A were predicted to have some involvement in PTE-4E interaction based on the docking model (Figure 3.10). However, TPAV-PTE EMSA binding patterns were similar to wild type (Table 3-5). Intriguingly, when the MTE was capped, a decreased binding affinity was observed. This mutation (S205A) is located in the flexible loop-3 near the cap-binding region (Figure 3.15), and the MTE-Ta4E model (Figure 3.12) predicted a potential docking interaction between S205 and G4247. The puzzling results could suggest that

MTE binds multiple points of interaction near the cap-binding flexible loops.

Similarly to mutants Q59A and S205A, mutant P209A did not affect the binding of capped TPAV-PTE and eIF4E. It was encouraging to see that in the absence of a cap, the protein binding affinity slightly decrease because this could indicate that the docking model was not far off (Figure 3.10). On the other hand, the MTE-Ta4E model (Figure 3.12) did not predict interaction with this residue, and yet, there was next to no binding (Table 3-5). However, the 3D docking model predicted MTE-4E interactions on residues S205 and G208, which are close to mutant P209A. Thus, if MCMV interacts with flexible loop 3, there is a chance that it comes in contact with this amino acid.

The binding activity of GST-purified proteins was unreliable because we were uncertain if the supposed dimerization of eIF4E and GST would affect the cap-binding site folding.

Although mutants W49A, R163A, W167 were predicted to inhibit cap-binding interactions, the wild type uncapped TPAV-PTE did not bind completely to GST-Ta4E. To verify that it was not an RNA probe problem, TPAV-PTE was incubated with m7GTP purified full-length Ta4E under the same conditions, and protein concentration and complete binding were observed. Since the

RNA binding for PTE-TPAV with GST-4Ewt was not 100%, a weak binding affinity could be

misinterpreted as inhibition when it is just a protein purification problem. Therefore, the EMSA 153

results using GST purified eIF4E mutant interactions reported in Table 3-5 need to be repeated as they are inconclusive.

Conclusion

A good molecular computational model can help us design more impactful mutants during lab-bench experimental design. Computational models not only provide amazing visuals but enable the researcher to integrate protein motion dynamics into experimental design. Proteins are marvelous molecules that are constantly in motion and re-arranging to perform diverse functions. They can dimerize to a friendly fusion tag, or they can undergo conformational changes that would allow it to go from RNA-protein to protein-protein interactions. When designing a disruptive protein mutant, there are a couple of questions one must ask: what is the

3D structure of my protein? Where is the location of my active site? Is my active site in a region that is likely to be involved in protein motion dynamics? How would the mutation affect structural folding and dynamics? For example, the effect of a modification will be dependent on the ionic charges, and the residue structure being replaced. In addition, a mutation will have a greater impact if it is located near a core region or a segment involved in the movement.

Many approaches can be taken while designing mutations in proteins for functional assay. In our case, we wanted to find mutations that would disrupt the PTE-eIF4E interaction. To design our mutants, we use data from previous publications (German-Retana et al., 2008; Wang et al., 2011; Wang et al., 2009) to identify locations of interest in eIF4E (Table 3-6). Also, we used simulation models to predict in silico interactions between eIF4E and our RNA of interest.

Experimental and 3D docking modeling data suggested the PTEs use their hyper-modified G located in the G-C pseudoknot to interact with eIF4E’s W62 and W108 amino acid residues

(Wang et al., 2011; Wang et al., 2009). Since we narrowed down the PTE-eIF4E interaction to the cap-biding site, we focused on naturally-occurring and engineer mutations in the eIF4E that 154

resulted in virus resistance. Most naturally-occurring resistance involves more than one amino acid (Bastet et al., 2017; German-Retana et al., 2008; Ruffel et al., 2002). When viruses

encounter resistance genes, they frequently seem to find ways to co-evolve with their host to find

ways around that. Since virus eradication is not always possible, one of the best ways to prolong

virus resistance is by targeting more than one protein-RNA virus interaction.

Wang et al. (2009) showed that double mutant W108L-W62L inhibited PTE-4E and

m7GTP binding activity. The presence of a single mutation only reduced PTE-4E binding, but the combination of both interacting amino acids resulted in complete inhibition. In this chapter, we wanted to identify mutants that would have a more significant effect on PTE-4E binding than m7GTP binding. The ideal host-resistance protein is the one that would prevent protein-virus interaction while allowing the host to continue business as usual. For this reason, we decided to screen for mutations for amino acid residues within and outside the cap-binding pocket to find the best combination that would be less detrimental for the host. For example, the first EMSA screening of the mutations described in this chapter suggests that mutants H67A, Q59A, P209A, and E109A should be further analyzed because they partially inhibited the PTE-eIF4E while maintaining some of the cap-binding activity.

The eIF4E gene family can be very redundant because there could be several different eIF4E isoforms and paralogs (Bastet et al., 2017; Dinkova et al., 2016; Joshi et al., 2005).

However, even though there could be several paralogs and eIF4E isoforms, it does not necessarily mean that they will be active at the same time (Dinkova et al., 2011; Kropiwnicka et al., 2015; Rodriguez et al., 1998). For example, during seed storage, eIFiso4E is expressed at higher levels than eIF4E, even though eIF4E levels are low in the mature seed, they increase upon germination (Dinkova et al., 2011). During germination and growth stages, almost all 155

eIF4E paralogs and isoforms are active (Dinkova et al., 2000; Dinkova and Sanchez de Jimenez,

1999; Lázaro-Mixteco and Dinkova, 2017; Xu et al., 2017), and this makes sense as the plants' needs develop essential organs that would ensure its survival in the environment. Because of the initial growth spurts, the plants rely on their translation factors to translate various proteins. It is also during these developmental stages that a plant can be more susceptible to viruses.

Since eIF4E has highly conserved regions among its structures, it is important to investigate the interaction mechanism between viral RNAs and host proteins. Sometimes in the absence of eIF4E, the virus can recruit eIFiso4E at lower rates. To investigate the binding affinity of MTE to eIF4E, we cloned maize eIF4E, eIFiso4E, and CBP into protein expression vectors. Unfortunately, we did not get to run binding assays on these proteins due to time limitations as we spend a lot of time troubleshooting and optimizing for wheat eIF4E constructs.

Since not many wheat eIF4E mutants affect MTE binding (Table 3-5), it will be interesting to see if binding affinities differ from wheat 4E interactions. The MTE element is highly conserved among MCMV strains and does not tolerate single mutations since it reverts to wild type almost

80% of the time (E.Carino’s observations). Although MCMV can infect other members of the

Gramineae family, it seems that symptoms are more visible in maize. If MCMV has developed an affinity for maize, it is expected that Zm4E-MTE binding affinity interactions would be higher than Ta4E-MTE interactions. Lastly, MTE looks like a PTE, but the 4E-MTE mutant binding interactions differ from the ones observed in TPAV-PTE-4E binding interactions. The differences in protein-MTE interactions could be attributed to the lack or weak pseudoknot in the

G-C domain. But there is also the question of whether all the currently known PTEs interact with eIF4E the same way. Since the G-C pseudoknot in PTEs (Figure 3.9) comes in different sizes, and it will be interesting to screen all the PTEs to see if all the eIF4E-PTE interactions are the 156

same. Although PTEs have similar structures, they might use a slightly different mechanism to

interact with eIF4E based on their pseudoknot properties.

For future steps, it is recommended to characterize these mutants further. A more

extensive search for virus resistance analogs (than the one observed in Table 3-6) would be

needed to identify the amino acid residues across involved in virus protein-protein interactions

(Vpgs) and RNA-protein interactions. Although the properties of protein-protein interactions are

different from RNA-protein interactions, there is a chance that there are amino acids that have

homologous proteins in different eIF4Es conferring resistance. For example, there are amino

acids associated with potyvirus resistance in W49wheat and G94wheat positions for lettuce,

Arabidopsis, and tomato. However, knowing the critical amino acids to mutate is not enough;

one must note the amino acid substitution. Not only would we be able to determine the atoms

interacting, but it could also provide us with information on interacting hydrogen bonds.

One of the main issues in this chapter was troubleshooting the formation of inclusion

bodies and hydrogen bonds that changed the solubility of the protein. Although we cannot avoid

working with insoluble proteins, some commercialized kits and resources could facilitate protein

expression. In this chapter, we used Tuner and Rosetta-gami cells that contain distinct genes that promoted the protein expression in the soluble fraction. Since each protein has its own physical and chemical properties, they all need to be taken into account when designing protein expression vectors. For example, when searching for fusion proteins, it is recommended to read the benefits and problems of such tags. In our case, we only became aware of the GST solubility issues reported in the literature only after we were having similar issues. If protein tags need to be used for purification, it will be recommended to use a tag that is smaller than the protein of

interest or to add the proper spacers between the protein-tag to prevent folding issues. Finally, 157

one should avoid using tags that are the same size as your protein; not only could it get confusing

during protein purification, but also it will increase the chances of oligomerization.

Materials and Methods

Construction of protein expression vectors

Wheat eIF4E-GST pGEX-5X-2 expression vectors. MS student, Melissa Sheber, constructed the pGEX-5X wheat eIF4E protein expression vector (Sheber, 2018). In essence, the amplified full-length sequence of Triticum aestivum eIF4E coding region from eIF4E-pET22b constructs obtained from Browning's lab (Monzingo et al., 2007) was inserted into the multi- cloning site of the pGEX-5X-2b vector between BamHI and SalI restriction sites. Colony PCR screening using eIF4E-specific primers was used to verify the presence of the insert. Positive bacteria colonies were sent to the Iowa State DNA facility for sequencing confirmation.

Sequencing chromatograms were analyzed to ensure that the eIF4E protein sequence was inserted in the correct frame. The eIF4E-pGEX vectors have a tac promoter follow by GST tag protein sequence, in frame factor Xa protease recognition site, 3 amino acid spacer, followed by wheat eIF4E (Z12616.2) protein sequence. For full details on the construction of pGEX-wheat eIF4E described in this chapter, refer to (Sheber, 2018).

Wheat eIF4E pET28a protein expression vectors. Former lab members constructed the pET28a wheat eIF4E His-tag expression vector used in this chapter (Treder et al., 2008; Wang et al., 2009). In essence, the wheat eIF4E gene was introduced into the pET28a vector by PCR amplification and recombination. One of the main differences from all the other full-length constructs is that the first eight amino acids of the protein are substituted by the six amino acids of the his-tag. Besides the replacement of the first eight amino acids, there are no other changes to the protein. The wheat pET28a eIF4E expression vector contains a T7 promoter followed by a 158

6-Histidine tag and the eIF4E sequence protein immediately. This expression vector uses

kanamycin as a selective agent.

Maize cap-binding protein expression vectors. A blast search was performed in the

B73 genome to identify the location of maize eIF4E genes. The mRNA and protein expression levels of eIF4E, eIFiso4E, and cap-binding protein (CBP) were analyzed using the eFB Atlas browser from maizegdb.org to identify the genes with the higher mRNA level abundance. The gene with the highest expression levels for each of these proteins was selected for cloning: eIF4E

(GRMZM2G002616-Chr3), eIFiso4E (GRMZM2G22019-Chr5), and CBP (Zm0001d028515-

Chr1). Because the mRNA levels for eIF4E, eIFiso4E, and CBP were high in seedlings, RNA

was isolated from 100 mg of leaf tissue of 8-day old B73 seedlings. The extracted RNA was used as the template for cDNA synthesis using Maxima first-strand cDNA synthesis kit (Thermo

Fisher). Polymerase chain reaction (PCR) was used to amplify cDNA gene inserts. PCR product size was verified using gel electrophoresis.

Gibson assembly and Gateway cloning were simultaneously performed to compare efficiency rate. For Gateway cloning, PCR insert was cloned into pENTR-D entry vector

(Thermo Fisher) following manufacturer's instructions. Colony PCR was performed to verify insert was present inside entry vector. Next, LR recombination was performed using pDEST17 as the destination vector (as recommended by the manufacturer). Recombined pDEST17 eIF4E, eIFiso4E, and CBP vectors were then transformed into E.coli calcium chloride competent cells.

Colonies were screen using PCR and restriction digest profiling. Colonies with the correct restriction digest profiling were sent for sequencing confirmation. For Gibson Assembly, specific primers were designed such that cDNA products contained overlapping regions with the destination vector, pET His6 MBP TEV Lic (Addgene plasmid #29656). The pET His6 MBP 159

TEV Lic cloning vector was linearized with the SspI enzyme. Gibson Assembly kit (New

England Biolabs) was used to assemble and ligate the pieces together. Assembled DNA was

transformed into calcium chloride competent cells, and colony PCR was performed to screen

colonies. Bacteria colonies that passed the PCR screening and contained the correct restriction

enzyme digest were sent for sequencing.

Site-directed mutagenesis. The eIF4E mutant constructs were generated using Q5-site

directed mutagenesis kit (New England Biolabs). Specific primers targeting specific amino acid

codon sequences were designed to change selected amino acids to Alanine (Table 3-7).

Following manufacturer's recommendations, 12.5 µL reactions were assembled on ice using the

appropriate forward and reserve primers. Thermocycling conditions were similar across the

primer sets (98°C-45 seconds, (98°C-10 seconds, 50-70°C-30 seconds, 72°C-3:30 minutes)x25 cycles, 72°C-2min) except for the annealing temperatures. Five microliters of PCR reaction were checked using gel electrophoresis amplicons with the correct size were then subjected to kinase, ligase, and DpnI (KLD) treatment for 15 minutes at room temperature. Treated KLD PCR products were then transformed into competent cells. Mutations were verified through sequencing analysis.

Protein inductions. Protein plasmid constructs were transformed into Tuner (DE3) pLysS or Rosetta-gami B (DE3) pLysS BL21 competent cells. For a medium scale purification, three 10-mL glass tubes containing 5-mL of LB media were inoculated with a single colony and incubated overnight at 37°C. Overnight bacteria cultures of the same constructs were pooled together and inoculated 1:40 overnight culture to new media. Cultures were incubated in temperature control rotatory shakers (37°C or 30°C) until OD600:0.4-0.6 was reached (~ 2-3 hours). A small 5mL aliquot was removed before induction to serve as a negative control. Induce 160

cells by adding x-amount of 1M isopropyl-β-D-1-thiogalactopyranoside (IPTG) such that the final concentration is 0.5mM (later experiments used 0.3mM). Bacteria cultures were placed back into the shaking incubator (37°C or 30°C) and incubated for 2 hours.

E.coli cell harvest and lysis. Upon induction incubation period completion, cells were harvest by centrifugation at 6,000xg for 15 minutes at 4°C. Cells were then quick-frozen or place at -80°C overnight. Next, cells were thawed and resuspended on lysis buffer. Lysis buffer varied based on downstream applications. If cells were to be purified using 7-methylguanosine beads,

B50 (20 mM HEPES-KOH pH7.6, 0.1 mM EDTA, 1 mM DTT, 10% glycerol, 50 mM KCl, protease inhibitor, 1mg/mL lysozyme) buffer was used. If proteins were to be purified using glutathione sepharose 4B beads, PBS (140 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM

KH2PO4, pH 7.3, 1 mg/mL lysozyme, protease inhibitor tablets) lysis buffer was used. Finally, if cobalt resin was used for purification, lysis equilibrium buffer was used (50mM sodium phosphate, 300mM sodium chloride, 10mM imidazole, pH 7.4, protease inhibitor tablets, 1 mg/mL lysozyme). Due to the lysozyme presence in the buffers, resuspended cell pellets were incubated on ice for 30 minutes prior to sonication. Bacteria cells were disrupted using sonication in 3 to 4 sets of 10 seconds sonication x 20 seconds pause in between sonications.

Sonicate solution was separated through centrifugation at 20,000xg for 15 minutes at 4°C.

Centrifugation steps were repeated until no bacterial pellet was observed in the supernatant.

Soluble supernatant was filtered thought a 45µM filter before a quick freeze, and storage at -

20°C or filtrate went directly to purification.

Glutathione sepharose purification. The glutathione sepharose beads (GE Healthcare) were equilibrated by washing them with 5-column volumes of binding buffer (140 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4, pH 7.3). The bottom of the column was capped 161

and add protein lysate was added to the equilibrated beads. The top of the column close with a

cap and column was incubated for 30 minutes at 4°C with gently shaking. The bottom cap and

top column caps were removed after incubation and columns were centrifuged at 300xg for 5

minutes. The flowthrough was passed twice through the column to ensure protein capture. The

flowthrough was stored at -20°C until capture of the eIF4E-GST complex was verified through

protein gel electrophoresis. The glutathione beads were washed with at least 5-volumes of binding buffer three to four times or until flowthrough reached background levels. The column was allowed to drain; then, the bottom was capped. Next, the GST beads were washed with elution buffer (50 mM Tris-HCl, 10 mM reduced glutathione, pH 8.0) using 0.5 mL per 1mL of beads. The glutathione beads were incubated at room temperature for 10 minutes to elute GST-

4E protein. Elution steps were repeated until no protein was detected through the Bradford protein quantification assay (Bio-Rad). Buffer exchange and protein concentration was performed using dialysis spin columns such that proteins were concentrated in protein storage buffer (50% glycerol, 20 mM Tris-HCl). A small volume of eluted protein was aliquoted for quantification and quality analysis. Prior to storage, the proteins were quick-freeze and stored at -

80°C.

Glutathione beads were cleaned right after use following the manufacturer's recommendations. Precipitated and denatured substances were removed from the glutathione beads by washing the beads with 2 column volumes of 6M guanidine hydrochloride, quickly followed by 5-volumes of binding buffer. Then 3 column volumes of 70% ethanol were added to wash out hydrophobic substances, immediately followed by 5-volumes of binding buffer. The glutathione sepharose beads were stored in 20% ethanol at 4°C.

162

7-methylguanosine purification. The m7-GTP agarose beads were equilibrated with 5-

volumes of B100 buffer (20 mM HEPES-KOH pH7.6, 0.1 mM EDTA, 1 mM DTT, 10%

glycerol, 100 mM KCl). The flowthrough speed was adjested to 1mL/min. The lysate was

diluted with 5-volumes of B100 buffer then loaded into equilibrated m7-GTP agarose beads. The

flowthrough was stored at -20°C until eIF4E capture was verified through protein gel

electrophoresis. The gravity column was washed with 10-volumes of B100 buffer or until OD

reached baseline. The m7GTP beads were washed with Buffer 100 plus 100 µM GTP to remove

contaminant proteins. The protein was eluted by adding 500 µL of Buffer B100 plus 20 mM

GTP. The elution step was repeated until all the protein was recovered from the beads. All the

fractions with the highest protein concentration were pool together. Buffer exchange and protein

concentration were performed using dialysis spin columns such that proteins were concentrated

in protein storage buffer (50% glycerol, 20 mM Tris-HCl). A small volume of eluted protein was

aliquoted for quantification and quality analysis. Prior to storage, the proteins were quick-freeze

and stored at -80°C. For extended details, refer to (Mayberry et al., 2007).

His-Pur cobalt purification. The His-Pur cobalt resin (Thermo Fisher) was equilibrated with 2-volumes of equilibrium buffer (50mM sodium phosphate, 300mM sodium chloride,

10mM imidazole, pH 7.4). The His-Pur cobalt resin column was centrifuged for 2 minutes at

400xg. The protein lysate was diluted to a 1:1 ratio of lysate to equilibrium buffer. The spin- column bottom was capped, and the lysate was added to the cobalt resin. The cobalt resin containing the diluted protein lysate was incubated in an end-over rotator for 30 minutes at 4°C.

After incubation, cap seals were removed, and the column was centrifuge spin column for 3 minutes at 400xg. The flowthrough was passed through the column one more time to ensure protein capture. The flowthrough was stored at -20°C until protein gel analysis verified that all 163

protein was capture. The cobalt resin was washed with equilibrium buffer until absorbance

reached baseline. eIF4E protein was eluted by adding 0.5 resin-bed volumes of elution buffer

(50mM sodium phosphate, 300mM sodium chloride, 150mM imidazole; pH 7.4). The elution

step was repeated until all protein was recover. Buffer exchange and protein concentration was

performed using dialysis spin columns such that proteins were concentrated in protein storage

buffer (50% glycerol, 20 mM Tris-HCl). A small volume of eluted protein was aliquoted for

quantification and quality analysis. Prior to storage, the proteins were quick-freeze and stored at -

80°C.

The His-Pur cobalt resin columns were clean right after use following the manufacturer's recommendations. The resin was regenerated by washing cobalt resin with 10-resin-bed volumes of 20mM MES buffer, 0.1M sodium chloride, pH 5.0. A second wash was performed by adding

10-resin-bed volumes of ultrapure water. The resin was stored by adding 50% slurry in 20% ethanol at 4°C.

In vitro Transcription. DNA templates for EMSA probes were amplified using PCR.

Uncapped RNAs probes were generated using MEGAshortscript kit (Ambion) and mMESSAGE mMACHINE kit (Ambion) for cap RNAs. Two microliters of 72.608 mM (90.76% 32P isotope of an 800Ci/mmol stock in a 10mCi/mL dilution) ɑ32P CTP (Perkin Elmer, BLU008X250UC) and 1µL of 75mM of non-radioactive CTP were added to per each 20µL transcription reaction; the reminding reagents were kept on the ratio recommended by kit’s instructions. Probes were incubated at 37°C for 3 hours followed by DNAse treatment for 15 minutes. One microliter of the reaction was set aside for radioactive measurement counts. Nonincorporated radionuclides were removed from RNA probes using RNAse-free Bio-Spin columns P-30 (Bio-Rad). The radioactivity counts of one microliter of the clean probe were measured using counter 164

scintillation machine. Radiolabeled RNA concentration was calculated by measuring the amount

of incorporated radioisotope using a scintillation counter.

Electrophoretic mobility shift assay (EMSA). Calculated specific activities of probes were used to determine the amount of molarity of RNA in each purified stock. RNA probes were subjected to 6M TBE-urea gels electrophoresis to verify quality of RNA. RNA was diluted to 10 fmol/μL for EMSA assays. As previously described (Kraft et al., 2019), 32P RNA labeled probes were incubated with 250nM of protein in EMSA binding buffer. Protein and RNA-probes were incubated in a total volume of 10 μL of 1X EMSA binding buffer (10 mM HEPES pH 7.5, 20 mM KCl, 1 mM dithiothreitol, 3 mM MgCl2), 1 μg/μl yeast tRNA, 0.5 μg/μl bovine serum albumen, 1 unit/μl RNaseOUT ™ Recombinant Ribonuclease Inhibitor (Invitrogen), and tris-

10% glycerol for 25 minutes in ice. Then, 3 μL of 5% glycerol was added to each reaction.

Immediately after, 7μL of RNA-protein mixture were loaded into a 5% polyacrylamide

(acrylamide:bis-acrylamide 19:1), Tris-borate/EDTA (TBE) gel, which was run at ~4°C at 110V for 45 minutes in 0.5X cold TBE buffer. Gels were dried on Whatman 3MM paper and exposed to phosphorimager screen overnight. Phosphor screens were scanned in a Bio-Rad

PhosphorImager, and radioactivity counts were analyzed using Quantity One software (Bio-

Rad).

Multiple sequence protein alignments. Protein sequences were obtained from NCBI database using Blastp tool. For the sequence alignments of Figure 1, only plant eIF4E from reference sequences were selected for sequencing alignments. Extracted protein sequences were aligned using Cluster Omega (Sievers et al., 2011) software. Protein alignments to thread the common structures and dynamic features of Zea mays eIF4E were performed using the Bio3D R 165

package tools (Skjaerven et al., 2014). Multiple sequence alignments, structure superposition,

and principal component data obtained from Bio3D were used for downstream applications.

3D protein folding prediction. Two online tool were used to determine and compare the predicted 3D structure of maize eIF4E, I-TASSER (Zhang, 2008) and Phyre2 (Kelley et al.,

2015). Information gathered from Bio3D analysis was used to assign constraints and template

guides to I-TASSER modeling software. The 2IDR_A chain PDB structure was used as a template structure. All other I-TASSER parameters were left as recommended in the Online

server. Phyre2 settings were set to 'Intensive' modeling construction parameters. PDB generated

files were visualized in PyMol software and analysis report were saved in PDF files.

Protein-ligand Docking. AutoDock (Morris et al., 2009) and AutoDock Vina (Trott and

Olson, 2010) suite packages were used to predict binding receptors between 7-methylguanosine

triphosphate and predicted maize eIF4E. The ligand (7-methylguanosine triphosphate) file

information was obtained from DrugBank ID: DB01960. AutoDock was used to assign the grid

area where the program would search for ligand binding. Grid area location was set

approximately around the cap-binding pocket, as observed in the 2IDV wheat eIF4E structure

(Monzingo et al., 2007). Center grid box coordinates (center_x=7, center_y=5, and center_z=31)

were noted and saved for downstream applications. Files were saved with .pdbqt extension for

AutoDock Vina simulations. Vina docking simulation using (m7G.pdbqt) and protein

(Zm4Eprotein.pdbqt) files were performed using the following coordinates: center_x=7,

center_y=5, center_z=31, size_y=28, size_x=26, size_z=28. Vina docking outputs were

visualized using PyMol. Maize eIF4E and m7GTP binding locations were compared to the

homologous superimposed locations found in 2IDV pdb file coordinates (Monzingo et al., 2007).

Identified binding locations reported in Figure 2. 166

Protein superimposition. Bio3D software searched identified similar PDB sequences in

the Research Collaboratory for Structural Bioinformatics (RCSB) as maize eIF4E; top structures

were selected for analysis (bitscore cutoff was 330). Protein structures with the highest fit were

used for superimposition in PyMol. The easiest way to superimpose sequences in PyMol is to

open PDB files of interest in PyMol software, in the command line, write "align filename,

filename." For Figure 2 image, superimposition the command line was "align, 2IDR, 2WMC,

Zm4E". More in dept principal component analysis (PCA) was performed using Bio3D software.

RNA 3D structure predictions. SimRNA webserver (Magnus et al., 2016) was used to

predict RNA 3D structure. RNA sequence was uploaded into the server containing SHAPE

probing restrictions. Simulation steps were set to 1,000. Over 500 structures were analyzed, and

the best-fit PDB file was saved for downstream applications.

RNA-protein docking. NPDock (Tuszynska et al., 2015a) was used to generate a nucleic acid-protein interaction model. Protein and RNA PDB files were uploaded into the server, for T. aestivum PDB file (2IDV) A-chain was analyzed. Parameters were set as follow: 50,000 decoys to be generated with GRAMM, Protein interface: location of amino acid located in the cap- binding pocket, number of residues of interface contact was left as default, the top 500 best score models were used for clustering, RMSD cut-off was set to 5, simulation steps were set to 1,000 and temperature of final step was set to 299 Kelvin. PyMol was used to analyze and visualize

PDB output files.

Analysis of RNA-protein interaction simulations was performed using PyMol. In short,

PDB simulation was loaded into PyMol software, RNA sequence, and protein sequences were saved as their own separate selections. Generation of separate selections can be achieved by selecting the nucleotide sequence on the sequence display, then select 'Action' (A)> 'rename'> 167 name of protein/RNA > 'enter' key. If the model structure is not is already in cartoon form, then go to 'All'> 'show' (S)> 'cartoon'. Color-code RNA and Protein structures by element or prefer color, to do this, go to 'color' (C) > select color scheme desired. On PyMol command line type

"show sticks, byres all within 5 of 'ligand/RNA name'; this command line changes all potential the amino acid and nucleotide interacting. Atomic interaction can be visualized by selecting

‘action’ (A) > ‘find’ > ‘polar contacts’ > ‘to any atoms.’ To add hydrogen bonds go to ‘action’

(A) > ‘find’ >. ‘hydrogens’ > ‘add’. Zoom in and check for polar interactions, add residue labels so that they are easy to identify; revert to cartoon form all the amino acids or nucleotides that do not interact. To change the visualization of a single sequence section without affecting the complete model, select the sequence of interest, right-click > 'show'> 'as'> 'cartoon/or sticks/or surface, etc.

References

Ala-Poikela, M., Rajamaki, M.L., Valkonen, J.P.T., 2019. A Novel Interaction Network Used by Potyviruses in Virus-Host Interactions at the Protein Level. Viruses 11, 1158. Allen, M.L., Metz, A.M., Timmer, R.T., Rhoads, R.E., Browning, K.S., 1992. Isolation and sequence of the cDNAs encoding the subunits of the isozyme form of wheat protein synthesis initiation factor 4F. J Biol Chem 267, 23232-23236. Andreev, D.E., O'Connor, P.B., Loughran, G., Dmitriev, S.E., Baranov, P.V., Shatsky, I.N., 2017. Insights into the mechanisms of eukaryotic translation gained with ribosome profiling. Nucleic Acids Res 45, 513-526. Ashby, J.A., Stevenson, C.E., Jarvis, G.E., Lawson, D.M., Maule, A.J., 2011. Structure-based mutational analysis of eIF4E in relation to sbm1 resistance to pea seed-borne mosaic virus in pea. PLoS One 6, e15873. Bastet, A., Robaglia, C., Gallois, J.-L., 2017. eIF4E Resistance: Natural Variation Should Guide Gene Editing. Trends in Plant Science 22, 411-419. Batten, J.S., Desvoyes, B., Yamamura, Y., Scholthof, K.B., 2006. A translational enhancer element on the 3'-proximal end of the Panicum mosaic virus genome. FEBS Lett 580, 2591- 2597. Bell, M.R., Engleka, M.J., Malik, A., Strickler, J.E., 2013. To fuse or not to fuse: what is your purpose? Protein Sci 22, 1466-1477. 168

Bhardwaj, U., Powell, P., Goss, D.J., 2019. Eukaryotic initiation factor (eIF) 3 mediates Barley Yellow Dwarf Viral mRNA 3'-5' UTR interactions and 40S ribosomal subunit binding to facilitate cap-independent translation. Nucleic Acids Res 47, 6225-6235. Browning, K.S., Maia, D.M., Lax, S.R., Ravel, J.M., 1987. Identification of a new protein synthesis initiation factor from wheat germ. J Biol Chem 262, 538-541. Bruun-Rasmussen, M., Moller, I.S., Tulinius, G., Hansen, J.K., Lund, O.S., Johansen, I.E., 2007. The same allele of translation initiation factor 4E mediates resistance against two Potyvirus spp. in Pisum sativum. Mol Plant Microbe Interact 20, 1075-1082. Carberry, S.E., Rhoads, R.E., Goss, D.J., 1989. A spectroscopic study of the binding of m7GTP and m7GpppG to human protein synthesis initiation factor 4E. Biochemistry 28, 8078-8083. Cavatorta, J., Perez, K.W., Gray, S.M., Van Eck, J., Yeam, I., Jahn, M., 2011. Engineering virus resistance using a modified potato gene. Plant Biotechnol J 9, 1014-1021. Chu, I.T., Speer, S.L., Pielak, G.J., 2020. Rheostatic Control of Protein Expression Using Tuner Cells. Biochemistry 59, 733-735. Cui, H., Wang, A., 2017. An efficient viral vector for functional genomic studies of Prunus fruit trees and its induced resistance to Plum pox virus via silencing of a host factor gene. Plant Biotechnol J 15, 344-356. David, C.C., Jacobs, D.J., 2011. Characterizing protein motions from structure. J Mol Graph Model 31, 41-56. Deceglie, S., Lionetti, C., Roberti, M., Cantatore, P., Loguercio Polosa, P., 2012. A modified method for the purification of active large enzymes using the glutathione S-transferase expression system. Anal Biochem 421, 805-807. Deng, H., Jia, Y., Zhang, Y., 2018. Protein structure prediction. Int J Mod Phys B 32, 1840009. Dinkova, T.D., Aguilar, R., Sanchez de Jimenez, E., 2000. Expression of maize eukaryotic initiation factor (eIF) iso4E is regulated at the translational level. Biochem J 351 Pt 3, 825-831. Dinkova, T.D., Márquez-Velázquez, N.A., Aguilar, R., Lázaro-Mixteco, P.E., Sánchez de Jiménez, E., 2011. Tight translational control by the initiation factors eIF4E and eIF(iso)4E is required for maize seed germination. Seed Science Research 21, 85-93. Dinkova, T.D., Martinez-Castilla, L., Cruz-Espíndola, M.A., 2016. The Diversification of eIF4E Family Members in Plants and Their Role in the Plant-Virus Interaction, Evolution of the Protein Synthesis Machinery and Its Regulation. Springer International Publishing, pp. 187-205. Dinkova, T.D., Sanchez de Jimenez, E., 1999. Differential expression and regulation of translation initiation factors -4E and -iso4E during maize germination. Physiologia Plantarum 107, 419-425. Du, Z., Alekhina, O.M., Vassilenko, K.S., Simon, A.E., 2017. Concerted action of two 3' cap- independent translation enhancers increases the competitive strength of translated viral genomes. Nucleic Acids Res 45, 9558-9572. 169

Fathi-Roudsari, M., Akhavian-Tehrani, A., Maghsoudi, N., 2016. Comparison of Three Escherichia coli Strains in Recombinant Production of Reteplase. Avicenna J Med Biotechnol 8, 16-22. Feinstein, W.P., Brylinski, M., 2015. Calculating an optimal box size for ligand docking and virtual screening against experimental and predicted binding pockets. J Cheminform 7, 18. Fu, D.Y., Meiler, J., 2018. RosettaLigandEnsemble: A Small-Molecule Ensemble-Driven Docking Approach. ACS Omega 3, 3655-3664. Gallois, J.L., Charron, C., Sanchez, F., Pagny, G., Houvenaghel, M.C., Moretti, A., Ponz, F., Revers, F., Caranta, C., German-Retana, S., 2010. Single amino acid changes in the turnip mosaic virus viral genome-linked protein (VPg) confer virulence towards Arabidopsis thaliana mutants knocked out for eukaryotic initiation factors eIF(iso)4E and eIF(iso)4G. J Gen Virol 91, 288-293. Gao, L., Luo, J., Ding, X., Wang, T., Hu, T., Song, P., Zhai, R., Zhang, H., Zhang, K., Li, K., Zhi, H., 2020. Soybean RNA interference lines silenced for eIF4E show broad potyvirus resistance. Mol Plant Pathol 21, 303-317. Gauffier, C., Lebaron, C., Moretti, A., Constant, C., Moquet, F., Bonnet, G., Caranta, C., Gallois, J.L., 2016. A TILLING approach to generate broad-spectrum resistance to potyviruses in tomato is hampered by eIF4E gene redundancy. Plant J 85, 717-729. German-Retana, S., Walter, J., Doublet, B., Roudet-Tavert, G., Nicaise, V., Lecampion, C., Houvenaghel, M.C., Robaglia, C., Michon, T., Le Gall, O., 2008. Mutational analysis of plant cap-binding protein eIF4E reveals key amino acids involved in biochemical functions and potyvirus infection. J Virol 82, 7601-7612. Gomez, M.A., Lin, Z.D., Moll, T., Chauhan, R.D., Hayden, L., Renninger, K., Beyene, G., Taylor, N.J., Carrington, J.C., Staskawicz, B.J., Bart, R.S., 2019. Simultaneous CRISPR/Cas9- mediated editing of cassava eIF4E isoforms nCBP-1 and nCBP-2 reduces cassava brown streak disease symptom severity and incidence. Plant Biotechnol J 17, 421-434. Grant, B.J., Rodrigues, A.P., ElSawy, K.M., McCammon, J.A., Caves, L.S., 2006. Bio3d: an R package for the comparative analysis of protein structures. Bioinformatics 22, 2695-2696. Guarnera, E., Berezovsky, I.N., 2016. Allosteric sites: remote control in regulation of protein activity. Curr Opin Struct Biol 37, 1-8. Harper, S., Speicher, D.W., 2008. Expression and purification of GST fusion proteins. Curr Protoc Protein Sci Chapter 6, Unit 6 6. Hellman, L.M., Fried, M.G., 2007. Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions. Nat Protoc 2, 1849-1861. Hernandez, G., Vazquez-Pianzola, P., 2005. Functional diversity of the eukaryotic translation initiation factors belonging to eIF4 families. Mech Dev 122, 865-876. Hinnebusch, A.G., 2014. The scanning mechanism of eukaryotic translation initiation. Annu Rev Biochem 83, 779-812. 170

Hinnebusch, A.G., 2017. Structural Insights into the Mechanism of Scanning and Start Codon Recognition in Eukaryotic Translation Initiation. Trends Biochem Sci 42, 589-611. Huang, S.Y., 2014. Search strategies and evaluation in protein-protein docking: principles, advances and challenges. Drug Discov Today 19, 1081-1096. Hwang, J., Li, J., Liu, W.Y., An, S.J., Cho, H., Her, N.H., Yeam, I., Kim, D., Kang, B.C., 2009. Double mutations in eIF4E and eIFiso4E confer recessive resistance to Chilli veinal mottle virus in pepper. Mol Cells 27, 329-336. Jiang, J., Laliberte, J.F., 2011. The genome-linked protein VPg of plant viruses-a protein with many partners. Curr Opin Virol 1, 347-354. Joshi, B., Lee, K., Maeder, D.L., Jagus, R., 2005. Phylogenetic analysis of eIF4E-family members. BMC Evol Biol 5, 48. Kanyuka, K., Druka, A., Caldwell, D.G., Tymon, A., McCallum, N., Waugh, R., Adams, M.J., 2005. Evidence that the recessive bymovirus resistance locus rym4 in barley corresponds to the eukaryotic translation initiation factor 4E gene. Mol Plant Pathol 6, 449-458. Keima, T., Hagiwara-Komoda, Y., Hashimoto, M., Neriya, Y., Koinuma, H., Iwabuchi, N., Nishida, S., Yamaji, Y., Namba, S., 2017. Deficiency of the eIF4E isoform nCBP limits the cell- to-cell movement of a plant virus encoding triple-gene-block proteins in Arabidopsis thaliana. Sci Rep 7, 39678. Kelley, L.A., Mezulis, S., Yates, C.M., Wass, M.N., Sternberg, M.J., 2015. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc 10, 845-858. Khrennikov, A., Yurova, E., 2017. Automaton model of protein: Dynamics of conformational and functional states. Prog Biophys Mol Biol 130, 2-14. Kraft, J.J., Peterson, M.S., Cho, S.K., Wang, Z., Hui, A., Rakotondrafara, A.M., Treder, K., Miller, C.L., Miller, W.A., 2019. The 3' Untranslated Region of a Plant Viral RNA Directs Efficient Cap-Independent Translation in Plant and Mammalian Systems. Pathogens 8, 28. Kropiwnicka, A., Kuchta, K., Lukaszewicz, M., Kowalska, J., Jemielity, J., Ginalski, K., Darzynkiewicz, E., Zuberek, J., 2015. Five eIF4E isoforms from Arabidopsis thaliana are characterized by distinct features of cap analogs binding. Biochemical and biophysical research communications 456, 47-52. Lázaro-Mixteco, P.E., Dinkova, T.D., 2017. Identification of Proteins from Cap-Binding Complexes by Mass Spectrometry During Maize (Zea mays L.) Germination. Journal of the Mexican Chemical Society 56. Leonard, S., Viel, C., Beauchemin, C., Daigneault, N., Fortin, M.G., Laliberte, J.F., 2004. Interaction of VPg-Pro of turnip mosaic virus with the translation initiation factor 4E and the poly(A)-binding protein in planta. J Gen Virol 85, 1055-1063. Li, D., Ji, F., Huang, C., Jia, L., 2019. High Expression Achievement of Active and Robust Anti- beta2 microglobulin Nanobodies via E.coli Hosts Selection. Molecules 24, 2860. 171

Liu, Q., Goss, D.J., 2018. The 3' mRNA I-shaped structure of maize necrotic streak virus binds to eukaryotic translation factors for eIF4F-mediated translation initiation. J Biol Chem 293, 9486-9495. Madan, B., Kasprzak, J.M., Tuszynska, I., Magnus, M., Szczepaniak, K., Dawson, W.K., Bujnicki, J.M., 2016. Modeling of Protein–RNA Complex Structures Using Computational Docking Methods. Springer New York, pp. 353-372. Magnus, M., Boniecki, M.J., Dawson, W., Bujnicki, J.M., 2016. SimRNAweb: a web server for RNA 3D structure modeling with optional restraints. Nucleic Acids Res 44, W315-319. Manjunath, S., Williams, A.J., Bailey-Serres, J., 1999. Oxygen deprivation stimulates Ca2+- mediated phosphorylation of mRNA cap-binding protein eIF4E in maize roots. Plant J 19, 21-30. Mann, C.M., Muppirala, U.K., Dobbs, D., 2017. Computational Prediction of RNA-Protein Interactions. Springer New York, pp. 169-185. Marcotrigiano, J., Gingras, A.C., Sonenberg, N., Burley, S.K., 1997. Cocrystal structure of the messenger RNA 5' cap-binding protein (eIF4E) bound to 7-methyl-GDP. Cell 89, 951-961. Maru, Y., Afar, D.E., Witte, O.N., Shibuya, M., 1996. The dimerization property of glutathione S-transferase partially reactivates Bcr-Abl lacking the oligomerization domain. J Biol Chem 271, 15353-15357. Mayberry, L.K., Dennis, M.D., Leah Allen, M., Ruud Nitka, K., Murphy, P.A., Campbell, L., Browning, K.S., 2007. Expression and purification of recombinant wheat translation initiation factors eIF1, eIF1A, eIF4A, eIF4B, eIF4F, eIF(iso)4F, and eIF5. Methods Enzymol 430, 397- 408. Miras, M., Truniger, V., Querol-Audi, J., Aranda, M.A., 2017a. Analysis of the interacting partners eIF4F and 3'-CITE required for Melon necrotic spot virus cap-independent translation. Mol Plant Pathol 18, 635-648. Miras, M., Truniger, V., Silva, C., Verdaguer, N., Aranda, M.A., Querol-Audi, J., 2017b. Structure of eIF4E in Complex with an eIF4G Peptide Supports a Universal Bipartite Binding Mode for Protein Translation. Plant Physiol 174, 1476-1491. Mishra, S.K., Kandoi, G., Jernigan, R.L., 2019. Coupling dynamics and evolutionary information with structure to identify protein regulatory and functional binding sites. Proteins 87, 850-868. Monzingo, A.F., Dhaliwal, S., Dutt-Chaudhuri, A., Lyon, A., Sadow, J.H., Hoffman, D.W., Robertus, J.D., Browning, K.S., 2007. The structure of eukaryotic translation initiation factor-4E from wheat reveals a novel disulfide bond. Plant Physiol 143, 1504-1518. Morales, E.S., Parcerisa, I.L., Ceccarelli, E.A., 2019. A novel method for removing contaminant Hsp70 molecular chaperones from recombinant proteins. Protein Sci 28, 800-807. Morris, G.M., Huey, R., Lindstrom, W., Sanner, M.F., Belew, R.K., Goodsell, D.S., Olson, A.J., 2009. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem 30, 2785-2791. 172

Moury, B., Lebaron, C., Szadkowski, M., Ben Khalifa, M., Girardot, G., Bolou Bi, B.A., Kone, D., Nitiema, L.W., Fakhfakh, H., Gallois, J.L., 2020. Knock-out mutation of eukaryotic initiation factor 4E2 (eIF4E2) confers resistance to pepper veinal mottle virus in tomato. Virology 539, 11- 17. Nicaise, V., German-Retana, S., Sanjuan, R., Dubrana, M.P., Mazier, M., Maisonneuve, B., Candresse, T., Caranta, C., LeGall, O., 2003. The eukaryotic translation initiation factor 4E controls lettuce susceptibility to the Potyvirus Lettuce mosaic virus. Plant Physiol 132, 1272- 1282. Nicholson, B.L., Wu, B., Chevtchenko, I., White, K.A., 2010. Tombusvirus recruitment of host translational machinery via the 3' UTR. Rna 16, 1402-1419. Nieto, C., Morales, M., Orjeda, G., Clepet, C., Monfort, A., Sturbois, B., Puigdomenech, P., Pitrat, M., Caboche, M., Dogimont, C., Garcia-Mas, J., Aranda, M.A., Bendahmane, A., 2006. An eIF4E allele confers resistance to an uncapped and non-polyadenylated RNA virus in melon. Plant J 48, 452-462. Novagen, E., 2010. Competent Cells: What a difference a strain makes, in: Chemicals, E. (Ed.), Competent cells, EMD. Piatkowski, P., Kasprzak, J.M., Kumar, D., Magnus, M., Chojnowski, G., Bujnicki, J.M., 2016. RNA 3D Structure Modeling by Combination of Template-Based Method ModeRNA, Template- Free Folding with SimRNA, and Refinement with QRNAS. Methods Mol Biol 1490, 217-235. Rodriguez-Hernandez, A.M., Gosalvez, B., Sempere, R.N., Burgos, L., Aranda, M.A., Truniger, V., 2012. Melon RNA interference (RNAi) lines silenced for Cm-eIF4E show broad virus resistance. Mol Plant Pathol 13, 755-763. Rodriguez, C.M., Freire, M.A., Camilleri, C., Robaglia, C., 1998. The Arabidopsis thaliana cDNAs coding for eIF4E and eIF(iso)4E are not functionally equivalent for yeast complementation and are differentially expressed during plant development. Plant J 13, 465-473. Ruffel, S., Dussault, M.-H., Palloix, A., Moury, B., Bendahmane, A., Robaglia, C., Caranta, C., 2002. A natural recessive resistance gene against potato virus Y in pepper corresponds to the eukaryotic initiation factor 4E (eIF4E). The Plant Journal 32, 1067-1075. Ruud, K.A., Kuhlow, C., Goss, D.J., Browning, K.S., 1998. Identification and characterization of a novel cap-binding protein from Arabidopsis thaliana. J Biol Chem 273, 10325-10330. Saha, S., Makinen, K., 2020. Insights into the Functions of eIF4E-Biding Motif of VPg in Potato Virus A Infection. Viruses 12, 197. Sankar, K., Mishra, S.K., Jernigan, R.L., 2018. Comparisons of Protein Dynamics from Experimental Structure Ensembles, Molecular Dynamics Ensembles, and Coarse-Grained Elastic Network Models. J Phys Chem B 122, 5409-5417. Scheets, K., 2000. Maize chlorotic mottle machlomovirus expresses its coat protein from a 1.47- kb subgenomic RNA and makes a 0.34-kb subgenomic RNA. Virology 267, 90-101. Scheets, K., 2016. Analysis of gene functions in Maize chlorotic mottle virus. Virus Res 222, 71- 79. 173

Schmitt-Keichinger, C., 2019. Manipulating Cellular Factors to Combat Viruses: A Case Study From the Plant Eukaryotic Translation Initiation Factors eIF4. Front Microbiol 10, 17. Sheber, M.K., 2018. Point mutational examination of eukaryotic initiation factor 4E (eIF4E), inclusion bodies analysis, and Barley Yellow Dwarf Virus (BYDV) exonuclease resistant RNA structure mapping and prediction, Plant Pathology and Microbiology. Iowa State University, Graduate Theses and Dissertations, p. 100. Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Soding, J., Thompson, J.D., Higgins, D.G., 2011. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7, 539. Simon, A.E., Miller, W.A., 2013. 3' cap-independent translation enhancers of plant viruses. Annu Rev Microbiol 67, 21-42. Skjaerven, L., Yao, X.Q., Scarabelli, G., Grant, B.J., 2014. Integrating protein structural dynamics and evolutionary analysis with Bio3D. BMC Bioinformatics 15, 399. Song, K., Zhang, J., 2018. Single Binding Pockets Versus Allosteric Binding. Methods Mol Biol 1825, 295-326. Stocker, B.K., Koster, J., Zamir, E., Rahmann, S., 2018. Modeling and simulating networks of interdependent protein interactions. Integr Biol (Camb) 10, 290-305. Tavert-Roudet, G., Abdul-Razzak, A., Doublet, B., Walter, J., Delaunay, T., German-Retana, S., Michon, T., Le Gall, O., Candresse, T., 2012. The C terminus of lettuce mosaic potyvirus cylindrical inclusion helicase interacts with the viral VPg and with lettuce translation eukaryotic initiation factor 4E. J Gen Virol 93, 184-193. Treder, K., Kneller, E.L., Allen, E.M., Wang, Z., Browning, K.S., Miller, W.A., 2008. The 3' cap-independent translation element of Barley yellow dwarf virus binds eIF4F via the eIF4G subunit to initiate translation. Rna 14, 134-147. Trott, O., Olson, A.J., 2010. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31, 455-461. Truniger, V., Miras, M., Aranda, M.A., 2017. Structural and Functional Diversity of Plant Virus 3'-Cap-Independent Translation Enhancers (3'-CITEs). Frontiers in plant science 8, 2047. Truniger, V., Nieto, C., Gonzalez-Ibeas, D., Aranda, M., 2008. Mechanism of plant eIF4E- mediated resistance against a Carmovirus (Tombusviridae): cap-independent translation of a viral RNA controlled in cis by an (a)virulence determinant. Plant J 56, 716-727. Tuszynska, I., Magnus, M., Jonak, K., Dawson, W., Bujnicki, J.M., 2015. NPDock: a web server for protein-nucleic acid docking. Nucleic Acids Res 43, W425-430. Uhl, L., Dumont, A., Dukan, S., 2018. A passive physical model for DnaK chaperoning. Phys Biol 15, 026003. Wang, X., Kohalmi, S.E., Svircev, A., Wang, A., Sanfacon, H., Tian, L., 2013. Silencing of the host factor eIF(iso)4E gene confers plum pox virus resistance in plum. PLoS One 8, e50627. 174

Wang, Z., Parisien, M., Scheets, K., Miller, W.A., 2011. The cap-binding translation initiation factor, eIF4E, binds a pseudoknot in a viral cap-independent translation element. Structure 19, 868-880. Wang, Z., Treder, K., Miller, W.A., 2009. Structure of a viral cap-independent translation element that functions via high affinity binding to the eIF4E subunit of eIF4F. J Biol Chem 284, 14189-14202. Wittmann, S., Chatel, H., Fortin, M.G., Laliberte, J.F., 1997. Interaction of the viral protein genome linked of turnip mosaic potyvirus with the translational eukaryotic initiation factor (iso) 4E of Arabidopsis thaliana using the yeast two-hybrid system. Virology 234, 84-92. Xu, M., Xie, H., Wu, J., Xie, L., Yang, J., Chi, Y., 2017. Translation Initiation Factor eIF4E and eIFiso4E Are Both Required for Peanut stripe virus Infection in Peanut (Arachis hypogaea L.). Front Microbiol 8, 338. Yan, Y., Zhang, D., Zhou, P., Li, B., Huang, S.Y., 2017. HDOCK: a web server for protein- protein and protein-DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res 45, W365-W373. Yanagiya, A., Suyama, E., Adachi, H., Svitkin, Y.V., Aza-Blanc, P., Imataka, H., Mikami, S., Martineau, Y., Ronai, Z.A., Sonenberg, N., 2012. Translational homeostasis via the mRNA cap- binding protein, eIF4E. Mol Cell 46, 847-858. Yang, J., Zhang, Y., 2015. Protein Structure and Function Prediction Using I-TASSER. Curr Protoc Bioinformatics 52, 5 8 1-5 8 15. Zagrovic, B., Bartonek, L., Polyansky, A.A., 2018. RNA-protein interactions in an unstructured context. FEBS Lett 592, 2901-2916. Zhang, Y., 2008. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9, 40-40.

175

Figures

Figure 3.1. Sequence alignment of plant eIF4E proteins. Amino acids in bold orange color indicate the amino acids identified to interact with 7-methylguanosine 5'-cap based on their crystal structure. The red boxes indicate the highly conserved amino acids in the cap-binding pocket of eIF4E. The blue boxes indicate other regions where at least 2/3 eIF4E crystal structures were conserved. Multiple sequence alignment of reference eIF4E proteins using the Cluster Omega alignment program (Sievers et al., 2011). Protein fasta sequences were obtained from the NCBI database. 176

Figure 3.2. Triticum aestivum eIF4E (2IDR) 3D model. Cartoon visualization of T. aestivum eIF4E 3D model with close up images of the cap-binding pocket. Stick structure of amino acids identified to interact with 7-methylguanosine triphosphate are colored in orange. PyMol software was used to visualize the PDB-2IDR crystal structure.

Figure 3.3. Wheat eIF4E conformational changes upon interaction with eIF4E simulation. The data of the crystal structure of melon eIF4E bound to eIF4G (Miras et al., 2017) was used to visualize the conformational structural changes on the eIF4E 3D structure. Cap-binding amino residues are colored in orange. The flexible loop (I112-A114) that forms a helix upon eIF4G binding interaction is shown in purple. Amino acid residues that create disulfide bonds upon eIF4E-eIF4G interaction are shown in light green. Top images show the cartoon illustration of the 3D model, while the bottom images show the surface view of the model. Files were visualized using PyMol. 177

Figure 3.4. Structure sequence alignment and superimposition of Triticum Aestivum and Pisum sativum eIF4E structures. A) Sequence alignment overview. In the dendrogram, the sequence groups are colored in grey and gap positions are indicated by the white segments. B) Sequence identity clustering dendrogram. Each sequence cluster is colored differently; cluster 1 (red), cluster 2 (black), cluster 4 (blue), and cluster 4 (green). C) Superimposed PBDs. The alignment positions are colored from N-terminal (grey) to C-terminal (red).

Figure 3.5.Simulated Zea mays eIF4E 3D protein structure. I-TASSER suite tools generated the predicted PDB file of Z. mays eIF4E by threading known PDB protein structures and using T. aestivum 2IDR-A folding coordinates as a restriction. Amino acids located in the 7- methylguanosine triphosphate 5'cap binding site are indicated in orange. The top panel shows the surface view of the protein, while the bottom plane shows the cartoon view of the 3D model.

178

Figure 3.6. Vina dock log files and ligand conformations of wheat eIF4E and maize eIF4E. A) Wheat eIF4E (2IDR-A) and m7GTP Vina docking results. Vina dock log file on the left and ligand conformations on the right. The white number on the left corner of each conformation matches the mode log values on the left table. The structure at the far right corresponds to the 2IDV-m7GTP complex, as published in Monzingo et al. (2007). B) Zea mays eIF4E predicted structure and m7GTP Vina docking results. Vina dock log file on the left and ligand conformations on the right. The black number on the left corner of each conformation matches the mode log values on the left table. The structure at the far right corresponds to the best-fitted docking interaction.

179

Figure 3.7. Superimposition of eIF4E crystal structures of plant proteins and predicted Zea mays. Superimposition of PDB crystal structures of Pisum sativum (pea) and Triticum aestivum (wheat) against predicted Zea mays (maize) eIF4E protein structure. Superimposition showed in cartoon form from different angles. The crystal protein structures of P. sativum and T. aestivum are missing the first 30 amino acids at the N-terminus. B) Surface view of eIF4E 3D structures. Orange color represents the amino acids involved in cap-binding interaction with 7- methylguanosine. The known 3D structure of wheat eIF4E (PDB:2IDR) is indicated in blue, while pea eIF4E (PDB: 2WMC) is shown in purple. The predicted structure of maize eIF4E is indicated in green. PDB files were analyzed using PyMol software.

180

Figure 3.8. Pea Enation mosaic virus 2 (PEMV2) 3'-CITE 3D model prediction. SimRNA software (Piatkowski et al., 2016) was used to predict PEMV2 3D structure based on SHAPE data (Wang et al., 2011). C-domain is colored in orange while G-domain is colored in red. Predicted nucleotides that create the pseudoknot between the G-bulge and the connecting bridge between side loop 1 and 2 are indicated with grey dash lines. PEMV2 nucleotides 3829-3899 were used for simulation. The simulation was set to 1,000 steps. The final cluster prediction is shown on the left. PDB file visualization performed in PyMol. Secondary structure-based on SHAPE data indicated on the right. 181

Figure 3.9. Secondary structures of known and predicted PTEs. The conserved interacting sequences with the 5' hairpin sequence are inside black boxes. Predicted nucleotides that create the pseudoknot between the G-bulge and the connecting bridge between side loop 1 and 2 are indicated with grey dash lines.

182

Figure 3.10. Triticum aestivum eIF4E and PEMV2 protein-RNA docking simulation. Hydrogen bond interactions from the 3D protein-RNA interaction model published in (Wang et al., 2011) were analyzed. The table lists predicted RNA-protein interactions based on the simulated model. The bottom left corner shows the PEMV2-wheat eIF4E predicted model, as published in (Wang et al., 2011). The right panel shows close-up photos of the predicted interactions.

183

Figure 3.11. Maize chlorotic mottle virus 3'CITE (MTE) 3D predicted model. SimRNA software (Piatkowski et al., 2016) was used to predict MTE 3D structure based on SHAPE data (Refer to Chapter 2). Maize chlorotic mottle virus (MCMV) nucleotides 4200-4300 were used for simulation. The simulation was set to 1,000 steps. The final cluster prediction is shown on the left. PDB file visualization performed in PyMol. Secondary structure-based on SHAPE data indicated on the right.

Figure 3.12. Triticum aestivum eIF4E and maize chlorotic mottle virus 3'CITE (MTE) 3D predicted protein-RNA docking simulation. Nucleic acid-Protein docking software (Tuszynska et al., 2015) predicted Triticum aestivum (wheat) eIF4E-MTE complex interaction. The interaction of active sites was analyzed using PyMol. The summary table of interactions is indicated on the left corner table. Red shows amino acid (s) involved in 7-methylguanosine triphosphate interactions. Wheat eIF4E is colored in blue while the MTE is colored in teal blue. Polar contact interactions are illustrated by yellow dot lines. Four figure panel on the right illustrates a close- up of the wheat eIF4E-MTE model.

184

Figure 3.13. Zea mays eIF4E and maize chlorotic mottle virus 3'CITE (MTE) 3D predicted protein-RNA docking simulation. Nucleic acid-Protein docking software (Tuszynska et al., 2015) predicted Zea mays (maize) eIF4E-MTE complex interaction. The interaction of active sites was analyzed using PyMol. The summary table of interactions is indicated on the left corner table. The amino acid (s) involved in 7-methylguanosine triphosphate interactions are colored in red. Four figure panel on the right illustrates a close-up of the wheat eIF4E-MTE model. 185

Figure 3.14. Sequence alignment of wheat eIF4E protein mutations. Multiple sequence alignment of wheat eIF4E mutant sequences using the Clustal Omega alignment program (Sievers et al., 2011). Orange boxes indicate the amino acids involved in cap-binding stability. Red boxes indicate amino acids that interact with 7-methylguanosine triphosphate. Blue boxes indicate amino acids involved in the eIF4E interaction binding site. Purple boxes indicate amino acids locates on the outer surface of eIF4E and homologous to regions associated with virus resistance in lettuce (Nicaise et al., 2003). Magenta boxes indicate amino acids predicted to interact with PEMV2-PTE based on docking Protein-RNA simulations (Figure 5). 186

Figure 3.15. Visualization of mutations in Triticum aestivum (wheat) eIF4E. Mutated amino acids are color-coded based in the 2IDR pdb structure based on their location in the structure. Protein structure is illustrated in cartoon form (left) or surface view (right). The amino acid coloring scheme is indicated on the top-left table.

Figure 3.16. Expressing eIF4E mutants in Tuner (DE3) pLysS at 37°C. Wheat eIF4E-GST protein constructs were expressed in Tuner (DE3) pLysS BL21 cells at 37°C. Induced (+ IPTG, 0.5mM) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Induced cells were separated into soluble (S) and insoluble (IS) fractions. Total fractions (T) were included in protein gel electrophoresis to determine the amount of protein in each fraction. The black arrow indicates the eIF4E-GST express protein complex. 187

Figure 3.17. Troubleshooting eIF4E solubility using different BL21 expression competent cells. Wheat eIF4E-GST (WT) protein construct expressed in Tuner (DE3) pLysS and Rosetta-gami B (DE3) pLysS BL21 competent cells at 37°C. Induced (+ IPTG, 0.5 mM) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Induced cells were separated into soluble (S) and insoluble (IS) fractions. Total fractions (T) were included in protein gel electrophoresis to determine the amount of protein in each fraction. The black arrow indicates the eIF4E-GST fusion protein.

188

Figure 3.18. Optimization eIF4E solubility using different temperatures. Wheat eIF4E-GST (WT) protein expressed in Tuner (DE3) pLysS and Rosetta-gami B (DE3) pLysS BL21 competent cells. A) Induction of bacterial cultures overnight at 16°C. B) Induction of bacterial cultures at 30°C for 2 hours. Induced (+ IPTG, 0.5mM) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Induced cells were separated into soluble (S) and insoluble (IS) fractions. Total fractions (T) were included in protein gel electrophoresis to determine the amount of protein in each fraction. The black arrow indicates the eIF4E-GST express fusion protein.

189

Figure 3.19. IPTG fine-tuning to optimize eIF4E expression. Wheat eIF4E-GST (WT) protein expressed in Tuner (DE3) pLysS at 30°C. Bacterial expression cultures were subjected to various concentrations of IPTG, ranging from 0.1mM to 0.4mM. Induced (+ IPTG) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Induced cells were separated into soluble (S) and insoluble (IS) fractions. Total fractions (T) were included in protein gel electrophoresis to determine the amount of protein in each fraction. The black arrow indicates the eIF4E-GST fusion protein.

190

Figure 3.20. Expression of eIF4E-pGEX using optimized conditions. Wheat eIF4E-GST (WT) protein expressed in Tuner (DE3) pLysS and Rosetta-gami B (DE3) pLysS BL21 competent cells at 30°C. Bacteria expression cultures were subjected to 2:30 hours of 0.3mM IPTG induction. Induced (+ IPTG) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Induced cells were separated into soluble (S) and insoluble (IS) fractions. Total fractions (T) were included in protein gel electrophoresis to determine the amount of protein in each fraction. The black arrow indicates the eIF4E-GST fusion protein.

Figure 3.21. Purification of wheat eIF4E-GST tag proteins. A) Example of GST purification after Factor Xa cleavage. B) Example overloaded concentrated purified eIF4E. Similar to GST (26 kDa), the expected size of wheat eIF4E is 26 kDa. The gels were overloaded with protein to visualize contaminants. Arrow points to eIF4E expected size. 191

Figure 3.22. Purification of His-tag eIF4E wild type protein, nickel vs. cobalt purification. Induced (+ IPTG) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Gel label abbreviations: soluble (S), total fractions (T), flowthrough (FL), wash 1 (W1), final wash (Wf), elution (E#), and concentrated and desalted protein (C). The black arrow indicates the eIF4E-GST fusion protein.

Figure 3.23. Construction of Zea mays cap-binding proteins expression vectors. A) PCR amplification of cDNA products from B73 seedlings mRNA. Primers were designed to target the most abundant form of eIF4E (GRMZM2G002616-Chr3), eIFiso4E (GRMZM2G22019-Chr5), and cap-binding protein (CBP) (Zm00001d028515-Chr1). Gibson assembly was used to clone maize proteins into protein expression vectors. B) Example of the bacterial expression vector for maize constructs. The maize cap-binding protein genes were expressed in 6Xhis tag Maltose binding protein (MBP) TEV vectors with Kanamycin as the selective agent. 192

Figure 3.24. Induction examples of Zea mays cap-binding protein vectors. pET His6 Maltose binding protein (MBP) TEV-eIF4E, eIFiso4E, or cap-binding protein (CBP) were expressed in Rosetta-gami B (DE3) pLysS BL21 competent cells at 30°C for 2 hours. Induced (+ IPTG) and non-induced (- IPTG) were sonicated to disrupt bacterial cells and release protein to soluble fraction. Induced cells were separated into soluble (S) and insoluble (IS) fractions. Total fractions (T) were included in protein gel electrophoresis to determine the amount of protein in each fraction. MBP and Z. mays protein complexes were cleaned up using 7-methylguanosine triphosphate agarose beads (E). The white arrow indicates the eIF4E-MBP fusion protein.

193

Tables

Table 3-1. Summary of T. aestivum eIF4E mutations

Region of Protein Cap binding Rationale for mutant creation mutation Mutation properties Disruption of 4E binding to 4G to evaluate the involvement of eIF4G binding W79A +a 4G binding site in PTE +4E binding. Involved in structural stability. Located in β4 Bind the cap through stacking π bond interactions. Partially b W62A ± disrupt restoration of LMV susceptibility. Expected to affect PTE binding. Located in a flexible loop 2 Bind the cap through stacking π bond interactions. Expected to c W108A - affect PTE binding. Located in a flexible loop between ɑ2 and β5 Involved in cap-binding stability. Human homologous mutant E109A - causes loss of cap-binding. Expected to disrupt binding to PTE. Located in a flexible loop between ɑ2 and β5 5'-cap-binding R158A + Involved in cap-binding stability. May not disrupt 4E-PTE binding. Located on flexible loop 3 Partial availability on the surface and cap-binding pocket. W49A - Involved in structural stability. Located in a flexible loop between ɑ1 and β3 R163A ± Involved in cap-binding and structure stability. May not disrupt 4E-PTE binding. Located on flexible loop 3 W167A - Involved in cap-binding stability. Suspected to disrupt 4E-PTE binding. Located at the end of loop 3 near β8 F50A ± Might affect cap-binding negatively. Expected to affect PTE-4E binding adversely. Located in a flexible loop between ɑ1 and β3 Outer surface H67A + Located on region associated with resistance. Might stabilize PTE binding. Located on loop 2 Expected to have some effect on PTE binding activity. Q59A + Conserved in one plant isoform (>90%). Located on flexible loop 2 S205A + Expected to have some effect on PTE binding activity. Located PTE-eIF4E on flexible loop 3 model K207A + Expected to have some effect on PTE binding activity. Located interaction on flexible loop 3 G208A + Expected to have some effect on PTE binding activity. Located on flexible loop 3 P209A + Expected to have some effect on PTE binding activity. Located on flexible loop 3 WT + control a: cap-binding activity expected b: partial cap-binding activity expected c: cap-binding activity disrupted

194

Table 3-2. Novagen competent cells genetic marker key

Genetic Marker key ahpC: mutation in alkyl hydroperoxide reductase conferring disulfide reductase activity. dcm: no methylation of cytosines in the sequence CCWGG F-: Strain does not contain the F episome gal: unable to utilize galactose gor: abolishes glutathione reductase. Allows formation of disulfide bonds in the E. coli cytoplasm lacY: abolishes lac permease ompT: lacks an outer membrane protease; improves recovery of intact recombinant proteins pLysS: CamR plasmid that carries the gene for T7 lysozyme pLysSRARE: CamR plasmid that carries the gene for T7 lysozyme and tRNA genes for six codons rarely used in E. coli. Tn10: contains the TetR transposable element, Tn10 trxB: abolished thioredoxin reductase. Allows formation of disulfide bonds in E. coli cytoplasm.

195

Table 3-3. Protein expression solubility summary results

Protein solubility Protein solubility: GST tag (pGEX backbone) 6X-His (pET28a) Region of Protein mutation Mutation 37°C 30°C 30°C Tunera R-gami-B Tuner cells R-gami-B cells Tuner/R-gami-Bg cells cells eIF4G mostly soluble W79A b insoluble (0%) binding (50%) Insolublec insoluble mostly soluble Insoluble mostly soluble W62A (≤5%) (≤5%) (50%) (≤10%) (~50-65%) partially solubled mostly soluble W108A Soluble (>70%)f (10-20%) (50-60%) partially soluble partially soluble mostly soluble E109A (20-25%) (20-25%) (~50%) 5'-cap mostly solublee mostly soluble R158A binding (50%) (50-60%) insoluble insoluble partially soluble W49A insoluble (≤5%) (0%) (0%) (25-30%) partially soluble mostly soluble R163A (25-30%) (50%) mostly soluble W167A insoluble (≤5%) (50-60%) insoluble insoluble partially soluble Insoluble F50A Outer (0%) (0%) (40-50%) (≤10%) surface insoluble insoluble partially soluble partially soluble H67A (≤5%) (≤5%) (10-25%) (20-25%) mostly soluble Q59A (~50%) mostly soluble S205A PTE-eIF4E (~50%) model K207A partially soluble interaction G208A (<40%) mostly soluble P209A (~50%) partially insoluble partially soluble partially soluble WT soluble Soluble (>70%) (10%) (40%) (40-50%) (10-25%) a: Solubility category, number in parenthesis represents the percent of protein in the soluble fraction b: grey colored box indicates that the construct was not grown/expressed c: constructs with less than 10% of the protein in the soluble fraction were categorized "insoluble" and colored in light red d: constructs expressing the protein in the soluble fraction between 10-40% were classified as "partially soluble" and colored in light orange e: constructs expressing the protein in the soluble fraction between 40-60% were classified as "mostly soluble" and colored yellow f: constructs with more than 70% of the protein in the soluble fraction were categorized "soluble" and colored in light green g: Tuner or R-gami-B cells were used based on observations of previous expression levels

196

Table 3-4. Summary of maize eIF4E search results

number of Protein amino nucleotide size gene transcript protein Gene ID of Chromosome acid (coding region synonyms splicing size (kDa) interest length cDNA) in bp conformations

eIF4E 1 3 218 657 24.5 GRMZM2G002616

NCBP eIF4E 1 1 229 690 26.5 Zm00001d028515

Zm00001d041973 eIF4E-1 eIF4E 1 3 220 884 26.4

GRMZM2G22019 eif7 eIFiso4E 1 5 216 651 24.1

eIFiso4E 3 1 210 633 23.5 Zm00001d032775 eIFiso4E

197

Table 3-5. Protein-RNA summary interactions from Electrophoretic mobility shift (EMSA) assay using 3'-cap-independent translation elements of thin paspalum asymptomatic virus (TPAV) (PTE) or maize chlorotic mottle virus (MTE) as RNA probes and wheat eIF4E (100 nM) as protein

PTE-TPAV PTE-m2 TPAV MTE Region of Protein Notes mutation Mutation cap RNA uncap RNA cap RNA uncap RNA cap RNA uncap RNA eIF4G W79A NA NA NA NA NA NA binding GST W62A c purified

GST purified

W108A 6X His purified

GST purified

5'-cap E109A binding 6X His purified

R158A NA NA NA NA NA NA GST W49A purified

GST R163A purified

GST W167A purified

GST F50A purified

Outer surface GST H67A purified

6X His Q59A purified

6X His PTE- S205A eIF4E purified

model K207A NA NA NA NA NA NA interaction G208A NA NA NA NA NA NA 6X His P209A purified

GST purified

6X His WT purified

m7GTP

aGrey shaded areas indicate proteins with expression difficulties; thus, no data available. eThis column indicates whether the method used to purified the protein tested for the EMSA assays cThe boxes contain images of EMSA test such that the left lane contains the RNA probe only. In contrast, the right lane contains the RNA-protein interaction reaction.

198

Table 3-6. Identification of homologous amino acid residues reported conferring virus resistance on eIF4E

Homologous location amino acid

eIF4E Virus Genus Plant structure Lettuce arabidopsis Pea Wheat maize location Publication source Clover yellow vein Bastet et al. β1 W64 W69 L62 W49a W52 virus Potyvirus Arabidopsis (2019) loop 1 V75 T80 D73 V60 A63

Bastet et al. loop 1 A76 S81 D74 A61 A64 (2018) loop 1 S84 S84 S78 S65 S69

loop G109 G114 R107 G94 G97

β5 N171 N176 K169 S156 S159

Ruffer et al. β6 D111 D116 D109 D96 D99 Potato virus Y Potyvirus pepper (2002) β2 M81 L86 M79 I66 I69

loop 1 S69 A74 A67 Q54 Q57

German- β1 W64 W69 L62 W49 W52 Retana et al.

Lettuce mosaic virus Potyvirus lettuce (2008) loop 1 W77 W82 W75 W62b W65

β5 R173 R178 R71 R158 R61

β6 W182 187 180 W167 170

Loop near F65 F70 F63 F50 F53 β1 Melon necrotic spot Nieto et al. loop 3 S223 N228 N221 G208 G211 virus Carmovirus melon (2006) Capsicum-Tobacco Yeam et al. loop G109 G114 R107 G94 G97 etch virus (TEV) Potyvirus tomato (2007) a Amino acid in bold letters indicate mutations at these locations are associated with virus resistance for at least two different eIF4E proteins b Amino acids colored in red indicate residues whose interaction with m7GTP has been confirmed

199

Table 3-7. List of primers

Q5- polymerase Primer name Sequence Description Anneal temp amplification of maize eIFiso4E Zmi4Ec5f-F caccATGGCGGAGGTCGAGGCTCCAGCTA 70 °C from GRMZM2G22019-Chr5, Zmi4Ec5f-R TTACACGGTGTACCGCCCACCTCT forward strand amplification of maize eIF4E Zm4Ec3f-F caccATGGCCGAGGAGACCGACACCAGGC 68 °C from GRMZM2G002616-Chr3, Zm4Ec3f-R TCAAACCGTGTAACGGTTCTTCAGGCCTTTGTCC forward strand amplification of maize NCBP ZmCBPc1r-F caccATGGAGGCGGCGGCGGAGAA 70 °C from Zm00001d028515-Chr1, ZmCBPc1r-R CTATCCTCTCAGCCATGTGTTCCTG reverse strand amplification of maize eIF4E Zm4Ec3r-F caccATGGCCGACGAGATTGACAC 65 °C from Zm00001d041973-Chr3, Zm4Ec3r-R TCAAACCGTGTAACGGTTCT revese strand Gibson assembly cloning of GA-Ta4E-F ctgtacttccaatccaatATGGCCGAGGACACGGAGAC 69 °C wheat eIF4E-FL into pET his6 GA-Ta4E-R cgctcgaattcggatccgTCAAACGGTGTAGCGGTTC MBP TEV LIC ctgtacttccaatccaatATGGCGGAGGTCGAGGCTCCAGCTA Gibson assembly cloning of zea Zmi4Ec5f-GA-F mays eIFiso4E 70 °C (GRMZM2G22019-Chr5- cgctcgaattcggatccgTTACACGGTGTACCGCCCACCTCT forward strand) into pET his6 Zmi4Ec5f-GA-R MBP TEV LIC ctgtacttccaatccaatATGGCCGAGGAGACCGACACCAGGC Gibson assembly cloning of zea Zm4Ec3f-GA-F mays eIF4E 70 °C (GRMZM2G002616-Chr3- cgctcgaattcggatccgTCAAACCGTGTAACGGTTCTTCAGGCCTTTGTCC forward strand) into pET his6 Zm4Ec3f-GA-R MBP TEV LIC ctgtacttccaatccaatATGGAGGCGGCGGCGGAGAA Gibson assembly cloning of zea ZmCBPc1r-GA-F mays CBP (Zm00001d041973- 70 °C Chr3-reverse strand) into pET cgctcgaattcggatccgCTATCCTCTCAGCCATGTGTTCCTG ZmCBPc1r-GA-R his6 MBP TEV LIC

Q52A-F CAAGTCCAGGgccGTGGCCTGGGGG Site-directed mutagenesis in 71 °C wheat eIF4E. Mutant Q59A Q52A-R CCCTGCGGGTTGTCGAAC TGCAAAGAGGgccGACAAAGGCC S198A-F Site-directed mutagenesis in 63 °C wheat eIF4E. Mutant S205A S198A-R TCCTCATGAACAATGAACCC GAGGTCTGACgccGGCCCCAAGAAC K200A-F Site-directed mutagenesis in 62 °C wheat eIF4E. Mutant K207A K200A-R TTTGCATCCTCATGAACAATG

G201A-F GTCTGACAAAgccCCCAAGAACC Site-directed mutagenesis in 62 °C wheat eIF4E. Mutant G208A G201A-R CTCTTTGCATCCTCATGAAC TGACAAAGGCgccAAGAACCGCT P202A-F Site-directed mutagenesis in 58 °C wheat eIF4E. Mutant P209A P202A-R GACCTCTTTGCATCCTCATGAAC GCG GGC CTT TAC AAC AAT ATC CAT AAC CC W79A-F Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant W79A W79A-R GAA GTC CTC GAC GGT GGA GAA GG

W49A-F GCG TTC GAC AAC CCG CAG Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant W49A W49A-R GAA GGT CCA GGC GTT CTC GAG GCG GGG AGC ACC ATC CAC W62A-F Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant W62A W62A-R GGC CAC CTG CCT GGA CTT GCG GAA GAC CCC ATT TGT GCC W108A-F Site-directed mutagenesis in 69 °C wheat eIF4E. Mutant W108A W108A-R TTT TGG CTC AAT CTT GTT CTT GAA GCA ATG G GCG GAC CCC ATT TGT GCC E109A-F Site-directed mutagenesis in 69 °C wheat eIF4E. Mutant E109A E109A-R CCA TTT TGG CTC AAT CTT GTT CTT GAA GCA

200

Table 3.7 (continued) Q5- polymera Primer name Sequence Description se Anneal temp

R158A-F GCG CAG AAA CAG GAA AGA GTA GCT ATC T Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant R158A R158A-R CAC GCT AAC GAC TGC TCC ACA

R163A-F GCG GTA GCT ATC TGG ACC AAA AAT GCT Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant R163A R163A-R TTC CTG TTT CTG ACG CAC GCT AAC

W167A-F GCG ACC AAA AAT GCT GCC AAT GAA GC Site-directed mutagenesis in 71 °C wheat eIF4E. Mutant W167A W167A-R GAT AGC TAC TCT TTC CTG TTT CTG ACG CA

F50A-F GCG GAC AAC CCG CAG GGC Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant F50A F50A-R CCA GAA GGT CCA GGC GTT CTC

H67A-F GCG CCC ATC CAC ACC TTC TCC ACC Site-directed mutagenesis in 70 °C wheat eIF4E. Mutant H67A H67A-R GAT GGT GCT CCC CCA GGC CAC CTG

201

CHAPTER 4: GENERAL CONCLUSION

Maize chlorotic mottle virus contributes to tremendous losses in regions with MLND outbreaks (Achon et al., 2017; FAO, 2017; Quito-Avila et al., 2016; Vega and Beillard, 2016).

Although by itself, the disease severity is affected by environmental factors such as abiotic stress and drought conditions (Scheets, 1998), MCMV becomes lethal upon interaction with a potyvirus (Gowda et al., 2015). The interaction between a potyvirus and MCMV results in maize lethal necrosis disease MLND (CIMMYT, 2014). The devastating damages in crops led to different research groups like CIMMYT to collaborate in the search for resistant lines (Awata et al., 2019; Gowda et al., 2018). The most desirable method for virus disease management is developing host resistance (Jones et al., 2018; Nyaga et al., 2019).

The research presented in this dissertation focused on understanding the translation mechanism of maize chlorotic mottle virus (MCMV). The deciphering of MCMV translation initiation began by determining if the virus uses a 3'-cap-independent translation elements (3'-

CITE) to promote viral translation. Deletion analysis on the genomic RNA indicated the presence of a 3'-CITE in the 3'-untranslated region of the viral RNA. The 3'-CITE was mapped between nucleotides 4164-4333 using the luciferase constructs. The second step was to determine the secondary structure of this 3'-CITE. Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) probing revealed a hammer-like stable helix structure with two side loops at the apex connected by a pyrimidine connecting tract. The MCMV 3'-CITE (MTE) resembled a previously characterized classification of 3'-CITEs, panicum-mosaic virus-like translation element (PTE) class. The PTEs are marked by the bridge that connects the helical branches is a pyrimidine-rich (C-domain), and the basal helix contains a guanylate rich bulge (G- domain) on its 5'-side (Batten et al., 2006; Wang et al., 2009). Two to four nucleotide base- 202

pairing between the C-domain and the G-domain such that a pseudoknot is formed. This G-C pseudoknot drives the interaction between the eIF4E cap-binding site to the PTE.

The pseudoknot interaction between the G- and C-domain differs among characterized or predicted PTEs. Because of the MTE's weak pseudoknot prediction, we wanted to explore if the number of nucleotide interactions affected eIF4E-PTE interaction. Constructs indicated that compared to other PTEs, cap-independent translation mediated by MTE was among the lowest.

However, we could also observe that having too many C-G base-pairs did not make the translation activity more efficient than those with 3-nucleotides interacting to create a pseudoknot. Since translation activity was affected by a weak pseudoknot or a very strong one, it seems that there is an ideal range of G-C pair for PTEs; and going outside the specific range could have negative effects on virus translation. In summary, strong PTEs have a three base-pair pseudoknot, whereas, presence of a three base-pair pseudoknot (even if all GC pairs) does not guarantee a strongly stimulating PTE.

We know that the G-C pseudoknot can affect translation, but we do not if the size of the pseudoknot or lack thereof will affect PTE-eIF4E interactions. So far, our lab (Wang et al., 2011;

Wang et al., 2009) and other labs (Gao and Simon, 2017) have focused on translation and eIF4E recruitment through the RNA perspective. For chapter three, we created mutations on eIF4E to enable us to understand the PTE-4E interaction from both sides of the scope. Since the MTE fits most of the PTE-class characteristics, we suspected it was a PTE-like structure anomaly. MTE contained the YGGCA/UGCCR conserved sequence that facilitates the long-distance interaction with the 5'-end of the viral RNA. MTE also shows hyper-modification around the G-domain bulge, but it lacks the series of Gs. Lastly, the MTE does bind to eIF4E, but it seems that it might bind in different regions of eIF4E than the PTE it was tested against. 203

Whether the MTE is a deviant PTE or not, more work needs to be done to get the complete picture of the MCMV translation mechanism. The translation mechanism of a virus is not whole until the host's response is taken into account. This dissertation only presents a small glimpse of a potentially bigger picture. This is just one piece of the puzzle. We still need to figure out if MTE can interact with other initiation factors or if there are any other RNA structures in the MCMV genome involved in translation stability or inhibition.

The next steps for chapter three would be to purify more proteins and screen PTEs with different points of connection in the G-C pseudoknot. It would be interesting to see if a larger G- bulge would increase the points of contact with the protein's binding pocket or if it will affect the protein conformation at all. The best way to learn about the PTE-4E interaction would be to get a crystal structure of the complex.

Protein-ligand docking programs are often used to increase the chances of finding binding affinities of two or more molecules with known structures (Huang and Zou, 2010).

Molecular docking simulations are sometimes used to identify drug-like compounds that could bind to specific biological targets (Ballante, 2018). In chapter three, RNA-protein docking simulations were performed to simulate the interaction between PTEs and eIF4E. Since the interaction of eIF4E and PTE play a fundamental role in the initiation of viral RNA translation using host translation machinery, understanding the viral RNA-protein interaction could allow us to engineer mutations in the host proteins that would disrupt this interaction. The increasing availability of structural data of RNA-protein interactions has facilitated the development of bioinformatics tools to compute simulations of these complexes (Bissaro et al., 2020). However, molecular docking of RNA structures into proteins is complicated due to the conformational flexibility, singular charge distribution, and effects of solvents in the RNA structure (Bissaro et 204

al., 2020; Delgado Blanco et al., 2019). Even so, the power of these RNA-protein docking tools could be harness to evaluate the putative viral RNA-host protein interactions. By harnessing these tools, we attempted to identify mutations on eIF4E that would only affect viral RNA binding with minimal effect on the host.

We were able to identify potential amino acid residous that had more detrimental effects in the uncapped PTE than capped RNA. My proposal for future work is to combine the power of in silico RNA-protein docking tools to complement the design of protein mutations to confer virus resistance. The accuracy of RNA-protein docking models decreases when an algorithm neglects to take into consideration the dynamic behavior of RNAs in the presence of various solvents. Since we have studied the dynamics of virus 3’-CITE structures from the biological perspective, one of the next steps would be to partner up with computational labs to improve docking algorithms which include the appropriate assumptions regarding RNA flexibility and dynamics. The development of better models would open the doors for a better understanding of the complexity behind RNA structures and their involvement in biological functions. It could also serve as a gateway to engineer strategic mutations in eIF4E that would not affect plant development and foliage production but would prevent virus infection.

References

Achon, M.A., Serrano, L., Clemente-Orta, G., Sossai, S., 2017. First Report of Maize chlorotic mottle virus on a Perennial Host, Sorghum halepense, and Maize in Spain. Plant Disease 101, 393-393.

Awata, L.A.O., Beyene, Y., Gowda, M., L, M.S., Jumbo, M.B., Tongoona, P., Danquah, E., Ifie, B.E., Marchelo-Dragga, P.W., Olsen, M., Ogugo, V., Mugo, S., Prasanna, B.M., 2019. Genetic Analysis of QTL for Resistance to Maize Lethal Necrosis in Multiple Mapping Populations. Genes (Basel) 11, 32.

Ballante, F., 2018. Protein-Ligand Docking in Drug Design: Performance Assessment and Binding-Pose Selection. Methods Mol Biol 1824, 67-88. 205

Batten, J.S., Turina, M., Scholthof, K.B., 2006. Panicovirus accumulation is governed by two membrane-associated proteins with a newly identified conserved motif that contributes to pathogenicity. Virol J 3, 12.

Bissaro, M., Sturlese, M., Moro, S., 2020. Exploring the RNA-Recognition Mechanism Using Supervised Molecular Dynamics (SuMD) Simulations: Toward a Rational Design for Ribonucleic-Targeting Molecules? Frontiers in Chemistry 8.

CIMMYT, 2014. Maize Lethal Necrosis: Building a comprehensive response, 2014 Annual Report : turning research into impact, Mexico, pp. 24-25.

Delgado Blanco, J., Radusky, L.G., Cianferoni, D., Serrano, L., 2019. Protein-assisted RNA fragment docking (RnaX) for modeling RNA–protein interactions using ModelX. Proceedings of the National Academy of Sciences, 201910999.

FAO, 2017. Status of Maize lethal necrosis disease (MLND)in kenya, Pest Reports, Food and Agriculture Organization of the United Nations.

Gao, F., Simon, A.E., 2017. Differential use of 3'CITEs by the subgenomic RNA of Pea enation mosaic virus 2. Virology 510, 194-204.

Gowda, M., Beyene, Y., Makumbi, D., Semagn, K., Olsen, M.S., Bright, J.M., Das, B., Mugo, S., Suresh, L.M., Prasanna, B.M., 2018. Discovery and validation of genomic regions associated with resistance to maize lethal necrosis in four biparental populations. Mol Breed 38, 66.

Gowda, M., Das, B., Makumbi, D., Babu, R., Semagn, K., Mahuku, G., Olsen, M.S., Bright, J.M., Beyene, Y., Prasanna, B.M., 2015. Genome-wide association and genomic prediction of resistance to maize lethal necrosis disease in tropical maize germplasm. Theor Appl Genet 128, 1957-1968.

Huang, S.-Y., Zou, X., 2010. Advances and Challenges in Protein-Ligand Docking. International Journal of Molecular Sciences 11, 3016-3034.

Jones, M.W., Penning, B.W., Jamann, T.M., Glaubitz, J.C., Romay, C., Buckler, E.S., Redinbaugh, M.G., 2018. Diverse Chromosomal Locations of Quantitative Trait Loci for Tolerance to Maize chlorotic mottle virus in Five Maize Populations. Phytopathology 108, 748- 758.

Nyaga, C., Gowda, M., Beyene, Y., Muriithi, W.T., Makumbi, D., Olsen, M.S., Suresh, L.M., Bright, J.M., Das, B., Prasanna, B.M., 2019. Genome-Wide Analyses and Prediction of Resistance to MLN in Large Tropical Maize Germplasm. Genes (Basel) 11, 16.

Quito-Avila, D.F., Alvarez, R.A., Mendoza, A.A., 2016. Occurrence of maize lethal necrosis in Ecuador: a disease without boundaries? European Journal of Plant Pathology 146, 705-710.

Scheets, K., 1998. Maize chlorotic mottle machlomovirus and wheat streak mosaic rymovirus concentrations increase in the synergistic disease corn lethal necrosis. Virology 242, 28-38. 206

Vega, H., Beillard, M.J., 2016. Ecuador declares state of emergency in corn production areas, in: Network, G.A.I. (Ed.). USDA Foreign Agricultural Service, Quite, Ecuador.

Wang, Z., Parisien, M., Scheets, K., Miller, W.A., 2011. The cap-binding translation initiation factor, eIF4E, binds a pseudoknot in a viral cap-independent translation element. Structure 19, 868-880.

Wang, Z., Treder, K., Miller, W.A., 2009. Structure of a viral cap-independent translation element that functions via high affinity binding to the eIF4E subunit of eIF4F. J Biol Chem 284, 14189-14202.