A Distinct Endogenous Pararetrovirus Family in Nicotiana Tomentosiformis, a Diploid Progenitor of Polyploid Tobacco1[W]
Total Page:16
File Type:pdf, Size:1020Kb
A Distinct Endogenous Pararetrovirus Family in Nicotiana tomentosiformis, a Diploid Progenitor of Polyploid Tobacco1[w] Wolfgang Gregor, M. Florian Mette, Christina Staginnus, Marjori A. Matzke*, and Antonius J.M. Matzke Gregor Mendel Institute of Molecular Plant Biology, Austrian Academy of Sciences, A-1090 Vienna, Austria Downloaded from https://academic.oup.com/plphys/article/134/3/1191/6112290 by guest on 28 September 2021 A distinct endogenous pararetrovirus (EPRV) family corresponding to a previously unknown virus has been identified in the genome of Nicotiana tomentosiformis, a diploid ancestor of allotetraploid tobacco (Nicotiana tabacum). The putative virus giving rise to N. tomentosiformis EPRVs (NtoEPRVs) is most similar to tobacco vein clearing virus, an episomal form of a normally silent EPRV family in Nicotiana glutinosa; it is also related to a putative virus giving rise to the NsEPRV family in Nicotiana sylvestris (the second diploid progenitor of tobacco) and in the N. sylvestris fraction of the tobacco genome. The copy number of NtoEPRVs is significantly higher in N. tomentosiformis than in tobacco. This suggests that after the polyploidiza- tion event, many copies were lost from the polyploid genome or were accumulated specifically in the diploid genome. By contrast, the copy number of NsEPRVs has remained constant in N. sylvestris and tobacco, indicating that changes have occurred preferentially in the NtoEPRV family during evolution of the three Nicotiana species. NtoEPRVs are often flanked by Gypsy retrotransposon-containing plant DNA. Although the mechanisms of NtoEPRV integration, accumulation, and/or elimination are unknown, these processes are possibly linked to retrotransposon activity. Integrated (endogenous) viral sequences are in- have originated when a pre-existing virus captured creasingly recognized as common constituents of an RT gene (Xiong and Eickbush, 1990). many plant genomes. Endogenous retroviruses have Little is known about the potential pathogenicity of been detected recently in diverse plant species endogenous viral sequences or their impact on plant (Vicient et al., 2001; Wright and Voytas, 2001). In genome structure and function. Some endogenous addition, sequences derived from the two types of pararetroviruses (EPRVs), such as those derived from plant DNA virus, the single-stranded DNA gemini- banana streak virus (Harper et al., 1999; Ndowora et viruses (Bejarano et al., 1996; Ashby et al., 1997) and al., 1999) and tobacco vein clearing virus (TVCV; the double-stranded DNA pararetroviruses (Harper Lockhart et al., 2000), were initially noticed because et al., 2002) have been identified in various plant they can be activated and cause symptoms of infec- genomes. tion in hybrid plants. By contrast, the EPRV family in Retroviruses, which have an RNA genome, must Nicotiana sylvestris (NsEPRV; formerly named TPVL integrate into host chromosomes by means of a [Jakowitsch et al., 1999] and TEPRV [Mette et al., retrovirus-encoded integrase activity to complete 2002]) comprises mutated viral sequences that are their replication cycle. By contrast, neither type of presumably unable to reconstitute a functional virus. plant DNA virus encodes an integrase function and NsEPRVs were first detected during a routine char- their replication normally proceeds without incorpo- acterization of plant DNA flanking transgene inserts ration into the host genome. Thus, the mechanism by in tobacco (Nicotiana tabacum; Jakowitsch et al., 1999). which DNA viral sequences integrate into plant chro- The conserved methylation pattern of NsEPRVs sug- mosomes remains to be clarified. Together with ret- gested that they might confer resistance to the exog- roviruses and retrotransposons, pararetroviruses are enous form of the virus, which has yet to be detected, classified as retroelements because they use a virus- through an epigenetic gene silencing mechanism act- encoded reverse transcriptase (RT) to replicate their ing at the genome level (Mette et al., 2002). genome. However, the genome structure of pararet- Tobacco provides an interesting system for study- roviruses differs significantly from that of retrovi- ing EPRVs and their contribution to plant genome ruses or retrotransposons, and they are thought to evolution. Tobacco is an allotetraploid that was prob- ably created by humans around 10,000 years ago (M. 1 This work was supported by the Austrian Fonds zur Fo¨rde- Chase, personal communication) from two wild dip- rung der Wissenschaftlichen Forschung (grant no. Z21–MED) and loid ancestors, N. sylvestris as maternal “S” genome by the European Union (contract no. QLK3–CT–2002–02098). donor and Nicotiana tomentosiformis as paternal “T” [w] The online version of this article contains Web-only data. * Corresponding author; e-mail [email protected]; genome donor (Kenton et al., 1993; Lim et al., 2000a). fax 43–1–4272–29749. Recent work analyzing the presence of geminiviral- Article, publication date, and citation information can be found related DNA sequences in different accessions of N. at www.plantphysiol.org/cgi/doi/10.1104/pp.103.031112. tomentosiformis pinpointed specifically N. tomentosi- Plant Physiology, March 2004, Vol. 134, pp. 1191–1199, www.plantphysiol.org © 2004 American Society of Plant Biologists 1191 Gregor et al. formis ac. NIC 479/84 as the paternal parent of to- distributed throughout the putative virus genome bacco (Murad et al., 2002). In principle, each diploid (Fig. 1, top: short vertical arrows), suggesting there progenitor could have contributed one or more dis- are no preferential sites in viral DNA for recombina- tinct EPRV families to the tetraploid tobacco genome. tion into plant chromosomes. Clones comprising DNA-blot hybridization experiments indicated that virus-plant junctions are discussed in more detail the aforementioned NsEPRV family accumulated to below. approximately 1,000 copies in the N. sylvestris ge- The putative pararetrovirus giving rise to Nto- nome before polyploid formation (Jakowitsch et al., EPRVs, which has not been described previously, is 1999). The copy number and epigenetic state of highly similar to TVCV (Lockhart et al., 2000), with NsEPRVs have remained essentially unchanged in N. amino acid identities in the four major open reading sylvestris and in tobacco since polyploidization. frames (ORFs) ranging from 88% to 94% (Fig. 3). The Downloaded from https://academic.oup.com/plphys/article/134/3/1191/6112290 by guest on 28 September 2021 When DNA isolated from diploid N. tomentosiformis putative N. tomentosiformis pararetrovirus is also re- was hybridized to a probe derived from NsEPRV lated to the putative virus giving rise to the NsEPRV sequences, a unique pattern comprising DNA frag- family in N. sylvestris and the S subgenome of to- ments not visible in tobacco or N. sylvestris was ob- bacco (Jakowitsch et al., 1999; Mette et al., 2002), served (Mette et al., 2002). This finding suggested although the amino acid identities in the four major that N. tomentosiformis harbors a related but distinct ORFs are lower (range of 61%–87%; Fig. 3). The or- family of EPRVs that is enriched in, or exclusive to, ganization of the four ORFs is identical among the the diploid genome. three viruses. In all cases, several short ORFs follow To examine this idea further, we have isolated and a region containing several tandem repeats and the sequenced a number of genomic clones containing putative enhancer-promoter. A distinguishing fea- portions of N. tomentosiformis EPRV (NtoEPRV) fam- ture of the putative N. tomentosiformis pararetrovirus ily members. We report here a characterization of this is the unusually short 5Ј leader, placing the tRNA- dispersed repetitive sequence family, a consensus binding site immediately upstream of the first ORF sequence for the putative virus giving rise to Nto- (Fig. 3). The significance of this is unknown. EPRVs, and we discuss the differential accumulation To determine the copy numbers of NtoEPRVs in the of the two EPRV families during evolution of the genomes of N. tomentosiformis and tobacco, a slot-blot three Nicotiana species. analysis was performed. The availability of the Nto- EPRV sequence allowed us to identify a region in ORF4 of the putative virus that showed relatively RESULTS low DNA sequence identity (69%) to the NsEPRV and To isolate NtoEPRVs, genomic libraries were pre- could therefore be used as an NtoEPRV-specific pared using DNA isolated from N. tomentosiformis ac. probe. From the slot-blot analysis, we estimate that NIC 479/84. These libraries were hybridized to a the genome of N. tomentosiformis contains approxi- 5.5-kb probe containing the region of NsEPRV rang- mately 4,000 copies of NtoEPRV, whereas the genome ing from approximately 2 to 7.5 kb (Jakowitsch et al., of tobacco DNA harbors around 600 copies (Fig. 4). 1999). Twenty-four positive clones were chosen for Thus, the genome of the diploid species contains further analysis (Fig. 1). Partial sequence analysis of approximately seven times more copies of NtoEPRV 16 clones that contained primarily viral sequences than the polyploid species. (Fig. 1, clones tom1 through tom16; viral sequences The difference in NtoEPRV copy number is not indicated by white bars) identified overlapping re- consistent with a simple additive model in which the gions, which could be used to assemble a consensus polyploid tobacco genome comprises an unaltered sequence of a putative viral genome that is approxi- combination of the two diploid progenitor genomes.