<<

Proc. Natl. Acad. Sci. USA Vol. 88, pp. 5631-5634, July 1991 Evolution Phylogeny of , viroidlike satellite , and the viroidlike domain of hepatitis 6 RNA (statistical geometry/quasispecies/RNA world/living fossils) SANTIAGO F. ELENA*, JOAQUIN DoPAZO*, RICARDO FLORESt, THEODOR 0. DIENER*, AND ANDR1S MOYA*§ *Departament de Gendtica i Servei de BioinformAtica, Universitat de Valdncia, Dr. Moliner 50, 46100 Burassot, Valdncia, Spain; tUnidad de Biologia Molecular y Celular de Plantas, Instituto de Agroquimica y Tecnologfa de Alimentos, Consejo Superior de Investigaciones Cientificas, 46010 Valdncia, Spain; and fCenter for Agricultural Biotechnology, and Department of Botany, University of Maryland, College Park, MD 20742, and Agricultural Research Service, U.S. Department of Agriculture, Beltsville, MD 20705 Contributed by Theodor 0. Diener, March 21, 1991

ABSTRACT We report a phylogenetic study of viroids, assumed an intracellular mode of existence sometime after some satellite RNAs, and the viroidlike domain of human the evolution of cellular . hepatitis 8 virus RNA. Our results support a monophyletic Implicit in this proposal is the possibility that all viroids and origin ofthese RNAs and are consistent with the hypothesis that viroidlike RNAs may have been derived from a common they may be "living fossils" of a precellular RNA world. ancestor. To obtain evidence for or against this proposition, Moreover, the viroidlike domain of human hepatitis 8 virus we have conducted a phylogenetic analysis of these small RNA appears closely related to the viroidlike satellite RNAs of pathogenic RNAs and report here that our results are con- , with which it shares some structural and functional sistent with a monophyletic origin of viroids and viroidlike properties. On the basis of our phylogenetic analysis, we satellite RNAs, as well as possibly of the viroidlike domain propose a taxonomic classification of these RNAs. of HDV RNA.

Viroids, subviral pathogens of higher plants, are small (246- Phylogenetic Methods 375 nucleotide residues), unencapsidated, single-stranded, The nucleotide sequences of 15 viroids, 4 circular (viroidlike) circular RNAs characterized by highly base-paired, rodlike satellite RNAs, 2 linear satellite RNAs, and the viroidlike secondary structures (1). Viroids do not code for any pro- domain of HDV RNA were used (Table 1). teins, yet they replicate autonomously (without the assis- The first and most critical step in this work was the tance of helper ) in susceptible cells. Viroidlike sat- alignment of the sequences. It was carried out by means of a ellite RNAs resemble viroids, but they are found within the multiple sequence algorithm with hierarchical clustering (37, of specific helper viruses required for their replica- 38), followed by minor adjustments of some stretches to tion (2). Human hepatitis 8 virus (HDV) RNA is a circular preserve previously described homologies (25, 39). RNA requiring virus as a for pack- Two methods of phylogenetic reconstruction were used: aging and transmission (3). HDV RNA contains a region, least squares (40) and maximum parsimony. In both cases the termed the viroidlike domain, with significant similarities bootstrap method (41, 42) was applied to evaluate the reli- with and viroidlike satellite RNAs (4). Viroids, viroid- ability of the inferred trees. In the first case, 1000 replicates like satellite RNAs, and HDV RNA appear to replicate via were sampled from the original set of sequences, and the oligomeric RNA intermediates by some type of rolling-circle corresponding distance matrix was obtained. The number of mechanism (2-6). mutations between sequences was measured, correcting for Several hypotheses have been advanced to explain the multiple substitutions by using the formula d = -In(l - K)L evolution of viroids. It has been suggested, for example, that (43), K being the frequency of changes and L the homologous viroids may have originated from or transpos- length of the sequences. Based on the distance matrix, a able elements by deletion of interior sequences (7), that they phylogenetic tree was obtained for each bootstrap replicate may represent escaped introns (8, 9), or that they have by means of the FITCH program of the PHYLIP package, evolved comparatively recently from hypothetical "anten- version 3.2 (44). Finally, a consensus tree was obtained by na" or "signal" RNAs, which eukaryotic cells are assumed using the CONSENSE program ofthe same package. It is worth to interchange (10). noting that the trees obtained were not significantly different With the demonstration that certain RNAs have catalytic whether gaps were excluded or treated as substitutions in the properties (11, 12), the idea that RNA preceded DNA as a estimation of d. carrier of genetic information has gained support. Most The bootstrap estimate (also based on 1000 replicates) of recent models for self-replicating precellular RNAs assume the tree obtained by using the second method was directly the existence of primitive RNA enzymes with properties that implemented in the program DNABOOT of the PHYLIP pack- are derived from extant self-splicing introns (13, 14). Because age. In this case the estimate of the branch lengths from the one viroid, all known viroidlike satellite RNAs, and the inferred tree was obtained with the FITCH program, using the viroidlike domain of HDV RNA are self-cleaving (3, 15, 16), previously obtained consensus topology. it is equally plausible to consider these RNAs as relics of the As a method for testing the tree-likeness of the data (i.e., RNA world, thus leading to an alternative hypothesis for the their possible common origin), we have employed the method evolution of viroids and viroidlike satellite RNAs (17). In this ofstatistical geometry in distance space (45, 46). This method view, one must suppose that these RNAs have evolved from is especially useful for the analysis of "old" relationships that "free-living" molecules and that they, like introns, have have suffered to a large extent from ulterior randomizations,

The publication costs of this article were defrayed in part by page charge Abbreviations: HDV, hepatitis 8 virus; ASBVd, avocado sunblotch payment. This article must therefore be hereby marked "advertisement" viroid; see Table 1 for other abbreviations. in accordance with 18 U.S.C. §1734 solely to indicate this fact. §To whom reprint requests should be addressed. 5631 Downloaded by guest on September 27, 2021 5632 Evolution: Elena et al. Proc. Natl. Acad. Sci. USA 88 (1991)

Table 1. Viroids and satellite RNAs examined 100 substitutions

Abbre- Size, ASBVd RNA viation nt Ref. Avocado sunblotch viroid ASBVd 247 18 Potato spindle tuber viroid PSTVd 359 19 vLTSV Tomato apical stunt viroid TASVd 360 7 Tomato planta macho viroid TPMVd 360 7 ASSVd Citrus exocortis viroid CEVd 371 20 Chrysanthemum stunt viroid CSVd 356 21 GIBVd Columnea latent viroid CLVd 370 22 HLVd Coconut cadang-cadang viroid CCCVd 246 23 Hop stunt viroid HSVd 297 24 Hop latent viroid HLVd 256 25 Apple scar skin viroid ASSVd 330 26 Grapevine yellow speckle viroid GYSVd 367 27 T CEVd Grapevine 1B viroid G1BVd 363 28 TASVd Australian grapevine viroid AGVd 369 29 TPMVd Coconut tinangaja viroid CTiVd 254 30 CSVd sArMV Velvet tobacco mottle virus* vVTMoV 366 31 Solanum nodiflorum mottle FIG. 1. Consensus phylogenetic tree obtained for the RNA sequences listed in Table 1 (see text). ***, Group monophyletic in all virus* vSNMV 378 31 of the replicates; **, monophyletic in more than 99% of the repli- Lucerne transient streak virus* vLTSV 324 32 cates; *, in more than 95%; +, in more than 90%; and -, in more than Subterranean clover mottle virus* vSCMoV 332 33 80%. The remaining groups appeared as monophyletic in less than Tobacco ringspot virust sTRSV 359 34 80% of 1000 replicates. ASBVd has been taken as outgroup. Groups Arabis mosaic virust sArMV 300 35 are considered from ASBVd to the left of the figure within the viroid Hepatitis 8 virus RNAt vHDV 368 36 family, and from ASBVd to the right within the satellite family (including the viroidlike domain of HDV RNA). For example, nt, Nucleotides. sTRSV and sArMV (satellite family) or CCCVd, CTiVd, and HLVd *Circular (viroidlike) satellite RNA of virus listed. (viroid family) conform to two well-defined monophyletic groups in tLinear satellite RNA of virus listed. all of the bootstrap replicates. tViroidlike domain as indicated in ref. 4. HDV RNA appears closely associated in the tree with the as is the case with the present sequences. The method is subgroup formed by most of the plant viroidlike satellite based on the statistical analysis of the six distances defined RNAs. In fact, despite its mammalian host, HDV RNA is for four sequences in all the possible quartets that can be closer to this subgroup than is the plant viroidlike satellite formed with all the sequences. RNA of Lucerne transient streak virus (Fig. 1). Within viroids themselves the high frequency of appear- RESULTS ance of groups in the consensus tree supports the four groups described above. The length of the branch that separates trees the were The phylogenetic relating different RNAs that ASBVd from all other sequences clearly shows that ASBVd two were obtained by the phylogenetic procedures similar. constitutes an independent group. It has been suggested that, The inferred evolutionary process is shown in Fig. 1. The on the basis of its intermediate biological properties, ASBVd following groups can be distinguished: could represent an evolutionary link between typical viroids Linear satellite RNAs (sTRSV, sArMV) and satellite RNAs (17). The location of ASBVd in the tree Circular (viroidlike) satellite RNAs (vLTSV, vSCMoV, supports this hypothesis. vSNMV, vVTMoV), together with the viroidlike domain of Application of the method of statistical geometry in dis- HDV RNA tance space to the whole set of sequences resulted in a value Avocado sunblotch viroid (ASBVd) of x/y = 0.18, where x/y stands for the rate of deviation from Typical viroids, which can be subdivided into tree-likeness to the true branch length (Table 2). This rela- Pospiviroids (potato spindle tuber-like viroids; tively high value is expected in view of the degree of PSTVd, TPMVd, CLVd, CEVd, TASVd, CSVd) variability displayed by the sequences; it is, nevertheless, Cocadviroids (coconut cadang-cadang-like viroids; significantly lower than that obtained from random se- CCCVd, HLVd, CTiVd) Table 2. Deviation from tree-likeness to the true branch length Apscaviroids (apple scar skin-like viroids; ASSVd, GYSVd, G1BVd, AGVd) Branches x/y Random x/y n Hop stunt viroid (HSVd). All possible 0.181** 0.4924 ± 0.0004 1000 From typical viroids to In analogy with the nomenclature ofplant virus groups, we ASBVd 0.723* 0.9891 ± 0.0047 500 propose to name viroid groups and subgroups by words From ASBVd to satellite composed of a prefix indicating in abbreviated form the name RNAs 0.647* 0.9301 ± 0.0046 500 of the type member of the group, followed by the suffix viroid. The main criteria underlying this classification are the x/y has been statistically tested for three sets of branches: (i) all frequency with which a group appears as monophyletic in the possible branches, corresponding to all possible quartets; (ii) the consensus tree, the length of the branch that separates the branch that separates typical viroids and ASBVd from the satellites, to group from the remaining sequences (which is a reflection of corresponding all possible quartets [(viroid, ASBVd), (satellite, satellite)]; and (iii) the branch that separates viroids from ASBVd and how long the group has evolved independently), and its satellites, corresponding to all possible quartets [(viroid, viroid), agreement with experimental evidence and biological crite- (ASBVd, satellite)]. The x/y ratios have been compared with the ria. corresponding ones obtained from the original set of sequences after Linear and circular satellite RNAs are clearly clustered by n randomizations (random x/y ± standard error). **, P s 0.001; *, the first criterion. Interestingly, the viroidlike domain of P ' 0.01. Downloaded by guest on September 27, 2021 Evolution: Elena et al. Proc. Natl. Acad. Sci. USA 88 (1991) 5633 quences having the same base composition (in 1000 repli- HDV RNA shares with the viroidlike satellite RNAs self- cates). cleavage, leading to the formation of specific termini and The same analysis was repeated for the branches from dependence upon a helper virus (interestingly, hepatitis B ASBVd to the viroid group and from ASBVd to the satellite virus, the helper virus of HDV, replicates through a mech- group. Results were similar (Table 2); they clearly indicate anism which has its closest counterpart in a group of plant that the phylogenetic tree presented in Fig. 1 reflects a true viruses, the Caulimoviruses). phylogenetic relationship. The fact that such results cannot Our phylogenetic analysis permits the distinction of four be obtained from two groups of independent sequences subgroups among the typical viroids. These may have strongly supports the monophyletic origin of the RNAs evolved from a common ancestor by means of an evolution- examined. ary process in "star" (43), suggesting that the subgroup The first comparisons of viroid nucleotide sequences were individualization, as an independent evolutionary lineage, published by Gross et al. (20), who compared those of CEVd may have occurred within a relatively short time span, and CSVd with the sequence of PSTVd (19) and found that possibly reflecting a process of continuous reintroduction of a central region was conserved in all three sequences. On the variants from a natural reservoir. These variants would have basis ofpairwise comparisons ofviroid sequences, Keese and undergone a certain amount of mutation (represented in the Symons (39) have proposed a model specifying five struc- individual branches) until becoming, in most cases, patho- tural/functional domains, and Koltunow and Rezaian (47) genic variants that can be easily detected by the symptoms later found this model to apply to all typical viroids. The they produce in infected plants. The division into subgroups former authors (39) also pointed out that viroid evolution could also reflect the properties of a population displaying a could have involved the rearrangement of these domains- to that proposed by Eigen in the for example, by discontinuous mediated by a structure similar quasispe- ju.nping polymerase in a way similar to that established for cies model (50). Evolution of this putative viroidal quasispe- defective interfering (DI) RNAs of viruses (48). cies could be the consequence of the lack of proofreading The percentage of sequence similarity has been proposed mechanisms in the RNA polymerase system (probably RNA as a criterion for classifying typical viroids into five groups, polymerase II) that catalyzes the replication of viroids and whose type representatives are PSTVd, CCCVd, HSVd, the intense processes of intermolecular recombination of ASSVd, and HLVd (25). In this scheme the separation limits viroids produced during the of hosts with two or among groups are arbitrarily fixed, and as a result ASSVd is more viroids. forced to belong to a group different from that formed by Our results strongly support a monophyletic origin of GYSVd and G1BVd, despite their sharing of a strictly viroids, some satellite RNAs, and the viroidlike domain of consensus sequence in the central conserved region (CCR) HDV RNA; they are also consistent with the hypothesis (28, 47). Koltunow and Rezaian (47), on the other hand, have stating that these RNAs may be "living fossils" of a precel- suggested that, in view of the proposed involvement of the lular RNA world (17). In such a hypothetical RNA world, CCR in viroid processing (49), classification should be based reactions would have been catalyzed mostly by . on the type of CCR present, and not solely on sequence Interestingly, the three enzymatic activities apparently re- similarities among the viroids. quired for viroid and satellite RNA replication (RNA- It has been pointed out (39) that the processes of recom- dependent RNA polymerase, RNase, and RNA ligase) are bination, duplication, and deletion suffered by viroids would precisely those that have so far been demonstrated with make phylogenetic inferences difficult. Indeed, excellent ribozymes. evidence for the occurrence ofextensive RNA recombination in viroids has come to light. CLVd, for example, is mostly a We thank Drs. J. Felsenstein, A. Dress, A. v. Haeseler, F. mosaic of viroid sequences known from other viroids. Its left GonzAlez-Candelas, C. Hernmndez, and R. A. Owens for valuable and right terminal domains are essentially identical with those comments. Computer resources of Centre de CAlcul de l'Universitat of Pospiviroids, whereas its CCR is identical with that of de Val~ncia have been used in this study. J.D. has been supported placement of CLVd with one of these by a fellowship of Conselleria de Cultura Educaci6 y Cidncia de la HSVd (22). Thus, Comunitat Valenciana. This work has been partially supported by groups depends on what one considers as the most important Grants B1089-0668-C03-03 and PB87-0346 from the Comisi6n Inter- criteria for classification. Nevertheless, as is shown in our ministerial de Ciencia y Tecnologfa to A.M. and R.F., respectively. analysis, existence of these processes does not blur the true phylogenetic relationships among these RNAs. 1. Diener, T. 0. & Owens, R. A. (1989) in The Biochemistry of Plants. Molecular Biology, Vol. 15, ed. Marcus, A. (Academic, An Evolutionary Model New York), pp. 537-562. 2. Francki, R. I. B. (1987) in The Viroids, ed. Diener, T. 0. We have presented evidence suggesting a common origin for (Plenum, New York), pp. 205-218. viroids, some satellite RNAs, and the viroidlike domain of 3. Taylor, J. (1990) Semin. Virol. 1, 135-141. HDV RNA. In our tree (Fig. 1), satellite RNAs (including the 4. Branch, A. D., Levine, B. J. & Robertson, H. D. (1990) Semin. domain of HDV RNA) are clustered on one side Virol. 1, 143-152. viroidlike 5. Robertson, H. D. & Branch, A. D. (1987) in Viroids and and typical viroids on the other side. The tree thus separates Viroidlike Pathogens, ed. Semancik, J. S. (CRC, Boca Raton, autonomously replicating, but not self-cleaving, viroids from FL), pp. 49-69. self-cleaving, but not autonomously replicating, satellite 6. Sanger, H. L. (1987) in The Viroids, ed. Diener, T. 0. (Plenum, RNAs. ASBVd, which displays functional similarities with New York), pp. 117-166. both groups (self-cleaving and autonomously replicating) is 7. Kiefer, M. C., Owens, R. A. & Diener, T. 0. (1983) Proc. located in the middle between the two clusters. Hence, Natl. Acad. Sci. USA 80, 6234-6238. ASBVd may represent an evolutionary link between viroids 8. Diener, T. 0. (1981) Proc. Natl. Acad. Sci. USA 78,5014-5015. and satellite RNAs, as has been proposed on the basis of 9. Dinter-Gottlieb, G. (1986) Proc. Natl. Acad. Sci. USA 83, different criteria (17). 6250-6254. 10. Zimmern, D. (1982) Trends Biochem. Sci. 7, 205-207. The phylogenetic association of the viroidlike domain of 11. Kruger, K., Grabowski, P. J., Zaug, A. J., Sands, J., HDV RNA with the viroids is consistent with other, previ- Gottschling, D. E. & Cech, T. R. (1982) 31, 147-157. ously identified, characteristics (4), namely size, circular 12. Guerrier-Takada, C., Gardiner, K., Marsh, T., Pace, N. & structure, rodlike conformation, and RNA to RNA rolling Altman, S. (1983) Cell 35, 849-857. circle replication, On the other hand, the viroidlike domain of 13. Sharp, P. A. (1985) Cell 42, 397-400. Downloaded by guest on September 27, 2021 5634 Evolution: Elena et al. Proc. Natl. Acad. Sci. USA 88 (1991)

14. Cech, T. R. (1986) Proc. NatI. Acad. Sci. USA 83, 4360-4363. 32. Keese, P., Bruening, G. & Symons, R. H. (1983) FEBS Lett. 15. Prody, G. A., Bakos, J. T., Buzayan, J. M., Schneider, I. R. & 159, 185-190. Bruening, G. (1986) Science 231, 1577-1580. 33. Davies, D., Haseloff, J. & Symons, R. H. (1990) 177, 16. Forster, A. C., Jeffries, A. C., Sheldon, C. C. & Symons, 216-224. R. H. (1987) Cold Spring Harbor Symp. Quant. Biol. 52, 34. Buzayan, J. M., Gerlach, W. L. & Bruening, G. (1986) Virol- 249-259. ogy 151, 186-199. 17. Diener, T. 0. (1989) Proc. Nati. Acad. Sci. USA 86,9370-9374. 35. Kaper, J. M., Tousignant, M. E. & Steger, G. (1988) Biochem. Biophys. Res. Commun. 154, 318-325. 18. Symons, R. H. (1981) Nucleic Acids Res. 9, 6527-6537. 36. Chen, P. J., Kalpana, G., Goldberg, J., Mason, W., Werner, 19. Gross, H. J., Domdey, H., Lossow, C., Jank, P., Raba, M., B., Gerin, J. & Taylor, J. (1986) Proc. Natl. Acad. Sci. USA 83, Alberty, H. & Sanger, H. L. (1978) Nature (London) 273, 8774-8778. 203-208. 37. Higgins, D. G. & Sharp, P. M. (1988) 73, 237-244. 20. Gross, H. J., Krupp, G., Domdey, H., Raba, M., Jank, P., 38. Higgins, D. G. & Sharp, P. M. (1989) Comput. Appl. Biosci. 5, Lossow, C., Alberty, H., Ramm, K. & Sanger, H. L. (1982) 151-153. Eur. J. Biochem. 121, 249-257. 39. Keese, P. & Symons, R. H. (1985) Proc. Natl. Acad. Sci. USA 21. Haseloff, J. & Symons, R. H. (1981) Nucleic Acids Res. 9, 82, 4582-4586. 2741-2752. 40. Fitch, W. M. & Margoliash, E. (1967) Science 155, 279-284. 22. Hammond, R. W., Smith, D. R. & Diener, T. 0. (1989) Nucleic 41. Felsenstein, J. (1985) Evolution 39, 783-791. Acids Res. 17, 10083-10094. 42. Efron, B. (1982) The Jackknife, the Bootstrap and Other 23. Haseloff, J., Mohamed, N. A. & Symons, R. H. (1982) Nature Resampling Plans (Soc. Ind. Appl. Math., Philadelphia). (London) 299, 316-321. 43. Kimura, M. (1983) The Neutral Theory ofMolecular Evolution 24. Ohno, T., Takamatsu, N., Meshi, T. & Okada, Y. (1983) (Cambridge Univ. Press, Cambridge, U.K.). Nucleic Acids Res. 11, 6185-6197. 44. Felsenstein, J. (1990) PHYLIP Manual (Univ. Herbarium, 25. Puchta, H., Ramm, K. & Sanger, H. L. (1988) Nucleic Acids Univ. of California, Berkeley). Res. 16, 4197-4216. 45. Eigen, M., Winkler-Oswatitsch, R. & Dress, A. (1988) Proc. Nucleic Acids Res. 15, Natl. Acad. Sci. USA 85, 5913-5917. 26. Hashimoto, J. & Koganezawa, H. (1987) 46. Winkler-Oswatitsch, R., Dress, A. & Eigen, M. (1986) Chem. 7045-7052. Scr. 26B, 59-66. 27. Koltunow, A. M. & Rezaian, M. A. (1988) Nucleic Acids Res. 47. Koltunow, A. M. & Rezaian, M. A. (1989) Intervirology 30, 16, 849-864. 194-201. 28. Koltunow, A. M. & Rezaian, M. A. (1989) Virology 170, 575- 48. Huang, A. S. (1991) in RNA Genetics, eds. Domingo, E., 578. Holland, J. J. & Ahlquist, P. (CRC, Boca Raton, FL), Vol. 3, 29. Rezaian, M. A. (1990) Nucleic Acids Res. 18, 1813-1818. pp. 195-208. 30. Keese, P., Osorio-Keese, M. E. & Symons, R. H. (1988) 49. Diener, T. 0. (1986) Proc. Natl. Acad. Sci. USA 83, 58-62. Virology 162, 508-510. 50. Eigen, M. & Biebricher, C. K. (1988) in RNA Genetics, eds. 31. Haseloff, J. & Symons, R. H. (1982) Nucleic Acids Res. 10, Domingo, E., Holland, J. J. & Ahlquist, P. (CRC, Boca Raton, 3681-3691. FL), Vol. 3, pp. 211-245. Downloaded by guest on September 27, 2021