The Evolution of Endogenous Viral Elements
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Cell Host & Microbe Review The Evolution of Endogenous Viral Elements Edward C. Holmes1,2,* 1Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA 2Fogarty International Center, National Institutes of Health, Bethesda, MD 20892, USA *Correspondence: [email protected] DOI 10.1016/j.chom.2011.09.002 Endogenous retroviruses are a common component of the eukaryotic genome, and their evolution and potential function have attracted considerable interest. More surprising was the recent discovery that eu- karyotic genomes contain sequences from RNA viruses that have no DNA stage in their life cycle. Similarly, several single-stranded DNA viruses have left integrated copies in their host genomes. This review explores some major evolutionary aspects arising from the discovery of these endogenous viral elements (EVEs). In particular, the reasons for the bias toward EVEs derived from negative-sense RNA viruses are considered, as well as what they tell us about the long-term ‘‘arms races’’ between hosts and viruses, characterized by episodes of selection and counter-selection. Most dramatically, the presence of orthologous EVEs in diver- gent hosts demonstrates that some viral families have ancestries dating back almost 100 million years, and hence are far older than expected from the phylogenetic analysis of their exogenous relatives. Although it has long been known that the eukaryote genome retroviruses (Katzourakis and Tristem, 2005). These comprise contains a myriad of complete and partial relatives of retroviruses at least 31 distinct families, such that there have been at least called endogenous retroviruses that are now inherited passively 31 separate integration events, and likely many more. In some with the host genetic component, it was surprising to discover animal species, it has even proven possible to see retroviral that sequences of RNA viruses, which do not make a DNA inter- endogenization in action (Tarlinton et al., 2006). Not surprisingly, mediate and usually do not enter the host cell nucleus, were also those EVEs derived from nonretroviruses are far rarer and hence present in eukaryotic genomes (Crochu et al., 2004). Although represent the consequence of sporadic evolutionary events. In integrated copies of single-strand DNA (ssDNA) viruses were addition, while effectively complete genomes of endogenous found in the genomes of plant viruses some years ago (Bejarano retroviruses are relatively commonplace, this is not the case et al., 1996), a number of endogenized ssDNA viruses were for those EVEs generated by other types of virus, which are recently described in a diverse set of animal genomes (Belyi usually composed of partial genomic fragments. et al., 2010a; Katzourakis and Gifford, 2010). Such an array of Although the presence of retroviral EVEs was described many virally derived genetic material raises a number of important years ago (Benveniste and Todaro, 1974; Weiss et al., 1973), the evolutionary questions, which forms the basis of this review. first description of a gene sequence of an RNA virus (that repli- Perhaps the most important theme is that more than expanding cate using an RNA-dependent RNA polymerase [RdRp]) inte- our basic knowledge of the composition of eukaryotic genomes, grated into the host genome did not occur until 2004. This the presence of endogenous viruses has had a profound impact example involved the insect flavivirus cell fusing agent (CFA), on our understanding of the time-scale of virus evolution, the fragments of which were found to be integrated into the consequences of which are yet to be fully understood. genomes of Aedes spp. mosquitoes (Crochu et al., 2004), and which has also been observed in some other insect flaviviruses The Types and Phylogenetic Distribution (Roiz et al., 2009). Shortly afterward, integrated sequences of of Endogenous Viruses the RNA virus Potato virus Y (PVY, a potyvirus) were found in Although endogenous viruses have been classified in different the genomes of some grapevine varieties (Tanne and Sela, ways, a useful collective term for them all that reflects their 2005). Since this time, a number of other endogenous RNA generally fragmentary nature is ‘‘endogenous viral elements’’ viruses have been discovered, comprising bornaviruses (also (EVEs) (Katzourakis and Gifford, 2010). From here on, the term referred to as endogenous Borna-like N elements [EBLN]; EVE will be used to refer to all endogenous viruses, whether Figure 2)(Belyi et al., 2010b; Horie et al., 2010), bunyaviruses derived from retroviruses, DNA viruses, or RNA viruses, although (Katzourakis and Gifford, 2010), filoviruses (Belyi et al., 2010b; this review will generally focus on the latter group. EVEs are Taylor et al., 2010), orthomyxoviruses (Katzourakis and Gifford, generated when a double-stranded DNA (dsDNA) copy of the 2010), reoviruses (Katzourakis and Gifford 2010), and rhabdovi- viral genome is integrated into the host germline (Figure 1). ruses (Katzourakis and Gifford, 2010), although usually at very Although the production of dsDNA intermediates that integrate low copy numbers (i.e., less than 100 elements per genome). A into host genomic DNA is an obligatory part of the retroviral life list of animal EVEs is provided in Table 1, with bornaviruses being cycle, germline as opposed to somatic cell integration is the only endogenous RNA viruses found in the human genome. expected to be a rare event. Despite this, endogenous retrovi- Similarly, endogenous viruses have been observed in various ruses are surprisingly common; for example, approximately fungal (Frank and Wolfe, 2009; Taylor and Bruenn, 2009) and 5%–8% of the human genome is composed of endogenous bacterial (Salanoubat et al., 2002) genomes. 368 Cell Host & Microbe 10, October 20, 2011 ª2011 Elsevier Inc. Cell Host & Microbe Review species A species B species C dsDNA RNA dsDNA (EVE+) (EVE+) (EVE-) e.g. Hepadna- protein dsDNA integrated EVE integrated EVE empty insertion site e.g. Adeno-, Polyoma-, Papilloma- ssDNA dsDNA non-homologous recombination e.g. Parvo-, Circo- dsRNA TRANSLATION minimum age of EVE FIXATION e.g. Reo- mRNA interaction INHERITANCE RNA-ve (RNA+ve) VERTICAL with cellular retroelements e.g. Filo-, Rhabdo- RNA+ve RNA-ve maximum age e.g. Flavi-, Picorna of EVE integrated dsDNA germ line RNA+ve dsDNA host integration retroviral genome time integration e.g Retro- ancestral host Figure 1. Processes Involved in the Generation of EVEs, and How EVEs Can Be Used to Estimate the Age of Viruses, Using Both RNA and DNA Viruses as Examples The presence of EVEs in related species A and B and integrated into the same genomic position such that they are orthologous indicates that this integration event occurred prior to the divergence of these two species (species C is an outgroup). If it is known when species A and B diverged, then the minimum age of the insertion event can also be estimated. See Katzourakis and Gifford (2010) for more details. Figure kindly provided by Rob Gifford. As well as the expanded catalog of endogenous RNA viruses, have codiverged with these species—it is evident that a number the sequences of a number of small DNA viruses are also inte- of independent insertion events have occurred (Taylor et al., grated into host genomes, namely the circovirus (Belyi et al., 2010). Most notably, those filovirus EVEs infecting placental 2010a), geminivirus (Bejarano et al., 1996), and parvovirus (Belyi and marsupial mammals are not sister taxa, as expected given et al., 2010a; Kapoor et al., 2010) families of ssDNA viruses, as ancient virus-host codivergence, with the EVEs from marsupials well as the hepadnaviruses (dsDNA) (Gilbert and Feschotte, more closely related to the exogenous filoviruses. This is a 2010). For example, endogenous fragments of the Parvoviridae remarkable observation, given the geographic separation of are found in mammalian hosts as diverse as cats, elephants, these two groups; most known exogenous filoviruses are of platypus, and wallabies. With the continual increase in the African origin, whereas the marsupial endogenous filoviruses number of eukaryotic genome sequences, it is certain that are largely from Australian species, with a single representative more EVEs will be described. In addition, it is likely that the found in an American opossum that constitutes the closest rela- phylogenetic signal for the integration of some very ancient tive of the exogenous filoviruses (Taylor et al., 2010). Such EVEs, particularly those that occurred early in eukaryote evolution a disjunct distribution is highly suggestive of the presence of and that evolve without functional constraints, has been lost further, currently uncharacterized, exogenous filoviruses. through the accumulation of multiple nucleotide substitutions. Despite the evident under- and biased sampling of EVEs, Of those EVEs recently discovered, perhaps the most there is a striking imbalance in the taxonomic origins of those surprising were the endogenous filoviruses (Figure 3). Although EVEs derived from RNA viruses described to date; at the time novel exogenous filoviruses have recently been discovered of writing, only eight families of RNA viruses have been shown (Barrette et al., 2011), they generally infect a small number of to possess endogenous relatives, five of which represent viruses mammalian species—particularly primate and bat species