Time-Dependent Rate Phenomenon in Viruses
Total Page:16
File Type:pdf, Size:1020Kb
JVI Accepted Manuscript Posted Online 1 June 2016 J. Virol. doi:10.1128/JVI.00593-16 Copyright © 2016 Aiewsakun and Katzourakis. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license. 1 Title 2 Time-dependent rate phenomenon in viruses 3 Running title 4 Time-dependent rate phenomenon in viruses Downloaded from 5 6 Authors 7 Pakorn Aiewsakuna and Aris Katzourakisa# http://jvi.asm.org/ 8 9 Author Affiliations 10 aDepartment of Zoology, University of Oxford, Oxford, UK on June 7, 2016 by guest 11 #Address correspondence to Aris Katzourakis, [email protected]. 12 13 Word count 14 Abstract: 244; text: 7226 15 Keywords 16 Short-term long-term rate discrepancy, time-dependent rate phenomenon, rate of viral evolution, power law 1 17 Abstract 18 Among the most fundamental questions in viral evolutionary biology are how fast viruses evolve and how their 19 rates vary among viruses and fluctuate through time. Traditionally, viruses are loosely classed into two groups: 20 slow-evolving DNA viruses and fast-evolving RNA viruses. As viral evolutionary rate estimates become more 21 available, it appears that the rates are negatively correlated with the measurement timescales, and that the 22 boundary between the rates of DNA and RNA viruses might not be as clear as previously thought. In this study, Downloaded from 23 we collected 396 viral evolutionary rate estimates across almost all viral genome types and replication 24 strategies, and examined their rate dynamics. We showed that the time-dependent rate phenomenon exists 25 across multiple levels of viral taxonomy, from the Baltimore classification viral groups to genera. We also 26 showed that, by taking the rate-decay dynamics into account, a clear division between the rates of DNA and http://jvi.asm.org/ 27 RNA as well as reverse-transcribing viruses could be recovered. Surprisingly, despite large differences in their 28 biology, our analyses suggested that the rate-decay speed is independent of viral types, and thus it might be 29 useful for better estimation of the evolutionary timescale of any virus. To illustrate this, we used our model to on June 7, 2016 by guest 30 re-estimate evolutionary timescales of extant lentiviruses, which were previously suggested to be very young 31 by standard phylogenetic analyses. Our analyses suggested that they are millions of years old, consistent with 32 paleovirological evidence, and therefore for the first time, reconciled molecular analyses of ancient and extant 33 viruses. 2 34 Importance 35 This work provides direct evidence that viral evolutionary rate estimates decay with their measurement 36 timescales, and that the rate-decay speeds do not differ significantly among viruses despite the vast 37 differences in their molecular features. With the rate-decay dynamics adjusted for, the division between the 38 rates of dsDNA, ssRNA, and ssDNA/reverse-transcribing viruses could be seen more clearly than before. Our 39 results provide a guideline for further improvement of molecular clock. As a demonstration of this, we used Downloaded from 40 our model to re-estimate the timescales of modern lentiviruses, which were previously thought to be very 41 young, to be millions of years old. This result matches the estimate from paleovirological analyses, thus 42 bridging the gap between ancient and extant viral evolutionary studies. http://jvi.asm.org/ on June 7, 2016 by guest 3 43 Introduction 44 An accurate and precise knowledge of the rate of viral evolution is central to the reconstruction of viral natural 45 history, necessary for the calculation of many evolutionary parameters, from viral age estimates to population 46 size. Generally, viruses are loosely classed into two groups according to their rates of evolution: ‘slow- 47 evolving’ and ‘fast evolving’ viruses. DNA viruses, especially double-stranded DNA (dsDNA) viruses, are 48 traditionally thought of as slow-evolving viruses. To estimate molecular evolutionary rates, the absolute Downloaded from 49 timescales for the observed genetic differences are required, and these can be derived from the divergence 50 dates and/or sampling dates of the study subjects. Many dsDNA viruses have been shown to have an 51 extremely stable co-speciation history with their hosts, and therefore their divergence dates can be directly 52 inferred from those of their hosts. On the basis of these observations, their rates have been estimated to be in http://jvi.asm.org/ 53 the order of 10-7 to 10-9 nucleotide substitutions per site per year (s/n/y) (1–5), comparable to those of their 54 hosts (6, 7). 55 on June 7, 2016 by guest 56 RNA viruses are, on the other hand, typically regarded as fast-evolving viruses. RNA viruses are generally 57 characterised by frequent cross-species transmissions in nature; as a result, it is often difficult to calibrate 58 their evolutionary rates using host evolutionary timescales. Their rates are thus often calculated by using 59 molecular sequences collected at different time points (heterochronous molecular datasets). In this case, the 60 differences among sampling times provide the timescales for the observed genetic divergence. Based on these 61 analyses, their rates are commonly estimated to be between 10-2 and 10-5 s/n/y (8–11), 2-7 orders of 62 magnitude higher than the typical rates of dsDNA viruses. 63 64 This conventional concept of a dichotomy between the rates of DNA and RNA viral evolution has recently been 65 challenged however, and it seems that the boundary between the rates of DNA and RNA viruses might not be 4 66 as clear as previously thought (11–14). For example, analyses of heterochronous molecular datasets of dsDNA 67 and single-stranded DNA (ssDNA) viruses revealed that DNA viruses are in fact capable of evolving very rapidly 68 over short timescales, with rates ranging between 10-3-10-6 s/n/y (15–19), comparable to the estabished rates 69 of many RNA viruses. On the other hand, when the co-speciation assumption is applicable with RNA viruses, 70 such as deltaretroviruses, hantaviruses, and foamy viruses, analyses suggest that their long-term rates of 71 evolution are extremely low, estimated to be in the range of 10-7-10-8 s/n/y (20–23), comparable to those of Downloaded from 72 dsDNA viruses. Paleovirological analyses also showed that many ancient endogenous viruses related to RNA 73 viruses exhibit high similarity to their modern-day counterparts despite them being millions of years old (24). 74 This provides independent evidence indicating that RNA viruses can indeed evolve very slowly over geological 75 timescales (24). By naïvely combining all of the rate estimates together, the ranges of the rate estimates of http://jvi.asm.org/ 76 both DNA and RNA viral evolution appear to be extremely wide, spanning from 10-3 to 10-9 s/n/y for DNA 77 viruses and 10-2 to 10-8 s/n/y for RNA viruses, largely overlapping with one another. As a result, there is an 78 emerging consensus that there is no strict division between the rates of DNA and RNA viruses (13, 14). on June 7, 2016 by guest 79 80 As viral evolutionary rate estimates become more available, it is becoming increasingly clear that viral 81 evolutionary rates appear to vary over time, continuously decreasing with the timescale of rate measurement 82 (25–27). Many hypotheses have been proposed to explain this time-dependent rate phenomenon (TDRP), 83 including temporal changes in selection pressure and/or viral biology, as well as the facts that short-term rates 84 are methodologically prone to overestimation and the long-term ones tend to be underestimated (see (25) for 85 review). This phenomenon may at least partly explain the observed large variation and overlap of viral 86 evolutionary rate estimates. 87 5 88 Another consequence of the TDRP is that naïvely transferring the rate estimates over different timescales for 89 evolutionary inference can severely bias the outcome (25–27). The best illustration of this is perhaps the 90 severe underestimation of the evolutionary timescales of extant viruses by current standard phylogenetic 91 tools. For example, while paleovirological analyses unequivocally show that simian immunodeficiency viruses 92 (SIVs) are millions of years old (28, 29), all previous standard phylogenetic analyses, which do not account for 93 the TDRP, suggest that they are young, sharing a most recent common ancestor (MRCA) less than a million Downloaded from 94 years ago (8, 30–32). One pragmatic approach to solve this problem is to use models describing the empirical 95 relationship between the rate estimates and their measurement timescales to correct for the TDRP effects in 96 evolutionary analyses (27, 33). Our study of foamy viruses has shown that a power-law rate decay model can 97 describe the TDRP very well empirically, and thus it may be useful as a tool for correcting for the effects of the http://jvi.asm.org/ 98 TDRP (27). 99 100 In this study, we collected 396 viral nucleotide substitution rates across almost all viral molecular features and on June 7, 2016 by guest 101 replication strategies, and examined their TDRP dynamics at various taxonomical levels by using the power- 102 law rate decay model as the basis of our investigation. We also examined how the rate dynamics differ among 103 viruses, and re-examined the concept of fast- and slow-evolving viruses. Lastly, we demonstrated the use of 104 our TDRP model by estimating the evolutionary timescale of extant lentiviruses, which has always been 105 severely underestimated by standard phylogenetic analyses. 6 106 Materials and Methods 107 Data collecting and constructing phylogenetically independent datasets of rate 108 estimates 109 396 viral nucleotide substitution rate estimates were collected from 133 pieces of published literature.