Dates from the Molecular Clock: How Wrong Can We Be?
Total Page:16
File Type:pdf, Size:1020Kb
TREE-758; No of Pages 5 Opinion TRENDS in Ecology and Evolution Vol.xxx No.x Dates from the molecular clock: how wrong can we be? Ma´ rio J.F. Pulque´ rio and Richard A. Nichols School of Biological and Chemical Sciences, Queen Mary, University of London, London, E1 4NS, UK Large discrepancies have been found in dates of measures of uncertainty): is there likely to be so much evolutionary events obtained using the molecular clock. uncertainty about molecular dating that the estimates are Twofold differences have been reported between the no longer useful? We fear that, for many current studies, dates estimated from molecular data and those from the answer is yes. However, it might be possible to gain the fossil record; furthermore, different molecular meth- extra precision using recently developed methods. The ods can give dates that differ 20-fold. New software degree of improvement depends on the pattern of variation attempts to incorporate appropriate allowances for this in the rate of molecular evolution and the availability of uncertainty into the calculation of the accuracy of date calibration points. We currently do not know enough to be estimates. Here, we propose that these innovations confident in the prospects of these new methods, and some represent welcome progress towards obtaining reliable initial results are discouraging. dates from the molecular clock, but warn that they are currently unproven, given that the causes and pattern of Why might the molecular clock have a relatively the discrepancies are the subject of ongoing research. constant rate? This research implies that many previous studies, even During the 1960s, Zuckerkandl and Pauling [1] observed some of those using recently developed methods, might that the number of amino acid differences between the have placed too much confidence in their date estimates, haemoglobin of different species had an approximately and their conclusions might need to be revised. linear relationship with the time since their common ances- tor, as estimated from the fossil record. Kimura [2,3] Molecular clocks and substitution rates explained the unexpectedly constant and rapid rate of evo- This article was motivated by the experience of a colleague lution by assuming that most substitutions have little effect who estimated the time since the separation of two taxa on fitness, contrary to the orthodoxy of the time (Box 1). This from the number of substitutions that had accumulated ’Neutral Theory’ is not a complete explanation, however. For between their DNA sequences; in other words, he was example, it predicts a constant substitution rate per gen- using the molecular clock. On submitting the work for eration, whereas empirical evidence suggests something publication, he was startled to be advised by a referee that closer to a constant rate per year [3]. Consider, as an his estimate was wrong by a factor of ten. The argument example, two related lineages: the elephants and the ele- concerned the tick rate of the molecular clock; that is, the phant shrews, which appear to have diverged 80 million rate of accumulation of substitutions per million years. years ago (Mya) [4]. They have very different generation How could the scientific community hold two such contra- times: for elephants, it is 25 years, whereas for elephant dictory opinions simultaneously? shrews, it is approximately two orders of magnitude less. Our colleague’s original calculation was based on a rate Despite their much shorter generations, the estimated rate estimated from inter-species comparisons, whereas the of substitution per year is only 2.5 times faster in elephant referee preferred a rate obtained from a pedigree study. shrews compared with that in elephants [4]. Later, we address why such discrepancies exist between In fact, Ohta [5] found that generation-time effects are estimates of substitution rates. The central lesson for more apparent in DNA-sequence data than in comparisons this article, however, is the realization that reasonable of amino-acid sequence. She explained this pattern with scientists working with the molecular clock can be using her Nearly-Neutral extension of Kimura’s theory (Box 1), estimates that are so different. If neither the fast estimate which argues that substitution rates can be elevated in nor the slow estimate were self-evidently wrong, it sug- small populations by the fixation of mildly deleterious gests that it is difficult to validate them using our knowl- mutations, and that this effect, among others [6–8], can edge of biogeography and the fossil record. Methods are compensate for longer generation times. Mutations are currently being devised that deal with uncertainty about more likely to be deleterious if they are amino-acid chan- the variation in the rate and about the timing of the ging (i.e. non-synonymous), hence the compensation calibration points. Here, we consider the prospects of between population size and generation time might be obtaining date estimates that take account of these issues more effective for amino-acid sequences than for DNA. when constructing their standard errors (or analogous The fundamental principles associated with the Nearly Neutral Theory mean that the rate of the molecular clock Corresponding author: Pulque´rio, M.J.F. ([email protected]). is known to vary between evolutionary lineages, and that Available online xxxxxx. it does so in a way that is not precisely predictable, www.sciencedirect.com 0169-5347/$ – see front matter ß 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.tree.2006.11.013 Please cite this article in press as: Pulque´rio, M.J.F. and Nichols, R.A., Dates from the molecular clock: how wrong can we be?, Trends Ecol. Evol. (2006), doi:10.1016/j.tree.2006.11.013 TREE-758; No of Pages 5 2 Opinion TRENDS in Ecology and Evolution Vol.xxx No.x How bad is the problem? Box 1. The neutral and nearly neutral rate Any assessment of the errors associated with molecular The apparently constant rate at which nucleotide substitutions have clock estimates must face the fundamental problem that accumulated among species was initially thought puzzling. This there is no definitive yardstick against which they can be pattern contrasts with the rate of morphological divergence between species, which differs considerably along different lineages checked: we do not know the true dates of the divergences [47], probably because of differences in the selection moulding the of species. There are three broad approaches to circumvent form and structure of the species. Kimura [3] resolved this problem this issue. by proposing that most molecular changes have negligible effects First, the variation in the substitution rate can be on fitness, and argued that advantageous mutations affecting detected without knowing any times, because we know morphology, and the phenotype in general, are rare. An important feature of his so-called ‘Neutral Theory’ is that the substitution rate logically that, for any two extant species, the time back to (mutations fixed per generation, as can be inferred by comparing their common ancestor is the same for both lineages. The sequences from different species) is equal to the mutation rate, mn problem with this type of analysis is low power [15]: (mutations per gamete per generation) and is independent of differences of fourfold can escape detection. Consequently, population size. This result can be explained by considering a convincing tests must combine information from many (diploid) population of size N that is produced by 2N gametes, each species to characterize the overall degree of rate variation, of which might contain a mutation; thus, 2Nmn new neutral mutations arrive in a population on average each generation. and our knowledge of a particular evolutionary branch is Because they are neutral, the mutations have just as much chance less precise. as any of those in the other 2N lineages of eventually becoming A second, rarely used approach is to check the inferred fixed, so the probability of fixation is 1/2N for each of 2Nmn substitution rate from phylogenetic studies against the mutations per generation, making the overall rate 1/2N Â 2Nmn = mn. When DNA sequence data became readily available, it was mutation rate observed in pedigrees, given that, according apparent that the rates of nucleotide substitution were more rapid to the Neutral Theory, mutation rate and substitution rate in the evolutionary lineages of organisms with shorter generation should be the same [3]. However, these comparisons sug- times. This trend was not so clearly discerned in the rates of gest large differences: the mutation rate estimated from substitution of amino acids. This incongruence was one of the reasons that led Ohta [48] to propose the Nearly Neutral Theory of pedigrees of humans is a hundredfold higher than the Evolution, which highlights the effect of population size on the substitution rate for the primate mitochondrial DNA con- spread of mildly deleterious mutations. Such alleles can increase in trol region [16]. One response is to avoid calibrating the frequency by chance, under the action of genetic drift, in a similar molecular clock using pedigree-based mutation rates, but manner to genuinely neutral alleles. If the selection acting against a it would be reassuring to understand why these rates differ new mutation is sufficiently mild, then the deleterious allele can spread to fixation. Genetic drift becomes sufficiently rapid relative to so much. selection if the population size is smaller than the inverse of the The third approach is to compare the clock-based dates selection coefficient [6]. There does appear to be an inverse with those obtained by other approaches, such as fossil or relationship between generation time and population size that biogeographical studies. These dates are themselves could obscure the generation time effect on substitution for amino imprecise: the geological dating has some error; more acids (which might be subject to selection), but not on synonymous mutations in the nucleotide sequence.