Testing the Relationship Between Morphological and Molecular Rates of Change Along Phylogenies
Total Page:16
File Type:pdf, Size:1020Kb
Evolution, 56(10), 2002, pp. 1921±1930 TESTING THE RELATIONSHIP BETWEEN MORPHOLOGICAL AND MOLECULAR RATES OF CHANGE ALONG PHYLOGENIES LINDELL BROMHAM,1,2,3 MEGAN WOOLFIT,1,3,4 MICHAEL S. Y. LEE,3,5,6 AND ANDREW RAMBAUT7,8 1School of Biological Sciences, University of Sussex, Falmer, Brighton BN1 9QG, United Kingdom 2E-mail: [email protected] 3Department of Zoology, University of Queensland, Brisbane 4072, Australia 4E-mail: M.R.Q.Wool®[email protected] 5Department of Palaeontology, South Australian Museum, Adelaide 5000, Australia 6E-mail: [email protected] 7Department of Zoology, University of Oxford, South Parks Road, OX1 3PS, United Kingdom 8E-mail: [email protected] Abstract. Molecular evolution has been considered to be essentially a stochastic process, little in¯uenced by the pace of phenotypic change. This assumption was challenged by a study that demonstrated an association between rates of morphological and molecular change estimated for ``total-evidence'' phylogenies, a ®nding that led some researchers to challenge molecular date estimates of major evolutionary radiations. Here we show that Omland's (1997) result is probably due to methodological bias, particularly phylogenetic nonindependence, rather than being indicative of an underlying evolutionary phenomenon. We apply three new methods speci®cally designed to overcome phylogenetic bias to 13 published phylogenetic datasets for vertebrate taxa, each of which includes both morphological characters and DNA sequence data. We ®nd no evidence of an association between rates of molecular and morphological rates of change. Key words. Maximum likelihood, molecular clock, node density effect, phylogenetic independence, relative rates, substitution rate. Received January 16, 2002. Accepted June 26, 2002. The relationship between rates of phenotypic evolution and to a more general, genomewide process, either directly (e.g., genetic change has been a matter of debate for many decades, higher mutation rate producing more novel advantageous al- but in practice the rate of molecular evolution is considered leles) or indirectly (e.g., effect of population size on rate of to be effectively disassociated from rate of morphological change; Omland 1997). Such a relationship could have se- change. This assumption was challenged by a study of phy- rious implications for the way molecular data is used in evo- logenies for which both morphological characters and mo- lutionary biology. For example, it has been suggested that lecular data were available (Omland 1997). A comparison of metazoan lineages underwent an explosive burst in rates of morphological and molecular branch lengths for eight phy- morphological evolution in the early Cambrian and that this logenies, covering a broad range of taxa from ducks to dan- drove high molecular rates, making molecular dates of the delions, revealed a signi®cant association between rates of ``Cambrian explosion'' divergences between metazoan phyla molecular and morphological change, both for the tips of the unreliable (Vermeij 1996; Conway Morris 1998; Knoll and phylogenies (branches leading to terminal taxa) and for root- Carroll 1999; Lee 1999; Valentine et al. 1999). to-tip pathways for each species. This association has been However, Omland's relationship may be an artifact of the interpreted as indicating a link between the rate of morpho- methods used to measure rates of change along phylogenies, logical evolution and rate of molecular evolution (e.g., Om- rather than re¯ecting an underlying evolutionary process. land 1997; Conway Morris 1998; Lee 1999). This is a sur- There are a number of potential biases in comparing mor- prising claim, because neither molecular evolutionary theory, phological and molecular branch lengths that could cause a experimental studies, nor observation from molecular phy- spurious association: (1) estimates of amount of morpholog- logenies support a close association between morphological ical and molecular change along any given branch are non- and molecular rates of change (see Bromham and Hendy independent because they share the confounding variable of 2000). time; (2) both morphological and molecular branch lengths Selection for a particular adaptation can prompt localized may be subject to similar measurement biases; and (3) phy- increases in substitution rates in genes associated with that logenetically independent comparisons are essential to avoid trait (Gillespie 1991). But any given morphological change artifactual associations. is likely to affect only a handful of nucleotide sites in one Omland (1997) was aware of these problems and attempted or a few genes, representing an almost insigni®cant propor- to minimize their impact on his analysis. However, we believe tion of the genome. Furthermore, much of the observed mo- that the problem of phylogenetic nonindependence was not lecular change is unlinked to adaptive evolution, occurring removed by his methodology and that it may have resulted at nucleotide sites that do not affect protein or RNA products in an artifactual association between morphological and mo- or causing molecular changes that have no appreciable affect lecular rates. Comparison of amount of morphological and on ®tness (Kimura 1983). So, if there is a link between mor- molecular change along branches of a phylogeny is problem- phological and molecular rates of change, it must be due not atic because they share the confounding factor of time (Om- to a direct link between adaptation and genetic change, but land 1997). Even if morphological and molecular rates are 1921 q 2002 The Society for the Study of Evolution. All rights reserved. 1922 LINDELL BROMHAM ET AL. unlinked, it is expected that the deeper the divergence, the at overcoming phylogenetic bias. Because we ®nd no evi- more substitutions and morphological changes will have ac- dence of a link between morphological and molecular rates cumulated. Because a long branch is likely to have both more for these datasets, we conclude that the association noted by morphological changes and more molecular substitutions Omland (1997) is likely to be due to phylogenetic bias, rather than a shorter branch, an artifactual association between mor- than being indicative of an underlying association between phological and molecular change may be caused by their rates of morphological and molecular evolution. covariation with time. By comparing root-to-tip pathways, summing branch lengths from the base of the tree to the tips, METHODS Omland ensured that he compared lineages of the same age. But this approach introduced two other sources of bias: node Data density effect and phylogenetic nonindependence. Systematic datasets for vertebrate taxa were selected from Parsimony infers the minimum number of changes along the literature that contained suf®cient morphological data any given branch. Long unbroken branches will tend to be (more than 30 characters) and DNA sequence data (generally underestimated by parsimony, because they have more sites more than 600 nucleotides) for at least seven taxa (Table 1). that have undergone multiple hits that cannot be directly Phylogenetic overlap was avoided by ensuring no lineages reconstructed. Adding taxa that break up long branches (in- were included in more than one dataset (see legend to Table creasing the density of nodes) allows more state changes to 1). Sequences were aligned by eye using Se-Al (Rambaut be inferred, so the amount of change estimated for a lineage 1996), and any saturated regions that could not be con®dently is expected to rise with the number of intersecting lineages aligned (primarily from 12S rRNA) were excluded from the (Sanderson 1990). If both morphological and molecular analysis. For datasets with multiple gene sequences, sequenc- branch lengths are estimated by parsimony, then both mor- es were concatenated into a single alignment and analyzed phological and molecular rates will be underestimated for together. long unbroken branches (e.g., species-poor clades), gener- In addition to the 13 vertebrate datasets in Table 1, we ating an apparent association between molecular and mor- have reanalyzed three of the phylogenies presented in Om- phological rates. Omland (1997) addressed the problem of land (1997; Table 2). We were unable to analyze all eight of node density by using the residuals of a least-squares re- the datasets in Omland, because our methods are not appli- gression of total path length against node sums (number of cable to restriction data and some of the sequence data was nodes separating a terminal taxon from the root); however, not available on GenBank. this approach is compromised by the use of nonindependent datapoints in the regression. When root-to-tip path lengths Estimation of Rates are used, the same internal branches contribute to multiple datapoints. This increases the apparent degrees of freedom Three new methods were applied to comparing rates of of the test, arti®cially in¯ating the power of the regression. morphological and molecular change along phylogenies. The Furthermore, branches lower in the tree will have an undue ®rst step for all three methods was the construction of a in¯uence on the inferred relationship. phylogeny from DNA sequence data using maximum like- Comparisons between lineages can only be considered sta- lihood. Molecular phylogenies were constructed for each of tistically independent if their paths neither meet nor cross on the 13 datasets listed in Table 1 using PAUP* (ver. 4.0b3a, a phylogeny (Harvey and Purvis 1991). Because of the hi- Swofford 1999) with an HKY 1Gmodel of nucleotide sub- erarchical nature of phylogenies, most of the root-to-tip path- stitution (Hasegawa et al. 1985; Yang 1994) with transition: way of one species will be shared with other species in the transversion ratio (ti:tv) and gamma shape parameter (a) es- phylogeny. Using root-to-tip pathways for all species there- timated from the data. This model allows for variation in fore does not satisfy statistical independence because all but substitution rates across sites, base frequency bias, and tran- the terminal branches will contribute to more than one da- sition-transversion bias, yet is both computationally tractable tapoint.