Ree-2005-Evolution.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
EVOLUTION INTERNATIONAL JOURNAL OF ORGANIC EVOLUTION PUBLISHED BY THE SOCIETY FOR THE STUDY OF EVOLUTION Vol. 59 February 2005 No. 2 Evolution, 59(2), 2005, pp. 257±265 DETECTING THE HISTORICAL SIGNATURE OF KEY INNOVATIONS USING STOCHASTIC MODELS OF CHARACTER EVOLUTION AND CLADOGENESIS RICHARD H. REE Department of Botany, Field Museum of Natural History, 1400 South Lake Shore Drive, Chicago, Illinois 60605 E-mail: rree@®eldmuseum.org Abstract. Phylogenetic evidence for biological traits that increase the net diversi®cation rate of lineages (key in- novations) is most commonly drawn from comparisons of clade size. This can work well for ancient, unreversed traits and for correlating multiple trait origins with higher diversi®cation rates, but it is less suitable for unique events, recently evolved innovations, and traits that exhibit homoplasy. Here I present a new method for detecting the phylogenetic signature of key innovations that tests whether the evolutionary history of the candidate trait is associated with shorter waiting times between cladogenesis events. The method employs stochastic models of character evolution and cladogenesis and integrates well into a Bayesian framework in which uncertainty in historical inferences (such as phylogenetic relationships) is allowed. Applied to a well-known example in plants, nectar spurs in columbines, the method gives much stronger support to the key innovation hypothesis than previous tests. Key words. Aquilegia, Bayesian inference, character mapping, diversi®cation rate, macroevolution, phylogeny. Received June 11, 2004. Accepted November 10, 2004. The tempo of evolution is a subject of general interest to al. 1988), plant latex and resin canals (Farrell et al. 1991), evolutionary biologists, from nucleotide mutation rates with- ¯oral nectar spurs (Hodges 1997), and ¯ower symmetry (Sar- in populations to the proliferation of branches on the tree of gent 2004). Alternatively, parametric approaches make it pos- life. At the macroevolutionary level, key innovation hypoth- sible to identify unusually large clades by ®tting data to pre- eses (Sanderson and Donoghue 1994) have been put forth to dictions of null models of stochastic cladogenesis, and this explain unusual disparities in species number between clades. can increase the statistical power associated with trait cor- A key innovation is commonly regarded as a biological trait relations (Slowinski and Guyer 1993; Sims and McConway that promotes lineage diversi®cation via mechanisms that in- 2003; McConway and Sims 2004). crease the rate of speciation and/or decrease the rate of ex- In contrast, it is more dif®cult to evaluate unique origins tinction (e.g., Hodges and Arnold 1995). (Hereafter, ``di- of key innovations. If a trait arises only once, distinguishing versi®cation'' will be used to convey the net change in clade its effect on diversi®cation from the effects of other, coin- diversity as a result of speciation and extinction.) Changes cidental, factors becomes problematic (see Discussion). De- in the rate of lineage diversi®cation are expected to leave an spite this, general methods for detecting shifts in diversi®- imprint in the phylogeny of the affected species, underscoring cation rate on phylogenies can be used to associate an inferred the importance of historical inference in studies of evolu- rate shift with the origin of the trait of interest. These use tionary innovations. stochastic branching models to evaluate the observed data, Key innovation hypotheses are most frequently studied by either the distribution of clade sizes on a phylogenetic tree correlation, which measures the degree to which multiple (e.g., Slowinski and Guyer 1989; Sanderson and Donoghue origins of a trait are associated with clades of unusual size. 1994; Sims and McConway 2003; McConway and Sims Correlations are appealing because they use phylogenetically 2004; Moore et al. 2004), or the distribution of the number independent datapoints, typically comparisons of sister of lineages through time (e.g., Nee et al. 1992, 1994; Paradis clades, to demonstrate a general, repeatable effect on diver- 1997). The dichotomy of approaches has led to them being sity by the candidate trait. Increasing the sample size in this termed topological and temporal methods, respectively (Chan way reduces the potential impact of other factors that affect and Moore 2002). diversi®cation rate. A popular approach has been to compare Key innovation tests based on topological data are ap- the relative sizes of sister groups using nonparametric sign pealing for a number of reasons. They are intuitive in that tests, for example, in studies of insect phytophagy (Mitter et shifts in diversi®cation rate will eventually lead to disparities 257 q 2005 The Society for the Study of Evolution. All rights reserved. 258 RICHARD H. REE in clade size. They are also pragmatic in generally requiring stochastic models of character evolution and cladogenesis in a minimum of taxon sampling in phylogenetic analyses, so continuous time. long as all relevant lineages are represented and the sizes of those lineages are known. Furthermore, they can be applied TESTING WHETHER CHARACTER EVOLUTION COVARIES WITH when no information about branch lengths is available DIVERSIFICATION RATE (Moore et al. 2004). However, methods for detecting rate shifts that rely solely on clade size data are not generally The test requires an estimate of the phylogeny of the spe- well-suited for cases in which the hypothesized key inno- cies of interest (i.e., those with the putative innovation) and their relatives. This can be a single tree, or it may be a set vation has arisen recently, with insuf®cient time having of trees representing plausible candidates (e.g., from boot- passed for the shift in diversi®cation rate to affect species strap replicates or Bayesian samples from the posterior dis- diversity to an extent that is statistically detectable. In these tribution of trees). For simplicity, consider ®rst the case of situations, they lose inferential power as a result of small a single tree. The terminal nodes on the tree are assumed to sample sizes. One exception to this is the so-called whole- represent extant species. Because we are interested in the tree approach (Chan and Moore 2002), which examines tree tempo of cladogenesis, the tree's branches should have symmetry in the context of a Markovian branching model lengths that are proportionate to time. A variety of statistical and can detect rate shifts even in small trees. tools are available for estimating the temporal durations of A potentially limiting aspect of all current key innovation branches from molecular sequence data, of which the simplest tests is their reliance on unreversed synapomorphies, mean- (and least often defensible) is to enforce the assumption of ing they tend to assume that once arisen, key innovations are a molecular clock (Felsenstein 1985; Hasegawa et al. 1985). never lost. While this may be a safe assumption when a trait However, while branch lengths should represent relative tem- is diagnostic of an entire clade, it is certainly reasonable to poral durations, they should be expressed in units of evo- expect cases in which key innovations are secondarily lost, lutionary change in the character of interest, such that the especially considering that the trait's adaptive value may be length of the tree is the total amount of change in the character contingent on a particular ecological setting or association expected over the whole phylogeny. that is transient (e.g., de Queiroz 2002). In theory, losses The steps involved in the test can be summarized as fol- provide additional information relevant to the key innovation lows. First, the history of the character of interest is mapped hypothesis, as they should be associated with decreases in on the tree, identifying partitions of the phylogeny over which diversi®cation rate just as gains are expected to cause in- the candidate trait was present and absent. Next, the number creases in rate. Thus, it is clearly desirable to have a phy- of branching events and relative time in each partition are logenetic method for testing key innovation hypotheses that used to estimate the difference in rates of diversi®cation be- accommodates bidirectional evolution in the trait of interest. tween character states. This difference in rates, averaged over A related problem with existing tests concerns their ten- alternative histories in proportion to their posterior proba- dency to rely on only the most parsimonious reconstructions bilities, is taken as the test statistic. Finally, the signi®cance of character change, that is, minimum-length solutions that of the test statistic is evaluated by generating its expected are known to be biased (see Harvey and Pagel 1991) and that density under a null model of waiting times between clad- also ignore the variance of the estimate (Nielsen 2002). His- ogenesis events. torical inference invariably introduces some uncertainty into Consider the two hypothetical phylogenies in Figure 1. considerations of character evolution, the magnitude of which Each has an identical binary character of which state 1 is depends on how much information we have about the past. hypothesized to be a key innovation. The trees have the same The increasing prevalence of Bayesian statistics and sto- topology, but differ in their branch lengths. By convention, chastic estimation methods is enabling uncertainty to be in- the tree topology is denoted t, the vector of branch lengths corporated into many aspects of phylogenetic inference (e.g., v, and the character data at the tips x. Given these data, the see Huelsenbeck et al. 2000). Here, the same approach is objective is to determine in each case whether a higher di- applied to the study of key innovations. versi®cation rate is associated with the history of state 1, the In summary, existing phylogenetic tests of key innovation candidate innovation, compared to state 0. Our ®rst clue to hypotheses are not ideal for studying singular instances, the outcome is found in the relative durations of terminal smaller clades, or traits that exhibit homoplasy.