ëZ± o§êu« fhZ oêu§«hZ±¥ ou±uh± ±§Z±« ¶ou§ «£uhu« «uuh±
by RhZ§o GZ§u± F±ñJ B.Sc., Victoria University of Wellington, óþþþ M.Sc. (Hons), Victoria University of Wellington, óþþì
A thesis submitted in partial fulllment of the requirements for the degree of
Dh±§ P«£í
in
e Faculty of Graduate Studies (Zoology)
e University of British Columbia (Vancouver)
Z¶¶«± óþÕó
© Richard Gareth FitzJohn, óþÕó Af«±§Zh±
Species selection — heritable trait-dependent dierences in rates of speciation or extinc- tion — may be responsible for variation in both taxonomic and trait diversity among clades. While initially controversial, interest in species selection has been revived by the accumulation of evidence of widespread trait-dependent diversication. In my the- sis, I developed and applied a number of new likelihood-based methods for investigat- ing species selection by detecting the association between species traits and speciation or extinction rates. ese methods are explicitly phylogenetic and incorporate simple, but commonly used, models of speciation, extinction, and trait evolution; I assume throughout that speciation and extinction can be modelled as a birth-death process where rates depend in some way on one or more traits, and that these traits evolve under a Markov process. In particular, I extended the BiSSE (Binary State Speciation and Extinction) method to allow use with incompletely resolved phylogenies, and de- veloped analogous methods for multi-state discrete traits or combinations of binary traits (MuSSE; Multi-State Speciation and Extinction) and quantitative traits (QuaSSE; Quantitative State Speciation and Extinction). I tested the statistical performance of the methods using simulations, investigating their performance with variation in tree size, degree of resolution, number of traits, and departure from the true model. I used each method to consider a dierent biological question; I found that sexual dimorphism was shortlived but associated with elevated rates of speciation in shorebirds; that solitariness and monogamy are associated with decreased speciation rates in primates (showing that a previous analysis was robust to treating both traits simultaneously); and that body size was a poor predictor of speciation rates in primates. In chapter ¢, I extended this analysis of body size to all mammals, and investigated if within-lineage increases in body size (Cope’s rule) were balanced by species selection against large bodied species. I
ii found little support for this hypothesis, with clade-specic dierences in the direction of species selection and idiosyncratic variation in speciation rates. Together, the methods I have developed allow testing of long-standing hypotheses about causes of variation in biological diversity.
iii P§uZhu
Several chapters from my thesis have been published elsewhere:
CZ£±u§ ó has been previously published as: FitzJohn R.G., Maddison W.P., and Otto S.P. óþþÉ. Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Systematic Biology ¢:¢É¢–äÕÕ.
I derived most of the mathematical results, developed the soware, carried out the anal- ysis and wrote the manuscript. Wayne P. Maddison and Sarah P. Otto had the original idea for the “skeletal tree” and “unresolved clades” methods, respectively, and both also helped with writing, editing, and guidance.
CZ£±u§ ì has been previously published as: FitzJohn R.G. óþÕþ. Quantitative traits and diversication. Systematic Biology ¢É:äÕÉ–äìì.
Sarah P. Otto provided guidance and helped edit the manuscript.
CZ£±u§ ¦ has been accepted for publication at Methods in Ecology and Evolution, pending minor revisions. I am the sole author of this manuscript. Sarah P.Otto provided guidance, advice, and helped edit the manuscript. e R package this paper is based on is my own work, but contains methods by Emma E. Goldberg (University of Illinois at Chicago), Karen Magnuson-Ford, and Sarah P. Otto.
CZ£±u§ ¢ was coauthored with Nick D. Pyneson (Smithsonian Inistitute) and Sarah P. Otto. I developed the idea, ran the analyses and wrote the rst version of the paper. Nick D. Pyneson provided data and advice about Cetacea. Sarah P. Otto provided guid- ance, advice, and editing.
iv TZfu C±u±«
Af«±§Zh± ...... ii P§uZhu ...... iv TZfu C±u±« ...... v L«± TZfu« ...... vii L«± F¶§u«...... viii Ahëuouu±« ...... x DuohZ± ...... xiii ÕI ±§o¶h± ...... × ×.× Detecting species selection...... £ ×.ö Speciation, extinction, & phylogenies...... å ×.î Structure & contents of this thesis...... óE «±Z± T§Z±-Du£uou± S£uhZ± b Eì±h± RZ±u« F§ Ih£u±uí Ru«êuo Píuu« ...... ×ÿ ö.× Summary...... ×ÿ ö.ö Introduction...... ×× ö.î BiSSE for complete phylogenies...... ×® ö.® Likelihood calculations for incompletely resolved phylogenies...... ×å ö.£ Bayesian inference...... ö® ö.å Numerical methods & application...... ö£ ö.à Results...... öà ö. Application to shorebird data...... î® ö.Ì Discussion...... î ìQ ¶Z±±Z±êu T§Z±« b Dêu§«hZ± ...... ®× î.× Summary...... ®× î.ö Introduction...... ®ö î.î Character evolution & diversication...... ®® î.® Likelihood calculations...... ®å î.£ Simulation results...... £ î.å Application to primate body size data...... å×
v î.à Discussion...... åà ¦D êu§«±§uu:C£Z§Z±êu Píuu±h AZí«u« Dêu§«hZ± R...... àö ®.× Summary...... àö ®.ö Introduction...... àö ®.î e methods...... àî ®.® e approach...... ࣠®.£ e MuSSE model...... à ®.å Simulation test assessing the power of MuSSE...... ö ®.à Closing comments...... Ìÿ ¢S ¶§êêZ ±u L±±u«±¥ S£uhu« Suuh±,C£u’« R¶u, b MZZ Boí Sñu...... Ìî £.× Introduction...... Ìî £.ö Results & discussion...... Ì£ £.î Methods...... ×ÿî äC h¶« ...... ××× å.× Some issues with these methods...... ××î å.ö Future directions...... ×ö× å.î Conclusion...... ×ö® Bf§Z£í ...... ×ö£ A££uohu« ...... ×®ö Z S¶££uu±Z§í I§Z± ± CZ£±u§ ó...... ×®ö Z.× Root-state calculations...... ×®ö Z.ö Character-independent model...... ×®® f S¶££uu±Z§í I§Z± ± CZ£±u§ ì...... ×®à f.× Single character derivation...... ×®à f.ö Multivariate character derivation...... ×®Ì h S¶££uu±Z§í I§Z± ± CZ£±u§ ¦...... ×££ h.× Tuning diversitree...... ×££ h.ö A faster algorithm for BM & OU likelihood calculations...... ×åö h.î MuSSE & multitrait diversication in primates...... ×åÌ o S¶££uu±Z§í I§Z± ± CZ£±u§ ¢...... ×àÌ o.× Partition compositions...... ×àÌ
vi L«± TZfu«
TZfu ì.Õ Summary of model ts for the association between body size and diversication for primates...... åå
TZfu ¦.Õ Summary of model types available in diversitree...... àå
TZfu ¢.Õ Properties of the Õþ clades used...... Ìå TZfu ¢.ó Properties of partitions recovered by Muo¶«Z ...... ×ÿ
vii L«± F¶§u«
F¶§u Õ.Õ Lineage-through-time plots for simulated trees...... à
F¶§u ó.Õ Ways that phylogenetic information may be incomplete...... ×î F¶§u ó.ó Schematic of the BiSSE method with and without full phylogenetic knowledge...... ×£ F¶§u ó.ì Posterior probability densities for BiSSE parameters...... ö F¶§u ó.¦ Posterior probability densities for diversication rates...... îÿ F¶§u ó.¢ Uncertainty around BiSSE parameter estimates as a function of phylogenetic knowledge...... îö F¶§u ó.ä Uncertainty around diversication rate estimates as a function of phylogenetic knowledge...... îî F¶§u ó.ß Phylogenetic tree of the ì¢þ species of shorebirds (Charadriiformes). î£ F¶§u ó. Association between sexual and diversication and character transition rates in shorebirds...... îà
F¶§u ì.Õ Possible ways a lineage extant at time t + ∆t might go extinct.... ® F¶§u ì.ó Possible ways a lineage extant at time t + ∆t might lead to exactly the clade N as observed...... £ÿ F¶§u ì.ì Power to detect dierential speciation and extinction with QuaSSE on simulated phylogenies...... £Ì F¶§u ì.¦ Representative speciation and extinction function ts...... åÿ F¶§u ì.¢ Power to detect trait-dependent speciation and directional character change...... åö F¶§u ì.ä Phylogenetic tree of the primates...... å® F¶§u ì.ß Primate speciation and extinction model ts...... å
F¶§u ¦.Õ Uncertainty around multitrait-MuSSE parameter estimates as a function of tree size and number of traits...... ® F¶§u ¦.ó Power and error rates of multitrait MuSSE...... £ F¶§u ¦.ì “Main eects” of monogamy and solitariness on speciation rate in primates...... Ì×
F¶§u ¢.Õ Distribution of mammal body masses...... Ì® F¶§u ¢.ó Mammal supertree showing the Õþ focal clades...... Ìà F¶§u ¢.ì Inferred body size/speciation relationships in Õþ mammal clades..ÌÌ
viii F¶§u ¢.¦ Slopes of speciation rate/body mass relationships within Muo¶«Z- derived partitions...... ×ÿ× F¶§u ¢.¢ Relationship between speciation rate and body size within mammals, across partitions with dierent diversication rates...... ×ÿö F¶§u ¢.ä Fits for the multi-slope analyses...... ×ÿÌ F¶§u ¢.ß Regression of body mass against length for cetacea...... ××ÿ
F¶§u h.Õ Performance of Brownian motion likelihood algorithms..... ×å F¶§u h.ó Primate phylogeny, showing distribution of monogamy and solitariness among species...... ×à×
ix Ahëuouu±«
First, a big thank you to my supervisor, Sally Otto. Sally has been a fantastic mentor, and provided me with enough freedom to let me focus on whatever I found interesting, while simultaneously providing enough guidance so I never felt too lost. Her ability to encourage, enlighten, and (at times) coax have been critical the completion of the work here. I also thank my committee; Wayne Maddison, Dolph Schluter, and Jeannette Whit- ton for their helpful advice at a number of points in this thesis and for always being willing to stop and chat with no notice. Other faculty at UBC and SFU were always generous with their time and ideas; Arne Mooers and Michael Whitlock in particular. e SOWD/Delta-Tea group provided a great venue for airing and rening half-baked ideas. e Otto lab has been a fantastic place to work over the last ve years, thanks to Di- lara Ally, Aleeza Gerstein, Philip Greenspoon, Kay Hodgins, Crispin Jordan, Liz Kleyn- hans, Nathan Kra, Leithen M’Gonigle, Karen Magnusson-Ford, Itay Mayrose, Jasmine Ono, Kate Ostevik, Alirio Rosales, Michael Scott, and or Veen. anks also to the regular extras at lab meetings (especially over the last year): Florence Débarre, Kim Gilbert, Fred Guillaume, Katie Lotterhos, and Alana Schick. e diversity of ideas in the lab has helped me keep my interests broad. is thesis is primarily theoretical, and I have beneted from data from other re- searchers. Gavin omas and Terje Lislevand, Jordi Figuerola, and Tamás Székely have made the data used in chapteró freely available. Arne Mooers, David Redding, and Rutger Vos generously shared data for the primate analysis in chaptersì and¦, and Tyler Kuhn generated the trees with simulated branch lengths in chaptersì–¢. Kate Jones
x (and coauthors) provided the trait data used in chapter¢. More generally, I thank those involved in Open Data eorts, including Heather Piowar and Tim Vines at UBC. In addition to those listed above, a number of people provided feedback on vari- ous chapters directly: David Bapst, Folmer Bokma, Will Cornwell, Luke Harmon, Dan Rabosky, and Graham Slater. Several people also contributed specic ideas; Rick Ree, initially suggested expanding BiSSE to account for trees containing exemplars, which inspired most of chapteró. Wayne Maddison suggested the connection to species selec- tion, which helped me clarify my thinking on my research. Graham Slater suggested I investigate the power of MuSSE, which turned out to be more interesting than I thought it would be. Matt Pennell suggested creating gure h.Õ, which made me feel a lot better about how painfully slow most of the models in diversitree are. I have also beneted tremendously from discussions with friends, collaborators, and colleagues from out- side of UBC: Jeremy Beaulieu, Juan-López Cantalapiedra, Emma Goldberg, Gene Hunt, Boris Igić, Marc Johnson, Marcos Alexandrou, Lynsey McInnes, Brian O’Meara, Nick Pyensen, Dan Rabosky, Stacey Smith, and Amy Zanne. I thank everyone who has helped by testing the perpetual beta that is “diversitree”. Special thanks to Emma Goldberg, Karen Magnusson-Ford, and Sally Otto for con- tributing code, to Wayne Maddison for developing the interface with Mesquite, and to Emmanuel Paradis for developing the “ape” package without which diversitree would never have been written. is work was nancially supported by a University Graduate Fellowship from the University of British Columbia, a Vanier Commonwealth Graduate Scholarship from NSERC, and a Capability Fund Grant from Manaaki Whenua/Landcare Research. Vast amounts of computing time was provided by the Zoology Computing Unit and Westgrid. Particular thanks to Alistair Blachford, Andy LeBlanc, and Richard Sullivan for running the Zoology Computing Unit, without which most of the research here would not have been possible.
xi UBC has been a fun and ridiculously interactive place to be, thanks in part to the people above, and Alistair Blachford, Dave Allen, Rose Andrews, Rowan Barrett, Ella Bowles, Alan Brelsford, Carla Crossman, Josh Chang Mell, Dan Ebert, Edd Hammill, Matt Herron, Will Iles, Andy LeBlanc, Robin LeCraw, Julie Lee-Yaw, Andrew MacDon- ald, Janet Maclean, Blake Matthews, Jon Mee, JS Moore, Jana Petermann, Jess Purcell, Seth Rudman, Kieran Samuk, Alana Schick, Laura Southcott, Travis Ingram, Kathryn Turner, Jabus Tyerman, and Sam Yeaman. Everyone I’ve played boardgames with, drank coee or beer with, eaten a burger at the Fringe with, climbed with, hiked with. Copying Jon Mee, I thank the beer barons, reading group and seminar organisers, and retreat planners who make UBC Zoology/Biodiversity the place it is. I thank my family for always being supportive of my research. Also, for not giving me too much of a hard time to leave a perfectly good job and go back to school. Finally, thank you Andréa, for help in all sorts of ways.
xii Bo my grandparents Anthony & Pamela FitzJohn Dewi & Eruwen Davies
xiii hZ£±u§ Õ I±§o¶h±
“At yet higher levels, the species and the community, natural selection ob- viously must occur. Species evolve to survive in a certain environmental range, and if the environment should suddenly change, some species will become extinct but others will survive.” Lewontin(ÕÉßþ), p. Õ¢
e tree of life is remarkably uneven. e Passeriformes consist of over ¢,þþþ species, but the deepest division in this clade is between two species of New Zealand wrens (the family Acanthisittidae) and the rest of the order. Similarly, a single species (Amborella tri- chopoda) is probably the sister to the remaining ó¢þ,þþþ species of angiosperms (Moore et al., óþþß). Such asymmetries in diversity are highly unlikely to have occurred by chance alone. One explanation is that species traits may cause dierences in rates of speciation and extinction. By analogy to natural selection — trait-based dierential survival and reproduction of individuals within population — trait-dependent specia- tion and extinction can be thought of as “species selection”. is idea has a long history, dating back as far as Lyell(Õìó), who was the rst to argue that dierential extinction among species may shape patterns of diversity (Van Valen, ÕÉߢ). e concept is implicit in E.O. Wilson’s “taxon cycle” (Wilson, ÕÉ¢É, ÕÉäÕ) and was discussed (but dismissed) by Fisher(ÕÉ¢, p. ¢þ) and Williams(ÕÉää, p. ÕÕ¢). Lewontin(ÕÉßþ) claried the idea considerably by drawing parallels to other levels of selection, and the term “species selection” itself was coined by Stanley(ÕÉߢb). While there has been little disagreement that species traits may aect their rates of speciation or extinction, species selection became controversial immediately aer being proposed.
Õ hZ£±u§ Õ
First, there has been disagreement about the nature of traits required to call trait- dependent speciation and extinction “species selection” (e.g., table Õ in Lieberman and Vrba, óþþ¢). A “narrow-sense” view of species selection argues that the trait leading to dierential speciation or extinction be apparent only as an emergent property of species (e.g., Vrba and Gould, ÕÉä). Examples of such traits include geographic range, population structure, and variability of traits. is view is motivated by the idea that for selection to operate at the species level, traits must be apparent at this same level. Dierential speciation and extinction due to traits apparent below the species level is referred to as “species sorting”. e alternative, “broad-sense”, view of species selection argues that any trait, whether it is the property of individuals or of the species as a whole, can be involved in species selection (e.g., Lloyd and Gould, ÕÉÉì). What matters is that tness is emergent at the level of species; species give birth (speciation) and die (extinction), and species selection operates on any trait that aects emergent tness. Increasingly, the distinction between emergent and aggregate traits has been considered not suciently informative to warrant the recognition of two dierent processes (argu- ments in Damuth and Heisler, ÕÉ; Grantham, ÕÉÉ¢; Okasha, óþþä; Jablonski, óþþb). As such, “emergent tness” is sucient for species selection to operate, regardless of the level at which the trait is apparent. at said, there is an interesting consequence of the emergent trait criterion; as individual selection cannot act on emergent traits, if species selection aects the distribution of traits among species then macroevolution may be fundamentally decoupled from microevolution (Erwin, óþÕþ). If species selection on emergent traits is widespread then the distribution of any trait that “hitch-hikes” with these will be unpredictable from microevolutionary studies. Second, species selection has been argued to be too slow, relative to individual-level selection, to be a signicant force in shaping the distribution of traits (Fisher, ÕÉ¢, p. ¢þ; Williams, ÕÉää, p. ÕÕ¢; Maynard Smith, ÕÉì; Rice, ÕÉÉ¢). Strong directional selection at the individual level should always be able to trump species selection due to the large dierences in their relative speeds. at is, the number of selective deaths of individuals
ó hZ£±u§ Õ
(and the resulting opportunity for selection to act) will outweigh the number of selective deaths of species. However, this view eventually soened. Strong sustained directional selection may not be common, with changes in both strength and sign (e.g., Grant and Grant, óþþó; Siepielski et al., óþþÉ), and stabilising selection may be pervasive (e.g., Estes and Arnold, óþþÉ), leaving room for species selection to operate. Brown(ÕÉÉ¢) argued that if the rate of environmental change can outpace rates of individual adaptation, then species selection may play an important role during mass extinction; this frames the problem as a purely empirical question of how oen such conditions are met. Maynard Smith(ÕÉÉ) argued that species selection has had a major eect on the distributions of dierent types of organisms, even if a trivial eect on the production of adaptations within organisms. Jablonski(óþþþ) put it nicely when he said “species sorting may not construct a complex eye or a long neck, but it may determine how many species possess complex eyes or long necks over evolutionary timescales” (p. ì). Finally, species selection has also been controversial due to its association with other contentious ideas; particularly group selection and punctuated equilibrium. Early ar- guments both for and against species selection oen invoked traits that decreased the probability of extinction (e.g., Leigh, ÕÉßß). However, the extinction of a species requires the death of all individuals of that species (in contrast with speciation, which is entirely decoupled from the birth of individuals). erefore, any trait that increases the prob- ability of species extinction is perfectly correlated with decreased individual survival (at least in small populations), and individual selection would tend to oppose that trait, without having to invoke species selection. Similarly, the association with punctuated equilibrium and its perceived challenge to neo-Darwinian orthodoxy (Eldredge and Gould, ÕÉßó; Gould and Eldredge, ÕÉßß) did not initially endear the theory to many biologists (e.g., Lande, ÕÉþ; Charlesworth, ÕÉÕ; Charlesworth et al., ÕÉó). As a result of this opposition, species selection has not been a popular concept among evolutionary biologists. While debate about the nature of species selection remained active in philosophy journals, I was unable to nd any primary research explicitly framed
ì hZ£±u§ Õ as species selection between ÕÉß and ÕÉÉÉ. In a rare primary paper that even mentioned species selection during this period, Herrera(ÕÉÉó) avoids the term not because it is an unlikely process, but because it is controversial (p. ¦óì). is view persists; an anony- mous reviewer of chapter ì commented:
“Why start a paper that has very broad implications out on the narrow and controversial topic of species selection? [or at least cite Coyne and Orr’s book for a counterargument] Whatever evidence there is for it is limited, yet there is little doubt that traits can aect speciation rates and there are many implications of that connection irrespective of whether selection on clades has molded biodiversity in some major way.”
A survey of Õ¢ biologists by Grimshaw(óþþÕ, p. Õß) found that most regarded species selection as an unlikely phenomenon, too weak to be generally important for biology, although their objections were vague. Species selection seems to have become theoria non grata among most biologists, though oen without a rm idea why. Recently, species selection appears to have gained favour amongst evolutionary bi- ologists. A series of inuential reviews have recently been published, each making the case that trait-dependent dierences in speciation and extinction rates can and should be considered species selection (Coyne and Orr, óþþ¦Õ; Jablonski, óþþb; Rabosky and McCune, óþþÉ). At least óß primary papers framing their work as species selection have been published since ÕÉÉÉ (Õä since óþÕþ). e degree of embrace varies from mentioning species selection as one possible interpretation (e.g., Pillon et al., óþÕþ; Sallan et al., óþÕÕ) to declaring it in the title of the paper (e.g., Duda and Palumbi, ÕÉÉÉ; McGlone et al., óþþÕ; Goldberg et al., óþÕþ; Simpson, óþÕþ; Eastman and Storfer, óþÕÕ). Despite claims of controversy, I found no articles by biologists arguing against species selection over the last decade, suggesting that even if it is not fully accepted, opposition has waned considerably since óþ years ago. My personal view, and the view taken in this thesis, is to
ÕAs the quote above cites Coyne and Orr(óþþ¦) as opposing species selection. However, Coyne and Orr(óþþ¦) write “ us, if we dene species selection on a trait as repeatable eects of that trait on the rate of diversication of species possessing it, we avoid unproductive arguments about whether or nor selection acts on emergent properties.” (p. ¦¦¦, their emphasis) and “we regard most of the examples given in Table Õó.ó as true cases of species selection” (p. ¦¦¢).
¦ hZ£±u§ Õ dene species selection as any trait-dependent dierences in speciation and extinction where the trait is “heritable” (i.e., does not drastically change at speciation).
Õ.Õ ou±uh± «£uhu« «uuh±
If species selection is a potentially important biological phenomenon, how do we detect it? More specically, how do we detect whether a trait is associated with increased or decreased rates of speciation or extinction? e most direct route would be to know the state of a species immediately before speciation or extinction. However, this requires a more complete fossil record, phylogeny, and knowledge of ancestral states than we are ever likely to have. Instead, we must infer the eect of traits on speciation by looking for signatures in the patterns of diversity and trait distributions. e rst attempt to measure diversity-trait associations statistically (rather than de- scriptively) using extant species was Mitter et al.(ÕÉ), who introduced “sister-clade comparisons”. e idea is simple; by denition, sister clades have diversied over the same length of time. If each clade is characterised by a dierent state of a trait thought to aect diversication rates, then dierences in diversity might be ascribed to that trait. If enough such diversity–trait comparisons show the same relationship, then there is statistical evidence for the nonrandom association of a state and increased rates of diversication. Such sister-clade comparisons have been used to detect trait-dependent dierences in diversication in a wide range of species (e.g., table Õó.ó in Coyne and Orr, óþþ¦) and have been the dominant approach since their introduction. However, sister-clade comparisons discard most of the potential information avail- able from phylogenies. Clades consisting of hundreds of species are collapsed into a binary categorisation: are they larger or smaller than their sister clade? Any information about the relative timings of speciation events are discarded (e.g., Rosenzweig, ÕÉÉä). If the trait varies among species within a clade, that clade cannot easily be used. Further- more, the eects of dierential speciation and dierential extinction cannot be disentan-
¢ hZ£±u§ Õ gled. One example that illustrates these issues is the long-standing hypothesis of species selection against asexual reproduction (Stanley, ÕÉߢa). It is unclear if asexual species tend to speciate less or go extinct more than sexual species (Schwander and Crespi, óþþÉ), and sister-clade approaches cannot distinguish between these possibilities. Furthermore, as asexual species are oen “tippy” (i.e., recently diverged species-poor groups scattered within a broader clade) most asexual species are sister to a single sexual species, which is uninformative in a sister-clade analyses. In such cases, methods that make better use of available phylogenetic information are needed to address questions about traits and diversication.
Õ.ó «£uhZ±, uì±h±, b £íuu«
Beyond sister-species comparisons, the temporal position of nodes in a time-calibrated phylogeny can provide information about the timing of speciation; this information has been widely used to infer rates of speciation. e simplest model is the pure birth, or “Yule”, process (Yule, ÕÉó¢), in which new species arise through speciation at rate λ and there is no extinction. is leads to exponential growth in species diversity at rate λ, which can be visualised by plotting the number of lineages over time on a semi-log plot, where the log-number of lineages is expected to increase linearly over time (gure Õ.Õ). If a tree with N species has a total branch length of s since the root node (i.e., the length of all branches in the tree), the maximum likelihood speciation rate estimate is (N −ò)~s (Nee, óþþÕ). e pure birth process is a special case of the “birth-death” process, in which lineages give rise to new species at rate λ and go extinct at rate µ. With extinction, we must make a distinction between lineages that survive to the present (and are therefore present in a phylogeny of all extant species: the “reconstructed phylogeny” of Nee et al., ÕÉɦb) and those that go extinct. Under this process, for most of the history of a clade, lineages in the reconstructed phylogeny accumulate at the diversication rate, λ − µ, with an
ä hZ£±u§ Õ
100 a) Pure−birth
50
20
10
5
2
1
100 b) Birth−death
50
20
10
5
2
1 Number of lineages (log scale)
100 c) Birth−death (expected)
50 λ 20
10 λ − µ λ − µ 5
2
1
140 120 100 80 60 40 20 0 Time (before present)
F¶§u Õ.Õ: Lineage-through-time plots for pure-birth (a) and birth-death (b) processes, which count the lineages surviving to the present over time. Fiy simulated phylogenies are represented by the thin grey lines. e dashed red line is the expected number of lineages that will survive to the present. In the pure-birth process, the slope of the expected log number of surviving lineages over time is λ over the entire time course. In the birth-death process, this slope is λ−µ initially, increasing towards the present to reach a slope of λ (grey lines panel c). e number of lineages at a given time (including those that subsequently go extinct) is larger than this (teal upper line in c). Figure adapted from Harvey et al.(ÕÉɦ).
ß hZ£±u§ Õ upturn in the log-number of lineages over time close to the present (gure Õ.Õ). Loosely, this occurs because recently diverged species have not yet had time to go extinct. is temporal dierence in slope suggests that it may be possible to estimate both speciation and extinction rates from extant taxa only (Harvey et al., ÕÉɦ). Likelihood calculations under this model were derived by Nee et al.(ÕÉɦa,b). Even if we can estimate speciation and extinction rates from phylogenies, linking these rates to dierences in traits is nontrivial. If we knew the character state at ev- ery point along the phylogeny we could construct lineage-through-time plots for each state and compare these. While these states are never directly known, some methods have attempted to combine reconstructed ancestral states with speciation rate estima- tion (Paradis, óþþ¢; Ree, óþþ¢; Freckleton et al., óþþ). However, these methods ignore the trait-dependence of speciation or extinction when estimating the rate at which the state changes, introducing biases in both the transition rates (Maddison, óþþä), and the estimated ancestral states (Goldberg and Igić, óþþ). Recently, the “BiSSE” (Binary State Speciation and Extinction) method of Maddison et al.(óþþß) was developed to account simultaneously for both the evolution of a trait and that trait’s eect on speciation and extinction. is is a likelihood method that computes the probability of a phylogenetic tree and the distribution of species traits without directly reconstructing ancestral states, avoiding the limitations above. BiSSE is the foundation for the techniques developed in this thesis and is described more fully in the next chapter.
Õ.ì «±§¶h±¶§u b h±u±« ±« ±u««
In this thesis, I develop and apply a number of new methods for investigating species selection by detecting the association between species traits and speciation or extinction rates. In chapter ó, I derive a method for estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies, extending the BiSSE method and allowing it to be used with currently available phylogenies where an assumption of
hZ£±u§ Õ complete sampling is oen violated. In chapter ì, I develop a method (QuaSSE: Quanti- tative State Speciation and Extinction) for detecting an association between continuous traits (such as body size) and speciation or extinction rate. With continuous traits, the associations can take any form, and I explore the degree of detail that we can reasonably expect to recover under optimal circumstances. In chapter ¦, I describe the R (R Devel- opment Core Team, óþÕó) package, “diversitree”, that I developed during my thesis and which implements BiSSE, QuaSSE, and other comparative phylogenetic methods. I also describe an extension of BiSSE to multiple states and multiple traits, and discuss ways in which models of discrete character evolution can be parametrised more intuitively. In chapter ¢, I use QuaSSE to test the long-standing hypothesis that while body size tends to increase over time (Cope’s rule), this is opposed by species selection. If both of these processes are operating, then the distribution of mammalian body size might represent a trade-o between two levels of selection: individual selection for larger body size and species selection for smaller body size. Finally, in chapter ä, I close with a discussion of the limitations of the methods used in the thesis, and suggest some future directions.
É hZ£±u§ ó E«±Z± T§Z±-Du£uou± S£uhZ± b Eì±h± RZ±u« F§ Ih£u±uí Ru«êuo Píuu«
ó.Õ «¶Z§í
Species traits may inuence rates of speciation and extinction, aecting both the pat- terns of diversication among lineages and the distribution of traits among species. Ex- isting likelihood approaches for detecting dierential diversication require complete phylogenies; i.e., every extant species must be present in a well-resolved phylogeny. We developed two likelihood methods that can be used to infer the eect of a trait on spe- ciation and extinction without complete phylogenetic information, generalising the re- cent Binary State Speciation and Extinction (BiSSE) method. Our approaches can be used where a phylogeny can be reasonably assumed to be a random sample of extant species, or where all extant species are included but some are assigned only to terminal unresolved clades. We explored the eects of decreasing phylogenetic resolution on the ability of our approach to detect dierential diversication within a Bayesian framework, using simulated phylogenies. Dierential diversication caused by an asymmetry in spe- ciation rates was nearly as well detected with only ¢þÛ of extant species phylogenetically resolved as with complete phylogenetic knowledge. We demonstrate our unresolved clade method with an analysis of sexual dimorphism and diversication in shorebirds (Charadriiformes). Our methods allow for the direct estimation of the eect of a trait on speciation and extinction rates using incompletely resolved phylogenies.
Õþ hZ£±u§ ó
ó.ó ±§o¶h±
Just as dierences in traits may aect the relative survival and reproductive success of individuals, traits may aect the relative rate at which lineages go extinct or speciate (Stanley, ÕÉߢb; Coyne and Orr, óþþ¦; Ricklefs, óþþß). Sister-clade comparisons (Bar- raclough et al., ÕÉÉ) have been widely used to detect traits that are correlated with dierential diversication. Using this method, traits that have been found to have a sig- nicant impact on diversication rates include diet in insects (Mitter et al., ÕÉ; Farrell, ÕÉÉ), latitude in birds and butteries (Cardillo, ÕÉÉÉ), mating system in birds (Mitra et al., ÕÉÉä), and sex allocation in owering plants (Heilbuth, óþþþ). ese analyses have oen been framed as tests of whether a character is a “key innovation”; i.e., has a particular character state lead to elevated rates of diversication? More recently, a variety of statistical approaches that directly estimate speciation rates have been developed that incorporate phylogenetic tree topology and the pattern of branching times (e.g., Pagel, ÕÉÉß; Paradis, óþþ¢;Ree, óþþ¢; Maddison et al., óþþß). ese approaches allow for greater statistical power than sister-clade comparisons, because they incorporate more informa- tion about the patterns of diversication. Among these is the BiSSE method (Binary State Speciation and Extinction; Maddison et al. óþþß), a whole-tree likelihood method that can be used to detect the eect of a trait on diversication, where the trait can be classied into two states. e BiSSE method as formulated by Maddison et al.(óþþß) assumes that the phy- logenetic tree is complete and fully resolved; i.e., the tree must include every extant species. It also assumes that all character state information is known. ese assumptions currently restrict its applicability, as few published phylogenies are both complete to the species level and large enough to detect dierential diversication. Without appro- priate correction, BiSSE will not produce valid likelihoods for incompletely resolved trees. Incomplete phylogenetic coverage decreases the apparent number of events over a phylogeny; there are fewer inferred speciation and character change events. Because of this, the BiSSE likelihood surface shis to favour lower rates of diversication and
ÕÕ hZ£±u§ ó character change. Furthermore, inferred phylogenies that include only a fraction of extant species tend to have longer terminal branches (gure ó.Õ b), and as a result the estimated extinction rates approach zero because there is a smaller increase in the num- ber of lineages in time near the present (Nee et al., ÕÉɦb). Similar limitations have been overcome in likelihood approaches that estimate spe- ciation and extinction rates when these rates do not depend on a character (character- independent diversication). e character-independent likelihood method of Nee et al. (ÕÉɦb) includes corrections that assume that the species present in a phylogeny repre- sent a random sample of extant species from a clade by incorporating the sampling pro- cess into the likelihood calculations. Recently, Bokma(óþþa) developed a Bayesian ap- proach for estimating character-independent diversication rates that treats the branch- ing times for missing taxa as additional parameters to be estimated. In these studies, because speciation and extinction rates do not depend on a species’ character state, only the branching times are required. However, if speciation and extinction rates depend on a character’s state, then branching times are insucient because the topology of the tree will depend on how the character evolves. Here, we extend BiSSE to allow estimation of character-dependent speciation and extinction rates from incompletely resolved phylogenies. We develop likelihood calcula- tions that compensate for incomplete phylogenetic knowledge in two cases: (Õ) where the species in a phylogeny represent a random sample of all extant species within a group (gure ó.Õ b), and (ó) where species not directly represented as tips in the phylogeny can be assigned to terminal unresolved clades (gure ó.Õ c). We also develop methods to allow for incomplete character state knowledge, for both complete and incompletely resolved trees. We describe how these likelihoods can be used in Bayesian inference and apply our methods to simulated data sets. Finally, we demonstrate our method by ap- plying it to the correlation between diversication and sexual dimorphism in shorebirds (Charadriiformes).
Õó hZ£±u§ ó
a) Complete phylogeny b) Skeleton phylogeny (random sampling) a b c d e f g h i j k l m n o p q r s t a b c d e f g h i j k l m n o p q r s t
c) Terminally unresolved phylogeny d) Other phylogeny, not handled a b c d e f g h i j k l m n o p q r s t a b c d e f g h i j k l m n o p q r s t
F¶§u ó.Õ: Dierent ways that phylogenetic information may be incomplete. Tree (a) is complete; every extant species is included and the tree is fully resolved. Black and white boxes above the tips refer to dierent character states. Tree (b) is a “skeletal tree”; species are included randomly from the full tree in (a). Sampled taxa are indicated by solid lines, and missing taxa are indicated by dashed lines. In general, nothing is known about the placement of these taxa. Tree (c) is a “terminally unresolved tree”; in this case the species not explicitly included as tips in the phylogeny are all known to belong to terminal unresolved clades. is tree is therefore “complete” in that it includes all extant taxa, but is incompletely resolved. is tree has the same branching structure as (b). Tree (d) contains a paraphyletic unresolved group and cannot be directly handled by either of the methods presented here. e relationships among species n − q are not resolved, and this group is known to be paraphyletic (see panel a). To convert this tree into a terminally unresolved tree, the known relationships within the r − t clade would have to be discarded to create an unresolved clade spanning species n − t.
Õì hZ£±u§ ó
ó.ì f««u § h£u±u £íuu«
Because our aim is to generalise the BiSSE model of Maddison et al.(óþþß), we start with a brief description of this method. BiSSE computes the probability of a phylogenetic tree and the observed distribution of character states among extant species, given a model of character evolution, speciation, and extinction. e character states must be binary; we denote the possible character states as þ or Õ (e.g., herbivorous or non-herbivorous insects). e likelihood calculation tracks two variables for each character state i along branches in a phylogeny: DNi(t) — the probability that a lineage in state i at time t would evolve into the extant clade N as observed, and Ei(t) — the probability that a lineage in state i at time t would go completely extinct by the present, leaving no extant members. (For compactness, we will oen refer to the clade whose most recent common ancestor is node N as “clade N”.) Time is measured backwards with the present at t = ý, and t > ý representing some time in the past. e changes in these quantities over time are described by a set of ordinary dierential equations
dD Ni = − (λ + µ + q )D (t) + q D (t) + òλ E (t)D (t) (ö.×a) dt i i i j Ni i j N j i i Ni dE i =µ − (λ + µ + q )E (t) + q E (t) + λ E (t)ò (ö.×b) dt i i i i j i i j j i i where λi is the speciation rate in state i, µi is the extinction rate in state i, and qi j is the rate of transition from state i to j forward in time (Maddison et al., óþþß). ese equa- tions are solved numerically along each branch backwards in time to compute DNi(t) (gure ó.ó). On each branch, the character state at the tip provides the initial conditions for equations (ó.Õ). DNi(ý) = Ô if the sampled tip N is in state i and ý otherwise because the lineage must be in its observed state. Similarly, Eý(ý) = EÔ(ý) = ý as a lineage cannot go extinct in zero time. At a node joining the lineages leading to clades N and M, the
Õ¦ hZ£±u§ ó
a) Complete phylogeny b) Terminally unresolved phylogeny 3 species in state 0 1 species in state 1 0 1 0 0 0 0 1 1 0
t0 DNi((t0)) DMi((t0)) t0
DNi((t1)) DMi((t1)) DNi((t1)) DMi((t1)) t1 t1 DN′′i((t1)) DN′′i((t1))
t2 t2 DN′′i((t2)) DN′′i((t2))
F¶§u ó.ó: BiSSE with (a) and without (b) full phylogenetic knowledge. In panel (a), the values at the base of the nodes leading to the rst tips (DNi(tÔ) and DMi(tÔ)) are calculated backwards in time using equations (ó.Õ) and then combined with equation
(ó.ó) to become the initial condition DN′ i(tÔ) for calculating DN′ i(tò). In panel (b), the four species on the le are unresolved but can be assigned to a tip that branches at time tÔ. DNi(t) would be calculated forward in time using our new method, and DMi(t) with BiSSE, with these values combined as above.
Õ¢ hZ£±u§ ó probability of generating both daughter clades given that the node is in state i is
DN′ i(t) = DNi(t)DMi(t)λi (ö.ö) where N′ represents the union of clades N and M (see gure ó.ó). e likelihood calcu- lation proceeds backwards in time down the tree from the tips until it reaches the root.
At the root, R, we have the two probabilities DRý and DRÔ, corresponding to the possible character states at the root. e overall likelihood, DR, must sum over the probabilities that the root was in each state (see Appendix Z.Õ).
ó.¦ uo hZh¶Z±« § h£u±uí §u«êuo £íuu«
Incompleteness in phylogenetic information can come in many forms. A species may be entirely unplaced phylogenetically, or placed into a clade but not into a precise re- lationship within the clade. Its character state may be known or unknown. We will derive methods for two situations: “skeletal trees,”where we have a fully resolved tree for a random sample of species whose states are fully known, and “terminally-unresolved trees,” where trees include all extant species and are fully resolved except for terminal clades that are completely unresolved phylogenetically and whose character states are known to varying degrees. Skeletal trees (gure ó.Õ b) could arise when a biologist sam- ples species simultaneously for their presence in a phylogenetic analysis and having data for the character of interest. For these trees, we assume that nothing is known about the phylogenetic placement of the missing taxa. Terminally-unresolved trees arise frequently when the species included in a molecular phylogeny are exemplars and where information on the non-included (unplaced) species is available (e.g., from previous systematic studies). If the unplaced species can be assigned to terminal clades containing the exemplar species, then our method can be used (gure ó.Õ c). Here, we assume that every species can be assigned to an unresolved clade. Note that terminally unre-
Õä hZ£±u§ ó solved trees are phylogenetically complete, in that they include all extant taxa, but are incompletely resolved, in that not all phylogenetic relationships are known. A broader class of incomplete phylogenies do not match either of these cases, and the methods we describe below cannot be used directly. is includes paraphyletic unresolved groups (gure ó.Õ d).
ó.¦.Õ Skeletal trees: unplaced missing taxa First, we consider skeletal trees, where a given phylogenetic tree represents a random sample of all extant species in a taxonomic group. Toaccount for incomplete phylogenies, we model a “sampling” event at the present that corresponds to a biologist obtaining data for the species. is event occurs during an innitesimally small time period, during which a species in state i has a probability fi of being sampled for inclusion in a phy- logeny. e fi values should be determined from estimates of the numbers of species having each character state that are unsampled versus sampled. If the character states of unsampled species are unknown, then the fi values could be set equal for all states and reect the proportion of all extant species that have been sampled. With this sampling event, Ei(t) can be interpreted as the probability of a lineage not being present in the phylogeny, either by going extinct or not being sampled. e initial condition Ei(ý) is therefore (Ô − fi). Similarly, rather than representing the probability that a lineage in state i at time t would evolve into the full extant clade N at the present, DNi(t) includes the probability that the tip taxa present in the phylogeny are sampled. e initial conditions become DNi(ý) = fi if the sampled tip is in state i and ý otherwise. Aer these modications to the initial conditions, the calculations continue as described in Maddison et al.(óþþß). is is similar to the method used by Nee et al.(ÕÉɦb) to correct likelihood calculations for inferring character-independent speciation and extinction rates from incomplete phylogenies. Indeed, in Appendix Z.ó.Õ we show that when the speciation and extinction rates are independent of character state, the calculations are equivalent.
Õß hZ£±u§ ó
is approach assumes that the taxon sampling process is independent of the posi- tion in the phylogeny. However, it need not be independent of the character state as the fi can dier between states þ and Õ. However, this approach also assumes that taxon sampling is even across the phylogeny, although in many cases it is not.
ó.¦.ó Terminally unresolved trees Terminally unresolved trees contain all species, but their relationships are not fully re- solved, with some species grouped into unresolved clades (gure ó.Õ c). We can envision this situation as comparable to the skeletal trees, but with the unplaced species not en- tirely unknown: we know to which terminal clades they belong, and we may know their character states. is extra information about placement and character states can be used to improve inference. We will use the word “tip” to refer to a terminal unit in the tree, which may represent either a single extant species, or a terminal unresolved clade. us, the number of tips in the tree will be less than the number of species implied if there are unresolved terminal clades. We assume that the sampling of species is complete; i.e., every extant species is either present directly as a tip or can be assigned to a tip that represents an unresolved clade. We do not assume knowledge of the timing of the last common ancestor for a terminal unresolved clade, instead we assume that diversication happens at any point aer splitting from its sister clade (gure ó.ó). We also do not assume any particular topology for the unresolved clades, rather we sum over all possible phylogenetic histories, according to their probability. We initially assume complete knowledge of character states, but we relax this assumption in the next section.
If we can compute the probability of a terminal clade, DNi(t), then we can combine this with the probability of their sister lineages using equation (ó.ó) and continue with BiSSE down the rest of the tree (gure ó.ó b). In contrast to the backward-time approach employed by BiSSE, we use a forward-time method to calculate DNi(t) for an unresolved clade. Because it has no phylogenetic structure, we cannot distinguish among the dif-
Õ hZ£±u§ ó ferent possible evolutionary histories of an unresolved clade. Consequently we model clade evolution as a Markov process, tracking only the probability of dierent clade compositions over time. e possible clade compositions can be distinguished by the number of species in each state; let (ný, nÔ) represent a clade with ný species in state þ and nÔ species in state Õ. Even though the number of possible clade types is innite, we truncate state space to a nite number of species. is process is similar to two birth-death processes (Nee, óþþä), one for each character state, but includes transitions between the processes. We term this a “birth-death-transition process”. If x(t) is a column vector representing probabilities of dierent clade types at time t and Q is a transition rate matrix describing the rates of changes between clade types, then the probability of generating any possible clade is given by:
x(tý) = exp((tÔ − tý) Q) ⋅ x(tÔ), tÔ > tý (ö.î)
where tÔ represents an earlier point in time to tý and ‘exp’ represents the matrix exponen- tial (Sidje, ÕÉÉ). e values of x(tý) that correspond to the observed data can then be used as the probability of the clade evolving as observed, DNi(tÔ), for subsequent BiSSE calculations. For example, for a clade that begins in state þ at time tÔ and ends in clade type (ç, Ô) at the present (tý, as in gure ó.ó b), we nd x(tý) from equation (ó.ì) and pick out the probability of generating a (ç, Ô) clade from this vector, which is used as DNý(tÔ). e probability of generating a clade with no species, (ý, ý), is the probability that the clade would have gone extinct, Eý(tÔ). is process is then repeated assuming that the lineage leading to the clade was initially in state Õ, giving DNÔ(tÔ) and EÔ(tÔ). Before describing the transition rate matrix Q we must rst specify the structure of the state space x(t). Let the rst element represent the probability of having zero species, the next two elements represent the two single-species clades with one species in state þ or one species in state Õ (respectively), and so on. at is, probabilities are assigned to
ÕÉ hZ£±u§ ó the positions of x(t) in the order:
zero species one species two species ³¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹·¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹µ ³¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹·¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹µ (ý, ý), (Ô, ý), (ý, Ô), (ò, ý), (Ô,Ô), (ý, ò), ⋯,
(k, ý), (k − Ô,Ô), (k − ò, ò),..., (Ô, k − Ô), (ý, k), ⋯ (ö.®) ´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶ k species so that a clade with k species is represented by the k+Ô elements in positions k(k+Ô)~ò+Ô to (k+Ô)(k+ò)~ò. To keep the state space nite, the nal element of x(t) is an absorbing state, representing the probability that a clade has at least nmax species. By doing this, we assume that once a clade reaches nmax species it is so large that there is a negligible probability of generating the observed number of species by time tý. In practice, nmax can be chosen to be large enough so that it does not signicantly aect calculations (e.g., by monitoring the change in relevant values in x(tý) as nmax is increased). At the base of the clade, tÔ, there must have been a single ancestral lineage in either state þ or Õ. e state of the system at this time must have been a vector of zeros except for a Ô in either the second position (corresponding to (Ô, ý) to calculate DNý(t)) or third position (corresponding to (ý, Ô) to calculate DNÔ(t)). To calculate Q, we assume that each time step is small enough that only a single event may happen; a lineage currently in state (ný, nÔ) may have one species
• speciate, moving from state (ný, nÔ) to (ný + Ô, nÔ) or (ný, nÔ + Ô) at rate nýλý or
nÔλÔ, respectively,
• go extinct, moving to (ný − Ô, nÔ) or (ný, nÔ − Ô) at rate ný µý or nÔ µÔ,
• change character state, moving to state (ný −Ô, nÔ +Ô) or (ný +Ô, nÔ −Ô) at rate nýqýÔ
or nÔqÔý.
Using these rules, the transition rate matrix has a block structure, involving the blocks
Sk, Ek and Ck. e block Sk is a (k + ò) × (k + Ô) matrix describing speciation from k to
óþ hZ£±u§ ó k + Ô species:
⎛ ⎞ kλý ý ý ⎜ ⎟ ⎜ ⎟ ⎜ ý ( − Ô) ý ⎟ ⎜ k λý ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ý λÔ (k − ò)λý ⎟ ⎜ ⎟ ⎜ ⎟ Sk = ⎜ ý ý ò Ô ⋱ ⎟ . ⎜ λ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⋱ λý ý ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ (k − Ô)λÔ ý ⎟ ⎜ ⎟ ⎝ ý kλÔ ⎠
Ek is a k × (k + Ô) matrix describing extinction from k to k − Ô species:
⎛ kµý µÔ ý ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ý ( − Ô) ò ⎟ ⎜ k µý µÔ ⎟ ⎜ ⎟ = ⎜ ⎟ . Ek ⎜ ý ý (k − ò)µý ⋱ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⋱ (k − Ô)µÔ ý ⎟ ⎜ ⎟ ⎝ µý kµÔ ⎠
Ck is a (k + Ô) × (k + Ô) square matrix describing character state changes, leaving the number of species constant at k:
⎛ ⋅ qÔý ý ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ kqýÔ ⋅ òqÔý ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ý ( − Ô) ⋅ ⋱ ⎟ ⎜ k qýÔ ⎟ Ck = ⎜ ⎟ ⎜ ⎟ ⎜ ý ý (k − ò)qýÔ ⋱ (k − Ô)qÔý ý ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⋱ ⋅ ⎟ ⎜ kqÔý ⎟ ⎝ qýÔ ⋅ ⎠
óÕ hZ£±u§ ó
where the dotted elements along the diagonal of Ck are chosen so that the columns of Q sum to zero. Denoting matrices of zeros with ý, the transition rate matrix Q is
⎛ Cý EÔ ý ⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ý CÔ Eò ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⋱ ⎟ ⎜ ý SÔ Cò ⎟ Q = ⎜ ⎟ (ö.£) ⎜ ⎟ ⎜ ý ý Sò ⋱ Enmax−Ô ý ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⋱ C ý ⎟ ⎜ nmax−Ô ⎟ ⎝ ⎠ Snmax−Ô ý
e nal speciation block Snmax−Ô, describing speciation into the absorbing state is an nt element row vector:
((nmax − Ô)λý, (nmax − ò)λý + λÔ,..., (nmax − Ô)λÔ)
As a special case, this approach can be used to calculate likelihoods for terminally unresolved trees where speciation and extinction do not depend on a character’s state, as described in Appendix Z.ó.ó.
ó.¦.ì Incomplete character state knowledge Regardless of the level of phylogenetic completeness and resolution, character state infor- mation may be unknown for some species. Here, we describe corrections to the BiSSE likelihood for missing character state information for fully resolved phylogenies, skeletal trees, and terminally unresolved trees. For fully resolved phylogenies, if no information on a character for a tip is available, then the “data” becomes the presence of the tip only. On the single branch leading to this tip, we can then interpret DNi(t) as the probability of giving rise to a single species, regardless of its character state. e initial conditions must therefore be DNi(ý) = Ô for both states, because with no time for extinction there is a ÕþþÛ probability that
óó hZ£±u§ ó the branch will lead to the observed data. Using this logic, for skeletal trees, the initial conditions are DNi(t) = fi. For terminally unresolved trees, character state information may not be known for all members of an unresolved clade. In this case, we can calculate the joint probability that a clade evolved to a particular composition and that it was sampled as observed. Say that the unresolved clade of interest truly has xý species in state þ and xÔ species in state Õ but that we know the state information only for a sample of these species so that si species are known to be in state i. If the probability of a species’ state being known is independent of its state, then we can assume that the sN = sý+sÔ known species represent samples without replacement from a pool of xN = xý + xÔ species, and compute the sampling probability using the hypergeometric distribution. Of the xN ways of sampling s species from sN N this pool, there are xi ways of sampling species in state . e sampling probability si si i is therefore given by xýxÔ sý sÔ Pr(sý, sÔSxý, xÔ) = . (ö.å) xN sN While we do not know the true number of species in each state, we can use equation (ó.ì) to compute the probability that the clade composition is (xý, xÔ) and then use equation
(ó.ä) to give the probability of knowing that si species are in state i. To do this, we multiply the probability of generating the clade by the probability of sampling the clade as observed and sum over all possible clade compositions:
j x −j xN −sÔ N D (t) = Q Pr(j, x − j) sý sÔ . (ö.à) Ni N xN = ý j s sN
Here, Pr(xý, xÔ) is the probability of a clade with xý species in state þ and xÔ species in state Õ, calculated from equation (ó.ì). is calculation assumes that we know that there are xN species in the clade, but this calculation can be generalised if xN itself is not known exactly but can be described by a probability distribution. Where we have full state information (i.e., sý + sÔ = xN ) equation (ó.ß) reduces to Pr(sý, sÔ).
óì hZ£±u§ ó
ó.¢ fZíu«Z u§uhu
e above equations can be used to calculate from an incomplete phylogeny the likeli- hood, that is, the probability of the data given a model of speciation, extinction, and character evolution. is method can then be used to estimate rates using maximum likelihood and to compare models using likelihood ratio tests. Here we will discuss their application to Bayesian inference so that measures of parameter uncertainty can be simultaneously obtained. For a general introduction to Bayesian inference in phy- logenetics, see Huelsenbeck et al.(óþþó). We will focus on the posterior probability distribution of the model parameters; that is, the probability of the parameters given the data. To compute the posterior probability we need to specify the prior probability dis- tribution for the parameters. We use an exponential prior for the six parameters (see Churchill, óþþþ). is choice reects the philosophical preference for explanations re- quiring fewer events, all else being equal (Occam’s razor). For example, if few species are present in state þ, then there is no information about the extinction rate for species in state þ (µý). An exponential prior would then generate a posterior distribution with the same mean as the prior. Other common priors include a uniform prior and a uniform prior on the log of each parameter (e.g., on ln(µý)). ese are both “improper priors” because they do not integrate to a nite value over the possible range of the parame- ters (ý, ∞). Because of this, in the case where little signal is present in the data, the posterior will not integrate to a nite value, and cannot easily be interpreted (Gelman et al., ÕÉÉ¢). Because it is itself proper, the exponential prior always produces a proper posterior probability distribution, and it has the additional benet that its inuence on the posterior distribution can be easily detected by comparing the mean of prior and posterior distributions. e prior probability density associated with the parameter θ j is set to:
−c j θ j Pr(θ j) = c je (ö.)
ó¦ hZ£±u§ ó
where θ j is the value of the jth parameter, and c j is a rate parameter. e posterior probability of the model given the data is proportional to
−c j θ j DR(tR) M c je (ö.Ì) j where the product is taken over the six model parameters. An exploration of the alter- native priors indicated that the priors generally had negligible inuence except where there were very few extant species in a given character state.
To choose values for c j, we use a preliminary measure of the rate of diversication from the tree. Ignoring state changes and asymmetries in speciation or extinction rates,
(λ−µ)t the expected number of species in a tree of length tR is n = e R , where λ and µ are the character independent speciation and extinction rates (Nee et al., ÕÉɦb). Rearranging, the diversication rate (λ − µ) that would produce n species at time tR is ln(n)~tR. We chose the prior rates so that the mean of the exponential distribution was twice this value
(i.e., c j = tR~ò ln(n)). e same prior was used for all model parameters.
ó.ä ¶u§hZ u±o« b Z££hZ±
To test our method, we followed the same approach as Maddison et al.(óþþß) by sim- ulating trees and character states using known rates and then attempting to infer those rates from the tree. We simulated trees containing ¢þþ species with rates λý = λÔ = ý.Ô,
µý = µÔ = ý.ýç, and qýÔ = qÔý = ý.ýÔ (equal rate trees) or with λÔ = ý.ò (unequal rate trees). ese are the same rates as Maddison et al.(óþþß) for comparison, and the trees were simulated using their method. We generated random incomplete phylogenies from these complete simulated phy- logenies. To perform random taxonomic sampling to create skeletal trees (gure ó.Õ b), we sampled a proportion of all tips independently of tip state. e per-state fraction of species in each state that were present in the nal sample was calculated and used to specify fý and fÔ when calculating likelihoods. To simulate terminally unresolved trees,
ó¢ hZ£±u§ ó a similar sampling routine can be used. Insofar as terminally unresolved trees can arise when character data are available for all species but detailed phylogenetic placement is available for only a sample of species, we can simulate this by choosing which species were sampled for detailed phylogenetic placement. e remaining unsampled species would be assigned to terminally unresolved clades represented by a single species that was sampled, the exemplar of the clade. However, this sampling requires some additional care, because every extant species must be either present in the phylogeny or assigned to an unresolved clade (compare gures ó.Õ c and ó.Õ d). Simply sampling species can leave orphaned species that fall below resolved clades and so cannot be placed into fully unresolved clades. For example, suppose that species j and k were chosen to have resolved placement from the phylogeny in gure ó.Õ a, but species i le unresolved. e species i cannot be placed into an unresolved clade represented by a single sampled exemplar species and is thus “orphaned”. As a way of guaranteeing that there were no orphans in the nal tree, we included a fraction of the orphan species in the sample and reassessed which species remained orphans, repeating until no orphan species were present. Note that this sampling approach does not generate a random sample of species, as assumed in our skeletal tree approach. For the results reported in this chapter, we assumed that the character states of all species were known.
ó.ä.Õ Implementation
We implemented the above methods in the R package “diversitree” (available from http: //www.zoology.ubc.ca/prog/diversitree). e “diversitree” package is also be accessible through recent versions of Mesquite (Maddison and Maddison, óþþ). e matrix exponentiations were calculated numerically using the ouì£ê routine in Ex- pokit (Sidje, ÕÉÉ). Because the transition rate matrix Q is very sparse, it is practical to use this approach for unresolved clades containing up to several hundred species. e posterior probability distribution cannot be sampled from directly, so we use Markov Chain Monte Carlo (hh) to approximate the distribution, using slice sampling for the
óä hZ£±u§ ó parameter updates (MacKay, óþþì; Neal, óþþì). For each tree, we ran three independent hh chains for Õþ,þþþ steps from random starting locations, discarding the rst ó,¢þþ steps of each chain. While these chains are short compared to those used in tree infer- ence, the sampler here is exploring a reasonably smooth continuous probability surface, rather than tree-space, with disjoint regions of high probability separated by areas of low probability (data not shown). Consequently, convergence of the hh chains was very rapid.
ó.ß §u«¶±«
We briey present the results of Bayesian inference using BiSSE with complete phy- logenetic knowledge, then discuss how the statistical power is aected by incomplete phylogenetic knowledge.
ó.ß.Õ Bayesian inference with BiSSE
Where speciation rates were equal for each character state (λý = λÔ; equal rate trees), the mean inferred speciation rates were close to the true values used to simulate the trees, and the posterior probability density was tightly distributed around this true value
(gure ó.ì, solid curves). Where speciation rates were unequal (λý < λÔ), species in state þ were relatively rare (approximately ÕþÛ of extant species). Consequently, the rates for transitions in state þ (λý, µý and qýÔ) were less precisely estimated than for state Õ, though still largely centred around their true values (gure ó.ì, dashed curves). is pattern is consistent with that in Maddison et al.(óþþß), who found that the maximum likelihood estimates for the rare character state were more widely distributed than the estimates for the more common character state (their gure ¦). Some of the model parameter estimates were correlated; in particular the speciation and extinction rates for a particular character state were positively and linearly correlated (data not shown), indicating that a range of speciation/extinction rate combinations
óß hZ£±u§ ó
µ a) λ0 b) 0 c) q01 NA NA NA State 0
µ d) λ1 e) 1 f) q10 Index Index Index Posterior probability density Posterior
Equal rate Unequal rate NA NA NA
True value State 1
0.00 0.10 0.20 0.00 0.10 0.20 0.00 0.01 0.02 0.03 0.04 0.05 Speciation rate Extinction rate Character transition rate Index Index Index F¶§u ó.ì: Posterior probability densities for the six BiSSE parameters on a fully- resolved phylogeny. Two trees were generated, each containing ¢þþ species and with either all rates equal (λý = λÔ, µý = µÔ, qýÔ = qÔý; solid curves) or with unequal speciation rates (λý < λÔ; dashed curves). e histograms display posterior probabilities over the last ß,¢þþ points from three independent hh chains, aer discarding the rst ó,¢þþ points of each chain. e vertical lines indicate the true parameter values used in simulating the trees. e y axes dier between plots but are scaled so the area under each curve integrates to Ô. e horizontal bars indicate the É¢Û credibility intervals for the equal rate (upper bar) and unequal rate (lower bar) tree.
ó hZ£±u§ ó had similar posterior probabilities. e diversication rate is the dierence between the speciation and extinction rates (ri = λi − µi, Nee et al. ÕÉɦb). e uncertainty around the diversication rate estimate was similar to the uncertainty around the speciation rates, even where extinction rates were poorly estimated (gure ó.¦). e dierence between the diversication rates for the two character states (relative diversication rate; rrel = rÔ − rý) gives a summary of the strength of dierential diversication. e relative diversication rate for equal rate trees was well estimated; centred around the true value and with a narrow credibility interval. For unequal rate trees, the posterior probability distribution was atter, but still centred around the true value. For the example shown in gure ó.¦, the posterior probability of rrel ≤ ý for the unequal rate tree was ý.ýý¥, so we would correctly conclude that character state Õ increased diversication rate in this case.
ó.ß.ó Eect of decreasing phylogenetic knowledge As fewer species were included in a phylogeny or as more species fell within unresolved clades, parameters were less accurately and precisely estimated (gure ó.¢). Accuracy and precision were essentially unaected for nearly completely resolved phylogenies (ߢۖÕþþÛ complete), and for most parameters precision did not deteriorate substan- tially until trees contained fewer than ≈ ¢þÛ of the total possible tips. For all parame- ters, the mean parameter estimate increased when phylogenetic resolution became very low. is reects the skew in the posterior probability distribution (gure ó.ì), which increased with reduced phylogenetic resolution as the prior distribution increasingly dominates the posterior distribution. In addition, the prior means were higher than any of the simulated rates (approximately ý.ç for unequal rate trees and ý.Ôâ for equal rate trees). Because medians are less sensitive to skew, the median parameter estimates were less aected by decreasing phylogenetic information than the mean, but they still tended to increase at very low phylogenetic resolution (not shown). e decrease in accuracy and precision was most pronounced for the rate parameters for the rare state
óÉ hZ£±u§ ó
a) b) c)
Equal rate Unequal rate True value NA NA NA Posterior probability density Posterior
−0.1 0.0 0.1 0.2 0.3 −0.1 0.0 0.1 0.2 0.3 −0.1 0.0 0.1 0.2 0.3 Diversification rate, state 0 Diversification rate, state 1 Relative diversification rate Index Index Index F¶§u ó.¦: Posterior probability distribution for the diversication rates (a) in state þ (rý) and (b) in state Õ (rÔ) and (c) the relative diversication rate (rrel = rÔ − rý), for an equal rate tree (λý = λÔ, solid curves) and an unequal rate tree (λý < λÔ, dashed curves). e É¢Û credibility intervals are indicated by the horizontal bars, and the vertical lines indicate the true parameter values used in simulating the trees. Parameters are as indicated in the text.
ìþ hZ£±u§ ó
on unequal rate trees (λý, µý, and qýÔ). Decreasing phylogenetic resolution increased the uncertainty of the parameter estimates, with the widths of the credibility intervals growing as the proportion of tips sampled decreased (gure ó.¢). On unequal rate trees at very low phylogenetic resolution, the posterior probability distribution for qýÔ became very similar to the prior distribution. e decrease in accuracy and precision with decreasing phylogenetic knowledge was more pronounced for skeletal trees than for terminally unresolved trees. is is because of the additional information that the unresolved clades contain, in addition to the branching structure (i.e., the number of species and their states). In general, for a given number of tips present in a phylogeny (i.e., sampled species in skeletal trees, resolved species plus unresolved clades in terminally unresolved trees), terminally unresolved trees had lower bias in the mean parameter estimates and narrower credibility intervals than did skeletal trees. However, for well estimated parameters (e.g., λÔ and µÔ on the unequal rate tree; gure ó.¢ b, e) the dierence in uncertainty between these methods was small. e dierence in precision was particularly pronounced for the character transition rates, which were typically well estimated on terminally unresolved trees, even with low phylogenetic resolution (gure ó.¢ g–i). Rates of net diversication were well estimated as phylogenetic resolution decreased, despite increasing uncertainty in extinction rates (gure ó.ä). For equal rate trees, the net diversication rate for each trait (ri = λi − µi) and the relative diversication rate rrel were fairly insensitive to phylogenetic resolution, with no bias in the mean param- eter estimates and little increase in the width of the credibility intervals, especially for terminally unresolved trees (gure ó.ä a, c, e). Where speciation rates diered (unequal rate trees), the estimated diversication rates were sensitive to decreasing phylogenetic resolution, but less so than for the individual parameters. Particularly for terminally un- resolved trees, the net diversication rate was well estimated even with low phylogenetic resolution. With the parameters used here, dierential diversication was detectable on unequal rate trees at the ¢Û signicance level until fewer than ìþÛ of taxa were
ìÕ hZ£±u§ ó
Unequal rate, state 0 Unequal rate, state 1 Equal rate 0.7 a) b) c) 0.6 ● Skeletal tree ● Unresolved terminal 0.5
0.4
NA 0.3 NA NA ● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ●●● ● ● ● 0.2 ● ● Speciation rate ● ● ● ● ● ● ● ●● ●● ●●● ● ● 0.1 ● ●● ●● ● ● ● ● ●● ●● ● ●●● ●
0.0 1.2 d) e) f)
1.0 Index Index Index 0.8
0.6 NA NA NA
0.4 ●
Extinction rate ● 0.2 ● ● ● ● ● ● ● ● ● ● ●● ●● ●●● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ●●● ● ●● ●● ●● ● ● ● ● ●● ●● ● ●●● ● 0.0
g) h) i) 0.5 ● Index Index Index
Mean and 95% credibility interval around parameter 0.4 ● 0.3 NA NA NA
0.2 ●
● ● 0.1 ● ● ● ●
Character change rate Character ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ● ● ● ● 0.0 ● ● ● ●● ●● ●● ●●● ● ● ● ● ● ● ● ● ●● ●● ● ●●● ●
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Proportion of tips phylogenetically resolved Index Index Index
F¶§u ó.¢: Uncertainty around BiSSE parameter estimates as a function of phylogenetic knowledge. Points represent the mean for the estimate of each parameters, and the curves above and below indicate the mean É¢Û credibility interval, averaged over ìþ dierent phylogenies. Dashed curves/open circles represent skeletal trees and solid curves/lled circles represent terminally unresolved trees. e horizontal dotted line indicates the true rate from the simulations. For skeletal trees, the proportion of tips reects phylogenetic completeness, while for terminally unresolved trees it represents the level of phylogenetic resolution. Trees were evolved with unequal speciation rates (λÔ > λý, rst two columns) or equal speciation rates (nal column) and contained ¢þþ species before sampling. Credibility intervals were calculated over the last ß,¢þþ points of three independent hh chains per tree, discarding the rst ó,¢þþ points.
ìó hZ£±u§ ó
Unequal rate trees Equal rate trees
0.4 a) b) ● Skeletal tree 0.3 ● Unresolved terminal 0.2 0.1 ●● ●● ●● ● ●● ● ●● ●● ● ●●● ● NA ● NA ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● 0.0 ● ● −0.1 ●
State 0 diversification −0.2 −0.3
0.4 c) d) 0.3 Index Index 0.2 ●● ●● ●● ● ● ● ● ●● ●● ●● ●●● ● 0.1 ●● ●● ●● ● ●● ● ●● ●● ● ●●● ● NA NA 0.0 −0.1
State 1 diversification −0.2 −0.3
0.4 e) f)
0.3 ● Index Index
Mean and 95% credibility interval around parameter ● 0.2 ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ● 0.1 NA NA 0.0 ●● ●● ●● ● ●● ● ●● ●● ● ●●● ● −0.1
Relative diversification Relative −0.2 −0.3 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Proportion of tips phylogenetically resolved Index Index
F¶§u ó.ä: Uncertainty around diversication rate estimates as a function of phyloge- netic knowledge. Panels (a) and (b) show the net diversication rate in state þ (λý − µý), c and d show the net diversication rate in state Õ (λÔ − µÔ), and panels e and f show the relative diversication rate, rrel. See gure ó.¢ for details.
ìì hZ£±u§ ó explicitly included in terminally unresolved trees and until ¢þÛ of taxa were included using skeletal trees (gure ó.ä f).
ó. Z££hZ± ± «§uf§o oZ±Z
Sexual dimorphism in body size or other traits in birds is thought be driven by sexual selection (Darwin, ÕßÕ). Larger males might be favoured by females or fare better in intra-sexual conict over mates, while small males (reversed sexual dimorphism) might be favoured when sexual displays are acrobatic (Figuerola, ÕÉÉÉ). Sexual dierences in any trait may indicate dierent optima for the two sexes, and therefore inter-sexual con- ict, which may increase speciation rates (Parker and Partridge, ÕÉÉ; Gavrilets, óþþþ; Jablonski, óþþb). Comparative evidence linking sexual dimorphism with speciation is mixed. Several studies have found that sexual dimorphism in plumage or other display traits might promote increased diversication (Barraclough et al., ÕÉÉ¢; Parker and Par- tridge, ÕÉÉ; Owens et al., ÕÉÉÉ), while other studies failed to nd correlations between measures of sexual selection and diversication rates (e.g., Gage et al., óþþó; Morrow and Pitcher, óþþì; Morrow et al., óþþì). Here, we use a recent supertree of shorebirds (Charadriiformes) to investigate the correlation between speciation rate and sexual dimorphism. omas et al.(óþþ¦) con- structed a complete supertree of all ì¢þ shorebird species. While complete, this tree lacks resolution among many of the terminal clades, with large polytomies including up to ¢þ species. For each polytomy, we collapsed all species descended from any lineage within the polytomy into a terminal unresolved clade (gure ó.ß). e resulting tree had Õì¦ tips (with the óÕ¢ unresolved species included in Õ¦ unresolved terminal clades). Many of the branch lengths in this tree are not strictly proportional to time, which reduces the information about extinction rates available in the tree. We used a database of bird traits with separate measurements for males and females of body mass, wing length, tarsus length, bill length and tail length (Lislevand et al.,
ì¦ hZ£±u§ ó
Rynchopini
Sternini
Stercorariini
Dromadidae
Larini
Alcinae
Pluvianellidae Chionidae Glareolidae Burhinidae
Haematopodini
Recurvirostrini
Charadriinae
Scolopacidae Pedionomidae Thinocoridae
Rostratulidae
Jacanidae
F¶§u ó.ß: Phylogenetic tree of the ì¢þ species of shorebirds (Charadriiformes) and measures of sexual dimorphism, based on omas et al.(óþþ¦). Gray triangles indicate unresolved clades, with the height of the triangle being proportional to the square root of the number of species. Character states at the Õ¢Û threshold level are indicated at the tips; grey indicates sexually dimorphic, black indicates sexually monomorphic, and white indicates no data. For unresolved clades, the degree of shading indicates the proportion of species in each state. For clarity, only family or subfamily names are shown.
ì¢ hZ£±u§ ó
óþþß). For each trait, we computed a standardised measure of dimorphism as (xm−xf )~x¯, where xm and xf are the trait values in the males and females and x¯ is the mean of the male and female values. We regarded species as dimorphic if the absolute value of this dimorphism measure was greater than some threshold value for at least one of the ve traits. is data set did not include state information for ßß species (óóÛ). ese were treated using the methods described in section ó.¦.ì. While the general form of the marginal posterior distributions was well characterised aer Õþ,þþþ steps of the hh algorithm, it was dicult to characterise some of the peaks in the multimodal posterior distributions. To improve resolution we ran eight in- dependent hh chains for Õþþ,þþþ iterations. e precise credibility intervals changed slightly, but not our general conclusions. e relationship between sexual dimorphism and speciation and diversication rates depended on the threshold dierence in body size used. For low to medium thresholds of sexual dimorphism (≤ Õ¢Û), the maximum likelihood and mean posterior probability speciation and diversication rates were higher for sexually dimorphic lineages than monomorphic lineages (gure ó.). In contrast, for very high thresholds (óþÛ) the diver- sication rate for dimorphic species was lower than that of monomorphic species. e dierence was supported by high posterior probability values only for the Õ¢Û threshold. e maximum likelihood extinction rate and the mode of the posterior probability distri- bution was generally zero for both character states across all threshold values examined. We found that character transition rates from sexual dimorphism to monomorphism were higher than the reverse across most thresholds used (gure ó.), with this dierence being most pronounced at Õ¢Û (signicant at the ¢Û level for the ÕþÛ and Õ¢Û thresholds). It is perhaps not surprising that the choice of threshold has such an eect, as the dimor- phic state becomes rare as the threshold is raised, and the rarer a state the more likely its diversication rate would be biased downward. As with previous studies, our results suggest that evidence for a correlation between sexual dimorphism and diversication rates is mixed at best (Barraclough et al., ÕÉÉ¢;
ìä hZ£±u§ ó
80 40 a) e) Monomorphic 30 60 Dimorphic
20 40 NA NA p = 5% 10 20
0 ● 0 ● ● ●
80 40 b) f) Index Index 30 60
20 40 NA NA
10 20 p = 10%
0 ● 0 ● ● ●
80 40 c) g) Index Index 30 60
20 40 NA NA
20 p = 15%
Posterior probability density Posterior 10
0 ● 0 ● ● ●
80 40 d) h) Index Index 30 60
20 40 NA NA
10 20 p = 20%
0 ● 0 ● ● ●
−0.2 0.0 0.1 0.2 0.3 0.0 0.1 0.2 0.3 0.4 0.5 Diversification rate Character transition rate Index Index F¶§u ó.: Marginal posterior probability distributions for the sexual dimorphism- dependent diversication rates (a–d) and character transition rates (e–h) inferred from a supertree of shorebirds (omas et al., óþþ¦). Panels in dierent rows use a dierent threshold level of sexual dimorphism to classify species as monomorphic and dimorphic. Solid curves show the distribution for sexually monomorphic species, and dashed curves for dimorphic species. e horizontal bar and point indicate the É¢Û credibility interval and maximum likelihood estimate.
ìß hZ£±u§ ó
Parker and Partridge, ÕÉÉ; Owens et al., ÕÉÉÉ; Gage et al., óþþó; Morrow and Pitcher, óþþì; Morrow et al., óþþì). However, rather than dividing groups arbitrarily into clades that have just one character state (e.g., Barraclough et al., ÕÉÉ¢), our approach allowed us to make use of all of the available phylogenetic and character state information.
ó.É o«h¶««
In this chapter, we have developed two methods for estimating the eect of a trait on speciation and extinction rates from incomplete and incompletely resolved phylogenies. Testing these methods with simulations, it was possible to estimate diversication rates from even poorly sampled phylogenies (gure ó.ä). Where trees were simulated with equal speciation rates, there was little increase in uncertainty in the estimates of dieren- tial diversication with decreasing phylogenetic information, even when as few as óþÛ of species were phylogenetically placed. is is surprising, because terminally unresolved trees lack much of the ne branching structure present at the tips of a completely resolved phylogeny (gure ó.Õ). However, the power to detect dierences in individual parameters depended more strongly on phylogenetic structure. Because the terminally unresolved clade method uses the branching structure avail- able to the skeletal method (for a given number of tips in a tree), dierences between these two methods are due to the additional information about the placement of the missing taxa in the terminal unresolved clades. In cases where a given species sample can be reasonably assumed to be a random draw from all extant species, the skeletal tree method provides a simple way of estimating speciation and extinction. In particular, the phylogenetic relationships and character states of non-sampled species do not need to be incorporated. Where phylogenies are almost complete, the loss of power using this method is fairly low. For poorly sampled phylogenies (fewer than ó¢Û species included in our ¢þþ species phylogenies), the uncertainty around parameters became very large to the point where inference was not possible (gures ó.¢–ó.ä).
ì hZ£±u§ ó
e terminal unresolved clade approach can avoid most of this loss of power, pro- vided all species not included in the phylogeny can be grouped into terminally unre- solved clades. e eect of including the terminally unresolved clades was strongest for the character transition rates (qýÔ and qÔý, gure ó.¢), and it allowed detection of dierential diversication on poorly sampled trees (gure ó.ä). However, the terminally unresolved tree method can only be used where every species can be assigned to a termi- nal unresolved clade. Deeper phylogenetic uncertainties such as unresolved paraphyletic groups have not yet been incorporated, and some known phylogenetic information may need to be discarded to use the current methods by including only terminal unresolved clades (see gure ó.Õ d). is method is also substantially more computationally demand- ing than the skeletal tree approach and is limited at present to unresolved clades that contain fewer than approximately óþþ species. With óþþ species, there are over óþ,þþþ possible clade compositions (numbers of species in each state), and even with modern matrix exponentiation techniques the calculations become both very slow and prone to numerical underow (Sidje, ÕÉÉ). Missing data and incompleteness are generally unavoidable in comparative macro- evolutionary analyses. Frequently, phylogenetic trees will contain species that are less related than expected by chance in order to maximise coverage over the true phylogeny (e.g., Moyle et al., óþþÉ). In these cases, the terminally unresolved tree method will be appropriate. If a phylogeny is almost complete, missing only a few taxa, but for which the placement is uncertain (e.g., the cases considered by Bokma, óþþa), the skeletal tree method should be satisfactory (gure ó.¢). It may not always be possible to know with complete certainty where taxa that are not included in a tree should be placed within a terminally unresolved tree. In this case, one could run an analysis over possible placements of missing taxa, integrating over this uncertainty (Lutzoni et al., óþþÕ). It is probably not possible to know in general for cases such as gure ó.Õ d whether collapsing known structure into an terminal unresolved clade, or treating the tree as a skeleton tree will suer the least reduction in power, as these approaches both lose data.
ìÉ hZ£±u§ ó
A nal caution about the pattern of missing taxa: outgroups are generally poorly sampled relative to the ingroup and should be removed from the tree prior to calculation of likelihoods with BiSSE. Poorly-sampled outgroups would certainly violate assumptions of random taxon sampling. Maddison et al.(óþþß) tested whether simpler models may t the data better by comparing models where rates were character dependent to simpler models where rates were character independent (e.g., a model where λý ≠ λÔ to a model where λý = λÔ) using likelihood ratio tests. Model selection between the full model and reduced models could also be done in a Bayesian framework using reversible jump Markov Chain Monte Carlo (§hh; Green ÕÉÉ¢). §hh alters the Markov chain to propose dierent models at some steps, for example changing from the full six parameter model to one of the three simpler ve parameter models. e posterior probability distribution of dierent models can then be directly compared. An §hh approach would provide a natural way of removing parameters from the analysis where there is little to no phylogenetic signal
(e.g., qýÔ in the poorly-resolved unequal rate tree, gure ó.¢ g). is approach has been used successfully elsewhere in phylogenetic inference (e.g., Pagel and Meade, óþþä). Using either of the sampling methods explored here, BiSSE likelihoods may be com- puted for partially complete phylogenies. Future work is needed to handle much larger unresolved clades (> óþþ species) and to handle orphan taxa, whose phylogenetic posi- tion is deep in the tree and uncertain. Such extensions are needed to analyse “higher- level” phylogenies that are complete at a taxonomic level above species but contain large numbers of species in their unresolved clades (e.g., Hackett et al. óþþ; Davies et al. óþþ¦).
¦þ hZ£±u§ ì Q¶Z±±Z±êu T§Z±« b Dêu§«hZ±
ì.Õ «¶Z§í
Quantitative traits have long been hypothesised to aect speciation and extinction rates. For example, smaller body size or increased specialisation may be associated with in- creased rates of diversication. Here, I present a phylogenetic likelihood-based method (QuaSSE; Quantitative State Speciation and Extinction) that can be used to test such hy- potheses using extant character distributions. is approach assumes that diversication follows a birth-death process where speciation and extinction rates may vary with one or more traits that evolve under a diusion model. Speciation and extinction rates may be arbitrary functions of the character state, allowing much exibility in testing models of trait-dependent diversication. I test the approach using simulated phylogenies and show that a known relationship between speciation and a quantitative character could be recovered in up to þÛ of the cases on large trees (¢þþ species). Consistent with other approaches, detecting shis in diversication due to dierences in extinction rates was harder than when due to dierences in speciation rates. Finally, I demonstrate the ap- plication of QuaSSE to investigate the correlation between body size and diversication in primates, concluding that clade-specic dierences in diversication may be more important than size-dependent diversication in shaping the patterns of diversity within this group.
¦Õ hZ£±u§ ì
ì.ó ±§o¶h±
Species selection may be responsible for much of the variation in diversity among clades. Species full the Lewontin(ÕÉßþ) denition of units of selection; dierent species have dierent traits, dierential “tness” (rates of speciation and extinction) that may be at- tributable to these traits, and the traits and their tness dierences are heritable. While the concept of species selection has been controversial since its inception (e.g., Stanley, ÕÉߢb; Vrba and Gould, ÕÉä), there now seems to be reasonable agreement that the process may operate (recently discussed in Okasha, óþþä; Jablonski, óþþb; Rabosky and McCune, óþþÉ). Many traits have been proposed to aect rates of speciation and ex- tinction, such as body size (Gittleman and Purvis, ÕÉÉ), sexual system (Heilbuth, óþþþ), and dispersal ability (Phillimore et al., óþþä). In addition, hypotheses that invoke “key innovations” (e.g., oral nectar spurs: Hodges and Arnold, ÕÉÉ¢) or evolutionary “dead ends” (e.g., asexuality: Schwander and Crespi, óþþÉ) generally invoke trait-dependent variation in rates of diversication. Phylogenies contain information about the timings of speciation events and patterns of diversication (Nee et al., ÕÉɦb) and have been used extensively in comparative analy- ses to attempt to identify correlates of elevated speciation or extinction rates. Sister-clade analyses have been widely used for detecting correlates of diversication for binary traits (e.g, Mitter et al., ÕÉ; Heilbuth, óþþþ; Vamosi and Vamosi, óþþ¢). ese require that clades are characterised by a single character state and assume that all lineages within the clade have taken this value for the majority of its evolutionary history. More recently, likelihood approaches such as BiSSE (Binary State Speciation and Extinction; Maddison et al. óþþß) have allowed this assumption to be relaxed, allowing any distribution of characters among extant species by using the entire pattern of branching in a phylogeny. For example, BiSSE has recently been used to demonstrate a correlation between live- bearing (vs. egg-laying) snake species and elevated speciation rates (Lynch, óþþÉ). Sister-clade analyses will not generally be appropriate for detecting correlation be- tween diversication rates and continuous traits because a clade does not have a single
¦ó hZ£±u§ ì body size, geographic range, or latitude (but see Gittleman and Purvis, ÕÉÉ, for an approach that uses mean clade traits). Several recent methods have been developed explicitly for continuous traits. Clauset and Erwin(óþþ) used a diusion model to cal- culate equilibrium trait frequency distributions under a model where species selection is opposed by individual-level selection. However, this approach cannot incorporate phy- logenetic information. e methods developed by Paradis(óþþ¢) and Freckleton et al. (óþþ) are explicitly phylogenetic, but they assume that ancestral character states can be estimated without accounting for the eect of the character on speciation and extinction (see below). None of these methods can distinguish between dierential speciation and dierential extinction. However, speciation and extinction rates may be correlated (high extinction rates may accompany high speciation rates; e.g., Gilinsky ÕÉɦ; Coyne and Orr óþþ¦; Liow et al. óþþ), and many traits are thought to change diversication rates through their eect on extinction, rather than speciation (e.g., Harcourt et al., óþþó; Cardillo et al., óþþ¢). Approaches that allow dierential speciation to be distinguished from dierential extinction will therefore allow testing of a broader array of evolutionary hypotheses. Inferring the states of ancestral nodes is problematic when the character aects spe- ciation or extinction (Maddison, óþþä; Paradis, óþþ). For example, if the processes of character evolution and speciation/extinction were treated separately, then trait values might be inferred at ancestral nodes that are unlikely because they would lead to high rates of extinction. More subtly, the trait values estimated in this way will be incorrect if the evolution of the trait has a directional tendency, as it is not possible to detect directional shis in a character under a Brownian motion model of character evolution on an ultrametric tree (Schluter et al., ÕÉÉß). However, where speciation, extinction, and character evolution are treated simultaneously, and where speciation rates are state- dependent, these directional changes can potentially be inferred. is would allow detec- tion of cases where species selection and individual-level selection oppose one another (i.e., directional character change towards trait values that are disfavoured by species
¦ì hZ£±u§ ì selection). For example, it has been suggested that large-bodied individuals tend to have higher tness (leading to an increase in mean size over time), while populations of large- bodied individuals are more prone to extinction (Clauset and Erwin, óþþ). Here, I describe a new comparative phylogenetic method, QuaSSE (Quantitative State Speciation and Extinction), for inferring the eect of quantitative traits on specia- tion and extinction rates. I rst derive likelihood equations that can be used to calculate the probability of a phylogenetic tree and distribution of character states among species under a general model of cladogenesis and character evolution. I then investigate the power of this method to detect dierential diversication by applying it to simulated trees. Finally, I demonstrate the method and illustrate some potential pitfalls by investi- gating the correlation between body size and diversication in primates.
ì.ì hZ§Zh±u§ u궱 b oêu§«hZ±
I model speciation and extinction as a birth-death process (similar to Nee et al., ÕÉɦb), allowing the rates of speciation and extinction to vary with a simultaneously evolving character. Assume that a species can be characterised by its mean value of some character trait, x, which varies on the interval (−∞, ∞), and that this character aects diversica- tion through its eect on the rate of speciation or extinction (or both). Let the rate of speciation for a lineage in state x be λ(x) and the rate of extinction be µ(x). ese may be arbitrary non-negative functions of x, and I do not assume anything about their form. In the most general case, these can also be functions of time, so that the speciation rate for a lineage in state x at time t is λ(x, t), but for notational brevity I will omit this time dependence. Incorporating time-dependence allows modelling of clade-wide changes in diversication rates (e.g., Rabosky, óþþä). It is convenient to model character evolution along lineages using a diusion pro- cess (Allen, óþþì). Diusion processes are attractive for modelling character evolution because they allow for stochasticity while being mathematically tractable. I will measure
¦¦ hZ£±u§ ì time backwards, with the present at time ý, and t > ý representing some time in the past. Let g(z, tSx, t + ∆t) be the transition probability density function for the diusion process; the probability density that a character state changes from x at time t + ∆t to state z at time t, where t is closer to the present than t + ∆t (ý < t < t + ∆t). e diusion assumptions state that
Ô ∞ ϕ(x, t) = lim (z − x)g(z, tSx, t + ∆t)dz (î.×a) ∆t→ý ∆t ∫−∞ Ô ∞ σ ò(x, t) = lim (z − x)ò g(z, tSx, t + ∆t)dz (î.×b) ∆t→ý ∆t ∫−∞ Ô ∞ ý = lim (z − x)k g(z, tSx, t + ∆t)dz, k > ò, (î.×c) ∆t→ý ∆t ∫−∞ where the integral is taken over all possible character transitions (Allen, óþþì). I will refer to ϕ(x, t) as the “directional” term, which captures the deterministic or directional component of character evolution; this is the expected rate of change of the character over time and may be due to selection or other any other within-lineage process that has a directional tendency. is term is typically referred to as the “dri” term (e.g., Allen, óþþì), but I avoid this terminology to prevent confusion with genetic dri. e term σ ò(x, t) is the “diusion” term, and is the expected squared rate of change; this captures the stochastic elements of character evolution. e condition (ì.Õc) formally captures the assumption that large changes are unlikely by asserting that character evolution is described entirely by the rst two moments of the transition probability density function. Note that both ϕ(x, t) and σ ò(x, t) may be functions of both the character state and time. I assume that the character state is perfectly inherited by both daughter species during speciation (e.g., speciation does not lead to character displacement). e above diusion process generalises other models of character evolution. Brown- ian motion can be modelled by setting the functions ϕ(x, t) and σ ò(x, t) to the constants ϕ and σ ò. Where ϕ = ý, this is is standard Brownian motion, and where ϕ is nonzero this is Brownian motion with a directional tendency (Felsenstein, ÕÉ). e Ornstein- Uhlenbeck process captures stabilising selection that pulls the character towards a long-
¦¢ hZ£±u§ ì term mean xˆ (Hansen and Martins, ÕÉÉä). is can be modelled by setting the direc- tional term, ϕ(x, t), to the linear function α(xˆ − x) where α is the strength of this stabilising force and setting the diusion term, σ ò(x, t), to the constant σ ò. Note that when the model of character evolution is Brownian motion (with no directional ten- dency), this birth-death-diusion process is essentially that described by Paradis(óþþ¢) and Freckleton et al.(óþþ), and by Slatkin(ÕÉÕ) and Clauset and Erwin(óþþ), but disallowing character changes at nodes. Before describing the approach, it is worth emphasising some limitations that follow from the above assumptions. e birth-death process leads to an exponential growth in the number of species (at rate λ − µ when these rates are character-independent; Nee et al. ÕÉɦb), which is clearly not sustainable indenitely. In addition, no interaction is possible between any of the lineages in the phylogeny; the rates of speciation, extinction and character evolution cannot depend on the number of extant species or the character states of those species. is prevents modelling of density-dependent diversication (Phillimore and Price, óþþ) or frequency-dependent character evolution.
ì.¦ uo hZh¶Z±«
In this section, I will derive equations to compute the probability of a phylogenetic tree and character state distribution among extant species under the above model of character evolution and character-dependent speciation and extinction. I will assume that the calculations are carried out on a single ultrametric phylogenetic tree that has branch lengths proportional to time. It is straightforward to extend this analysis to integrate over a family of trees (e.g., bootstrapped trees or samples from a Bayesian analysis). I also assume that the tree is complete and fully resolved; i.e., that it includes every extant species above a common ancestor and contains no polytomies. Later, I will relax this assumption slightly to allow for partial taxon sampling.
¦ä hZ£±u§ ì
e calculations follow the same general structure as those of BiSSE (Maddison et al., óþþß). Following the notation of BiSSE, let E(x, t) be the probability that a lineage in state x at time t goes completely extinct, leaving no descendants by the present (time ý). is is a continuous function in both trait-space and time, in contrast to the analogous quantities in BiSSE that were continuous only in time. Similarly, let DN (x, t) be the probability that a lineage in state x at time t would evolve into the extant clade N as observed, including branch lengths and present-day character states. e subscript N denotes that this function applies to a particular lineage N.
ì.¦.Õ Probability of extinction
Assume that we know the function E(x, t) at some time t in the past. If we can express E at a time immediately prior to this, t + ∆t, in terms of its values at time t, then we can continue to do this until reaching the origin of a branch (Felsenstein, ÕÉÕ). To do this, consider all the events that could occur over a very short period of time, ∆t, and write E(x, t + ∆t) by multiplying the probability of each event happening by the probability of extinction given that a particular event happened (gure ì.Õ). I assume that in this small period of time at most one speciation or extinction event may occur (specically, I assume that the probability of two or more events occurring is of order (∆t)ò and therefore negligible with suciently small ∆t). Over this period of time, there are three possibilities; the lineage (Õ) goes extinct with probability µ(x)∆t, (ó) speciates with probability λ(x)∆t(Ô − µ(x)∆t), or (ì) neither speciates nor goes extinct with probability (Ô − λ(x)∆t)(Ô − µ(x)∆t) (see Maddison et al., óþþß). If extinction does not occur in this time interval, character change may have occurred along the branch, and we must account for all possible character transitions that might have occurred. Where speciation occurred, this character change occurs independently on both lineages and both lineages must be extinct by the present (gure ì.Õ). Summing over these possibilities
¦ß hZ£±u§ ì
a) Extinction b) Speciation c) No change x x x
t t t t+Δt x t+Δt t+Δt
F¶§u ì.Õ: Possible ways a lineage extant at time t + ∆t might go extinct. If at most a single lineage-changing event occurs, then a) extinction happens with probability µ(x)∆t, leading to total extinction with probability Ô, b) a speciation event happens with probability λ(x)(Ô − µ(x))∆t, leading to total extinction with probability E(x, t)ò, or c) no speciation or extinction happens with probability (Ô − µ(x)∆t − λ(x)∆t), leading to total extinction with probability E(x, t). We must integrate over the character change that might occur during this period of time; lineages in which the character may change are indicated in black.
¦ hZ£±u§ ì gives
E(x, t + ∆t) =µ(x)∆t × Ô ∞ ò + (Ô − µ(x)∆t)λ(x)∆t g(z, tSx, t + ∆t)E(z, t)dz ∫−∞ ∞ (î.ö) + (Ô − λ(x)∆t)(Ô − µ(x)∆t) g(z, tSx, t + ∆t)E(z, t)dz ∫−∞ + O(∆tò) where O(∆tò) includes terms of order (∆t)ò or higher (see gure ì.Õ). Subtracting E(x, t) from both sides, dividing by ∆t, taking the limit ∆t → ý, and using the diu- sion conditions above, the following partial dierential equation can be derived (see Appendix f.Õ for details):
∂E(x, t) = µ(x) + λ(x)E(x, t)ò − (λ(x) + µ(x))E(x, t) ∂t ∂E(x, t) σ ò(x, t) ∂òE(x, t) + ϕ(x, t) + . (î.î) ∂x ò ∂xò
ì.¦.ó Probability of the data Next, consider the probability of a lineage including topology, branch lengths, and char- acter states amongst its extant descendants in clade N, DN (x, t). Because the calcu- lations here assume we are not at a node, only a single lineage can be present in the reconstructed phylogeny. e three possible events that could occur over the period of time ∆t that are consistent with this are (Õ) extinction, with no chance to explain the data (the extant clade N), (ó) speciation, requiring the eventual extinction of either of the resulting lineages (with the other becoming clade N), and (ì) no speciation or extinction, leaving a single lineage to become clade N (gure ì.ó). Incorporating the
¦É hZ£±u§ ì
a) Extinction b) Speciation c) No change
x N N t t t t+Δt x t+Δt t+Δt
F¶§u ì.ó: Possible ways a lineage extant at time t + ∆t might lead to exactly the clade N as observed. If at most a single lineage-changing event occurs, then a) extinction happens with no chance of explaining the data, b) a speciation event requiring the extinction of either lineage (probability òDN (x, t)E(x, t) of explaining the data), or c) no speciation or extinction, with probability DN (x, t) of explaining the data. See gure ì.Õ for other details.
¢þ hZ£±u§ ì possible character transitions in all non-extinct lineages as for E(x, t) gives
DN (x, t + ∆t) =µ(x)∆t × ý ∞ + òλ(x)(Ô − µ(x)∆t)∆t g(z, tSx, t + ∆t)DN (z, t)dz × ∫−∞ ∞ g(z, tSx, t + ∆t)E(z, t)dz (î.®) ∫−∞ ∞ + (Ô − λ(x)∆t)(Ô − µ(x)∆t) g(z, tSx, t + ∆t)DN (z, t)dz ∫−∞ + O(∆tò).
e ò in equation (ì.¦) appears because either of the two lineages that are extant aer a speciation event could be consistent with the observed data, provided the other goes extinct (Maddison et al., óþþß). Using the same logic as was used to derive E(x, t) gives the partial dierential equation
∂D (x, t) N = òλ(x)D (x, t)E(x, t) − (λ(x) + µ(x))D (x, t) ∂t N N ∂D (x, t) σ ò(x, t) ∂òD (x, t) + ϕ(x, t) N + N . (î.£) ∂x ò ∂xò
Equations (ì.ì) and (ì.¢) form the core of QuaSSE.
ì.¦.ì Initial & boundary conditions Equations (ì.ì) and (ì.¢) do not have known analytic solutions. However, given appropri- ate initial and boundary conditions, they may be integrated numerically along a branch towards the root of the tree. For the initial condition for E, note that a lineage cannot go extinct in zero time, so E(x, ý) = ý for all x. e initial condition for DN (x, t) must be a probability distribution function; i.e., it must integrate to Ô over all x because at time ý a lineage does exist. If we knew with absolute certainty that an extant species had state
¢Õ hZ£±u§ ì
xobs, we could use a Dirac delta function,
DN (x, ý) = δ(x − xobs),
which concentrates the probability distribution on the observed character state xobs and integrates to Ô. However, in contrast with discrete data, a species’ state is never known without error, due to both within-species variation and measurement error. In the ex- amples below, I will use a normal distribution centred on xobs, with standard deviation
σobs, but any probability distribution could be used. To integrate these equations numerically, a nite domain and boundary conditions need to be specied. Suppose that the range (xl , xr) is modelled; I assume that at these boundaries the derivative of E(x, t) and DN (x, t) with respect to x is zero (i.e., Neumann boundary conditions). is requires that the derivative of λ(x) and µ(x) with respect to x is approximately zero at the boundaries and that the region is suciently wide that
DN (x, t) is very close to zero at the boundaries so that the probability of explaining the data from states beyond these boundaries is negligible.
ì.¦.¦ Calculations at the nodes Given the initial and boundary conditions above, equations (ì.ì) and (ì.¢) can be inte- grated along a branch to give distributions at the base of nodes. At the node N′ that joins the branches leading from nodes N and M, the initial condition is
DN′ (x, t) = DN (x, t)DM(x, t)λ(x). (î.å)
is is the probability of the lineage at the node being in state x at time t speciating, then giving rise to both the N and M clades. is value is then used as the initial condition for the integration along the branch leading down from this node.
¢ó hZ£±u§ ì
ì.¦.¢ Calculations at the root
At the base of the tree, we have a function DR(x, tR), where tR is the time at the root. To get a single likelihood value, DR, we must integrate over all possible character states x. is has been discussed elsewhere for the binary case (Goldberg and Igić, óþþ; FitzJohn et al., óþþÉ). e simplest approach is to integrate over the possible states:
xr DR = DR(x, tR)dx. (î.à) ∫xl
is is equivalent to assigning a at prior to the character state at the root (e.g., Pagel, ÕÉɦ). However, the tree and model contain some information about the likely state at the root, and we can use this by weighting the state x by its relative probability of yielding the observed data xr ( , ) = ( , ) DR x tR d . (î.) DR DR x tR xr x ∫xl D (y, t )dy ∫xl R R e latter approach is used in the calculations in this paper.
ì.¦.ä Extensions Ih£u±u £íuí — Where a phylogenetic tree does not include all extant relatives the above calculations may not be used, as they will produce incorrect like- lihoods (see FitzJohn et al., óþþÉ). However, if the species included in a phylogeny represent a random sample from the extant taxa a simple modication to the calculations above allows the tree to be used, following Nee et al.(ÕÉɦb) and FitzJohn et al.(óþþÉ). Suppose that a species in state x has probability f (x) of being sampled for inclusion in the tree. We can model this sampling event similar to a mass extinction at the present (Nee et al., ÕÉɦb). If the probability of being included in a phylogenetic tree is thought to be independent of the character of interest, then we may set f (x) = f . Above, E(x, t) was dened as the probability that a lineage in state x at time t would have no extant descendants. For incomplete trees, we can interpret this as the probability of failing to
¢ì hZ£±u§ ì appear in the phylogenetic tree, either by extinction or by not being sampled. e initial conditions then become E(x, ý) = Ô − f (x). Likewise, we can interpret DN (x, t) as the probability that the lineage evolves into a clade N and is sampled. e initial condition D(x, ý) is then the product of a distribution describing uncertainty in the extant species state and f (x).
M¶±£u hZ§Zh±u§« — Multiple traits may aect diversication rate, and these may not evolve independently. For example, body size and latitude are likely to be correlated and are thought to both have eects on speciation and/or extinction rates (Jablonski, óþþb). Suppose that we are tracking k traits. Let x be a vector of character states of length k, and let xi be the ith trait (i = Ô, ò, . . . , k). e speciation and extinction functions become λ(x) and µ(x). As above, we retain just the rst two moments of char- acter evolution (which now include covariances) so that ϕi(x, t) is the rate of directional evolution of the ith trait, σi,i(x, t) is the rate of diusion of the ith trait, and σi, j(x, t) is the instantaneous covariance between the ith and jth traits (i ≠ j). In Appendix f.ó,I derive the multivariate analogues to equations (ì.ì) and (ì.¢):
∂E(x, t) = µ(x) + λ(x)E(x, t)ò − (λ(x) + µ(x))E(x, t) ∂t k k k ò ∂E(x, t) σi, j(x, t) ∂ E(x, t) + Q ϕi(x, t) + Q Q (î.Ì) i=Ô ∂xi i=Ô j=Ô ò ∂xi ∂x j and
∂D (x, t) N = òλ(x)D (x, t)E(x, t) − (λ(x) + µ(x))D (x, t) ∂t N N k k k ò ∂DN (x, t) σi, j(x, t) ∂ DN (x, t) + Q ϕi(x, t) + Q Q , (î.×ÿ) i=Ô ∂xi i=Ô j=Ô ò ∂xi ∂x j where the single sums are taken over the directional parameters, and the double sums are taken over the diusion parameters (when i = j) and the covariances between characters
¢¦ hZ£±u§ ì
(i ≠ j). A similar approach can be used where the second character is a binary state (see Appendix f.ó).
ì.¦.ß Implementation & technical details To integrate equations (ì.ì) and (ì.¢) numerically, I used an implicit integration scheme, where the propagation of the values in character space are performed using future values of E and DN . To do this, I discretised both the character space and time. In each time step of size ∆t, the changes in E and DN through time are initially set to the character- independent solutions to equations (ì.ì) and (ì.¢),
µ(x) − λ(x)E(x, t) + e(λ(x)−µ(x))∆t(Ô + E(x, t) − µ(x)) E(x, t + ∆t) = (î.××a) µ(x) − λ(x)E(x, t) + e(λ(x)−µ(x))∆t(Ô + E(x, t) − λ(x)) ò e(λ(x)−µ(x))∆t(λ(x) − µ(x)) D (x, t + ∆t) =D (x, t) . N N e(λ(x)−µ(x))∆t λ(x)(Ô − E(x, t)) − µ(x) + λ(x)E(x, t) (î.××b)
(While these are character-independent, these equations need to be evaluated for each of the discretised x positions.) Following this, for each step I integrate over the character evolution that may have occurred during this period of time. For constant directional and diusion terms (i.e., when the character evolves under Brownian motion), this can be done eciently by convolving the functions E(x, t) and DN (x, t) with a normal distribution with mean ϕ∆t and variance σ ò∆t, that is the solution to the diusion process described by the partial derivatives on the right hand side of equations (ì.ì) and (ì.¢). I implemented these calculations in R (R Development Core Team, óþÕó), using the Fast Fourier Transform routines in the package ±ë (Frigo and Johnson, óþþ¢) to perform the convolutions. I focus on maximum likelihood estimation in the analyses in this paper, using the subplex algorithm in R to maximise the likelihood function with respect to the parameters of λ(x), µ(x), and the directional and diusion coecients. However, the likelihoods computed could be used in Bayesian calculations (e.g., FitzJohn et al., óþþÉ), although the choice of appropriate priors may not be trivial and the number
¢¢ hZ£±u§ ì of calculations required to draw samples from the posterior using Markov Chain Monte Carlo will make this fairly slow in practice. is implementation (currently allowing only one character) is available in the R package “diversitree” (available from http: //www.zoology.ubc.ca/prog/diversitree).
ì.¦. Tree simulation I tested the performance of QuaSSE on simulated trees. To simulate a tree, I started with a single lineage in state xý. Each time step, I allowed at most one lineage to speciate or go extinct, and then updated the character state of every lineage stochastically following a Brownian motion model of character evolution. I scaled time so that on average ¢þþ time steps would occur between lineage-changing events by setting the time step equal to Ô~( ýý(∑i λ(xi) + µ(xi)), where xi is the character state of the ith lineage and the sum is taken over all extant lineages in the tree. is ensured that the character had adequate time to evolve between speciation events to approximate continuous character evolution. I simulated trees where there was no eect of a character on speciation or extinction (i.e., λ(x) = λ, µ(x) = µ), and where speciation or extinction were sigmoidal functions of the character state, (yÔ − yý) yý + , Ô + exp(r(xmid − x)) where yý and yÔ are the asymptotic values at low and high x, r describes the steepness of the sigmoid and xmid is the inection point. I chose a sigmoidal function as this captures a directional eect of a character on speciation or extinction, while preventing negative or extremely large speciation or extinction rates. When constant, the speciation rate was ý.Ô and the extinction rate was ý.ýç. For the dierential speciation simulations the speciation rates varied with x from þ.Õ–þ.Õ¢ (low dierence), þ.Õ–þ.ó (medium), or þ.Õ– þ.ì (high). For dierential extinction, the rates varied from þ.þì–þ.þ¦¢ (low dierence),
þ.þì–þ.þä (medium), or þ.þì–þ.þÉ (high). For all simulations, I set xmid = ý and r = ò. . I also used two rates of character diusion; low (σ ò = ý.ýÔ) or high (σ ò = ý.ýò ). For these
¢ä hZ£±u§ ì simulations, there was no directional tendency (ϕ = ý). Note that the scale used for x is arbitrary (making the choice of xmid arbitrary), and changing r is equivalent to changing σ ò when only one of λ(x) and µ(x) varies with x and ϕ = ý. For the parameter values used, most of the variation in speciation rate with respect to the character occurred over the region [−ò, ò]; I started simulated trees in a character state chosen randomly from a uniform distribution on this range. Sigmoidal functions require four parameters (plus parameters for extinction and character evolution), and there may not be sucient signal in the data to be able to t such complicated models (see results). erefore, I t models where speciation and extinction were constant, linear, or sigmoidal functions of the trait. To make the linear models satisfy the boundary condition that ∂E(x, t)~∂x is eectively zero at extreme values of x, I set the slope to zero for λ(x) and µ(x) once the character was òý times the character-independent maximum likelihood diusion coecient away from the extant character distribution. I also set the functions to zero if they became negative. To test if speciation or extinction functions that vary with character state t better than constant functions, I used likelihood ratio tests where model comparisons were nested. Even with large trees, the appropriate cut-o value for the χò statistic may not be the expected ì.¦ (e.g., Maddison et al., óþþß). I used simulated trees where specia- tion and extinction rates were independent of any character states to estimate the false positive rate. For all tree sizes the false positive rate for tests of dierential speciation was close to ¢Û at the ¢Û level (Kolmogorov–Smirnov test of observed distribution vs. χò distribution with Õ d.f.: p > ý.¥ ). However, for dierential extinction, the false positive rate was higher (Ôò% signicant results at the % level), and the distribution
ò of likelihood ratios signicantly deviated from a χÔ distribution (p < ý.ýý). I therefore used empirically determined cuto values for the three tree sizes of â. ý (Õó¢ species), .À (ó¢þ species), and .Þò (¢þþ species) based on these simulations for the power calculations reported below.
¢ß hZ£±u§ ì
ì.¢ «¶Z± §u«¶±«
On simulated trees, there was oen power to detect dierential speciation, while dier- ential extinction was always dicult, but not impossible, to detect (gure ì.ì). Because a linear function was signicant for almost all cases where a sigmoidal function was signicant, I interpret a signicant t of the linear function as detection of dierential speciation or extinction. As tree size increased, the power to detect dierential specia- tion grew from Õþ–¦þÛ on Õó¢ species trees to up to ßþÛ on ¢þþ species trees. As the dierence between minimum and maximum speciation or extinction rates increased, dierential speciation and extinction became easier to detect. However, there was es- sentially no power to detect dierential extinction for the simulated trees unless the character had a large eect on rates of extinction (gure ì.ì f–h). Dierential speciation was easier to detect in simulations with higher diusion parameters. Increasing the diusion parameter increases the sampling of the character space where λ(x) changes most with respect to x, eectively increasing the sampling of the relevant character states. Power to detect trait-dependent speciation also depended strongly on the starting position of the simulation (data not shown). e power to detect dierential speciation was highest on trees where the simulation started slightly below the mean diversication rate, again reecting the amount of sampling of the informative region of dierential speciation. Even though the trees were simulated using a sigmoidal eect of character on spe- ciation or extinction, linear models were oen preferred to the full sigmoidal model when ts were compared using the Akaike information criterion (gure ì.¦). Sigmoidal models were rarely preferred for the extinction function. As tree size increased, the sigmoidal speciation models were more oen preferred over linear models (data not shown). However, even for ¢þþ species trees, sigmoidal functions were preferred in less than ¥ý% of signicant cases. is is probably due to the fact that for most smaller trees, most species occupy the roughly linear part of the sigmoidal function (gure ì.¦). In addition, the sigmoidal model that was t was oen a step function rather than a
¢ hZ£±u§ ì
No effect Low Medium High 0.8 a) b) c) d) ● 2 ● σ = 0.01 ● ● 2 σ = 0.025 ● 0.6 ● ● ● ● ● ● 0.4 ● ● ● ● ●
●
0.2 ● Speciation ● ● ● ● ● ● ● 0.0
0.8 e) f) g) h)
0.6
● 0.4 ● ● ● Proportion of tests significant
0.2 Extinction ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 ●
200 300 400 500 200 300 400 500 200 300 400 500 200 300 400 500 Number of species
F¶§u ì.ì: Power to detect dierential speciation (a–d) and dierential extinction (e–h) with QuaSSE on simulated phylogenies. Trees were evolved with Õó¢, ó¢þ or ¢þþ species, and no, low, medium, or high eect of a character state on speciation or extinction. e character state evolved under Brownian motion at a low or high rate σ ò, as indicated by dashed or solid lines. Error bars are É¢Û binomial condence intervals over the óþþ replicate simulated trees. Horizontal dotted lines indicate the ¢Û signicance level.
¢É hZ£±u§ ì
Constant Linear Sigmoid
0.4 a) b) c)
0.3
0.2 NA NA NA
0.1 Speciation rate
0.0 (48%) (36%) (16%)
0.4 d) e) f) Index Index Index 0.3
0.2 NA NA NA
0.1 Extinction rate
0.0 (70%) (23%) ( 6%)
−5 0 5 −5 0 5 −5 0 5
Index CharacterIndex state Index
F¶§u ì.¦: Representative speciation (a–c) and extinction (d–f) function ts. Functions were t to óþþ trees containing ó¢þ taxa with a high rate of character evolution (σ ò = ý.ýò ) and a medium eect of speciation (λ(x) ranged from þ.Õ to þ.ó) or a high eect of extinction (µ(x) ranged from þ.þì to þ.þÉ). ick black lines show the true values used in the simulations. Gray lines show ts for individual trees, and span the range of observed character data for each tree. Only the best t chosen by likelihood ratio test or AIC is displayed (see text for details).
äþ hZ£±u§ ì smooth sigmoid (gure ì.¦), showing that while dierences in extreme speciation rates are detectable, the exact pattern may not be. To investigate if there is power to detect directional changes in character states, I ran some simulations that included a nonzero directional parameter, ϕ. I used only the set of parameters with the highest power in the absence of directional evolution (¢þþ species, high rate of character evolution, large eect of the trait on speciation). I ran simulations where this directional tendency was negative and opposed species selection (i.e., the trait tended to decrease along a lineage, while species with larger trait values had higher rates of speciation) and where the tendency was positive and reinforced species selection. When rates of this directional tendency were very high in either direction, the power to detect dierential diversication was reduced as character states tended to evolve into atter regions of the speciation function (gure ì.¢). When the directional tendency opposed species selection, there was some power to detect the trend, but this power was never high for the parameter values explored (gure ì.¢). ere was essentially no power to detect the presence of the directional tendency where it reinforced species selection, as both processes moved most character states into regions where speciation was constant with respect to the character state.
ì.ä Z££hZ± ± £§Z±u foí «ñu oZ±Z
Several studies have suggested that speciation and/or extinction are correlated with ani- mal body size. Typically, smaller bodied species have been hypothesised to have higher speciation rates or lower extinction rates than larger bodied species (e.g., Cardillo et al., óþþ¢; Clauset and Erwin, óþþ). In some groups there is also palaeontological evidence for increases in body size over evolutionary time, with species tending to be larger than their ancestors (Cope’s rule; Jablonski ÕÉÉß; Alroy ÕÉÉ). I investigated body size evolution in primates, testing whether body size is a correlate of speciation or extinction. I used a recent primate supertree (Vos and Mooers, óþþä).
äÕ hZ£±u§ ì
● Differential speciation 0.8 ● Directional trend
● ● ●
0.6 ●
0.4
● ●
Proportion of tests significant 0.2
● ● ● ● ● 0.0 ●
−0.10 −0.05 0.00 0.05 0.10
Rate of directional character change F¶§u ì.¢: Power to detect trait-dependent speciation (dashed lines/open circles) and directional character change (solid lines/lled circles) at dierent rates of the directional tendency on simulated ¢þþ-species phylogenies.
äó hZ£±u§ ì
is tree contains several polytomies that need to be resolved before running the analysis (óÕì of óìó internal nodes are resolved; gure ì.ä). e polytomies reect phylogenetic uncertainty, but the likelihood calculations above would misinterpret them as bursts of speciation followed by relatively low rates of speciation. It is not sucient to randomly resolve the nodes and leave branch lengths as eectively zero, as branch lengths need to be specied in some way to prevent this misinterpretation. To do this, I used a bifurcating tree generated by T. Kuhn, in which he randomly resolved the topology of the polytomies and then used fuZ«± (Drummond and Rambaut, óþþß) to simulate unknown branch lengths under a constant-rates birth-death model (Kuhn et al., óþÕÕ). I used log-transformed female body mass from a recent collection of primate trait data as a measure of size (Redding et al., óþÕþ). I t several functions: constant rates for both speciation and extinction, and models where the speciation or extinction function was linear, sigmoidal, or modal. For the modal function, I used a vertically oset Gaussian
( − )ò + ( − ) exp − x xmid yý yÔ yý ò òωx
ò where yý is the rate at low and high x, yÔ is the rate at the mid point xmid, and ωx is the width (variance) of the Gaussian kernel. To test for the presence of directional body size evolution (e.g., Cope’s rule) I also ran models where there was a nonzero, but constant, directional term (ϕ). Among models with a directional relationship between log body size and speciation, there was strong support for a positive linear relationship between log body size and
ò speciation rates (likelihood ratio test (LRT) against the constant rate model: χÔ = Ôý., p = ý.ýýÔ), with a step-shaped sigmoidal curve preferred (∆ AIC = ó., gure ì.ß). Con- trary to the predictions above, speciation rates were inferred to increase with increasing body size. However, the best t model was a modal-speciation model (table ì.Õ), where species with body masses around ó.¢–Õì.¦ kg had elevated speciation rates (gure ì.ß).
äì hZ£±u§ ì
Cercopithecus mitisdryasdiana Cercopithecus ascaniuspetauristaerythrogasternictitans Cercopithecus pogoniaserythrotissclatericephus Cercopithecus neglectusmonacampbelliwolfi Cercopithecus solatuspreussilhoestihamlyni AllenopithecusMiopithecusErythrocebusChlorocebus talapoinaethiops patas nigroviridis Macaca fuscatamulattacyclopis Macaca assamensisthibetanasinicafascicularis Macaca nigrasilenusnemestrinaarctoidesradiata Macaca tonkeanamauraochreata LophocebusTheropithecusPapioMacaca hamadryas sylvanusCercopithecoidea albigena gelada MandrillusCercocebus leucophaeus torquatusagilisgaleritus ColobusMandrillus angolensispolykomosguereza sphinx ProcolobusColobus satanas preussirufomitratuspennantiibadius PygathrixNasalisProcolobus larvatusconcolor bieti verus Pygathrix nemaeusavunculusroxellanabrelichi Presbytis femoralismelalophosfrontatacomata Presbytis potenzianihoseirubicundathomasi TrachypithecusSemnopithecus obscurusvetulusjohniientellus Trachypithecus francoisipileatuscristatusphayrei HylobatesTrachypithecus leucogenysconcolor geeiauratus Hylobates laragilissyndactylusgabriellae HylobatesHominoidea muelleriklossiimoloch PanHylobates troglodytespaniscus hoolockpileatus SaguinusPongoGorillaHomo sapienspygmaeusgorilla fuscicollis Saguinus labiatusinustusnigricollistripartitus Saguinus bicolorleucopusimperatormystax LeontopithecusSaguinus oedipusgeoffroyimidas chrysopygus CallithrixLeontopithecus jacchus chrysomelacaissararosalia Callithrix kuhliigeoffroyipenicillata Callithrix pygmaeahumeraliferaargentataflavicepsaurita AotusCallimico nigricepstrivirgatusazarai goeldii Aotus brumbackinancymaaemiconaxinfulatus SaimiriAotus lemurinushershkovitzivociferans oerstedii Saimiri boliviensisvanzoliniisciureusustus LagothrixCebus olivaceusapellacapucinusalbifronsPlatyrrhini flavicauda AtelesBrachytelesLagothrix geoffroyifusciceps lagotricha arachnoides Ateles paniscusmarginatuschamekbelzebuth Alouatta seniculussarabelzebul Alouatta palliatacoibensiscarayapigrafusca Callicebus brunneusmolochcinerascens Callicebus personatusdubiuscupreuscaligatushoffmannsi Callicebus olallaeoenanthedonacophilus PitheciaCallicebus aequatorialismonachusalbicans torquatusmodestus CacajaoPithecia pitheciairroratamelanocephaluscalvus TarsiusChiropotes syrichtabancanus satanasalbinasus CheirogaleusTarsius spectrumdianaepumilusTarsiidae major MicrocebusCheirogaleus coquerelirufusmurinus medius PropithecusPhanerAllocebus furcifer trichotis diademaverreauxitattersalli EulemurAvahiIndri indri laniger macacofulvus HapalemurVareciaEulemur variegata rubriventermongozcoronatus aureus LepilemurLemurHapalemur catta dorsalis simusgriseus LepilemurStrepsirhini ruficaudatusleucopusedwardsimicrodon NycticebusDaubentoniaLepilemur mustelinusseptentrionalis coucang madagascariensis ArctocebusLorisNycticebus tardigradus pygmaeuscalabarensis GalagoPerodicticusArctocebus senegalensismoholigallarum aureus potto GalagoidesOtolemurGalago alleni garnettiicrassicaudatus demidoff GalagoEuoticusGalagoides matschiei palliduselegantulus zanzibaricus
60 50 40 30 20 10 02 6 10 Time (Ma) ln(mass)
F¶§u ì.ä: Phylogenetic tree of the primates, from Vos and Mooers(óþþä). Log body size (in grams) is shown by the horizontal bar for each species. e vertical dashed lines indicate the approximate range of body masses for which QuaSSE inferred elevated speciation rates under the “modal” speciation model (the lines indicate masses that are ÕþÛ above the base speciation rate). e arrow indicates where Muo¶«Z inferred a shi in speciation and extinction rates compared with the rest of the tree.
ä¦ hZ£±u§ ì
Including a positive directional term, consistent with increasing average body size along lineages, improved model t signicantly (table ì.Õ). e body size with elevated speciation rates inferred using the modal-speciation function is concentrated in the Cercopithecoidea and Hominoidea (old world monkeys and apes), and this section of the tree does appear to have undergone a recent burst of relatively rapid diversication compared with the rest of the tree (gure ì.ä). It is possible that any character that is concentrated in this clade could lead to a signicant correlation with diversication, so this body size result could be spurious. To explore this further, I used Muo¶«Z (Modelling Evolutionary Diversication Under Stepwise AIC; Alfaro et al. óþþÉ) to test for clade-specic dierences in diversication across the tree. Using the suggested AIC dierence of ¦, there was support for a single partition that separated the tree into the old world monkey clade (superfamily Cercopithecoidea; gure ì.ä), and
ò the rest of the tree (LRT χç = ò¥.À, p < ý.ýýýÔ). I modied QuaSSE to allow this partition. I allowed a “background” group (all clades except for the old world monkeys) to have one set of speciation and extinction functions and a “foreground” group (the old world monkeys) to have another. e two groups shared a common diusion coecient and I set the directional term to zero. A model with constant speciation and extinction functions that could dier between the partitions had a lower (better) AIC and fewer parameters than the unpartitioned modal- speciation model (table ì.Õ). I found no support for any relationship between body size and either speciation or extinction for the “background” group (table ì.Õ). However, there was support for a model where speciation was a decreasing linear function of log- body size among the old world monkeys (LRT vs. the constant-rate partitioned model:
ò χÔ = .Ô, p = ý.ýò¥) or where extinction was an increasing function of log-body size ò within this group (χÔ = À., p = ý.ýýò), suggesting decreasing diversication with increased body size in this group. However, the latter t suggested an extremely high rate of extinction among large old world monkeys (gure ì.ß).
ä¢ hZ£±u§ ì
TZfu ì.Õ: Summary of model ts for the association between body size and diversication for primates.
Model type ln L n AIC ∆AIC Constant -ì¦. ì Õäߢ.ß óÉ.ä Linear λ -óÉ.¦ ¦ Õäää.É óþ. Sigmoidal λ -óä.þ ä Õää¦.Õ Õ.þ Modal λ -óó.¦ ä Õä¢ä. Õþ.ß With directional tendency: Linear λ -óä.þ ¢ ÕääÕ.É Õ¢. Sigmoidal λ -óì. ß ÕääÕ.ß Õ¢.ä Modal λ -Õ. ß Õä¢Õ.ß ¢.ä Partitioned tree (no directional tendency): Constant -óó.þ ¢ Õ䢦.þ ß. Linear λ (fg) -ÕÉ.¦ ä Õä¢þ.É ¦.ß Linear λ (bg) -óÕ.ä ä Õ䢢.ó É.þ Linear λ (both) -ÕÉ.Õ ß Õä¢ó.ó ä.Õ Linear µ (fg) -Õß.Õ ä Õä¦ä.Õ þ.þ Linear µ (bg) -óÕ.ß ä Õ䢢.¦ É.ì Linear µ (both) -Õä. ß Õä¦ß.ß Õ.ä
Notes: “Fg” and “bg” refer to the Cercopithecoidea clade (old world monkeys) and the rest of the tree, respectively. “Both” is where the functions were t to both groups separately. ln L is the log likelihood of the maximum likelihood t, n is the number of parameters, and ∆AIC is the AIC dierence relative to the best model (linear µ (fg)).
ää hZ£±u§ ì
e results presented here are in broad accord with the analysis of body size evolution in primates by Paradis(óþþ¢), Gittleman and Purvis(ÕÉÉ) and Freckleton et al.(óþþ), despite using dierent methods, phylogenetic trees, and data sets. ese studies all initially inferred a relationship of increasing diversication rates with increasing body size, though this was not signicant in Gittleman and Purvis(ÕÉÉ). Paradis(óþþ¢) also looked at a partitioned data set and found support for decreasing diversication with increasing body size aer allowing clade-specic diversication rates.
ì.ß o«h¶««
How much signal is there within a phylogeny about the evolutionary processes that generated it? On the simulated trees used here, it was generally possible to infer the correct trend in the character dependence of speciation, but dicult to infer the exact functional form of the trend. For instance, both the linear and sigmoidal functions capture the tendency of speciation to increase or decrease with increasing character state and the inferred linear speciation function was oen a rough characterisation of the true function (gure ì.¦). It is oen dicult to infer ancestral states with condence (which are needed to identify a speciation-trait correlation, even though this is only done implicitly here), as the information provided by the tips attenuates deeper into the past. Here, adding more species improved the ability to recover the more specic model, but this may be through the larger number of shallow nodes, rather than through more accurate information about deep ancestral states (Mossel and Steel, óþþ¢). It is possible that extinction is not possible to reliably detect on real (non-simulated) molecular phylogenies. Accurate detection of extinction requires that we determine the rate at which species fail to appear in our phylogeny, which is a dicult task. Maximum likelihood estimates of the extinction rates are frequently zero, despite fossil evidence of nonzero extinction (e.g., Nee, óþþä; Purvis, óþþ). However, even when ML estimates are zero, the condence intervals around extinction rate estimates may be large, allowing
äß hZ£±u§ ì
a) 80
60
40 Frequency 20
0
0.30 b) 0.25 0.20 0.15 0.10 Speciation rate 0.05 0.00
0.6 c) 0.5 Speciation Extinction 0.4 0.3 0.2 0.1
Speciation or extinction rate Speciation or extinction 0.0
102 103 104 105
Body size (g)
F¶§u ì.ß: Primate speciation and extinction rate model ts. (a) Primate body mass distribution. (b) Maximum likelihood speciation rate model ts for the complete tree, showing constant, linear, sigmoidal and modal functions. e modal model provided the best t to the data (table ì.Õ). (c) Maximum likelihood speciation and extinction functions on the partitioned tree: grey lines are ts for the “background group”,and the black lines are ts for the Cercopithecoidea. e reverse-L shape of the Cercopithecoidea extinction rate function (black dashed line in panel c) indicates a zero extinction rate for all observed body sizes and an extremely high extinction rate for species with body masses slightly greater than the largest observed mass in the Cercopithecoidea.
ä hZ£±u§ ì potentially high levels of extinction to be consistent with the observed data. Where we have strong independent evidence of high extinction rates, perhaps our analyses would be improved by including these rates directly, either through a prior distribution on extinction rates in a Bayesian analysis, or by using this estimated rate and not attempting to directly estimate it from the phylogeny. e likelihood calculations proposed here would hold in either case. Many phylogenies appear to show some sort of slowdown in lineage accumulation towards the present, which will generate low extinction rate estimates. e response to this has generally been to alter the model of diversication. Most commonly, slowdowns have been interpreted as evidence that speciation rates may be density dependent (e.g., McPeek, óþþ; Phillimore and Price, óþþ), and various alternative models of cladogen- esis have been proposed and tested based on this pattern (e.g., McPeek, óþþ; Rabosky, óþþÉb). Because of its use of the birth-death model, which does not allow interaction among lineages, it would not be straightforward to incorporate these types of dynamics directly into QuaSSE, though it is possible that they may be approximated (Rabosky and Lovette óþþ, but see Bokma óþþÉ). Care should be taken to interpret results from QuaSSE and other birth-death based models (e.g., Nee et al., ÕÉɦb; Paradis, óþþ¢; Ra- bosky, óþþä; Maddison et al., óþþß; Freckleton et al., óþþ; Alfaro et al., óþþÉ) in light of these limitations. An alternative explanation for the observed “slowdown”, and consequent problems for estimating speciation and extinction rates, is that our methods of tree construction and ultrametricisation creates trees that are incongruous with the model. Extinction rate estimates will always be sensitive to the precise lengths of terminal branches, and any consistent bias towards lengthening the terminal branches will cause problems (Purvis, óþþ). Furthermore, our delineation of species is generally retrospective, with lineages counted as species once both morphological changes and reproductive isolation have occurred. However, many isolated lineages may be considered “species” in that they will never again exchange genes. Some of these would eventually become recognised
äÉ hZ£±u§ ì species, but most will go extinct. However, simple birth-death models do not include this sort of process; incorporating such lags in species recognition into tree construction or diversication models, along with information from the fossil record where available, may help with eorts to infer meaningful speciation and extinction rates. e likelihood equations derived here provide exact solutions to the forward-time dynamics described by Paradis(óþþ¢) and Freckleton et al.(óþþ), and also to the early model of Slatkin(ÕÉÕ), but ignoring character evolution at nodes. e key advance of this work is that it treats character evolution and cladogenesis simultaneously. ough the equations cannot be solved directly, likelihoods computed using this approach will correspond exactly to those under this model of character evolution and cladogenesis. Because the likelihood method here uses all of the available phylogenetic and character data, it should have higher statistical power than methods based on approximations, such as rst inferring ancestral states and ignoring the character-dependent diversica- tion process when doing so. Run on the same trees, the model of Freckleton et al.(óþþ) had approximately óäÛ of the power of QuaSSE at detecting dierential speciation (data not shown). However, the factors that aect power were the same as identied by Freck- leton et al.(óþþ); increased rates of character evolution, stronger eects of a character on speciation, and larger trees all increased power (gure ì.ì). QuaSSE does retain some ability to detect dierential extinction in contrast to the method of Freckleton et al.(óþþ), but this power appears to be limited and parameter dependent (gure ì.ì). QuaSSE was also robust to the levels of background extinction used here (c.f. Paradis, óþþ¢). Despite their assumptions, diusion models of character evolution and birth-death models of cladogenesis have given us insights over the last few decades into correlated character evolution (Felsenstein, ÕÉ¢), evolutionary constraints (Hansen and Martins, ÕÉÉä) and patterns of diversication (Alfaro et al., óþþÉ). While the combination of the birth-death and diusion methods used in QuaSSE may inherit the limitations of both methods, it presents a tractable and powerful method that will help to answer long
ßþ hZ£±u§ ì standing questions about the correlates of diversication from phylogenetic data and current character distributions. As Freckleton et al.(óþþ) noted, we have no general expectation of what the relationship between speciation or extinction and character states might look like. Because QuaSSE can use arbitrary speciation and extinction func- tions, it allows investigation of alternative functions. However, we should not generally expect to extract more than general trends from the data, especially where variation in extinction is important in aecting patterns of diversication.
ßÕ hZ£±u§ ¦ Dêu§«±§uu:C£Z§Z±êu Píuu±h AZí«u« Dêu§«hZ± R
¦.Õ «¶Z§í
e R package “diversitree” contains a number of classical and contemporary compar- ative phylogenetic methods. Key included methods are BiSSE (Binary State Speciation and Extinction), MuSSE (a multi-state extension of BiSSE), and QuaSSE (Quantitative State Speciation and Extinction). Diversitree also includes includes methods for analysing trait evolution and estimating speciation/extinction rates independently. In this chapter, I describe the features and demonstrate use of the package, using a new method, MuSSE (Multi State Speciation and Extinction), to examine the joint eects of two traits on speciation. Diversitree is open source and available on h§Z (the Comprehensive R Archive Network). A tutorial and sample data sets can be downloaded from http: //www.zoology.ubc.ca/prog/diversitree.
¦.ó ±§o¶h±
e tree of life is remarkably uneven in both taxonomic and trait diversity; describing this unevenness and revealing its underlying causes are major focuses of evolutionary biology. Comparative phylogenetic methods have been widely used to study patterns and rates of both trait evolution (Felsenstein, ÕÉ¢; Pagel, ÕÉɦ) and diversication (Nee et al., ÕÉɦb). A recently developed set of models unites both trait evolution and species diversication, avoiding biases that occur when the two are treated separately (Maddi-
ßó hZ£±u§ ¦ son, óþþä). is includes the “BiSSE” method (Binary State Speciation and Extinction; Maddison et al., óþþß), as well as similar methods that generalise the approach to non- anagenetic trait evolution and to quantitative traits (see below). In this chapter, I describe the “diversitree” package for R (R Development Core Team, óþÕó). Diversitree implements several recently developed methods for analysing trait evolution, speciation, extinction, and their interactions. Below, I describe the general approach of the package and the method that it contains. I introduce a generalisation of the BiSSE method to multi-state characters or to combinations of binary traits (MuSSE: Multi State Speciation and Extinction). Finally, I demonstrate the package, and MuSSE, with an example of social trait evolution in primates.
¦.ì ±u u±o«
e diversitree package implements a series of methods for detecting associations be- tween traits and rates of speciation and/or extinction given a phylogeny and trait data, including the BiSSE method (Maddison et al., óþþß). Under BiSSE, speciation and extinc- tion follow a birth-death process, where the rate of speciation and extinction may vary with a binary trait, itself evolving following a continuous-time Markov process. BiSSE has been used to look at the associations between many dierent traits and speciation or extinction, including migration in warblers (Winger et al., óþÕó), fruiting body mor- phology in fungi (Wilson et al., óþÕÕ), and recombination in plants (Johnson et al., óþÕÕ). In its original formulation, BiSSE assumes that character change occurs only along branches (anagenetic change), using the same model of character evolution as used in the “discrete” (Pagel, ÕÉɦ) or “Mk” models (Lewis, óþþÕ). is may not always be a rea- sonable assumption, and we might expect some characters to show considerable change during speciation (cladogenetic change). One such example is geographic range; while geographic ranges are expected to change anagenetically, allopatric speciation should also alter range sizes. e GeoSSE (Geographic SSE; Goldberg et al., óþÕÕ) method allows
ßì hZ£±u§ ¦ speciation rates to vary depending on a species’ presence in two dierent geographic regions, allowing within- and between-region speciation. is has been used to examine diversication in plants endemic to serpentine regions (Anacker et al., óþÕþ). More recently, the BiSSE-ness (BiSSE-Node Enhanced State Shi; Magnuson-Ford and Otto, óþÕó) and ClaSSE (Cladogenetic SSE; Goldberg and Igić, in press) models have been developed to allow both anagenetic and cladogenetic character evolution, such as that expected for traits involved in ecological speciation (Schluter, óþþÉ). Importantly, with extinction or incomplete taxonomic sampling not all speciation events will appear as nodes in a phylogeny; these missing nodes must be modelled to accurately estimate the rate of cladogenetic trait change (see Nee et al., ÕÉɦb and Bokma, óþþb, and note that the placement of these missing nodes is nonlinear in time). Diversitree also includes methods for non-binary traits. QuaSSE (Quantitative SSE; FitzJohn, óþÕþ) allows speciation and extinction rates to be modelled as any function of a continuously varying trait, which itself evolves under Brownian motion. is has been used to test for associations between diversication rates and body size in snakes (Burbrink et al., óþÕó) and dispersal ability in birds (Claramunt et al., óþÕó). Finally, MuSSE extends BiSSE to multi-state traits or combinations of binary traits. Diversitree includes variants that relax some of the original assumptions of the in- cluded methods. Birth-death based speciation/extinction models will give biased pa- rameter estimates unless all extant taxa in the focal clade are present in a phylogeny. For cases where not all extant species are included in a phylogeny, diversitree includes methods for where species are included randomly or where all species are represented in “unresolved clades” (FitzJohn et al., óþþÉ). Rates of speciation, extinction or character change can be set to vary as any user-supplied function of time. Similar approaches have been used elsewhere to model slowdowns in speciation or diversication over time (e.g., Rabosky and Glor, óþÕþ). Rates of speciation, extinction, and character change may also be allowed to vary in dierent regions of a tree. is is similar to Muo¶«Z (Modelling Evolutionary Diversi-
ߦ hZ£±u§ ¦ cation Under Stepwise AIC: Alfaro et al., óþþÉ) for diversication and A¶±u¶§ (Eastman et al., óþÕÕ) for continuous character evolution. Such methods can be used to test whether membership of a clade that has undergone a shi in diversication rates is misleading BiSSE or other methods. For example, if particular trait values are concentrated in a highly diverse clade BiSSE may detect an association when none exists (see applications in Johnson et al., óþÕÕ and FitzJohn, óþÕþ, the diversitree tutorial for a worked example, and further discussion in Read and Nee, ÕÉÉ¢). In the above models, if speciation and extinction do not vary with character state, the models converge on classical models of character evolution (e.g., Pagel, ÕÉɦ) and state-independent speciation and extinction (Nee et al., ÕÉɦb). For completeness, these models are also included. However, when comparing models to determine if traits are associated with speciation or extinction using likelihood ratio tests, comparisons must involve only nested models to be valid. For example, BiSSE and Mkó are not directly comparable, but BiSSE can be compared to a constrained version of BiSSE that disallows state-dependent diversication. See table ¦.Õ for a summary of included methods. In addition to the likelihood calculations, tree simulation routines are implemented for birth-death, BiSSE, MuSSE, and QuaSSE. Simulating character evolution on a given tree is possible for discrete (binary or multi-state) characters and continuous characters under Brownian motion and Ornstein-Uhlenbeck processes. Likelihood-based ances- tral state reconstruction (Schluter et al., ÕÉÉß) and stochastic character mapping (Boll- back, óþþä) are implemented for discrete characters.
¦.¦ ±u Z££§Zh
In diversitree the inference process is decoupled from the likelihood calculations, al- lowing users to take advantage of the programmatic exibility of R. Analyses therefore require at least two steps. First, the user creates a likelihood function from their tree and data, using a make.xxx function (where xxx is one of the model types available). For
ߢ hZ£±u§ ¦
TZfu ¦.Õ: Summary of model types available in diversitree (as of version Õ.þ).
Name Traita Missing Extensionsc Description and reference taxab bd — Sk, Un Sp, Tv Constant-rate birth-death (Nee et al., ÕÉɦb) mk2, mkn B,M — Sp, Tv Markov discrete character evolution (Pagel, ÕÉɦ; Lewis, óþþÕ) bisse B Sk, Un Sp, Tv Binary State Speciation and Extinction (Maddison et al., óþþß; FitzJohn et al., óþþÉ) bisseness B Sk, Un — BiSSE-ness (Magnuson-Ford and Otto, óþÕó) geosse T Sk — Geographic State Speciation and Extinction (Goldberg et al., óþÕÕ) musse M Sk, Un Sp, Tv Multi State Speciation and Extinction classe M Sk — Clade State Speciation and Extinction (Goldberg and Igić, in press) bm Q — — Brownian motion ou Q — — Ornstein-Uhlenbeck quasse Q Sk Sp Quantitative State Speciation and Extinction (FitzJohn óþÕÕ) a: Trait type key: B = Binary (þ/Õ), T = Ternary (three combinations of presence/ absence in two regions), M = Multi-state (Õ, ó, ì, ...), Q = Quantitative (real-valued). b: Missing taxa support: Sk = “Skeleton tree” (random sampling) correction, Un = “Unresolved clade”. c: Extensions: Sp = “Split tree” (allows Muo¶«Z-style dierent rate classes in dierent partitions of the tree), Tv = Time-varying rates.
ßä hZ£±u§ ¦ example, to look at character evolution under a two-state Markov model (Lewis, óþþÕ), we would enter:
lik <- make.mk2(tree, states)
Secondly, we can nd the maximum likelihood (ML) parameter vector:
fit <- find.mle(lik, starting.pars) or use it in a Bayesian analysis by running an hh (Markov chain Monte Carlo) chain (with an appropriate prior):
samples <- mcmc(lik, starting.pars, nsteps, proposal.widths, prior) or in some other use (for example, integrating the function numerically to compute the “integrated likelihood” for Bayes factors, e.g., Kass and Raery, ÕÉÉ¢). Between these steps, the likelihood function can be constrained arbitrarily. Diver- sitree’s constrain function allows several natural constraints, such as setting one pa- rameter equal to another, or to a specic numerical value. For example, to constrain the forward and backward transition rates to be equal (reducing the Mkó model to the Jukes-Cantor model):
lik.jc <- constrain(lik, q01 ~ q10)
We could then nd the ML parameter by entering
fit.jc <- find.mle(lik.jc, starting.parameters)
ese nested models could then be compared using a likelihood ratio test. Most of the methods included in diversitree are computationally challenging, but there are a number of options for controlling how the calculations are performed. Among these, the user can use dierent ODE solvers, and the accuracy of the calculations can be traded o against speed for most methods. Algorithms that have proven to be rea- sonably robust (in my experience) are used by default. For some models, such as Mkó, Brownian motion, and Ornstein-Uhlenbeck, diversitree provides alternative algorithms
ßß hZ£±u§ ¦ that perform better with large numbers of states or large trees. e possible options and algorithms are discussed in Appendix h.Õ. Diversitree builds on much existing soware: ape (Paradis et al., óþþ¦) is used for tree loading and manipulation, the deSolve package (Soetaert et al., óþÕþ) and sundials library (Hindmarsh et al., óþþ¢) are used for solving the systems of dierential equations for the discrete trait models, and ±ë (Frigo and Johnson, óþþ¢) is used to solve the partial dierential equations in QuaSSE. In addition to the R interface, Wayne Maddison has developed a wrapper around some of diversitree’s functionality to allow use from within Mesquite (Maddison and Maddison, óþþ), using a user-friendly point-and-click interface.
¦.¢ ±u ¶««u ou
MuSSE is a straightforward extension of BiSSE to discrete traits with more than two states. Some characters are not naturally binary (e.g., mating systems, diets, or count data), and MuSSE allows these to be treated naturally. is method has been used to examine the eect of diet (faunivore, folivore, frugivore) in primates (Gómez and Verdú, óþÕó). Alternatively, MuSSE can be used to disentangle the relative importance of two or more traits to diversication (see below). Suppose that we have a trait that takes values Ô, ò, . . . , k that might inuence specia- tion and/or extinction. Using the notation and approach of Maddison et al.(óþþß), let lineages in state i speciate at rate λi, go extinct at rate µi, and transition to state j ≠ i at rate qi j. For k states, there are k speciation rates, k extinction rates, and k(k − Ô) transition rates.
¦.¢.Õ Derivation
Let DN,i(t) be the probability of a lineage in state i at time t before the present (t = ý) evolving into its descendant clade as observed, and let Ei(t) be the probability that a
ß hZ£±u§ ¦ lineage in state i at time t, and all of its descendants, goes extinct by the present. Under the same assumptions as Maddison et al.(óþþß) and using the same approach, it is possible to derive a set of ordinary dierential equations that describe the evolution of the D and E variables over time:
dEi(t) ⎛ ⎞ ò = µi − λi + µi + Q qi j Ei(t) + λi Ei(t) + Q qi jE j(t) (®.×a) dt ⎝ j≠i ⎠ j≠i
dDN,i(t) ⎛ ⎞ = − λi + µi + Q qi j DN,i(t) + òλi Ei(t)DN,i(t) + Q qi jDN, j(t). (®.×b) dt ⎝ j≠i ⎠ j≠i
For k states, there are òk equations. We can solve this system of equations numerically from the tip to base of a branch. As with BiSSE the initial conditions for the D variables are Ô when the trait combination is consistent with the data, and ý otherwise, while the initial conditions for all E variables is zero. Missing trait data is allowed by setting all D values to Ô (any state is consistent with the observed data). For the multi-trait case, if state information is available for some traits and not the others, the initial conditions are modied to allow any trait combination consistent with the observed data. For example, if trait A is in state ý and the state of trait B is unknown, the D variables will be Ô for the combinations (ý, ý) and (ý, Ô) and zero for combinations (Ô, ý) and (Ô,Ô). When the phylogeny is incomplete, the initial conditions can be modied by assuming random sampling (see FitzJohn et al., óþþÉ). At the node N′ that joins lineages N and M, we multiply the probabilities of both daughter lineages together with the rate of speciation
DN′,i(t) = DN,i(t)DM,i(t)λi . (®.ö)
e equations here assume no cladogenetic change, but this can be added following the approach in Magnuson-Ford and Otto(óþÕó) or Goldberg and Igi ć (in press).
ßÉ hZ£±u§ ¦
As the number of parameters in MuSSE grows quadratically with the number of states, care will oen be required to prevent over-tting and pathological behaviour associated with estimation of rate parameters involving states that are rarely observed. In particular, if some state i is not observed, then the the likelihood surface never has a negative slope with increasing qi j (j ≠ i) and µi, causing ML values for these parameters to tend to innity, in turn causing problems for both the maximisation and likelihood calculation routines. For ordinal data, constraining the transition rates so that qi j = ý for Si − jS > Ô may be useful.
¦.¢.ó Analysing multiple traits simultaneously Alternatively, this method can be generalised to combinations of binary traits, following Pagel(ÕÉɦ); in this scheme, a discrete state would represent the combination of dierent binary traits; for n binary traits there are òn possible states. For example, for a pair of binary traits there are four possible state combinations: (ý, ý), (ý, Ô), (Ô, ý), (Ô,Ô). Wecan denote these (Ô, ò, ç, ¥), and use MuSSE directly. However, in this “multi-trait” model, parameters may be unintuitive to interpret, particularly as the number of traits increases. Moreover, with multiple traits we may be explicitly interested in asking if combinations of traits aect speciation or extinction non-additively, and this is dicult to determine with this parametrisation. In diversitree, an alternative parametrisation is available to facilitate interpretation and model testing. Let λi, j be the speciation rate of a species with states A = i, B = j, for two binary traits A and B. We can use a linear modelling approach and write
λi, j = λý + λAXA + λB XB + λAB XAXB, (®.î)
where XA and XB are indicator variables that are Õ when trait A and B are in the “Ô” state
(respectively), λý is the “intercept” speciation rate (if all traits are in state ý), λA and λB are the “main eects” of traits A and B, and λAB is the interaction between these. If a
þ hZ£±u§ ¦
combination of A and B drives speciation, then a model with λAB will t better than a model with just the main eects. Similarly, for the extinction rate, we write
µi, j = µý + µAXA + µB XB + µAB XAXB. (®.®)
e same approach can be used for the character transition rates. If we follow Pagel (ÕÉɦ) and allow change in only a single trait during a single point in time, then for n traits there are only òn possible “types” of transitions (i.e., a ý → Ô or Ô → ý transition in one of the n traits). However, the rate at which these transitions happen may vary depending on the state of the other traits. For example, with two traits, we can write the rate of transition in trait A from ý to Ô, given that trait B is in state j, as
qAýÔ, j = qAýÔ,ý + qAýÔ,B XB. (®.£)
where qAýÔ,ý is the intercept term and qAýÔ,B is the main eect of trait B. In this scheme, if a model with qAýÔ,B ts better than a model without, then the rates of ý → Ô transition of trait A depends on the state of trait B. Similar schemes can be derived for more traits; for more than two states, interaction terms will appear in the equations. For example, with three traits (A, B, and C)
qAýÔ, j,k = qAýÔ,ý + qAýÔ,B XB + qAýÔ,C XC + qAýÔ,BC XB XC (®.å)
where qAýÔ,C is the main eect of trait C on the rate of character change of trait A from
ý to Ô, and qAýÔ,BC is an interaction eect that species the level of non-additivity of the traits B and C on character change of trait A. Of course, this parametrisation of transition rates is valid for studying character evolution in multiple binary traits without modelling its eect on diversication (as in Pagel, ÕÉɦ), and this can be done with the make.mkn.multitrait function.
Õ hZ£±u§ ¦
¦.ä «¶Z± ±u«± Z««u«« ±u £ëu§ ¶««u
ere are a large number of distinct ways of modelling diversication with MuSSE, and I expect that the power of the model will depend strongly on the model specication. For example, one might have an ordinal multi-state trait, where transitions can only occur between adjacent states, and be interested asking whether large or small values of that trait are associated with elevated rates of diversication. For a given number of states (> ó), such a model will have far fewer parameters (and greater power) than a model where the trait is purely categorical, such as diet, if all transitions are possible. e power of MuSSE will strongly depend on the number of estimated parameters (especially the character transition parameters), and I expect that for any more than four states, careful consideration of constraints in the transition parameters will be needed. Here, I focus on a simple multi-trait case where there is some number of uncorrelated binary traits that evolve at the same rate, one of which inuences the rate of speciation. I investigate the ability of MuSSE to correctly identify the trait associated with elevated speciation and to rule out the association with other traits, as a function of clade size and number of possible traits.
To simulate trees, I set the intercept speciation and extinction rates (λý and µý) to ý.Ô and ý.ýç respectively, and character transition rates (qXýÔ, qXÔý, for traits X = A, B,...) to ý.ýÔ. I set λA = ý.Ô so that when trait A is in state Ô, the speciation rate is ý.ò. When only a single trait is considered, these are the same parameters used by Maddison et al. (óþþß) in their “asymmetric speciation” case. I simulated phylogenies and character state transitions under the multitrait MuSSE model, starting at the root in one of the “low” speciation states (with A in state þ), sampling randomly for the other traits. Trees were simulated to contain ¢þ, Õþþ, óþþ, or ¦þþ species, with Õ, ó, ì, or ¦ traits, and with Õþþ replicate trees for each of the Õä combinations. For each tree, I ran a Markov chain Monte Carlo (hh) analysis on a model where all speciation main eects were free to vary (but excluded interactions), tting only intercepts for extinction and character change. For example, with two traits this meant
ó hZ£±u§ ¦
that the free parameters were λý, λA, λB, µý, qAýÔ,ý, qAÔý,ý, qBýÔ,ý, and qBÔý,ý. is model is very close to the true model, but allows for uncertainty in which trait is responsible for increased speciation (trait A or B). I used an exponential prior with a mean of twice the state-independent diversication rate for all the underlying rate parameters (see Appendix h.ì). I ran each chain for Õþ,þþþ steps, and discarded the rst ¢þþ steps as “burn-in”. Because the “dummy” traits B, C, and D are equivalent where present, I report results primarily for trait A (which increases speciation rates when in state Õ) and trait B (which does not aect speciation rates). As the size of the tree increased, the credibility intervals around the main eects on speciation decreased, and the mean estimated eect converged on the true values (see gure ¦.Õ). e uncertainty around the dummy trait, B, was not strongly aected by the number of dummy traits that were included, and decreased slightly as more traits were included. For small trees (≤ Õþþ species), MuSSE underestimated the eect of trait A on speciation rates, especially as the number of traits increased. Signicance showed similar patterns. As tree size increased, power to correctly iden- tify A as the trait associated with increased speciation increased (gure ¦.ó, blue lines), but for trees with Õþþ species or more this varied only weakly with the number of in- cluded traits. e dummy trait B was signicant approximately ¢Û of the time (based on É¢Û credibility intervals): the rate expected due to type I error (gure ¦.ó, solid green lines). To test how model misspecication would aect the results, I also reran the analyses with trait A omitted so that none of the analysed traits were truly associated with state- dependent diversication. e dummy trait B was incorrectly associated with increased speciation in up to óßÛ of trees (gure ¦.ó). While this eect was strongest when there were fewer dummy traits, the possibility of any trait being falsely associated with diver- sication increased. Indeed, where three dummy traits are included, the probability of associating any trait with increased speciation increased to ¢ÉÛ for the ¦þþ species tree (gure ¦.ó, dotted orange lines).
ì hZ£±u§ ¦
0.20 a) 1 trait b) 2 traits
0.15
0.10
0.05 NA NA
0.00
−0.05
−0.10
0.20 c) 3 traits d) 4 traits Index Index 0.15 Speciation rate main effect Speciation rate 0.10
0.05 NA NA
0.00
−0.05
−0.10
50 100 200 400 50 100 200 400
Index Number of species Index
F¶§u ¦.Õ: Uncertainty around multitrait MuSSE parameter estimates as a function of tree size and number of traits. e solid blue line and blue region represent the mean and É¢Û credibility interval (CI) over Õþþ trees for the estimated speciation rate main eect of trait A, which increases speciation rates (true value is ý.Ô, indicated by the grey dotted line). e solid green line and region represent the mean and É¢Û CI for the speciation rate main eect for trait B, which has no eect on speciation rates (true value of zero indicated by dotted grey line). Panel (a), with one trait, is equivalent to BiSSE.
¦ hZ£±u§ ¦
1.0 a) 1 trait b) 2 traits
0.8
0.6 NA NA
0.4
0.2
0.0
1.0 c) 3 traits d) 4 traits Index Index
0.8 Proportion of trees significant
0.6 NA NA
0.4
0.2
0.0
50 100 200 400 50 100 200 400
Index Number of species Index
F¶§u ¦.ó: Power and error rates of multitrait MuSSE, as a function of tree size. e lines are the proportion of Õþþ simulated trees that have É¢Û credibility intervals of speciation main eects that do not include zero (indicating signicant state-dependent speciation). e blue line represents trait A, which increases speciation rates when in state Ô. e solid green line represents a trait B with no eect on speciation. e dashed green line indicates the same trait B, but when trait A is omitted from the analysis. e dotted orange line in panels (c) and (d) is the probability of nding any of the dummy traits (B, C, or, where present D) signicant in an analysis that omits trait A. e ¢Û expected type I error rate is indicated by the dotted grey line. Panel (a), with one trait, and the dashed green line in panel (b) are equivalent to BiSSE.
¢ hZ£±u§ ¦
ese results are simultaneously encouraging and sobering. When a trait that is aects speciation is included in the model, it is easily detected, and this is robust to the number of additional traits included. However, if no traits do aect speciation, as we add additional traits we risk false positives at an alarming rate. However, the rates of false positives are perhaps not surprising. e trees used do not conform well to the expectations of a constant rate birth-death tree (there is strong phylogenetically structured variation in speciation rates) and the model is using the only parameters it has to explain this deviation. I expect that similar problems will aect other comparative analyses such as detecting correlated trait evolution with the Mk/discrete models. e code for this analysis is available on the diversitree github siteó.
¦.ä.Õ Social evolution and speciation in primates
Here I give a worked example, using the trait data compiled by Redding et al.(óþÕþ) to look at social evolution in primates. Previously, Magnuson-Ford and Otto(óþÕó) found that both monogamy and solitary behaviour in primates reduced speciation rates, though this was only marginally signicant for solitariness. However, if these characters are correlated, then it is possible that the decreased speciation rates could be truly asso- ciated with just one trait. at is, the eect of one character might bias the estimated eects of the other when these are treated independently. Alternatively, it could be that an elevated (or decreased) speciation rate occurs only with some combination of trait states (e.g., only social, polygamous taxa speciate more rapidly). Here, I illustrate the method with R input in italics (preceded by “>”), while output is upright. e full version of this analysis is presented in Appendix h.ì. e phylogeny is stored in NEXUS format (Maddison et al., ÕÉÉß), and loaded using the read.nexus function in ape as the object “tree”. For multi-trait MuSSE, the data must be stored in a data frame with species names as row labels. e two traits are “M” (TRUE for monogamous, FALSE otherwise) and “S” (TRUE for solitary, FALSE otherwise).
óhttp://github.com/richfitz/diversitree/tree/pub/simulations
ä hZ£±u§ ¦
> head(dat) MS Allenopithecus_nigroviridis NA FALSE Allocebus_trichotis TRUE TRUE Alouatta_belzebul NA FALSE Alouatta_caraya NA FALSE Alouatta_coibensis FALSE FALSE Alouatta_fusca NA FALSE
Note that some of the species lack state information (i.e., have NA values). ese are accommodated using the method described above. e rst step is to make a likelihood function with make.musse.multitrait. e “depth” argument controls the number of terms to include from equations (¦.ì), (¦.¦), and (¦.¢); ý includes only intercepts, Ô also includes main eects, ò includes interactions between two parameters and so on. If specied as a ì-element vector, the elements apply to the λ, µ, and q parameters; if a scalar is given, the same depth is used for all three parameter types. To make a model with intercepts only:
> lik.0 <- make.musse.multitrait(tree, dat, depth=0)
is likelihood function takes a vector of parameters as its rst argument. To get the vector of names for the parameters, use the argnames function:
> argnames(lik.0) [1] "lambda0" "mu0" "qM01.0" "qM10.0" "qS01.0" "qS10.0"
is shows the six parameters: the speciation rate (lambda0), extinction rate (mu0) and four transition rates (e.g., qM01.0 is the rate of transition of the breeding system from non-monogamous to monogamous, and this rate does not depend on the social state S). To nd the maximum likelihood (ML) point, a sensible starting point must be sup- plied (discussed in Appendix h.ì); with such a point, p.0, we can nd the ML parameters using the find.mle function:
> fit.0 <- find.mle(lik.0, p.0)
ß hZ£±u§ ¦
is returns an object (fit.0) that contains estimated parameters, likelihood values, and other information about the t (see the help page ?find.mle for more information).
> round(coef(fit.0), 4) lambda0 mu0 qM01.0 qM10.0 qS01.0 qS10.0 0.1912 0.1110 0.0251 0.0259 0.0009 0.0163 > fit.0$lnLik [1] -786.3427
By default “subplex” (Rowan, ÕÉÉþ) is used for the optimisation. However, dierent optimisation algorithms can be selected through the “method” argument to find.mle. To include state-dependent diversication, we construct a likelihood function that includes “main eects” of the two traits on speciation and extinction. To allow this while retaining the independent model of character evolution, we change the depth argument:
> lik.1 <- make.musse.multitrait(tree, dat, depth=c(1, 1, 0)) > argnames(lik.1) [1] "lambda0" "lambdaM" "lambdaS" "mu0" "muM" "muS" [7] "qM01.0" "qM10.0" "qS01.0" "qS10.0"
Running an ML search from a suitable point p.1:
> fit.1 <- find.mle(lik.1, p.1)
ese models can be compared using a likelihood ratio tests using the anova function; the model with state-dependent speciation and extinction ts much better than the state-
ò independent version (χ¥ = ò¥.Þ, p < ý.ýýÔ).
> anova(fit.1, noSDD=fit.0) Df lnLik AIC ChiSq Pr(>|Chi|) full 10 -773.97 1568.0 noSDD 6 -786.34 1584.7 24.739 5.677e-05
(e use of anova for general model comparison is a fairly widespread convention in R packages, and does not imply that an ANOVA was performed!)
hZ£±u§ ¦
We can expand the model further to allow interactions between the two traits in speciation and extinction; is a combination of mating system and sociality associated with elevated speciation or extinction? Specifying depth=c(2, 2, 0) introduces the terms “lambda.MS” and “mu.MS” (see equations ¦.ì and ¦.¦) to model non-additive eects of these traits on speciation and extinction, and again leaves character transitions to occur independently for the two traits.
> lik.2 <- make.musse.multitrait(tree, dat, depth=c(2, 2, 0)) > fit.2 <- find.mle(lik.2, p.2) > anova(fit.2, noInteraction=fit.1) Df lnLik AIC ChiSq Pr(>|Chi|) full 12 -773.73 1571.5 noInteraction 10 -773.97 1568.0 0.49143 0.7821
is time the improvement is not signicant, implying that there is no evidence for an interaction between these traits on speciation and extinction rates. To test the signicance of the each trait (solitariness and monogamy) in a maximum likelihood framework, we could t models where the main eect of each trait was set to zero and compare these against the model fit.1 using a likelihood ratio test. is ap- proach is explored in Appendix h.ì. Alternatively, we might run an hh and examine the posterior distributions of the lambdaM and lambdaS values:
> samples <- mcmc(lik.1, p.1, nsteps=10000, w=0.5, prior=prior)
e prior distribution used here is exponential with respect to the underlying rates in the model (e.g., λi, j, not λAB: see equation (¦.ì) and Appendix h.ì), but any prior function may be specied by the user (see the main diversitree tutorial). e “slice sampling” hh algorithm (Neal, óþþì) is used by default and is fairly insensitive to tuning param- eters. In particular, specifying a too large or too small value for the width of the proposal step (w) just increases the mean number of function evaluations per step, rather than the rate of mixing of the chain. e marginal distributions of both the monogamy and sociality main eects on spe- ciation rates are negative over the bulk of their distribution (gure ¦.ì). However, in
É hZ£±u§ ¦ contrast with treating the traits separately using BiSSE (gure ¦.ì a), we nd that the É¢Û credibility intervals for both traits do not include zero (gure ¦.ì b). erefore these results support the conclusions of Magnuson-Ford and Otto(óþÕó) that both monogamy and sociality are associated with decreased speciation rates in primates. Surprisingly, simultaneously accounting for both traits increased our condence levels, suggesting that incorporating additional traits can reduce noise caused by shis in diversication due to other traits. More comprehensive examples are included in a tutorial document on the diversitree website, http://www.zoology.ubc.ca/prog/diversitree, as well as within the on- line help for the package.
¦.ß h« hu±«
e diversitree package implements several methods for jointly modelling character evolution and speciation. e package is open source and designed to be fairly straight- forward to extend. In particular, any model that can be expressed by moving down a tree (post-order traversal, or “pruning”; Felsenstein, ÕÉÕ) can be implemented using only a modest number of lines of R code. To facilitate the development of related methods, there is a “Writing diversitree extensions” manual available from the diversitree website. Stable versions of diversitree are available on h§Z (the Comprehensive R Archive Net- work) and from the website above. Development can be followed or joined on github (http://github.com/richfitz/diversitree). I hope that the package will enable users to test a wide variety of macroevolutionary questions. However, I will close with a caution. All included methods are correlative only (Maddison et al., óþþß; Losos, óþÕÕ); they can merely show a statistical association be- tween traits and speciation or extinction rates and cannot prove that the trait does aect speciation or extinction. Any unconsidered trait that is correlated with the target trait could be causal (Maddison et al., óþþß, and gure ¦.ó). Alternatively, the associations
Éþ hZ£±u§ ¦
12 a) Independent lambda (+monogamy) 10 (BiSSE) lambda (+solitary) 8 6 4 2 0
15 b) Simultaneous (MuSSE multitrait) 10 Posterior probability density Posterior
5
0
−0.3 −0.2 −0.1 0.0 0.1 Trait effect on speciation
F¶§u ¦.ì: Posterior probability distributions for the eects of monogamy (dark grey) and solitariness (light grey) on speciation rate. Shaded areas and bars indicate the É¢Û credibility intervals for each parameter. In the top panel, BiSSE was run on each character independently. In the bottom panel, the musse.multitrait t the eects of both traits simultaneously. In both cases, the hh chain was run for Õþ,þþþ steps, and the rst ¢þþ points were dropped as burn-in.
ÉÕ hZ£±u§ ¦ may be spurious, perhaps driven by departures from the assumed model of cladogenesis or character evolution. ere is currently no way of testing absolute goodness-of-t with any method, and all conclusions should be recognised as being conditional on a particular model, and on that model being appropriate.
Éó hZ£±u§ ¢ S¶§êêZ ±u L±±u«±¥ S£uhu« Suuh±,C£u’« R¶u, b MZZ Boí Sñu
¢.Õ ±§o¶h±
Cope’s rule states that body size tends to increase over evolutionary time (Cope, Õß, ÕÉä)ì. While palaeontological evidence appears to support this pattern in mammals (Alroy, ÕÉÉ; Valkenburgh et al., óþþ¦; Raia et al., óþÕó), most species are relatively small; the modal body size of mammals is approximately Õþþ g and the distribution is skewed to the right (gure ¢.Õ, Jones et al., óþþÉ). is relative overabundance of small species in mammals (and other groups) has invited explanations including an optimal body size (Brown and Maurer, ÕÉÉ), the eect of the perceived size of the environment (Morse et al., ÕÉ¢), and decreased rates of diversication of larger species (Stanley, ÕÉßì; Brown, ÕÉÉ¢). is last hypothesis is the focus of this chapter: that species selection for small body size opposes a general tendency for phyletic size increase, a combination of pro- cesses that we will refer to as the “Cope–Stanley hypothesis”. As phylogenies contain information about the timing and pattern of both morpho- logical and taxonomic diversication, they have been widely used to test for size-depend- ent speciation. However, analyses of individual mammal clades have generally found mixed results; while some have shown the expected negative relationship between body size and diversication rates, others recovered no pattern or even a positive relationship (Gardezi and da Silva, ÕÉÉÉ; Gittleman and Purvis, ÕÉÉ; Isaac et al., óþþì; Paradis, óþþ¢; ìCope never coined the phrase “Cope’s rule”, but these two papers are commonly cited. Some argue that the rule is implicit in Cope’s writing (e.g., Stanley, ÕÉßì; Benton, óþþó), though others dispute even this (Polly, ÕÉÉ). e phrase was apparently coined by Rensch(Õɦ), and has been widely used since.
Éì hZ£±u§ ¢
Cope's rule
100 102 104 106 108 Body Mass (g; log scale) Species selection
F¶§u ¢.Õ: e distribution of mammal body masses. e Cope–Stanley hypothesis states that species mass tends to increase over time (Cope’s rule), balanced by species selection against large species. Data from PanTHERIA (Jones et al., óþþÉ).
ɦ hZ£±u§ ¢
FitzJohn, óþÕþ). Dierent groups have been studied with dierent methods, making generalisation to all species dicult. Recently, Clauset and Erwin(óþþ) argued that the distribution of body sizes across all mammals is consistent with the Cope–Stanley hypothesis, and Etienne et al.(óþÕóa) argued that such a relationship holds across multi- ple animal phyla. Here, we test the Cope–Stanley hypothesis across all mammal species using an explic- itly phylogenetic Bayesian approach. We modelled speciation rates as linear functions of log female body mass, which we assumed evolves according to a Brownian motion process, using QuaSSE (Quantitative State Speciation and Extinction; FitzJohn, óþÕþ). e possibility that body size tends to increase (or decrease) within lineages was incor- porated by allowing for directional mass change, in addition to the diusion present in the Brownian motion model. To avoid over-parameterisation, we held extinction rates constant across body sizes. is combination of processes allows us to simultaneously test for an among-lineage disadvantage of large size with respect to diversication, and for a within-lineage tendency for body size to increase (Cope’s rule). We selected Õþ major clades that include ÉÛ of known extant mammal species (gure ¢.ó, table ¢.Õ) from a recent supertree of mammals (Bininda-Emonds et al., óþþß; Fritz et al., óþþÉ) or from clade-specic phylogenies (Hernández Fernández and Vrba, óþþ¢; Vos and Moo- ers, óþþä; Steeman et al., óþþÉ), and obtained mammal body mass estimates from the PanTHERIA database (Jones et al., óþþÉ). Markov chain Monte Carlo (hh) was used to sample from the posterior distribution of the model (see section ¢.ì).
¢.ó §u«¶±« b o«h¶««
Across the clades, we found no consistent relationship between speciation rates and body size. In four clades (Afrotheria, Cetacea, Eulipotyphla, and Rodentia) we found that speciation rates were negatively correlated with body size, consistent with species selection against large body size. However, four clades showed the opposite pattern
É¢ hZ£±u§ ¢
TZfu ¢.Õ: e Õþ mammal clades used to test the Cope-Stanley hypothesis. Species were excluded either because of a lack of data on mass or because they were not present in the phylogeny. However, they are accounted for in the analysis by assuming that such species were randomly omitted.
Name Taxonomic level Species Included Excluded Afrotheria Superorder ßÉ äì Õä Carnivora Order óä ó¢þ ìä Cetacea Order É ß ó Chiroptera Order ÕÕÕä Éó¦ ÕÉó Eulipotyphla Order ¦¢ó óþä ó¦ä Lagomorpha Order Éó ߦ Õ Marsupials Infraclass ììÕ óßþ äÕ Primates Order óìì óìì þ Rodentia Order óóßß Õ¦¦ó ì¢ Ruminantia Sub order ÕÉß ÕÉä Õ
Éä hZ£±u§ ¢
F¶§u ¢.ó: Mammal supertree showing the Õþ focal clades. Each terminal branch represents a family. e grey branches indicate groups that were not included in the analysis. Clockwise from the right, included clades are Marsupials, Afrotheria, Eulipotyphla, Chiroptera, Carnivora, Ruminantia, Cetacea, Primates, Lagomorpha, Rodentia.
Éß hZ£±u§ ¢
(Chiroptera, Lagomorpha, Marsupials, and Primates), and two clades (Carnivora and Ruminantia) showed no signicant pattern (gure ¢.ì). Consistent with Cope’s rule, there was a general tendency for body size to increase within lineages (ß/Õþ clades had a positive mean trend parameter). Nevertheless, this positive trend was only signicant in Cetacea, and Lagomorpha had a signicant negative trend (gure ¢.ì). ese results argue against the generality of the Cope–Stanley hypothesis and suggest that body size is not a consistent decelerator of speciation rates. Indeed, across clades, being larger was just as likely to accelerate diversication as decelerate it. at a model can be t to data does not prove that the model is correct. ere is considerable variation among clades in both the slope and mean of the speciation rate/body size relationship, which argues against a single common model (gure ¢.ì). However, similar rate heterogeneity may exist within each clade. In particular, it may be that evolutionary transitions in some unknown trait along certain branches of the tree of life have altered diversication rates and led to a spurious correlation between body size and diversication. For example, the dolphin family (Delphinidae) is a relatively recent but rapid radiation containing ¦þÛ of cetacean species (ì¦ of ¦ species), including ßþÛ of cetacean species shorter than ¦ m in length. Consequently, we might falsely attribute variation in diversication rate within the Cetacea to body size when these dierences are in fact due to some other shared feature of the Delphinidae. To determine if the observed associations between body size and diversication have been driven by major shis in diversication caused by other factors, we divided each clade into regions that appeared to have dierent diversication rates using Muo¶«Z (Alfaro et al., óþþÉ); we then allowed speciation rate/body mass relationships to vary among these these partitions. When analysed in this way, only two partitions showed the expected negative relationship between body size and speciation rate (paraphyletic basal groups in Afrotheria and Cetacea; gure ¢.¦), while three clades showed a sig- nicant positive relationship (Dasyudidae and Macropodidae within Marsupials, and a clade that included Nataloidea and Noctilionoidea within Chiroptera). e evidence
É hZ£±u§ ¢
0.30 a) 0.25 0.20 0.15 ● 0.10 ● ● 0.05 ● 0.00
0.30 b) 0.25 0.20
0.15 ●
0.10 ● Speciation rate 0.05 0.00
0.30 c) 0.25
0.20 ● 0.15 ● 0.10 ● ● 0.05 0.00 100 102 104 106 108 Body mass (g; log scale)
F¶§u ¢.ì: e inferred relationship between body size and speciation rate for Õþ mammal clades. e thick lines represent the mean relationship, and the lighter envelopes the É¢Û credibility interval around these, estimated using hh. Arrows indicate the mean direction of the trend body size evolution when signicantly dierent to zero (right arrows indicate an increase, le arrows indicate decrease). Points indicate the median body size within each clade and the mean estimated speciation rate at that body size. e three panels separate cases where the slope was a) signicantly negative (Afrotheria, Cetacea, Eulipotyphla, and Rodentia), b) not signicantly dierent from zero (Carnivora and Ruminantia), and c) signicantly positive (Chiroptera, Lagomorpha, Marsupials, and Primates).
ÉÉ hZ£±u§ ¢ for directional change toward larger body size remained signicant in Cetacea, and there was evidence for a negative body size trend within Marsupials. Our results suggest that there is no single overarching pattern of size-dependent speciation in mammals. We found signicant relationships between body size and spe- ciation rates in most clades (gure ¢.ì); the problem is not that we cannot detect asso- ciations, but that the nature of the associations that we detected diers among groups. In contrast to detecting dierences in speciation rates, detecting directional trends in quantitative characters is notoriously dicult (it is impossible under a simple Brownian motion model without fossil data), and the power of QuaSSE to detect such trends is low (FitzJohn, óþÕþ). us, the lack of positive trends here does not preclude Cope’s law operating, especially given the considerable palaeontological support (e.g., Alroy, ÕÉÉ; Valkenburgh et al., óþþ¦; Raia et al., óþÕó). Our results support the idea that clade-specic dierences in speciation rate are more important than body-size related dierences in determining mammal diversity. It is possible that major shis in body size have occurred along the branches separating the clades in our analyses, so that analysing the clades separately reduces the power to detect any association with diversication. at is, the clades that are speciating more rapidly also tend to contain smaller species (such as Delphinidae within Cetacea). is would be consistent with purely taxonomic analyses showing that the most diverse clades (e.g., Chiroptera and Rodentia) tend to be relatively smaller in size (though perhaps not the smallest; Dial and Marzlu, ÕÉ; Gardezi and da Silva, ÕÉÉÉ). However, we found no clear relationship between speciation rate at the median body size and median body size, either across our Õþ major clades (gure ¢.ì), or across the partitions identied by Muo¶«Z (gure ¢.¢ and ¢.ä). For example, the group with the highest speciation rate within the Chiroptera, Rhinolophus, has a median body size that is similar to the median over the whole order. Most of the phylogenetic information used by QuaSSE comes from the recent bran- ches, as about half the branches in a tree lead to single extant species. Signicant relation-
Õþþ hZ£±u§ ¢
0.10
0.05
0.00
−0.05 Slope of speciation rate/body mass Slope of speciation rate/body
−0.10
Myotis Rattus Murinae Microgale CrociduraSoricidae Marmotini Sciurinae Nataloidea Arvicolinae Delphinidae Dasyuridae RhinolophusVampyressa Lagomorpha Peromyscus Pteropodinae Ctenomyidae Basal Cetacea Macropodidae Some Canidae Basal Primates Basal Rodentia Sigmodontinae Basal Afrotheria Basal Carnivora Cercopithecidae Basal ChiropteraSome Chiroptera Basal Marsupials Most Ruminantia Basal Euliptyphla Basal Ruminantia
F¶§u ¢.¦: Slopes of speciation rate/body mass relationships within Muo¶«Z-derived partitions (see table ¢.ó and Appendix o for denitions). Each bar represents the É¢Û credibility interval for the slope of the speciation rate/body size relationship within a partition, and arrows above the bars indicate the direction of the trend parameter when signicantly dierent from zero. For contrast, the single-slope credibility interval is displayed behind the bars. Dark shading represents cases where the slope was signicantly dierent from zero; that most bars have light shading and are as likely to be negative as positive indicates little support for the Cope–Stanley hypothesis. For Chiroptera and Rodentia, only partitions that were recovered in at least ÕþÛ of trees are displayed.
ÕþÕ hZ£±u§ ¢
0.6 Some Canidae (Carnivora)
0.5
Rhinolophus (Chiroptera) 0.4
0.3 Cercopithecidae Microgale (Primates) (Afrotheria) Marmotini (Rodentia) Mean speciation rate
0.2
0.1
0.0
101 102 103 104 105 106
Median partition body mass (g; log scale) F¶§u ¢.¢: Relationship between speciation rate and body size within mammals, across partitions with dierent diversication rates identied by Muo¶«Z. Each point is the estimated speciation rate at the median body size, integrated over the posterior distribution of the model (see gure ¢.ì). e dashed line indicates a linear regression through these points, excluding the ve partitions with the highest mean speciation rate (rò = ý.ýýâ, p = ý.Þ); the regression was also not signicant with these points included. For Chiroptera and Rodentia, only partitions that were recovered in at least ÕþÛ of trees are displayed.
Õþó hZ£±u§ ¢ ships between speciation rate and body size can be driven by recent nodes that lead to species concentrated at one end of the size distribution. As such, even if a fundamentally dierent model of cladogenesis is closer to the true model (e.g., the niche-lling model of Rabosky, óþþÉa), our approach would interpret size-specic dierences in turnover as a speciation-body size relationship. e lack of a consistent pattern here indicates that a single general rule relating diversication and body size in mammals is unlikely to hold. Explaining the great disparity in clade diversity remains a major thrust of evolution- ary investigation. erian mammals outnumber Monotremes by Õþþþ to Õ, and over ¦þÛ of mammals are rodents; it is natural to suspect that traits within these clades account for some of these dierences. Here, we have tested the long-standing hypothesis that variation in speciation rate driven by body size has led to the excess abundance of small species relative to large species. We found little support for this hypothesis, but instead infer a more idiosyncratic relationship between body size and diversication, with some clades exhibiting a positive and some a negative relationship.
¢.ì u±o«
Toinvestigate the relationship between body size and speciation rate we used the QuaSSE (Quantitative State Speciation and Extinction) method (FitzJohn, óþÕþ) as implemented in the R package diversitree (Chapter ¦) to t size dependent birth-death models of speciation and extinction. Specically, we modelled speciation rate as a linear function of body size, disallowing negative speciation rates (these were truncated to zero). A negative slope of the speciation rate–body size relationship captures the hypothesised long-term disadvantage of large body size on speciation rates, while requiring only two parameters. We modelled change in body size over time using Brownian motion with both a diusion and a trend term, where the latter captures directional changes within a lineage (e.g., due to selection or mutational pressure). us, this method allows for the
Õþì hZ£±u§ ¢ possibility of detecting conicts between individual selection (e.g., favouring a trend to- ward larger body size) and species selection (e.g., favouring higher diversication among small mammals). For body size estimates, we used the PanTHERIA database (Jones et al., óþþÉ), ex- pressed as log female mass in grams (for brevity, we will refer to this simply as “mass”). Direct estimates were available for ì,¢¦ó species, with another ìÉó determined by linear regression from body length estimates (see Jones et al., óþþÉ, for details). ese were sup- plemented by estimates of cetacean length (unpubl. dat., Nick Pyensen). We converted cetacean log-lengths to log-weight estimates by predicting from linear regression among cetacean species only (gure ¢.ß). We focused on Õþ major clades of mammals (gure ¢.ó, table ¢.Õ). is omitted ÕÕÕ species (óÛ of all species) in É additional clades that we felt were too small to be statistically informative, the largest of which were the superorder Xenarthra (ìÕ species) and the order Scandentia (óþ species). For most clades, the tree was pruned from a recent mammal supertree (Bininda-Emonds et al., óþþß; Fritz et al., óþþÉ). For three clades we used trees constructed specically for that clade: Cetacea (Steeman et al., óþþÉ), Primates (Vos and Mooers, óþþä), and Ruminantia (Hernández Fernández and Vrba, óþþ¢). e mammal, Primate, and Ruminantia supertrees contained polytomies; before use with QuaSSE these were randomly resolved, and replaced with branch lengths simulated under a birth-death model (see Kuhn et al., óþÕÕ for details). is produced a family of plausible trees, from which we sampled Õþþ trees to account for uncertainty in polytomy resolution (note that this includes only a fraction of possible phylogenetic uncertainty). e polytomy resolution was performed without reference to body size, which some- times created unrealistically short branches separating sister species that were very dif- ferent in size, causing numerical instability in the integrations and preventing likelihood calculation. To avoid this, we dropped one species of any sister species pair that were “impossibly disparate” given their estimated divergence date. We arbitrarily dened this
Õþ¦ hZ£±u§ ¢ as a log probability of less than −Õþ (approximately ý.ýýýý¥ ) under a Brownian motion model of character evolution with the maximum likelihood rate estimate (estimated ig- noring speciation and extinction). In practice, this removed Õþ–ìþ of the ì,Éì¦ included species across the Õþ clades, the exact taxonomic composition of which varied among the trees. For Cetacea, we sampled Õþþ trees from a distribution of time-calibrated trees obtained from a Bayesian tree search (Steeman et al., óþþÉ), which more fully accounts for phylogenetic uncertainty in all branches. We used two dierent approaches to analyse the data: “single-slope” ts (one spe- ciation rate/body size relationship per clade), and “multiple-slope” ts where dierent regions of each clade were allowed to have dierent relationships. For the single-slope analyses, we t QuaSSE models to Õþ major clades of mammals (table ¢.Õ). Each model has ve parameters: the slope and intercept of the speciation rate–body mass relation- ship, extinction rate, diusion parameter, and trend parameter. Not all species were represented on the phylogeny, and some species lacked body mass data (table ¢.Õ). We corrected for these missing data in the likelihood analysis by assuming that species missing from a clade were randomly excluded with respect to both phylogeny and mass (FitzJohn, óþÕþ). is approach treats all the clades as fully independent data points, and no information was shared among them during the inference process. To account for major shis in diversication across the tree of mammals, we t “multiple-slope” models where the speciation rate/body mass relationship was allowed to vary among partitions in a clade. We rst used Muo¶«Z (Alfaro et al., óþþÉ) to identify regions within the Õþ major clades exhibiting signicant shis in diversication rates. To avoid over-tting, and because extinction rate estimates are sensitive to the exact distribution of branch lengths, we constrained the Muo¶«Z algorithm to use a single extinction rate over the entire clade, set to the maximum likelihood estimate from a constant rate birth-death model for that clade while using Muo¶«Z. Each partition adds two parameters: a location and a new speciation rate. To compensate for the very large number of comparisons made when tting Muo¶«Z models (considering potential shis
Õþ¢ hZ£±u§ ¢ at every branch in each phylogeny), we used Bonferroni corrections. is correction increased the required log-likelihood improvement from ì.þ (assuming a χò distribution with two degrees of freedom) to between ß.Õ (Afrotheria, äó nodes) and Õþ.þ (Rodentia, Õ,¦ó¢ nodes). e smallest clades were rarely (Afrotheria) or never (Lagomorpha) partitioned (ta- ble ¢.ó). Most other clades were usually split into two or more partitions. e largest orders, Chiroptera and Rodentia, were split into many partitions that (in contrast to other orders) varied widely in presence among individual trees. is appears to re- ect taxonomic uncertainty in the backbones of these clades. Many of the regions that Muo¶«Z t as having diversication rates that diered from the rest of the tree were fairly recent, with most species included in a basal paraphyletic partition. We used Markov chain Monte Carlo (hh) to sample from the posterior distribu- tion for our models, using slice sampling (Neal, óþþì) for the parameter updates. For the single-slope analysis, this model has ve parameters: slope and intercept of the spe- ciation rate function, extinction rate, Brownian motion trait diusion coecient, and a Brownian motion dri coecient describing the rate of directional trait change (“trend”). For the multiple-slope analysis with n partitions, there are n slope and intercept parame- ters, in addition to a common extinction rate, diusion coecient, and trend coecient. We had no strong prior beliefs about most parameters, so we used unbounded uniform priors for all model parameters. However, we used a uniform prior on the root state that spanned the range of known mammal sizes: Õ.ì g (Batodonoides vanhouteni: Bloch et al. ÕÉ) to ÕÉþ tonnes (Balaenoptera musculus). All chains were started from the maximum likelihood point, estimated using the subplex algorithm (Rowan, ÕÉÉþ). We ran the chains for ì,þþþ steps (single-slope analysis) or ä,þþþ steps (multiple-slope analysis), discarding the rst ¢þþ steps as burn-in. While this is a relatively small number of steps, the estimated eective sample size was over Õ,þþþ for most parameters (Plummer et al., óþþä), and the total computational time for the analyses here exceeded ¢þ h£¶ years. We then pooled these sampled parameters over the Õþþ trees to produce posterior
Õþä hZ£±u§ ¢ distributions representing ó¢þ,þþþ or ¢¢þ,þþþ hh steps that include both parameter and polytomy uncertainty (phylogenetic uncertainty in the case of the Cetacea).
Õþß hZ£±u§ ¢
TZfu ¢.ó: Properties of partitions recovered by Muo¶«Z, indicating the number of trees for which that partition was signicant, and the median log-likelihood improvement (∆ likelihood). e exact species composition varied among trees, and the name is indicative of composition only; see Appendix o.
Clade/partition name Number of trees ∆ likelihood (range) Afrotheria Not partitioned Éß Microgale ì ß.É (ß.¦, É.ì) Carnivora Some Canidae Õþþ Õä.ì (Õ¦.Õ, ÕÉ.ß) Cetacea Not partitioned ä¦ Delphinidae ìä .ß (ß.¢, Õ¢.ó) Chiroptera Some Chiroptera ÉÉ ìþ.Õ (ó¢.Õ, ì¢.Õ) Myotis ì Õì.þ (Õþ.þ, ìÕ.ì) Nataloidea ¦Õ Õì.ä (Õþ.þ, Õ¢.þ) Pteropodinae Õì Õþ.¢ (Õþ.þ, Õ.ì) Rhinolophus Õþþ ¦.ì (¦Õ.¦, äþ.¢) Vampyressa Õþ Õß.¢ (Õó.ß, Õ.É) Other ó Eulipotyphla Crocidura Õþþ ¦ó.ä (ìä.ä, ¦.¦) Soricidae óÕ É.ä (É.þ, ÕÕ.¦) Lagomorpha Not partitioned Õþþ Marsupials Dasyuridae Õþþ Õþ.ì (É.¢, ÕÕ.¢) Macropodidae Õþþ ÕÕ.É (Õþ.ä, Õì.¢) Primates Cercopithecidae Õþþ ÕÕ.ä (Õþ., Õó.¦) Rodentia Arvicolinae ó Õì.ó (Õþ., Õä.ß) Ctenomyidae óä ÕÕ.É (Õþ., Õ¦.ì) Marmotini Õþþ óó.þ (óþ.ó, ó¦.ì) Murinae ¢ Õ¢.Õ (Õþ., óì.¢) Peromyscus Õß Õó.¦ (Õþ.É, Õ¢.¢) Rattus ÕÕ Õ¢.ä (Õó.ó, óÕ.þ) Sciurinae ÉÉ Õä.¦ (ÕÕ.Õ, óþ.¦) Sigmodontinae Õþþ ¦ä.Õ (ìÉ.þ, ¢ä.ì) Other É Ruminantia Not partitioned ¦ Most Ruminantia Éä Õó.É (Õþ.ß, Õ¦.ì)
Õþ hZ£±u§ ¢
Afrotheria Carnivora Cetacea 0.6 Basal Afrotheria Basal Carnivora Basal Cetacea Microgale Some Canidae Delphinidae 0.5
0.4
0.3
0.2
0.1
0.0
Chiroptera Euliptyphla Marsupials 0.6 Basal Chiroptera Basal Euliptyphla Basal Marsupials Chiroptera Crocidura Dasyuridae Myotis Soricidae Macropodidae 0.5 Nataloidea Pteropodinae Rhinolophus 0.4 Vampyressa
0.3
0.2 Speciation rate 0.1
0.0
Primates Rodentia Ruminantia 0.6 Basal Primates Arvicolinae Basal Ruminantia Cercopithecidae Basal Rodentia Most Ruminantia Ctenomyidae 0.5 Marmotini Murinae Peromyscus 0.4 Rattus Sciurinae Sigmodontinae 0.3
0.2
0.1
0.0 100 102 104 106 108 100 102 104 106 108 100 102 104 106 108 Body mass (g; log scale) F¶§u ¢.ä: Speciation rate inferences for partitions. In each major group, QuaSSE was performed separately for the partitions that exhibited signicantly dierent diversication rates, according to Muo¶«Z. No such partitions were inferred for Lagomorpha, so only the remaining É clades are shown. Each envelope represents the É¢Û credibility interval for the speciation rate/body size relationship within a partition. ick lines indicate the mean relationship. For Chiroptera and Rodentia, only partitions that were recovered in at least ÕþÛ of trees are displayed.
ÕþÉ hZ£±u§ ¢
● Balaenidae ● ● Balaenopteridae ●●● 108 ● Physeteridae ● ● Ziphiidae ● ● Neobalaenidae ● ● ● ● Monodontidae ● ● Delphinidae ● Iniidae ● Phocoenidae ● 7 ● ● 10 ● Platanistidae ●
●● ● ● ● ● ● ● 6 10 ● ● Weight (g; log scale) Weight ●
● ● ● ●●● ●●●● ● ● ● ● 5 ● ● 10 ●● ●● ● ● ● ● ●
2 5 10 20 Length (m; log scale)
F¶§u ¢.ß: Regression of body mass against length for Cetacea (rò = ý.À¥). Dierent colours indicate dierent families (sorted by decreasing mean mass in the legend). Because we perform a linear transformation between log-length and log-mass, the rank ordering of size estimates is unchanged. is transform rescales the parameter estimates for more direct comparison with the other clades.
ÕÕþ hZ£±u§ ä Ch¶«
I believe that one of the reasons that the concept of species selection has gained favour among biologists is the evidence for trait-based dierences in diversication that have accumulated over the last ó¢ years (Coyne and Orr, óþþ¦; Jablonski, óþþb; Rabosky and McCune, óþþÉ). In my thesis, I have developed and applied a number of methods for detecting species selection using phylogenies. e methods presented here improve on earlier approaches by allowing exploration of both the magnitude of species selection and the within-lineage processes that might oppose it. In chapter ó, I developed a method for detecting the association between a binary trait and rate of speciation or extinction when a phylogeny is incompletely resolved. is built upon the BiSSE (Binary State Speciation and Extinction) method of Maddison et al. (óþþß). I then used this method to investigate the association between speciation rate and sexual dimorphism in shorebirds, nding that strong sexual dimorphism was asso- ciated with elevated rates of speciation, but may quickly revert back to monomorphism. In chapter ì, I developed a method for inferring the association between a continuous trait and speciation or extinction rates, using a diusion model for character evolution (QuaSSE: Quantitative State Speciation and Extinction). State-dependent diversication under such a model has been long discussed (e.g., Slatkin, ÕÉÕ), but tting this model to data has been only approximate prior to the development of this method. I used QuaSSE to investigate the association between body size and speciation and extinction in Primates, nding mixed evidence for any size-specic speciation or extinction. In chapter ¦, I described an R package, “diversitree”, that I developed to facilitate phylogenetic analyses. is package contains and documents the methods above, and
ÕÕÕ hZ£±u§ ä also methods that ignore trait evolution when studying diversication (as in Nee et al., ÕÉɦb) or model character evolution independently of speciation and extinction (e.g., Pagel, ÕÉɦ). While this is a small chapter, developing diversitree has been a large compo- nent of my thesis. e package has been widely adopted by the comparative phylogenetic community; it has been used in ì¦ published studies since May óþþÉ and has over Õþþ users (based on email contact). In this chapter I also introduced “MuSSE” — a multi- state extension of BiSSE — and discuss how this can be applied to combinations of traits to simultaneously account for the eects of multiple traits on speciation or extinction using an approach akin to a general linear model regression. Importantly, this allows for quantifying the association between one trait and diversication while accounting for other possible traits. I reanalysed an investigation of social and mating structure in Primates originally discussed by Magnuson-Ford and Otto(óþÕó). Surprisingly, when taking into account sociality, the eect of mating system (monogamy vs. polygamy) on speciation became more apparent. In chapter ¢, I carried out an analysis of body size evolution in mammals, aiming to address the long-standing question of why there are so many small mammal species (and so few large) by testing the hypothesis that small mammals tend to diversify more rapidly than large species. I found that there was no consistent relationship between body size and speciation rate, with some clades having a negative relationship and others having a positive relationship. I found limited evidence for Cope’s law (the tendency for species to get larger over time), which was only well supported in the Cetacea. Furthermore, I found that within clades there was oen large variation in rates of speciation that was not directly related to body size (gure ¢.ä). Here, I conclude with some general comments about the approaches used in my thesis and their limitations. I will consider some lines of evidence challenging the sort of methods described in my thesis, focusing on inconsistencies between inference from phylogenies and inference from the fossil record. I consider modications to phyloge-
ÕÕó hZ£±u§ ä netic methods that have been suggested to address their shortcomings. Finally, I nish with some thoughts on the future directions of the eld.
ä.Õ «u ««¶u« ë± ±u«u u±o«
Model-based inference is only accurate to the extent that the model reasonably captures the key features of the (unknown) true process. Some simplication of reality is needed, as models cannot capture every aspect of a process while remaining tractable. e dom- inant paradigm in comparative phylogenetics is “null model testing”; for analyses of spe- ciation and extinction the constant rate birth-death process is most commonly used. as a null model. We prefer the more complicated model over the null model (perhaps adding time- or state-dependence) if the complex model ts signicantly better. However, this approach gives only a relative measure of support; even if the more complex model ts better, we have no strong evidence that it is actually a good t to the data, rather than merely “less bad” than the simple model. How badly do trait-dependent diversication models t biological data? e models that I have considered in my thesis combine the birth-death model of speciation and extinction with simple Markov models of character evolution. ere are a number of highly improbable assumptions and consequences of these models, including (i) expo- nential growth (or decline) in the number of species, (ii) no interaction among species during diversication or character evolution, (iii) independence of character change and speciation events from the time of the last such event, and (iv) homogeneity of rates over all species in the tree, over all time. While both the character evolution and diversication models have weaknesses, most attention has focused on the shortcomings of the birth-death model.
ÕÕì hZ£±u§ ä
ä.Õ.Õ e birth-death model, extinction, & the fossil record e most commonly cited line of evidence that the birth-death model is a poor t to biological phylogenies is that the maximum-likelihood estimate of extinction rate is oen zero. With extinction, there should be an upturn in the log number of lineages in a reconstructed phylogeny over time (gure Õ.Õ), which would lead to positive values of the “gamma statistic” of Pybus and Harvey(óþþþ). However in many molecular phylo- genies, the gamma statistic is negative, indicating a decreasing slope in the log number of lineages over time, leading to extinction rate estimates of zero (reviews in McPeek, óþþ; Phillimore and Price, óþþ). is absence of extinction conicts with even the most cursory comparison with the fossil record and with the motivation for the initial development of birth-death methods over the pure birth methods that preceded them (Nee et al., ÕÉɦa; Kubo and Iwasa, ÕÉÉ¢). is problem has motivated extensive work into alternatives to the constant rate birth-death process for modelling diversication. ese alternatives fall into two broad (but overlapping) classes. One class posits that rates of speciation or diversication have decreased over time. Consequently, methods for tting time-varying speciation and extinction rates have proliferated recently (e.g., Rabosky, óþþä; Rabosky and Lovette, óþþ; Morlon et al., óþÕþ; Paradis, óþÕþ; Rabosky and Glor, óþÕþ; Stadler, óþÕÕ). However, there are a number of undesirable consequences of this interpretation. For trees where the lineage through time plot appears to be saturating (negative gamma), not only will extinction rates still be estimated to be zero at the present, but contemporary speciation rates will also be estimated to be very low (e.g., Paradis, óþÕþ; Rabosky and Glor, óþÕþ). is is dicult to reconcile with the view that both speciation and extinction are ongoing and can be rapid (e.g., Seehausen, óþþä; Taylor et al., óþþä; Hendry et al., óþþß; Grant and Grant, óþþÉ). Also, given the general prevalence of apparent slowdowns, this interpretation implies that diversication is slowing down simultaneously in almost every clade (Ra- bosky, óþþÉb). Perhaps we are at a peculiar moment in history where diversication is simultaneously slowing down in all groups, but this seems unlikely (though there
ÕÕ¦ hZ£±u§ ä is palaeontological evidence for declining origination rates over time over broad taxo- nomic scales: e.g., Gilinsky and Bambach, ÕÉß; Sepkoski, ÕÉÉ). e second major class of explanations postulates that species occupy niches, which may become lled over time (e.g., McPeek, óþþ; Phillimore and Price, óþþ; Rabosky, óþþÉa,b; Morlon et al., óþÕþ, óþÕÕ; Etienne et al., óþÕób). Aer an initial period of di- versication, speciation and extinction rates balance to maintain the size of a clade at a characteristic carrying capacity of diversity. In its simplest form, this would imply logistic growth, rather than exponential growth for the constant rate birth-death process. While approaches for detecting diversity-dependent diversication from phylogenies are new, the idea itself has a long history in ecological and palaeontological thought (e.g., MacArthur, ÕÉäÉ; Raup et al., ÕÉßì; Levinton, ÕÉßÉ; Walker and Valentine, Õɦ). Proponents claim that modelling this process brings inferences from phylogeny clos- er to the fossil record (Rabosky, óþþÉa; Etienne et al., óþÕób). Based on fossil evidence, clades have been oen been thought to reach a stable level of diversity early on in their evolutionary history, followed by periods with no consistent trend in diversity over time (e.g., Gould et al., ÕÉßß; Alroy, óþþ; Alroy et al., óþþ). For example, the fossil record supports equal origination and extinction rates at the family level in marine organisms (Gilinsky, ÕÉɦ). Similarly, while diversity in tropical plants has generally increased over the last ä¢ million years, this increase has not been exponential, and estimated speciation rates are similar to extinction rates (Jaramillo et al., óþþä). Consistent with the predictions of diversity-dependent diversication, speciation rates have been inferred to increase following mass extinctions (Sepkoski, ÕÉÉß) and to decrease when diversity is high (Alroy, ÕÉÉä). However, not all groups have fossil records that show strong signs of diversity depen- dence. While speciation rates may decline with diversity, this eect has been shown to be weaker in terrestrial systems than marine (Benton, ÕÉÉß; Eble, ÕÉÉÉ). In fact, insect groups show signs of a steady increase in diversity over time (Labandeira and Sepkoski, ÕÉÉì; Mayhew, óþþß). At the broadest scales, some have argued that fossil data are consis-
ÕÕ¢ hZ£±u§ ä tent with a steady increase in diversity over hundreds of millions of years (Benton, ÕÉÉ¢; Jablonski, óþþa; Kalmar and Currie, óþÕþ). Furthermore, whether or not a saturating pattern of diversity is seen depends on the taxonomic level used; orders have stronger evidence for a carrying capacity than lower taxonomic units (Benton, ÕÉÉß). Any density- dependent eects on speciation or extinction should most strongly aect species that are most closely related, most similar, and with overlapping ranges. Contrary to this, Alroy (ÕÉÉä) found that while overall diversity of North American mammals has not strongly increased over the last ó¢ Myr, speciation and extinction rates within genera are not tightly coupled. It is also dicult to see that the major interactions that constrain diversity should necessarily occur within a clade. Particularly for clades that are not very diverse, why should the presence of a species in one continent aect the propensity of a species in another continent to speciate (Wiens, óþÕÕ)? Alternatively, speciation in one clade may be aected by distantly related clades, such as cospeciation in parasites of insects (Forbes et al., óþþÉ), mites of gophers (Huelsenbeck et al., óþþþ), and wasp pollinators of gs (Weiblen and Bush, óþþó). At a broader scale, the hypothesised codiversication of angiosperms and insects (e.g., Pellmyr, ÕÉÉó; Farrell, ÕÉÉ; Grimaldi, ÕÉÉÉ) suggests that a single-clade view is rather too narrow. In a phylogenetic context, extensive periods of turnover with constant diversity (i.e., balanced speciation and extinction rates) should result in a proliferation of branches near the present, as in a coalescent tree, causing an upturn in a lineage through time plot — the lack of which was part of the original motivation for creating these sorts of models (Etienne et al., óþÕóa)! If it turns out that the rst class of explanation (time varying rates) is a reasonable approximation of the true process, then it is straightforward to adapt the preceding chapters of my thesis to allow for this. Speciation and extinction rates would vary both in time and according to the value of a trait. Indeed, this has been done already (Rabosky and Glor, óþÕþ, Chapter ¦, J. L. Cantalapiedra et al., in prep.). However, including the parameters required to model temporal variation in diversication may leave little power
ÕÕä hZ£±u§ ä to identify trait-dependent diversication. In contrast, if diversity-dependence really is a better model of diversication than the birth-death model, this will be dicult to combine with the approaches I have developed in my thesis, especially as there appears to be multiple distinct ways of modelling such dynamics. is view also raises deeper questions about what state-dependent diversication or species selection means in this context, as the interactions necessarily become frequency dependent. Most postulated species-selection scenarios correspond to simple directional (or possibly stabilising) se- lection; with diversity dependent diversication, dierent traits may confer dierent carrying capacities, rates of increase, or rates of character evolution, all of which may vary with the number and identity of other species.
ä.Õ.ó Alternative explanations for poor t Returning to the apparent slowdowns in diversication observed in phylogenies, there are additional explanations that have received less attention. In a birth-death model, “birth” can happen at any instant before the present. Indeed, we expect a large number of births very close to the present, especially with nonzero extinction rates (gure Õ.Õ). However, we are reluctant to call two species separate until they are morphologically, ecologically, or reproductively distinct (Coyne and Orr, óþþ¦). Recognising species therefore typically requires the passage of signicant amounts of time since sharing a common ancestor, so that for many groups distinct species are at least hundreds of thousands or millions of years old. In essence, there may be a lag period between speci- ation (i.e., the point at which two populations share only trivial amounts of gene ow) and our recognition of species. is taxonomic lag will hide recent speciation events, essentially collapsing the last million years into unresolved clades that we recognise as a single species (Weir and Schluter, óþþß; Purvis, óþþ; Purvis et al., óþþÉ; Rosenblum et al., óþÕó). Biased sampling of taxa towards including more distinct species (i.e., deeper nodes) would also lead to apparent slowdowns in diversication in pure birth trees, even when most taxa are included (Cusimano and Renner, óþÕþ; Brock et al., óþÕÕ).
ÕÕß hZ£±u§ ä
e number of missing species may be substantial; cryptic diversity may more than double the number of known species (e.g., Hebert et al., óþþ¦a; Pfenniger and Schwenk, óþþß; Funk et al., óþÕó). Even DNA divergence-based estimates likely underestimate the true number of cryptic species, given that DNA divergence of the level oen considered still requires substantial time to develop (e.g., óÛ in Hebert et al., óþþ¦b). Low mutation rates, small sample sizes, and incomplete lineage sorting will all hamper identication of lineages that will never again exchange genes. In addition to cryptic species, new species are being discovered even in well-studied groups. It seems far more likely that we are missing species that are closely related to known species than missing entire deep splits, at least among vertebrates. For example, while ì¦Õ mammal species were discovered between ÕÉÉó and óþþä, most were closely related to known species with only Õ new genera created (Reeder et al., óþþß). is pattern of missing taxa will generate apparent slowdowns in diversication. An additional area of disconnect between the models and the data is caused by the quality of the phylogenies that we use. In most comparative phylogenetic approaches, we take the tree as given or, at best, compute a statistic over a distribution of trees. Com- pared to the extensive work on developing and comparing alternative models of clado- genesis, there has been relatively little work considering the impact of tree estimation error on estimates from any of these models (but see Revell et al., óþþ¢; Cusimano and Renner, óþÕþ). Ages of recent splits may be systematically overestimated, which would create apparent slowdowns (Pulquério and Nichols, óþþß; Purvis, óþþ), as would a discordance between gene-trees and species-trees (Burbrink and Pyron, óþÕÕ). Estimates of speciation and extinction rates are sensitive to branch length, particularly near the tips, and dierences between fossil date estimates, calibration, and methods of reconstruction can have pronounced eects on node ages (Pulquério and Nichols, óþþß; Warnock et al., óþÕó). While many methods for creating time-calibrated phylogenies attempt to take into account some biologically realistic patterns of deviations from a molecular clock, we are still understanding the ways that the clock can be violated (e.g., Smith and Donoghue,
ÕÕ hZ£±u§ ä
óþþ; Schwartz and Mueller, óþÕþ; Mayrose and Otto, óþÕÕ). Together, these issues raise the question of whether our tree inferences are good enough to accurately detect the true nature of deviations from a birth-death process.
ä.Õ.ì What about character evolution? e criticisms above focus on the birth-death model, but the models of character evo- lution used in my thesis are just as simple, and likely just as awed. However, there does not seem to be the same groundswell of dislike against our current models of character evolution, to the point where we do not even know how badly they t. We do know that trait-dependent diversication can bias estimates of character transition rates (Maddison, óþþä). In addition, there is a general mistrust for ancestral state reconstruc- tion methods (e.g., Omland, ÕÉÉÉ; Losos, óþÕÕ), especially for continuous traits (Oakley and Cunningham, óþþþ; Webster and Purvis, óþþó). However, the weaknesses in these models have not stimulated the same proliferation of methods as has occurred for birth- death models. For example, for binary data on small trees there is oen insucient power to distinguish between a model where rates of ý → Ô and Ô → ý transitions dier, and where these rates are equal (the Jukes–Cantor model). is led Mooers and Schluter (ÕÉÉÉ) to suggest that a two rate model was rarely justied. In contrast, estimation of zero extinction rates (which mean that the pure birth model is statistically indistinguishable from the birth-death model) has led to development of alternative models. I can oer no explanation for why two such similar issues have led to such dramatically dierent degrees of scepticism.
ä.Õ.¦ Model adequacy Are simple models useful in any way? Can we learn anything from phylogenies about speciation, extinction, or character evolution that we didn’t already know? Our models trade o realism in order to remain tractable:
ÕÕÉ hZ£±u§ ä
“Brownian motion is a poor model, and so is Ornstein-Uhlenbeck, but just as democracy is the worst method of organising a society ‘except for all the others’, so these two models are all we’ve really got that is tractable. Crit- ics will be admitted to the event, but only if they carry with them another tractable model.” Joe Felsenstein (§-«-phylo mailing list, ß April óþþ)
All models are incorrect. All we hope is that they capture enough “truth” to tell us something about a complex system. While I recognise that our models of cladogenesis and character evolution are a gross simplication of reality, it is not clear to me that there currently is a much better alternative to the birth-death model, with all options lacking in one way or other. e main focus in this thesis is relating species traits to patterns of species diversity in order to detect the signature of species selection. It is quite possible that even if the model of cladogenesis is totally incorrect, and the rates are meaningless in an absolute sense, there may still be valuable information in the estimated parameters. For example if the “true model” involves some form of niche-lling, and if species with some particular trait have a higher rate of turnover or a higher carrying capacity, then it is likely that a method like BiSSE or QuaSSE would still correctly associate the trait with increased “speciation” (see also Wiens, óþÕÕ). How oen this will happen, and how oen this fortuitous association will be missed will depend on the true model of cladogenesis and deserves future study. Actual patterns of diversication in both species and traits are far more complicated than any of the models discussed in my thesis or elsewhere. Patterns of fossil diversity in groups have risen and fallen, with complex temporal patterns of past diversity. For example large North American carnivores have experienced several bouts of wholesale replacement (Valkenburgh et al., óþþ¦), rather than an increase or steady pattern of diversity. Fossil data suggest Cetacean diversity was up to ve times higher than today just Õþ million years ago (Quental and Marshall, óþÕþ). is seems equally at odds with both models positing a steady increase in diversity or a carrying capacity at a low
Õóþ hZ£±u§ ä taxonomic level. Clearly, more work is needed to both improve our methods, and to better characterise how well they t our data.
ä.ó ¶±¶§u o§uh±«
Here, I briey outline some areas for possible future development and synthesis that would improve our ability to infer about possible past processes using phylogenies.
ä.ó.Õ Better diversication models Recently, there have been attempts to reconcile the models of cladogenesis with more re- alistic models of speciation and species concepts. Incorporating “protracted speciation”, where the probability of speciation of a lineage depends on the time since the previous speciation event, may capture biological aspects of the speciation process that the birth- death model ignores (Rosindell et al., óþÕþ; Etienne and Rosindell, óþÕó). Accounting for biased sampling can avoid misinference of speciation and extinction rates (Hönha et al., óþÕÕ; Cusimano et al., óþÕó). Better still, detecting such biased sampling and correcting for it automatically would increase the robustness of our inferences. Similarly, dierent species concepts imply dierent interpretations of “birth” and “death”, which in turn can lead to very dierent inferences about patterns of diversication from phylogenies (Ezard et al., óþÕó). Very little research has been done along these lines, but this should bring our models considerably closer to matching biology. However, determining what the “correct” concept is will be challenging (and most likely, contentious). All these devel- opments are recent, and it is not yet clear how they will aect choice among competing models of diversication.
ä.ó.ó Better trait evolution models While development of alternative models of trait evolution appear to lag models of diver- sication, development of new approaches has recently accelerated. Methods to identify
ÕóÕ hZ£±u§ ä topological shis in the rate of trait evolution are now available (O’Meara et al., óþþä; Eastman et al., óþÕÕ; Slater et al., óþÕó), though only for continuous traits¦. Felsenstein (óþÕó) recently developed an method that allows for a “threshold” model of discrete trait evolution, in which an observed binary trait represents the sign of an unobserved continuous trait. is relaxes the assumption of constant rates of trait change by allowing traits to change more rapidly if they have recently changed, and might be expected whenever traits are parts of co-adapted trait complexes. Detecting the association be- tween such a trait and speciation and extinction would be challenging (both statistically and computationally) but could represent a more realistic model than those I have pre- sented. Models that allow for speciational shis in character states (e.g., Bokma, óþþb; Magnuson-Ford and Otto, óþÕó; Goldberg and Igić, in press) are still in early stages of development but hold promise for testing hypotheses about the role of ecological speciation and punctuated equilibrium (but see Rabosky, óþÕó). It is possible to extend QuaSSE to allow for punctuated trait changes, but doing so will be computationally challenging. Similarly, methods that allow for more realistic modelling of species ranges, constraints, or simultaneous character evolution in multiple groups would signicantly broaden the fairly meagre toolbox of trait models currently available. Probably the most useful development with respect to improving models of trait evolution would be the creation of goodness-of-t statistics. e gamma statistic (Pybus et al., óþþó, see above) has been instrumental in motivating research into alternative diversication models, and a similar statistic for traits could do the same for trait evolution.
ä.ó.ì Combining phylogenetics & the fossil record e fossil record is already the main source of comparison for phylogenetic approaches to studying speciation, extinction, and trait evolution. However, most studies treat fossils and phylogenies as two separate sources of information: one generates a hypothesis, and
¦is is straightforward to implement for discrete traits for given shi positions in diversitree, though such a method’s statistical properties are unknown. Implementing search over possible shis, as in Eastman et al.(óþÕÕ), is less trivial.
Õóó hZ£±u§ ä the other tests it. Closer integration of phylogenetic and fossil analyses would allow better inferences than either alone. For example, incorporating fossil data can improve ancestral state estimation (Finarelli and Flynn, óþþä). Recent eorts in a diversication context include Simpson et al.(óþÕÕ), who used extinction data from the fossil record when inferring rates of speciation and extinction from a phylogeny of reef corals, and “Fossil Muo¶«Z” (J. Brown et al., in prep.), which uses records of past diversity to relax assumptions of both temporal and topological homogeneity in rates. While there is scope for improving inference by combining phylogenetic approaches with fossil data, this integration will introduce challenges that will have to be solved. Using fossils to date nodes in a phylogeny is probably the most common way that fossils are used to complement purely phylogenetic analyses, and this application illustrates some potential problems. Fossils belong to branches of unknown length that connect to unknown places within a phylogeny and adding them directly to phylogenies is not trivial (Felsenstein, óþþó; Forey et al., óþþ¦). Even under ideal circumstances, with a complete fossil record and clock-like molecular evolution, discrepancies are expected be- tween molecular and fossil dates, with fossil-based node ages biased towards the present as fossils indicate only minimum clade age (Brown et al., óþþ; Lukoschek et al., óþÕó). e species concepts used by palaeontologists are necessarily morphological, and anal- yses are oen performed at the level of morphological fossil “genera” or “families” (e.g., Benton, ÕÉÉ¢; Mayhew, óþþß; Jablonski, óþþa; Liow et al., óþþ). More than a semantic issue, dierent species concepts can entirely change interpretations about patterns of diversity both from the fossil record (Benton, ÕÉÉß; Alroy, óþþþ; Forey et al., óþþ¦) and from phylogenies (Ezard et al., óþÕó). Taxonomic misidentication can cause “pseudo- extinction”, where taxa disappear from the fossil record due to morphological change, rather than extinction (Alroy, óþþÉ). Such pseudo-extinctions are paired with pseudo- speciation events, leading to ination of both estimates of speciation and extinction rates, and the apparent correlation between these rates. Multiple specimens from a single species may be recorded as dierent species or even dierent genera (Forey et al., óþþ¦).
Õóì hZ£±u§ ä
However, many of these issues can probably be addressed through modelling, and I believe there is scope to improve our analyses considerably when using all the available evidence simultaneously.
ä.ì hh¶«
Much of the value in comparative phylogenetics is in formalising hypotheses about the processes that might have shaped diversity. Simple approaches such as sister-clade com- parisons allow us to formally test hypotheses about the associations of traits with higher levels of diversity in a coarse manner (e.g., Mitter et al., ÕÉ; Barraclough et al., ÕÉÉ; Va- mosi and Vamosi, óþþ¢). I feel that the strength of the methods presented in my thesis is that they allow a more ne-grained test of associations and can lead to a more pluralistic view of how species have diversied and how traits may have become distributed (e.g., Goldberg et al., óþþ¢; Maddison, óþþä; Clauset and Erwin, óþþ; Schwander and Crespi, óþþÉ). I hope that in combination with other types of data, the approaches discussed in my thesis will be a useful source of future hypotheses.
Õó¦ Bf§Z£í
Alfaro M.E., Santini F., Brock C., Alamillo H., Dornburg A., Rabosky D.L., Carnevale G., and Harmon L.J. óþþÉ. Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates. Proceedings of the National Acadamy of Sciences, USA Õþä:Õì¦Õþ–Õì¦Õ¦. Allen L.J.S. óþþì. An Introduction to Stochastic Processes with Applications to Biology. Pearson Prentice Hall, Upper Saddle River, N.J. Alroy J. ÕÉÉä. Constant extinction, constrained diversication, and uncoordinated stasis in North American mammals. Palaeogeography, Palaeoclimatology and Palaeoecol- ogy Õóß:ó¢–ìÕÕ. Alroy J. ÕÉÉ. Cope’s rule and the dynamics of body mass evolution in North American fossil mammals. Science óþ:ßìÕ–ßì¦. Alroy J. óþþþ. Successive approximations of diversity curves: ten more years in the library. Geology ó:Õþóì–Õþóä. Alroy J. óþþ. Dynamics of origination and extinction in the marine fossil record. Proceedings of the National Acadamy of Sciences, USA Õþ¢:ÕÕ¢ìä–ÕÕ¢¦ó. Alroy J. óþþÉ. Speciation and extinction in the fossil record of North American mammals. In Speciation and Patterns of Diversity (R. Butlin, J. Bridle, and D. Schluter, eds.), pages ìþÕ–ìóì, Cambridge University Press. Alroy J., Aberhan M., Bottjer D.J., Foote M., Fürsich F.T., Harries P.J., Hendy A.J.W., Holland S.M., Ivany L.C., Kiessling W., Kosnik M.A., Marshall C.R., McGowan A.J., Miller A.I., Olszewski T.D., Patzkowsky M.E., Peters S.E., Villier L., Wagner P.J., Bonuso N., Borkow P.S., Brenneis B., Clapham M.E., Fall L.M., Ferguson C.A., Hanson V.L., Krug A.Z., Layou K.M., Leckey E.H., Nürnberg S., Powers C.M., Sessa J.A., Simpson C., Tomasovych A., and Visaggi C.C. óþþ. Phanerozoic trends in the global diversity of marine invertebrates. Science ìóÕ:Éß–Õþþ. Anacker B.L., Whittall J.B., Goldberg E.E., and Harrison S.P. óþÕþ. Origins and consequences of serpentine endemism in the California ora. Evolution ä¢:ì䢖ìßä. Barraclough T.G., Harvey P.H.,and Nee S. ÕÉÉ¢. Sexual selection and taxonomic diversity in passerine birds. Proceedings of the Royal Society of London Series B ó¢É:óÕÕ–óÕ¢. Barraclough T.G., Nee S., and Harvey P.H. ÕÉÉ. Sister-group analysis in identifying correlates of diversication. Evolutionary Ecology Õó:ߢՖߢ¦. Benton M.J. ÕÉÉ¢. Diversication and extinction in the fossil record. Science óä:¢ó–¢.
Õó¢ ff§Z£í
Benton M.J. ÕÉÉß. Models for the diversication of life. Trends in Ecology and Evolution Õó:¦Éþ–¦É¢. Benton M.J. óþþó. Cope’s rule. In Encycopedia of Evolution (M. Pagel, ed.), pages Õ¢– Õä, Oxford University Press, Oxford. Bininda-Emonds O.R.P., Cardillo M., Jones K.E., MacPhee R.D.E., Beck R.M.D., Grenyer R., Price S.A., Vos R.A., Gittleman J.L., and Purvis A. óþþß. e delayed rise of present-day mammals. Nature ¦¦ä:¢þß–¢Õó. Bloch J.I., Rose K.D., and Gingerich P.D. ÕÉ. New species of Batodonoides (Lipotyphla, Geolabididae) from the early Eocene of Wyoming: smallest known mammal? Journal of Mammalogy ßÉ:þ¦–óß. Bokma F. óþþa. Bayesian estimation of speciation and extinction probabilities from (in)complete phylogenies. Evolution äÉ:ó¦¦Õ–󦦢. Bokma F. óþþb. Detection of “punctuated equilibrium” by Bayesian estimation of speciation and extinction rates, ancestral character states, and rates of anagenetic and cladogenetic evolution on a molecular phylogeny. Evolution äó:óßÕ–óßóä. Bokma F. óþþÉ. Problems detecting density-dependent diversication on phylogenies. Proceedings of the Royal Society of London Series B óßä:ÉÉì–Éɦ. Bollback J.P. óþþä. «Z£: Stochastic character mapping of discrete traits on phylogenies. BMC Bioinformatics ß:. Brock C.D., Harmon L.J., and Alfaro M.E. óþÕÕ. Testing for temporal variation in diversication rates when sampling is incomplete and nonrandom. Systematic Biology äþ:¦Õþ–¦ÕÉ. Brown J.H. ÕÉÉ¢. Macroecology. University of Chicago Press, Chicago. Brown J.H. and Maurer B.A. ÕÉÉ. Macroecology: the division of food and space among species on continents. Science ó¦ì:ÕÕ¦¢–ÕÕ¢þ. Brown J.W., Rest J.S., García-moreno J., Sorenson M.D., and Mindell D.P. óþþ. Strong mitochondrial DNA support for a Cretaceous origin of modern avian lineages. BMC Biology ä:ä. Burbrink F.T. and Pyron R.A. óþÕÕ. e impact of gene-tree/species-tree discordance on diversication-rate estimation. Evolution ä¢:Õ¢Õ–ÕäÕ. Burbrink F.T., Ruane S., and Pyron R.A. óþÕó. When are adaptive radiations replicated in areas? ecological opportunity and unexceptional diversication in West Indian dipsadine snakes (Colubridae: Alsophiini). Journal of Biogeography ìÉ:¦ä¢–¦ß¢. Cardillo M. ÕÉÉÉ. Latitude and rates of diversication in birds and butteries. Proceed- ings of the Royal Society of London Series B óää:ÕóóÕ–Õóó¢.
Õóä ff§Z£í
Cardillo M., Mace G.M., Jones K.E., Bielby J., Bininda-Emonds O.R.P., Sechrest W., Orme C.D.L., and Purvis A. óþþ¢. Multiple causes of high extinction risk in large mammals. Science ìþÉ:ÕóìÉ–Õó¦Õ. Charlesworth B. ÕÉÕ. Macroevolution: pattern and process (book review). Biological Journal of the Linnean Society Õä:ÕäÉ–Õßó. Charlesworth B., Lande R., and Slatkin M. ÕÉó. A neo-Darwinian commentary on macroevolution. Evolution ìä:¦ß¦–¦É. Churchill G.A. óþþþ. Inferring ancestral character states, vol. ìó of Evolutionary Biology, chap. ä, pages ÕÕß–Õì¦. Kluwer Academic, New York. Claramunt S., Derryberry E.P., Remsen Jr J.V., and Brumeld R.T. óþÕó. High dispersal ability inhibits speciation in a continental radiation of passerine birds. Proceedings of the Royal Society of London, Series B Biological Sciences óßÉ:Õ¢äߖբߦ. Clauset A. and Erwin D.H. óþþ. e evolution and distribution of species body size. Science ìóÕ:ìÉÉ–¦þÕ. Cohen S.D. and Hindemarsh A.C. ÕÉÉä. hêou, a sti/nonsti ou solver in C. Computers in Physics Õþ:Õì–Õ¦ì. Cope E. Õß. e origin of the ttest. Appleton, New York. Cope E.D. ÕÉä. e primary factors of organic evolution. Open Court Publishing Company, Chicago. Coyne J.A. and Orr H.A. óþþ¦. Speciation. Sinauer Associates, Massachusetts. Cusimano N. and Renner S.S. óþÕþ. Slowdowns in diversication rates from real phylogenies may not be real. Systematic Biology ¢É:¦¢–¦ä¦. Cusimano N., Stadler T., and Renner S.S. óþÕó. A new method for handling missing species in diversication analysis applicable to randomly or non-randomly sampled phylogenies. Systematic Biology in press. Damuth J. and Heisler I.L. ÕÉ. Alternative formulations of multilevel selection. Biology and Philosophy ì:¦þß–¦ìþ. Darwin C. ÕßÕ. e Descent of Man, and Selection in Relation to Sex. Murray, London. Davies T.J., Barraclough T.G., Chase M.W., Soltis P.S., and Soltis D.E. óþþ¦. Darwin’s abominable mystery: insights from a supertree of the angiosperms. Proceedings of the National Acadamy of Sciences, USA ÕþÕ:ÕÉþ¦–ÕÉþÉ. Dial K.P. and Marzlu J.M. ÕÉ. Are the smallest organisms the most diverse? Ecology äÉ:Õäóþ–Õäó¦. Drummond A.J. and Rambaut A. óþþß. fuZ«±: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology ß:óÕ¦.
Õóß ff§Z£í
Duda T F J. and Palumbi S.R. ÕÉÉÉ. Developmental shis and species selection in gastropods. Proceedings of the National Acadamy of Sciences, USA Éä:Õþóßó–Õþóßß. Eastman J.M., Alfaro M.E., Joyce P., Hipp A.L., and Harmon L.J. óþÕÕ. A novel comparative method for identifying shis in the rate of character evolution on trees. Evolution ä¢:ì¢ß–ì¢É. Eastman J.M. and Storfer A. óþÕÕ. Correlations of life-history and distributional- range variation with salamander diversication rates: Evidence for species selection. Systematic Biology äþ:¢þì–¢Õ. Eble G.J. ÕÉÉÉ. Originations: land and sea compared. Geobios ó. Eldredge N. and Gould S.J. ÕÉßó. Punctuated equilibria: an alternative to phyletic gradualism. In Models in Paleobiology (T.J.M. Schopf, ed.), pages ó–ÕÕ¢, Freeman, Cooper and Company, San Francisco. Erwin D.H. óþÕþ. Macroevolution is more than repeated rounds of microevolution. Evolution and Development ó:ß–¦. Estes S. and Arnold S.J. óþþÉ. Resolving the paradox of stasis: Models with stabilizing selection explain evolutionary divergence on all timescales. e American Naturalist ÕäÉ:óóß–ó¦¦. Etienne R.S., de Visser S.N., Janzen T., Olsen J.L., Ol H., and Rosindell J. óþÕóa. Can clade age alone explain the relationship between body size and diversity? Interface Focus ó:Õßþ–ÕßÉ. Etienne R.S., Haegeman B., Stadler T., Aze T., Pearson P., Purvis A., and Phillimore A.B. óþÕób. Diversity-dependence brings molecular phylogenies closer to agreement with the fossil record. Proceedings of the Royal Society of London Series B in press. Etienne R.S. and Rosindell J. óþÕó. Prolonging the past counteracts the pull of the present: protracted speciation can explain observed slowdowns in diversication. Systematic Biology äÕ:óþ¦–óÕì. Ezard T., Pearson P., Aze T., and Purvis A. óþÕó. e meaning of birth and death (in macroevolutionary birth-death models). Biology Letters :ÕìÉ–Õ¦ó. Farrell B.D. ÕÉÉ. “Inordinate fondness” explained: why are there so many beetles? Science óÕ:¢¢¢–¢¢É. Felsenstein J. ÕÉßì. Maximum-likelihood estimation of evolutionary trees from continu- ous characters. American Journal of Human Genetics ó¢:¦ßÕ–¦Éó. Felsenstein J. ÕÉÕ. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution Õß:ìä–ìßä. Felsenstein J. ÕÉ¢. Phylogenies and the comparative method. e American Naturalist Õó¢:Õ–Õ¢.
Õó ff§Z£í
Felsenstein J. ÕÉ. Phylogenies and quantitative characters. Annual Review of Ecology and Systematics ÕÉ:¦¦¢–¦ßÕ. Felsenstein J. óþþó. Quantitative characters, phylogenies, and morphometrics. In Morphology, Shape, and Phylogenetics (N. MacLeod, ed.), Systematics Association Special Volume Series ä¦, Taylor and Francis, London. Felsenstein J. óþÕó. A comparative method for both discrete and continuous characters using the threshold model. e American Naturalist ÕßÉ:Õ¦¢–Õ¢ä. Figuerola J. ÕÉÉÉ. A comparative study on the evolution of reversed size dimorphism in monogamous waders. Biological Journal of the Linnean Society äß:Õ–Õ. Finarelli J.A. and Flynn J.J. óþþä. Ancestral state reconstruction of body size in the Caniformia (Carnivora, Mammalia): e eects of incorporating data from the fossil record. Systematic Biology ¢¢:ìþÕ–ìÕì. Fisher R.A. ÕÉ¢. e genetical theory of natural selection. Dover, New York. FitzJohn R.G. óþÕþ. Quantitative traits and diversication. Systematic Biology ¢É:äÕÉ– äìì. FitzJohn R.G., Maddison W.P.,and Otto S.P.óþþÉ. Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies. Systematic Biology ¢:¢É¢–äÕÕ. Forbes A.A., Powell T.H.Q., Stelinski L.L., Smith J.J., and Feder J.L. óþþÉ. Sequential sympatric speciation across trophic levels. Science ìóì:ßßä–ßßÉ. Forey P.L.,Fortey R.A., Kenrick P.,and Smith A.B. óþþ¦. Taxonomy and fossils: a critical appraisal. Philosophical Transactions of the Royal Society B: Biological Sciences ì¢É:ìäÉ–ä¢ì. Freckleton R.P., Phillimore A.B., and Pagel M. óþþ. Relating traits to diversication: a simple test. e American Naturalist Õßó:Õþó–ÕÕ¢. Frigo M. and Johnson S.G. óþþ¢. e design and implementation of ±ëì. Procedings of the IEEE Éì:óÕä–óìÕ. Fritz S.A., Bininda-Emonds O.R.P., and Purvis A. óþþÉ. Geographical variation in predictors of mammalian extinction risk: big is bad, but only in the tropics. Ecology Letters Õó:¢ì–¢¦É. Funk W.C., Caminer M., and Ron S.R. óþÕó. High levels of cryptic species diversity uncovered in Amazonian frogs. Proceedings of the Royal Society of London Series B in press. Gage M.J.G., Parker G.A., Nylin S., and Wiklund C. óþþó. Sexual selection and speciation in mammals, butteries and spiders. Proceedings of the Royal Society of London Series B óäÉ:óìþÉ–óìÕä.
ÕóÉ ff§Z£í
Gardezi T. and da Silva J. ÕÉÉÉ. Diversity in relation to body size in mammals: a comparative study. e American Naturalist Õ¢ì:ÕÕþ–Õóì. Gavrilets S. óþþþ. Rapid evolution of reproductive barriers driven by sexual conict. Nature ¦þì:ä–É. Gelman A., Carlin J.B., Stern H.S., and Rubin D.B. ÕÉÉ¢. Bayesian Data Analysis. Chapman & Hall, London. Gilinsky N.L. ÕÉɦ. Volatility and the phanerozic decline of background extinction intensity. Paleobiology óþ:¦¦¢–¦¢. Gilinsky N.L. and Bambach R.K. ÕÉß. Asymmetrical patterns of origination and extinction in higher taxa. Paleobiology Õì:¦óß–¦¦¢. Gittleman J. and Purvis A. ÕÉÉ. Body size and species-richness in carnivores and primates. Proceedings of the Royal Society of London Series B óä¢:ÕÕì–ÕÕÉ. Goldberg E.E. and Igić B. óþþ. On phylogenetic tests of irreversible evolution. Evolution äó:óßóß–óߦÕ. Goldberg E.E. and Igić B. in press. Tempo and mode in plant breeding system evolution. Evolution . Goldberg E.E., Kohn J.R., Lande R., Robertson K.A., Smith S.A., and Igić B. óþÕþ. Species selection maintains self-incompatibility. Science ììþ:¦É얦ɢ. Goldberg E.E., Lancaster L.T., and Ree R.H. óþÕÕ. Phylogenetic inference of reciprocal eects between geographic range evolution and diversication. Systematic Biology äþ:¦¢Õ–¦ä¢. Goldberg E.E., Roy K., Lande R., and Jablonski D. óþþ¢. Diversity, endemism, and age distributions in macroevolutionary sources and sinks. e American Naturalist Õä¢:äóì–äìì. Gómez J.M. and Verdú M. óþÕó. Mutualism with plants drives primate diversication. Systematic Biology in press. Gould S.J. and Eldredge N. ÕÉßß. Punctuated equilibrium: the tempo and mode of evolution reconsidered. Paleobiology ì:ÕÕ¢–Õ¢Õ. Gould S.J., Raup D.M., Sepkoski Jr. J.J., Schopf T.J.M., and Simberlo D.S. ÕÉßß. e shape of evolution: A comparison of real and random clades. Paleobiology ì:óì–¦þ. Grant P.R. and Grant B.R. óþþó. Unpredictable evolution in a ìþ-year study of darwin’s nches. Science óÉä:ßþß–ßÕÕ. Grant P.R. and Grant B.R. óþþÉ. e secondary contact phase of allopatric speciation in Darwin’s nches. Proceedings of the National Acadamy of Sciences, USA Õþä:óþÕ¦Õ– óþÕ¦.
Õìþ ff§Z£í
Grantham T.A.ÕÉÉ¢. Hierarchical approaches to macroevolution: recent work on species selection and the “eect hypothesis”. Annual Review of Ecology and Systematics óä:ìþÕ–ìóÕ. Green P.J. ÕÉÉ¢. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika ó:ßÕÕ–ßìó. Grimaldi D. ÕÉÉÉ. e co-radiations of pollinating insects and angiosperms in the cretaceous. Annals of the Missouri Botanical Garden ä:ìßì–¦þä. Grimshaw A.J. óþþÕ. e Adventures of Punctuated Equilibria. Ph.D. thesis, Deakin University. Hackett S.J., Kimball R.T., Reddy S., Bowie R.C.K., Braun E.L., Braun M.J., Chojnowski J.L., Cox W.A., Han K.L., Harshman J., Huddleston C.J., Marks B.D., Miglia K.J., Moore W.S., Sheldon F.H., Steadman D.W., Witt C.C., and Yuri T. óþþ. A phylogenomic study of birds reveals their evolutionary history. Science ìóþ:Õßäì–Õßä. Hansen T.F. and Martins E.P. ÕÉÉä. Translating between microevolutionary process and macroevolutionary patterns: the correlation structure of interspecic data. Evolution ¢þ:Õ¦þ¦–Õ¦Õ¦. Harcourt A.H., Copperto S.A., and Parks S.A. óþþó. Rarity, specialization and extinction in primates. Journal of Biogeography óÉ:¦¦¢–¦¢ä. Harmon L.J., Weir J.T., Brock C.D., Glor R.E., and Challenger W. óþþ. uu§: investigating evolutionary radiations. Bioinformatics ó¦:ÕóÉ–ÕìÕ. Harvey P.H., May R.M., and Nee S. ÕÉɦ. Phylogenies without fossils. Evolution ¦:¢óì– ¢óÉ. Hebert P.D.N., Penton E.H., Burns J.M., Janzen D.H., and Hallwachs W. óþþ¦a. Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper buttery Astraptes fulgerator. Proceedings of the National Acadamy of Sciences, USA ÕþÕ:Õ¦Õó–Õ¦Õß. Hebert P.D.N., Stoeckle M.Y., Zemlak T.S., and Frances C.M. óþþ¦b. Identication of birds through DNA barcodes. PLoS Biology ó:eìÕó. Heilbuth J.C. óþþþ. Lower species richness in dioecious clades. e American Naturalist Õ¢ä:óóÕ–ó¦Õ. Hendry A.P., Nosil P., and Rieseberg L.H. óþþß. e speed of ecological speciation. Functional Ecology óÕ:¦¢¢–¦ä¦. Hernández Fernández M. and Vrba E.S. óþþ¢. A complete estimate of the phylogenetic relationships in Ruminantia: a dated species-level supertree of the extant ruminants. Biological Reviews þ:óäÉ–ìþó.
ÕìÕ ff§Z£í
Herrera C.M. ÕÉÉó. Historical eects and sorting processes as explanations for contem- porary ecological patterns: Character syndromes in mediterranean woody plants. e American Naturalist Õ¦þ:¦óÕ–¦¦ä. Herron M.D., Castoe T.A., and Parkinson C.L. óþþ¦. Sciurid phylogeny and the paraphyly of holarctic ground squirrels (Spermophilus). Molecular Phylogenetics and Evolution ìÕ:ÕþÕ¢–Õþìþ. Hindmarsh A.C., Brown P.N., Grant K.E., Lee S.L., Serban R., Shumaker D.E., and Woodward C.S. óþþ¢. «¶oZ«: Suite of nonlinear and dierential/algebraic equation solvers. ACM Transactions on Mathematical Soware ìÕ:ìäì–ìÉä. Hodges S.A. and Arnold M.L. ÕÉÉ¢. Spurring plant diversication: are oral nectar spurs a key innovation? Procedings of the Royal Society of London, Series B óäó:ì¦ì–ì¦. Hönha S., Stadler T., Ronquist F., and Britton R. óþÕÕ. Inferring speciation and extinction rates under dierent sampling schemes. Molecular Biology and Evolution ó:ó¢ßß– ó¢É. Huelsenbeck J.P.,Larget B., Miller R.E., and Ronquist F. óþþó. Potential applications and pitfalls of Bayesian inference of phylogeny. Systematic Biology ¢Õ:äßì–ä. Huelsenbeck J.P.,Rannala B., and Larget B. óþþþ. A Bayesian framework for the analysis of cospeciation. Evolution ¢¦:ì¢ó–ìä¦. Isaac N.J.B., Agapow P.M., Harvey P.H., and Purvis A. óþþì. Phylogenetically nested comparisons for testing correlates of species richness: a simulation study of continuous variables. Evolution ¢ß:Õ–óä. Jablonski D. ÕÉÉß. Body-size evolution in Cretaceous molluscs and the status of Cope’s rule. Nature ì¢:ó¢þ–ó¢ó. Jablonski D. óþþþ. Micro-and macroevolution: scale and hierarchy in evolutionary biology and paleobiology. Paleobiology óä:Õ¢–¢ó. Jablonski D. óþþa. Biotic interactions and macroevolution: extensions and mismatches across scales and levels. Evolution äó:ßÕ¢–ßìÉ. Jablonski D. óþþb. Species selection: theory and data. Annual Review of Ecology, Evolution, and Systematics ìÉ:¢þÕ–¢ó¦. Jaramillo C., Rueda M.J., and Mora G. óþþä. Cenozoic plant diversity in the Neotropics. Science ìÕÕ:ÕÉì–ÕÉä. Johnson M.T.J., FitzJohn R.G., Smith S.D., Rausher M.D., and Otto S.P. óþÕÕ. Loss of sexual recombination and segregation is associated with increased diversication in evening primroses. Evolution ä¢:ìóìþ–ìó¦þ.
Õìó ff§Z£í
Jones K.E., Bielby J., Cardillo M., Fritz S.A., O’Dell J., Orme C.D.L., Sa K., Sechrest W., Boaks E.H., Carbone C., Connolly C., Cutts M.J., Foster J.K., Grenyer R., Habib M., Plaster C.A., Price S.A., Rigby E.A., Rist J., Teacher A., Bininda-Emonds O.R.P., Gittleman J.L., Mace G.M., and Purvis A. óþþÉ. PanTHERIA: A species-level database of life-history, ecology and geography of extant and recently extinct mammals. Ecology Éþ:óä¦. Kalmar A. and Currie D.J. óþÕþ. e completeness of the continental fossil record and its impact on patterns of diversication. Paleobiology ìä:¢Õ–äþ. Karlin S. and Taylor H.M. ÕÉÕ. A Second Course in Stochastic Processes. Academic Press, London. Kass R.E. and Raery A.E. ÕÉÉ¢. Bayes factors. Journal of the American Statistical Association Éþ:ßßì–ßÉ¢. Kubo T. and Iwasa Y. ÕÉÉ¢. Inferring the rates of branching and extinction from molecular phylogenies. Evolution ¦É:äɦ–ßþ¦. Kuhn T.S., Mooers A.O., and omas G.H. óþÕÕ. A simple polytomy resolver for dated phylogenies. Methods in Ecology and Evolution ó:¦óß–¦ìä. Labandeira C.C. and Sepkoski Jr. J.J. ÕÉÉì. Insect diversity in the fossil record. Science óäÕ:ìÕþ–ìÕ¢. Lande R. ÕÉþ. Microevolution in relation to macroevolution. Paleobiology ä:óìì–óì. Leigh E G J. ÕÉßß. How does selection reconcile individual advantage with the good of the group? Proceedings of the National Acadamy of Sciences, USA ߦ:¦¢¦ó–¦¢¦ä. Leisch F. óþþó. Sweave: Dynamic generation of statistical reports using literate data analysis. In Compstat óþþó — Proceedings in Computational Statistics (W. Härdle and B. Rönz, eds.), pages ¢ß¢–¢þ, Physica Verlag, Heidelberg, ISBN ì-ßÉþ-Õ¢Õß-É. Levinton J.S. ÕÉßÉ. A theory of diversity equilibrium and morphological evolution. Science óþ¦:ì좖ììä. Lewis P.O. óþþÕ. A likelihood approach to estimating phylogeny from discrete morpho- logical character data. Systematic Biology ¢þ:ÉÕì–Éó¢. Lewontin R. ÕÉßþ. e units of selection. Annual Review of Ecology and Systematics Õ:Õ–Õ. Lieberman B.S. and Vrba E.S. óþþ¢. Stephen Jay Gould on species selection: ìþ years of insight. Paleobiology ìÕ:ÕÕì–ÕóÕ. Liow L.H., Fortelius M., Bingham E., Lintulaakso K., Mannila H., Flynn L., and Stenseth N.C. óþþ. Higher origination and extinction rates in larger mammals. Proceedings of the National Acadamy of Sciences, USA Õþ¢:äþÉß–äÕþó. Lislevand T., Figuerola J., and Székely T. óþþß. Avian body sizes in relation to fecundity, mating system, display behavior, and resource sharing. Ecology :Õäþ¢.
Õìì ff§Z£í
Lloyd E.A. and Gould S.J. ÕÉÉì. Species selection on variability. Proceedings of the National Acadamy of Sciences, USA Éþ:¢É¢–¢ÉÉ. Losos J. óþÕÕ. Seeing the forest for the trees: e limitations of phylogenies in comparative biology. e American Naturalist Õßß:ßþÉ–ßóß. Lukoschek V., Keogh J.S., and Avise J.C. óþÕó. Evaluating fossil calibrations for dating phylogenies in light of rates of molecular evolution: A comparison of three approaches. Systematic Biology äÕ:óó–¦ì. Lutzoni F., Pagel M., and Reeb V. óþþÕ. Major fungal lineages are derived from lichen symbiotic ancestors. Nature ¦ÕÕ:Éìߖɦþ. Lyell C. Õìó. Principles of Geology, vol. ó. John Murray, London. Lynch V.J. óþþÉ. Live-birth in vipers (Viperidae) is a key innovation and adaptation to global cooling during the Cenozoic. Evolution äì:ó¦¢ß–ó¦ä¢. MacArthur R.H. ÕÉäÉ. Patterns of communities in the tropics. Biological Journal of the Linnean Society Õ:ÕÉ–ìþ. MacKay D.J.C. óþþì. Information eory, Inference, and Learning Algorithms. Cam- bridge University Press, New York. Maddison D.R., Swoord D.L., and Maddison W.P.ÕÉÉß. u춫: an extensible le format for systematic information. Systematic Biology ¦ä:¢Éþ–äóÕ. Maddison W.P. óþþä. Confounding asymmetries in evolutionary diversication and character change. Evolution äþ:Õߦì–Õߦä. Maddison W.P. and Maddison D.R. óþþä. Mesquite: a modular system for evolutionary analysis. Version Õ.Õ http://www.mesquiteproject.org. Maddison W.P. and Maddison D.R. óþþ. Mesquite: A modular system for evolutionary analysis. Version ó.¢. http://www.mesquiteproject.org. Maddison W.P., Midford P.E., and Otto S.P. óþþß. Estimating a binary character’s eect on speciation and extinction. Systematic Biology ¢ä:ßþÕ–ßÕþ. Magnuson-Ford K. and Otto S.P. óþÕó. Linking the investigations of character evolution and species diversication. e American Naturalist . Mayhew P.J. óþþß. Why are there so many insect species? Perspectives from fossils and phylogenies. Biological Review ó:¦ó¢–¦¢¦. Maynard Smith J. ÕÉì. Models of evolution. Proceedings of the Royal Society of London Series B óÕÉ:ìÕ¢–ìó¢. Maynard Smith J. ÕÉÉ. e causes of extinction. Philosophical Transactions of the Royal Society B: Biological Sciences ìó¢:ó¦Õ–ó¢ó. Mayrose I. and Otto S.P. óþÕÕ. A likelihood method for detecting trait-dependent shis in the rate of molecular evolution. Molecular Biology and Evolution ó:ߢɖßßþ.
Õì¦ ff§Z£í
McGlone M.S., Duncan R.P., and Heenan P.B. óþþÕ. Endemism, species selection and the origin and distribution of the vascular plant ora of new zealand. Journal of Biogeography ó:ÕÉÉ–óÕä. McPeek M.A. óþþ. e ecological dynamics of clade diversication and community assembly. e American Naturalist Õßó:Eóßþ–Eó¦. Mitra S., Landel H., and Pruett-Jones S. ÕÉÉä. Species richness covaries with mating system in birds. e Auk ÕÕì:¢¦¦–¢¢Õ. Mitter C.B., Farrell B., and Wiegmann B. ÕÉ. e phylogenetic study of adaptive zones: has phytophagy promoted insect diversication? e American Naturalist Õìó:Õþß– Õó. Moler C. and Loan C.V. óþþì. Nineteen dubious ways to compute the exponential of a matrix, twenty-ve years later. SIAM Review ¦¢:ì–¦É. Mooers A.Ø. and Schluter D. ÕÉÉÉ. Reconstructing ancestor states with maximum likelihood: support for one- and two-rate models. Systematic Biology ¦:äóì–äìì. Moore M.J., Bell C.D., Soltis P.S., and Soltis D.E. óþþß. Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proceedings of the National Acadamy of Sciences, USA Õþ¦:ÕÉìäì–ÕÉìä. Morlon H., Parsons T., and Plotkin J. óþÕÕ. Reconciling molecular phylogenies with the fossil record. Proceedings of the National Acadamy of Sciences, USA Õþ:Õäìóß–Õäììó. Morlon H., Potts M.D., and Plotkin J. óþÕþ. Inferring the dynamics of diversication: a coalescent approach. PLoS Biology :eÕþþþ¦Éì. Morrow E.H. and Pitcher T.E. óþþì. Sexual selection and the risk of extinction in birds. Proceedings of the Royal Society of London Series B óßþ:ÕßÉì–ÕßÉÉ. Morrow E.H., Pitcher T.E., and Arnqvist G. óþþì. No evidence that sexual selection is an ‘engine of speciation’ in birds. Ecology Letters ä:óó–óì¦. Morse D.R., Lawton J.H., Dodson M.M., and Williamson M.H. ÕÉ¢. Fractal dimension of vegetation and the distribution of arthropod body lengths. Nature ìÕ¦:ßìÕ–ßìì. Mossel E. and Steel M. óþþ¢. How much can evolved characters tell us about the tree that generated them? In Mathematics of Evolution and Phylogeny (O. Gascuel, ed.), pages 즖¦Õó, Oxford University Press. Moyle R.G., Filardi C.E., Smith C.E., and Diamond J. óþþÉ. Explosive Pleistocene diversication and hemispheric expansion of a “great speciator”. Proceedings of the National Acadamy of Sciences, USA Õþä:Õäì–Õä. Neal R.M. óþþì. Slice sampling. Annals of Statistics ìÕ:ßþ¢–ßäß. Nee S. óþþÕ. Inferring speciation rates from phylogenies. Evolution ¢¢:ääÕ–ää.
Õì¢ ff§Z£í
Nee S. óþþä. Birth-death models in macroevolution. Annual Review of Ecology and Systematics ìß:Õ–Õß. Nee S., Holmes E.C., May R.M., and Harvey P.H.ÕÉɦa. Extinction rates can be estimated from molecular phylogenies. Philosophical Transactions of the Royal Society B: Biological Sciences 즦:ßß–ó. Nee S., May R.M., and Harvey P.H. ÕÉɦb. e reconstructed evolutionary process. Philosophical Transactions of the Royal Society B: Biological Sciences 즦:ìþ¢–ìÕÕ. Oakley T.H. and Cunningham C.W. óþþþ. Independent contrasts succeed where ancestor reconstruction fails in a known bacteriophage phylogeny. Evolution ¢¦. Okasha S. óþþä. Evolution and the Levels of Selection. Oxford University Press, New York. O’Meara B.C., Anè C., Sanderson M.J., and Wainwright P.C. óþþä. Testing for dierent rates of continuous trait evolution using likelihood. Evolution äþ:Éóó–Éìì. Omland K.E. ÕÉÉÉ. e assumptions and challenges of ancestral state reconstructions. Systematic Biology ¦:äþ¦–äÕÕ. Owens I.P.F., Bennett P.M., and Harvey P.H. ÕÉÉÉ. Species richness among birds: body size, life histoy, sexual selection or ecology. Proceedings of the Royal Society of London Series B óää:Éìì–ÉìÉ. Pagel M. ÕÉɦ. Detecting correlated evolution on phylogenies: a general method for the comparative analysis of discrete characters. Proceedings of the Royal Society of London Series B ó¢¢:ìß–¦¢. Pagel M. ÕÉÉß. Inferring evolutionary processes from phylogenies. Zoologica Scripta óä:ììÕ–ì¦. Pagel M. and Meade A. óþþä. Bayesian analysis of correlated evolution of discrete characters by reversable-jump Markov chain Monte Carlo. e American Naturalist Õäß:þ–ó¢. Paradis E. óþþ¢. Statistical analysis of diversication with species traits. Evolution ¢É:Õ– Õó. Paradis E. óþþ. Asymmetries in phylogenetic diversication and character change can be untangled. Evolution äÕ:ó¦Õ–ó¦ß. Paradis E. óþÕþ. Time-dependent speciation and extinction from phylogenies: a least squares approach. Evolution ä¢:ääÕ–äßó. Paradis E., Claude J., and Strimmer K. óþþ¦. Z£u: analyses of phylogenetics and evolution in R language. Bioinformatics óþ:óÉ–óÉþ. Parker G.A. and Partridge L. ÕÉÉ. Sexual conict and speciation. Philosophical Transactions of the Royal Society B: Biological Sciences ì¢ì:óäÕ–óߦ.
Õìä ff§Z£í
Pellmyr O. ÕÉÉó. Evolution of insect pollination and Angiosperm diversication. Trends in Ecology and Evolution ß:¦ä–¦É. Pfenniger M. and Schwenk K. óþþß. Cryptic animal species are homogeneously distributed among taxa and biogeographical regions. BMC Evolutionary Biology ß:ÕóÕ. Phillimore A.B. and Price T.D. óþþ. Density-dependent cladogenesis in birds. PLoS Biology ä:¦ì–¦É. Phillimore A.B., reckleton R.P., Orme C.D.L., and Owens I.P.F. óþþä. Ecology predicts large-scale patterns of phylogenetic diversication in birds. e American Naturalist Õä:óóþ–óóÉ. Pillon Y., Munzinger J., Amir H., and Lebrun M. óþÕþ. Ultramac soils and species sorting in the ora of New Caledonia. Journal of Ecology É:ÕÕþ–ÕÕÕä. Plummer M., Best N., Cowles K., and Vines K. óþþä. hoZ: Convergence diagnosis and output analysis for hh. R News ä:ß–ÕÕ. Polly P.D. ÕÉÉ. Cope’s rule. Science óó:¢þ–¢Õ. Pulquério M.J.F. and Nichols R.A. óþþß. Dates from the molecular clock: how wrong can we be? Trends in Ecology and Evolution Õþ–Õ¦. Purvis A. óþþ. Phylogenetic approaches to the study of extinction. Annual Review of Ecology and Systematics ìÉ:ìþÕ–ìÕÉ. Purvis A., Orme C.D.L., Toomey N.H., and Pearson P.N. óþþÉ. Temporal patterns in diversication rates. In Speciation and Patterns of Diversity (R. Butlin, J. Bridle, and D. Schluter, eds.), pages óß–ìþþ, Cambridge University Press. Pybus O.G. and Harvey P.H.óþþþ. Testing macro-evolutionary models using incomplete molecular phylogenies. Proceedings of the Royal Society of London Series B óäß:óóäß– óóßó. Pybus O.G., Rambaut A., Holmes E.C., and Harvey P.H.óþþó. New inferences from tree shape: numbers of missing taxa and population growth curves. Systematic Biology ¢Õ:Õ–. Quental T.B. and Marshall C.R. óþÕþ. Diversity dynamics: Molecular phylogenies need the fossil record. Trends in Ecology and Evolution ó¢:¦ì¦–¦¦Õ. R Development Core Team. óþÕó. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN ì-Éþþþ¢Õ- þß-þ. Rabosky D.L. óþþä. Likelihood methods for detecting temporal shis in diversication rates. Evolution äþ:ÕÕ¢ó–ÕÕä¦. Rabosky D.L. óþþÉa. Ecological limits and diversication rate: alternative paradigms to explain the variation in species richness among clades and regions. Ecology Letters Õó:ß좖ߦì.
Õìß ff§Z£í
Rabosky D.L. óþþÉb. Heritability of extinction rates links diversication patterns in molecular phylogenies and fossils. Systematic Biology ¢:äóÉ–ä¦þ. Rabosky D.L. óþÕó. Positive correlation between diversication rates and phenotypic evolvability can mimic punctuated equilibrium on molecular phylogenies. Evolution in press. Rabosky D.L., Donnellan S.C., Talaba A.L., and Lovette I.J. óþþß. Exceptional among- lineage variation in diversication rates during the radiation of Australia’s most diverse vertebrate clade. Proceedings of the Royal Society of London Series B óߦ:óÉÕ¢– óÉóì. Rabosky D.L. and Glor R.E. óþÕþ. Equilibrium speciation dynamics in a model adaptive radiation of island lizards. Proceedings of the National Acadamy of Sciences, USA ¢Õ:óóÕß–óóÕì. Rabosky D.L. and Lovette I.J. óþþ. Explosive evolutionary radiations: decreasing speciation or increasing extinction through time? Evolution äó:Õää–Õߢ. Rabosky D.L. and McCune A.R. óþþÉ. Reinventing species selection with molecular phylogenies. Trends in Ecology and Evolution ó¢:ä–ߦ. Raia P., Carotenuto F., Passaro F., Fulgione D., and Fortelius M. óþÕó. Ecological specialization in fossil mammals explains cope’s rule. e American Naturalist ÕßÉ:ìó–ììß. Raup D.M., Gould S.J., Schopf R.J.M., and Simberlo D.S. ÕÉßì. Stochastic models of phylogeny and the evolution of diversity. Journal of Geology Õ:¢ó¢–¢¦ó. Read A.F. and Nee S. ÕÉÉ¢. Inference from binary comparative data. Journal of eoretical Biology Õßì:ÉÉ–Õþ. Redding D.W., DeWol C., and Mooers A.Ø. óþÕþ. Evolutionary distinctiveness, threat status and ecological oddity in primates. Conservation Biology ó¦:Õþ¢ó–Õþ¢. Ree R.H. óþþ¢. Detecting the historical signature of key innovations using stochastic models of character evolution abd cladogenesis. Evolution ¢É:ó¢ß–óä¢. Reeder D.M., Helgen K.M., and Wilson D.E. óþþß. Global trends and biases in new mammal species discoveries. Occasional Papers, Museum of Texas Tech University óäÉ:Õ–ì¢. Rensch B. Õɦ. Histological changes correlated with evolutionary changes of body size. Evolution ó:óÕ–óìþ. Revell L.J., Harmon L.J., and Glor R.E. óþþ¢. Underparameterized model of sequence evolution leads to bias in the estimation of diversication rates from molecular phylogenies. Systematic Biology ¢¦:Éßì–Éì. Rice S.H. ÕÉÉ¢. A genetical theory of species selection. Journal of eoretical Biology Õßß:óìß–ó¦¢.
Õì ff§Z£í
Ricklefs R.E. óþþß. Estimating diversication rates from phylogenetic information. Trends in Ecology and Evolution óó:äþÕ–äÕþ. Rosenblum E.B., Sarver B.A.J., Brown J.W., Des Roches S., Hardwick K.M., Hether T.D., Eastman J.M., Pennell M.W., and Harmon L.J. óþÕó. Goldilocks meets Santa Rosalia: An ephemeral speciation model explains patterns of diversication across time scales. Evolutionary Biology in press. Rosenzweig M.L. ÕÉÉä. Colonial birds probably do speciate faster. Evolutionary Ecology äÕ–äì. Rosindell J., Cornell S.J., Hubbell S., and Etienne R.S. óþÕþ. Protracted speciation revitalizes the neutral theory of biodiversity. Ecology Letters Õì:ßÕä–ßóß. Rowan T. ÕÉÉþ. Functional Stability Analysis of Numerical Algorithms. Ph.D. thesis, University of Texas at Austin. Sallan L.C., Kammer T.W., Ausich W.I., and Cook L.A. óþÕÕ. Persistent predatorâĂŞprey dynamics revealed by mass extinction. Proceedings of the National Acadamy of Sciences, USA Õþ:ì좖ìì. Schluter D. óþþÉ. Evidence for ecological speciation and its alternative. Science ìóì:ßìß– ߦÕ. Schluter D., Price T., Mooers A.Ø., and Ludwig D. ÕÉÉß. Likelihood of ancestor states in adaptive radiation. Evolution ¢Õ:ÕäÉÉ–ÕßÕÕ. Schwander T. and Crespi B. óþþÉ. Twigs on the tree of life? neutral and selective models for integrating macroevolutionary patterns with microevolutionary processes in the analysis of asexuality. Molecular Ecology Õ:ó–¦ó. Schwartz R.S. and Mueller R.L. óþÕþ. Branch length estimation and divergence dating: estimates of error in bayesian and maximum likelihood frameworks. BMC Evolutionary Biology Õþ:¢. Seehausen O. óþþä. Conservation: Losing biodiversity by reverse speciation. Current Biology Õä:Rì즖Rììß. Sepkoski Jr. J.J. ÕÉÉß. Biodiversity: past, present, and future. Journal of Paleontology ßÕ:¢ìì–¢ìÉ. Sepkoski Jr. J.J. ÕÉÉ. Rates of speciation in the fossil record. Philosophical Transactions of the Royal Society B: Biological Sciences ì¢ì:ìÕ¢–ìóä. Sidje R.B. ÕÉÉ. Expokit: A soware package for computing matrix exponentials. ACM Transactions on Mathematical Soware ó¦:Õìþ–Õ¢ä. Siepielski A.M., DiBattista J.D., and Carlson S.M. óþþÉ. It’s about time: the temporal dynamics of phenotypic selection in the wild. Ecology Letters Õó:ÕóäÕ–Õóßä. Simpson C. óþÕþ. Species selection and driven mechanisms jointly generate a large-scale morphological trend in monobathrid crinoids. Paleobiology ìä:¦Õ–¦Éä.
ÕìÉ ff§Z£í
Simpson C., Kiessling W.,Mewis H., Baron-Szabo R.C., and Müller J. óþÕÕ. Evolutionary diversication of reef corals: a comparison of the molecular and fossil records. Evolution ä¢:ìóߦ–ìó¦. Slater G.J., Harmon L.J., Wegmann D., Joyce P., Revell L.J., and Alfaro M.E. óþÕó. Fitting models of continuous trait evolution to incompletely sampled comparative data using Approximate Bayesian Computation. Evolution ää:ߢó–ßäó. Slatkin M. ÕÉÕ. A diusion model of species selection. Paleobiology ß:¦óÕ–¦ó¢. Smith S. and Donoghue M. óþþ. Rates of molecular evolution are linked to life history in owering plants. Science ìóó:ä–É. Soetaert K., Petzoldt T., and Setzer R.W.óþÕþ. Solving dierential equations in r: Package desolve. Journal of Statistical Soware ìì:Õ–ó¢. Stadler T. óþÕÕ. Mammalian phylogeny reveals recent diversication rate shis. Proceed- ings of the National Acadamy of Sciences, USA Õþ:äÕß–äÕÉó. Stanley S.M. ÕÉßì. An explanation for Cope’s rule. Evolution óß:Õ–óä. Stanley S.M. ÕÉߢa. Clades versus clones in evolution: why we have sex. Science ÕÉþ:ìó– ìì. Stanley S.M. ÕÉߢb. A theory of evolution above the species level. Proceedings of the National Acadamy of Sciences, USA ßó:ä¦ä–ä¢þ. Steeman M.E., Hebsgaard M.B., Fordyce R.E., Ho S.Y.W., Rabosky D.L., Nielsen R., Rahbek C., Glenner H., Sørensen M.V., and Willerslev E. óþþÉ. Radiation of extant cetaceans driven by restructuring of the oceans. Systematic Biology ¢:¢ßì–¢¢. Taylor E.B., Boughman J.W., Groenboom M., Sniatynski M., Schluter D., and Gow J.L. óþþä. Speciation in reverse: morphological and genetic evidence of the collapse of a three-spined stickleback (Gasterosteus aculeatus) species pair. Molecular Ecology ì¦ì–좢. omas G.H., Wills M.A., and Székely T. óþþ¦. A supertree approach to shorebird phylogeny. BMC Evolutionary Biology ¦:ó. Valkenburgh B.V., Wang X., and Damuth J. óþþ¦. Cope’s rule, hypercarnivory, and extinction in north american canids. Science ìþä:ÕþÕ–Õþ¦. Vamosi S.M. and Vamosi J.C. óþþ¢. Endless tests: guidelines for analysing non-nested sister-group comparisons. Evolutionary Ecology Research ß:¢äß–¢ßÉ. Van Valen L.M. ÕÉߢ. Group selection, sex, and fossils. Evolution óÉ:ߖɦ. Vos R.A. and Mooers A.Ø. óþþä. A new dated supertree of the Primates. chapter ¢. In Inferring large phylogenies: the big tree problem. (R.A. Vos)PhD. thesis, Simon Fraser University.
Õ¦þ ff§Z£í
Vrba E.S. and Gould J.S. ÕÉä. e hierarchical expansion of sorting and selection: sorting and selection cannot be equated. Paleobiology Õó:óÕß–óó. Walker T.D. and Valentine J.W. Õɦ. Equilibrium models of evolutionary species diversity and the number of empty niches. Evolution Õó¦:ß–ÉÉ. Warnock R.C.M., Yang Z., and Donoghue P.C.J. óþÕó. Exploring uncertainty in the calibration of the molecular clock. Biology Letters :Õ¢ä–Õ¢É. Webster A.J. and Purvis A. óþþó. Testing the accuracy of methods for reconstructing continuous characters. Proceedings of the Royal Society of London Series B óäÉ:Õ¦ì– Õ¦É. Weiblen G.D. and Bush G.L. óþþó. Speciation in g pollinators and parisites. Molecular Ecology ÕÕ:Õ¢ßì–Õ¢ß. Weir J.T. and Schluter D. óþþß. e latitudinal gradient in recent speciation and extinction rates of birds and mammals. Science ìÕ¢:բߦ–Õ¢ßä. Wiens J.J. óþÕÕ. e causes of species richness patterns across space, time, and clades and the role of “ecological limits”. Quarterly Review of Biology ä:ߢ–Éä. Williams G.C. ÕÉää. Adaptation and natural selection. Princeton University Press, New Jersy. Wilson A.W., Binder M., and Hibbett D.S. óþÕÕ. Eects of gasteroid fruiting body morphology on diversication rates in three independent clades of fungi estimated using binary state speciation and extinction analysis. Evolution ä¢. Wilson E.O. ÕÉ¢É. Adaptive shi and dispersal in a tropical ant fauna. Evolution Õì:ÕÕó– Õ¦¦. Wilson E.O. ÕÉäÕ. e nature of the taxon cycle in the Melanesian ant fauna. e American Naturalist É¢:ÕäÉ–ÕÉì. Winger B.M., Lovette I.J., and Winker D.W. óþÕó. Ancestry and evolution of seasonal migration in the Parulidae. Proceedings of the Royal Society of London Series B óßÉ:äÕþ–äÕ. Yule G.U. ÕÉó¢. A mathematical theory of evolution, based on the conclusions of Dr. JC Willis, FRS. Philosophical Transactions of the Royal Society B: Biological Sciences óÕ:óÕ–ß.
Õ¦Õ Z££uoì Z S¶££uu±Z§í I§Z± ± CZ£±u§ ó
Z.Õ §±-«±Z±u hZh¶Z±«
Under BiSSE, at the root of a tree, R, we have the two probabilities DRý and DRÔ, corre- sponding to the possible character states at the root. e overall likelihood must sum over the probabilities that the root was in each state. In Schluter et al.(ÕÉÉß), the Ds at the root were weighted evenly, which assumes that the lineage arose out of a group with ý% of taxa in each state, therefore potentially being out of equilibrium with the inferred model of evolution. e model parameters provide some knowledge, however. For example, if transitions away from character state ý are much more frequent than the reverse (qýÔ > qÔý) and if speciation/extinction rates do not depend strongly on the character state, then we would expect that the system is more likely to be in state Ô than in state ý at any point in time, including at the root. In Maddison et al.(óþþß), the information provided by the model was used to weight the Ds at the root by the equilibrium frequencies for the character states given by the model (following Maddison and Maddison óþþä). is implicitly assumes that a sucient amount of time has passed prior to the root, so that the root state can be assumed to be a random draw from an equilibrium distribution. However, this assumption does not account for cases where the traits are novel or have yet to reach equilibrium. Here, we treat the root state as a nuisance parameter (Gelman et al., ÕÉÉ¢) and use an alternative root assignment that weights each root state according to its probability of giving rise to the extant data, given the model parameters and the tree. is probability is given by the likelihood given that the root is in state i divided by the sum of the
Õ¦ó Z££uoì Z
likelihoods over both root states, DRi~(DRý + DRÔ). e overall likelihood is then:
DRý DRÔ DR = DRý + DRÔ (Z.×) DRý + DRÔ DRý + DRÔ
As a test case, consider a tree that consists of a single branch with an innitesimally short branch length where the single extant taxon is in state ý. Assuming that none of the transition parameters is very large, then DRý is nearly one and DRÔ is nearly zero. Assigning the root state according to the probability of the root leading to the data, as we do in equation (Z.Õ), we infer that there is nearly a Ôýý% probability that the root was in state ý, and the overall probability of the data given the model is nearly one. Assigning the root state uniformly, we would be assuming that there is a ý% probability that the root state was Ô, even though we know this to be impossible (more precisely, it has an innitesimally small probability of being true given the innitesimally short branch and the fact that the single extant taxon is in state ý). Similarly, assigning the root state according to the equilibrium distribution would give some non-zero probability to the root being in state Ô, unless qýÔ = ý. Furthermore, if there is directional change in the character, assigning the root to the equilibrium distribution incorrectly forces the character to take on the value that it will have in the long-term future, not what it is likely to have been in the past. For example, if all organisms started in state ý and are evolving into state Ô (qýÔ ≫ qÔý, with all else equal), the equilibrium distribution method will incorrectly assign the root to state Ô, even if most extant species are still in state ý. By contrast, assigning the root states according to their relative likelihoods of explaining the data will assign a high probability on the root having been in state ý when most species are in state ý. Equation (Z.Õ) has the further advantage that the only quantities needed for its calculation are DRý and DRÔ, which are already known once BiSSE has traversed the tree. Goldberg and Igić (óþþ) have recently explored the eect of root state on BiSSE calculations and found that they can have a strong eect on conclusions, especially where character change is unidirectional. Our approach (equation (Z.Õ)) can approximate the ancestral root state and result in
Õ¦ì Z££uoì Z reasonable character change rate estimates for the situations described above. In practice, sensitivity to the root state is easily detected by comparing DRý and DRÔ.
Z.ó hZ§Zh±u§-ou£uou± ou
When the speciation and extinction rates do not depend on a character, our likelihood calculations reduce to existing models of character-independent evolution. Analytical solutions for these models are known, removing the need to use numerical approaches to calculating the likelihoods.
Z.ó.Õ Skeletal trees With character-independent speciation rates, the skeletal-tree likelihoods reduce to the method of Nee et al.(ÕÉɦb). e character-independent analogues of equation (ó.Õ) are
dD N = − (λ + µ)D (t) + òλE(t)D (t) (Z.öa) dt N N dE =µ − (λ + µ)E(t) + λE(t)ò (Z.öb) dt
(Maddison et al., óþþß), where λ and µ are the character-independent speciation and extinction rates. If a fraction, f , of all species are sampled in the phylogeny, then the initial conditions are E(ý) = Ô − f and D(ý) = f . It is possible to derive an analytical solution to equations (Z.ó) describing changes along a single branch. Using the initial condition E(ý) = Ô − f , the solution to equation (Z.ób) for the extinction rate is
f (λ − µ) E(t) =Ô − if λ ≠ µ (Z.îa) f λ − e(λ−µ)t(µ − λ(Ô − f )) Ô − f + f λt E(t) = if λ = µ (Z.îb) Ô − f λt
Õ¦¦ Z££uoì Z
Substituting equation (Z.ì) into equation (Z.óa) and solving for DN (t) gives
−(λ−µ)t ò −( − )( − ) ( f λ − e N (µ − λ(Ô − f ))) D (t) =e λ µ t tN D (t ) if λ ≠ µ (Z.®a) N ( f λ − e−(λ−µ)t(µ − λ(Ô − f )))ò N N (Ô + f λt )ò D (t) = N D (t ) if λ = µ,(Z.®b) N (Ô + f λt)ò N N where tN represents the time depth (since the present) of node N. Equations (Z.ì) and (Z.¦) reduce to equations (É) and (Õþ) in Maddison et al.(óþþß) if sampling is complete ( f = Ô). ese equations can be used as in Maddison et al.(óþþß) to compute the likelihood for the entire phylogeny:
⎡ ò ò⎤ ⎢ n −(λ−µ)tk,t ⎥ −( − )( − ) f λ − e (µ − λ(Ô − f )) ( ) = ⎢ λ µ tk,b tk,t ⎥ n if ≠ (Z.£a) DR tR ⎢M e −( − ) ⎥ λ λ µ ⎢ − λ µ tk,b ( − (Ô − )) ⎥ ⎣ k=Ô f λ e µ λ f ⎦ òn ò (Ô + f λt , ) ( ) = M k t n if = (Z.£b) DR tR (Ô + )ò λ λ µ k=Ô f λtk,b where tk,b and tk,t are the times at the base and tip of the kth branch, respectively, and the product is taken over all òn branches for a tree containing n nodes. Equation (Z.¢) is consistent with the results from Nee et al.(ÕÉɦb) for the character- independent case. Nee et al.(ÕÉɦb) does not explicitly give equations for the probability of a sampled phylogeny, given a speciation rate, extinction rate and sampling probability, so we state them here. Using the notation of Nee et al.(ÕÉɦb), the probability of the data is N N N−ò N−ò ò L = (N − Ô)! f λ M Ps(ti , T) (Ô − us(xò)) M(Ô − us(xi)) (Z.å) i=ç i=ç where N is the number of tips in the phylogeny, xi is the time between the present and the node that splits the phylogeny into i branches (so that xò is the distance to the root), Ps(ti , T) is the probability that a lineage originating at time ti leaves at least one descendant at the present, time T (given by Nee et al.’s (ÕÉɦ) equation (ì¦)), and us(t)
Õ¦¢ Z££uoì Z is Ô − a us(t) = Ô − (Z.à) f erxi − a + Ô − f where a = µ~λ and r = λ − µ. Equation (Z.ß) can be derived from equations (óÉ) and (ìì) in Nee et al.(ÕÉɦb). Aer some algebra, equation ( Z.¢a) can be shown to be equal to equation (Z.ä), aer conditioning on the existence of a root node and two surviving lineages (see Maddison et al. óþþß for similar calculations with f = Ô).
Z.ó.ó Terminally unresolved trees To calculate the likelihood for terminally unresolved trees, we must rst calculate the probability of unresolved clades given the speciation and extinction rates. We could re- derive Q and x, ignoring character state changes, which now do not aect diversication, and use equation (ó.ì) to compute the probability of the clade. However, without transi- tions, this can be viewed as a single birth-death process, for which an analytical solution is available. e probability of k lineages arising and surviving to the present from a single ancestor over a period of time t is given by (Nee et al., ÕÉɦb):
λ − µ P(k, t) = (Ô − u(t))u(t)k−Ô k > ý, λ ≠ µ (Z.a) λ − µe−(λ−µ)t (tλ)i−Ô P(k, t) = k > ý, λ = µ (Z.b) (Ô + tλ)i+Ô where u(t) is us(t) from equation (Z.ß) with f = Ô. If an unresolved clade has n species and originated at time tN , then P(n, tN ) can be used for DN (tN ) in equation (É) of Maddison et al.(óþþß). is approach has been used by Rabosky et al.(óþþß) to estimate speciation and extinction rates from a terminally unresolved lizard phylogeny.
Õ¦ä Z££uoì f S¶££uu±Z§í I§Z± ± CZ£±u§ ì
f.Õ «u hZ§Zh±u§ ou§êZ±
Here, I derive equation (ì.ì): the partial dierential equation that describes changes in the probability of extinction of a lineage in statex over time t, E(x, t), under QuaSSE. is derivation parallels both that of BiSSE (Maddison et al., óþþß) and the Kolmogorov backward dierential equation (Allen, óþþì). Starting with equation (ì.ó), subtracting E(x, t) from both sides and dropping remaining terms that are of order (∆t)ò gives
∞ ò E(x, t + ∆t) − E(x, t) =µ(x)∆t + λ(x)∆t g(z, tSx, t + ∆t)E(z, t)dz ∫−∞ ∞ − (λ(x) + µ(x))∆t g(z, tSx, t + ∆t)E(z, t)dz (f.×) ∫−∞ ∞ + g(z, tSx, t + ∆t)E(z, t)dz − E(x, t) + O(∆tò). ∫−∞
Next, note that because g is a probability distribution function it must integrate to Ô over ∞ all possible future character states, ∫−∞ g(z, tSx, t + ∆t)dz = Ô, so that
∞ E(x, t) = g(z, tSx, t + ∆t)E(x, t)dz.(f.ö) ∫−∞
Õ¦ß Z££uoì f
Replacing the nal E(x, t) term in equation (f.Õ) with equation (f.ó) and dividing both sides by ∆t gives
E(x, t + ∆t) − E(x, t) ∞ ò =µ(x) + λ(x) g(z, tSx, t + ∆t)E(z, t)dz ∆t ∫−∞ ∞ − (λ(x) + µ(x)) g(z, tSx, t + ∆t)E(z, t)dz ∫−∞ Ô ∞ + g(z, tSx, t + ∆t)(E(z, t) − E(x, t))dz + O(∆t). ∆t ∫−∞ (f.î)
We then take the limit ∆t → ý. In this limit, the le hand side becomes the partial derivative ∂E(x, t)~∂t. Because wild jumps are not allowed and we are considering an innitesimally small time period, the term E(z, t) can be expanded as a Taylor series in z around the point z = x:
∂E(x, t) (z − x)ò ∂òE(x, t) E(z, t) = E(x, t) + (z − x) + + O((z − x)ç) (f.®) ∂x ò ∂xò
Using this expansion, the third term on the right hand side of equation (f.ì) can then be written
∞ lim (λ(x) + µ(x)) g(z, tSx, t + ∆t)× ∆t→ý ∫−∞ ∂E(x, t) (z − x)ò ∂òE(x, t) E(x, t) + (z − x) + + O((z − x)ç) dz. ∂x ò ∂xò
In the limit ∆t → ý, transitions any distance away from x become increasingly unlikely. at is, lim g(z, tSx, t + ∆t) = δ(z − x), ∆t→ý where δ(x) is the Dirac delta function, concentrating all probability density on the point z = x. is means that
∞ lim (z − x)k g(z, tSx, t + ∆t)dz = ý, k > ý ∆t→ý ∫−∞
Õ¦ Z££uoì f and the third term of equation (f.ì) can be rewritten
∞ lim (µ(x) + λ(x)) g(z, tSx, t + ∆t)E(z, t)dz = (µ(x) + λ(x))E(x, t).(f.£) ∆t→ý ∫−∞
e same logic applied to the second term of equation (f.ì) gives
∞ ò lim λ(x) g(z, tSx, t + ∆t)E(z, t)dz = λ(x)E(x, t)ò.(f.å) ∆t→ý ∫−∞
Aer substituting in the Taylor expansion from equation (f.¦), the fourth term of equation (f.ì) becomes
Ô ∞ ∂E(x, t) (z − x)ò ∂òE(x, t) lim g(z, tSx, t + ∆t) (z − x) + + O((z − x)ç) dz. ∆t→ý ∆t ∫−∞ ∂x ò ∂xò
is can be simplied using the diusion conditions from equation (ì.Õ) to give
∂E(x, t) σ ò(x, t) ∂òE(x, t) ϕ(x, t) + .(f.à) ∂x ò ∂xò
Substituting equations (f.¢), (f.ä), and (f.ß) into equation (f.ì) gives the partial dier- ential equation (ì.ì). f.ó ¶±êZ§Z±u hZ§Zh±u§ ou§êZ±
I start with the two-character case, from which the extension to an arbitrary number of characters immediately follows. Suppose we have two characters, x and y. Let g(a, b, tS x, y, t + ∆t) be the probability density that the character changes from x, y at time t + ∆t to state a, b at time t, where t is closer to the present than t + ∆t (ý < t < t + ∆t).
e functions E and DN are now functions of both character variables; E(x, y, t) and
DN (x, y, t).
Õ¦É Z££uoì f
Continuing as for the single character case, E(x, y, t + ∆t) can be written
E(x, y, t + ∆t) =µ(x, y)∆t
+ (Ô − µ(x, y)∆t)λ(x, y)∆t× ∞ ∞ ò g(a, b, tSx, y, t + ∆t)E(a, b, t)dadb ∫−∞ ∫−∞ (f.) + (Ô − µ(x, y)∆t)(Ô − λ(x, y)∆t)× ∞ ∞ g(a, b, tSx, y, t + ∆t)E(a, b, t)dadb ∫−∞ ∫−∞ + O(∆tò).
Dropping terms of O(∆tò), subtracting E(x, y, t + ∆t) from both sides, dividing by ∆t, and rearranging using the fact that
∞ ∞ E(x, y, t) = g(a, b, tSx, y, t + ∆t)E(x, y, t)dadb, ∫−∞ ∫−∞ we have
E(x, y, t + ∆t) − E(x, y, t) = µ(x, y) ∆t ∞ ∞ ò + λ(x, y) g(a, b, tSx, y, t + ∆t)E(a, b, t)dadb ∫−∞ ∫−∞ ∞ ∞ − (µ(x, y) + λ(x, y)) g(a, b, tSx, y, t + ∆t)E(a, b, t)dadb ∫−∞ ∫−∞ Ô ∞ ∞ + g(a, b, tSx, y, t + ∆t)(E(a, b, t) − E(x, y, t))dadb ∆t ∫−∞ ∫−∞ + O(∆t).(f.Ì)
To simplify equation (f.É), we need an expression for E(a, b, t) in terms of the orig- inal location (x, y). is can be expanded as a Taylor series in two variables around the
Õ¢þ Z££uoì f point (a = x, b = y):
E(a, b, t) =E(x, y, t) ∂E (a − x)ò ∂òE + (a − x) + ∂x ò ∂xò ∂E (b − y)ò ∂òE (f.×ÿ) + (b − y) + ∂y ò ∂yò ∂òE + (a − x)(b − y) + O((∆x, ∆y)ç) ∂x∂y where O((∆x, ∆y)ç) includes the terms O((a − x)ç), O((b − y)ç), O((a − x)ò(b − y)), and O((a − x)(b − y)ò). We take the limit ∆t → ý for equation (f.É) and consider each term in sequence. Using the same logic as the one character case,
lim g(a, b, tSx, y, t + ∆t) = δ(a − x)δ(b − y), ∆t→ý so
∞ ∞ kx ky lim (a − x) (b − y) g(a, b, tSx, y, t + ∆t)dadb = ý, for kx , ky > ý. ∆t→ý ∫−∞ ∫−∞
With this, the second and third terms of equation (f.É) can be written
∞ ∞ ò lim λ(x, y) g(a, b, tSx, t + ∆t)E(x, y, t)dadb = λ(x, y)E(x, y, t)ò. ∆t→ý ∫−∞ ∫−∞ (f.××) and
∞ ∞ lim (µ(x, y) + λ(x, y)) g(a, b, tSx, y, t + ∆t)E(x, y, t)dadb = ∆t→ý ∫−∞ ∫−∞ (µ(x, y) + λ(x, y))E(x, y, t) (f.×ö)
Õ¢Õ Z££uoì f
Aer substituting in the Taylor expansion (f.Õþ) into the fourth term of equation (f.É) and taking the limit ∆t → ý, we have
Ô ∞ ∞ lim g(a, b, tSx, y, t + ∆t)× ∆t→ý ∆t ∫−∞ ∫−∞ ∂E (a − x)ò ∂òE ∂E (b − y)ò ∂òE (a − x) + + (b − y) + ∂x ò ∂xò ∂y ò ∂yò ∂òE +(a − x)(b − y) + O((∆x, ∆y)ç) dadb. ∂x∂y
A similar set of assumptions to those in equation (ì.Õ) can be made. Consider rst the derivatives involving just one character. Rearranging the rst of these gives:
Ô ∞ ∞ ∂E lim g(a, b, tSx, y, t + ∆t)(a − x) dadb = ∆t→ý ∆t ∫−∞ ∫−∞ ∂x ∂E Ô ∞ ∞ lim (a − x) g(a, b, tSx, y, t + ∆t)dbda. ∂x ∆t→ý ∆t ∫−∞ ∫−∞
Dening ∞ g(a, b, tSx, y, t + ∆t)db = g(a, tSx, t + ∆t), ∫−∞ so that integrating over all transitions in y gives the transition probability density func- tion for x. e equation above then becomes
∂E Ô ∞ lim (a − x)g(a, tSx, t + ∆t)da ∂x ∆t→ý ∆t ∫−∞
Õ¢ó Z££uoì f where the term within the limit is identical to equation (ì.Õa). With similar manipulation for the other terms, the diusion conditions become
Ô ∞ ∞ ϕx(x, y, t) = lim (a − x)g(a, b, tSx, y, t + ∆t)dadb (f.×îa) ∆t→ý ∆t ∫−∞ ∫−∞ Ô ∞ ∞ ϕy(x, y, t) = lim (b − y)g(a, b, tSx, y, t + ∆t)dadb (f.×îb) ∆t→ý ∆t ∫−∞ ∫−∞ ∞ ∞ Ô ò σx,x(x, y, t) = lim (a − x) g(a, b, tSx, y, t + ∆t)dadb (f.×îc) ∆t→ý ∆t ∫−∞ ∫−∞ ∞ ∞ Ô ò σy,y(x, y, t) = lim (b − y) g(a, b, tSx, y, t + ∆t)dadb (f.×îd) ∆t→ý ∆t ∫−∞ ∫−∞ Ô ∞ ∞ σx,y(x, y, t) = lim (a − x)(b − y)g(a, b, tSx, y, t + ∆t)dadb (f.×îe) ∆t→ý ∆t ∫−∞ ∫−∞ Ô ∞ ∞ ý = lim (a − x)kx (b − y)kx g(a, b, tSx, y, t + ∆t)dadb,(f.×îf) ∆t→ý ∆t ∫−∞ ∫−∞ where condition (f.Õìf) holds for kx + ky > ò. e term σx,y(x, y, t) represents the instantaneous covariance between x and y. Substituting equations (f.ÕÕ), (f.Õó), and (f.Õì) into equation (f.É) and applying the limit to the le-hand side gives
∂E(x, y, t) = µ(x, y) + λ(x, y)E(x, y, t)ò − (λ(x, y) + µ(x, y))E(x, y, t) ∂t ∂E(x, y, t) ∂E(x, y, t) + ϕ (x, y, t) + ϕ (x, y, t) x ∂x y ∂y ò ò , ( , , ) ( , , ) σ , (x, y, t) ( , , ) + σx x x y t ∂ E x y t + y y ∂ E x y t ò ∂xò ò ∂yò ∂òE(x, y, t) + σ , (x, y, t) (f.×®) x y ∂x∂y
Õ¢ì Z££uoì f and a similar process leads to the partial dierential equation
∂D (x, y, t) N = òλ(x, y)D (x, y, t)E(x, y, t) − (λ(x, y) + µ(x, y))D (x, y, t) ∂t N N ∂D (x, y, t) ∂D (x, y, t) + ϕ (x, y, t) N + ϕ (x, y, t) N x ∂x y ∂y ò ò , ( , , ) ( , , ) σ , (x, y, t) ( , , ) + σx x x y t ∂ DN x y t + y y ∂ DN x y t ò ∂xò ò ∂yò ò ∂ DN (x, y, t) + σ , (x, y, t) (f.×£) x y ∂x∂y
is can be extended to an arbitrary number of characters to give equations (ì.É) and (ì.Õþ). Similar logic can also be used to derive equations when speciation, extinction and character transition functions also depend on the state of an additional binary charac- ter; let λi(x) and µi(x) denote the speciation and extinction function in a continuous character state while in the binary state , where = ý or Ô, ( , ) and ò( , ) be the x i i ϕi x t σi x t directional and diusion functions, while in state i, and qi j(x) be the rate of transition from binary state i to j, which may depend on the continuous trait. e variables become
Ei(x, t) and DiN (x, t), which are the probability of extinction or of the lineage leading to node N (respectively) for the lineage in binary state i and continuous state x at time t. Following logic similar to above and in Maddison et al.(óþþß) yields the equations
∂E (x, t) i = µ (x) + λ (x)E (x, t)ò − (λ (x) + µ (x) + q (x))E (x, t) ∂t i i i i i i j i ∂E (x, t) σ ò(x, t) ∂òE (x, t) + q (x)E (x, t) + ϕ (x, t) i + i i (f.×åa) i j j i ∂x ò ∂xò
∂D (x, t) iN = òλ (x)D (x, t)E (x, t) − (λ (x) + µ (x) + q (x))D (x, t) ∂t i iN i i i i j iN ∂D (x, t) σ ò(x, t) ∂òD (x, t) + q (x)D (x, t) + ϕ (x, t) iN + i iN .(f.×åb) i j jN i ∂x ò ∂xò
Õ¢¦ Z££uoì h S¶££uu±Z§í I§Z± ± CZ£±u§ ¦
h.Õ ±¶ oêu§«±§uu
Many of the models included in diversitree are computationally expensive. For example, all the state-dependent speciation and extinction models (abbreviated xxSSE) involve numerically solving systems of nonlinear dierential equations for every branch in a tree (of which there are òn − ò for a tree with n species). is can lead to long computation times, but it is possible to tune the performance of diversitree by changing how these calculations are performed Tocontrol the way calculations are carried out, every “make” function takes a control argument. When specied, this is a list of one or more tag/value pairs, such as
control=list(tag1=value1, tag2=value2, ...)
Below, for each class of models I list the possible tags and the values that they may take. As diversitree handles discrete and continuous traits in quite dierent ways, I describe the tuning options separately. h.Õ.Õ Discrete traits
backend is switches between dierent algorithms for solving the system of dif- ferential equations, and takes values "deSolve" (the default), "cvodes", and "CVODES" (quotes are required). For example, passing the argument control=list(backend="CVODES")
Õ¢¢ Z££uoì h
to a make function will use the "CVODES" backend. Keys never require quotes, while values do unless logical or numeric.
e "deSolve" backend uses the «oZ algorithm from the R package deSolve (Soetaert et al., óþÕþ) to solve the system of dierential equations. is is a great general purpose ODE solver and is available on all R platforms. However, while the calculations for each branch are done entirely using compiled code, the calculations at nodes and substantial amounts of book-keeping are done in R code, which can be a bottleneck. See safe below for the other disadvantage of this backend.
e "cvodes" backend uses the hêou« algorithm (Cohen and Hindemarsh, ÕÉÉä) from the sundials library of solvers (Hindmarsh et al., óþþ¢)¢. Like the deSolve backend, only the integration is done in compiled code, with node calculations and book-keeping done in R. e "cvodes" backend is about óþÛ slower than "deSolve" for BiSSE, and is not available on Windows.
e "CVODES" backend also uses the hêou« algorithm. However, all cal- culations are carried out in compiled code. is option is not available for all model types. In particular, ClaSSE, BiSSE-ness, split models, and time- dependent models are not yet implemented. is backend is about ¢ times faster than "deSolve" for BiSSE, but is also not available on Windows.
safe When using the "deSolve" backend, diversitree uses non-exported com- piled functions within the deSolve packageä. deSolve’s function denitions may change between package versions, and if called can cause R to crash. To avoid this, if the installed deSolve version is not known to work, diversitree will fall back on a “safe” version, in which only exported deSolve functions are used.
¢is actually uses the hêou algorithm, not hêou«, as sensitivity calculations are not yet supported. äSpecically, we use the compiled «oZ wrapper function call_lsoda directly, rather than the R function lsoda.
Õ¢ä Z££uoì h
is is about ¢ times slower than safe=FALSE for BiSSEß. is approach can be forced by specifying safe=TRUE, though there are few cases where this is desired. is option has no eect for the "cvodes" and "CVODES" backends.
unsafe is is the opposite of safe. When TRUE, even if the deSolve version is not known to work, the calculations will proceed anyway. is can crash R, or potentially produce incorrect results (though the latter is unlikely and can be checked by conrming with safe=TRUE). However, this option may be useful if the diversitree version lags behind the installed deSolve version. is is especially true for Windows users, for whom compilation of diversitree is oen tricky. is option has no eect for the "cvodes" and "CVODES" backends.
tol is controls the degree of accuracy of the integration of each branch. By default a value of 1e-8 is used (Ôý−). Decreasing this value increases the ac- curacy of the calculations, but increases the required running time. Informally, the «oZ algorithm estimates errors, e, for each variable y and attempts to keep these errors smaller than tol× y+tol. e hêou« algorithm uses a sim- ilar error target, ∑n Ô ( ~(tol × + tol))ò, where is the estimated error i=Ô n ei yi ei for the ith variable. If the requested accuracy is not possible, the calculations will fail with an error. Values below sqrt(.Machine$double.eps), which is usually around 1e-8, are optimistic. Note that because of error propagation, the error for the entire likelihood calculation will be substantially higher.
eps At the end of each branch, diversitree checks the value of all “data” variables (in BiSSE-type models, these are the D variables). If any of these are smaller than “eps” diversitree splits the branch in two and runs the calculations on each half, repeating this until the desired accuracy is reached. is is useful on very
ße reason for the slowdown is due to looking up the memory address of the derivative function every time it is used. In diversitree, we create a wrapper that remembers the address as it will not change during an R session. e form tol × y + tol in the above expressions arises because the algorithm actually uses rtol × y + atol where rtol is a relative tolerance and atol is an absolute tolerance. However, these are not separately tunable in the current diversitree version.
Õ¢ß Z££uoì h
long branches where speciation and extinction rates are similar, as negative D values can be produced. e default is eps=0, which enforces positive values. Specifying eps=-Inf will disable this check. In theory, small positive numbers may help in some dicult-to-t models (again, with speciation rates close to extinction rates). However, in models with rapid character evolution, vari- ables may never take values much above zero, which means that this criterion may never be satised for a given eps > þ, and the calculations will fail.
As an example, using a BiSSE likelihood function,
make.bisse(tree, states, control=list(backend="CVODES", tol=1e-5, eps=-Inf)) species that the "CVODES" backend will be used, error tolerances have increased to ý.ýýýýÔ, and positive D values are no longer being enforced. is will run faster than the default options, but the answers will be less accurate. e elements of the control argument are checked to make sure that the values make sense, but misspellings of element names are not checked. For example, specifying control=list(back="CVODES") will still use the the default deSolve algorithm «oZ solver for the calculations, with no warning given.
Mó b M — By default, likelihoods under these models are computed dierently to the xxSSE models. Because there are no E variables to compute, it is feasible to generate a matrix Pi j(∆t) that describes the probability of a transition from state i to state j over time ∆t simultaneously for all branches. is is not possible in BiSSE and other xxSSE models as the starting and ending times matter (rather than simply the elapsed time) because of the inuence of E(t). By default, diversitree computes this matrix for all branch lengths in the tree and then uses this matrix to quickly compute the likelihood. Almost all of these calculations are in compiled code and should be fairly quick.
For mk2, the Pi j(∆t) matrix can be computed exactly, and all of the above control options are ignored. For mkn, the calculation of Pi j(∆t) still requires numerical inte-
Õ¢ Z££uoì h gration, and the options backend, safe, unsafe, and tol in the previous section are available. However, neither backend="CVODES" or eps make sense in this case, and will cause an error (backend="CVODES") or be silently ignored (eps). As the number of states increases, computing the transition probability matrix be- comes expensive. is is particularly the case for make.mkn.multitrait where the number of possible states grows exponentially in the number of traits; n binary traits n òn require ò possible states, which is ò elements in Pi j(∆t). In such cases, there is a control option that changes the algorithm:
method Specifying method="ode" uses a branch-by-branch approach that avoids computing the transition probability matrix. When this is specied, the algo- rithm used is exactly the same as for BiSSE-type models, and all the control parameters in the previous section are available. e default is method="exp" (the name here derives from the approach used by “ape” to carry out the same calculations through matrix exponentiation — see Moler and Loan, óþþì). h.Õ.ó Continuous traits Q¶ZSSE — e options for tuning QuaSSE are not shared with any other model. Un- fortunately, no matter what control parameters are specied, QuaSSE will be fairly slow compared with other models in diversitree.
method One of "fftC" or "fftR" to switch between C (relatively fast) and R (extremely slow) back-ends for the integration. Both use non-adaptive fast Fourier transform (±) based convolutions to solve the partial dierential equations (see FitzJohn, óþÕþ). Specifying "fftC" uses the “Fastest Fourier Transforms in the West” library (Frigo and Johnson, óþþ¢), while specifying "fftR" method uses R’s built-in fft function. Specifying "mol" will use an experimental method-of-lines approach.
Õ¢É Z££uoì h
dt.max Maximum time step to use for the integration. By default, this will be set to Õ/Õþþþ of the tree depth. Smaller values will slow down calculations, but improve accuracy.
nx e number of bins into which the character space is divided (the default is Õþó¦). Larger values will be slower and more accurate. For the "fftC" inte- gration method, this must be an integer power of ó (¢Õó, óþ¦, etc).
r Scaling factor that multiplies nx for a “high resolution” section at the tips of the tree (the default is ¦, giving a high resolution character space divided into ¦,þÉä bins). is helps improve accuracy and makes it possible to allow for narrow initial probability distributions, which atten out as time progresses towards the root. Larger values will be slower and more accurate. For the "fftC" integration method, this must be a small power of ó (e.g., ó, ¦, ), so that nx*r is also a power of ó.
tc is species when in the tree to switch the resolution from the high resolution section specied by r to the lower resolution. Zero corresponds to the present, larger numbers moving towards the root. By default, this happens at ÕþÛ of the tree depth (so the default is 0.1 * max(branching.times(tree))). Smaller values will be faster, but less accurate, and must be specied in time units.
B§ëZ ± Zo O§«±u-Uufuh — ese models take two possible control parameters:
method is switches between two dierent algorithms for computing likelihoods. e default, method="vcv", uses a variance-covariance matrix approach, as implemented in the geiger and ape packages (Paradis et al., óþþ¦; Harmon et al., óþþ). is is very fast for small trees. For large trees (hundreds of species), the matrix calculations make computing the likelihood very expensive. In this case, specifying method="pruning"
Õäþ Z££uoì h
uses an algorithm more like that used in BiSSE and QuaSSE likelihood cal- culations, where each branch is treated separately. is algorithm is similar to that described in Felsenstein(ÕÉßì); however, I have not seen this version described elsewhere and so describe it in section h.ó.
backend When method="pruning" is selected, this controls which of two meth- ods of calculation is used; the default, backend="R", computes the likelihood entirely within R code, which should be fairly robust. Specifying backend="C" uses the same algorithm, but entirely implemented in C, which will be faster, but is less extensively tested. h.Õ.ì “Split” models All models that allow for dierent regions of the tree to have dierent rates (such as make.bisse.split and make.quasse.split) have one additional control parameter:
caching.branches When TRUE, this will try to minimise recalculating the likeli- hood for regions of the tree where parameters have not changed. Every func- tion evaluation, the values at the beginning and end of each branch, plus the parameters, are stored. If in the next evaluation both the parameters and initial conditions are unchanged, the previous base values are returned. Otherwise the branch is calculated as usual. is is useful during hh updates, or with many ML search algorithms, where only a fraction of parameters are changed at once. In particular, the default hh algorithm (slice sampling; Neal, óþþì) updates each parameter separately, so without this set most calculation time is wasted. For computationally intensive models, such as QuaSSE, this can make hh analyses with split models take the same time to take one step (updating all parameters) as non-split models, despite the increase in the dimensionality of parameter space. For less intensive models, such as BiSSE, the overhead of the caching process can erase some of the time savings.
ÕäÕ Z££uoì h h.ó Z Z«±u§ Z§± § f b ¶ uo hZh¶Z±«
Here, I describe an algorithm for computing likelihoods under Brownian motion and Ornstein-Uhlenbeck that can be considerably faster than conventional approaches. e algorithm here uses the pruning algorithm of Felsenstein(ÕÉÕ), and is closely related to the algorithm presented in Felsenstein(ÕÉßì). It diers mostly in presentation, using the same approach as the xxSSE models. e conventional algorithm, as implemented in fitContinuous in the geiger package (Harmon et al., óþþ) and the §u algorithm in ace in the ape package (Paradis et al., óþþ¦) uses the phylogenetic variance-covariance matrix to compute the likelihood of a rate of diusion by estimating the probability of the observed trait data under a multivariate-normal distribution. In what follows, I will refer to this as the “êhê” algorithm due the central role of the phylogenetic variance- covariance matrix. h.ó.Õ Brownian motion
Using the notation of QuaSSE (FitzJohn, óþÕþ), let DN (x, t) be the probability of the observed trait distribution for some lineage N, given that it is in state x at time t. Un- like QuaSSE, this does not account for the distribution of branching times, which are assumed to be unaected by the trait state. Assuming that D will always be Gaussian, it can be characterised by a vector of three elements: a mean µN , variance VN , and a normalising factor Ô~zN , such that
ò Ô Ô (x − µN ) DN (x, t) = √ exp − .(h.×) zN òπVN òVN
For example, the initial distribution at a tip, DN (x, ý),willbehave µN as the mean extant trait value. If the trait value is known without error, then VN = ý and DN (x, ý) is a delta function at µN . Alternatively, VN can be set greater than zero to capture measurement
Õäó Z££uoì h
error or uncertainty in the mean µN . e normalising factor, Ô~zN , will be Õ so that
∫ DN (x, ý)dx = Ô. Consider a branch that has a tip at time t and a base at time t + ∆t (further back in time than t), so that the branch of has length ∆t. Brownian motion has a normal transition probability density function; given a rate of diusion of σ ò, the probability density of state y at a branch tip (time t) given a state of x at the branch base (time t + ∆t) is: Ô ( − )ò ( , S , + ∆ ) = √ exp − x y .(h.ö) g y t x t t ò òπ∆tσ ò ò∆tσ
Given DN (x, t) at the tip of the branch, we can compute DN (x, t +∆t) at the base of that branch as ∞ DN (x, t + ∆t) = g(y, tSx, t + ∆t)DN (y, t)dy,(h.î) ∫−∞ which is the probability that we move from x to y multiplied by the probability of the data at y, integrated over possible values of y (this is the convolution of DN and g). is has the solution
Ô Ô (x − µ )ò D (x, t + ∆t) = » exp − N ,(h.®) N ò ò( + ò∆ ) zN òπ(VN + σ ∆t) VN σ t
ò which is a Gaussian with a mean µN , variance VN + σ ∆t, and normalising factor Ô~zN . Note that only the variance is changed by this calculation, and it is always increased. At the node N′ that is the parent of the lineages leading to nodes N and M, we have two Gaussian functions with means µN and µM, variances VN and VM, and normalising factors Ô~zN and Ô~zM. Both daughter lineages share the same state as their parent imme- diately following speciation. erefore, at the node the probability density of the data given that we are in some state x is
DN′ (x, t) = DN (x, t)DM(x, t),(h.£)
Õäì Z££uoì h or
( − )ò µN µM µN VM +mM VN ò Ô exp − ò( + ) Ô ⎛ (x − ) ⎞ VN VM VN +VM DN′ (x, t) = » ¼ exp − ,(h.å) VN VM zN zM ò ( + ) VN VM ⎝ ò ⎠ π VN VM òπ VN +VM VN +VM
+ which is Gaussian with mean µN VM mM VN , variance VN VM , and normalising factor Ô × VN +VM VN +VM zN zM ò exp− (µN −µM ) ò( + ) » VN VM . See also equations (ß)–(Õ¢) in Felsenstein(ÕÉßì), which involve similar òπ(VN +VM ) terms. With these two operations (moving down a branch and combining at nodes) we can move to the root of the tree, where we have a distribution DR(x, tR), which is Gaussian with mean µR, variance VR, and normalising factor zR. Diversitree implements four possible root treatments. First, following Pagel(ÕÉɦ) we can specify a at prior on the root state, computing the likelihood as
∞ Ô DR(x, tR)dx = .(h.à) ∫−∞ zR
Second, following FitzJohn(óþÕþ), we can weight the D function by the probability of observing the data:
∞ DR(x, tR) Ô Ô DR(x, tR) ∞ dx = √ ,(h.) −∞ ∫ ∫−∞ DR(y, tR)dy zR ò πVR where the second fraction on the right hand side comes from integrating the product of two Gaussians. ird, we can evaluate DR(x, t) at its ML value. is is x = µR, giving
Ô Ô maxx[DR(x, tR)] = √ .(h.Ì) zR òπVR
Finally, the user can supply a character state at the root xR at which DR(x, tR) will be evaluated. At present this is assumed to be known without error, but in principle a distribution could be used here.
Õä¦ Z££uoì h
When the ML approach is used (equation h.É), this algorithm produces identical likelihoods to the êhê algorithm. I have proven this statement for a three-species tree and conrmed it numerically for some larger trees. h.ó.ó Ornstein-Uhlenbeck For an Ornstein-Uhlenbeck process (OU), the above algorithm can again be used aer altering the along-branch calculations (i.e., equations h.ó and h.¦). In addition to the diusion parameter, σ ò, let θ be the “optimum” character state, towards which traits are pulled and let α be the strength of restoring force bringing traits back to θ. e transition probability density function for OU (probability density of state y at time t given we were ò in state at time + ∆ is normal with mean −αt( − ) + , variance σ (Ô − −òαt), such x t t e x θ θ òα e that
Ô e−α∆t(x − θ) + θ − y g(y, tSx, t + ∆t) = ¼ exp − ò (h.×ÿ) ò σ −ò ∆ σ (Ô − −òα∆t) (Ô − e α t) π α e α
(Karlin and Taylor, ÕÉÕ). e solution to the convolution (h.ì) using equation (h.Õþ) as
α∆t the transition probability density function gives a Gaussian with mean e (µN − θ) + θ, òα∆t ò variance (e −Ô)σ + òα∆t , and normalising factor ∆tα~ . òα e VN e zN e same treatment at nodes applies (equation h.ä), and the root calculations in equations (h.ß)–(h.É) can be used. However, the “at prior” root treatment (equation h.ß) appears to give likelihoods that are not directly comparable to BM in a likelihood ratio test, with a hugely inated type I error. h.ó.ì Performance e above calculations can oen be much faster than the êhê-based calculations. I generated random Yule trees using diversitree’s tree.yule function with a speciation rate of ý.Ô up to Õä, ìó, ä¦, ..., ¦þÉä species. For each tree, I simulated traits under Brownian motion using diversitree’s sim.character function with a diusion parame-
Õä¢ Z££uoì h ter, σ ò, of ý.Ô. Note that both the rate of speciation and diusion rate of character state evolution are arbitrary, as rescaling the edge lengths or trait distributions is equivalent to changing the parameters. To measure the time taken to compute likelihoods, for each tree I then computed the likelihood of the true diusion parameter repeatedly until the total evaluation time exceeded þ.¢ s (this ranged from a single evaluation to several thousand evaluations). I repeated this on Õþ dierent simulated trees and traits. I tested three algorithms; rst the êhê algorithm that is based on the code in the geiger package. I modied this algorithm slightly to avoid inverting the phylogenetic covariance matrix and other matrix calculations for every likelihood calculation in the case where “measurement error” is assumed to be zero. is is the default algorithm used by make.bm, corresponding to passing in control=list(method="vcv"). Second, I used the pruning algorithm above, entirely coded in R. is is the algorithm selected by passing control=list(method="pruning", backend="R") to make.bm. ird, I used the pruning algorithm, coded in C. is is selected by specifying backend="C" rather than "R" in the control list above. is third algorithm should be most directly comparable to the êhê algorithm in terms of speed, as both involve primarily compiled code. To avoid the optimisation I made in the êhê algorithm, and to measure perfor- mance when measurement error is included, I repeated the above, but included very small measurement errors (Ôý−Þ). Timings were carried out on a óþþ Mac Pro (ó. GHz Intel Xeon processor). Where measurement error is zero, the êhê algorithm outperformed the pruning/R algorithm for all but the largest trees (¦,þÉä species). However, the required computa- tional time of the êhê grew very quickly at this point (gure h.Õ, solid lines). Including
“measurement errors” (initial VN > ý), the êhê algorithm requires signicant extra time to run. As a result, pruning/R (which was unaected by the addition of errors), outper- formed the êhê beyond around Õó species (gure h.Õ, dashed lines). e pruning/C algorithm was the fastest in all cases, and was only marginally aected by the addition of measurement errors. For very large trees (¦,þÉä) species, the dierence in running
Õää Z££uoì h times was very large; comparing êhê with pruning/C, there was a ßþþ-fold dierence without measurement errors and over Õ¢þ,þþþ-fold dierence including measurement errors. e required running time in the number of species n, for the three algorithms at ¦,þÉä species grows as approximately O(nç) and O(nò.ç) for êhê with and without mea- surement errors respectively, O(n) for pruning/R, and O(ný.À) for pruning/C. e sub- linear growth of the pruning/C algorithm is due to the impact of xed computational costsÉ, and should approach O(n) for larger trees. ese performance timings do not include the time to compute and invert the phylogenetic variance-covariance matrix for the êhê algorithm, which would currently represent a large fraction of the time in carry- ing out an ML search. For example, on a ¦,þÉä species tree creating a likelihood function (including creating and inverting the phylogenetic covariance matrix, and computing the determinant of the inverse) takes Õó¦ s with the êhê algorithm compared with þ.óì s with the pruning algorithm. e tree sizes used here are large, and the performance dierences on modest sized tree should be fairly small (gure h.Õ). However, trees larger than the largest size used here have increasingly become available (e.g., Bininda-Emonds et al., óþþß; Smith and Donoghue, óþþ), and which exceed the ability of the êhê algorithm to run. Extending this pruning algorithm approach to relax the model may also be easier than within the conventional êhê framework. e code for the timing tests is available on the diversitree github siteÕþ.
ÉSimply checking that the diusion parameter is non-negative takes ÕóÛ of the calculation time for a ó¢ä species tree. Õþhttp://github.com/richfitz/diversitree/tree/pub/bm
Õäß Z££uoì h
● ● 102 vcv pruning/R pruning/C ●
● 100
● ●
●
● ●
Elapsed time (s) 10−2 ● ●
● ●
● ● ● ● ● ● 10−4
16 32 64 128 256 512 1024 2048 4096
Number of taxa F¶§u h.Õ: Mean running times for each likelihood function evaluation under the three algorithms. Solid lines assume that species traits are known without error, while dashed lines include “measurement error”. Note the logarithmic scale of both axes.
Õä Z££uoì h h.ì ¶««u b ¶±±§Z± oêu§«hZ± £§Z±u«
is is a full version of the analysis of social structure and mating system in primates. However, it is not a reference and the reader is directed to the diversitree help pages for further information (in particular the help page ?make.musse.multitrait). is section is a “Sweave” document: a mix of R and LATEX code (see Leisch, óþþó). It is possible to recompile the document, which runs the R code and regenerates this output. For instructions on how to do this, please see the instructions at the top of the primates.Rnw source le, available on the diversitree github siteÕÕ. To run the examples, or recompile this le, you will need two data les: “primates-100.nex” containing the phylogenetic trees and “primates-social.csv” containing the trait data. ese are also available on the diversitree github siteÕó. I assume that these les are in the directory “data”.
> library(diversitree)
First, load the distribution of trees. ese were generated by Tyler Kuhn, using the approach in Kuhn et al.(óþÕÕ). Here, we will use just the rst tree to demonstrate the methods.
> trees <- read.nexus("data/primates-100.nex") > tree <- trees[[1]]
Next, load the species trait data; the rst line below creates a data.frame with the species names as row names. e two data columns are “M”, and “S”, which are TRUE when the species is monogamous and social, respectively. Alternatively, these could be integer values with “þ” and “Õ” being equivalent to FALSE and TRUE, respectively (I will refer to states þ and Õ below).
> dat <- read.csv("data/primates-social.csv", row.names=1) > head(dat) ÕÕhttps://github.com/richfitz/diversitree/tree/master/pub/example/ Õóhttps://github.com/richfitz/diversitree/tree/master/pub/example/data, or clone the diversitree github repository and run the analysis from the diversitree/pub/example directory.
ÕäÉ Z££uoì h
MS Allenopithecus_nigroviridis NA FALSE Allocebus_trichotis TRUE TRUE Alouatta_belzebul NA FALSE Alouatta_caraya NA FALSE Alouatta_coibensis FALSE FALSE Alouatta_fusca NA FALSE
Note that some of the species lack state information. e distribution of traits on the phylogeny is shown in gure h.ó. Start by creating a simple model, in which the speciation and extinction rates do not depend on the character state, and the two traits have forward (ý → Ô) and backward (Ô → ý) transition rates that do not depend on the state of the other trait.
> lik.0 <- make.musse.multitrait(tree, dat, depth=0)
All diversitree likelihood functions take as their rst argument a vector of parameters. To get the vector of names for the parameters, use the argnames function:
> argnames(lik.0) [1] "lambda0" "mu0" "qM01.0" "qM10.0" "qS01.0" "qS10.0"
is shows the six parameters: the speciation rate (lambda0), extinction rate (mu0) and six transition rates (e.g., qM01.0 is the rate of transition of the breeding system from non-monogamous to monogamous). e “0” aer all parameter names indicates that these are intercepts. To perform a maximum likelihood (ML) analysis, we search for the parameter vector with the highest likelihood. To do this, diversitree uses a numerical optimisation routine (by default, subplex; Rowan, ÕÉÉþ); most algorithms start at some point in parameter space and work “uphill” nding parameters that improve the likelihood until no further improvement is possible. To start this search, we therefore need a starting parameter vector from which the ML point is reachable. It is not possible in general to prove that a point will lead to the ML point, or that the best point found really is the ML point. However, the state-independent speciation and extinction, combined with reasonable
Õßþ Z££uoì h
Monogamous: no yes Solitary: no yes
Callicebus Pithecia
Cacajao Chiropotes Alouatta Tarsius Cheirogaleus Microcebus Allocebus Phaner Propithecus Indri Ateles Avahi Brachyteles Eulemur Lagothrix Varecia Hapalemur Cebus Lemur
Lepilemur Saimiri
Daubentonia Nycticebus Loris Arctocebus Aotus Perodicticus Otolemur
Callimico Galago, Euoticus, & Galagoides Callithrix
Leontopithecus
Cercopithecus
Saguinus
Chlorocebus Erythrocebus Pongo Miopithecus Gorilla Allenopithecus Homo Pan
Macaca
Hylobates
Papio Theropithecus Lophocebus Cercocebus
Mandrillus
Colobus
Procolobus
Nasalis
& Presbytis
Trachypithecus, Pygathrix Semnopithecus,
> cols <- list(M=c("#a6cee3", "#1f78b4"), S=c("#fdbf6f", "#ff7f00")) > genus <- sub("_.+$", "", tree$tip.label) # extract genus name > trait.plot(tree, dat, cols, lab=c("Monogamous", "Solitary"), str=c("no", "yes"), genus)
F¶§u h.ó: Primate phylogeny, showing states of the two traits considered here: monogamy (blue, inner circle) and solitariness (orange, outer circle). Species are grouped by genus (or groups of genera in the case of polyphyletic groupings).
ÕßÕ Z££uoì h guesses for the character state transition rates appears to converge with reasonable suc- cess:
> p.0 <- c(starting.point.bd(tree), rep(.1, 4)) > names(p.0) <- argnames(lik.0)
It is good practice to conrm that this point has nite likelihood:
> lik.0(p.0) [1] -863.5594
We can then carry out the ML search, with find.mle:
> fit.0 <- find.mle(lik.0, p.0)
is returns an object of class “fit.mle”, which has a lnLik element with the log- likelihood at the ML point, and a par element with the ML parameter vector. is object is described in more detail in the help page ?find.mle. As with other model ts, the coef function can access parameters:
> fit.0$lnLik [1] -786.3427 > round(coef(fit.0), 4) lambda0 mu0 qM01.0 qM10.0 qS01.0 qS10.0 0.1912 0.1110 0.0251 0.0259 0.0009 0.0163
Now, we expand this model to allow state-dependent diversication. To make a likeli- hood function that includes “main eects” of the two traits for speciation and extinction, but leaves the character state change parameters the same, we use depth=c(1, 1, 0).
> lik.1 <- make.musse.multitrait(tree, dat, depth=c(1, 1, 0)) > argnames(lik.1) [1] "lambda0" "lambdaM" "lambdaS" "mu0" "muM" "muS" [7] "qM01.0" "qM10.0" "qS01.0" "qS10.0"
(Specifying depth=c(1,1,1), or equivalently depth=1, would also introduce main ef- fects for the transition rates, which would then mean we would have to determine why
Õßó Z££uoì h lik.1 might t better than lik.0 — the additional degrees of freedom from state de- pendent diversication or from correlated character evolution.) To nd the ML point for this model, we again need a starting parameter vector. e model lik.0 is a special case of model lik.1, so it makes sense to start the ML point found in fit.0. To do this, expand the parameter vector of the model in lik.0 to include main eects, but set them equal to zero:
> p.1 <- rep(0, length(argnames(lik.1))) > names(p.1) <- argnames(lik.1) > p.1[names(coef(fit.0))] <- coef(fit.0) > round(p.1, 4) lambda0 lambdaM lambdaS mu0 muM muS qM01.0 qM10.0 0.1912 0.0000 0.0000 0.1110 0.0000 0.0000 0.0251 0.0259 qS01.0 qS10.0 0.0009 0.0163
(compare this parameter vector to coef(fit.0), above). e parameter vector p.1 must have the same likelihood as the previous ML t:
> lik.0(coef(fit.0)) [1] -786.3427 > lik.1(p.1) [1] -786.3427
Find the ML point for the model that includes main eects:
> fit.1 <- find.mle(lik.1, p.1)
is model ts substantially better than the state-independent model, with a likeli- hood improvement of Õó.¦:
> fit.1$lnLik - fit.0$lnLik [1] 12.36941
ò With a dierence of ¥ parameters, we can compare twice this value to a χ¥ and see that the improvement is statistically signicant.
Õßì Z££uoì h
> 1 - pchisq(2*(fit.1$lnLik - fit.0$lnLik), 4) [1] 5.677339e-05
e anova function does these likelihood ratio tests automatically, also reporting AIC values:
> anova(fit.1, noSDD=fit.0) Df lnLik AIC ChiSq Pr(>|Chi|) full 10 -773.97 1568.0 noSDD 6 -786.34 1584.7 24.739 5.677e-05
We can expand the model further to include interaction terms; is a combination of mating system and sociality associated with elevated speciation or extinction?
> lik.2 <- make.musse.multitrait(tree, dat, depth=c(2, 2, 0)) > argnames(lik.2) [1] "lambda0" "lambdaM" "lambdaS" "lambdaMS" "mu0" "muM" [7] "muS" "muMS" "qM01.0" "qM10.0" "qS01.0" "qS10.0"
Now t the model, starting from the ML point found in fit.1, expanded to include intercept terms for both speciation and extinction rates:
> p.2 <- rep(0, length(argnames(lik.2))) > names(p.2) <- argnames(lik.2) > p.2[names(coef(fit.1))] <- coef(fit.1) > fit.2 <- find.mle(lik.2, p.2)
Comparing this against the no-interaction t, there is no signicant improvement in model t:
> anova(fit.2, noInteraction=fit.1) Df lnLik AIC ChiSq Pr(>|Chi|) full 12 -773.73 1571.5 noInteraction 10 -773.97 1568.0 0.49143 0.7821
e improvement in t between fit.1 and fit.0 could be due to either of the monogamy or sociality traits (or both). We can determine the contribution of each by constructing models that omit main eects of each trait. e constrain function can
Õߦ Z££uoì h be used to simplify models by removing parameters. Parameters can either be set to be equal to a dierent parameter or to a constant. For example, to remove the main eect of the breeding system trait on speciation from the lik.1 model, we can enter:
> lik.1M <- constrain(lik.1, lambdaM ~ 0)
Notice that lambdaM is no longer present in the argnames of this likelihood function:
> argnames(lik.1M) [1] "lambda0" "lambdaS" "mu0" "muM" "muS" "qM01.0" [7] "qM10.0" "qS01.0" "qS10.0"
Similarly, for sociality:
> lik.1S <- constrain(lik.1, lambdaS ~ 0)
ese models can then be t as before (adjusting the starting point to account for the reduction in the number of parameters)
> fit.1M <- find.mle(lik.1M, p.1[argnames(lik.1M)]) > fit.1S <- find.mle(lik.1S, p.1[argnames(lik.1S)])
ere is a signicant drop in likelihood when the breeding system (monogamy/non- monogamy) speciation main eect is dropped from the model, but the social/nonsocial trait association is marginally non-signicant (both comparisons here are made against fit.1):
> anova(fit.1, noM=fit.1M, noS=fit.1S) Df lnLik AIC ChiSq Pr(>|Chi|) full 10 -773.97 1568.0 noM 9 -778.24 1574.5 8.5236 0.003506 noS 9 -775.52 1569.0 3.1031 0.078145
Alternatively, we might run use Markov chain Monte Carlo (hh) to sample from the posterior distribution of the lambdaM and lambdaS values. is procedure requires a prior probability distribution for the parameters. Here, I use a prior on the actual rates in the model, not the multitrait parametrisation (partly because it is easier to constrain
Õߢ Z££uoì h the former to being positive; negative speciation and extinction rates are not allowed, and partly because I do not have a strong prior belief about the form of the main eect parameters). Given a multi-trait parametrisation, the underlying rate parameters can be found by passing in the argument pars.only=TRUE to the likelihood function, which returns the underlying parameters without computing the likelihood:
> round(coef(fit.1), 3) lambda0 lambdaM lambdaS mu0 muM muS qM01.0 qM10.0 0.224 -0.095 -0.076 0.061 -0.061 0.000 0.020 0.030 qS01.0 qS10.0 0.000 0.017 > round(lik.1(coef(fit.1), pars.only=TRUE), 3) lambda00 lambda10 lambda01 lambda11 mu00 mu10 mu01 0.224 0.128 0.148 0.052 0.061 0.000 0.061 mu11 q00.10 q00.01 q00.11 q10.00 q10.01 q10.11 0.000 0.020 0.000 0.000 0.030 0.000 0.000 q01.00 q01.10 q01.11 q11.00 q11.10 q11.01 0.017 0.000 0.020 0.000 0.017 0.030
To specify an exponential prior with a mean set to twice the state-independent diversi- cation rate:
> r <- p.0[[1]] - p.0[[2]] > prior1 <- make.prior.exponential(1/(2*r))
Using the translation above, we can make a function that takes as arguments the multi- trait parametrisation and returns the prior probability density:
> prior <- function(pars) prior1(lik.1(pars, pars.only=TRUE))
Running the hh for Õþ,þþþ steps (this takes several hours, and by default will print the parameters visited on each step and their posterior probability).
> samples <- mcmc(lik.1, p.1, nsteps=10000, w=0.5, prior=prior)
Õßä Z££uoì h
e posterior distributions for the main eects of the speciation rates (lambda.M and lambda.S) are concentrated below zero, with the É¢Û credibility interval below zero (gure ¦.ì). erefore, we can conclude both traits have a negative eect on speciation.
For completeness, I will show how a similar analysis would proceed if we treat each trait separately with BiSSE. First, we will need two named state vectors — one for each trait:
> st.m <- dat$M > names(st.m) <- rownames(dat) > st.s <- dat$S > names(st.s) <- rownames(dat)
en we build likelihood functions (using make.bisse).
> lik.m <- make.bisse(tree, st.m) > lik.s <- make.bisse(tree, st.s)
We can then run hh chains from a “sensible” starting point (see the help page ?starting.point.bisse):
> p <- starting.point.bisse(tree)
Running the chains:
> samples.m <- mcmc(lik.m, p, nsteps=10000, w=.5, prior=prior1) > samples.s <- mcmc(lik.s, p, nsteps=10000, w=.5, prior=prior1)
For a single trait, the dierence in the speciation rates (i.e., λÔ − λý) is mathematically equivalent to the main eect of that trait. Monogamy is signicantly associated with decreased speciation rates (the É¢Û credibility interval of λM is below zero; gure ¦.ì). However, the eect of solitariness is no longer signicant. Similarly, we can run an ML analysis. First, t the full six parameter model for each trait:
Õßß Z££uoì h
> fit.m <- find.mle(lik.m, p) > fit.s <- find.mle(lik.s, p)
en t constrained models where λÔ is set equal to λý for both traits:
> lik.m.eqL <- constrain(lik.m, lambda1 ~ lambda0) > fit.m.eqL <- find.mle(lik.m.eqL, p[argnames(lik.m.eqL)]) > lik.s.eqL <- constrain(lik.s, lambda1 ~ lambda0) > fit.s.eqL <- find.mle(lik.s.eqL, p[argnames(lik.s.eqL)])
Comparing the “with state-dependent speciation” model against the simpler model that lacks state-dependent speciation with a likelihood ratio test for the monogamy trait:
> anova(fit.m, equal.lambda=fit.m.eqL) Df lnLik AIC ChiSq Pr(>|Chi|) full 6 -755.62 1523.2 equal.lambda 5 -758.74 1527.5 6.2242 0.0126 and the solitariness trait
> anova(fit.s, equal.lambda=fit.s.eqL) Df lnLik AIC ChiSq Pr(>|Chi|) full 6 -710.90 1433.8 equal.lambda 5 -711.59 1433.2 1.3821 0.2397 conrms the previous BiSSE result: there is evidence of state-dependent speciation for monogamy, but not for solitariness. Note that it is not possible to directly compare the multitrait MuSSE model with the BiSSE model, because these models explain dierent data; MuSSE accounts for the ob- served distribution of multiple traits, BiSSE accounts only for one trait. Likelihood ratio tests are only valid where models are nested and where the data are identical between both models. Attempting a non-nested comparison will generate an error. Instead, to perform such comparisons, the user must generate constrained models using MuSSE, which accounts for all of the trait data but allows only one of the traits to aect diversi- cation (as performed using lik.1M and lik.1S above).
Õß Z££uoì o S¶££uu±Z§í I§Z± ± CZ£±u§ ¢
o.Õ £Z§±± h£«±«
Species composition of dierent tree partitions identied with Muo¶«Z. Full species lists of each partition for each tree will be made available on o§íZo.
Afrotheria Afrotheria was only partitioned three times, into the same two groups:
BZ«Z A§±u§Z — containing most of the super order Mh§Zu — a genus of tenrecs. Only É or Õþ of the species in this genus were included in the group.
Carnivora Carnivora was always split into two groups
BZ«Z CZ§ê§Z — most species in the order. Su CZoZu — a group of ¦– recently diverged genera within Canidae, always including Canis and Lycalopex. Other genera sometimes included are Atelo- cynus, Cerdocyon, Chrysocyon, Cuon, Lycaon, and Speothos.
Cetacea Cetacea was partitioned into two groups in ìä trees:
ÕßÉ Z££uoì o
BZ«Z Cu±ZhuZ — a group containing the baleen whales (suborder Mysticeti, with families Balaenidae, Neobalaenidae, and Balaenopteridae), and most families of toothed whales. Du£oZu — most of the Delphinidae (dolphins). is group never included Orcinus orca and in three cases also did not include Feresa, Globicephala, Gram- pus, Orcaella, Peponocephala, and Pseudorca (trees äÕ, Õ, ä).
Chiroptera Partitioned in all trees into a variable number of groups:
BZ«Z C§£±u§Z (Õþþ trees) — basal group that includes most species. R£¶« (Õþþ trees) — horseshoe bats. is group includes only some species in this genus. Su C§£±u§Z (ÉÉ trees) — this is a complicated group of fairly small families. Always includes the families Craseonycteridae, Emballonuridae, Hipposideri- dae, Megadermatidae, Nycteridae, Rhinolophidae, and Rhinopomatidae. Mí±« (ì trees) — mouse-eared bat and relatives. Up to ìä genera may be in- cluded in this group, but usually only two (Cistugo is always present in addition to Myotis). NZ±ZouZ (¦Õ trees) — funnel-eared bats and relatives. is superfamily in- cludes the families Furipteridae, Myzopodidae, Natalidae, and yropteridae. P±u§£oZu (Õì trees) — this clade approximately corresponds to Pteropodi- nae, Rousettinae (missing Eidolon), and Epomophorinae. VZ£í§u««Z (Õþ trees) — in one tree this is a group of Õ genera, corresponding approximately to Stenodermatinae. For the other trees, this group is extended out to include most genera in Phyllostomidae.
Eulipotyphla Always splits into two or three groups
Õþ Z££uoì o
BZ«Z E¶£±í£Z — A basal group that always includes the families Erinacei- dae (Erinaceomorpha) and Solenodontidae and Talpidae (Soricomorpha). C§ho¶§Z — white-toothed shrews. A group within the Soricidae that always includes Crocidura, but up to É others mostly in the Crocidurinae. S§hoZu (óÕ trees) — shrews. Where present, this group includes all Soricidae not included in C§h§¶oZ.
Lagomorpha Never partitioned by Muo¶«Z.
Marsupials e Marsupials include ß orders:
Didelphimorphia
Paucituberculata
Peramelemorphia
Notoryctemorphia
Dasyuromorphia
Microbiotheria
Diprotodontia
is always splits into three groups
BZ«Z MZ§«¶£Z« — most of the Marsupials: everything except for the two fam- ilies below. DZ«í¶§oZu — “marsupial mice”, within Dasyuromorphia. e entire family ex- cept for Lagostrophus (Banded Hare-wallaby) is always included. MZh§£ooZu — kangaroos, wallabies, etc, within Diprotodontia. e entire family is always included.
ÕÕ Z££uoì o
Primates Always splits into two groups
BZ«Z P§Z±u« — most of the primates: everything except the family below. Cu§h£±uhoZu — “Old World monkeys”. e entire family is always included.
Rodentia Rodentia can be divided into several suborders:
Sciuromorpha
Hystricomorpha
Castorimorpha
Myomorpha
Anomaluromorpha
Most of the diversity is in Myomorpha (which contains Muridae and Cricetidae), so we treat this separately from the other suborders for simplicity. e only group that spans both the background group of suborders (Anomaluromorpha, Castorimorpha, Hystricomorpha and Sciuromorpha) and Myomorpha is
BZ«Z Rou±Z (Õþþ trees) — is collects everything that is not in one of the groups below and spans the root of the tree. It primarily contains non-Myo- morpha species, with suborders Anomaluromorpha (Anomaluridae and Pede- tidae) and Castorimorpha (Castoridae, Geomyidae, and Heteromyidae) always entirely included. Other families are always in this group, such as Dipodidae (Myomorpha), most families in Hystricomorpha, and Aplodontiidae (Sciuro- morpha).
Non-Myomorpha groups:
MZ§± (Õþþ trees) — most ground squirrels, within Sciuridae. is is a group of three genera of ground squirrels Cynomys, Marmota, Spermophilus. is
Õó Z££uoì o
broadly matches Herron et al.(óþþ¦)’s, with the exclusion of Ammospermo- philus, which is the sister to the three species here. Sh¶§Zu (ÉÉ trees) — ying squirrels and tree squirrels, within Sciuridae. With the exception of one tree, the same set of Õä genera are included: (Aeromys, Eoglaucomys, Eupetaurus, Glaucomys, Hylopetes, Iomys, Microsciurus, Petau- rillus, Petaurista, Petinomys, Pteromys, Pteromyscus, Rheithrosciurus, Sciurus, Syntheosciurus, Tamiasciurus). Tree ßó also includes Ratufa and Sundasciurus, which are the sister groups to this subfamily. C±uíoZu (óä trees) — tuco-tuco, within Ctenomyidae (Hystricomorpha). is is the only genus in the family, and only part of the genus is included in this group.
Myomorpha groups
So±Zu (Õþþ trees) — New World rats and mice, within Cricetidae. is either includes the entire subfamily (ä¦ genera: Õ¢ trees) or most of the family (as few as ¦ó genera, only ÕÕ trees have fewer than ¢ß genera). M¶§Zu (¢ trees) — this is a large group that varies in composition across dier- ent trees. It varies from approximately the Murinae (within family Muridae, only ¢ trees have this small a span), to the entire Myomorpha group except for Dipodidae. A§êhZu (ó trees) — voles, lemmings, and muskrats, within Cricetidae. e entire subfamily is not always present, with the group ranging from Õþ–óÕ species, but usually óÕ (see tree ì¢ for the smallest collection of genera). Most commonly (Õþ trees) the tribe Lemmini (lemmings: Synaptomys, Lemmus, Myopus) is missing. Pu§í«h¶« (Õß trees) — deer mouse, within Cricetidae/Sigmodontinae. is sometimes did not cover the full genus, but always get most of it. RZ±±¶« (ÕÕ trees) — rats and related genera, within Muridae/Murinae. is group varies from ó–Õ¦ genera in a fairly continuous manner.
Õì Z££uoì o
Ruminantia Ruminantia was partitioned in all but four trees. When partitioned, this was always into two groups:
BZ«Z R¶Z±Z — Antilocapridae, Giradae, and Tragulidae. ese three small families span the root of the tree. M«± R¶Z±Z — Moschlidae, Cervidae, and Bovidae. ese families make up the vast majority of the diversity in Ruminantia.
Õ¦