Probabilistic Models of Geographic Range Evolution Will Freyman 10 SYSTEMATIC BIOLOGY VOL
Total Page:16
File Type:pdf, Size:1020Kb
Probabilistic Models of Geographic Range Evolution Will Freyman 10 SYSTEMATIC BIOLOGY VOL. 57 IB200, Spring 2016 Downloaded from http://sysbio.oxfordjournals.org/ at University of California School Law (Boalt Hall) on April 12, 2016 Image: Richard H Ree and Stephen A Smith. Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Systematic Biology, 57(1):4–14, 2008. FIGURE 3. Copyedited by: TRJ MANUSCRIPT CATEGORY: Article 2013 LANDIS ET AL.—BAYESIAN BIOGEOGRAPHY FOR MANY AREAS 3 A) B) There are an infinite number of biogeographic 1234011001 010 011 histories that can explain the observed geographic 011 001 ranges. When calculating the probability of the observed Biogeographic histories on geographic ranges at the tips of the phylogenetic tree, 010 a phylogeny: 5 011 it is unreasonable to condition on a specific history 001 101 of biogeographic change. After all, the past history 101 of biogeographic change is not observable. Instead, 6 111 011 the usual approach is to marginalize over all possible 111 histories of biogeographic change that could give rise 7 to the observed geographic ranges. The standard way 111 Downloaded from 101 to do this is to assume that events of colonization or local extinction occur according to a continuous- 8 101 time Markov chain (Ree et al. 2005). Marginalizing over histories of biogeographic change is accomplished C) D) 011101 010 011 011001 010 011 using two procedures. First, exponentiation of the 010 instantaneous-rate matrix, Q, gives the probability http://sysbio.oxfordjournals.org/ 011 010 011 011 011 011 111 111 001 density of all possible biogeographic changes along a 011 010 branch 111 010 101 011 011 111 011 001 Qt 111 101 011 p(y z t,Q) e− , 111 101 → ; = yz 001 % & 101 111 111 011 where y is the ancestral geographic range, z is the 011 001 current geographic range, and t is the duration of the 001 011 branch on the tree. The geographic-range transition 011 010 probabilities obtained in this way marginalize over all at University of California School Law (Boalt Hall) on April 12, 2016 111 010 possible biogeographic histories along a single branch, but do not account for the possible combinations of FIGURE 1. An example of a tree with M 4 species. A) Nodes on the geographic ranges that can occur at internal nodes of tree are labeled such that the tips of the tree= have the labels 1,2,...,M whereas the interior nodes of the tree are labeled M 1,M 2,...,2M. the phylogeny. The Felsenstein (1981) pruning algorithm Note that in this article we also consider the “stem” branch+ + of the tree, is typically used to marginalize over the different which connects the root node (node 7) and its immediate common combinations of “states” (ancestral geographic ranges) Image: Michael J Landis, Nicholas J Matzke, Brian R Moore, ancestor (node 8). B–D) Several possible biogeographic histories— at the interior nodes of the tree. Taken together, matrix and John P Huelsenbeck. Bayesian analysis of biogeography comprising 6, 6, and 12 events, respectively—that can explain the exponentiation and the pruning algorithm comprise the when the number of areas is large. Systematic biology, page observed species ranges. syt040, 2013. conventional approach for calculating the probability of observing the geographic ranges at the tips of the tree while accounting for all of the possible ways chain (Ree et al. 2005). A continuous-time Markov those observations could have been generated under the chain is fully described by a matrix containing the model. instantaneous rates of change between all pairs of states The dimensions of the instantaneous-rate matrix, Q, (geographic ranges, in this case). This instantaneous- however, are n( ) n( ), where n( ) 2N 1, so the size rate matrix, Q, has off-diagonal elements that are of Q grows exponentiallyS × S with respectS = to− the number all 0andnegativediagonalelementsthatare ≥ of geographic areas, N. Furthermore, computing the specified such that each row of the matrix sums 3 to 0. The elements of Q are parameterized by matrix exponential is of complexity (n( ) )(Golub and Loan 1983). Thus, for values of N O20,S the number of functions of θ, the parameter vector, according to some ≥ dispersal model, . The probability of a biogeographic computations required to exponentiate the rate matrix is history is obtainedM using the information on the quite large and computing the transition probabilities position of colonization/extinction events on the tree in this manner is intractable (Ree and Sanmartín and information from the instantaneous-rate matrix. 2009). Consider, for example, a case in which the process starts Statistical phylogenetic models encounter an with a geographic range of 001 at one end of a branch, analogous problem when modeling nucleotide evolution. As Felsenstein (1981) suggests, one with a subsequent colonization of area one at time t1 (i.e., changes from 001 101), and then remains in the might assume that each nucleotide site evolves geographic range 101 until→ the end of the branch at time under mutual independence to keep the state space small and amenable to matrix exponentiation. For t2. The probability of this history is biogeographic inference, however, the assumption of mutual independence would imply (implausibly) q001,101 ( q001,001 t1) ( q101,101 (t2 t1)) q001,001e− − e− − − that the correlative effects between areas—such as − ×−q001,001 × Probability of no further events geographic distance—are irrelevant to dispersal Waiting time for colonization Probability of colonization event processes, which renders this assumption suitable only ! "# $ ! "# $ ! "# $ [17:17 4/7/2013 Sysbio-syt040.tex] Page: 3 1–16 The Dispersal-Extinction-Cladogenesis (DEC) model: Syst. Biol. 57(1):4–14, 2008 Copyright c Society of Systematic Biologists ⃝ ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150701883881 Maximum Likelihood Inference of Geographic Range Evolution by Dispersal, Local Extinction, and Cladogenesis RICHARD H. REE1 AND STEPHEN A. SMITH2 1Department of Botany, Field Museum of Natural History, 1400 South Lake Shore Drive, Chicago, Illinois 60605, USA; E-mail: rree@fieldmuseum.org 2Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut 06520, USA Abstract.—In historical biogeography, model-based inference methods for reconstructing the evolution of geographic ranges on phylogenetic trees are poorly developed relative to the diversity of analogous methods available for inferring character Downloaded from evolution. We attempt to rectify this deficiency by constructing a dispersal-extinction-cladogenesis (DEC) model for ge- ographic range evolution that specifies instantaneous transition rates between discrete states (ranges) along phylogenetic branches and apply it to estimating likelihoods of ancestral states (range inheritance scenarios) at cladogenesis events. Unlike an earlier version of this approach, the present model allows for an analytical solution to probabilities of range transitions as a function of time, enabling free parameters in the model, rates of dispersal, and local extinction to be estimated by maxi- mum likelihood. Simulation results indicate that accurate parameter estimates may be difficult to obtain in practice but also show that ancestral range inheritance scenarios nevertheless can be correctly recovered with high success if rates of range http://sysbio.oxfordjournals.org/ evolution are low relative to the rate of cladogenesis. We apply the DEC model to a previously published, exemplary case study of island biogeography involving Hawaiian endemic angiosperms in Psychotria (Rubiaceae), showing how the DEC model can be iteratively refined from inspecting inferences of range evolution and also how geological constraints involving times of island origin may be imposed on the likelihood function. The DEC model is sufficiently similar to character models that it might serve as a gateway through which many existing comparative methods for characters could be imported into the realm of historical biogeography; moreover, it might also inspire the conceptual expansion of character models toward inclusion of evolutionary change as directly coincident, either as cause or consequence, with cladogenesis events. The DEC model is thus an incremental advance that highlights considerable potential in the nascent field of model-based historical biogeographic inference. [Ancestral state reconstruction; dispersal; extinction; Hawai‘i; historical biogeography; Psychotria; speciation; vicariance.] at University of California School Law (Boalt Hall) on April 12, 2016 Inferring the evolution of geographic ranges of In character models, transitions between states are species and clades in a phylogenetic context is a major typically assumed to occur stochastically according to focus of historical biogeography. To this end, many a Markov process, with the probability Pij(t) of ancestor- questions about the past may be of interest, such as: descendant change from state i to state j on a phyloge- Where were ancestors distributed? What was the tempo netic branch of length t being a function of the model’s and direction of dispersal? How important was range parameter values for instantaneous transition rates. The expansion to lineage diversification? The breadth of matrix of transition probabilities for all pairs of states is Qt potential inquiry is wide, but to date it has not been generally obtained by the equation P(t) e− , where Q matched by an equally diverse set of methods