Ruud Scharn Validating a New Analytical Framework for Including
Total Page:16
File Type:pdf, Size:1020Kb
Validating a new analytical framework for including fossils in biogeographic analyses with a dated phylogeny for the plant order Gentianales . Ruud Scharn Degree project for Master of Science (120 credits) Biodiversity and Systematics 60 hec Department of Biological and Environmental Sciences University of Gothenburg June 2014 Supervisors: Alexandre Antonelli, Cajsa Lisa Anderson Abstract The question why biodiversity differs among regions is central in biology. Based on the idea that current species distributions combined with their evolutionary history hold information on their geographic origin, several analytical methods have been developed using species relationships (phylogenies) to model past ancestral ranges. Nevertheless, reconstructing dispersals of species that did not leave any descendants in their ancestral areas has been challenging. Fossils hold the potential of informing on past species occurrences, but placing them in a phylogeny is often difficult, often precluding their inclusion in biogeographic analyses. In this study I test for the first time a novel approach to include fossils in biogeographic analyses, calculating relative dispersal events through time. I ask whether current disjunctions in the Neo and Paleotropics are better explained by i) trans-Atlantic dispersals across the Atlantic (which should occur randomly over time); or instead ii) by migrations accross the boreotropic route – a largely interconnected belt of tropical forest covering southern North America and Laurasia during the Eocene (which should leave a clear temporal signature of increased dispersals during that period). I use the plant order Gentianales as a test case, for which I build the hitherto largest phylogenetic tree (containing 430 tips and 3 genes). I calibrate the tree in a Bayesian framework with c. 40 fossils, many of which are used for the first time in molecular dating analyses. I take topological and dating uncertainties into account, and estimate the impact of biogeographic methodology and assumptions on the analyses. The dispersal patterns obtained for the Gentianales strongly indicate that dispersals among the world’s tropics did not take place in at continuous rate: instead, there was a pronounced increase in dispersals around the late Paleocene–early Eocene. Although several potential pitfalls could influence these results, including the relatively arbitrary use of a maximum age constraint for the order, my study indicates that the boreotropical route may have played a pivotal role in producing current inter-tropical disjunctions. My results further suggest that calculating relative (rather than absolute) dispersal events through time may be a more adequate way of tracking and quantifying biogeographic history. Introduction The world’s biodiversity is not evenly distributed. This was an important observation that contributed to Darwin’s (1859) formulation of the theory behind The origin of species (Ronquist and Sanmartín 2011). Although this observation is easily made, the question of why biodiversity in one area differs from another has been and still is a central question in biology. There are three main biological processes that govern the diversity of an area: 1) speciation, the formation of new species out of existing lineages; 2) extinction, the death of the last individual of a species; and 3) migration, the movement of species either into or out of the area. Based on the idea that current species distributions combined with their evolutionary history hold information on the spatial distribution of their ancestors, several analytical methods have been developed using species relationships (phylogenies) to model past ancestral ranges (for Ruud Scharn: Biogeography of Gentianales an overview of currently available methods see (Ree and Sanmartín 2009, Ronquist and Sanmartín 2011). These methods allow us to address questions such as when and under which circumstances lineages migrated into new areas to attain their present day distribution. Undetected migration Despite recent methodological developments, there are biological pitfalls in ancestral state reconstructions, since not all migration events leave traces in present day distributions. We know this because the fossil record often reveals taxa in areas that do not contain any living relatives. This effect can be partly explained because the eco-physical attributes of areas change over time (e.g. climate, geographic settings, drainage systems, etc) and species have the tendency to inherit and track their ancestral ecology at a macro-ecological scale (Crisp et al. 2009). Changes in climate and other ecological attributes may therefore drive the movement of species or even major parts of the local biotas between areas (Raven and Axelrod 1974), even without causing species extinctions. Modern parametric models for biogeography reconstruction such as the Dispersal-Extinction- Cladogenesis model (DEC) (Ree et al. 2005, Ree and Smith 2008) can in part account for past changes in geo-physical settings (e.g. connectivity between continents and islands) through the ability to include time dependent constraints on dispersal. The user’s knowledge on past ecology and geography can hence be used to down-weigh the chance on successful dispersal between areas at times when dispersal is deemed less likely. However, whilst this method can include temporal information, it also has the problem that although we may have rough ideas about past rates of dispersal, precise numbers are usually defined arbitrarily. Furthermore, if an area is not occupied by any of the taxa in the phylogeny, that area cannot be included in the analysis. Including fossils in biogeographic analysis A further development of biogeographic analyses would be to include fossil data directly into the analyses, such as the DEC approach. Since fossils at least partly represent the past ranges of lineages, they provide appropriate constraints on ancestral area reconstructions, besides incorporating a temporal aspect. There are however several potential pitfalls when using fossils in such way. Fossil identification and taxonomical assignment is challenging, since fossils often only represent a fraction of the organism and preservation quality if often low. This is likely to reduce the number of preserved synapomorphies and other reliable characters for a confident fossil placement on a phylogeny. The quality of identifications therefore varies. Fossil assignment found on results from morphology based cladistic analyses is generally regarded more trustworthy than one based on morphological similarities (Parham et al. 2012, Ronquist et al. 2012), but it is dependent on the number of preserved characters. The result is that fossils are often placed at higher taxonomic levels (closer to the root of the phylogeny) than they should. Moreover, fossil ages are often based on the age of the geological assembly in which they are found, and are thus often given as ranges rather than specific ages. Finally, 2 (20) Ruud Scharn: Biogeography of Gentianales because of their rarity, fossil absence in an area can usually not be considered proof for absence. In order to effectively use fossils, the following things should to be taken into account: 1) uncertainty in the taxonomic placement of the fossil; 2) uncertainty in the phylogenetic reconstruction; 3) uncertainty in the molecular dating analyses; 4) uncertainty in the geographic range of the fossil; and 5) uncertainty in the fossil’s absolute age Current methods Several solutions have been proposed to include fossils in biogeographic analyses. Here I will briefly discuss the two that are based on Maximum Likelihood (ML) parametric models (models that define the likelihoods of alternative scenarios, given a set of probability distributions and their corresponding parameter values; (Ree and Sanmartín 2009)), as they can account for the effect of time. In other words, there is a larger likelihood that events happen along longer branches in a phylogeny than over shorter ones (Ree and Sanmartín 2009). The first method is to include fossils in the initial phylogeny as terminal taxa before running the biogeographic analyses (see: (Ronquist et al. 2012, Wood et al. 2013)). The biogeographic analyses (for example DEC) is then run over a set of trees from a Bayesian posterior distribution and summarized on a maximum clade credibility tree, thus accounting for phylogenetic and dating uncertainty. This approach is therefore preferable, but doing it requires a vast morphological dataset, for both fossil and extant taxa. In practice this is often impractical when studying a large number of taxa, since paleontological data (including high resolution images) are often not publicly available, and the time required for properly coding and analyzing morphological matrices can be substantial. An alternative method is found in the AReA package. The method is unpublished but a manuscript describing the use of fossil constraints is available online (Moore et al. 2008) and has been used in several papers (Nylander et al. 2008, Xiang and Thomas 2008). The package allows fossil information to be used to constrain the lineage it is affiliated with to the area of its occurrence, and at the age of the fossil. If the entire range is thought to be known (at the level of the operational units used in the biogeographic analysis), the analyses will enforce presence on only these areas during the time of the fossil’s occurrence. If the range is only partially known, only presence (not absence) is enforced.