Adding Phylogenies to QGIS and Lifemapper for Evolutionary Studies of Species Diversity Jeffery A

Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings Volume 14 Portland, Oregon, USA Article 4 2014 Adding Phylogenies to QGIS and Lifemapper for Evolutionary Studies of Species Diversity Jeffery A. Cavner University of Kansas (USA) Aimee M. Stewart Charles J. Grady James H. Beach Follow this and additional works at: https://scholarworks.umass.edu/foss4g Part of the Geography Commons Recommended Citation Cavner, Jeffery A.; Stewart, Aimee M.; Grady, Charles J.; and Beach, James H. (2014) "Adding Phylogenies to QGIS and Lifemapper for Evolutionary Studies of Species Diversity," Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings: Vol. 14 , Article 4. DOI: https://doi.org/10.7275/R5T72FN2 Available at: https://scholarworks.umass.edu/foss4g/vol14/iss1/4 This Paper is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings by an authorized editor of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. Adding Phylogenies to QGIS and Lifemapper Adding Phylogenies to QGIS and Lifemapper for Evolutionary Studies of Species Diversity regulation of species distributions, should assay the spatial variation of phylogenies by mapping phylo- by Jeffery A. Cavner, Aimee M. Stewart, Charles J. Grady, genetic community values across space and time at James H. Beach different scales using advances in GIS techniques. One such approach would to be bring phyloge- University of Kansas (USA). [email protected] netic data into a GIS environment. We have begun to develop such an approach as an addition to Abstract the Lifemapper project (www.lifemapper.org) in a Phylogenetic data from the “Tree of Life” have ex- Lifemapper Range & Diversity (LmRAD) QGIS plug- plicit spatial and temporal components when paired in (Cavner et al. 2014) that provides phylogenetic with species distribution and ecological data for test- visualization and analysis tools for spatially linked ing contributions to biological community assembly range-diversity relationships derived from presence- at different geographic scales of species interaction. absence matrices (PAMs). We developed the tool also Important questions in biology about the degree of hoping to expand it to include historical biogeogra- niche suitability and whether the history of a com- phy meta-community analyses and community as- munity’s assembly for an area can affect whether the sembly analyses focused on phylogenetic-diversity species in a community are more or less phyloge- area relationships where analysis across geographic netically related can be answered using several dif- scale leads some of the most important questions in ferent spatially-filtered measures of phylogenetic di- biodiversity. versity. Phylogenetic analyses which support the de- The LmRAD QGIS plug-in creates, maps and an- scription of ecological processes are usually achieved alyzes presence-absence matrices or PAMs, one of in a handful of software libraries that are narrowly the core data structures for macroecological research. focused on a single set of tasks. Very few applications It links the resulting data to phylogenetic and spa- scale to large datasets and most do not have an ex- tial views of a set of range-diversity statistics de- plicit spatial component without relying on external rived from the PAM. The PAM or incidence matrix visualization packages. This prompted us to explore is a 2-dimensional Boolean matrix constructed from bringing phylogenetic data into an open-source GIS a spatially defined grid of regular polygons where environment. The Lifemapper Macroecology/Range the presence or absence of each species of hundreds & Diversity QGIS plug-in is a custom plug-in which or thousands of species are recorded for each cell. we use to calculate and map biodiversity indices that One axes of the matrix represents species and the describe range-diversity relationships derived from orthogonal axis represents geographic localities de- large multi-species datasets. We describe extensions scribed by the regular polygons. Each geographic to that plug-in which expand the Lifemapper set site is coded for the presence (1) or absence (0) of of ecological tools to link phylogenies to spatially- each species. It summarizes the two fundamental derived ’diversity field’ statistics that describe the units of biogeography, the distributional range of a phylogenetic composition of natural communities. species (both their position and size, range size sim- Keywords: QGIS, WPS, Distributed Comput- ply equals the total of the species axes across sites) ing, Biogeography, Range and Diversity, Lifemapper, and the species diversity of sites or the number of Macroecology, Phylogenetics. different species in each site as summarized by site axes totals. Several mathematical and biological relation- 1. Background ships obtain across the PAM that link spatially derived statistics with species based statistics. Of in- Community phylogenetics, the focus on how species terest for phylogenetic relationships are the species relatedness and species traits are associated with based statistics calculated from the PAM that mea- how evolution extends into ecological processes sure the “diversity field” of a species (Arita et al. and spatial patterns, and biogeography or meta- 2008). The diversity field is the set of diversity values community ecology, largely focused on the spatial of sites in which a species occurs. For example, the OSGEO Journal Volume 14 Page 19 of 48 Adding Phylogenies to QGIS and Lifemapper diversity field volume, i.e. the summation of those as Open Geospatial Consortium (OGC) Web Process- species diversity values within a species’ range di- ing Services (WPS) (Open Geospatial Consortium, vided by the range size of the species allows us to cal- Inc. 2007b) so that larger distributed computing en- culate the average species diversity within the range vironments can be brought to bear on large datasets. of that species. We represent that volume as a pro- The Lifemapper web services are organized as two portion of the total number of species in the study modules, LmSDM, and LmRAD. The LmSDM mod- area. Including the total area of the study area allows ule uses RESTful and OGC specifications to build us to illustrate the proportion of the sites in which species distribution models based on the predicted two species co-occur. The average association of a niche for a species using climate and species occur- species with all of the species in the study area al- rence data. The LmRAD (Range and Diversity) is a lows us to illustrate that there is an inverse relation- multi-species platform for PAM based range and di- ship between the proportional range of a species and versity calculations. Both modules can be accessed the difference between the mean proportional diver- through the plug-in, and outputs from LmSDM can sity within its range and the average proportional be piped into LmRAD as species inputs to PAMs. diversity in the study area (Arita et al. 2008). The This paper will focus on the range and diversity ca- mathematical reciprocal of the average proportional pabilities of the plug-in and how the spatial compo- diversity of the study area is a well-studied measure nent to phylogenetic data recently added to the plug- of species turnover called Whittaker’s beta diversity. in can be used with the biodiversity indices calcu- It is a measure of the ratio between the overall di- lated from the PAM and areas where phylogenetic versity of the study area and the average local di- data can be used to explore other types of diversity versity (Arita et al. 2008). There are closely associ- measures for species communities. This paper will ated beta measures of diversity for several different begin by outlining use cases and common threads types of diversity. Different approaches to species di- that connect them and how we have begun to ad- versity such as phylogenetic diversity – the degree dress them with a focus on new interface capabilities of relatedness of species in a community based on for phylogenetic data and linked data spaces. Next their evolutionary history – abundance and ecosys- we will describe how the Lifemapper plug-in and tem function measures of diversity all can be decom- it’s supporting web services were designed to take posed into measures of local and regional diversity advantage of a client-server architecture in order to ratios that are highly dependent on scale. be able to use geographic processing standards on Analyzing the diversity field within the range of a large datasets. This is followed by a comparison of species is equivalent to studying it’s covariance with related software with a focus on phylogenetic algo- all the species in a study, i.e. the degree of associa- rithms and scripts with a spatial component. We end tion of species within their ranges. We plot this as- by discussing findings, and future directions for the sociation in QGIS through the plug-in in a “range- Lifemapper plug-in. diversity” plot. Curves on the plot for species follow a line defined by the inverse relationship between the range of a species and the difference between the two 2. Use Cases and Capabilities diversity statistics. When plotting the species in this way, species with equal degrees of association with 2.1 Range and Diversity Plots and Maps one another arrange themselves along lines of isoco- with Phylogenetic Trees variance. The Lifemapper plug-in allows the user to “brush” data points along those curves in the interac- Phylogenetic based ecology is a growing field. Its tive range-diversity plot which selects the individual practice both at small scales and larger biogeo- species in the linked data space for the phylogenetic graphic scales – it goes under several names: phylo- tree. In this way the spatially derived statistics for di- geography, ecophylogenetics, or phylogenetic com- versity from the PAM can be compared to the degree munity ecology – share two obvious constraints of phylogenetic relatedness within species commu- for incorporating phylogenetic data into ecology re- nities.

Adding Phylogenies to QGIS and Lifemapper for Evolutionary Studies of Species Diversity Jeffery A

Ade4: Analysis of Ecological Data: Exploratory and Euclidean Methods

"Introduction to Inferring Evolutionary Relationships". In: Current

Romanesco Documentation Release 0.1.0

Tutorial: Environment for Tree Exploration Release 3.0.0B7

User Manual for Splitstree4 V4.6

Load-Balance and Fault-Tolerance for Massively Parallel Phylogenetic Inference

Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes

Tutorial: Environment for Tree Exploration Release 2.3.6

Rutger Aldo Vos Doctorandus, Universiteit Van Amsterdam, 2000 THESIS SUBMITTED in PARTIAL FULFILLMENT of the REQUIREMENTS for TH

Preview Software Manual (PDF)

User Manual for Splitstree4 V4.17.1

Reconciliation of Gene and Species Trees with Polytomies