Assembling an Illustrated Family-Level Tree of Life for Exploration in Mobile Devices
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.454988; this version posted August 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. Assembling an illustrated family-level tree of life for exploration in mobile devices Andrés A. Del Risco, Diego A. Chacón, Lucia Ángel, David A. García Corresponding Author: Andrés A. Del Risco. E-mail: [email protected] ABSTRACT Since the concept of the tree of life was introduced by Darwin about a century and a half ago, a considerable fraction of the scientific community has focused its efforts on its reconstruction, with remarkable progress during the last two decades with the advent of DNA sequences. However, the assemblage of a comprehensive tree of life for its exploration has been a difficult task to achieve due to two main obstacles: i) information is scattered into a plethora of individual sources and ii) practical visualization tools for exceptionally large trees are lacking. To overcome both challenges, we aimed to synthetize a family-level tree of life by compiling over 1400 published phylogenetic studies, ensuring that the source trees represent the best phylogenetic hypotheses to date based on a set of objective criteria. Moreover, we dated the synthetic tree by employing over 550 secondary-calibration points, using publicly available sequences for more than 5000 taxa, and by incorporating age ranges from the fossil record for over 2800 taxa. Additionally, we developed a mobile app (Tree of Life) for smartphones in order to facilitate the visualization and interactive exploration of the resulting tree. Interactive features include an easy exploration by zooming and panning gestures of touch screens, collapsing branches, visualizing specific clades as subtrees, a search engine, a timescale to determine extinction and divergence dates, and quick links to Wikipedia. Small illustrations of organisms are displayed at the tips of the branches, to better visualize the morphological diversity of life on earth. Our assembled Tree of Life currently includes over 7000 taxonomic families (about half of the total family-level diversity) and its content will be gradually expanded through regular updates to cover all life on earth at family-level. Keywords: Tree of Life, Phylogeny, Visualization, Mobile App, Exploration, Evolution INTRODUCTION one of the most challenging goals in the biological sciences, prompting the The fact that all life on Earth is emergence of phylogenetics as a connected through common ancestry fundamental subdiscipline of across a single genealogical tree has evolutionary biology. fascinated scientists for over one and a half century (Darwin, 1859). Since then, Even though the scientific reconstructing the tree of life has become community has made substantial 1 bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.454988; this version posted August 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. advances towards this goal, especially files, which are acquired from either during the last two decades following the public data repositories or by direct advent of molecular systematics (Pennisi submission by authors. Unfortunately, 2013), two major obstacles still stand tall less than a fifth of the studies provide in the way of providing a unified and such files (Drew 2013), meaning that most comprehensive tree of life to browse of the potentially valuable phylogenetic through the evolutionary relatedness of information that is presented in the form all life on Earth. Firstly, phylogenetic of figures in published articles are yet to reconstructions are usually focused on be incorporated into its synthetic tree. very small fractions of the whole tree of Secondly, the OTOL synthetic tree does life, so the available fragments of not include the temporal component information are widely scattered across (time-proportional branch lengths) of the thousands of scientific publications which Tree of Life, so this information must be in turn are buried in the often paywalled consulted elsewhere (e.j., Kumar et al. primary literature. Secondly, due to the 2017). Finally, the OTOL project currently high number of terminal branches, having aims to reconstruct the evolutionary a fully assembled tree of life would relationships among living taxa, so extinct require effective and practical ways to taxa known from the vast fossil record has navigate and visualize a phylogenetic yet to be incorporated into the synthesis. diagram of such large dimensions. On the other hand, while many Some important efforts have been programs that allow users to visualize and undertaken to tackle both obstacles navigate tree files have been available for (although separately) with varying over a decade (Page 2011), only two degrees of success. Early projects aiming recently developed web-based tools have to reconstruct the tree of life involved the attained the goal of providing a complete compilation of phylogenies of specific visualization and exploration of large groups of taxa into sectional, hyperlinked phylogenies: Onezoom (Rosindell & webpages (Maddison et al. 2007) or books Harmon 2012) and LifeMap (de Vienne (Lecointre & Le Guyader 2006; Hedges & 2016). Both allow users to smoothly Kumar 2009), but these were eventually explore the tree of life by employing superseded by the most comprehensive zooming features, and by being also initiative to synthesize phylogenetic and available in the form of mobile apps, they taxonomic information into a single take advantage of touch screens for a source tree: The Open Tree of Life more intuitive exploration. These tools (hereafter, OTOL: Hinchliff et al. 2015). were solely developed for visualization, Although the establishment of the OTOL and hence depend on external datasets to project represents a huge step in explore a given phylogeny, relying on the overcoming the first obstacle by the synthetic tree compiled by the OTOL or extensive compilation of phylogenies, it is the NCBI taxonomic tree. The main not devoid of shortcomings. Firstly, the limitation of these approaches is the construction of its constantly updating visual formats in which trees are synthetic tree mainly depends upon presented; even though they are published phylogenies in the form of tree aesthetically appealing, they greatly 2 bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.454988; this version posted August 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. deviate from the conventional diagrams Figure 1 and is thoroughly explained in the in which phylogenies are presented sections below. (cladograms, phylograms, timetrees, etc.) and thus hinder the proper inference of the topological relationships depicted in Taxonomic Input the tree. Additionally, another consequence of these distinctive To assemble the family-level tree of life, visualization formats is their inability to the first step was establishing incorporate a temporal dimension in the comprehensive taxonomic lists for large form of branch lengths, which is necessary groups of organisms based on clade- to determine extinction and divergence specific, published taxonomic backbones dates. covering at least family-ranked taxa. We employed the comprehensive family-level With the aim of overcoming both classification of living organisms of obstacles while avoiding some of the Ruggiero (2014) as a starting point and as mentioned pitfalls of previous projects, the default taxonomic backbone for we introduce the Tree of Life App: a novel groups without specialized proposals. tool to visualize and interactively explore Although different taxonomic proposals a thoroughly compiled, time-calibrated or treatments may exist for a given group, family level-tree of life through a mobile we chose what we consider to be the best app. We aim to establish this tool as a taxonomic backbone based on their reference resource for scientific congruence with current phylogenetic consultation, educational purposes, and hypotheses (i.e., recognition of mostly even casual exploration for the general monophyletic families). Each family-level public. Tree of Life App is publicly taxonomic backbone is then cited in and available in the Google Play Store and displayed in the secondary information iPhone AppStore, and it is constantly window of its corresponding supra- being updated as new phylogenetic familial taxon. For any given group, a studies are published. taxonomic family list may or may not be a composite of more than one source. Such is the case for groups that have a rich METHODS fossil record, as are most major vertebrate and arthropod clades, among The general procedure for the assemblage others. In these cases, taxonomic lists are of the family-level tree of life consisted usually composed of a source for extant mainly in a tree grafting approach, in families, and another one for extinct which higher-level phylogenies are pasted families (which are often excluded in the into lower-level phylogenies. The whole former). For most