Web-Based Computational Tools for Studying Plant Biodiversity
Total Page:16
File Type:pdf, Size:1020Kb
Louisiana State University LSU Digital Commons LSU Doctoral Dissertations Graduate School 2014 Web-based Computational Tools for Studying Plant Biodiversity Timothy Mark Jones Louisiana State University and Agricultural and Mechanical College, [email protected] Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_dissertations Recommended Citation Jones, Timothy Mark, "Web-based Computational Tools for Studying Plant Biodiversity" (2014). LSU Doctoral Dissertations. 3055. https://digitalcommons.lsu.edu/gradschool_dissertations/3055 This Dissertation is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU Doctoral Dissertations by an authorized graduate school editor of LSU Digital Commons. For more information, please [email protected]. WEB-BASED COMPUTATIONAL TOOLS FOR STUDYING PLANT BIODIVERSITY A Dissertation Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of Doctor of Philosophy in The Department of Biological Sciences by Timothy Mark Jones B.S., Cleveland State University, 2006 May 2015 ACKNOWLEDGMENTS This dissertation is over a decade in the making and is the result of many peoples work and ideas. First, I am forever indebted to my graduate advisor, Dr. Lowell E. Urbatsch, for giving me the opportunity to work at a true idea factory. Lowell provided guidance, plus the freedom to develop new computational tools, in addition to years of challenging conversations, classes, workshops, and field trips. Lowell primarily led along a conversational path of, “How would you make it better?” I would also like to thank my undergraduate advisor, Dr. George Wilder, for providing the framework for this document. His incredibly detailed lectures and exams are the seeds for all future works presented. And thank you to LSU for its support and the state-of-the-art herbarium. My committee members: Dr. Michal Brylinksi, Dr. Jeremy Brown, Dr. Prosanta Chakrabarty, and Dr. Bryan Carstens (now at Ohio State University) have all educated and enlightened me with their classes, presentations, and interactions. They have always provided crucial guidance about research, publishing, and graduate school in general. Michal really solidified my larval reasoning concerning computational processes as a necessity in all biological fields of study. I would also like to sincerely thank Dr. Bill Wischusen, Ann Jolissaint, and Jane Reiland, for the teaching advice and time that they would freely give out of their already busy days. Plus, I would like to thank Priscilla Milligan, for always taking the time to point me in the right direction. Many LSU Herbarium personnel were instrumental in the making of this work. Dr. Yalma Vargas, aided in development of literally everything; all the good/bad/ugly ideas. As well as Chris Reid, I am truly grateful for every trip and conversation that we have had concerning our dissertations, projects, and bodies of work. Jennifer Kluse, also graciously provided feedback with development about all things, while Bruce Messick enabled a healthy IT development environment for these works. ii Other individuals that facilitated chapter two are Dr. Mary Barkworth, Dr. Jim Bissell and Dr. Tony Reznicek. Tony provided early access to published materials that were obscure and/or new, thereby enabling early development of the alpha Carex Interactive Identification key (CIIK), while Jim granted access to rare ecosystems holding those hard-to-find species that permitted early testing. Mary made all things possible with innumerable conversations, software, travel, housing, and great meals. She introduced me into the professional botanical world. Computationally, Dr. Michael Dallwitz showed great patience with early versioning of CIIK character matrices, as well as taking the time to provide revisionary edits that informed, while he improved the data clarity. Dr. Kevin Thiele helped to improve this informational clarity through discussions and body of work. And thanks to Dr. Terrence Walters for permitting me access to a United States Department of Agriculture course for state-of-the-art identification software years ago. That class was instrumental in my development. The third chapter on grass pollination was enabled by contributions from Dr. Sophie Warny, Dr. Vaughn Bryant, Dr. Victoria Bayless and Dr. George Jones. Their combined expertise greatly accelerated the publication of the grass/bee chapter. The last data chapter on plant biodiversity usage of Google Analytics is the result of work by many hundreds of individuals but centers around David Baxter, Ben Legler, Ed Gilbert, Dr. Chuck Miller, Dr. Gregor Hagedorn, and Dr. Rod Page. Without their support and data sets, this paper would have been an impossibility. And special appreciation is genuinely felt for all the field companions that have provided the their intellect, insight, and expert-level knowledge over the years: Ramon Vargas, Valery Yanak, Jamie Byrd, Dr. Suneeti Jog, Pedro Lake, Fredrick Schleibner, Dr. David Kriska, Diane Massey, Dr. Kal Ivanov, George Skalsky, Dr. Alistair McTaggert, Dr. Tony Kusbach, Dr. Forrest Dillemuth, Dr. Kyle Harms, and Dr. Maggie Hanes-Koopman. iii Finally, thank you to my folks, Tamara Jordan & Roderick Jordan, without your faith, I could not have made this far. And to my family, Juliana Polonski, Ted Polonski, and Victor Potapenko - thank you for allowing this to happen. I greatly appreciate all the time and effort by everyone. It is only due to this collective community, the backbone and broad shoulders of co-workers, friends and family that has allowed these works presented here. iv TABLE OF CONTENTS ACKNOWLEDGMENTS ...................................................................................................................... ii ABSTRACT ........................................................................................................................................ vi CHAPTER 1. PROBLEMS OF SCALE ACROSS LEVELS ........................................................................ 1 CHAPTER 2. A VISUAL IDENTIFICATION KEY UTILIZING BOTH ANALYTIC AND GESTALT APPROACHES TO IDENTIFICATION OF CARICES PRESENT IN NORTH AMERICA .............................. 5 CHAPTER 3. ENTOMOPHILOUS POLLINATION OF GRAMINOIDS .................................................. 17 CHAPTER 4. ACCESS OF PLANT BIODIVERSITY DATA REVEALED BY GOOGLE ANALYTICS ............. 26 CONCLUSION ................................................................................................................................. 50 REFERENCES .................................................................................................................................. 51 APPENDICES .................................................................................................................................. 56 VITA ............................................................................................................................................... 77 v ABSTRACT Large scale plant biodiversity bioinformatics projects are now making taxonomic datasets available at a frenetic pace via the World Wide Web (WWW). While these new resources provide the fundamental textual and visual backbone of expert level knowledge, their information structure often impedes the development of derivative works for identification. But when this information is rearranged from a traditional format, questions can be asked of the data that were previously thought to be unanswerable. The difficulty in transforming this ‘big-data’ is manifold. How to deliver it rapidly to researchers across the world while providing visualizations of data that encompass these large data sets. Interactive Visual Identification Keys (VIK) are introduced here to help manage this magnitude of image data, using both analytic and gestalt methods, (Chapter 2) here via the Carex Interactive Visual Identification Key (CIVIK). Through matrix preparation utilizing ontological methods only, and brute force data-mining, Flora of North America is leveraged to develop and provide a novel identification system for the largest vascular plant genus of North America, Carex. The third chapter focuses on pollination syndromes found within the graminoids, or the grasses and sedges of which Carex is a member. The graminoid pollination syndrome is known as anemophily, or wind pollination. During preparation of CIVIK it was noted repeatedly while taking the photos required for its generation, that small solitary bees and flies will often visit graminoids to collect pollen during anthesis. Yet, traditional botanical literature often neglects to mention this fact, or it is described as being inadvertent or mistaken. This chapter presents solid evidence that even common honey bees, Apis mellifera, will exclusively visit a common turf grass to collect pollen. Then, Chapter 4 examines and analyzes these plant biodiversity websites for use. Are they being used? With what technology? Are trends present to be considered for future development? With answers to these questions, curators of museum quality data, in conjunction with web developers, may be able to provide a richer user experience in a shorter amount of time. vi CHAPTER 1. PROBLEMS OF SCALE ACROSS LEVELS Though the following three data chapters may seem disparate in nature, there is one recurring theme, that being the issue concerning scales of information.