On the Feasibility to Engage Heterogeneous Communities in Data Gathering, Sharing and Enrichment By
Total Page:16
File Type:pdf, Size:1020Kb
Department of Life Sciences and Chemistry On the feasibility to engage heterogeneous communities in data gathering, sharing and enrichment by Julia Schnetzer, MSc. A thesis submitted in partial fulfillment of requirements for the degree of DOCTOR OF PHILOSOPHY in Marine Microbiology Approved Thesis Committee: Prof. Dr. Frank Oliver Glöckner (chair) Max Planck Institute for Marine Microbiology Jacobs University Bremen Prof. Dr. Matthias Ullrich Jacobs University Bremen Dr. Julia A. Busch Carl von Ossietzky University Oldenburg Dr. Renzo Kottmann Max Planck Institute for Marine Microbiology Date of Defense: 08.12.2015 Unlike the original submitted thesis this digital version contains the publisher's version of the articles. Statutory Declaration Statutory Declaration (on Authorship of a Dissertation) I, Julia Schnetzer hereby declare that I have written this PhD thesis independently, unless where clearly stated otherwise. I have used only the sources, the data and the support that I have clearly mentioned. This PhD thesis has not been submitted for conferral of degree elsewhere. I confirm that no rights of third parties will be infringed by the publication of this thesis. Signature Date ...by the mean nature of a world that will not stand still long enough for them to see it clear as a whole. What lured Hemingway to Ketchum? Hunter S. Thompson (1937-2005) Thesis Abstract Marine microbes play critical roles in the well being of the planet Earth and all its inhabitants. Not only do they influence chemical cycles, the marine food chain, but also the whole atmosphere and climate of our planet. However, the field of marine microbiology is still in its infancy and there is much more waiting to be explored. Here, I present a new approach to investigate global marine microbial diversity and function on a single day of the year, the 21st of June 2014/2015: the Ocean Sampling Day (OSD). The collection of a simultaneous, global dataset, required marine researchers, worldwide, to be connected. The aim was not only to create a snap shot of the marine microbial diversity fixed in time, but also to raise awareness amongst the general public of the important role these tiny organisms play in our daily lives. Therefore, professional scientists as well as the non- scientific public were invited to join the corresponding citizen science project, MyOSD. They supported OSD by providing oceanographic measurements and even microbial samples. Data collected by citizen scientists were validated and show that citizen science can contribute valuable data to marine research. A special focus was set on additional environmental measurements such as water temperature. This contextual data is important for the interpretation of microbial diversity in any given sample; however, it is still not common practice in marine microbial research to measure or report contextual data; OSD aims to make scientists more aware of this problem. Extracting contextual data after a dataset or article has been published, is onerous work. Hence, I present two new tools to extract environmental information and geographic locations from scientific literature. The text mining tool, ENVIRONMENTS, automatically annotates scientific text with terms from the Environmental Ontology (EnvO). The PubMap application utilizes the power of the crowd to enable the creation of a manually curated database of georeferenced scientific publications. Overall, this thesis shows that enabling collaboration within the scientific community as well as the non-scientific public, leads to achievements not only in gathering of new datasets, but also in enhancing present and historic scientific literature. Table of Contents Chapter 1 Introduction ............................................................................................................... 4 1.1 Marine Microbes: The Grey Eminences of the Ocean ......................................................... 4 1.2 Uncovering the Secrets of Marine Microbes ....................................................................... 5 1.3 Context is Everything ............................................................................................................ 9 1.4 Exploring the Microbial Ocean ........................................................................................... 10 1.5 Microbes Gaining Importance in Citizen Science ............................................................... 11 1.6 Working Together to Refurbish Scientific Data .................................................................. 13 1.7 Research aims ..................................................................................................................... 16 Chapter 2 Results and Discussion ............................................................................................. 17 2.1 Overview............................................................................................................................. 17 2.2 The Ocean Sampling Day Consortium ................................................................................ 21 2.3 Between Ignorance and Concern - Interdisciplinary Approaches to Raising Awareness on Marine Environments ............................................................................................................... 27 2.4 MyOSD 2014: Evaluating Oceanographic Measurements Contributed by Citizen Scientists in Support of Ocean Sampling Day .......................................................... 36 2.5 Understanding Marine Microbes, the Driving Engines of the Ocean ................................ 38 2.6 MyOSD 2015: Marine Microbiology Meets Citizen Science............................................... 56 2.7 ENVIRONMENTS and EOL: Identification of Environment Ontology Terms in Text and the Annotation of the Encyclopedia of Life .................................................................................... 74 2.8 PubMap: A Crowdsourcing Application for Georeferenced Annotation of Scientific Publications .............................................................................................................................. 77 Chapter 3 Summary and Conclusion ........................................................................................ 94 3.1 The Ocean Sampling Day: Bringing Marine Researchers Closer Together ........................ 94 3.2 MyOSD: Engaging the Public in Marine Microbiology ....................................................... 96 3.3 Refurbishing Data for the Scientific Community .............................................................. 100 Chapter 4 Outlook .................................................................................................................. 102 Additional Scientific Publications ........................................................................................... 102 Appendix ................................................................................................................................. 105 Acknowledgements ............................................................................................................... 154 Bibliography ............................................................................................................................ 156 2 Table of Figures Figure 1: The development of sequencing technologies over the past 30 years. ..................... 7 Figure 2:Growth of DNA sequencing capacity ........................................................................... 7 Figure 3: Google trends search on citizen science.. ................................................................. 12 3 1.1 Marine Microbes: The Grey Eminences of the Ocean Chapter 1 Introduction 1.1 Marine Microbes: The Grey Eminences of the Ocean The ocean, which covers 70% of earth's surface, represents the largest habitat for living organisms. These organisms are mainly marine microbes which include not only prokaryotes such as bacteria and archaea but also viruses and microscopic eukaryotes such as single celled algae, protists or fungi (Fuhrman 2009). As the name microbe already implies they share a common feature, their size is situated in the microscopic scale. For example, the smallest members of the marine microorganism are viruses of the Parvoviridae family with a diameter of only 20 nm (Munn 2011) or the archaeon Thermodiscus which is only 200 nm in diameter (Schulz and Jørgensen 2001). But also in the microbial world some "giants" exist such as the bacteria Thiomargarita namibiens. Its diameter of about 750 µm is making it even visible to the naked eye (Schulz 1999). Independent on their size-range, microbes are very divers and can live in aerobic as well as anaerobic conditions and use various organic but also inorganic chemicals such as hydrogen sulfide as energy sources (Madigan and Brock 2012). Microbes are able to inhabit every known niche and can even be found in extreme environments such as hydrothermal vents in the deep ocean (Jannasch and Mottl 1985) or hyper saline waters such as the Dead Sea (Pundak and Eisenberg 1981). Despite their small size marine microbes account not only for the most abundant entities in the ocean—a total of 1.2 x 1029 microbial cells is estimated in the open ocean alone—but also comprise about 90% of the total marine biomass (Whitman et al. 1998). So, it comes as no surprise that they play major roles in the biochemical cycles of the marine but also the terrestrial ecosystem. In the upper 200 meters of the ocean photosynthetic microbes—phytoplankton—are responsible