The Global Biodiversity Information Facility (GBIF) Is a Worldwide
Total Page:16
File Type:pdf, Size:1020Kb
This PDF file contains all parts of GBIF Training Manual 1: Digitisation of Natural History Collections This book contains many hyperlinks. Most are to websites (if you are using the document offline, you will not be able to follow these). Some of the links are internal to the document (particularly in the Introduction). Î To go to a different place in the document + Mouse over a link and click to jump to the part of the book that is of interest, or + Click on the Bookmark in the bookmarks sidebar Î To return to your previous place, either * Click on its Bookmark in the bookmarks sidebar, or * Hit Alt + left arrow (or right click and select “previous view”) START HERE GBIF TRAINIING MANUAL 1: DIGITISATION OF NATURAL HISTORY COLLECTIONS DATA GBIF Training Manual 1: Digitisation of Natural History Collections Data Published by: Global Biodiversity Information Facility http://www.gbif.org © 2008, Global Biodiversity Information Facility Digitisation of Natural History Collections Data ISBN: 87‐92020‐07‐0 Permission to copy and/or distribute all or part of the information contained herein is granted, provided that such copies carry due attribution to the Global Biodiversity Information Facility (GBIF). Recommended citation: Global Biodiversity Information Facility. 2008. GBIF Training Manual 1: Digitisation of Digitisation ofNatural History Collections Data, version 1.0. Copenhagen: Global Biodiversity Information Facility. While the editor, authors and the publisher have attempted to make this book as accurate and as thorough as possible, the information contained herein is provided on an ʺAs Isʺ basis, and without any warranties with respect to its accuracy or completeness. The editor, authors and the publisher shall have no liability to any person or entity for any loss or damage caused by using the information provided in this book. GBIF Training Manual 1: Digitisation of Natural History Collections Data GBIF Training Manual 1 Digitisation of Natural History Collections Data Covers and copyright page ................................................................... Covers and copyright Introduction ................................................................................................ Introduction Meredith A. Lane Chapter 1: Uses of Digitised Collections Data ......................................................... Chapter 1 Arthur D. Chapman Chapter 2: Initiating a Collection Digitisation Project ............................................... Chapter 2 Christopher K. Frazier, John Wall and Sharon Grant Chapter 3: Data Quality .................................................................................... Chapter 3 Arthur D. Chapman Chapter 4: Data Cleaning .................................................................................. Chapter 4 Arthur D. Chapman Chapter 5: Georeferencing ................................................................................ Chapter 5 Arthur D. Chapman and John Wieczorek (eds.) Chapter 6: Generalising Sensitive Data.................................................................. Chapter 6 Arthur D. Chapman and Oliver Grafton Glossary and Acronym Expansion ............................................ Glossary and Acronym Expansion GBIF Secretariat staff booklet Appendix: GBIF Data Portal Tutorial ................................................................................. Donald Hobern and Meredith A. Lane online (English) online (French) (French translation by GBIF-fr) Table of Contents GBIF Training Manual 1: Digitisation of Natural History Collections Data Introduction The Global Biodiversity Information Facility (GBIF) is a worldwide network that makes primary, scientific, biodiversity data (documented species occurrence data) from many sources openly available via the Internet. It does this by building an information infrastructure that interconnects hundreds of databases, and by promoting the digitisation and sharing of data that are not currently available via the Internet, such as those associated with specimens in natural history museums. This promotion of digitisation is approached in a number of ways: • Seed money awards to stimulate digitisation projects; • The development (with partners) of community-accepted standards for data and metadata, as well as software tools that enable interconnectivity and interoperability; • Workshops for training in digitisation and data-sharing; and • Guides such as this Training Manual and its components. GBIF’s hope is to help collections and database personnel around the world share best practices in the tasks and operations required in building a web-based, global “natural history collection and herbarium” that can be accessed any time any where by any one via the Internet. GBIF is based upon primary scientific data—data that were recorded directly from nature—and upon a robust and comprehensive taxonomic system. These kinds of data can be used and reused in different analyses without diminishing their value. However, in this digital age, the use of biodiversity data are limited by the paucity of records that are in a digital form; most data are recorded only on paper in ink. For this reason, GBIF places a strong emphasis on the digitisation of natural history and other biological collections, as well as taxonomic names data and concepts. In addition, GBIF will provide tools that will, to unprecedented levels, enhance quality of these data, and describe its fitness for various uses. GBIF comprises its Participants, their Nodes, data providers around the world, and a coordinating Secretariat that works with partner organisations of many types to accomplish the goals of all. It does this by • Supporting and promoting the view that sharing biodiversity data, with clear rules and with full respect for the rights of the providers, has clear advantages for both users and providers. • Reaching out to data providers and potential users of the data, providing them with opportunities to increase their capacity to share and utilise biodiversity data. • Encouraging and facilitating the digitisation of data, including historical specimens, their label texts and associated materials, as well as observational data, so that these can be added to the digital store of available data; • Encouraging and facilitating the digital capture, documentation and georeferencing of newly gathered specimens and observational records; • Building an information architecture that offers web services to users and data providers, and makes biodiversity databases interoperable among themselves and across levels of biological organisation, as well as with digital literature. The founding GBIF Memorandum of Understanding laid out the principles of GBIF, to which it still adheres. These include that GBIF will: ____________________ Introduction, p. 1 Digitisation of Natural History Collections Data • be shared and distributed, while encouraging co-operation and coherence; • be global in scale, though implemented nationally, regionally and locally; • be accessible by individuals anywhere in the world, offering potential benefits to all, while being funded primarily by those that have the greatest financial capabilities; • promote standards and software tools designed to facilitate their adaptation into multiple languages, character sets and computer encodings; • disseminate technological capacity by making widely available scientific and technical information; and • make biodiversity data universally available, while fully acknowledging the contribution made by those gathering and furnishing these data. GBIF’s adherence to these principles is specifically intended to achieve benefits both for the users and providers of primary species occurrence data, and to make GBIF a global public good. Benefits of GBIF To users: GBIF is a Global Public Good Public goods, as generally understood, have two important characteristics: (1) they are freely available to all, and (2) they are not diminished by use. By this definition alone, GBIF is a public good: (1) GBIF’s fundamental principle is freely shared, accessible data, and (2) GBIF-mediated data can be used and reused by anyone—the very use of the data can often improve their quality. There is no single major “buyer” for GBIF- mediated data – in fact, the data can be used by researchers to generate new knowledge, by non-governmental agencies of one sort or another, or by governments for decision-making (among many other uses, see Chapter 1 of this Manual). A recent paper by Arzberger et al. (2004) has as its core principle that publicly funded research data should be openly available to the maximum extent possible because these data are a public good produced in the public interest. A set of very good examples of this are the databanks such as GenBank, PDB and FlyBase. They are supported by public funds, and are used for free by basic researchers to generate new knowledge as well as by the private sector to generate profits. Similarly, GBIF is a databank (though unlike these others, it is a distributed one) for the species level of biodiversity – serving up data from many sources that were generated in large part using public funds. Benefits to data providers Frazier and colleagues (see Chapter 2 of this Manual) provide a number of reasons that holders of species level biodiversity data should share those data, among which are: • wider dissemination, thereby raising the profile of the institution; • facilitating research through reducing transcription time and enabling novel combinations of species data