<<

E

Biodiversity information and cybertaxonomy:

International initiatives to inventory the earth's biodiversity (GBIF, Synthesys, Zoobank, EDIT, EoL, SpeciesBase,...)

Anne-Sophie ARCHAMBEAU http://www.e-taxonomy.eu GBIF France MNHN Géologie CP48 Phone:+33(0)140798065 43 rue Buffon 75005 Paris mail: [email protected]

http://www.e-taxonomy.eu

EDIT BiodiversityPresentation information title and cybertaxonomy:Presenter’s International position initiatives to inventoryPresenter’s the earth's name biodiversity

Communication officer at GBIF France Anne-Sophie Archambeau

Taxonomy Summer School 1-15 September 2008 Biodiversity Information: ● science that promotes the access, sharing and usage of data and knowledge concerning the biological diversity. Taxonomy: ● naming, describing and classifying organisms ● including all plants, and microorganisms of the world. ● taxonomists have named about 1.8 million , yet the total number of species is unknown and probably between 5 and 30 million. => the aim is to combine these knowledge to inventory the earth's biodiversity and make it accessible on the web.

Creation and management of Systematic information systems:  From the field to the web:  Needs in information systems at all the stages

 From the knowledge to the representation of this knowledge:  Difficulties to transfer the information from the human brain to the computer’s processor

 From the scientists to the wide audience  Web effect: same data, different views

=> Development of international standards and projects to find better ways to answer these needs 1

Chronology

 Mid 60's : first computerization of a collection

 Ending 60's : Algorithms on key’s creation

 1973 : First congress on Computer Assisted Indexing

 1980 : DELTA (Description Language for Taxonomy)

 1982 : First congress on Systematic data bases

 1983 : XPER

 1985: TDWG (Taxonomic Database Working Group), attached as an IUBS commission in 1988

Source: adapted from N. Bailly, 2004 www.tdwg.org

TDWG : Taxonomic Database Working Group, now called Biodiversity Information Standards

 The aim: to establish international collaboration among biological database projects and facilitate data exchange, launched in 1985

 Biodiversity Information Standards (TDWG) focuses on the development of standards for the exchange of biological/biodiversity data:  Develop, adopt and promote standards and guidelines for the recording and exchange of data about organisms  Promote the use of standards through the most appropriate and effective means and  Act as a forum for discussion through holding meetings and through publications D

www.tdwg.org

TDWG Working Groups  Biological Descriptions Interest Group  Geospatial Interest Group  Imaging Interest Group  Invasive Species Interest Group  Literature Interest Group  Natural Collections Descriptions Interest Group  Observation and Specimen Records  Access to Biological Collections Data  DarwinCore Task Group (DwC)  Process Interest Group  Taxonomic Names and Concepts Interest Group  TDWG Architecture Group  Globally Unique Identifiers  TAPIR Task Group  TDWG Infrastructure Project www.tdwg.org

TDWG prior standards:  Authors of plant names  Botanico-periodicum-huntianum  Botanico-periodicum-huntianum/supplementum  Economic botany data collection standard  Floristic regions of the world  information standards and protocols for interchange of data  Index Herbarium, Part 1: The herbaria of the world  International transfer format for botanic garden plant records  Plant names in botanical databases  Plant occurrence and status scheme  Taxonomic literature ed. 2 and its supplements  Users guide to the DELTA system  World geographical scheme for recording plant distributions  XDF: A language for the definition and exchange of biological data sets www.tdwg.org

TDWG current and draft standards:

 TDWG Current (2005) Standards:  Access to Biological Collection Data - version 2.06  Structured Descriptive Data  Taxonomic Concept Transfer Schema

 TDWG Draft Standards:  TDWG Standards Documentation Specification  TDWG Life Sciences Identifiers Applicability Statement E(

Chronology

 1988: FishBase Initiative  1988 : PANDORA

 1992 : Earth Summit in RIO (prepared by UNEP since 1988)  Establishment of the Convention on Biological Diversity

 Clearing House Mechanism

 1996-2000 : GTI Global Taxonomy Initiative

 1993 : Systematics Agenda 2000

 1993-1996: CDEFD (A Common Datastructure For European Floristic Databases)

 1994: Species 2000 http://www.fishbase.org

FishBase FishBase is a global information system with all you ever wanted to know about fishes and is one of the first and more complete taxonomic database. (since 1988)

FishBase is a relational database with information to cater to different professionals such as research scientists, fisheries managers, zoologists …

FishBase on the web contains practically all fish species known to science: ( 30300 Species, 266200 Common names, 46000 Pictures, 41600 References, 1560 Collaborators, 23 million Hits/month )

FishBase was developed at the WorldFish Center in collaboration with the Food and Agriculture Organization of the United Nations (FAO) and many other partners, and with support from the European Commission (EC). Since 2001 FishBase is supported by a consortium of seven research institutions. FishBase is related to most of the international biodiversity initiative. D

www.cbd.int

United Nations Convention on Biological Diversity (CBD)

 Objectives  Biodiversity conservation  Sustainable use of its components  Fair and equitable sharing of benefits arising from the use of genetic resources.  Organisation  COP Conference of the Parties  SBSTTA Subsidiary Body for Scientific, Technical, and Technological Advice  Since 1993 C(

www.cbd.int

Convention on Biological Diversity (CBD)

 CHM : Clearing House Mechanism (www.cbd.int/chm)

 GTI : Global Taxonomy Initiative (www.cbd.int/gti): Confronting the taxonomic impediment to biodiversity conservation, launched in 1996

 Remove the taxonomic impediment  Reduce the lack of taxonomists  Reduce the impact these deficiencies have on our ability to conserve, use and share the benefits of our biological diversity. Systematics Agenda 2000 (DIVERSITAS): Initiative emanating from biological systematists in the USA, in 1993, which proposes an intensive international programme over a 25 year period:  to discover, describe and inventory global species diversity  to synthesize and produce phylogenesis and predictives classifications  to develop an appropriate information system to handle the resulting information and provide dissemination of knowledge using data bases

Related to DIVERSITAS : international, non-governmental umbrella programme that would address the complex scientific questions posed by the loss of and change in global biodiversity. Launched in 1991 by United Nations Educational, Scientific and Cultural Organization (UNESCO), the Scientific Committee on Problems of the Environment (SCOPE) and the International Union of Biological Science (IUBS), (S

Chronology

 1994 : European Science Foundation Systematics Network  1994 : Tree of Life  1996 : OECD Megascience Forum Working Group on Biological Informatics  1999-2001 : implementation of GBIF (Global Biodiversity Information Facility)

 1997 : OBIS (Oceanographical Biodiversity Information System)

 1997-1999 : BioCISE (Biologial Colletion Information System in Europe) http://tolweb.org/

ToL : Tree of Life

 Objectives: compiles information about biodiversity and the evolutionary relationships of all organisms (phylogeny):

 To present information about every species and significant group of organisms on Earth, living and extinct,  To present a modern scientific view of the evolutionary tree that unites all organisms on Earth: ToL pages are linked to one another hierarchically, in the form of the evolutionary tree of life. Starting with the root of all Life on Earth and moving out along diverging branches to individual species, the structure of the ToL project illustrates the genetic connections between all living things  To aid learning about and appreciation of biological diversity and the evolutionary Tree of Life.  To share information with other databases and analytical tools, and to phylogenetically link information from other databases. http://www.iobis.org/

OBIS: Ocean Biogeographic Information System

 Objectives: make marine biogeographic data, from all over the world, freely available over the World Wide Web.

 OBIS provides:  taxonomically and geographically resolved data on marine life and the ocean environment;  interoperability with similar databases;  software tools for data exploration and analysis.  14 million records of 78000 species from 251 databases

 OBIS working groups : Taxon names Working Group, Visualisation tools Working Group, Habitat classification Working Group, Fishery data Working Group, Discovery Metadata Working Group. S1P

Chronology

 1998-2004 : Access to infrastructures  1998-2000 : ERMS (European Register of Marine Species)  1999 : CoML (Census of Marine Life)  1999 : First time that biodiversity as a domain was funded by the (5e) European Research Framework Programm (PCRDT, FP)  2000 : FaEu (Fauna Europaea); EMP (Euro+Med PlantBase), ENHSIN (European Natural History Specimen Information Network) European programms in taxonomic referentials:

The goal of these projects is to create a validated checklist of all the world's species in each domain: plants, animals, fungi and microbes:  List of valid species,  List of validated common names,  Establish the synonymy

They have been funded by the European Commission for a period of four years (1 March 2000 - 1 March 2004) within the Fifth Framework Programme (FP5). European programms in taxonomic referentials:

Europaea + Mediflore -> Euro+Med PlantBase: www.emplantbase.org  ERMS (European Register of Marine Species) : www.marbef.org/data/erms.php  Fauna Europaea (FaEu): www.faunaeur.org  Species 2000 Europa: European Catalogue of Life Project www.sp2000europa.org/ www.sp2000.org

Species 2000

Species 2000 is a "federation" of database organisations working closely with users, taxonomists and sponsoring agencies.

The website provide a validated checklist of all the world's species (plants, animals, fungi and microbes). This is being achieved by bringing together an array of global species databases covering each of the major groups of organisms. www.itis.gov

Integrated Taxonomic Information System

ITIS provides authoritative taxonomic information on plants, animals, fungi, and microbes of North America and the world.

=> In 2001, Species 2000 & ITIS decided to work together to create the Catalogue of Life (CoL).

The Catalogue of Life provides the taxonomic backbone to the GBIF and the Encyclopedia of Life (EOL). C

http://www.catalogueoflife.org/

Catalogue of Life (CoL)

 Objectives: Comprehensive catalogue of all known species of organisms on Earth by the year 2011.

 The Catalogue is published as two products:  Annual Checklist : fixed edition that can be cited and used as a common catalogue for comparative purposes by many organisations. CD-ROM and on the website. The seventh edition of the Annual Checklist contains 1,008,965 species.

 Dynamic Checklist: virtual catalogue operated on the Internet and available both for users and as an electronic web-service. The Dynamic Checklist harvests taxonomic sectors and associated strands of hierarchical classification dynamically from the source databases across the internet. 2

Chronology

 2000-2002 : All Species Foundation  March 2001: GBIF Established  2001 : BioCASE (A Biodiversity Collection Access Service for Europe) FP5  2002 : Earth Summit in Johannesburg  2003 : ENBI (European Network for Biodiversity Information) FP5  2003 : EuroCat (Species 2000 Europe)  2003 : 6e PCRDT/ FP6 www.biocase.org

Biological Collection Access Services

BioCASE: transnational network of biological collections of all kinds.

Objectives: widespread unified access to distributed and heterogeneous European collection and observational databases using open-source, system-independent software and open data standards and protocols.

BioCASE provides BCI: the Biodiversity Collections Index online. (www.biodiversitycollectionsindex.org/static/index.html) www.gbif.org

GBIF: Global Biodiversity Information Facility

main objectives and achievements in an international framework

GBIF's mission :

... make the world’s scientific biodiversity data freely and universally available via the Internet.

GBIF's objectives are :

to establish a distributed information infrastructure that serves primary biodiversity data  with initial focus on species- and specimen-level data,  with links to ecosystem, ecological, molecular and genetic levels

GBIF’s work is fully in line with the Convention on Biological Diversity (CBD) -> GTI, GSPC, Protected Areas, Invasives, 2010 target and CHM

GBIF history :

● June 1999: OECD Science Ministers (CSTP) endorsed GBIF, and recommended it be established as independent organization  1 March 2001: GBIF established  19 countries and organisations as founding members  June 2001: Copenhagen selected to host GBIF Secretariat  February 2002: GBIF Secretariat operational  February 2004: GBIF data portal online  July 2008:  + 140 million records are online  81 countries and organisations GBIF’s focus: biodiversity datas

Existing responsibilities of other groups GBIF's place among international organizations:

 GBIF avoids duplication of effort  Partnerships are vital to GBIF  Works to maximize benefits to all  Can contribute to science, policy and applications

What primary data exists?

 1-3 billion physical specimens in museums  Label data to be digitised  250-400 million digital data records off-line  Museums, observation networks, natural resource surveys, etc.  +140 million records are online today through GBIF  Using agreed standard, formats

Benefits of mobilizing data:

 Recognition of importance of collections and higher visibility of the institution or research project as useful to society  Global dissemination of the data  Source of the data gets credit when used

 Better management (supported by scientific data) of biological resources,

 Public understanding of the contributions of scientists to society, and biodiversity itself GBIF Operational Areas :

 Network and Nodes Implementation: Providing best practices and models for Participants to build and run their GBIF nodes  Data Access and Database Interoperability (DADI): Developing standards for linking biodiversity databases, Serving linked data through a common data portal  Digitization (DIGIT):Digitising primary biodiversity data  Electronic Catalog of Names of Known Organisms (ECAT): Developing a list of the scientific and common names of all 1.8 million known species  Outreach and Capacity Building (OCB): helping countries and organizations to share and use biodiversity data Everything GBIF does is in partnership with others

2010 Indicators, GTI, GSPC, CHM InternationalInternational ProgrammesProgrammes

TDWG

ITIS Tree of Life

CDEFD BioCISE

ENHSIN

Global Biodiversity Information Facility GBIFGBIF TopicsTopics andand ProgrammesProgrammes ContentContent areaarea responsibilitiesresponsibilities ofof GBIFGBIF + Network of GBIF natl. nodes / IPR / Products / e-Services / Portal Species bank Biological Catalogue of names Outreach and Taxa description Specimen Data of known organisms Capacity building SDD DIGIT ECAT OCB ABCD Catalogue of Life All Species Search Engines Species2000- ITIS CDEFD BioInfo DiGIR Common Access Names Interoperability Simbio BioCISE OBIS/CoML DADI ArcBota ENHSIN e-Types ERMS SPP BioCASE RIHA EuroCat Octopus ENHSIN DiGIR Synthesys TB-HF FaEu RefTax BioCASE EuroCat Biodiversity GECO Literature EMP Resources SPN

ENBI: European Network for Biodiversity Information FishBase TDWG GBIF will enable synergism among existing investments that is not possible at present GBIF’s achievements to date ...

 Data Portal (http://data.gbif.org): + 140 million biodiversity data available.  Capacity Building: training sessions (English, French,Spanish). Puting data to use: Data modeling workshops  Mentoring Program: To support building up NODES and foster N-N, S-S and N-S collaboration.  Demo Projects: develop protypes, proof of concept and applications.  Active pursuit of data repatriation Biodiversity and information about it are unevenly distributed…. biodiversity hotspot holder of large amounts of biodiversity data http://http://data.gbif.orgdata.gbif.org o

Which kind of information can be found in the GBIF?

 Data on species occurrences (based on specimens or observations)  What ?  When ?  Where ?  By who ?  Names  Scientifics names & classification  Common names http://data.gbif.orghttp://data.gbif.org

3. Advanced search Use of the data portal:

Different ways to find the information 2. Searching by words

1. Explore directly Which species are presents in a country?

click here Which species are presents in a country?

1. Choose the initial’s country

Be careful: • The list of species may be not

2. Choose the country complete in the list • GBIF needs more contents Which species are presents in a country?

Actions This box gives the options of uses

Summary of results Which species are presents in a country?

Obtain list of species based on the network information

Find the datasets which contains information on that country

Download the information in diverse format for different usages WhichWhich areare thethe dataprovidersdataproviders whowho givegive informationinformation onon aa country?country?

Choose the datasets you want to see with the boxes 