The Enetwild Project
eNETwild Wildlife: collecting and sharing data on wildlife populations, transmitting animal and human disease agents
Standards for data collection on wildlife distribution and abundance
Guillaume Body, 17/01/2017, Parma
1
Plan
1. What is a data standard 2. Why do we need them 3. On what do they apply 4. Existing data standards 5. Current framework
2
1 Data standard: definition
1. What is a data standard ?
3
1. Data standard: definition
“Standards” are documented agreements containing technical specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics to ensure that materials, products, processes, and services are fit for their purpose
The challenge remains for any community of practice to develop community based vocabularies and content standards through identifying the important features and their properties within a particular domain and express these using GML application schemas
http://www.eubon.eu/getatt.php?filename=EU%20BON_D2.2_Data%20sharing%20tools_13350.pdf https://www.iso.org/standards.html http://tdwg.org/ http://geobon.org/essential-biodiversity-variables/guidance/standards-overview/
4
2 1. Data standard: definition
Exemple of standards From INSPIRE “Species distribution”
It contains: List of variables Their relations List of accepted values Format of values
UML class diagram of INSPIRE corresponding to the Species Distribution (from Figure 7 in INSPIRE Data specification)5
1. Data standard: definition
Exemple of standards From INSPIRE “Species distribution”
It contains: List of variables Their relations List of accepted values Format of values
UML class diagram of INSPIRE corresponding to the Species Distribution (from Figure 7 in INSPIRE Data specification)6
3 1. Data standard: definition
Exemple of standards From INSPIRE “Species distribution”
It contains: List of variables Their relations List of accepted values Format of values
UML class diagram of INSPIRE corresponding to the Species Distribution (from Figure 7 in INSPIRE Data specification)6
1. Data standard: definition
CountingMethodValue Exemple of standards counted The units defined by the countUnitValues have been counted. From INSPIRE estimated The units defined by the countUnitValues have been estimated. “PopulationSizeType” calculated The units defined by the countUnitValues have been calculated using a modelling technique. GeneralCountingUnitValue colonies individual organisms of the same species living closely together, usually for mutual benefit
individuals single, genetically distinct member of a population
juvenile not sexually mature individual
larvae a distinct juvenile form many animals undergo before metamorphosis into adults
pairs mated pairs
shoal A cluster of internal self-coordinated moving individuals, e.g. a fish flock.
shoots Shoots are counted when it is not possible to distinguish individuals, e.g. due to clonal growth. groups of plants of a single species growing so closely together that it is impossible to distinguish single tufts individuals without destroying the occurrence 7 [1] http://inspire.ec.europa.eu/codelist/CountingMethodValue [2] http://inspire.ec.europa.eu/codelist/GeneralCountingUnitValue
4 1. Data standard: definition
Development of standards for sampling-event data
8
2. Data standard: the need
2. Why do we need them ?
9
5 2. Data standard: the need
To be in capacity to use data ! It is the common language
10
2. Data standard: the need
E.g: Wolf communal synthetic data ETAT_DDT_biennale2017_DDT DPT INSEE_ NOM_COMMUNE ZONE 01 01187 HOTONNE Avérée 01 01292 Le PETIT-ABERGEMENT Avérée 01 01273 NEUVILLE-SUR-AIN Avérée 01 01176 Le GRAND-ABERGEMENT Non-avérée 04 04006 ALLOS Avérée 04 04020 BARLES Avérée 04 04023 BAYONS Avérée Fichier zonage2004 N°Dpt N°INSEE Nom commune Statut exercice 2017
04 04005 ALLONS Régulière
04 04006 ALLOS Régulière
04 04007 ANGLES Régulière
04 04008 ANNOT Régulière
04 04013 AUBIGNOSC Occasionnelle 11
6 2. Data standard: the need
E.g: Wolf communal synthetic data ETAT_DDT_biennale2017_DDT DPT INSEE_ NOM_COMMUNE ZONE 01 01187 HOTONNE Avérée 01 01292 Le PETIT-ABERGEMENT Avérée 01 01273 NEUVILLE-SUR-AIN Avérée 01 01176 Le GRAND-ABERGEMENT Non-avérée 04 04006 ALLOS Avérée 04 04020 BARLES Avérée 04 04023 BAYONS Avérée Fichier zonage2004 N°Dpt N°INSEE Nom commune Statut exercice 2017
04 04005 ALLONS Régulière
04 04006 ALLOS Régulière
04 04007 ANGLES Régulière
04 04008 ANNOT Régulière
04 04013 AUBIGNOSC Occasionnelle 12
2. Data standard: the need
E.g. : « oiseaux de passage ONCFS-FNC-FDC » bird survey
Somme contact Code carte IGN Annee Nom espece Code espece auditif X Y XL93 YL93
0316 1999 GRIVE DRAINE GD 0 68837,58 2415701 120489,1 6853408
0317 1999 GRIVE DRAINE GD 0 70842,47 2397795
0415 1999 GRIVE DRAINE GD 0 100160,1 2432499 151922,7 6869944
0416 1999 GRIVE DRAINE GD 3 98639,5 2413531 150251,5 6851002
0417 1999 GRIVE DRAINE GD 3 97163,03 2395435 148631,5 6832930
0418 1999 GRIVE DRAINE GD 3 95477,88 2376366 146795,2 6813888
0419 1999 GRIVE DRAINE GD 1 93813,82 2355890 144968,8 6793441
13
7 2. Data standard: the need
E.g. : « oiseaux de passage ONCFS-FNC-FDC » bird survey
Somme contact Code carte IGN Annee Nom espece Code espece auditif X Y XL93 YL93
0316 1999 GRIVE DRAINE GD 0 68837,58 2415701 120489,1 6853408
0317 1999 GRIVE DRAINE GD 0 70842,47 2397795
0415 1999 GRIVE DRAINE GD 0 100160,1 2432499 151922,7 6869944
0416 1999 GRIVE DRAINE GD 3 98639,5 2413531 150251,5 6851002
0417 1999 GRIVE DRAINE GD 3 97163,03 2395435 148631,5 6832930
0418 1999 GRIVE DRAINE GD 3 95477,88 2376366 146795,2 6813888
0419 1999 GRIVE DRAINE GD 1 93813,82 2355890 144968,8 6793441
14
2. Data standard: the need
We need data standard to:
- store data : same list of variables same vocabulary - compare data : same signification same unit - keep the integrity of data : known referentials - process data : complete list of variables
15
8 2. Data standard: the need
We also need metadata standards !
Metadata = data about data a description of the context of the dataset
Species referential Administrative referential Geographic projection Investigator Methodology Sampling effort
16
2. Data standard: the need
We need metadata standards to:
- track data : who did the job when for what how to access them how to cite them - understand data : which kind of data which methodology - find data : period covered by the dataset area covered by the dataset taxonomic group covered by the dataset
17
9 2. Data standard: the need
If you need something, it must be in the data/metadata standard
Everything which is not in the standard, does not exist.
Everything which is in plain text, does not exist.
18
The eNETwild project
Cas A: A validated protocole exist for the species: the data-standard including the description of the protocole is established, then littérature review start and fill in the database C B that can be use for modelling A Case B: The data standard is only developped Database census: when a protocole is validated for a species. Littérature review and filling up the database
Year 1 Continuous method only start after hte format is defined development to Case C: A protocole cannot be validated to assess abundance evaluate the parameter (ex: abundance): a standard that record the different unvalidated protocole may be developped, and the PROJECT MANAGEMENT Continuous Data format standard database not filled up as the data could not be used. Networking and Ad-hoc Stakeholder Preliminary models & maps to identify Years2 metadata standard technical analysis where we have no or scarce data & Strategic plan development Cost/effective approach recording scientific for species with organizations in countries to (i) use CTs advice to - Development 5 of harmonized accepted protocole to target specific areas to confirm EFSA protocols spp. presence for data and (ii) to diagnose specific disease (Citizen (one per Validation & Data collection Science) year) quality assessment by Integration into the wildlife database professionals Predictive modelling of wildlife Year6 Prepare maps and populations and diseases charts
19
10 2. Data standard: the need
FAIR principles:
Findable : role of metadata Accessible : role of the database or metadata Interoperable : role of the data standard Reusable: role of the data sharing agreement
What data standards do not do: Fiability of the methodology Adequation between the data and its use
20
3. Data standard: application
3. On what do they apply ?
21
11 3. Data standard: application
Raw/primary data Occurrence distance sampling Hunt kill CMR road count
Processed data Distribution Abundance index Hunt bag Density
Metadata
22
3. Data standard: application
23
12 3. Data standard: application
Raw/primary data Occurrence distance sampling Hunt kill CMR road count
Processed data Distribution Abundance index Hunt bag Density
Metadata
24
4. Data standard: the existant
4. Existing data standards
25
13 4. Data standard: the existant
26
3. Data standard: the existant
Darwin Core + DarwinCore Event extension
ABCD Access to Biological Collections Data
INSPIRE directive
EML Ecological Metadata Language
DarwinCore and ABCD are minimally restrictive, so they need to be explained
27
14 4. Data standard: the existant
Raw/primary data Occurrence DarwinCore / ABCD distance sampling none CMR none road count DarwinCore Event ?
Processed data Distribution INSPIRE Abundance index none Density none
Metadata EML / ABCD
28
5. Data standard: current framework
5. Current framework for developping standards
29
15 3. Data standard: the existant
The network of biodiversity informatics organisations.
The network visualisation was created using NodeXL (Version 1.0.1.229) () and was laid out with the Harel–Koren Fast Multiscale algorithm and then adjusted manually to remove overlaps. The colours represent clusters identified using the Girvan– Newman algorithm.
Part of: Bingham H, Doudin M, Weatherdon L, Despot- Belmonte K, Wetzel F, Groom Q, Lewis E, Regan E, Appeltans W, Güntsch A, Mergen P, Agosti D, Penev L, Hoffmann A, Saarenmaa H, Geller G, Kim K, Kim H, Archambeau A, Häuser C, Schmeller D, Geijzendorffer I, García Camacho A, Guerra C, Robertson T, Runnel V, Valland N, Martin C (2017) The Biodiversity Informatics Landscape: Elements, Connections and Opportunities. Research Ideas and Outcomes 3: e14059. https://doi.org/10.3897/rio.3.e14059
30
5. Data standard: current framework
GEO BON Species population objectives 2017-2020
31
16 5. Data standard: current framework
a Scenarios
Indicators
Essential Biodiversity Variables First level of abstraction
Observations
GEO BON implementation plan 2017-2020
32
5. Data standard: current framework
Essential EBV class EBV Candidate Genetic composition Co-ancestry Biodiversity Allelic diversity Population genetic differentiation Variables Breed and variety diversity Species populations Species distribution http://geobon.org/essent Population abundance ial-biodiversity- Population structure by age/size class variables/classes/ Species traits Phenology Body mass Natal dispersion distance Migratory behavior Demographic traits Physiological traits Community composition Taxonomic diversity Species interactions Ecosystem function Net primary productivity Secondary productivity Nutrient retention Disturbance regime Ecosystem structure Habitat structure Ecosystem extent and fragmentation Ecosystem composition by functional type
33
17 5. Data standard: current framework a
2016 The GEO Handbook on Biodiversity Observation Network
Hugo et al. Chapter 11 Global infrastructure for Biodiversity data and services 34
End
Thanks for you attention
35
18