The Enetwild Project

eNETwild Wildlife: collecting and sharing data on wildlife populations, transmitting animal and human disease agents

Standards for data collection on wildlife distribution and abundance

Guillaume Body, 17/01/2017, Parma

1

Plan

1. What is a data standard 2. Why do we need them 3. On what do they apply 4. Existing data standards 5. Current framework

2

1 Data standard: definition

1. What is a data standard ?

3

1. Data standard: definition

“Standards” are documented agreements containing technical specifications or other precise criteria to be used consistently as rules, guidelines, or definitions of characteristics to ensure that materials, products, processes, and services are fit for their purpose

The challenge remains for any community of practice to develop community based vocabularies and content standards through identifying the important features and their properties within a particular domain and express these using GML application schemas

http://www.eubon.eu/getatt.php?filename=EU%20BON_D2.2_Data%20sharing%20tools_13350.pdf https://www.iso.org/standards.html http://tdwg.org/ http://geobon.org/essential-biodiversity-variables/guidance/standards-overview/

4

2 1. Data standard: definition

Exemple of standards From INSPIRE “Species distribution”

It contains: List of variables Their relations List of accepted values Format of values

UML class diagram of INSPIRE corresponding to the Species Distribution (from Figure 7 in INSPIRE Data specification)5

1. Data standard: definition

Exemple of standards From INSPIRE “Species distribution”

It contains: List of variables Their relations List of accepted values Format of values

UML class diagram of INSPIRE corresponding to the Species Distribution (from Figure 7 in INSPIRE Data specification)6

3 1. Data standard: definition

Exemple of standards From INSPIRE “Species distribution”

It contains: List of variables Their relations List of accepted values Format of values

UML class diagram of INSPIRE corresponding to the Species Distribution (from Figure 7 in INSPIRE Data specification)6

1. Data standard: definition

CountingMethodValue Exemple of standards counted The units defined by the countUnitValues have been counted. From INSPIRE estimated The units defined by the countUnitValues have been estimated. “PopulationSizeType” calculated The units defined by the countUnitValues have been calculated using a modelling technique. GeneralCountingUnitValue colonies individual organisms of the same species living closely together, usually for mutual benefit

individuals single, genetically distinct member of a population

juvenile not sexually mature individual

larvae a distinct juvenile form many animals undergo before metamorphosis into adults

pairs mated pairs

shoal A cluster of internal self-coordinated moving individuals, e.g. a fish flock.

shoots Shoots are counted when it is not possible to distinguish individuals, e.g. due to clonal growth. groups of plants of a single species growing so closely together that it is impossible to distinguish single tufts individuals without destroying the occurrence 7 [1] http://inspire.ec.europa.eu/codelist/CountingMethodValue [2] http://inspire.ec.europa.eu/codelist/GeneralCountingUnitValue

4 1. Data standard: definition

Development of standards for sampling-event data

8

2. Data standard: the need

2. Why do we need them ?

9

5 2. Data standard: the need

To be in capacity to use data ! It is the common language

10

2. Data standard: the need

E.g: Wolf communal synthetic data ETAT_DDT_biennale2017_DDT DPT INSEE_ NOM_COMMUNE ZONE 01 01187 HOTONNE Avérée 01 01292 Le PETIT-ABERGEMENT Avérée 01 01273 NEUVILLE-SUR-AIN Avérée 01 01176 Le GRAND-ABERGEMENT Non-avérée 04 04006 Avérée 04 04020 Avérée 04 04023 Avérée Fichier zonage2004 N°Dpt N°INSEE Nom commune Statut exercice 2017

04 04005 ALLONS Régulière

04 04006 ALLOS Régulière

04 04007 ANGLES Régulière

04 04008 Régulière

04 04013 Occasionnelle 11

6 2. Data standard: the need

E.g: Wolf communal synthetic data ETAT_DDT_biennale2017_DDT DPT INSEE_ NOM_COMMUNE ZONE 01 01187 HOTONNE Avérée 01 01292 Le PETIT-ABERGEMENT Avérée 01 01273 NEUVILLE-SUR-AIN Avérée 01 01176 Le GRAND-ABERGEMENT Non-avérée 04 04006 ALLOS Avérée 04 04020 BARLES Avérée 04 04023 BAYONS Avérée Fichier zonage2004 N°Dpt N°INSEE Nom commune Statut exercice 2017

04 04005 ALLONS Régulière

04 04006 ALLOS Régulière

04 04007 ANGLES Régulière

04 04008 ANNOT Régulière

04 04013 AUBIGNOSC Occasionnelle 12

2. Data standard: the need

E.g. : « oiseaux de passage ONCFS-FNC-FDC » bird survey

Somme contact Code carte IGN Annee Nom espece Code espece auditif X Y XL93 YL93

0316 1999 GRIVE DRAINE GD 0 68837,58 2415701 120489,1 6853408

0317 1999 GRIVE DRAINE GD 0 70842,47 2397795

0415 1999 GRIVE DRAINE GD 0 100160,1 2432499 151922,7 6869944

0416 1999 GRIVE DRAINE GD 3 98639,5 2413531 150251,5 6851002

0417 1999 GRIVE DRAINE GD 3 97163,03 2395435 148631,5 6832930

0418 1999 GRIVE DRAINE GD 3 95477,88 2376366 146795,2 6813888

0419 1999 GRIVE DRAINE GD 1 93813,82 2355890 144968,8 6793441

13

7 2. Data standard: the need

E.g. : « oiseaux de passage ONCFS-FNC-FDC » bird survey

Somme contact Code carte IGN Annee Nom espece Code espece auditif X Y XL93 YL93

0316 1999 GRIVE DRAINE GD 0 68837,58 2415701 120489,1 6853408

0317 1999 GRIVE DRAINE GD 0 70842,47 2397795

0415 1999 GRIVE DRAINE GD 0 100160,1 2432499 151922,7 6869944

0416 1999 GRIVE DRAINE GD 3 98639,5 2413531 150251,5 6851002

0417 1999 GRIVE DRAINE GD 3 97163,03 2395435 148631,5 6832930

0418 1999 GRIVE DRAINE GD 3 95477,88 2376366 146795,2 6813888

0419 1999 GRIVE DRAINE GD 1 93813,82 2355890 144968,8 6793441

14

2. Data standard: the need

We need data standard to:

- store data : same list of variables same vocabulary - compare data : same signification same unit - keep the integrity of data : known referentials - process data : complete list of variables

15

8 2. Data standard: the need

We also need metadata standards !

Metadata = data about data a description of the context of the dataset

Species referential Administrative referential Geographic projection Investigator Methodology Sampling effort

16

2. Data standard: the need

We need metadata standards to:

- track data : who did the job when for what how to access them how to cite them - understand data : which kind of data which methodology - find data : period covered by the dataset area covered by the dataset taxonomic group covered by the dataset

17

9 2. Data standard: the need

If you need something, it must be in the data/metadata standard

Everything which is not in the standard, does not exist.

Everything which is in plain text, does not exist.

18

The eNETwild project

Cas A: A validated protocole exist for the species: the data-standard including the description of the protocole is established, then littérature review start and fill in the database C B that can be use for modelling A Case B: The data standard is only developped Database census: when a protocole is validated for a species. Littérature review and filling up the database

Year 1 Continuous method only start after hte format is defined development to Case C: A protocole cannot be validated to assess abundance evaluate the parameter (ex: abundance): a standard that record the different unvalidated protocole may be developped, and the PROJECT MANAGEMENT Continuous Data format standard database not filled up as the data could not be used. Networking and Ad-hoc Stakeholder Preliminary models & maps to identify Years2 metadata standard technical analysis where we have no or scarce data & Strategic plan development Cost/effective approach recording scientific for species with organizations in countries to (i) use CTs advice to - Development 5 of harmonized accepted protocole to target specific areas to confirm EFSA protocols spp. presence for data and (ii) to diagnose specific disease (Citizen (one per Validation & Data collection Science) year) quality assessment by Integration into the wildlife database professionals Predictive modelling of wildlife Year6 Prepare maps and populations and diseases charts

19

10 2. Data standard: the need

FAIR principles:

Findable : role of metadata Accessible : role of the database or metadata Interoperable : role of the data standard Reusable: role of the data sharing agreement

What data standards do not do: Fiability of the methodology Adequation between the data and its use

20

3. Data standard: application

3. On what do they apply ?

21

11 3. Data standard: application

Raw/primary data Occurrence distance sampling Hunt kill CMR road count

Processed data Distribution Abundance index Hunt bag Density

Metadata

22

3. Data standard: application

23

12 3. Data standard: application

Raw/primary data Occurrence distance sampling Hunt kill CMR road count

Processed data Distribution Abundance index Hunt bag Density

Metadata

24

4. Data standard: the existant

4. Existing data standards

25

13 4. Data standard: the existant

26

3. Data standard: the existant

Darwin Core + DarwinCore Event extension

ABCD Access to Biological Collections Data

INSPIRE directive

EML Ecological Metadata Language

DarwinCore and ABCD are minimally restrictive, so they need to be explained

27

14 4. Data standard: the existant

Raw/primary data Occurrence DarwinCore / ABCD distance sampling none CMR none road count DarwinCore Event ?

Processed data Distribution INSPIRE Abundance index none Density none

Metadata EML / ABCD

28

5. Data standard: current framework

5. Current framework for developping standards

29

15 3. Data standard: the existant

The network of biodiversity informatics organisations.

The network visualisation was created using NodeXL (Version 1.0.1.229) () and was laid out with the Harel–Koren Fast Multiscale algorithm and then adjusted manually to remove overlaps. The colours represent clusters identified using the Girvan– Newman algorithm.

Part of: Bingham H, Doudin M, Weatherdon L, Despot- Belmonte K, Wetzel F, Groom Q, Lewis E, Regan E, Appeltans W, Güntsch A, Mergen P, Agosti D, Penev L, Hoffmann A, Saarenmaa H, Geller G, Kim K, Kim H, Archambeau A, Häuser C, Schmeller D, Geijzendorffer I, García Camacho A, Guerra C, Robertson T, Runnel V, Valland N, Martin C (2017) The Biodiversity Informatics Landscape: Elements, Connections and Opportunities. Research Ideas and Outcomes 3: e14059. https://doi.org/10.3897/rio.3.e14059

30

5. Data standard: current framework

GEO BON Species population objectives 2017-2020

31

16 5. Data standard: current framework

a Scenarios

Indicators

Essential Biodiversity Variables First level of abstraction

Observations

GEO BON implementation plan 2017-2020

32

5. Data standard: current framework

Essential EBV class EBV Candidate Genetic composition Co-ancestry Biodiversity Allelic diversity Population genetic differentiation Variables Breed and variety diversity Species populations Species distribution http://geobon.org/essent Population abundance ial-biodiversity- Population structure by age/size class variables/classes/ Species traits Phenology Body mass Natal dispersion distance Migratory behavior Demographic traits Physiological traits Community composition Taxonomic diversity Species interactions Ecosystem function Net primary productivity Secondary productivity Nutrient retention Disturbance regime Ecosystem structure Habitat structure Ecosystem extent and fragmentation Ecosystem composition by functional type

33

17 5. Data standard: current framework a

2016 The GEO Handbook on Biodiversity Observation Network

Hugo et al. Chapter 11 Global infrastructure for Biodiversity data and services 34

End

Thanks for you attention

35

18