The SeaDataNet pan-European infrastructure for marine and data management

Michèle Fichaut, Helge Sagen, Serge Scory and the SeaDataNet Consortium

Marine data workshop, Helsinki, 21 November 2019 [email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

What is SeaDataNet? A pan-European infrastructure set up and operated for managing marine and ocean data in cooperation with the NODCs and data focal points of 34 countries bordering the European seas Metadata directories 90s Medar/MedAtlas 2002-2005 Sea-Search (FP5) 2006-2011 SeaDataNet (FP6) 2011-2015 SeaDataNet II (FP7) 2016-2020 SeaDataCloud (H2020)

A legal entity : SeaDataNet AISBL for sustainability of the Consortium [email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

SeaDataNet portal

With access to services • Standards & common vocabularies • Software tools both for data centres and users • Data and metadata • Products http://www.seadatanet.org

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

SeaDataNet metadata directories

EDMO Organisations EDMERP CSR Projects Cruises

EDIOS EDMED CDI Data sets [email protected] – www.seadatanet.orgData Marine data workshop, Helsinki, 21 November 2019

SeaDataCloud - Cooperation with EUDAT

A consortium of 20 High Performance Computing (HPC) centres offering also storage resources 5 EUDAT members are partners of SeaDataCloud [email protected] – www.seadatanet.org Present SeaDataNet architecture

Proposed upgraded architecture with data replication, advance services and VRE in the cloud

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

CDI catalogue: discovery and access to data

Search Data and download selection

EUDAT Cloud More than 110 data centres connected

European data sources Data centresç ≈ 730 source laboratories

[email protected] – www.seadatanet.org Data centres Marine data workshop, Helsinki, 21 November 2019 Service for discovery and unified data access cdi.seadatanet.org/search

Areas

Trajectories

Vertical profiles or time series • since 1800 è 2019 • 2.3 M of CDIs for physical, chemical, biological, geosciences data • 87sdn -%[email protected] of unrestricted – www.seadatanet.org or SDN license data Marine data workshop, Helsinki, 21 November 2019

SeaDataNet products

CENTRAL CDI

Analysis Data of data harvesting anomalies SeaDataNet Quality Checks Strategy QC (QCS) File and analysis parameter aggregation

QC-Loop

Aggregated datasets and climatologies Improvement of the data quality

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

SeaDataNet standards • Metadata formats for all catalogues – ISO19115 and ISO19139 • SeaDataNet data transport formats – ASCII (Ocean Data View, and MedAtlas) – NetCDF (CF compliant) è Relying on controlled vocabularies governed by NERC- BODC (UK) and used in many other international or national initiatives

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

SeaDataNet standards

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

SeaDataNet software Tools (1) • Tools for the data centres – data managers – To be connected to the infrastructure and to be able to duplicate data in the cloud (Replication manager) – To follow the data downloading by users : MySeaDataCloud – To generate the metadata at the SDN standards : MIKADO – To generate the data files at the SeaDataNet standards : NEMO – To check the compliance of the data files : OCTOPUS [email protected]– To quality – checkwww.seadatanet.org the data: ODV Marine data workshop, Helsinki, 21 November 2019

SeaDataNet software Tools (2) • Tools for the users and data scientist – All catalogue search interfaces – To visualise data, plot, analyse : ODV – To interpolate data : DIVA – to publish your data using Sensor Web standards – To publish marine data (and get a DOI) : SEANOE – To work on datasets in the cloud environment : SDN Virtual Research Environment (VRE) • prototype available and used by the regional product leaders of the SDC project [email protected] – www.seadatanet.org SDN: Arctic T,S-dataset

Data Holding centre: 19 unique string values found.

Values Counts

Alfred-Wegener-Institute for Polar- and Marine Research (1368) 145 All-Russia Research Institute of Hydrometeorological Information - World Data Centre (RIHMI-WDC) National Oceanographic Data Centre (NODC) (681) 20209 British Oceanographic Data Centre (43) 8402 Department of Safety and Quality of Milk and Fish Products (Max Rubner) (2303) 46 Federal Maritime and Hydrographic Agency (1850) 12 Federal Research Centre for Fisheries (Hamburg) (990) 19 Finnish Meteorological Institute (1725) 10 IFREMER / IDM / SISMER - Scientific Information Systems for the SEA (486) 27762 Institute of Fisheries Ecology - Cuxhaven (VTI-CUX) (1576) 8 Institute of Marine Research - Norwegian Marine Data Centre (NMD) (612) 51855 International Council for the Exploration of the Sea (ICES) (730) 8494 Marine Hydrophysical Institute (727) 7 Marine Institute (396) 33253 Marine Research Institute (583) 7547 NIOZ Royal Netherlands Institute for Sea Research (630) 253 PANGAEA - Data Publisher for Earth & Environmental Science (3234) 156 Shom (540) 3952 Swedish Meteorological and Hydrological Institute (545) 166 Thünen-Institute of Sea Fisheries (TI-SF) (1570) 113 [email protected] – www.seadatanet.org Arctic SDC V1 complete T,S-dataset NonRestricted dataset 1900-2014 731286 stations

[email protected] – www.seadatanet.org SDC V1 complete T,S-dataset Restricted dataset 373 stations

[email protected] – www.seadatanet.org NonRestricted T,S-dataset 1995-2004 349890 ferrybox stations, 43607 other data

[email protected] – www.seadatanet.org Comparing SDN & WOD SDN NonRestricted T,S dataset 1975- WOD18 OSDS, PFLS dataset 1975- 1984 9429 stations 1984 108260 stations

but SDN NonRestricted T,S dataset 2005- WOD18 OSDS, PFLS dataset 2005- 2014 290583 stations 2014 54870 stations [email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

Overall architecture for the SDC VRE

[email protected] – www.seadatanet.org Background field 1955-2014 Summer Autumn

[email protected] – www.seadatanet.org Climatology temperature 1955-1964 1965-1974

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

FAIRness of SeaDataNet data (1) Challenge: make SDN data and metadata and related services more FAIR, both for machines and people • Enriching CDI metadata by data centres and their data originators - Adding more information on QA-QC activities - Adding extra information about data collection, in particular instruments using SDN vocabulary - Including, where applicable, links to projects (EDMERP), cruises (CSR), data collections (EDMED) - Including, where applicable, links to ‘standard data processing methods’ like laboratory tests, using the Ocean Best Practices repository of IODE (https://www.oceanbestpractices.net/) [email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

FAIRness of SeaDataNet data (2) • Applying Linked Data principles to all services by their managers – Publishing SDN directories as SPARQL services as RDF resources following existing models – Use Schema.org for Search Engines • Harmonising the URLs of the SDN services (GUI and SPARQL) by their managers : – EDMO, EDMED and EDMERP – EDIOS and CDI – CSR

[email protected] – www.seadatanet.org Marine data workshop, Helsinki, 21 November 2019

FAIRness of SeaDataNet data (3) • Ensuring SDN data file format conformance by data centres – Use SeaDataNet tools for preparing SeaDataNet data files (ODV, NetCDF (CF)) – Follow examples of SeaDataNet data files for specific data types – Make sure that all declared parameters have one or more values – Make sure that the correct primary variables in ODV are defined and filled with values considering the specific data types – Check the syntax and semantics of ODV and NetCDF files, using SDN OCTOPUS software

[email protected] – www.seadatanet.org Aggregated collection

IQUOD workshop, Brest, 31 October 2019

Total collection

GEOSS portal IODE ODP portal Copernicus MEMS

Data discovery and access Black Sea portal Caspian portal Geo-Seas portal Regional subsets

> 110 data centres Thematic portals Bathymetry

NODCs; HOs; GEOs; BIOs; ICES Physics

Chemistry ≈ 730 European Geology data originators General context [email protected] – www.seadatanet.org and cooperation Biology Marine data workshop, Helsinki, 21 November 2019

[email protected] – www.seadatanet.org