Note – Exploring the Dutch data landscape To : NPOS steering committee From : Melle de Vries Subject : Scoping paper on national data landscape study (project E) Version management : v. 1.1 (approved by Steering Committee on 14 October 2019)

Status: This document has been prepared by the core team (Melle de Vries, KNAW; Maurice Bouwhuis, SURF; Ruben Kok, DTL; Pieter Schipper, NWO) and examines the initial findings (phase 1 of the project) and also provides a view of the intended results of the project upon delivery in May/June 2020.

Table of contents

Management summary ...... 2 Chapter 1 Introduction ...... 4 Chapter 2 Definition ...... 5 Chapter 3 Initial study results...... 6 Chapter 4 Comparison with other countries ...... 12 Chapter 5 Developments at European level ...... 14 Chapter 6 Organisation of the national data landscape...... 15 Chapter 7 Contours of final result ...... 16 Chapter 8 Reference framework under development ...... 17 Chapter 9 Panel meetings ...... 19 Appendix: Sources consulted ...... 20 Appendix: Alignment, policies and regulations in the Netherlands ...... 21 Appendix: Overview of data repositories in the Netherlands ...... 26

Note – Exploring the Dutch data landscape 1

Management summary

This scoping paper is the first interim report of the NPOS project ‘Exploration and optimisation of national data landscape' (project E) launched on 6 May 2019. The purpose of this paper is to define the project – based on an initial study of developments – and to create a common starting point for further interpretation of the data landscape in consultation with relevant stakeholders.

Definitions

The following definitions apply in the project: This project concerns data from publicly funded research but is not limited to publicly funded service providers. It concerns (possibly international) organisations which are active in the Netherlands . It relates to research data management and not IT support. The FAIR principles are the starting point, but this study is mainly about the organisations that operate in this area, in conjunction with the optimisation of services. This study also does not primarily cover the funding, governance or quality of service, even though these aspects may be touched on in the recommendations in the final report.

Preliminary findings

The Netherlands is not the only country in the process of exploring the data landscape. There is a need also for insight into the data landscape of the Member States originating from the Science Cloud (EOSC). Key words in reports have so far been: fragmentation and overlap and therefore a need for consolidation and coordination.

Some common bottlenecks: • Fragmentation among service and facility providers, and confusion about the distinction between data, digitalisation and IT, which means that researchers do not know where they should be heading • Inconsistency between the research project funding and the ambition of sustainable accessibility of the results of research. • Uncertainty about roles, competences and responsibilities • Recognition and rewarding of research data management still need to be shaped

On 28 October and 22 November 2019, the core team will organise two panel meetings to share the first findings of the study of the current situation with various stakeholders and also to generate input for further elaboration of what the desired future should be.

Note – Exploring the Dutch data landscape 2

Organization of the national data landscape When mapping the Dutch data landscape, we have so far been considered the following perspectives: (a) nature of the organisation; (b) types of services; (c) target groups; and (d) scale of services.

The intended final result of this project is an advisory report/action plan (mid 2020). This shall in any case include the following: • A ‘picture’ of the current data landscape. Which organisations are active in the Netherlands when it comes to reusing research data? • How is the spread about services and the scientific disciplines/domains? • At what level are the organisations active? Institution, consortium or (inter)national? • What regulations exist in relation to research data and which consultative structures are in place with regard to the national research data infrastructure? • What are good practices? • Where are the bottlenecks and areas for improvement?

A separate focus is the interface between this NPOS study and the NWO implementation plan for the digitalisation of science, especially as it relates to (a) the thematic digital competence centres (DCCs) and (b) the governing role of the federated network of local DCCs and thematic DCCs.

It should be noted that this scoping paper advocates (strengthening) individual attention to the support of research data management in research, while the focus of the NWO digitalisation plan (and in the DCCs) also relates to IT (including high performance computing and software development). This distinction is highlighted more sharply in the elaboration (with clear definitions).

Note – Exploring the Dutch data landscape 3

Chapter 1 Introduction

This scoping paper, ‘Exploring the Dutch Data Landscape’, is the first interim report on the NPOS project ‘Exploration and optimisation of the national data landscape’ (project E).

The objectives of this paper are: • To complete step 1 of the project outline approved by the NPOS Steering Committee: “Drawing up a synopsis of relevant national and international developments, bottlenecks and consequences for the national data landscape”. • Definition of the project and creation of a common starting point for the discussions with relevant stakeholders (step 2 in the project). The interim report may be shared with the discussion partners. • Outline of the first contours of the intended final result of this project (advisory report/action plan). • Accountability to the NPOS Steering Committee and requests for suggestions for the follow-up of this project.

The focus of the project is on mapping the organisations that are active in the Netherlands in supporting the reuse of research data. This is not a ‘blueprint’, but rather a ‘picture’. The core group is aware that it is a snapshot in a developing landscape and will highlight developments where possible.

Purpose of the project Collecting good practices and exploring and developing the necessary (taken from the project improvements in the national ‘data landscape' in order: outline) a) to create better boundary conditions for the NPOS ambition of the optimum reuse of research data; b) to boost cooperation between data-intensive scientific fields and the resulting societal force for innovation; and c) to prepare participation in the European Open Science Cloud (EOSC) at national level.

Observations: • It is important to consider the various aspects (the needs of researchers, guidelines and standards, such as FAIR, effective data management and stewardship, interoperability at IT and data level, access to data and the legal and ethical aspects of that access, funding, infrastructures, training and support, governance and certification) in a coherent way so as to create guiding frameworks that can effectively optimise the entire chain (from conducting research to technical facilities). • Due to the diversity of research cultures in and between the different disciplines, a 'one size fits all' approach will not succeed. At the same time, we aim to maximise synergies in data infrastructure in order also to stimulate multidisciplinary research. This requires a careful balance.

Note – Exploring the Dutch data landscape 4

Chapter 2 Definition

The following definitions are used in the project: • This projects concerns data from publicly funded research but is not limited to publicly funded service providers. • This concerns research data management (the provision of research data, metadata and the software required for this purpose) from the generation or collection of the data to its sustainable availability and not the IT support of the research (nor the development or configuration of the software). • It is about organisations that are active in the Netherlands, also of course taking into account cross-border cooperation in research1 and international service providers as well. • The FAIR principles are the starting point but this exploration is mainly about the organisations that operate in this area, in conjunction with the optimisation of services. The question of which data researchers must store and for how long is the subject of the relevant KNAW Advisory Committee (which aims to come up with a recommendation on the storage and availability of data for research in the various disciplines by mid-2020).

Organisation within research institutions

In this exploration of the national data landscape, the participation of (institutional) data stewards, data scientists and software engineers in research projects is taken into account only for the overall picture (to a certain extent). This also applies to the local, first-line support that research institutions have created for this purpose. The knowledge institutions are expected to set up a form of first-line support within their own organisation (in line with the duty of data management in the scientific integrity code of conduct). The way this is set up is outside the scope of this exploration. However, a good practice may be included in the final report for the purpose of illustration.

Funding, governance and quality

This exploration also does not primarily cover the funding, governance or quality of service, even though these aspects may be touched on in the recommendations in the final report.

1 If Dutch researchers cooperate with foreign researchers in an international project then, according to the Dutch Code of Conduct for Scientific Integrity, the standard still applies that the research data and the research data are made as publicly available as possible after the project has ended. Nothing is said about who is to make the data available and where that should be done.

Note – Exploring the Dutch data landscape 5

Chapter 3 Initial study results

What is the researcher's perspective? In 2016/2017, CWTS (Centre for Science and Technology Studies in Leiden) examined this together with Elsevier and concluded the following (CWTS 2017, 39):

1. Data-sharing practices depend on the field: there is no general approach. General policy initiatives towards open data might benefit from encouraging bottom-up solutions in fields where open data is already an integral part of the research design. 2. Although data sharing seems to have a global benefit, cultural and national factors pose a significant challenge to a one-size-fits-all approach. 3. Freeing up data for reuse and sharing depends on accommodation or coordination of disciplinary, cultural, and local differences with respect to data privacy and licensing. 4. The role of funders and publishers in mandating data practices is limited compared to the role of researchers themselves. Open data mandates would benefit from better alignment with researcher incentive and evaluation structures (i.e., linked to the academic reputation). 5. In both intensive and restricted datasharing fields, training and support facilities for open data-sharing practices need to be provided. 6. Data journals are still a relatively small-scale phenomenon, but their popularity is growing rapidly. 7. Data is not always considered as a public good, but as something to pay for. This perception could be a threat to open data. 8. Where open data management is occurring, it is often perceived as a burden, and not as a responsibility.

This image is partially up-to-date but is dated; not yet FAIR data.

NCOS inventory The National Open Science Coordinator, Karel Luyben, met with some nationally operating organisations (DANS, DTL, GO-FAIR, 4TU.ResearchData, SURF and Netherlands eScience Centre) in early 2019 for an initial exploration of the national data landscape. The purpose of the discussions was to find a common approach to data services in the Dutch data services infrastructure.

Some conclusions • There is a need for national direction in the data landscape together with policies on data management and data reuse along with data processing protocols. • The national direction at generic level should be combined with a bottom-up approach at discipline or domain level and also in the institutions (local data desks, sharing expertise in the areas of IT, library and data stewards). • Exchange of knowledge and good practices and training of, for example data stewards are between these levels and could be addressed more jointly (under national direction).

Note – Exploring the Dutch data landscape 6

The remainder of this chapter provides a brief outline of the results of phase 1 of the project: Drafting a synopsis of relevant national and international developments, bottlenecks and consequences for the national data landscape. In general, it can be said that the Netherlands is leading the way in ideas when it comes to the organisation of research data management, but not when it comes to implementation. Developments in other countries are now progressing swiftly and in the Netherlands there is still some improvement required in the latter (implementation).

All universities in the Netherlands have on their websites information on how research data management works, referring to the current policy document, available data services within and outside the university, and a more or less extensive offer of support and training for their own researchers.

The coordination role is often fulfilled by the university library, whether or not in cooperation with the local IT service. At faculty level, there is often a specific fulfillment of tasks including availability of data stewards.

National developments • In ZonMw, since 2013, the grant recipient has to draw up a data management plan, which shows how data are shared, when the data are made available to third parties and how the data are made accessible. This plan is then submitted to ZonMw for approval. • Since the end of 2016, NWO has been working with a data management policy that applies to all NWO instruments. In concrete terms, this means that the data management section must be answered for each research application. Once a grant has been approved, a data management plan must also be submitted. • Since the end of 2018, there is a new Code of Conduct regarding Scientific Integrity with the obligation to take care of Data Management. • From 2019, the Government Agreement will stimulate the national digital science infrastructure (including recommendations from the Permanent Committee for Large- Scale Scientific Infrastructure, Apers sub-committee) with the promotion/formation of Digital Competence Centres (DCCs) and thematic DCCs. NWO presented a spending plan (total of €20M + co-financing) in November 2019.

NWO Scientific Infrastructure Spending Plan An important current development is to stimulate the digitalisation of science with resources from the current government agreement. At the beginning of 2019, the ICT sub- committee of the Permanent Committee for Large-Scale Scientific Infrastructure (PC-GWI) issued recommendations on the use of the €20 M included in the government agreement. The Lower House has now been informed about this. Part of this advice is about supporting open science. This includes Digital Competence Centres (DCCs) at universities (and other research institutions) and thematic DCCs (such as DANS, DTL/Health-RI and 4TU.ResearchData) in a federated infrastructure. A DCC integrates data(stewardship), software and computing expertise for a single institution. SURF plays a key role in the technical facilitation and coordination of this system. This federated system must have a good fit within the EOSC. NWO presented an elaborated plan following coordination with the field in November 2019.

Note – Exploring the Dutch data landscape 7

In a local DCC, the IT department, library and research support (and, where possible, the local Open Science programme) work together and provide a first point of contact for all questions from local researchers in the “digital area” (not just data). Data facilities are also made available within the local DCCs. The DCCs also ensure that the local data stewards are brought together. The interpretation may vary from one knowledge institution to another.

The tasks of the thematic DCCs and also of SURF as a director have still to be developed further.

In any case, in view of the above, it seems logical to focus (future) federated infrastructure, where there is a place for: • Institutional services, within a (cluster of) knowledge institution(s) • Domain-related services aimed at discipline-dependent support (also known as‘ vertical’) • National services, domain and institution-transcending (generic or ‘horizontal’)

A key challenge in exploring and optimising the national data landscape is to promote interoperability or the interplay between these three levels, and certainly to ensure the optimum connection of these levels with international infrastructures.

• Sector descriptions for beta, technology and SSH lead to sectoral plans, with SSH, in particular, also investing in digital infrastructural facilities. • The common digital infrastructure for the ‘health domain in life sciences (human-centric) will be developed further from within the Health-RI programme. Health-RI bundles and provides access to the activities of existing parties and research initiatives in a national portal, therefore providing an overview and organising a 'shared voice' approach to making agreements, developing policies and involving funders. • KNAW is working on recommendations on the storage and availability of data for research (delivery mid-2020). • NWO is currently reviewing its own data management policy in relation to what has been developed in the European context (Science Europe) in 2018 (delivery January 2020 – see separate framework below).

NWO Data Policy Since 2016, NWO has with its data management policy, in addition to the proper storage of data during and after its funded research projects, attempted to ensure that data is saved as far as possible as FAIR. The starting point is therefore: open where possible, closed where necessary. This policy is in line with that of the ERC and many other research funders. The main tool NWO uses for this is the data management plan; a document in which the researcher indicates how data will be stored and shared from the research project. A data management plan must be drawn up and approved for each NWO-funded research project. This has contributed to the professionalisation of data management in science. NWO will revise its data management policies in several aspects at the beginning of 2020. One new development is that, within the Association of European Research Funders,

Note – Exploring the Dutch data landscape 8

Science Europe, a set of core requirements has been established for matters that should be laid down in each data management plan. NWO has contributed to the establishment of these "core requirements" and is drawing up a new format for the data management plan on this basis. In line with the wishes of knowledge institutions, which are increasingly committed to data management policies and are building the necessary infrastructure for them, NWO will also accept their own formats, after approval, for data management plans. This reduces the administrative burden on researchers. In the future, researchers will have to indicate on their research application how they intend to store their data FAIR and what barriers to doing so are expected. Deviations from the standard in their field, in positive or negative terms, may be noted by referents and the assessment committee. The data management section in the research application will be copied into the data management plan, so that the assessor of the plan is aware of the ambitions regarding FAIR data formulated by the researcher for the project. In the near future, NWO requires a plan to be inspected by a data steward or similar knowledge- based institutional officer for submission to NWO. This is expected to improve the quality of the data management plans and researchers will know how to find the way to appropriate infrastructure and services better. Approval of the data management plan remains a prerequisite for starting a project and the assessment of data management plans remains with NWO.

• VSNU has taken the initiative regarding investments and cooperation between the universities in the field of digitalisation. It is not yet known whether this initiative will touch on the current exploration of the national data landscape.

In the various ministries there is also a question of initiatives as regards the collection and availability of data, which may be important for scientific research. • The Ministry of Economic Affairs and Climate (EZK) has recently initiated the development of a national data sharing coalition2, based on an InnoPay report3. This approach aims at creating a framework of guidelines/system of arrangements that allows coalitions in a wide range of sectors to share data more easily under clear conditions. FAIR data are a building block (via GO FAIR). With the support of EZK, the GO FAIR Foundation is drawing up a certification scheme for FAIR services. • The Ministry of Health, Welfare and Sport (VWS) is working on a health care information system based on "letting data work for health”. Standardisation is a keyword: unity of language and technology. FAIR data is also input here4. This is very relevant to the Health-RI programme: multiple use of data in care in relation to quality evaluation, research and innovation. • The Ministry of the Interior and Kingdom Relations (BZK) is designing a landscape of Information Houses (under construction), which is relevant to both the social and natural sciences and health research.

2 https://www.nederlanddigitaal.nl/initiatieven/d/datadeelcoalitie-voor-bedrijven/nieuws/2019/05/17/vorming- datadeelcoalitie-gestart-met-ondertekening-intentieverklaring 3 https://www.innopay.com/en/events/presentatie-resultaten-en-vervolg-onderzoek-naar-cross-sectoraal- datadelen 4 https://www.rijksoverheid.nl/documenten/kamerstukken/2018/11/15/kamerbrief-over-data-laten-werken-voor- gezondheid

Note – Exploring the Dutch data landscape 9

It is not known whether this involves coordination at the interdepartmental level and/or between departments and research organisations. International developments • Development of the European Open Science Cloud (EOSC) and further development of ERICs (such as BBMRI, CESSDA, CLARIN, DARIAH-EU) and domain infrastructures (funded by ESFRI, such as EHRI and ELIXIR). “Effort should be made to overcome the existing fragmentation of the research data landscape and achieve economies of scale. In this process, it is important that existing services and capacities that are valued by particular research domains are not lost, and that research communities new to FAIR are given the opportunities to develop the tools they need.” (EC 2018b, 27) • Parallel developments in other countries to prepare the national infrastructure to connect to the EOSC. See also chapter 4. • Several projects from or focusing on the EOSC, such as EOSC-pillar5 (which also makes a comparison between countries) and EOSC synergy6. See also chapter 5. • Commercial publishers wishing to expand their market share in terms of data services in the domain of scientific research. • Other private parties that (wish to) play a role in the scientific infrastructure (such as Amazon and Google) with data services. • Partnerships, such as CODATA, CoreTrustSeal, GO-FAIR (led from The Netherlands), Research Data Alliance and World Data System that provide development and exchange of knowledge and also certification in the field of research data management.

Commercial publishers It is not yet clear what the place of the commercial publishers in this area will be. In the field of publications, the relationship between research organisations and publishers is under a certain amount of pressure. This seems to extend to the field of research data management, as several publishers (are) also (going to) offer services in this area, whether or not in combination with cooperation platforms.

The Scholarly Publishing and Academic Resources Coalition (SPARC) has recently issued a landscape study and warns of potentially unfavourable effects. “This report was commissioned in response to the growing trend of commercial acquisition of critical infrastructure in our institutions.” (SPARC 2019)

5 The project aims to propose the initiatives for the national coordination of data infrastructures and service recently started in many Member States as one of the founding pillars for the development and long-term sustainability of the EOSC. EOSC-Pillar starts with an initial group of neighbouring countries who are active in open science, to define and set a model to harmonise and interfederate the initiatives. 6 EOSC synergy will start in Q4 2019. The project contributes to the European Open Science Cloud and focuses on building and improving EOSC capacity. In practice, it means an increase in computing power and data storage for scientists, and more data sets and tools becoming available for scientific research.

Note – Exploring the Dutch data landscape 10

Within the context of the increasing costs of data infrastructures, the OECD recommended looking for ways to save costs in order to be able to manage digital research results in the future. As examples are cited (OECD 2017, 13): • By encouraging or funding the establishment of lead organisations for open research data at the national level, and encouraging those organisations to collaborate globally. • By encouraging or funding collaboration and federation. Not all research data repositories need to perform specialised curation and preservation tasks. Similarly, not all institutions or organisations need to create individual repositories. Collaboration and federation can help to manage and reduce costs.

Bottlenecks Some of the many bottlenecks mentioned in the reports examined: • Fragmentation among service and facility providers, and confusion about the distinction between data, digitalisation and IT, as a result of which researchers do not know where they should be heading • Inconsistency between the project funding of research and the ambition of sustainable accessibility of the research results. • Uncertainty about roles, competences and responsibilities • Recognition and rewarding of research data management still need to be shaped

This has yet to be confirmed for the Dutch situation in the continuation of this project. The panel meetings, among others, should help here (see chapter 9).

“In the absence of national investments, individual research groups will build their own solutions, which are fragmented, non-interoperable, less cost efficient and therefore much less powerful. Also, the lack of national direction may result in unused capacity in different places.” (Wyatt et al. 2017, 27)

Consequences • Possible overlap in services and facilities and therefore inefficient use of resources • Potential blank spots for disciplines where service and infrastructure are still insufficient • Opportunities for breakthrough research may be missed due to lack of standardisation and opportunities for exchange and cooperation • Insufficient alignment with international developments (including EOSC, at infrastructural level, but also regarding content/strategy)

Note – Exploring the Dutch data landscape 11

Chapter 4 Comparison with other countries

It is not possible, within the scope of this project, to make a thorough comparison with other countries, partly because of the differences in the way science is funded and organised. However, efforts are being made to learn from similar initiatives in the countries around us.

Some insights from published reports: • In 2018, (Flanders) conducted a study into the improvement of research data management and the role of the Flemish government in this regard. This study also looked into what can be learned from , Ireland, the Netherlands and . There is a strong call for “ambition, leadership and central coordination” (Fikkers et al. 2018, 22). • An exploration of the data landscape was also carried out in the UK in 2017/2018. The recommendations there also call for active coordination and cooperation. There is a need for “coordination mechanisms and incentives to promote cooperation across a federated national ecosystem of provision, with links also to international services and initiatives. In short, there is a need for a strategy – and a governance structure – to build on strengths, remedy weaknesses, and fill gaps, including community-led initiatives.” (ORDT 2017, 52) A major recommendation is that UKRI [British research funders] "should establish for itself a co-ordinating role – while taking full account of the critical importance of active leadership from other stakeholders including research organisations, funders, specialist service providers, publishers, learned societies, and senior representatives of the research community – in overseeing the development of ORD policies, infrastructure and services. ”(ORDT 2018, 32) • In Denmark, the Ministry of Science and Higher Education carried out an analysis in 2018. “The analysis shows that many of the elements needed to realise FAIR data already exist, but they are fragmented and dispersed. Coordination and collaboration are crucial for developing a common approach and understanding of the FAIR data principles. (…) A FAIR data solution should build on local solutions with a cohesive national superstructure – not a ‘one-size-fits-all’ solution, but a solution that can grow based on local, academic and research environments.” (Oxford 2018, 5) • In , it was decided at the end of 2018 to establish a national data infrastructure. The federal government and the Länder have made available € 90 M per year for the period 2019-2028 to set up a national research data network and this is explicitly not intended for hardware, etc. “The aim of the national research data infrastructure (NFDI) is to systematically manage scientific and research data, provide long-term data storage, backup and accessibility, and network the data both nationally and internationally. The NFDI will bring multiple stakeholders together in a coordinated network of consortia tasked with providing science-driven data services to research communities.” (www.dfg.de) • In Sweden, the Swedish Research Council launched a national roadmap in this field in 2019. This roadmap also describes the national landscape. Core principles are coordination and cooperation and clarification of roles and responsibilities. It also highlights the importance of national coordination with a view to liaising with the EOSC. “Based on the observation that the current fragmentation is becoming increasingly challenging and is probably not cost efficient, the panel recommends that now is the time to consider an encompassing national e-infrastructure coordination and even organizational mergers of e-infrastructures.” (SRC 2019, 30)

Note – Exploring the Dutch data landscape 12

• The Nordic countries work together in the Nordic e-Infrastructure Collaboration (NeIC) and aim to address the challenge of Open Science in that context. “The Nordic countries are particularly well suited for collaboration among each other due to social and cultural similarities. Also, as the countries are individually small, unifying efforts in science and technology to realise common undertakings will generally result in a better end-product and greater impact in the international arena. Finally, a Nordic-wide collaboration reduces the risks of duplication of effort and therefore promotes a more cost-efficient R&D segment within the Nordics. (…) Development of the concept of open science is still in its infancy and it will require significant effort and funding to fully realise the potential of aligning research practices in modern science with the capabilities offered by semantic metadata modelling, linked data and knowledge graphs. To get there, it is necessary to build the essential infrastructures to support this vision.” (Jaunsen 2018, 7, 37)

Note – Exploring the Dutch data landscape 13

Chapter 5 Developments at European level

A 'Landscape' Working Group (term until the end of 2020) was started immediately before the summer of 2019 by the EOSC Executive Board. The participant from the Netherlands is Ronald Stolk (University of Groningen). The task of the Landscape Working Group is as follows (Jones & Abramatic 2019, 15-16): • Deliver the mapping of EOSC-relevant national infrastructures and the current level of spending on research data infrastructures • Take stock of federation constraints and opportunities at the various architectural levels, arising from national and regional structures and initiatives • Propose mechanisms and best practices that will facilitate convergence and alignment between European, national and regional structures and initiatives. • Conduct an analysis of the Member State’s level of preparedness to provide financial resources and support for political stability and infrastructural planning to EOSC

This NPOS project will conduct the (national) exploration in close coordination with the EOSC exploration of data landscapes to enable mutual learning. In addition, a summary of the Dutch situation (in broad terms) has already been provided from some European projects78.

The e-Infrastructure Reflection Group concludes in its report on the national focal points in the EOSC that “the (organisation of the) national e-Infrastructure landscape varies considerably between the countries” (E-IRG 2019, 21) and against that background makes the following recommendations: “Members states and associated countries should continue to increase the level of coordination between and consolidation of the various national players on e-Infrastructure provisioning.” (E-IRG 2019, 22)

Some other initiatives/projects to follow on from this project: • FREYA. This three-year project started in December 2017. It aims to build a persistent identifier (PID) infrastructure as a core element of open science. One of the project objectives is to improve the visibility of research data by building on existing PID systems, such as Crossref, DataCite, ORCID and identifiers.org. DANS is a partner in the project from the Netherlands. • EOSC Synergy. This three-year project will start in autumn 2019 and aims at extending EOSC coordination at national level in the member countries: Germany, the Netherlands, Poland, Portugal, Slovakia, Spain, the Czech Republic, . DANS is a partner in the project from the Netherlands. • EOSC Pillar. This three-year project started in July 2019 and aims to coordinate and harmonise the national efforts of Belgium, Germany, France, Italy and Austria in aligning with and implementing the EOSC. • EOSC Nordic. This three-year project started in September 2019 and aims to facilitate the EOSC initiatives from the Nordic and Baltic States and to achieve synergies at policy and service level. • NI4OS. This project aims to develop and build national open science initiatives jointly in the countries of South-East Europe.

7 http://www.eosc-synergy.eu/europe/netherlands/ 8 https://www.openaire.eu/item/netherlands

Note – Exploring the Dutch data landscape 14

Chapter 6 Organisation of the national data landscape

When mapping the organisations, this project takes into account four aspects: • Nature of the organisation (public/private, structural/temporary, consortium/independent) • What services does the organisation provide? • For which domains/scientific areas? • On what scale (research performing organisation, consortium, national, international)?

Initial overview of services that are offered to research organisations, research groups and researchers in the national data landscape (to be further divided into generic and domain- specific services): • Provision of IT infrastructure (access to/use of) • Provision of storage services (short-term) • Provision of archiving services (long-term) • Support in certification • Development of knowledge, standards, ontologies and guidelines • Provision of outreach, consultancy, education and training • Coordination (policy content, financial, technical, etc.) o definition of policies, funding terms and conditions, and standards o support for alignment and cooperation within a domain o connection to international initiatives – as a focal point, both within a domain and between domains

In all discussions on the development of the research data infrastructure, in the Netherlands and also in the European context, a federated infrastructure is often referred to. We also need to consider this in the exploration of the national data landscape. In any case, this means: • proper demarcation between the ‘hard’ infrastructure (IT or e-infra) and the data infrastructure; • proper demarcation between what research institutions do themselves (first-line) and where they can be helped with data services (second-line); • a clear division in national direction or economies of scale (due to generic services) and domain or discipline related services; • linking of supply (services) and demand (data production and use) in clusters.

Note – Exploring the Dutch data landscape 15

Chapter 7 Contours of final result

The intended final result of this project is an advisory report/action plan. This shall in any case include the following elements: • A ‘picture’ of the current data landscape. Which organisations are active in the Netherlands when it comes to the reuse of research data? • How is the spread about services (what types do we use) and the scientific disciplines/domains (what format do we use)? This may trigger a national service catalogue. • At what level are the organisations active? At the level of one (or more) institution(s), of an infrastructure/consortium or at national level? • What national regulations exist in relation to research data (such as code of conduct, database law, privacy, etc.) and what consultative structures exist with regard to the national infrastructure for research data? See an introduction in the appendix on alignment, policy and regulations in the Netherlands. • What are good practices? • Where are the bottlenecks and areas for improvement?

The ‘snapshot' takes into account the further elaboration of the figure below, with a first – not exhaustive – list of organisations known to play a role in the national data landscape, made by analogy with an inventory from the UK (ORDT 2018).

The above figure highlights in particular (a) the type of organisations. In any case, this requires additional snapshots as regards (b) the types of services provided and (c) the fields of science serviced.

Note – Exploring the Dutch data landscape 16

Chapter 8 Reference framework under development

When taking a ‘snapshot’ of the national data landscape and listing good practices, bottlenecks and blank spots, it is important to have a reference framework.

A reference framework should be based on the ambitions formulated or made at national level in terms of research data management and in particular the role that the Netherlands wants to play in the rapidly developing international context.

In the government agreement (2017), the Cabinet stated that ‘Open science’ and ‘open access’ are to be the norm in scientific research. And in the science letter from 2019, the Minister for Education, Culture and Science (OCW) explains the additional investment space for digital infrastructure: “With the €20 million investment, I wish (…) to invest in the data infrastructure for open science in the field. For example, for the reuse of research data, one of the key points for open science, it is necessary to strengthen the digital infrastructure.” This investment is in the context of the ambition that “The Netherlands wants to continue to be part of the world leaders in science. This requires cooperation at national and international level, between scientific and social parties and with industry. A strong Dutch system with good research facilities improves the position of our researchers in working with top scientists from other countries on global challenges.”

The figure below is an initial step with ideas for coordination and consolidation at national level, in an international context, taking into account efficiency considerations and because the EOSC requires national coordination.

Note – Exploring the Dutch data landscape 17

Explanation: • There is a grant system (above), an international context (left) and individual institutional responsibility in accordance with the Code of Conduct (right) and in between there is a range of services relating to the reuse of research data. • (i)DCC refers to the (inter-university or thematic) Digital Competence Centres. • Naturally, there is also a lot of cooperation and exchange that is not reflected in this figure. This figure focuses on the connecting position of the national data competence centre.

Further elaboration of tasks for a new, yet to be formed, national data competence centre: • Alignment and coordination: o Policy, expertise, regulation, standards o Unique, acknowledged consultative structure o National focal point for EOSC, Research Data Alliance, CODATA, CoreTrustSeal, GO-FAIR, WDS, etc. • Development and exchange of knowledge o In (funded) projects • Consolidation: o National portal for research information (NARCIS.nl) o National Service Catalogue with Shared Services (paying) § Training, workshops § Advice and support § Repository services

The existing services offered at national level (including current funding) should be given a place in the new national data competence centre.

Benefits of a new national data competence centre: • Efficiency: economies of scale • One clear point of contact • Do not reinvent the wheel every time • Cross-fertilisation, connection • Independence

Note – Exploring the Dutch data landscape 18

Chapter 9 Panel meetings

On 28 October and 22 November 2019, the project scheduled two panel meetings to gain insights on the current national data landscape and generate ideas on how this landscape can be further optimised towards the desired future. We extended a broad invitation (researchers and supporters, representatives of institutions and infrastructures) and welcomed a diverse range of people from various fields of science and data services (20-30 participants per meeting).

Content and working method For the meetings, four sub-themes have been defined with several relevant questions for each theme. All participants, in varying compositions, successively discussed all the themes on the basis of a schedule.

Sub-themes and questions: • Types of Services (Data Services) o Which types of services can we distinguish? o What services are currently sufficiently available? o What services are additionally required? o What are the bottlenecks in terms of services? • Generic versus discipline/domain-specific o Which services are independent of the scientific field? o Which services are very discipline or domain-specific? o Which scientific fields are well organised in terms of research data management and what factors are responsible for this success? o Which scientific fields are less well organised and what are the specific bottlenecks involved? • Institutional versus national versus international o What services should be available within an institution? o Which services should we provide at national level? o What are good examples of internationally delivered services? o What are the bottlenecks in terms of coherence between what is organised at institutional, national and international levels? • Alignment, coordination or direction o Which consultative structures are there in terms of research data management? o What strengths should we definitely maintain in this and what might be more efficient? o In which areas is more harmonisation, coordination or direction needed in the Netherlands? o In which areas should there be more policy in a European (or international) context?

The participants were invited by e-mail in mid-September, it being explicitly explained that we are at the forefront of a potentially significant change in the national data landscape in order to prepare the Netherlands for affiliation with the EOSC. The people who had signed up received this focus paper one week prior to start of the meeting.

Note – Exploring the Dutch data landscape 19

Appendix: Sources consulted

• CWTS (2017). Open data. The researcher perspective • EC (2018a). Prompting an EOSC in practice. Final report and recommendations of the Commission 2nd High Level Expert Group [2017-2018] on the European Open Science Cloud (EOSC) • EC (2018b). Turning FAIR into reality. Final Report and Action Plan from the European Commission Expert Group on FAIR Data • E-IRG (2019). National Nodes – Getting organized; how far are we? Implementing e- Infrastructure Commons and the European Open Science Cloud • Fikkers, Derek Jan; Dujso, Elma; Vooren, Robert van der (2018). Flemish Open Science Board. De governance structuur voor Open Data en Research Data Management in Vlaanderen. Technopolis. • Jaunsen, Anders O. (2018).The state of Open Science in the Nordic countries • Jones, Sarah; Abramatic Jean-François (2019). European Open Science Cloud (EOSC) – Strategic Implementation Plan • KNAW, NFU, NWO, TO2, VH, VSNU (2018). Nederlandse gedragscode wetenschappelijke integriteit. • KNAW (2019). Instellingsbesluit commissie Opslag en beschikbaarheid van data voor onderzoek • LERU (2018). Open Science and its role in universities: A roadmap for cultural change • NFU (2019a). Data4Lifesciences programmahttps://data4lifesciences.nl • NFU (2019b). Onderzoek en innovatie met en voor de gezonde regio • NFU (2019c). Kwaliteitsborging mensgebonden onderzoek • NWO (2016). Nationale Roadmap Grootschalige Wetenschappelijke Infrastructuur • NWO (2018). Integrale aanpak voor digitalisering in wetenschap (advies voor meerjarige bestedingsplan n.a.v. middelen in het regeerakkoord) • OCW (2019). Nieuwsgierig en betrokken. De waarde van wetenschap – wetenschapsbrief • OECD (2017). Business Models for Sustainable Research Data Repositories • Open Research Data Taskforce (2017). Research Data Infrastructures in the UK. Landscape report. • Open Research Data Taskforce (2018). Realising the potential. Final report. • Oxford Research (2018). Preliminary analysis: Introduction of FAIR data in Denmark • SPARC (2019). Landscape Analysis. The Changing Academic Publishing Industry – Implications for Academic Institutions. • Swedish Research Council (2019). An outlook for the national roadmap for e- infrastructures for research. • VSNU (2018). Een nieuw fundament: beeld van de bètasector • VSNU (2018). Een nieuw fundament: beeld van de technieksector • VSNU (2018). Samen sterker. Beeld van het SSH domein • Wyatt, Sally e.a. (2017). Topwetenschap verreist topinfrastructuur. Adviesrapport nationale digitale infrastructuur voor wetenschappelijk onderzoek

Note – Exploring the Dutch data landscape 20

Appendix: Alignment, policies and regulations in the Netherlands

The following points will covered in this appendix: • Legislation • CLA for universities • Code of Conduct regarding Scientific Integrity • Standard Evaluation Protocol • Research funding by NWO, ZonMw and SGF • National Plan on Open Science • VH and VSNU • UKB and SHB • Research data management national coordination point • Research data management control group

Legislation

• Copyright: The Copyright Act is the act governing copyright in the Netherlands. Data in a database may be partially or fully copyright protected. For example, texts or photographs that are subject to copyright may be included in a database. Copyright may also apply to a scientific item, a film or a book. Anyone writing, drawing, photographing, composing or producing anything has fully automatic copyright of his 'work' or 'intellectual creation' for up to seventy years after his death. In order to be copyrighted, the 'work' must be in an original form and must bear a personal stamp. • Database law: The database law is the law governing the intellectual property of databases within the European Union. The law gives limited protection to the creator of generic databases where it cannot invoke copyright. • Privacy law: Since 25 May 2018, the same privacy legislation applies throughout the European Union: the General Data Protection Regulation (GDPR). The GDPR is directly applicable in the Netherlands. Where the GDPR leaves room for national choices in the implementation of the GDPR, these have been added in the GDPR Implementing Act (UAVG).

VSNU Collective Labour Agreement

Chapter 1, section 3 Intellectual property rights

Article 1.20 General 1. The employee is obliged to comply with provisions reasonably laid down by the employer with regard to patent rights, database rights, plant breeder’s rights, design rights, trademark rights and copyright, with due observance of the legal provisions. 2. The employer may impose more detailed rules with regard to the provisions referred to in Articles 1.21 and 1.22.

Note – Exploring the Dutch data landscape 21

Article 1.21 Obligation to report 1. An employee who, during or otherwise coinciding with the performance of his duties, creates a possibly patentable invention or, by means of plant selection work, isolates a new variety for which plant breeder’s rights may be obtained, is obliged to report this in writing to the employer and must submit sufficient data to enable the employer to assess the nature of the invention or variety. 2. The obligation referred to in paragraph 1 arises the moment the employee is reasonably able to conclude that there is a question of such an invention or such a variety. In any event, the employee shall be considered to have been able to reach such a conclusion the moment the invention is completed or the variety has been isolated. 3. The provisions in this article apply by analogy as far as possible if the employee creates work that is protected by copyright, if and insofar the employer has not determined otherwise. Article 1.22 Transfer and retention of rights 1. Without prejudice to the provisions in Section 12 of the State Patents Act, Bulletin of Acts & Decrees 1995, 51, Section 31 of the Seeds and Planting Materials Act, Bulletin of Acts & Decrees 1966, 455 and Section 7 of the Copyright Act, Bulletin of Acts & Decrees 1912, 308, the employee, if and insofar he is entitled to other than moral rights to the invention, the variety or the work, for which the obligation to report in Article 1.21 exists, shall transfer these rights to the employer in whole or in part if so requested, in order to enable it to make use of them in the context of fulfilling its statutory duties within a term to be established later. 2. As soon as the term referred to in paragraph 1 has expired without the employer actually having made use of the rights that were transferred to it, the employee is entitled to reclaim them. If the employee subsequently decides in favour of exploitation, the second sentence of paragraph 3 applies by analogy. 3. Except in cases contrary to the substantial interests of the university, the employee is entitled not to comply with the request as referred to in paragraph 1. In that case, the employer may decide that the costs it has invested are at the employee’s expense, including salary, the costs of the facilities made available to the employee, insofar as they are directly related to the creation of the rights the employee now wishes to keep for himself, plus the interest accrued. The term ‘substantial interests of the university’ shall be interpreted to include interests arising from agreements entered into with third parties by or on behalf of the employer.

Code of Conduct regarding Scientific Integrity

In 2018, the Dutch Code regarding Conduct on Scientific Integrity was revised. The preamble identifies the dynamics in the domain of science. It highlights, inter alia, the increasing importance of the use and management of data and developments in open science. The code itself sets standards for good research practice and also the duties of care of institutions. The standards and obligations relating to research data are reproduced below.

Note – Exploring the Dutch data landscape 22

Standards of good research practice • 11 – As far as possible, make research findings and research data public subsequent to completion of the research. If this is not possible, establish valid reasons for their non- disclosure • 23 – Describe the data collected for and/or used in your research honestly, scrupulously and as transparently as possible. • 24 – Manage the collected data carefully and store both the raw and processed versions for a period appropriate for the discipline and methodology at issue. • 25 – Contribute, where appropriate, towards making data findable, accessible, interoperable and reusable in accordance with the FAIR principles. • 29 – Do justice to everyone who contributed to the research and to obtaining and/or processing the data.

The institution's duties of care • 11 – Provide a research infrastructure in which good data management is the rule and is facilitated. • 12 – Ensure that, as far as possible, data, software codes, protocols, research material and corresponding metadata can be stored permanently. • 13 – Ensure that all data, software codes and research materials, published or unpublished, are managed and securely stored for the period appropriate to the discipline(s) and methodology concerned. • 14 – Ensure that, in accordance with the FAIR principles, data is open and accessible to the extent possible and remains confidential to the extent necessary. • 15 – Ensure that it is clear how data, software codes and research material can be accessed.

This implies that research institutions themselves are responsible for providing a data infrastructure, the sustainable storage of research data and compliance with the FAIR principles, even if this is (partially) outsourced.

Standard Evaluation Protocol

KNAW, NWO and VSNU have an agreement on the quality assessment of research in the Netherlands. The current protocol (term 2015-2021) is currently being evaluated, with the focus on a new version from 2021. In accordance with the protocol, the evaluation committees – in the context of integrity – also focus on how the research unit deals with data and data management. To do this, the research unit must indicate in the self-evaluation how research data is handled (“how the unit deals with and stores raw and processed data”). The research unit is also given the opportunity to indicate at the output indicators how many/which data sets have been produced during the evaluation period and to what extent the data sets have been used.

Note – Exploring the Dutch data landscape 23

Research funding

In the Netherlands, NWO and ZonMw are the national public funders of scientific research. In addition, a significant amount of medical examination is funded by the Health Funds (SGF). • NWO. In addition to publications, research data arising from projects funded by NWO should be made available as openly as possible and for re-use. The adage is therefore "open if it can be, protected if it needs to be". Considerations, such as privacy, public safety, ethical restrictions, property rights and commercial interests, may be reasons to deviate from this rule. In order to make data open, it must be as findable, accessible, interoperable and reusable (FAIR) as possible. To achieve this, NWO works with a data management policy that applies to all NWO instruments from 1 October 2016. In concrete terms, this means that the data management section must be answered for each research application. Following the approval of a grant, a data management plan must also be submitted. • ZonMw aims at improving the scientific and social impact of research output, including research data. To gain impact from research data, one must be able to reuse them for verification of research findings, or for future research. To this end, ZonMw requires researchers to perform research data management and stewardship (RDM), and to share their data to contribute to future, innovative research. ZonMw's procedures for RDM aim at creating FAIR data, and high quality research projects. • SGF. Health funds are committed to preventing and curing (chronic) illnesses as well as providing good care for patients. They are an important funder of scientific research, with the aim of contributing to the prevention, treatment and curing of illnesses and improving quality of life. The SGF considers that the knowledge and results of this study should be available to everyone free of charge. Both for its own members, which wish to have access to developed knowledge and want to know where investments will lead, and for scientists and professionals who can use this knowledge to conduct further research and accelerate its development into usable applications. This is also known as‘ Open Science ’. The SGF endorses the ambitions of the Open Science movement.

National Plan on Open Science

On 9 February 2017, ten national knowledge partners presented the National Plan on Open Science. The purpose of this plan is to achieve the national transition to open science. One of the ambitions is to promote the optimum (re-)use of research data. The Plan states the following: “Open science aims for researchers to reuse the research data and services of others, where possible, and to make their own data as available as possible. To this end, such data must first be stored and described for the purpose of accessibility for reuse and reproducibility of tests. In order to be able to store research data in this way, including whilst research is being conducted, both technical and policy preconditions must be met. A large number of these preconditions are expected to be discipline-specific. How other parties – such as publishers – can play a facilitation role in this regard, needs to be further elaborated.”

Note – Exploring the Dutch data landscape 24

VH and VSNU

The Association of Universities of Applied Sciences (VH) and the Association of Universities in the Netherlands (VSNU) may still have their own (official) working groups with representatives of the various institutions dealing with (aspects of) research data management.

UKB and SHB

The consortium of university libraries and the National Library of the Netherlands (UKB) supports and accelerates scientific progress by dividing, concentrating and linking mutual expertise into (inter)national networks. UKB has a Research Data Working Group. The purpose of this working group is to exchange knowledge in the UKB and to transfer that knowledge to the university research community.

The 2017-2020 ambitions include: “We have a common approach to the development of services in area of research data management. There is a national network (LCRDM) and effective and efficient support is provided at local level. Researchers know where to store their research data during and after their research so it is findable, accessible, interoperable and reusable (FAIR) and where they can address questions about ownership and legal aspects of research data management. In addition, we develop innovative courses on 'open science and data management’ for young researchers. (…) We support the development of data-intensive research through ‘digital scholarship centres’. Such centres provide a range of facilities, such as specialist tools, methods and techniques for collecting, processing and visualising data.”

Only the services of the libraries of the Consortium of Universities of Applied Sciences (SHB) are listed on the consortium's website: • Support and advice in archiving and making available (raw) research data through the University of Applied Sciences' data repository for reference and (re)use of the data. • Support and advice in setting up data management plans. • For example, please refer to DANS, 3TU or international repositories for very large files, special formats and/or long-term storage.

Research Data Management National Coordination Point

The Research Data Management National Coordination Point (LCRDM) is a national network of experts in research data management (RDM). The LCRDM is an initiative of SURF and commissioned by the VSNU and facilitates the link between policy and solution. Within the LCDRM, experts work together to schedule RDM topics that are too large for one institution and require a common national approach.

Research Data Management Control Group

The chief information officers (CIOs) of the universities and UMCs have, together with SURF, established an informal consultation body.

Note – Exploring the Dutch data landscape 25

Appendix: Overview of data repositories in the Netherlands

Source: https://www.re3data.org/search?query=&countries%5B%5D=NLD&sort=name. This website was consulted on 5 July 2019 and at that time had 55 results for the Netherlands. In the list, the repositories with CoreTrustSeal (or the Data Seal of Approval) are highlighted in green. The four columns on the right show for which domains the repository is available. • HSS – Humanities and Social Sciences • LS – Life Sciences • NS – Natural Sciences • ES – Engineering Sciences Note that almost half of the repositories below do not have unique identifiers, which means that the conditions for F of FAIR are not met.

HSS LS NS ES 4TU.Centre for Data-archief voor de technische wetenschappen X X X Research Data AlgaeBase AlgaeBase is a database of information on algae that includes X terrestrial, marine and freshwater organisms. Amsterdam Cohort The Amsterdam cohort study on HIV infection and AIDS among X Studies homosexual men, expanded to include drug users. CancerData.org The CancerData site is an effort of the Medical Informatics and x Knowledge Engineering team of Maastro Clinic, Maastricht. CARIBIC CARIBIC is a scientific project to study and monitor important X chemical and physical processes in the Earth´s atmosphere. CLAPOP CLAPOP is the portal of the Dutch CLARIN community. X X CLARIN INT Resources that are relevant to the lexicological study of the X Portal Dutch language and on resources relevant for research in and development of language and speech technology. CLARIN-ERIC CLARIN has a focus on language resources (data and tools). It X X is being implemented and improved at leading institutions in a large and growing number of European countries data.enanomapper A substance database for nanomaterial safety information X X DataverseNL Online storage, sharing and registration of research data, during X X X X the research period and up to ten years after its completion. DHS Data Access The DNB Household Survey (DHS) supplies longitudinal data to X the international academic community, with a focus on the psychological and economic aspects of financial behavior. Donders The repository of the Donders Institute for Brain, Cognition and X Repository Behaviour at the Radboud University. eartH2Observe EartH2Observe brings together the findings from European FP X projects. It will integrate available global earth observations. EASY Online archiving system with access to thousands of datasets in X X X the humanities, the social sciences and other disciplines. EDGAR The Emissions Database for Global Atmospheric Research X provides independent estimates of the global anthropogenic emissions and emission trends. EIDA EIDA is a distributed data centre established to (a) securely X archive seismic waveform data, and (b) provide transparent access to the archives by geosciences research communities. eLaborate eLaborate is an online work environment in which scholars can X upload scans, transcribe and annotate text, and publish the results as on online text edition which is freely available.

Note – Exploring the Dutch data landscape 26

HSS LS NS ES European Climate Presented is information on changes in weather and climate X Assessment & extremes, as well as the daily dataset needed to monitor and Dataset project analyse these extremes. Huygens ING Huygens ING intends to open up old and inaccessible sources, X and to understand them better. Huygens ING aims to publish digital sources and data responsibly and with care. ICOS Carbon Data portal of the Integrated Carbon Observation System. It X Portal provides observational data from the state of the carbon cycle in Europe and the world. ICTWSS database The ICTWSS database covers four key elements of modern X political economies: trade unionism, wage setting, state intervention and social pacts. IISH Dataverse The IISH Dataverse contains micro-, meso-, and macro-level X datasets on social and economic history. ISRIC- World Soil ISRIC has a mission to serve the international community with X X Information information about the world’s soil resources to help addressing major global issues. ISRIC Soil data.isric.org is the central location for searching and X X Metadata downloading soil data bases/layers from around the world. Catalogue KNMI Climate The KNMI Climate Explorer is a web application to analysis X Explorer climate data statistically. KNMI Data Centre The KNMI Data Centre (KDC) provides access to weather, X climate and seismological datasets of KNMI. Land Portal The Land Portal collects metadata from statistical datasets X X X relating to land, peer-reviewed articles and other research reports, national laws and policies, grey literature but also news, blogs and organization profiles. Leiden Open LOVD portal provides LOVD software and access to a list of X Variation worldwide LOVD applications through Locus Specific Database Database list and List of Public LOVD installations. LISS Panel The LISS panel (Longitudinal Internet Studies for the Social X sciences) consists of 4500 households, comprising 7000 individuals. Longitudinal Aging LASA focuses on, physical, emotional, cognitive and social X Study Amsterdam functioning in late life, the connections between these aspects, and the changes that occur in the course of time Maddison Project Maddison's work contains the Project Dataset with estimates of X GDP per capita for all countries in the world between 1820 and 2010 in a format amenable to analysis in R. Meertens Institute The focus is on resources relevant for the study of function, X Collections meaning and coherence of cultural expressions and resources relevant for the study of language variation within (Dutch) MycoBank MycoBank is a service to the mycological and scientific society X by documenting mycological nomenclatural novelties (new names and combinations) and associated data. Open Rotterdam To accommodate a wider scope of ophthalmic data, we x Glaucoma Imaging launched our new Rotterdam Ophthalmic Data Repository. Data Sets ð This portal has a successor in RODR (see below) OpenML OpenML is an open ecosystem for machine learning. OpenML X X is a platform to share detailed experimental results with the community at large and organize them for future reuse. Penn World Table PWT version 9.0 is a database with information on relative X 9.0 levels of income, output, input and productivity, covering 182 countries between 1950 and 2014. PROFILES Patient Reported Outcomes Following Initial treatment and X Registry Long term Evaluation of Survivorship is a registry for the study of the impact of cancer and its treatment

Note – Exploring the Dutch data landscape 27

HSS LS NS ES Pseudobase Since the first discovery of RNA pseudoknots more and many X more pseudoknots have been found. However, not all of those pseudoknot data are easy to trace. Rotterdam The Rotterdam Ophthalmic Data Repository contains data sets X Ophthalmic Data related to ophthalmology that the Rotterdam Ophthalmic Repository Institute has made freely available for researchers worldwide. SeaDataNet SeaDataNet is a standardized system for managing the large X and diverse data sets collected by the oceanographic fleets and the automatic observation systems. Sound and Vision Sound and Vision has one of the largest audiovisual archives in X Europe. The institute manages over 70 percent of the Dutch audiovisual heritage. SACA&D Southeast Asian Climate Assessment & Dataset is focusing on X the digitization and use of high-resolution historical climate data from Indonesia and other Southeast Asian countries STITCH 4.0 The Database explores the interactions of chemicals and X proteins. SURF Data The SURF Data Repository allows researchers to store, X X X X Repository annotate and publish research datasets of any size to ensure long-term preservation and availability of their data. SHARE The Survey of Health, Ageing and Retirement in Europe is a X X multidisciplinary and cross-national panel database of micro data on health, socio-economic status and social networks. ISO Data Archive The Infrared Space Observatory (ISO) is designed to provide X detailed infrared properties of selected Galactic and extragalactic sources. The Language The Language Archive is storing a lot of unique material, from a X Archive large variety of languages worldwide, which is recorded and analyzed by researchers from different linguistic disciplines. Tilburg University TiU Dataverse is the central online repository for research data X Dataverse at Tilburg University. TRAILS TRAILS is a prospective cohort study, with young people from X X the Northern part of the Netherlands. Information that spans the total period from preadolescence up until young adulthood. TreeBASE TreeBASE is a repository of phylogenetic information, X specifically user-submitted phylogenetic trees and the data used to generate them. UvA / AUAS The University of Amsterdam and the Amsterdam University of X X X X figshare Applied Sciences cooperate to connect academic research with the insights and experiences from professional practice. World Christian The World Christian Database provides comprehensive X Database statistical information on world religions, Christian denominations, and people groups. World Religion The World Religion Database contains detailed statistics on X Database religious affiliation for every country of the world. WorldClim – WorldClim is a set of global climate layers (climate grids) with a X Global Climate spatial resolution of about 1 square kilometer. The data can be Data used for mapping and spatial modeling in a GIS or with other computer programs. YODA Yoda publishes research data on behalf of researchers that are X X X X affiliated with Utrecht University, its research institutes and consortia where it acts as a coordinating body. 27 24 22 8

Note – Exploring the Dutch data landscape 28