Land and Nutrition Data in the Land Portal and the Global Nutrition Report: a Gap Exploration Report

GODAN ACTION LEARNING PAPER Land and nutrition data in the Land Portal and the Global Nutrition Report: a gap exploration report Valeria Pesce Global Forum on Agricultural Research (GFAR) Lisette Mey Land Portal 30 June 2018 Pauline L’Hénaff Open Data Institute (ODI) Carlos Tejo-Alonso Land Portal GODAN ACTION LEARNING PAPER •••• LAND AND NUTRITION DATA: A GAP EXPLORATION REPORT Executive summary GODAN Action supports data users, producers and The main conclusions drawn from the report are that intermediaries to effectively engage with open data and maximise its potential for impact in the agriculture and • The two use cases present many similarities. They nutrition sectors. In particular, we work to strengthen both aggregate data from secondary sources, capacity, to promote common standards and best practice, already partly normalised by global agencies; they and to improve how we measure impact. both aggregate data around specific indicators; they aggregate from datasets with a similar structure This gap analysis report is the third in a series which (indicator, country, year, value). has examined gaps in data standards. The first version of the report examined gaps in agriculture and food data • The identified gaps in data standardisation are (Pesce, Kayumbi, Tennison, Mey, and Zervas: 2016) very similar. The names of countries and regions in data sources are not standardised or they are A second version (Pesce, Tennison, Dodds and Zervas: standardised according to different conventions; the 2017), in line with the 2017 project focus on weather data names of the variables do not follow any convention; and related use cases, examined the situation in the area indicators are represented by strings and may of data standards for weather data (and closely related change over the years (both their names and the geospatial data), and particularly focused on weather measurement methods). data for use in farm management services. With reference to our data standard assessment criteria, This third version focuses on data standardisation gaps the few standards used by the data sources (country in specific use cases of aggregation of land data and naming conventions, value ranges, units of measurement) nutrition data around indicators: the Land Portal and the in both cases are not open and not very usable. The Global Nutrition Report. situation is different when it comes to the way the two projects re-publish the data: the GNR normalises values The report starts with a review of the relevant types of around some conventions, while the LP re-publishes data for these use cases, then illustrates similarities everything according to Linked Data principles, and uses between the two projects and similar standardisation published vocabularies. gaps, and then moves to more specific challenges for the two individual projects. The Land Portal (LP) gathers information from a broad range of land-related data and information providers. It is organised and visualised in ways that are intuitive and usable for researchers, private sector actors and policy makers at global and local levels. The information provided can strengthen research, advocacy, and policy- making efforts by enabling a better understanding of land governance issues affecting various countries and regions. The Global Nutrition Report (GNR) is a comprehensive narrative on global and country-level nutrition. GNR produces the Report annually and aggregates a wealth of nutrition and nutrition-related data from a wide range of sources. This data underpins the report itself as well as being used to produce a range of supplementary materials, including country, regional, and sub-regional profiles and data visualisation tools. 2 GODAN ACTION LEARNING PAPER •••• LAND AND NUTRITION DATA: A GAP EXPLORATION REPORT Contents 1 Introduction 04 1.1 Relevant types of data 04 2.1 Similarities between the two use cases: types of data sources 05 2.1 Common standardisation issues 06 3 Land data use case: the Land Portal 12 3.1 Land Portal specific standardisation gaps 13 3.1.1 Data sources 13 3.1.2 Re-published data 17 4 Nutrition data use case: the Global Nutrition Report (GNR) 19 4.1 GNR specific standardisation gaps 20 4.1.1 Data sources 20 4.1.2 Re-published data 25 5 Experts interviewed 26 6 Conclusions 26 References 27 3 GODAN ACTION LEARNING PAPER •••• LAND AND NUTRITION DATA: A GAP EXPLORATION REPORT 1 Introduction This third version of the data standards gap analysis 1.1 Definitions focuses on specificuse cases of aggregation of land data and nutrition data: the Land Portal and the Global In this document, we use the same terminology as in our Nutrition Report. previous reports, which is explained in more detail in ‘A The Land Portal1 gathers information from a broad Map of Agri-food Data Standards‘ (Pesce, Tennison, Mey, range of land-related data and information providers Jonquet, Toulet, Aubin, and Zervas: 2016). and is organised and visualised in ways that are intuitive We often use the terms “data standards” and “vocabularies” and usable for researchers, private sector actors and interchangeably, to indicate any specification (from models policy makers at global and local levels. The information to templates/schemas to data dictionaries to code lists to provided can strengthen research, advocacy, and policy- thesauri) that normalises the way an entity is described or making efforts by enabling a better understanding of categorised. This corresponds to the general definitions land governance issues affecting various countries and of the W3C4 for vocabularies: “vocabularies define the regions. concepts and relationships used to describe and represent 3 The Global Nutrition Report2 is a comprehensive an area of concern”. narrative on global and country-level nutrition. GNR In other cases, when it is important to clarify the type produces the Report annually and aggregates a wealth of a specific data standard/vocabulary or which type of nutrition and nutrition-related data from a wide range of standard would be appropriate to improve the of sources. This data underpins the report itself as well interoperability of some datasets, we indicate the more as being used to produce a range of supplementary specific type of standard, either referring to a broader materials, including country, regional, and sub-regional group of standards if any type in that group applies, or profiles and data visualisation tools. referring to the specific type of standard if we want to be Before writing this report, we conducted a survey of data more specific. In these cases we refer to the groupings standards for land data and nutrition data. This report defined by the W3C and to the specific types defined by 5 will show that that very few of the standards surveyed the Dublin Core list of KOS: are relevant to the Land Portal or the Global Nutrition • Metadata element sets or element sets (or Report. This is because our survey considered primary “description vocabularies”) “define classes and land data and nutrition data, which normally comes in attributes used to describe entities of interest”. typical statistical formats (from tabular to SDMX) and Specific types of description vocabularies are: uses the data dictionaries, code lists and the classification metadata schemas (more specifically, XML schemas, schemes agreed upon by authoritative agencies in that JSON schemas, RDF schemas), models (including field. All data standards relevant for this type of data are UML models), templates, ontologies and more analysed in our survey report (forthcoming). • Value vocabularies “define resources (such as In this report on the other hand, we are focusing on the two instances of topics, art styles, or authors) that are use cases, which are data aggregators aiming to visualise used as values for elements in metadata records. [...] and narrate the current status of relevant socio-economic A value vocabulary thus represents a controlled list indicators in the world. They therefore collect data from of allowed values for an element. Examples include: secondary sources built by global agencies responsible thesauri, code lists, term lists, classification schemes, for those indicators, where the primary national/regional subject heading lists, taxonomies, authority files, data have already been normalised and consolidated digital gazetteers, concept schemes, and other types around specific indicators. of knowledge organisation systems”. This report illustrates the similarities between the two The second type of vocabulary is very relevant for the use cases in terms of structure of the data sources and use cases analysed in this document. Therefore, the term related data standardisation gaps. value vocabulary or more specifically terms like code lists or classifications are often used in the document, according to the definitions above and to the more detailed definitions in ‘A Map of Agri-food Data Standards’ and in the published Map of agri-food data standards.6 4 1 http://landportal.org 5 http://wiki.dublincore.org/index.php/ 2 http://globalnutritionreport.org/ NKOS_Vocabularies#KOS_Types_Vocabulary 3 https://www.w3.org/standards/semanticweb/ontology 6 See definitions athttp://vest.agrisemantics.org/about/structure 4 https://www.w3.org/2005/Incubator/lld/ XGR-lld-vocabdataset-20111025/#Introduction:_Scope_and_Definitions GODAN ACTION LEARNING PAPER •••• LAND AND NUTRITION DATA: A GAP EXPLORATION REPORT 2 Similarities between the two use cases: types of data sources The thematic topics on which the project focuses in the global

Land and Nutrition Data in the Land Portal and the Global Nutrition Report: a Gap Exploration Report

DELIVERABLE D3.2 Survey of Data Models, Ontologies and Standards in the Wider Energy Efficient Buildings Domain

ADMS Issues 09 February 2012 Proposed Proposed Category JIRA Description Action Resolution

Framework for Matching and Linking Large Ontologies

Farm Data Management, Sharing and Services for Agriculture Development ©Adobe Stock/Only Kim Farm Data Management, Sharing and Services for Agriculture Development

February 2009 Vol

A Survey of Data Standards for Land and Nutrition Data

Rupert Herbert-Burns Phd Thesis

Neon Methodology for Building Ontology Networks: Specification, Scheduling and Reuse I María Del Carmen Suárez De Figueroa Baonza

FOOD and AGRICULTURE ORGANIZATION of the UNITED NATIONS Terms of Reference for Consultant

STIDS Conference October 24, 2012

Indian Ocean Rising: Maritime Security and Policy Challenges

I an ARABIC SEMANTIC WEB MODEL a Thesis Submitted to Kent State University in Partial Fulfillment of the Requirements for the De