A Survey of Data Standards for Land and Nutrition Data
Total Page:16
File Type:pdf, Size:1020Kb
GODAN ACTION LEARNING PAPER A survey of data standards for land and nutrition data Valeria Pesce Global Forum on Agricultural Research and Innovation (GFAR) Lisette Mey Land Portal 30 April 2018 Pauline L’Hénaff Open Data Institute (ODI) Carlos Tejo-Alonso Land Portal GODAN ACTION LEARNING PAPER •••• A SURVEY OF DATA STANDARDS FOR LAND AND NUTRITION DATA Executive summary GODAN Action supports data users, producers and through statistics and surveys and elaborated through intermediaries to effectively engage with open data and projections. maximise its potential for impact in the agriculture and nutrition sectors. In particular, we work to strengthen On the other hand, the truly topic-specific data standards capacity, to promote common standards and best practice, are the data dictionaries, code lists and the classification and to improve how we measure impact. schemes used for each specific type of data. This explains why the majority of the data standards identified for these This report is a short accompanying document to the two thematic topics are value vocabularies, like code GODAN Action Map of Data Standards (http://vest. lists and classification schemes. However, beyond a few agrisemantics.org) which now includes two new sections formalised classifications, most of the standardisation dedicated to data standards relevant for the two thematic work in these areas is done through recommendations topics, which are the GODAN Action project focuses in and guidelines issued by authoritative bodies. its second year: land data and nutrition data. A more in-depth analysis of the data standards and Links to the two new sections are on the homepage of standardisation gaps in the areas of land data and nutrition the Map (http://vest.agrisemantics.org). The direct link data is provided in our full gap exploration report (Pesce to the land data section is http://vest.agrisemantics.org/ et al.; 2018). by-theme/7705/7705, while the direct link to the (mal) nutrition data section is http://vest.agrisemantics.org/ by-theme/7706/7706. We refer to our original map of standards report (Pesce V, Kayumbi GW, Tennison J et al.; 2016) for the full description of the map, the organisation of the content (metadata and classifications) and the initial coverage. What we describe in detail here is the process followed to identify relevant data standards for the two new thematic topics. As we did for weather data in the first year of the project, we framed our survey of data standards around: the types of data commonly used for land and nutrition; the standardisation practices that are currently in use for these types of data (from statistical data formats to code lists and classification schemes); the corresponding authoritative bodies to which all experts look to, as well as important projects and initiatives. The result of this work is the identification of the most relevant data standards for these two topics (listed in chapter 2) and their inclusion in the Map (which can be found in the links above). An initial overview of the data standards identified shows that the data formats most widely used for these types of data are the typical statistical formats (from tabular to SDMX). Indeed, malnutrition, land tenure and land 2 use are socio-economic dimensions, mainly measured GODAN ACTION LEARNING PAPER •••• A SURVEY OF DATA STANDARDS FOR LAND AND NUTRITION DATA Contents 1 Coverage of data standards for land data and nutrition data 04 1.1 Methodology for identification and inclusion of data standards 04 1.2 General types of data relevant for both land data and nutrition data 04 1.2.1 Statistical data 04 1.2.2 Geospatial/geopolitical data 06 1.3 Land data 06 1.4 Nutrition data 10 1.5 Experts interviewed 13 1.6 Conclusions 13 2 List of data standards added to the map 14 2.1 List of statistical and global indicators data standards 14 2.2 List of data standards relevant for the thematic topic- land 15 2.3 List of data standards relevant for the thematic topic - nutrition 17 References 20 3 GODAN ACTION LEARNING PAPER •••• A SURVEY OF DATA STANDARDS FOR LAND AND NUTRITION DATA 1 Coverage of data standards for land data and nutrition data 1.1 Methodology for code lists and the classification schemes used (see identification and inclusion of chapters 1.3 and 1.4). data standards In addition, all these types of datasets have some geospatial dimension, although in most cases more geopolitical and limited to area codes or country codes. We followed the same methodology as for weather data standards (Pesce, Tennison, Dodds and Zervas; 2018) Land cover data, the only type of data under these themes and based our identification and selection of relevant that doesn’t have a socio-economic dimension, is the data standards around: exception. It’s geospatial data with observational features describing the physical nature of the land cover, so it the types of data commonly used for land and • uses typical data formats and conventions of geospatial nutrition; and observation data (as seen in our weather data gap • the standardisation practices that are currently in analysis report) plus a few classifications schemes of use for these types of data (from statistical data land cover types (see chapter 1.3). formats to code lists and classification schemes); Therefore, statistical and geospatial data standards are • the corresponding authoritative bodies to which all relevant for the two thematic topics. experts look up as well as important projects and initiatives; 1.2.1 Statistical data • Interviews with experts on the points above. National statistics provide many of the data necessary to contextualise land data and nutrition data: population, age, An important aspect of the identification of the relevant gender, occupation… Many international organisations types of data is what types of users are going to need and initiatives then aggregate these data to provide local, land data and nutrition data and therefore what specific regional and global overviews. Tha main actors for this types of data are needed for their purposes. type of data are of course national governments and then The identification of data types in the following chapters the regional and global agencies that have a mandate is based on the fact, on which interviewed experts (see to collect these data (like the EU or the UN agencies). chapter 1.5) agree, that the primary users of land and However, government official statistics don’t collect all the nutrition data are policy makers. nutrition and land data that are needed to create useful These data are needed also by infomediaries, but even information systems and products, like consumption then the final objective of using these data is to influence data, details about land use and perceived land tenure policy makers. security. For these types of data, a key instrument are ad-hoc surveys. 1.2 General types of data Such surveys are normally run by national governments relevant for both land data and or by international organisations in coordination with nutrition data national governments (see chapters 1.3 and 1.4 for the role of international organisations like FAO or WHO in coordinating and aggregating surveys), but also by Malnutrition, land tenure and land use are socio- dedicated initiatives and projects. In many cases, given economic dimensions, mainly measured through statistics the complexities of the methodology and of the execution and surveys and elaborated through projections. (in order to reach maximum coverage), these surveys So the data formats adopted are normally the typical are outsourced to specialised agencies. statistical formats (from tabular to SDMX), while the really 4 topic-specific data standards are the data dictionaries, GODAN ACTION LEARNING PAPER •••• A SURVEY OF DATA STANDARDS FOR LAND AND NUTRITION DATA The reference body for socio-economic statistics and which are the inputs and outputs in the design surveys, especially for global aggregation of data and for and production of statistics. guidelines and standards, is the UN Statistics Division.1 In particular for household surveys, the UN working group These are general standards for the metadata and the called Inter-secretariat Working Group on Household model, but then the structure of the surveys depends of Surveys2 aims at fostering coordination and harmonisation course on the data to be collected and several institutions of household survey activities. (UN, World Bank, USDA) produce guidelines for specific types of data (land use, food consumption…). Another important initiative in this area is the International Household Survey Network (IHSN). The mission of the Indeed most of the standardisation work is done through IHSN is to improve the availability, accessibility, and recommendations and guidelines, partly because the quality of survey data within developing countries, and standardisation of the methodology is more important to encourage the analysis and use of this data by national than the standardisation of the variable names for further and international development decision makers, the reuse; variable names need to be accompanied by research community, and other stakeholders. One of specifications like methods, formulas etc. Examples are their objectives is “Availability of standards, tools, and the UN “Principles and Recommendations for Population 9 guidelines that would allow data producers to document, and Housing Censuses” or the USDA “Current Population disseminate, and preserve microdata according to Survey Food Security Supplement December 2016 10 international standards and best practices”.3 Microdata File User Notes”. All these bodies more or less recommend the same data In reality, most surveys are conducted without using standards: standards for the metadata, while for the data they use tabular formats and some conventions and guidelines • For metadata: on variables and more rarely on variable names. Quite often a lot of harmonisation work is done by the global • The XML schema of the Data Documentation agencies (UN, World Bank) that aggregate the data and 4 Initiative (DDI) for the metadata, developed make them available again in a more standardised way, specifically for the documentation and e.g.