<<

AGROVOC Semantic data interoperability on food and agriculture AGROVOC Semantic data interoperability on food and agriculture

Food and Agriculture Organization of the United Nations Rome, 2021 ii

Required citation: The designations employed and the presentation of material in this FAO. 2021. AGROVOC – Semantic information product do not imply the expression of any opinion whatsoever data interoperability on food on the part of the Food and Agriculture Organization of the United Nations and agriculture. Rome. (FAO) concerning the legal or development status of any country, territory, city https://doi.org/10.4060/cb2838en or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The mention of specific companies or products of manufacturers, whether or not these have been patented, does not imply that these have been endorsed or recommended by FAO in preference to others of a similar nature that are not mentioned. The views expressed in this information product are those of the author(s) and do not necessarily reflect the views or policies of FAO. ISBN 978-92-5-133831-5 © FAO, 2021

Some rights reserved. This work is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 IGO licence (CC BY-NC- SA 3.0 IGO; https://creativecommons.org/licenses/by-nc-sa/3.0/igo/legalcode). Under the terms of this licence, this work may be copied, redistributed and adapted for non-commercial purposes, provided that the work is appropriately cited. In any use of this work, there should be no suggestion that FAO endorses any specific organization, products or services. The use of the FAO logo is not permitted. If the work is adapted, then it must be licensed under the same or equivalent Creative Commons licence. If a translation of this work is created, it must include the following disclaimer along with the required citation: “This translation was not created by the Food and Agriculture Organization of the United Nations (FAO). FAO is not responsible for the content or accuracy of this translation. The original [Language] edition shall be the authoritative edition.” Disputes arising under the licence that cannot be settled amicably will be resolved by mediation and arbitration as described in Article 8 of the licence except as otherwise provided herein. The applicable mediation rules will be the mediation rules of the World Intellectual Property Organization http:// www.wipo.int/amc/en/mediation/rules and any arbitration will be conducted in accordance with the Arbitration Rules of the United Nations Commission on International Trade Law (UNCITRAL). Third-party materials. Users wishing to reuse material from this work that is attributed to a third party, such as tables, figures or images, are responsible for determining whether permission is needed for that reuse and for obtaining permission from the copyright holder. The risk of claims resulting from infringement of any third-party-owned component in the work rests solely with the user. Sales, rights and licensing. FAO information products are available on the FAO website (www.fao.org/publications) and can be purchased through [email protected]. Requests for commercial use should be submitted via: www.fao.org/contact-us/licence-request. Queries regarding rights and licensing should be submitted to: [email protected]. iii

Contents

Acronyms and abbreviations v

Executive summary vii

1 Data sharing and interoperability 1

1.1 Vocabularies 1

1.2 Vocabularies and the FAIR principles 3

1.3 Vocabularies and Linked Data 4

1.4 Resource Description Framework (RDF) 6

2 Knowledge Organization Systems (KOS) 7

2.1 Sharing versus creating a new KOS 9

2.2 Using, integrating and merging a KOS 10

3 AGROVOC 11

3.1 The AGROVOC concept model 12

3.2 AGROVOC VoID 14

3.3 Copyright and licensing 14

4 Accessing AGROVOC 15

4.1 Browsing and searching in Skosmos 16

4.2 The Skosmos API 18

4.3 Web-based SPARQL interface 19

4.4 SOAP web services 20

5 Contributing to AGROVOC 21

5.1 Community-based content curation 21

5.2 Benefits of joining AGROVOC 22

5.3 The AGROVOC editorial guidelines 23

5.3.1 The Agrontology 24

5.3.2 Alignments 25

5.4 Subvocabularies 25

5.5 Suggesting new terms 28 iv

6 VocBench: editorial workflow 29

7 VocBench: navigating AGROVOC 31

7.1 Browsing and searching in VocBench 31

7.2 Searching through a SPARQL query in VocBench 33

8 VocBench: curating AGROVOC 35

8.1 Adding a new term 35

8.1.1 Adding a preferred term 35

8.1.2 Adding an alternative term 35

8.2 Editing an existing term 36

8.3 Adding a new concept 37

8.4 Adding other properties 38

8.5 Adding a relationship between concepts 40

8.6 Mapping to an external concept (alignments) 40

9 VocBench: curating a subvocabulary 41

9.1 Associating concepts with a new scheme 43

9.2 Creating scheme-specific hierarchical properties 44

9.3 Implementing the scheme-specific hierarchy 46

9.3.1 Positioning a concept in the scheme-specific hierarchy 46

10 VocBench: importing and exporting 47

10.1  SPARQL and Sheet2RDF to export and import data 47

10.1.1 Importing translations 48

Glossary 51

Bibliography 53

Annex 1 List of AGROVOC editorial institutions 2020 57

Annex 2 AGROVOC 25 top concepts 58 v

Abbreviations and acronyms

AGRIS International System for Agricultural Science and Technology BT Broader Term CSV Comma-separated values DOI Digital Object Identifier FAIR Findable, Accessible, Interoperable and Reusable FAO Food and Agriculture Organization of the United Nations FAOTERM FAO Terminology FTP File Transfer Protocol GEMET General Multilingual Environmental Thesaurus HTTP Hypertext Transfer Protocol ISO International Organization for Standardization JSON JavaScript Object Notation KOS Knowledge Organization System LOD Linked Open Data NKOS AP Networked Knowledge Organization Systems Dublin Core Application Profile NT Narrower Term OKFN Open Knowledge Foundation OWL Web Ontology Language RDF Resource Description Framework RT Related Term SKOS Simple Knowledge Organization System SKOS-XL Simple Knowledge Organization System eXtension for Labels SQL Structured Query Language URI Uniform Resource Identifier URL Uniform Resource Locator VoID Vocabulary of Interlinked Datasets XML eXtensible Markup Language W3C World Wide Web Consortium vi

AGROVOC is a multilingual and controlled vocabulary designed to cover concepts and terminology under FAO’s areas of interest. vii

Executive summary

Since the early 1980’s, the Food and Agriculture Organization of the United Nations (FAO) has coordinated AGROVOC, a valuable tool for classifying data homogeneously, facilitating interoperability and reuse.

AGROVOC is a multilingual and controlled vocabulary designed to cover concepts and terminology under FAO’s areas of interest. It is the largest Linked Open Data set about agriculture available for public use and its greatest impact is through providing the access and visibility of data across domains and languages.

AGROVOC provides a way to organize knowledge for subsequent data retrieval and consists of a structured collection of concepts, terms, definitions and relationships. Concepts represent anything in food and agriculture, such as maize, hunger, aquaculture, value chains or forestry; these concepts are used to unambiguously identify resources, allowing standardized indexing processes to make searching more efficient. Each AGROVOC concept also has terms used to express it in various languages, known as lexicalizations. Today, AGROVOC consists of more than 38 000 concepts and 800 000 terms in up to 40 languages.

Over recent years, AGROVOC has evolved to become a valued information resource worldwide with more than 30 milllion accesses a year. This book aims to increase awareness of the use of AGROVOC to enhance the accessibility and visibility of information and data, as well as to inform about the latest technical developments, recommended standards and various ways to engage with AGROVOC. This publication is especially targeted at individuals and institutions who are interested in controlled vocabularies and SKOS, who may also wish to use AGROVOC or improve its usage, and those who may wish to contribute to AGROVOC, either through the AGROVOC editorial community or as part of a community of experts.

1

1 Data sharing and interoperability

In agriculture, data exchange is While there has never been as much data and essential for research and innovation information available as now, it is not always simple as well as for business, including for to find the right information: it is distributed, fragmented, and often compartmentalized. Today, market prices, infrastructure, and online availability does not imply accessibility. weather information. Metadata therefore plays a crucial role in making data findable; while data are the actual pieces of Data exchange is also needed for policy and information like numbers, dates or literal values; regulations, such as tracking of food products and metadata contains information about the data. use of pesticides. Key factors that have changed the nature of data sharing in recent decades are the Metadata is “structured data about anything that can advent of computers and digital data, the Internet be named, such as Web pages, books, journal articles, and, more recently, the cloud and big data images, songs, products, processes, people (and their technologies, which have multiplied the potential of activities), research data, concepts, and services” data processing power. Over the last decade, (DCMI: Metadata Basics, n.d.) machines have become the primary data consumers and the actual intermediaries in the data-sharing process; therefore, data must be machine-readable 1.1 Vocabularies in order to be shared. A vocabulary is a data model comprised of classes, The Semantic Web aims to make online information properties and relationships, which can be used for machine-readable and it is an extension of the describing data and metadata. It also defines World Wide Web through standards set by the agreed values, ideally in different languages, and World Wide Web Consortium (W3C). The Semantic exposes the machine-readable definition as a Web allows for the connection of information that Uniform Resource Identifier (URI), which helps can be easily read by machines including computers, humans and machines to interpret the values. artificially-intelligent bots, virtual assistants, mobile Vocabularies that have this function are known as phones, fitness trackers, smart TVs, or other devices “value vocabularies” or Knowledge Organization commonly used to access information. By Systems (KOS), like thesauri, taxonomies, or combining different sets of data, the Semantic Web classifications. Vocabularies help overcome the issue makes it easier to offer people the information they of unclear concepts, which are an obstacle for seek at the time they need it, including information interoperability across datasets. related to precision agriculture, remote sensing, supply chains, and public health challenges. AGROVOC Semantic data interoperability on food and agriculture 2

Given the long tradition of knowledge organization 2. Value vocabularies or KOS: thesauri, code lists, endeavours, there are many types of vocabularies, classifications, authority lists, etc. Value although distinctions are not always clear-cut and vocabularies are collections of things or concepts, there is no formal classification. Nevertheless, typically with an identifier, definitions and possibly vocabularies can be grouped into two main types: basic relationships between them. For example, AGROVOC is a controlled vocabulary covering all 1. Description vocabularies: metadata schemas FAO’s areas of interest; GeoNames is an and ontologies. Description vocabularies define authoritative database of geographic entities. classes and attributes used to describe entities of interest. For example, Dublin Core is a vocabulary On the Web, KOS are generally expressed using for describing documents (or any information Simple Knowledge Organization System (SKOS). resource), while Darwin Core is a vocabulary for the SKOS is a W3C recommendation designed for description of germplasm specimens. On the Web, representation of thesauri, classification schemes, metadata element sets and data models are taxonomies, subject-heading systems, or any other generally expressed as Resource Description type of structured controlled vocabulary. SKOS is Framework (RDF) schemas, see Section 1.4, or Web part of the Semantic Web family of standards, and Ontology Language (OWL). OWL is a Semantic Web its main objective is to enable easy publication and language designed to represent rich and complex use of such vocabularies as linked data. knowledge about things, groups of things, and relations between things and is designed for use by A taxonomy of value vocabulary types has been applications to process information (W3C, 2004). partially constructed by the Networked Knowledge Organization Systems Application Profile (NKOS AP) There is no authoritative list of description (Zeng and Zumer, 2015) and the KOS Types vocabularies, but the most commonly used are: Vocabulary (BARTOC, 2019), which gives a sense of the great variety of KOS, see below: schema (or metadata element set): any set of metadata elements, like Extensible Markup categorisation scheme: loosely formed grouping Language (XML) schemas, RDF schemas or less scheme; formalised set of descriptors; classification scheme: schedule of concepts and application profile: a schema which consists of pre-coordinated combinations of concepts, metadata elements drawn from one or more arranged by classification; namespaces (a set of commonly used properties), combined together and optimised for a particular dictionary: reference source containing words local application; usually alphabetically arranged along with information about their forms, pronunciations, messaging standard: a standard that describes functions, etymologies, meanings, and syntactical how to format syntactically (and sometimes and idiomatic uses; semantically) a machine-transmitted message usually describing some event- or time-related gazetteer: geospatial dictionary of named and information; typed places;

variable name convention: a set of prescribed glossary: collection of textual glosses or of variable names for certain types of observations; specialised terms with their meanings;

ontology: a way of showing the properties of a list: a limited set of terms arranged as a simple subject area and how they are related, by defining alphabetical list or in some other logically evident a set of concepts and categories that represent the way; containing no relationships of any kind; subject. 1 Data sharing and interoperability 3

name authority list or authority file:controlled 1.2 Vocabularies and the vocabulary for use in naming particular entities consistently; FAIR principles

ontology: formal model that allows knowledge to In 2016, a group of different stakeholders – be represented for a specific domain; an ontology academia, industry, publishers, and funding describes the types of things that exist (classes), agencies – with the objective of defining how to the relationships between them (properties) and enable more effective data sharing established the the logical ways those classes and properties can FAIR Principles (Wilkinson et al., 2016). The FAIR be used together (axioms); principles state that data must be (F)indable, (A) ccessible, (I)nteroperable and (R)eusable. This is to semantic network: set of terms representing support knowledge discovery and innovation, data concepts, modelled as the nodes in a network of and knowledge integration, and promote sharing variable relationship types; and reuse of data, which focuses more on data- intensive research and sharing data across the data subject heading scheme: structured vocabulary value chain. The FAIR principles help data and comprising terms available for subject indexing, metadata to be machine readable, supporting new plus rules for combining them into pre- discoveries through the harvest and analysis of coordinated strings of terms where necessary; multiple datasets (FORCE11, 2016). The FAIR synonym ring: set of synonymous or almost principles are enablers in terms of discovery and synonymous terms, any of which can be used to semantic interoperability. refer to a particular concept; A description of each of the FAIR principles is taxonomy: scheme of categories and provided below: subcategories that can be used to sort and Findable. It is recommended that a globally unique otherwise organize items of knowledge or and permanent persistent identifier, e.g. Digital information; Object Identifier (DOI) is assigned, describing the terminology: set of designations belonging to data with rich metadata, and making sure it is one special language; findable through disciplinary discovery portals. This is an essential step and, without it, achieving other thesaurus: controlled and structured vocabulary FAIR principles is more difficult: in which concepts are represented by terms, organized so that relationships between concepts F1. (Meta)data are assigned a globally unique and are made explicit, and preferred terms are persistent identifier; accompanied by lead-in entries for synonyms or F2. Data are described with rich metadata, defined quasi-synonyms. by the principle Reusable 1 (R1), see below;

Ontologies appear in both lists: they are both a F3. Metadata clearly and explicitly include the modelling vocabulary and a value vocabulary. As identifier of the data they describe; stated on Wikipedia, “most ontologies describe individuals (instances), classes (concepts), attributes, F4. (Meta)data are registered or indexed in a and relations” (Wikipedia, 2020a). searchable resource. AGROVOC Semantic data interoperability on food and agriculture 4

Accessible. Once data is found, it is necessary to Reusable. This is the ultimate principle: the data know how to access it, and it should be possible to should maintain its initial richness. The description authenticate and authorise, if required. It is of essential, recommended, and optional metadata recommended that data and metadata should be elements should be machine-processable and retrievable in a variety of formats that are sensible to verifiable. In addition, the use of data should be easy humans and machines using persistent identifiers. and data should be citable to sustain data sharing and to recognize the value of data. A1. (Meta)data are retrievable by their identifier using a standardised communications protocol, R1. Meta(data) are richly described with a plurality such as Hypertext Transfer Protocol (HTTP) or File of accurate and relevant attributes: Transfer Protocol (FTP): R1.1. (Meta)data are released with a clear and – A1.1 The protocol is open, free, and universally accessible data usage license; implementable; R1.2. (Meta)data are associated with detailed – A1.2 The protocol allows for an authentication provenance; and authorisation procedure, where necessary. R1.3. (Meta)data meet domain-relevant A2. Metadata are accessible, even when the data community standards. are no longer available. Many stakeholders, especially research funders, Interoperable. The description of metadata are rapidly adopting the FAIR principles, e.g. by the elements should follow community guidelines that European Commission Guidelines on FAIR Data use open, well-defined vocabularies. Management in the Horizon 2020 programme (European Commission, 2016). I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. 1.3 Vocabularies and I2. (Meta)data use vocabularies that follow FAIR Linked Data principles. This is the key principle for this guide and for AGROVOC. The concept of open has been the cornerstone for all initiatives on knowledge and data sharing for The controlled vocabulary used to describe datasets decades. Any type of resource (document, image, needs to be documented and resolvable using dataset) can be considered as open; there are a few globally unique and persistent identifiers. This rules and it is fairly easy to apply. The open data documentation needs to be easily findable and framework is not strictly formalized and cannot be accessible by anyone who uses the dataset attributed to any one initiative, person or institution. (FORCE11, 2016). The first formal definition of open was created by the Open Knowledge Foundation (OKFN) in 2005 I3. (Meta)data include qualified references to other and was applied to knowledge in general: (meta)data. “Knowledge is open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness” (OKFN, 2020a). This definition was also applied by OKFN to open data: “Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose” (OKFN, 2020b). 1 Data sharing and interoperability 5

The first real technical framework for open data was To be more precise, distinguishing between linked designed by Sir Tim Berners-Lee (henceforth TBL). data and linked open data, TBL said: “Linked Data The technical framework he designed for the web of does not of course in general have to be open -- there data is the Linked Open Data (LOD) or simply Linked is a lot of important use of linked data internally, and Data good practice, which was formalized in 2006 for personal and group-wide data. You can have 5-star (Berners-Lee, 2006). The core of the Linked Data Linked Data without it being open. However, if it claims approach consists of technical guidelines to make to be Linked Open Data then it does have to be open to data fully linked. get any star at all.” (Berners-Lee, 2006).

TBL published the five stars of open data This framework has also been criticized for being deployment scheme, see Table 1, “in order to rigid in terms of the strict open license definition and encourage people -- especially government data technical approach to RDF in the fourth and fifth owners -- along the road to good linked data” stars. However, this does not mean that it has been (Berners-Lee, 2012). The five star scheme illustrates superseded; it is still the reference framework for the continuum in data publishing that leads to the high data interoperability and for a loosely coupled, final steps of fully linked open data (the LOD bottom-up open web of data. framework is only a subset of it, the last two stars). The last two stars of TBL’s five stars, using HTTP TBL’s five stars of open data are still a reference URIs and linking URIs, are the ones that guarantee framework for anyone working on open data. They the highest level of interoperability. The core four have been generally interpreted as cumulative, in principles for linked data are (Berners-Lee, 2006): that each additional star presumes the data meets the criteria of the previous step(s), which has always Use URIs as names for things; put enormous weight on the first step, an open Use HTTP URIs so that people can look up those license. Without an open license, implementing the names; other four stars would not result in open data. All the other stars relate to data interoperability, while When someone looks up a URI, provide useful the first star is all about openness for reuse. TBL’s information, using the standards; five stars of open data is a framework for openness; hence, they are also called “the five stars of Include links to other URIs, so that the user can openness”. discover more things. A URI is a string of characters that unambiguously identifies a particular resource. These resources or things can be any sort of tangible or abstract entity, from people to books, countries, events or activities.

Table 1. Tim Berners-Lee’s five stars of open data deployment scheme.

Make your stuff available on the web (whatever format) under an open licence

Make it available as structured data (e.g. Excel instead of image scan of a tab)

Non-proprietary format (e.g. CSV instead of Excel)

Use URIs to identify things, so that people can point at your stuff.

Link your data to other people’s data to provide context.

Source: (Berners-Lee, 2012) AGROVOC Semantic data interoperability on food and agriculture 6

1.4 Resource Description Compared to basic RDF, adopting Linked Data implies a few additional requirements: Framework (RDF) While RDF allows for the use of any type of URI, The fundamental step ahead towards semantic Linked Data recommends HTTP, so web-based interoperability was the introduction of the RDF, URIs, or URLs. URLs should be “resolvable”: this which did not define a new file format or data means that a web browser and a machine should serialization syntax, but provided a completely be able to retrieve data from that URL. new conceptual model. RDF is a rigorous “abstract syntax” based on short statements of three A web browser and a machine should find useful elements (subject, predicate, object), called triples. data at that URL, again in a standard format and The RDF model uses URIs to refer to anything, and preferably RDF. ensures any data serialization follows a very rigorous pattern, without prescribing a specific syntax. The RDF should link as much as possible to other URIs: so instead of using a literal or an internal RDF key characteristics: identifier to refer to something, use an already published URI where that something is defined. A model that uses URIs to refer to anything, so that Besides helping with semantic interoperability, this reference to defined things and concepts is facilitates the discovery of new linked data. preferred to literal values whenever possible. This means that machines can deal better with URIs As clarified by TBL, Linked Data can be open or not. than with literals; machines can fetch additional While it is not easy to monitor the growth of non- information and meaning from the URI. open Linked Data, as it is not openly accessible, it is relatively easy to discover Linked Open Data, given A rigorous grammar that ensures all data are that it is published under public URIs and is findable serialized using the same pattern (node – arc, by machines. triples), which means that machines do not have to deal with structural ambiguities.

A core RDF schema means that machines can rely on one consistent meta-model for all data models.

A model that is based on the definition and reuse of RDF vocabularies defining any needed semantics and the declaration of namespaces to refer to things as well as metadata properties from those vocabularies. This means that machines can fetch the definitions of semantics for data and metadata. 7

2 Knowledge Organization Systems (KOS)

KOS are created in order to organize Hierarchical relationship. The hierarchical information and promote knowledge relationship should be established between a pair management (Zeng, 2008). of concepts when the scope of one of them falls completely within the scope of the other. It should The typical components in a KOS include concepts be based on degrees or levels of superordination and properties (like a code or notation), labels and and subordination, where the superordinate alternative labels in one or more languages, concept represents a class or whole, and definitions, and different types of notes (change/ subordinate concepts refer to its members or history notes, editorial notes). Additionally, a KOS parts. The following tags should be used, contains different types of relations between reciprocally: concepts. The most common type of KOS is the organized hierarchical lists of concepts with BT (i.e. broader term), written as a prefix to the definitions and often-reciprocal relations, like superordinate term; thesauri, classification schemes, or subject NT (i.e. narrower term), written as a prefix to the headings and topic trees. subordinate term.

AGROVOC is a KOS and a thesaurus. According to Associative relationship. The associative the International Organization for Standardization relationship covers associations between pairs of (ISO) 25964-1 on Information and documentation concepts that are not related hierarchically, but are Thesauri and interoperability with other semantically or conceptually associated to such an vocabularies (2011), a thesaurus should include the extent that the link between them needs to be following items (note: only those used by AGROVOC made explicit in the thesaurus on the grounds that are listed): it may suggest additional or alternative terms for Concept. Each concept is represented by one use in indexing or retrieval. The relationship is preferred term per language, and by any number indicated by the tag “RT” (related term) and it of non-preferred terms. The notes and broader/ should be applied reciprocally. narrower/related term relationships apply to the Note. Editorial notes are useful for entries such concept as a whole, rather than to its preferred as “Review this term after the company merger term. A unique identifier may be assigned to each completes” or “Check spelling with expert A”. concept. In some systems, the concept is Notes such as this, and several of the attributes, identified only by its preferred term or by the are more useful for housekeeping than for user identifier of its preferred term. consultation. AGROVOC Semantic data interoperability on food and agriculture 8

Definition. A full definition is not usually required All notions related to thesauri, like the traditional to clarify the way in which a preferred term should notions of RT, BT and NT have a translation into be used. However, if a definition is required for SKOS properties. The SKOS model also another reason, a separate note field should be accommodates other types of KOS (classifications, established for the definitions so that they do not taxonomies, subject headings, lists), which are become confused with any scope notes. The normally simpler than thesauri. source of each definition should be recorded alongside the definition itself. In the case of The basic notions in SKOS are: concepts, their labels AGROVOC, definitions are required. and relations.

The most common uses of KOS are: Concepts. A concept could also be considered as the set of all terms used to express the concept in Data organization and indexing. In data various languages. In SKOS, concepts are cataloguing, some metadata fields can only take formalized as skos:Concept, identified by values from selected KOS from a controlled list, dereferenceable URIs (= URL). Concepts are called authority control. represented by terms.

Search and browse. Data that has been indexed Terms or labels. Terms or labels are used to using values from the KOS can be easily browsed name a concept. For example maize, maïs, and and searched. are all terms used to refer to the same 玉米 concept in English, French, and Chinese Aggregation and combination of data from respectively, e.g. skos:prefLa bel (@en, @fr, different sources. Values from KOS are also used @zh). by data aggregators to integrate data from different sources, which use a common KOS. This is Relations. In SKOS, hierarchical relations between possible by aggregating data that has been concepts are expressed by the predicates “tagged” with the same value from the same KOS. skos:broader and skos:narrower and they In this way, applications can create browsing correspond to the classical thesaurus relations interfaces across several catalogues, federated broader/narrower (BT/NT). searches or new aggregated datasets. Table 2 shows the comparisons between KOS and Automatic indexing. Controlled lists or authority SKOS terminologies. vocabularies can be used by applications for automatic indexing (or auto-tagging), benefiting SKOS is normally enough to model a simple to KOS content (like synonyms, or relationships medium complex KOS, but more advanced needs between terms in thesauri or ontologies) to may require additional modelling. Several improve the quality of the automatic indexing. extensions of SKOS already exist that were developed for particular needs. For example, SKOS In 2009, the W3C published and recommended eXtension for Labels (SKOS-XL) provides support for SKOS as a vocabulary to model KOS. This defines the describing and linking lexical entities instead of just elements to describe a KOS: classes, which include using plain literals for labels. This extension concepts, concept schemes (the KOS itself) and introduces a skosxl:Label class that allows labels collection (for subsets of the KOS); and properties. to be treated as first-order RDF resources. Each SKOS was created considering the ISO 2788, later instance of this class shall first be attached to a 25964, and BS8723-2 standards for thesauri. single RDF literal via the skosxl:literalForm property (W3C, 2009b). 2 Knowledge Organization Systems (KOS) 9

For example, the concept “Sustainable Development In some cases, it makes perfect sense to keep Goals” is labelled using both the official name and vocabularies independent and to simply link them. the acronym. The two labels can be represented in This may be either when the scope is very different, the following way: or the communities managing the vocabularies have completely different views on how concepts should Sustainable Development relations. In other cases, there may be good reasons Goals for joining forces and maintaining subsets of concepts together, primarily to avoid duplication of Sustainable Development vocabularies covering the same domain. Goals Another reason for combining efforts on vocabularies SDGs organizing them requires considerable effort, which SDGs experts. For different KOS, which cover a certain percentage of common concepts, the same effort is repeated, potentially involving the same experts. 2.1 Sharing versus creating a However, the results are often different, which results new KOS in different definitions, translations, and organizations of knowledge. This is a risk. In some cases, these As each system or community has its own needs, differences are essential and deserve different using existing URIs from existing KOS is not always vocabularies but, in other cases, communities might an option. Even communities that work in domains develop similar vocabularies because they do not that are very much related or overlap often prefer want to use what already exists. their own very specific vocabulary. From a technical point of view, the problem with the proliferation of vocabularies re-defining the same concepts is that data curators who want to use a certain concept have to choose between different KOS.

Table 2. Correlation between KOS and SKOS terminology

COMMON KOS SKOS Classification, thesaurus, controlled list skos:ConceptScheme Term skos:Concept Name / label skos:prefLabel (@en, @fr) Alternative name skos:altLabel (@en, @fr) Code / acronym skos:notation Note / explanatory note skos:note / scopeNote… Definition skos:definition (@en, @fr) Parent / Broader Term (BT) skos:broader Child / Narrower Term (NT) skos:narrower Related Term (RT) skos:related AGROVOC Semantic data interoperability on food and agriculture 10

This means that data curators either create may share all the concepts in one bigger KOS redundant datasets linking the same thing to many and each KOS is a subset that can be curated URIs defining the same concept, or they choose one individually; or URI from one KOS. They may refer to the most authoritative KOS, if there is one, relying on the fact are completely merged and become one. that this URI will link to other KOS. This can work if There are a few techniques and RDF modelling vocabularies are all linked, but building crosswalks solutions for these different scenarios, such as: and federated searches looking up several KOS is not smooth. Besides, links themselves should be copying or importing a concept scheme into maintained and updated as the target vocabularies another one (for a complete merge); evolve, which is not always the case. curating different collections in one concept From a management point of view, selecting, defining, scheme; translating and organizing KOS is a significant effort, which requires consultation, users’ feedback, and integrating different overlapping concept schemes language experts. in a broader one (using the same concept URIs); or

In the SKOS model, the key entity is the concept, creating a new meta-concept scheme that which can have any number of language-specific contains only the common overlapping concepts representations. This means that, ideally, one URI with new concept URIs, and links to concepts in the should identify the same concept, with labels in participating concept schemes. different languages. For vocabularies created for a In scenarios where different KOS are in one concept specific language, the same assessment criteria for scheme, special techniques can be used to leave a potential reuse of or integration in an existing certain autonomy for communities to manage their vocabulary apply as for any new vocabulary. Ideally, own hierarchies and definitions (such as additional creating a language version of an existing properties for scheme-wise hierarchies). vocabulary is preferable to creating a new vocabulary, provided the structure and content of Maintenance of a KOS is always a demanding the existing vocabulary is similar enough or activity: “controlled vocabularies are living entities adjustments can be made to make them more needing: new material added, outdated material similar. Vocabularies developed specifically for one removed, changes made” (Gazan, undated). For any language can also represent a particular type of URI KOS, a long-term maintenance plan is needed with redundancy and duplication of efforts. institutional support and resources, a designated curator, monitoring of use and users’ input (search logs, term submissions), and an editorial workflow for changes. In shared-vocabulary scenarios, the COMMON KOS SKOS 2.2 Using, integrating effort of maintaining the KOS is shared and Classification, thesaurus, controlled list skos:ConceptScheme and merging a KOS therefore lighter on each curator. However, the Term skos:Concept Vocabulary sharing and use can take different forms design and maintenance of the collaboration Name / label skos:prefLabel (@en, @fr) with two or more KOS: platform, the workflows and the editorial guidelines Alternative name skos:altLabel (@en, @fr) might take more time because of the need to cater may only share a core set of common concepts (a Code / acronym skos:notation for the requirements of different communities. core subset), while maintaining the rest of the Note / explanatory note skos:note / scopeNote… concepts in their separate KOS. In this case, there Vocabulary reuse and extension is still an area Definition skos:definition (@en, @fr) would not be one KOS with subsets but a number of experimentation, there is no fixed set of Parent / Broader Term (BT) skos:broader of KOS each with one part in common and one established rules. Child / Narrower Term (NT) skos:narrower independent; Related Term (RT) skos:related 11

3 AGROVOC

Since the early 1980s, the Food and More precise domain-specific relations between Agriculture Organization (FAO) of the concepts, which are represented by a number of United Nations has promoted greater subproperties of skos:related were added to the AGROVOC model, like affects/isAffectedBy; knowledge sharing and access among hasPathogen/isPathogenOf; hasProduct/ its member countries through the productOf; hasTaxonomicRank / publication of AGROVOC. isTaxonomicRankOf. These additional relations required the creation of a new small ontology, called Coordinated by FAO, AGROVOC is maintained by an the Agrontology (AIMS, 2020), which was customized international editorial community, with over 25 to support AGROVOC, especially to accommodate organizations and expert communities volunteering legacy (old) non-hierarchical relations that were as focal points for specific languages and/or specific defined when it was converted to an OWL ontology. domains, see Annex 1. Subsequently, FAO noted increased demand from The AGROVOC thesaurus was first published in the AGROVOC community for a more networked English, Spanish and French as a printed resource for and distributed management of the thesaurus. A indexing and searching publications on agriculture, 2010 survey summarized in the publication “Linked including forestry, animal husbandry, aquatic Data for fighting global hunger: experiences in sciences, fisheries, aquaculture and human nutrition. setting standards for agricultural information Primary users were the FAO library and AGRIS [agris. management” highlighted the need for reusing, fao.org], the International System for Agricultural extending and adapting AGROVOC (Baker and Science and Technology coordinated by FAO. Keizer, 2010). Since then, AGROVOC has been In 2000, AGROVOC became digital. In 2004, RDF was strengthening its role as a “hub” of food and becoming popular and the first version of OWL was agriculture vocabularies. It is the starting point of published, formalizing AGROVOC as an OWL new specialized vocabularies, and allows for the ontology, with a set of semantic relations between building of AGROVOC subvocabularies with the concepts. However, the OWL model was too heavy collaboration of communities of experts. for a vocabulary that needed little more than the For the past decade, the technical infrastructure typical relationships of a thesaurus. Today for AGROVOC has been managed by FAO in AGROVOC is a full web-oriented resource and uses collaboration with the Artificial Intelligence Research the SKOS model. group at the University of Rome Tor Vergata (Rome, ). AGROVOC content editing is carried out using VocBench. AGROVOC Semantic data interoperability on food and agriculture 12

VocBench is an advanced web environment for the Khmer, Korean, Lao, Latin, Malay, Norwegian collaboration of maintaining thesauri, ontologies, Bokmål, Norwegian Nynorsk, Persian, Polish, code lists and authority resources, and provides Portuguese, Romanian, Russian, Serbian, Slovak, features such as history, validation, a publication Spanish, Swahili, Swedish, Telugu, Thai, Turkish, workflow, and multi-user management with role- Ukrainian, and Vietnamese. based access control. A concept may cover any subject: an animal, a plant, a geographical region, a chemical element, a 3.1 The AGROVOC concept technique, etc. Operationally, a concept is the set of terms used in any language to describe the same model idea identified by dereferenceable URIs. When an In 2009, AGROVOC included 20 000 concepts and HTTP client can look up a URI using the HTTP 580 000 terms in 19 different languages. In protocol and retrieve a description of the resource, it December 2020, AGROVOC contained up to 38 000 is called a dereferenceable URI (W3C, 2013). For concepts and over 800 000 terms in 40 languages. example, the URI for the AGROVOC concept for FAO is predominantly responsible for the six FAO “weathering” is http://aims.fao.org/aos/ languages (English, French, Spanish, Arabic, Chinese agrovoc/c_8343. The terms or labels are the and Russian), while responsibility for other concept’s names in different languages, for languages lies with the institutions who curate example, weathering, выветривание and content in the language. With regards to preferred Verwitterung are all labels for the same concept in terms, Chinese, Czech, English, French, German, English, Russian and German respectively. Italian, Japanese, Portuguese, Spanish and Turkish AGROVOC concepts are hierarchically organized have the widest coverage. Not all concepts have a under 25 generalized top concepts, see Annex 2. translation in all languages; the coverage varies for These top concepts act as roots for the different different languages. hierarchies below. Concepts under each root In 2020, AGROVOC was available in up to 40 represent “types of” the top concept. For instance, languages (in at least 150 terms), including Arabic, under “activities”, there are types of activities, such Burmese, Catalan, Chinese, Czech, Danish, Dutch, as breeding, production or seed treatment. From English, Estonian, Finnish, French, Georgian, the second level, the hierarchies change depending German, Greek, Hindi, Hungarian, Italian, Japanese, on the type of top concept.

Figure 1. An example of the AGROVOC concept model.

Source: FAO, 2020 3 AGROVOC 13

AGROVOC uses the full SKOS model (except for the Labels are treated in the same way as concepts, use of skos:notation): thus additional information can be attached to labels. However, this approach requires an skos:broader and skos:narrower properties additional vocabulary since it is not supported represent the hierarchical relations, see Figure 1 by SKOS. In SKOS-XL, labels can be enriched with and 2; properties, e.g. to assign dates of creation and modification and editorial notes to labels, or skos:exactMatch or skos:closeMatch properties represent semantic matches with relations between individual labels. external concepts; AGROVOC supported the representation of subsets of concepts, or subvocabularies, originally to skos:inScheme links the concept to the concept scheme to which it belongs; and integrate some classification schemes used in other systems, like FAO Terminology (FAOTERM) and the skos:topConceptOf identifies concepts as top AGRIS classification scheme. In recent years, concepts of a scheme. AGROVOC has developed institutional collaborations with organizations and communities of experts For documentation properties such as skos:note interested in either integrating their KOS in (and sub-properties like skos:definition, AGROVOC or extending AGROVOC for their skos:scopeNote, skos:editorialNote etc.), subdomain. This required some additional modelling SKOS does not prescribe a specific type of value. that went beyond the standard SKOS and SKOS-XL They can either have literal values or a link to models. SKOS allows for the implementation of separate RDF resources with the literal value. subvocabularies through two mechanisms: AGROVOC follows the latter approach.

Figure 2. An example of the concept “rice” in AGROVOC

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 14

Concept schemes: concepts can belong to animals, taxonomic terms for bacteria, taxonomic different concept schemes. terms for fungi, taxonomic terms for plants, taxonomic terms for viruses. Collections: a concept scheme can have different collections, which are linked to concepts through 2. Multiple concept schemes with scheme- skos:mem ber. A skos:Collection is a collection specific hierarchy. AGROVOC has adopted this of resources (skos:Concept or option for domain-specific subvocabularies that skos:Collection), usually with a flat hierarchy. require a tailored organization of concepts. The Collections are not associated with a scheme. At objective is to support multiple concept schemes present they are not used by AGROVOC. while, at the same time, allowing for scheme- specific hierarchical relations through the While SKOS solves the grouping of concepts under definition of new scheme-specific sub-properties a subvocabulary, their organization is a challenge. of skos:broader and skos:narrower. Concepts in the different collections or concept schemes maintain the hierarchical relations that they have in the main scheme, which may not 3.2 AGROVOC VoID correspond with the desired concept organization of the subvocabulary. In order to solve this issue, In addition to being a SKOS concept scheme, AGROVOC has adopted two different solutions: AGROVOC is also a dataset; more precisely, it is a linked dataset. The most common vocabulary used 1. By using specific properties in the to describe linked datasets is the Vocabulary of Agrontology. This approach was used for flat Interlinked Datasets (VoID) (W3C, 2011), which subvocabularies of concepts and for flat is used as part of the AGROVOC RDF model. subvocabularies of terms. To attach a concept The AGROVOC VoID file is located at to a subvocabulary, the Agrontology provides http://aims.fao.org/aos/agrovoc/void.ttl the isPartofSubvocabulary property. The following subvocabularies of concepts The AGROVOC VoID description includes core are currently available: chemicals, geographical metadata properties from the Dublin Core Terms country level, geographical above country level, vocabulary that describe the resource. Additionally, and geographical below country level. the VoID metadata defines the RDF dataset using This type of subvocabulary could be supported the VoID vocabulary, and extracting some statistical by a skos:Collection, but the information about the triples, identifying subsets, isPartOfSubvocabulary relation was inherited numbers of lexicalizations in each language, and from the OWL version of AGROVOC. grouping all mappings to other vocabularies under linksets, or sets of links/mappings. • Subvocabularies of terms include labels of a specific type. For instance, an organism concept may have two labels: a common name and a 3.3 Copyright and licensing taxonomic name. In some cases, users may want to extract just the list of taxonomic names. By the AGROVOC in the six FAO official languages (English, use of SKOS-XL, labels are entities with properties: Russian, French, Spanish, Arabic and Chinese) is to attach the term to a term subvocabulary, the licensed under the International Creative Commons Agrontology provides the hasTermType Attribution License (CC-BY IGO 3.0). property. The following subvocabularies of terms are available: Acronym, common name for animals, common name for bacteria, common name for fungi, common name for plants, common name for viruses, taxonomic terms for 15

4 Accessing AGROVOC

AGROVOC can be accessed in different b) in a search engine ways depending on the user’s needs. In a search engine, the AGROVOC concepts that will If a user wants to browse, search, or look at the be shown, are those that have been used in the local structure, the web browsing interface is the best database or in an index for faster searches. A search option. However, if the user wants to access the engine might not need to access all of AGROVOC. content, extract parts, or embed it in applications, Unless all the necessary additional information has then the best choice is to download it or to use one been stored locally at the time of indexing, in some of the programmable interfaces, such as Application instances, AGROVOC might need to be used in real Programming Interface (API). In addition, AGROVOC time, such as to show: also provides services for machine use. Typical more information about a selected concept, e.g. a scenarios include: definition or relations with other concepts; a) in a tool matching concepts from other vocabularies to to use AGROVOC as a controlled vocabulary to load additional results from other catalogues, or to select the appropriate AGROVOC concepts for facilitate the switch to other catalogues; indexing resources; parent and children concepts to broaden or to use a subset of AGROVOC for a specialized narrow the search. catalogue, for instance extracting only concepts c) in a federated catalogue that belong to a subvocabulary; Federated catalogues are a collection of catalogues to represent a hierarchical tree in the dropdownto that are joined together in a standardised method. facilitate the selection of concepts. These federated catalogues will search on all catalogues matching concepts in other vocabularies, which can be through either URIs or labels depending on their functionalities. AGROVOC Semantic data interoperability on food and agriculture 16

4.1 Browsing and searching When a user clicks on a concept in the left panel, the right panel shows its properties. Most of the in Skosmos properties are SKOS properties, although in some For simple web-based browsing, AGROVOC uses cases with different names: preferred label in English Skosmos, an open source web-based SKOS browser (preferred term), broader concepts, narrower and publishing tool. Skosmos offers search and concepts, alternative labels, labels in other browse functionalities, an alphabetical and thematic languages, URI, close matches and exact matches. index, structured concept display, visualized concept Other properties come from the Agrontology, e.g. hierarchy and a multilingual user interface. The Has taxonomic rank, Has disease, Is produced by, etc. version of AGROVOC loaded in Skosmos is always Skosmos allows users to navigate AGROVOC in 40 the most recent. available languages (any language in which at least Browsing the AGROVOC vocabulary is quite intuitive, 150 concepts have preferred terms, see Figure 4.) see Figure 3. In the left panel, it is possible to browse The language coverage is not the same for all the vocabulary alphabetically or hierarchically. languages. Languages like Vietnamese or Swedish The alphabetical tab shows all terms, including have very few translated labels as, at present, there alternative/non-preferred labels. Terms that are not is no national curating institution. In cases where preferred labels are not clickable and an arrow links concepts are not available in that particular them to the corresponding preferred label. The language the corresponding preferred label in hierarchy tab shows only preferred labels. English will be displayed, see Figure 5. This allows users to still browse the whole vocabulary without missing concepts and maintains the hierarchy.

Figure 3. AGROVOC in the Skosmos browsing interface.

Source: FAO, 2020 4 Accessing AGROVOC 17

The search is dependent on the language. The When typing a search term, the system displays a search criterion is “starts with”, so the search will find number of search results in a dropdown list. To go all terms that start with the search term, both directly to the corresponding page for a specific preferred and non-preferred terms. Skosmos term, it is necessary to click one result, see Figure 6. searches only in concepts having a label starting As in the browsing interface, search result terms that with the input value. A search for “regis” will display, are not preferred terms will link to the for example, “registration” and “registered corresponding preferred term. designation of origin”, but not “deed registration”. To search for all terms that contain a specific text, it is All matching terms appear with further detail and a necessary to use wildcards (*) at the beginning and/ link to the term page, without selecting any of the or at the end of the search string. results in the dropdown (for instance when one wants to see more details about the results), see Figure 7.

Figure 4. Selecting the language in which to browse the Figure 5. Navigating AGROVOC in Arabic: untranslated vocabulary in Skosmos. terms are displayed in English.

Source: FAO, 2020 Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 18

4.2 The Skosmos API A REST API is an Application Programming Interface (API) that follows the “Representational state Skosmos provides a REST API to access all data. transfer” (REST) architecture (Wikipedia, 2020), As with any API, the Skosmos API methods can be which, simply means that: a) clients request data called from within applications and the response through the API methods “in such a way that every can be reused in the application code. packet of information transferred can be understood in isolation, without context information from previous packets in the session” (Wikipedia, 2020); b) for performing Create, Read, Update and Delete (CRUD) operations, clients use HTTP verbs such as GET, POST, PUT, PATCH and DELETE; c) the API exposes a uniform interface, more precisely it always returns “resources” through standardized URIs (e.g. http://www.example.org/rest/resources/countries would return data on all countries, while http://www. example.org/rest/resources/countries/KEN would return data on Kenya); and d) the API methods return data in lightweight formats (mostly JSON or XML).

Figure 6. Search results dropdown Figure 7. Search results in SKOSMOS. menu.

Source: FAO, 2020 Source: FAO, 2020 4 Accessing AGROVOC 19

As the Skosmos API conforms to the REST API style, 4.3 Web-based SPARQL it can easily be used with the REST interface. AGROVOC has its own Skosmos REST API. Using this interface REST API, it is possible to look up terms, filter subsets SPARQL is an RDF (semantic) query language for of terms or get all narrower or broader concepts. databases which is able to retrieve and manipulate For the official documentation for the Skosmos API, data stored in RDF. A SPARQL endpoint is a see Bibliography. All API methods illustrated there conformant SPARQL protocol service which enables can be used on the AGROVOC Skosmos API users (human or other) to query a knowledge base endpoint by adding http://agrovoc.uniroma2.it/ via the SPARQL language. The SPARQL endpoint can agrovoc/rest/v1 in front of the method pattern, be used as a web interface (executing queries in the for instance for the method /{vocid}/children, browser) and as an API endpoint, which can be use http://agrovoc.uniroma2.it/agrovoc/rest/v1/ called directly from their applications in any agrovoc/children?uri=uri_of_concept programming language. SPARQL is the An important note is that the first parameter to recommended way to access RDF through an API, as provide in the URL is the vocabulary ID, which does the main parameter of the API is a SPARQL query, not correspond to a concept scheme, but to a and the SPARQL language is a query language “dataset” loaded in Skosmos. In the case of designed for RDF which can exploit all RDF features. AGROVOC, the dataset is the full AGROVOC dataset, The structure of a SPARQL query is composed of a including all concept schemes. The first parameter first part where all the prefixes for the various to be used when calling the API is always “agrovoc”, relevant vocabularies are set and a second part with even when to obtain data from another concept the actual query. Prefixes are not mandatory but scheme. they will help to speed up the writing of queries. The prefixes needed for the most common queries to a SKOS vocabulary are:

Figure 8. Public AGROVOC SPARQL endpoint

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 20

PREFIX rdf: PREFIX rdfs: PREFIX skos:

The web-based AGROVOC SPARQL interface is The available web services are listed below. The another way to access the thesaurus. SPARQL names are quite intuitive and are provided here to requires specific technical knowledge. The public give an idea of what can be done using these web SPARQL endpoint for AGROVOC, see Figure 8, services. For detailed information on the includes sample queries for exploration. parameters and the return values, it is possible to refer either to the machine-readable WSDL file or to All SPARQL endpoints expose the same API the web-based description of all services available interface: the base API URL. In the case of on the AGROVOC website. AGROVOC, using the API interface (so inside another tool and not directly in a browser), the AGROVOC getConceptByKeyword SPARQL endpoint is followed by a question mark getConceptByKeyword2 and a few parameter/value pairs (primarily, the searchByModeLangScopeXML query parameter, see below) separated by “&”. Being simpleSearchByMode2 part of the URL, the value of the “query” parameter getConceptInfoByTermcode needs to be URL-encoded. getConceptInfoByURI getDefinitions To learn more about how to use SPARQL, see getAllLabelsByTermcode2 Bibliography. getTermByLanguage getURIByTermAndLangXML getFullAuthority 4.4 SOAP web services getConceptByURI The AGROVOC SOAP web services are the oldest getConceptByRelationshipValue form of API that was made available to access the getlatestUpdates thesaurus. They are maintained as legacy web getTermcodeByTermAndLangXML services and are still used, but they should be getTermExpansion considered deprecated. It is highly recommended to getReleaseDate use either the Skosmos API or the SPARQL endpoint getWebServicesVersion to access the data. 21

5 Contributing to AGROVOC

AGROVOC is one of the broadest and 5.1 Community-based content most authoritative vocabularies for curation food and agriculture. AGROVOC’s language versions have always been There are a number of equally authoritative curated in collaboration with institutions. These thesauri, mainly general purpose or neighbouring institutions either support FAO for a specific domain thesauri that also include agricultural topics. language, or take full responsibility for an entire These include EuroVoc, the Standard-Thesaurus language version. Wirtschaft Thesaurus for Economics, the General Multilingual Environmental Thesaurus (GEMET), as The most recent strategic and architectural changes well as a few agriculture-specific thesauri managed to AGROVOC have been the hosting of subdomain by other important organizations, such as the US vocabularies and the multi-scheme multi-hierarchy National Agricultural Library Thesaurus (NALT), the model. Since 2019, the curation and community CABI Thesaurus, and the Chinese Agricultural dimension of AGROVOC has been strengthened, Thesaurus. with the option for expert communities to curate technical subvocabularies in their domain. Language Relative to the landscape of smaller and subdomain- version and subvocabulary curators are part of a specific agricultural KOS, given its size and community of editors that receives support from authoritativeness, AGROVOC is a reference system, AGROVOC and meets regularly to discuss challenges whether as a starting point to create new and improvements. vocabularies or a resource to link to or a hub to host other vocabularies. There are many smaller and The AGROVOC approach to collaborative and subdomain-specific vocabularies with an existing or decentralized vocabulary maintenance was potential overlap with AGROVOC. Since AGROVOC’s highlighted in the NISO document TR-06-2017 scope is generic enough to accommodate terms “Issues in Vocabulary Management” in the from subdomains, it can easily become a semantic “Vocabulary Preservation” section: “In the future, we hub hosting subdomain vocabularies as sub- can imagine a broadly-distributed ecosystem for schemes. vocabulary creation, maintenance, and use based on a commonly agreed URI infrastructure, built to support distribution of terms to consumers based on their explicit preferences. The Food and Agriculture Organization (FAO) implements such a model for AGROVOC and it is instructive to review its features” (NISO, 2017). AGROVOC Semantic data interoperability on food and agriculture 22

The approach expands institutional collaborations While the overall responsibility for managing beyond languages to the curation and extension of AGROVOC remains with FAO, different institutions the content itself, based on the specific and teams are responsible for the different language competencies and mandates of different versions and technical domains. Curating a language organizations and communities. This approach version of AGROVOC is how most institutions entailed some strategic, institutional and procedural contribute to AGROVOC, generally with a long-term configurations: commitment. Other institutions and expert communities contribute in specific technical areas. AGROVOC is viewed as a community-curated expandable concept scheme and as a source for The community of editors representing institutions more specialized vocabularies. are essential to AGROVOC, and has been strengthened over time, since AGROVOC began. FAO primarily bears the responsibility for English, Regarding languages, for example, ICARDA is French, Spanish, Arabic, Chinese and Russian, co-responsible for the Arabic version with FAO, facilitates the technical maintenance of AGROVOC, EMBRAPA curates Portuguese, the Turkish including its publication as a Linked Open Data Department of Training, Extension and Publications, resource, and coordinates all editorial activities. Ministry of Food Agriculture and Livestock curates Around 25 institutions are currently curating Turkish, while Kuratorium für Technik und Bauwesen different language versions and technical in der Landwirtschaft e. V. (KTBL) oversees German subvocabularies of AGROVOC. content. An institution may also contribute translations in their language in one specific Clarification of institutional responsibilities is technical area, such as forestry or engineering. important: while the overall responsibility of AGROVOC stays with FAO, different institutions are responsible for various language versions and 5.2 Benefits of joining domains. AGROVOC In most cases, institutions agree to curate a Using the work of others and maintaining common language version or a subscheme and they parts of vocabularies collaboratively helps to build a become part of the editorial community. There is better ecosystem of vocabularies and brings no fixed legal procedure for establishing a economies of scale; this is also important when it collaboration. Some institutions request a formal comes to sharing knowledge. Much of the world’s agreement like a letter of intent or a Memorandum research is not published in English, and may not be of Understanding. easily discoverable. When a multilingual controlled Participating in the AGROVOC editorial community vocabulary like AGROVOC is being used to tag or is voluntary. index resources, a user or a machine can search in Thai for “research policies” and find resources with To decide whether a new scheme should be this topic in Arabic, Hungarian or Spanish, for created within AGROVOC, the topic of the new example. Adding new translated terms is a scheme should be relevant to AGROVOC and a contribution to making research and datasets more curator should be designated for the new scheme. discoverable, and to making national research more visible and accessible. 5 Contributing to AGROVOC 23

5.3 The AGROVOC editorial The editorial guidelines provide instructions on how to choose the preferred and non-preferred terms, guidelines and which form and capitalization should be used. For many years, the most substantial work carried The guidelines provide detailed instructions for out by FAO was based on internal guidelines, openly different cases, especially regarding term form, style published in 2008 and 2015 but not regularly and language. The guidelines are broadly in line with updated. With a shift to distributed management of the International Organization for Standardization AGROVOC by working with editors worldwide, clear, (ISO) and National Information Standards concise and agreed guidelines are needed to Organization standards for thesauri and the coordinate all the efforts to guarantee consistency International Federation of Library Associations and and coherence on the selection of concepts and Institutions (IFLA) Guidelines for Multilingual terms. The guidelines also apply to AGROVOC Thesauri (IFLA, 2009). subvocabularies.

Figure 9. Adding new terms and concepts to AGROVOC.

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 24

To suggest and create concepts and terms, it is A concept has only one preferred term in any given recommended to check whether the concept or language. All the alternative terms to name a term in question already exists in AGROVOC, by concept in any given language are called non- searching either through Skosmos or VocBench. preferred terms, see Figure 10. The more If a term or a concept is missing, there are some commonly-used term is the preferred term, but the steps to take into consideration, see Figure 9. editor can check with other reputable authorities.

Before suggesting a concept for addition, it is If a concept exists in AGROVOC, but not as a term in recommended to check that it is unique and relevant a specific language, a term that is commonly used in to the scope of AGROVOC, avoiding the following: the target language is recommended over an artificial or literal translation: duplicates of existing concepts; it is recommended to only use a term from the trademarked names, e.g. brand names, source language, if it is used in the target commercial names; language, e.g. “Farmer Field School”@de;

names of plant varieties; or to use a similar term with the same meaning, individuals; e.g. :c_2cfe62a “from farm to fork”@en, “de la ferme à la table”@fr, “fra jord til bord”@nb). programmes/initiatives of limited duration;

individual publication titles, e.g “The Eatwell 5.3.1 The Agrontology Guide”; and The Agrontology is an ontology which complements concepts not within scope of AGROVOC, e.g. AGROVOC, providing domain specific properties for “intensive care”, “mortgage holiday”. enriching the description of concepts. It is a specific vocabulary of non-hierarchical relations developed Concepts might exist which do not follow the current for AGROVOC, which are grouped under guidelines, because they were created when other skos:related. The Agrontology is not conceived as rules existed, leading to inconsistencies. In the past, an exhaustive and coherent ontology for reuse by some antonyms were included as non-preferred others, but rather as a support ontology to terms for the same concept, because they dealt with AGROVOC, especially to accommodate non- the same topic, for example c_2636 “erectness”* with hierarchical relations. non-preferred term “prostrate plants”. This practice is no longer valid: antonyms should be generated into a separate concept.

Figure 10. Preferred and non-preferred terms.

Source: FAO, 2020 5 Contributing to AGROVOC 25

5.3.2 Alignments The basic criteria for accepting a subvocabulary include: AGROVOC is aligned to more than 20 other datasets, which allows for crosswalks between data that use The domain should be relevant to AGROVOC, and AGROVOC and data that use other KOS linked to it. its scope should be well defined. It is important to SKOS has a few properties that allow to link concepts note that concepts in the new scheme will also to external URIs: exactMatch, closeMatch, belong to AGROVOC. narrowMatch, broadMatch, relatedMatch. It is critical The institution that takes responsibility for a to identify concepts in other vocabularies that can subvocabulary should have recognized expertise be matched with AGROVOC and to use their URIs as and a mandate to cover the related topic. values of these properties, as appropriate. A curator should be designated for the new scheme. 5.4 Subvocabularies The institution should agree to the common Subvocabularies in AGROVOC are achieved by AGROVOC editorial rules. creating a scheme containing a subset of AGROVOC concepts. There are two main possible scenarios Currently, they are three subvocabularies in regarding the creation of a scheme: AGROVOC (December 2020):

1) An institution or community of experts with 1) LandVoc. The LandVoc Thesaurus is a set of expertise in and a mandate for a specific 300 concepts about land governance created and subdomain relevant to AGROVOC proposes maintained by the Land Portal Foundation as a creation of a new scheme in AGROVOC. The distinct concept scheme within AGROVOC. The commit to select and organize existing relevant vocabulary was started independently and then AGROVOC concepts and suggesting new merged into AGROVOC in 2017. Most concepts concepts. In this case, the institution or were already in AGROVOC, but they were community of experts use AGROVOC as a scattered throughout the AGROVOC hierarchy. foundation to create either a subset or an For instance, the concept “land conflicts” is extension that covers their domain. embedded three levels under the top concept “phenomena” and “land conflict resolution” is 2) An institution or community of experts with embedded six levels under “activities”. LandVoc expertise in and a mandate for a specific reuses these concepts and re-structures them subdomain relevant to AGROVOC already has a into a hierarchy designed for people working on vocabulary and proposes to integrate it into land tenure, land management and land AGROVOC, reusing existing AGROVOC concepts, governance. In addition, LandVoc has added new when possible, and adding missing ones by land governance concepts to AGROVOC. implementing the hierarchy of their original vocabulary. To date, this has been the most common scenario. AGROVOC Semantic data interoperability on food and agriculture 26

2) ASFA. The ASFA Thesaurus is an indexing and 3) FAOLEX. FAOLEX is a database of national searching tool. It contains the subject descriptors legislation, policies and bilateral agreements on used to index the content of the Aquatic Sciences food, agriculture and natural resources and Fisheries Abstracts (ASFA) Bibliographic management. It currently contains legal and Database. It covers the world’s literature on the policy documents drawn from more than 200 science, technology, management, and countries, territories and regional economic conservation of marine, brackish water, and integration organizations in over 40 languages. freshwater resources and environments, including their socio-economic and legal aspects. A number Although FAO facilitates the technical maintenance of concepts from the thesaurus had already been and publication of the whole of AGROVOC, the mapped to AGROVOC while the two thesauri were infrastructure is flexible. While institutions that do maintained independently. Since 2019, aquatic not have the necessary technical infrastructure can sciences and fisheries concepts have been added rely completely on AGROVOC services like Skosmos to AGROVOC. ASFA has been integrated in and the SPARQL endpoint, other institutions can AGROVOC as an independent subvocabulary, decide to use their own browsing interface or their managed in Vocbench, and sharing concepts with own triple store. AGROVOC and other vocabularies. ASFA and LandVoc are examples of two different approaches:

Figure 11. The same concept (“line fishing”, URI http://aims.fao.org/aos/agrovoc/c_4349) in the AGROVOC Skosmos (left) and in the ASFA Skosmos (right).

Source: FAO, 2020 5 Contributing to AGROVOC 27

1) ASFA uses all the features of the AGROVOC The following SELECT query on the AGROVOC infrastructure, and its browsing interface is SPARQL endpoint retrieves a tabular result set with provided by the AGROVOC Skosmos platform. The columns for concept URI, preferred label in English ASFA Skosmos interface displays only the ASFA and Spanish and their definitions, see Figure 12. scheme, and only shows the ASFA hierarchic This is only for concepts belonging to the LandVoc relationships, see Figure 11. scheme:

2) LandVoc is used in the Land Portal website for The following DESCRIBE query allows the download indexing and browsing content. The Land Portal of a full RDF version of the scheme. DESCRIBE asks administrators curate the LandVoc content in the for all the triples in which the values of the selected AGROVOC Vocbench, which is regularly variables are included, see Figure 14. downloaded in RDF. The Land Portal also uses its own SPARQL endpoint and browsing interface. 5.5 Suggesting new terms Sub-vocabularies or schemes can be downloaded independently. The institution that curates a scheme The recommended procedure is to suggest new may provide a dedicated download. Otherwise, the terms directly in the editing platform VocBench, scheme can be downloaded using SPARQL queries. where validators will review the suggestion. New As seen with language versions, users can download concepts can also be suggested by non-editors by a CSV version using a SELECT query with the values sending an email to [email protected]. they need, see Figure 13, or an RDF version executing a DESCRIBE or a CONSTRUCT query.

Figure 12. Sample SELECT query on the AGROVOC SPARQL endpoint.

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 28

Generally, it is enough to provide the new concept If one or more concepts or terms are suggested by label in English, with any available translations. email, it is recommended to ask [email protected] However, a definition with source is required. For a for the simple template, see Figure 15. new concept, it would be useful to also suggest the concept’s position in the hierarchy (the broader concept), alignments with concepts in other thesauri, and any other non-hierarchical relationships. The more context given, the easier it is to review.

Figure 13. CSV file of the LandVoc export.

Source: FAO, 2020

Figure 14. Sample DESCRIBE query on the AGROVOC SPARQL endpoint.

Source: FAO, 2020

Figure 15. Template for suggesting new concepts.

Source: FAO, 2020 29

6 VocBench: editorial workflow

VocBench is an essential instrument VocBench 3, which is presented in this guide, was for AGROVOC editing. released to the public in September 2017 under a Berkeley Software Distribution (BSD) 3-clause In this chapter, and in the following chapters, license. Since then, an average of two or three information about VocBench will be always releases each year demonstrates the constant presented from the AGROVOC point of view, evolution of the system characterized by the meaning that only VocBench relevant features to introduction of new features, improvements of edit the AGROVOC thesaurus will be presented and existing ones and bug fixes. discussed. In the AGROVOC VocBench platform, there are a VocBench is a free and open-source advanced variety of roles that include: environment for creating KOS, which supports the collaborative development of these resources by Lexicographer – has permissions to add labels embracing Semantic Web standards such as OWL and notes (except validating them), which can be and SKOS (SKOS-XL). Besides editing and browsing limited to certain languages. capabilities, VocBench features several advanced Mapper – can create alignments with external functionalities for supporting the publication concepts. workflow (e.g. history, validation and versioning), “extract, transform, and load” (ETL) processes (e.g. Ontologist – can edit classes and properties. importing from spreadsheets), alignment (manual, semi-automatic and automatic) or user discussions. Thesaurus editor – has permissions as a lexicographer to manage concepts (add, remove, VocBench 3 is funded by Action 1.1 of the update) and to execute SPARQL queries. Interoperability solutions for public administrations, businesses and citizens (ISA) Programme of the Project manager – has full permissions. European Commission, which aims to provide Validator – can validate draft edits, which are all “interoperability solutions for public administrations, the changes done to AGROVOC by editors, e.g. businesses and citizens”. new translations and concepts, updating a concept or a term, etc.

Administrator – responsible for the entire VocBench deployment and configuration. AGROVOC Semantic data interoperability on food and agriculture 30

Additionally, different types of editing rights may be Editors can access the “History” tab, which displays assigned by language. all accepted actions. They can also see their own pending submissions under the “Validation” tab and In Vocbench, a project is a specific instance of a they can reject their own proposed actions. Editors given thesaurus or ontology, and it is the can also browse VocBench and see the status of any administrator who assigns user roles based on the resource. Some history and validation data are also combination of which operations should be possible saved to the AGROVOC RDF, such as date of concept on which parts of a project. Operations include or term creation and date of last update while, in create, read, update, delete and validate. general, changes performed in between creation There are two projects containing AGROVOC data: and last update are only stored in VocBench. agrovoc-core, which is the live editing environment, and agrovoc-test, which is a sandbox for learning how to use VocBench and explore the AGROVOC backend. It is possible to request an account for agrovoc-test.

Most AGROVOC editors will have one or both of the lexicographer and thesaurus editor roles. There are access restrictions related to schemes: all concepts and subvocabularies belong to AGROVOC, but AGROVOC editors can only edit within their assigned schemes.

In VocBench, the workflow includes the:

editor (thesaurus editor and/or lexicographer) adds or modifies something, which is automatically considered as proposed; and

validator accepts or rejects the proposed action.

All AGROVOC editors are experts, working in centres of recognized expertise in the area of agriculture, forestry, fisheries or related sciences. The validators, besides being very knowledgeable in their specific area of expertise, are also well acquainted with the AGROVOC structure and editorial guidelines. Validation is an especially important step in ensuring the coherence and quality of AGROVOC, which is critical to avoid entry of concepts that may also already exist in another form, and to ensure adherence to AGROVOC structural logic and editorial guidelines. 31

7 VocBench: navigating AGROVOC

7.1 Browsing and searching Once a scheme is selected, it is possible to start browsing the concepts, see Figure 17. Under in VocBench “Concepts”, there are the top 25 concepts of Most VocBench operations are performed under AGROVOC, in all available languages; by clicking on the “Data” tab where users and editors will find all the triangles on the left, the hierarchy will be the concepts and concept schemes accessible to expanded. them, as well as the vocabulary elements (classes, Navigating with all languages active can be properties, data types). Since AGROVOC hosts confusing and difficult, It is possible to configure the multiple schemes, it is important to ensure that the system to display only specific languages by clicking user is browsing the right scheme. It is on the user icon in the top right corner of the recommended to first click Scheme and select VocBench interface and selecting “Preferences”: AGROVOC http://aims.fao.org/aos/agrovoc, see Figure 16. If no scheme is selected, the concept In the Languages panel, it is possible to select which browsing view will show the combined hierarchies of languages for visualizing the concept labels and in all schemes, which can make navigation confusing. which order. In the example, English and Spanish have been selected, see Figure 18 and 19.

Figure 16. AGROVOC schemes listed under the Data > Figure 17. The AGROVOC top concepts in the VocBench Scheme tab. Concept tab.

Source: FAO, 2020 Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 32

It is also possible to configure the default editing The AGROVOC hierarchy can be browsed on the left. language under the “Editing” tab. After changing the When one selects a concept, properties will be preferences, the Concept view will be more displayed in the panel on the right, see Figure 21. manageable, see Figure 20. VocBench uses the RDF property names in the right panel preceded by the prefix that identifies the vocabulary to which the property belongs.

Figure 18. The VocBench Preferences Figure 19. Configuring languages used in rendering concepts. menu item.

Source: FAO, 2020 Source: FAO, 2020

Figure 20. The Concepts view using only Figure 21. The concept tree panel (left) and the property two languages. panel (right) in VocBench.

Source: FAO, 2020 Source: FAO, 2020 7 VocBench: navigating AGROVOC 33

At the bottom of the left panel, there is a search box, 7.2 Searching through a SPARQL see Figure 22, which provides options to configure search settings, such as starts with, contains, exact query in VocBench or fuzzy. It is also possible to choose to search in In addition to the public SPARQL endpoint, certain literal values or in URIs, or in specific VocBench has a local interface available for queries languages. This search is not case sensitive (so which contains the current content of AGROVOC. searching for “Events” will return “events”). The user needs to be logged in to VocBench to If the search gives only one result, VocBench will perform such queries in contrast to the public automatically open the result expanding the SPARQL endpoint which can be accessed by anyone. concept tree in the left panel. If there is more than It is possible to extract the concepts from SPARQL one result, VocBench will display all results in a queries in a simple CSV file, which helps to review, popup so the user can select the desired concept. edit and add new labels where needed. The VocBench has a useful “Restrict search to language” translated concepts can be imported at a later point setting in the Search settings popup, see Figure 23. by the administrator. Editors have access to run and save SPARQL queries in VocBench, see Figure 24. When searching in a language, geographic variants (locales) are not considered by default. The “Include Depending on what editors need or prefer for their locales” checkbox can be ticked if desired. work, it is possible to run queries or request exports such as:

Figure 22. Vocbench search functionality.

Source: FAO, 2020

Figure 23. The “Restrict search to language” setting in the VocBench search settings popup.

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 34

all concepts and their labels in English (as source The query results from SPARQL queries can be language) and in a specific language; exported in formats such as Excel, JavaScript Object Notation (JSON), CSV, Tab Separated Values (TSV) all concepts that exist in English but not in a and Open Document Spreadsheet (ODS). specific language; and Editors with access to multiple schemes can filter a all concepts created after a certain date. query by scheme (for example, only ASFA by adding The next step is to add the specific query text. This ?concept skos:inScheme ). However, for English, showing also preferred terms in French and queries in the public SPARQL endpoint, only the Spanish, for each concept and the non-preferred AGROVOC scheme is present (the LandVoc scheme terms in English, French and Spanish, see Figure 25. is also available but with no custom hierarchy).

Figure 24. Using SPARQL inside VocBench.

Source: FAO, 2020

Figure 25. Sample SPARQL query in VocBench

Source: FAO, 2020 35

8 VocBench: curating AGROVOC

8.1 Adding a new term Only one preferred label for each language is allowed. So if editors add a new preferred label in a To add a term in a specific language for an existing language for which there is already a preferred label, concept, editors should always search in AGROVOC VocBench will automatically transform the old to find the concept to which the translation should preferred label into an alternative label. The new one be attached. Additionally, they have to check that the will be the preferred label, which will be marked as a term does not already exist in that particular proposal until validated. language. 8.1.2 Adding an alternative term 8.1.1 Adding a preferred term An editor may want to suggest an alternative label It is possible to suggest a preferred label for a in a specific language for an already existing specific language if it is not present. concept. For example, the editor may want to add ‘industrial farming’, which is not yet present in When a concept is selected, all its associated AGROVOC, but they find “intensive farming” as the information is displayed in a dedicated tab, in the preferred term for the concept :c_3906. Even right part of the screen, see Figure 19. Inside the though this concept already has “industrial viewer, all the preferred and alternative labels of the agriculture” among its alternative labels, the editor desired concept are shown in the “Lexicalizations” decides to suggest “industrial farming” as well. section. Selecting the concept “intensive farming” in the Aneditor should click on the “+” button besides left panel, the editor can see all its properties in skosxl:prefLabel and insert the new term in the the right panel. Among the properties, there is popup, see Figure 26. It is only possible to add a skosxl:altLabel which indicates non-preferred term in a language for which the editor has been terms. The editor needs to click on the plus (+) sign on granted permission. the right to add a new alternative label, see Figure 28.

The new suggested label is now shown in green The editor can add the new term in the popup, see italic, pending validation, see Figure 27. Figure 29. The new term will appear as a draft until Note: editors can set their main language as the validated. default editing language in the preferences. This way, Editors can access the validation tab and see what the editors’ language is already pre-selected in the actions are pending or done. If needed, they can “Add skosxl:prefLabel” popup and they do not have to reject a suggested action (changing the data as it switch in most of the current editing tasks. was before editing) while it is still pending validation. AGROVOC Semantic data interoperability on food and agriculture 36

8.2 Editing an existing term To edit a published label, like a spelling mistake, the editor needs to click on the small arrow pointing Modification of existing labels is possible for an down on the far right, and choose “Edit literal editor with appropriate language permissions. content”, see Figure 30. There are three general use cases: To swap preflabel and altlabel, the editor should ask edit literal content, e.g. correcting the spelling in a [email protected] for help. To delete a published language, or changing from upper to lower case; label, the editor should click on the little arrow pointing down on the far right, and choose Delete. swap altLabel and prefLabel; and This action then needs to be validated. delete a label, e.g. to fix a previous error.

Figure 26. Adding a preferred label in Spanish. Figure 29. Adding a new alternative term: step 2.

Source: FAO, 2020 Source: FAO, 2020

Figure 27. New suggested label in Spanish highlighted Figure 30. Editing a label. in green italic.

Source: FAO, 2020

Figure 28. Adding a new alternative term: step 1.

Source: FAO, 2020

Source: FAO, 2020 8 VocBench: curating AGROVOC 37

8.3 Adding a new concept This last step is essential. Editors should consider that a concept may have different names or labels Before adding a new concept, editors should: and check whether the label (preferred or alternative) that they want to add is not part of an be familiar with the AGROVOC structure; existing concept. Also, if it is just a term or an entirely know the AGROVOC editorial guidelines well; new concept.

remember that the new concept will be part of The VocBench search functionality searches all AGROVOC (really important for editors managing labels by default, so editors can use it to search for any subvocabulary in AGROVOC); the term they have in mind and for synonyms. Depending on the search results, a completely new have a definition ready, with source; and concept may be required, or a new alternative label may be added to an existing concept. The editor search AGROVOC to make sure the concept is not may also opt to do nothing because the term is already present in AGROVOC as a synonym or in a already a label of an existing concept in the relevant different language. language. Here is an example for when an editor wants to add “farmers’ rights”:

Figure 31. List of AGROVOC search Figure 33. Step 2: creating a new narrower concept. results for “farmers’ rights” in VocBench.

Source: FAO, 2020

Figure 34. Step 3: adding the label for a new concept.

Source: FAO, 2020 Source: FAO, 2020

Figure 32. Step 1: searching for the broader concept Figure 35. Newly created concept. before adding a new concept.

Source: FAO, 2020 Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 38

First, the editor should have a definition ready, with The editor should search for the broader concept a source. “Farmers’ Rights refer to rights arising from (legal rights) in VocBench, see Figure 32. With “legal the past, present and future contributions of farmers rights” selected (it will be highlighted in blue), the in conserving, improving, and making available plant editor should click on the second button under genetic resources, particularly those in the centres of Concept (in tab Data > Concept). This button allows origin/diversity.” creation of a new narrower concept, see Figure 33. The source of the definition is The International Treaty on Plant Genetic Resources for Food and Clicking on this button will open a popup, where the Agriculture, 2001. new concept (e.g. farmers’ rights) can be suggested, see Figure 34. It is also possible to add a concept in The editor should then search in AGROVOC to another language. However, editors can only add make sure that this concept does not already exist, English if they have editing rights to that language. which can be done in Skosmos or VocBench, see Figure 31. However, it is important to note that The system will automatically create the new VocBench (agrovoc-core) will always have the concept and generate its URI local name (the final newest data (even if changes are not yet validated). part of the URI, after the AGROVOC namespace), forming it by concatenating (linking) c_ and a eight- To suggest “farmers’ rights” as a new concept, the digit automatically-generated code. editor should go to the Concept tab, where there are four buttons: Create concept, Create narrower Now “farmers’ rights” appears as a draft new concept, Delete concept and Deprecate. Note that if concept, awaiting review, see Figure 35. the editor does not have all the permissions, certain buttons will not be active. The new concept will be marked as a draft until approved by the validator. The new concept may be Concepts should never be deleted in AGROVOC. accepted, rejected, or left pending if the editor is AGROVOC URIs are persistent, which means that the asked to provide additional information. Definitions URI is permanently assigned to a particular resource, is are mandatory for all new concepts to aid stable and does not change or vanish over time. translations, see next section. It is important to Instead, a concept can be deprecated (for example, if it remember that changes to concepts will be is a duplicate). Deprecated concepts are visible, as the inherited throughout AGROVOC and all schemes URI may be in use, but cannot be edited. The reason for that include that concept. deprecation should be annotated in a change note. It is important that the editor does not deprecate a concept without talking to the administrator. 8.4 Adding other properties

The editor needs to select where to place the new When a concept is selected, the editor can edit its concept as there is a specific logic to where concepts properties if permissions are granted. sit in the hierarchy. Annex 2 outlines the top concepts of AGROVOC, which can also be helpful Editing procedures are slightly different depending when thinking about where something might fit in on the type of value used for the property an editor the AGROVOC hierarchy. For example, “farmers’ wants to edit: literal values or resources. Some rights” might sit at the same level in AGROVOC examples are provided for how to edit such hierarchy as “breeders’ rights”: entities > legal rights properties in VocBench. > breeders’ rights. For definitions, AGROVOC uses the skos:definition property, which is a sub-property of skos:note, which groups all the documentation 8 VocBench: curating AGROVOC 39

properties. All documentation properties can be or “Definition with a description”, see Figure 38. added or edited under the “Notes” section in the For definitions with descriptions, the source is right panel of the Concept view. If a documentation described in text, not as a URL. property has not yet been inserted for a concept, the editor will find the section empty and need to Editors are strongly encouraged to provide a add a new type of note. definition in English any time they add a new concept to AGROVOC, independently of their In the right panel, in the Notes section, the editor preferred language. This practice is meant to allow should click on the orange icon to add a new note, future expansion of AGROVOC, i.e. translation to see Figure 36. other languages.

In the popup, the editor should select the type of Definitions can be provided in any language, documentation note. Definitions are under but only one definition per language. Once added, skos:note. In this case, skos:definition should the definition is displayed in the right panel, see be selected, see Figure 37. Figure 39.

There are two options for definitions, depending on Once the definition is added, the skos:definition where to find the definition: “Definition with a URL” property appears in the right panel. An editor

Figure 36. Adding a definition: step 1.. Figure 39. Display of definition.

Source: FAO, 2020 Source: FAO, 2020

Figure 37. Adding a definition: step 2. Figure 40. Editing a definition.

Source: FAO, 2020

Source: FAO, 2020 Figure 41. Example of relationship between concepts in Skosmos. Figure 38. Adding a definition: step 3.

Source: FAO, 2020 Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 40

can add other definitions in other languages by 8.5 Adding a relationship clicking on the corresponding plus sign on the right, instead of adding a new property by clicking between concepts on the yellow icon. AGROVOC uses the SKOS relation skos:related To edit or delete a definition, it is necessary to corresponding to the classical thesaurus RT. double-click on the hyperlinked definition text and a AGROVOC also allows for relations from the definition resource will open. The editor will be able Agrontology between concepts. For example, to see that the definition is a separate entity, with its looking at “goat cheese” in Skosmos, this includes own URI and metadata. In this window, the editor the relationship isMadeFrom “goat milk”, see can edit or delete the definition using the dropdown Figure 41. options on the right, see Figure 40. To add non-hierarchical relations, the editor needs to open “goat cheese” in VocBench, then Other Properties, where the specific type of relationship, for example isMadeFrom will be selected in order to link the two concepts together, see Figure 42.

8.6 Mapping to an external concept (alignments) Figure 42. Viewing a relationship between concepts in VocBench. To align the new concept “farmers’ rights” with the corresponding concept farmers’ rights in NALT, the editor should select the source concept “farmers’ rights” from the AGROVOC concept tree, see Figure Source: FAO, 2020 43. In the right panel, the editor should click on the dropdown arrow on the right and select “Align with external resource”. Figure 43. Linking to an external concept: step 1. In the alignment popup, see Figure 44, the editor should choose the appropriate property for the type of mapping. In this case, the concept “farmers’ rights” in NALT is the same concept as in AGROVOC, so it is possible to set an exact match relation. It is helpful to look at context and definitions, where available, to make sure the match is exact.

Source: FAO, 2020 The Basel Register of Thesauri, Ontologies & Figure 44. Linking to an external concept: step 2. Classifications (BARTOC, 2020) Skosmos browser is a very useful resource for looking up alignments; it is important to be sure that the link is to a concept URI, not a URL. It is also highly recommended to align AGROVOC with other KOS that are already aligned to AGROVOC (so that the alignment is bi-directional).

Source: FAO, 2020 41

9 VocBench: curating a subvocabulary

By using the multischeme hierarchy VocBench supports the use of other hierarchical feature, a controlled vocabulary can relation properties in addition to the standard be viewed flexibly and edited with skos:broader. Even custom relation properties can be used, as long as they are sub-properties of its customized relations, or exported skos:broader. Since these hierarchical properties with a generic SKOS hierarchy of are not automatically assigned to a scheme, the broader and narrower relations, editor needs to decide which hierarchy to see by without changing the hierarchy of selecting what hierarchical property to use in the AGROVOC itself. concept tree. In AGROVOC, each scheme has its own hierarchical property, allowing a different hierarchy This function enables potential collaborations view for each scheme. The editor can switch the view with specialized communities: their vocabulary by selecting a scheme, see Figure 45, and by can be hosted by AGROVOC while maintaining choosing the corresponding hierarchical property, the possibility for separate hierarchy, exports see Figure 46. and display.

Figure 45. Changing scheme view in Vocbench. Figure 46. Changing hierarchical property for scheme view in Vocbench.

Source: FAO, 2020

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 42

When viewing a concept in VocBench, it is possible All the editorial rules, workflow steps and basic to see which scheme(s) it belongs to, and what the operations in VocBench described in Chapter 8 are broader concepts are, see Figure 47. also valid for sub-vocabularies. This chapter will only illustrate some operations in VocBench that are In order to manage a scheme, editors need to have specific to scheme curation. the “thesaurus-editor” role and permissions on their own scheme. There is an ontologist role if the editor needs to set up the scheme-specific hierarchic properties, but this is handled by the administrator.

Figure 47. Example of how VocBench displays differences in scheme hierarchy for a concept.

Source: FAO, 2020 9 VocBench: curating a subvocabulary 43

9.1 Associating concepts In the next step, the FAOLEX scheme is selected by clicking on OK. The concept also now belongs to the with a new scheme FAOLEX scheme.

Whether a scheme curator is reusing an existing If the new concept is not a top concept in a scheme AGROVOC concept or has created a new concept, it is and it does not have a hierarchical relation with necessary to associate it with the desired scheme. other concepts up to the top concepts, editors will For instance, the scheme curator wants to reuse the not see it when they select their scheme. If it is a top AGROVOC “law of the sea” concept in the FAOLEX concept, they will have to select it again under the scheme. Since that concept is not in the scheme yet, AGROVOC scheme and add a value to the the scheme curator has to select the AGROVOC skos:topConceptOf property, setting it to their scheme under the Scheme tab, then under the scheme. If it is not a top concept, it will be displayed Concept tab, select the concept “law of the sea”, see in the tree only after it is positioned in the hierarchy. Figure 48. For the moment, the concept is associated Before positioning the concept in a hierarchy, only with AGROVOC and ASFA schemes: by clicking editors need to have the scheme-specific hierarchic on the plus sign next to the skos:inScheme header in property in place. the right panel, it is possible to also associate the concept with the FAOLEX scheme, see Figure 49.

Figure 48. Associating a concept with a new scheme.

Source: FAO, 2020

Figure 49. Selecting the scheme. Figure 50. Creating a subProperty of skos:broader in VocBench.

Source: FAO, 2020

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 44

9.2 Creating scheme-specific A name for the property can then be provided in the popup, see Figure 51. The property can now hierarchical properties be used.

In order to see how to set up scheme-specific Once a new scheme-specific hierarchical property hierarchies, it is necessary to understand the RDF has been created, one more step is needed in order model that implements alternative hierarchies. to view the new hierarchy. The new property must All properties used in VocBench can be found under be selected as the hierarchic property in the the Data > Property tab. Editors with permissions as Concept tab of VocBench. To do this, it is necessary an ontologist or thesaurus editor will see the Create to click on the Settings icon, see Figure 52. subProperty button enabled. If they want to create a Next to the “Broader” header, editors should click sub-property of skos:broader, they will need to on the plus sign to add another broader property, select skos:broader in the property list, see Figure see Figure 53. 50, and then click on the Create subProperty button. Editors should select the new scheme-specific broader property, see Figure 54.

Figure 51. Naming the new subProperty. Figure 53. Add a broader subProperty to the concept tree settings.

Source: FAO, 2020

Figure 52. Settings icon in the Concept tab.

Source: FAO, 2020

Figure 54. Select the preferred broader property for Source: FAO, 2020 hierarchical display.

Source: FAO, 2020 9 VocBench: curating a subvocabulary 45

Figure 55. Setting the subProperty as default broader property.

Source: FAO, 2020

Figure 56. Adding a new broader concept. Figure 58. Selecting the scheme-specific broader property.

Source: FAO, 2020 Source: FAO, 2020

Figure 57. Changing the default broader property. Figure 59. Selecting the correct scheme in the broader selection popup.

Source: FAO, 2020 Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 46

In the settings popup, editors should select the new 9.3.1 Positioning a concept in the property before clicking on OK, so that only that property will be used to display the hierarchy, see scheme-specific hierarchy Figure 55. To add a broader term, editors should follow the After that, the Concept tab will display only the normal process by clicking on “Add broader”, see hierarchy according to the specific hierarchical Figure 56. property. The same steps can be used to change In the next popup, editors should change the to another hierarchical view. broader property in the top right corner by clicking on it to open a popup where they can select the scheme-specific broader property, see Figures 57 9.3 Implementing the and 58. scheme-specific hierarchy Once this property is selected, they need to choose Editors can implement the scheme-specific the broader concept. Since the broader term has to hierarchy by using the scheme-specific broader belong to the same scheme, they need to make sure property instead of the core skos:broader that in the concept selection popup, the correct property. The hierarchy uses the specific scheme is selected for browsing, see Figure 59. hierarchical property, which is logically associated to that scheme. The popup will then display the concepts in a specific scheme, e.g. FAOLEX, and the editors can choose Depending on whether the concept is new or not, the broader concept. Once the scheme-specific editors will need to take an additional step. If the property is filled instead of the default concept is new, it should be first added to the skos:broader, this hierarchic relation will be set AGROVOC scheme, at a specific place in its hierarchy, only for the selected scheme. using the standard skos:broader property. Once the concept is in the AGROVOC hierarchy, the scheme-specific hierarchy needs to be set next.

Editors should choose to either set the scheme-broader property (e.g. :landvoc-broader) to the same hierarchy as in AGROVOC, or to select another broader concept of their choice. Both concepts need to be in the desired other scheme, otherwise the hierarchy will not be visible when selecting the scheme containing the subvocabulary. 47

10 VocBench: importing and exporting

Functionalities for importing and 10.1 Using SPARQL and exporting are used when editors want Sheet2RDF to export and to import a number of concepts already import data created or edited with a different tool, e.g. in Excel or in another RDF tool, Using the Sheet2RDF tool, content can be imported instead of adding or editing one in VocBench from spreadsheet files following a specific template. This is useful when the work may concept at a time. be simpler to do in a spreadsheet rather than in VocBench can import from RDF and CSV files. VocBench, including when: However, using these functionalities requires an editor wants to add a large amount of more than basic permissions and some technical translations to existing AGROVOC concepts; knowledge. an editor works with subject matter experts, who Editors can export data from VocBench in two would be willing to help contribute new concepts, different ways: or review a limited set of technical terminology in 1) RDF. VocBench can export the content of a their area; project in all possible RDF formats. The export a language needs bulk edits (like changing case) functionality is available from the top right for a large number of concepts; and “Global Data Management” menu. a non-editor partner wants to suggest new terms. 2) SPARQL. After executing a query in the SPARQL interface, results can be downloaded in In all these cases, the missing terms for concepts in different formats, including CSV. This is useful a specific language or set of topics can be extracted when a number of concepts need to be edited, from AGROVOC using a SPARQL query. Then the and then reimported. new terms can be created in a spreadsheet, which can be imported into AGROVOC. To export data with SPARQL, it is possible to use queries in the public SPARQL interface. AGROVOC Semantic data interoperability on food and agriculture 48

Only the project manager and administrator can The final file should have only one preferred work with Sheet2RDF; the editors are not allowed to label per concept in the new language. Multiple bulk import data from CSV. It is best to discuss alternative labels in one language are allowed, import requests with the administrator first to make but these need to be in separate columns. It is sure the file format is correct, making sure that all also important that existing translations are not the information required will be filled, for example re-imported, so if the first type of download is the AGROVOC URI must be included together with used with existing translations, the final file should language labels. exclude the previously existing translations and only include the newly-translated terms. Once the The recommended procedure is to suggest the new file with translations or edits is ready, it can be terms directly in VocBench, where validators will imported in VocBench by the project manager review the suggestion. New concepts can also be using the VocBench Sheet2RDF tool. suggested by non-editors by sending an email to [email protected]. Generally, it is enough to provide To illustrate the process, a simple file will be used, the new concept label in English, with any available which was extracted from a downloaded Chinese translations. A definition with source is required. For translation file with translated terms. The a new concept, it would be useful to suggest also skosxl:prefLabel or skosxl:altLabel the position in the hierarchy (the broader concept), properties are used to add new labels, like alignments with concepts in other thesauri, and any skosxl:prefLabel@zh. When sending the file to other non-hierarchical relationships. AGROVOC, it is important to clearly indicate what is the skosxl:prefLabel and skosxl:altLabel. Then the project manager will import the file saved 10.1.1 Importing translations as CSV using the VocBench Sheet2RDF tool, see To import translations, it is important to make sure Figure 61 and 62. that the final file follows a predefined template which can be requested from the AGROVOC administrator, see Figure 60.

Figure 60. Example of template for importing terms from Figure 62. CSV file loaded into VocBench. an Excel file.

Source: FAO, 2020

Figure 61. Opening the Sheet2RDF tool in VocBench. Source: FAO, 2020

Figure 63. Configuring the subject header in Sheet2RDF.

Source: FAO, 2020

Source: FAO, 2020 10 VocBench: importing and exporting 49

As shown in Figure 62, the skos:prefLa bel@zh Finally, in the top right corner of the bottom header is green, which means that VocBench panel, the Add triples button, i.e. the button to recognizes the property. The next step is to indicate the left of “Export as”, to import the translations which header represents the subject by clicking on should be clicked. “Subject mapping”. A message will indicate: “The generated triples have In the next screen, the administrator should select been added.” In the validation tab of VocBench, a the AGROVOC URI header as the subject and single operation for Sheet2RDF/addTriples will now Default converter under Converter and then click be pending review and action by the validator. The OK, see Figure 63. imported terms pending validation will be visible in Vocbench, such as “ ” and spot-checks are 全地形车 Everything is now set for the import. The recommended. administrator should click on the “Generate Pearl” button first (arrow to the right of “Subject mapping”), then on the “Generate triples” button (first icon in Pearl window). Moving the mouse indicator over a button will show the tooltip associated to it. The list of the triples that are going to be imported will be displayed, see Figure 64.

Figure 64. Generating and executing script to create triples in Sheet2RDF.

Source: FAO, 2020 AGROVOC Semantic data interoperability on food and agriculture 50 51

Glossary

Agrontology. Specific vocabulary of non- Another hierarchical relation is between the whole hierarchical relations developed for AGROVOC, and its parts grouped under skos:related. “blood vessels”* skos:narrower “blood veins”, Concept. Concepts may cover any subject: an “arteries” animal, plant, geographical region, chemical element, technique, etc. Operationally, a concept In some cases, the relation is instantial i.e. refers to a is a set of terms used in any language to describe particular type of thing. the same idea. “mountain ranges”* skos:broader “”, “intensive farming”*@en “Apennines” “intensive agriculture”@en Non-preferred term. All the alternative terms to “Explotación agrícola intensiva”@es name a concept in any given language are called “agriculture intensive”@fr non-preferred terms. “agrosilvicultural systems”@en, Sibling concept. Concepts that have the same “farm forestry”@en are all non-preferred terms in parent concept. Looking at “wood products” and English which are used for concept :c_207 (preferred “non-wood products”, these are sibling concepts, term “agroforestry”*) . i.e. on the same level in the hierarchy with the Parent concept. In the hierarchical structure, the shared parent concept “forest products”. more general concept is the parent concept. It is the Hierarchical relations between concepts. object of the relation skos:broader. Concepts are organized hierarchically by means of “animals”@en, “animales”@es is the parent concept the relations skos:broader (BT) and its inverse of “aquatic animals”@en, “animales acuáticos”@es skos:narrower (NT). The relation can be generic between a category and its members. Preferred term. For each concept in each language, one term is preferred representing a single concept. “birds” skos:narrower “parrots”, where the The decision which term should be preferred usually biological order “parrots” is one of the members depends on its domain and its accepted of the class “birds” conventions.

“agroforestry”*@en is the preferred term in English for the concept :c_207. AGROVOC Semantic data interoperability on food and agriculture 52

Simple Knowledge Organization System URI - Uniform Resource Identifier.A URI is a (SKOS). SKOS is a W3C recommendation designed string of characters used to identify a name or a for representation of thesauri, classification resource on the Internet. The most common form schemes, taxonomies, subject-heading systems, or of URI is the Web page address, which is a particular any other type of structured controlled vocabulary. form or subset of URI called a Uniform Resource SKOS is part of the Semantic Web family of Locator (URL). In SKOS, concepts are formalized standards built upon RDF and RDFS, and its main as skos:Concept and identified by objective is to enable easy publication and use of dereferenceable URIs. such vocabularies as linked data. Source: https://en.wikipedia.org/wiki/Simple_ http://aims.fao.org/aos/agrovoc/c_12332 is the Knowledge_Organization_System URI of the concept “maize”*@en, “corn (maize)”@ en, “maïs”@fr, …] SKOS concept scheme. A SKOS concept scheme is an aggregation of one or more SKOS concepts. Source: https://www.w3.org/TR/skos-reference

Term. A term is a word or set of words used to name a concept in any given language. ar@” ذ رة ص ف راء“

“zrno kukuřice”@cs

“maize”@en

“Maíz”@es

“Mais”@it

“kukorica”@hu

“Jagung”@ms

“Kukurydza (ziarno)”@pl

“milho”@pt

“porumb”@ro

“kukurica siata”@sk

“ ”@zh 玉米 53

Bibliography

Apache Jena. 2020. SPARQL Tutorial [online]. European Commission. 2016. Guidelines on FAIR [Cited 15 September 2020]. https://jena.apache.org/ Data Management in Horizon 2020. [Cited 7 October tutorials/sparql.html 2020]. https://ec.europa.eu/research/participants/ data/ref/h2020/grants_manual/hi/oa_pilot/h2020- Baker, T. & Keizer, J. 2010. Linked Data for fighting hi-oa-data-mgt_en.pdf global hunger: experiences in setting standards for agricultural information management. In D. Wood, European Environment Agency. 2020. European ed. Linking Enterprise Data, pp. 177–201. Springer. Environment Agency SPARQL endpoint [online]. (also available at http://eprints.rclis.org/21107/). [Cited 15 September 2020]. https://data.europa.eu/ euodp/en/data/dataset/european-environment- BARTOC. 2019. KOS Types Vocabulary [online]. [Cited agency-sparql-endpoint 15 September 2020]. https://bartoc.org/en/ node/1665 FAOLEX. 2020. FAOLEX Database [online]. [Cited 15 September 2020]. http://www.fao.org/faolex/en/ BARTOC. 2020. BARTOC Skosmos Browser [online]. [Cited 15 September 2020]. http://bartoc-skosmos. Finto. 2020. Skosmos API [online]. [Cited 15 unibas.ch/en/ September 2020]. http://api.finto.fi/doc/#/

Berners-Lee, T. 2006. Linked Data. In: W3C [online]. FORCE11. 2014. Guiding Principles for Findable, [Cited 7 October 2020]. https://www.w3.org/ Accessible, Interoperable and Re-usable Data DesignIssues/LinkedData.html Publishing version b1.0. In: FORCE11 [online]. [Cited 7 October 2020]. https://www.force11.org/ Berners-Lee, T. 2012. 5-star Open Data [online]. fairprinciples [Cited 7 October 2020]. http://5stardata.info/en/ FORCE11. 2016. The FAIR Data Principles. In: Creative Commons. 2020. [online]. Attribution 3.0 FORCE11 [online]. [Cited 7 October 2020]. https:// IGO (CC BY 3.0 IGO). [Cited 15 September 2020]. www.force11.org/group/fairgroup/fairprinciples https://creativecommons.org/licenses/by/3.0/igo/ Gazan, R. undated. Controlled Vocabulary & DCMI. 2020. Metadata Basics [online]. [Cited 15 Thesaurus Design: Instructor’s Manual. Library of September 2020]. https://dublincore.org/resources/ Congress. https://www.loc.gov/catworkshop/ metadata-basics/ courses/thesaurus/pdf/cont-vocab-thes-instr- manual.pdf AGROVOC Semantic data interoperability on food and agriculture 54

GO FAIR. 2020. I2: (Meta)data use vocabularies that Riley, J. (2017). Understanding metadata: What is follow the FAIR principles [online]. [Cited 15 metadata, and what is it for? NISO. http://www.niso. September 2020]. https://www.go-fair.org/fair- org/publications/understanding-metadata-riley principles/i2-metadata-use-vocabularies-follow-fair- principles/ Skosmos. 2020. Skosmos [online]. [Cited 15 September 2020]. http://skosmos.org International Organization for Standardization. 2011. ISO 25964, Thesauri and interoperability with other Università degli Studi di Roma ‘Tor Vergata’. 2020. vocabularies. [online] Geneva. [Cited 20 October VocBench [online]. [Cited 15 September 2020]. 2020]. http://www.niso.org/schemas/ http://vocbench.uniroma2.it/ iso25964/#standard W3C. 2004. OWL Web Ontology Language [online]. Land Portal. 2020. Land Portal [online]. [Cited 15 [Cited 15 September 2020]. https://www.w3.org/TR/ September 2020]. https://landportal.org/sparql owl-features/

Metadata 2020. 2020. Metadata 2020 Personas. In: W3C. 2008. SPARQL Query Language for RDF Metadata 2020 [online]. [Cited 15 September 2020]. [online]. [Cited 15 September 2020]. https://www. http://www.metadata2020.org/resources/ w3.org/TR/rdf-sparql-query/ metadata-personas/ W3C. 2009a. SKOS Simple Knowledge Organization Mons, B. 2020. Invest 5% of research funds in System eXtension for Labels (SKOS-XL) Namespace ensuring data are reusable. Nature, 578(7796): Document - HTML Variant [online]. [Cited 7 October 491–491. https://doi.org/10.1038/d41586-020- 2020]. https://www.w3.org/TR/skos-reference/ 00505-7 skos-xl.html

National Information Standards Organization W3C. 2009b. SKOS Simple Knowledge Organization (NISO). 2017. NISO TR-06-2017, Issues in Vocabulary System Primer [online]. [Cited 16 October 2020]. Management: A Technical Report of the National https://www.w3.org/TR/skos-reference/skos-xl.html Information Standards Organization. (Also available W3C. 2011. Describing Linked Datasets with the at https://www.niso.org/publications/tr-06-2017- VoID Vocabulary [online]. [Cited 15 September 2020]. issues-vocabulary-management ). https://www.w3.org/TR/void/

Open Knowledge Foundation. 2020a. Open W3C. 2013a. Linked Data Glossary [online]. [Cited 15 Definition 2.1. In: Open Definition [online]. [Cited 7 October 2020]. https://www.w3.org/TR/skos-primer/ October 2020]. https://opendefinition.org/od/2.1/en/ W3C. 2013b. SPARQL 1.1 Overview [online]. [Cited 15 Open Knowledge Foundation. 2020b. What is Open September 2020]. https://www.w3.org/TR/sparql11- Data? [online]. [Cited 7 October 2020]. http:// overview/ opendatahandbook.org/guide/en/what-is-open- data/ Bibliography 55

W3C. 2016. W3C Invites Implementations of its Data Further AGROVOC online on the Web Best Practices | W3C News [online]. [Cited 7 October 2020]. https://www.w3.org/blog/ resources news/archives/5760 AGROVOC. 2020. Agrontology [online]. [Cited 15 Wikipedia. 2020. Stateless protocol. [Cited 25 September 2020]. http://www.fao.org/agrovoc/ October 2020]. https://en.wikipedia.org/w/index. agrontology php?title=Stateless_protocol&oldid=978370476 AGROVOC. 2020. AGROVOC [online]. [Cited 2 Wilkinson, M.D., Dumontier, M., Aalbersberg, Ij.J., December 2020]. http://www.fao.org/agrovoc/ Appleton, G., Axton, M., Baak, A., Blomberg, N., AGROVOC. 2020. AGROVOC VoID [online]. [Cited 15 Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E., September 2020]. http://aims.fao.org/aos/agrovoc/ Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, void.ttl I., Dumon, O., Edmunds, S., Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J.G., Groth, P., Goble, C., AGROVOC. 2020. Data services [online]. [Cited 2 Grethe, J.S., Heringa, J., ’t Hoen, P.A.C., Hooft, R., December 2020]. http://www.fao.org/agrovoc/ Kuhn, T., Kok, R., Kok, J., Lusher, S.J., Martone, M.E., machine-use Mons, A., Packer, A.L., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone, S.-A., Schultes, E., AGROVOC. 2020. Linked data [online]. [Cited 2 Sengstag, T., Slater, T., Strawn, G., Swertz, M.A., December 2020]. http://www.fao.org/agrovoc/ Thompson, M., van der Lei, J., van Mulligen, E., linked-data Velterop, J., Waagmeester, A., Wittenburg, P., AGROVOC. 2020. Multischeme and multihierarchy Wolstencroft, K., Zhao, J. & Mons, B. 2016. The FAIR management [online]. [Cited 15 September 2020]. Guiding Principles for scientific data management http://www.fao.org/agrovoc/multischeme-and- and stewardship. Scientific Data, 3(1): 160018. multihierarchy-management https://doi.org/10.1038/sdata.2016.18 AGROVOC. 2020. Skosmos browser [online]. [Cited 1 Working Group on Guidelines for Multilingual December 2020]. http://www.fao.org/agrovoc/ Thesauri. 2009. Guidelines for Multilingual Thesauri. search IFLA Professional Reports No. 115. La Haya (Países Bajos)IFLA, International Federation of Library AGROVOC. 2020. SPARQL service [online]. [Cited 15 Associations and Institutions. (also available at September 2020]. http://agrovoc.uniroma2.it/sparql https://www.ifla.org/files/assets/hq/publications/ professional-report/115.pdf). AGROVOC. 2020. The AGROVOC Editorial Community : Report on developments and Zeng, M., Žumer, M. 2019. Networked Knowledge achievements 2017 – 2020 [online]. [Cited 1 Organization Systems Dublin Core Application December 2020]. In press Profile (NKOS AP) [online]. [Cited 15 September 2020]. https://nkos.slis.kent.edu/nkos-ap.html AGROVOC. 2020. The AGROVOC editorial guidelines [online]. [Cited 1 December 2020]. In press Zeng, M.L. 2008. Knowledge Organization Systems (KOS). Knowledge Organization, 35(2–3): 160–182. AGROVOC. 2020. The editorial community [online]. https://doi.org/10.5771/0943-7444-2008-2-3-160 [Cited 15 September 2020]. http://www.fao.org/ agrovoc/agrovoc-community-editors Zeng, M.L. undated. Metadata development overview [online]. [Cited 15 September 2020]. AGROVOC. 2020. The linked data concept hub for https://marciazeng.slis.kent.edu/metadatabasics/ food and agriculture [online]. [Cited 1 December overview.htm 2020]. http://www.fao.org/publications/card/en/c/ CB1200EN AGROVOC Semantic data interoperability on food and agriculture 56

ASFA. 2020. Skosmos browser [online]. [Cited 1 Land Portal. 2020. LandVoc Concepts [online]. [Cited December 2020]. http://agrovoc.uniroma2.it/ 1 December 2020]. https://landportal.org/voc/ skosmosAsfa/asfa/en/ landvoc/concept

FAO. 2020. Evolution of AGROVOC since 2017 [video]. Subirats, I. & Zeng, M. 2020. Linked Open Data http://www.fao.org/agrovoc/webinars/ Enabled Bibliographical Data (LODE-BD) 3.0 : A agrovocwebinar-evolution-agrovoc-2017 practical guide on how to select appropriate encoding strategies for producing Linked Open FAO. 2020. Showcase from national editors and Data Enabled Bibliographical Data [online]. [Cited 1 expert community partners [video]. http://www.fao. December 2020]. In press org/agrovoc/news/3rd-agrovoc-editorial- communityworkshop Subirats, I., Kolshus, K., Turbati, A., S., Mietzsch, E., Martini, D., & Zeng, M. 2020. AGROVOC: The Linked FAO. 2020. Strengthening AGROVOC through Data Concept Hub for Food and Agriculture, engagement with expert communities [video]. accepted at the Computers and Electronics in http://www.fao.org/agrovoc/news/webinar- Agriculture [online]. [Cited 1 December 2020]. In strengthening-agrovocthrough-engagement- press expert-communities

Land Portal. 2020. LandVoc : the Linked Land Governance Thesaurus [online]. [Cited 1 December 2020]. https://www.landvoc.org/ Annex 1 57

Annex 1. List of AGROVOC editorial institutions 2020

Agroinstitut Nitra, Slovak Republic Kuratorium für Technik und Bauwesen in der Aquatic Sciences and Fisheries Abstracts (ASFA) Landwirtschaft e. V. (KTBL) and LeibnizInformationszentrum Belarus Agricultural Library, National Academy of Lebenswissenschaften, Sciences, Belarus Land Portal Foundation (LPF) Matica Srpska Biblioteca Storica Nazionale dell’Agricoltura, Italy Library, Serbia BonaRes Data Centre and Leibniz Centre for National Agriculture and Forestry Research Agricultural Landscape Research (ZALF), Germany Institute, Lao People’s Democratic Republic Central Scientific Agricultural Library, Russian Norwegian University of Life Sciences (NMBU), Federation Norway Centre de coopération internationale en recherche The Republican Scientific Agricultural Library of the agronomique pour le développement (CIRAD), State Agrarian University of Moldova, The Republic of Moldova Chinese Academy of Agricultural Sciences, China Ukrainian Institute of Scientific and Technical Czech General lnstitute of Agricultural Economics Expertise and Information, Ukraine and Information, Czech Republic Food and Agriculture Organization of the United Ministry of Agriculture and Forestry, Department Nations (FAO) of Training and Publication, Turkey – AGRIS, the International System for Empresa Brasileira de Pesquisa Agropecuária Agricultural Science and Technology (Embrapa), Brazil – FAOLEX GAK Education, Research and Innovation – Technologies and Practices for Small Nonprofit Co, Szent Istvan University, Hungary Agricultural Producers (TECA) Monitoring, Evaluation and Learning Team (MEL), International Center for Agricultural Research in the Dry Areas (ICARDA) Institute Techinformi of the Georgian Technical University, Georgia Iranian Fisheries Science Research Institute, Iran Thai National AGRIS Centre, Kasetsart University, Thailand AGROVOC Semantic data interoperability on food and agriculture 58

Annex 2. AGROVOC 25 top concepts

Activities: This contains activities that are Groups: Groups are defined as “a number of conducted along the food supply chain, like individual items or people brought together.” “breeding”, “feeding”, “surveying”, “cleaning”, Narrower concepts like “engineers”, “librarians” but “transport”. Included here are also higher-level also societal groups like “consumers” and “interest management activities like “accounting” and groups” can be found here. “planning”, activities and nutritional topics like Location: A location is a “a point or extent in space” “weight reduction” as well as activities that are more and thus holds concepts like “climatic zones”, loosely related to agriculture and food or rural areas “maritime zones”, “protected areas” and “urban areas”. like “cartography”, “computer programming” or “recreation”. Measure: While a measure can also denote an action taken, in this context, it is clearly defined as Entities: Entities are broadly defined as “something something that can be observed and involves a which is distinct and separate from something else.” measurement: “Number or quantity that records a These include narrower concepts like “agencies”, directly observable value or performance. All “labels”, “networks”, and “policies”. measures have a unit attached to them: inch, Events: Events in this context are defined as centimetre, dollar, litre, etc.” Examples of narrower something taking place at a certain point in time and concepts are: “altitude”, “breeding value”, “humidity”, involving the participation of people, so includes “price indices”, and “soil water potential”. concepts like “exhibitions” and “training courses”. Methods: Methods describe ways of doing things, Factors: In agricultural research and publications, either in agricultural research or in production but the term “factors” is frequently used in a number of also in everyday life. They are like recipes - and as a rather common word combinations. These common notable fact, “cooking methods” is a narrower combinations are reflected in the narrower concepts concept of the methods top concept. Other to be found here, e.g. ”abiotic factors”, “biotic factors”, examples include “autoclaving”, “irrigation methods”, “environmental factors” or “production factors”. “sampling”, “statistical methods” and “survey Features: This relates to the feature concept from methods”. geosciences and genetics and contains narrower Objects: Objects in this context include concepts such as “genomic features”, “physiographic human-made, tangible things like “equipment” features” and “soil morphological features”. and “furniture”. Annex 2 59

Organisms: The organisms tree is one of the Resources: Resources are things that are used largest subtrees in AGROVOC and contains the during a production process or that are required to taxonomic trees of organisms relevant to agriculture cover human needs in everyday life. Concepts like under concepts like “Eukaryota” and “Prokaryotae”, as “economic resources”, “inputs” and “raw materials” well as common organism classes like “plants” and would refer to the former category. The latter “animals”, but also roles that an organism can hold category is covered by more abstract resources like like “hosts”, “pests” or “predators”. Concepts for “cultural heritage” or “natural resources”. organisms that live in a certain habitat like “aquatic Site: Sites contain narrower concepts that serve to organisms” or “soil organisms” are also available. describe locations and facilities that are set up by Phenomena: In scientific usage, a phenomenon is humans for a certain purpose like “hospitals”, any event that is observable, however common it “laboratories”, “meteorological stations”, “restaurants” might be, even if it requires the use of and “timber yards”. instrumentation to observe, record, or compile data Stages: Stages has a few narrower concepts: concerning it. In natural sciences, a phenomenon is “developmental stages” and “life cycle”. The former an observable happening or event. This tree concept, however, is highly branched, containing contains concepts like “deficiencies”, “economic plant and animal development stages like “embryo phenomena”, “hazards”, “population dynamics” and stage”, “reproductive stage” etc. “trends”. State: States are any condition in which a physical Processes: A process is a set of interrelated or substance or organism can be in. Some narrower interacting activities which transforms inputs into concepts are: “anoxia”, “colloidal state”, “employment”, outputs. Examples of narrower concepts of “physical states”, and “sleep”. processes include: “anthropogenic changes”, “biological processes”, “evolution”, “inhibition”, Strategies: Strategies describe acting options and “physiological processes” and “synthesis”. include communication, rural development and training strategies, as well as “approaches”. Products: In the context of AGROVOC, these concepts are mostly confined to products and Subjects: Subjects are disciplines of study or topics product classes originating from agricultural supply relevant to agriculture and nutrition and includes chains, like “animal products”, “feeds”, “foods” or “oil “cartography”, “humanities” and “sciences”. products”. Raw materials or product properties are Substances: Substances is a broad subtree also represented by concepts such as “resins”, “forest providing hierarchies for chemical substances products”, “biodegradable products” and “sustainable according to physical properties like “ceramics”, products”. “explosives”, “oils” or “solutes” but also according to Properties: A property is a characteristic or quality their role or function like “attractants”, “culture that can be owned or possessed, which serves to media”, “drugs” or “soil amendments”, and their define or describe its possessor. This tree contains source or place of origin like “exudates”, “filter cakes” numerous narrower concepts of differing or “sediment”. granularity, e. g. “age”, “colour-fastness”, “periodicity”, “soil properties”, “toxicity” and “wind direction”. AGROVOC Semantic data interoperability on food and agriculture 60

Systems: The systems top concept contains a wide Technology: This includes concepts for range of concepts for systems of human action, technological developments and inventions that are interaction and thought (“economic systems”, applied in modern agricultural and food systems: “political systems”, “value systems”), production and “biotechnology”, “food technology”, “information and supply (“distribution systems”, “drinking water communication technologies”, “seed technology”, systems”, “agroforestry systems”), technological “wood technology” and so on. systems (“information systems”, “photovoltaic Time: This contains concepts that describe systems”, “surveillance systems”), as well as systematic timespans with a certain function - e. g. “free time”, and organizational approaches from science “seasons”, “times of the day”, “working hours” and (“knowledge organization system”, “terminology”). timespans relevant to agricultural production are mostly aggregated in the “timing” concept. End matter 61

ISBN 978-92-5-133831-5

9 789251 338315 CB2838EN/1/01.21