Sissvoc: a Linked Data API for Access to SKOS Vocabularies

SISSVoc: A Linked Data API for access to SKOS vocabularies Editor(s): Krzysztof Janowicz, University of Santa Barbara, California Solicited review(s): Werner Kuhn, University of California, Santa Barbara, USA; Anusuriya Devaraju, Agrosphere Institute, For- schungszentrum Jülich, Germany; Antoine Isaac, Vrije Universiteit Amsterdam, The Netherlands; Alejandro Llaves, Facultad de Informática Boadilla del Monte, Madrid, Spain Simon J D Coxa*, Jonathan Yua and Terry Rankineb a CSIRO Land and Water, PO Box 56, Highett, Vic. 3190 Australia b CSIRO Mineral Resources, PO Box 1130, Bentley WA, 6102 Australia {simon.cox|jonathan.yu|terry.rankine}@csiro.au Abstract. The Spatial Information Services Stack Vocabulary Service (SISSVoc) is a Linked Data API for accessing published vocabularies. SISSVoc provides a RESTful interface via a set of URI patterns that are aligned with SKOS. These provide a standard web interface for any vocabulary which uses SKOS classes and properties. The SISSVoc implementation provides web pages for human users, and machine-readable resources for client applications (in RDF, JSON, and XML). SIS- SVoc is implemented using a Linked Data API façade over a SPARQL endpoint. This approach streamlines the configuration of content negotiation, styling, query construction and dispatching. SISSVoc is being used in a number of projects, mainly in the environmental sciences, where controlled vocabularies are used to support cross-domain and interdisciplinary interopera- bility. SISSVoc simplifies access to vocabularies for end users, and provides a web API to support vocabulary applications. Keywords: Vocabulary, SKOS, API, Linked data 1. Introduction bases and spreadsheets, text documents, page and image formats. Some of the most fundamental vo- Controlled vocabularies are a key element of many cabularies are made available on the web by their classification systems. They are typically published official custodian only as browser pages or PDFs for by specific organisations, domains, or communities download (e.g SI units of measure1, geologic time- of practice. The web has encouraged and enabled scale2). consolidation of vocabulary use, such that common The emergence of Semantic Web technologies has vocabularies are now more likely to be maintained provided some powerful tools for formalizing defini- and published at a community level than only within tions, vocabularies, and ontologies, in forms that also an agency or project team, thus improving interoper- support reasoning and inferencing. In this context, ability of scientific datasets. Examples include chem- the Simple Knowledge Organization System (SKOS) ical entities [13,21], bio-medical terminology [38,49], [1,26] was designed to allow easy formalization of environmental science topic or subject headings existing multilingual vocabularies that have flat or [15,27,31] and geological classifications (see compi- hierarchical structures, to smooth the transition to- lation at [30]). Vocabularies such as EuroVoc [33] wards the richer logic-based tools from ontology and the International Chronostratigraphic Chart [6] modelling. have well-defined governance and authority, i.e. the SKOS provides a standard vocabulary for repre- Publications Office of the European Union, and the senting thesauri, classifications, taxonomies and con- International Commission for Stratigraphy, respec- trolled vocabularies, using RDF. SKOS has a simple tively. model with few key constructs, focusing on labeling While many vocabularies are openly available on the web, they are formalized and published in a va- 1 http://www.bipm.org/en/si/base_units/ , riety of generally incompatible ways, including data- http://www.bipm.org/en/si/si_brochure/ 2 http://stratigraphy.org/index.php/ics-chart-timescale and basic hierarchies. While it lacks the expressivity Linked Data approaches, such as Semantic Technol- and rigour of languages such as OWL, its simplicity ogies for Archaeological Resources (STAR) Project’s allows a broad range of vocabularies and classifiers semantic terminology services3, Library of Congress to be ported from a diverse set of formats to RDF, Authorities and Vocabularies service 4 , and the promoting ease of sharing and cross-linking between Coastal and Marine Spatial Planning Vocabularies vocabularies. Many existing vocabularies have been (CMSPV) SKOS API5 [45]. However, each service ported to SKOS [25] including large vocabularies has a different interface to access the content. Tech- such as AGROVOC [34] and the Library of Congress nologies such as Pubby [10], D2R server [3] and Ep- Subject Headers (LCSH) [44]. SKOS is now one of imorphics Linked Data API Implementation (ELDA) the most commonly used vocabularies for structured [17] are available for publishing RDF resources as data on the web [25]. Linked Data, but there is no standard pattern for ac- Vocabularies are likely to be adopted and shared if cess to SKOS vocabulary resources. The fundamental they are made available easily. Nevertheless, despite issue is that RESTful approaches rely only on URIs, successes in the use of SKOS for encoding vocabu- HTTP, and content-types [18,37], yet SKOS is not laries, current standards provide only low-level inter- recognised as a ‘content-type’ in this context. faces to vocabulary data. For example, many vocabu- In this paper, we describe a standard interface laries are published as an RDF document for down- called SISSVoc through which SKOS vocabularies load. However, if the vocabulary is large then the can be provided to web users. SISSVoc provides a download will be commensurately large, and if the level of abstraction for the end users corresponding user only wants to retrieve a single vocabulary term to the SKOS content model, supporting access to or select a few terms, this option requires processing vocabularies without specific knowledge of the un- on the client side. Alternatively, access to vocabular- derlying technologies and semantic web languages ies is often provided at a SPARQL endpoint. used, such as SPARQL endpoints and queries, SKOS SPARQL [20,32] is the generic RDF query language. and RDF. A human interface in the form of web pag- While this is powerful, it is a low-level language sim- es and forms is provided when HTML is requested. ilar to the relational database query language SQL SISSVoc also allows for machine-to-machine use, so and normally is only used by database administrators. that data providers can use HTTP links to vocabular- Some SKOS vocabularies are published via other ies, data applications can be configured with standard HTTP interfaces. However, each implementation terminology, and data clients can retrieve definitions uses different protocols and supports a varied set of or verify the existence of items claimed to be in par- features e.g. content-negotiation provided by the ticular vocabularies. GEMET [16] REST interface, and NERC Data SISSVoc is a key component of the Spatial Infor- Grid’s Vocabulary Server [23,29] SOAP interface. In mation Services Stack (SISS) developed by CSIRO some cases, one or both of human-readable formats through the AuScope project [47]. SISSVoc v1 and and machine-readable formats is not available. Thus, v3 have been briefly introduced previously [8,19]. In discovery and access across vocabulary endpoints this paper, we present the SISSVoc v3 design in de- becomes challenging and ad-hoc. tail and describe the current implementation. We There is a clear opportunity here, to design an API point to its use in some environmental domains, and to match the SKOS vocabulary, taking advantage of some client applications built on SISSVoc. We eval- the fact that much modern vocabulary content is uate SISSVoc in terms of the URI design, and com- structured using SKOS classes and predicates. This pare it with some other products with similar scope. API can then be used as the basis for various higher level vocabulary applications. Linked Data has been proposed as a means of publishing and interlinking structured data on the web. Linked Data proposes the use of RDF for describing structured data and allows relationships and links between resources to be defined [4]. This allows both human-readable and machine-readable con- 3 tent/interaction to access data resources and their http://hypermedia.research.southwales.ac.uk/ descriptive metadata using existing web technologies resources/terminology/ 4 simply by dereferencing HTTP URIs. A number of http://id.loc.gov/search/ 5 SKOS vocabulary services are available that utilise http://tw.rpi.edu/web/project/CMSPV/KeyConcepts 2. Different vocabulary interfaces for different RESTful web services [37] and Linked Data [2,4] users principles. Standard operations are thus defined as a set of URI patterns. The patterns use the SKOS vo- Using standard semantic web technologies, a vocabulary, so as to facilitate discovery and access to cabulary can be usefully published through at least resources formalized using SKOS. four distinct interfaces: 1. The complete vocabulary formalized in RDF, 3. Design and implementation formatted using one of the standard RDF seriali- zations (RDF/XML, Turtle) and bundled as a In this section we first provide a detailed descrip- single document (file), delivered from the "On- tion of the SISSVoc API, then summarize an imple- tology URI". This is for users and services who mentation of SISSVoc based on configuration of a wish to harvest the whole vocabulary in one Linked Data API implementation, and a typical de- transaction, for local processing. For

Sissvoc: a Linked Data API for Access to SKOS Vocabularies

A Comparative Evaluation of Geospatial Semantic Web Frameworks for Cultural Heritage

V a Lida T in G R D F Da

Design and Analysis of a Query Processor for Brick

Diplomová Práce Prostředky Sémantického Webu V Uživatelském

SWAD-Europe Deliverable 3.11: Developer Workshop Report 4 - Workshop on Semantic Web Storage and Retrieval

Performance of RDF Library of Java, C# and Python on Large RDF Models

Inscribing the Blockchain with Digital Signatures of Signed RDF Graphs. a Worked Example of a Thought Experiment

Wiki S Podporou Zpracování Sémantiky

Introduction to Linked Open Data Github.Com/Cmh2166/Swib18lodintro Simeon Warner, Cornell University Christina Harlow, Stanford University

Semantic Web and Python Concepts to Application Development

Design of an Effective Ontology and Query Processor Enabling Portable Building Applications

RDF in the JRS Server