Linked Data Patterns (PDF)

Total Page:16

File Type:pdf, Size:1020Kb

Linked Data Patterns (PDF) Linked Data Patterns A pattern catalogue for modelling, publishing, and consuming Linked Data Leigh Dodds Ian Davis Linked Data Patterns: A pattern catalogue for modelling, publishing, and consuming Linked Data Leigh Dodds Ian Davis Publication date 2012-05-31 Abstract This book lives at http://patterns.dataincubator.org. Check that website for the latest version. The book is also available as both a PDF [http://patterns.dataincubator.org/book/linked-data-patterns.pdf] and EPUB [http://patterns.dataincubator.org/book/linked-data-patterns.epub] download. This work is licenced under the Creative Commons Attribution 2.0 UK: England & Wales License. To view a copy of this licence, visit http://creativecommons.org/licenses/by/2.0/uk/. Thanks to members of the Linked Data mailing list [http://lists.w3.org/Archives/Public/public-lod/] for their feedback and input, and Sean Hannan [http://twitter.com/MrDys/] for contributing some CSS to style the online book. Table of Contents 1. Introduction ................................................................................................................... 1 Overview .................................................................................................................. 1 2. Identifier Patterns ............................................................................................................ 4 Hierarchical URIs ....................................................................................................... 4 Literal Keys .............................................................................................................. 5 Natural Keys ............................................................................................................. 6 Patterned URIs ........................................................................................................... 6 Proxy URIs ............................................................................................................... 7 Rebased URI ............................................................................................................. 8 Shared Keys .............................................................................................................. 9 URL Slug ................................................................................................................ 10 3. Modelling Patterns ........................................................................................................ 12 Custom Datatype ...................................................................................................... 13 Index Resources ....................................................................................................... 14 Label Everything ...................................................................................................... 15 Link Not Label ......................................................................................................... 16 Multi-Lingual Literal ................................................................................................. 18 N-Ary Relation ......................................................................................................... 19 Ordered List ............................................................................................................ 20 Ordering Relation ..................................................................................................... 21 Preferred Label ......................................................................................................... 22 Qualified Relation ..................................................................................................... 23 Reified Statement ..................................................................................................... 25 Repeated Property ..................................................................................................... 26 Topic Relation .......................................................................................................... 27 Typed Literal ........................................................................................................... 28 4. Publishing Patterns ........................................................................................................ 30 Annotation ............................................................................................................... 30 Autodiscovery .......................................................................................................... 31 Dataset Autodiscovery ............................................................................................... 32 Document Type ........................................................................................................ 33 Edit Trail ................................................................................................................ 34 Embedded Metadata .................................................................................................. 35 Equivalence Links ..................................................................................................... 36 Link Base ................................................................................................................ 37 Materialize Inferences ................................................................................................ 38 Primary Topic Autodiscovery ...................................................................................... 39 Progressive Enrichment .............................................................................................. 40 See Also ................................................................................................................. 41 Unpublish ................................................................................................................ 41 5. Data Management Patterns .............................................................................................. 44 Graph Annotation ..................................................................................................... 44 Graph Per Aspect ..................................................................................................... 47 Graph Per Resource .................................................................................................. 49 Graph Per Source ..................................................................................................... 51 Named Graph ........................................................................................................... 53 Union Graph ............................................................................................................ 54 6. Application Patterns ....................................................................................................... 57 Assertion Query ........................................................................................................ 57 Blackboard .............................................................................................................. 58 Bounded Description ................................................................................................. 59 iii Linked Data Patterns Composite Descriptions ............................................................................................. 60 Follow Your Nose .................................................................................................... 61 Missing Isn't Broken ................................................................................................. 62 Named Query ........................................................................................................... 64 Parallel Loading ....................................................................................................... 65 Parallel Retrieval ...................................................................................................... 66 Parameterised Query ................................................................................................. 66 Resource Caching ..................................................................................................... 68 Schema Annotation ................................................................................................... 68 Smushing ................................................................................................................ 69 Transformation Query ................................................................................................ 72 URI Resolver ........................................................................................................... 73 iv Chapter 1. Introduction Abstract There are many ways to help spread the adoption of a technology, and to share skills and experience amongst a community of practitioners. Different approaches work well for communicating different kinds of knowledge. And we all individually have a preferred means of acquiring new skills, or getting help with a specific problem. Reference manuals, tutorials, recipes, best practice guides and experience reports all have their role. As do training courses, mentoring, pair programming and code reviews. This book attempts to add to the steadily growing canon of reference documentation relating to Linked Data. Linked Data is a means of publishing "web-native" data using standards like HTTP, URIs and RDF. The book adopts a tried and
Recommended publications
  • Provenance and Annotations for Linked Data
    Proc. Int’l Conf. on Dublin Core and Metadata Applications 2013 Provenance and Annotations for Linked Data Kai Eckert University of Mannheim, Germany [email protected] Abstract Provenance tracking for Linked Data requires the identification of Linked Data resources. Annotating Linked Data on the level of single statements requires the identification of these statements. The concept of a Provenance Context is introduced as the basis for a consistent data model for Linked Data that incorporates current best-practices and creates identity for every published Linked Dataset. A comparison of this model with the Dublin Core Abstract Model is provided to gain further understanding, how Linked Data affects the traditional view on metadata and to what extent our approach could help to mediate. Finally, a linking mechanism based on RDF reification is developed to annotate single statements within a Provenance Context. Keywords: Provenance; Annotations; RDF; Linked Data; DCAM; DM2E; 1. Introduction This paper addresses two challenges faced by many Linked Data applications: How to provide, access, and use provenance information about the data; and how to enable data annotations, i.e., further statements about the data, subsets of the data, or even single statements. Both challenges are related as both require the existence of identifiers for the data. We use the Linked Data infrastructure that is currently developed in the DM2E project as an example with typical use- cases and resulting requirements. 1.1. Foundations Linked Data, the publication of data on the Web, enables easy access to data and supports the reuse of data. The Hypertext Transfer Protocol (HTTP) is used to access a Uniform Resource Identifier (URI) and to retrieve data about the resource.
    [Show full text]
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • Dynamic Provenance for SPARQL Updates Using Named Graphs
    Dynamic Provenance for SPARQL Updates using Named Graphs Harry Halpin James Cheney World Wide Web Consortium University of Edinburgh [email protected] [email protected] Abstract nance for collections or other structured artifacts. The workflow community has largely focused on static prove- The (Semantic) Web currently does not have an offi- nance for atomic artifacts, whereas much of the work on cial or de facto standard way exhibit provenance infor- provenance in databases has focused on static or dynamic mation. While some provenance models and annota- provenance for collections (e.g. tuples in relations). tion techniques originally developed with databases or The workflow community has largely focused on workflows in mind transfer readily to RDF, RDFS and declaratively describing causality or derivation steps of SPARQL, these techniques do not readily adapt to de- processes to aid repeatability for scientific experiments, scribing changes in dynamic RDF datasets over time. and these requirements have been a key motivation for Named graphs have been introduced to RDF motivated the Open Provenance Model (OPM) [12, 11], a vocab- as a way of grouping triples in order to facilitate anno- ulary and data model for describing processes including tation, provenance and other descriptive metadata. Al- (but certainly not limited to) runs of scientific workflows. though their semantics is not yet officially part of RDF, While important in its own right, and likely to form the there appears to be a consensus based on their syntax basis for a W3C standard interchange format for prove- and semantics in SPARQL queries. Meanwhile, up- nance, OPM does not address the semantics of the prove- dates are being introduced as part of the next version nance represented in it — one could in principle use it ei- of SPARQL.
    [Show full text]
  • Name That Graph Or the Need to Provide a Model and Syntax Extension to Specify the Provenance of RDF Graphs Fabien Gandon, Olivier Corby
    Name That Graph or the need to provide a model and syntax extension to specify the provenance of RDF graphs Fabien Gandon, Olivier Corby To cite this version: Fabien Gandon, Olivier Corby. Name That Graph or the need to provide a model and syntax extension to specify the provenance of RDF graphs. W3C Workshop - RDF Next Steps, W3C, Jun 2010, Palo Alto, United States. hal-01170906 HAL Id: hal-01170906 https://hal.inria.fr/hal-01170906 Submitted on 2 Jul 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Name That Graph or the need to provide a model and syntax extension to specify the provenance of RDF graphs. Fabien Gandon and Olivier Corby, INRIA, W3C Member ABSTRACT When querying or reasoning on metadata from the semantic web, the source of this metadata as well as a number of other characteristics (date, trust, etc.) can be of great importance. While the SPARQL query language provides a keyword to match patterns against named graphs, the RDF data model focuses on expressing triples. In many cases it is interesting to augment these RDF triples with the notion of a provenance for each triple (or set of triples), typically an IRI specifying their real (e.g.
    [Show full text]
  • Efficient and Flexible Access to Datasets on the Semantic
    Semantic Sitemaps: Efficient and Flexible Access to Datasets on the Semantic Web Richard Cyganiak, Holger Stenzhorn, Renaud Delbru, Stefan Decker, and Giovanni Tummarello Digital Enterprise Research Institute (DERI), National University Ireland, Galway Abstract. Increasing amounts of RDF data are available on the Web for consumption by Semantic Web browsers and indexing by Semantic Web search engines. Current Semantic Web publishing practices, however, do not directly support efficient discovery and high-performance retrieval by clients and search engines. We propose an extension to the Sitemaps protocol which provides a simple and effective solution: Data publishers create Semantic Sitemaps to announce and describe their data so that clients can choose the most appropriate access method. We show how this protocol enables an extended notion of authoritative information across different access methods. Keywords: Sitemaps, Datasets, RDF Publishing, Crawling, Web, Search, Linked Data, SPARQL. 1 Introduction Data on the Semantic Web can be made available and consumed in many dif- ferent ways. For example, an online database might be published as one single RDF dump. Alternatively, the recently proposed Linked Data paradigm is based on using resolvable URIs as identifiers to offer access to individual descriptions of resources within a database by simply resolving the address of the resources itself [1]. Other databases might offer access to its data via a SPARQL endpoint that allows clients to submit queries using the SPARQL RDF query language and protocol. If several of these options are offered simultaneously for the same database, the choice of access method can have significant effects on the amount of networking and computing resources consumed on the client and the server side.
    [Show full text]
  • Named Graphs, Provenance and Trust
    Named Graphs, Provenance and Trust Jeremy J. Carroll1, Christian Bizer2, Patrick Hayes3, and Patrick Stickler4 1 Hewlett-Packard Labs,Bristol, UK 2 Freie Universitat¨ Berlin, Germany 3 IHMC, Florida, USA 4 Nokia, Tampere, Finland Abstract. The Semantic Web consists of many RDF graphs named by URIs. This paper extends the syntax and semantics of RDF to cover such collections of Named Graphs. This enables RDF statements that describe graphs, which is ben- eficial in many Semantic Web application areas. As a case study, we explore the application area of Semantic Web publishing: Named Graphs allow publishers to communicate assertional intent, and to sign their graphs; information consumers can evaluate specific graphs using task-specific trust policies, and act on informa- tion from those named graphs that they accept. Graphs are trusted depending on: their content; information about the graph; and the task the user is performing. The extension of RDF to Named Graphs provides a formally defined framework which could be used as a foundation for the Semantic Web trust layer. 1 Introduction A simplified view of the Semantic Web is a collection of web retrievable RDF doc- uments, each containing an RDF graph. The RDF Recommendation [17, 25, 4, 9], ex- plains the meaning of any one graph, and how to merge a set of graphs into one, but does not provide suitable mechanisms for talking about graphs or relations between graphs. The ability to express metainformation about graphs is required in many application areas where RDF is used for: Data syndication systems need to keep track of provenance information, and prove- nance chains.
    [Show full text]
  • Rule-Based Programming of User Agents for Linked Data
    Rule-based Programming of User Agents for Linked Data Tobias Käfer Andreas Harth Karlsruhe Institute of Technology (KIT) Karlsruhe Institute of Technology (KIT) Institute AIFB Institute AIFB Karlsruhe, Germany Karlsruhe, Germany [email protected] [email protected] ABSTRACT with read-write capabilities (HTTP PUT, POST, and DELETE oper- While current Semantic Web languages and technologies are well- ations) for the interface to writeable resources, most systems that suited for accessing and integrating static data, methods and tech- interact with LDP servers are still programmed in imperative code. nologies for the handling of dynamic aspects – required in many We would like to specify the behaviour of user agents in a language modern web environments – are largely missing. We propose to use grounded in mathematical logic. Abstract State Machines (ASMs) as the formal basis for dealing with A high-level language for the specification of the behaviour changes in Linked Data, which is the combination of the Resource of user agents would allow expressing executable specifications, Description Framework (RDF) with the Hypertext Transfer Proto- which are shorter and cheaper to create than imperative code, and col (HTTP). We provide a synthesis of ASMs and Linked Data and which are easier to validate (e. g. with subject matter experts) and show how the combination aligns with the relevant specifications to analyse (e. g. using formal methods). The language could also such as the Request/Response communication in HTTP, the guide- serve as execution substrate for agent specifications created using lines for updating resource state in the Linked Data Platform (LDP) planning approaches from Artificial Intelligence.
    [Show full text]
  • Resource-Level Versioning in Administrative Geography RDF Data
    Resource-Level Versioning in Administrative Geography RDF Data Alex Lohfink, Duncan McPhee University of Glamorgan, faculty of Advanced Technology, Dept. of Computing and Maths, Pontypridd, CF37 1DL Tel. +44443 482950 (alohfink, dmcphee)@glam.ac.uk ABSTRACT The Resource Description Framework (RDF) is a base technology of the semantic web, providing a web infrastructure that links distributed resources with semantically meaningful relationships at the data level. An issue with RDF data is that resources and their properties are subject, as with all data, to evolution through change, and this can lead to linked resources and their properties being either removed or outdated. In this paper we describe a simple model to address such an issue in administrative geography RDF data using RDF containers. The model version-enables such data at the resource level without disturbing the inherent simplicity of the RDF graph, and this translates to the querying of data, which retrieves version-specific properties by matching simple graph-patterns. Further, both information and non-information resources can be versioned by utilizing the model, and its inherent simplicity means that it could be implemented easily by data consumers and publishers. The model is evaluated using some UK administrative geography RDF datasets. Keywords: Key words: administrative geography; semantic web; versioning; RDF; linked data Page 1 1. Introduction as single entities. The model does not The publication of data in Linked Data interfere with the inherent simplicity of the (RDF) form has increased significantly graph-based structure imposed by the RDF recently and in the UK this process has been data model, and this simplicity is evident accelerated through the Government‟s when formulating queries about versioned „Making Public Data Public‟ initiative, resources, where standard SPARQL (W3C which is encouraging Government Agencies 2008) queries can be used to retrieve to publish data in linked form.
    [Show full text]
  • Speech Acts Meet Tagging Alexandre Monnin, Freddy Limpens, Fabien Gandon, David Laniado
    Speech acts meet tagging Alexandre Monnin, Freddy Limpens, Fabien Gandon, David Laniado To cite this version: Alexandre Monnin, Freddy Limpens, Fabien Gandon, David Laniado. Speech acts meet tagging. 5th AIS SigPrag International Pragmatic Web Conference Track (ICPW 2010), Conference Track of I-Semantics, 2010, Sep 2010, Graz, Austria. 10.1145/1839707.1839746. hal-01170885 HAL Id: hal-01170885 https://hal.inria.fr/hal-01170885 Submitted on 2 Jul 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Speech acts meet tagging: NiceTag ontology Alexandre Monnin Freddy Limpens, David Laniado Philosophies contemporaines, ExeCO, Fabien Gandon DEI, Politecnico di Milano Univ. Paris I Panthéon-Sorbonne Edelweiss, INRIA Sophia-Antipolis Piazza L. da vinci, 32 17, rue de la Sorbonne 2004, route des Lucioles BP 93 20133 Milano 75005, Paris 06902 Sophia Antipolis Cedex [email protected] [email protected] [email protected] paris1.fr [email protected] ABSTRACT related to tagging. Though seemingly unchallenged and despite it Current tag models do not fully take into account the rich and being implemented in many systems, a close look at each diverse nature of tags.
    [Show full text]
  • Versioned Queries Over RDF Archives: All You Need Is SPARQL?
    Versioned Queries over RDF Archives: All You Need is SPARQL? Ignacio Cuevas and Aidan Hogan Department of Computer Science, University of Chile & IMFD Chile Abstract. We explore solutions for representing archives of versioned RDF data using the SPARQL standard and off-the-shelf engines. We consider six representations of RDF archives based on named graphs, and describe how input queries can be automatically rewritten to return solutions for a particular version, or solutions that change between ver- sions. We evaluate these alternatives over an archive of 8 weekly versions of Wikidata and 146 queries using Virtuoso as the SPARQL engine. 1 Introduction A key aspect of the Web is its dynamic nature, where documents are frequently updated, deleted and added. Likewise when we speak of the Semantic Web, it is important to consider that sources may be dynamic and RDF datasets are sub- ject to change [11]. It is in this context that various works have looked at version- ing in the context of RDF/SPARQL [25,23,7,8,12], with recent works proposing RDF archives [5,3,2,6,20] that manage RDF graphs and their historical changes, allowing for querying across different versions of the graph. Within these works, a variety of specialised indexing techniques [3,2,20], query languages [23] and benchmarks [13,6] have been proposed, developed and evaluated. While these represent important advances, many such works propose custom SPARQL ex- tensions, indexes, engines, etc., creating a barrier for adoption. In fact, versioned queries as proposed in the literature [5] can be supported using off-the-shelf SPARQL engines with years of development, optimisation, and deployment.
    [Show full text]
  • Named Graphs, Provenance and Trust
    Named Graphs, Provenance and Trust Jeremy J. Carroll Christian Bizer Hewlett-Packard Labs Freie Universitat¨ Berlin Bristol, UK Germany [email protected] [email protected] Pat Hayes Patrick Stickler IHMC, Florida, USA Nokia, Finland [email protected] [email protected] ABSTRACT privacy preferences to graphs in order to restrict the usage of The Semantic Web consists of many RDF graphs nameable by published information [18, 34]. URIs. This paper extends the syntax and semantics of RDF to cover Access control A triple store may wish to allow fine-grain access such Named Graphs. This enables RDF statements that describe control, which appears as metadata concerning the graphs in graphs, which is beneficial in many Semantic Web application ar- the store [28]. eas. As a case study, we explore the application area of Semantic Signing RDF graphs As discussed in [13], it is necessary to keep Web publishing: Named Graphs allow publishers to communicate the graph that has been signed distinct from the signature, assertional intent, and to sign their graphs; information consumers and other metadata concerning the signing, which may be can evaluate specific graphs using task-specific trust policies, and kept in a second graph. act on information from those Named Graphs that they accept. Stating propositional attitudes such as modalities and beliefs [27]. Graphs are trusted depending on: their content; information about Scoping assertions and logic where logical relationships between the graph; and the task the user is performing. The extension of graphs have to be captured [6, 25, 31]. RDF to Named Graphs provides a formally defined framework to Ontology versioning and evolution OWL [19] provides various be a foundation for the Semantic Web trust layer.
    [Show full text]
  • On Explicit Provenance Management in RDF/S Graphs
    On Explicit Provenance Management in RDF/S Graphs P. Pediaditis G. Flouris I. Fundulaki V. Christophides ICS-FORTH ICS-FORTH ICS-FORTH ICS-FORTH University of Crete Greece Greece University of Crete Greece [email protected] [email protected] Greece [email protected] [email protected] Abstract An RDF triple, (subject; property; object), asserts the fact that subject is associated with object through The notion of RDF Named Graphs has been proposed property. A collection of (data and schema) triples in order to assign provenance information to data de- forms an RDF/S graph whose nodes represent either in- scribed using RDF triples. In this paper, we argue that formation resources universally identified by a Univer- named graphs alone cannot capture provenance infor- sal Resource Identifier (URI) or literals, while edges are mation in the presence of RDFS reasoning and updates. properties, eventually defined by one or more associated In order to address this problem, we introduce the no- RDFS vocabularies (comparable to relational and XML tion of RDF/S Graphsets: a graphset is associated with schemas). In addition, RDF Named Graphs have been a set of RDF named graphs and contain the triples that proposed in [5, 28] to capture explicit provenance in- are jointly owned by the named graphs that constitute formation by allowing users to refer to specific parts of the graphset. We formalize the notions of RDF named RDF/S graphs in order to decide “how credible is”, or graphs and RDF/S graphsets and propose query and up- “how evolves” a piece of information.
    [Show full text]