Techniques for Ontology and Mapping Bootstrapping

Project Nº: FP7-318338 Project Acronym: Optique Project Title: Scalable End-user Access to Big Data Instrument: Integrated Project Scheme: Information & Communication Technologies Deliverable D4.2 Techniques for Ontology and Mapping Bootstrapping Due date of deliverable: (T0+24) Actual submission date: October 31, 2014 Start date of the project: 1st November 2012 Duration: 48 months Lead contractor for this deliverable: UOXF Dissemination level: PU – Public Final version Executive Summary: Techniques for Ontology and Mapping Bootstrapping This document summarises deliverable D4.2 of project FP7-318338 (Optique), an Integrated Project sup- ported by the 7th Framework Programme of the EC. Full information on this project, including the contents of this deliverable, is available online at http://www.optique-project.eu/. More specifically, the present deliverable describes the designed and implemented Techniques for Ontology and Mapping (O&M) Bootstrapping corresponding to the Task T4.1. Optique’s O&M bootstrapping module allows to perform systems installation over relational databases. It offers several scenarios for installing the platform that combine (i) bootstrapping of ontologies and mappings from relational schemas, (ii) importing of existing ontologies in the platform via alignment or layering. O&M bootstrapper is tightly integrated with other Optique components and this allows to facilitate installation with mapping editing and ontology approximation. Moreover, O&M bootstrapper can encode in mappings information needed for provenance of query answers. We implemented the bootstrapper, integrated it in the Optique platform, and extensively evaluated on several database schemas including the ones provided by Statoil and Siemens. We currently work several challenging research directions that are tightly related to T4.1: bootstrapping of complex mappings and benchmarking. We presented our results on a number of international venues and to Statoil and Siemens users. List of Authors Ernesto Jiménez-Ruiz (UOXF) Evgeny Kharlamov (UOXF) Dmitriy Zheleznyakov (UOXF) Ian Horrocks (UOXF) Domenico Fabio Savo (UNIROMA1) Valerio Santarelli (UNIROMA1) Jose Mora (UNIROMA1) Riccardo Rosati (UNIROMA1) Marco Console (UNIROMA1) Evgenij Thorstensen (UiO) Dag Hovland (UiO) Martin Giese (UiO) Leif Harald Karlsen (UiO) Daniel Lupp (UiO) Martin Georg Skjæveland (UiO) Johannes Trame (FOP) Christoph Pinkel (FOP) Thomas Hubauer (SIEMENS) Mikhail Roshchin (SIEMENS) 2 Optique Deliverable D4.2 Techniques for Ontology and Mapping Bootstrapping Internal Reviewers Martin Rezk (FUB) Özgür L. Özçep (TUHH) 3 Contents 1 Introduction 7 2 Related Work 10 3 Preliminaries 11 3.1 Relational database . 11 3.2 Ontologies and the Web Ontology Language (OWL) . 11 3.2.1 The OWL 2 QL profile . 12 3.3 Ontology-to-Schema (OBDA) Mappings . 13 3.3.1 Direct Mapping Specification and R2RML language . 13 3.4 Ontology-to-Ontology Alignments . 14 3.4.1 Representation of Ontology Alignments . 15 3.4.2 Semantic Consequences of the Integration . 15 3.5 Provenance . 16 3.6 Preliminaries for ontology approximation . 17 3.6.1 Basic Definitions . 17 4 Installation Scenarios 18 5 Bootstrapping Techniques 20 5.1 Running example . 20 5.2 Bootstrapping of Mappings . 21 5.2.1 Layering an Existing Ontology . 22 5.3 Bootstrapping of Ontologies . 22 5.3.1 Adding hierarchy to the ontology classes . 24 5.3.2 Dealing with multiple properties with the same name . 24 5.3.3 Annotation Schema for the Query Formulation Interface . 24 5.4 Enhancing Bootstrapping with External Ontology . 25 5.4.1 Ontology Alignment . 26 5.4.2 Alignment Repair . 26 5.5 Ontology Approximation . 27 5.5.1 Global semantic approximation . 28 5.5.2 K-approximation . 29 5.5.3 Approximation in OWL 2 QL . 30 5.5.4 Computing the entailment set in OWL 2 QL . 31 5.6 Provenance in Bootstrapped Mappings . 31 5.6.1 Provenance model . 31 5.6.2 Provenance at URI level . 32 5.6.3 Provenance at triple level . 33 5.6.4 Provenance at graph level . 33 4 Optique Deliverable D4.2 Techniques for Ontology and Mapping Bootstrapping 6 Post-Bootstrapping Analysis 35 6.1 Semi-Automatic Ontology Layering . 35 6.2 Validation of the boostrapped ontology and mappings . 35 6.3 Guidelines for the manual construction of an OBDA specification . 36 7 Integration with the Optique Platform 37 7.1 RDB Schema Selection and Bootstrapping . 38 7.1.1 Ontology Alignment . 38 7.1.2 Ontology approximation . 38 7.1.3 Ontology an Mapping Storage . 40 7.1.4 Integrated O&M boostrapper . 40 8 Evaluation 41 8.1 Installing Optique Platform at Statoil . 41 8.1.1 The database . 41 8.1.2 Experiments . 41 8.2 Installing Optique Platform at Siemens . 41 8.2.1 Siemens Schemata and Ontologies . 42 8.2.2 Coverage of Query Terms by the Ontologies . 42 9 Ongoing Work 44 9.1 Benchmark for Ontology Alignment . 44 9.2 Benchmark for Ontology and Mapping Bootstrapping . 44 9.3 Bootstrapping of Complex Mappings . 45 9.3.1 Basic Definitions . 45 9.3.2 Finding classes based on joins . 45 9.3.3 Finding classes based on clusters of attributes . 46 Bibliography 47 Glossary 52 A R2RML direct mapping cases 53 B OM 2013: IncMap 57 C Initial guidelines for OBDA specification 70 D LogMap: OM 2013 paper 76 E LogMap: OM 2014 paper 85 F ISWC 2014: Conservativity in Ontology Alignments 95 G ISWC 2014: Repair in Ontology Alignments 112 H ISWC 2014: Ontology Approximation 129 I DL 2013: Ontology Approximation 146 J Empirical Evaluation of the Ontology Approximation Module 159 K Optique demo: ISWC 2013 paper 162 5 Optique Deliverable D4.2 Techniques for Ontology and Mapping Bootstrapping L Optique demo: 2014 (submitted) 167 M ISWC 2014: Ontology Alignment for Query Answering 172 6 Chapter 1 Introduction Building an ontology and connecting it to the data sources via mappings is a costly process, especially for large and complex databases. To aid this process, tools that can extract a preliminary ontology and mappings from the source schema play a critical role. In order to ease the production of initial versions of the ontology and mappings, the Optique platform includes a O&M bootstrapping component that takes a set of database schemata as the input, and returns an ontology and a set of mappings that connect the terms occurring in the ontology to the schema elements. The purpose of this document is to describe the designed and implemented Techniques for Ontology and Mapping (O&M) Bootstrapping corresponding to the Task T4.1 of WP4. Challenges of the Work package. Within Optique, WP4 deals with the problems related to the management of the OBDA specification. The specification of an Ontology-Based Data Access (OBDA) system [14, 59] is a triple ∐︀풪; 풮; ℳ̃︀, where 풪 is an ontology, providing a conceptual specification of the domain of interest, 풮 is an intensional specification (schema) of a set of data sources, and ℳ is a set of mapping asser- tions, i.e., expressions that specify the relationship between the ontology and the data sources, by means of queries over the ontology that are put in correspondence with queries over the data sources. OBDA systems crucially depend on the existence of suitable ontologies and mappings. Developing them from scratch in a Big Data scenario is likely to be expensive, thus, practical OBDA systems should support a (semi-)automatic creation of an initial ontology and set of mappings [35, 44]. Challenges of the Task. Task T4.1 has the goal of developing techniques and methodologies to facilitate the rapid deployment of the platform in new applications and application domains. Concretelly, this will lead to the development of the desired initial ontology and mappings. Summary of Task Results. The Optique system, within its O&M Management system (see architecture in Figure 1.1), includes an O&M bootstrapping module that allows to perform systems installation over relational databases. O&M bootstrapper offers several scenarios for installing the platform that combine (i) bootstrapping of ontologies and mappings from relational schemas, (ii) importing of existing ontologies into the platform via alignment or layering. O&M bootstrapper is tightly integrated with other Optique components and this allows to facilitate installation with mapping editing and ontology approximation. Moreover, O&M bootstrapper can encode in mappings information needed for the provenance of query answers. We implemented the bootstrapper, integrated it in the Optique platform, and extensively evaluated on several database schemas including the ones provided by Statoil and Siemens. We currently work several challenging research directions that are tightly related to T4.1: bootstrapping of complex mappings and benchmarking. We presented our results on a number of international venues and to Statoil and Siemens users. List of Achievement in Year 2. 7 Optique Deliverable D4.2 Techniques for Ontology and Mapping Bootstrapping Integrated via Information Workbench Presentation Ontology and Mapping Ontology editing Layer Management Interface Interface: Protégé Information Workbench frontend API (E.g., widget development, Java, REST) Ontology & Mapping Manager's Configuration Processing Components of modules Workbench visualisation O&M revision, O&M LDAP engine control, editing analyser, authentification reasoner External O&M visualisation evolution and engines O&M transformation bootstrapper engine Query Formulation O&M matching, Processing alignment system Components Query driven OWL API ontology Sesame API Ontology Processing - ontology construction - mappings Ontology reasoner 1 - configuration Ontology reasoner 2 - queries

Load more