D4.2.1 State-Of-The-Art Survey on Ontology Merging and Aligning V1
Total Page:16
File Type:pdf, Size:1020Kb
EU-IST Integrated Project (IP) IST-2003-506826 SEKT SEKT: Semantically Enabled Knowledge Technologies D4.2.1 State-of-the-art survey on Ontology Merging and Aligning V1 Jos de Bruijn (DERI) Francisco Mart´ın-Recuerda (DERI) Dimitar Manov (SIRMA) Marc Ehrig (UKARL) Abstract. EU-IST Integrated Project (IP) IST-2003-506826 SEKT Deliverable D4.2.1 (WP4) This deliverable contains a comprehensive state-of-the-art survey on Ontology Merging and Aligning methods, tools and specification languages. Keyword list: Ontology Mediation, state-of-the-art survey, ontology merging, ontology aligning, ontology mapping WP4: Ontology Mediation Nature of the Deliverable: Report Dissemination level: PU Contractual date of delivery: 2004-06-30 Actual date of delivery: 2004-07-22 Copyright °c 2004 Digital Enterprise Research Institute, University of Innsbruck SEKT Consortium This document is part of a research project partially funded by the IST Programme of the Commission of the European Communities as project number IST-2003-506826. British Telecommunications plc. Empolis GmbH Orion 5/12, Adastral Park Europaallee 10 Ipswich IP5 3RE 67657 Kaiserslautern UK Germany Tel: +44 1473 609583, Fax: +44 1473 609832 Tel: +49 631 303 5540, Fax: +49 631 303 5507 Contactperson: John Davies Contactperson: Ralph Traphoner¨ E-mail: [email protected] E-mail: [email protected] Jozef Stefan Institute University of Karlsruhe, Institute AIFB Jamova 39 Englerstr. 28 1000 Ljubljana D-76128 Karlsruhe Slovenia Germany Tel: +386 1 4773 778, Fax: +386 1 4251 038 Tel: +49 721 608 6592, Fax: +49 721 608 6580 Contactperson: Marko Grobelnik Contactperson: York Sure E-mail: [email protected] E-mail: [email protected] University of Sheffield University of Innsbruck Department of Computer Science Institute of Computer Science Regent Court, 211 Portobello St. Techikerstraße 13 Sheffield S1 4DP 6020 Innsbruck UK Austria Tel: +44 114 222 1891, Fax: +44 114 222 1810 Tel: +43 512 507 6475, Fax: +43 512 507 9872 Contactperson: Hamish Cunningham Contactperson: Jos de Bruijn E-mail: [email protected] E-mail: [email protected] Intelligent Software Components S.A. Kea-pro GmbH Pedro de Valdivia, 10 Tal 28006 Madrid 6464 Springen Spain Switzerland Tel: +34 913 349 797, Fax: +49 34 913 349 799 Tel: +41 41 879 00, Fax: 41 41 879 00 13 Contactperson: Richard Benjamins Contactperson: Tom Bosser¨ E-mail: [email protected] E-mail: [email protected] Ontoprise GmbH Sirma AI EOOD (Ltd.) Amalienbadstr. 36 135 Tsarigradsko Shose 76227 Karlsruhe Sofia 1784 Germany Bulgaria Tel: +49 721 50980912, Fax: +49 721 50980911 Tel: +359 2 9768, Fax: +359 2 9768 311 Contactperson: Hans-Peter Schnurr Contactperson: Atanas Kiryakov E-mail: [email protected] E-mail: [email protected] Vrije Universiteit Amsterdam (VUA) Universitat Autonoma de Barcelona Department of Computer Sciences Edifici B, Campus de la UAB De Boelelaan 1081a 08193 Bellaterra (Cerdanyola del Valles)` 1081 HV Amsterdam Barcelona The Netherlands Spain Tel: +31 20 444 7731, Fax: +31 84 221 4294 Tel: +34 93 581 22 35, Fax: +34 93 581 29 88 Contactperson: Frank van Harmelen Contactperson: Pompeu Casanovas Romeu E-mail: [email protected] E-mail: pompeu.casanovasquab.es Changes Version Date Author Changes 0.1 2004-01-03 Jos creation 0.2 2004-02-08 Jos update to SEKT style 0.3 2004-04-07 Jos updates the evaluation framework 0.4 2004-06-03 Jos updates evaluation framework and adding scenarios and use cases 0.5 2004-06-09 Jos updates several descriptions of approaches 0.6 2004-06-17 Jos major update introduction and refinement of the de- scriptions of approaches 0.7 2004-06-25 Jos incorporating contributions from Sirma and UKARL 0.9 2004-06-28 Jos Version for QA 1.0 2004-07-19 Jos incorporated reviewer comments and proof-reading Executive Summary This report provides a state-of-the-art survey of Ontology Merging and Aligning methods, tools and techniques. We provide a framework for the evaluation of these approaches, as well as a catego- rization of approaches for ontology merging and aligning. We categorize the approaches in methods and tools, integration systems and specific techniques. We compare the approaches according to the evaluation framework and to a set of generic use cases for ontology mediation in order to evaluate the applicability of the approach to the ontology mediation problem on the Semantic Web. Contents 1 Introduction 3 1.1 Terminology ................................. 5 1.2 The Ontology Mapping Process ...................... 8 1.3 Ontology Mismatches ............................ 10 1.3.1 Ontology-level Mismatches .................... 10 1.3.2 Language-level mismatches .................... 12 1.4 One-to-one integration vs. Global integration . 13 1.5 Wrappers and Mediators .......................... 14 2 Motivational Use Cases 16 2.1 Generic Use Cases ............................. 16 2.1.1 Use Cases for Instance Mediation . 17 2.1.2 Ontology Merging ......................... 20 2.1.3 Creating Ontology Mappings .................... 21 3 The Evaluation Framework 22 4 The Survey 25 4.1 Methods and Tools ............................. 26 4.1.1 MAFRA .............................. 26 4.1.2 RDFT ................................ 34 4.1.3 PROMPT .............................. 37 4.1.4 GLUE ................................ 44 4.1.5 Semantic Matching ......................... 46 4.1.6 OntoMap .............................. 50 4.1.7 RDFDiff .............................. 54 4.2 Integrated Systems ............................. 57 4.2.1 InfoSleuth .............................. 57 4.2.2 ONION ............................... 61 4.2.3 OBSERVER ............................ 65 4.2.4 MOMIS ............................... 70 4.3 Specific Techniques ............................. 74 4.3.1 QOM Quick Ontology Mapping . 75 1 CONTENTS 2 5 Comparison of the Methods 77 5.1 Ontology Languages ............................ 77 5.2 Mapping Language ............................. 79 5.3 Mapping Patterns .............................. 81 5.4 Automation Support ............................. 81 5.5 Applicability to Use Cases ......................... 81 5.6 Implementation ............................... 83 5.7 Experiences ................................. 83 6 Conclusions 86 Bibliography 89 Chapter 1 Introduction This report presents a state-of-the-art survey on ontology mapping and merging methods and tools with an emphasis on ontology mapping and inter-operability on the Semantic Web. The issue of inter-operability between information systems has already existed for many years. With the recent advent of the Semantic Web [BLHL01], the issues have only increased, because of the abundance, heterogeneity and independence of the various data sources. Traditional data integration systems focus on inter-operability between data sources and applications within enterprises. Within enterprises a certain coherence between data sources can be expected, although data integration within enterprises still faces many challenges which remain to be resolved. We focus on information integration on the Semantic Web. This means that we not only take into account data integration within organizations1, but also explicitly address integration across organizational boundaries. Between organizations, even more hetero- geneity between the data sources can be expected. Fortunately, ontologies [Fen03], the backbone of the Semantic Web, can help us with a part of the integration problem. Because ontologies are explicit and formal specifications of knowledge, they help in disambiguating data and can help in finding correspondences between data sources because of the explicit specification of the knowledge in an ontol- ogy. On the Semantic Web, data is annotated using ontologies. Concepts (also called classes)2 in ontologies give meaning to data on the Web. Because ontologies are shared specifications, the same ontologies can be used for the annotation of multiple data sources, not only limited to Web pages, but also collections of xml documents, relational databases, 1Arguably, the intranet of an organization is an isolated part of the Web and could be part of the Semantic Web in the near future 2We use the words concept and class interchangeably in this document. 3 CHAPTER 1. INTRODUCTION 4 etc. This already enables a certain degree of inter-operation between these data sources, because of their shared terminology3. However, it cannot be expected that all individuals and organizations on the Semantic Web will ever agree on using one common terminol- ogy or ontology. It can be expected that many different ontologies will appear and, in order to enable inter-operation, mediation is required between these ontologies. As was argued in [VC98, Usc00], it is very hard to create standard ontologies. In fact, even inside organizations the standardization of a terminology is not feasible, because of the lengthy process of standardization and because the use of a big standard impedes changes in the organization (any change would require consensus among a large group of people, which is hard to achieve). Across organizations this problem becomes more severe, because the group of people which need to reach consensus is much bigger and conflicts of interest are more likely to occur. Therefore, it is likely that there will be many different heterogeneous ontologies on the Semantic Web and in order to enable inter- operability between applications on the Semantic Web, mediation is required between different representations (ontologies) of knowledge in the same domain. This report