Ontology Alignment Using Biologically-Inspired Optimisation Algorithms

Ontology Alignment using Biologically-inspired Optimisation Algorithms Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswissenschaften (Dr. rer. pol.) von der Fakultät fur¨ Wirtschaftswissenschaften des Karlsruher Instituts fur¨ Technologie (KIT) genehmigte DISSERTATION von Dipl.-Inform. Jurgen¨ Bock Tag der mundlichen¨ Prufung:¨ 12. Juli 2012 Referent: Prof. Dr. Rudi Studer Korreferent: Prof. Dr. Hartmut Schmeck Karlsruhe 2012 To my parents. Abstract Ontologies describe real-world entities in terms of axioms, i.e. statements about them, and have become an established instrument for formally modelling and representing knowledge. The diversity of available ontologies results in a heterogeneous landscape where ontologies can overlap in their content. Such an overlap can be caused by ontologies modelling the same or a similar domain created by different ontology designers, or with a different focus on a domain. If overlapping ontologies are to be used in a semantic application, sophisticated methods are required to overcome this heterogeneity. Identifying the overlap of ontologies is tackled by the discipline of ontology alignment. An alignment between two ontologies denotes a set of correspondences between on- tological entities. In this thesis, the ontology alignment problem is considered an optimisation problem. Thereby, optimality is defined in terms of an objective function that evaluates candidate alignments according to ontology modelling- and domain-specific criteria, such as significance and similarity of entity identifiers, or logical implications of an alignment. This optimisation problem is solved using biologically-inspired optimisation techniques, exemplary demonstrated by a novel Evolutionary Algorithm and an adapted Discrete Particle Swarm Optimisation algorithm. The Evolutionary Algorithm implements concepts from Evolutionary Programming and Extremal Optimisation and operates on a newly developed data structure for representing alignments. The Discrete Particle Swarm Optimisation algorithm extends an existing algorithm for a structurally similar problem. The presented approach is the first to systematically apply biologically-inspired optimisation algorithms to the problem of ontology alignment. These algorithms have several advantages, which address relevant issues of the alignment problem: First, the inherent parallelisability of biologically-inspired optimisation techniques enables the exploitation of distributed computing environments, such as cloud infrastructures. This improves on scalability aspects of the alignment task. Second, biologically-inspired optimisation algorithms are metaheuristics, which are largely independent from the objective function. Thus, arbitrary alignment quality criteria can be encoded, reflecting the characteristics of the ontologies. This makes the approach flexible regarding the nature of the ontologies. Third, candidate alignments are assessed as a whole during the optimisation process. This allows for consideration of global alignment quality criteria that go beyond the traditional pairwise computation of entity similarities. Finally, the iterative nature of biologically- inspired optimisation techniques demonstrates anytime behaviour, i.e. the algorithm can be interrupted at any time and the best alignment found so far can be obtained. The presented algorithms were implemented in the form of two software prototypes, a generic ontology alignment API and evaluation library for flexibly building objective func- tions. The prototypes were evaluated using established ontology alignment benchmarks, among other experiments. It could be shown that biologically-inspired optimisation techniques are applicable to the ontology alignment problem and can compute alignments of good quality depending on the configuration of the objective function, while at the same time being scalable through high parallel efficiency. i ii Acknowledgements This work would not have been possible without the support and encouragement of a number of people. My biggest thanks go to my supervisor Prof. Dr Rudi Studer, who was providing an excellent research environment, allowing also somehow unconventional approaches to be explored. Also I'd like to thank Prof. Dr Hartmut Schmeck for acting as a second reviewer, and for giving valuable feedback. Many thanks also to Prof. Dr Stefan Nickel for being the examiner and to Prof. Dr Bruno Neibecker for being the chairman in the examination committee. For continuous motivation and expert advice throughout the process of developing this thesis I want to thank Dr Stephan Grimm and PD Dr Sebastian Rudolph. Additionally, I'd like to thank PD Dr Catherina Burghart for holding the (sometimes) distracting project work off me, particularly during the last months of writing up this thesis. I wouldn't even have started this research without two people I met during my studies at Griffith University in Brisbane, Australia: Dr Andrew Lewis and Jan Hettenhausen. It was Andrew, who first brought the research area of biologically-inspired optimisation to my attention. With Jan I had a memorable brainstorming session about how to tackle the ontology alignment problem by particle swarm optimisation, which eventually resulted in the first MapPSO prototype in 2008. For fruitful discussions about biologically-inspired optimisation techniques and their application I'd like to express my gratitude to Prof. Dr Marcus Randall and PD Dr Sanaz Mostaghim. For support with bringing the approach into \the cloud" my thanks go to Alex Lenk and Carsten Dänschel. Many thanks goes to Florian Berghoff, Michael Mutter, Carsten Dänschel, Peng Liu, and Matthias Stumpp for implementation support, fruitful discussions, and ideas for im- proving and optimising the prototypes. I also would like to thank the anonymous reviewers of submitted papers for their valuable feedback and comments. Finally, I want to express my gratitude to my parents Martin and Brigitte for their endless support and belief in me, and to Beáta Ori} for her love and support. Karlsruhe, July 2012 JürgenBock iii iv Contents Abstract i Acknowledgements iii Contents iv List of Tables ix List of Figures xi List of Algorithms xiii 1 Introduction1 1.1 Motivation....................................1 1.2 Overview.....................................5 2 Foundations9 2.1 Ontologies.....................................9 2.2 Ontology Alignment............................... 11 2.2.1 Alignment Formalism.......................... 11 2.2.2 Ontology Alignment Problem...................... 13 2.3 Biologically-inspired Optimisation Methods.................. 14 2.3.1 Evolutionary Computation....................... 15 2.3.2 Computational Swarm Intelligence................... 18 3 Related Work 21 3.1 Ontology Alignment............................... 21 3.1.1 Matrix-based Approaches........................ 22 3.1.2 Constraint-based Approaches...................... 23 3.2 Applications of Biologically-inspired Optimisation Methods......... 24 3.2.1 Applications in Ontology Alignment.................. 24 3.2.2 Applications in Other Semantic Technologies............. 26 3.2.3 Applications in Structurally Related Problem Domains........ 27 3.3 Discussion..................................... 28 4 Evaluation Metrics for Ontology Alignment 31 4.1 Evaluation Metrics................................ 33 4.1.1 Local Correspondence Evaluation.................... 33 4.1.2 Contextual Correspondence Evaluation................ 38 4.1.3 Alignment Level Evaluation....................... 45 4.2 Similarity Aggregation.............................. 48 v vi Contents 4.2.1 Maximum Aggregation.......................... 48 4.2.2 Weighted Average Aggregation..................... 49 4.2.3 Ordered Weighted Average Aggregation................ 49 5 Ontology Alignment using Biologically-Inspired Optimisation Techniques 51 5.1 Objective Function................................ 52 5.2 Solution Representation............................. 52 5.2.1 Correspondence Set Representation................... 53 5.2.2 Correspondence Permutation Representation............. 53 5.3 Iterative Convergence.............................. 54 5.3.1 Mutation and Selection in an Evolutionary Algorithm........ 54 5.3.2 Particle Movement in Swarm Optimisation.............. 63 5.4 Discussion..................................... 71 5.4.1 Mutation vs. Crossover......................... 71 5.4.2 Evolutionary Algorithm vs. Particle Swarm Optimisation...... 72 6 Implementation 75 6.1 KADMOS API.................................. 76 6.1.1 Core Representation API........................ 76 6.1.2 Alignment Algorithm API........................ 78 6.1.3 Cloud Adapter API........................... 78 6.2 HARMONIA Commons............................. 78 6.3 MapEVO..................................... 83 6.4 MapPSO..................................... 86 6.5 Deployment.................................... 88 6.5.1 Application Programming Interface................... 88 6.5.2 Web Service................................ 89 6.5.3 Cloud Infrastructure........................... 89 6.5.4 SEALS Tool Package.......................... 94 6.6 Development................................... 95 7 Evaluation 97 7.1 Alignment Algorithm Performance Metrics................... 98 7.2 Ontology Alignment Evaluation Initiative................... 100 7.2.1 Benchmarks Track............................ 101 7.2.2 Directory Track.............................. 104 7.2.3 Anatomy Track.............................. 107 7.2.4 Conference Track............................

Ontology Alignment Using Biologically-Inspired Optimisation Algorithms

Internship Report — Mpri M2 Advances in Holistic Ontology Alignment

Wiktionary Matcher

Term-Based Ontology Alignment

Wiktionary Matcher Results for OAEI 2020

An Ontology-Based Approach to Manage Conflicts in Collaborative Design Moisés Lima Dutra

Results of the Ontology Alignment Evaluation Initiative 2020⋆

User Validation in Ontology Alignment

Abstract Ontology Alignment Techniques for Linked Open

Alignment Incoherence in Ontology Matching

Agreement Technologies

Ontology Matching Techniques: a Gold Standard Model

Addressing the Limited Adoption of Semantic Web for Data Integration