Integration of Ontology Alignment and Ontology Debugging for Taxonomy Networks

LinköpingStudies in Science and Technology. Thesis No. 1644 Licentiate Thesis Integration of Ontology Alignment and Ontology Debugging for Taxonomy Networks by Valentina Ivanova Department of Computer and Information Science LinköpingUniversity SE-581 83 Linköping,Sweden Linköping2014 This is a Swedish Licentiate's Thesis Swedish postgraduate education leads to a Doctor's degree and/or a Licentiate's degree. A Doctor's degree comprises 240 ECTS credits (4 year of full-time studies). A Licentiate's degree comprises 120 ECTS credits. Copyright c 2014 Valentina Ivanova ISBN 978-91-7519-417-2 ISSN 0280{7971 Printed by LiU Tryck 2014 URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-102953 Abstract Semantically-enabled applications, such as ontology-based search and data integration, take into account the semantics of the input data in their algorithms. Such applications often use ontologies, which model the application domains in question, as well as alignments, which provide information about the relationships between the terms in the different ontologies. The quality and reliability of the results of such applications depend directly on the correctness and completeness of the ontologies and alignments they utilize. Traditionally, ontology debugging discovers defects in ontologies and alignments and provides means for improving their correctness and completeness, while ontology alignment establishes the relationships between the terms in the different ontologies, thus addressing completeness of alignments. This thesis focuses on the integration of ontology alignment and debugging for taxonomy networks which are formed by taxonomies, the most widely used kind of ontologies, connected through alignments. The contributions of this thesis include the following. To the best of our knowledge, we have developed the first approach and framework that integrate ontology alignment and debugging, and allow debugging of modelling defects both in the structure of the taxonomies as well as in their alignments. As debugging modelling defects requires domain knowledge, we have developed algorithms that employ the domain knowledge intrinsic to the network to detect and repair modelling defects. Further, a system has been implemented and several experiments with real-world ontologies have been performed in order to demonstrate the advantages of our integrated ontology alignment and debugging approach. For instance, in one of the experiments with the well-known ontologies and alignment from the Anatomy track in Ontology Alignment Evaluation Initiative 2010, 203 modelling defects (concerning incomplete and incorrect information) were discovered and repaired. This work has been supported by the Swedish National Graduate School in Com- puter Science (CUGS), the Swedish e-Science Research Center (SeRC) and Veten- skapsr˚adet(VR). v Acknowledgements When life brought me to Sweden I had never imagined the wonderful possi- bilities I would discover. They did not come for granted, though. The path through the research world is thorny, going up and down, turning at the most unpredictable moments. I believe I have managed to put those to my advantage and now I welcome the next challenge. I am sincerely thankful to my supervisor Professor Patrick Lambrix who has introduced me to the challenging area of ontologies. While working under his supervision I have improved my calm judgement of circumstances and, in general, my analytical skills. He provided encouraging and relaxed work environment and guided me during all stages of this work. Thank you, Patrick! I am especially grateful to Professor Nahid Shahmehri, my second supervisor, who is the main reason for me being at this university. She is the one who first believed in my research talent and kindly advised me. I am also thankful to Associate Professor Lena Strömbäck and David Byers who made me believe I possess the strength to take this adventure. They have introduced me to the wonderful world of research. The time here would not have been that enjoyable without my colleagues who make the work environment so friendly. I also thank the people at the IDA administrative department, and especially Anne, for their timely and always kind assistance in various administrative issues. I say thank you to Brittany Shahmehri for proof reading this thesis and providing valuable remarks. I am greatly thankful to my family and friends for their unquestion- ing support and encouragement. Their belief in the successful end of this adventure has always been driving me forward. This work would not have been possible without my life partner Pavel. He shares the sunny and stormy weather with me. Thank you, Pavel, for your love and for being here! Valentina Ivanova January 2014 Linköping,Sweden vii Contents 1 Introduction 1 1.1 Semantic Web . 1 1.2 Ontologies . 3 1.2.1 Ontology alignment . 4 1.2.2 Ontology debugging . 4 1.2.3 Ontology networks . 5 1.2.4 Benefits from the integration of ontology alignment and ontology debugging . 5 1.3 Problem formulation . 6 1.4 Contributions . 7 1.5 Thesis structure . 8 1.6 List of publications . 9 1.6.1 Thesis based on . 9 1.6.2 Related publications . 9 1.6.3 Other publications . 10 2 Background 11 2.1 Ontologies . 11 2.1.1 Components . 12 2.1.2 Classification . 15 2.1.3 Applications . 17 2.2 Ontology alignment . 17 2.3 Ontology debugging . 20 2.3.1 Classification of defects . 21 2.4 Definitions . 23 2.4.1 Ontologies and ontology networks . 23 2.4.2 Knowledge bases . 23 3 Framework and Algorithms 25 3.1 Framework and workflow . 26 3.2 Methods in the framework . 28 3.2.1 Detect missing and wrong is-a relations and mappings 28 3.2.2 Repair missing and wrong is-a relations and mappings 31 3.3 Algorithms in the debugging component . 35 ix CONTENTS 3.3.1 Detect and validate candidate missing is-a relations and mappings . 35 3.3.2 Repair missing and wrong is-a relations and mappings 38 3.4 Algorithms in the alignment component . 43 3.4.1 Detect and validate candidate missing mappings . 43 3.4.2 Repair missing and wrong mappings . 44 3.5 Interactions between the alignment component and the debugging component . 45 4 Implemented System 47 4.1 Detect and validate candidate missing is-a relations and mappings . 48 4.1.1 Detect and validate candidate missing is-a relations . 48 4.1.2 Detect and validate candidate missing mappings . 49 4.2 Repair missing and wrong is-a relations and mappings . 51 4.2.1 Repair wrong is-a relations and mappings . 51 4.2.2 Repair missing is-a relations and mappings . 52 5 Experiments and Discussions 55 5.1 Ontology debugging . 55 5.1.1 OAEI Anatomy 2010 . 55 5.2 Integration of ontology debugging and ontology alignment . 60 5.2.1 OAEI Anatomy 2011 . 60 5.2.2 OAEI Benchmark 2010 . 64 5.2.3 ToxOntology-MeSH use case . 70 5.3 Discussion . 76 6 Related work 79 6.1 Ontology debugging . 79 6.1.1 Debugging modelling defects . 79 6.1.2 Debugging semantic defects . 82 6.2 Ontology alignment . 86 6.3 Integration of ontology alignment and ontology debugging . 88 7 Conclusions and Future Work 91 7.1 Conclusions . 91 7.1.1 Debugging of ontologies and alignments . 92 7.1.2 Benefits from the integration of ontology alignment and ontology debugging . 92 7.1.3 Implemented system . 93 7.2 Future work . 93 7.2.1 Extending the system . 94 7.2.2 Long-term future work . 95 x List of Figures 2.1 (Part of an) Ontology network. 13 2.2 Part of the is-a hierarchy in the Wine ontology. 14 2.3 Part of the Wine ontology. 15 2.4 A general alignment framework. 18 2.5 An unsatisfiable concept in the Pizza ontology. 22 3.1 Workflow. 27 3.2 Initialization for detection. 35 3.3 Initialization for repairing. 38 3.4 Algorithm for generating repairing actions for wrong is-a relations and mappings. 39 3.5 Algorithm for generating repairing actions for missing is-a relations and mappings. 41 4.1 Generating and validating CMIs. 49 4.2 Aligning. 50 4.3 Repairing wrong is-a relations. 51 4.4 Repairing missing is-a relations. 53 xi LIST OF FIGURES xii List of Tables 5.1 Ontology debugging: OAEI Anatomy 2010|ontologies and alignment. 56 5.2 Ontology debugging: OAEI Anatomy 2010—final result. 56 5.3 Ontology debugging: OAEI Anatomy 2010|recommendations. 57 5.4 Ontology debugging: OAEI Anatomy 2010—first iteration results. 58 5.5 Ontology alignment and debugging: OAEI Anatomy 2011| Run I results|debugging of the alignment. 61 5.6 Ontology alignment and debugging: OAEI Anatomy 2011| Run I results|debugging of the ontologies. 62 5.7 Ontology alignment and debugging: OAEI Benchmark 2010| ontologies and alignments. 64 5.8 Ontology alignment and debugging: OAEI Benchmark 2010| Run I—final result. 65 5.9 Ontology alignment and debugging: OAEI Benchmark 2010| Run II—final result. 67 5.10 Ontology alignment and debugging: OAEI Benchmark 2010| comparison between Run I and Run II . 68 5.11 Ontology alignment and debugging: ToxOntology-MeSH| validation of mapping suggestions|initial alignment. 71 5.12 Ontology alignment and debugging: ToxOntology-MeSH| changes in the alignment (equivalence mapping (≡), ToxOn- tology term is-a MeSH term (!), MeSH term is-a ToxOn- tology term ( ), related terms (R), wrong mapping (W), removed (rem)). 73 5.13 Ontology alignment and debugging: ToxOntology-MeSH| changes in the structure of ToxOntology. 74 xiii LIST OF TABLES xiv Chapter 1 Introduction 1.1 Semantic Web The Web today provides an immense variety of structured, semi-structured and, most often, completely unstructured information sources|databases, web pages, documents, figures, etc.|interconnected through an enormous number of links.

Load more