The Justificatory Structure of Owl Ontologies

THE JUSTIFICATORY STRUCTURE OF OWL ONTOLOGIES A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2013 By Samantha Patricia Bail School of Computer Science Contents Abstract9 Declaration 10 Copyright 11 Acknowledgements 12 1 Introduction 13 1.1 Errors in OWL ontologies...................... 16 1.2 Justification based debugging support................ 17 1.2.1 Understanding justifications................. 18 1.2.2 Justificatory structure.................... 20 1.2.3 Beyond debugging...................... 22 1.3 Research objectives.......................... 23 1.4 Contributions............................. 24 1.5 Thesis structure............................ 25 2 Background and related work 28 2.1 Description logic knowledge bases.................. 29 2.1.1 DL syntax and semantics................... 29 2.1.2 Standard reasoning services................. 32 2.1.3 The Web Ontology Language OWL............. 36 2.2 Errors in OWL ontologies...................... 39 2.2.1 Logical errors......................... 40 2.2.2 Non-logical errors....................... 43 2.2.3 Debugging ontologies..................... 44 2.3 Justifications for entailments of ontologies............. 45 2.3.1 Justification based repair................... 48 2.3.2 Computing justifications................... 50 2.3.3 Understanding individual justifications........... 53 2.3.4 Understanding multiple justifications............ 60 2.4 Alternative approaches to debugging................ 61 2 2.4.1 Proofs............................. 62 2.4.2 Ontology revision....................... 62 2.4.3 Direct computation of diagnoses............... 63 2.4.4 OntoClean........................... 64 2.4.5 Ontology comprehension................... 64 2.5 Summary and conclusions...................... 65 3 Defining finite entailment sets 67 3.1 Design decisions for finite entailment sets.............. 68 3.1.1 Tautologies.......................... 69 3.1.2 Asserted and inferred axioms................ 70 3.1.3 Transitivity.......................... 71 3.1.4 Equivalent classes....................... 74 3.1.5 Strict and non-strict subsumptions............. 75 3.1.6 Equivalence to top and bottom............... 76 3.1.7 Axiom and expression types................. 77 3.1.8 Dealing with ontology imports................ 79 3.2 A notation for finite entailment sets................. 81 3.2.1 Introducing the notation................... 81 3.2.2 Axioms and expressions................... 82 3.2.3 Wanted and unwanted entailments............. 82 3.2.4 Sample entailment sets.................... 83 3.3 Entailments in OWL applications.................. 85 3.3.1 Inferred ontology generation in the OWL API....... 86 3.3.2 Presenting entailments to end-users............. 86 3.3.3 Ontology publishing..................... 87 3.3.4 Metrics and analytical applications............. 87 3.3.5 Imported and native entailments in BioPortal....... 88 3.4 Summary and conclusions...................... 89 4 The justificatory structure of OWL ontologies 91 4.1 Categories of justifications and entailments............. 92 4.1.1 Self-justifications and self-supporting entailments..... 92 4.1.2 Atomic subsumption chains................. 93 4.1.3 Complex justifications.................... 94 4.1.4 Categorising entailments and ontologies........... 95 3 4.2 Representing justifications as j-graphs................ 96 4.2.1 J-graph definition....................... 96 4.2.2 J-graph generation...................... 98 4.3 Justificatory structure........................ 99 4.3.1 Axiom properties....................... 100 4.3.2 Properties of justifications.................. 103 4.3.3 Relations between justifications............... 104 4.4 Summary and conclusions...................... 107 5 Justification isomorphism 109 5.1 Isomorphism.............................. 111 5.2 Subexpression-isomorphism..................... 112 5.2.1 Representing equivalence classes............... 117 5.3 Lemma-isomorphism......................... 119 5.3.1 Restrictions on lemmatisations................ 121 5.3.2 Lemmatisations and obvious steps.............. 122 5.3.3 Non-transitivity........................ 124 5.4 Equivalence and superfluity..................... 125 5.5 Implementing an isomorphism checker............... 126 5.5.1 Algorithm and implementation............... 127 5.5.2 Optimisations......................... 128 5.5.3 Limitations due to syntactical differences.......... 130 5.5.4 Extending the j-graph.................... 132 5.6 Summary and conclusions...................... 133 6 Coping strategies 135 6.1 Debugging problems......................... 136 6.1.1 Defining debugging problems................ 137 6.1.2 Justification encounters.................... 138 6.2 Measuring effort............................ 140 6.2.1 The complexity of individual justifications......... 140 6.2.2 A model for user effort.................... 143 6.3 Coping strategies........................... 144 6.3.1 Characterising justification sets............... 145 6.3.2 Justification overlap..................... 146 6.3.3 Isomorphism relations.................... 149 4 6.3.4 Combining isomorphism and overlap............ 151 6.4 Summary and conclusions...................... 152 7 A survey of justificatory structure 153 7.1 The BioPortal corpus......................... 153 7.1.1 Properties of the corpus................... 154 7.1.2 Justification corpus preparation............... 154 7.2 Results of the BioPortal survey................... 158 7.2.1 Entailment types....................... 158 7.2.2 Occurrence of multiple justifications............. 159 7.2.3 Justification overlap..................... 162 7.2.4 Justification isomorphism.................. 168 7.3 Discussion............................... 177 7.3.1 Justification types and frequency.............. 177 7.3.2 Overlaps............................ 178 7.3.3 Isomorphism.......................... 180 7.3.4 Limitations.......................... 181 7.4 Summary and conclusions...................... 182 8 Conclusions 184 8.1 Summary of contributions...................... 184 8.1.1 Design decisions for finite entailment sets.......... 184 8.1.2 Justificatory structure and justification isomorphism... 185 8.1.3 Reducing user effort..................... 186 8.1.4 Experimental results..................... 186 8.2 Significance of results......................... 187 8.3 Future directions........................... 189 A Ontologies in the test corpus 213 Word Count: 54,391 5 List of Tables 2.1 ALC constructors and semantics................... 32 3.1 Entailment set properties and keys................. 82 3.2 Ontologies and imported entailments in the NCBO BioPortal... 88 7.1 Overview of the data in sets Ssa and Su............... 156 7.2 Overview of OWL 2 profiles...................... 157 7.3 Overview of the basic ontology metrics in the corpus........ 158 7.4 Entailment types in sets Ssa and Su................. 159 7.5 Root and derived justifications in Ss and Su............. 165 7.6 Mean times (in seconds) per ontology for isomorphism detection.. 169 7.7 Template frequency and coverage across the corpus......... 173 7.8 Most frequent templates for lemma-isomorphism across the corpus. 176 7.9 Comparison of reductions in Ss and Ssl................ 177 6 List of Figures 1.1 Screenshot of multiple justifications in the Protégé4 ontology editor. 15 2.1 A screenshot of the Explanation tab in Protégé4.......... 47 2.2 A screenshot of the Repair tool in Swoop.............. 49 2.3 Three justification patterns for C1 v C2............... 59 3.1 Class graphs representing the transitive reducts of O and O0.... 73 3.2 Asserted and inferred class graphs of a toy ontology......... 84 3.3 Screenshot of the `Selected Entailments' tab in Protégé4..... 87 4.1 A decision tree for categorising entailments............. 95 4.2 An example of a j-graph for justifications and entailments..... 97 4.3 J-graph illustrating axiom frequency, impact, semantic relevance.. 100 5.1 Three justifications which are s-isomorphic via transitivity..... 115 5.2 Parse tree of a justification...................... 127 5.3 An extended j-graph containing four template nodes........ 132 6.1 Different representations of a justification for A1 v A6....... 141 7.1 The justification corpus preparation workflow............ 155 7.2 Frequency of multiple complex justifications in the corpus..... 161 7.3 Axiom frequency, impact, and semantic relevance.......... 164 7.4 Overlap frequency with and without outlier ontologies....... 166 7.5 Comparison of reduction caused by isomorphism types....... 171 7.6 Template frequencies for strict and lemma-isomorphism...... 174 7 List of key terms active axiom, 102 justificatory structure, 98 activity, 102 laconic justification, 52 alleviation factor a, 142 atomic subsumption chain, 92 masking cross, 54 bridging axiom, 105 external, 54 cognitive complexity, 57 internal, 54 shared cores, 55 debugging problem, 135 minimal conflict set, 50 diagnosis, 50 overlap, 104 effort score c, 140 entailment set "K, 34 preferred template Θp, 117 proofs, justification based, 56 fine-grained justifications, 52 frequency, 99 repair, 47 minimal, 47 graph component, 103 root and derived, 60 hitting set, 50 self-justification, 91 impact, 100 self-supporting entailment, 92 negative, 101 semantic relevance, 101 positive, 101 summarising lemmatisation, 120 inferential power, 106 surface pattern, 103 isomorphism template Θ, 112 lemma-, 120 − strict, 110 unwanted entailments "O, 82 subexpression-, 112 + wanted entailments "O,

The Justificatory Structure of Owl Ontologies

The Semantic Web and Its Languages Editor: Dieter Fensel Vrije Universiteit Amsterdam [email protected]

The Role of Ontology in Information Management

Practical Reasoning in Probabilistic Description Logic

Ontology and Information Systems

Improved Knowledge Management Through First-Order Logic in Engineering Design Ontologies" (2010)

Experiments and Results on the Use of Ontologies in the Artificial Intelligence Domain

Knowledge Graph Reasoning Over Unseen RDF Data

A Probabilistic Extension to the Web Ontology Language

Ontology Engineering in UML: Best Practices & Lessons Learned of A

Overview of Ontologies and Semantic

From SHIQ and RDF to OWL: the Making of a Web Ontology Language

Ontology-Based Infrastructure for Intelligent Applications