Automated Semantic Forgetting for Expressive Description Logics

AUTOMATED SEMANTIC FORGETTING FOR EXPRESSIVE DESCRIPTION LOGICS A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Science and Engineering 2018 Yizheng Zhao School of Computer Science Contents Abstract 6 Declaration 8 Copyright Statement 9 Acknowledgements 10 1 Introduction 11 1.1 Applications of Forgetting . 12 1.2 Challenges and Contributions . 13 1.3 Overview of the Thesis . 15 1.4 Published Results . 16 2 Basics of Description Logics 18 2.1 The Basic Description Logic ALC ..................... 18 2.2 Extensions of the Basic ALC ........................ 23 2.3 Relationships with Other Logics . 27 3 Basics of Forgetting 31 3.1 Forgetting in Classical Logics . 31 3.2 Second-Order Quantifier Elimination . 33 3.3 Forgetting in Modal Logics . 38 3.4 Forgetting in Description Logics . 40 4 Concept Forgetting for ALCOI 44 4.1 The Description Logic ALCOI ...................... 44 2 4.2 Generalised Ackermann’s Lemma . 47 4.3 The Normalisation . 54 4.4 The Calculus – AckC ............................ 57 4.5 The Forgetting Method . 76 4.6 Examples . 87 5 Role Forgetting for ALCOIH(O, u) 91 5.1 The Description Logic ALCOIH(O, u).................. 92 5.2 Obstacles to Role Forgetting . 96 5.3 The Normalisation . 99 5.4 The Calculus – AckR ............................ 104 5.5 The Forgetting Method . 119 5.6 Examples . 131 6 Implementation and Evaluation 134 6.1 The Implementation – Fame ........................ 135 6.2 The Corpus . 140 6.3 Forgetting Concept Symbols . 142 6.4 Forgetting Role Symbols . 148 6.5 Forgetting Concept and Role Symbols . 153 6.6 Comparison of Fame with Lethe ..................... 155 7 Conclusions and Future Directions 156 7.1 Conclusions . 156 7.2 Future Directions . 159 Bibliography 162 Word Count: 54639 3 List of Tables 4.1 Frequency counts of Σ-symbols . 84 4.2 Frequency counts of Σ-symbols in Example 4.6.1 . 88 5.1 Interpretations of negative TBox premises . 110 5.2 Interpretations of negative RBox premises . 110 6.1 Statistics of ontologies selected from BioPortal . 141 6.2 Results of forgetting 10% of concept symbols in the signature . 144 6.3 Results of forgetting 40% of concept symbols in the signature . 145 6.4 Results of forgetting 70% of concept symbols in the signature . 146 6.5 Results of forgetting 100 concept symbols in the signature . 148 6.6 Results of forgetting 10% of role symbols in the signature . 149 6.7 Results of forgetting 40% of role symbols in the signature . 150 6.8 Results of forgetting 70% of role symbols in the signature . 151 6.9 Results of forgetting 20 role symbols in the signature . 152 6.10 Results of forgetting 10% of concept symbols and 10% of role symbols . 153 6.11 Results of forgetting 40% of concept symbols and 10% of role symbols . 154 6.12 Results of forgetting 70% of concept symbols and 70% of role symbols . 154 4 List of Figures 4.1 Transformations of concepts into negation normal form . 55 4.2 Transformation of concepts into clausal normal form . 56 C 4.3 The Ackermann rules for eliminating A ∈ NC from a set of clauses . 59 4.4 The SurfacingC rules for transforming A-clauses into A-reduced form . 62 4.5 The SkolemisationC rules for transforming A-clauses into A-reduced form 66 C 4.6 The Purify rules for eliminating A ∈ NC from a set of clauses . 69 4.7 The PurifyC rules in the sense of the AckermannC rules . 70 4.8 The simplification rules in AckC ..................... 72 4.9 The main phases in concept forgetting process . 77 4.10 Transformation of TBox axioms and ABox assertions into TBox clauses 77 4.11 Transformation of TBox clauses into TBox axioms and ABox assertions 79 5.1 Transformation of TBox clauses into normal form . 101 5.2 The rewriteR rules for transforming r-clauses into r-reduced form . 106 R 5.3 The Ackermann rule for eliminating r ∈ NR from a set of clauses . 109 R 5.4 The Purify rules for eliminating r ∈ NR from a set of clauses . 115 5.5 The PurifyR rules in the sense of the AckermannR rule . 116 5.6 The main phases in role forgetting process . 120 5.7 Transformation of RBox axioms into RBox clauses . 121 5.8 The SkolemisationO rules for transforming A-clauses into A-reduced form123 5.9 Transformation of RBox clauses into RBox axioms . 124 6.1 The framework of Fame .......................... 136 5 The University of Manchester Yizheng Zhao Doctor of Philosophy Automated Semantic Forgetting for Expressive Description Logics March 21, 2018 Ontologies, exploiting Description Logics (DLs) as the representational underpin- ning, provide a logic-based data model for knowledge processing thereby supporting intelligent reasoning of domain knowledge for various applications, most evidently for modern biomedical, life science and text mining applications. However, with their growing utilisation, not only has the number of available ontologies increased consid- erably, but they are also blowing up in size and becoming more complex to manage. Moreover, capturing domain knowledge in the form of ontologies is labour-intensive work which is expensive from an implementation perspective. There is a strong de- mand for techniques and automated tools for creating restricted views of ontologies while preserving complete information up to the restricted views. Forgetting is a non-standard reasoning technique which provides such a service by eliminating concept and role symbols from ontologies in a way such that all logical consequences are preserved up to the remaining signature. It has turned out to be very useful in ontology-based knowledge processing, as it allows users to focus on spe- cific parts of (usually very large) ontologies for easy reuse, or to zoom in on (usually very complex) ontologies for in-depth analysis. Other uses of forgetting are information hiding, explanation generation, abduction, ontology debugging and repair, and computing logical differences between ontology versions. Despite its notable usefulness as described above, forgetting, on the other hand, is an inherently difficult problem — it is much harder than standard reasoning (satisfi- ability testing) — and very few logics are known to be complete for forgetting, there has been insufficient research on the topic and few forgetting tools are available. This thesis investigates practical methods for semantic forgetting in expressive description logics not considered before. In particular, we present a practical method for forgetting concept symbols from ontologies expressible in the description logic ALCOI, i.e., the basic ALC extended with nominals and inverse roles. Being based on a generalisation of a monotonicity property called Ackermann’s Lemma, the method is the first and only approach to concept forgetting in description logics with nominals. We also present a practical method for forgetting role symbols from ontologies expressible in the description logic ALCOIH(O, u), i.e., ALCOI extended with role hierarchies, the universal role and role conjunction. The universal role and role conjunction enrich our target language, making it expressive enough to represent the forgetting solution which otherwise would have been lost. Being based on a non-trivial generalisation of Ackermann’s Lemma, the method is the first and only approach so far that provides support for role forgetting in description logics with nominals. Both methods are goal-oriented and incremental. They are terminating, and are sound in the sense that the forgetting solutions are equivalent to the original ontologies up to (the interpretations of) the symbols that have been forgotten, possibly with (the interpretations of) the symbols that have been introduced. These two methods can be used as a unifying method for forgetting both concept and role symbols from ontologies expressible in the description logic ALCOIH(O, u). The method has been 6 implemented in Java using the OWL API and the prototypical implementation, called Fame, has been evaluated on a corpus of real-world ontologies (in order to verify its practicality). Performance results have shown that Fame was successful (i.e., elimi- nated all specified concept and role symbols) in most of the test cases, and in most of these cases the elimination was done within a very short period of time. 7 Declaration No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning. 8 Copyright Statement i. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. ii. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. iii. The ownership of certain Copyright, patents, designs, trade marks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv. Further information on the conditions under which disclosure, publication and com- mercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=487), in any relevant Thesis restriction declarations deposited in the University Library, The Univer- sity Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s Policy on Presentation of Theses.

Automated Semantic Forgetting for Expressive Description Logics

Metadata for Semantic and Social Applications

Schema.Org As a Description Logic

From the Semantic Web to Social Machines

Ontologies and Description Logics

Knowledge Representation in Bicategories of Relations

THE DATA COMPLEXITY of DESCRIPTION LOGIC ONTOLOGIES in Recent Years, the Use of Ontologies to Access Instance Data Has Become In

A New Generation Digital Content Service for Cultural Heritage Institutions

Complexity of the Description Logic ALCM

Description Logics

Description Logics—Basics, Applications, and More

Logic-Based Technologies for Intelligent Systems: State of the Art and Perspectives

Enterprise Data Classification Using Semantic Web Technologies