Reasoning in Description Logics Using Resolution and Deductive Databases

Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswissenschaften (Dr. rer. pol.) von der FakultätfürWirtschaftswissenschaften der UniversitätFridericiana zu Karlsruhe genehmigte Dissertation. Reasoning in Description Logics using Resolution and Deductive Databases M.Sc. Boris Motik Tag der mündlichen Prüfung: 09. Januar 2006 Referent: Prof. Dr. Rudi Studer, UniversitätKarlsruhe (TH) Korreferentin 1: Dr. habil. Ulrike Sattler, University of Manchester Korreferent 2: Prof. Dr. Karl-Heinz Waldmann, UniversitätKarlsruhe (TH) 2006 Karlsruhe ii Abstract Description logics (DLs) are knowledge representation formalisms with well-understood model-theoretic semantics and computational properties. The DL SHIQ(D) provides the logical underpinning for the family of Semantic Web ontology languages. Metadata management applications, such as the Semantic Web, often require reasoning with large quantities of assertional information; however, existing algorithms for DL reasoning have been optimized mainly for efficient terminological reasoning. Although techniques for terminological reasoning can be used for assertional reasoning as well, they often do not exhibit good performance in practice. Deductive databases are knowledge representation systems specifically designed to efficiently reason with large data sets. To see if deductive database techniques can be used to optimize assertional reasoning in DLs, we studied the relationship between the two formalisms. Our main result is a novel algorithm that reduces a SHIQ(D) knowledge base KB to a disjunctive datalog program DD(KB) such that KB and DD(KB) entail the same set of ground facts. In this way, we allow DL reasoning to be performed in DD(KB) using known, optimized algorithms. The reduction algorithm is based on several novel results. In particular, we developed a resolution-based algorithm for deciding satisfiability of SHIQ knowledge bases. Furthermore, to enable representation of concrete data, such as strings or integers, we developed a general approach for reasoning with concrete domains in the framework of resolution, and we use it to extend our algorithms to SHIQ(D). For unary coding of numbers, these algorithms run in exponential time, and are thus worst-case optimal. These results allowed us to derive tighter data complexity bounds. Namely, assum- ing that the size of the assertional knowledge dominates the size of the terminological knowledge, the reduction algorithm runs in nondeterministic polynomial time; furthermore, if disjunctions are not used, it runs in deterministic polynomial time. Finally, we extended these algorithms in several ways. First, we showed that so- called DL-safe rules can be combined with disjunctive programs obtained by the reduction to increase the expressivity of the logic, without affecting decidability. Second, we derived an algorithm for answering conjunctive queries. Third, we extended the algorithms to support metamodeling. To estimate the practicability of our algorithms, we implemented KAON2—a new DL reasoning system. Our experiments show a performance increase in query answering over existing DL systems of one or more orders of magnitude. iii iv Contents I Foundations 1 1 Introduction 3 2 Preliminary Definitions 9 2.1 Multi-Sorted First-Order Logic . 9 2.2 Relations and Orderings . 13 2.3 Rewrite Systems . 14 2.4 Ordered Resolution . 14 2.5 Basic Superposition . 16 2.6 Splitting . 23 2.7 Disjunctive Datalog . 24 3 Introduction to Description Logics 27 3.1 The Description Logic SHIQ ........................ 27 3.1.1 Example . 33 3.2 Description Logics with Concrete Domains . 35 3.2.1 Concrete Domains . 36 3.2.2 The Description Logic SHIQ(D).................. 37 3.2.3 Example . 39 II From Description Logics to Disjunctive Datalog 41 4 Reduction Algorithm at a Glance 43 4.1 The Main Difficulty in Reducing DLs to Datalog . 43 4.2 The General Idea . 44 4.3 Translating KB into Clauses . 45 4.4 Deciding Satisfiability of Ξ(KB) by R ................... 47 4.5 Translating ALC to Disjunctive Datalog . 50 4.6 Examples . 52 4.7 Extending the Algorithms to SHIQ(D).................. 55 4.8 Discussion . 56 4.8.1 Independence of the Reduction and the Query . 56 v vi CONTENTS 4.8.2 Minimal vs. Arbitrary Models . 56 4.8.3 Complexity . 57 4.8.4 Descriptive vs. Minimal-Model Semantics . 58 4.8.5 Unique Name Assumption . 59 4.8.6 The Size of DD(KB)......................... 60 4.8.7 The Benefits of Reducing DLs to Disjunctive Datalog . 61 5 Deciding SHIQ by Basic Superposition 63 5.1 Decision Procedure Overview . 63 5.2 Eliminating Transitivity Axioms . 65 5.3 Deciding ALCHIQ− ............................. 68 5.3.1 Preprocessing . 68 5.3.2 Parameters for Basic Superposition . 70 − 5.3.3 Closure of ALCHIQ -Closures under Inferences by BSDL . 71 5.3.4 Termination and Complexity Analysis . 77 5.4 Removing the Restriction to Very Simple Roles . 79 5.4.1 Transformation by Decomposition . 80 5.4.2 Deciding ALCHIQ by Decomposition . 85 5.4.3 Safe Role Expressions . 87 5.5 Example . 91 5.6 Related Work . 96 6 Reasoning with a Concrete Domain 99 6.1 Resolution with a Concrete Domain . 100 6.1.1 Preliminaries . 100 6.1.2 d-Satisfiability . 101 6.1.3 Concrete Domain Resolution with Ground Clauses . 103 6.1.4 Most General Partitioning Unifiers . 107 6.1.5 Concrete Domain Resolution with General Clauses . 108 6.1.6 Deleting D-Tautologies . 110 6.1.7 Combining Concrete Domains with Other Resolution Calculi . 112 6.2 Deciding SHIQ(D) .............................113 6.2.1 Closures with Concrete Predicates . 114 6.2.2 Closure of ALCHIQ(D)-Closures under Inferences . 115 6.2.3 Termination and Complexity Analysis . 116 6.3 Example . 118 6.4 Related Work . 120 7 Reducing Description Logics to Disjunctive Datalog 123 7.1 Overview . 123 7.2 Eliminating Function Symbols . 124 7.3 Removing Irrelevant Clauses . 130 7.4 Reduction to Disjunctive Datalog . 131 CONTENTS vii 7.5 Equality Reasoning in DD(KB) . 132 7.6 Answering Queries in DD(KB) . 134 7.7 Example . 137 7.8 Related Work . 145 8 Data Complexity of Reasoning 147 8.1 Data Complexity of Satisfiability . 148 8.2 A Horn Fragment of SHIQ(D) . 150 8.3 Discussion . 156 8.4 Related Work . 157 III Extensions 159 9 Integrating Description Logics with Rules 161 9.1 Reasons for Undecidability of SHIQ(D) with Rules . 162 9.2 Combining Description Logics and Rules . 164 9.3 DL-Safety Restriction . 165 9.4 Expressivity of DL-Safe Rules . 166 9.5 Query Answering for DL-Safe Rules . 168 9.6 Related Work . 170 10 Answering Conjunctive Queries 173 10.1 Definition of Conjunctive Queries . 173 10.2 Answering Conjunctive Queries . 175 10.3 Deciding Conjunctive Query Containment . 180 10.4 Related Work . 181 11 The Semantics of Metamodeling 183 11.1 Undecidability of Metamodeling in OWL-Full . 184 11.2 Extending DLs with Decidable Metamodeling . 188 11.2.1 Metamodeling Semantics for ALCHIQ(D) . 188 11.2.2 Deciding ν-Satisfiability . 191 11.2.3 Metamodeling and Transitivity . 195 11.3 Expressivity of Metamodeling . 197 11.4 Related Work . 198 IV Practical Considerations 201 12 Implementation of KAON2 203 12.1 KAON2 Architecture . 203 12.2 Ontology Clausification . 204 12.2.1 Reusing Replacement Predicates . 205 viii CONTENTS 12.2.2 Optional Positions . 205 12.2.3 Handling Functional Roles . 207 12.2.4 Discussion . 207 12.3 The Theorem Prover for BS . 208 12.3.1 Inference Loop . 209 12.3.2 Representing ALCHIQ-Closures . 209 12.3.3 Inference Rules . 211 12.3.4 Redundancy Elimination Rules . 211 12.3.5 Optimizing Number Restrictions . 214 12.3.6 Tuning the Calculus Parameters . 214 12.3.7 Choosing the Given Closure . 215 12.3.8 Indexing Terms and Closures . 215 12.4 Disjunctive Datalog Engine . 217 12.4.1 Magic Sets . 218 12.4.2 Bottom-Up Saturation . 219 13 Performance Evaluation 221 13.1 Test Setting . 221 13.2 Test Ontologies . 223 13.3 Querying Large ABoxes . 225 13.3.1 VICODI . 225 13.3.2 SEMINTEC . 226 13.3.3 LUBM . 226 13.3.4 Wine . 228 13.4 TBox Reasoning . 228 14 Conclusion 231 List of Figures 9.1 Two Similar Models . ..

Reasoning in Description Logics Using Resolution and Deductive Databases

Schema.Org As a Description Logic

A Framework for Comparing the Use of a Linguistic Ontology in an Application

Ontologies and Description Logics

Knowledge Representation in Bicategories of Relations

A Translation Approach to Portable Ontology Specifications

THE DATA COMPLEXITY of DESCRIPTION LOGIC ONTOLOGIES in Recent Years, the Use of Ontologies to Access Instance Data Has Become In

Complexity of the Description Logic ALCM

Hybrid Logic As Extension of Modal and Temporal Logics

Logic Functions Realized Using Clockless Gates for Rapid Single-Flux-Quantum Circuits

Layering Genetic Circuits to Build a Single Cell, Bacterial Half Adder

Formalizing Knowledge by Ontologies: OWL and KIF

Layering Genetic Circuits to Build a Single Cell, Bacterial Half Adder