Inline Evaluation of Hybrid Knowledge Bases

Die approbierte Originalversion dieser Dissertation ist in der Hauptbibliothek der Technischen Universität Wien aufgestellt und zugänglich. http://www.ub.tuwien.ac.at The approved original version of this thesis is available at the main library of the Vienna University of Technology. http://www.ub.tuwien.ac.at/eng Inline Evaluation of Hybrid Knowledge Bases PhD THESIS submitted in partial fulfillment of the requirements of Doctor of Technical Sciences within the Vienna PhD School of Informatics by Guohui Xiao Registration Number 0928566 to the Faculty of Informatics at the Vienna University of Technology Advisor: O. Univ. Prof. Dipl.-Ing. Dr. techn. Thomas Eiter Wien, 02.12.2013 (Signature of Author) (Signature of Advisor) Technische Universität Wien A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.ac.at Declaration of Authorship Guohui Xiao Piazza Domenicani 3, I-39100 Bolzano-Bozen BZ, Italy I hereby declare that I have written this Doctoral Thesis independently, that I have com- pletely specified the utilized sources and resources and that I have definitely marked all parts of the work - including tables, maps and figures - which belong to other works or to the internet, literally or extracted, by referencing the source as borrowed. Bozen, 2 December 2013 (Place, Date) (Signature of Author) i To my parents Fengchong Xiao and Shuhui Yan. 献给我的父亲肖凤冲和母亲闫淑慧 iii Acknowledgements First of all, I want to thank my advisor Prof. Thomas Eiter. It was a great and exciting experience being his student and working with him. As a top researcher, he had a very busy schedule, but he was always devoting his time to his students for discussions, comments, meetings and encouragements. Even when I was abroad, he managed two face-to-face meetings with me during his vocations to finalize this thesis. I am grateful to my reviewers Prof. Diego Calvanese and Prof. Sebastian Ruldoph for the helpful discussions and constructive suggestions. I want to thank the Vienna PhD School of Informatics, which provided me the opportunity for this PhD study. Prof. Hannes Werthner and Prof. Hans Tompits have been doing a great job in organizing and leading the PhD school. The sectaries Clarissa Schmid and Mamen Calatrava Moreno supported me in many aspects during the PhD study from the visa application to the final defense. It was a great experience working in the Knowledge Based System group in the Vienna Uni- versity of Technology. Stijn Heymans guided me to start the research and taught me how to do the research professionally. I had many happy and productive collaborations with Magdalena Ortiz, Mantas Simkus, Thomas Krennwallner, Patrik Schneider, and Cristina Feier, resulting several joint publications. Peter Schüller, Thomas Krennwallner and Christoph Redl developed the great DLVHEX solver and always provided me technical supports. Our secretary Eva Ne- doma and technicatian Matthias Schlögel were always helpful when I needed them. Thanks all the people in the KBS and DBAI groups for the last four years. My gratitude also goes to all who inspired me during my PhD. The lectures from Georg Gottlob, Thomas Eiter, Reinhard Pichler and Axel Polleres gave me deep understandings of the database theory, the complexity theory, and the Semantic Web technologies, which are the foundations of this thesis. In the DLV team, Francesco Ricca, Wolfgang Faber and Nicola Leone developed the DLV system and always answered all my questions related to the system. Worarat Krathu provided the use cases and benchmarks from business domain used in the thesis. Furthermore, I thank my master supervisor Zuoquan Lin who brought me into the field of AI and KR and encouraged me doing the PhD. I thank Guilin Qi, Yue Ma, Xiaowang Zhang and Kedian Mu for their encouragements and the happy collaborations on a variety of research topics. I also want to thank my Chinese friends in Vienna and Munich. They let me feel that I am at home. With them, the life was much more interesting. Finally, I want to thank my family and in particular my parents for their loves and understandings. My special thanks goes to, of course, my girlfriend Linfang Ding. Without her support and love, this thesis would never exist. v This thesis was partially funded by the EU Project OntoRule (IST-2009-231875), Austrian Science Fund (FWF) Project Reasoning in Hybrid Knowledge Bases (P20840), and the Vienna PhD School of Informatics. Abstract The deployment of knowledge representation formalisms to the Web has created the need for hybrid formalisms that combine heterogeneous knowledge bases (KBs). There are many formalisms proposed by the knowledge representation community for modeling knowledge bases, among which two families are of great importance, in particular in the context of semantic Web. One family is Description Logics (DLs) based ontologies, and the other is rule-based logic programming. The challenge of formalisms of combining ontologies and rules has been drawing a lot of attentions in recent years. Among several proposed approaches, loose coupling of rules and ontologies aims at combining respective knowledge bases by means of a clean interfacing semantics, in which roughly speaking inferences are mutually exchanged such that the one KB takes the imported information into account, and exports in turn conclusions to the other KB. This approach is fostered by non-monotonic description logic (dl-) programs, where this exchange is handled by a generalization of the answer set semantics of non-monotonic logic programs. Because of the loose coupling nature of dl-programs, one can build engines for dl-programs on top of legacy reasoners. For instance, the DLVHEX system with dl-plugin, which is a state- of-the-art system for dl-programs, is built on top of the ASP reasoner DLV and the DL reasoner RacerPro. Although this architecture is very elegant, the performance of this implementation is suboptimal. We observe that the overhead of calling external reasoners in the classical approach can be the bottleneck of the performance. The aim of this thesis is to improve the reasoning efficiency over hybrid KBs. We propose a new strategy, called inline evaluation, which compiles the whole hybrid KB into a new KB using only one single formalism. Hence we can use a single reasoner to do the reasoning tasks, and improve the efficiency of hybrid reasoning. In case of dl-programs, we design an abstract framework rewriting dl-programs to Datalog program with negation, by compiling every com- ponents carefully and combine them together into a single program. The reduction is sound and complete when the DL ontologies are “Datalog-rewritable”. We show that many DLs are Datalog-rewritable by introducing concrete rewriting algorithms. Furthermore, we show that inline evaluation can be used in hybrid KBs of other formalisms. To confirm the hypothesis that the inline evaluation is superior to the classical approach of “ASP + external DL reasoner”, we implement the inline evaluation method in the novel DReW system for dl-programs. We conduct extensive evaluations on several benchmark suites and show that DReW system outperforms the classical approach in general, especially for dl-programs of complex structures or with large instances. vii Contents 1 Introduction 1 1.1 Combining Rules and Ontologies . 1 1.2 Contributions . 3 1.3 Evolution of this Work . 4 1.4 Thesis Organization . 4 2 Preliminaries 7 2.1 First-order Logic . 7 2.2 Computational Complexity . 11 2.2.1 Turing Machine . 11 2.2.2 Complexity Classes . 12 2.2.3 Reductions . 13 2.2.4 Oracle Turing Machine and Polynomial Hierarchy . 14 2.3 Description Logics and OWL . 14 2.3.1 Expressive Description Logics . 14 2.3.2 Lightweight Description Logics . 22 2.3.3 Practical Considerations . 23 2.4 Datalog Family and Logic Programming . 27 2.4.1 Syntax . 28 2.4.2 Semantics . 29 2.4.3 Computational Complexities . 32 2.4.4 Practical Considerations . 34 2.5 Description Logics vs Logic Programming . 35 2.5.1 Unique Name Assumption . 35 2.5.2 Open domain vs close domain . 36 2.5.3 Open world vs close world . 36 2.5.4 Strong negation and default negation . 37 2.5.5 Monotonicity . 37 2.5.6 Arities . 37 3 Hybrid Knowledge Bases 39 3.1 Loose Coupling Approaches . 39 3.1.1 DL-Programs . 40 ix 3.1.2 CQ-Programs . 45 3.1.3 F-Logic# KBs . 47 3.1.4 Defeasible Logic Rules on Top of Ontologies . 47 3.2 Tight coupling Approaches . 47 3.2.1 First-order Combinations . 48 3.2.2 + log and its Variants . 50 DL 3.3 Embedding approaches . 50 3.3.1 Hybrid MKNF . 51 3.3.2 Open Answer Set Programming . 51 3.3.3 Datalog± ................................. 51 3.3.4 FO(ID) . 51 3.3.5 Quantified Equilibrium Logic . 52 4 Inline Evaluation of DL-Programs 53 4.1 A Framework for Inline Evaluation . 53 4.2 + and OWL 2 RL . 59 LDL 4.2.1 The Description Logic + ...................... 59 LDL 4.2.2 + is Datalog-rewritable . 65 LDL 4.3 OWL 2 EL . 72 4.4 Horn- ................................... 75 SHIQ 4.4.1 Syntax and Semantics of Horn- . 75 SHIQ 4.4.2 Canonical Models . 76 4.5 Go beyond instance queries . 82 4.6 Discussion . 84 4.6.1 Direct Rewriting vs Reification based Rewriting . 84 4.6.2 Datalog Reasoner vs Proprietary Reasoner . 85 4.6.3 Relations to other Datalog rewritings . 85 4.6.4 Non-Horn Description Logics . 85 4.6.5 Operators and ............................ 86 [− \− 4.6.6 Relaxation of Datalog-rewritability . 86 5 Inline Evaluation of other Hybrid Knowledge Bases 89 5.1 Terminological Default Logics . 89 5.2 DL-safe Conjunctive Queries . 93 5.3 Conjunctive Queries and positive Weakly DL-safe KBs .

Load more