Datalog Techniques for Inference in Probabilistic Logic Programs
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Beyond the Grounding Bottleneck: Datalog Techniques for Inference in Probabilistic Logic Programs Efthymia Tsamoura V´ıctor Gutierrez-Basulto´ Angelika Kimmig Samsung AI Research Cardiff University Cardiff University Abstract van Bremen, Dries, and Jung 2019), where determining the relevant grounding explores a large search space, but only State-of-the-art inference approaches in probabilistic logic small parts of this space contribute to the formulas. programming typically start by computing the relevant ground program with respect to the queries of interest, and We address this bottleneck, building upon the TcP oper- then use this program for probabilistic inference using knowl- ator (Vlasselaer et al. 2015), which integrates formula con- edge compilation and weighted model counting. We propose struction into forward reasoning for ground programs and an alternative approach that uses efficient Datalog techniques is state-of-the-art for highly cyclic PLP programs. Our key to integrate knowledge compilation with forward reasoning contribution is a program transformation approach that al- with a non-ground program. This effectively eliminates the grounding bottleneck that so far has prohibited the applica- lows us to implement forward inference using an efficient tion of probabilistic logic programming in query answering Datalog engine that directly operates on non-ground functor- scenarios over knowledge graphs, while also providing fast free programs. We focus on programs without negation for approximations on classical benchmarks in the field. simplicity, though the TcP operator has been studied for general probabilistic logic programs (Bogaerts and Van den Broeck 2015; Riguzzi 2016) as well; the extension to strat- 1 Introduction ified negation following (Vlasselaer et al. 2016) is straight- The significant interest in combining logic and probability forward. We further build upon two well-known techniques for reasoning in uncertain, relational domains has led to a from the Datalog community, namely semi-naive evalua- multitude of formalisms, including the family of probabilis- tion (Abiteboul, Hull, and Vianu 1995), which avoids re- tic logic programming (PLP) languages based on the dis- computing the same consequences repeatedly during for- tribution semantics (Sato 1995) with languages and systems ward reasoning, and the magic sets transformation (Ban- such as PRISM (Sato 1995), ICL (Poole 2008), ProbLog (De cilhon et al. 1986; Beeri and Ramakrishnan 1991), which Raedt, Kimmig, and Toivonen 2007; Fierens et al. 2015) makes forward reasoning query driven. We adapt and ex- and PITA (Riguzzi and Swift 2011). State-of-the-art infer- tend both techniques to incorporate the formula construc- ence for PLP uses a reduction to weighted model count- tion performed by the TcP operator and implement our ap- ing (WMC) (Chavira and Darwiche 2008), where the de- proach using VLog (Urbani, Jacobs, and Krotzsch¨ 2016; pendency structure of the logic program and the queries is Carral et al. 2019). Our experimental evaluation demon- first transformed into a propositional formula in a suitable strates that the resulting vProbLog system enables PLP in- format that supports efficient WMC. While the details of ference in the pOBDA setting, answering each of the 14 this transformation differ across approaches, a key part of standard queries of the LUBM benchmark (Guo, Pan, and it is determining the relevant ground program with respect Heflin 2011) over a probabilistic database of 19K facts in to the queries of interest, i.e., all groundings of rules that a few minutes at most, while most of these are infeasi- contribute to some derivation of a query. This grounding ble for the existing ProbLog implementation of TcP . Fur- step has received little attention, as its cost is dominated by thermore, for ten of the queries, vProbLog computes ex- the cost of constructing the propositional formula in typical act answers over 1M facts in seconds. At the same time, PLP benchmarks that operate on biological, social or hy- on three standard PLP benchmarks (Fierens et al. 2015; perlink networks, where formulas are complex. However, it Renkens et al. 2014; Vlasselaer et al. 2016) where the bot- has been observed that the grounding step is the bottleneck tleneck is formula construction, vProbLog achieves compa- that often makes it impossible to apply PLP inference in rable approximations to the existing implementation in less the context of ontology-based data access over probabilis- time. tic data (pOBDA) (Schoenfisch and Stuckenschmidt 2017; For details on proofs as well as additional background, Copyright c 2020, Association for the Advancement of Artificial we refer to the accompanying technical report (Tsamoura, Intelligence (www.aaai.org). All rights reserved. Gutierrez-Basulto,´ and Kimmig 2019). 10284 2 Background We start by noting that one way to (abstractly) specify We provide some basics on probabilistic logic programming. such a λ is to take a disjunction over the conjunctions of We use standard notions of propositional logic and logic facts in all total choices that entail the query of interest: programming, cf. (Tsamoura, Gutierrez-Basulto,´ and Kim- f mig 2019). We focus on the probabilistic logic programming language ProbLog (De Raedt, Kimmig, and Toivonen 2007; C⊆F:C∪R|=q f∈C Fierens et al. 2015), and consider only function-free logic We next extend the immediate consequence operator TP for programs. classic logic programs to construct parameterized interpre- A rule (or definite clause) is a universally quantified ex- h :− b , ..., b h b tations associating a propositional formula with every atom. pression of the form 1 n where and the i are T atoms and the comma denotes conjunction. A logic program Recall that the P operator is used to derive new knowl- edge starting from the facts. Let P be a logic program. For a (or program for short) is a finite set of rules. A ProbLog I T program P is a triple (R, F,π), where R is a program, F Herbrand interpretation , the P operator returns 1 π : F→[0, 1] is a finite set of ground facts and a func- TP (I )={h | h :− b1,...,bn ∈Pand {b1,...,bn}⊆I } tion that labels facts with probabilities, which is often writ- ten using annotated facts p :: f where p = π(f). Without The least fixpoint of this operator is the least Herbrand loss of generality, we restrict R to non-fact rules and in- model of P and is the least set of atoms I such that TP (I ) ≡ k clude ‘crisp’ logical facts f in F by setting π(f)=1.We I . Let TP (∅) denote the result of k consecutive calls of TP , ∞ also refer to a ProbLog program as probabilistic program. and TP (∅) the least fixpoint interpretation of TP . As common in probabilistic logic programming (PLP), we Let HB(P) denote the set of all ground atoms that can be assume that the sets of predicates defined by facts in F and constructed from the constants and predicates occurring in rules in R, respectively, are disjoint. A ProbLog program a program P.Aparameterized interpretation I of a proba- specifies a probability distribution over its Herbrand inter- bilistic program P is a set of tuples (a, λa) with a ∈ HB(P) pretations, also called possible worlds. Every fact f ∈F and λa a propositional formula over F. We say that two pa- independently takes values true with probability π(f) or rameterized interpretations I and J are equivalent, I≡J, false with probability 1 − π(f). if and only if they contain formulas for the same atoms and For the rest of the section we fix a probabilistic program for all atoms a with (a, ϕ) ∈Iand (a, ψ) ∈J, ϕ ≡ ψ. P =(R, F,π).Atotal choice C ⊆Fassigns a truth value Before defining the TcP operator for probabilistic pro- to every (ground) fact, and the corresponding logic program grams, we introduce some notation. For a parameterized in- C ∪Rhas a unique least Herbrand model; the probability of terpretation I of P, we define the set B(I, P) as this model is that of C. Interpretations that do not correspond : to any total choice have probability zero. The probability B(I, P)={(hθ, λ1 ∧ ...∧ λn) | (h − b1,...,bn) ∈P of a query q is then the sum over all total choices whose ∧ hθ ∈ HB(P) ∧∀1 ≤ i ≤ n :(biθ, λi) ∈I} program entails q: Intuitively, B(I, P) contains for every grounding of a rule Pr(q) := π(f) · (1 − π(f)) . (1) in P with head hθ for which all body atoms have a for- C⊆F:C∪R|=q f∈C f∈F\C mula in I the pair consisting of the atom and the conjunction of these formulas. Note the structural similarity with TP (I) As enumerating all total choices entailing the query is ∀i B(I, P) infeasible, state-of-the-art ProbLog inference reduces the above: the condition in the definition of corre- sponds to the subset condition there, we include substitu- problem to that of weighted model counting. For a formula θ λ over propositional variables V and a weight function w(·) tions as our program is non-ground, and we store conjunc- assigning a real number to every literal for an atom in V , the tions along with the ground head. weighted model count is defined as Definition 1 (TcP operator) Let I be a parameterized in- terpretation of P. Then, the TcP operator is WMC(λ) := w(a) · w(¬a) . (2) I⊆V :I|=λ a∈I a∈V \I TcP (I)={(a, λ(a, I, P)) | a ∈ HB(P),λ(a, I, P) ≡⊥} w(f)=π(f) w(¬f)=1− π(f) The reduction assigns and where for facts f ∈F, and w(a)=w(¬a)=1for other atoms.