Learning Logic Programs by Explaining Failures

Learning Logic Programs by Explaining Failures Rolf Morel , Andrew Cropper University of Oxford frolf.morel,[email protected] Abstract is to machine learn computer programs from data [Shapiro, 1983]. We build on the inductive logic programming (ILP) Scientists form hypotheses and experimentally test approach learning from failures and its implementation called them. If a hypothesis fails (is refuted), scientists POPPER [Cropper and Morel, 2021]. POPPER learns logic pro- try to explain the failure to eliminate other hypothe- grams by iteratively generating and testing hypotheses. When ses. We introduce similar explanation techniques a hypothesis fails on training examples, POPPER examines for inductive logic programming (ILP). We build the failure to learn constraints that eliminate hypotheses that on the ILP approach learning from failures. Given will provably fail as well. A limitation of POPPER is that it a hypothesis represented as a logic program, we only derives constraints based on entire hypotheses (as Alice test it on examples. If a hypothesis fails, we iden- does for C1) and cannot explain why a hypothesis fails (cannot tify clauses and literals responsible for the failure. reason as Alice does for C2). By explaining failures, we can eliminate other hy- We address this limitation by explaining failures. The idea potheses that will provably fail. We introduce a is to analyse a failed hypothesis to identify sub-programs that technique for failure explanation based on analysing also fail. We show that, by identifying failing sub-programs SLD-trees. We experimentally evaluate failure ex- and generating constraints from them, we can eliminate more planation in the POPPER ILP system. Our results hypotheses, which can in turn improve learning performance. show that explaining failures can drastically reduce By the Blumer bound [1987], searching a smaller hypothesis learning times. space should result in fewer errors compared to a larger space, assuming a solution is in both spaces. 1 Introduction Our approach builds on algorithmic debugging [Caballero et al., 2017]. We identify sub-programs of hypotheses by The process of forming hypotheses, testing them on data, analysing paths in SLD-trees. In similar work [Shapiro, 1983; analysing the results, and forming new hypotheses is the foun- Law, 2018], only entire clauses can make up these sub- [ ] dation of the scientific method Popper, 2002 . For instance, programs. By contrast, we can identify literals responsible imagine that Alice is a chemist trying to synthesise a vial of for a failure within a clause. We extend POPPER with failure the compound octiron from substances thaum and slood. To explanation and experimentally show that failure explanation do so, Alice can perform actions, such as fill a vial with a can significantly improve learning performance. fill(Vial,Sub) mix(V1,V2,V3) substance ( ) or mix two vials ( ). Our contributions are: One such hypothesis is: synth(A,B,C) fill(V1,A), fill(V1,B), mix(V1,V1,C) • We relate logic programs that fail on examples to their failing sub-programs. For wrong answers we identify arXiv:2102.12551v1 [cs.AI] 18 Feb 2021 This hypothesis says to synthesise compound C, fill vial V1 clauses. For missing answers we additionally identify with substance A, fill vial V1 with substance B, and then mix literals within clauses. vial V1 with itself to form C. When Alice experimentally tests this hypothesis she finds • We show that hypotheses that are specialisations and gen- that it fails. From this failure, Alice deduces that hypotheses eralisations of failing sub-programs can be eliminated. that add more actions (i.e. literals) will also fail (C1). Alice • We prove that hypothesis space pruning based on sub- can, however, go further and explain the failure as “vial V1 programs is more effective than pruning without them. cannot be filled a second time”, which allows her to deduce • We introduce an SLD-tree based technique for failure that any hypothesis that includes fill(V1,A) and fill(V1,B) will explanation. We introduce POPPER , which adds the fail (C2). Clearly, conclusion C2 allows Alice to eliminate X ability to explain failures to the POPPER ILP system. more hypotheses than C1, that is, by explaining failures Alice can better form new hypotheses. • We experimentally show that failure explanation can dras- Our main contribution is to introduce similar explanation tically reduce (i) hypothesis space exploration and (ii) techniques for inductive program synthesis, where the goal learning times. 2 Related Work 3.1 Learning From Failures predicate declara- Inductive program synthesis systems automatically generate To define the LFF problem, we first define tions hypothesis constraints computer programs from specifications, typically input/output and . LFF uses predicate decla- examples [Shapiro, 1983]. This topic interests researchers rations as a form of language bias, defining which predicate from many areas of machine learning, including Bayesian symbols may appear in a hypothesis. A predicate declaration is head pred(p; a) body pred(p; a) inference [Silver et al., 2020] and neural networks [Ellis et a ground atom of the form or p a al., 2018]. We focus on ILP techniques, which induce logic where is a predicate symbol of arity . Given a set of D C declaration programs [Muggleton, 1991]. In contrast to neural approaches, predicate declarations , a definite clause is consistent p=m ILP techniques can generalise from few examples [Cropper when two conditions hold (i) if is the predicate C head pred(p; m) D et al., 2020]. Moreover, because ILP uses logic program- in the head of , then is in , and (ii) for all q=n C body pred(q; n) ming as a uniform representation for background knowledge predicate symbols in the body of , is D (BK), examples, and hypotheses, it can be applied to arbitrary in . domains without the need for hand-crafted, domain-specific To restrict the hypothesis space, LFF uses hypothesis con- L neural architectures. Finally, due to logic’s similarity to natural straints. Let be a language that defines hypotheses, i.e. a language, ILP learns comprehensible hypotheses. meta-language. Then a hypothesis constraint is a constraint ex- pressed in L. Let C be a set of hypothesis constraints written Many ILP systems [Muggleton, 1995; Blockeel and Raedt, in a language L. A set of definite clauses H is consistent with 1998; Srinivasan, 2001; Ahlgren and Yuen, 2013; Inoue et C if, when written in L, H does not violate any constraint in al., 2014; Schuller¨ and Benz, 2018; Law et al., 2020] ei- C. ther cannot or struggle to learn recursive programs. By We now define the LFF problem, which is based on the ILP contrast, POPPER can learn recursive programs and thus X learning from entailment setting [Raedt, 2008]: programs that generalise to input sizes it was not trained [ Definition 3.1 (LFF input). A LFF input is a tuple on. Compared to many modern ILP systems Law, 2018; + − + − Evans and Grefenstette, 2018; Kaminski et al., 2019; Evans (E ;E ; B; D; C) where E and E are sets of ground atoms denoting positive and negative examples respectively; et al., 2021], POPPERX supports large and infinite domains, which is important when reasoning about complex data struc- B is a Horn program denoting background knowledge; D is tures, such as lists. Compared to many state-of-the-art systems a set of predicate declarations; and C is a set of hypothesis [Cropper and Muggleton, 2016; Evans and Grefenstette, 2018; constraints. Kaminski et al., 2019; Hocquette and Muggleton, 2020; A definite program is a hypothesis when it is consistent with Patsantzis and Muggleton, 2021] POPPERX does not need both D and C. We denote the set of such hypotheses as HD;C . metarules (program templates) to restrict the hypothesis space. We define a LFF solution: Algorithmic debugging [Caballero et al., 2017] explains Definition 3.2 (LFF solution). Given an input tuple + − failures in terms of sub-programs. Similarly, in databases (E ;E ; B; D; C), a hypothesis H 2 HD;C is a solution provenance is used to explain query results [Cheney et al., when H is complete (8e 2 E+;B [ H j= e) and consistent 2009]. In seminal work on logic program synthesis, Shapiro (8e 2 E−;B [ H 6j= e). [ ] 1983 analysed debugging trees to identify failing clauses. If a hypothesis is not a solution then it is a failure and a By contrast, our failure analysis reasons about concrete SLD- failed hypothesis. A hypothesis H is incomplete when 9e+ 2 [ ] trees. Both ILASP3 Law, 2018 and the remarkably similar E+;H [ B 6j= e+. A hypothesis H is inconsistent when [ ] ProSynth Raghothaman et al., 2020 induce logic programs 9e− 2 E−;H [ B j= e−. A worked example of LFF is by precomputing every possible clause and then using a select- included in Appendix A. test-and-constrain loop. This precompute step is infeasible for clauses with many literals and restricts their failure explana- 3.2 Specialisation and Generalisation Constraints tion to clauses. By contrast, POPPERX does not precompute The key idea of LFF is to learn constraints from failed hypothe- clauses and can identify clauses and literals within clauses ses. Cropper and Morel [2021] introduce constraints based on responsible for failure. subsumption [Plotkin, 1971] and theory subsumption [Midel- OPPER [ ] P Cropper and Morel, 2021 learns first-order con- fart, 1999]. A clause C1 subsumes a clause C2 if and only if straints, which can be likened to conflict-driven clause learning there exists a substitution θ such that C1θ ⊆ C2. A clausal [ et al. ] OPPER Silva , 2009 . Failure explanation in P X can there- theory T1 subsumes a clausal theory T2, denoted T1 T2, OPPER fore be viewed as enabling P to detect smaller conflicts, if and only if 8C2 2 T2; 9C1 2 T1 such that C1 subsumes yielding smaller yet more general constraints that prune more C2.

Load more