LOGIC PROGRAMMING AND THEOREM PROVING LOGIC PROGRAMMING AND THEOREM PROVING

LOGIC PROGRAMMING A-System: Problem Solving through Abduction Antonis C. Kakas Bert Van Nuffelen and Marc Denecker Dep. of Computer Science Dep. of Computer Science University of Cyprus K.U.Leuven B.O.Box 20537 Celestijnenlaan 200A

CY-1678 Nicosia, Cyprus B-3000 Leuven, Belgium g [email protected] fbertv, marcd @cs.kuleuven.ac.be

Abstract represented in a high-level declarative way with logic pro- gramming rules and classical first-order sentences (integrity This paper presents a new system, called the A- constraints). Our work aims to examine the possibility of de- System, performing abductive reasoning within the veloping ALP into a declarative problem solving framework, framework of Abductive Logic Programming. It is suitable for a variety of AI problems, that is computationally based on a hybrid computational model that imple- viable for problems of practical scale. ments the abductive search in terms of two tightly

A principled implementation of the A-System is developed coupled processes: a reduction process of the high- based on a hybrid computational model that formalizes the level logical representation to a lower-level con- abductive search for a solution in terms of two interleaving straint store and a lower-level constraint solving processes: the logical reduction of the high-level representa- process. A set of initial ”proof of principle” ex- tion of the problem and lower-level constraint solving. The periments demonstrate the versatility of the ap- abductive search is linked tightly to the construction of an as- proach stemming from its declarative representa- sociated constraint store. tion of problems and the good underlying compu- To validate our approach we have carried out a set of ”proof tational behaviour of the system. The approach of- of principle” experiments with an initial implementation of fers a general methodology of declarative problem the system that aim to test (a) the general underlying compu- solving in AI where an incremental and modular re-

tational behaviour of the A-System on different domains and finement of the high-level representation with extra (b) the flexibility of the framework to incorporate additional domain knowledge can improve and scale the com- problem specific knowledge in a modular and computation- putational performance of the framework. ally enhancing way. In particular, some of these experiments are designed to test the extend to which the high-level rep- 1 Introduction resentation of problems can be computationally improved via Over the last two decades it has become clear that abduction extra domain knowledge added to it. The experiments con- can play a central role in addressing a variety of problems sidered include constraint satisfaction problems and standard in Artificial Intelligence. These problems include diagnosis AI Planning Systems Competition problems. [Poole et al., 1987; Console et al., 1996], planning [Missiaen The A-System has been developed as a follow up of two [ ] et al., 1995; Kakas et al., 2000; Shanahan, 2000] knowledge earlier ALP systems, the ACLP system Kakas et al., 2000 [ ] assimilation and belief revision, [Inoue and Sakama, 1995; and SLDNFAC Denecker and Schreye, 1998 , bringing to- Pagnucco, 1996] multi-agent coordination,[Ciampolini et al., gether features from these two systems. The main devel- 2000; Kowalski and Sadri, 1999] and knowledge intensive opment in the design of the A-System over these previous learning [Muggleton, 2000; Mooney, 2000]. systems is the fact that the non-determinism in the abduct- The essential feature of this abductive approach to problem ive computation is now made explicit allowing the possibil- solving is the fact that it allows the application problems to ity of implementing this as a form of parameterized heuristic be formalized directly in their high-level declarative repres- search coupled with a process of deterministic propagation entation. A close link therefore emerges between declarative of the state of computation. The A-System is available from problem solving in AI and the logical reasoning of abduction. www.cs.kuleuven.ac.be/dtai/kt. However, despite this variety of applications for abduction and its potential benefits it has not been easy to develop gen- 2 Declarative Programming with Abduction eral systems for abduction that are computationally effective Declarative Problem Solving with a high-level representation for problems of practical size and complexity. of the problem at hand is closely related to abduction.The

This paper presents a new system, called the A-System, reason for this stems from the fact that in a declarative rep- that supports abductive reasoning within the framework of resentation where one describes the expert knowledge on the Abductive Logic Programming (ALP) [Kakas et al., 1998; problem domain rather than some method for its solution of- Denecker and Kakas, 2000]. In this framework problems are ten, the task of solving the problems consists in filling the

LOGIC PROGRAMMING AND THEOREM PROVING 591 missing information from this representation pertaining to the The constraint predicates are defined, as in CLP, by an un-

goal at hand. Typically, in a logical framework of represent- derlying constraint theory which is independent of any par-

È; A; Á C µ ation the solution consists in finding the extension of some ticular abductive theory ´ . We will assume that the predicate(s) which are incompletely specified in the repres- constraint theory is a finite domain theory that also includes entation. The high-level theory describing the problem is then equality over logical terms. The formal details of this are extended to a new one such that the problem goal is satisfied. beyond the scope of this paper. Computing such extensions of the theory representing our In ALP the abductive problem task is defined as follows.

problem is an abductive task. Indeed, abduction as a problem

È ; A; Ì µ Definition 2.2 Given an abductive theory ´ and

solving method assumes that the general data structure for the Ä some query É, consisting of a conjunction of literals over ,

solution to a problem (or solution carrier) is at the predicate

¡  A an abductive solution (or explanation) for É is a set of level and hence that a solution is described in the same terms ground abducible atoms together with an answer substitution

and level as the problem itself.

È [ ¡  such that is consistent and:

A framework for problem solving with abduction therefore

È [ ¡ j= 8´ ´Éµµ

needs to be expressing enough to allow high-level declarative ¯

È [ ¡ j= ÁC: representations of complex problems with missing or incom- ¯ plete information. At the same time for such a framework to have a practical value it should provide ways of improving This definition is generic in that it defines the notion of an the computational effectiveness of this high-level represent- abductive solution in terms of any given semantics of standard

ation for any particular problem. One such framework that (constraint) Logic Programming (LP). Each particular choice = combines representational expressiveness and computational of semantics defines its own entailment relation j and hence flexibility is that of Abductive Constraint Logic Programming its own notion of what is an abductive solution. In the con- (ACLP) a framework that integrates together Abductive Lo- text of ID-logic, its inductive definition semantics essentially [ gic Programming (ALP) with methods from Constraint Logic coincides with the well-founded model semantics Gelder et ] 1 Programming (CLP). The A-system is developed within this al., 1991 for LP when this is a two valued model .Intherest framework. Let us briefly review the underlying framework of this paper we will be adopting this well-founded model of ALP. semantics for LP for the development of our A-System. A recent way to formalize ALP is to view this as a special A computed abductive solution ¡ therefore gives a ground

case of ID-logic [Denecker, 2000], a logic extending classical definition of the open predicates which in turn through the

[ ¡ logic with inductive definitions. Let a language of predicates, logic program, È , extends this to the defined predicates. As an example of an abductive theory we give below a

Ä, be given consisting of three disjoint types of predicates:

part of the theory representing a planning domain. In this,

È A

(i) D the defined predicates, (ii) the open or abducible

ÚehicÐe <  6= dÖ iÚ e , is the only open predicate, , , are con- predicates and (iii) C the constraint predicates. Then a theory

in ALP is defined as follows. straint predicates and all others are defined predicates. The

Ð ÓcaØiÓÒ ØiÑe

predicates Ú ehicÐ e, and are simple predicates

È; A; Á C µ

Definition 2.1 An abductive theory is a triple ´ defining the finite domain of variables, e.g. using CLP nota-

´ÌÖÙckµ: ÌÖÙckiÒ ½::½¼ where: tion Ú ehicÐ e for a problem where

we have ten trucks. È

¯ is a constraint logic program where only defined pre-

aØ´Ì Ö Ùck ; ÄÓc; Ì µ:

dicates appearing in the head of these rules. ÚehicÐe

ÚehicÐe aØ´Ì Ö Ùck ; ÄÓcµ; iÒiØ A ¯ is a set of ground abducible atoms over open predic-

2

:cÐ iÔÔed aØ´Ì Ö Ùck ; ÄÓc; ¼;̵ :

ates. ÚehicÐe

aØ´Ì Ö Ùck ; ÄÓc; Ì µ: ÚehicÐe

ÁC Ä

¯ is a set of first order formulae over , called integrity

Ú ehicÐ e´Ì Ö Ùck ; ÄÓc; E µ;

constraints. dÖ iÚ e

E<Ì;

ÚehicÐe aØ´Ì Ö Ùck ; ÄÓc; E ; Ì µ: cÐ iÔÔed

The representation of an application problem in an abduct- :

È; A; Á C µ

ive theory ´ splits, in this view of ALP in terms

cÐ iÔÔed Ú ehicÐ e aØ´ÌÖÙck;ÄÓc½;E;̵:

of ID-logic, into two parts. The set of rules in È represents

Ú ehicÐ e´Ì Ö Ùck ; ÄÓc¾;Cµ;

the expert’s strong definitional knowledge of the problem, i.e. dÖ iÚ e

¾ 6= ÄÓc½;

knowledge which fully determines one or a group of predic- ÄÓc

 C; C < Ì : ates, the defined predicates, in terms of other open predicates. E

Each rule represents a case in which the defined predicate in

ÔÖ ec Ú ehicÐ e´Ì Ö Ùck ; ÄÓc; Ì µ:

the head will be true; the program È is an exhaustive enumer-

´ÌÖÙckµ;

ation of the cases. The open or abducible predicates have no Ú ehicÐ e

´ÄÓcµ;

definition. But some general information about them may be Ð ÓcaØiÓÒ

´Ì µ;

available indirectly through the expert’s weaker assertional ØiÑe

ÔeØÖ ÓÐ ´Ì Ö Ùck ; Ì µ:

knowledge of the problem. This is represented by the the- iÒ A

ory ÁC of integrity constraints.Theset enumerates all the Ì Ö c; ÄÓc; Ì :

possible missing ”values” for the open predicates. Abductive 8

Ú ehicÐ e´Ì Ö c; ÄÓc; Ì µ ! ÔÖ ec Ú ehicÐ e´Ì Ö c; ÄÓc; Ì µ: dÖ iÚ e solutions are build from A giving a ground (partial) definition of these open predicates. 1The formal details of this are beyond the scope of this paper.

592 LOGIC PROGRAMMING AND THEOREM PROVING The initial state of the problem is given by a applying a suitable inference rule yielding a new state. The

set of facts in the logic program È including state- main inference rules are given by the following rewrite rules

Ú ehicÐ e aØ´Ì Ö Ùck ; ÄÓcµ Ë

ments of the form iÒiØ ,e.g. (different elements of are separated by ”;”):

iÒiØ Ú ehicÐ e aØ´ØÖ Ùck ½; ciØÝ ½ ½µ:

¯ Դص µ B [Ø] Ô Ô´X µ

i if is a defined predicate and

B [ X ]

i arule.

3 A-System and Abductive Search

Դص µ Ø = × Ô Ô´×µ ¾Ë ¯ if is an open predicate and .

The computation of an abductive solution in the A-System

¯ :A µ A can be seen as a process of reduction of the query, É,toa .

set of abducible hypotheses on the open predicates together

¯ 8X: Ô´Ø µ;É µ F ; :::; F Ô Ò

½ ( a defined predicate)

¼ C

with an associated constraint store Ë of constraints over the

F = 8X ; X : B [Ø; X ];É

i i i where i for each rule

general finite domain constraint theory supported by the par- ¼

Ô´Y µ B [Y;X ] X  X i ticular ALP framework. This is similar to the computation of i ,( ).

CLP with two important differences.

¯ 8X: Դص;É µ 8X: Դص;É; F ; :::; F Ô Ò ½ ( an

First the reduction involves through the hypotheses also a ¼

F = 8X : Ø = ×; É

open predicate) where i for each ÁC

reduction of the integrity constraints . Effectively, when ¼

Ô´× µ Ë X  X a new abductive hypothesis is made the integrity constraints i an abduced atom in ,( ).

need to be re-evaluated to remain true. This satisfaction of

8X: C; É µ:C; C ¯ if is a constraint atom without

the ÁC can result in new constraints for the constraint store universally quantified variables. C

Ë and possibly new goals at the same level as the original

8X: C; É µ C ; 8X: É; C query. The second difference with CLP is the way that the ¯ if is a constraint atom

without universally quantified variables. C

constraint store Ë is used in this reduction. As we will see

8X: :A; É µ A A in the next subsection, this is not merely a passive store of ¯ if does not contain universally constraints to be evaluated at the end of the reduction but it quantified variables. is used actively during the reduction to affect the search. For

example, it enables reductions to be pruned early by setting A successful derivation terminates with a state Ë such that:

C Ë new constraints in provided that this remains satisfiable. 1. Ë contains positive goals only of the form of abducible atoms or constraint atoms,

3.1 Abductive Inference in the A-System

2. negative goals in Ë are denials containing some open A

The logical reduction of a query É in the -System can be

´Øµ atom Ô which has already been selected and resolved

defined as a derivation for É through a rewriting process

´×µ ¾Ë with each abduced atom Ô ,and

between states. We give here in brief the formal details of

´Ë µ Ë this. 3. the constraint store C of is satisfiable.

A computational restriction on the form of the integrity Otherwise, the derivation flounders when universally quan-

´È; A; Á C µ constraints, ÁC, of an abductive theory is imposed tified variables appear in the selected literal in a denial. If the in the A-System. These sentences must be (or should be derivation does not succeed or flounder then it fails. Note that transformed to classically equivalent sentences) of the form:

negation as failure in È is simply re-written (via the 3rd rule)

8:´Ð ;:::;Ð µ Ð

Ò i ½ where each is a literal. Such an integrity to a denial of the positive atom. constraint will also be called a denial or a negative goal

Let Ë be the final state of a successful derivation. Then any

Ð ;:::;Ð Ò and will be denoted by ½ . Also to simplify the

substitution  that assigns a ground term to each free variable

presentation, we assume that each rule in È is of the form

C ´Ë µ

of Ë and which satisfies the constraint store is called a

Ô´X µ B [X; X ]

i where all variables are (implicitly) uni-

solution substitution of Ë . Such a substitution always exists

[X; Y ]

versally quantified, referred to as free variables, and B

´Ë µ since C is satisfiable for a successful derivation.

is a conjunction of literals with free variables X occurring in

´È; A; Á C µ

the head and Y occurring only in the body. Rules are there- Theorem 3.1 Let be an abductive theory s.t.

j= ÁC É Ë

fore in homogeneous form where no non-variable term can È , a query, the final state of a successful de-

É  Ë

´cµ Õ ´Y µ

appear in their head e.g. a rule Ô will be written rivation for and a solution substitution of . Then the

 ´¡´Ë µµ  É

´X µ X = c; Õ ´Y µ equivalently as Ô . pair and is an abductive solution of .

A state Ë of the derivation consists of two types of ele- ments: a set of literals (possibly with free variables), called 3.2 Abductive Search

positive goals, and a set of denials, called negative goals, of The search for an abductive solution in the A-System based

X: F [X; Y ] F [X; Y ]

the form 8 where is a conjunction on the above proof theory, depends very closely on the com-

¡´Ë µ of literals and Y are free variables. denotes the set of puted constraint store. Typically, the abducible hypotheses

abducible atoms in Ë , i.e. positive goal atoms whose predic- made during the computation are non-ground atoms whose

´Ë µ

ate is an open predicate and C denotes the set of positive variables are restricted by the constraints in the associated É constraint atoms in Ë . A derivation for starts with an initial constraint store of the computation. This constraint store

state consisting of the literals of the query É as positive goals plays a central role not only in carrying (together with the

and all the denials in ÁC as negative goals. The rewriting abducibles) the solution but also in controlling the abductive

derivation proceeds by selecting a literal in a goal of Ë and search.

LOGIC PROGRAMMING AND THEOREM PROVING 593 In many respects the computation can be viewed as a pro- have a simple form are different possibilities that we are cur-

cess of constructing a constraint store from the high-level spe- rently exploring here in the implementation of the A-System. cification and query of the abductive theory as any decision After the phase of quasi-consistency the computation ex- that we take, e.g. which abducible to introduce or how to amines the other choice points left suspended, namely posit- satisfy an integrity constraint, extends differently the current ive goals which are non-deterministic, and again it can apply constraint store. Hence we can turn things around and use the a form of heuristic evaluation to choose amongst the differ- constraint store to guide the decisions that we make in this ent possibilities. Once a global decision is taken on all these abductive search. choice points the computation either returns to the top to re- The overall pattern of the abductive computation in the peat the process, when the choice leads to new positive goals,

A-System is as follows: or otherwise, it terminates successfully by labeling the con- 1. Deterministic Propagation of +ve and -ve Goals straint store and thus grounding the abductive solution.

This global approach, of the A-System to choice making 2. Suspend and Evaluate Choices where it tries to cover together as many choice points as pos- (a) Quasi-consistency of -ve Goals sible results in more informed decisions leading to less back- (b) Evaluate Choices on +ve goals tracking and thus to a better computational performance. It

also facilitates the use of heuristics by helping to parameter- G

3. Global set of Choices: new +ve Goals, ÒeÛ ize them and make them available at the top level. G 4. If ÒeÛ non-empty then Return to (1) An important parameter of the computation is the degree 5. Otherwise, Exit and Ground Solution. to which the constraint store is examined for its satisfiabil- ity (and other quality measures) during its construction. One The computation starts with a phase of deterministic possibility is to do a full check of the whole store at each propagation where all of the current goals that have only point where this grows. But this can be costly as a large part one possible rewriting are reduced until no such goals are of this maybe unnecessary. On the other hand an incomplete left in the resulting state of the computation. The purpose check may result in the late (after several choices) recognition of this phase is two-fold: (a) to expose and collect all the of the inconsistency of the constraint store. An intermediate choice points in the current state of the computation and (b) to possibility is to localizing the check to the latest addition to propagate the construction of the constraint store with all new the store, i.e to the variables of the current constraints and constraints that are necessarily imposed under the choices other variables connected to these by constraints already in made so far in the previous iteration steps. The updated con- the store. This induces a form of dynamic partition of the straint store is checked for satisfiability during this phase as constraint store into connected components that reflects, to it grows. This can have a significant effect on the computa- a certain extend, the high-level structure of the problem and tion as it enables us to detect early the ensuing failure of the query at hand and could help minimize the computation re- choices made prior to this before committing to other choices. quired to check the satisfiability of the store. All the choice points exposed by this deterministic phase are suspended and a new process of their evaluation begins. This explicit handling of the non-determinism in the com- 4 Experiments with the A-System putation allows the use of a parameterized form of heuristic A number of different experiments have been carried out to search where a variety of different types of heuristics can be test the underlying general computational behaviour of (an used. There are essentially two types of choices in the compu- initial implementation of) the A-System and how this can be tation. These are (1) a choice of which rule from the program affected by extending the high-level representation of prob- to use in rewriting a positive goal (cf. first and second rewrite lems with additional information. rule) and (2) a choice of which way to satisfy a denial (cf. last The current version of the system is implemented as a three rewrite rules). meta-program on top of Sicstus Prolog 3.8.5 and uses its finite The choice points of the second type are considered in a 3 domain constraint solver. It is developed as an experimenta- phase called quasi-consistency where we try to satisfy all tion tool to test different datastructures and strategies. Cur- the denials in the current state together if possible without rently a form of indexing on the abducibles and constraints is the introduction of new positive goals. Choices result in a used in the datastructures. With respect to the parameters of new possible constraint store, that we can compute by a sub- the abductive search presented in the previous section the cur- sidiary phase of deterministic propagation. The ”quality” of rent version implements only a very basic form of heuristics. this ensuing constraint store under various criteria, e.g. tight- In particular, the methods used for checking the satisfiability ness of its finite domain variables or minimal change in the of the constraint store during its construction are simple. current constraint store, forms a possible heuristic criterion. A first set of experiments consists of problems whose spe- A second criterion concerns the number and complexity of cification can be reduced to a finite domain constraint store in the new positive goals, that a choice would produce. For ex- one phase of reduction. Such problems include standard con- ample, preferring new goals which are deterministic or which straint satisfaction problems, e.g. N-Queens, Graph Coloring introduce new abducibles whose associated denials would and Scheduling problems. The aim here was to confirm that 3This name reflects the fact that at this stage of the computation the execution will be deterministic and to compare the execu- the satisfaction of the integrity constraint denials is in general con- tion time with that of solving the problem directly in CLP to tingent on a set of new goals. see the overhead cost of the reduction of the high-level rep-

594 LOGIC PROGRAMMING AND THEOREM PROVING resentation of the problem. The tables below show a sample actions. of these experiments for the N-Queens problem and a graph High level solutions coloring problem. For the graph coloring experiment a set of Problem Length Time(s) Max time points 4-colorable planar graphs were constructed. The run times 7-0 8/15 6.3 100 are split into the time needed to find an abductive solution 8-0 9/19 7.8 100 and the subsequent time needed to ground this by the CLP 9-0 8/16 34.1 (*) 100 labeling. All experiments reported in this paper are done on a 10-0 7/18 8.5 100 Linux machine (800 Mhz) with 512 MB of memory. 11-0 - - - N-Queens 12-0 9/18 298.4 (*) 100 size abductive solution (s) clp grounding 12-0 9/18 19.9 20 30 0.7 10ms Low level solutions 40 1.9 20ms Problem Length Time 50 4.1 160ms 7-0 21/40 40ms 60 8.9 160ms 8-0 21/47 50ms Graph Coloring 9-0 15/39 40ms size(nodes) abductive solution (s) clp grounding 10-0 18/46 50ms 80 1.9 10ms 12-0 24/45 50ms 100 3.7 20ms Some of the problems took a while to return a solution. Be- 200 21.3 20ms cause the finite domain solver is incomplete, the A-System 400 233.3 50ms may go on with a branch although the constraint store is Although the reduction of the whole specification is done already unsatisfiable (see the marked entries in the above in one step (no backtracking occurs over the abductive solu- tables). We tried different satisfiability checks but none of tion) in both experiments the construction time (abductive them was overall successful. A parameter in such a check solution) increases clearly with the size. So the metainter- is the domain size of the variables (see the last column of preter is some orders slower than the Prolog below, which is the table of the high level plans): by decreasing the maximal used in a CLP-program to set up the constraints. number of time points the satisfiability check fails faster when Another set of experiments was carried out on planning the constraint store is insatisfiable. As consequence some problems. These are non-deterministic problems where the problems could be solved in a reasonable time. Currently, we

abductive computation and search of the A-System can be are investigating how we can improve this satisfiablity check. . We selected from the latest AIPS2000 planning com- Compared to previous experiments with SLDNFAC and petition two domains: blocks world and logistics. The next ACLP, the A-System is more reliable and more robust to table shows the performance of the system on the blocks changes in the order of execution of the query. However be- world domain. For this experiment we used a specification cause the constraint solver is incomplete it still possible to with one action move(X,Y,T), which denotes moving a block have one derivation of A-System failing to find a solution and X onto Y at time T. another one finding it immediately. Another main difference

Problem Planlength Time(s) of the A-System with these previous systems is its separation 20-0 37 1.7 of search and inference. This allows easy experimentation 30-1 57 3.5 with several strategies and different degrees of deterministic 45-0 88 10.2 propagation. 55-1 105 17.2 60-0 114 30.8 5 Related Work and Conclusions The second considered domain is that of logistics. For this There are several other abductive systems for ALP that have domain a specification with functors and the general event recently been developed. Different methods have been used calculus is used. Futhermore the specification is two layered: e.g. bottom up computation [Iwayama and Satoh, 2000], the top layer is a compact high level description of the logist- tabling [Alferes et al., 1999] and rewriting rules with the com- ics domain. The lower level actions (from the original AIPS pletion [Fung and Kowalski, 1997; Kowalski et al., 1998].

strips description) are derived from the high level ones. The The A-System with its two earlier ALP systems, the ACLP

first table below shows the performance of the A-System on system and SLDNFAC, is to our knowledge the first abductive problems where the system has been used to compute only system that has paid particular attention to the computational high-level abstract plans. The length of the plans are given aspects of abduction using constraint solving extensively to by the final time when the goal is achieved (first number in control the abductive search and enhance its computational the column) and the number of abduced actions describing behaviour. the actual plan. A blank entry indicates that the system was On a more abstract level our work is related to Answer Set unable to find a solution within a given time limit. Programming(ASP) [Gelfond and Lifschitz, 1991]. Strong The results for the low level plans are presented in a second connections have been established [Satoh and Iwayama, table. These plans are computed in a separated phase using 1991] between ALP and ASP. At the level of semantics these as input the previous computed high level plans. This step two frameworks are for a large class of theories (ASP ad- expands the high level plans in number of time points and mits only a special type of integrity constraints) equivalent.

LOGIC PROGRAMMING AND THEOREM PROVING 595 At the level of computation our construction of an abductive [Fung and Kowalski, 1997] T.H. Fung and R.A. Kowalski.

solution in the A-System corresponds to the generation of a The iff procedure for abductive logic programming. Lo- model in ASP but there are some significant differences in the gic Programming, 33(2):151–165, 1997. respective computational models. It is therefore instructive [Gelder et al., 1991] A. Van Gelder, K.A. Ross, and J.S. to develop systematic experiments to compare systems from Schlipf. The well-founded semantics for general logic pro- these two approaches to declarative problem solving. grams. Journal of the ACM, 38(3):620–650, 1991.

The abductive search in the A-System can be parametrized in a variety of ways to reflect the type of cooperation with [Gelfond and Lifschitz, 1991] M. Gelfond and V. Lifschitz. constraint solver and the heuristics that are used. An im- Classical negation in logic programs and disjunctive data- portant future development of the system is to structure fur- bases. New Generation Computing, pp. 365-387, 1991. ther this parametric space and to study how more techniques [Inoue and Sakama, 1995] K. Inoue and C. Sakama. Abduct- from constraint programming and advanced heuristic search ive framework for nonmonotonic theory change. In Pro- can be incorporated in order to improve the general computa- ceedings of IJCAI-95, pages 204–210, 1995. tional behaviour of the framework on a variety of problems. [Iwayama and Satoh, 2000] N. Iwayama and K. Satoh. Com- In particular, we can study how recent heuristic methods for puting abduction by using tms with top-down expectation. [ ] planning Bonet and Geffner, 1999 can be generalized to the Logic Programming, 44(1-3):179–206, 2000. abductive computation of the A-System. This general improvement of the underlying computational [Kakas et al., 1998] A.C. Kakas, R.A. Kowalski, and F. Toni. efficiency of the ALP framework, or indeed of any other de- The role of abduction in logic programming. In D. Gabbay, clarative problem solving framework, is clearly limited as we C. Hogger, and J. Robinson, editors, Handbook of Logic are aiming to use the framework on a general variety of prob- in AI and Logic Programming, volume 5, pages 235–324. Oxford University Press, 1998. lems. A complementary line of development of the A-System and its associated ALP framework concerns the study of how [Kakas et al., 2000] A.C. Kakas, A. Michael, and C. Mour- to provide a programming environment where the user has the las. Aclp: Abductive constraint logic programming. facility to incrementally refine her/his high-level representa- Journal of Logic Programming: Special Issue on Abduct- tion of the problem in a modular way. A general methodology ive Logic Programming, 44(1-3):129–177, 2000. for declarative problem solving with abduction is emerging [Kowalski and Sadri, 1999] R.A. Kowalski and F. Sadri. where the ALP modeling environment provides the possibil- From logic programming to multi-agent systems. Annals ity for extra problem specific (declarative or control) know- of Mathemathics and Artificial Intelligence, 1999. ledge to be included that would enhance its computational performance. [Kowalski et al., 1998] R.A. Kowalski, F. Toni, and G. Wet- zel. Executing suspended logic programs. Fundamenta Informaticae, 34(3):203–224, 1998. References [Missiaen et al., 1995] L.R. Missiaen, M. Denecker, and [Alferes et al., 1999] J. J. Alferes, L. M. Pereira, and M. Bruynooghe. CHICA, an abductive planning system T. Swift. Well-founded abduction via tabled dual pro- based on event calculus. Journal of Logic and Computa- grams. In Proceedings of ICLP-99, 1999. tion, 5(5):579–602, 1995. [Bonet and Geffner, 1999] B. Bonet and H. Geffner. Plan- [Mooney, 2000] R.J. Mooney. Integrating abduction and in- ning as heuristic search: New results. In Proc. European duction in machine learning. In Abduction and Induction: Conference on Planning (ECP-99), 1999. essays on their relation and integration, pages 181–191. Kluwer Academic Press, 2000. [Ciampolini et al., 2000] A. Ciampolini, E. Lamma, P. Mello, and P. Torroni. An implementation for abductive [Muggleton, 2000] S. Muggleton. Theory completion in logic agents. In AI*IA’99, LNAI Vol. 1792, 2000. learning. In Inductive Logic Programming, ILP-00, 2000. [Console et al., 1996] L. Console, L. Portinale, and D. The- [Pagnucco, 1996] M. Pagnucco. The role of abductive reas- seider Dupr´e. Using compiled knowledge to guide and oning within the process of belief revision.PhDThesis, focus abductive diagnosis. IEEE Transactions on Know- Dept. of Computer Science, University of Sydney, 1996. ledge and Data Engineering, 8 (5):690–706, 1996. [Poole et al., 1987] D. Poole, R. Goebel, and R. Aleliunas. Theorist: A logical reasoning system for defaults and dia- [Denecker and Kakas, 2000] M. Denecker and A.C. Kakas. gnosis. In The Knowledge Frontier: Essays in the Repres- Abductive Logic Programming. Special issue of JLP, Vol entation of Knowledge, pages 331–352. Springer-Verlag, 44 (1-3), 2000. 1987. [ ] Denecker and Schreye, 1998 M. Denecker and D. De [Satoh and Iwayama, 1991] K. Satoh and N. Iwayama. Com- Schreye. Sldnfa: an abductive procedure for abductive lo- puting abduction by using the tms. In Proc. of ICLP’91, gic programs. Logic Programming, 34(2):111–167, 1998. pages 505–518, 1991. [Denecker, 2000] M. Denecker. Extending classical logic [Shanahan, 2000] M. Shanahan. An abductive event calculus with inductive definitions. In Proceedings of CL-2000, planner. Logic Programming, 44(1-3):207–239, 2000. 703-717, 2000.

596 LOGIC PROGRAMMING AND THEOREM PROVING A Comparative Study of Logic Programs with Preference

Torsten Schaub∗ and Kewen Wang Institut fur¨ Informatik, Universitat¨ Potsdam Postfach 60 15 53, D–14415 Potsdam, Germany torsten,kewen @cs.uni-potsdam.de { }

Abstract perspective, one can view (iii) as an axiomatization of the un- derlying strategy within the object language, while (i) may We are interested in semantical underpinnings for be regarded as a meta-level description of the corresponding existing approaches to preference handling in ex- construction process. One may view (ii) as the most seman- tended logic programming (within the framework tical characterization because it tells us which “models” of of answer set programming). As a starting point, the original program are selected by the respective preference we explore three different approaches that have handling strategy. been recently proposed in the literature. Because We limit (also in view of (iii)) our investigation to ap- these approaches use rather different formal means, proaches to preference handling that remain within NP. This we furnish a series of uniform characterizations excludes approach like the ones in [Rintanen, 1995; Zhang that allow us to gain insights into the relationships and Foo, 1997] that step outside the complexity class of the among these approaches. To be more precise, we underlying reasoning method. This applies also to the ap- provide different characterizations in terms of (i) proach in [Sakama and Inoue, 1996], where preferences on fixpoints, (ii) order preservation, and (iii) transla- literals are investigated. While the approach of [Gelfond tions into standard logic programs. While the two and Son, 1997] remains within NP, it advocates strategies former provide semantics for logic programming that are non-selective. Approaches that can be addressed with preference information, the latter furnishes within this framework include [Baader and Hollunder, 1993; implementation techniques for these approaches. Brewka, 1994] that were originally proposed for default logic.

1 Introduction 2 Definitions and notation Numerous approaches to logic programming with preference We assume a basic familiarity with logic programming un- information have been proposed in the literature. So far, how- der answer set semantics [Gelfond and Lifschitz, 1991]. An ever, there is no systematic account on their structural dif- extended logic program is a finite set of rules of the form ferences, finally leading to solid semantical underpinnings. We address this shortcoming by a comparative study of a dis- L0 L1,...,Lm, not Lm+1,..., not Ln, (1) tinguished class of approaches to preference handling. This ← where n m 0, and each L (0 i n) is a literal, ie. class consists of selective approaches remaining within the i either an≥ atom ≥A or its negation A.≤ The≤ set of all literals is complexity class of extended logic programming (under an- denoted by Lit. Given a rule r as¬ in (1), we let head(r) de- swer sets semantics). These approaches are selective insofar note the head, L0, of r and body(r) the body, L1,...,Lm, as they use preferences to distinguish certain “models” of the { + not Lm+1,..., not Ln , of r. Further, let body (r) = original program. } L1,,...,Lm and body−(r) = Lm+1,...,Ln . A pro- We explore three different approaches that have been re- { } { } cently proposed in the literature, namely the ones in [Brewka gram is called basic if body−(r) = for all its rules. ∅ + and Eiter, 1999; Delgrande et al., 2000; Wang et al., 2000]. We define the reduct of a rule r as r = head(r) + ← Our investigation adopts characterization techniques found in body (r). The reduct, ΠX , of a program Π relative to a set the same literature in order to shed light on the relationships X of literals is defined by among these approaches. This provides us with different X + Π = r r Π and body−(r) X = . characterizations in terms of (i) fixpoints, (ii) order preser- { | ∈ ∩ ∅} vation, and (iii) translations into standard logic programs. A set of literals X is closed under a basic program Π iff for While the two former provide semantics for logic program- any r Π, head(r) X whenever body+(r) X. We say ming with preference information, the latter furnishes im- that X∈is logically closed∈ iff it is either consistent⊆ (ie. it does plementation techniques for these approaches. From another not contain both a literal A and its negation A) or equals ¬ ∗ Affiliated with the School of Computing Science at Simon Lit. The smallest set of literals which is both logically closed Fraser University, Burnaby, Canada. and closed under a basic program Π is denoted by Cn(Π).

LOGIC PROGRAMMING AND THEOREM PROVING 597 Finally, a set X of literals is an answer set of a program Π body−(r) Xi = , or r0 or another rule with the same head X ∩ 6 ∅ iff Cn(Π ) = X. In what follows, we deal with consistent have already applied, viz. head(r0) Xi. ∈ answer sets only. As its original CΠ, the operator (Π,<) is anti-monotonic. X C The set ΓΠ of all generating rules of an answer set X from Accordingly, we may define for any set X Lit, the Π is given by alternating transformation of (Π, <) as ⊆ (X) = A(Π,<) X + (Π,<)( (Π,<)(X)). A fixpoint of (Π,<) is called an alter- ΓΠ = r Π body (r) X and body−(r) X = . C C A { ∈ | ⊆ ∩ ∅} nating fixpoint of (Π, <). Note that (Π,<) is monotonic. X A As van Gelder in [1993], we define CΠ(X) = Cn(Π ). Now, in analogy to Van Gelder [1993], a semantical frame- Note that the operator CΠ is anti-monotonic, which implies work for ordered logic programs in terms of sets of alternating that the operator AΠ(X) = CΠ(CΠ(X)) is monotonic. A fixpoints can be defined. Three different types of semantics 3 fixpoint of AΠ is called an alternating fixpoint for Π. Differ- are investigated in [Wang et al., 2000]: (i) Preferred answer ent semantics are captured by distinguishing different groups sets, viz. alternating fixpoints being also fixpoints of (Π,<). C 4 of fixpoints of AΠ. (ii) Preferred regular extensions, viz. maximal normal al- A(statically) ordered logic program1 is a pair (Π, <), ternating fixpoints of (Π, <). (iii) Preferred well-founded where Π is an extended logic program and < Π Π is model, viz. the least alternating fixpoint of (Π, <). ⊆ × an irreflexive and transitive relation. Given, r1, r2 Π, the We put the prefix ‘W-’ whenever a distinction to other ap- ∈ 2 relation r1 < r2 expresses that r2 has higher priority than r1. proaches is necessary. For illustration, consider the following ordered logic pro- 3 Preferred alternating fixpoints gram (Π2, <) due to [Baader and Hollunder, 1993]:

The notion of answer sets (without preference) is based on r1 : f p, not f r2 < r1 (2) ¬ ← a reduction of extended logic programs to basic programs r2 : w b, not w ← ¬ (without default negation). Such a reduction is inapplicable r3 : f w, not f ← ¬ when addressing conflicts by means of preference informa- r4 : b p ← tion since all conflicts between rules are simultaneously re- r5 : p solved when turning Π into ΠX . Rather conflict resolution ← must be characterized among the original rules in order to Observe that Π2 admits two answer sets: X = p, b, f, w { ¬ } account for blockage between rules. That is, once the neg- and X0 = p, b, f, w . As argued in [Baader and Hollunder, { } ative body body−(r) is eliminated there is no way to detect 1993], X is the unique W-preferred answer set. To see this, observe that whether head(r0) body−(r) holds in case of r < r0. ∈ Such an approach is pursued in [Wang et al., 2000] for X0 = X0 = ∅ 0 ∅ characterizing “preferred” answer sets. Following earlier ap- X1 = p X0 = p { } 1 { } proaches based on default logic [Baader and Hollunder, 1993; X2 = p, b, f X0 = p, b { ¬ } 2 { } Brewka, 1994], this approach is based on the concept of ac- X3 = p, b, f, w X0 = X0 = X0 { ¬ } 3 2 6 tiveness: Let X,Y Lit be two sets of literals in an ordered X4 = X3 = X logic program (Π, <⊆). A rule r in Π is active wrt the pair + Note that w cannot be included into X because r is active (X,Y ), if body (r) X and body−(r) Y = . 30 1 ⊆ ∩ ∅ wrt (X0,X0 ) and r1 is preferred to r2. Definition 1 (Wang et al.,2000) Let (Π, <) be an ordered 2 logic program and let X be a set of literals. We define 4 Compiling order preservation X0 = and for i 0 A translation of ordered logic programs (Π, <) to standard ∅ ≥ Xi+1 = Xi head(r) ones Π0 is developed in [Delgrande et al., 2000]. The specific ∪ { | strategy used there ensures that the resulting program Π0 ad- I . r Π is active wrt (Xi,X) and ∈  mits only those answer sets of the original program Π that are II . there is no rule r0 Π with r < r0 ∈  order preserving: such that  (a) r0 is active wrt (X,Xi) and Definition 2 Let (Π, <) be a statically ordered program and (b) head(r0) Xi  let X be an answer set of Π. 6∈  Then, X is called <-preserving, if there exists an enumer- Then, (Π,<)(X) = Xi if Xi is consistent. X i 0 i 0 ation ri i I of ΓΠ such that for every i, j I we have that: C S ≥ S ≥ h i ∈ ∈ Otherwise, (Π,<)(X) = Lit. + C 0. body (ri) head(rj) j < i ; and The idea is to apply a rule r only if the question of application ⊆ { | } 1. if ri < rj, then j < i; and has been settled for all higher-ranked rules r0. That is, if either + X its prerequisites will never be derivable, viz. body (r0) 2. if ri < r0 and r0 Π Γ , then 6⊆ ∈ \ Π X, or r0 is defeated by what has been derived so far, viz. + (a) body (r0) X or 1Also called prioritized logic program by some authors, as eg. in 6⊆ (b) body−(r0) head(rj) j < i = . [Brewka and Eiter, 1999]. ∩ { | } 6 ∅ 2Some authors, among them [Brewka and Eiter, 1999], attribute 3Originally called prioritized. 4 relation < the inverse meaning. An alternating fixpoint X is normal if X (Π,<)(X). ⊆ C

598 LOGIC PROGRAMMING AND THEOREM PROVING 5 Condition 0 makes the property of groundedness explicit. Definition 3 (Delgrande et al.,2000) Let Π = r1, . . . , rk Although any standard answer set is generated by a grounded be a dynamically ordered logic program over .{ } sequence of rules, we will see in the sequel that this property Then, the logic program (Π) over +Lis defined as T L is weakened when preferences are at issue. Condition 1 stip- (Π) = r Πτ(r) , where τ(r) consists of the following ulates that ri i I is compatible with <, a property invariant T S+ ∈ + rules, for L body (r), L− body−(r), and r0, r00 Π : to all of theh consideredi ∈ approaches. Lastly, Condition 2 is ∈ ∈ ∈ comparable with Condition II in Definition 1; it guarantees a1(r): head(r) ap(nr) ← that rules can never be blocked by lower-ranked rules. a2(r): ap(nr) ok(nr), body(r) + ← + As above, X = p, b, f, w is the only <-preserving b1(r, L ): bl(nr) ok(nr), not L { ¬ } ← answer set of Π2; it can be generated by the grounded se- b2(r, L−): bl(nr) ok(nr),L− quences r5, r4, r1, r2 and r5, r1, r4, r2 both of which sat- ← h i h i c1(r): ok(nr) ok0(nr, nr ), ..., ok0(nr, nr ) isfy conditions 1 and 2. The only grounded sequence gener- ← 1 k c2(r, r0): ok0(nr, nr ) not (nr nr ) ating X0 = p, b, f, w , namely r5, r4, r2, r3 , violates 2b. 0 ← ≺ 0 { } h i c3(r, r0): ok0(nr, nr ) (nr nr ), ap(nr ) The corresponding translation integrates ordering informa- 0 ← ≺ 0 0 c4(r, r0): ok0(nr, nr ) (nr nr ), bl(nr ) tion into the logic program via a special-purpose predicate 0 ← ≺ 0 0 symbol . This allows also for treating ordering information t(r, r0, r00): nr nr00 nr nr0 , nr0 nr00 in a dynamic≺ fashion. A logic program over a propositional ≺ ← ≺ ≺ as(r, r0): (nr0 nr) nr nr0 language is said to be dynamically ordered iff contains ¬ ≺ ← ≺ L L the following pairwise disjoint categories: (i) a set N of terms We write (Π, <) rather than (Π0), whenever Π0 is the dy- serving as names for rules; (ii) a set At of (propositional) namicallyT ordered program capturingT (Π, <). atoms of a program; and (iii) a set At of preference atoms τ(r) ≺ The first four rules of express applicability and block- s t, where s, t N are names. For each such program ing conditions of the original rules. The second group of rules Π,≺ we assume furthermore∈ a bijective function n( ) assigning · encodes the strategy for handling preferences. The first of to each rule r Π a name n(r) N . To simplify nota- c (r) Π ∈ ∈ these rules, 1 , “quantifies” over the rules in . This is tion, we usually write nr instead of n(r) (and we sometimes necessary when dealing with dynamic preferences since pref- abbreviate nri by ni). erences may vary depending on the corresponding answer set.

An atom nr nr0 At amounts to asserting that The three rules c2(r, r0), c3(r, r0), and c4(r, r0) specify the ≺ ∈ ≺ r < r0 holds. A statically ordered program (Π, <) can pairwise dependency of rules in view of the given preference thus be captured by programs containing preference atoms ordering: For any pair of rules r, r0 with nr nr , we de- only among their facts; it is then expressed by the program ≺ 0 rive ok0(nr, nr0 ) whenever nr nr0 fails to hold, or when- Π (nr nr ) r < r0 . ever either ap(n ) or bl(n ) is≺ true. This allows us to derive ∪ { ≺ 0 ← | } r0 r0 Given r < r0, one wants to ensure that r0 is considered be- ok(nr), indicating that r may potentially be applied when- fore r, in the sense that, for a given answer set X, rule r is 0 ever we have for all r0 with nr nr0 that r0 has been applied known to be applied or defeated ahead of r (cf. Condition II or cannot be applied. It is important≺ to note that this is only or 2 above, respectively). This is done by translating rules so one of many strategies for dealing with preferences: differ- that the order of rule application can be explicitly controlled. ent strategies are obtainable by changing the specification of For this purpose, one needs to be able to detect when a rule ok( ) and ok0( , ), as we will see below. has been applied or when a rule is defeated. For a rule r, there As· shown in· [·Delgrande et al., 2000], a set of literals X is a are two cases for it not to be applied: it may be that some lit- <-preserving answer set of Π iff X = Y for some answer + eral in body (r) does not appear in the answer set, or it may set Y of (Π, <). In the sequel, we refer∩L to such answer sets be that a literal in body−(r) is in the answer set. For detecting as being TD-preferred. non-applicability (i.e., blockage), for each rule r in the given program Π, a new, special-purpose atom bl(n ) is introduced. r 5 Synthesis Similarly, a special-purpose atom ap(nr) is introduced to de- tect the case where a rule has been applied. For controlling The last two sections have exposed three rather different ways application of rule r the atom ok(nr) is introduced. Infor- of characterizing preferred answer sets. Despite their differ- mally, one concludes that it is ok to apply a rule just if it is ok ent characterizations, however, it turns out that the two ap- with respect to every <-greater rule; for such a <-greater rule proaches prefer similar answer sets. r0, this will be the case just when r0 is known to be blocked or applied. 5.1 Characterizing D-preference More formally, given a dynamically ordered program Π over , let + be the language obtained from by adding, We start by providing a fixpoint definition for D-preference. L L L for each r, r0 Π, new pairwise distinct propositional atoms For this purpose, we assume a bijective mapping rule( ) from ∈ rule(head(r)) = r · ap(nr), bl(nr), ok(nr), and ok0(nr, nr0 ). Then, the transla- rule heads to rules, that is, ; accordingly, tion maps an ordered program Π over into a standard rule( head(r) r R ) = R. Such mappings can be programT (Π) over + in the following way.L defined{ in a bijective| ∈ way} by distinguishing different occur- T L rences of literals.

5This term is borrowed from the literature on default logic. Definition 4 Let (Π, <) be a statically ordered logic pro-

LOGIC PROGRAMMING AND THEOREM PROVING 599 gram and let X be a set of literals. We define (b) head(ri) head(rj) j < i ; and ∈ { | } 1. if r < r , then j < i; and X0 = and for i 0 i j ∅ ≥ X X = X head(r) 2. if ri < r0 and r0 Π Γ , then i+1 i ∈ \ Π ∪ { | + I . r Π is active wrt (Xi,X) and (a) body (r ) X or ∈  0 II . there is no rule r0 Π with r < r0 6⊆ ∈  (b) body−(r0) head(rj) j < i = or such that  ∩ { | } 6 ∅ (c) head(r0) head(rj) j < i . (a) r0 is active wrt (X,Xi) and ∈ { | } (b) r0 rule(Xi)  The primary difference of this concept of order preservation 6∈  to the original one is clearly the weaker notion of grounded- D X Then, (Π,<)(X) = i 0 Xi if i 0 Xi is consistent. ness. This involves the rules in Γ (via Condition 0b) as well C S ≥ S ≥ Π Otherwise, D (X) = Lit. as those in Π ΓX (via Condition 2c). The rest of the defini- C(Π,<) Π tion is the same\ as in Definition 2. For instance, answer set The difference between this definition and Definition 1 man- W a, b of Π3 is generated by the < -preserving rule sequence ifests itself in IIb. While D-preference requires that a higher- { } r3, r2 . Note that r1 satisfies 2c but neither 2a nor 2b. For a ranked rule has effectively applied, W-preference contents h i itself with the presence of the head of the rule, no matter complement, in (Π30 , <), r10 is dealt with via Condition 0b. whether this was supplied by the rule itself. Interestingly, this weaker notion of groundedness can be easily integrated into the translation given in the last section. This difference is nicely illustrated by program (Π3, <): Definition 6 Given the same prerequisites as in Definition 3. r1 : a not b r2 < r1 (3) W + ← Then, the logic program (Π) over is defined as r2 : b W T L ← (Π) = r Πτ(r) c5(r, r0) r, r0 Π , where r3 : a T S ∈ ∪ { | ∈ } ← c5(r, r0): ok0(nr, nr ) (nr nr ), head(r0) While the only answer set a, b is W-preferred set, there is 0 ← ≺ 0 { } no D-preferred answer set. This is the same with program The purpose of c5(r, r0) is to eliminate rules from the prefer- (Π30 , <) obtained by replacing r1 with r10 : a b. ence handling process once their head has been derived. We have the following result providing three← alternative We have the following result, showing in particular, how characterizations of D-preferred answer sets. W-preference is implementable via off-the-shelf logic pro- Theorem 1 Let (Π, <) be a statically ordered logic program gramming systems. over and let X be a consistent set of literals. Theorem 3 Let (Π, <) be a statically ordered logic program Then,L the following propositions are equivalent. over and let X be a consistent set of literals. Then, the D followingL propositions are equivalent. 1. (Π,<)(X) = X; C 1. (X) = X; 2. X = Y for some answer set Y of (Π, <); C(Π,<) ∩ L T 2. X = Y for some answer set Y of W(Π, <); 3. X is a <-preserving answer set of Π. ∩ L T W While the last result dealt with effective answer sets, the next 3. X is a < -preserving answer set of Π. D one shows that applying operator (Π,<) is equivalent to the In analogy to what we have shown above, we have the fol- C lowing stronger result, opening the avenue for implementing application of van Gelder’s operator CΠ0 to the translated pro- gram (Π, <) . more semantics based on W-preference:. T Theorem 2 Let (Π, <) be a statically ordered logic program Theorem 4 Let (Π, <) be a statically ordered logic program over and let X be a consistent set of literals over . over and let X be a consistent set of literals over . L L L D L Then, we have that (Π,<)(X) = C W (Π,<)(Y ) for Then, we have that (Π,<)(X) = C (Π,<)(Y ) for some C T ∩L C + T ∩ L set of literals Y over + such that X = Y . some set of literals Y over such that X = Y . L ∩ L L ∩ L This result is important because it allows us to use the trans- 6 Brewka and Eiter’s concept of preference lation (Π, <) for implementing further semantics by appeal to the alternatingT fixpoint idea. Another approach to preference was proposed by Brewka and Eiter in [1999]. For brevity, we omit technical details and 5.2 Characterizing W-preference simply say that an answer set is B-preferred; the reader is We start by showing how W-preference can be characterized referred to [Brewka and Eiter, 1999; 2000] for details. in terms of order preservation. This approach differs in two significant ways from the two approaches given above. First, the construction of answer sets Definition 5 Let (Π, <) be a statically ordered program and is separated from verifying whether they respect the given let X be an answer set of Π. preferences. Interestingly, this verification is done on the ba- Then, X is called

600 LOGIC PROGRAMMING AND THEOREM PROVING X the inference process. This is made explicit in [Brewka and 2. if ri < r0 and r0 Π Γ , then ∈ \ Π Eiter, 2000], where the following filtering transformation is + 6 (a) body (r0) X or defined: 6⊆ (b) body−(r0) head(rj) j < i = or ZX (Π) = Π r Π head(r) X, body−(r) X = ∩ { | } 6 ∅ \{ ∈ | ∈ ∩ 6 ∅} (c) head(r0) X. (4) ∈ Then, by definition, an answer set of Π is B-preferred iff it is This definition differs in two ways from its predecessors. a B-preferred answer set of ZX (Π). First, it drops any requirement on groundedness, expressed by The distinguishing example of this approach is given by Condition 0 above. This corresponds to using (X,X) instead program (Π5, <): of (Xi,X) in Definition 7. Hence, groundedness is fully dis- connected from order preservation. In fact, observe that the r1 : b a, not b r3 < r2 < r1 (5) ← ¬ B-preferred answer set a, b of (Π5, <) is associated with r2 : b not b B { } ¬ ← the < -preserving sequence r1, r2 , while the standard an- r3 : a not a h i ← ¬ swer set itself is generated by the grounded sequence r2, r1 . Second, Condition 2c is more relaxed than in Definitionh 5.i Program Π5 has two standard answer sets, a, b and a, b . While the former is B-preferred, neither of{ them} is W{- or¬ D}- That is, any rule r0 whose head is in X (as opposed to Xi) preferred (see below). Also, we note that both answer sets of is taken as “applied”. Apart from this, Condition 2c also in- tegrates the filter-conditions from (4).8 For illustration, con- program (Π2, <) are B-preferred, while only p, b, f, w is W- and D-preferred. { ¬ } sider Example (3) extended by r3 < r2: In order to shed some light on these differences, we start r1 : a not b r3 < r2 < r1 (6) by providing a fixpoint characterization of B-preference: ← r2 : b ← Definition 7 Let (Π, <) be an ordered logic program and let r3 : a X be a set of literals. We define ← While this program has no D- or W-preferred answer set, it has X0 = and for i 0 a B-preferred one: a, b generated by r2, r3 . The critical ∅ ≥ { } h i Xi+1 = Xi head(r) rule r1 is handled by 2c. As a net result, Condition 2 is weaker ∪ { | I . r Π is active wrt (X,X) and than its counterpart in Definition 5. ∈  We have the following results. II . there is no rule r0 Π with r < r0 ∈  such that  Theorem 5 Let (Π, <) be a statically ordered logic program (a) r0 is active wrt (X,Xi) and over and let X be a consistent answer set of Π. L (b) head(r0) Xi  Then, the following propositions are equivalent. 6∈  B B Then, (Π,<)(X) = i 0 Xi if i 0 Xi is consistent. 1. X is -preferred; C S ≥ S ≥ Otherwise, (Π,<)(X) = Lit. 2. B (X) = X; C C(ZX (Π),<) 7 The difference between this definition and its predecessors 3. X = Y for some answer set Y of B(Π, <) manifests itself in Condition I, where activeness is tested wrt (where ∩B Lis defined in [Delgrande et al.T , 2000]); (X,X) instead of (Xi,X) as in Definition 1 and 4. In fact, T in Example (5) it is the (unprovability of the) prerequisite a 4. X is a

LOGIC PROGRAMMING AND THEOREM PROVING 601 arguably a sensible strategy, it leads to the loss of preferred that the investigated approaches yield an increasing number answer sets on programs like of answer sets depending on how tight they connect prefer- ence to groundedness. r1 : a not b r2 < r1 ← An interesting technical result of this paper is given by the r2 : b . ← equivalences between the fixpoint operators and the standard Let us now discuss the differences among the approaches. logic programming operators applied to the correspondingly The difference between D- and W-preference can be directly transformed programs (cf. Theorem 2 and 4). This opens the read off Definition 1 and 4; it manifests itself in Condition IIb avenue for further concepts of preference handling on the ba- and leads to the following relationship. sis of the alternating fixpoint theory and its issuing semantics. Further research includes dynamic preferences and more ef- Theorem 6 Every D-preferred answer set is W-preferred. ficient algorithms for different semantics in a unifying way. Example (3) shows that the converse does not hold. Interestingly, a similar relationship is obtained between W- Acknowledgements. This work was supported by DFG un- and B-preference. In fact, Definition 8 can be interpreted as a weakening of Definition 5 by dropping Condition 0 and der grant FOR 375/1-1, TP C. weakening Condition 2 (via 2c). We thus obtain the following result. References Theorem 7 Every W-preferred answer set is B-preferred. [Baader and Hollunder, 1993] F. Baader and B. Hollunder. How to prefer more specific defaults in terminological de- Example (5) shows that the converse does not hold. fault logic. In Proc. IJCAI’93, p 669–674, 1993. We obtain the following summarizing result by letting [ ] (Π) = X CΠ(X) = X and P (Π, <) = X Brewka and Eiter, 1999 G. Brewka and T. Eiter. Preferred AS(Π) X{is P|-preferred for }P = WAS, D, B. { ∈ answer sets for extended logic programs. Artificial Intelli- AS | } gence, 109(1-2):297–356, 1999. Theorem 8 Let (Π, <) be a statically ordered logic program. Then, we have: [Brewka and Eiter, 2000] G. Brewka and T. Eiter. Prioritiz- ing default logic. In St. Holldobler,¨ ed, Intellectics and (Π, <) (Π, <) (Π, <) (Π) AS D ⊆ AS W ⊆ AS B ⊆ AS Computational Logic. Kluwer, 2000. To appear. In principle, this hierarchy is induced by a decreasing in- [Brewka, 1994] G. Brewka. Adding priorities and specificity teraction between groundedness and preference. While D- to default logic. In L. Pereira and D. Pearce, eds, Proc. preference requires the full compatibility of both concepts, JELIA’94, p 247–260. Springer, 1994. this interaction is already weakened in W-preference, before [Delgrande et al., 2000] J. Delgrande, T. Schaub, and it is fully abandoned in B-preference. This is nicely reflected H. Tompits. Logic programs with compiled preferences. by the evolution of Condition 0 in definitions 2, 5, and 8. In Proc. ECAI 2000, p 392–398. IOS Press, 2000. Notably, groundedness as such is not the ultimate distin- guishing factor, as demonstrated by the fact that prerequisite- [Gelfond and Lifschitz, 1991] M. Gelfond and V. Lifschitz. free programs do not necessarily lead to the same preferred Classical negation in logic programs and deductive answer sets, as witnessed in (3) and (6). Rather it is the databases. New Generation Computing, 9:365–385, 1991. degree of interaction between groundedness and preferences [Gelfond and Son, 1997] M. Gelfond and T. Son. Reasoning that makes the difference. with prioritized defaults. In J. Dix, L. Pereira, and T. Przy- musinski, eds, Workshop on Logic Programming and 8 Conclusion Knowledge Representation, p 164–223. Springer, 1997. The notion of preference seems to be pervasive in logic pro- [Rintanen, 1995] J. Rintanen. On specificity in default logic. gramming when it comes to knowledge representation. This In Proc. IJCAI’95, p 1474–1479. Morgan Kaufmann, is reflected by numerous approaches that aim at enhanc- 1995. ing logic programming with preferences in order to improve [Sakama and Inoue, 1996] C. Sakama and K. Inoue. Rep- knowledge representation capacities. Despite the large vari- resenting priorities in logic programs. In M. Maher, ed, ety of approaches, however, only very little attention has been Proc. JCSLP’96, p 82–96. MIT Press, 1996. paid to their structural differences and sameness, finally lead- ing to solid semantical underpinnings. [van Gelder, 1993] A. van Gelder. The alternating fixpoint This work is a first step towards a systematic account to of logic programs with negation. J. Computer and System logic programming with preferences. We elaborated upon Science, 47:185–120, 1993. three different approaches that were originally defined in [Wang et al., 2000] K. Wang, L. Zhou, and F. Lin. Al- rather heterogenous ways. We obtained three alternative ternating fixpoint theory for logic programs with prior- yet uniform ways of characterizing preferred answer sets (in ity. In Proc. Int’l Conf. Computational Logic, p 164-178. terms of fixpoints, order preservation, and an axiomatic ac- Springer, 2000. count). The underlying uniformity provided us with a deeper [Zhang and Foo, 1997] Y. Zhang and N. Foo. Answer sets understanding of how and which answer sets are preferred in for prioritized logic programs. In J. Maluszynski, ed, Proc. each approach. This has led to a clarification of their rela- ISLP’97, p 69–84. MIT Press, 1997. tionships and subtle differences. In particular, we revealed

602 LOGIC PROGRAMMING AND THEOREM PROVING Reasoning with infinite stable models

Piero A. Bonatti Dip. di Tecnologie dell’Informazione - Universita` di Milano I-26013 Crema, Italy [email protected]

Abstract we introduce a Turing-equivalent class of programs, called finitary programs, that opens the way to effective default and The existing proof-theoretic and software tools for autoepistemic reasoning about infinite domains. The main nonmonotonic reasoning can only handle finite do- theoretical features of finitary programs are the following: mains. In this paper we introduce a class of normal logic programs, called finitary programs, whose Their domain may be infinite. domain may be infinite, and such that credulous Nontheless both credulous and skeptical reasoning are and skeptical entailment under the stable model semi-decidable. Ground queries are decidable.

semantics are computable. Finitary programs— A form of compactness holds. that are characterized by two conditions on their dependency graph—are computationally complete Finitary programs are computationally complete, i.e. (they can simulate arbitrary Turing machines). Fur- each Turing machine can be simulated by some finitary ther results include a compactness theorem and the program. proof that the two conditions defining finitary pro- Finitary programs are defined by two conditions on their grams are, in some sense, “minimal”. The exist- dependency graph, that are minimal, in the sense that if ing methods for automated nonmonotonic reason- any condition were dropped, then semi-decidability and ing are either complete for finitary programs, or can compactness would not be guaranteed anymore. be easily extended to cover them. The paper is organized as follows: after a few technical preliminaries (Section 2), we characterize the subprogram 1 Introduction needed to reason about a given ground formula ¡ (Section 3). Then we introduce finitary programs (Section 4) and study There is a complexity gap between propositional and first- their theoretical properties and expressiveness. In Section 5, order non-monotonic theories. Finite propositional theories the completeness proof for the skeptical resolution calcu- are decidable (as in first-order logic), while the consequences lus introduced in [Bonatti, 1997] (originally formulated for of first-order theories are not recursively enumerable, in gen- function-free programs) is extended to all finitary programs eral. A computationally complete (Turing-equivalent) layer and a generalization thereof (almost finitary programs). Af- is missing between the two classes of theories. For this ter a brief sketch of how existing credulous reasoners can be reason, most of the research on automated reasoning and extended to finitary programs (Section 6), we conclude with a proof-theory for nonmonotonic logics has been focussed on discussion of the results and related work. Some of the proofs finite propositional theories or equivalent formalisms, such are omitted due to space limitations. as function-free logic programs. In this way, the ability of reasoning about recursive data structures and infinite do- mains (such as lists, trees, XML/HTML documents, time, 2 Preliminaries and so on) is completely lost. This is a strong limitation, We assume the reader to be familiar with the classical theory both for standard tasks—such as reasoning about action and of logic programming [Lloyd, 1984]. Normal logic programs

change when time is explicitly represented—and for emerg- (hereafter called simply “programs”) are sets of rules of the

 ¢

ing applications—such as using XML document bases as form ¢¤£¦¥¨§ © ©¥ ( ) such that is a logical atom © 

knowledge bases. and each ¥ ( ) is a literal. As usual by “head” and

¢ ¥ © © ¥  Of course, there may exist interesting, semi-decidable frag- “body” of such a rule we mean and § , respec-

ments of first-order nonmonotonic logics. For example, in tively. A program is positive if it contains no occurrences !

some cases, the second-order circumscription formula can be of . The ground instantiation of a program is denoted !10

expressed as a finite first-order formula. Similar examples are by "¨# $&%('*),+-!/. . The Gelfond-Lifschitz transformation of 2 missing so far for Default Logic and Autoepistemic Logic. program ! w.r.t. an Herbrand interpretation (represented, Normal logic programs under the stable model semantics as usual as a set of ground atoms) is obtained by removing

can be regarded as a fragment of both logics. In this paper from "3#$&%('4)*+5!/. all the rules containing a negative literal

LOGIC PROGRAMMING AND THEOREM PROVING 603

X X

6982 +5Y7+[Z>..\£ +[Z].^© _(+[Z>.

76 such that , and by removing from the remaining

; _(+-Z].\£J`S+[Z].

rules: all negative literals. An interpretation is a stable

a a

; !/< +-b(.\£J +-b(.

model of ! if is the least Herbrand model of . A for-

¡ ¡

c X

+[Z].d£ +[Z].

mula is credulously (resp. skeptically) entailed by ! iff

¡ X

is satisfied by some (resp. each) stable model of ! . +eY7+-b(..

and let  . Here !

The dependency graph of a program is a labelled di- ¡

X X a

TW+-!¨© .f g +5Y7+5bG. .R© +-b(.^© _(+-b(.^©`S+-b(.R© +5b(.4hi©

rected graph whose vertices are the ground atoms of ! ’s lan-

¡

X X

VU+-!¨© .f g1+ +5Y7+-b(. .\£ +5bG.^© _(+5bG. .R©O+-_(+5b(.£J`S+-b(..^© ¢

guage. Moreover, i) there exists an edge from 6 to iff there

a a ¢

is a rule =>8?"3#$&%('4)@+-!/. with in the head and an occur-

+ +5b(.\£j +-b(..Rh

rence of 6 in the body; ii) such edge is labelled “negative”

if 6 occurs in the scope of , and “positive” otherwise. An

¢ 6 =

atom depends positively (resp. negatively) on if there is Note that for all rules =]8k"¨# $C%('4)*+-!/. , if the head of be-

¡

6 ¢

. =

a directed path from to in the dependency graph with an longs to TW+-!¨© then also all the other atoms in belong to

¡ . even (resp. odd) number of negative edges. A program is call- TW+5!¨© . The next proposition follows easily.

consistent if no atom depends negatively on itself. By odd- ¡

. "¨# $&%('*),+-!/. Proposition 3.3 TW+5!¨© is a splitting set for ,

cycle we mean a cycle in the dependency graph with an odd ¡

. D(lnmporq s@tu+-!/. and VU+-!¨© coincides with . number of negative edges. Call-consistency coincides with

the absence of odd-cycles. Every call-consistent program has Now the basic technical results can be proved.

¡ ¡ .

at least one stable model [Dung, 1992]. Lemma 3.4 For all ground formulae , VU+-!¨© has a sta-

s

! ; ;vH

In the context of normal logic programs, a splitting set for ble model ; iff has a stable model such that

¡

s

TW+5!¨© .\w; a program ! [Lifschitz and Turner, 1994] is a set of atoms .

A containing all the atoms occurring in the body of any rule

; !

A Proof. (“If” part) Suppose is a stable model of .

¡ =8

=8B"¨# $C%('4),+5!/. whose head is in . The set of rules

TW+-!¨© . !

A By Proposition 3.3, is a splitting set for and

¡ !

"¨# $&%('*),+-!/. whose head is in —called the “bottom” of

VU+-!¨© .wxD(lnmporq s@tu+-!/.

A . Then, by the splitting theorem F&E3+5!¨© 2G.

w.r.t. —will be denoted by D E3+-!/. . By we denote A

[Lifschitz and Turner, 1994], there exist a stable model 2 of

¡ ¡ 2IH

the following partial evaluation of ! w.r.t. : remove

. L F*lnmporq s@tR+5!wPyVU+-!¨© .R©2S.

VU+-!¨© , and a stable model of ,

¡

"3#$&%('4)*+5!/. ¢¤£J¥ © ©¥ 

from each rule § such that some

A

TW+5!¨© .

such that ;K2¨MzL . By definition, no atom in oc-

¡ ¡

 2

¥ containing an atom of is false in , and remove from

A

s@tR+-!{PnVU+-!¨© .^© 2G. L|H>TW+5!¨© .\w}

curs in F*lnmporq , therefore .



¥ ¡

the remaining rules all the containing a member of . The ¡

.y2 ;~H¤TW+-!¨© . It follows that ;~HwTU+-!¨© , and hence

following is a specialization to normal programs of a result in ¡ . is a stable model of VU+-!¨© .

[Lifschitz and Turner, 1994]. ¡ .

(“Only if” part) Suppose VU+5!¨© has a stable model

A s

Theorem 2.1 (Splitting theorem) Let be a splitting set ; . By definition, all the atoms occurring in an odd-

¡

! ; .

for a normal logic program . An interpretation is a sta- cycle belong to TW+-!¨© . Consequently, the dependency

¡

! ;K¤LIMN2

s€t+5!ƒP4VU+-!¨© .R© 2G.

ble model of iff , where graph of F*l€mo‚q contains no odd-cycles, i.e.

¡

s@tu+-!POVU+-!¨© .R©2S.

F*lnmporq is call-consistent. Then, by a well- D EO+-!/.

1. 2 is a stable model of , and

¡

s€tR+-!{PnVW+5!¨© .^© 2G.

known result in [Dung, 1992], F*l€mo‚q has a FCE¨+5!QP¨D E3+-!/.R©2S.

2. L is a stable model of .

s ;„ LUMƒ;

stable model L . Let . By the splitting theorem, !

the interpretation ; is a stable model of . Moreover, since

¡ ¡

s

.} ;‡HTW+-!¨© .ˆ; 3 Relevant subprograms L†HTW+5!¨© (cf. point 1), .

This section contains the basic technical results needed to in- Theorem 3.5 For all ground formulae ¡ , ¡

vestigate decidability and undecidability issues. These results ¡

VW+5!¨© .

1. ! credulously entails iff does.

"¨# $C%('4),+5!/. ¡

prove that a certain strict subset of suffices to ¡

VW+5!¨© .

decide credulous and skeptical entailment. Intuitively, all is 2. ! skeptically entails iff does. ¡

needed for inference are the rules affecting the behavior of Proof. if ! credulously entails , then there exists a sta-

¡

! ; ‰ 

odd-cycles (which may generate inconsistencies in the form ble model ; of such that . By Lemma 3.4,

¡ ¡

. VU+-!¨© .

of instability) and the definitions of the predicates on which ;jHŠTW+-!¨© is a stable model of . Moreover, since

¡

¡ ¡ .

the given goal depends. by definition TW+5!¨© contains all the atoms occurring in ,

¡ ¡

;‹H TU+-!¨© .

must have the same truth value in ; and ,

¡ ¡

Definition 3.1 [Relevant universe and subprogram] The rel- ¡

¡

.†‰  VU+-!¨© .

and hence ;‹H¤TU+-!¨© . As a consequence ¡ evant universe for a ground formula (w.r.t. program ! ),

¡ credulously entails .

¡ ¡

. ¢

denoted by TU+-!¨© , is the set of all ground atoms such .

Conversely, suppose that VW+5!¨© credulously entails .

¡ ¢

that the dependency graph of ! contains a path from to an

¡

s

VU+-!¨© . Then there exists a stable model ; of such that

atom occurring either in or in some odd-cycle of the graph. ¡

¡

s

‰  ! ; ; . By Lemma 3.4, has a stable model such

The relevant subprogram for a ground formula (w.r.t ¡

¡

s s

.\w; ; ;

that ;ŒH]TU+-!¨© . Then the models and must

¡

VU+-!¨© . program ! ), denoted by , is the set of all rules in

¡ agree on the valuation of (cf. the “only if” part of the proof)

¡

TW+5!¨© .

"¨# $&%('*),+-!/. whose head belongs to .

 !

and hence ;‰ , which means that credulously entails ¡

Example 3.2 Consider the program ! consisting of the . This completes the proof of 1). rules: To prove 2), we demonstrate the equivalent statement:

604 LOGIC PROGRAMMING AND THEOREM PROVING

¡ ¡

VU+5!¨© . ! ! does not skeptically entail iff does not Theorem 4.4 (Compactness) A finitary program has no

skeptically entail ¡ . stable models iff it has a finite unstable kernel.

¡

“ !

This statement is equivalent to: ! credulously entails Proof. Let be any ground atom in the language of . From

¡ ¡

! VU+-!¨© “/.

. iff VU+-!¨© credulously entails , that follows immediately Lemma 3.4, it follows that has no stable models iff from 1). has no stable models. Clearly, VW+5!¨© “/. is downward closed

by definition. Moreover, by Proposition 4.2, VU+5!¨© “/. is fi- VW+5!¨© “/. 4 Finitary programs nite. Therefore ! has no stable models iff is a finite

unstable kernel of ! . Now we introduce a class of programs whose consequences are recursively enumerable, although their domain may be in- 4.2 Decidability and semi-decidability of inference finite. Hereafter we focus on the complexity of inference within the class of finitary programs. This subsection deals with upper Definition 4.1 [Finitary programs] We say a program ! is finitary if the following conditions hold: bounds. We start with the proof that, for all ground goals,

both credulous and skeptical inference are decidable. ! 1. For each node ¢ of the dependency graph of , the set

Theorem 4.5 For all finitary programs ! and ground goals ¢

of all nodes 6 such that depends (either positively or “ “ , both the problem of deciding whether is a credulous

negatively) on 6 is finite. “ consequence of ! and the problem of deciding whether is 2. Only a finite number of nodes of the dependency graph a skeptical consequence of ! are decidable.

of ! occurs in an odd-cycle.

Proof. By Theorem 3.5, “ is a credulous (resp. skeptical) “ For example, most classical programs on recursive data struc- consequence of ! iff is a credulous (resp. skeptical) conse-

member ap- VU+-!¨© “/. tures such as lists and trees (e.g. predicates , quence of VU+-!¨©“/. . Moreover, by Proposition 4.2, pend, reverse) satisfy the first condition. In these pro- is finite, so the set of its stable models can be computed in

grams, the terms occurring in the body of a rule occur also in finite time. It follows that the inference problems for ! and

the head, typically as strict subterms of the head’s arguments. “ are both decidable. This property clearly entails Condition 1. It follows easily that existentially quantified goals are semi- The second condition is satisfied by most of the pro- decidable.

grams used for embedding NP-hard problems into logic pro- “ grams [Cholewinski´ et al., 1995; Eiter and Gottlob, 1993; Theorem 4.6 For all finitary programs ! and all goals ,

Gottlob, 1992]. Such programs can be (re)formulated by us- both the problem of deciding whether •(“ is a credulous con-

X •(“

ing a single odd cycle involving one atom and defined by sequence of ! and the problem of deciding whether is a

X X X X X

skeptical consequence of ! are semi-decidable. £KY4©‚

simple rules such as £K and (if does not Y

occur elsewhere, then can be used as the logical constant Proof. The formula •–“ is credulously (res. skeptically) en- —

false in the rest of the program). tailed by ! iff there exists a grounding substitution such that ! An example of finitary program without odd-cycles is il- “˜— is credulously (res. skeptically) entailed by . The latter

lustrated in Figure 1. It credulously entails a ground goal problem is decidable (by Theorem 4.5), and all grounding —

Ž

+[. 

iff encodes a satisfiable formula. By adding rule for “ can be recursively enumerated, so existential entailment

 

Ž +p’4.^©r

£‘ we obtain another finitary program with can be reduced to a potentially infinite recursive sequence of

Ž +-. one odd-cycle, such that is skeptically entailed iff the decidable tests, that terminates if and only if some “˜— is en-

formula encoded by  is a logical consequence of the one en- tailed. coded by Y . We shall see in the following section that this is a strict bound,

The following proposition follows straightforwardly from i.e. existential entailment can be undecidable (cf. Corol- VU+-!¨© “/. the definitions of TU+-!¨© “/. , and finitary programs. It lary 4.11). will be needed to prove computability results.

4.3 Minimality and expressiveness “ Proposition 4.2 If ! is finitary then, for all ground goals ,

Next we focus on lower bounds to the complexity of inference VU+-!¨© “/. TW+5!¨© “/. and are finite. for the class of finitary programs and relaxations thereof. The 4.1 Compactness next two results show that both of the conditions in Defini- In classical first-order logic, an infinite set of formulae is in- tion 4.1 are necessary for semi-decidability, i.e. Definition 4.1 consistent iff it contains an inconsistent finite subset. A simi- is in some sense minimal. lar property—that in general does not apply to nonmonotonic Proposition 4.7 Credulous and skeptical inference are not logics—is enjoyed by finitary programs, too. semi-decidable for the class of all programs satisfying Con- dition 2 of Definition 4.1.

Definition 4.3 An unstable kernel for a program ! is a sub-

Proof. Note that locally stratified programs trivially satisfy "¨# $&%('*),+-!/. set ” of with the following properties:

Condition 2 of Definition 4.1 while some of them may violate ¢

1. ” is downward closed, that is, for each atom occur- Condition 1. For locally stratified programs, credulous and

” =8j"¨# $C%('4),+5!/. ring in ” , contains all the rules skeptical inference coincide with the well-founded seman- whose head is ¢ . tics [Gelfond and Lifschitz, 1988] and are not semi-decidable

2. ” has no stable models. [Schlipf, 1990], so the proposition immediately follows.

LOGIC PROGRAMMING AND THEOREM PROVING 605

Ž Ž¦« Ž Ž Ž Ž

+-¢(.d£¤£*¥¦£‚§‚¥Ÿ,+-¢4© ¨ ©,©ªr© Ÿ*© .^©S nš +[¢–. +e™š(›*+-œ4© –..d£ +[œ–.R© +[–.

Ž Ž¦« Ž Ž Ž

š +[¢–.d£¤£*¥¦£‚§–¥ Ÿ,+[¢*© ¨ ©,© ª4© Ÿ4© .^© +[¢–. +ež Ÿ*+-œ*©–..d£ +-œ(.

« Ž Ž

£*¥¦£‚§‚¥Ÿ,+-¢4© ¨ ¢@‰ ¬ . +ež Ÿ*+-œ*©–..d£ +-(.

« Ž Ž

£*¥¦£‚§‚¥Ÿ,+-¢4© ¨ ­€‰ ¬ .£¤£*¥ £‚§‚¥ Ÿ,+[¢*© ¬(. +[š‚ž¡,+-œ(. .\£x +-œ–.

Figure 1: A finitary program for SAT

The other lower-bound results are based on positive pro- can be derived from the subprogram !n· and the clauses for Â

grams that decide whether a given Turing machine terminates blank list iff ® terminates using the portion of repre-

§

«

Ë Ì Í

¬ —‚© Ê © ¨pÊ © ^© Ê ‰¦»G— !

using a fixed portion of its tape. sented by . At the same time, · has

·

!

¯ °

Let ® be an arbitrary Turing machine; let and be a stable model iff the above goal cannot be derived from ,

Ä +-»*© ¬(.á—

® ’s set of states and vocabulary, respectively. Recall that the because of the odd-cycle containing . It follows that

“ ±` ©²4©²G³e©` ³5©´¶µ

actions of ® are defined by 5-tuples , where for all consisting of a propositional symbol not occurring

§ §

! ! “ ®

` ² ²G³ ·

and are the current state and symbol, respectively, is in · , skeptically entails iff terminates.

` ³ ´ the symbol to be overwritten on ² , is the next state, and

Theorem 4.9 For each Turing machine ® with initial state

specifies ® ’s head movement.

Ã

 ±² ©² §&© © ²COµ ` and tape with non-blank portion , a pro-

Consider the positive program !· in Figure 2. The reader

!1í “

gram · and a goal can be recursively constructed, such

· !

may easily verify that ¡,+¹¸© ¬4©º*© »(. is entailed by iff there

! !

í í · that · satisfies only Condition 1 in Definition 4.1, and

exists a finite computation of ® , starting from a configura- ®

credulously entails “ iff terminates. ¬4©º*© »

tion with state ` and tape described by , and using only

§

X X

¬4©º*©» ¬

! ˆ! Mzg h

í ·

the finite portion of tape corresponding to . Here is a Proof. Let · , where is a new propositional

§

X

®

! “j !

í ·

list representing a finite portion of the tape on the left of ’s symbol not occurring in · , and let . Clearly

º »

“ ! !

í í ·

head, in reverse order; is the current symbol; is a (non credulously entails iff · has a stable model. Since ®

reversed) finite portion of the tape on the right of ’s head. has a stable model iff ® terminates (cf. the previous proof), · Program ! satisfies Condition 1 of Definition 4.1. To the theorem immediately follows.

prove this, note that the recursive calls to predicate ¡ preserve · The next result is based on a modification of ! that keeps the length ¼ of the portion of tape encoded in the head. There- track of the result of the computation. The modified program

fore, for each ground atom ¡,+5` ©½¹©²4©=C. , the number of atoms !1î

· has one extra argument to return the final state of the ¡,+e`©½¹© ²r© =&. ¡,+5` ³5© ½-³5©²G³5©=&³¾. connected to by a directed path in

tape. The theorem proves that the class of finitary programs

¯y‰¦¿ ‰ °À‰ Á the dependency graph is bounded by ‰ .

is computationally complete by showing how any Turing ma- · The next two results are based on ! and prove that if chine can be simulated by a suitable finitary program. Condition 2 in Definition 4.1 were dropped, then inference

would not be semi-decidable anymore. More precisely, the Theorem 4.10 For each Turing machine ® with initial state

`  ±C²&é² © ^© ² µ 

theorems say that some inferences would be as complex as and tape with non-blank portion § , a (pos- ©,+5¬4©»*©œ–. deciding the termination of an arbitrary Turing machine. !1î itive) finitary program · and a goal can be re- cursively constructed, in such a way that for all grounding

Theorem 4.8 For each Turing machine ® with initial state

— !1î ‰ ?©@+5¬r© »*©œ–.ᗠ® œG—

substitutions , · iff terminates and

`  ±²&é² © © ² µ  and tape with non-blank portion § , a pro-

§ encodes the final tape of the computation.

! “

gram · and a goal can be recursively constructed, such

§ §

! ! ·

that · satisfies only Condition 1 in Definition 4.1, and Corollary 4.11 The problems of deciding whether a finitary

“ ®

skeptically entails iff terminates. program ! credulously/skeptically entails an existentially •(“

§ quantified goal are not decidable. !

Proof. (Sketch) Let · consist of the program defined in Figure 2 plus the rules

5 Resolution calculus

Ä Ž Ž

Æ‚È ¡,+-¬(.R©S§rÆS™Cš(Ç Æ–È ¡,+[»–.R©

+5¬4©»–.d£Å§rÆ ™š(Ç Here the results of the previous sections are applied to the

« Ä

.^© +-¬4© »(.^

¡,+ɸ © ¬4© ʦË(© ¨pʦ̖© © Ê^ÍΉ¦» resolution calculus for skeptical stable model semantics intro- «

Ž duced in [Bonatti, 1997]. In the original paper, the resolution

Æ‚È ¡*+ ¨ .u §‚ÆS™š(Ç

« Ž

Ž calculus was proved complete w.r.t. function-free programs

Æ‚È ¡*+ ¨Ï1‰C¬ .\£Å§rÆS™šGÇ Æ‚È ¡,+-¬(.u §‚ÆS™š(Ç (soundness holds for all programs). In this section, we extend

§ the completeness result to all finitary programs.

D !

where represents the blank symbol. · satisfies Condi-

!· +áïñðJò tion 1 of Definition 4.1 (cf. the argument for ). Condi- Theorem 5.1 Let ! be a finitary program, and let

tion 2 of Definition 4.1 is violated, as there exist infinitely

“/.áó ! ð

ï be a ground skeptical consequence of , where Ä

many odd-cycles, one for each ground atom +[Ðn©і. . Clearly, +5“?‰ and “ are sequences of literals. Then the skeptical goal

for any grounding substitution — , the goal !

ð{. has a successful skeptical derivation from with answer

ÒÔÓÕ¦Ö^× Ø Õ¦Ù¦Ú ÛGÒÔÜÝ Þ4ÓÕ¦Ö × Ø Õ&Ù¦Ú^ÛGÒÔßÝ Þ*ÛSÒáà ÞÉÜGÞRâ^ã&ÞRäåâ^æ&ÞuçuççRÞuâèêéRߦëÝ¹Ýáì ó substitution — more general than .

606 LOGIC PROGRAMMING AND THEOREM PROVING

« «

ºƒ‰¦» .\£ô¡,+¹¸ ³-© ¨õÊS³€‰C¬ ©º*©»–. ±` ©²4©²G³e©` ³-© Ÿ*È öS÷(¡µ

¡,+ɸ©¬4© ʂ© ¨ for all 5-tuples

« «

¡,+ɸ© ¨ ºƒ‰C¬ © ʂ© »(.d£ô¡,+¹¸ ³-©¬4©º*© ¨pÊ ³@‰C» .

for all 5-tuples ±` ©²4©²G³e©` ³-©RÆS¥ øS¡µ

` ² ¡,+ɸ©¬4© ʂ© »–. if there exists no tuple for and .

Figure 2: A program deciding termination of a Turing machine with bounded tape

¡ ¡ ïü“/.áó

Proof. Let ù+áïúðûò . Since is a ground through credulous inference. Only direct approaches such as ¡

skeptical consequence of ! , then is also a skeptical con- the resolution calculus are possible.

¡ ¡

. ý†+-!¨© . sequence of ý†+-!¨© (by Theorem 3.5). Moreover, is finite (by Proposition 4.2). Then, by the original complete- 6 Credulous automated reasoners ness result [Bonatti, 1997], +5“þ‰Gð{.Éó has a ground skeptical

¡ For function-free programs, there exist powerful automated .zÿ‡"¨# $C%('4)*+5!/. derivation from ýU+5!¨© . By standard lift- ing techniques (cf. [Lloyd, 1984]), a corresponding skeptical reasoners based upon stable model generation, such as

SMODELS [Niemela¨ and Simons, 1997] and DLV [Eiter et al.,

ð{. ! — derivation of +-“f‰ from with answer substitution 1997]. Internally, these engines operate on ground programs; more general than ó can easily be obtained. for this reason they embody smart program instantiation rou-

Example 5.2 Consider again the program ! of Example 3.2. tines. Such routines are applied before any other form of rea- This finitary program has no stable model, because of its third soning. It seems not difficult to extend the instantiation rou- rule. Figure 3 shows a successful skeptical derivation for the

X tines to deal with all finitary programs, so that given a ground

skeptical conclusion +[Z]. , with empty answer substitution 1

¡ “

X goal , only the relevant (ground and finite) subprogram

+[Z]. ! (which means that Z{ is a skeptical consequence of ). 2

ý†+-!¨©“/. (or an equivalent subset thereof) is generated. The

The Restricted Split of type II introduces two subgoals, each a

a rest of the engine needs no modification. By Theorem 3.5,

+5bG. with a new hypothesis ( +5b(. and , respectively), ob- soundness and completeness are preserved. In this way, our tained from an atom occurring in some odd-cycle. The Con- results can be used to extend the applicability range of this tradiction rule replaces the left-hand side of a goal with the class of automated reasoners, with no modification to the core negation of some hypothesis. Intuitively, the goal is proved of their reasoning mechanisms.3 by showing that the hypotheses on the right hand side can- not be satisfied. The Failure rule rewrites a goal with one of

its counter-supports. In this example the counter-support of 7 Summary and related work

a a +5bG.

+-b(. is itself (obtained by negating the unique support For the first time, semi-decidable fragments of the stable

a a

+-b(.Rh +5bG. g¦ of ). Resolution can be performed either with a model semantics have been explored in depth, and related program rule or with one of the hypotheses (treated as facts). to the number of atoms involved in odd-cycles. The main Finally, the Success rule removes goals where there is nothing contributions of this paper can be summarized by recalling left to prove. As usual, the empty goal sequence is denoted that finitary programs (Def. 4.1) are computationally com- by ¢ . plete (Theorem 4.10), can deal with function symbols, and With the help of the resolution calculus it can be shown enjoy a compactness result (Theorem 4.4) that guarantees that a class of normal programs larger than finitary programs semi-decidability of inference (Theorem 4.6). Finitary pro- is Turing equivalent. The extended class—called almost fini- grams are characterized by two conditions on their depen- tary—admits the kind of rules typically used to generate in- dency graph that are minimal, in the sense that if any of consistencies, without the restrictions of Definition 4.1. them were dropped, then inference would not always be semi- Definition 5.3 [Almost finitary programs] A normal pro- decidable (Theorems 4.7, 4.8 and 4.9). We proved that the skeptical resolution calculus is complete for finitary and al- gram ! is almost finitary iff it can be partitioned into two

most finitary programs (Theorems 5.1 and 5.4), and sketched

! !7§ !

(disjoint) subsets, !d§ and , such that is finitary, and í

í how other engines for nonmonotonic reasoning can be ex-

61§&© ^© 6 *© 7¢ consists of rules of the form ¢£ tended to deal with finitary programs (Section 6). We are

Note that there is no restriction on ! ’s rule variables. Given a

í currently extending our results to larger classes of programs,

+5¢£J6 © © 6 © 7¢|.¹— ! ¢ —  ground instance § of a rule in , 4

í and to partial stable models. may depend on infinitely many ground atoms 6 (i.e., Condi- The work on finitary programs contributes to provid- tion 1 of Definition 4.1 can be violated). ing nonmonotonic logics with classical proof- and model- Theorem 5.4 The skeptical resolution calculus is complete (in the sense of Theorem 5.1) for all almost finitary programs. 1These engines can only deal with ground queries 2 Corollary 5.5 The set of skeptical consequences of the form The smart instantiation routines remove some non-applicable

rule instances.

•,+ ð ò “/. ð “ ï ï , where and are sequences of literals, is 3 semi-decidable for almost finitary programs. As noted by an anonymous referee, the same idea can be applied to function-free programs, in order to reduce the cost of program Remark 5.6 For almost finitary programs, ground credulous instantiation. inference is not semi-decidable. This means that skeptical in- 4The latter idea and evidence to its feasibility have been sug- ference from almost finitary programs cannot be implemented gested by an anonymous referee.

LOGIC PROGRAMMING AND THEOREM PROVING 607

X +[Z].O‰|.

+ (Initial goal)

X a X a

+[Z].O‰ +5bG. . + +[Z].O‰ +5bG. .

+ Restricted Split of type II

a a X a

+5bG.O‰ +5bG. . + +[Z].O‰ +5bG. .

+e Contradiction rule

a a X a

+5bG.O‰ +5bG. . + +[Z].O‰ +5bG. .

+ Failure rule

a X a

£¢‰ +5bG. . + +[Z].O‰ +5bG. .

+ Resolution with hypothesis

X a

+[Z].O‰ +5bG. .

+ Success rule

a a

+-b(.O‰C +-b(. .

+ Contradiction rule

a a a a

+-b(.O‰C +-b(. . +5bG.\£j +5bG.

+5 Resolution with

a

£¢‰ +5bG. . + Resolution with hypothesis

¢ Success rule

Figure 3: A skeptical derivation theoretic tools and results. Work in a similar direction com- [Eiter and Gottlob, 1993] T. Eiter and G. Gottlob. Complex- prises some pretty standard axiomatizations based on Hilbert- ity results for disjunctive logic programming and applica- style systems [Levesque, 1990]5 and sequent calculi [Olivetti, tions to nonmonotonic logics. In Proc. of ILPS’93. MIT 1992; Bonatti and Olivetti, 1997]. These axiomatizations for Press, 1993. Default and Autoepistemic logic should be extended to first- [Eiter et al., 1997] T. Eiter, N. Leone, C. Mateis, G. Pfeifer, order theories, perhaps adapting the techniques introduced in and F. Scarcello. A deductive system for nonmonotonic this paper. An infinitary proof-theory can be found in [Mil- reasoning. In [Dix et al., 1997], 1997. nikel, 1999]. In [Rosati, 1999], issues related to decidable nomonotonic reasoning in MKNF are investigated. In [Cen- [Gelfond and Lifschitz, 1988] M. Gelfond and V. Lifschitz. zer et al., 1999], sufficient conditions for the existence of re- The stable model semantics for logic programming. In cursively enumerable stable models are identified. They do Proc. of the 5th ICLP, pages 1070–1080. MIT Press, 1988. not ensure that all stable models are r.e., so skeptical and cred- [Gottlob, 1992] G. Gottlob. Complexity results for non- ulous reasoning are not semi-decidable. monotonic logics. Journal of Logic and Computation, 2:397–425, 1992. Acknowledgments The author is grateful to the anonymous referees for their [Levesque, 1990] H. J. Levesque. All I know: a study in deep and careful reviews, and for their precious suggestions. autoepistemic logic. Artificial Intelligence, 42:263–309, 1990. References [Lifschitz and Turner, 1994] V. Lifschitz and H. Turner. Splitting a logic program. In Proc. of ICLP’94, pages 23– [Bonatti and Olivetti, 1997] P. A. Bonatti and N. Olivetti. A 37. MIT Press, 1994. sequent calculus for skeptical default logic. In Proceedings of TABLEAUX’97, number 1227 in LNAI, pages 107–121. [Lifschitz, 1985] V. Lifschitz. Computing circumscription. Springer Verlag, 1997. In Proceedings of IJCAI’85, pages 121–127, 1985. [Bonatti, 1997] P. A. Bonatti. Resolution for skeptical stable [Lloyd, 1984] J. W. Lloyd. Foundations of Logic Program- semantics. Journal of Automated Reasoning. To appear. ming. Springer-Verlag, 1984. Preliminary version in [Dix et al., 1997]. [Milnikel, 1999] R. S. Milnikel. Nonomonotonic logic: a [Cenzer et al., 1999] D. Cenzer, J. B. Remmel, and A. Van- monotonic approach. PhD thesis, Cornell University, derbilt. Locally determined logic programs. In Proc. of 1999. LPNMR’99, number 1730 in LNAI, pages 34–48, Berlin, [Niemela¨ and Simons, 1997] I. Niemela¨ and P. Simons. 1999. Springer Verlag. Smodels - an implementation of the stable model and well- [Cholewinski´ et al., 1995] P. Cholewinski,´ V. Marek, founded semantics for normal LP. In [Dix et al., 1997], A. Mikitiuk, and M. Truszczynski.´ Experimenting with 1997. nonmonotonic reasoning. In Proc. of ICLP’95. MIT Press, [Olivetti, 1992] N. Olivetti. Tableaux and sequent calculus 1995. for minimal entailment. Journal of Automated Reasoning, [Dix et al., 1997] J. Dix, U. Furbach, and A. Nerode, editors. 9:99–139, 1992. Proc. of LPNMR’97, number 1265 in LNAI, Berlin, 1997. [Rosati, 1999] R. Rosati. Towards first-order nonmonotonic Springer Verlag. reasoning. In Proc. of LPNMR’99, number 1730 in LNAI, [Dung, 1992] P. M. Dung. On the relation between stable pages 332–346, Berlin, 1999. Springer Verlag. and well-founded semantics of logic programs. Theoreti- [Schlipf, 1990] J. Schlipf. The expressive power of the logic cal Computer Science, 105:7–25, 1992. programming semantics. In Proc. of PODS’90, 1990. 5This axiomatization of Autoepistemic logic, formulated for first-order theories, is complete only for the propositional fragment.

608 LOGIC PROGRAMMING AND THEOREM PROVING LOGIC PROGRAMMING AND THEOREM PROVING

THEOREM PROVING Splitting Without Backtracking

Alexandre Riazanov and Andrei Voronkov The University of Manchester

Abstract the splitting branches, the procedure backtracks and proceeds with the second branch. In what follows, we call this proce- Integrating the splitting rule into a saturation-based dure explicit case analysis. A simple example of a proposi- theorem prover may be highly beneficial for solv- tional derivation which combines the splitting rule with reso- ing certain classes of first-order problems. The use lution on branches is shown in Figure 1. The current search of splitting in the context of saturation-based theo- state is depicted as a tree. The nodes contain clauses com- rem proving based on explicit case analysis (as im- mon for all the branches below them. The clause or clauses plemented in SPASS) employs backtracking which to which an inference rule has been applied is put in a shaded is difficult to implement as it affects design of the box. For a moment, ignore the labels of the edges, for exam-

whole system. Here we present a “cheap” and effi- & ple 1 . cient technique for implementing splitting that does In the first-order case the splitting rule can be formulated in

not use backtracking. the same way as in the propositional case with an additional %)( restriction: the clauses %2& and do not have common vari- ables. This rule can be combined with inference rules like 1 Introduction resolution and paramodulation. Combined with some resolu- Case analysis in the form of the ¡ -rule is the core of tableau- tion strategies, it gives decision procedures for fragments of based theorem proving methods. If our aim is to refute a set first-order logic (see e.g. [Fermuller¨ et al., 2001]).

of clauses ¢¤£¦¥¨§ ©  , we can reduce this task to refuting The cost of naive backtracking is very high, since the

¢£¥§ ¢£¥  two simpler sets: and . This is justified by the clause set ¢ can contain tens of thousands of clauses, so af- following argument. To show that ¢ is false in all the models ter several splits a lot of memory is needed to store backtrack of the formula §© , one can separately consider two cases: points. Therefore, in practice (the SPASS prover [Weiden-

all the models of § and the ones not necessarily satisfying bach et al., 1999]) explicit case analysis is implemented by

§ but satisfying . The saturation based theorem proving has adding to each clause its split history: the set of splits used adopted the case analysis principle in the form of splitting that to obtain this clause, represented by an array of bits. The can be formulated in the propositional case as the following split history also helps to implement intelligent backtrack- inference rule:   "! ing: by analyzing the split history one can check which splits have actually been used to obtain a contradiction on the cur-

rent branch.Clauses which do not belong to the currently ex-

  #! $ "! ploited branch are put in a special waiting list, and popped

back from the waiting list upon backtracking. There are com-

%'& %)( where ¢ is a set of clauses, and are clauses. Unlike plications caused by simplification by unit equalities belong- most of the other inference rules in saturation-based theo- ing to a branch which can be illustrated by the following ex- rem proving that only modify a set of clauses by adding and ample. removing some clauses, this rule makes two sets of clauses Example 1.1 Consider the split on the left of this picture: out of one. Now, to refute the original set ¢¤£*¥%'&+©,%)(- ,

.

&

one must separately refute the two new sets: ¢.£¦¥¨% and . 5#6 . 34

. 

(

¢,£/¥¨% .

34 5#6

. 587¤9  

In the propositional case, this rule together with unit reso- 587:9

lution gives a complete inference system. Finding a refutation 

57:9



9;6 34 5#6

in this system can be organised, for example, as depth-first 34 587¤9

exploring of branches. After finding a refutation on one of

0

<=?> < @A>

Partially supported by grants from EPSRC and the Faculty of where % and have no common variables, in the <$E Science and Technology, the University of Manchester. The first simplification order used in the proof-search, and BDC is a

author is also partially supported by an ORS award. clause with an occurrence of < . After the split the left branch

LOGIC PROGRAMMING AND THEOREM PROVING 611

 HMG

F

H  G

 HIG



F H



F

G H G

K

K

F F

H  HIG G

HIG

F

H HIG

F

H G

F

 H  L

F

F

N

H

H  HIG J

K K

F K

K

G

F

H HIG

F

HIG

HMG

F

H  G

F

H 



F

H G

F

K

K



H  Q

G

P

K

K

H

F

H HMG

F

K

K

G

F

H HIG H  HIG

HIG

F F

O

HIG

HIG

F

H RG

F

H  G



F

 H 

F



H

 HIG

K

F

K K

H RG

K

T S

F

G G

F

G

F

HIG HIG H HIG H HIG U VWVWV

H HMG

F F

G

F

G O

Figure 1: A derivation with splitting and unit resolution

.

.

34 5W6

BDC <$E BDC <"E

contains two clauses < =X> and , so can be simpli-

 

57:9 >YE

fied and replaced by BDC , but only on the left branch, since H

there is no clause on the right branch. This means that



K

K

BDC <$E

should still be kept on the right branch, so simplification 587:9

by should be implemented according to the right-hand side of the picture. If there are several splits, we introduce a new propositional symbol for every split. Now, instead of dealing with branches, we simply add to each clause on a branch all labels used on that branch. Then the split can be described as the following To achieve efficiency, modern theorem provers maintain inference rule which replaces a clause by two new clauses:

one or more indexes on terms, literals, or clauses. To im-

<$E;_W<=`>I©,% ab¢,£D¥^BDC <$E;_W<=c>©1d_#%e©\]1 gf plement splitting, maintainance and retrieval algorithms on ¢,£/¥^BDC indexes should be changed. Either one should have common indexes for all branches, or else indexes for the current branch It is not hard to argue that such a splitting preserves the un- only. In the former case, index retrieval algorithms should be satisfiability of the set of clauses (we will prove it formally changed to take into attention the splitting history. In the lat- below). The effect of splitting is very similar to the effect of ter case, index maintainance algorithms should be changed. the explicit case analysis, but branches are no more needed,

In either case, a significant amount of work on indexes is re- since we are still dealing with just one set of clauses. The

<$E quired upon backtracking. following simplification of BDC by on the left branch can now be described in at least two possible ways:

In general, backtracking in resolution-based theorem <$EY_#<=c>I©R1d_#%e©,\]1M a

provers requires a nontrivial implementation. In the modern ¢£/¥¨BDC

<$EY_#<=c>I©R1d_#%e©,\]1_jB/C >YEk©R1 -l saturation-based theorem provers for first-order logic, split- ¢£/¥¨BDC

ting with backtracking is implemented only in SPASS [Wei- <$EY_#<=c>I©R1d_#%e©,\]1M a

denbach et al., 1999]. ¢£/¥¨BDC

¢£/¥<=`>I©1_W%*©,\]1_#BDC >YEm©1_#BDC <$En©,\]1 -f

In this report we discuss an alternative to the explicit case The second way corresponds to the simplification of Exam- analysis which can be implemented in resolution-based theo- ple 1.1, but we prefer the first way because it gives simpler rem provers without a significant overhead. We call it split- clauses. ting without backtracking. The idea of splitting without back- We will show that splitting without backtracking can sim- tracking can be formulated as follows. Consider again the ulate, in some sense, explicit case analysis. Moreover, by split of Example 1.1, but let us mark the branches of the using various selection functions one can simulate a parallel

search tree by literals as follows. We take a new proposi- version of case analysis, in which branches are exploited si- 1 tional symbol 1 and mark the left branch with and the right multaneously. Apart from being easy to implement, splitting

branch with \]1 : without backtracking has several other advantages: firstly,

612 LOGIC PROGRAMMING AND THEOREM PROVING we obtain the effect of intelligent backtracking for free, and the nonground version of the ordered resolution rule itself,

secondlyo , some simplification rules can move clauses be- for other rules and calculi see e.g. [Bachmair and Ganzinger,

tween the branches thus avoiding repeated work on different 2001]. Let @ be a simplification order on terms, extended in branches. the usual way on literals. We assume that a selection func- This article is organized as follows. In Section 2 we define tion select in every nonempty clause either a negative literal, some fundamental notions of this paper, for example those or a subset of positive literals, and if any positive literal is

of component and split. We also recall some notions of the selected, then all maximal w.r.t. @ literals are selected too. theory of resolution. In Section 3 we discuss several versions Then ordered resolution is the following rule:

of the splitting rule, namely binary and hyper splitting, with ‹

©% \+Œ©ƒB or without naming. In Section 4 we show how one simulate _

different strategies of splitting by using appropriate literal se- wŽ% ©By| ‹

lection functions. In particular, we show how one can simu- ‹ Œ where  is a most general unifier of and , the atom is

late explicit case analysis by using so-called blocking exten- ‹ \+Œ selected in the clause ©% , the literal is selected in the

sion of selection functions. An efficient algorithm for find- ‹

Bh  clause \+Œ[©B , and there is no literal in greater than ing the maximal split of a clause into components is given in the used simplification ordering. in the full version of this paper. In Section 5 we discuss the clause variance problem: checking if a clause is a variant of another clause. We show that the clause variant problem is 3 Splitting without backtracking polynomial-time equivalent to the graph isomorphism prob- In this section we formulate several versions of splitting with- lem. Finally, in Section 6 we discuss experiments carried out out backtracking and prove their soundness. with VAMPIRE over a large collection of problems. 3.1 Binary splitting 2 Preliminaries

Definition 3.1 (binary splitting rule) Let ¢ be a set of

p q %‘©’B

We assume a fixed signature . By we denote a count- clauses of the signature p£¤q , be a clause with a

r p 1X“Aq

able set of predicate symbols of arity disjoint from . El- nontrivial component % , and be an atom not occur-

& (

q 1 _s1 _f$f$f ements of will be denoted by and used as new ring in ¢¦£*¥%*©B, . Then the binary splitting rule is the names for clause components, or alternatively, as identifiers following inference rule: for branches. We call a clause a set of literals. Sometimes clauses will be ¢£/¥%*©B, ab¢,£D¥¨% ©R1_jB?©\]1M nf (1)

written as disjunction of their literals. We denote by tuvkwsxRy ¢D£ We also say that ¢D£.¥¨%e©1_#BA©,\]1 is obtained from the set of all variables occurring in an expression x (e.g.

¥¨%e©B, by binary splitting. %

clause). For a clause % , the universal closure of , denoted

& &

z]{ f$ff|z]{]}k% { _$f$ff$_j{~} by zM% , is the formula , where are all Theorem 3.2 The binary splitting rule preserves satisfiabil- variables of % . ity.

2.1 Components and splits PROOF. Any model for ¢ £ƒ¥% ©Z1d_jB?©\]1M is also a model %X©¤B for ¢”£*¥%e©ƒB, since is a logical consequence of

Definition 2.1 (component) The clause % is called a com-

B‘©•\]1 –A‡

%`©1 and . In the reverse direction, a model for

% tuvkw;%€yR

ponent of a clause %©[B if is nonempty, –

¢¤£”¥%*©1d_jBX©,\]1M can be obtained from a model for

% q

tuv‚wsBƒy'=?„ and contains no predicate symbols in . The

1 –A‡— =”1˜™\zI% š

¢,£/¥¨%e©ƒB• by redefining so that . %

component % is called minimal if no proper subset of is a

% B component in % ©,B . The component is called trivial if

is a clause of the signature q or empty. It follows from the proof of Theorem 3.2 that the new pred-

icate symbol 1 introduced by an application of the binary

Note that every clause B has at least one component: the splitting rule (see (1)) can be regarded as a new name of the p subset of B consisting of the literals of the signature .

formula \zI% . This implies an optimization of binary split-

Definition 2.2 (split) A split of a clause B is any representa- ting which allows one to reuse the new names. This optimiza-

% %2&‚©ff$f†% ©B ‡ %'&ˆ_$ff$f_#% }

tion of as } , where are different tion can be formalized as follows.

&

‰e=Š % components of % . The split is trivial if and is the

Definition 3.3 (binary splitting with naming) A naming

&

% _f$f$f_#%)} trivial component of B . The split is maximal if

function is any function › from the set of nonempty clauses

are all the minimal components of B . q

of the signature p to the set with the following property:

B ‡

%2& %)(

Note that in the maximal split the subclause only consists ›¦w;%'&yœ=›¦wŽ%)(y if and only if is a variant of .

q

pD£hq %¦©B

of the literals of the signature . Also, it is obvious that the Let ¢ be a set of clauses of the signature and › maximal split of a clause is unique. be a clause with a nontrivial component % . Let also be a naming function. Then the binary splitting rule with naming 2.2 Resolution is the following inference rule: Our technique for implementing splitting works in the context of a saturation procedure for the calculus of ordered resolu- tion. For the purposes of this paper it is enough to recall only ¢,£/¥¨%e©B, a¢,£D¥%*©›¦w;%€y†_#B?©\+›’wŽ%€yW mf (2)

LOGIC PROGRAMMING AND THEOREM PROVING 613 One can also consider a modification of the binary splitting 3.2 Hyper splitting rule rule ž with naming where only the first application of the split- Similar to the binary splitting, we can define hyper splitting, ting introduces the clause %*©›¦w;%€y , and all further applica- in which we split more than one component off a clause. tions of splitting with the same component % simply replace

Definition 3.5 (hyper splitting rule) Let ¢ be a set of BX©,\+›¦w;%€y

% ©B by one clause .

p¬£œq %2&¨©2f$ffª©8% ©œB

Let us give an example that illustrates the difference be- clauses of the signature and } be a clause

& &

_$f$ff$_W%} 1 _$ff$f$_Ž1~}

tween the binary splittings with and without naming. Sup- with nontrivial components % . Let be ¢ predicate symbols in q occurring in neither this clause nor . pose that we have a set of clauses ¢’£ ¥¨%e©ƒB &_#% ©B (

Then the hyper splitting rule is the following inference rule:

& %X©¤B

such that % is a nontrivial component in both and (

%•©B . Using binary splitting, we can perform the following

& ©.f$f$f¨©ƒ%)}€©B, a derivation: ¢,£/¥¨%

(3)

¢,£/¥¨%2&+©1&¨_f$ff$_#% ©1 _jB?©,\]1I&+©/ff$f¨©ƒ\]1 -f

} } }

& (

¢,£/¥¨%e©B _#% ©ƒB a

& & & (

¢,£/¥¨%e©1 _#B ©\]1 _W%e©B a

Let also › be a naming function. Then the hyper splitting

¢,£/¥¨%e©1&¨_#B &+©\]1I&¨_W%e©1](ˆ_jB(œ©,\]1~(ˆ gf rule with naming is the following inference rule:

  ­W­W­† +®¬ !'¯

3

 ³²j´ ´ ® ®m²j´

 °±s ­#­†­  R°±s

Using binary splitting with naming, we can perform a differ-   

² ²

 HM°ƒ±s  ­W­W­" HM°ƒ±s+® !¨­

ent derivation (assuming ›¦wŽ%€yœ=¦1& ). (4)

3

& (

_#%e©B a ¢,£/¥¨%e©B Hyper splitting can be interpreted as repeatedly applied bi- ¢,£/¥¨%e©1&¨_#B &+©\]1I&¨_W%e©B( a nary splitting. One can prove that hyper splitting (with or ¢,£/¥¨%e©1&¨_#B &+©\]1I&¨_#B (©ƒ\]1I&¨ -f without naming) preserves satisfiability in exactly the same Preservation of satisfiability in the case of binary splitting way as for binary splitting. with naming is more difficult to formulate. We can only guar- antee preservation of satisfiability under some natural condi- 4 Literal selection and simulation of explicit tions. case analysis

Theorem 3.4 Let Ÿ be an inference system on sets of clauses So far we only considered splitting as an inference rule. Here with the following properties: every inference preserves the we will consider splitting as part of an inference system. As

set of models, i.e. for every inference ¢ a ¢ ‡ in this system, the inference system we consider binary resolution with su-

¢ ‡ Ÿn‡

¢ and have the same models. Let be the extension of perposition and negative selection. We assume a simplifi- Ÿ

by binary splitting with naming. Then for every derivation cation ordering in which every literal of the signature p is

&

¢¡ a¢¢ a£ff$f Ÿn‡ ¢¡

in such that is a set of clauses of the greater than any atom in the signature q . In order to define

¤R¥¦r ¢M§

signature p and every , the set is satisfiable if and the inference system we have to define a selection function & only if so is ¢I§©¨ . on clauses. We will define two kinds of selection functions: blocking and parallel. Both will be obtained by extending an PROOF. The proof essentially repeats the proof of Theo-

arbitrary selection function on the clauses of the signature p

¢I¡ p rem 3.2. Take any model – of of the signature and

to the clauses of the extended signature p.£q . p£ q

define its extension –A‡ to the signature as follows: de- ›¦wŽ%€yR“¦q fine in –A‡ each new predicate symbol be equiv- 4.1 Blocking extension of a selection function alent to \zI% . Then arguing as in the proof of Theorem 3.2

Definition 4.1 (blocking extension) Let < be any selection ¢I§ one can show that –A‡ is a model of each .

function on the clauses of the signature p . The blocking ex-

In the converse direction, note that every ¢I§ is a logical

tension of < is the selection function on the clauses of the

¢ & ¢ & ¢

§©¨ §

consequence of §©¨ . Indeed, when is obtained from B signature p £¤q defined as follows. (a) Let be a clause

by applying an inference of the original system Ÿ , this holds q

containing a negative literal \]1 of the signature , then the

& ¢I§

by our assumption. When ¢I§ª¨ is obtained from by ap- B

plying binary splitting with naming (2), this is also obvious, maximal one among such literals is selected in B . (b) Let q

be a clause containing a positive literal 1 of the signature

¢ & %*©›¦w;%€y š since §©¨ contains the clause . Note

and no negative literals in this signature. Then this literal is B that this proof can be easily adapted to the modified version selected only when 1 is the maximal literal in . Note that in of splitting. For the modified version in general it is not true this case B solely consists of positive literals of the signature

q .

& %«©:›¦wŽ%€y

any more that ¢I§©¨ contains the clause , but one

% ©›¦wŽ%€y ¢ &

can show that is a logical consequence of §©¨ . It is not difficult to prove that we indeed obtain a selection q The conditions on the inference system Ÿ are natural and function, since all literals of the signature are less than all

not restrictive. For example, the standard proofs of complete- literals of the signature p . Since we will usually assume some

ness in the theory of resolution [Bachmair and Ganzinger, selection function on the clauses of p , we will simply speak 2001] use inference systems with two kinds of rule: addition about the blocking selection function instead of the blocking of a clause implied by other clauses, and removal of a clause extension of the selection function. implied by other (smaller) clauses. It is easy to see that every Now assume that we are using a blocking selection func- such inference system preserves models. tion. Let us also assume that each time we apply the splitting

614 LOGIC PROGRAMMING AND THEOREM PROVING

& &

\]1 ©Œ \]1 rule, the newly introduced atom 1 is greater than any atom clause, for example means that is selected. For

previouslyµ introduced by splitting. Let us show that the re- simplicity, we remove subsumed clauses and pure literals.

sulting inference system simulates, to some extent, explicit After inference 7 of the derivation using explicit case anal-

‹ ©/\+Œ case analysis. ysis, we have a copy of the clause \ which was put Consider a sequence of splits and applications of nonsim- on the right branch, after we performed a split on this clause

plifying inference rules. Let us call the label of a branch the on the left branch. In the derivation using binary splitting, the

‹ ‹

( (

©•\+Œ \ © 1 \]1 ©•\+Œ set of all positive literals that mark edges of this branch. We split of \ into and remains even transform clauses used in a derivation with splitting as fol- when we finish working on the left branch.

lows. To each clause % that appears on a branch we add %[©.¶ the label ¶ of this branch, obtaining the clause . We 4.2 Parallel extension of a selection function can show that each nonsimplifying inference that can be per- Definition 4.2 (parallel extension) Let < be any selection formed on a branch, can also be performed on the transformed function on the clauses of the signature p . The parallel ex-

clauses. For simplicity, we only consider the resolution rule.

< <^‡

‹ ‹

‹ tension of is a selection function on the clauses of the

( ( & & &

\ ©.B

Let ©.B and be two clauses in which

p¤£q B ‹

‹ signature defined as follows. Let be a clause of the

\ ( 

and are selected, be a most general unifier of & and

B ‡#©¶ B ‡ p ¶ ‹

‹ form , where is a clause of the signature and is

& ( &

B 

, and no literal in is greater than  . Then we can

B ‡ <^‡ a clause of the signature q . If is not empty, then selects

apply a resolution inference as follows.

< B ‡

in B ‡n©,¶ exactly the same literals as selects in . When

‹ ‹

B ‡ ¶

& ( (

& is empty, the maximal literal from is selected. ©B \ ©B f

B &†©B(" In other words, the parallel extension always makes the selec-

‹ B ‡

tion in a clause B ignoring the literals of of the signature

& ( & &

¶ ¶

Let and be the labels of the branches of ©•B and ‹

q . (©ƒB(

\ respectively. Then the corresponding transformed ‹

‹ Again, it is not difficult to prove that we indeed obtain

\ (2©/B (2©/¶d( clauses will be &)©DB &)©D¶ & and . By our

a selection function, since all literals of the signature q are

definition of a blocking selection function and the order @ it

‹ ‹

less than all literals of the signature p . We will simply speak

& ( is not hard to argue that and \ will be selected in the about a parallel selection function instead of the parallel ex- transformed clauses and that no literal in B &†I©¶ &" is greater

‹ tension of a selection function. &

than  . Therefore, we can apply the rule ‹

‹ A parallel selection function ignores the splitting history of

& & & ( ( (

©ƒB ©ƒ¶ \ ©B ©ƒ¶

f clause completely, and thus gives the effect of parallel search

& ( & (

©B ¬©¶ ©¶

B on all branches.

& ( ©¶

It is not hard to argue that ¶ is the label of the common ‹

‹ 4.3 Subsumption resolution and splitting

& & ( (

©B \ branch for and ©ƒB . It is interesting to observe what happens to the literals One of the serious advantages of splitting without backtrack- which do not belong to the currently investigated branch. ing is that it can be productively combined with the sim- Suppose, for, simplicity, that we are working with the left- plification rule called subsumption resolution [Bachmair and

Ganzinger, 2001]. Essentially, subsumption resolution is a

& ©f$ff·©'1~} most branch labelled by 1 . Then all clauses which are not on the current branch contain a negative literal of the special case of binary resolution that guarantees that the re- solvent subsumes one of the parents. We say that a clause

signature q , which by our choice of the selection function,

B ( ¹ B & subsumes a clause if there exists a substitution such

will be selected. Take any such clause, for example, \]1~}+©B .

& ( ¹’º«B The only inference rule we can apply to this clause is resolu- that B . It is known that clauses subsumed by other

clauses can be removed from the search space.

1 ©¶ 1 } tion with a clause } , where is greater than any literal Subsumption resolution is the following inference rule on of ¶ . But we can obtain such a clause only when the empty sets of clauses:

clause is obtained on the leftmost branch labelled by 1]}©ƒ¶ ,

‹ ‹ ‹

so \]1~}]© B will be blocked for inferences until a contradiction

& & ( ( & & (

©B _W\ ©B ab¢£/¥ ©B _jB ¢,£/¥

‹ ‹

on the leftmost branch is obtained. As soon as such a contra- ‹

¢,£/¥¨\ &œ©ƒB &_ (©B( ab¢£/¥\ & ©B &^_jB(ˆ

diction is obtained, we can “unblock” the clause \]1~}¸©B by

‹ ‹

¹ &†¹*= ( B?©ƒ¶ resolving it against 1]}€©¶ and obtaining . where there exists a substitution such that and

Thus, we obtain a nearly complete correspondence be- B &†¹iº»B ( . The rule is called subsumption resolution be-

tween search states of the explicit case analysis and the sys- cause of its resemblance to subsumption: this rule is applica-

‹ ‹

& & ( ( ©B tem with the binary splitting without backtracking. It is not ble if and only if ©B subsumes . Thus subsump- an exact correspondence for several reasons. Firstly, simpli- tion algorithms can be modified for finding pairs of clauses to fication rules behave in a slightly different way. Secondly, which subsumption resolution is applicable.

splits done by splitting without backtracking often do not be- Since subsumption resolution is a simplification rule (the

‹ ‹

(©B( (8©B(

long to particular branches, but are “shared” across different clause \ or is replaced by a smaller clause ( branches. To explain this effect, we show in Figure 2 a deriva- B ), it can be applied independently of the selection function tion using binary splitting and a blocking selection function or order on literals. which corresponds to the derivation of Figure 1. We preserve Subsumption resolution combined with splitting can give

the same enumeration of inferences as in that figure. If a lit- the following effect: literals can be moved across different

% © 1 B‘© eral in the new signature q is selected, we put it in front of the branches. For example, consider two clauses and

LOGIC PROGRAMMING AND THEOREM PROVING 615

 HIG

HMG

F

 HIG

F

H G H G

F

 HIG H RG

F F

 

F F

RG H G

F

H RG    

K

F F

Q

F F

H 

K K K

HIG H  HIG

P

 

H H G G

F

L

F F

H RG K

  H MHIG

N

J H 

K K

F

H HMG

H 

F

F



H RG

K K

H MHIG

K

F

F

K

 

K





K

H HIG

HIG

K K

HIG” 

K K

F

H RG K

K

F

 

 HIG

F

F



H G

K

H G

 HIG

F

H 

K

F



H H  RG

F

H MHIG

K

S T

F

H HIG

K K

 

K

H 

U

VWVWV

K

G

F

K K

H  HMG

K

HIG¦ 

K



K

G” 

K K

Figure 2: A derivation using binary splitting and blocking selection

% B \]1 such that subsumes . In explicit case analysis, these Theorem 5.2 The clause variance problem is polynomial-

clauses would belong to two different branches, because % ©1 time equivalent to the graph isomorphism problem. Bi©/\]1 should be on the left branch of splitting on 1 , while The proof is given in the full version of this paper.

on the right branch. If we use subsumption resolution, we More generally, in practice we have to check whether a B can replace B¼©¤\]1 by the clause , which corresponds to given component % is a variant of a component in a set of moving B from the right branch to the top. components ¢ . In VAMPIRE we implement this many-to-one

The effect of subsumption resolution is even stronger when clause variance check using the code tree indexing described

B B

% is a variant of or coincides with . Then we obtain two in [Voronkov, 1995; Riazanov and Voronkov, 2000b]. Code %*©,\]1 clauses % ©1 and , which by subsumption resolution trees are used in the implementation of many-to-one forward followed by subsumption will be replaced by % . This has the subsumption, it is not hard to modify them to implement the effect of noting that % belongs to two different branches, and many-to-one clause variance. putting it on top and can save a lot of resources compared to explicit case analysis. Splitting with backtracking would 6 Experimental results repeat inferences with % on different branches. Let us show that subsumption resolution and splitting can, We implemented several versions splitting with backtracking in some cases, give the effect of condensing a clause. Con- in the new version of VAMPIRE [Riazanov and Voronkov, 2001]. In this section we present experimental results over densing is defined in [Joyner, Jr., 1976]. A clause B‡ is said to

4658 nonunit clause form problems. Of these problems, 2820 B ‡ be obtained from a clause B by condensing if is a proper

problems come from the TPTP library [Sutcliffe and Suttner,

B B ‡ subset of B and subsumes .

1998] and 1838 problems from the experiments in list soft-

À

{¿E { ½ ½

Example 4.3 Let %C¾½ be a clause with variables in and ware reuse described in [Schumann and Fischer, 1997]. Of

À

{ %CŽ½ E

be a sequence of variables disjoint from ½ . Then is a the 4658 problems 3300 contain equality. For problems with

À

{¿E %C¾½{~E^© %Cs½ E

variant of %Cs½ . Consider the clause . This clause equality, in addition to splitting without backtracking we im- {~E

can be condensed into %C¾½ . Binary splitting will replace this plemented the branch rewriting rule which replaces a clause

À

%Cs½{¿E¨©¬1 %CŽ½ E^© \]1f

<¹]E©¤¶ %C >·¹]E©”¶ <¤=Á>8©”¶€‡ clause by two clauses and Application %C by a clause , if a clause

of subsumption resolution to these clauses yields the clause with ¶€‡ºc¶ is present. The effect of this simplification rule {¿E %Cs½ which subsumes each of these clauses. Thus, binary is similar to the effect of simplification by unit equalities on splitting and subsumption resolution in this example give the a splitting branch in an explicit case analysis procedure. To effect of condensing. support our viewpoint that we describe a “cheap” implemen- tation of rewriting, we note that branch rewriting was imple- 5 Complexity of splitting with naming mented in less than 3 hours. We present the results for problems with and without To implement splitting with naming, it is required to check equality separately. For all problems we used the same al- whether a component is a variant of another component. In gorithm based on the limited resource strategy [Riazanov and this section we prove that the complexity of checking whether Voronkov, 2000a] and the same selection function. This algo- one clause is a variant of another one is polynomial-time rithm and selection function were chosen because they exper- equivalent to the graph isomorphism problem. imentally proved to be superior to other algorithms and se- Definition 5.1 (clause variance problem) Clause variance lection functions when splitting was not used. When running

is the following decision problem. An instance of this prob- VAMPIRE with splitting, we used the following parameters: %'&

lem is a pair of clauses w;%'&_#%)(¨y . The answer is “yes” if 1. Hyper splitting with a Parallel selection function (P) vs. ( is a variant of % . binary splitting with no selection.

616 LOGIC PROGRAMMING AND THEOREM PROVING no 166:11220:7227:12 256:6 280:4 300:1 359:3 388:3 We believe that the performance of splitting without back- PNR 94:36 75:15 125:30 130:9 153:9 215:14241:11 tracking can be considerably improved, but optimized imple- NR 71:69 43:6 121:58127:41 147:4 175:3 mentation, more experiments, and insights in the behavior of PN 95:60 111:50104:20180:39198:28 splitting are required. N 98:72 101:52 115:9 143:8 PR 68:45 118:38144:35 P 110:53130:44 References R 37:8 [Bachmair and Ganzinger, 2001] L. Bachmair and H. Ganzinger. PNR NR PN N PR P R Resolution theorem proving. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume I, chapter 2, pages 19–99. Elsevier Science, 2001. Figure 3: Results on problems with equality [Fermuller¨ et al., 2001] C. Fermuller¨ , A. Leitsch, U. Hustadt, and T. Tammet. Resolution decision procedures. In A. Robinson PN 65:3 74:11 76:0 80:1 and A. Voronkov, editors, Handbook of Automated Reasoning, N 14:13 16:2 18:1 volume II, chapter 25, pages 1791–1849. Elsevier Science, 2001. no 19:6 21:5 [ ] P 10:7 Joyner, Jr., 1976 William H. Joyner, Jr. Resolution strategies as decision procedures. Journal of the ACM, 23(3):398–417, July N no P 1976. [Riazanov and Voronkov, 2000a] A. Riazanov and A. Voronkov. Figure 4: Results on problems without equality Limited resource strategy in resolution theorem proving. Preprint CSPP-7, Department of Computer Science, University of Manch- ester, October 2000. 2. Naming (N) vs. no naming. [Riazanov and Voronkov, 2000b] A. Riazanov and A. Voronkov. 3. Branch Rewriting (R) vs. no branch rewriting. Partially adaptive code trees. In M. Ojeda-Aciego, I.P. de Guzman,´ G. Brewka, and L.M. Pereira, editors, Logics in We will use combinations of the three letters P, N, R, to de- Artificial Intelligence. European Workshop, JELIA 2000, volume note different strategies used in the experiments. For exam- 1919 of Lecture Notes in Artificial Intelligence, pages 209–223, ple, PR means that the Parallel version with branch Rewriting Malaga,´ Spain, 2000. Springer Verlag. but no naming was used. We use “no” to denote the strategy [Riazanov and Voronkov, 2001] A. Riazanov and A. Voronkov. that uses no splitting at all, and  the strategy that uses split- Vampire 1.1 (system description). Siena, Italy, June 2001. Ac- ting without P,R, and N, i.e. simple binary splitting with a cepted to IJCAR 2001. blocking selection function. [Schumann and Fischer, 1997] J. Schumann and B. Fischer. We used a Linux-running PC with 256M RAM and a NORA/HAMMR: Making deduction-based software component 400MHz Intel III processor. For all problems we used the retrieval practical. In Proc. Automated Software Engineering time limit of 1 minute. The chosen strategy without splitting (ASE-97), pages 246–254, Lake Tahoe, November 1997. IEEE solves 3091 problems. The total number of problems solved Computer Society Press. with or without splitting is 3189. So with splitting, VAM- [Sutcliffe and Suttner, 1998] G. Sutcliffe and C. Suttner. The TPTP PIRE could solve 98 problems that could not be solved with- problem library — CNF release v. 1.2.1. Journal of Automated out splitting. Of these 98 problems, 77 are without equality Reasoning, 21(2), 1998. and 21 with equality. Figures 3 and 4 show the comparative [Voronkov, 1995] A. Voronkov. The anatomy of Vampire: Imple- behavior of various strategies depending on the settings of the menting bottom-up procedures with code trees. Journal of Auto- parameters P, N, R on problems with and without equality, re- mated Reasoning, 15(2):237–265, 1995.

spectively. The pair ‰d&ƒÃ‰M( in a row labelled by a strategy

[Weidenbach et al., 1999] C. Weidenbach, B. Afshordel, U. Brahm,

& ( ¢

¢ and the column labelled by a strategy means that there C. Cohrs, T. Engel, E. Keen, C. Theobalt, and D. Topic. System

& & ( (

¢ ¢ ‰

were ‰ problems solved by but not solved by and description: SPASS version 1.0.0. In H. Ganzinger, editor, Auto-

( & ¢ problems solved by ¢ but not solved by . For example mated Deduction—CADE-16. 16th International Conference on NR (Naming + branch Rewriting) could solve 71 problems Automated Deduction, volume 1632 of Lecture Notes in Artificial with equality not solved by PN (Parallel selection + Naming), Intelligence, pages 378–382, Trento, Italy, July 1999. while PN could solve 69 problems with equality not solved by NR. The following conclusion can be derived from these re- sults:

1. On problems with equality, VAMPIRE with any version of splitting is on the average weaker than VAMPIRE without splitting, while on problems without equality it is on the average stronger. 2. Each of the following settings: naming, parallel split- ting, and branch rewriting improve the results, by far the strongest one is naming.

LOGIC PROGRAMMING AND THEOREM PROVING 617 UNSEARCHMO: Eliminating Redundant Search Space on Backtracking for Forward Chaining Theorem Proving

Lifeng He Faculty of Information Science and Technology Aichi Prefectural University, Aichi, 480-1198 Japan

Abstract ( ¥ -SATCHMORE). The other method, proposed in [He, 2001] (called I-SATCHMO) is eliminating redundant search- This paper introduces how to eliminate redundant ing space after unnecessary non-Horn clauses have been used search space for forward chaining theorem prov- for forward chaining by intelligent backtracking. ing as much as possible. We consider how to keep As indicated in [He, 2001], there are mainly two problems on minimal useful consequent atom sets for neces- about the approach to select a suitable non-Horn clause for sary branches in a proof tree. In the most cases, forward chaining. One is that it takes much overhead cost an unnecessary non-Horn clause used for forward to decide whether a non-Horn clause is suitable or not. In chaining will be split only once. The increase of general, the stricter the checking is, the more expensive the the search space by invoking unnecessary forward overhead cost is. The other is that any such strategy is only chaining clauses will be nearly linear, not exponen- effective in a limited range and certainly fails in some more tial anymore. In a certain sense, we “unsearch” complicated cases. In fact, a non-Horn clause recognized more than necessary. We explain the principle of as “relevant” by some advanced selecting strategy might be our method, and provide an example to show that practically unnecessary to the refutation being made. In other our approach is powerful for forward chaining the- words, there are always such irrelevant non-Horn clauses that orem proving. can pass any relevancy test. These problems have been illus- trated on some examples in [He, 2001]. The principle of the method of eliminating redundant 1 Introduction searching space by intelligent backtracking is very simple: Automated reasoning is one of the most important topics for If one of the consequent atoms of a forward chaining clause artificial intelligence and computer science. Among others, used for forward chaining is found to be useless to the rea- theorem proving technologies for first-order predicate calcu- soning on backtracking, the use of this clause is known as lus have been attracting much interesting [Robinson, 1965; unnecessary, and the remaining processing over the clause’s Loveland, 1968; Manthey and Bry, 1988]. consequence is immediately abandoned. It only takes a lit-

It is well-known that if ¡£¢ is an unsatisfiable subset of a tle overhead cost to find irrelevant non-Horn clauses and is

¢ ¡ clause set ¡ , it is generally easier to show unsatisfiable compatible with the former strategy introduced above. than to show ¡ unsatisfiable. Obviously, the smaller such a However, marking those consequent atoms contributed to ¡£¢ is, the shorter a proof is. If forward chaining is used for derive antecendents of irrelevant forward chaining clauses as reasoning about non-Horn clauses, it means that we need not useful and mixing up the useful consequent atoms for differ- consider every violated clause, but only those that can help ent branches in a proof tree, I-SATCHMO might make redu- us to find a refutation. It is obvious that invoking unneces- dant search. sary non-Horn clauses will explode the search space expo- This paper introduces a solution for these problems. As nentially. we will see, in our improved method, for each node in the SATCHMO (SATisfiability CHecking by MOdel genera- proof tree, only those indispensable consequent atoms to de- tion) [Manthey and Bry, 1988] is potentially inefficient, since rive the refutation at the node are marked as useful. Thus, our it uses all violated clauses to forward chaining. Addressing method can show a refutation for a given unsatisfiable clause to this problem, two strategies have been developed. One is set without UNSEARCHing MOre than necessary (hence the utilizing an intelligent strategy to only select those relevant- acronym UNSEARCHMO). In other words, our method is like non-Horn clauses for forward chaining, which is pro- one of the most efficient strategies for forward chaining theo- posed in [Ramsay, 1991], refined in [Loveland et al., 1995] rem proving. (SATCHMORE) and further improved in [He et al., 1998]

¤ 2 SATCHMO and I-SATCHMO This work is partially supported by the Japanese Ministry of Ed- ucation, Science, Sports and Culture and the Artificial Intelligence We assume familiarity with the details relating to SATCHMO Research Promotion Foundation of Japan. [Manthey and Bry, 1988], SATCHMORE [Loveland et al.,

618 LOGIC PROGRAMMING AND THEOREM PROVING

CO>PD ¥

] Q R

1995 as well as -SATCHMORE and limit ourselves to Q

R

S T

briefly¦ reviewing the basic material needed for our presen-

V U V U V

tation. U

X X W

In this paper, a problem for checking unsatisfiability is rep- W

Z Z Y

resented by means of positive implicational clauses, each of Y

 ) )   "!

which has of the form §©¨   §¨   ( #

). We refer to the implicit conjunction §©¨   § as the

) ) ) )

antecedent of the clause, with each §$ being an antecedent

¨     atom. The implicit disjunction  is referred to as

Figure 1: The proof tree []\ constructed by SATCHMO

the consequent of the clause, and each &% is a consequent #

atom. A clause with an empty consequent, i.e., (' , is

¨ 

 § ()

called a negative clause and is written as § , 

3. For each a consequent atom $ of the selected violated )

where means false. A clause with an empty antecedent, .

#

$  , clause, create a child node below the current

i.e., *' , is written with consequent part only if it is a  fact (Horn clause), and otherwise the antecedent atom true is node. Taking node $ as new current node, call this pro-

added. Moreover, a given clause set is divided into two sub- cedure recursively in depth-first strategy with the aug- 2F^`_ mented ground consequent atom set 2I^`_ , where =

sets. One is +-, , the Backward Chaining component that is a

2 6

decidable Horn clause set consisting of all negative clauses, E .

B 2

facts and any Horn clauses selected by user. The other is For node , where the ground consequent atom set is E ,

. , , the Forward Chaining component that contains all the if all branches terminate in a leaf node ) , it means that satis-

remaining clauses. fying atom B there will lead to a refutation anyway. In other

.

/

+&,b6c2 +&,b6 ,

If goal (a conjunction of atoms) can be proven to log- words, E cannot be extended as a model of .

¡ ¡10

ically follow the clause set , we denote that with / . Such node is said to be unsatisfiable. When the node is the

.

2 §

+&,H6 ,

On the other hand, if is a ground atom set and a ground root node C+,dD , then is unsatisfiable. Correspond-

§4352 § 2

[ce B atom, indicates that atom is a member of . Be- ing to creating node B in , an atom is asserted into

cause we utilizes forward chaining on non-Horn clauses same the reasoning database. On the other hand, when node B is

as SATCHMO, all of clauses must hold the so-called range- proven to be unsatisfiable, the atom B will be retracted from

restricted property1 to guarantee the soundness for refutation. the database. Moreover, whenever a node is proven unsatis- 2 Suppose that +&, be a decidable Horn clause set and a fiable, the reasoning process will backtrack to its parent node ground atom set, then a conjunction (disjunction) of ground if any.

atoms is satisfiable in +-,762 if all (some) of its members Example 2.1 Consider the following propositional clause set

¡ .

can be derived from +-,8692 , and else unsatisfiable. A ground ,

'@+-,b6 .

+-,86<2 ,

clause ¥:;, is said satisfiable in if is satisfiable

T X S W

+-,gf h)i j)c 1h)i +&,=6>2

or ¥ is unsatisfiable in , and else violated.

W X

¡

k]j)c ml-h)c

It is well-known that a clause set is satisfiable if and Gh)i

¡

.

T Xpo S T T Xpo X W ?

only if we can find a model ? for . If is a model of

,gfnl   l   rqstkuFlv

¡ ¡ T Xpo

, then each clause of is satisfiable in ? . Based on such

l t gF ¡

consideration, to check whether a given clause set '@+,A6 .

, is satisfiable, SATCHMO goes to construct a model for The proof tree generated by SATCHMO is shown in Figure this clause set by trying to satisfy all clauses in ¡ . 1. It is obvious that SATCHMO made some redundant search. In fact, a refutation can be derived from the subset consisting

The reasoning made by SATCHMO to search a model for .

.

T Xpo

, +&, l 4 wF , a given clause set +-,>6 can be graphically illustrated in a of and the only clause .

By the way, for the given clause set in this example, ev- Proof Tree (abbreviated to PT hereafter) described as follows. .

ery violated , clause used by SATCHMO is recognized as

“relevant” by either SATCHMORE or ¥ -SATCHMORE, the C+-,=D For the current node B (initially the root node ), and proof tree generated by SATCHMORE or ¥ -SATCHMORE

the ground consequent atom set 2FE (initially empty): is the same as that shown in Figure 1.

0

) ) B 1. If +&,G6H2IE , create a leaf node below node . The redundant search space can be cut down by intelligent

The process for the branch terminates. backtracking presented in I-SATCHMO. As we have seen, a

0

+-,J692 ) ELK

2. If , select an instantiated ground violated consequent atom B is retracted from the reasoning database

.

§©¨   §>4M¨N  £ ,

clause from . If no such only if the node B in the proof tree has been proven as unsat-

.

+-,H6 ,

clause exists, the process terminates to report that isfiable. Suppose that ¥xy¨    z be the violated

. ,

is satisfiable. clause used for forward chaining at node B in the proof tree,

{¨ z then, node B holds child nodes , , . If some con- 1A clause is range-restricted iff every variable in its consequent also occurs in its antecedent. An important property of such clauses sequent atom z$ is found to be useless to the reasoning at the is that if its antecedent is satisfied by a set of ground atoms, then time being retracted from the database (hence node z$ has

every consequent atom is certainly ground. As indicated by Manthey been proven as unsatisfiable), i.e., it is useless to construct $ and Bry (1988) and Bry and Yahya (2000), any first-order clause set the partial proof tree below node  , the same partial proof

can be transformed into the same “meaning” range-restricted one. tree can be directly constructed below node B , and therefore,

LOGIC PROGRAMMING AND THEOREM PROVING 619

) The algorithm of I-SATCHMO to construct PT for check-

¡

.

Q V } | U

R

Q

R

Q V U

R

'¥+-,A6 ,

Q

V U

R ing whether a given clause set is unsatisfiable

X S W W X

T can be described as follows.

 k l C +-,dD

For the current node B (initially the root node ) and

2

E

W ~ XF~

SF~ the corresponding consequent atom set (initially empty):

ml

0

+&,J6i2 ) )

1. If E , create a leaf node under the current

B Figure 2: The deriving tree for goal ) at node node, which means that node is proven unsatisfiable.

Mark all consequent atoms listed in the successful deriv-

§ ˆ¨‰u‹Œu©

ing node in Bb\ as useful. 0

at this point, we know that node B can be proven as unsat-

+-,J692IE )

2. If K , select an instantiated ground violated

.

 B isfiable just in the same way made at node $ . Since has clause from , . If no such clause exists, the process

been known as unsatisfiable, the remaining process for show- . , terminates to report that +-,b6 is not unsatisfiable.

ing node B unsatisfiable, i.e., the process on the consequent

¥ªn 

3. For the selected violated non-Horn clause ¨ ; ;

   

atoms $€=¨ , , (if any), can be removed at once. On the .

z , mark all consequent atoms listed in the success- ¥‚M¨N   £

other hand, the use of an , clause is nec-

«

© ˆ¨‰‹ Œ

ful deriving node in B‡\ as useful. Then, create ƒ©„@z$„

essary only if each z$ ( ) is useful to constructing ¬ a node z$ below the current node for each such that

the partial tree below node z$ .

ƒ­„š¬H„š  Now we introduce I-SATCHMO how to decide what con- . Taking $ as the new current node, call

sequent atom is useful for the reasoning. Obviously, a con- this procedure recursively in depth-first strategy with the

$

2IEg6JC  D augmented consequent atom set 2I^`_ = . How-

sequent atom asserted can only be used to derive goal ) or

.  ever, if $ has been found to be useless on backtracking,

antecedents of violated , clauses. Therefore,I-SATCHMO

% ¬®°¯J„¡

marks an asserted consequent atom as useful whenever it con- instead of each node  such that , a special

. ±

leaf node ± is created, where means that the process ,

tributed to derive goal ) or an antecedent of a violated 

clause. from there is pruned away. On the other hand, if all $

 z are marked as useful, the use of ¥L"¨ ; ; is The useful consequent atoms to derive goal / can be found

known as necessary. In either case, node B is proven as from the deriving tree of goal / [He, 2001].

unsatisfiable.

¡

¡ .

Definition 2.1 Suppose that a give clause set be 'x+&,†6

. ,

Example 2.3 Suppose that 'x+,†6 be the clause set 2 , , the current consequent atom set established by forward

0 given in in Example 2.1.

/ +&,†6>2 / X

chaining, a goal such that . The deriving tree S

 1 š4)

In the branch C +-,dD in Figure 1, only



+-,>62 B‡\ˆŠ‰u‹Œ of goal / with respect to , denoted by , is the

atom is used in forward chaining and marked as useful. In X

result tree constructed as follows. S

 ‚<t)

the same way, in the branch C +&,dDM , only /

1. The root node is . atom  is used in forward chaining and marked as useful.

~ 

~ Since both and are proven to be unsatisfiable, its par-

¨

§ §Ž   §`   

¨ %

2. Suppose that is a node in X

ent node, i.e., is proven to be unsatisfiable. We backtrack

X X the tree being constructed. X

‘“’•” to node to retract . At that time, we know is not marked Bb¨   B>–wr—

For each +-, clause such that as useful in the reasoning. Therefore, it is useless for the rea-

.

TXpo X W

—™˜'š§©¨v˜ ˜

l  

, where is a most general unifier of soning and the use of the violated , clause

S

¨ ¨ –

§ B ˜I    B ˜I m§Ž ˜I

— and , create a child node selected at node is unnecessary. I-SATCHMO prunes away

~ ~

W

  m§ ˜8  ›  œ

 ¨

% . the process to be made on the remaining consequent atom ‘ž ”

and backtracks to node S at once.  '¡§¢¨v£ £ For each atom Ÿ3H2 such that , where

Now, we know that S is also not marked as useful in the 

is a ground substitution (since is a ground atom), .

TXpo S T

~ ~

,  

reasoning, therefore the violated clause l se-

§ £¤   m§ £    

Ž  ¨

create a child node , % ~

lected at the root node C+-,=D is unnecessary. Accordingly, we

 &%7¨{'¥

%v7¨ , where . stop to process the remaining consequent atom T .

3. Repeat this procedure recursively until an empty leaf or In this way, the proof tree constructed by I-SATCHMO is

a leaf such that all of its atoms hold “ ” is derived. Such . shown in Figure 3. As we see, unnecessary , clauses are

a leaf is called successful deriving leaf. split only once.

0



+-,c62 / B9\Mˆ›‰u‹ Œ Since +-, is decidable and , can be fi-

nally constructed with finding a successful deriving leaf. The 3 UNSEARCHMO +-,†6b2 consequent atoms that contributed to derive / from are those atoms listed in the successful deriving leaf if any. Since I-SATCHMO marks an asserted consequent atom as

useful whenever it contributes to derive thea antecendent of

¡

.

.

'x+,†6 , Example 2.2 Suppose that be the clause set a violated , clause, those consequent atoms contributed to

given in Example 2.1. Consider the leftmost node in Fig- unnecessary violated clauses are also marked as useful.

S X



':C ™D )

ure 1, where 2 . The deriving tree for goal , This problem can be easily solved by improving the mark-

§

ˆ›‰‹ Œ¦

) B9\ , is shown in Figure 2. To derive goal , only is ing work as follows: an asserted consequent atom is marked

useful. as useful only when it contributes to derive goal ) or the

620 LOGIC PROGRAMMING AND THEOREM PROVING

¥³t   

¨ 

CO>PD

Q

Q R

R

¡

S

±

E

V U

B

V U

V U

X

V U

V · U

±

·

Z Y

M¨  £$  

¡ ¡ ¡



_

^p¸ ^ ^

¦ ) ) Figure 5: Finding the useful consequent atom set for parent node from those of its children nodes

Figure 3: The proof tree []\ constructed by I-SATCHMO in

Example 2.3 ’ select atom ² first to process at node or we first select the

CO>PD last non-Horn clause for forward chaining. It is the well-

Q

Q

R

R 

’ known nondetermination problem for theorem proving.

V U V U V

U Now, we consider how to distinguish and manage the use-

² ² o

o ful consequent atoms for different branches in a proof tree.

Z Z Y Z Y Z Y

Y Without the loss of generality, shown as in Figure 5, suppose

X W X W

¨ 

¥ªn  

 

that ; ; be the violated clause selected at node

¡

B _ $

and ^ be the useful consequent atom set relative to

¬ ƒi„@¬„¹

) ) ) ) ) ) )

) for all such that . We consider how to derive the

¡ B

useful consequent atom set E relative to node .

[]\ 

Figure 4: The proof tree constructed by I-SATCHMO for Since our reasoning procedure is depth-first one, $€7¨ is 

the clause set given in Example 3.1 not taken into processing until node $ has been proven to be $

unsatisfiable. Moreover, only when some of  is found to be

‘ ” ƒi„¹¬£„@

useless or all node z$ are proven to be unsatis-

. , antecedent of a violated clause that has been verified as fiable, their parent node B is proven to be unsatisfiable. necessary. As we will know from the algorithm of UNSEARCHMO,

Moreover, without distinguishing the useful consequent when node z$ is proven to be unsatisfiable, the useful conse-

¡

_ $ 

atoms for different branches in proof tree, I-SATCHMO quent atom set ^ for deriving the refutation at node has ¡

might mismark those consequent atoms that are useful for the already been established. The useful consequent atom set E B refutation of unnecessary branches in a proof tree but use- for node B is constructed according to the way how node

less for the necessary branches as “useful”. In such cases, I- is proven as unsatisfiable.

$ 

SATCHMO also possibly makes some redundant search. Let If node B is proven as unsatisfiable in the way that all of

‘ ”

us see the following example. ƒ9„¬]„s are proven as unsatisfiable, then the use of the

¨ 

 

Example 3.1 Consider the following propositional clause set clause ¥º‚ ; ; is found to be necessary. In order to

¡ .

complete the refutation at node B , the clause should be ably , '@+-,b6 .

found to be violated there. Therefore, the useful consequent

X W

Oi³fn 1h)c nGh)i h)c h)i

atoms to derive the antecedent ¥ are useful for our refuta-

’  ’

²

´

T Xpo o o

T Xpo tion. Moreover, in order to complete the refutation made at

³fnl   ‚l   4 gF



o TXpo X W

each node z$ , each useful consequent atom for the refutation

4 wF 4l   £$

made at node £$ , except itself (since it can be obtained by

  z B

The proof tree generated by I-SATCHMO is shown in Fig- splitting ¥³¶¨ ; ; at node ), is also needed. Con- ¡

ure 4. cludingly, in this case, the useful consequent atom set E can o

Consider the leftmost node labeled with , where 2Iµ '

’

’ be derived as follows.

o o

¶ wI C D

, the clause becomes violated and is used



¡ ¡ ¡

to forward chaining. Since both and are useful for de- ¡

«

'C C ¨NDD£6G½½½6

^»¸-¼ ^ ¼ E (1)

riving ) , the use of the clause is found to be necessary to the

¦

’

o ¡

refutation, and are marked as useful. In this way, the pro- «



² o where is the useful consequent atom set for deriving the

cess for the consequent atom (the sibling atom of ) and ’ antecedent ¥ . (the sibling atom of ) are made, respectively.

On the other hand, if node B is proven as unsatisfiable in

 ‘ ”

$ ƒP„¥¬z„¥

Obviously, the process for node in the above example is the way that some of  is found to be useless,

’

  z

unnecessary. Although the consequent atom is useful for then the use of the clause ¥s1M¨ ; ; is found to be un-

¡

o

  3

$ ${K _

each branch from node , it is useless for any branch from necessary for the refutation made on node , i.e., ^ .

² ² $

node . It means that the refutation at node can be made In other words, the refutation found at node  can also be

’

¨ 

¥*¾   

without the consequent atom , that is, that can be made on made at node B without using the clause ; ; .

¡ C +&,dD

the root node in the same way. Therefore, the redundant Thus, the useful consequent atom set E for deriving a refu-



¡

_ ^ process for the consequent atom can be eliminated. tation at node B is exactly the same as , i.e.,

Although the process made at the leftmost node o is also

¡ ¡

' _ ^ unnecessary in a strict meaning, it is unavoidable unless we E (2)

LOGIC PROGRAMMING AND THEOREM PROVING 621

‘ ”

¡ ¡

^•_ ƒb„¬]„ In any case, when E is derived, all of The proof of the correctness of our UNSEARCHMO is

are remov¿ ed from database. not difficult. Since UNSEARCHMO just cuts down some Let us make a comparison of I-SATCHMO’s method and branches in the proof tree constructed by SATCHMO, if

our new strategy. When node B is proven as unsatisfiable SATCHMO can show a given clause set unsatisfiable, then

in the way that all of z$ are proven as unsatisfiable, the two UNSEARCHMO can certainly show the same thing. This

methods are exactly similar, the useful consequent atom set shows the completeness of UNSEARCHMO. ¡

E is obtained according to the formula (1). However, when On the other hand, as we have indicated above, an asserted

B z$

node is proven as unsatisfiable in the way that node consequent atom B is found useless at the time being re- ¡

is found to be useless, the useful consequent atom set E tracted from the reasoning database, its parent node is able to derived by I-SATCHMO is as follows. be proven as unsatisfiable. Therefore, all remaining process

to show its parent node unsatisfiable can be removed without

¡ ¡ ¡

¡ the loss of the soundness. The further details are omitted here

¨ $ÃÂI¨

E°'C ^ C  DD-6<½ ½½ 6JC ^¤_ÁÀ C  DD-6 ^¤_ ¸¼ ¸¼ (3) since the lack of space.

Obviously, although the consequent atoms contained in ¡

¡ It is easy to implement UNSEARCHMO by Prolog. The

C C {¨NDD  C C z$ÃÂI¨NDD

^ ¼ ^•_ÄÀ ¼ ¸ ¸ , , are useful for the

code is omitted here also because of the lack of space.

½ ½½ £$ÃÂI¨

branches beginning from ¨ , , , but useless for the $ refutation made at node  , and therefore is also useless to 4 An Example prove B unsatisfiable.

As we have seen, if a consequent atom, except the last atom Example 4.1 Let ¡ be the following clause set, an extension . of the consequence, of a violated , clause is found to be of the benchmark problem SYN009-1 given in TPTP problem

useful, then its next sibling atom will be taken into process- library [Sutcliffe and Suttner, 2000].

” ‘ÇÆ ”

ing. Therefore, the fewer the useful consequent atoms are, the ‘ÇÆ S

+-,gf È- É È& É h)i

‘ÇÆ ” ‘ÇÆ ”

more efficient a refutation is. It is quite clear that the useful Ê

È É È É h)i

” ‘ÃÆ ”

consequent atom set for any node derived by our new strat- ‘ÇÆ

T

È É m È É j)c

Ë ËN”

egy cannot be reduced any further, i.e., it is a minimal one. ‘ÇË

 h)i

‘“” ‘“ËN”

Since we can keep a minimal useful consequent atom set for ‘“’•”

k k

a refutation, our new strategy is one of the most efficient ap- k

‘ÇÆw” ‘ ” ‘ ”

proaches for forward chaining theorem proving. .

,gf;k k È k É

‘ÃÆ ” ‘ÇÆ ” ‘ÃÆ ”

Ê T

The algorithm of our prover UNSEARCHMO to construct S

 È vÉ  È vÉ  È É

¡

.

‘ÃÆ” ‘ ” ‘ ” ‘ÇÆ ” ‘ÇÆ ” ,

PT for checking whether a given clause set 's+&,A6 is

k k È k É 4 È& É m mÈ vÉ

unsatisfiable can be described as follows. .

The first , clause will be fully split by any of C +-,dD For the current node B (initially the root node ) and

SATCHMO, SATCHMORE and ¥ -SATCHMORE. That ŽÍ

its corresponding consequent atom set 2IE (initially empty):

Ì

0 means more than cases will be considered. As a result, we

+&,G6H2IE ) ) B 1. If , create a leaf node below node , gave up the our tests on SATCHMO, SATCHMORE and ¥ -

which means node B is proven unsatisfiable. The useful

¡ SATCHMORE with a run of three days without termination B consequent atom set for node , E , that consists of on an Intel PentiumIII/650MHZ workstation, respectively.

those consequent atoms listed in the successful deriving Now, we see how I-SATCHMO and UNSEARCHMO

.

§

©

ˆŠ‰u‹Œ B‡\ leaf in , is established. work. Initially, the first , clause is used for forward chain-

0 ing with each substitution such that its antecedent becomes

+-,J692IE )

K

’ ’•”

2. If , select an instantiated ground violated ‘“’

S

.



satisfiable. In this way, a branch C+&,=D­

, ¥>5    

¨ 

’ ” ‘Ç’ ’ ËN” ‘“Ë Ë ËN”

clause from . If no such clause ex- ‘“’

S S S

.

 t½½ ½u

is generated.

+&,H6 ,

‘ÇË Ë ËN”

ists, is satisfiable. .

S

,

At node , the second clause with the substi-

Æ ’ ’ ’ ‘“’¤” ‘“’•” ‘Ç’•”

¨ 

3. For the selected violated non-Horn clause ¥ªn ; ;

' È"' Ét' D k k k 

tution C , , , i.e.,

‘“’ ’ ’•” ‘“’ ’ ’•” ‘Ç’ ’ ’¤”

£$ ¬ ƒ©„ ¬-„

z , create a node for all such that . Tak-

• )

, is first split. For node ,  ing $ as the new current node, call this procedure re-

can be derived and the useful consequent atom set is estab-

‘“’ ’ ’¤” ‘Ç’ ’ ’¤” ‘“’ ’ ’•”

cursively in depth-first strategy with the augmented con- S

D  )

lished as C , . At node , can

$ $

D  sequent atom set 2I^`_<'Å28Es6 C  . However, if

is found to be useless on backtracking, instead of each

CO>PD

Q

 ¬<®š¯„‚

%

Q R

node such that , create a special node R

’

¡

±

B E

± . The useful consequent atom set for node , , is

V U U

established according to the formula (2). On the other ²

o

 ƒH„ª¬©„ª

hand, if every $ for is found to be useful,

Z Y Z Y ¡

the useful consequent atom set E is established accord-

X W



ing to the formula (1). In either case, node B is proven

unsatisfiable.

¡

.

) ) ) ) , Example 3.2 Suppose that 'x+,†6 be the clause set given in in Example 2.1. The proof tree constructed by UN- Figure 6: The proof tree [©\ constructed by UNSEARCHMO SEARCHMO is shown in Figure 6. Redundant search space for the clause set given in Example 3.1 has been eliminated.

622 LOGIC PROGRAMMING AND THEOREM PROVING .

not be derived, and the second , clause with the substi- proving as much as possible. The redundant search branches

Æ ’ ’  ‘“’•” ‘Ç’•” ‘ž ”

' ÈÎ' ɶ' D k k k 

tution C , , , i.e., are immediately pruned away when unnecessary ones are

‘Ç’ ’ ” ‘“’ ’  ” ‘Ç’ ’ ”



; , is split. At node , similar as found on backtracking. At any reasoning point, only those

‘“’ ’ ’•”

)

node , is derived and the useful consequent atom necessary violated clauses are completely split for forward

‘Ç’ ’  ” ‘Ç’ ’  ” ‘“’ ’ ”

S

D 

set is C , . For node , similar as chaining. Repeated search by invoking unnecessary non-

‘“’ ’ ’•”

node  , no refutation can be derived, then the second Horn clauses can be effectively eliminated. The experimen-

Æ ’ '

non-Horn clause will be split with the substitution C , tal result has shown that our method is powerful for forward

’ Ë

ɥ' D ȳ' , , and so on. chaining theorem proving.

The second non-Horn clause will be further split until Since our strategy prunes away unnecessary proof

‘“ËN” ‘“ËN” ‘“ËN” ‘“Ë Ë ËN” ‘“Ë Ë ËN”

k k Ï Ð

k is used for forward branches after unnecessary non-Horn clauses are used for

‘ž Ë Ë ”

chaining at node  . Then, at last a refutation can forward chaining, the violated non-Horn clauses for forward

‘ž Ë ËN”

be derived from each of node  ’s child node simul- chaining can be used in any order decided by any advance

taneously. The useful consequent atom set for the refutation checking strategy. That is, we can incorporate the relevancy

‘“Ë Ë ËN” ‘“Ë Ë ËN” ‘“Ë Ë ËN”

S

C Ð D

at node is and that at node checking proposed in SATCHMORE and the availability test-

‘“Ë Ë ËN” ‘“Ë Ë ËN”

¥

C D

 is just . According to the formula (1), the ing introduced in -SATCHMORE into our strategies to im-

‘ž Ë Ë ”

useful consequent atom set for node  is constructed prove the performance further.

‘“Ë Ë ËN”

S

D as C . It is obvious that our method can be applied to disjunc- If we do not distinguish the useful consequent atoms for tion logic programming (database). The principle of our different branches, as I-SATCHMO does, we will find that all method to eliminate redundant search space in forward chain-

S -facts have been marked as useful. Then, the same process ing based prover can also be applied to reason about other

‘ÇÑ ” Ñ

S S

Ò Ó

made on each -fact mÒ» Ó , where each of , and is logics whenever forward chaining strategy is used. This gives

 Ë

one of ’ , and , would be repeated for the corresponding our method a wide application in automated reasoning field.

‘ÇÑ ” ‘ÇÑ ”

Ê Ê T T ÐÒ» Ó -fact ÐÒ» Ó (also for the corresponding -fact ). As future works, we will use our prover to test those bench- Similar to other existing strategies, no answer can be obtained mark problems provided in TPTP library [Sutcliffe and Sut-

within a reasonable time. tner, 2000], to clear which class of problems will be helped

‘ž Ë ËN”

However, at node  , where the useful consequent by our approach.

‘“Ë Ë ËN” ‘ž Ë Ë ”

S

D 

atom set is C , UNSEARCHMO finds

‘ž” ‘“ËN” ‘ÇËN”

k k

useless for the refutation, i.e., the clause k

‘ž Ë ËN” ‘ž Ë ËN” Ë Ë ”

‘ž References

Ô m

selected at node  ’s parent

Ë ËN”

‘“’ [Bry and Yahya, 2000] Bry, F. and Yahya, A.: Positive

 is not necessary. According to the formula (1),

the useful consequent atom set derived at its sibling node Unit Hyperresolution Tableaux and Their Application

‘“ Ë ËN” ‘ž Ë ËN” ‘ž Ë Ë ”

S to Minimal Model Generation. Journal of Automated

C m D

, i.e., , is abandoned, and

Ë ËN”

‘“’ Reasoning, 25:35-82, 2000.

the useful consequent atom set for node  is still

‘“Ë Ë ËN”

S

D C . [He et al., 1998] He, L., Chao, Y., Simajiri, Y., Seki, H. and

Similar backtracking repeatedly continues to node Itoh, H.: ¥ -SATCHMORE: SATCHMORE with Avail-

‘“Ë Ë ËN”

S

, where the useful consequent atom set established ability Checking. New Generation Computing, 16:55-

‘ÇË Ë Ë ”

S

D

by UNSEARCHMO is still C . The same process 74, 1998.

‘“Ë Ë ËN”

S

made on node will be also made on its sibling [ ]

Ë ËN” ‘“Ë Ë ËN”

‘“Ë He, 2001 He, L.: I-SATCHMO: an Improvement of

Ê T

node and node , and the corresponding

Ë ËN”

‘“Ë SATCHMO. To appear in J. of automated reasoning.

Ê

D

useful consequent atom set are found to be C and

‘“Ë Ë ËN” ‘“ËN” ‘ÇË ” ‘“ËN”

T [Loveland, 1968] Loveland, D.W.: Mechanical Theorem

D k k k

C , respectively. The violated clause

‘“Ë Ë ËN” ‘“Ë Ë ËN” ‘ÇË Ë ËN” ‘“ Ë ËN”

S Ê T

S Proving by Model Elimination. J. of the ACM, 15:236-

  

selected at node is 251, 1968. found to be necessary.

Now, let us see what happens when UNSEARCHMO [Loveland et al., 1995] Loveland, D.W., Reed, D.W. and

‘“Ë Ë ËN” ‘ž Ë ËN” S

S Wilson, D.S.: SATCHMORE: SATCHMO with REl-

backtracks to ’s parent node , where, ac- cording to the formula (1), useful consequent atom set is sur- evancy. Journal of Automated Reasoning, 14:325-351, prisedly found to be empty! no other consequent atom is use- 1995. ful for our refutation! UNSEARCHMO simply backtracks to [Manthey and Bry, 1988] Manthey, R. and Bry, F.:

the root node C+-,=D to conclude that the given clause set is SATCHMO: a theorem prover implemented in unsatisfiable. Prolog. In Proceedings of 9th intl. Conf. on Automated With the help of distinguishing the useful consequent Deduction, 1988. atoms for different branches, UNSEARCHMO only takes [Ramsay, 1991] Ramsay, A.: Generating Relevant Models. 0.002 seconds to find its solution on an Intel Pentiu- Journal of Automated Reasoning, 7:359-368, 1991. mIII/650MHZ workstation. [Robinson, 1965] Robinson, J.A.: A Machine-oriented logic based on the resolution principle. J. of Ass. comput. 5 Conclusion Mach., 12, 23-41, 1965.

In this paper, we have introduced how to eliminate redundant [Sutcliffe and Suttner, 2000] Sutcliffe, G. and Suttner, C.:

Õ Õ search space on backtracking for forward chaining theorem http: ÕÕ www.cs.jcu.edu.au ˜ tptp

LOGIC PROGRAMMING AND THEOREM PROVING 623 Theorem Proving with Structured Theories

Sheila McIlraith∗ and Eyal Amir Department of Computer Science, Gates Building, Wing 2A Stanford University, Stanford, CA 94305-9020, USA {sheila.mcilraith,eyal.amir}@cs.stanford.edu

Abstract convert a graphical representation of the problem into a tree- structured representation, where each node in the tree repre- Motivated by the problem of query answering over sents a tightly-connected subproblem, and the arcs represent multiple structured commonsense theories, we ex- the loose coupling between subproblems. Inference is done ploit graph-based techniques to improve the ef- locally at each node and the necessary information is propa- ficiency of theorem proving for structured theo- gated between nodes to provide a global solution. Inference ries. Theories are organized into subtheories that thus proves to be linear in the tree structure, and often worst- are minimally connected by the literals they share. case exponential within the individual nodes. We present message-passing algorithms that reason We leverage these ideas to perform more efficient sound over these theories using consequence finding, spe- and complete theorem proving over theories in first-order cializing our algorithms for the case of first-order logic (FOL) and propositional logic. In this paper we as- resolution, and for batch and concurrent theorem sume that we are given a first-order or propositional theory proving. We provide an algorithm that restricts the that is partitioned into subtheories that are minimally cou- interaction between subtheories by exploiting the pled, sharing minimal vocabulary. Sometimes this partition- polarity of literals. We attempt to minimize the ing is provided by the user because the problem requires rea- reasoning within each individual partition by ex- soning over multiple KBs. Other times, a partitioning is in- ploiting existing algorithms for focused incremen- duced automatically to improve the efficiency of reasoning. tal and general consequence finding. Finally, we (Some automated techniques for performing this partitioning propose an algorithm that compiles each subtheory are discussed in [3; 2].) This partitioning can be depicted as into one in a reduced sublanguage. We have proven a graph in which each node represents a particular partition the soundness and completeness of all of these al- or subtheory and each arc represents shared vocabulary be- gorithms. tween subtheories. Theorem proving is performed locally in each subtheory, and relevant information propagated to en- 1 Introduction sure sound and complete entailment in the global theory. To maximize the effectiveness of structure-based theorem prov- Theorem provers are becoming increasingly prevalent as ing we must 1) minimize the coupling between nodes of the query-answering machinery for reasoning over single or mul- tree to reduce information being passed, and 2) minimize lo- tiple large commonsense knowledge bases (KBs) [3]. Com- cal inference within each node, while, in both cases, preserv- monsense KBs, as exemplified by Cycorp’s Cyc and the High ing global soundness and completeness. Performance Knowledge Base (HPKB) systems developed by In this paper we present message-passing algorithms that Stanford’s Knowledge Systems Lab and by SRI often com- reason over partitioned theories, minimizing the number of prise tens/hundreds of thousands of logical axioms, embody- messages sent between partitions and the local inference ing loosely coupled content in a variety of different subject within partitions. We first extend the applicability of a domains. Unlike mathematical theories (the original domain message-passing algorithm presented in [3] to a larger class of automated theorem provers), commonsense theories are of- of local reasoning procedures. In Section 3 we modify this ten highly structured and with large signatures, lending them- algorithm to use first-order resolution as the local reasoning selves to graph-based techniques for improving the efficiency procedure. In Section 4 we exploit Lyndon’s Interpolation of reasoning. Theorem to provide an algorithm that reduces the size of the Graph-based algorithms are commonly used as a means communication languages connecting partitions by consider- of exploiting structure to improve the efficiency of reason- ing the polarity of literals. Finally, in Section 5 we attempt ing in Bayes Nets (e.g., [18]), Constraint Satisfaction Prob- to minimize the reasoning within each partition using algo- lems (CSPs) (e.g., [12]) and most recently in logical reason- rithms for focused and incremental consequence finding. We ing (e.g., [11; 3; 25]). In all cases, the basic approach is to also provide an algorithm for compiling partitioned proposi- Knowledge Systems Laboratory (KSL) tional theories into theories in a reduced sublanguage. We

624 LOGIC PROGRAMMING AND THEOREM PROVING have proven the soundness and completeness of all of these the path to the node containing the query. algorithms with respect to reasoning procedures that are com- Recall, consequence finding (as opposed to proof finding) plete for consequence finding in a specified sublanguage. was defined by Lee [19] to be the problem of finding all non- Proofs omitted from this paper can be found at [22]. tautological logical consequences of a theory or sentences that subsume them. A prime implicate generator is a popu- 2 Partition-Based Logical Reasoning lar example of a consequence finder 1. To determine the direction in which messages should be In this section we describe the basic framework adopted in sent in the graph G, step 1 in FMP computes a strict partial this paper. We extend it with new soundness and complete- order over nodes in the graph using the partitioning together ness results that will enable us to minimize local inference. with a query, Q. Following [3], we say that {Ai}in is a partitioning of Definition 2.1 (≺) Given partitioned theory A = S A , a logical theory A if A = Si Ai. Each individual Ai is in i a set of axioms called a partition, L(Ai) is its signature associated graph G = (V; E; l) and query Q ∈ L(Ak), let (the set of non-logical symbols), and L(Ai) is its language dist(i; j) (i; j ∈ V ) be the length of the shortest path between (the set of formulae built with L(Ai)). The partitions may nodes i; j in G. Then i ≺ j iff dist(i; k) < dist(j; k). share literals and axioms. A partitioning of a theory in- duces a graphical representation, G = (V; E; l), which we call the theory’s intersection graph. Each node of the in- PROCEDURE FORWARD-M-P (FMP)({Ai}in, G, Q) tersection graph, i, represents an individual partition, Ai, {Ai}in a partitioning of the theory A, G = (V; E; l) a graph (V = {1; :::; n}), two nodes i; j are linked by an edge if describing the connections between the partitions, Q a query in (k n). L(Ai) and L(Aj ) have a non-logical symbol in common L(Ak) ≤ (E = {(i; j) | L(Ai) ∩ L(Aj ) 6= ∅}), and the edges are 1. Determine ≺ as in Definition 2.1. labeled with the set of symbols that the associated partitions 2. Concurrently, PSfrag replacementsshare (l(i; j) = L(Ai) ∩ L(Aj )). We refer to l(i; j) as the communication language between partitions Ai and Aj . We (a) Perform consequence finding for each of the partitions ensure that the intersection graph is connected by adding a Ai, i ≤ n. minimal number of edges to E with empty labels, l(i; j) = ∅. (b) For every (i; j) ∈ E such that i ≺ j, for every conse- Figure 1 illustrates a propositional theory A in clausal form quence ’ of Aj found (or ’ in Aj ), if ’ ∈ L(l(i; j)), (left-hand side) and its partitioning displayed as an intersec- then add ’ to the set of axioms of Ai. a tion graph (right-hand side). (Figures 1, 2 and 3 first appeared (c) If Q is proven in Ak, return YES. in [3].) a Derive a subsuming formula or initially add ¬Q to Ak and derive inconsistency. A1 (1) :ok pump _ :on pump _water Figure 2: A forward message-passing algorithm. A (2) :man fill _ water (3) :man fill _ :on pump :ok pump _ :on pump _ water (4) man fill _ on pump :man fill _ water Figure 3 illustrates an execution of the FMP algorithm us- :man fill _ :on pump water ing resolution as the consequence finder within a partition. man fill _ on pump A2 :water _ :ok boiler (5) :water_:ok boiler As can be seen from the example, the partitioning reduces the _:on boiler _ steam _:on boiler_steam water _ :steam number of possible inference steps by precluding the direct (6) water _ :steam ok boiler _ :steam (7) ok boiler _ :steam resolution of axioms residing in different partitions. Indeed, on boiler _ :steam (8) on boiler _ :steam :steam _ :coffee _ hot drink [3] showed that partition-based reasoning reduces the search coffee _ teabag steam space significantly, as a function of the size of the communi- :steam _ :teabag _ hot drink A3 cation language between partitions. (9) :steam_:coffee_hot drink (10) coffee _ teabag FMP is sound and complete if we guarantee some proper- (11) :steam_:teabag_hot drink ties of the graph G and the consequence finders used for each partition. The graph G is required to be a tree that is properly Figure 1: Partitioned theory A intersection graph G. labeled for A. Definition 2.2 (Proper Labeling) A tree-structured repre- Figure 2 displays FORWARD-M-P (FMP), a message- sentation, G = (V; E; l), of a partitioned theory A = passing algorithm for partition-based logical reasoning. It {Ai}in is said to have a proper labeling, if for all (i; j) ∈ E takes as input a partitioned theory, A, an associated graph and B1; B2, the two subtheories of A on the two sides of the structure G = (V; E; l), and a query formula Q in L(Ak), edge (i; j) in G, it is true that l(i; j) ⊇ L(B1) ∩ L(B2). and returns YES if the query was entailed by A. The al- gorithm uses procedures that generate consequences (conse- For example, every intersection graph that is a tree is prop- quence finders) as the local reasoning mechanism within each erly labeled. A simple algorithm called BREAK-CYCLES partition or graphical node. It passes a concluded formula to 1Recall, an implicate is a clause entailed by a theory. It is prime an adjacent node if the formula’s signature is in the commu- if it is minimal in some way. Definitions of prime vary including the nication language l of the adjacent node, and that node is on use of subsumption, syntactic minimality, or entailment.

LOGIC PROGRAMMING AND THEOREM PROVING 625 Using FMP to prove hot drink Observe that every reasoner that is complete for L- Part. Resolve Generating consequence finding is also complete for L-generation, for any language L that is closed under subsumption [14]. The A1 (2) ; (4) on pump ∨ water (m1) notion of a consequence finder restricting consequence gen- A (m1); (1) ok pump ∨ water (m2) 1 eration to consequences in a designated sublanguage was A1 (m2); (12) water (m3) discussed by Inoue [17], and further developed by del Val clause water passed from A1 to A2 [14] and others. Most results on the completeness of con- A2 (m3) ; (5) ok boiler ∧ on boiler ⊃ steam (m4) sequence finding exploit resolution-based reasoners, where

A2 (m4) ; (13) ¬on boiler ∨ steam (m5) completeness results for L-consequence finding are gener- ally restricted to a clausal language L. The FMP reasoners A2 (m5) ; (14) steam (m6) in Theorem 2.4 must be complete for Li-generation in arbi- clause steam passed from A2 to A3 trary FOL languages, Li. Corollary 2.6 refines Theorem 2.4 (9) ; steam teabag hot drink (m7) A3 (10) ¬ ∨ ∨ by restricting Ai and Li to propositional clausal languages A3 (m7) ; (11) ¬steam ∨ hot drink (m8) and requiring reasoners to be complete for Li-consequence A3 (m8) ; (m6) hot drink (m9) finding rather than Li-generation. Corollary 2.6 (Soundness and Completeness) Let A be a Figure 3: A proof of hot drink from A in Figure 1 after partitioned theory {Ai}in of propositional clauses, G a tree asserting ok pump (12) in A1 and ok boiler (13), on boiler that is properly labeled with respect to A, and Q ∈ L(Ak), (14) in A2. k ≤ n, a query. Let Li = L(l(i; j)) for j such that (i; j) ∈ E and j ≺ i (there is only one such j), and let {Ri}in be reasoning procedures associated with partitions {Ai}in. If that transforms an intersection graph that is not tree into a every Ri is complete for Li-consequence finding then A |= Q properly labeled tree was presented in [3]. Note that the no- iff FMP({Ai}in, G, Q) outputs YES. tion of proper labeling is equivalent, in this context, to the In Section 5 we provide examples of reasoners that are running intersection property used in Bayes Nets. complete for L-consequence finding and show how to exploit The consequence finders applied to each partition i are re- them to focus reasoning within a partition. quired to be complete for Li-generation for a sublanguage Li ⊂ L(Ai) that depends on the graph G and the query Q. 3 Local Inference Using Resolution Definition 2.3 (Completeness for L-Generation) Let A be In this section, we specialize FMP to resolution-based con- a set of axioms, L ⊆ L(A) a language, and R a consequence sequence finders. Resolution [26] is one of the most widely finder. Let CR;L(A) be the consequences of A generated by used reasoning methods for automated deduction, and more R that are in L. R is complete for L-generation if for all specifically for consequence finding. It requires the input for- ’ ∈ L, if A |= ’, then CR;L(A) |= ’. mula to be in clausal form, i.e., a conjunction of disjunctions of unquantified literals. The resolution rule is complete for Theorem 2.4 (Soundness and Completeness) Let A be a [ ] partitioned theory of arbitrary propositional or consequence finding (e.g., 19; 27 ) and so is linear resolu- {Ai}in [ ] first-order formulae, G a tree that is properly labeled with tion and some of its variants (e.g., 23 ). We present algorithm RESOLUTION-M-P (RES-MP), respect to A, and Q ∈ L(A ), k ≤ n, a query. For all i ≤ n, k that uses resolution (or resolution strategies), in Figure 4. let L = L(l(i; j)) for j such that (i; j) ∈ E and j ≺ i (there i The rest of this section is devoted to explaining four differ- is only one such j), and let {R } be reasoning procedures i in ent implementations for subroutine RES-SEND(’, j, i), used associated with partitions {A } . If every R is complete i in i by this procedure to send appropriate messages between par- for Li-generation then A |= Q iff FMP({Ai}in, G, Q) out- puts YES. titions: the first implementation is for clausal propositional theories; the second is for clausal FOL theories, with associ- This soundness and completeness result improves upon a ated graph G, that is a properly labeled trees and whose labels soundness and completeness result in [3] by allowing conse- include all the function and constant symbols of the language; quence finders that focus on consequences in the communi- the third is also for clausal FOL theories, however it uses un- cation language between partitions. In certain cases, we can skolemization and subsequent Skolemization to generate the restrict consequence finding in FMP even further by using messages to be passed between partitions; the fourth is a re- reasoners that are complete for L-consequence finding. finement of the third for the same class of theories that avoids unskolemization. Definition 2.5 (Completeness for L-Consequence Finding) In the propositional case, subroutine RES-SEND(’, j, i) R Let A be a set of axioms, L ⊆ L(A) a language, and (Implementation 1) simply adds ’ to , as done in FMP. R Ai a consequence finder. is complete for L-consequence If the resolution strategies being employed satisfy the condi- finding iff for every ’ ∈ L that is not a tautology, A |= ’ iff 2 tions of Corollary 2.6, then RES-MP is sound and complete. there exists ∈ L such that A `R and subsumes ’. In the FOL case, implementing RES-SEND requires more care. To illustrate, consider the case where resolution gener- 2For clausal theories, we say that clause subsumes ’ if there ates the clause P (B; x) (B a constant symbol and x a vari- is a substitution such that ⊂ ’. able). It also implicitly proves that ∃b P (b; x). RES-MP

626 LOGIC PROGRAMMING AND THEOREM PROVING PROCEDURE RESOLUTION-M-P(RES-MP)({Ai}in, G, Q) RES-MP (or in FMP), upholding the soundness and com-

{Ai}in a partitioned theory, G = (V; E; l) a graph, Q a query pleteness to that supplied by Theorem 2.4. The subroutine formula in the language of L(Ak) (k ≤ n). RES-SEND(’, j, i) that implements this approach is pre- sented in Figure 5. It replaces ’ with a a set of formulae 1. Determine ≺ as in Definition 2.1. in L(l(i; j)) that follows from ’. It then Skolemizes the re- 2. Add the clausal form of ¬Q to Ak. sulting formulae for inclusion in Ai. 3. Concurrently, PROCEDURE RES-SEND(’, j, i) (Implementation 3) (a) Perform resolution for each of the partitions Ai, i ≤ n. ’ a formula, j; i ≤ n. (b) For every (i; j) ∈ E such that i ≺ j, if partition Aj includes the clause ’ (as input or resolvent) and the 1. Unskolemize ’ into a set of formulae, Φ in L(l(i; j)), treat- predicates of ’ are in l i; j , then perform RES- L( ( )) ing every symbol of L(’) \ l(i; j) as a Skolem symbol. SEND(’, j, i). 2. For every ’2 ∈ Φ, if ’2 is not subsumed by a clause that is (c) If Q is proven in Ak, return YES. in Ai, then add the Skolemized version of ’2 to the set of axioms of Ai. Figure 4: A resolution forward message-passing algorithm. Figure 5: Subroutine RES-SEND using unskolemization. may need to send ∃b P (b; x) from one partition to another, but it cannot send P (B; x) if B is not in the communication Procedure U may generate more than one formula language between partitions (for ground theories there is no for any given clause ’. For example, if ’ = such problem (see [27])). In the first-order case, complete- P (x; f(x); u; g(u)), for l(i; j) = {P }, then we must gener- ness for consequence finding for a clausal first-order logic ate both ∀x∃y∀u∃vP (x; y; u; v) and ∀u∃v∀x∃yP (x; y; u; v) language (e.g., Lee’s result for resolution) does not guaran- (’ entails both quantified formulae, and there is no one quan- tee completeness for L-generation for the corresponding full tified formula that entails both of them). In our case we can FOL language, L. This problem is also reflected in a slightly avoid some of these quantified formulae by replacing the un- different statement of Craig’s interpolation theorem [10] that skolemize and then Skolemize process of RES-SEND (Imple- applies for resolution [27]. mentation 3) with a procedure that produces a set of formulae A simple way of addressing this problem is to add all con- directly (Implementation 4). It is presented in Figure 6. stant and function symbols to the communication language between every connected set of partitions. This has the advan- PROCEDURE RES-SEND(’, j, i) (Implementation 4) tage of preserving soundness and completeness, and is sim- ’ a formula, j; i ≤ n. ple to implement. In this case, subroutine RES-SEND(’, j, i) (Implementation 2) simply adds ’ to Ai, as done in FMP. 1. Set S ← L(’) \ l(i; j).

In large systems that consist of many partitions, the addi- 2. For every term instance, t = f(t1; :::; tk), in ’, if f ∈ S and tion of so many constant and function symbols to each of the t is not a subexpression of another term t0 = f 0(t0 ; :::; t0 ) 1 k0 other partitions has the potential to be computationally inef- of ’ with f 0 ∈ S, then replace t with “x ← t” for some new ficient, leading to many unnecessary and irrelevant deduction variable, x (if k = 0, t is a constant symbol). steps. Arguably, a more compelling way of addressing the 3. Nondeterministicallya, for every pair of marked arguments problems associated with resolution for first-order theories is “x ← ”, “y ← ”, in ’, if ; are unifiable, then unify all to infer the existential formula ∃b P (b; x) from P (B; x), send occurrences of x; y (i.e., unify i; i for all markings x ← this formula to the proper partition and Skolemize it there. i, y ← i). For example, if ’ = P (f(g(B)); x) is the clause that RES- 4. For every marked argument “x ← ” in ’, do SEND gets, replacing it with ∃b P (b; x) eliminates unneces- (a) Collect all marked arguments with the same variable on sary work of the receiving partition. the left-hand side of the “←” sign. Suppose these are The process of conservatively replacing function and con- x ← 1; :::; x ← l. stant symbols by existentially quantified variables is called (b) Let y ; :::; y be all the variables occurring in unskolemization or reverse Skolemization and is discussed in 1 r ; :::; . For every i ≤ l, replace “x ← i” with [ ] [ ] 1 l 5; 9; 8 . 8 presents an algorithm U that is complete for our f(y1; :::; yr) in ’, for a fresh function symbol f (if purposes and generalizes and simplifies an algorithm of [9]. r = 0, f is a fresh constant symbol). (Space precludes inclusion of the algorithm.) 5. Add ’ to Ai. Theorem 3.1 ([8]) Let V be a vocabulary and ’; be for- a mulae such that ∈ L(V ) and ’ ⇒ . There exists F ∈ Nondeterministically select the set of pairs for which to unify L(V ) that is generated by algorithm U such that F ⇒ . all occurrences of x; y. Thus, for every resolution strategy that is complete for - L Figure 6: Subroutine RES-SEND without unskolemization. consequence finding, unskolemizing ’ using procedure U for V = l(i; j) and then Skolemizing the result gives us a com- bined procedure for message generation that is complete for Steps 2–3 of procedure RES-SEND(’, j, i) (Implementa- Lj -generation. This procedure can then be used readily in tion 4) correspond to similar steps in procedure U presented

LOGIC PROGRAMMING AND THEOREM PROVING 627 in [8], simplifying where appropriate for our setup. Our directional resolution [25] and lock resolution [7]. Strategies procedure differs from unskolemizing procedures in step 4, such as set-of-support and semantic resolution can be used where it stops short of replacing the Skolem functions and with somewhat different treatments. constants with new existentially quantified variables. Instead, it replaces them with new functions and constant symbols. 4 Minimizing Node Coupling Using Polarity The nondeterminism of step 3 is used to add all the possible FMP and RES-MP use the communication language to deter- combinations of unified terms, which is required to ensure mine relevant inference steps between formulae in connected completeness. partitions. This section improves the efficiency of FMP and For example, if ’ P f g B ; x and l i; j = ( ( ( )) ) ( ) = RES-MP by exploiting the polarity of predicates in our par- P , then RES-SEND adds P A; x to , for a new { } ( ) Ai titions to further constrain the communication language be- constant symbol, A. If ’ P x; f x ; u; g u , for = ( ( ) ( )) tween partitions. This leads to a reduction in the number l i; j P , then RES-SEND adds P x; h x ; u; h u ( ) = { } ( 1( ) 2( )) of messages that are passed between adjacent partitions, and to , for new function symbols h ; h . Finally, if Ai 1 2 thus a reduction in the search space size of the global reason- ’ P x; f x ; u; f g u , then RES-SEND adds = ( ( ) ( ( ))) ing problem. Our results are predicated on Lyndon’s Interpo- P x; f x ; u; h u and P h u ; h u ; u; h u to , for ( ( ) ( )) ( 1( ) 2( ) 2( )) Ai lation Theorem [20], an extension to Craig’s Theorem [10]. h; h1; h2 new function symbols. Theorem 4.1 (Lyndon’s Interpolation Theorem) Let ; Theorem 3.2 (Soundness & Completeness of RES-MP) be sentences such that ` . Then there exists a sentence Let A be the partitioned theory Sin Ai of propositional such that ` and ` , and that every relation symbol or first-order clauses, G a tree that is properly labeled with that appears positively [negatively] in appears positively respect to A, and Q ∈ L(Ak) k ≤ n, a sentence that is the [negatively] in both and . is referred to as the inter- query. A |= Q iff applying RES-MP({Ai}in, G, Q) (with polant of and . Implementation 4 of RES-SEND) outputs YES. This theorem guarantees that FMP need only send clauses PROOF SKETCH Soundness and completeness of the algo- with literals that may be used in subsequent inference steps. rithm follow from that of FMP, if we show that RES-SEND For example, let {A1; A2} be a partitioned theory, G = (V = (Implementation 4) adds enough sentences (implying com- {1; 2}; E = {(1; 2)}; l) be a graph, and Q ∈ L(A2), be a pleteness) to Ai that are implied by ’ (thus sound) in the query. If FMP concluded P from A1, and P does not show restricted language L(l(i; j)). positively in A2 ⇒ Q (i.e., P does not show negatively in A2 If we add all sentences ’ that are submitted to RES-SEND and does not show positively in Q), then there is no need to to Ai without any translation, then our soundness and com- send the message P from A1 to A2. pleteness result for FMP applies (this is the case where we Procedure POLARIZE (Figure 7) takes as input a parti- add all the constant and function symbols to all l(i; j)). tioned theory, associated tree G = (V; E; l), and a query We use Theorem 3.1 to prove that we add enough sentences Q. It returns a new graph G0 = (V; E; l0) that is mini- to Ai. Let ’2 be a quantified formula that is the result of ap- mal with respect to our interpretation of Lyndon’s Interpo- plying algorithm U to ’. Then, ’2 results from a clause C lation Theorem. The labels of the graph now include pred- generated in step 4 of algorithm U (respectively, Step 3 in icate/propositional symbols with associated polarities (the RES-SEND). In algorithm U, for each variable x, the mark- same symbol may appear both positively and negatively on ings “x ← i” in C are converted to a new variable that is an edge label). All function and object symbols that appeared existentially quantified immediately to the right of the quan- in l also appear in l0 for the respective edges. tification of the variables y1; :::; yr. ’2 is a result of ordering Theorem 4.2 (Soundness and Completeness) Let A be a the quantifiers in a consistent manner to this rule (this process partitioned theory {Ai}in of arbitrary propositional or is done in steps 5–6 of algorithm U). first-order formulae, G a tree that is properly labeled with Step 4 of RES-SEND performs the same kind of replace- 0 respect to A, and Q ∈ L(Ak), k ≤ n, a query. Let G ment that algorithm U performs, but uses new function sym- be the result of running POLARIZE({Ai}i n, G, Q). Let bols instead of new quantified variables. Since each new Li = L(l(i; j)) for j such that (i; j) ∈ E and j ≺ i (there is quantified variable in ’ is to the right of the variables on 2 only one such j), and let {Ri}in be reasoning procedures which it depends, and our new function uses exactly those R 0 associated with partitions {Ai}in. If every i is complete variables as arguments, then Step 4 generates a clause C 0 for Li-generation then A |= ’ iff FMP({Ai}in, G ,Q) out- from C that entails ’2. Thus, the clauses added to Ai by puts YES. RES-SEND entail all the clauses generated by unskolemiz- ing ’ using U. From Theorem 3.1, these clauses entail all the Darwiche [11] proposed a more restricted use of polarity sentences in L(l(i; j)) that are implied by ’. in graph-based algorithms for propositional SAT-search. His To see that the result is still sound, notice that the set of proposal is equivalent to first finding those propositional sym- bols that appear with a unique polarity throughout the theory clauses that we add to Ai has the same consequences as ’ in and then assigning them the appropriate truth value. In con- L(l(i; j)) (i.e., if we add those clauses to Aj we get a conser- trast, our proposed exploitation of polarity is useful for both vative extension of Aj ). propositional and first-order theories, it is more effective in Resolution strategies that can be readily used in RES-MP, constraining inference steps, and is applicable to a broader while preserving completeness, include linear resolution [23], class of message-passing algorithms problems. In particular,

628 LOGIC PROGRAMMING AND THEOREM PROVING PROCEDURE POLARIZE({Ai}in, G, Q) The first strategy is compilation. Figure 8 provides an al-

{Ai}in a partitioning of the theory A, G = (V; E; l) a tree and gorithm, COMPILE({Ai}in, G), that takes as input a parti- Q a query formula in L(Ak) (k ≤ n). tioned theory {Ai}in and associated tree G, that is properly 0 0 labeled, and outputs a compiled partitioned theory {A } . 1. For every i; j ∈ V , set l (i; j) to be the set of object and i in function symbols that appear in l(i; j), if there are any. Each new partition is composed of the logical consequences of partition Ai that are in the language Lcommi , all the com- 2. Rewrite {Ai}in such that all negations appear in front of munication languages associated with A . Prime implicate literals (i.e., in negation normal form). i finders have commonly been used for knowledge compila- 3. Determine ≺ as in Definition 2.1. tion, particularly in propositional cases. SFK-resolution can 4. For all (i; j) ∈ E such that i ≺ j, for every predicate symbol be used as the sound and complete L-consequence finder in P ∈ l(i; j), Step 2 of COMPILE. Knowledge compilation can often create a large theory. (a) Let V1; V2 be the two sets of vertices in V separated by i in G, with j ∈ V1. Each partition produced by COMPILE({Ai}in, G) will be jL(Lcommi )j (b) If [¬]P appears in V1 then, of worst case size of O(2 ) clauses. Since our as- if [¬]P appears in Q or ¬[¬]P appears in Am, for sumption is that partitions are produced to minimize commu- 0 some m ∈ V2, then add [¬]P to l (i; j). nication between partitions, | L(Lcommi ) | should be much 5. Return G0 = (V; E; l0). smaller than | L(Ai) |. As a consequence, we might expect the compiled theory to be smaller than the original theory, Figure 7: Constraining the communication language of though this is not guaranteed. Under the further assumption that the theories in partitions are fairly static, the cost of com- {Ai}i n by exploiting polarity. pilation will be amortized over many queries. We discuss further options for compilation, including the use of partial our method is useful in cases where symbols appear with dif- compilation, in a longer paper. ferent polarities in different partitions.

PROCEDURE COMPILE({Ai}in, G)

5 Minimizing Local Inference {Ai}in a partitioning of the theory A, G = (V; E; l) a tree with proper labeling for . For each partition , For i ; : : : ; n, To maximize the effectiveness of structure-based theorem A Ai = 1 proving, we must minimize local inference within each node 1. Let Lcommi = L(S(i;j)2E l(i; j)) of our tree-structured problem representation, while preserv- 2. Using a sound and complete -consequence finder, ing global soundness and completeness. First-order and L perform Lcomm -consequence finding on each partition Ai, propositional consequence finding algorithms have been de- i 0 veloped that limit deduction steps to those leading to interest- placing the output in a new partition Ai. ing consequences, skipping deduction steps that do not. In the propositional case, the most popular algorithms are Figure 8: A partition-based theory compilation algorithm. certain L-(prime) implicate finders. (See [21] for an excellent survey.) SOL-resolution (skipping ordered linear resolution) [17] and SFK-resolution (skip-filtered, kernel resolution) [14] Proposition 5.1 Let A = Sin Ai be a partitioned theory are two first-order resolution-based L-consequence finders. with associated tree G that is properly labeled for A. Let SFK-resolution is complete for first-order L-consequence Lcommi = L(S(i;j)2E l(i; j))). For all ’ ∈ Li ⊆ Lcommi ⊆ 0 0 finding, reducing to Directional Resolution in the proposi- L(Ai), Ai |= ’ iff Ai |= ’, where {Ai}in are the compiled tional case [13]. In contrast, SOL-resolution is not complete partitions output by COMPILE({Ai}in, G). for first-order L-consequence finding, but is complete for We may use our compiled theories in several different first-order incremental L-consequence finding. Given new in- put Φ, an incremental L-consequence finder finds the conse- strategies for batch-style and concurrent theorem proving, as well as in our previous message-passing algorithms. Figure quences of A ∪ Φ that were not entailed by Φ alone. Defining 9 presents an algorithm for batch-style structure-based the- completeness for incremental L-consequence finding is anal- ogous to Definition 2.5. orem proving. BATCH-MP takes as input a (possibly com- piled) partitioned theory, associated tree G that is properly In the rest of this section, we propose strategies that exploit labeled, and query Q. For each partition in order, it exploits our graphical models and specialized consequence finding al- gorithms to improve the efficiency of reasoning. Following focused L-consequence finding to compute all the relevant consequences of that theory. It passes the conclusions to- the results in previous sections, using SFK-resolution as a rea- wards the partition with the query. This algorithm is very sim- soner within partitions will preserve the soundness and com- [ ] pleteness of the global problem while significantly reducing ilar to the bucket elimination algorithm of 13 . BATCH-MP preserves soundness and completeness of the global problem, the number of inference steps. SFK-resolution can be used while exploiting focused search within each partition. by all of the procedures below. Unless otherwise noted, the algorithms we describe are limited to propositional theories Theorem 5.2 (Soundness and Completeness) Let A be a because first-order consequence finders may fail to terminate, set of clauses in propositional logic. Let {Ri}in be the even for decidable cases of FOL. Li-consequence finders associated with partitions {Ai}in

LOGIC PROGRAMMING AND THEOREM PROVING 629 PROCEDURE BATCH-MP ({Ai}in, G, Q) PROCEDURE CONCURRENT-MP ({Ai}in, G, Q)

{Ai}in a (compiled) partitioning of the theory A, G = (V; E; l) {Ai}in a (compiled) partitioning of the theory A, G = (V; E; l) a properly labeled tree describing the connections between the par- a properly labeled tree describing the connections between the par- titions, Q a query in L(Ak) (k ≤ n). titions, Q a query in L(Ak) (k ≤ n).

1. If {Ai}in, is a compiled theory, replace partition Ak with 1. Determine ≺ as in Definition 2.1. the partition Ak from the uncompiled theory. a 2. Let Li = L(l(i; j)) for j such that (i; j) ∈ E and j ≺ i . 2. Determine ≺ as in Definition 2.1. 3. If {Ai}in, is a compiled theory, then replace partition Ak a 3. Let Li = L(l(i; j)) for j such that (i; j) ∈ E and j ≺ i . with the partition Ak from the uncompiled theory.

4. Following ≺ in a decreasing order, for every (i; j) ∈ E such 4. For every i ≤ n, run the Li-consequence finder on partition a that j ≺ i , Ai until it has exhausted its consequences. Run the Li-consequence finder on Ai until it has exhausted a 5. For every (i; j) ∈ E such that j ≺ i , add the Li-prime its consequences, and add the consequences in Li to Aj . implicates to partition Aj . b 5. If Q is proven in Ak, return YES. 6. Concurrently, aThere is only one such j. (a) For every (i; j) ∈ E such that j ≺ ia, perform incre- b Derive a subsuming formula or initially add ¬Q to Ak and mental Li-consequence finding for each of the partition derive inconsistency. Ai and add the the consequences in Li to Aj . b (b) If Q is proven in Ak, return YES. Figure 9: A batch-style message-passing algorithm. aThere is only one such j. b Derive a subsuming formula or initially add ¬Q to Ak and derive inconsistency. in step 4 of BATCH-MP ({Ai}in,G,Q). If every Ri is com- plete for Li-consequence finding then A |= Q iff applying Figure 10: A concurrent message-passing algorithm. BATCH-MP({Ai}in,G,Q) outputs YES. Our final algorithm, CONCURRENT-MP, (Figure 10), takes as input a (possibly compiled) partitioned theory, as- reason with logical rather than probabilistic theories, where sociated tree G that is properly labeled, and query Q. It notions of structure and independence take on different roles exploits incremental L-consequence finding in the output in reasoning. Our work is most significantly distinguished communication language of each partition to compute the from work on CSPs (e.g., [12]) and more recently, logical rea- relevant incremental consequences of that theory, and then soning (e.g., [11; 25]) in that we reason with explicitly par- passes them towards the partition with the query. Once again, titioned theories using message passing algorithms and our SFK-resolution can be used as the sound and complete L- algorithms apply to FOL as well as propositional theories. consequence generator for the preprocessing (Step 4). In the In the area of FOL theorem proving, our work is related case where the theory is compiled into propositional prime to research on parallel theorem proving (see surveys in [6; implicates, the consequences in Li may simply be picked out 15]) and on combining logical systems (e.g., [24; 4]). Most of the existing consequences in Ai. SOL-resolution can be parallel theorem prover implementations are guided by looka- used as the sound and complete incremental L-consequence head and subgoals to decompose the search space dynami- finder (Step 6a). CONCURRENT-MP preserves soundness cally or allow messages to be sent between different provers and completeness of the global problem in the propositional working in parallel, using heuristics to decide on which mes- case, while exploiting focused search within each partition. sages are relevant to each prover. These approaches typically Theorem 5.3 (Soundness and Completeness) Let A be a look at decompositions into very few sub-problems. In addi- tion, the first approach typically requires complete indepen- set of clauses in propositional logic. Let {Ri}in be the dence of the sub-spaces or the search is repeated on much of Li-consequence finders associated with partitions {Ai}in the space by several reasoners. In the second approach there in step 4 of CONCURRENT-MP({Ai}in,G,Q) and let 0 is no clear methodology for deciding what messages should {R }in be the incremental Li-consequence finders associ- i be sent and from which partition to which. ated with partitions {Ai}in in step 6 of CONCURRENT- MP({Ai}in,G,Q). If every Ri is complete for Li- The work on combining logical systems focuses on combi- R0 nations of signature-disjoint theories (allowing the queries to consequence finding, and every i is complete for incre- mental Li-consequence finding then A |= Q iff applying include symbols from all signatures) and decision procedures CONCURRENT-MP({Ai}in,G,Q) outputs YES. suitable for those theories. All approaches either nondeter- ministically instantiate the (newly created) variables connect- ing the theories or restrict the theories to be convex (disjunc- 6 Related Work tions are intuitionistic) and have information flowing back A number of AI reasoning systems exploit some type of struc- and forth between the theories. In contrast, we focus on ture to improve the efficiency of reasoning. While our ex- the structure of interactions between theories with signatures ploitation of graph-based techniques is similar to that used in that share symbols and the efficiency of reasoning with con- Bayes Nets (e.g., [18]) our work is distinguished in that we sequence finders and theorem provers. We do not have any

630 LOGIC PROGRAMMING AND THEOREM PROVING restrictions on the language besides finiteness. [7] R. S. Boyer. Locking: a restriction of resolution. PhD thesis, Work on formalizing and reasoning with context (see [1] Mathematics Department, University of Texas, Austin, 1971. for a survey) can be related to theorem proving with struc- [8] R. Chadha and D. A. Plaisted. Finding logical consequences tured theories by viewing the contextual theories as interact- using unskolemization. In Proceedings of ISMIS’93, volume ing sets of theories. Unfortunately, to introduce explicit con- 689 of LNAI, pages 255–264. Springer-Verlag, 1993. texts, a language that is more expressive than FOL is needed. [9] P. Cox and T. Pietrzykowski. A complete nonredundant algo- Consequently, a number of researchers have focused on con- rithm for reversed skolemization. Theoretical Computer Sci- text for propositional logic, while much of the reasoning work ence, 28:239–261, 1984. has focused on proof checking (e.g., GETFOL [16]). [10] W. Craig. Linear reasoning. a new form of the Herbrand- Gentzen theorem. J. of Symbolic Logic, 22:250–268, 1957. 7 Summary [11] A. Darwiche. Utilizing knowledge-based semantics in graph- In this paper we exploited graph-based techniques to im- based algorithms. In Proc. AAAI ’96, pages 607–613, 1996. prove the efficiency of theorem proving for structured the- [12] R. Dechter and J. Pearl. Tree Clustering Schemes for Con- ories. Theories were organized into subtheories that were straint Processing. In Proc. AAAI ’88, 1988. minimally connected by the literals they share. We presented [13] R. Dechter and I. Rish. Directional resolution: The Davis- message-passing algorithms that reason over these theories Putnam procedure, revisited. In Proc. KR ’94, pages 134–145. using consequence finding, specializing our algorithms for Morgan Kaufmann, 1994. the case of first-order resolution, and for batch and concurrent [14] A. del Val. A new method for consequence finding and com- theorem proving. We provided an algorithm that restricts the pilation in restricted language. In Proc. AAAI ’99, pages 259– interaction between subtheories by exploiting the polarity of 264. AAAI Press/MIT Press, 1999. literals. We attempted to minimize the reasoning within each [15] J. Denzinger and I. Dahn. Cooperating theorem provers. In individual partition by exploiting existing algorithms for fo- W. Bibel and P. Schmitt, editors, Automated Deduction. A ba- cused incremental and general consequence finding. Finally, sis for applications., volume 2, chapter 14, pages 383–416. we proposed an algorithm that compiles each subtheory into Kluwer, 1998. one in a reduced sublanguage. We have proven the soundness [16] F. Giunchiglia. GETFOL manual - GETFOL version 2.0. and completeness of all of these algorithms. The results pre- Technical Report DIST-TR-92-0010, DIST - University of sented in this paper contribute towards addressing the prob- Genoa, 1994. Available at http://ftp.mrg.dist.unige.it/pub/mrg- lem of reasoning efficiently with large or multiple structured ftp/92-0010.ps.gz. commonsense theories. [17] K. Inoue. Linear resolution for consequence finding. Artificial Intelligence, 56(2-3):301–353, Aug. 1992. Acknowledgements [18] F. V. Jensen, S. L. Lauritzen, and K. G. Olesen. Bayesian updating in recursive graphical models by local computation. We wish to thank the anonymous IJCAI reviewers for their Computational Statistics Quarterly, 4:269–282, 1990. thorough review of this paper, and Alvaro del Val and Pierre [19] R. C.-T. Lee. A Completeness Theorem and a Computer Pro- Marquis for helpful comments on the relationship between gram for Finding Theorems Derivable from Given Axioms. our work and previous work on consequence finding. This PhD thesis, University of California, Berkeley, 1967. research was supported in part by DARPA grant N66001- [20] R. C. Lyndon. An interpolation theorem in the predicate cal- 97-C-8554-P00004, NAVY grant N66001-00-C-8027, and by culus. Pacific Journal of Mathematics, 9(1):129–142, 1959. DARPA grant N66001-00-C-8018 (RKF program). [21] P. Marquis. Consequence finding algorithms. In Algorithms for Defeasible and Uncertain Reasoning, volume 5 of Hand- References book on Deafeasible Reasoning and Uncertainty Management [1] V. Akman and M. Surav. Steps toward formalizing context. AI Systems, pages 41–145. Kluwer, 2000. Magazine, 17(3):55–72, 1996. [22] S. McIlraith and E. Amir. Theorem proving with structured theories (full report). Technical Report KSL-01-04, KSL, [2] E. Amir. Efficient approximation for triangulation of mini- Computer Science Dept., Stanford U., Apr. 2001. mum treewidth. Manuscript submitted for publication. Avail- able at http://www-formal.stanford.edu/eyal/papers/decomp- [23] E. Minicozzi and R. Reiter. A note on linear resolution strate- uai2001.ps, 2001. gies in consequence-finding. Artificial Intelligence, 3:175– 180, 1972. [3] E. Amir and S. McIlraith. Paritition-based logical reasoning. In Proc. KR ’2000, pages 389–400. Morgan Kaufmann, 2000. [24] G. Nelson and D. C. Oppen. Simplification by cooperating decision procedures. ACM Trans. on Programming Languages [4] F. Baader and K. U. Schulz. Combination of constraint solvers and Systems, 1(2):245–257, Oct. 1979. for free and quasi-free structures. Theoretical Computer Sci- ence, 192(1):107–161, 1998. [25] I. Rish and R. Dechter. Resolution versus search: two strate- gies for SAT. Journal of Automated Reasoning, 24(1-2):225– [5] W. W. Bledsoe and A. M. Ballantyne. Unskolemizing. Techni- 275, 2000. cal Report Memo ATP-41, Mathematics Department, Univer- [ ] sity of Texas, Austin, 1978. 26 J. A. Robinson. A machine-oriented logic based on the resolu- tion principle. J. of the ACM, 12(1):23–41, 1965. [6] M. P. Bonacina and J. Hsiang. Parallelization of deduction [27] J. R. Slagle. Interpolation theorems for resolution in lower strategies: an analytical study. Journal of Automated Reason- predicate calculus. J. of the ACM, 17(3):535–542, July 1970. ing, 13:1–33, 1994.

LOGIC PROGRAMMING AND THEOREM PROVING 631 LOGIC PROGRAMMING AND THEOREM PROVING

ANSWER SET PROGRAMMING Experimenting with Heuristics for Answer Set Programming Wolfgang Faber Nicola Leone Inst. f. Informationssysteme 184/3, TU Wien Dep. of Mathematics, Univ. of Calabria A-1040 Wien, Austria 87030 Rende (CS), Italy [email protected] [email protected]

Gerald Pfeifer Inst. f. Informationssysteme 184/2, TU Wien A-1040 Wien, Austria [email protected]

Abstract program) are subset-minimal models which are “grounded” in a precise sense, they are called answer sets [Gelfond and Answer Set Programming (ASP) is a novel pro- Lifschitz, 1991]. The idea of answer set programming is to gramming paradigm, which allows to solve prob- represent a given computational problem by an ASP program lems in a simple and highly declarative way. The whose answer sets correspond to solutions, and then use an language of ASP (function-free disjunctive logic answer set solver to find such a solution [Lifschitz, 1999]. programming) is very expressive, and allows to rep- As an example, consider the well-known problem of 3-

resent even problems of high complexity (every colorability, which is the assignment of three colors to the ¤¦¥¨§ ©  problem in the complexity class ¡£¢ ). nodes of a graph in such a way that adjacent nodes have dif- ferent colors. This problem is known to be NP-complete. As for SAT solvers, the heuristic for the selection Suppose that the nodes and the edges are represented by a set

of the branching literal (i.e., the criterion determin-   of facts with predicates  (unary) and (binary), ing the literal to be assumed true at a given stage respectively. Then, the following ASP program allows us to of the computation) dramatically affects the perfor- determine the admissible ways of coloring the given graph.

mance of an ASP system. While heuristics for SAT   "!$#%!&('*),+-/.102 "!$#%!&('*),+43(.502 6!/#7!$8'*),+-9(.-:<;1!&=>'*)?.

have received a fair deal of research in AI, only lit- &@£A-: >B=98>'*),+DCE.6+- 6!/#7!$8'*),+GFH.6+- 6!/#%!&8'*CI+6FH. tle work in heuristics for ASP has been done so far.

Rule JK above states that every node of the graph is col- In this paper, we extend to the ASP framework ¤ a number of heuristics which have been success- ored red or yellow or green, while J forbids the assignment fully employed in existing systems, and we com- of the same color to any adjacent nodes. The minimality pare them experimentally. To this end, we imple- of answer sets guarantees that every node is assigned only ment such heuristics in the ASP system DLV, and one color. Thus, there is a one-to-one correspondence be-

tween the solutions of the 3-coloring problem and the answer ¤Q we evaluate their respective efficiency on a num- MLON

sets of JKP6J . The graph is 3-colorable if and only if ¤8Q ber of benchmark problems taken from various do- RL N mains. The experiments show interesting results, JKPGJ has some answer set. and evidence a couple of promising heuristic crite- An advantage of answer set programming over SAT-based ria for ASP, which sensibly outperform the heuris- programming is that problems can be encoded more easily in tic of DLV. the ASP language than in propositional CNF formulas, thanks to the nonmonotonic character of disjunction and negation as

failure [Lifschitz, 1999]. Importantly, due to the support for

¡ ¢

1 Introduction variables, every problem in the complexity class ¤ (i.e., in §S©  ) can be directly encoded in an ASP program which Answer set programming (ASP) is a declarative approach to can then be used to solve all problem instances in a uniform programming, which has been recently proposed in the area way [Eiter et al., 1997]. of nonmonotonic reasoning and logic programming. The The high expressiveness of answer set programming comes knowledge representation language of ASP is very expres- at the price of a high computational cost in the worst case, sive: function-free logic programs with classical negation which makes the implementation of efficient ASP systems a where disjunction is allowed in the heads of the rules and non- difficult task. Nevertheless, some efforts have been done in monotonic negation may occur in the bodies of the rules. The this direction, and a number of ASP systems are now avail- intended models of an ASP program (i.e., the semantics of the able. The two best known ASP systems are DLV [Eiter et al.,  This work was supported by FWF (Austrian Science Funds) un- 2000], and Smodels [Niemela,¨ 1999; Simons, 2000], but also der the project Z29-INF and P14781. other systems support ASP to some extent, including CCALC

LOGIC PROGRAMMING AND THEOREM PROVING 635 [McCain and Turner, 1998], DCS [East and Truszczynski,´ Answer Sets

2000T ], QUIP [Egly et al., 2000], and XSB [Rao et al., 1997]. We describe the semantics of consistent answer sets, which The core of an ASP system is model generation, where a has originally been defined in [Gelfond and Lifschitz, 1991].

model of the program is produced, which is then subjected to wyx

an answer set check. For the generation of models, ASP sys- Given a program v , let the Herbrand Universe be the

v z tems employ procedures which are similar to Davis-Putnam set of all constants appearing in and the Herbrand Base x

procedures used in SAT solvers. As for SAT solvers, the be the set of all possible combinations of predicate symbols

wyx k heuristic (branching rule) for the selection of the branching appearing in v with constants of , possibly preceded by .

A set { of literals is said to be consistent if, for every literal

literal (i.e., the criterion determining the literal to be assumed |E} {

true at a given stage of the computation) is fundamentally im- { , its complementary literal is not contained in . ~EJ€‚ƒJ„

portant for the efficiency of a model generation procedure, Given a rule J , denotes the set of rules obtained J

and dramatically affects the overall performance of an ASP by applying all possible substitutions from the variables in

w v

system. While a lot of work has been done in AI developing to elements of x . Similarly, given a program , the ground

~EJ‚€ƒv†„ v ‡2ˆ&‰ ~EJ€‚ƒJ„ new heuristics and comparing alternative heuristics for SAT instantiation of is the set x .

(see, e.g., [Hooker and Vinay, 1995; Li and Anbulagan, 1997; For every program v , we define its answer sets using its

Freeman, 1995]), only little work has been done so far for ground instantiation ~EJ‚€ƒv†„ in two steps, following [Lif- ASP systems. In particular, we are not aware of any previous schitz, 1996]: First we define the answer sets of positive pro- comparison between heuristics for ASP. grams, then we give a reduction of general programs to posi- In this paper, we evaluate different heuristics for ASP sys- tive ones and use this reduction to define answer sets of gen- tems. To this end, we first consider a couple of heuristics eral programs.

which have been very successful in SAT solvers or in non- An interpretation Š is a set of ground literals. A consis-

v v

tent interpretation ŠŒ‹oz x is closed under (where is

disjunctive ASP systems, and we extend them to the frame- } ~EJ€‚*v†„ work of (disjunctive) ASP. We then perform an experimen- a positive program), if, for every J , at least

tation activity to compare the different heuristics. In partic- one literal in the head is true whenever all literals in the body v ular, we implement the heuristics in the ASP system DLV, are true.  is an answer set for if it is minimal w.r.t. set and we compare their respective efficiency on a number of inclusion and closed under v .

benchmark problems taken from various domains. The ex- The reduct or Gelfond-Lifschitz transform of a general Ž‹z x

periments show interesting results, and evidence a couple of ground program v w.r.t. a set is the positive ground

}

v J v promising heuristic criteria for ASP, which sensibly outper- program v2 , obtained from by (i) deleting all rules form the heuristic of DLV. Nevertheless, this paper is not at whose negative body is false w.r.t. X and (ii) deleting the neg-

all a conclusive work on heuristics for ASP; rather, it is a first ative body from the remaining rules. ‘‹’z x

step in this field that will hopefully stimulate further works An answer set of a general program v is a set

 ~EJ‚‚*v†„ on the design and evaluation of heuristics for ASP, which are such that  is an answer set of . strongly needed to build efficient ASP solvers. 3 Answer Sets Computation 2 Answer Set Programming Language In this section, we describe the main steps of the compu- In this section, we provide a formal definition of the syn- tational process performed by ASP systems. We will refer tax and semantics of the answer set programming language particularly to the computational engine of the DLV system, supported by DLV: disjunctive datalog extended with strong which will be used for the experiments, but also other ASP negation. For further background, see [Gelfond and Lifschitz, systems, like Smodels, employ a very similar procedure.

1991; Eiter et al., 2000]. An answer set program v in general contains variables. The first step of a computation of an ASP system eliminates ASP Programs these variables, generating a ground instantiation J‚€ƒv†„

1 J A (disjunctive) rule is a formula of v . The hard part of the computation is then performed on

this ground ASP program generated by the instantiator.

U U\[^]G_a` `cb `&bBh `Bi2j

P P Ped1f8g P Pd1f(g

K X&X&X K X&X$X

KWVYX&X&XZV The heart of the computation is performed by the Model

[ i

U ` `

U Generator, which is sketched in Figure 1. Roughly, the Model

KP P P KP P X&X$X where X$X&X are classical literals (atoms pos-

Generator produces some “candidate” answer sets. The sta- WlRm5P&nolRpqlRm

sibly preceded by the symbol k ) and . The U\[

U bility of each of them is subsequently verified by the function J

disjunction K/VZX&X$XrV is the head of , while the conjunction

`&b `cbBh `Bi ` `cb

` IsAnswerSet(I), which verifies whether the given “candidate”

P P P1d1f(g P P4d1f(g P P

X&X$X K X&X&X K X&X&X

K is the body, the

bch i

` ` ~EJ‚€ƒv†„4“

Š is a minimal model of the program obtained

d1f(g KP P4d1f(g

positive body, and X&X&X the negative body ¥

by applying the GL-transformation w.r.t. Š and outputs the P$stP&utP$stu of J . Comparison operators (like ) are built-in predicates in ASP systems, and may appear in the bodies of model, if so. IsAnswerSet(I) returns True if the computation rules. should be stopped and False otherwise.

A disjunctive datalog program (also called ASP program 1Note that 9&!&”\;5='*•S. is not the full set of all syntactically con- • in this paper) v is a finite set of rules. structible instances of the rules of ; rather, it is a subset (often As usual, an object (atom, rule, etc.) is called ground or much smaller) of it having precisely the same answer sets as • propositional, if it contains no variables. [Faber et al., 1999a].

636 LOGIC PROGRAMMING AND THEOREM PROVING

Function ModelGenerator(I: Interpretation): Boolean; £ shrinks to p if the truth of is assumed and propagated in the

var inconsistency: Boolean; ¯SK­£¦„ interpretation Š . The weight is

begin

b

¥Œ¡

Z±²

U b

I := DetCons(I); b°

¯ ­£¦„ w«e¬ ¥ D£¦„

K K

if I = – then return False; (* inconsistency *) ¯ if no atom is undefined in I then return IsAnswerSet(I); Thus, the weight function K prefers literals introducing a

Select an undefined ground atom — according to a heuristic; higher number of short unsatisfied rules. Intuitively, the in- if ModelGenerator(˜H™›š&—Hœ ) then return True; troduction of a high number of short unsatisfied rules is pre- else return ModelGenerator(˜H™,šB8ž$Ÿ£—Hœ ); ferred because it creates more and stronger constraints on the end; intepretation so that a contradiction can be found earlier [Li

Figure 1: Computation of Answer Sets and Anbulagan, 1997]. We combine the weight of an atom £ £

with the weight of its complement d1f(g¦£ to favour such

¯ D£¦„ ¯ *d1f(g¦£¦„ K The ModelGenerator function is first called with parameter that K and are roughly equal, to avoid that

2 a possible failure leads to a very bad state. To this end, as v Š set to the empty interpretation. If the program has an an-

in SATZ, we define the combined weight comb- ¯SK­£¦„ of an swer set, then the function returns True setting Š to the com-

puted answer set; otherwise it returns False. The Model Gen- atom £ as follows:

¥

erator is similar to the Davis-Putnam procedure employed by j

¯SK­£¦„ ¯SK*d1f(g¦£¦„ m´µ\¶ ¯SK/­£¦„6¶ ¯SKƒd1f8g¦£¦„ comb- ¯SK8D£¦„

SAT solvers. It first calls a function DetCons(), which returns

¡ z · z K

the extension of Š with the literals that can be deterministi- Given two atoms and , heuristic prefers over

¡ ¡¸sS¹ z ¯ º¡«„»s ¯ Dz†„

K K K

cally inferred (or the set of all literals upon inconsistency). ( ) iff comb- comb- . Once a

¹

K £ ·‚K £

This function is similar to a unit propagation procedure em- s -maximum atom is selected, heuristic takes if d1f(g¦£ ployed by SAT solvers, but exploits the peculiarities of ASP ¯SK­£¦„£u¼¯½K*d1f(g¦£¦„ , else. for making further inferences (e.g., it exploits the knowledge

Heuristic §y¾ . The second heuristic we consider is inspired that every answer set is a minimal model). If DetCons does to the branching rule of Smodels – a well known ASP sys- not detect any inconsistency, an atom ¡ is selected accord-

tem [Simons, 2000]. Let ¿ÀªÁ¿ denote the number of atoms

ing to a heuristic criterion and ModelGenerator is called on

Q Q L¢N

L¢N which are either true or false in a (three-valued) intepretation

¡ Š d5f(gE¡ ¡ Š and on . The atom plays the role of a

ª . Then, define

¤ ¥

branching variable of a SAT solver. And indeed, like for SAT j

¯ D£¦„ ¿ /¤€¥cD£¦„$¿ ¡ solvers, the selection of a “good” atom is crucial for the ¤ performance of an ASP system. In the next section, we de- Since ¯ maximizes the size of the resulting intepretation, scribe a number of heuristic criteria for the selection of such it minimizes the atoms which are left undefined. Intuitively,

branching atoms. this minimizes the size of the remaining search space (which

 /¤1¥c­£¦„

is ´Â , where is the number of undefined atoms in ) ¤

[Simons, 2000]. Similar to Smodels, the heuristic · cau- ¤

4 Heuristics ¤

D£¦„ ¯ *d1f(gE£¦„

tiously maximizes the minimum of ¯ and . ¤

Throughout this section, we assume that a ground ASP pro- ¤

¹ ·

More precisely, the preference relationship s of is

¤ Š

gram v and an interpretation have been fixed. Here, we de-

¹

z ¡Ãs

defined as follows. Given two atoms ¡ and ,

¤ ¤ ¤

scribe the heuristic criteria that will be compared in Section 7. ¤

z D¡ „BPÆ· *d1f(gt¡ „6„ÇsŽnqÄÅ ­· ºz†„cP"· ƒd5f(gtz†„6„

if n<ÄÅ D· ;

¤ ¤ ¤

We consider “dynamic heuristics” (the ASP equivalent of UP ¥

sS¹ z nqÄÅ D· D¡ „cP"· ƒd5f(gt¡ „6„

otherwise, ¡ if

¤ ¤ ¤ ¤

heuristics for SAT), that is, branching rules where the heuris- U

ºz›„BP"· ƒd1f8gtz›„G„ n ¤ID· D¡ „BPÆ· *d1f(gt¡ „6„Ès

n<Ä­ ­· and

¤ ¤ ¤

U £

tic value of a literal £ depends on the result of taking true

¹

¤ID· Dz†„cP"· *d1f(gtz†„6„ s £

n . Once a -maximum atom is

¤ /¤1¥cD£2„

and computing its consequences. Given a literal £ ,

£ d1f(g¦£ selected, heuristic · takes either or , depending on

will denote the intepretation resulting from the application of Q

LON the same selection principle. £

DetCons (see previous section) on Š ; w.l.o.g., we as- Remark. It is worthwhile noting that the heuristic of Smod- £

sume that $¤€¥cD£¦„ is consistent, otherwise is automatically els, while following the above intuition, is more advanced ¤ set to false and the heuristic is not evaluated on £ at all. and sophisticated than · . Unfortunately, it is defined for non-disjunctive programs, and centered around properties of Heuristic §©¨ . This is an extension of the branching rule adopted in the system SATZ [Li and Anbulagan, 1997] – one unstratified negation. which is not so important in our frame- of the most efficient SAT solvers – to the framework of ASP. work. We do not see any immediate extension of Smodels’

heuristic to the framework of disjunctive ASP programs. ª

The length of a rule J (w.r.t. an interpretation ), is the

b

U

w«e¬ ¥ D£2„ number of undefined literals occurring in J . Let Heuristic §ÊÉ . Let us consider now the heuristic used in 3

denote the number of unsatisfied rules of length p w.r.t. the DLV system. Even if this is more “naive” than the previ- Š

/¤1¥c­£¦„ , which have a greater length w.r.t. . In other words, ous heuristics, we will benchmark it in order to evaluate the

b

U

¥ D£¦„ w«e¬ is the number of unsatisfied rules whose length impact of changing the branching rule on the test system.

2Observe that the interpretations built during the computation are A peculiar property of answer sets is supportedness: For

Š J

3-valued, that is a literal can be True, False or Undefined (if its value each true atom ¡ of an answer set , there exists a rule of

J Š ¡

has not been set yet) w.r.t. to an interpretation ˜ . the program such that the body of is true w.r.t. and is 

3  J

Recall that a rule is satisfied w.r.t. ® if the body of is false the only true atom in the head of . Since an ASP system

 ® w.r.t. ® or the head of is true w.r.t. . must eventually converge to a supported interpretation, ASP

LOGIC PROGRAMMING AND THEOREM PROVING 637

Ø Ø Ø

[Ø7Ù

systems try to keep the interpretations “as much supported Let Ö be a propositional formula in conjunctive normal

¥Ò×

j&j$j

Î

Ë Ö º K  „ 

V V

Ú Ú Ú Û

as possible” during the intermediate steps of the computa- form (CNF) K where the

j$j&j i

¤ P PG¤ tion. To this end, the DLV system counts the number of Un- are literals over the propositional variables K . supportedTrue (UT) atoms, i.e., atoms which are true in the Ö is satisfiable, iff there exists a consistent conjunction

current interpretation but still miss a supporting rule (further ¥

Š¨¿ Ö Š of literals such that (see e.g. [Papadimitriou,

details on UTs can be found in [Faber et al., 1999b] where 1994] for a complete definition). ]G_ they are called MBTs). For instance, the rule ¥©¤ im- 3SAT is a classical NP-complete problem and can be easily plies that ¤ must be true in every answer set of the program;

represented in our formalism as follows:Ø

but it does not give a “support” for ¤ . Thus, in the DLV sys-

³<Ü Ü

Ä n For every propositional variable ¤ ( ), we add

tem ¤ is taken true to satisfy the rule, and it is added to the Ø

set of UnsupportedTrue; it will be removed from this set once the followingØ rule which ensures that we either assume that

`›]G_Ì Ϥ

variable ¤ or its complement true:

¤ ¤

a supporting rule for will be found (e.g., V is a

Q

¥^N

` Ì

¤ Š ¤ÊP4d1f(g P

supporting rule for in the interpretation ). ÝÞ\0¦;5ÝÞ4ß w«Í2­£¦„

Given a literal £ , let be the number of UT atoms

¤

j$j&j

/¤1¥cD£2„ w«Í D£¦„ w«ÍÏ΁­£¦„

Î

K  Ö V in . Moreover, let and be the num- And for every clause V in we add the constraint

ber of UT atoms occurring, respectively, in the heads of ex-



-: = +ÆßÆß"ßB+ =â$ß

8ž&ŸÇà 8ž$Ÿáà

actly 2 and 3 unsatisfied rules w.r.t. /¤€¥cD£¦„ . The heuristic

¤

Ø Ø

·€Î wEÍ2D£¦„ w«Í D£2„ w«ÍÏ΁D£2„ ܌ä

of DLV considers , and in a ³›Ü

Ø

 ã Ä ¤  ¤ Û where ( ) is Û if is a positive literal , and

prioritized way, to favor atoms yielding interpretations with

¤

Ϥ  d1f(gE¤ Û

Û if is a negative literal .

Î Ð8w«Í fewer w«ÍHÐ(w«Í atoms (which should more likely lead

to a supported model). If all UT counters are equal, then the Hamiltonian Path (HAMPATH) is another classical NP-

U ¥c­£¦„

heuristic considers the total number Ñ of rules which complete problem from the area of graph theory: /¤1¥cD£2„

are satisfied w.r.t. . More precisely, given two atoms ¥

DåÁP6æ2„ å

Given an undirected graph ~ , where is the z

¡ and : æ

set of vertices of ~ and is the set of edges, and a node

}

U

¹/Î

z w«Í2D¡ „£u»w«Í2ºz›„

1. ¡Òs if ; ~

å of this graph, does there exist a path of starting

U ¥

at and passing through each node in å exactly once?

¹/Î

s z w«Í2º¡«„ w«Í2Dz†„

2. otherwise, ¡ if and

¤ ¤

w«Í º¡ „£u»w«Í Dz†„

; Suppose that the graph ~ is specified by using two predi-

¤ ¤

¥ Ì

U 4

(\ƒá„ J º P6盄

sS¹$Ύz wEÍ º¡ „ w«Í ºz›„

3. otherwise, ¡ if and cates and , and the starting node is spec-

U J¥

ified by the predicate ¬$¥ (unary) which contains only a

Î Î

º¡ „£u»w«Í Dz†„ w«Í ;

¥ single tuple. Then, the following program solves the problem

Î Î Î

s z w«Í º¡ „ w«Í Dz†„

4. otherwise, ¡ if and HAMPATH.

U U

¥cD¡ „ZsRÑ ¥cºz†„ Ñ .

% Each node has to be reached.

-:¢;5!&=>'*)?.6+ />Bè Æé\>B='*)?.6ß

·€Î 8ž$Ÿ

A sS¹/Î -maximum atom is selected by the heuristic of />Bè Æé\>B='*)?.-:Wê"ëÅècëÆ'*)?.6ßÒ/>Bè Æé\>B='*)?.Ï-:íìƒ;1îZèëÅé‚'*CI+­)?.6ß

DLV. Unlike the previous heuristics, ·€Î considers only atoms

(instead of literals), and it does not take into account what % Guess whether to take a path or not. ìº;1îZèë-é€'*),+ÅC«.€0¦!$”ëÅîZèë-é‚'*),+-C .e-:</>Bè Æé>c='*)ï.6+-è$ $'*),+ÅC«.6ß

happens when the selected atom £ leads to a failure (i.e.,

% At most one incoming/outgoing arc! /¤1¥cƒd5f(g¦£¦„

is not considered in the heuristic). -:¢ìƒ;1îZèë-é‚'*),+ÅCE.6+-ìƒ;1îZèë-é‚'*),+ÅC2ðB.6+-C2ñHòyC2ð$ß -:¢ìƒ;1îZèë-é‚'*),+ÅCE.6+-ìƒ;1îZèë-é‚'*)qð/+4C .6+-)<ñHò )<ð$ß

Heuristic §ÊÓ . Finally, we have considered a simple “bal- ·1Î anced version” ·1Ô of the heuristic of DLV, where also the Blocksworld (BW) is a classic problem from the planning

complement of an atom is evaluated for the heuristic. Given

¥ ¥

¤ domain, and one of the oldest problems in AI:

wEÍSÕºD¡ „ w«Í2D¡ „¶Ww«Í†*d1f(gS¡ „ w«ÍSÕ D¡ „

an atom ¡ , let ,

¤ ¤

¥

wEÍSÕ º¡«„ w«ÍÏÎ(º¡ „¶

, Î , Given a table and a number of blocks in a (known) initial

¥

U U U

Ô

¥4ÕDD¡ „ Ñ ¥cº¡«„½¶^Ñ ¥cƒd5f(gt¡ „ ·

and Ñ . The heuristic state and a desired goal state, try to reach that goal state Î works precisely as · , but considers the primed counters. by moving one block at a time such that each block is Once the best atom has been selected, it is taken positive or either on top of another block or the table at any given

negative, depending on ·1Î . time step. Figure 2 shows a simple example that can be solved in three 5 Benchmark Programs time steps: First we move block c to the table, then block b To evaluate the different heuristics presented in the previous on top of a, and finally c on top of b. section, we chose a couple of benchmark problems: 3SAT, initial: goal: Blocksworld Planning, Hamiltonian path, and Strategic Com- c c b panies. b a a

3SAT is one of the best researched problems in AI and gen- Figure 2: Simple BW Example erally used for solving many other problems by translating them to 3SAT, solving the 3SAT problem and transforming 4Predicate è& is symmetric, since undirected arcs are bidirec- the solution back to the original domain: tional.

638 LOGIC PROGRAMMING AND THEOREM PROVING Due to space restrictions we refer to [Erdem, 1999; Faber 7 Experimental Results and Conclusion et al., 1999a] for a complete encoding.

The results of our experiments are displayed in the graphs of

¡ ¤ ¢ Strategic Companies (STRATCOMP) finally, is a - Figures 3–6. In each graph, the horizontal axis reports a pa- complete problem, which has been first described in [Cadoli rameter representing the size of the instance. On the vertical et al., 1997]: axis, we report the average running time (expressed in sec- A holding owns Ì companies, each of which produces onds) over the 8 instances of the same size we have run (see some goods. Some of these companies may jointly con- previous section).

trol others. This is modelled by means of predicates Remark. All heuristics have been implemented in a straight-

³ ³

ó

PÆõt´8„ ô õ õt´

J‚ºôZPÆõ ( is produced by and ) and forward way, without optimizations, so the running times re-

³ ä

Ì

P"õt´PÆõ „ õ

Ï¥c­õ«P"õ (company is jointly controlled ported in the graph are meaningful only for comparing the

³ ä

õt´ õ by õ , and ). relative efficiencies of the heuristics. Now, some companies should be sold, under the con- We have allowed a running time of 600 seconds for each straint that all goods can be still produced, and that no problem instance. In the graphs, the line of an heuristic company is sold which would still be controlled by the stops whenever some problem instance was not solved in the holding afterwards. A company is strategic, if it belongs maximum allowed time. The following table displays, for to a strategic set, which is a minimal set of companies each heuristic, the maximum instance-size where the heuris- satisfying these constraints. tic could solve all problem instances in the maximum allowed

The answer sets of the following natural program corre- time.

 @

é éâ é\ú spond one to one to the strategic sets. Checking whether any é

given company õ is strategic is done by brave reasoning: “Is 3SAT 270 270 250 260 ò

there any answer set containing õ ?” HAMPATH 80 40 70 100

ò ò ò

/'ºF ðc.10›ê /'ºFHö$.e-:ï÷\&!$='ƒîy+-F ð/+6FHö/.6ß

ê BW P6 P6 P5 P6

ò ò ñ ò

/'ºFH.-:¢ "!&;5ë"'ºF£+GF ð/+6FHö+GFZø.6+6ê /'ºF ðc.6+4ê /'ºFHö$.6+6ê $'ºFZø.6ß

ê STRATCOMP 1200 1200 700 1200 Î As in [Cadoli et al., 1997] we assume that each product As expected, heuristic · , the “native” heuristic of the DLV is produced by at most two companies and each company is system, which does not combine the heuristic values of com- jointly controlled by at most three companies to allow for an plementary atoms, is the worst in most cases. It does not ter- easier representation. minate on the instance P6 of BW, it could not solve any of the benchmark instances of STRATCOMP (it does not appear at

6 Benchmark Data all in Figure 5), and stopped earlier than the others on 3SAT. · For 3SAT, we have randomly generated 3CNF formulas over Heuristic K , the extension of SATZ heuristic to ASP, be-

 variables using a tool by Selman and Kautz [Selman and haves very well on average. On 3SAT, BW, and STRAT-

Kautz, 1997]. For each size we generated 8 such instances, COMP, ·‚K could solve all benchmark instances we have run. where we kept the ratio between the number of clauses and It is the fastest on BW and one of the two fastest on 3SAT. the number of variables at 4.3, which is near the cross-over It shows a negative behaviour only on HAMPATH. In this point for random 3SAT [Crawford and Auton, 1996]. problem, considering UnsupportedTrue literals seems to be

The instances for HAMPATH were generated using a tool crucial for the efficiency.

Ô Î · by Patrik Simons which has been used to compare Smod- Heuristic · is surprisingly good compared to . It is a els against SAT solvers (cf. [Simons, 2000]), and is avail- simple “balanced version” of heuristic ·1Î (the heuristic val-

able at http://tcs.hut.fi/Software/smodels/misc/ ues of the positive and of the negative literal are combined by

Î ·

hamilton.tar.gz. For each problem size  we generated sum). This simple extension to dramatically improves the Ô 8 instances, always assuming node 1 as the starting node. performance. Indeed, heuristic · solves nearly all instances The blocksworld problems P3 and P4 have been employed we ran (only on 3SAT it stopped a bit earlier than other heuris- in [Erdem, 1999] to compare ASP systems, and can be solved tics). It is the best heuristic on STRATCOMP and, impor- in 8 and 9 steps, respectively. We augmented these by prob- tantly, on HAMPATH, where it beats all other heuristics of a

lems P5 and P6 which require 11 and 12 steps, respectively. relevant factor. ¤

For each of these problems, we generated 8 random permuta- The behaviour of heuristic · , based on the minimization tions of the input. of the undefined atoms, is rather controversial. It behaves

For STRATCOMP, finally, we randomly generated 8 in- very well on 3SAT and BW, but it is extremely bad on HAM-

 

stances for each problem size  , with companies and PATH, where it stops at 40 nodes already and is beaten even

Î · products. Each company ù is controlled by one to five com- by the “naive” heuristic . This confirms that further stud-

panies, where the actual number of companies is uniform ran- ies are needed to find a proper extension of the heuristic of Ì domly chosen. On average there are 1.5 Ï¥ relations per Smodels to the framework of disjunctive ASP.

company. Concluding, we observe that both heuristic ·K and heuristic Ô The benchmark data are available at http://www.dbai. · , significantly improve the efficiency of the native heuristic

tuwien.ac.at/proj/dlv/. All experiments were per- ·€Î of the DLV system. The dramatic improvement obtained ·1Ô formed on an Athlon/750 FreeBSD 4.2 machine with 256MB by the simple change from ·€Î to , confirms even more the of main memory. The binaries were produced with GCC importance of a careful study of branching rules in ASP sys- 2.95.2. tems. This paper is only a first step in this field, we hope that

LOGIC PROGRAMMING AND THEOREM PROVING 639 ü further works will follow, for proposing new heuristics for h1 average running time 200 number of variables ASPû and for better understanding the existing ones, in order to improve the efficiency of ASP systems. ü h2

150 average running time

0 0

number of variables ü 200 h1 ü h3 100 average running time average running time

instance

number of variables

160 ü

average running time ü h2 50 h4 average running time average running time

0 0 1 instance 0 0 120 number of variables

ü 0 h3 180 210 240 270 80 number of variables average running time

instance

average running time ü 40 h4 Figure 6: 3SAT problems, average running times average running time

0 0 1 instance 0 3 4 5 6 [Eiter et al., 2000] T. Eiter, W. Faber, N. Leone, and G. Pfeifer. instance Declarative Problem-Solving Using the DLV System. Logic- Figure 3: Blocksworld problems, average running times Based Artificial Intelligence, pp. 79–103. Kluwer, 2000. [Erdem, 1999] E. Erdem. Applications of Logic Program- ming to Planning: Computational Experiments. Un-

ý http://www.cs.utexas.edu/users/ 50 h1 published draft. esra/papers.html, 1999. average running time

þ number of nodes 40

ý [ ] h2 Faber et al., 1999a W. Faber, N. Leone, C. Mateis, and G. Pfeifer. Using Database Optimization Techniques for Nonmonotonic average running time

0 0 30 number of nodes Reasoning. DDLP’99, pp. 135–139. ý h3 20 [Faber et al., 1999b] W. Faber, N. Leone, and G. Pfeifer. Pushing average running time number of nodes Goal Derivation in DLP Computations. LPNMR’99, pp. 177–

average running time ý 10 h4 191.

average running time 0ÿ 0 number of nodes [Freeman, 1995] J.W. Freeman. Improvements on Propositional 0 40 50 60 70 80 90 100 number of nodes Satisfiability Search Algorithms. PhD thesis, University of Penn- sylvania, 1995. Figure 4: Hamiltonian Path problems, average running times [Gelfond and Lifschitz, 1991] M. Gelfond and V. Lifschitz. Classi- cal Negation in Logic Programs and Disjunctive Databases. New Generation Computing, 9:365–385, 1991. 5.0 ý h1 [Hooker and Vinay, 1995] J.N. Hooker and V. Vinay. Branching 4.5 average running time

number of companies

4.0 rules for satisfiability. Journal of Automated Reasoning, 15:359– ý 3.5 h2 383, 1995.

1.0

ÿ 0.5 average running time

ÿ 0.0 0 3.0 number of companies [Li and Anbulagan, 1997] C.L. Li and Anbulagan. Heuristics based 2.5 ý h4 on unit propagation for satisfiability problems. In IJCAI 1997, pp. 2.0

average running time 366–371.

number of companies 1.5 average running time [ ] 1.0 Lifschitz, 1996 V. Lifschitz. Foundations of logic programming. 0.5 Principles of Knowledge Representation, pp. 69–127. CSLI Pub- 0.0 lications, Stanford, 1996. 700 800 900 1000 1100 1200 number of companies [Lifschitz, 1999] V. Lifschitz. Answer set planning. ICLP’99, pp. Figure 5: Strategic Companies, average running times 23–37. The MIT Press. [McCain and Turner, 1998] N. McCain and H. Turner. Satisfiabil- ity planning with causal theories. KR’98, pp. 212–223. Morgan References Kaufmann Publishers, 1998. [Cadoli et al., 1997] M. Cadoli, T. Eiter, and G. Gottlob. Default [Niemela,¨ 1999] I. Niemela.¨ Logic programming with stable model Logic as a Query Language. IEEE TKDE, 9(3):448–463, 1997. semantics as constraint programming paradigm. Annals of Math- [Crawford and Auton, 1996] James M. Crawford and Larry D. Au- ematics and Artificial Intelligence, 25(3–4):241–273, 1999. ton. Experimental results on the crossover point in random 3sat. [Papadimitriou, 1994] Christos H. Papadimitriou. Computational Artificial Intelligence, 81(1–2):31–57, March 1996. Complexity. Addison-Wesley, 1994. [East and Truszczynski,´ 2000] D. East and M. Truszczynski.´ dcs: [Rao et al., 1997] P. Rao, K.F. Sagonas, T. Swift, D.S. Warren, An implementation of DATALOG with Constraints. NMR’2000. and J. Freire. XSB: A System for Efficiently Computing Well- Founded Semantics. LPNMR’97, pp. 2–17. [Egly et al., 2000] U. Egly, T. Eiter, H. Tompits, and S. Woltran. Solving Advanced Reasoning Tasks using Quantified Boolean [Selman and Kautz, 1997] B. Selman and H. Kautz, 1997. ftp: Formulas. In AAAI’00, pp. 417–422. AAAI Press. //ftp.research.att.com/dist/ai/. [ ] [Eiter et al., 1997] T. Eiter, G. Gottlob, and H. Mannila. Disjunctive Simons, 2000 P. Simons. Extending and Implementing the Stable Datalog. ACM Transactions on Database Systems, 22(3):364– Model Semantics. PhD thesis, Helsinki University of Technology, 418, September 1997. Finland, 2000.

640 LOGIC PROGRAMMING AND THEOREM PROVING Graph Theoretical Characterization and Computation of Answer Sets

Thomas Linke Institut f¨ur Informatik, Universit¨at Potsdam

Postfach 60 15 53, D–14415 Potsdam, Germany, [email protected]

A Ö A f ¾ A

f ½ ½

Abstract ½ , because if contributes to ,then and thus

Ö Ö Ö

e a e cannot be applied. Analogously, blocks wrt answer

We give a graph theoretical characterization of an- A

set ¾ . This observation leads us to a strictly blockage-based swer sets of normal logic programs. We show that approach. More precisely, we represent the block relation be- there is a one-to-one correspondence between an- tween rules as a so-called block graph. Answer sets then are swer sets and a special, non-standard graph coloring characterized as special non-standard graph colorings of block of so-called block graphs of logic programs. This graphs. Each node of the block graph (corresponding to some leads us to an alternative implementation paradigm rule) is colored with one of two colors, representing applica- to compute answer sets, by computing non-standard tion or non-application of the corresponding rule. The block graph colorings. Our approach is rule-based and not graph has quadratic size of the corresponding logic program. atom-based like most of the currently known meth- Since the block graph serves as basic data structure for our ods. We present an implementation for computing implementation, it needs polynomial space. answer sets which works on polynomial space. 2 Background

1 Introduction We deal with normal logic programs which contain the sym- Ö bol ÒÓØ used for negation as failure.Arule, , is any expres- Answer set semantics [Gelfond and Lifschitz, 1991] was es- sion of the form

tablished as an alternative declarative semantics for logic pro-

Ô Õ ;:::;Õ ; ÒÓØ × ;:::;ÒÓØ×

Ò ½ k grams. Originally, it was defined for extended logic pro- ½ (2)

1

Õ ¼  i  Ò × ¼  j  k

[ ] Ô j

grams Gelfond and Lifschitz, 1991 as a generalization of where , i ( )and ( ) are atoms. A rule is a

=k=¼ k=¼ Ö

the stable model semantics [Gelfond and Lifschitz, 1988]. fact if Ò , it is called basic if . For a rule we define

head ´Ö µ=Ô bÓdÝ ´Ö µ=fÕ ;:::;Õ ;ÒÓØ × ;:::;ÒÓØ × g

Ò ½ k Currently there are various applications of answer set pro- and ½ .

[ ·

ÓdÝ ´Ö µ=fÕ ;:::;Õ g

gramming, e.g. Dimopoulos et al., 1997; Liu et al., 1998; b Ò Furthermore, let ½ denote the positive

Niemel¨a, 1999]. Furthermore, there are reasonably efficient

bÓdÝ ´Ö µ=f× ;:::;× g bÓdÝ ´Ö µ k part and ½ the negative part of . implementations available for computing answer sets, e.g. Definitions of the head, the body, the positive and negative smodels [Niemel¨a and Simons, 1997] and dlv [Eiter et al., body of a rule are generalized to sets of rules in the usual way.

1997]. Systems, like DeReS [Cholewi´nski et al., 1996] and ·

ÓdÝ ´Ö µ The elements of b are referred to as the prerequisites

quip [Egly et al., 2000], are also able to compute answer

bÓdÝ ´Ö µ or positive body atoms of Ö . The elements of are

sets, although they were designed to deal with more general ·

bÓdÝ ´Ö µ=; referred to as the negative body atoms of Ö .If , formalisms.

then Ö is said to be prerequisite-free. A set of rules of the Both systems and most of the theoretical results as well deal form (2) is called a (normal) logic program. A program is with answer sets in terms of atoms (or literals). This paper called prerequisite-free if each of its rules is prerequisite-free.

aims at a different point of view, namely characterizing and

È F

We denote the set of all facts of program by È .

computing answer sets in terms of rules. Intuitively, the head ·

Ö head ´Ö µ

Let Ö be a rule. then denotes the rule

Ô Ô Õ ;:::;Õ ; ÒÓØ × ;:::;ÒÓØ ×

½ Ò ½ k

of a rule is in some ·

· ·

È È = fÖ j Ö ¾ È g ÓdÝ ´Ö µ

b . For a logic program let .A

A Õ ;:::;Õ A × ;:::;×

Ò ½ k

answer set if ½ are in and none of is È set of atoms X is closed under a basic program iff for any

in A .Let

Ö ¾ È head ´Ö µ ¾ X bÓdÝ ´Ö µ  X 

 , whenever . The smallest set

a b; ÒÓØ e b d c b

of atoms which is closed under a basic program È is denoted =

È (1)

d e d; ÒÓØ f f a

È µ

by Cn´ .

X

È X

The reduct, È , of a program relative to aset of atoms

Ö Ö Ö Ö

a b c d

be a logic program and let us call the rules , , , ,

X ·

= fÖ j Ö ¾ È bÓdÝ ´Ö µ \ X = ;g:

is defined by È and We

Ö Ö È f

e ,and , respectively. Then has two different answer È

say that a set X of atoms is an answer set of a program iff

A = fd; b; c; a; f g A = fd; b; c; eg ¾

sets ½ and . It is easy to

X X

È µ=X È

Cn ´ . The reduct is often called the Gelfond-

Ö Ö e see that the application of f blocks the application of wrt Lifschitz reduction. Observe, that there are programs, e.g.

1

Ô ÒÓØ Ôg Extended logic programs are logic programs with classical nega- f , that do not possess an answer set. Through- tion. out this paper, we use the term “answer set” instead of “stable

LOGIC PROGRAMMING AND THEOREM PROVING 641

model” since it is the more general one. Observe, that there exists a unique maximal grounded set

¼

È  È È Ë

Asetofrules of the form (2) is grounded iff there exists for each program ,thatis, È is well-defined. This

¼

Ö

hÖ i Ë i ¾ Á

i¾Á

an enumeration i of such that for all we have that definition captures the conditions under which a rule blocks

¼ ½

·

Ö ´Ö ;Öµ ¾ A

ÓdÝ ´Ö µ  head ´fÖ ;:::;Ö gµ: Ë

b another rule (e.g. ). We also gather all ground-

½ i ½

i Forasetofrules and

È Ë

a set of atoms X we define the set of generating rules of wrt edness information in , due to the restriction to rules in the È

X as follows maximal grounded part of logic program . This is important

¼ Ö

because a block relation between two rules Ö and becomes

·

¼

Ê ´Ë; X µ=fÖ ¾ Ë j bÓdÝ ´Ö µ  X; bÓdÝ ´Ö µ \ X = ;g: G (3)

effective only if Ö is groundable through other rules. E.g. for

= fÔ Õ; Õ Ôg

The following result relates groundedness and generating program È the maximal grounded subset rules to answer sets. of rules is empty and therefore È contains no 0-arcs. Fig-

ure 1 shows the block graph of program (1). X

Lemma 2.1 Let È be a logic program and be a set of

¼ ¼

È GÊ ´È; X µ

atoms. Then X is an answer set of iff (i) is

Ö Ö Ö

a b c

·

= ´GÊ ´È; X µ µ

grounded and (ii) X Cn .

¼ ¼

This lemma characterizes answer sets in terms of generating ½

· X

Ê ´È; X µ 6= È È =

rules. Observe, that in general G (take

Ö

Ö Ö

f

e d

a ;b cg X = fag

f and ).

½ ¼

Assume that for each program È we have

Figure 1: Block graph of program (1).

·

¾ È jbÓdÝ ´Ö µj½:

for each rule Ö we have (4) We now define so-called application colorings or a-

G  È

In Section 5, we show how to generalize our approach to both colorings for block graphs. A subset of rules Ö is

Ö ¾ È G

normal logic programs with multiple positive body atoms and a grounded 0-path for if Ö is a 0-path from some fact

Ö also to extended logic programs. Therefore, the assumption to in È .

above is not a real restriction. Definition 3.2 Let È be a logic program s.t. condition (4)

¼ ½

= ´È; A [ A µ We need some graph theoretical terminology. A directed

holds, let È be the corresponding block

È È

G G =´Î; Aµ Î

: È 7! f©; ¨g c

graph is a pair such that is a finite, non- graph and let c be a mapping. Then is

A  Î ¢ Î empty set (vertices) and is a set (arcs). For a

an a-coloring of È iff the following conditions hold for each

G =´Î; Aµ Ú ¾ Î

¾ È

directed graph and a vertex ,wedefine Ö

­ ´Ú µ = fÙ j ´Ù; Ú µ ¾

the set of all predecessors of Ú as

c´Ö µ=©

A½ iff one of the following conditions holds

Ú Ag

¼

. Analogously, the set of all successors of is defined as ¼

­ ´Ö µ 6= ; Ö ¾ ­ ´Ö µ c´Ö µ=©

¼ · a. and for each we have

¼ ¼

´Ú µ=fÙ j ´Ú; Ùµ ¾ Ag Ú Ú G =´Î; Aµ

­ .Apath from to in

¼¼ ¼¼

¾ ­ ´Ö µ c´Ö µ=¨

b. there is some Ö s.t. .

½

¼ ¼

È  Î È = fÚ ;:::;Ú g

ÚÚ ½ Ò

is a finite subset ÚÚ such that ,

¼

A¾ c´Ö µ=¨

= Ú Ú = Ú ´Ú ;Ú µ ¾ A ½  i<Ò

Ú iff both of the following conditions hold

Ò i i·½ ½ , and for each . The arcs

2

¼ ¼

È AÖc× ´È µ=f´Ú ;Ú µ j ½ 

­ G c´G µ=¨ ´Ö µ=;

ÚÚ ÚÚ i i·½ Ö

of a path are defined as a. or it exists grounded 0-path Ö s.t.

¼

¼¼ ¼¼

i<Òg Ú Ú Ú ¾ Î

¾ ­ ´Ö µ c´Ö µ=©

. A path from to for some is called a cycle in b. for each Ö we have . ½

G.

c

Let be an a-coloring of some block graph È . Rules are In order to represent more information in a directed graph,

then intuitively applied wrt some answer set of È if they are

¼ ½

Î; A [ A µ

we need a special kind of labeled graphs. ´ is

A½ Ö

colored ¨. Condition specifies that a rule is colored

¼ ½

[ A

a directed graph whose arcs A are labeled with zero

Ö A½ © (not applied) if and only if is not “grounded” ( a) or

(0-arcs) and with one (1-arcs), respectively. For G we distin- A½ Ö is blocked by some other rule ( b). A rule is colored

guish 0-predecessors (0-successors) from 1-predecessors (1- A¾

¨ (applied) if and only if it is grounded ( a) and it is not

· ·

´Ú µ ­ ´Ú µ ­ ´Ú µ ­ ´Ú µ

successors) denoted by ­ ( )and ( )

¼ ¼ ½ ½

blocked by some other rule (A¾b) . This captures the intuition

¼

Ú ¾ Î È G

for , respectively. A path ÚÚ in is called 0-path which rules apply wrt to some answer set and which do not

¼

¼

AÖc× ´È µ  A

if ÚÚ .Thelength of a cycle in a graph (see Section 1).

¼ ½

Î; A [ A µ

´ is the total number of 1-arcs occurring in the

Ö = Ô ÒÓØ Ô È = fÖ g Ô Let Ô . Then program has

cycle. Additionally, we call a cycle even (odd) if its length is

´È; ;; f´Ö ;Ö µgµ Ô block graph Ô . By Definition 3.2 there is no

even (odd).

a-coloring of È . For this reason, we need both conditions A¾ A½ and . 3 Block Graphs and Application Colorings

Lemma 3.1 Let È be a logic program s.t. condition (4) holds

c A½

We now go on with a formal definition of the conditions under and let be an a-coloring of È . Then condition holds iff

which a rule blocks another rule. condition A¾ does not hold. In general, we do not have the equivalence stated in this Definition 3.1 Let È be a logic program s.t. condition (4)

¼ lemma, because there are examples (see above) where no a-

 È

holds, and let È maximal grounded. The block graph ½

¼ coloring exists. Lemma 3.1 states that a-colorings are well-

=´Î ;A [ A µ È È

È of is a directed graph with vertices È

È defined in the sense that they assign exactly one color to each

Î = È

È and two different kinds of arcs defined as follows

node.

·

¼ ¼ ¼ ¼ ¼

= f´Ö ;Öµ j Ö ;Ö ¾ È head ´Ö µ ¾ bÓdÝ ´Ö µg

A and

È 2

 È c´Ë µ=¨ c´Ë µ=©

Forasetofrules Ë we write or if for

½ ¼ ¼ ¼ ¼

= f´Ö ;Öµ j Ö ;Ö ¾ È head ´Ö µ ¾ bÓdÝ ´Ö µg:

A and

È

¾ Ë c´Ö µ= ¨ c´Ö µ=© each Ö we have or , respectively.

642 LOGIC PROGRAMMING AND THEOREM PROVING Ò

We obtain the main result: If there is no such then chooseÈ fails. This strategy to select c

choice points ensures that nodes ¨ are grounded. Observe

Theorem 3.2 Let È be a logic program s.t. condition (4)

Í 6= ;

that, if chooseÈ fails and then we have to color all

È È

holds and let È be the block graph of .Then has an ©

nodes in Í with , since they cannot be grounded through

A c

answer set iff È has an a-coloring .Furthermore,we c

rules ¨ .

Ê ´È; A µ=fÖ ¾ È j c´Ö µ=¨g: have G

During recursive calls Æ contains the choice point of the

Answer sets can therefore be computed by computing a- former recursion level. The color of nodes Æ has to be prop-

c´fÖ gµ=© c´fÖ ;Ö ;Ö ;Ö ;Ö gµ=¨

d b c a f

colorings, e.g. e and agated with propagate (see Figure 3) for two reasons. First,

È

A

½

Ü Ü ¾f©; ¨g correspond to answer set of program (1). when coloring nodes in Æ with color ( ) it is not

checked whether this is allowed (wrt the actual c). This check 4 Computation of A-colorings

is done by propagate . This means, colorÈ fails only during È For the following description of our algorithm to compute propagation (see Theorem 4.1). Second, propagating already

a-colorings, let È be some logic program s.t. condition (4) colored nodes prunes the search space and thus reduces the

: È 7! f©; ¨g c

holds. Let c be a partial mapping. is repre- necessary number of choices. Since choice points make up

c ;c µ c = fÖ ¾ È j

´ the exponential part of our problem, propagation becomes the

¨ © sented by a pair of (disjoint) sets © s.t.

3

´Ö µ=©g c = fÖ ¾ È j c´Ö µ=¨g c essential part of our approach.

and ¨ . We refer to map-

´c ;c µ

c Currently, we propagate only in arc direction as it is suf-

¨ È ping with the tuple © and vice versa. Assume that

is a global parameter of each presented procedure (indicated ficient for correctness and completeness of the algorithm.

Í Æ Í

through index È ). Let and be sets of nodes s.t. con- Therefore, we have to deal with four propagation cases: if

Ü Ü ¾ f©; ¨g

= È Ò ´c [ c µ

Í a node is colored ( ) then this color has to be ¨

tains the currently uncolored nodes ( © )and

¼

Ö ;Ö ¾ È

Æ contains colored nodes whose color has to be propagated. propagated over 1- and over 0-arcs. Let be nodes

¼ ¼ ½ ¼

Ö ;Öµ ¾ A [ A Ö Figure 2 shows the implementation of the non-deterministic s.t. ´ and assume that is already colored.

Then we have to propagate this color to node Ö . For example,

procedure colorÈ in pseudo code.

¼

´Ö µ=¨ c´Ö µ=©

propagating c over 1-arcs gives . For rea-

´Í; Æ : ; c : µ

procedure colorÈ list partial mapping sons of correctness, we cannot propagate colors without any :

var Ò node ; further tests. We have got the following result.

Æ; cµ

if propagate ´ fails then fail ; È

Theorem 4.1 Let È be a logic program s.t. condition (4)

Í := Í Ò ´c [ c µ ¨

© ;

c :

holds, let È be the corresponding block graph and let

´Í; c; Òµ

if chooseÈ fails then

È 7! f©; ¨g c

be a mapping. If is an a-coloring of È then

c := ´c [ Í; c µ ¨

© ;

¼ ¼ ¼

Ö ¾ È Ö ¾ F c´Ö µ=¨

for each if È then and the following

c Í; cµ if propagate ´ fails then fail else output ; È conditions hold:

else

·

¼

Í := Í ÒfÒg

´Ö µ ¾ ­ c´Ö µ=©

; (A) for each Ö we have

½

¼

c := ´c ;c [fÒgµ

c´Ö µ=¨ ¨

© ; if

´Í; fÒg;cµ

if colorÈ succeeds then exit

·

¼

¾ ­ ´Ö µ c´Ö µ=¨ (B) for each Ö we have

else ½

¼ ¼¼ ¼¼

´Ö µ=© Ö ¾ ­ ´Ö µ:c´Ö µ=©

if c and for each and

½

c := ´c [fÒg;c ÒfÒgµ ¨

© ;

¼¼ ¼¼

[­ Ö ¾ ­ ´Ö µ=; ´Ö µ:c´Ö µ=¨]

Í; fÒg;cµ

´ or there is some

¼ ¼

if colorÈ succeeds then exit else fail ;

·

¼

¾ ­ ´Ö µ c´Ö µ=¨

(C) for each Ö we have ¼

Figure 2: Definition of procedure colorÈ .

¼ ¼¼ ¼¼

´Ö µ=¨ Ö ¾ ­ ´Ö µ:c´Ö µ=©

if c and for each ½

Notice that all presented procedures (except chooseÈ )

·

¼

¾ ­ ´Ö µ c´Ö µ=©

(D) for each Ö we have ¼

return some partial mapping through parameter c or fail.

¼ ¼¼ ¼¼

´Ö µ=© Ö ¾ ­ ´Ö µ:c´Ö µ=©

if c and for each . ¼

chooseÈ returns some node or fails.

c =

When calling colorÈ the first time, we start with According to Theorem 3.2, a rule contributes to some answer

´;;F µ Í = È Ò F Æ = F

È È

, È and . That is, we start with all ¨

set A if it is colored . In case of (A) there is no further con- ¨

facts colored . Basically, colorÈ takes both a partial map- ©

dition, because a node Ö has to be colored if there is some

¼¼ ¼¼ ¼ Í

ping c and a set of uncolored nodes and aims at coloring

Ö ¨ Ö = Ö 1-predecessor Ö of which is colored (take in

these nodes. This is done by choosing some uncolored node Ò Ö

A½b Definition 3.2). In other words, cannot be applied if it

Ò ¾ Í ¨

( ) with chooseÈ andbytryingtocolorit first. In case is blocked by some other applied rule. Intuitively, condition

Ò ©

of failure colorÈ tries to color node with . If this also fails

(B) says that Ö has to be applied if all of its 1-predecessors Ò

colorÈ fails. Therefore, we say that node is used as a choice Ö are colored © ( is not blocked) and one of its 0-predecessors

point. All different a-colorings are obtained by backtracking ·

bÓdÝ ´Ö µ is colored ¨ ( is a consequence of applied rules or

over choice points. Ö

Ö is a fact). Condition (C) states that rule is applied if it is

´Í; c; Òµ Ò Ò ¾ Í

chooseÈ selects some uncolored node ( ) “grounded” through one of its 0-predecessors and if it is not

­ ­ Òµ = ; ´Òµ 6= ;

s.t. ´ and or the following condition ½

¼ blocked by some other rule. The last condition postulates that

·

bÓdÝ ´Ö µ

holds: rule Ö cannot be applied if cannot be derived from

c ¼

¼ other applied rules. Theorem 4.1 implies that a mapping is

Ò ¾ ­ ´Òµ c´Ò µ=¨ CÈ there is some s.t. . ¼ no a-coloring if propagate fails. Figure 3 shows the imple-

3 È

c È = c [ c ¨

Since is not total we do not necessarily have © . mentation of propagate . The purpose of propagate is to

È È

LOGIC PROGRAMMING AND THEOREM PROVING 643

´È Ò C = fc j c

; c : ; ´Æ : È

procedure propagate list partial mapping ) Define È is some output of color

È

¼

F ;F ; ´;;F µµg

Ò :

È È

var node ; È . With this notation, we obtain correctness

Æ 6= ;

while do and completeness of colorÈ .

¼ Æ select Ò from ;

Theorem 4.3 Let È be a logic program s.t. condition (4)

¼

´Ò ¾ c µ

if ¨ then

c : È 7! f©; ¨g

holds, let È be its block graph, let be a

¼

Ò ;cµ

(A) if propA ´ fails then fail ;

C c È

mapping and let È be defined as above. Then is an a-

¼

Ò ;cµ

(C) if propC ´ fails then fail ;

c ¾ C

È È coloring of È iff . else

colorÈ

¼ Let us demonstrate how computes the a-colorings

Ò ;cµ

(B) if propB ´ fails then fail ;

È

¼ of the block graph of program (1) (see Figure 1). We in-

Ò ;cµ

(D) if propD ´ fails then fail .

È

´Í; Æ ; cµ Í = È ÒfÖ g Æ = fÖ g

È d

voke color with d , and

c =´;; fÖ gµ ´Æ; cµ

d .First,propagate is executed. By prop-

Figure 3: Definition of procedure propagate . È

È

c´Ö µ=¨ c´Ö µ=¨ b

agating d with case (C) we get and re-

c´Ö µ=¨ c = ´;; fÖ ;Ö ;Ö gµ

d b c

cursively c .Thisgives .Af-

¼

Í = fÖ ;Ö ;Ö g

c´Ò µ=¨

e f

apply the corresponding propagation cases, e.g. if ter updating uncolored nodes we obtain a .

´Í; c; Òµ Ò È then cases (A) and (C) have to be applied. Now chooseÈ ( variable) is executed. For choose

The four procedures used in propagate can be easily im- there are two possibilities to compute the next choice point

È

CÈ Ö Ö Ò = Ö

e a

plemented. For example, propB is shown in Figure 4. First, s.t. hold, namely a and . Assume .

È

Í Í = fÖ ;Ö g f

After updating we have e and the first

´Í; fÖ g;cµ c = a

colorÈ

¼ recursive call is executed where

; c : µ Ò :

procedure propB ´ node partial mapping

È

´;; fÖ ;Ö ;Ö ;Ö gµ ¨ Ö

b c a a

d . Again color of node has to be prop-

: Ë :

var Ò node ; set of nodes ;

´fÖ g; ´;; fÖ ;Ö ;Ö ;Ö gµµ

a d b c a

· propagate

¼ agated by executing .

È

:= fÒ ¾ ­ ´Ò µ j Òg

Ë condition (B) holds for ;

½

c´Ö µ = ¨

By using propagation case (C) for a we obtain

6= ;

while Ë do

c´Ö µ = ¨

f . This color is recursively propagated using Ë

select Ò from ;

c´Ö µ = © c =

case (A), which gives e . Thisleadsto

Ò ¾ c

if © then fail ;

Í ´fÖ g; fÖ ;Ö ;Ö ;Ö ;Ö gµ

d b c a f

e .Since becomes the empty set,

Ò 6¾ c

if ¨ then c

chooseÈ fails and is the first output. Invoking backtrack-

c := ´c ;c [fÒgµ ¨ © ;

ing means that the last recursive call to colorÈ fails. Then

Ò; cµ

propagate ´ .

È

c = ´fÖ g; fÖ ;Ö ;Ö gµ ´Í; fÖ g;cµ

d b c È a

a and color is executed.

´Ö µ=© c´Ö µ=©

Figure 4: Definition of procedure propB . c

a f

È By using case (D) for we obtain and

c´Ö µ= ¨

thus e with case (B). Hence the second solution is

¼

c =´fÖ ;Ö g; fÖ ;Ö ;Ö ;Ö gµ

Ò a f d b c e it determines the set Ë of all 1-successors of s.t. condi- . Since there is no other choice

tion (B) holds. Finally, it tests whether all nodes in Ë can be point, we have no further solutions.

Ò ¾ Ë

colored ¨. If node is currently uncolored it is col- 5 Generalizations

c´Òµ=©

ored ¨ and its color is propagated. If then propB

È ¨ fails, otherwise Ò is already colored and we go on with the In this section, we discuss generalizations of the presented

next node from Ë . The procedures for the remaining propaga- approach. First, we show how to apply our method to normal

tion cases can be implemented analogously. Whenever some logic programs with multiple positive body literals. Let È be

= Ô

currently uncolored node is colored during propagation, this a normal program without restrictions. For each rule Ö

Õ ;:::;Õ ; ÒÓØ × ;:::;ÒÓØ×

Ò ½ k

color is recursively propagated by calling propagate . ½ we define

È

 

¼

c : È 7! f©; ¨g

Õ ;ÒÓØ× ;:::;ÒÓØ×

For partial mapping we define the set of Ô

½ k

¼

È = [fÕ Õ j ½  i  Òg

Ö i

¼ ¼

¼ (5)

i

A

Õ Õ ;:::;Õ

corresponding answer sets c as

½ Ò

¼ ¼ ¼

;:::;Õ Õ ;Õ È

A = fX j X È

c where are new atoms not appearing in .Fora Ò

is answer set of and ½

Ë

È È = È

c  GÊ ´È; X µ c \ GÊ ´È; X µ=;g:

Æ Ö ©

¨ and program we set . Hence, we have defined

Ö ¾È

a local program transformation which has linear size of the

c A

If is undefined for all nodes then c contains all answer sets original program. Each normal program is transformed into

È c A

of .If is a total mapping then c contains exactly one

ÓdÝ ´Ö µ=; Ö

some program in which b for each rule with

È c

answer set of (if is an a-coloring). With this notation we ·

jbÓdÝ ´Ö µj > ½ È formulate the following result: . That is why we may interpret Æ as some

kind of normal form of È . The following lemma obviously

Theorem 4.2 Let È be a logic program s.t. condition (4) holds:

c´Ö µ

holds, let c and be partial mappings. Then for each A

Lemma 5.1 Let È be a normal logic program and let be a

Ö ¾ ´c [ c µ ´fÖ g;cµ ¨

© we have if propagate succeeds and

È È

setofatoms.ThenA is an answer set of iff there exists an

´Ö µ

c is the actual partial mapping after executing propagate

È

A È A A

Æ

Æ answer set Æ of s.t. and contain exactly the same

A = A :

c

´Ö µ then c atoms out of the set of all atoms occurring in È .

This theorem states that propagate neither discards nor in- It is straightforward to extend the algorithm presented in

È

È Ö ¾ È Æ

troduces answer sets corresponding to some partial mapping Section 4 to normal programs Æ . Let us call all rules

·

jbÓdÝ ´Ö µj > ½ c. Hence, it justifies that only nodes used as choice points lead with AND-nodes and all other rules OR-

to different answer sets. Therefore backtracking is necessary nodes. Observe, that on the one hand, we do not have to mod- È only over choice points (see Figure 2). ify Definition 3.1 of block graphs for programs Æ .Itstaysas

644 LOGIC PROGRAMMING AND THEOREM PROVING

it is. We just distinguish two different kinds of nodes in È . as a tool for theoretical analysis of logic programs. For exam- Æ On the other hand, Definition 3.2 and procedures presented ple, results presented in [Linke and Schaub, 2000] imply that in Section 4 deal only with OR-nodes and consequently we a logic program without odd cycles always has some answer

have to extend it to AND-nodes. For AND-node Ö we know set.

­ ´Ö µ=; Ö

by definition that (see (5)). Therefore, cannot be Clearly, colorÈ is, like smodels, a Davis-Putnam like

½

A½ A¾

blocked and we do not have to consider cases b and b procedure. Once again, the main difference is that colorÈ of Definition 3.2. In order to extend Definition 3.2, we require determines answer sets in terms of generating rules whereas

that the following conditions hold for each AND-node Ö (cor- smodels and dlv construct answer sets in terms of literals. A¾ responding to conditions A½a and a of Definition 3.2 for Using rules instead of atoms has the advantage that we have

OR-nodes): complete knowledge about which rule is responsible for some

¼

¼ atom belonging to an answer set. Atom-based approaches

c´Ö µ=© Ö ¾ ­ ´Ö µ c´Ö µ=© A¿ iff there is some s.t.

¼ additionally have to detect the responsible rule and ensure

¼ ¼

Ö ¾ ­ ´Ö µ c´Ö µ=¨ c´Ö µ=¨ A4 iff for each we have . groundedness, because in general there may be several rules

¼ with the same head. We obtain groundedness of generating

·

­ ´Ö µ=­ ´Ö µ=

According to (5), for AND-node Ö we have ½

½ rules as a by-product of our strategy to select choice points

· ·

j­ ´Ö µj =½ ­ ´Ö µ[­ ´Ö µ

;, and will never contain any AND-

¼ ¼ ¼

with procedure chooseÈ . node. For this reason, we obtain only two new propagation

cases, in which we propagate the color of OR-nodes over 0- 7Conclusion

·

¼

¾ ­ ´Ö µ arcs to AND-nodes. Let Ö be some AND-node and

¼ The main contribution of this paper was the definition of the

¼

¾f©; ¨g Ö

let Ü be the actual color of (OR-node). Then we

È

block graph È of a program . As a theoretical tool, the

have the following new cases block graph seems to be suitable for investigations of many

·

¼ ¼

¾ ­ ´Ö µ c´Ö µ=¨ c´Ö µ=¨

(C’) for each Ö we have if and concepts for logic programs, e.g. answer set semantics, well-

¼

¼¼

¼¼ founded semantics or query-answering. As a first result, we

¾ ­ ´Ö µ:c´Ö µ=¨ for each Ö

¼ have described answer sets as a-colorings (a non-standard

·

¼ ¼

Ö ¾ ­ ´Ö µ c´Ö µ=© c´Ö µ=©

(D’) for each we have if , ¼ kind of graph colorings of È ). This led us to an alternative

which can be easily integrated into propagate . algorithm to compute answer sets by computing a-colorings È [Gelfond and Lifschitz, 1990] show that logic programs which needs polynomial space. with classical negation are equivalent to normal logic pro- Finally, let us give first experimental results to demon- grams when new atoms are introduced. With this technique strate the practical usefulness of our algorithm. We have used [ our approach is also suitable for computing answer sets of two NP-complete problems proposed in Cholewi´nski et al., ] general logic programs (with classical negation). Addition- 1995 : the problem of finding a Hamiltonian path in a graph ally, we may apply techniques presented in [Janhunen et al., (Ham) and the independent set problem (Ind). 2000] to handle disjunctive logic programs. Concerning time, our first prolog implementation (devel- opment time 6 month) is not comparable with state of the art 6 Related Work implementations. However, Theorem 4.2 suggests to compare the number of used choice points, because it reflects how an Directed graphs are often associated with logic programs and algorithm deals with the exponential part of a problem. Un- used in theory and applications. Usually, the nodes of these fortunately, only smodels gives information about its choice graphs are the atoms of the programs, e.g. dependency graphs points. For this reason, we have concentrated on comparing or TMS networks [Doyle, 1979]4. These approaches use rules our approach with smodels. Results are given for finding

to define graphs on atoms whereas we have used atoms to

à à à Ã

8 9 ½¼ define graphs on rules. 7 Other approaches [Dimopoulos and Torres, 1996; Brignoli smodels 4800 86364 1864470 45168575 et al., 1999] are more or less rule-based but have some serious noMoRe 15500 123406 1226934 12539358 drawbacks: they deal only with prerequisite-free programs, because (wrt to answer set semantics) there is some equivalent Table 1: Number of choice points for HAM-problems. prerequisite-free program for each program. Since in general all solutions of different instances of Ham and Ind.Table1 equivalent prerequisite-free programs have exponential size of shows results for some Ham-encodings of complete graphs

the original ones, approaches which rely on this equivalence 5

à Ò

Ò where is the number of nodes . Surprisingly, it turns need exponential space. out that our non-monotonic reasoning system (noMoRe)per- In fact, the block graph is a specialization of graphs de- forms very well on this problem class. That is, with grow- fined on rules in [Papadimitriou and Sideri, 1994; Linke and ing problem size we need less choice points than smodels. Schaub, 2000] for default theories. Whereas in the case of de- This can also be seen in Table 2 which shows the correspond- fault logic the aforementioned graphs are abstractions of the ing time measurements. For finding all Hamiltonian cycles of

essential blocking information, here they contain all informa- Ã

a ½¼ we need less time than the actual smodels version. tion necessary for computing answer sets. Although we focus 6

To be fair, for Ind-problems of graphs CirÒ we need twice on the practical usage of the block graph it may also be used 5In a complete graph each node is connected to each other node.

4 6

Ò fÚ ;:::; Ú g

Ò Ò

Truth maintenance systems can be translated into logic pro- A so-called circle graph Cir has nodes ½ and arcs

A = f´Ú ;Ú µ j ½  i  Òg[f´Ú ;Ú µg

i·½ Ò ½ grams [Brewka, 1991]. i .

LOGIC PROGRAMMING AND THEOREM PROVING 645 the choice points (and much more time) smodels needs, be- [Cholewi´nski et al., 1996] P. Cholewi´nski, V. Marek, and cause we have not yet implemented backward-propagation. M. Truszczy´nski. Default reasoning system DeReS. In However, even with the same number of choice points smod- Proc. KR-1996, pp. 518–528. Morgan Kaufmann Publish- els is faster than noMoRe, because noMoRe uses general ers, 1996. backtracking of prolog, whereas smodels backtracking is [Dimopoulos and Torres, 1996] Y. Dimopoulos and A. Tor- highly specialized for computing answer sets. The same ap- res. Graph theoretical structures in logic programs and plies to dlv. default theories. Theoretical Computer Science, 170:209–

244, 1996.

Ã Ò Ham for Ò Ind for Cir

[Dimopoulos et al., 1997] Y. Dimopoulos, B. Nebel, and

= 8 9 ½¼ 4¼ 5¼ 6¼ Ò J. Koehler. Encoding planning problems in non-monotonic smodels 54 1334 38550 8 219 4052 logic programs. Proc. of the 4th European Conf. on Plan- dlv 4 50 493 13 259 4594 ing, pp. 169–181, Toulouse, France, 1997. Springer Verlag. noMoRe 198 2577 34775 38 640 11586 [Doyle, 1979] J. Doyle. A truth maintenance system. Artifi- Table 2: Time measurements in seconds for HAM-andIND- cial Intelligence, 12:231–272, 1979. problems on a SUN Ultra2 with two 300MHz Sparc proces- [ ] sors. Egly et al., 2000 U. Egly, Th. Eiter, H. Tompits, and St. Woltran. Solving advanced reasoning tasks using quan- For future work, we may also think of different improve- tified boolean formulae. Proc. AAAI-2000, pp. 417–422, ments of our algorithm. First of all, we have to integrate 2000. backward propagation (propagating colors against arc direc- [Eiter et al., 1997] T. Eiter, N. Leone, C. Mateis, G. Pfeifer, tion), since this will definitely improve efficiency by further and F. Scarcello. A deductive system for nonmonotonic reducing the number of choice points. Second, we may try to reasoning. In J. Dix, U. Furbach, and A. Nerode, editors,

pre-color not only facts but also some other nodes, e.g. each LPNMR-1997, pages 363–374. Springer Verlag, 1997.

½

´Ò; Òµ ¾ A © node Ò with has to be colored . The block [ ] graph may also be used for other improvements. For example, Gelfond and Lifschitz, 1988 M. Gelfond and V. Lifschitz. it is possible to replace 0-paths without incoming and outgo- The stable model semantics for logic programming. In ing 1-arcs by only one 0-arc. Finally, we have to investigate Proc. of the Int. Conf. on Logic Programming, 1988. [Gelfond and Lifschitz, 1990] M. Gelfond and V. Lifschitz. different heuristics for procedure chooseÈ to select the next choice point. Logic programs with classical negation. In Proc. of the The approach as presented in this paper has been Int. Conf. on Logic Programming, pp. 579–597, 1990. implemented in ECLiPSe-Prolog [Aggoun et al., 2000]. [Gelfond and Lifschitz, 1991] M. Gelfond and V. Lifschitz. The current prototype is available at http://www.cs.uni- Classical negation in logic programs and deductive potsdam.de/˜linke/nomore. databases. New Generation Computing, 9:365–385, 1991. Acknowledgements [Janhunen et al., 2000] T. Janhunen, I. Niemel¨a, P. Simons, I would like to thank T. Schaub, Ph. Besnard, K. Wang, K. and J. You. Unfolding partiality and disjunctions in sta- Konczak, Ch. Anger and S.M. Model for commenting on pre- ble model semantics. In A.G. Cohn, F. Guinchiglia, and vious versions of this paper. B.Selman, editors, Proc. KR-2000, pp. 411–419, 2000. This work was partially supported by the German Science [Linke and Schaub, 2000] T. Linke and T. Schaub. Alterna- Foundation (DFG) within Project “Nichtmonotone Inferen- tive foundations for Reiter’s default logic. Artificial Intel- zsysteme zur Verarbeitung konfligierender Regeln”. ligence, 124:31–86, 2000. [Liu et al., 1998] X. Liu, C. Ramakrishnan, and S.A. Smolka. References Fully local and efficient evaluation of alternating fixed [Aggoun et al., 2000] A. Aggoun, D. Chan, P. Dufresne and points. Proc. of the 4th Int. Conf. on Tools and Algorithms other. Eclipse User Manual Release 5.0. ECLiPSe is avail- for the Construction Analysis of Systems, pp. 5–19, Lis- able at http://www.icparc.ic.ac.uk/eclipse, 2000. bon, Portugal, 1998. Springer Verlag. [Brewka, 1991] G. Brewka. Nonmonotonic Reasoning: Log- [Niemel¨a and Simons, 1997] I. Niemel¨a and P. Simons. ical Foundations of Commonsense. Cambridge University Smodels: An implementation of the stable model and well- Press, Cambridge, 1991. founded semantics for normal logic programs. In J. Dix, U. Furbach, and A. Nerode, editors, Proc. LPNMR-1997, [Brignoli et al., 1999] G. Brignoli, S. Costantini, pp. 420–429. Springer, 1997. O. D’Antona, and A. Provetti. Characterizing and com- [ ] puting stable models of logic programs: the non-stratified Niemel¨a, 1999 I. Niemel¨a. Logic programming with case. In Proc. of Conf. on Information Technology, pp. stable model semantics as a constraint programming 197–201, 1999. paradigm. Annals of Mathematics and Artificial Intelli- gence, 25(3,4):241–273, 1999. [Cholewi´nski et al., 1995] P.Cholewi´nski, V.Marek, A. Mik- [Papadimitriou and Sideri, 1994] C. Papadimitriou and itiuk, and M. Truszczy´nski. Experimenting with nonmono- M. Sideri. Default theories that always have extensions. tonic reasoning. In Proc. of the Int. Conf. on Logic Pro- Artificial Intelligence, 69:347–357, 1994. gramming, pp. 267–281. MIT Press, 1995.

646 LOGIC PROGRAMMING AND THEOREM PROVING LOGIC PROGRAMMING AND THEOREM PROVING

LOGIC PROGRAMMING A Framework for Declarative Update Specifications in Logic Programs

Thomas Eiter and Michael Fink and Giuliana Sabbatini and Hans Tompits Institut f¨ur Informationssysteme, Abteilung Wissensbasierte Systeme 184/3,

Technische Universit¨at Wien, Favoritenstrasse 9-11, A-1040 Vienna, Austria g E-mail:feiter,michael,giuliana,tompits @kr.tuwien.ac.at

Abstract of ÃB , they are not conceived for handling a yet unknown update, which will arrive as the environment evolves. In fact, Recently, several approaches for updating knowl- these approaches lack the possibility to specify how an agent edge bases represented as logic programs have been should react upon the arrival of such an update. For ex- proposed. In this paper, we present a generic ample, we would like to express that, on arrival of the fact

framework for declarative specifications of update

be×Ø bÙÝ ´×hÓÔ µ ÃB

½ , this should be added to , while best- policies, which is built upon such approaches. It

buy information about other shops is removed from ÃB . extends the LUPS language for update specifica- In this paper, we address this issue and present a declar- tions and incorporates the notion of events into the ative framework for specifying update behavior of an agent. framework. An update policy allows an agent to The agent receives new information in terms of a set of rules flexibly react upon new information, arriving as an

(which is called an event), and adjusts its ÃB in accord to a event, and perform suitable changes of its knowl- given update policy, consisting of statements in a declarative edge base. The framework compiles update poli- language. Our main contributions are summarized as follows: cies to logic programs by means of generic transla- tions, and can be instantiated in terms of different (1) We present a generic framework for specifying update concrete update approaches. It thus provides a flex- behavior, which can be instantiated with different update ap- ible tool for designing adaptive reasoning agents. proaches to logic programs. This is facilitated by a layered approach: At the top level, the update policy is evaluated,

given an event and the agent’s current belief set, to single out ÃB

1 Introduction the update commands Í which need to be performed on . È Updating knowledge bases is an important issue for the real- At the next layer, Í is compiled to a set of rules to be in- ization of intelligent agents, since, in general, an agent is situ- corporated to ÃB ; at the bottom level, the updated knowledge ated in a changing environment and must adjust its knowledge base is represented as a sequence of logic programs, serving base when new information is available. While for classical as input for the underlying update semantics for logic pro- knowledge bases this issue has been well-studied, approaches grams, which determines the new current belief set. to update nonmonotonic knowledge bases, like, e.g., updates (2) We define a declarative language for update poli- of logic programs [Alferes et al., 2000; Eiter et al., 2000; cies, generalizing LUPS by various features. Most impor-

Zhang and Foo, 1998; Inoue and Sakama, 1999] or of default tantly, access to incoming events is facilitated. For example,

´be×Ø bÙÝ ´×hÓÔ µµ [[E : be×Ø bÙÝ ´×hÓÔ µ]] ¾

theories [Williams and Antoniou, 1998], are more recent. retract ½ states that

be×Ø bÙÝ ´×hÓÔ µ be×Ø bÙÝ ´×hÓÔ µ ½ The problem of updating logic programs, on which we fo- if ¾ is told, then is removed

cus here, deals with the incorporation of an update È ,given from the knowledge base. Statements like this may involve

by a rule or a set of rules, into the current knowledge base further conditions on the current belief set, and other com-

È ;:::;È

ÃB mands to be executed (which is not possible in LUPS). The Ò

. Accordingly, sequences ½ of updates lead to

ÃB ;È ;:::; È µ

´ language thus enables the flexible handling of events, such as Ò sequences ½ of logic programs, which are given a declarative semantics. To broaden this approach, simply recording changes in the environment, skipping unin- Alferes et al. [1999a] have proposed the LUPS update lan- teresting updates, or applying default actions. guage, in which updates consist of sets of update commands. (3) We analyze some properties of the framework, using

Such commands permit to specify changes to ÃB in terms the update answer set semantics of Eiter et al. [2000] as a

of adding or removing rules from it. For instance, a typ- representative of similar approaches. In particular, useful

a b ÛheÒ c ÃB

ical command is a××eÖØ , stating that rule properties concerning maintenance are explored, and the

b ÃB c

a should be added to if is currently true in it. Sim- complexity of the framework is determined. Moreover, we

b b ilarly, ÖeØÖacØ expresses that must be eliminated from describe a possible realization of the framework in the agent

ÃB , without any further condition. system IMPACT [Subrahmanian et al., 2000], providing evi- However, a certain limitation of LUPS and the above men- dence that our approach is a viable tool for developing adap- tioned formalisms is that while they handle ad hoc changes tive reasoning agents.

LOGIC PROGRAMMING AND THEOREM PROVING 649

2 Preliminaries

BeдÃË µ i ½

½ We assume the reader familiar with extended logic programs i Belief set at step

(ELPs) [Gelfond and Lifschitz, 1991]. For a rule Ö , we write

ÃB E ::: E E ÃË

i ½ i i

½ Knowledge state

´Ö µ B ´Ö µ Ö

À and to denote the head and body of , respectively.

ÒÓØ :

Furthermore, stands for default negation and for strong Û



Í

iØ A Ä update policy

negation. A is the set of all literals over a set of atoms ,

Ä ÄiØ A

and A is the set of all rules constructible from .

´È ;:::;È µ Ò

An update program, P, is a sequence ½ of

ÃB Í ::: Í Í

i ½ i

½ Executable commands

 ½

ELPs, where Ò . We adopt an abstract view of the seman-

Û

´¡µ

tics of ELPs and update programs, given as a mapping BeÐ , 

ØÖ compilation

Beд µ Ä

which associates with every sequence P aset P A

´ µ

of rules; intuitively, BeÐ P are the consequences of P.Dif-

Beд¡µ

È È ::: È È

½ i ½ i ferent instantiations of are possible, according to var- ¼ Update sequence

ious proposals for update semantics. We only assume that

Û

Beд¡µ

satisfies some elementary properties which any “rea- 

BeÐ update semantics

sonable” semantics satisfies. In particular, we assume that

È  Beд µ

Ò P holds, and that the following property is satis-

BeдÃË µ=Beд´È ;:: :;È µµ i

¼ i

i Belief set at step

¾ Beд µ A ¾ B ´Ö µ Ö ¾ Beд µ

fied: given A P and ,then P

´Ö µ B ´Ö µ ÒfAg¾Beд µ iff À P . We use here the semantics of Eiter et al. [2000],which coincides with the semantics of inheritance programs due to Figure 1: From knowledge state to belief set at step i.

Buccafurri et al. [1999]. The semantics of ELPs È and update

sequences P with variables is defined as usual through their around”) for specifying update policies.

´È µ G ´ µ

ground versions G and P over the Herbrand universe, È

respectively. In what follows, let A, , P, etc. be ground. 3.1 Basic Framework

Ë  ÄiØ

An interpretation is a set A which contains no We start with the formal notions of an event and of the knowl- complementary pair of literals. Ë is a (consistent) answer edge state of an agent.

set of an ELP È iff it is a minimal model of the reduct

Ë

Ä

A

È È ¾

, which results from by deleting all rules whose body Definition 1 An event class is a collection EC  of finite

ÒÓØ Ä Ä ¾ Ë ¾EC contains some default literal with , and by sets of rules. The members E are called events. removing all default literals in the bodies of the remaining

Informally, EC describes the possible events (i.e., sets of

´È µ rules [Gelfond and Lifschitz, 1991].ByAË we denote communicated rules) an agent may witness. For example, the

the collection of all answer sets of È .Therejection set,

¼

A  A

collection F of all sets of facts from a subset of

ej ´Ë; µ Ë

Ê P ,ofP with respect to the interpretation is given Ë

Ò atoms may be an event class. In what follows, we assume

ej ´Ë; µ = Êej ´Ë; µ; Êej ´Ë; µ = ;

by Ê P P where P ,

i Ò

i=½

that an event class EC has been fixed.

Ò > i  ½ Êej ´Ë; µ Ö ¾ È

and, for , P contains every rule i

i

¼ ¼

´Ö µ=:À ´Ö µ B ´Ö µ [ B ´Ö µ  Ë

such that À and ,forsome

ÃË = hÃB ; E ;:::;E i Ò

Definition 2 A knowledge state ½

¼

Ö ¾ È Ò Êej ´Ë; µ j>i Ë

j P with . Then, is an answer set of

j

ÃB ´

Ë consists of an ELP the initial knowledge base) and a

=´È ;:::;È µ Ë È Ò Êej ´Ë; µ

Ò i

P ½ iff is an answer set of P .

E ;:::;E E ¾EC i ¾f½;:::;Òg

i

Ò i

sequence ½ of events , .For

´ µ

We denote the collection of all answer sets of P by AË P .

i  ¼ ÃË = hÃB ; E ;:::;E i ÃË

½ i

, i is the projection of to

= ½ Êej ´Ë; µ = ; Since Ò implies P , the semantics extends the first i events. the answer set semantics. [Eiter et al., 2000] describes a char- acterization of the update semantics in terms of single ELPs. Intuitively, ÃË describes the evolution of the agent’s

knowledge, starting from its initial knowledge base. When

È = fb ÒÓØ a; a g È = f:a ; ½

Example 1 Let ¼ ,

E ÃË

i ½

a new event i occurs, the current knowledge state

c g È = f:c g È ¼

, and ¾ . Then, has the single answer

ÃË = hÃB ; E ;:::;E ;E i

½ i ½ i

changes to i , which requests

Ë = fag Êej ´Ë ;È µ = ; ´È ;È µ

¼ ¼ ¼ ½

set ¼ with ; has answer E

the agent to incorporate the event i into its knowledge base

Ë = f:a; c; bg Êej ´Ë ; ´È ;È µµ = fa g

½ ¼ ½ set ½ with ; and

and adapt its belief set.

´È ;È ;È µ Ë = f:a; :c; bg

½ ¾ ¾

¼ possesses as unique answer

BeдÃË µ

½

The procedure for adapting the belief set i

Êej ´Ë ; ´È ;È ;È µµ = fc ;a g

¼ ½ ¾

set with ¾ . E

on arrival of i is illustrated in Figure 1. Informally, at

BeÐ ´ µ Ö ¾ Ä A The belief set A P is the set of all rules

step i of the knowledge evolution, we are given the be-

Ë ¾ AË ´ µ

such that Ö is true in each P . We shall drop

BeдÃË µ ÃË =

½ i ½ lief set i and the knowledge state

the subscript “ A ” if no ambiguity can arise. With a slight

hÃB ; E ;:::;E i E

i ½ i

½ , together with the new event ,and

Ä Ä ¾ BeÐ ´ µ

abuse of notation, for a literal , we write A P if

BeдÃË µ

we want to compute i in terms of the update pol-

Ä ¾ BeÐ ´ µ

A P .

Í Í

icy . First, a set i of executable commands is deter-

mined from Í . Afterwards, given the previously computed

3 Update Policies

Í ;:::;Í ´ÃB ; Í ;:::;Í µ

i ½ ½ i

sets ½ , the sequence is com- =

We first describe our generic framework for event-based piled by the transformation ØÖ into the update sequence P

´È ;È ;:::;È µ EÈÁ BeдÃË µ Beд µ

¼ ½ i updating, and afterwards the language (“the language . Then, i is given by P .

650 LOGIC PROGRAMMING AND THEOREM PROVING

that the rule is to be ignored. Similarly, the command retract

×ØaØi hcÓÑÑi [if hcÓÒd½i][[[hcÓÒd¾i]]] h ::= ;

forces a rule to be deactivated. The option event states that

c ÒaÑei a××eÖØ[ eÚeÒØ] j ÖeØÖacØ[ eÚeÒØ] j h ::=

an assertion or retraction has only temporary value and is not

eÚeÒØ] j caÒceÐ j igÒÓÖe aÝ×[ aÐÛ ;

supposed to persist by inertia in subsequent steps. The precise

Ö idi hÖÙÐeij hÖ ÚaÖi h ::= ;

meaning of the different update commands will be made clear

ÐiØ idi hÐiØeÖaÐij hÐiØ ÚaÖi h ::= ;

in the next section.

cÓÑÑi hc ÒaÑei´hÖ idiµ h ::= ;

Example 2 Consider a simple agent selecting Web shops in

cÓÒd½i [ÒÓØ] hcÓÑÑij [ÒÓØ] hcÓÑÑi; hcÓÒd½i h ::= ;

search for some specific merchandise. Suppose its knowledge

cÓÒd¾i hkb cÓÒd×ijE : heÚ cÓÒd×ij h ::=

base, ÃB , contains the rules

kb cÓÒd×i E : heÚ cÓÒd×i

h , ;

Ö : ÕÙeÖÝ ´Ë µ ×aÐe ´Ë µ;ÙÔ´Ë µ; ÒÓØ :ÕÙeÖÝ ´Ë µ;

½

kb cÓÒd×i hkb cÓÒdijhkb cÓÒdi hkb cÓÒd×i

h ::= , ;

Ö : ØÖÝ ÕÙeÖÝ ÕÙeÖÝ ´Ë µ;

¾

kb cÓÒdi hÖ idij hÐiØ idi

h ::= ;

Ö : ÒÓØifÝ ÒÓØ ØÖÝ ÕÙeÖÝ ;

¿

eÚ cÓÒd×i heÚ cÓÒdijheÚ cÓÒdi heÚ cÓÒd×i

h ::= , ;

Ö : daØe ´¼µ Ö

¼ ½

eÚ cÓÒdi hÐiØ idij hÖ idi h ::= ; and a fact as an initial time stamp. Here,

expresses that a shop Ë , which has a sale and whose Web site

Ö Ö ¿ is up, is queried by default, and ¾ , serve to detect that no

Table 1: Syntax of an update statement in EÈÁ.

site is queried, which causes ‘ÒÓØifÝ ’ to be true. Assume that

an event, E , might be any consistent set of facts or ground

×aÐe ´×µ daØe ´Øµ ×

3.2 Language EÈÁ :Syntax rules of the form , stating that shop has E a sale on date Ø, such that contains at most one time stamp

The language EÈÁ generalizes the update specification lan-

´¡µ guage LUPS [Alferes et al., 1999a], by allowing update state- daØe .

An update policy Í may be defined as follows. Assume it ments to depend on other update statements in the same EÈÁ

program, and more complex conditions on both the current contains the incorporate-by-default statement ´½µ,aswellas:

aÝ×´×aÐe ´Ë µ daØe ´Ì µµ if a××eÖØ´×aÐe ´Ë µ daØe ´Ì µµ

belief set and the actual event (note that LUPS has no provi- aÐÛ ;

¼ ¼

´×aÐe ´Ë µ daØe ´Ì µµ[[daØe ´Ì µ;Ì 6= Ì ; E : daØe ´Ì µ]]

sion to support external events). These features make it suit- caÒceÐ ;

¼ ¼

´×aÐe ´Ë µ daØe ´Ì µµ[[daØe ´Ì µ;Ì 6= Ì ; E : daØe ´Ì µ]] able for implementing rational reactive agents, capable, e.g., ÖeØÖacØ . of filtering incoming information. Informally, the first statement repeatedly confirms the infor-

The syntax of EÈÁ is given in Table 1. In what follows, we

mation about a future sale, which guarantees that it is ef-  use cÑd to denote update commands and to refer to rules fective on the given date, while the second statement revokes or rule variables. In general, an EÈÁ statement may have the this. The third one removes information about a previously

form µ

ended sale ´assuming the time stamps increase .Further-

cÑd ´ µ if [ÒÓØ] cÑd ´ µ;:::; [ÒÓØ] cÑd ´ µ[[c ; E :c ]]

½ ½ ¾ ¾ Ñ Ñ ½ ¾

more, Í includes also the following statements:

¼ ¼

´daØe ´Ì µµ[[daØe ´Ì µ;Ì 6= Ì ; E : daØe ´Ì µ]]

which states conditional assertion or retraction of a rule ÖeØÖacØ ;

 cÑd ´ µ

½ ½ ½

´×aÐe ´× µµ[[E : ×aÐe ´× µ]]

, expressed by , depending on other commands igÒÓÖe ½

½ ;

[ÒÓØ]cÑd ´ µ; [ÒÓØ]cÑd ´ µ

¾ Ñ Ñ

¾ ..., , and conditioned with

igÒÓÖe ´×aÐe ´× µ daØe ´Ì µµ[[E : ×aÐe ´× µ daØe ´Ì µ]] ½

½ . c

the proviso whether ½ belongs to the current belief set and

´Øµ ÃB

The first statement keeps the time stamp daØe in

c EÈÁ whether ¾ is in the actual event. The basic commands unique, and removes the old value. The other statements sim-

are the same as those in LUPS (for their meaning, cf. [Alferes ×

ply state that sales information about shop ½ is ignored. et al., 1999a]), plus the additional command ignore,which allows to skip unintended updates from the environment, 3.3 Language EÈÁ : Semantics which otherwise would be incorporated into the knowledge

According to the overall structure of the semantics of EÈÁ,

[[¡]] c E :c ¾ base. Each condition in , both of the form ½ and , can

as depicted in Figure 1, at step i, we first determine the

be substituted by a list of such conditions. Note that in LUPS Í

executable command i given the current knowledge state

no conditions on rules and external events can be explicitly

ÃË = hÃB ; E ;:::;E i

½ ½ i ½ i and its associated belief set

expressed, nor dependencies between update commands. We

BeдÃË µ = Beд µ = ´È ;:::;È µ

½ i ½ i ½ ¼ i ½ i P ,whereP . also extend the language by permitting variables for rules and

To this end, we evaluate the update policy Í over the new

literals in the update commands, ranging over the universe of

E Beд µ

i ½ event i and the belief set P .

the current belief set and of the current event (syntactic safety

´Í µ Í A Let G be the grounded version of over the language conditions can be easily checked). By convention, variable underlying the given update sequence and the received events.

names start with capital letters. i

´Í µ i

Then, the set G of reduced update statements at step is EÈÁ Definition 3 An update policy Í is a finite set of state- given by

ments. i

G ´Í µ = f cÑd´µ if C j cÑd´µ if C [[C ]] ¾G´Í µ;

½ ¾

½ where

C = c ;::: ;c ; E : Ö ;:::; Ö ;

¾ ½ ½ Ñ Ð and such that

For instance, the EÈÁ statement

µ c ;: ::;c ¾ Beд Ö ;:::;Ö ¾ E g:

i ½ ½ ½ Ñ i

Ð P and

´Ê µ if ÒÓØ igÒÓÖe ´Ê µ[[E : Ê ]]

a××eÖØ (1)

i

´Í µ

The update statements in G are thus of the form

´ µ if [ÒÓØ] cÑd ´ µ;:::;[ÒÓØ] cÑd ´ µ

means that all rules in the event have to be incorporated into cÑd

½ ¾ ¾ Ñ Ñ ½ .Se- the new knowledge base, except if it is explicitly specified mantically, we interpret them as ordinary logic program rules

LOGIC PROGRAMMING AND THEOREM PROVING 651

cÑd ´ µ [ÒÓØ ]cÑd ´ µ;:::;[ÒÓØ ]cÑd ´ µ _´a××eÖØ eÚeÒØ´Ö µ ¾ EC

i ½

½ ¾ ¾ Ñ Ñ

½ .The

Í i

eÚeÒØ]´Ö µ ¾= ÈC ^ aÐÛaÝ×[

i

G ´Í µ

program ¥ is the collection of all these rules, given ,

i

^ a××eÖØ[ eÚeÒØ]´Ö µ ¾= EC µg together with the following constraints, which exclude con- i . tradictory commands: On the basis of this compilation, we can define the belief

set for a knowledge state ÃË :

a××eÖØ[ eÚeÒØ]´Ê µ; ÖeØÖacØ[ eÚeÒØ]´Ê µ;

aÐÛaÝ×[ eÚeÒØ]´Ê µ; caÒceÐ´Ê µ: Í

Definition 6 Let ÃË and be as in Definition 5, and let

Í ;:::;Í Ò ½ be the corresponding executable commands ob-

tained from Definition 4. Then, the belief set of ÃË is given

ÃË = hÃB ; E ;:::;E i Ò

Definition 4 Let ½ be a knowledge

BeдÃË µ=BeдØÖ ´ÃB ; Í ;:::;Í µµ: Ò

by ½

Í Í

state and an update policy. Then, i is a set of executable

´i  Òµ Í i Example 3 Reconsider Example 2 and suppose the event

update commands at step iff i is an answer set of

Í Í

E = f×aÐ e´× µ;daØe´½µg ÃË = hÃB i

G ´¥ µ ¥ ¼

the grounding of . ½ occurs at . Then,

i i

½

G ´Í µ = f a××eÖØ´×aÐe ´× µµ if ÒÓØ igÒÓÖe´×aÐe ´× µµ ¼

Since update statements do not contain strong negation, ¼ ,

´daØe ´½µµ if ÒÓØ igÒÓÖe´daØe ´½µµ

executable update commands are actually stable models of a××eÖØ ,

Í

ÖeØÖacØ ´daØe ´¼µµg:

´¥ µ

G [Gelfond and Lifschitz, 1988]. Furthermore, since i programs may in general have more than one answer set, or Í The corresponding program ¥ has the single answer set

no answer set at all, and the agent must commit itself to a sin- ½

fa××eÖØ´×aÐe ´× µµ; a××eÖØ´daØe ´½µµ; ÖeØÖacØ ´daØe ´¼µµg

¼ ,

gle set of update commands, we assume a suitable selection

ØÖ ´¡µ ÈC = ÈC Ò ¼

which is compiled, via function ,to ½

ËeÐ ´¡µ Í

function, , returning a particular i if an answer set ex-

fa××eÖØ [eÚeÒØ]´daØe ´¼µµg = ; È = f×aÐe ´× µ ¼

and ½

¼ ¼ ¼ ¼

Í = fa××eÖØ´? µg i

ists, or, otherwise, returning i ,where

ÓÒ´Ö µ; ÓÒ´Ö µ ; daØe ´½µ ÓÒ´Ö µ; ÓÒ´Ö µ ;

½ ½ ¾ ¾ ?

i is a reserved atom. These atoms are used for signaling

:ÓÒ´Ö µ g BeдhÃB ; E iµ ½ ¼ . As easily seen, the belief set

that the update policy encountered inconsistency. They can

Beд´È ;È µµ ×aÐe ´× µ ÕÙeÖÝ ´× µ

½ ¼ ¼

= ¼ contains and .

´¡µ easily be filtered out from BeÐ , if needed, restricting the

outcomes of the update to the original language. 4 Properties

Í ;:::;Í i

Next we compile the executable commands ½ into

´ÃË µ

In this section, we discuss some properties of BeÐ for

´È ;:::;È µ i

an update sequence ¼ , serving as input for the be-

´¡µ

particular update policies, using the definition of BeÐ based

´¡µ lief function BeÐ . This is realized by means of a trans-

on the update answer sets approach of Eiter et al. [2000],as

´¡µ formation ØÖ , which is a generic and adapted version of a

similar mapping introduced by Alferes et al. [1999a].Inwhat explained in Section 2. We stress that the properties given

´¡µ follows, we assume a suitable naming function for rules in the below are also satisfied by similar instantiations of BeÐ , like, e.g., dynamic logic programming [Alferes et al., 2000]. update sequence, enforcing that each rule Ö is associated with

First, we note some basic properties: Ò

a unique name Ö .

Í = ; ÃB

¯ If (called empty policy), then will never be

ÃË = hÃB ; E ; :::; E i Ò

Definition 5 Let ½ be a knowl-

E ;:::;E Ò

updated; the belief set is independent of ½ ,and

i  ¼

edge state and Í an update policy. Then, for ,

BeдÃË µ = BeдÃB µ

thus static. Hence, i , for each

ØÖ ´ÃB ; Í ;:::;Í µ = ´È ;È ;:::;È µ

i ¼ ½ i

½ is inductively de-

=½;:::;Ò

i .

Í ;:::;Í i

fined as follows, where ½ are the executable com-

Í = fa××eÖØ´Ê µ[[E :Ê ]]g mands according to Definition 4: ¯ If (called unconditional as-

sert policy), then all rules contained in the received

i =¼ : È = fÀ ´Ö µ B ´Ö µ;ÓÒ´Ò µ j Ö ¾ ÃB g[ Ö Set ¼

events are directly incorporated into the update se-

ÓÒ´¡µ fÓÒ´Ò µ j Ö ¾ ÃB g

Ö ,where are new atoms.

BeдÃË µ = Beд´ÃB ;E ;:::;E µµ

½ i

quence. Thus, i , ÈC

Furthermore, initialize the sets ¼ of persistent com-

=½;:::;Ò

for each i .

EC ;

mands and ¼ of effective commands to .

¯ Í

If i is empty, then the knowledge is not updated, i.e.,

i  ½: EC ÈC È

i i

i , and are as follows:

È = ; BeдÃË µ=BeдÃË µ

i i ½

i . We thus have .

EC fcÑd´Ö µ j cÑd´Ö µ ¾ Í ^ igÒÓÖe ´Ö µ ¾= Í g

i i

i = ;

¯ Í = fa××eÖØ´? µ g BeдÃË µ=

i i

Similarly, if i ,then

ÈC ÈC [faÐÛaÝ×´Ö µ j aÐÛaÝ×´Ö µ ¾ EC g

i i ½ i

´ÃË µ = BeÐ

½

i .

eÚeÒØ´Ö µ j aÐÛaÝ× eÚeÒØ´Ö µ ¾ EC [faÐÛaÝ×

i

aÐÛaÝ×´Ö µ ¾= EC [ ÈC g

^ Physical removal of rules

i i ½

Ò ´faÐÛaÝ× eÚeÒØ´Ö µ j aÐÛaÝ×´Ö µ ¾ EC g

i An important issue is the growth of the agent’s knowledge

[faÐÛaÝ×[ eÚeÒØ]´Ö µ j caÒceÐ ´Ö µ ¾ EC gµ;

i base, as the modular construction of the update sequence

ØÖ ´¡µ

fÓÒ´Ò µ ;À´Ö µ B ´Ö µ;ÓÒ´Ò µ j

È through transformation causes some rules and facts to

Ö Ö

i =

a××eÖØ[ eÚeÒØ]´Ö µ ¾ EC

i be repeatedly inserted. This is addressed next, where we dis-

eÚeÒØ]´Ö µ ¾ ÈC g _ aÐÛaÝ×[

i cuss the physical removal of rules from the knowledge base.

[fÓÒ´Ò µ j ÖeØÖacØ eÚeÒØ´Ö µ ¾ EC

Ö i ½

=´È ;:::;È µ Ò

Lemma 1 Let P ¼ be an update sequence. For

^ ÖeØÖacØ[ eÚeÒØ]´Ö µ ¾= EC g

i

¼

Ö ¾ È ;Ö ¾ È i < j j

every i with , the following holds: if

[f:ÓÒ´Ò µ j ´ÖeØÖacØ[ eÚeÒØ]´Ö µ ¾ EC

Ö i

¼ ¼ ¼

µ Ö = Ö ´ µ Ö = Ä Ö = :Ä ´ µ Ö =

´i ,or ii and ,or iii

eÚeÒØ]´Ö µ ¾= ÈC µ ^ aÐÛaÝ×[

i

¼¼ ¼¼

Ä Ö ¾ È À ´Ö µ = :Ä

k

_ ´aÐÛaÝ× eÒØ´Ö µ ¾ ÈC eÚ such that no rule with exists,

i ½

k ¾fj ·½;:::;Òg :Ä ¾ B ´Ö µ BeÐ ´ µ=

^ caÒceÐ´Ö µ ¾ EC A

i where , and ,then P

^ a××eÖØ[ BeÐ ´È ;:::;È ;È ÒfÖ g;È ;:::;È µ: eÚeÒØ]´Ö µ ¾= EC µ

i

A ¼ i ½ i i·½ Ò

652 LOGIC PROGRAMMING AND THEOREM PROVING

´ÃË µ ´

The following property holds: BeÐ is as follows entries denote completeness results;

´¡µ

the case of unknown ËeÐ is given at the right of “/”):

BeдÃË µ =

Theorem 1 Let ÃË be a knowledge state and

£

Beд µ =´È ;:::;È µ Ò

P ,whereP ¼ . Furthermore, let P result ÒÍ ÃB fact. assert & strat. stratified general

from P after repeatedly removing rules as in Lemma 1, and

ÆÈ ÆÈ È

È È È ¥

¾

stratified /

=´È ;:::;È µ

ÆÈ ÆÈ È

let P ,where ÆÈ

Ò

¼

È È È ¥

general / ¾

£ £

È = fÀ ´Ö µ B ´Ö µ ÒfÓÒ´Ò µgjÖ ¾ ;ÓÒ´Ò µ ¾ gÒ

Ö Ö P i P

i Similar results hold, e.g., for dynamic logic programming.

g: fÓÒ´Ò µ j ÓÒ´Ò µ ¾ × × P

The results can be intuitively explained as follows. Each

Í È ½ 

i i

´ÃË µ=BeÐ ´ µ

BeÐ and as in Figure 1 can be computed iteratively ( A

Then, A P .

¼

 Ò i Ö ¾ i ), where at step polynomially many problems

Thus, we can purge the knowledge base and remove dupli- Í

Beд´È ;:::;È µµ ¥

i ½

¼ must be solved to construct .From i

cates of rules, as well as all deactivated (retracted) rules. Í

Í = Ëeд¥ µ È i

i and previous results, is easily computed

i

È jÍ j

History Contraction in polynomial time. Since i contains less than rules,

Another relevant issue is the possibility, for some special step i is feasible in polynomial time with an NP oracle.

= ´È ;:::;È µ Ò

case, to contract the agent’s update history, and compute Thus, P ¼ is polynomially computable with

Ö ¾ Beд µ

its belief set at step i merely based on information at step an NP oracle, and P is decided with another oracle

È È

i ½ Í ½

. Let us call a factual assert policy if all assert[ event] call. Updating a stratified ¼ such that only sets of facts ,

È ;::: ¾

and always[ event] statements in Í involve only facts. In may be added preserves polynomial decidability of

¼

Ö ¾ Beд´È ;:::;È µµ

ØÖ ´¡µ ÃË = i ½

this case, the compilation for a knowledge state ¼ ; this explains the polynomial de-

ÆÈ

È

hÃB ; E ;:::;E i = ÃB

È cidability result. In all other cases, -hard problems such

½ Ò

can be simplified thus: (1) ¼ ,

Ä È as computing the lexicographically maximal model of a CNF

and (2) the construction of each i involves facts and

Ä ÓÒ´Ò µ :ÓÒ´Ò µ

: formula are easily reduced to the problem. Ö

instead of Ö and , respectively.

Í

ËeÐ ´¡µ Ëeд¥ If is unknown, each possible result of µ can For such sequences, the following holds: i

be nondeterministically guessed and verified in polynomial

=´È ;:::;È µ

¼ Ò

Lemma 2 Let P be an update sequence such ÆÈ È

time. This leads to coNP =¥ complexity.

¾

È ½  i  Ò BeÐ ´ µ= A

that i contains only facts, for . Then, P

È = È È = È [ ´È Ò BeÐ ´È ;È µ

Ù ½ Ù i·½ Ù ¼ Ù

A ,where , and

½ i·½ i Ò

Ä j:Ä ¾ È gµ

f 5 Implementational Issues ·½ i . We can thus assert the following proposition for history An elegant and straightforward realization of update policies contraction: is possible through IMPACT agent programs. IMPACT [Sub-

rahmanian et al., 2000] is a platform for developing soft- =

Theorem 2 Let Í be a factual assert policy and P ware agents, which allows to build agents on top of legacy

´È ;:::;È µ ÃË Ò

½ be the compiled sequence obtained from by code, i.e., existing software packages, that operates on arbi-

BeÐ ´ÃË µ=

the simplified method described above. Then, A trary data structures. Thus, in accordance with our approach,

BeÐ ´´ÃB ;È µµ È

Ù Ù

A ,where is as in Lemma 2. Ò Ò we can design a generic implementation of our framework,

Simple examples show that Theorem 2 does not hold in without committing ourselves to a particular update seman-

´¡µ general. The investigation of classes of policies for which tics BeÐ .

similar results hold are a subject for further research. Since every update policy Í is semantically reduced to a logic program, the corresponding executable commands Computational Complexity can be computed using well-known logic programming en-

Finally, we briefly address the complexity of reasoning about

DÄÎ DeÊe×

gines like ×ÑÓdeÐ×, ,or . Hence, we may assume Í a knowledge state ÃË . An update policy is called stratified

that a software package, ËÈ , for updating and querying a

EÈÁ cÑd´µ if C [[C ]] ¾Í ¾

iff, for all statements ½ , the asso- ÃB

knowledge base ÃB is available, and that can be ac-

¼

cÑd ´µ C

ciated rules ½ form a stratified logic program,

½ ´µ

cessed through a function beÐ returning the current belief

¼

C C EÈÁ

where results from ½ by replacing the declaration

½

´ÃË µ ËÈ

set BeÐ . Moreover, we assume that has a function ÒÓØ

ÒÓØ by default negation . ´µ

eÚeÒØ , which lists all rules of a current event. Then, an

Í ¥

For stratified Í ,any has at most one answer set. Thus, i

update policy Í can be represented in IMPACT as follows.

´¡µ

the selection function ËeÐ is redundant. Otherwise, the (1) Conditions on the belief set and the event can be mod-

´¡µ

complexity cost of ËeÐ must be taken into account. If

´Ø; beдµµ

eled by IMPACT code call atoms,i.e.atomsiÒ ,

´¡µ

ËeÐ is unknown, we consider all possible return values

iÒ´Ø; beдµµ iÒ´Ø; eÚeÒØ´µµ Ø

ÒÓØ ,and ,where is a constant Í

(i.e., all answer sets of ¥ ) and thus, in a cautious rea-

i

Ê iÒ´Ö; f´µµ Ö

Ö or a variable .InIMPACT, is true if constant

BeдÃË µ = Beд´È ;:::;È µµ Ò

soning mode, all possible ¼

´µ Ê Ö is in the result returned by f ;avariable is bound to all

from Figure 1. Clearly, for update answer sets, deciding

´Ö; f´µµ ÒÓØ iÒ

such that iÒ is true; “ ” is negation.

¼

Ö ¾ Beд´É ;:::;É µµ É

Ñ ¼

¼ is in coNP; it is polynomial, if (2) Update commands can be easily represented as IM-

É ½  i  Ñ

is stratified and each i , , contains only facts. PACT actions. An action is implemented by a body of code

´¡µ

Theorem 3 Let BeÐ be the update answer set seman- in any programming language (e.g., C); its effects are speci-

´¡µ

tics, and ËeÐ polynomial-time computable with an NP fied in terms of add and delete lists (sets of code call atoms).

ÃË = a××eÖØ´Ê µ ÖeØÖacØ ´Ê µ Ê

oracle. Then, given a ground rule Ö and ground Thus, actions like , ,etc.,where is

hÃB ; E ;:::;E i Ö ¾ Ò ½ , the complexity of deciding whether a parameter, are introduced.

LOGIC PROGRAMMING AND THEOREM PROVING 653

(3) EÈÁ statements are represented as IMPACT action rules In concluding, our generic framework, which extends other ¼

¼ approaches to logic program updates, represents a convenient

DÓ´cÑd ´ µµ [:]DÓ ´cÑd ´ µµ;: : : ; [:]DÓ ´cÑd ´ µµ;

½ ¾ Ñ

½ platform for declarative update specifications and could also

caÐ Ð aØÓÑ× ´cÓÒd µ; Óde

c be fruitfully used in several applications. Exploring these is-

Óde caÐ Ð aØÓÑ× ´cÓÒdµ where c is the list of the code call sues is part of our ongoing research. atoms for the conditions on the belief set and the event in Acknowledgements cÓÒd as described above. The semantics of IMPACT agent programs is defined This work was partially supported by the Austrian Science Fund (FWF) under grants P13871-INF and N Z29-INF. through status sets.Areasonable status set Ë is equivalent

to a stable model of a logic program, and prescribes the agent

DÓ´«µ Ë Ë

to perform all actions « where is in . Thus, repre- References Í sents the executable commands i of Figure 1 in accord with [Alferes et al., 1999a] J. Alferes, L. Pereira, H. Przymusin-

Í , and the respective action execution affects the computation , and T. Przymusinski. LUPS - A language for updat-

ØÖ ´¡µ È [ ] of i via . For more details, cf. Eiter et al., 2001 . ing logic programs. In Proc. LPNMR’99, LNAI 1730, pp. 162-176. Springer, 1999. 6 Related Work and Conclusion [Alferes et al., 2000] J. Alferes, J. Leite, L. Pereira, H. Przy- Our approach is similar in spirit to the work in active musinska, and T. Przymusinski. Dynamic updates of non- databases (ADBs), where the dynamics of a database is speci- monotonic knowledge bases. J. Logic Programming, 45(1- fied through event-condition-action (ECA) rules triggered by 3):43-70, 2000. events. However, ADBs have in general no declarative se- [Baral and Lobo, 1996] C. Baral and J. Lobo. Formal char- mantics, and only one rule at a time fires, possibly causing acterization of active databases. In Proc. LID’96,LNCS successive events. In [Baral and Lobo, 1996], a declarative 1154, pp. 175-195. Springer, 1996. characterization of ADBs is given, in terms of a reduction to [Buccafurri et al., 1999] F. Buccafurri, W. Faber, and logic programs, by using situation calculus notation. N. Leone. Disjunctive logic programs with inheritance. Our language for update policies is also related to action In Proc. ICLP’99, pp. 79-93. MIT Press, 1999. languages, which can be compiled to logic programs as well [Eiter et al., 2000] Th. Eiter, M. Fink, G. Sabbatini, and (cf., e.g., [Lifschitz and Turner, 1999]). A change to the H. Tompits. Considerations on updates of logic programs. knowledge base may be considered as an action, where the In Proc. JELIA’00, LNAI 1919, pp. 2-20. Springer, 2000. execution of actions may depend on other actions and condi- [Eiter et al., 2001] Th. Eiter, M. Fink, G. Sabbatini, and tions. However, action languages are tailored for planning H. Tompits. Declarative knowledge updates through and reasoning about actions, rather than reactive behavior agents. In Proc. AISB’01 Symp. on Adaptive Agents and specification; events would have to be emulated. Further- Multi-Agent Systems, York, UK, pp. 79-84. AISB, 2001. more, a state is, essentially, a set of literals rather than a be- [Gelfond and Lifschitz, 1988] M. Gelfond and V. Lifschitz. lief set as we define it. Investigating the relationships of our The stable model semantics for logic programming. In framework to these languages in detail—in particular con- Proc. ICSLP’88, pp. 1070-1080. MIT Press, 1988. cerning embeddings—is an interesting issue for further re- [Gelfond and Lifschitz, 1991] M. Gelfond and V. Lifschitz. search. Classical negation in logic programs and disjunctive data- A development in the area of action languages, with pur- bases. New Generation Computing, 9:365-386, 1991. poses similar to those of EÈÁ, is the policy description lan- [Inoue and Sakama, 1999] K. Inoue and C. Sakama. Updat- guage ÈDÄ [Lobo et al., 1999]. It extends traditional action ing extended logic programs through abduction. In Proc. languages with the notion of event sequences, and serves for LPNMR’99, LNAI 1730, pp. 147-161. Springer, 1999. specifying actions as reactive behavior in response to events. [Lifschitz and Turner, 1999] V. Lifschitz and H. Turner. Re-

A ÈDÄ policy is a collection of ECA rules, interpreted as a presenting transition systems by logic programs. In Proc.

function associating with an event sequence a set of actions. LPNMR’99, LNAI 1730, pp. 92-106. Springer, 1999. EÈÁ

ÈDÄ seems thus to be more expressive than ; possible [Lobo et al., 1999] J. Lobo, R. Bhatia, and S. Naqvi. A pol- ÈDÄ embeddings of EÈÁ into remain to be explored. icy description language. In Proc. AAAI/IAAI’99, pp. 291-

The EÈÁ language could be extended with several features: 298. AAAI Press / MIT Press, 1999.

´Ö µ Ö

(1) Special atoms iÒ telling whether is actually part of [Marek and Truszczy´nski, 1998] W. Marek and M. Trusz-

ÃB ÓÒ´Ò µ (i.e., activated by Ö ), allowing to access the “exten- czy´nski. Revision programming. Theoretical Computer sional” part of ÃB . Science, 190:241-277, 1998.

(2) Rule terms involving literal constants and variables, [Subrahmanian et al., 2000] V.S. Subrahmanian, J. Dix,

À ÙÔ´× µ;B À; B ÙÔ´× µ ½ e.g., “ ½ ”, where are variables and Th. Eiter, P. Bonatti, S. Kraus, F. Ozcan, and R. Ross. Het- is a fixed atom, providing access to the structure of rules. erogeneous Agent Systems. MIT Press, 2000. Combined with (1), commands such as “remove all rules in-

[Williams and Antoniou, 1998] M.-A. Williams and G. An-

ÙÔ´× µ volving ½ ” can thus be conveniently expressed. toniou. A strategy for revising default theory extensions. (3) More expressive conditions on the knowledge base are In Proc. KR’98, pp. 24-35. Morgan Kaufmann, 1998. conceivable, requesting for more complex reasoning tasks, [ ] and possibly taking the temporal evolution into account. E.g., Zhang and Foo, 1998 Y. Zhang and N. Foo. Updating logic

programs. In Proc. ECAI’98, pp. 403-407. 1998.

´aµ a “ ÔÖeÚ ” expressing that was true at the previous stage.

654 LOGIC PROGRAMMING AND THEOREM PROVING Abduction in Logic Programming: A New Definition and an Abductive Procedure Based on Rewriting

Fangzhen Lin Jia-Huai You Department of Computer Science Department of Computing Science Hong Kong University of Science and Technology University of Alberta Clear Water Bay, Kowloon, Hong Kong Edmonton, Alberta, Canada T6G 2H1

Abstract One could argue that these differences are due to the fact that under the answer set semantics, negation is considered to We propose a new definition of abduction in logic

be “negation-as-failure.” If none of the atoms in ¨ appear in programming, and contrast it with that of Kakas

the head of a rule in the logic program , then adding a set

and Mancarella’s. We then introduce a rewriting

©¨ to really means that we are adding the complete

system for answering queries and generating expla- ¥

¨© literal set, © £ , to . This would also nations, and show that it is both sound and complete explain why there is no minimality condition in the definition: under the partial stable model semantics and sound two complete sets of literals are never comparable in terms of and complete under the answer set semantics when the subset relation. the underlying program is so-called odd-loop free. However, while this notion of abductive explanations

We discuss an application of the work to a problem makes sense in theory, it is problematic in practice. For in-

§ §

in reasoning about actions and provide some exper- ¡'&

!"$#%  !() stance, if ¨ , and , then there are two

imental results. abductive explanations for ¡ according to Kakas and Man-

%!*#% ¨ +

carella’s definition: %!" and . In general, if has ¡

1 Abduction in logic programming elements, then there are ,.-0/21 abductive explanations for , a

In general, given a background theory , and an observation number that is too big to manage.

! ¡ ¡ Since in this case is the explanation that we are look-

to explain, an abduction of w.r.t. is a theory ¢ such

¦¥ §

¡ ing for, it is tempting here to say that we should prefer mini- that ¢¤£ . Normally, we want to put some additional mal abductive explanations like what we did for propositional conditions on ¢ , such as that it is consistent with and con- tains only those propositions called abducibles. For instance, logic. As we mentioned above, this does not make sense if we take an abductive explanation to be a complete set of literals in propositional logic, given a background theory , a set ¨ of assumptions or abducibles, and a proposition ¡ , an expla- as implied by the answer set semantics. However, one can

¡ still try to minimize the set of atoms, in this case, preferring

nation © of is commonly defined (see [Reiter and de Kleer, !"$#%

1987], [Poole, 1988], and [Konolige, 1992]) to be a minimal %!" over .

¥ §

¡ However, this minimization strategy is problematic when

£ © £ © set of literals over ¨ such that and is con- sistent. a program contains negation. Consider a situation in which In the context of logic programming, abduction has been a boat can be used to cross a river if it is not leaking or, if it investigated from both proof-theoretic and model-theoretic is leaking, there is a bucket available to scoop the water out perspectives (e.g. [Eshghi and Kowalski, 1989; Kakas and of the boat. This can be axiomatized by the following logic

Mancarella, 1990; Satoh and Iwayama, 1992]). One of the program :

3 & #=9:!.>?A@%BDC0EGF!0HJIK+MLM(

most influential definitions of abduction in logic program- !4+6587:9<;;

& 3

ming is that of Kakas and Mancarella’s generalized stable 3

!4+6587:9<;; #=9:!.>?*EF!JH0IN+ML$OP!J;%QSR HJF%>?( ¨ model semantics [1990]. Given a logic program , a set of atoms standing for abducibles, and a query ¡ , Kakas and Now suppose that we saw someone crossed the river, how

Mancarella define an abductive explanation © to be a sub- do we explain that? Clearly, there are two possible explana-

set of ¨ such that there is an answer set (also called a stable tions: either the boat is not leaking or the person has a bucket

¡ £©

model) of that satisfies . with her. In terms of Kakas and Mancarella’s definition,

3 #?9:!4>$

One can see the following two differences between this there are three abductive explanations for !4+6587:9<;; , ,

3 3

HJF%>$ #=9:!.>?*EF!JH0IN+ML$OP!J;%QSR HJF%>$

definition and the one that we defined above for propositional #=9:!.>?$OP!J;%QSR , and , as-

§

3

¨ #=9:!.>?*EF!JH0IN+ML$OP!J;%QSR HJF%>$ logic: In propositional logic, © is a set of literals, but in logic suming that is the set of

programming, it is just a set of atoms; In propositional logic, abducibles. But only one of them, :#?9:!4>$ , is minimal.

© must be minimal, in terms of the subset ordering relation; On a closer look, we see that in our first example, when we

but there is no such requirement in the case of logic program- say that %!" is a preferred explanation over all the others, we

ming. do not mean the complete set of literals !*T#% , is preferred

LOGIC PROGRAMMING AND THEOREM PROVING 655 over all the others. While we want ! to be part of the expla- as well. Furthermore, it is a minimal explanation as none of

nation, we don’t necessarily want T# because we do not want its element can be deleted for it continue to be an explanation. EF!JH0IN+ML" to apply negation as failure on abducibles, which are assump- Similarly, #=9:!.>?$ is also a minimal explanation.

tions one can make one way or the other. What we want is for If we take a hypothesis to be the conjunction of its ele- ¡

the set %!" itself to be the best explanation for . ments, then we have that in the propositional logic,

 

One way to justify this is that all possible ways of com- 

¥%$ ¥%$ ¥

pleting this set into a complete set of literals, !*T#% and

 "!#  "!'& ( "!')

%!$#% , turn out to correspond to all the abductive expla- nations of ¡ according to Kakas and Mancarella’s defini-

where * is the set of all complete hypotheses that are expla-

1 ¡

tion. The same kind of justification turns out to work for ¡ * ,

nations of , * + the set of all explanations of , and the set ¡ our second example as well: the reason that #=9:!.>$ is not of all minimal explanations of . Therefore the set of mini-

a preferred explanation is that while its completion accord- mal explanations is a succinct representation of the set of all

3

EF!JH0IN+ML$TOP!J;%QSR HJF%>$ ing to negation-as-failure, #=9:!.>?$ is explanations. an explanation, some of its other completions, for example

It is clear from our definition that a complete hypothesis ¥ 3

¡ ¡ HJF%>$ #=9:!.>?*EF!JH0IN+ML$TOP!J;%QSR is not an explanation. This is an explanation of iff ¥© is an abductive explanation of motivates our following definitions. according to Kakas and Mancarella’s definition. This implies that if none of the abducibles occur in the head of any clauses 2 Abduction in logic programming revisited

in , then

 

3

E/.K©0$ 

In this paper, we consider (normal) logic programs which are ¥

-

"! ( "! #

sets of rules of the form &

¡

& 3 3

¡

*

# A( ( ( *# A@%BDC A( ( ( =@B

! where is the set of all abductive explanation of according

§ ¥

1

1 1 -

3

E1.G©0 © £   

3 to Kakas and Mancarella’s definition,

! #£¢ ¢

where , and are atoms of the underlying propositional ¡

¤

¨SG© * , and + is the set of all minimal explanations of . language . Here “ @B

equivalent. However, as we have seen above, the number of ¨ Let be a logic program, a set of propositions standing

¡ abductive explanations can be very large. Enumerating them for abducibles, and a proposition. In the following, without all is impossible even in simple, small domains. In contrast, loss of generality, we shall assume that none of the abducibles

1 the number of minimal explanations are much smaller. More in ¨ occur in the head of a rule in . importantly, just like explanations in propositional logic, they

In the following, by a hypothesis ¥ we mean a consistent

only include “relevant propositions.”

  set of literals over ¨ , i.e. it is not the case that and

But computationally, it may be hard to compute minimal

  ¨ are both in ¥ for some . We say that a hypothesis

explanations from scratch. It is often easier to compute first a

 ¨   ¥ is complete if for each atom  , either or is in , small “cover” of all explanations.

but not both. Notice that a complete hypothesis is really a *

truth-value assignment over the language ¨ . We say that a Definition 2.3 A set of hypotheses is said to be a cover of

¡

¨

¦ ¦ §¥

hypothesis ¥ is an extension of another one if , and a w.r.t. and iff 

complete extension is an extension that is complete. 

¥%$ ¥ 

¥ ( "!'2

Definition 2.1 A complete hypothesis is said to be an ex- ( "!

¡

¨

planation of w.r.t. and iff there is an answer set of

¡

¡

*3

¨¥ ©   ¥ 

£ such that contains and for any , , where is the set of minimal explanations of .

¡ ¥

where ¥ © is the set of atoms in .

¥ 4*

Proposition 2.4 If * is a cover of , then each must ¡ Definition 2.2 A hypothesis is said to be an explanation of ¡ be an explanation of . iff every complete extension of it is an explanation. A hypoth- So a cover is a set of explanations such that any complete

esis ¥ is said to be a minimal explanation if it is an explana- explanation must be an extension of one of the explanations in ¥ ¥ tion, and there is no other explanation ¥ such that . the cover. Once we have a cover, then we can find all minimal

Consider the logic program in the previous section about explanations by propositional reasoning alone. Recall that a

3

!4+6587:9<;; 5

. The following are the complete hypotheses that conjunction of literals ¥ is a prime implicant of a formula

¥ § ¥ §

3

!4+6587:9<;;

5 ¦ ¦ 5 ¦

explain : if ¥ , and there is no other such that and is

¥ ¥

3 a subset of , i.e. is a minimal conjunction of literals that

#=9:!.>?$ EF!JH0IN+ML$TOP!J;%QSR HJF%>$ 4

entails 5 .

3 3

#=9:!.>?$ EF!JH0IN+ML$OP!J;%QSR HJF%>$ 4=#=9:!.>?*EF!0H0IN+ML"*O"!J;%QSR HJF%>$ D( ¡

Proposition 2.5 Let * be a cover of . Then a hypothesis is a

3

¡

#?9:!4>?$OP!J;%QSR HJF%>$

( "! ¥ Now consider . Clearly every complete minimal explanation of iff it is a prime implicant of 6 . extension of this set is an explanation, so it is an explanation In the rest of this paper, we shall propose a rewriting sys- 1

If  occurs in the head of a rule, then we can always intro- tem for generating explanations of a proposition in a logic

  

duce a new proposition, say  , add the rule to , add to program. We shall first define it for logic programs without

   and delete from . abducibles. We will then extend the system to logic programs

656 LOGIC PROGRAMMING AND THEOREM PROVING

§ §

£

L 3  L  (A(A(! L

with abducibles, and show that for any query, the rewriting Context: A rewrite chain L

§

1 -

 L'3< ( ( (  L

system generates an approximation of a cover set in the gen- records a set of literals 5 for proving

-0/21

L . AL A( ( (  L 0 5

eral case, and exactly a cover set when the given logic pro- . We will write 3 and call a context.

-0/ 1 gram has no so-called “odd loops.” We will then discuss ¦ For simplicity, we assume that whenever  is gener-

an application of our system to reasoning about actions and

G5 0 5 ated, it is automatically replaced by . , where is

present some experimental results. the set of literals on the corresponding rewrite chain, and

¦

 is automatically replaced by . 3 Goal Rewrite Systems

Note that for every literal in any derived goal, the rewrite

Given a (ground) program , the Clark completion of , de-

chain leading to it from a literal in the given goal is uniquely

9¡  .G 0

noted 5 , is the following set of equivalences: for ¢

¤ determined. As an example, suppose the completion of a pro-

 ¤

each atom , ¤

3 ¡

¢ ¢¥¤

%! T#  # 2

£ gram is: , . We then have a rewrite

¨

¡ 3

if does not appear as the head of any rule in , 3

¦

 T#     

sequence ! . For the three literals

9¡  .G 0 5 .

¢§¤ in the last goal, we have the following rewrite chains for them

£

¡ 3

Q ( ( ( Q  5 9¡  . 0

!"¤T##  !$ T##  !$ 

otherwise, (with default from ! : , , and .

©¨ ¨ -

negations replaced1 by negative literals), if there are ex-

¢ ¢

& Simplification Rules:

Q ¢ 

actly + rules with as the head. We write

Q Q ¢

for ¢ if is empty.

 5 E Let ¢ ’s be any goal formulas, a context, and a literal.

The idea of goal rewriting is simple. A completed defini-

¢ ¤

¦

Q (A( ( Q  5 9¡  .G 0

tion can be used as a rewrite SR1. %

¢

1 ¨ ¨ -

¨

Q (A(A( Q

rule from left to right: is rewritten to , and ¦

¢

1 ¨ ¨ - 

SR1’. 

  Q (A(A(  Q

to . We call these literal rewriting, and ¨

1 - ¦ the completed definitions program (rewrite) rules. ¦

SR2. %

A goal is a formula which may involve  , and . A goal

¦ ¦

¨



is also referred to as a goal formula. A goal resulted from a SR2’. 

literal rewriting from another goal is called a derived goal. A

.G5 + 0 .G5 £85 + 0 G5 0 5 £85 +

SR3. . if is consistent

1 1 1

goal with negation appearing only in front of atoms is said

¦

to be signed, a term introduced in [Kunen, 1989] for a sim-

.G5 0& .G5 0 5 £5 +

SR4. if + is inconsistent

1

ilar purpose. For convenience, we assume that all goals are 1

¦

.G5 0 E' E  5

signed, which can be achieved easily by simple transforma- SR5. if 

  ¦

tions using the following rules: for any formulas and ,

.G5 0  E 5

SR5’. E if

T

 .)  0& .  0 .)  0

+ , + ,

.  0  

 SR6.

¨ ¨

1( 1( 1(

¨

 .  0  

¨

 0   .  0 .) .)  0 *

+ , , +

SR6’. ,

¨ ¨

1(

Like a formula, a goal may be further transformed to a suit- 1 able form for literal rewriting without changing its semantics. The simplification system is a nondeterministic transfor-

With a mechanism of loop handling, rewriting of a goal  ter- mation system. The primed version of a rule is its symmet-

¦ minates at either meaning that  is proved, or meaning ric case. Most of the rules are about the logical equivalence

that  is not proved. Hence, a goal rewrite system consists of between the two sides of a rule. SR3 merges two contexts if

9¡  .G 0 three types of rewrite rules: (i) program rules from 5 they are consistent, otherwise SR4 makes it a failure to prove. for literal rewriting, (ii) simplification rules to transform and SR5 and SR5’ prevent generating an inconsistent context be-

simplify goals, and (iii) loop rules for handling loops. fore literal E is even proved. 3.1 Simplification rules Note that the proof-theoretic meaning of a goal formula may not be the same as the logical meaning of the formula.

The simplification subsystem is formulated with a mechanism

 !

E.g., the goal formula ! (a tautology in classic logic) ¦

of loop handling in mind, which requires keeping track of ¨

 !

could well lead to an if neither ! nor can be proved.

L  ( ( (  L L  I + ¢ literal sequences 3 where each , , is in

- For goal rewriting that does not involve loops, the system L the goal formula resulted from rewriting ¢ . Two central

/21 described so far is sufficient. mechanisms in formalizing goal rewrite systems are rewrite chains and contexts.

Example 3.1 Let be £

Rewrite Chain: Suppose a literal E is written by its defi-

¢ ¤ ¢ ¢

§ §

&

E E  L @%BDC0!

nition  where or . Then, each literal

& 3,+ &

E ! @%BDCJ#:=@B

E in the derived goal is generated in order to prove .

& ¡.+ &

 E # # @B

This ancestor-descendant relation is denoted E . A

 ( (A( E

sequence E is then called a rewrite chain,

1 -

9¡  .G 0

Then 5 is:

 ©E

abbreviated as E . Notice that it is essential here

1 -

¤ ¤ ¤

+ 3 + ¡



 ! ! .GT#  0 .G#  -0 # 

that any goal be in the form of a signed goal, and that L

¤ ¤ ¤ ¤

¦ ¦ ¦ ¦

¨ ¨

¡ + + + 3

  E ¤ E  - when is in , we have that but not . 

LOGIC PROGRAMMING AND THEOREM PROVING 657

§

& + & + &

+ ¡- @B

The rewrite sequence below is generated by focusing on the . Both - and -

left part of a goal.  can be proved.

-   !  #   ! .  -"* !$#% 0

L  !

3

 - !  T#  ! .  -" !$T#% 0

 .G# 0 .GT# - 0

¨ ¨

¡ 3

§

 .  0 .KT# - 0

& + &

¦

¨ ¨ ¨

%! @%BDCJ# # @%BDCJ#% !  ! ,

3 . Neither nor is

 .  0 .KT# -0

¨ ¨ ¨

3 proved:

 .G 0 .GT# -0

§

¦ ¦

¨ ¨

3

+

 . .K5 0 0 .KT# - 0 5  L$ !"$#*6

 T#  #   !  #  T# 

where !

¨ ¨

3

 . .K5 0 .KT# - 010 . .GT# - 010

¨ ¨ ¨

3

 . .K5 0 T# 0 . .K5 0 -0 . .GT# -0/0

% ¢¡£¡£¤¦¥¨§ © 3.3 Goal rewrite systems and their properties

¦

¨ ¨ ¨

3

. .G5 0 - 0 . .KT# - 010

 To summarize, a rewrite sequence is a sequence of zero or

¨ ¨ ¨

3

 . .K5 0 -0 . .GT# -0/0

  ( (A((   $ 3

more rewrite steps 3 (denoted )

¦

¨ ¨

3

 . .K5 0 0 . .KT# -0/0 

such that 3 is an initial goal (one involving no context), and

¦

¨ ¨

3

 . .GT# -0/0

I H  ¢  ¢

for each  , is obtained from by

¦ ¦

¨ ¨

©

1

3

£

 .KT# -0& .GT# - 0

 ¢

¨ ¨

literal rewriting at a non-loop literal in ; or

£ 

3.2 Loop rules applying a simplification rule to a subformula in ¢ ; or

£

E ¢

After a literal is rewritten, it is possible that at some later applying a loop rule to a loop literal in  .

 E

stage either E or appears again in a goal on the same rewrite

  

We may call a subsequence ¢ a rewrite sequence

 ( ( ( *E

chain. Thus, a loop is a rewrite chain %E such that

§ § -

1 in the understanding that it is part of some rewrite sequence

E  E E

E , or . A loop analysis involves classifying



 3#  ¢    3

- - 1 1 from an initial goal .

all the cases of loops, and for each one, determining the out-

   Definition 3.4 A goal rewrite system is a triple 

come of a rewrite according the underlying semantics. For 

  

the problem at hand, there are only four cases. , where  is the set of all goals, is a set of rewrite

§

9¡  .G 0

rules which consists of program rules from 5 , the

E  ©E

Definition 3.2 Let © be a rewrite chain.

 -

1 simplification rules, and the loop rules; and is the set of

§ §

£

E E E  E ©

If  or , then is called an odd loop. all rewrite sequences.

1 - 1 -

§

£ E

If E , then Goal rewrite systems are like term rewrite systems (see, -

1 e.g., [Dershowitz and Jouannaud, 1990]) everywhere except

E E

– © is called a positive loop if and are both -

1 at terminating steps: a terminating step at a subgoal may de-

 ©E

atoms and each literal on E is also an atom; -

1 pend on the history of rewriting.

E E

– © is called a negative loop if and are both -

1 Two desirable properties of rewrite systems are the proper-

 © E

negative literals and each literal on E is - 1 ties of termination and confluence. Rewrite systems that pos- also negative; sess both of these properties are called canonical systems. A – Otherwise, © is called an even loop. canonical system guarantees that the final result of rewriting

In all the cases above, E is called a loop literal. from any given goal is unique, independent of any order of -

§ rewriting. It therefore allows an implementation to be based

E E E I"¦+

Note that when , and all of ¢ , , have - the same sign, this sign1 is either positive or negative, though on any particular order of rewriting. both types of loops are caused by “positive loops” in the given Since the simplification system is terminating and literal program. These two types of loops must be treated differently rewriting only generates non-repeated rewrite chains, it is according to the semantics. clear that a goal rewrite system is terminating when the given It turns out that we only need two rewrite rules to handle program is finite. A goal is called a normal form if it cannot

all four cases. be rewritten by any rule.



  

Proposition 3.5 Let  be a goal rewrite system.

 © L

Loop Rules: Let L be a rewrite chain.

 -

1 If is finite then every rewrite sequence in is finite. Fur-

¦

 

 3    

LR1. L ther, for any rewrite sequence , if is a normal

§ ¦ §

-

¡

 

  .G5 0 (A(A( .K5 0

L  ©¤L  I  +

if ¢ , for some , is a positive loop or form, then either or for

1 ¨ ¨

- 

an odd loop. some .



L  . AL  ( ( (  L 0

 

  

LR2. Definition 3.6 A goal rewrite system  is conflu-

- 1 -

L  © L  I  +

¢

 

 > + >  > ,

if , for some , is a negative loop or ent iff for any rewrite sequences > and , there

-

1 1

*

>"!'#  >  >"! > $ >"! ,

an even loop. exist and rewrite sequences + and .



 

   Apparently, a loop literal should always be rewritten by a Theorem 3.7 Any goal rewrite system % with a

loop rule. finite is confluent.

§

3,+ 3 & 3

& The goal rewrite systems described here are sound and

# @B

1 complete w.r.t. partial stable models. These results are spe-

and T# is not.

cial cases of those for the rewrite systems for abduction given

¦

3 3 3 + 3 3

     . :#* 0 T#    # in the next section.

658 LOGIC PROGRAMMING AND THEOREM PROVING 4 Rewrite systems for abduction for any goal, the underlying rewrite system will generate an

approximation of a cover for any program , and exactly a ¨

Let be a program and a set of abducibles. An abducible

¢ ¢

cover when has no odd loops.

literals is either an abducible or its negative counterpart  .



 

 *¨S  The rewriting framework that we defined earlier can be ex- Theorem 4.2 Let % be a goal rewrite system. tended for abduction in a straightforward way: the only dif- Suppose ¡ is a proposition and

ference in the extended framework is that we do not apply the ¡

# #



 E .G5 0 E  .G5  0 (A(A(

¨ ¨

1 1*1    1 1

Clark completion to abducibles. That is, once an abducible 1

¡ ¡ ¡ ¡

 

0 E .G5 0 .G5

appears in a goal, it will remain there unless it is eliminated E

1 1

 

¡ ', ©¡ ',

by the simplification rule © or .

¢

is a rewrite sequence such that each E is either or an ab-



.K5 0

Just like a rewrite to is written as , where 5 is the

5 £ £ 5  I ¢

ducible literal, and ¢ is consistent for each .   underlying rewrite chain (cf. Section 4.1), a rewrite to an 1

Then, if has no odd loops then

E1.G5 0 5

abducible literal E will be written as for rewrite chain .



¡ ¡

#

 

  E D D%E   E . =%E

   ¨ 

In the following we shall denote by % the

1 1*1 1

     

¨ rewrite system obtained by the logic program and the set ¡

is a cover of . In general, for arbitrary we have of abducibles. We shall show that it is both sound and com-

[ 

¡ ¡

plete w.r.t. the partial stable models semantics Przymusinski, #



E   ¥ E E   E

¨ ¨

1(   1*1   1  

1990], which is a three-valued generalization of answer set ( "!

semantics. An answer set of is also its partial stable model. ¡

where * is any cover of .

But sometimes may not have any answer sets. For instance,

§

& + ¡'&

  @B

if , then has no answer set, because Consider again the boat example in Section 1. The Clark

3



there is no way to assign a truth value to . However, has a completion of !4+6587:9<;%; is: ¡

partial stable model in which is true and  is undefined. The

3 3

$ .G#?9:!4>  EF!JH(0 .K#?9:!4> EGF!0H OP!J;%QSR HJF%>/0=(

[ ] !4+6587:9<;;

following definition is adopted from You and Yuan, 1995 . ¨

3 ©

Let be a (ground) program. For any set of default

#?9:!4> EF!JH OP!J;%QSR HJF%>

¢ ¢

§ ¥

¦ Since , and are abducibles, so rewrit-

3

.G©0 .@%BDC0! £ ©  !" negations, let  , where is the ing for !4+65879<;; terminates in one step, and produces the

standard propositional derivation relation with each default ¢

¢ following cover: @B

negation @%BDC being treated as a named atom . A par-

3

:#?9:!4>  EGF%!JHM #?9:!4> EF!JH OP!J;%QSR HJF%>$ .(

tial stable model of is defined by a maximal fixpoint

¦ ¦ §

+



© .G©0 ©

of the function that applies twice,  , while

¦ Notice that the second explanation is not minimal. To get



©  G© 0

satisfying . , in the following way: for any atom

£ £ £ ¢¤£ £

£ minimal ones, we have to compute prime implicants of the

 @%BDC  ©  £ ©

,  if , if , and is

£ £

disjunction of explanations in the cover, which are #?9:!4>

3



undefined otherwise. Notationally, any such that 

£ £

EF!JH #?9:!4> OP!J;%QSR HJF%>

 and .

  and  represents that is undefined in . An an-

swer set ¥ (also called a stable model) can then be defined as §

¦ 5 Related work

 .§¦ 0 ¦

a special case by a fixpoint ¦ such that and

£ ¢©£

§ ¥

 ¤£¨¦ ¥ . Traditionally, logic programming proof procedures have been

defined abstractly in terms of derivation and refutation. Ter- ¨ Theorem 4.1 Let be a finite program, a set of proposi-

 mination has been considered a separate, implementation is-

 

 *¨S  tions, and % the goal rewrite system w.r.t. sue. On the one hand, this separation is possible since the un- and ¨ . derlying semantics allows the completeness to be stated with- out resorting to termination. But completeness is rarely guar- Soundness: For any literal L and any rewrite sequence

anteed in an implementation. On the other hand, the separa-



 

 E .K5 0 E .G5 0  

L tion is also necessary since these procedures deal with non-

1 1 ¨

 

ground programs for which the problem of loop-checking

¢ 5 £ £

where each E is either an abducible literal or , if 1

  is undecidable (even for function-free programs [Bol et al.,

 5 is consistent, then there exists a partial stable model

1991]). For answer set programming, however, loop handling

©



£ E  ( ( ( *E L   5 ¢T

of such that  .

1  1 has become a semantic issue: a sound and complete proce- dure cannot be defined without it. Thus, a distinct feature of Completeness: For any set of atoms © ¨ , and any literal

our work is a mechanism of loop handling both for termina-

£ © L in a partial stable model of , there exists a rewrite sequence tion and for the implementation of the underlying semantics.

Completed programs have been used in query answering



 E .K5 0 E .G5 "0  

L in abstract, abductive procedures in [Console et al., 1991;

¨

1  

1 Denecker and Schreye, 1998; Fung and Kowalski, 1997] for



L  5 ¤ ©  %E A( ( (  E

such that  , and .

 1  1 non-ground programs with constraints. These procedures are

We say that a program has no odd loops if there is no defined for the three-valued completion semantics under the odd loop starting with any literals (programs that have no odd certainty mode of reasoning – computing bindings for which loops are also called call-consistent [Dung, 1992]). It is well- an (existential) goal is true in all indented models. In our case

known that if has no odd loops, then the partial stable mod- the reasoning mode is brave – establishing whether a query is els of and the answer sets of coincide. We now show that true in one of the intended models.

LOGIC PROGRAMMING AND THEOREM PROVING 659

Our work is closely related to another abstract procedure, so is the package. Suppose that we have the following propo-

¥¤ 0 ¤ .¥¤ 0 the Eshghi-Kowalski procedure (EKP) [Eshghi and Kowal- sitions: ta . – the truck is at location initially; pa –

ski, 1989], which is sound and complete for ground programs the package is at location ¤ initially; in – the package is in

¦¤6¨§ © 0 ¤

under finite-failure three-valued stable models [Giordano et the truck initially; ta . – the truck is at location after

© .¦¤6 §M © 0 al., 1996]. Besides loop handling and termination, to some the action of moving it from § to is performed; pa

extent, one can say that our goal rewriting system (GRS) sim- – the package is at location ¤ after the action of moving the

© .¥§M¨©0

ulates EKP in a nontrivial way. truck from § to is performed; and in – the package is © 1. GRS incurs no backtracking! Backtracking is simulated in the truck after the action of moving the truck from § to is ¦ performed. We then have the following logic program:

by rewriting disjunctions, e.g., % .

¨

¥  # . 0=(

2. Loops that go through negation are handled in EKP by ta .

&

¥ ¨ # D¨ ,'0 .¥  # . , 0? .¥ # . , 0?(

nested structures while in GRS by a flat structure using pa . ta in §

rewrite chains. &

¥  # . , 0  ,J .¦ 0?A@%BDC .¥ ¨ # D¨ ,'0=(

ta . ta taol

§



These features plus loop handling made it possible to for- &

¥ ¨ # D¨ ,'0   .  D ,'0?( taol . ta

malize our system as a rewriting system benefiting from the &

¥ ¨ # D¨ ,'0 .¥ 0==@B

known properties of term rewriting in the literature. This also §

&

¦  D ,'0   . ¨ # D¨ ,'0=(

distinguishes our use of rewrite systems from that by [Fung paol . pa

&

¥  0 ( and Kowalski, 1997]. It seems remarkable that a form of non- in . in monotonic reasoning is just rewriting, two areas of research that had little connection previously. The first rule is the effect axiom. The second rule is a causal To illustrate these feature, consider the following program rule which says that if a package is in the truck, then the pack-

age should be where the truck is. The rest are frame axioms.

& &

D( L @%BDC0! 7¡ P( # !

7 For instance, the third one is the frame axiom about ta, with

& 3 3 &

! @%BDCP#=@B

, and if one cannot prove that it will be elsewhere after the and the question whether we can prove L . We may answer this

action is performed, then it should still be at . question by the following reasoning: To have L we must have

3 As one can see, the above program, when fully instantiated

@%BDC0! # @B

over any given finite set  of locations, has no odd loops. So requires having ! (r3 and r4). This results in a contradiction.

our rewrite system will generate a cover for any query. Note

Note that in this reasoning we need to remember what was 3

. 0

that in the program we have omitted domain predicate E9 @B

the program refer to locations). Thus, the program is domain

¦ ¦

3 3 3 3

  !  #  !    !

L restricted and needs only to be instantiated for the variable

¨ ¨ ¨ in the fourth and sixth rules over the domain of locations. However, EKP will go through six nested levels, and do it Now let the set ¨ of abducibles be the following set:

twice through backtracking, before the same conclusion can

¥

£  .¥¤ 0? .¥¤ 0 ¤  4( be reached.  in pa ta Rewriting can be applied to function-free programs for The following table shows some of the results for

proving ground goals. The idea is that if every derived goal § 2

 D$,J¨ P ¢P is ground, then all the mechanisms given in this paper apply  :

directly. Obviously, if for every rule in the given program Query Result Time

 D*, ¨ 0

a variable that appears in the body also appears in the head, ta . false 0.0

 *, ¨ 0 then a ground goal will be rewritten to another ground goal. ta . true 0.0

[ ]

 .*, ¨ 0 . 0 

Domain restricted programs Niemela,¨ 1999 can be instan- pa . pa in 0.05

. D$,J 0  . 0 .G, 0 . 0 .¥¢ 0

tiated only on domain predicates over variables that do not  pa pa in pa pa pa 0.2

¨ ¨ ¨

appear in the head so that the resulting non-ground programs ¨

G, *, ¨ 0 .G,'0  pa . pa in 0.08

also satisfy this requirement. For example, the program given

.G,J$,J 0  .G, 0 . 0 . 0 .¥¢ 0

 pa pa in pa pa pa 0.1

¨ ¨ ¨

in the next section is domain restricted. ¨

¦ P*, ¨ 0 . 0

pa . pa in 0.25

¨

.¦ $,J 0   .¦ 0  . 0

 pa in pa in pa

¨ ¨

.G, 0  .¥¢ 0

6 Applications and experimental results  in pa in pa 0.1

¨

We have implemented the writing framework in SWI-Prolog.

 D$,J¨ 0 For instance, the row on pa . says that for it to be true, In the following, we discuss the performance of our imple- the package must initially be at and cannot be inside the mentation on one particular application of abduction in logic truck (otherwise, it would be moved along with the truck),

programming, which is the problem of computing successor

¦ P*, ¨ 0 and the computation took 0.1 seconds.The row on pa . state axioms from a causal action theory [Lin, 2000]. says that for it to be true, either the package was initially at

Consider a logistics domain in which we have a truck and a  or it was inside the truck. The outputs for larger s are package. We know that the truck and the package can each be at only one location at any given time, and that if the package 2On a PIII 1GHz PC with 512MB RAM running SWI-Prolog is in the truck, then when the truck moves to a new location, 3.2.9.

660 LOGIC PROGRAMMING AND THEOREM PROVING

similar. The performance varies for different queries. For [Denecker and Schreye, 1998] M. Denecker and D. De Schr-

 D*, ¨ 0

simple queries like ta . , their covers can be computed eye. Sldnfa: an abductive procedure for normal abductive

 *, ¨ 0

almost in constant time. The hardest one is for pa . programs. J. Logic Programming, 34(2):111–167, 1998.

¥ ¥D§¡ which took 25 minutes when  . [Dershowitz and Jouannaud, 1990] N. Dershowitz and It is interesting to compare our system with an alterna- P. Jouannaud. Rewrite systems. In Handbook of Theo- tive for computing the cover of a query. As we mentioned retical Computer Science, Vol B: Formal Methods and in Section 2, the set of abductive explanations according to Semantics, pages 243–320. North-Holland, 1990. Kakas and Mancarella is actually a cover. One way of com- [Dung, 1992] P.M. Dung. On the relation between stable and puting these abductive explanations is to add, for each propo-

well-founded semantics of logic programs. Theoretical ¨

sition   , the following two clauses ([Satoh and Iwayama, &

& Computer Science, 105:7–25, 1992.

@%BDCJ  @%BDC% 1991]):  and into the original pro- gram, and use the fact that there will be a one to one cor- [Eshghi and Kowalski, 1989] K. Eshghi and R.A. Kowalski. respondence between abductive explanations of ¡ under the Abduction compared with negation by failure. In Proc. 6th original program and answer sets of the new program that ICLP, pages 234–254. MIT Press, 1989. contain ¡ . So one can use an answer set generator, for exam- [Fung and Kowalski, 1997] T. Fung and R. Kowalski. The ple smodel or dlv, to compute a cover of query by generating iff proof procedure for abductive logic programming. J. all the answer sets in the new program that contain the query. Logic Programming, 33(2):151–164, 1997. However, the problem here is that there are too many such [Gelfond and Lifschitz, 1988] M. Gelfond and V. Lifschitz. answer sets in this case. For instance, suppose there are + The stable model semantics for logic programming. In locations, then the number of answer sets that contain any

+ Proc. 5th ICLP, pages 1070–1080. MIT Press, 1988. - particular query is in the order of , , roughly one half of the number of complete hypotheses, even for a very simple query [Giordano et al., 1996] L. Giordano, A. Martelli, and

M. Sapino. Extending negation as failure by abduction:

 D$,J¨ 0 like ta . . We do not know at the moment if there is any efficient way of using an answer set generator to compute a A three-valued stable model semantics. J. Logic Program- cover set of a query. ming, 26(1), 1996. [Kakas and Mancarella, 1990] A. Kakas and P. Mancarella. 7 Future work Generalized stable models: a semantics for abduction. In Proc. 9th European Conf. for AI, 1990. There are several directions for extending this work. One of [Konolige, 1992] K. Konolige. Abduction versus closure in them is to consider rewriting for non-ground logic programs causal theories. Artificial Intelligence, 53:255–272, 1992. for some restricted yet decidable classes of non-ground goals. Another important one is to extend this to programs with con- [Kunen, 1989] K. Kunen. Signed data dependencies in logic straints of the form: programs. J. Logic Programming, 7(3):231–245, 1989.

¢ [Lin, 2000] F. Lin. From causal theories to successor state

&

! A( ( (  !¢ A@%BDCJ# A( ( ( A@%BDCJ#

1 - 1 axioms: bridging the gap between nonmonotonic action Our new definition of abduction can be extended to include theories and STRIPS-like formalisms. In Proc. AAAI ’00, these constraints straightforwardly. The challenge is in ex- 2000. tending our rewriting system accordingly. [Niemela,¨ 1999] I. Niemela.¨ Logic programs with stable model semantics as a constraint programming paradigm. 8 Acknowledgements Ann. Math. and AI, 25(3-4):241–273, 1999. [Poole, 1988] D. Poole. Representing knowledge for logic- We would like to thank Ilkka Niemela¨ for helpful discus- based diagnosis. In Proc. Fifth Generation Computer Sys- sions related to the topics of this paper, and Ken Satoh for tems Conference, pages 1282–1290, 1988. comments on earlier version of this paper. The first author’s work was supported in part by the Research Grants Council [Przymusinski, 1990] T.C. Przymusinski. Extended stable of Hong Kong under Competitive Earmarked Research Grant semantics for normal and disjunctive logic programs. In HKUST6145/98E. The work by the second author was car- Proc. 7th ICLP, pages 459–477. MIT Press, 1990. ried out mainly during his visits to Hong Kong University of [Reiter and de Kleer, 1987] R. Reiter and J. de Kleer. Foun- Science and Technology and the Institute of Information Sci- dations of ATMS. In Proc. AAAI87, 1987. ence, Academia Sinica in Taiwan. [Satoh and Iwayama, 1991] K. Satoh and N. Iwayama. Com- puting abduction using the tms. In Proc. the 8th Interna- References tional Conference on Logic Programming, 1991. [Bol et al., 1991] R. Bol, A. Krzysztof, and J. Klop. An [Satoh and Iwayama, 1992] K. Satoh and R. Iwayama. A analysis of loop checking mechanisms for logic programs. query evaluation method for abductive logic program- Theoretical Computer Science, 86(1):35–39, 1991. ming. In Proc. JICSLP’92, 1992. [ ] [Console et al., 1991] L. Console, D. Theseider, and P. Po- You and Yuan, 1995 J. You and L. Yuan. On the equiva- rasso. On the relationship between abduction and deduc- lence of semantics for normal logic programs. J. Logic tion. J. Logic Programming, 2(5):661–690, 1991. Programming, 22:212–221, 1995.

LOGIC PROGRAMMING AND THEOREM PROVING 661