Abduction Algorithm for First-Order Logic Reasoning Frameworks

Andre´ de Almeida Tavares Cruz de Carvalho

Thesis to obtain the Master of Science Degree in

Mathematics and Applications

Advisor: Prof. Dr. Jaime Ramos

Examination Committee Chairperson: Prof. Dr. Cristina Sernadas Advisor: Prof. Dr. Jaime Ramos Member of the Committee: Prof. Dr. Francisco Miguel Dion´ısio

June 2020

Acknowledgments

I want to thank my family and friends, who helped me to overcome many issues that emerged throughout my personal and academic life. Without them, this project would not be possible. I also want to thank professor Cristina Sernadas and professor Jaime Ramos for all the help they provided during this entire process, whether regarding theoretical and practical advise or other tips concerning my academic life.

i

Abstract

Abduction is a process through which we can formulate hypotheses that justify a set of observed facts, using a background theory as a basis. Eventhough the process used to formulate these hypotheses may vary, the innerent advantages are universal. It approaches real life problems within fields like medicine and criminality, while maintaining their usefulness within more theoretical subjects, like fibring of logic proof systems. There are, however, some setbacks, such as the time complexity associated with these algorithms, and the expressive power of the formulas used. In this thesis, we tackle this last issue. We study the notions of consequence system and of proof system, as well as provide relevant ex- amples of these systems. We formalize some theoretical concepts regarding the principle, including refutation completeness and soundness. We then extend these concepts to First-Order Logic. This representation of logic system was chosen for three main reasons: to maintain a certain degree of uniformity throughout the entire thesis; to allow a connection between abduction and the fibring of heterogeneous proof systems; to provide a simple and straighforward manner in which to represent the logic systems needed to demonstrate some important results. Equipped with these theoretical re- sults, we formalize and implement an abduction algorithm for First-Order Logic formulas in Mathematica, based on an already existing algorithm. We explore the applications of this algorithm regarding some pratical issues and some theoretical notions. Namely, since this new algorithm accepts formulas not only contained within Description Logic, but also First-Order Logic, it gives way to a better representation of concrete problems, and consequently to a better chance of obtaining correct answers. More, it gives way to the automatization of the fibring of proof systems, through its application to the computation of an abduction function for proof systems.

Keywords: Consequence System, Proof System, Abduction Algorithm, TCHF Reasoning Framework, Resolution

iii

Resumo

Abduc¸ao˜ e´ um processo atraves´ do qual podemos formular hipoteses´ que justificam um conjunto de fac- tos observados, usando como base uma teoria. Apesar do processo de formulac¸ao˜ destas hipoteses´ poder variar, as vantagens inerentes sao˜ universais. Este processo aborda assuntos reais relaciona- dos com varios´ campos, como a medicina e a criminalidade, mantendo a sua utilidade em temas mais teoricos,´ como e´ o caso da combinac¸ao˜ de logicas.´ Apesar disto, existem alguns desvantagens, no- meadamente relativas a` complexidade temporal e ao poder expressivo das formulas´ utilizadas. Nesta tese, abordamos este ultimo´ problema. Estudamos as noc¸oes˜ de sistema de consequenciaˆ e de sistema de prova, apresentando tambem´ al- guns exemplos destes sistemas. Formalizamos alguns conceitos teoricos´ relacionados com o princ´ıpio da Resoluc¸ao,˜ incluindo a correc¸ao˜ e completude de refutac¸ao.˜ Estendemos depois este princ´ıpio a formulas´ de Primeira Ordem. Esta representac¸ao˜ de sistemas logicos´ foi escolhida devido a tresˆ fato- res: para manter um certo grau de uniformidade ao longo de todo o estudo; para permitir uma ligac¸ao˜ entre a abduc¸ao˜ e a combinac¸ao˜ de logicas´ heterogeneas;´ para termos acesso a um modo simples e direto para representar os sistema logicos´ necessarios´ para demonstrac¸oes˜ de alguns resultados im- portantes. Equipados com estes resultados teoricos´ e outros estudos, formalizamos e implementamos um algoritmo de abduc¸ao˜ para formulas´ de Primeira Ordem em Mathematica, baseado num algoritmo ja´ existente e num conceito estendido do princ´ıpio de Resoluc¸ao.˜ Exploramos as aplicac¸oes˜ deste algoritmo no que concerne a assuntos mais praticos´ e a assuntos mais teoricos.´ Nomeadamente, como este algoritmo aceita formulas´ nao˜ so´ contidas em Logica´ Des- critiva, mas tambem´ em Primeira Ordem, permite uma melhor representac¸ao˜ de problemas concretos e, consequentemente, uma maior probabilidade de as respostas obtidas serem corretas. Mais, permite uma automatizac¸ao˜ da combinac¸ao˜ de sistemas de prova, atraves´ da sua aplicac¸ao˜ do algoritmo a` determinac¸ao˜ de uma func¸ao˜ de abduc¸ao˜ de um sistema de prova.

Keywords: sistemas de consequencia,ˆ sistemas de prova, combinac¸ao˜ de logicas,´ algoritmo de abduc¸ao,˜ estrutura de racioc´ınio TCHF, Resoluc¸ao˜

v

Contents

List of Tables xi

List of Figures xiii

1 Introduction 1 1.1 Background and Motivation ...... 1 1.2 Goals and Achievements ...... 3 1.3 Literature Review ...... 4 1.4 Outline ...... 4

2 Basic Notions 7 2.1 Consequence Systems ...... 7 2.2 Hilbert Calculus Induced Consequence Systems ...... 9 2.3 Other Consequence Systems ...... 11 2.4 Proof Systems ...... 15 2.5 Hilbert Calculus Induced Proof Systems ...... 16 2.6 Relations between Proof Systems and Consequence Systems ...... 16

3 Abduction 19

4 Peirce’s Algorithm 21 4.1 Main Ideas and Input ...... 22 4.2 Selection Criterion ...... 23 4.3 Resolution Principle for PROP Formulas ...... 24 4.4 Final Algorithm ...... 28

5 Peirce’s FOL Algorithm 31 5.1 Main Ideas ...... 32 5.2 Preliminaries and Procedures ...... 32 5.2.1 Input ...... 33 5.2.2 CLAFOL Procedure ...... 34 5.2.3 ResolutionOP and MGUFIN Procedures ...... 38 Generalized Resolution Principle - GRP ...... 38 Unification Algorithm ...... 40 Refutation Soundness, Completeness and Decibility ...... 40 Optimizations ...... 44 5.2.4 Other Procedures ...... 45 5.3 Final Theoretical Considerations ...... 46 5.4 Implementation ...... 47

vii 5.4.1 Input ...... 47 5.4.2 CLAFOL Procedure Implementation ...... 47 Pre Translation ...... 48 Skolemization Translation ...... 48 CNF Translation ...... 49 Clausal Translation ...... 49 Final Procedure ...... 50 5.4.3 ResolutionOP Procedure Implementation ...... 51 Seq1 Procedure ...... 51 Seq2 Procedure ...... 51 MGUFIN Procedure ...... 52 Optimizations Procedures ...... 57 Final Procedure ...... 60 5.4.4 Hypotheses Formulation ...... 64 5.4.5 Implementation of Peirce’s FOL Algorithm ...... 65

6 Results Analysis 67

7 Other Applications 71 7.1 Proof Systems in FOL ...... 71 7.2 Fibring ...... 72

8 Conclusion 79 8.1 Achievements ...... 79 8.2 Future Work ...... 80

Bibliography 81

A TCHF Reasoning Frameworks #2 Tested and Summarized Results 85 A.1 TCHF Reasoning Frameworks #2 ...... 85 A.2 Summarized Results ...... 86 A.3 Example 14 ...... 87

B Logic Definitions 89

viii x List of Tables

6.1 Results obtained with PEIRCEFOL ...... 68

A.1 Results obtained with PEIRCEFOL ...... 86

B.1 Boolean Algebra for ∧ and ∨ ...... 89 B.2 Boolean Algebra for ¬ ...... 89

xi

List of Figures

1.1 Peirce’s Theory of Inquiry ...... 1

xiii

Chapter 1

Introduction

1.1 Background and Motivation

Abduction algorithms are procedures that formulate hypotheses that justify a set of observed facts, using a theory as a basis. They are based on the concept of abduction, in particular Logic-based abduction, first introduced in general terms by Charles Peirce as one of the components of his Theory of Inquiry [1]. This notion denotes a type of reasoning, different from inductive and deductive reasoning, that instead of starting by the hypotheses with the goal of reaching a conclusion, it starts with the conclusions in order to determine hypotheses. While in the field of mathematics this order of reasoning may not be very useful, since the answers obtained are always uncertain, in other fields, such as diagnosis, economics, automated planning and historical linguistics, it has proven to be very helpful.

Figure 1.1: Peirce’s Theory of Inquiry

With the evolution of computer studies, the inclusion of abduction reasoning within fields like com- puter science and artificial intelligence research has been an open problem within the scientific commu- nity. Many algorithms, like the ABox abduction for Description Logic described in [2] by Klarman et al., the CIFF procedure by Endriss et al. presented in [3], and the THEORIST logic programming system for First-Order Logic (without quantifiers), detailed by Poole in [4], have tried to simplify this process, while at the same time expanding the universe to which it can be applied. Several progresses have been made, in particular in more recent years. Namely, the complexity of the logic systems used has increased from simple Propositional Logic to more advanced Description Logic and First-Order Logic, and selection criteria, i.e. methods through which the best hypotheses are chosen, have been refined. However, the formalization of such complex algorithms has also presented its problems, namely in terms

1 of time and space complexity. According to the article by Eiter et al. [5], the complexity of an abduction algorithm depends on the selection criterion chosen to sort the hypotheses and on the types of formulas used. The most efficient one studied within this article uses Propositional Logic formulas in definite Horn Clause form and a very basic selection criterion in which all preliminary solutions are considered to be good hypotheses. It belongs to P - the class of decision problems that can be solved by a deterministic Turing machine in polynomial time [6]. On the other hand, the remaining algorithms studied have a significantly higher complexity, with an algorithm that uses Propositional Logic formulas and a selection criterion based on P 1 order belonging to class Π3 [6] . Following this train of thought, it’s important to mention that one of the main conclusions reached within this article is that the complexity of an abduction algorithm reduces by one level if the input formulas are Horn, evidencing the role of the structure of the formulas chosen. In terms of selection criteria, a plethora of options were explored in [5]. Every one of them follows the guiding principle of Occam’s Razor [7] - if there are multiple competing hypotheses, select the one that makes the fewest assumptions. Another factor that must be taken into consideration, as it was done for example in the book by Gene- sereth et al. [8] and in the article by Rodrigues et al. [9], is the process used to build the hypotheses. Regarding this issue, there have been several approaches. Some experts, like Poole in [4], have used uniform deductive reasoning to compute them. Others, like Klarman et al. in [2] and Rodrigues et al. in [9], have used specific inference rules, namely Resolution and Factoring. From all the approaches taken until now, this last one has been proven to be quite efficient in terms of space complexity and, through the application of some optimization procedures, in terms of time complexity. At last, many different computing systems have been used to implement abduction algorithms. The most prolific system used is SWI-PROLOG, which in itself contains a simple abduction procedure that uses formulas in Horn Clause form. An example of an algorithm implemented in this system is the A-System - a combination of ACLP (abductive constraint logic programming), IFF (if and only if proof procedure) and SLDNFA (selective linear definite clause Resolution with negative failure) - developed by Kakas et al. in [10]. ACLP itself has also been implemented within ECLiPSe, using CPL language. The already mentioned ABox abduction algorithm was formalized in Java Expert System Shell (JESS). Another subject that we think has a connection with these algorithms is fibring of logics, a field in which there has been noticeable progress as well. This connection was inspired by the studies made by Sernadas et al., presented in the articles [11] and [12]. In theoretical terms, fibring is a meta-logical constructor that joins two logic systems into one. The usefulness of fibring has many strands. We highlight two of them. The first comes from transference results, i.e. the study of the preservation of properties from the original systems. The second one comes from the added possibility of mixing symbols of both logic systems within the same formulas, increasing their expressive power. Looking at the big picture, we can divide fibring into two problems. The first one corresponds to ho- mogeneous fibring, i.e. combination of logic systems presented in the same manner, using for example Hilbert calculi. This side of the issue has been explored with significant depth, including in the articles [13], [14], [15] and [16]. On the other hand, the second issue, corresponding to heterogeneous fibring, has had less attention devoted to it. As the name implies, it consists in the combination of logic systems presented in different manners. For example, it could be applied in the case where one system is a Hilbert calculus and the other is a sequent calculus. In [11], this issue has been addressed. In this same article, the logic systems adopted - consequence system and proof system - are purposely very generic in order to facilitate this process of fibring of logics. Several results are shown, including some transference results within fibring

1 P Π3 = co − N P(co − N P(co − N P(P))). It symbolizes the set of decision problems where a negative response can be obtained through a non-deterministic Turing machine in polynomial time, using an oracle of class co − N P(co − N P(P)).

2 of proof systems. One of the results achieved was the abductability within proof systems. It portrays the idea that it is possible to define an abduction function - a map that given a derivation d from a set Γ of formulas to a formula ϕ, returns a subset of Γ which also makes the derivation d valid - for a certain type of proof systems. More, it was shown that if two proof systems have an abduction function, then their fibring (which was proven to also be a proof system) has an abduction function as well, dependent on the original maps. Taking into consideration this result, one can easily conclude that these functions play relevant roles not only in the computation of the abduction function for the fibred system, but also in the definition of fribring itself. Thus, it would be interesting to see if we could automatize the process of computing an abduction function, in order to connect the theoretical results with the practical side. By looking at both of these topics - abduction algorithm and fibring -, one can easily perceive that they are somehow connected, mainly due to the concept of abduction function. Since they both relate to relevant present-day issues, it would be advantageous to study how this connection could be made.

1.2 Goals and Achievements

Throughout the background search, we’ve registered the prevalence of several setbacks with abduction algorithms. In particular, we’ve noticed the following:

• the frameworks used englobe several different types of logic, from Propositional Logic to Modal and Description Logic. However, when we turn to First-Order Logic (including quantified formulas), the options are limited in terms of input accepted, with the most relevant result besides the ABox algorithm for Description Logic, which is a subset of First-Order Logic, being the method developed by Mayer et al. in [17]. Their method can be applied to First-Order Logic using a sequent or tableau calculus based logic system. In many ocasions, a high number of conditions are required to be satisfied. Hence, we could try to devise an algorithm that accepts First-Order Logic formulas with less restrictions;

• eventhough some of the algorithms constructed, like the one explored in [18] by Peng et al., have practical applications, the pool of algorithms that effectively reflect real life events is still limited. Thus, we should try to implement this algorithm based on a framework that allows for a better representation of these events.

With this is mind, and taking into consideration the background study, we want to develop an abduc- tion algorithm for First-Order Logic formulas that computes hypotheses using a Resolution mechanism, taking into account the guiding principle of Occam’s Razor in terms of selection criteria. Moreover, we want to implement this algorithm in a simple computing system, in order to facilitate the user-computer interaction. In terms of complexity, due to the studies already referenced, in particular [5], we expect high values for both time and space. We’ll try to balance these parameters, but our main focus is the increasement of the expressive power of the formulas accepted. Considering proof systems and their fibring, we want to connect this algorithm to the definition of an abduction function for proof systems, with focus on the First-Order Logic case, since this is the scope of the algorithm proposed. More, in order to maintain a uniform view of both subjects throughout this work, we adopt the logic systems described in [11], including some based in Hilbert calculi, even in the discussion of the theoretical concepts behind the computational procedures. Notice that despite introducing notions related to heterogeneous fibring, it is not our intent to develop the theoretical aspect of this field of study. Instead, we simply want to automatize one particular case - abduction functions

3 for proof systems in First-Order Logic. The definitions introduced can thus be seen as an example of a situation where the automatization can be useful. To summarize, the main achievements of this thesis are the following:

• translation of the generic theoretical notions of consequence systems and of proof systems, pre- sented in [11], into more concrete examples, in order to simplify the representation of systems based in Propositional Logic and First-Order Logic and to achieve a better understanding of the formulas in play;

• adaptation of the proofs of refutation soundness and completeness of the Resolution principle and of the extended Resolution principle, using notions related to proof systems;

• implementation of a translation mechanism in Mathematica, that translates input formulas in First- Order Logic into formulas that can be processed;

• implementation of a Resolution mechanism in Mathematica for First-Order Logic formulas;

• development and implementation of an abduction algorithm in Mathematica for First-Order Logic formulas;

• introduction of a relation between the concepts of abduction function, presented in [11], and the abduction algorithm developed, which in practise simplifies the process of fibring of proof systems.

1.3 Literature Review

The main bibliographic source for the theoretical concepts of consequence system, proof system and fibring was the article by Sernadas et al. [11]. More specific formalizations of these notions, taken from the article by Sernadas et al. [12], were also used as inspiration. The definition of Resolution and the concrete unification algorithm were based on the article by Genesereth et al. [8]. The background algorithm - Peirce’s algorithm - was introduced by the Rodrigues et al. [9]. The explanations facilitated within this article were useful in the sense that they gave us a clear picture of how the extended algorithm should work. Eventhough we use a different framework, it’s worth mentioning the influence of the procedure devel- oped by Klarman et al. [2], which takes advantage of tableaux calculus. Namely, this routine also uses the Resolution principle as a means to formulate hypotheses.

1.4 Outline

In Chapter 2, we present the concepts of consequence system and of proof system, as tuples made up of concrete sets and relations. In particular, these sets include a signature (family of sets whose elements we denote by constructors) and an extensive, monotonic, idempotent and closed for renaming substitutions map, in the case of consequence systems. For proof systems, besides a similar signature set, a set of derivations, a map that composes said derivations and a family of relations connoting a relation between a set of formulas, a derivation and a conclusion are also required. We also provide some examples so that these ideas become clearer, and methods of induced construction for these systems, i.e. mechanism through which we can obtain a proof system or a consequence system based on the the signature and rules of Hilbert calculi, or even based on other consequence systems or proof systems. We introduce some notions related to interpretation of formulas within the systems. At last, we describe the connections between all of these systems.

4 Chapter 3 introduces the problem of abduction. We formalize the idea of abduction algorithm, present some examples of applications related to this idea and discuss the expressive power of the formulas used, comparing it to the previously developed algorithm that most relates to our problem. In Chapter 4, we describe Peirce’s algorithm, which is used as a background for the development of the new algorithm. We present its pseudocode and analyse it. This analysis starts with a study of the input framework, of the idea behind the processes used and the selection criterion chosen. Then, it pro- gresses into the discussion of the Resolution principle, which includes the proof of refutation soundness and completeness. To finalize this chapter, we study this algorithm as a whole, i.e. how it produces an output, and why it is correct, while also mentioning its limitations. Chapter 5 is reserved for the development and implementation of Peirce’s FOL algorithm, an ab- duction algorithm that accepts frameworks with First-Order Logic formulas. Similar to the previous chap- ter, we study the theoretical background, but this time extending the concepts to First-Order Logic. In particular, we present the mechanisms required, including translations, a unification algorithm and the extended Resolution principle. We show its refutation soundness and completeness, and discuss its decibility. Some optimizations are introduced, as well as an explanation of why this algorithm produces correct answers (in theory). The implementation of the procedures, including their pseudocode and some problems that emerged, is left for the final part of this chapter. The analysis of the results obtained is detailed in Chapter 6. In it, we present some results regarding time complexity (and, partly, space complexity), the output obtained, and the relations between the solutions and the input framework. Chapter 7 is used to provide an insight on other applications of Peirce’s FOL algorithm. Namely, we relate this algorithm to the problem of determining an abduction function for a proof system. The notion of fibring of proof systems, as well as some properties of these fibred systems, are presented. We finish the chapter by relating our algorithm to the simplification of fibring and other underlying procedures, which can be solved indirecly by some programs developed. The summary of the achievements and suggestions of future work are left for Chapter 8. Appendix A contains some other examples of real life applications, while in Appendix B, important logic definitions are presented.

5 6 Chapter 2

Basic Notions

In this chapter, we’ll introduce the basic notions concerning consequence systems and proof systems. We’ll also present some examples, over which we will work in the next chapters.

2.1 Consequence Systems

We start by introducing a significant amount of important definitions, which will help us to understand the idea behind our goals. Most of them were extracted from [11], [12], [19] and [20], and are presented here with a few changes in terms of notation. The notation will be adjusted along the thesis, and we will mention when and why those changes occur.

Definition 1. (Signature)

A signature C is a family of sets Ck indexed by k ∈ N.

Definition 2. (Constructors)

The elements of Ck are called constructors or connectives of arity k.

Definition 3. (Included Signature) 0 0 0 We say that a signature C is included in another signature C , denoted by C ⊆ C , if Ck ⊆ Ck for every k ∈ N.

Definition 4. (Language and Formulas) Let L(C, Ξ) be the free algebra over C generated by the set of schema variables

Ξ = {ψn| n ∈ N} i.e.

• c ∈ L(C, Ξ) if c ∈ C0;

• ψn ∈ L(C, Ξ) if ψn ∈ Ξ;

• c(ϕ1, ..., ϕk) ∈ L(C, Ξ) if c ∈ Ck and ϕ1, ..., ϕk ∈ L(C, Ξ).

We call L(C, ∅) = L(C) the language and its elements formulas.

As we will see further ahead, the schema variables are used to build schematic rules, which in turn can be used to develop schematic derivations. At any of these stages, the schema variables can be substituted by formulas. When this happens, we drop the nomenclature ”schematic” from the concepts of schematic rules and schematic derivations.

7 Example 1. We can present the signature and language of Propositional Logic, denoted by PROP, using the previous definitions. In particular, CPROP is defined as follows:

PROP • C0 = Prop ∪ {⊥, >};

PROP • C1 = {¬};

PROP • C2 = {⇒, ∧, ∨};

PROP • Ck = ∅, for k > 2; where Prop is the set of propositional variables denoted by indexed lower case letters of the form ci, where i ∈ N. The language L(CPROP ) is the free algebra over CPROP generated by ∅.

There are two important concepts that we also need to introduce. They regard two types of ”basic” formulas, and will be used throughout the thesis.

Definition 5. (PROP Atom and Literal) PROP A PROP atom is a formula in C0 . A literal in PROP is an atom (positive literal) or a negation of an atom (negative literal).

Before introducing the next example, we need to define an important concept.

Definition 6. (Set of Terms - Term) The set of terms, denoted by Term, is defined recursively as follows:

• ci ∈ Term if ci ∈ χ;

• xi ∈ Term if xi ∈ C ;

• fj(t1, ..., tn) ∈ Term if fj ∈ Fn and ti ∈ Term, for i = 1, ..., n; where:

• χ is a countable infinite set of variables, denoted by ci, where i ∈ N;

• C = F0 is a countable set of constant symbols, denoted by xi, where i ∈ N;

•F i are countable sets of function symbols, denoted by fj, with a fixed arity i, and where j ∈ N; these symbols operate over terms.

In First-Order Logic (FOL), we’re also able to define atoms and literals, similar to what was done in PROP.

Definition 7. (FOL Atom and Literal)

A FOL atom is a formula of the form P(t1, ..., tn) or >, where:

• P ∈ Pi, where Pi are countable sets of predicate symbols of arity i ∈ N, denoted by upper letter cases like P, Q, W;

• ti ∈ Term.

A literal in FOL is an atom (positive literal) or a negation of an atom (negative literal).

For representation reasons, we’ll denote the set of literals in FOL by LIT.

Example 2. We can introduce the signature and language of First-Order Logic, denoted by FOL, using the previous definitions. In particular, CFOL is defined as follows:

8 FOL • C0 = {⊥, >} ∪ LIT;

FOL • C1 = {¬, ∀, ∃};

FOL • C2 = {⇒, ∧, ∨};

FOL • Ck = ∅, for k > 2;

FOL where the unary operators ∀ and ∃ can only be applied to variables ci. The language L(C ) is the free algebra over CFOL generated by ∅.

Now, we present one of the main notions of this chapter.

Definition 8. (Consequence System) A consequence system is a tuple C = hC, `i, where C is a signature and ` : ℘(L(C)) → ℘(L(C)) is a map that satisfies the following properties:

• Extensivity - Γ ⊆ Γ`;

` ` • Monotonicity - if Γ1 ⊆ Γ2 then Γ1 ⊆ Γ2 ;

• Idempotence - (Γ`)` ⊆ Γ`;

• Closure for renaming substitutions - ρ(Γ`) ⊆ (ρ(Γ))` for every renaming substitution ρ, i.e. for every substitution ρ such that ρ(ϕ) ∈ Ξ if ϕ ∈ Ξ; where Γ` = {ϕ| Γ ` ϕ} is the closure of Γ.

One immediate result that we obtain from Definition 8 is the following:

Proposition 1. (Γ`)` = Γ`

We can also analise the sets within Ξ and L(C), and define the consequence systems accordingly.

Definition 9. (Consequence System’s Categories) Let Π ⊆ Ξ, Γ ⊆ L(C). A consequence system is said to be:

• consistent if ϕ∈ / Π`, for every ϕ ∈ Ξ\Π;

• closed for substitution if σ(Γ`) ⊆ (σ(Γ))`, for any substitution σ: L(C, Ξ) → L(C, Ξ);

• compact or finitary if Γ` = S Φ`; Φ∈℘fin(Γ)

• recursive if Γ` is recursively enumerable whenever Γ is recursive;

• recursively enumerable if Γ` is recursively enumerable whenever Γ is recursively enumerable.

2.2 Hilbert Calculus Induced Consequence Systems

Defining a consequence system may be an arduous task if we base our construction procedure on the relation `, since this relation uses itself in its definition. In order to avoid this issue, we introduce a mechanism to build these systems based on more concrete object - a Hilbert calculus.

Definition 10. (Hilbert Calculus) A Hilbert calculus is a pair HIL = hC,Ri, where C is a signature and R is a set of rules hθ, ηi, with θ ∪ {η} ⊆ L(C, Ξ), Ξ 6= ∅ and θ finite.

9 Definition 11. (Axiom) If θ = ∅ then rule hθ, ηi is said to be an axiom.

Below, we can find a few examples of Hilbert calculi. In them, ψi are schema variables.

Example 3. (Propositional Logic - PROP) Propositional Logic can be presented as a Hilbert calculus, with PROP = hCPROP ,RPROP i, such that PROP R = {h∅, ψ1 ⇒ (ψ2 ⇒ ψ1)i, h∅, (ψ1 ⇒ (ψ2 ⇒ ψ3)) ⇒ ((ψ1 ⇒ ψ2) ⇒ (ψ1 ⇒ ψ3))i, h∅, ((¬ψ1) ⇒ (¬ψ2)) ⇒ (ψ2 ⇒ ψ1)i, PROP h{ψ1, ψ1 ⇒ ψ2}, ψ2i} and such that C is the signature introduced in Example 1.

Example 4. [21] (First-Order Logic - FOL) First-Order Logic can also be presented as a Hilbert calculus, with FOL = hCFOL,RFOLi, such that

FOL PROP R = {h∀ci(ψ1 ⇒ ψ2), ψ1 ⇒ (∀ciψ2)i, h∅, ∀ciψ1(ci) ⇒ ψ1(t)i, hψ1, ∀ciψ1i} ∪ R where:

• fv(ψ1) is the set of free variables in ψ1;

• the relation B will be defined below. If t B ci : ψ1 then we say that t is free for ci in ψ1;

• in the first rule, ci ∈/ fv(ψ1);

• in the second rule, t B ci : ψ1(ci); and such that CFOL is the signature introduced in Example 2.

The definitions of free variables and of the relation B are presented next.

Definition 12. (Free Variables) The map fv : L(CFOL) → ℘(χ) assigning to each formula the set of free variables in it is recursively defined as follows:

• fv(⊥) = fv(>) = ∅;

• fv(P(t1, ..., tn)) = var(t1) ∪ ... ∪ var(tn);

• fv(¬ψ) = fv(ψ);

• fv(ψ1 ⇒ ψ2) = fv(ψ1) ∪ fv(ψ2);

• fv(∀ciψ) = fv(ψ)\{ci}; where var is map assigning to each term the variables in it.

Definition 13. (Free Term for a Variable in a Formula) FOL The relation B⊂ Term × χ × L(C ) is defined recursively as follows:

• t B ci : ⊥;

• t B ci : >;

• t B ci : P(t1, ..., tn), where P ∈ Pn and t1, ..., tn ∈ Term;

• t B ci :(¬ψ) if t B ci : ψ;

• t B ci :(ψ1 ⇒ ψ2) if t B ci : ψ1 and t B ci : ψ2;

10 • t B c1 : ∀c2ψ if either:

– c2 is c1;

– t B c1 : ψ and if c1 ∈ fv(ψ) then c2 ∈/ var(t).

Notice that in Example 4, two of the rules requires some conditions to be verified. However, the map FOL fv is only defined in L(C ), which does not include schema variables. The relation B suffers from a similar problem. Thus, these provisos are only checked after a substitution from the schema variables into the formulas is applied. As we’ve seen, a consequence system is composed of a signature C and a relation `. We’ve presented two important examples of Hilbert calculi. As one may conjecture, the signature of the induced consequence systems will be the same as the ones used in these calculi. Hence, we only need to introduce a consequence relation. With that in mind, consider the following definition.

Definition 14. (Hilbert-derived Formula) Given a Hilbert calculus HIL = hC,Ri, we say that the formula ϕ is Hilbert-derived from the set of for- mulas Γ, denoted by Γ `H ϕ, iff there is a finite sequence ϕ1, ..., ϕn of formulas, called Hilbert derivation, such that:

• ϕn is ϕ;

• for each i = 1, ..., n:

– ϕi ∈ Γ;

– there exists a rule hθ, ηi ∈ R and a substitution σ such that ϕi = σ(η) and σ(θ) ⊆ {ϕ1, ..., ϕi−1}.

Notice that the rules introduced in Example 3 and Example 4 use schema variables. Thus, to obtain a Hilbert derivation, we’ll need to use a substitution σ :Ξ → L(C, ∅) over the rules. This step is clearly present in the last step of Definition 14. The Hilbert calculi introduced can thus be used to construct a consequence system in a simpler manner.

Proposition 2. A Hilbert calculus HIL induces a compact and closed for substitution consequence

`H system hC, `H i such that Γ = {ϕ|Γ `H ϕ}.

Example 5. The Hilbert calculus PROP = hCPROP ,RPROP i induces a compact and closed for sub- PROP PROP stitution consequence system C` = hC , `PROP i.

Example 6. The Hilbert calculus FOL = hCFOL,RFOLi induces a compact and closed for substitution FOL FOL consequence system C` = hC , `FOLi.

2.3 Other Consequence Systems

As seen in the definition of consequence system, we can use a plethora of consequence relations. In our work, the relevant relations will be PROP and FOL, since they portray connections between sets of formulas. In particular, they’ll allow us to determine the truth value of the formulas in PROP and FOL. Hence, we’ll introduce them next. The maps, together with their properties, were adopted from [21], [22] and [23].

Definition 15. (Valuation) A valuation is a function v : Prop → {⊥, >} that assigns truth values to propositional variables.

11 Definition 16. (Interpretation Function) PROP Given a valuation v, the interpretation function · v : L(C ) → {⊥, >} is defined recursively as J K follows:

• ⊥ v = ⊥; J K

• > v = >; J K

• ci v = v(ci); J K

• ¬ϕ v = ¬ ϕ v; J K J K

• ϕ1 ∧ ϕ2 v = ϕ1 v ∧ ϕ2 v; J K J K J K

• ϕ1 ∨ ϕ2 v = ϕ1 v ∨ ϕ2 v; J K J K J K

• ϕ1 ⇒ ϕ2 v = ¬ ϕ1 v ∨ ϕ2 v. J K J K J K Notice that the interpretation function depends on the valuation chosen. More, the valuation only operates over propositional variables. Hence, before applying any of these relations to the formulas, we need to substitute the schema variables present with formulas within the language that do not contain any schema variables. As such, the final result will depend on the substitutions made. This was expected since the relations we’re presenting attribute a truth value to the formulas. On the other hand, the schema variables are only used to build schematic rules and schematic derivations with no innerent truth value. Another important aspect is that the constructors ∨, ∧ and ¬ are applied to booleans ⊥ or >. As such, we must introduce the relations between all of these elements. Such relations can be obtained from what we call the boolean algebra, first presented by Boole in [24]. We describe this algebra in Appendix B. There is, however, an alternative method to interpret PROP formulas. This method uses structures called proposition models.

Definition 17. (Proposition Model) A proposition model M is a set of propositional variables.

Definition 18. (Satisfiability Relation for PROP) PROP The satisfaction relation, denoted by PROP ⊆ ℘(Prop) × L(C ), is defined recursively as follows:

•M PROP ci iff ci ∈ M;

•M PROP ¬ϕ iff M 1PROP ϕ;

•M PROP ϕ1 ∧ ϕ2 iff M PROP ϕ1 and M PROP ϕ2;

•M PROP ϕ1 ∨ ϕ2 iff M PROP ϕ1 or M PROP ϕ2;

•M PROP ϕ1 ⇒ ϕ2 iff M 1PROP ϕ1 or M PROP ϕ2.

The concepts of proposition model and valuations represent two types of equivalent semantics for PROP. They are connected through the following bijections.

Proposition 3. Let v be a valuation. Then v determines the proposition model Mv = {ci ∈ PROP : v(ci) = >} such that Mv PROP ψ iff ψ v = >. J K Proposition 4. Let M be a propositional model. Then M determines a valuation v such that:

• v(ci) = > if ci ∈ M and ci ∈ Prop;

12 • v(ci) = ⊥ if ci ∈/ M and ci ∈ Prop.

This result provides an easy method to produce proposition models that allow us to check the satis- fiability of a formula. It can be easily demonstrated using induction over the formula ψ.

Definition 19. (Satisfiable PROP Formula) A formula ϕ is satisfiable if there exists a proposition model M such that M PROP ϕ.

Definition 20. (Valid PROP Formula) A formula ϕ is valid, denoted by PROP ϕ, iff every proposition model M satisfies ϕ.

Definition 21. (PROP Models of Ψ)

The set of proposition models M in which Ψ is satisfiable is denoted by ModPROP (Ψ).

Definition 22. (Entailed PROP Formula) Let Ψ be a set of formulas, and let ϕ be a formula. We say that Ψ entails ϕ, denoted by Ψ PROP ϕ, if ϕ is satisfiable by every propositional model M in which Ψ is satisfiable, i.e. if MPROP (ϕ) ⊆ MPROP (Ψ).

Example 7. Using the relation PROP , we can define a consequence system for PROP since this relation satisfies the conditions introduced in Definition 8. This system will be denoted by PROP PROP C = hC , PROP i.   Looking now torwards the last consequence relation, we have the following definitions.

Definition 23. (Interpretation Structure) An interpretation structure over CFOL is a triple I = hDom, ·F , ·P i where:

• Dom is a non empty set;

•· F is a map from the function symbols to functions between the elements of the domain;

•· P is a map from the predicate symbols to subsets of the domain with the same cardinality as the set of symbols.

Definition 24. (Assignment) An assignment into I is a map µ : χ → Dom.

Definition 25. (Y -equivalent)

Let Y ⊆ χ. The assigments µ1 and µ2 into I are said to be Y -equivalent if µ1(ci) = µ2(ci) for each ci ∈/ Y .

Similar to the case of PROP, these relations are not meant to be applied to schema variables, since they contain no truth value. As such, before applying these relations, every occurence of these vari- ables within the formulas must be substituted by elements of the language that do not contain schema variables. Continuing on with the presentation of the consequence relation, consider the following definitions.

Definition 26. (Term Denotation Map) Given an interpretation structure I over CFOL and an assignment µ into I, the term denotation map

Iµ FOL · CFOL : L(C ) → Dom J K is recursively defined as follows:

Iµ • ci CFOL = µ(ci), if ci ∈ χ; J K

13 Iµ F • xi CFOL = xi , if xi ∈ C ; J K Iµ F Iµ Iµ • fj(t1, ..., tn) CFOL = fj ( t1 CFOL , ..., tn CFOL ), if fj ∈ Fn and ti ∈ Term. J K J K J K We can finally approach some very important notions regarding satisfiability and validity. They will take advantage of the maps defined above.

Definition 27. (Local Satisfaction) The relation of local satisfiability of a formula by an interpretation structure I and an assignment µ into I is defined recursively as follows:

• I, µ FOL >;

• I, µ 1FOL ⊥;

P Iµ Iµ • I, µ FOL P(t1, ..., tn) iff P ( t1 CFOL , ..., tn CFOL ) = >; J K J K • I, µ FOL (¬ϕ) iff I, µ 1 ϕ;

• I, µ FOL (ϕ1 ⇒ ϕ2) iff I, µ 1 ϕ1 or I, µ ϕ2;

0 0 • I, µ FOL ∀ciϕ iff I, µ FOL ϕ for every assignment µ ci-equivalent to µ.

The cases referring to the other operators can be easily obtained through the relations between these operators, mentioned in Appendix B.

Definition 28. (Global Satisfaction) The relation of global satisfiability of a formula by an interpretation structure I is defined inductively as follows: I is said to satisfy ϕ, denoted by I FOL ϕ, if I, µ FOL ϕ for every assignment µ into I.

Definition 29. (Valid FOL Formula) A formula ϕ is said to be vaild, denoted by FOL ϕ, if I FOL ϕ for every interpretation structure I over signature CFOL.

Definition 30. (Entailed Formula) FOL FOL A formula ϕ over C is entailed by a set Γ of formulas over C , denoted by Γ FOL ϕ, if, for every FOL interpretation structure I over signature C , I FOL ϕ whenever I FOL γ, for every γ ∈ Γ.

Definition 31. (FOL Models of Ψ)

The set of interpretation structures I that satisfy Ψ is denoted by ModFOL(Ψ).

Before producing a consequence system similar to that of Example 7, we need an extra definition.

Definition 32. (Sensible to Renaming Interpretation Structures)

An interpretation structure I ∈ ModFOL (the set of all FOL models) is said to be sensible to renaming if for each renaming substitution ρ, there is a map βρ : ModFOL → ModFOL such that I FOL ρ(ϕ) iff βρ(I) FOL ϕ.

Example 8. Using the relation FOL and only sensible to renaming interpretation structures, we can define a consequence system for FOL since this relation satisfies the conditions introduced in Definition FOL FOL 8. This system will be denoted by C = hC , FOLi.  

At last, it’s worth mentioning that the relations PROP and FOL are directly related to the one introduced in Definition 14. As such, the consequence systems of Example 5 and Example 6 are also connected to the consequence systems of Example 7 and Example 8, respectively. These links are evidential in the following propositions.

14 Proposition 5. (Soundness and Completeness of PROP) PROP PROP The consequence systems C and C are equivalent, i.e. Γ PROP ϕ iff Γ `PROP ϕ. `   Proposition 6. (Soundness and Completeness of FOL) FOL FOL The consequence systems C and C are equivalent, i.e. Γ FOL ϕ iff Γ `FOL ϕ. `   With these definitions, we’ve properly materialized the concepts needed to understand the idea be- hind the algorithms presented ahead.

2.4 Proof Systems

The next step consists in introducing the notion of proof systems, as well as providing a few examples of these systems. They are more abstract than consequence systems, but they still retain the constructive nature of derivations, which is a big advantage.

Definition 33. (Relation) Given two sets A, B, a relation is a map R : A × B → {0, 1}.

Definition 34. (Proof System) A proof system is a tuple P = hC,D, ◦,P i, where C is a signature, D is the set of possible derivations,

◦ : ℘(D) × D → D is a map, and P = {PΓ}Γ⊆L(C) is a family of relations PΓ ⊆ D × L(C) satisfying the following properties:

• PΓ(d, ϕ) = 1 if d is a derivation of ϕ from Γ, i.e. an ordered sequence of formulas, where a formula is entailed by a set of previous ones;

• PΓ(E, Ψ) = 1 if ∀ϕ ∈ Ψ∃e ∈ E : PΓ(e, ϕ) = 1;

• Right reflexivity - PΓ(D, Γ) = 1, ∀Γ ⊆ L(C);

• Monotonicity - PΓ1 ≤ PΓ2 , ∀Γ1 ⊆ Γ2 ⊆ L(C);

• Compositionality -

– ∅ ◦ d = d, ∀d ∈ D;

– if E ⊆ D and ∃Ψ ∈ ℘(L(C)) : PΓ(E, Ψ) = 1 ∧ PΨ(d, ϕ) = 1 then PΓ(E ◦ d, Ψ) = 1;

• Variable exchange - PΓ(D, ϕ) = Pρ(Γ)(D, ρ(ϕ)), for all renaming substitutions ρ;

• Falsehood - PΓ(∅, ϕ) = 0, ∀ϕ ∈ L(C);

• Monotonocity on the first argument - PΓ(E1, Ψ) 6 PΓ(E2, Ψ), ∀E1 ⊆ E2;

• Anti-monotonocity on the second argument - PΓ(E, Ψ1) 6 PΓ(E, Ψ2), ∀Ψ2 ⊆ Ψ1;

• Union - PΓ(E, Ψ1 ∪ Ψ2) = PΓ(E, Ψ1) × PΓ(E, Ψ2).

Notice that the last four conditions can be obtained from the first ones. Similar to what happened with the consequence systems, proof systems can also be defined according to the sets within Ξ and L(C).

Definition 35. (Proof System’s Categories) Let Π ⊆ Ξ, Γ ⊆ L(C). A proof system is said to be:

• consistent if PΠ(D, ϕ) = 0, where ϕ ∈ Ξ\Π;

15 • compact if for every Γ, ϕ, ∃Φ ⊆ Γ finite such that PΓ(D, ϕ) 6 PΦ(D, ϕ);

• decidable if PΓ is decidable for each decidable set Γ ⊆ L(C);

• closed for substitutions if PΓ(D, ϕ) = Pσ(Γ)(D, σ(ϕ)), for all substitutions σ :Ξ → L(C, Ξ);

• recursively enumerable if PΓ is recursively enumerable for each recursively enumerable set Γ ⊆ L(C).

2.5 Hilbert Calculus Induced Proof Systems

Similar to the previous case, we can also build proof systems using a Hilbert calculus like PROP or FOL, obtaining P(PROP) and P(FOL). The following proposition exemplifies how these systems are obtained.

Proposition 7. A Hilbert calculus HIL = hC,Ri induces a compact proof system P(HIL) = hC,D, ◦,P i such that:

• D = L(C)∗;

π(E) • E ◦ d = dE ;

• PΓ(d, ϕ) = 1 iff d is a Hilbert-derivation for ϕ from Γ; where π(e) denotes the last element of sequence e ∈ L(C)∗, π(E) = {π(e): e ∈ E}, E ⊆ L(C)∗, and π(E) ∗ dE is the sequence obtained by replacing in d ∈ L(C) the last element of sequence e by sequence e.

2.6 Relations between Proof Systems and Consequence Systems

Now that we’ve introduced the notions of consequence system and proof system, we notice that their compositions are somewhat similar. Both possess a signature and a connection between sets of for- mulas within the signature, either through a consequence relation or through a set of derivations and a familiy of relations with certain properties. As such, we could try to create bridges between these concepts. We present some of them in the following propositions.

Proposition 8. A proof system P = hC,D, ◦,P i induces a consequence system C(P) = hC, `i where ` Γ = {ϕ ∈ L(C): PΓ(D, ϕ) = 1}.

Proof. Once again, we need to show that P satisfies the conditions established in Definition 8.

• Extensivity - Since PΓ satisfies the right reflexivity property, the result follows;

• Monotonicity - Suppose that Γ1 ⊆ Γ2 and that ϕ ∈ Γ1. Then PΓ1 (D, ϕ) = 1. By the monotonicity of ` P, PΓ1 (D, ϕ) ≤ PΓ2 (D, ϕ). Hence, PΓ2 (D, ϕ) = 1, and so ϕ ∈ Γ2 ;

` ` • Idempotence - Suppose that ϕ ∈ (Γ ) . Then there’s d ∈ D such that PΓ` (d, ϕ) = 1. On the other ` hand, there’s E ⊆ D such that PΓ(E, Γ ) = 1. By compositionality in P, PΓ(E ◦ d, ϕ) = 1 and hence ϕ ∈ Γ`;

• Closure for renaming substitution - Suppose that ρ is a renaming substitution and ϕ ∈ Γ`. Then ` PΓ(D, ϕ) = 1. By variable exchange in P, Pρ(Γ)(D, ρ(ϕ)) = 1, and so ρ(ϕ) ∈ (ρ(Γ)) .

16 Proposition 9. Let Calc be a Hilbert calculus. Then C(P(Calc)) = C(Calc).

Proof. We know that C(P(Calc)) and C(Calc) share the same signature. Hence, we need to show that the closure of a set Γ ⊆ L(C) is the same in both cases. Let C(P(Calc)) = hC, `1i and C(Calc) = hC, `2i.

`1 `2 Then ϕ ∈ Γ iff PΓ(D, ϕ) = 1 in P(Calc) iff there’s a Calc-derivation of ϕ from Γ in D iff ϕ ∈ Γ .

Proposition 10. A consequence system C = hC, `i induces a proof system P(C) = hC,D, ◦,P i such that:

• D = {∗};

• E ◦ ∗ = ∗;

` • PΓ(∗, ϕ) = 1 iff ϕ ∈ Γ .

Proof. We need to show that it satisfies the conditions required to be a proof system. With that in mind, consider:

` • Right reflexivity - Since ` is extensive, Γ ⊆ Γ , ∀Γ ⊆ L(C). Hence, PΓ(D, Γ) = 1;

` • Monotonicity - Suppose that Γ1 ⊆ Γ2 and PΓ1 (D, ϕ) = 1. Then ϕ ∈ Γ1 . By monotonicity of C, ` ` ` Γ1 ⊆ Γ2 . Hence, ϕ ∈ Γ2 , and so PΓ2 (D, ϕ) = 1;

` ` • Idempotence - Suppose that PΓ(E, Ψ) = 1 and PΨ(d, ϕ) = 1. Then Ψ ⊆ Γ and ϕ ∈ Ψ . By ` ` ` monotonicity of C, ϕ ∈ (Γ ) and so, by idempotence of `, ϕ ∈ Γ . Hence, PΓ(E ◦ d, ϕ) = 1;

` • Variable exchange - Assume that ρ is a renaming substitution and that PΓ(D, ϕ) = 1. Then ϕ ∈ Γ . ` Hence ρ(ϕ) ∈ (ρ(Γ)) , and so Pρ(Γ)(D, ρ(ϕ)) = 1.

These propositions will play important roles in our thesis. First, they’ll allow us to connect the two basic concepts - consequence systems and proof systems - as well as Hilbert calculus induced sys- tems, quite freely. Second, they’ll provide a method to construct proof systems based on consequence relations, which greatly simplifies some proofs introduced further ahead.

17 18 Chapter 3

Abduction

Now that we’ve introduced some basic theoretical concepts, we’ll move on to the main aspect of this thesis - the development of an abduction algorithm. In this chapter, we’ll introduce the concept of abduction algorithm, present a few examples of applications, and discuss some problems regarding the expressive power of the formulas used. As mentioned before, abduction plays an important role within various fields of study. Depending on the field, the abduction problem can be interpreted in different ways. In general terms, abduction algorithms create hypotheses that justify a set of facts, based in a theory. We’ll formalize this notion in the next definition. Definition 36. (Abduction Algorithm) An abduction algorithm receives as input a theory and a set of observed facts, and formulates a set of hypotheses that justify the highest number of facts. To achieve a better understanding on how this notion can be used, consider the next examples. Example 9. Assume the following scenario. John catches a bus. There are four alternatives: bus 1, bus 2, bus 3 and bus 4. If he takes bus 1, he reaches place A. On the other hand, buses 2, 3 and 4 take him to place B. In this particular occasion, he catches the bus on a sunday. We know that in this particular weekend, there’s a marathon, and so bus 2 is not operating. John has reached place B. We want to determine the bus that John took to get to that place. By formalizing these ideas into a certain reasoning framework containing a theory, in which we can include a relation between the hypotheses (i.e. the buses that John could catch) and the conclusions (i.e. the places that John could reach depending on the bus), and also some conditions (for example, the fact that bus 2 is not operating), we can use an abduction algorithm to discover the bus that John took, with a certain level of uncertainty. Example 10. Consider the next situation. Mary is sick. She goes to the doctor, and explains her symptoms. She has a cough, a fever and a small rash. The doctor, wanting to discover the disease she has, uses the knowledge he learned on diseases. He knows that the Flu always causes a fever and sometimes causes cough, Smallpox always causes a rash, but never a fever, and that a certain tropical disease may cause a rash, which in turn may lead to the Flu. By translating these notions into formulas in FOL, we can once again define a resoning framework in which we can include the relations between the hypotheses (i.e. the diseases that exist) and the conclusions (i.e. the symptoms that they may cause and that they always cause). Then, we can use an abduction algorithm to discover the disease that Mary is suffering from, with a certain level of uncertainty. Notice that this routine is able to not only indicate the disease that Mary is suffering from, but also other diseases that can indirectly lead to those symptoms.

19 Both examples are very simple. However, this procedure could be applied to more complex situa- tions. For example, we could expand the universe of diseases that one may catch, or the number of bus lines that exist. Nowadays, the goals achievable by the majority of the current solvers available are very diminuted when compared to the vastness of problems that could be solved with more efficient and embracing abduction. This is due to the fact that most of these systems use frameworks that only support very basic language, as is the case of PROP, or complex language but with many constraints, like we discussed in Chapter 1. This triggers a problem since these obstacles prevent a proper representation of real life situations, where in many occasions we face the ideas of totality (”it’s always that case that...”) and partiality (”sometimes is the case that...”). Hence, in an attempt to solve this problem, we’ll develop and implement a new original abduction algorithm that supports a certain framework composed of formulas in L(CFOL) and, in particular, theories, which consist of finite subsets of formulas, in L(CFOL). We’ve chosen this language as the main focus of our algorithm since it is more capable of translating everyday occurrences into mathematical formulas, and thus could help us move torwards the goal of enlarging the universe in which this routine can be utilized. With the development of this new algorithm in mind, we’ll first analise an already existing algorithm - Peirce’s algorithm - which uses a proof system induced by the consequence system CPROP , system `Res which we also present. On this subject, we’ll explore and expose the ideas behind this algorithm, as well as the procedures involved. Then, taking them into consideration, together with the concepts introduced so far, we’ll develop and implement our own algorithm, which can be applied to the more complex language that we’ve already mentioned. Since this new algorithm operates over formulas that are not very simple, due to the fact that they may include constructors like universal quantifiers and existential quantifiers, we assume that a few optimization ideas must also be implemented in order to tackle the complexity problem. At last, we’ll present other subjects in which this algorithm could be used. In particular, we’ll tie the notion of abduction algorithm with the concept of abduction function, presented in [11], and exemplify how this algorithm can be used within the context of proof systems.

20 Chapter 4

Peirce’s Algorithm

The procedure chosen to be the backbone of our work was Peirce’s algorithm, developed by Rodrigues et al.in [9]. Similar to other abduction algorithms introduced in recent years, this procedure was built upon the Resolution Principle [8], developed by Martin Davis and Hilary Putnam (1960)[25] and later improved by John Robinson (1965)[26]. The development of this principle was very important in the sense that until that point, there was no clear way of dealing with FOL formulas in terms of proof search mechanisms. Resolution was able to partially fill this gap. More, most proof search mechanisms are implemented in such a manner that an exaustive production of all possible sets of derivations is needed in order for the procedure of searching for a certain proof, i.e. derivation from a set of hypotheses to a conclusion, to be successful. Resolution does not require such an expensive course of action. The original algorithm is described below. It was built to operate over sets of formulas in PROP, like c1, c2 ∧ c3, ¬c4, c5 ⇒ c6.

Algorithm 1 Peirce’s Algorithm 1: procedure PEIRCE(C,T,F,Exp,Comp) 2: if Consistent(T ∪ C) === T rue then 3: R ← CLA(T,C,¬F) 4: R ← Resolution(R) 5: if R does not contain the empty set then 6: H ← FormulateCandidateHypotheses(R) 7: H ← RemoveInconsistentHypotheses(T,C,H) 8: H ← SelectGoodHypotheses(T,C,H,F,Exp,Comp) 9: return H 10: else break 11: end if 12: else break 13: end if 14: end procedure

In order to ease the process of understanding the algorithm, we briefly explain what these procedures do. Consistent determines whether a set of formulas is consistent or not. CLA translates a set of formulas in PROP into clausal form. Resolution applies the Resolution mechanism to a set of formulas in clausal form. The final three programs are very straighforward. The first one constructs the set candidate hypotheses, i.e. the solutions to our problems. The second one takes that set and removes the elements that are inconsistent with the specificities of the problem. The third one takes that purified set and selects the elements that satisfy certain conditions. Those elements are called good hypotheses. In this chapter, we’ll provide an overview of the procedures involved with this algorithm, as well as a compilation of some explanations of the theoretical concepts behind these routines. We’ll also yield a

21 description of how the procedures work with each other to produce a solution, and why that solution is correct.

4.1 Main Ideas and Input

The first step is to understand what kind of structure the original algorithm requires its input to have, and how it produces the desired output. Most abduction algorithms use a THF (theory, hypotheses and facts) reasoning framework. Instead, this algorithm uses a TCHF (theory, conditions, hypotheses and facts) reasoning framework. More, it requires the input sets to contain formulas of certain types.

Definition 37. [27] (HF form) A formula ϕ is in HF form if it’s written in one of the following formats:

• a1 ∧ ... ∧ an, where ai are literals;

• a1 ∨ ... ∨ an, where ai are negative literals;

• (a1 ∧ ... ∧ an) ⇒ (b1 ∧ ... ∧ bm) where ai are positive literals and bj are literals;

• (a1 ∨ ... ∨ an) ⇒ (b1 ∨ ... ∨ bm) where ai are literals and bj are negative literals.

Definition 38. (TCHF Reasoning Framework) A TCHF reasoning framework for abduction is a tuple hT,C,H,F i such that:

• T = {t1, ..., tm} is the theory set denoting t1 ∧ ... ∧ tm, where ti are formulas in PROP in HF form; it represents the hypotheses that must be assumed as T rue during the reasoning process and that do not change depending on the situation;

• C = {c1, ..., cp} is the conditions set denoting c1 ∧ ... ∧ cp, where ci are formulas in PROP in HF form; it represents the accepted conditions that must be assumed as T rue during the reasoning process and that may change depending on the situation;

• H = {h1, ..., hj} is the hypotheses set denoting h1 ∨ ... ∨ hj, where hi are formulas in PROP; it represents the hypotheses that together with T and C explain the facts represented by F ;

• F = {f1, ..., fq} is the facts set denoting f1 ∧ ... ∧ fq, where fi are positive literals in PROP; it represents the facts that must be explained through abductive reasoning.

The addition of set C to our reasoning framework has multiple advantages. In a THF framework, if we want to add conditions, we have to include them in the theory set. This jeopardises the generality of this set, turning it too particular to the problem in question. With set C in play, we don’t influence the theory set with extra sentences. Besides this very important benefit, it allows an easier representation of two or more instances of abductive reasoning that share a theory set and a facts set and it allows the explicit definition of conditions linked to context (space), circumstances (time), intention, belief, faith, among others.

Example 11. Consider the scenario presented in Example 9. The TCHF framework for this problem is the following:

• ci denotes ”John catches bus i”, i = 1, 2, 3, 4;

• c5 denotes ”John reaches place A”;

• c6 denotes ”John reaches place B”;

22 • c7 denotes ”It’s sunday”;

• T = {c1 ⇒ c5, c2 ⇒ c6, c3 ⇒ c6, c4 ⇒ c6};

• C = {c7, c7 ⇒ ¬c2};

• F = {c6}.

The idea behind Peirce’s algorithm is thus the following: given the theory set T , a conditions set C and a facts set F , we want to create a hypotheses set H such that

T ∪ C 2PROP F (4.1)

p T ∪ C ∪ {h} PROP F, ∀h ∈ H (4.2)

T ∪ C ∪ {h} 2PROP ⊥, ∀h ∈ H (4.3)

p The relation PROP indicates that not all but at least one of the components of F must be entailed for the relation to be true. The meaning of these equations is very important, and as such it is worth mentioning. Equation (4.1) indicates that the theory set and the conditions set do not explain the facts by themselves. Equation (4.2) stipulates that the theory set, the conditions set and any hypothesis must entail at least one fact. The last equation tells us that the any hypothesis must not be inconsistent with the theory and conditions sets.

4.2 Selection Criterion

In addition to the conditions introduced in the previous section, the algorithm uses a selection criterion in order to choose the best solutions amongst the ones present in the hypotheses set.

Definition 39. (Explanatory Power - EP ) The explanatory power of h ∈ H is the ratio between the number of facts it can explain and the total number of facts.

Definition 40. (Complexity - Comp) The complexity of h ∈ H is the number of atomic propositions that h contains.

Definition 41. (Selection Criterion #1)

Hypothesis h ∈ H is considered to be a good hypothesis if h = argminh∈H (Comp(argmaxh∈H (EP (h)))), where argmin and argmax are functions that indicate the argument h ∈ H that minimizes the value of Comp and maximizes the value of EP , respectively.

In other words, this selection criterion selects the hypotheses which have maximum explanatory power and, from that pool, it selects the ones with minimum complexity. There are other selection criteria for abduction algorithms, but in this case the one chosen was selection criterion #1. Notice that this criterion is useful in the sense that in most situations, we do not need a large amount of hypotheses or very long hypotheses. We’re only looking for a simple and straighforward answer. This is an improvement over the selection criterion used, for example, in [11], which resulted in redundant and useless hypotheses that little or nothing have to do with the facts set.

23 4.3 Resolution Principle for PROP Formulas

The final step before explaining how this algorithm works is to introduce a pivotal principle, already mentioned above. This principle will allow us to check consistency (line 2 of Algorithm 1) and build the hypotheses set (lines 4 to 8 of Algorithm 1).

Definition 42. (Resolution Principle) The Resolution Principle consists in applying the Resolution rule RES

{ϕ, Ψ1} {¬ϕ, Ψ2}

{Ψ1, Ψ2} where ϕ is a literal, Ψi are sets of literals, {Ψ1, Ψ2} is called the resolvent and {ϕ, Ψ1}, {¬ϕ, Ψ2} are the input clauses. The (possibly empty) set of resolvents is denoted by Res({ϕ, Ψ1}, {¬ϕ, Ψ2}). The empty set is F alse under any interpretation function.

The application of this principle is, in most situations, very straighforward. There are, however, some cases where its application must be done with some precaution. When the input clauses contain more than one pair of complementary literals, we can’t simply remove all of these pairs from the input clauses and join the remaining literals in the resolvent. Instead, we have to choose a pair arbitrarily and use the Resolution principle over that single pair. The result of this application will always be a tautology. More, we have to point out that the empty set is not present in L(CPROP ). But in order to explain our reasoning, we’ll have to resort to this particular case. Hence, we’ll abuse the notation and look at the empty set as a formula during the course of the proofs. This principle is a valid inference rule that allows us to remove complementary literals from a given set. More, it can be used as a basis for a procedure to verify the consistency, satisfiability and, by extension, the validity of a set of formulas in PROP. To prove such a fact, we will use the consequence relations mentioned in Section 2.3, and the relations between proof systems and consequence systems introduced in Section 2.6. We start by introducing an alternative to the Hilbert derivation presented before, as well as a useful induced proof system.

Definition 43. (Resolution Derivation)

A Resolution derivation of a possibly empty clause ϕ from a set of clauses Ψ, denoted by Ψ `Res ϕ, is a sequence of clauses Ψ1, ..., Ψn such that:

• Ψi ∈ Ψ or Ψi ∈ Res(Ψj, Ψk), where 1 6 j and k < i;

• Ψn = ϕ.

Similar to the case of Hilbert derivation, Resolution derivation can also be used in the context of a proof system. This relation is evident in Example 12.

PROP Example 12. The proof system based on the Resolution principle is the proof system P(hC , `Resi) induced by consequence system CPROP = hCPROP , ` i, built in agreement with Proposition 10. `Res Res The consequence system CPROP is induced by the Hilbert calculus HIL = hCPROP ,RESi, in `Res Res concordance with Proposition 2.

With this consequence relation and this proof system, we can provide a proof for the following propo- sition. It conveys the idea that the application of the Resolution principle preserves the satisfiability of the formulas in play.

Proposition 11. [23] The proof system based on the Resolution principle is sound for PROP, i.e. if 0 0 0 0 {Ψ1, Ψ2} `Res Ψ, then {Ψ1, Ψ2} PROP Ψ.

24 0 0 0 0 Proof. Suppose that {Ψ1, Ψ2} `Res Ψ. Then we know that Ψ1 and Ψ2 contain complementary literals. 0 0 Hence, let Ψ1 = Ψ1 ∪ {ϕ}, Ψ2 = Ψ2 ∪ {¬ϕ} and Ψ = Ψ1 ∨ Ψ2, where Ψ1 and Ψ2 are disjunctions of literals. Suppose that 0 0 Mv PROP Ψ1 ∧ Ψ2 (4.4) where Mv is a proposition model determined by a valuation v. By definition of PROP , we need to show that Mv PROP Ψ (4.5)

From (4.4), 0 0 Mv PROP Ψ1 ∧ Ψ2 iff Mv PROP (Ψ1 ∨ ϕ) ∧ (Ψ2 ∨ (¬ϕ)) iff

iff Mv PROP (Ψ1 ∨ ϕ) and Mv PROP (Ψ2 ∨ (¬ϕ)) (4.6)

Suppose, by absurd, that

Mv 1PROP Ψ1 ∨ Ψ2 (4.7)

Then

Mv 1PROP Ψ1 and Mv 1PROP Ψ2 (4.8)

Hence, by (4.6), Mv PROP ϕ and Mv PROP ¬ϕ (4.9) which cannot happen. As such, Mv PROP Ψ1 ∨ Ψ2

Despite its soundness, this proof system is not generatively complete.

Definition 44. (Generative Completeness) A proof system is generatively complete if it’s possible to derive every formula entailed by a set of formulas in the proof system.

Such fact is apparent in the next example.

Example 13. Let Ψ1 = {ϕ1}, Ψ2 = {ϕ2} and Ψ = {ϕ1 ∨ ϕ2}. It is obvious that {Ψ1, Ψ2} PROP Ψ, yet the Resolution principle cannot derive this conclusion, i.e. {Ψ1, Ψ2} 0Res Ψ.

For this reason, we can conclude the following.

Proposition 12. [23] The proof system based on the Resolution principle is not generatively complete for PROP.

As such, there’s no big advantage in using the Resolution principle as a basis of a proof system. On the other hand, the Resolution principle is very usefull when we’re discussing the consistency of PROP formulas. To check such a fact, we’ll prove that this principle gives way to a sound and complete refutation system, as mentioned in [23], [28] and [29]. As we’ve stated, within a proof system, soundness indicates that every derived formula can be en- tailed, while completeness indicates that every entailed formula can be derived. Within a refutation system, these concepts are similar - soundness indicates that if we derive an empty set, then we can entail ⊥; completeness indicates that if we can entail ⊥, then we can derive an empty set. First, we need proper definitions of consistent set and refutation system, which we present next.

Definition 45. (Consistent Set)

A set of formulas Ψ in PROP (or in FOL) is consistent iff Ψ 2PROP ⊥ (or Ψ 2FOL ⊥).

25 Definition 46. (Refutation System) A refutation system is a proof system used to obtain refutations, i.e. proofs that a certain formula is not consistent.

The construction of a refutation system is equal to that of a proof system, and thus we can use the procedures introduced in Chapter 2. In particular, the refutation system is the one presented in Example 12, with the caviat of its goal - refutation instead of proof - being different.

Proposition 13. The refutation system based on the Resolution principle is sound for PROP, i.e.

if Ψ `Res ∅ then Ψ PROP ⊥ (4.10)

Proof. From Proposition 11, we know that the proof system based on the Resolution principle is sound, i.e. if {Ψ1, Ψ2} `Res Res(Ψ1, Ψ2) then{Ψ1, Ψ2} PROP Res(Ψ1, Ψ2). Hence, if {Ψ1, Ψ2} `Res ∅, then {Ψ1, Ψ2} PROP ∅. We know that the empty set is F alse under any interpretation function, and as such, {Ψ1, Ψ2} PROP ⊥.

The completeness result for the refutation system is more complex to prove, since we need to define a few concepts beforehand.

Definition 47. (Semantic Tree)

Without loss of generality, assume that Ψ = {ϕ1, ..., ϕn} is an ordered set of PROP formulas. The semantic tree of Ψ is a full binary tree TΨ where the left edge from any node at level i is labelled with ϕi, and where the right edge from any node at level i is labelled with ¬ϕi.

Definition 48. (Partial Interpretation)

Let N be a node in TΨ. Then the partial interpretation ω(N, Ψ): Ψ → {0, 1} is a partial function defined as follows:

• ω(N, Ψ)(ϕi) = 1 if ϕi is on the path from the root to node N;

• ω(N, Ψ)(ϕi) = 0 if ¬ϕi is on the path from the root to node N;

• ω(N, Ψ)(ϕi) is undefined otherwise.

Definition 49. (Partial Model) Let ω be a partial interpretation, and let Ψ be a set of PROP formulas. Then ω is a partial model of Ψ if ∗ ∗ it has an extension ω : Ψ → {0, 1} such that ω ∈ ModPROP (Ψ).

Definition 50. (Weak Partial Model) Let ω be a partial interpretation, and let Ψ be a set of PROP formulas. Then ω is a weak partial model Ψ ϕ ∈ Ψ ω ω∗ : Ψ → {0, 1} ω∗ ∈ Mod (Ψ) of if, for each i , has an extension ϕi such that ϕi PROP .

Going back to the semantic tree, we have these additional definitions.

Definition 51. A node N in TΨ is, with respect to Ψ:

• a safe node if ω(N, Ψ) is a weak partial model of Ψ;

• an inference node if it is a safe node, but no descendent of it is a safe node;

• a failure node if it is not a safe node, but every ancestor of it is a safe node.

26 Proposition 14. Let N be an inference node for Ψ in TΨ. Let N1,N2 be the children of N. Let ϕi ∈ Ψ be such that ω(N1, Ψ) cannot be extended to a propositional model of ϕi, and let ϕj ∈ Ψ be such that

ω(N2, Ψ) cannot be extended to a propositional model of ϕj. Let ϕk be the proposition on which N1 and

N2 branch. Then ω(N, Ψ) cannot be extended to a propositional model of Res(ϕi, ϕj) on (ϕk, ¬ϕk).

Proof. In [23].

Proposition 15. Let Ψ be a finite set of unsatisfiable clauses. Then every node of TΨ is a failure node with respect to Res(Ψ).

Proof. Suppose, by absurd, that TΨ contains a safe node N. N cannot be a leaf, or else ω(N, Ψ) would be a proposition model for Ψ. Let N1 be a safe node which is furthest from the root. Then N1 must be an inference node, or else one of its children would be safe, contradicting the assumption that is a safe node which is furthest from the root. Since it is an inference node, by Definition 51, Proposition 14 asserts that it is a failure node for an element of Res(Ψ). This is absurd, and as such N1 cannot exist.

Hence, every node of TΨ is a failure node with respect to Res(Ψ).

This last proposition allows us to conclude the final result.

Proposition 16. The refutation system based on the Resolution principle is complete for PROP, i.e.

if Ψ PROP ⊥ then Ψ `Res ∅ (4.11)

Proof. From Proposition 15, we know that the root node of TΨ must be a failure node with respect to Res(Ψ). This can only happen if ∅ ∈ Res(Ψ).

By joining Proposition 13 and Proposition 16, we obtain the following result.

Proposition 17. The refutation system based on the Resolution principle is sound and complete for PROP, i.e. Ψ PROP ⊥ iff Ψ `Res ∅ (4.12)

Thus, to check whether a certain set Ψ is consistent or not, we can simply apply the Resolution principle recursively until one of the following cases occur:

• if we reach the empty set ∅, then Ψ PROP ⊥, and the set Ψ is inconsistent;

• if we reach a point where the Resolution principle can no longer be applied, and the empty set has

not been achieved yet, then Ψ 2PROP ⊥ and hence it’s consistent.

This last observation gives way to the following result.

Proposition 18. A refutation system based on the Resolution principle is decidable, i.e. it always deter- mines whether a set is consistent or not.

For these reasons, the Resolution principle will be used as a basis for a refutation system, instead of a regular proof system. Besides the obvious benefits of using this system in consistency checks, we could also use it to show that a certain formula is entailed by a set of formulas. This process is useful, since the algorithm requires us to determine whether a certain hypothesis, together with a theory set and a conditions set, entails the facts. With this issue in mind, we first introduce the following theorem.

Theorem 1. (Deduction Theorem) {Γ, ϕ} PROP ψ iff Γ PROP (ϕ ⇒ ψ)

27 From this theorem, we can directly infer that

{Ψ, ¬ψ} PROP ⊥ iff Ψ PROP ((¬ψ) ⇒ ⊥) iff Ψ PROP ψ ∪ ⊥ iff Ψ PROP ψ (4.13)

Hence, to check if Ψ PROP ψ, we can use a refutation system based on the Resolution principle with {Ψ, ¬ψ} as a starting set, and verify if we can derive the empty set. If we can, then {Ψ, ¬ψ} is inconsistent, and thus Ψ PROP ψ, according to equation (4.13). Note that these observations are not necessarily true when we’re discussing the application of the Resolution principle to FOL formulas. This problem will be cleared up in the next chapter. At this point, the only subject left to address is clausal form.

Definition 52. (Clausal Form). Wn A formula ϕ is in clausal form if ϕ = i=1 ci, where c1, ..., cn are literals (positive or negative). We usually denote this formulas by {c1, ..., cn}.

The process of translating a formula into clausal form will be described later. Every formula in PROP or in FOL can be translated into a logically equivalent or an equisatisfiable formula in clausal form.

4.4 Final Algorithm

In this section, we’ll explain how the original algorithm produces the good hypotheses. Starting with the TCHF reasoning framework with H = ∅, we check whether the set T ∪ C is consistent or not. This verification will be made with the help of the Resolution principle as explained above, together with the program CLA, which translates the sets into clausal form. If it’s consistent, we translate the set T ∪C ∪¬F into clausal form with the program CLA, and apply Resolution. If ∅ ∈ Resolution(CLA(T,C, ¬F )), then

CLA(T ∪ C ∪ ¬F ) PROP ⊥ (4.14) and so CLA(T ∪ C) PROP CLA(F ) (4.15)

Hence, there’s no need to produce a hypotheses set, since the theory set and the conditions set already explain the facts set. Suppose instead that

Resolution(CLA(T,C, ¬F )) = {r1, ..., rk} (4.16)

Then CLA(T ∪ C) doesn’t explain F , and we’ll produce a hypotheses set with the program

FormulateCandidateHypotheses. This program starts by negating each ri ∈ R and adding it to H.

Then, it adds all combinations (¬ri) ∧ (¬rj) such that i 6= j. This continues until (¬r1) ∧ ... ∧ (¬rk) is added to H. In [9], the maximum complexity of the hypotheses generated was defined a priori. If

λ = max(Comp(h)), then we simply add combinations of ¬ri such that their complexity is less than or equal to λ. The idea behind this procedure is based on Theorem 1 and Definition 52. From (4.16),

k _ CLA(T,C, ¬F ) PROP ri (4.17) i=1

Hence

k k k _ _ ^ CLA(T,C, ¬F ) PROP ri iff CLA(T,C)∪{¬ ri} PROP CLA(F ) iff CLA(T,C)∪{ ¬ri} PROP CLA(F ) iff i=1 i=1 i=1

28 k \ iff (CLA(T,C) ∪ {¬ri}) PROP CLA(F ) (4.18) i=1

On the other hand, from (4.16), we also know that CLA(T,C) 2PROP CLA(F ). Going back to the idea of the algorithm, we want to find hypotheses with maximal explanatory power but minimal complexity, and Vk not just hypotheses that explain all the facts. Hence, instead of taking H = { i=1 ¬ri}, which would explain the entire fact set but have maximum complexity, we notice that from (4.18) we can conclude that conjunctions of {¬ri} together with CLA(T,C) can entail a subset of CLA(F ). Hence, we consider H as being formed of tuples made of these elements. After formulating the hypotheses set, we remove the inconsistent hypotheses. This corresponds to removing the hypotheses h ∈ H such that

T ∪ C ∪ {h} `Res ∅ (4.19)

At last, we determine the good hypotheses using selection criterion #1. In terms of efficiency, it was shown in [9] that the algorithm has complexity O(n2+λ), where λ is the complexity defined a priori, which is a great improvement when compared to other abduction algorithms developed in previous years. However, this complexity was achieved throught the use of formulas in HF form, which take constant time to be translated into clausal form, and the use of the Resolution principle, whose complexity when applied to formulas in clausal form translated from HF PROP formulas is O(n2). Hence, in practice, it is not very useful, since this language does not have much expressive power.

29 30 Chapter 5

Peirce’s FOL Algorithm

In this section, we’ll develop and implement an adaptation of Peirce’s algorithm capable of supporting input sets with formulas in FOL. In particular, we’ll explain the procedures used and the theoretical aspects behind them. We’ll also provide a few explanations of why this algorithm works in somes cases, while remaining uncertain in others. Lastly, we’ll present the connection between this algorithm and the abduction problem arised in Chapter 3, and the peculiarities of the implementation developed. The idea behind this new algorithm is similar to the one used in the previous section, but adapted so that it produces fairly good results in the presence of sets of FOL formulas, eventhough these results are harder to obtain and more incomplete than the ones obtained for PROP formulas. This new algorithm that we’ve created and developed includes optimizations in the procedure ResolutionOP that we think will balance some of the increased complexity that emerged due to the adaptation to more complex formulas. Additional optimizations are mentioned, but are left for future work. We’ll implement the algorithm in Mathematica 12.0, a system that operates with Wolfram language. The choice of this system was due to the fact that it allows for a simple and very practical representation of FOL formulas. More, it already possesses some useful built-in functions that facilitate the processes in play. Other systems, like SWI-PROLOG or CPython, could have also been used to develop this algorithm. The pseudocode of the algorithm can be found below.

31 Algorithm 2 Peirce’s FOL Algorithm 1: procedure PEIRCEFOL(T, C, F, p, ExpPow, Comp) 2: CCLA ← CLAFOL(C) 3: TCLA ← CLAFOL(T) 4: if TCLA is empty and CCLA is empty then 5: print ”Choose new T and C sets” 6: else 7: if Consistent(TCLA ∪ CCLA, p) === T rue then 8: Fact ← CLAFOL(¬F) 9: R ← ResolutionOP(TCLA ∪ CCLA ∪ Fact, p) 10: if R does not contain the empty set then 11: H∗ ← FormulateCandidateHypothesesFOL(R, Comp) 12: H∗ ← RemoveInconsistenHypothesesFOL(TCLA, CCLA, H∗, p) 13: H∗ ← SelectGoodHypothesesFOL(TCLA, CCLA, H∗, Fact, p, ExpPow, Comp) 14: return H∗ 15: else 16: print ”The theory set and the conditions set explain the facts set” 17: end if 18: else 19: print ”The theory set and the conditions set are inconsistent” 20: end if 21: end if 22: end procedure

By looking at this algorithm, we notice many similarities to Peirce’s algorithm. Namely, we still check the consistency of the theory and conditions sets, apply a Resolution mechanism (not the original one), and formulate hypotheses. However, all of these programs were adapted to support certain kinds of formulas, different from HF PROP formulas in clausal form.

5.1 Main Ideas

The idea behind this algorithm is the following: given the theory set T , a facts set F and a conditions set C, we want to create a hypotheses set H such that

T ∪ C 2FOL F (5.1)

p T ∪ C ∪ {h} FOL F, ∀h ∈ H (5.2)

T ∪ C ∪ {h} 2FOL ⊥, ∀h ∈ H (5.3)

{h} 2FOL F, ∀h ∈ H (5.4)

p The relation FOL indicates that not all but at least one of the components of F must be entailed for the relation to be true. We’ve added the last condition to avoid obtaining facts as hypotheses for themselves, which is not useful. The other equations have a similar meaning to the one presented in the context of Peirce’s algorithm, but instead of referring to entailed PROP formulas, they refer to entailed FOL formulas.

5.2 Preliminaries and Procedures

In this section, we’ll explain in detail how each component was developed. Many of these procedures were derived from the ones present the previous algorithm.

32 5.2.1 Input

The input sets T,C,F contain formulas in FOL built accordingly with the induced consequence system of Example 6. They convey the same idea as before, but are not required to be in HF form. Instead, they only need to be in FOL.

Definition 53. (TCHF Reasoning Framework #2) A TCHF reasoning framework #2 for abduction is a tuple hT,C,H,F i such that:

• T = {t1, ..., tm} is the theory set denoting t1 ∧ ... ∧ tm, where ti are formulas in FOL; it represents the hypotheses that must be assumed as T rue during the reasoning process and that do not change depending on the situation;

• C = {c1, ..., cp} is the conditions set denoting c1 ∧ ... ∧ cp, where ci are formulas in FOL; it represents the accepted conditions that must be assumed as T rue during the reasoning process and that may change depending on the situation;

• H = {h1, ..., hj} is the hypotheses set denoting h1 ∨ ... ∨ hj, where hi are formulas in FOL; it represents the hypotheses that together with T and C explain the facts represented by F ;

• F = {f1, ..., fq} is the facts set denoting f1 ∧ ... ∧ fq, where fi are formulas in FOL; it represents the facts that must be explained through abductive reasoning.

From Proposition 10, we know that the signature set of the induced FOL proof system (and, conse- quently, of the theories) is the same as the original consequence system. Hence, the structure of the formulas mentioned is also in concordance with the formulas of the induced proof system and of the theories. In the cases where we begin with a large amount of formulas within the reasoning framework, we should try to reduce the elements in T to the ones related directly or indirectly to the facts before running the procedure (when possible), in order to greatly improve the complexity of the algorithm.

Example 14. Recall the scenario of Example 10. The TCHF reasoning framework #2 for this case is the following:

• c1 denotes ”patient”;

• the relation P(·) denotes ”· has the Flu”;

• the relation Q(·) denotes ”· has Smallpox”;

• the relation S(·) denotes ”· has a tropical disease”;

• the relation W(·) denotes ”· has cough”;

• the relation V(·) denotes ”· has a small rash”;

• the relation U(·) denotes ”· has a fever”;

• T = {∀c1(P(c1) ⇒ U(c1)), ∃c1(P(c1) ⇒ W(c1)), ∀c1(Q(c1) ⇒ (V(c1) ∧ ¬U(c1))), ∃c1(S(c1) ⇒ V(c1)),

∃c1(V(c1) ⇒ P(c1))};

• C = ∅;

• F = {∃c1W(c1), ∃c1U(c1), ∃c1V(c1)}.

The goal within this framework is to compute hypotheses that justify all three facts.

33 The fourth input parameter is p. This value will be used in the context of the modified Resolution principle. As we’ll see ahead, when applied to formulas with function and predicate symbols, variables and constants, there’s a possibility that the mechanism applying the modified Resolution principle does not terminate. This parameter will work as a guard in those situations. It defines the maximum amount of iterations that this principle will be applied to a set of formulas. The remaining parameters are related to the selection criterion used in this algorithm. In order to combat the expected increased complexity, we’ve made some changes to these specifications. In particular, instead of using the explanatory power, we’ve used a second version of this parameter.

Definition 54. (Explanatory Power #2 - ExpPow) The explanatory power #2 of h ∈ H is the number of facts it can explain.

We’ve also made some changes to the selection criterion itself. Instead of using selection criterion #1, we’ve altered the order of the computations of the complexity and of the explanatory power, and used explanatory power #2. This selection criterion is presented next.

Definition 55. (Selection Criterion #2) Hypothesis h ∈ H is considered to be a good hypothesis if

h ∈ {h ∈ {h ∈ H : Comp(h) 6 Comp} : ExpP ow(h) ≥ ExpPow} (5.5)

The change in the concept of the explanatory power was made with the simple goal of reducing the number of computations executed. Instead of calculating a ratio, we just count the number of satisfied facts. The alteration of the order in which both these parameters are applied was also justified by the complexity of the program. As we will see, checking whether or not a certain hypothesis, together with the theory and conditions sets, explains a fact is a very time consuming operation. As such, by changing the order within the selection criterion, we avoid unnecessary computations over hypotheses that have extra complexity. To summarize, in Peirce’s FOL algorithm, the parameter ExpPow indicates the minimum explanatory power #2 that a given hypothesis must possess in order to be accepted, while the parameter Comp specifies the maximum complexity that a hypothesis can have.

5.2.2 CLAFOL Procedure

The first procedure - CLAFOL - executes the translation of the formulas in FOL into clausal form. This procedure could have been done in various ways, but we’ll focus on the mechanism described in [8] with a few changes, since we’re dependent on the capabilities of the system Mathematica. This translation is crucial since the Resolution principle we use can only be applied to formulas in clausal form. The first problem we need to tackle is related to the type of formulas that can be translated into clausal form. We only know how to translate formulas in PROP into clausal form. Hence, we could try to translate the FOL formulas into PROP formulas, which in turn could be translated into clausal form. However, it is not possible to translate a formula in FOL into a formula in PROP. Thus, the path that we propose is the one where we translate FOL formulas into equisatisfiable non-quantified formulas that assume the structure of PROP formulas. We’ll denote these new formulas by FOL∗ formulas.

Definition 56. (FOL∗ Formulas) A formula is said to be in FOL∗ if is a non-quantified FOL formula where every variable is assumed to be universally quantified.

34 This first step in this procedure is to translate the FOL formulas into . Every FOL formula can be translated into a logically equivalent formula in prenex normal form, as stated in [30].

Definition 57. (Prenex Normal Form) A formula ϕ is in prenex normal form if it’s written as a string of quantifiers and bounded variables (the prefix) followed by a quantifier free body (the matrix).

To execute this translation, we simply apply translation Pre to every subformula present in a given formula.

Definition 58. (Subformula) Let ϕ be a formula. Then:

• If ϕ is an atomic formula, then the subformula of ϕ is ϕ;

• If ϕ is ¬ψ, then the subformulas of ϕ are ϕ and ψ;

• If ϕ is ψ1 ⇒ ψ2, ψ1 ∨ ψ2 or ψ1 ∧ ψ2, then the subformulas of ϕ are ϕ and the subformulas of ψ1

and ψ2;

• If ϕ is ∀ciψ, then the subformulas of ϕ are ϕ and the subformulas of ψ.

Definition 59. (Pre translation) FOL Let ci ∈ χ, cj be a fresh variable (i.e. a variable that is not part of the language) and let ϕ, ψ ∈ L(C ) such that ci ∈/ fv(ψ). Then Pre : FOL → FOL is the translation defined inductively as follows:

• Pre(ϕ) = ϕ if ϕ is in prenex normal form;

• Pre(¬¬ϕ) = Pre(ϕ);

• Pre(ϕ ⇒ ψ) = (Pre(¬ϕ) ∨ Pre(ψ));

• Pre(¬∀ci(ϕ)) = ∃ci(¬Pre(ϕ));

• Pre(¬∃ci(ϕ)) = ∀ci(¬Pre(ϕ));

• Pre(∀ciϕ) = ∀cj(σ(ϕ)), where σ = {ci → cj} is a substitution;

• Pre(∃ciϕ) = ∃cj(σ(ϕ)), where σ = {ci → cj} is a substitution;

• Pre((∀ciϕ) ∧ ψ) = Pre(∀ci(ϕ ∧ ψ));

• Pre((∀ciϕ) ∨ ψ) = Pre(∀ci(ϕ ∨ ψ));

• Pre((∃ciϕ) ∧ ψ) = Pre(∃ci(ϕ ∧ ψ));

• Pre((∃ciϕ) ∨ ψ) = Pre(∃ci(ϕ ∨ ψ)).

All of these translation rules have the clear goal of ”pushing” the quantifier to the prefix part of the formula, while keeping a quantifier free matrix. We can also notice that in some cases, the same variable appears quantified in different formulas. In order to avoid confusion, renaming those variables is necessary. This process is called Standardization, and it’s included in Pre translation (6th and 7th rules). It also uses a substitution. As it was alluded, Pre translation can be applied to any FOL formula with useful results.

Proposition 19. For any formula in FOL, there exists a logically equivalent formula in prenex normal form.

35 Proof. Simply use Pre translation recursively over the original formula and its subformulas. Since all rules preserve satisfiability (trivial to check using interpretation structures), then the sucessive applica- tion of the translation also preserves the satisfiability of the formula. Since there is a finite number of subformulas, this process always terminates. More, due to the fact that there is no addition of variables or other symbols to the language, the end result is logically equivalent to the starting formula.

Once we’ve obtained the logically equivalent formula, we move to a more complex translation used to create an equisatisfiable formula resembling PROP formulas. This translation must eliminate the quantifiers, since these connectives don’t belong to the signature of PROP. At this point, we can choose one of two approaches that operate in a similar way - Skolemization or Herbrandization. As stated in [31] and [32], Herbrandization preserves validity, while Skolemiza- tion preserves satisfiability. But since we’re interested in obtaining equisatisfiable formulas, we’ll use Skolemization.

Definition 60. (Skolem Normal Form) A formula in FOL is in Skolem normal form if it’s in prenex normal form and its prefix is composed only of universal quantifiers.

Definition 61. (Skolemization) Skolemization is the process of translating a FOL formula into Skolem normal form, by removing exis- tential quantifiers and adding Skolem functions and constants.

Skolemization will be achieved through the use of Skolemization translation.

Definition 62. [33] (Skolemization translation) Let ϕ be a matrix of a FOL formula in prenex normal form. Then Skolemization : FOL → FOL is the translation defined recursively as follows:

• Skolemization(∃c1...∃cnϕ) = σ(ϕ), where σ = {c1 → xi1, ..., cn → xin} is a substitution from the variables to Skolem constants, denoted by xi, i ∈ N, and such that xi ∈/ C ;

• Skolemization(∃c1...∃cn∀cn+1...∀cmϕ) = Skolemization(∀cn+1...∀cm(σ(ϕ))), where σ = {c1 →

xi1, ..., cn → xin} is a substitution from the variables to Skolem constants, denoted by xi, i ∈ N, and such that xi ∈/ C ;

• Skolemization(∀cm+1...∀ck∃c1...∃cn∀cn+1...∀cmϕ) = Skolemization(∀cm+1...∀ck∀cn+1...∀cm(σ(ϕ))),

where σ = {c1 → fj1(cm+1, ..., ck), ..., cn → fjn(cm+1, ..., ck)} is a substitution from the variables

to Skolem functions, denoted by fj, j ∈ N, and such that fj ∈/ Fk−(m+1).

Since Skolemization uses constants and symbols that are not present in the language, the translated formulas are not logically equivalent. Instead, they’re equisatisfiable.

Proposition 20. [32] Let ϕ ∈ L(CFOL) be in prenex normal form. Then Skolemization(ϕ) is equisatis- fiable, i.e. ϕ is unsatisfiaible iff Skolemization(ϕ) is unsatisfiable.

Proof. (⇐) By induction over the number n of existential quantifiers in the prefix portion of ϕ removed during the process.

• Base: n = 0. Then ϕ = Skolemization(ϕ), and so Skolemization(ϕ) is also satisfiable;

• Step: Suppose that ϕ = ∀cm+1...∀ck∃c1ψ is satisfiable. Then there exists an interpretation struc- FOL l ture I such that I FOL ϕ. By definition of I, for every l-tuple (a1, ..., al) ∈ (C0 ) , there exists FOL b ∈ C0 such that I FOL σ(ψ), where σ = {cm+1 → a1, ..., ck → al, c1 → b} is a sub-

stitution. Consider the indexed family of sets {B } FOL l , where B = {a1,...,al} {a1,...,al}∈(C0 ) {a1,...,al}

36 ∗ {b ∈ C : I FOL σ(ψ)}. Let f be a choice function for this family of sets (i.e. for every FOL l ∗ (a1, ...al) ∈ (C0 ) , f (a1, ..., al) selects an element of B{a1,...,al}). Then we construct an in- 0 0 0 ∗ 0 0 terpretation structure I such that I expands I with I (f1) = f . Then I FOL ∀cm+1...∀ckσ (ψ), 0 where σ = {c1 → f1(cm+1, ...ck)}, and hence Skolemization(ϕ) is satisfiable.

0 0 (⇒) Suppose that Skolemization(ϕ) = ∀cm+1...∀ckσ (ψ), where σ = {c1 → f(cm+1, ..., ck)} and 0 0 that I FOL Skolemization(ϕ). Let I be the interpretation structure built such that I restricts I to the set of symbols not equal to f1. Then I FOL ∀cm+1...∀ck∃c1ψ, since for every l-tuple (a1, ..., al) ∈ FOL l 0 FOL (C0 ) , I (f1)(a1, ..., al) provides a specific choice for b ∈ C0 such that I FOL ϕ.

After obtaining the equisatisfiable formulas in Skolem normal form, we can simply remove the univer- sal quantifiers, and assume that every variable is universally quantified. After this step, we’ve obtained PROP like formulas equisatisfiable to the original FOL formulas, where each variable is assumed to be universally quantified. As mentioned in the beginning of the chapter, these formulas are said to be in FOL∗. This claim is supported by the following theorem.

Theorem 2. [32] (Herbrand’s Theorem)

A formula in Skolem normal form ∀c1...∀cnψ is unsatisfiable iff there is n ∈ N and closed instances

σ1(ψ), ..., σn(ψ) such that σ1(ψ) ∧ ... ∧ σn(ψ) is unsatisfiable in PROP, where σi are substitutions.

In the penultimate step, we need to translate the formulas into .

Definition 63. (Conjunctive Normal Form - CNF) A formula in FOL∗ is in conjunctive normal form if it’s written as a conjunction of one or more clauses, where a clause is a disjunction of literals (positive or negative).

Definition 64. (CNF translation) ∗ ∗ ∗ Let ϕ1, ϕ2, ϕ3 be FOL formulas. Then CNF : FOL → FOL is the translation defined inductively as follows:

• CNF(ϕ1) = ϕ1 if ϕ1 is in CNF;

• CNF(¬(ϕ1 ∨ ϕ2)) = ¬CNF(ϕ1) ∧ ¬CNF(ϕ2);

• CNF(¬(ϕ1 ∧ ϕ2)) = ¬CNF(ϕ1) ∨ ¬CNF(ϕ2);

• CNF(ϕ1 ∧ (ϕ2 ∨ ϕ3)) = (CNF(ϕ1) ∧ CNF(ϕ2)) ∨ (CNF(ϕ1) ∧ CNF(ϕ3));

• CNF(ϕ1 ∨ (ϕ2 ∧ ϕ3)) = (CNF(ϕ1) ∨ CNF(ϕ2)) ∧ (CNF(ϕ1) ∨ CNF(ϕ3)).

The second and third rules are known as De Morgan’s Laws, while the fourth and fifth rules convey the distributive property. If we apply this translation recursively to the subformulas of a FOL∗ formula, we obtain a logically equivalent or FOL∗ formula in CNF. The final step consists of translating the formula in CNF into clausal form. For that, we’ll use the following translation:

Definition 65. (Clausal translation) Let Clausal : FOL∗ → FOL∗ be the translation defined recursively as follows:

• Clausal(ϕ1 ∨ ... ∨ ϕn) = {ϕ1, ..., ϕn};

• Clausal(ϕ1 ∧ ... ∧ ϕn) = Clausal(ϕ1), ..., Clausal(ϕn).

37 Other translation that produces a similar result can be found in [30]. However, that procedure is more compact (has one big step instead of five small steps) and as such, may be more difficult to apply. The procedure CLAFOL will consist of applying these translations in order. In the implementation chapter, the details will be explained. The final important aspect of this procedure is the relation of the original formula with the converted one. As we’ve seen, since this process introduces new variables (Skolem constants and Skolem func- tions) during the Skolemization translation, the final formulas won’t be logically equivalent. Instead, by Proposition 20, they’ll be equisatisfiable. In the context of our program, this setback will influence how we’ll interpret the solutions that we’ll obtain, since technically, they could contain variables outside our language (and system). Hence, the set of hypotheses obtained by Peirce’s FOL algorithm will be denoted by H∗, to simbolize this difference. We’ll mention a method that can be used to ”translate” H∗ into a set H with formulas in FOL.

5.2.3 ResolutionOP and MGUFIN Procedures

Generalized Resolution Principle - GRP

The second procedure that we’ll describe is ResolutionOP. This program will recursively apply a slightly optimized version of a general Resolution principle, based on the one introduced in Definition 42. The initial algorithm used the Resolution principle as an inner mechanism. This principle could only operate over the refutation system of Example 12. In this case, we have formulas in FOL∗ which may contain predicate or function symbols. As such, we have to extend the conditions adjacent to this principle in order to ensure that the satisfiability of the problem is not altered. We start by introducing some new concepts.

Definition 66. (Variable Renaming Substitution) A variable renaming substitution is a map σ: χ → Term, i.e. it maps a variable to a term.

From this point on, we will assume that any substitution mentioned is a variable renaming substitution, until we state otherwise.

Definition 67. (Unifier)

A (variable renaming) substitution σ is a unifier for Ψ1 and Ψ2 iff σ(Ψ1) = σ(Ψ2).

Definition 68. (More General Unifier)

Let σ1, σ2 be two unifiers. σ1 is said to be as general as or more general than σ2 iff there is a substitution

σ3 such that σ3(σ1) = σ2.

Definition 69. (Most General Unifier - MGU)

A unifier σ is said to be the most general unifier of Ψ1, Ψ2 if it is as general as or more general than any other unifier

It’s possible that two expressions have more than one MGU. However, they’re unique up to variable renaming and as such, we can choose one at random to use during GRP.

Definition 70. (Factor) Let Ψ be a formula in clausal form, and let σ be the most general unifier of a subset of the literals of Ψ. Then Ψ0 = σ(Ψ) is called a factor of Ψ.

All of these notions allow us to formulate the following definition.

Definition 71. (Generalized Resolution Principle - GRP) The generalized Resolution principle consists in applying the generalized Resolution rule GENRES

38 {Ψ1}{Ψ2} 0 0 {σ((Ψ1\{ϕ1}) ∪ (Ψ2\{¬ϕ2}))} where

• Ψ1, Ψ2 are formulas in clausal form;

• ρ is a renaming substitution of Ψ1;

0 0 • Ψ1 is a factor of ρ(Ψ1) and ϕ1 ∈ Ψ1;

0 0 • Ψ2 is a factor of Ψ2 and ¬ϕ2 ∈ Ψ2;

• σ is the most general unifier of ϕ1 and ϕ2;

0 0 • GenRes(Ψ1, Ψ2) = {σ((Ψ1\{ϕ1}) ∪ (Ψ2\{¬ϕ2}))}, which could be empty, is called the general resolvent.

Analogous to what we explained in the scope of the Resolution principle, Peirce’s FOL algorithm FOL∗ will use the induced refutation system P(hC , `GenResi) as an inner mechanism, built in a similar manner to that of Example 12.

FOL∗ Example 15. The refutation system based on the GRP is the proof system P(hC , `GenResi), in- ∗ duced by the consequence system CFOL according to Proposition 10. The consequence system is in `GenRes FOL∗ turn induced by the Hilbert calculus HILGenRes = hC , GENRESi, as described in Proposition 2.

The relation `GenRes is very similar to `Res, and is presented next.

Definition 72. (General Resolution Derivation)

A general Resolution derivation of a clause ϕ from a set of clauses Ψ, denoted by Ψ `GenRes ϕ, is a sequence of clauses Ψ1, ..., Ψn such that:

• Ψi ∈ Ψ or Ψi ∈ GenRes(Ψj, Ψk), where 1 ≤ j and k < i;

• Ψn = ϕ.

Besides this detail, we also notice that the empty set is once again present. In this case, it is considered to be F alse under any interpretation structure. We’ll also rely on abuse of notation throughout the proofs introduced below, by considering the empty set to be part of L(CFOL). Looking at the GRP, we immediately notice a few modifications. In particular, the addition of a few substitutions - renaming substitution and MGU. The usage of MGU is necessary since the output of CLAFOL has been through several variable renamings. As we’ve seen, the Resolution rule can only be applied in the presence of identical formulas. This includes not only the predicate and function symbols, but also the variables and contants contained within their scope. Hence, without an MGU, it would be very difficult to encounter similar contants and symbols in the formulas. More, if we don’t use the MGU and instead use a simple unifier, we may overlook some variables that can be renamed. This has the obvious consequence of impeding the conclusion of the Resolution procedure. The renaming substitu- tion is needed since the algorithm to compute the MGU cannot differentiate between two variables that have the same name but are not the same, since they are in different clauses, and two variables that have the same name and are effectively the same. At last, it’s important to mention that the property of closure for variable exchange of the induced refutation system implies that these substitutions do not alter the satisfiability of the formulas.

39 Unification Algorithm

There are many algorithms to determine the MGU of two expressions. In [34], an algorithm with complex- ity O(n.α(n)), where α is slowly rising function, is presented. The Paterson-Wegman Linear Unification algorithm, described in [35], can also be used to compute the MGU. Since we’re only working with FOL∗ formulas in clausal form, we’ll once again use a procedure developed [8] which is very straightforward to apply. This procedure, which we will name MGUFIN, may not be the most efficient, but since in practice the number of formulas in play is reduced, we don’t need to worry about the assimptotical difference between linear and quasi-linear algorithms. It works as follows:

• Each expression is viewed as a sequence of subexpressions. We have two types of expres- sions: object constants (consisting of constants, predicate symbols and function symbols, includ- ing Skolem constants and Skolem functions) and variables;

• Given two expressions and an initial substitution (empty), we recursively process the two expres- sions by comparing the subexpressions;

• If two expressions are identical, then the procedure is done;

• If two expressions are not identical and they’re constants, then there’s no MGU;

• If one of the expressions is a variable, we check if the variable has a binding in the current substi- tution;

– If it has, we try to unify the substitution with the second expression; – If it doesn’t have one, we check whether or not the second expression contains the variable; ∗ If it contains, then there’s no MGU; ∗ If it doesn’t contain, then the substitution becomes the composition of the old substitution and a new substitution in which the variable is bound to the second expression;

• If the two expressions are both sequences, we recursively apply this procedure to the subexpres- sions.

One detail that we have to factor in is called the occur check. Once it happens, the procedure cannot move forward.

Definition 73. (Occur Check) Suppose a variable is bounded to an expression through a substitution σ. If the variable occurs within that expression, then we’re in the presence of an cccur check.

Refutation Soundness, Completeness and Decibility

At last, we need to address the problem mentioned when we introduced Proposition 16. Similar to the Resolution principle, we’ll now see that the GRP also produces a complete and sound refutation system. However, while in the original case we had a decidable program, in this case our program will only be semi-decidable. This is not case when when applied to certain subsets of FOL formulas - theories - that are decidable. The first proofs that we’ll make are those of refutation completeness and soundness. Both of these demonstrations are quite extensive, but play an important role in our thesis. As such, they’ll be fully presented.

40 Proposition 21. The refutation system based on the GRP is sound for FOL, i.e.

if Ψ `GenRes ∅ then Ψ FOL ⊥ (5.6)

Proof. We’ll prove that the GRP based induced proof system is sound, which will automatically imply 00 that it is a sound refutation system, similar to what was done in Proposition 13. As such, let Ψ1 = 0 00 0 Ψ1 ∪ Ψ1 ∪ {ϕ1}, Ψ2 = Ψ2 ∪ Ψ2 ∪ {¬ϕ2}, and let Ψ = σ(ρ(Ψ1)) ∨ σ(Ψ2), where σ, ρ, ϕ1 and ¬ϕ2 are the 0 elements present in definition of GRP, and Ψ1, Ψ2 are the elements removed during factoring. Suppose that 00 00 I FOL Ψ1 ∧ Ψ2 (5.7) where I is an interpretation structure. We need to show that

I FOL Ψ (5.8)

From (5.7), 00 00 I FOL ρ(Ψ1 ) ∧ Ψ2 (5.9) since the renaming of variables does not change the satisfiability of a formula (recall that one of the assumptions of the proof systems was that of closure for renaming substitutions). By induction over the structure of the formula, it can also be shown that factoring also preserves satisfiability. Hence, I FOL (ρ(Ψ1) ∨ ϕ1) ∧ (Ψ2 ∨ (¬ϕ2)) (5.10)

Applying the MGU of ϕ1 and ϕ2, σ, to the second member, also perserves satisfiability, since the induced proof system is closed for substitutions. Thus,

I FOL (σ(ρ(Ψ1)) ∨ σ(ϕ1)) ∧ (σ(Ψ2) ∨ σ(¬ϕ2)) (5.11)

By definition, we know that ¬σ(ϕ1) = σ(¬ϕ2). Hence, either I FOL σ(ρ(Ψ1)) or I FOL σ(Ψ2), since I FOL σ(ϕ1) or I FOL σ(¬ϕ2), but not both at the same time. Therefore, I FOL Ψ. The result follows immediately.

In regards to completeness, we’ll use the approach chosen in [32], [36], [37], [38] and [39] based on Herbrand bases, ground atoms and liftings. As such, we’ll first proof a few auxiliary lemmas and introduce some definitions.

Definition 74. (Ground Formula) A ground formula is a formula that has no free variables.

Proposition 22. Let C be a signature containing at least one constant and let Ψ = {ψ1, ..., ψn} be a V set of ground literals. Then ψi has a FOL model iff Ψ does not contain a complementary pair of 16i6n literals.

Proof. (⇒) If Ψ contains a complementary pair of literals, it can be trivially concluded that it does not have a FOL model. (⇐) Suppose that Ψ does not contain a complementary pair of literals. We define a Herbrand algebra

HΨ as follows: for each predicate symbol P , we define PHΨ = {(t1, ..., tn) ∈ Term : P (t1, ..., tn) ∈ Ψ}. Clearly HΨ FOL ψi for each ψi ∈ Ψ, since if ψi = P (t1, ..., tn) then P (t1, ..., tn) ∈ Ψ and (t1, ..., tn) ∈

PHΨ . Otherwise, if ψi = ¬P (t1, ..., tn) then P (t1, ..., tn) ∈/ Ψ and (t1, ..., tn) ∈/ PHΨ . Else, it would contradict the assumption that Ψ contains no complementary pair of literals. Hence HΨ FOL ψi for ψi ∈ Ψ. The result follows.

41 Proposition 23. Let Ψ be a set of ground clauses. If Ψ does not have a FOL model, then the empty clause ∅ may be derived by the ground Resolution rule {Ψ} 0 {Ψ\{Ci,Cj} ∪ {Cij}}

0 0 0 0 0 where Ci,Cj are the clauses Ci,Cj without the complementary literals and Cij = Ci ∪ Cj.

Note that this rule is equivalent to the Resolution principle of Definition 42. However, this rule is applied only to ground clauses, instead of PROP formulas.

Proof. Let Ψ = {Ci : 1 6 i 6 n, n > 0} be a set of n clauses. If ∅ ∈ Ψ, there’s nothing to proof. Hence, assume that ∅ ∈/ Ψ. Consider the measure given by

X #Ψ = ( |Ci|) − n (5.12) 16i6n

It’s trivial to check that #Ψ = 0 iff every clause is composed of a single literal. We’ll finish our proof by induction on #Ψ: V • Base: #Ψ = 0. Then Ci = {ψi} and Ψ = ψi. By Proposition 22, we know that Ψ is 16i6n unsatisfiable iff it contains a complementary pair. By the ground Resolution rule, the resolvent of this complementary pair of literals is ∅;

• Step: Suppose that #Ψ = k > 0. Then there must be at least one clause Ci which contains more

than one literal. Hence, let Ci = {ψi} ∪ Di, with ψi ∈/ Di and Di 6= ∅. Let Ψi1 = (Ψ\{Ci}) ∪ {Di}

and Ψi2 = (Ψ\{Ci}) ∪ {ψi}. It’s trivial to check that #Ψi1 , #Ψi2 < #Ψ. More, if Ψ does not have

a FOL model then neither do Ψi1 , Ψi2 . The induction hypothesis says that for all k such that 0 6 #Ψ = k < m for some m > 0, if Ψ does not have a FOL model, then ∅ is derivable from Ψ. Hence, by induction hypothesis, there are ground Resolution derivations R1, R2, using only 0 the rule, from Ψi1 , Ψi2 , respectively, which derive ∅. Consider the ground Resolution derivation R1

obtained from R1 by adding the literal ψi to Ψi1 and performing exactly the same sequence of ground Resolutions. We have two cases:

– If the derivation R1 did not involve the use of any of the literals in Di and ∅ was derived, then

the same sequence with ψi included would also derive ∅;

– If one or more steps in ground Resolution derivation R1 involved literals from Di then the 0 resulting ground Resolution derivation R1 may derive the clause {ψi} instead of ∅. On the

other hand, we know that ∅ is derived from Ψi2 in ground Resolution derivation R2. This

implies that there are ground resolution steps in R2 involving the literal ψi which derive ∅. 0 Hence, there exists at least one clause containing the literal ψi, complementary to ψi, in the 0 set of final clauses obtained in R1. By applying the ground Resolution steps of R2 which do 0 not appear in R1, ∅ would be derived.

The next result can be used in many proofs within the scope of the abduction problem, the Resolution principle and the GRP.

Proposition 24. (Lifting Lemma)

Let C1,C2 be clauses and let σ1, σ2, σ be substitutions such that:

• fv(C1) ∩ fv(C2) = ∅;

42 • fv(σ1(C1)) ∩ fv(σ2(C2)) = ∅;

0 • C12 is the resolvent of σ1(C1) and σ2(C2) via a substitution σ, by a single application of the ground Resolution rule.

Then there exists a resolvent C12 of C1 and C2 obtained through a single application of the ground 0 Resolution rule via a substitution σ3 and a substitution σ4 such that C12 = σ4(C12).

0 0 0 0 0 Proof. Let C1 = C1 ∪{ψi : 1 6 i 6 m, m > 0} = C1 ∪L1 and C2 = C2 ∪{ψj : 1 6 j 6 n, n > 0} = C2 ∪L2 0 0 such that σ is the MGU of σ1(L1) and σ2(L2) where L2 is the set ¬L2. Since fv(C1) ∩ fv(C2) = ∅, then dom(σ1)∩dom(σ2) = ∅. More, since fv(σ1(C1))∩fv(σ2(C2)) = ∅, then fv(range(σ1))∩fv(range(σ2)) = 0 0 0 0 ∅ and thus σ1(C1) = (σ1 ∪ σ2)(C1) and σ2(C2) = (σ1 ∪ σ2)(C2). 0 0 0 Since σ is an MGU of σ1(L1) and σ2(L2), then σ◦(σ1 ∪σ2) is a unifier of L1 ∪L2. If L1 ∪L2 is unifiable, 0 0 then it has an MGU σ3 such that C12 = σ3(C1 ∪ C2) is the resolvent of C1 and C2. More, since σ3 is 0 0 0 the MGU, then there is a substitution σ4 such that σ4 ◦ σ3 = σ ◦ (σ1 ∪ σ2) and C12 = σ4(σ3(C1 ∪ C2)) = 0 0 σ ◦ (σ1 ∪ σ2)(C1 ∪ C2).

We’re now ready to demonstrate the following result.

Proposition 25. The refutation system based on the GRP is complete for FOL, i.e.

if Ψ FOL ⊥ then Ψ `GenRes ∅ (5.13)

Proof. The idea behind the proof is the following: using the lifting lemma, we know that if the substitutions

σ1 and σ2 are ground substitutions, i.e. substitutions that map the variables to terms without variables, then there exists a corresponding ground substitution σ4 which produces the same result after ground Resolution. By Herbrand’s theorem, we can infer that a set Ψ is unsatisfiable iff a finite subset of ground instances of Ψ is unsatisfiable. Hence, to prove the completeness of GRP refutation it is sufficient to consider only the finite set of ground clauses from which ∅ may be derived. We know that Ψ suffers a renaming through substitution ρ, and as such the variables in every clause are disjoint from the variables occuring in any other clause. By Herbrand’s theorem, there is a finite set of ground clauses G = {gCi : 1 6 i 6 m} such that G is unsatisfiable. Each gCi ∈ G is obtained by a substitution on some clause, i.e. gCi = σi(Ci). More, for i 6= j, dom(σi) ∩ dom(σj) = ∅ and since all the clauses in G are ground, the conditions of the lifting lemma are satisfied. At last, through induction on the height of the Resolution proof tree, it can be shown that each application of ground Resolution rule in G may be lifted to finding an MGU of the appropriate clauses. The result follows.

Joining Proposition 21 and Proposition 25 leads to the following result.

Proposition 26. The refutation system based on the GRP is sound and complete for FOL, i.e.

Ψ FOL ⊥ iff Ψ `GenRes ∅ (5.14)

We now address the problem mentioned in the begining of the section. In PROP, we had a de- cidable problem. However, in FOL, that is not the case. Instead, we have a semi-decidable problem. This means that if a clause is unsatisfiable, then the GRP will always derive the empty clause (due to refutation completeness and soundness). However, if the clause is satisfiable, then there’s no guarantee that the GRP will ever terminate. To prove this fact, it is sufficient to provide an example.

Example 16. Suppose that Ψ1 = {Q(c1), ¬Q(f1(c1))} and Ψ2 = {¬Q(c2), ¬P(c2)}. These clauses are satisfiable. Applying the GRP to Ψ1, Ψ2 would result in Ψ3 = {¬Q(f1(c2), ¬P(c2)}. Again, applying the

43 GRP to Ψ1, Ψ3 would result in Ψ4 = {¬Q(f1(f1(c2)), ¬P(c2)}. By continuing to apply the GRP to the new clause with the first one, we would end up with an endless general Resolution derivation.

For this reason, we’ve added the paramenter p, as explained before. In these situations, the initial hypotheses set consisting of the tuples of the negations of the clauses obtained through GRP and, consequently, the final set H, will be incomplete. In terms of decibility, in practice, the algorithm will mostly be applied to finite subsets of FOL formu- las related directly and indirectly to the facts. These finite subsets, known as theories, can be defined accordingly with some properties. We present below some theories that have been shown to be decid- able.

Definition 75. [40] (Bernays-Schonfinkel-Ramsey¨ Class) A formula in FOL is in Bernays-Schonfinkel-Ramsey¨ class if it does not contain any function symbols and is in class ∃∗∀∗, i.e. the class of formulas in prenex normal form in which the existential quantifiers precedes the universal quantifiers.

Definition 76. (Lowenheim¨ Class) A formula in FOL is is Lowenheim¨ class if it has no free variables and contains only unary predicates.

Definition 77. (FO2 Class) A formula in FOL is is FO2 if it contains only two distinct variables.

Proposition 27. A FOL theory T h is decidable iff ModFOL(T h) is decidable.

Proposition 28. The theories composed of formulas in Bernays-Schonfinkel-Ramsey,¨ Lowenheim¨ and FO2 classes are decidable.

Optimizations

The normal procedure from this point on would be to compare each pair of sets and check if it’s possible to apply the GRP, and continue until no more applications could be done. However, this naive approach is very inefficient in terms of space for many reasons. Some applications may be fruitless since the clauses in play do not contain complementary or unifiable formulas and thus produce no output. Another issue emerges when the same resolvent can be obtained starting from different input clauses, resulting in redundant applications. With the goal of improving the complexity of this procedure in mind, we’ll present a few optimization details which will give way to the procedure ResolutionOP used in Peirce’s FOL algorithm.

• Negative Normal Form: During the initial procedure, we’ll use several translations. One of them - Skolemization - includes the addition of several constants and functions. This has a negative effect, not only in terms of time, but also in terms of safe. An alternative would be to translate formulas into negative normal form before applying Skolemization, instead of using directly Pre. This change could reduce the number of constants and functions that have to be used;

• Pure Literal Elimination: If a clause contains a pure literal, i.e. a literal that has no instance complementary or possibly unifiable to an instance of another literal, then the GRP won’t be able to derive the empty set from that clause. As such, if that is our goal, we should remove that clause from the equation;

• Tautology Elimination: If a clause contains a literal and its negation, then it is called a tautology. These clauses can also be removed from the set of input clauses, since their presence does not influence the satisfiability of the set, and only increases the time spent in the GRP. The literals in

44 question must be exact complements. We can’t apply this optimization to clauses that become tautologies through unification;

• Factorization: If Ψ0 is a factor of Ψ, then we can simply operate with Ψ0, since they’re equisatis- fiable. This procedure is necessary for a correct use of the GRP, as we’ve seen in Definition 71. More, on contrary to the case of PROP Resolution, factorization allows us to operate over more than one pair of literals in a single step of Resolution. This is, in itself, an optimization;

• Subsumption Elimination: If there exists a substitution σ such that σ(Ψ1) ⊆ Ψ2, then we say that

Ψ1 subsumes Ψ2. We can remove the subsumed clause, since there is a equisatisfiability relation between the subsumed clause and the original set;

• Unit Resolution: If one of the input clauses contains a single literal, we can implementent the Resolution mechanism in order for it to start with this clause. This will, in most cases, result in an easier and more efficient implementation;

• Input Resolution: Similar to the previous case, in this implementation we focus first on the cases where the GRP is applied to at least one of the initial input clauses. It can be shown that this type of Resolution has the same inferential power as the Unit Resolution and as such, it may be useful in the reduction of the complexity;

• Linear Resolution: This process is one of the most useful in terms of efficiency. It consists of prioritizing the application of the GRP to the cases where at least one of the input clauses is one of the original input clauses or an ancestor of the other input clause. It helps removing most of the redundancy cases, since they derive from the application of the GRP to intermediate clauses. More, we can reduce the number of clauses to experiment as the top clause. If we know that a set Ψ is satisfiable but Ψ ∪ {ϕ} is unsatisfiable, then there’s a linear refutation starting with ϕ as an initial input clause.

We’ll use some of these optimizations in our implementation of Resolution, thus obtaining the pro- cedure ResolutionOP used throughout the algorithm (including in the occasions where this mechanism was used as a subprogram of a major procedure, like Consistent and PEIRCEFOL itself).

5.2.4 Other Procedures

Apart from these two procedures, we still have four other ones. Since they’ll be very similar to the routines used in Peirce’s algorithm, we’ll unravel them in a very succint way. The Consistent procedure will once again take advantage of the ResolutionOP program. However, in this case, we’ll first need to translate the formulas into clausal form, which will be achieved through the program CLAFOL used in lines 2 and 3. After this translation, we’ll appy the GRP to the set TCLA ∪ CCLA and verify if the empty set is achieved. If such event occurs, by Propostion 26, the set is inconsistent. The remaining three procedures - FormulateCandidateHypothesesFOL, SelectGoodHypothesesFOL and RemoveInconsistentHypothesesFOL - are equal to the ones used in Peirce’s algorithm, with the small caviat of using ResolutionOP to check whether or not a hypothesis is inconsistent with the theory and conditions sets, and to infer whether or not the consistent hypotheses explain any fact, again taking advantage of Proposition 26 and of the following deduction theorem.

Theorem 3. (Deduction Theorem for FOL) ({Γ, ϕ} FOL ψ) iff (Γ FOL (ϕ ⇒ ψ)), where ϕ is a closed formula, i.e. a formula in which every variable is not free, and ψ is a formula.

45 Since we assume that every variable is universally quantified, it is implied that every formula is closed. Hence, the theorem is valid. Suppose, for example, that we want to check if

TCLA ∪ CCLA ∪ {h} FOL f (5.15) where ¬f ∈ F act and h ∈ H∗. Then we could compute ResolutionOP(TCLA ∪ CCLA ∪ {h} ∪ {¬f}). If we obtain the empty clause, i.e.

TCLA ∪ CCLA ∪ {h} ∪ {¬f} `GenRes ∅ (5.16)

By Proposition 26, TCLA ∪ CCLA ∪ {h} ∪ {¬f} FOL ⊥ (5.17)

By Theorem 3, this implies that

TCLA ∪ CCLA ∪ {h} FOL (¬f) ⇒ ⊥ (5.18) and thus TCLA ∪ CCLA ∪ {h} FOL f (5.19)

In program RemoveInconsistentHypothesesFOL, we’ve added a step in which we check whether or not the hypothesis justifies the facts by itself, discarting it if that is the case. Plus, notice that despite ∗ being in FOL , we use the relation FOL. This abuse of notation is valid since we can assume that FOL∗ is a subset of FOL.

5.3 Final Theoretical Considerations

Before implementing the algorithm, it’s worth presenting a few justifications of why the hypotheses ob- tained through this method justify the facts. We again begin with a TCHF reasoning framework (in this case a TCHF reasoning framework #2) with H = ∅. Then, we translate T and C into clausal form, obtaining TCLA and CCLA, and check whether or not the set TCLA ∪ CCLA is consistent, using ResolutionOP. If it is, we translate the negated facts into clausal form, and again apply ResolutionOP. If ∅ ∈ ResolutionOP(TCLA ∪ CCLA ∪ Fact), then

TCLA ∪ CCLA ∪ Fact `GenRes ∅ (5.20)

By Proposition 26, this is equivalent to

TCLA ∪ CCLA ∪ Fact FOL ⊥ (5.21)

By Theorem 3, this implies that TCLA ∪ CCLA FOL ¬Fact (5.22)

Hence, there’s no need to produce a hypotheses set, since the theory set and the conditions set already explain the facts set. Assume instead that

ResolutionOP(TCLA ∪ CCLA ∪ Fact) = {r1, ..., rk} (5.23)

Then TCLA ∪ CCLA does not explain ¬Fact, and we’ll produce a hypotheses set in the same manner as before, i.e. using combinations of conjunctions of ¬ri. Similar to the justification introduced in Section

46 4.4, from (5.23), k _ TCLA ∪ CCLA ∪ Fact FOL ri (5.24) i=1 Thus by Theorem 3 k _ TCLA ∪ CCLA ∪ {¬ ri} FOL ¬Fact (5.25) i=1 The result follows in the sammer manner as in Section 3.4, with the small change in terms of selection ∗ criterion. Again, we use the relation FOL when discussing FOL formulas. This remains valid for reasons already explained.

5.4 Implementation

In this section, we’ll ellaborate on how these procedures were implemented. We’ll provide the pseu- docode of the programs developed using Mathematica, as well as clarify the choices made in terms of commands used, the inputs and outputs, and the goals of the programs themselves. In terms of program construction, we chose to go with an imperative construction (and, in very spe- cific cases, functional construction and a mix of recursive and imperative constructions) for our programs, in which we used a dynamic scoping Block. This scoping localizes the values of the variables and not their names, as is the case with Module. For that reason, it improves the time complexity of our programs.

5.4.1 Input

To built the sets T,C,F of formulas, we’ll use the Mathematica commands ForAll, Exists, Implies, And, Or and Not. For predicate symbols, we’ll use upper case letters P, Q, S, U, among others (excluding H and R). For function symbols, we’ll use lower case letters followed by a natural number, like f1, f2, f3, among others. For variables, we’ll use lower case letters x, y, w, z, among others, while for constants we’ll use expressions of the form xi, i ∈ N (the same as Skolem constants). The sets T , C and F will be represented as lists of FOL formulas built in this manner. Mathematica automatically simplifies some negations by ”pushing” them inwards, in particular if they are applied over quantifiers. A small detail that influences the expressive power of the procedure is that Mathematica requires that a quantified variable must be present in the body of the formula. More, in order for the procedure to work efficiently, we require that every variable used in a formula must be bounded. The choice of representing the sets as lists of formulas was not made at random. Lists are easier to manipulate, since Mathematica has many built-in functions that allow us, for example, to select elements or indexes of elements of a list that satisfy a certain pattern or condition, remove elements, join, unite and intersect lists, determine the number of elements in a list, apply functions to every element of a list (distributive property), among other procedures.

5.4.2 CLAFOL Procedure Implementation

We start with program CLAFOL, responsible for the translation of a set of FOL formulas into equisatisfi- able clausal form formulas in FOL∗. This program is composed of several auxiliary programs, which we also present. These auxiliary programs will implement the translations introduced in Section 5.2.2.

47 Pre Translation

The first auxiliary program executes the Pre translation. It takes advantage of an undocumented built-in function - Reduce’ToPrenexForm - which was found through a search of the symbols associated with the package Reduce‘ of Mathematica. This command applies the Pre translation to each formula in the set of formulas, renames each quantified variable as ci, i ∈ N, and executes standardization.

Skolemization Translation

The second auxiliary program executes the Skolemization translation. In this case, Mathematica does not have any built-in function that executes the requested translation. As such, we had to implement an algorithm, whose pseudocode we present below.

Algorithm 3 Skolemization Translation 1: procedure Skolemization(S) 2: form ← S 3: ex, far ← F alse 4: varex, varfa ← {} 5: while T rue do 6: if form[[0]] is not a quantifier then 7: break 8: end if 9: if form[[0]] is a universal quantifier then 10: fa ← T rue 11: if ex === T rue then 12: ex ← F alse 13: else 14: varex ← {} 15: end if 16: varfa ← varfa ∪ form[[1]] 17: form ← form[[2]] 18: end if 19: if form[[0]] is an existential quantifier then 20: ex ← T rue 21: varex ← varex ∪ form[[1]] 22: form ← form[[2]] 23: if fa === F alse then 24: for j ← 1, j 6 Length[varex], j ← j+1 do 25: form ← form/.{varex[[j]] → Unique["x"]} 26: end for 27: else 28: for j ← 1, j 6 Length[varex], j ← j+1 do 29: form ← form/.{varex[[j]] → Unique["f"][Sequence@@varfa]} 30: end for 31: end if 32: end if 33: end while 34: return form 35: end procedure

Let’s observe how this algorithm works.

• Line 1: The input is a set of formulas in prenex normal form, obtained through the program Pre;

• Lines 2 to 5: We initialize the auxiliary lists varex and varfa, which will contain the existentially quantified and the universally quantified variables and Skolem constants, respectivelly, and the

48 indicators ex and fa, which will stipulate whether an existential quantifier or a universal quantifier have occured before the current computation, respectively, or not. Also, due to the computation specifications of Mathematica and its local environment Block, we need to create a new object form which will define the value of formula S and over which we can make the changes required in our program;

• Lines 6 and 7: If the formula is not quantified, we stop the cycle;

• Lines 9 to 17: If the formula is universally quantified, we change the indicator fa to T rue. Then we check if before this quantifier, an existential quantifier had occured (i.e. if ex is T rue). If that was the case, then we change the indicator ex back to F alse. Else, we remove all existentially quantified variables from the auxiliary list varex. Then, we add the universally quantified variable to varfa and remove that quantifier and variable from the formula;

• Lines 19 to 22: If the formula is existentially quantified, we change the indicator ex to T rue, add the quantified variable to list varex and remove the quantifier and quantified variable from the formula;

• Lines 23 to 26: If the existentially quantified formulas were not previously universally quantified, we replace each existentially quantified variable with a new unique Skolem constant of the form xi, where i∈ N, using the Mathematica command Unique. This corresponds to steps 1 and 2 of the Skolemization translation, and is completed with the use of command /., which applies a rule or list of rules in an attempt to transform each subexpression of the formula;

• Lines 28 to 30: If the existentially quantified formulas were previously universally quantified, we re- place each existentially quantified variable with a new unique Skolem function of the form fi[...], where i∈ N, dependent on the variables and Skolem constants present in the list of universally quantified variables varfa, using the Mathematica commands Unique and Sequence. This corre- sponds to step 3 of the Skolemization translation, and is once again completed with the command /..

CNF Translation

The third auxiliary program computes the CNF translation. Similar to the Pre translation, Mathematica already possesses a built-in routine which implements this transformation - BooleanConvert. This func- tion allows us to choose the form into which the formulas are going to be translated. In our case, we will choose the option "CNF", indicating the conjunctive normal form.

Clausal Translation

The fourth auxiliary program executes the last step in our precomputation routine. It translates the formulas in CNF into clausal form. We present the pseudocode below.

49 Algorithm 4 Clausal Translation 1: procedure Clausal(S) 2: TEO ← S 3: Cla ← {} 4: b ← T rue 5: if TEO[[0]] is And then 6: b ← False 7: for j ← 1, j 6 Length[TEO], j ← j+1 do 8: Cla ← Cla ∪ {Clausal(TEO[[j]])} 9: end for 10: end if 11: if TEO[[0]] is Or then 12: b ← False 13: Cla ← Cla ∪ {Table(TEO[[j]], j=1,...,Length[TEO])} 14: end if 15: if b is True then 16: Cla ← Cla ∪ {S} 17: end if 18: return Cla 19: end procedure

This algorithm was adapted from Definition 65 as follows:

• Lines 2 to 4: The initialization is completed. Since, once again, we will be working in the Block environment, we need to create an object TEO to define the value of the formula. More, in order to simplify the modifications being made, we’ll create an object Cla, initialized as an empty list, in which we’ll built our formula in clausal form. At last, the object b will act as an indicator of whether or not the formula being checked has any conjunctions or disjunctions;

• Lines 5 to 10: Corresponds to the second case of the Clausal translation. If the formula is a conjunction, we simply translate each subformula into clausal form;

• Lines 11 to 14: Corresponds to the first case of the Clausal translation. If a formula is disjunction, we add each subformula to Cla;

• Lines 15 to 16: If the formula being checked is neither a conjunction nor a disjunction, we just add the formula to Cla, since it already is in clausal form.

Final Procedure

Equipped with these auxiliary programs, we’re now able to develop the final program CLAFOL, which translates a set of formulas in FOL into formulas in FOL∗ in clausal form. The pseudocode is below.

Algorithm 5 CLAFOL Translation 1: procedure CLAFOL(S) 2: return Flatten(Table(Clausal(BooleanConvert(Skolemization(Pre(Resolve(S[[i]]))), 3: "CNF")), j=1,...,Length[TEO]), 1) 4: end procedure

There are a few explanations that we need to give before moving on to more complex programs. First is the use of the built-in function Resolve. If we simply tried to apply the auxiliary programs to the input formulas, we would not be able to properly manipulate them and, in particular, manipulate the variables. Resolve eliminates this problem. Second, is the use of Flatten. For every formula in S,

50 the program creates a list of the formulas in clausal form. Since we’re not interested in analysing each formula separately, we flatten all these lists, in order to obtain a single list of clauses.

5.4.3 ResolutionOP Procedure Implementation

The second program is the more complex one. It requires the development of several auxiliary pro- grams, which we’ll also detail. It executes the GRP, with some optimizations. Other optimizations will be added to other procedures outside of the scope of ResolutionOP, but will influence the efficiency of this program.

Seq1 Procedure

We start with program Seq1. It has a very simple goal: given a formula, it returns a list of the subformulas in the immediate sublevel.

Algorithm 6 Seq1 Procedure 1: procedure Seq1(formula) 2: if formula is a variable or a constant then 3: return {formula} 4: else 5: return Table(formula[[i]],i=0,...,Length[formula]) 6: end if 7: end procedure

To verify if a formula is a variable, we have to check if it satisfies the condition Length[formula]==1 and formula[[0]]==C. To test if it is a constant, we assess if Length[formula]==0.

Seq2 Procedure

This program has a very similar goal to that of Seq2. However, it operates over a pair of formulas. The goal is to obtain two lists with the same number of subexpressions that can be compared. It recursively applies procedure Seq1 to the formulas and its subformulas, until one of the lists cannot be expanded any more.

Algorithm 7 Seq2 Procedure 1: procedure Seq2(formula1,formula2) 2: form1 ← {formula1} 3: form2 ← {formula2} 4: while T rue do 5: seq1 ← Map[Seq1,form1] 6: seq2 ← Map[Seq1,form2] 7: if seq1 === form1 or seq2 === form2 then 8: break 9: else 10: form1 ← seq1 11: form2 ← seq2 12: end if 13: end while 14: return {form1, form2} 15: end procedure

51 MGUFIN Procedure

We’ve now reached a critical part of our work - the MGUFIN procedure. This quasi-linear routine, based on the one introduced in Section 5.2.3, allows us to compute the MGU of two expressions, if it exists. If it does not exist, it returns that indication. In our case, it will return 0. Since this procedure involves many steps, we’ve divided the program into four small steps, that were then gathered in the general program. Each of these auxiliary programs corresponds to a case of the general procedure, as we’ll see next. The procedures will receive as input and return as output a set composed of five elements. The first two are the expressions being compared. The third component corresponds to the current MGU. The final two elements are booleans indicating whether a certain procedure has occured or not, and whether the computation of an MGU is still possible or not. This first auxiliary MGU program determines if two expressions are equal. If they are, then we simply move on to the next expression, i.e. make the boolean next F alse. Notice that once again, since we’re working in a Block environment, we need to create a new object n that represents the local value of next during the procedure. It corresponds to the third point of the unification algorithm.

Algorithm 8 MGU1 Procedure 1: procedure MGU1(s1,s2,sigma,next,iter) 2: n ← next 3: if s1 === s2 then 4: n ← F alse 5: end if 6: return {s1,s2,sigma,n,iter} 7: end procedure

The second auxiliary MGU procedure corresponds to the fourth case of the algorithm introduced before. In this case, we check whether or not the two expressions are equal and whether or not they’re object constants or are headed by an object constant. In our work, an expression exp is an object constant if it is a predicate or function symbol, a Skolem constant or a Skolem function. Hence, to test whether or not two expressions satisfy these conditions, we have to determine if one of the following cases occurs:

• Length[s1]===0 and Length[s2]===0, i.e. both expressions are constants or predicate symbols;

• Length[s1]===0 and s2 begins with f, i.e. the first expression is a constant or predicate symbol, and the second one is headed by a function symbol;

• Length[s2]===0 and s1 begins with f, i.e. the second expression is a constant or predicate symbol, and the first one is headed by a function symbol;

• Both s1 and s2 start with f but then differ, i.e. both are headed by different function symbols.

To materialize the last three cases, we need a way to verify if the expressions begin with f. We could try to simply check if the head of an expression is f. However, we’re dealing with Skolem function constants, which are of the form fi, where i ∈ N. Mathematica considers the entire expression fi as the head, and so the verification would return F alse, even though the expression starts with f. On the other hand, implementing a verification for every fi would not be practical. To avoid this problem, we’ll first translate the expression into a string, using the command ToString. For each character in the string, Mathematica has a unique corresponding code - a natural number - which can be obtained through the command ToCharacterCode. As such, we translate the string into a list of codes using the aforementioned function, and check if the first element of the list is the code 102, corresponding to f.

52 The last case has an additional catch. Not only do we have to check if the first element of the expressions are f, but we also have to check if the second elements (i.e. the natural numbers associated with the function symbol) are different as well. With that in mind, we first test if the first element of the expressions is f using the same method as before. Then, we just have to verify if the codes of the second elements are different, since the strings of the natural numbers are also associated with unique codes. If any of the four cases occurs, then there is no MGU between the two expressions. As such, we terminate the procedure, by changing the indicator n to F alse, which impedes the continuation of the iteration, and by changing the boolean iter2 to F alse, terminating the cycle. The pseudocode can be found below.

Algorithm 9 MGU2 Procedure 1: procedure MGU2(s1,s2,sigma,next,iter) 2: n ← next 3: iter2 ← iter 4: if s1 6= s2 and s1, s2 are both object constants or expressions starting with object constants then 5: n ← F alse 6: iter2 ← F alse 7: end if 8: return {s1,s2,sigma,n,iter2} 9: end procedure

Before moving on to the third auxiliary MGU procedure, we need to introduce a routine that indicates whether a certain substitution is acceptable or not. This procedure is essential during the auxiliary program, since we need to compose subsitutions, as described in the original algorithm.

Definition 78. (Acceptable Substitution) A substitution σ is acceptable if ∃1ψ : σ(ϕ) = ψ and when applied over itself, it doesn’t result in an endless loop.

Definition 79. (Composable Substitutions)

Two substitutions σ1, σ2 are composable if σ1 ∪ σ2 is an acceptable substitution.

We’ve developed program AcceptableQ, which indicates the acceptability of a substitution. The pseudocode is presented next.

Algorithm 10 AcceptableQ Procedure 1: procedure AcceptableQ(sigma) 2: sig ← sigma 3: aux ← Graph[sig] 4: b ← AcyclicGraphQ[aux]&&Select[VertexOutDegree[aux],#>1 &]==={} 5: if b === T rue then 6: for i ← 1, i 6 Length[sig], i ← i+1 do 7: aux ← sig\sig[[i]] 8: auxi ← {sig[[i]]}//.aux 9: if there’s a variable in auxi that triggers OccurCheck then 10: b ← F alse 11: end if 12: end for 13: end if 14: return b 15: end procedure

Let’s describe how this auxiliary program works.

53 • Lines 2 to 3: It creates an object sig, which will represent the local value of the substitution sigma in our Block environment. Then, it transforms the substitution into a graph, where the vertices are the elements being bounded, and the edges are oriented according to the substitution;

• Line 4: The first test is to check if the graph is acyclic, i.e. if the substitution doesn’t originate an endless loop when applied to itself. Then, we test whether or not there is any vertex with more than one edge originating from it, which is also a violation of the conditions established in Definition 78, since it means that at least one variable is bounded to more than one expression;

• Line 5 to 10: If the previous test fails, then there’s no need to continue the procedure. Otherwise, there’s a case that was not caught in the test that we need to consider. If a variable that is bounded to an expression is present in an expression bounded by another variable, an endless loop may

occur. For example, in the subsitution σ = {c1 → c2, c2 → f1[c1]}, the variable c1 is bounded

to c2, which in turn is bounded to f1[c1]. If we applied the substitution to itself, it would result in an endless loop. However, the graph obtained from this substitution is not acyclic, and has no vertices with more than one outgoing edge. To test if this case occurs in a substitution, for each sub-substitution, we apply the remainder of the substitution until no more changes transpire, using the command //.. Then, we check if the result of that operation triggers an OccurCheck, that is, if a variable is present in both members of the substitution. In our example, if we take the first sub-

substitution {c1 → c2} and apply {c2 → f1[c1]} to it, we obtain {c1 → f1[c1]}, which triggers OccurCheck;

• Line 14: Returns the result - F alse if the substitution is not acceptable, or T rue if it is;

• To use the commands AcyclicGraphQ and VertexOutDegree, we need to call the package GraphUtilities, using Needs.

We’re now able to introduce the third auxiliary MGU program, which is used when the first expression is a variable. This program works as follows:

• Lines 2 to 4: Initialization takes place. We define the local values of the objects n, iter3 and sig, which will be used in the Block environment;

• Lines 5 to 7: If the first expression is variable, we change the indicator n to F alse, indicating that that iteration is over. Then, it tries to create a list aux containing the expression already bounded to the variable in sig;

• Lines 8 to 15: If aux is not empty, then there’s one expression bounded to the variable. Hence, we try to compute the MGU between that expression and and the second expression using program MGU, which will be presented later. If there is an MGU, indicated by the fifth output element, we check if the unification of that MGU with the already existing one is possible. If it’s not possible, then no general MGU can be computed. As such, the indicator iter3 is changed to F alse. Otherwise, the new substitution is obtained by unifying the new substitution with the old one;

• Lines 18 to 24: If there is no binding of the variable in the substitution, we test if an OccurCheck arises. If it does, then computing the MGU is impossible. Otherwise, we unify the old substituion with the one that binds the variable with the second expression, if it’s acceptable.

The fourth and final auxiliary MGU program, MGU4, works exactly like MGU3, but is triggered if the second expression is a variable. As such, we’ll omit its description. At last, we reach the general program MGU, which computes the MGU between two expressions, or indicates if such computation is impossible.

54 Algorithm 11 MGU3 Procedure 1: procedure MGU3(s1,s2,sigma,next,iter) 2: n ← next 3: iter3 ← iter 4: sig ← sigma 5: if s1 is a variable then 6: n ← F alse 7: aux ← {y1} where y1 is bounded to s1 in sig 8: if aux is not empty then 9: m ← MGU(aux[[1]],s2,sig, T rue, T rue) 10: if m[[5] === T rue then 11: if AcceptableQ(sig ∪ m[[3]]) === F alse then 12: iter3 ← F alse 13: else 14: sig ← sig ∪ m[[3]] 15: end if 16: end if 17: else 18: if OccurCheck(s1,s2) === T rue then 19: iter3 ← F alse 20: else 21: if AcceptableQ(sig ∪ {s1 → s2}) === T rue then 22: sig ← sig ∪ {s1 → s2} 23: else 24: iter3 ← F alse 25: end if 26: end if 27: end if 28: end if 29: return {s1,s2,sig,n,iter3} 30: end procedure

55 Algorithm 12 MGU Procedure 1: procedure MGU(s1,s2,sigma,next,iter) 2: sig ← sigma 3: n ← next 4: iter0 ← iter 5: sub1 ← {y1, ..., yn} where y1, ..., yn are the subexpressions of s1 6: sub2 ← {z1, ..., zn} where z1, ..., zn are the subexpressions of s2 7: for i ← 1, i 6 Length[sub1] and iter0 === T rue, i ← i+1 do 8: m1 ← MGU1(sub1[[i]],sub2[[i]],sig,n,iter0) 9: n ← m1[[4]] 10: if n === T rue then 11: m2 ← MGU2(sub1[[i]],sub2[[i]],sig,n,iter0) 12: n ← m2[[4]] 13: iter0 ← m2[[5]] 14: if n === T rue then 15: m3 ← MGU3(sub1[[i]],sub2[[i]],sig,n,iter0) 16: n ← m3[[4]] 17: iter0 ← m3[[5]] 18: sig ← sig ∪ m3[[3]] 19: if n === T rue then 20: m4 ← MGU4(sub1[[i]],sub2[[i]],sig,n,iter0) 21: n ← m4[[4]] 22: iter0 ← m4[[5]] 23: sig ← sig ∪ m4[[3]] 24: end if 25: end if 26: end if 27: end for 28: return {s1,s2,sig,n,iter0} 29: end procedure

56 This algorithm creates a chain of the previous four auxiliary programs, computing the MGU, if possi- ble. In the cycle, it checks at every step if the previous auxiliary program was triggered, represented by the boolean n, and if an MGU computation is still possible, represented by the boolean iter0. When n becomes F alse, we skip to the next iteration, and when iter0 becomes F alse, we stop the algorithm altogether. To create the lists of the subexpressions, we’ll use a combination of Seq1 and Seq2. At last, we developed algorithm MGUFIN, which receives two formulas as input, and returns the MGU between the two, if it exists, or 0, if it doesn’t. This procedure takes into consideration that we need to ”remove” the negations from a pair of formulas before computing the MGU. Notice that we did not add a guard that impedes the process if both formulas are negated or if both formulas are not negated, since this program will be used in other cases where that condition does not have to be verified.

Optimizations Procedures

The next step is to introduce the implementations of some of the optimizations mentioned in Section 5.2.3. As was also stated, there are other optimization routines that could have also been implemented. However, since our goal is to develop an algorithm for FOL abduction, and not real optimization, we leave some of these for future work. The first optimization procedure developed was TautologyElimination. It receives as input a set of formulas S, and returns a set without the tautologies. To achieve this result, we used the built-in function TautologyQ, which receives a formula and indicates whether that formula is tautology or not. Since set S is composed of formulas in clausal form, to check if a clause is a tautology defined according with our notion, we test if the disjunction of the formulas in each clause results in a T rue output from TautologyQ. We’ll omit the pseudocode since it is very straightforward. The second procedure implemented was PureLiteralElimination. The pseudocode is below.

Algorithm 13 Pure Literal Elimination Procedure 1: procedure PureLiteralElimination(S) 2: TEO ← {} 3: list1 ← {y1, ..., yn} where y1, ..., yn are the subexpressions of the formulas in ¬S 4: for each subexpression yi do 5: b ← 0 6: for each element y of list1[[i]] and b<1 do ik 7: if y is negated then ik 8: if there is a positive literal with the same outer predicate or function symbol then 9: b ← b+1 10: end if 11: else 12: if there is a negative literal with the same outer predicate or function symbol then 13: b ← b+1 14: end if 15: end if 16: end for 17: if b ≥ 0 then 18: TEO ← TEO ∪ S[[i]] 19: end if 20: end for 21: return TEO 22: end procedure

This program proceeds as follows:

• Lines 2 to 3: We create an object TEO, which is locally empty. Then, we negate each formula in S and build a list list1 of lists consisting of the negated subformulas;

57 • Lines 4 to 14: We initialize a counter b which will indicate how many complementary literals in terms of negations start with the same function or predicate symbol, i.e. may be Resoluted. We only need to find the first one, since the goal is to eliminate any clause that can’t be Resoluted, and as such, the guard only requires that b < 1. In the For cycle, for each element of list1[[i]], we first see if it’s negated. If it is, we search for a positive literal with the same outer predicate or function symbol, which might be unifiable with it. Otherwise, we instead search for a negative literal with the same properties. When one of these cases occurs, we increment the counter;

• Lines 17 to 19: If the counter was incremented, then there’s at least one complementary literal in terms of negations with the same outer symbol. As such, that clause is added to the final set TEO which will be outputted. Notice that we don’t analyse the subformulas in order to determine if the expressions can be unifiable. As such, some expressions with the same outer predicate but with non unifiable subexpressions in relation to the other expressions will still be maintained. This was done since in the ResolutionOP procedure that we’ll see ahead, this verification will be made, and so we would be duplicating the number of computations related to this issue.

The third auxiliary program developed was SubsumedElimination. This program removes the clauses that are subsumed by another clause. The idea behind the procedure is the following: for each clause, we build a table with the MGUs of the cartesian product of the subexpressions. Then, we build an- other table whose elements are the first expression with the various MGUs applied to it. Afterwards, we check if the first expression subsumes the second one, by testing if the second expression contains all elements of the substitute first expression. If it does, then we remove this clause. The pseudocode can be found below. To simplify the understanding of this problem, we’ve built an auxiliary program denoted by SubsumesEliminationAux.

Algorithm 14 Subsumed Elimination Auxiliary Procedure 1: procedure SubsumedEliminationAux(s1,s2) 2: t1 ← {m1, ..., mk}, where mi are the MGUs of the cartesian product of the subexpressions of s1 and s2 3: tup ← Tuples[t1] 4: tupa ← {} 5: for each tuple in tup do 6: if AcceptableQ(Flatten[tup[[i]]]) === T rue then 7: tupa ← tupa ∪ Flatten[tup[[i]]] 8: end if 9: end for 10: t2 ← {y1, ..., yn} where yi are the results of applying each element of tupa to the first expression 11: b ← F alse 12: for each element of t2 and b not T rue do 13: if s2 contains all the elements of t2[[i]] then 14: b ← T rue 15: end if 16: end for 17: return b 18: end procedure

This procedure was implemented as follows:

• Lines 2 to 4: We build a list t1 of the possible MGUs between each pair of subexpressions of expressions s1 and s2. Then, we use the built-in function Tuples, which computes tuples of elements of t1. This is done since more than one pair of subexpressions might be unifiable at the same time. Lastly, we define a new object tupa, in which we will add acceptable substitutions resulting from the unification of the various tuples in tup;

58 • Lines 5 to 7: For each tuple in tup, we check if the union of all of its elements is an acceptable substitution. If it is, we add this union to tupa;

• Lines 10 to 14: After obtaining the list of the substitutions that might imply the existence of a subsumed clause, we test if any of these produce the desired result. With that in mind, we build list t2 whose elements are obtained by applying each substitution in tupa to the first expression. Then, we check if any of these elements is contained within the second expression. If that is the case, we can conclude that the first expression subsumes the second one, by definition, and return the boolean b indicating this fact.

We thus developed the final procedure, whose pseudocode is next.

Algorithm 15 Subsumed Elimination Procedure 1: procedure SubsumedElimination(S) 2: TEO ← SortBy[S,Length] 3: pos, TEO1 ← {} 4: for each clause TEO[[i]] do 5: for each clause TEO[[j]] with j > i do 6: sea ← SubsumedElimiantionAux(TEO[[i]],TEO[[j]]) 7: if sea === T rue then 8: pos ← pos ∪ j 9: else 10: if Length[TEO[[i]]] === Length[TEO[[j]]] then 11: sea ← SubsumedElimiantionAux(TEO[[j]],TEO[[i]]) 12: if sea === T rue then 13: pos ← pos ∪ i 14: end if 15: end if 16: end if 17: end for 18: end for 19: for each position i of TEO do 20: if i ∈/ pos then 21: TEO1 ← TEO1 ∪ TEO[[i]] 22: end if 23: end for 24: return TEO1 25: end procedure

This program works as follows:

• Lines 2 to 3: Initialization process. We order the set S by length of the expressions. This optimizes the procedure since only a shorter or equally long expression can subsume another expression. Hence, instead of testing whether or not each pair of expressions triggers a subsumed elimination, we’ll just check if this happens for a pair of expressions in which the first one is shorter than or equally long as the second one. We also initialize two auxiliary sets - pos and TEO1 - which will be used later;

• Lines 4 to 9: We scan the list of clauses ordered by length, checking if a clause subsumes any clause located later in the list, using SubsumedEliminationAux. If that is the case, we add the position of the clause that is subsumed by another one to the list pos;

• Lines 10 to 13: If the length of the clauses is the same, and the first one does not subsume the second one, there might be the case that the second clause subsumes the first. Hence, we

59 proceed similarly to lines 4 to 9, but testing if the second clause triggers a subsumed elimination over the first one;

• Lines 19 to 21: For each position i in pos, we add the element TEO[[i]] to the output set TEO1 if it is not subsumed by any other clause, i.e. if it is not present in pos.

Final Procedure

At last, we’ve reached the main subject of this section - the definition of procedure ResolutionOP, which applies the GRP with some optimizations to a set of formulas in clausal form. According to Definition 71, the first step is to rename the variables of the first clause so that no intersection occurs with the second clause. With that goal in mind, we’ve developed two auxiliary programs - VariableInT, which receives a set of clauses S and returns a list of the variables present in the formulas, and Rename, which receives two clauses and renames the first one so that no variable is present in both. The pseudocode for the second algorithm is presented below.

Algorithm 16 Rename Procedure 1: procedure Rename(s1,s2) 2: notvar ← {a1, ..., an} where ai are the variables already used in s1 and s2 3: var ← {b1, ..., b1} where bi are the variables that occur simultaneously in both clauses 4: seq1 ← s1 5: if var is not empty then 6: for each variable in var do 7: seq1 ← seq1//.{var[[i]] → cMax[notvar]+i } 8: end for 9: end if 10: return {seq1,s2} 11: end procedure

The routine was implemented as follows:

• Lines 2 to 4: We build two lists - notvar, composed of the variables used in s1 and s2, and var, whose elements are the variables that intersect both clauses. The first list was obtained with the procedure VariableInT, while the second one was derived using the built-in function Intersection. We also create an object seq1 to represent the local value of the first expression;

• Lines 5 to 9: If there are variables in var, then the occurences of those variables in the first ex- pression have to be renamed. To that end, we exploit the properties of the variables. Since they are all numbered, to ensure that no variable is substituted by an already existing variable, we com- pute the maximum index of all the variables present in both expressions. Then, we simply replace the var variables with consecutively numbered ones, starting with the immediate successor of the maximum index found.

Again looking at Definition 71, we see that the next step is to compute the factors of the sec- ond clause and of the renamed first clause. Hence, we’ve formulated procedures RemoveFactor and RemoveFactors. The first one receives as input a clause and a boolean, and returns the attempt of removing a factored element of the clause, and the indication of whether that attempt was successful or not. The second one receives the set of formulas S and determines the factors of each clause, using RemoveFactor as an auxilary program.

60 Algorithm 17 Remove Factor Procedure 1: procedure RemoveFactor(S,b) 2: iter ← b 3: res ← S 4: i ← 1 5: while iter === T rue and i 6 Length[res] do 6: resi ← res\res[[i]] 7: j ← 1 8: while iter === T rue and j 6 Length[resi] do 9: if Head[res[[i]]] === Head[resi[[j]]] then 10: m ← MGUFIN(res[[i]],resi[[j]]) 11: if m 6= 0 then 12: iter ← F alse 13: res ← DeleteDuplicates[res//.m] 14: end if 15: end if 16: j ← j+1 17: end while 18: i ← i+1 19: end while 20: return {res,iter} 21: end procedure

It works as follows: • Lines 2 to 4: Initialization process. We attribute the values of clause S and boolean b to objects used within Block;

• Lines 5 to 6: We scan res, defining for each one a set resi, consisting of the original clause but without the element in position i;

• Lines 8 to 16: For each element in resi, we check if the head of the formula is the same as the head of the formula in position i that is being tested. If it is, then we try to unify them, using MGUFIN. If there is an MGU of the two expressions, then one of the expressions is a factor of the other. Hence, we change the boolean iter to indicate that a factor was removed (but not all of them), and return the new clause without the factor. The pseudocode of the second procedure can be seen below.

Algorithm 18 Remove Factors Procedure 1: procedure RemoveFactors(S) 2: res ← {} 3: for each element of S do 4: iter ← T rue 5: a ← S[[i]] 6: while iter === T rue do 7: rf ← RemoveFactor(a,iter) 8: if rf[[2]] === T rue then 9: iter ← F alse 10: else 11: a ← rf[[1]] 12: end if 13: end while 14: res ← res ∪ rf[[1]] 15: end for 16: return res 17: end procedure

61 This procedure is very simple to understand. For each clause of the set, we recursively apply the RemoveFactor procedure. If the second element of the output of this procedure is T rue, then no factor was or can be removed, and so we move on to the next clause. On the other hand, if that boolean is F alse, then a factor was removed. There might be, however, more factors to be removed. As such, we again apply RemoveFactor to the first element of the output. After finishing the implementation of the auxiliary procedures, we developed a program to compute the GRP. We’ve divided the implementation of this principle into three parts. We start with program Resolution, which receives two clauses and returns the result of applying the GRP to both clauses.

Algorithm 19 Resolution Procedure 1: procedure Resolution(clause1,clause2) 2: clause ← Rename(clause1,clause2) 3: cl1 ← RemoveFactors({clause[[1]]}) 4: cl2 ← RemoveFactors({clause[[2]]}) 5: for each element of cl1 do 6: listi ← {a1, ..., an} where ak are the elements of cl2 that have the same head as cl1[[i]] and are complementary in terms of negations 7: list ← {b1, ..., bn} where bk are lists whose first element is the position i of cl1[[i]] and second element is a list of indexes of possible complementary literals 8: end for 9: res ← {} 10: if list is not empty then 11: for each element of list do 12: for each element of list[[k,2]] do 13: mgu ← MGUFIN(cl1[[list[[k,1]]]],cl2[[list[[k,2,j]]]]) 14: if mgu 6= 0 then 15: res ← res ∪ {cl1∗ ∪ cl2∗} 16: end if 17: end for 18: end for 19: end if 20: if res is empty then 21: return 0 22: else 23: return res 24: end if 25: end procedure

It was developed as follows:

• Lines 2 to 4: The first steps of the GRP are executed. First, we rename the clauses so that no vari- able occurs in both clauses simultaneously. Then, we factor both clauses, using RemoveFactors. Notice once again that in the case of FOL abduction, more than one complementary pair can be eliminated in a single step, due to the factoring procedure;

• Lines 5 to 8: For each element of the first clause, we build two lists: an auxiliary list listi, in which we store the elements that are complementary to the test subject in terms of negations and outer symbols, and as such may be Resoluted. Then, we add to the set list a list with two elements. The first one is the position of the test subject, and the second one is a list of the indexes of possible unifiable complementary literals in cl2. Notice that more than one literal in cl2 may be unifiable complementary, and as such, the GRP will return all clauses that can be obtained through this process;

• Lines 10 to 15: For each member of list (if it is not empty), we try to compute the MGU between

62 the first formula and one of the formulas whose position is in the second element. If there is an MGU, then we add to the ouput set the union of the clauses cl1∗, cl2∗, consisting of the clauses cl1, cl2 without the complementary literals, after replacing the remaining elements with the substitutes obtained through the MGU.

The second procedure - ResolutionStep - receives as input two sets S and S2, and applies one step of the GRP, applying it to each pair whose first element is a member of S and second element is a member of S2. This detail allows us to save some computation time. At the end of each step, as mentioned before, we need to check the existence of tautologies, using TautologyElimination. The output is a set of two elements, where the first one is the clause with the Resoluted formulas, and the second one is a list of the formulas over which Resolution was applied with success. To conclude the implementation of ResolutionOP, we introduce the proper program, which receives a set of clauses S and a value maxiter, and returns either 0, if the empty set was derived, or a set of Resoluted elements to whom the GRP was not applied, obtained after maxiter iterations. The maxiter derives from parameter p, and will play the same role as this value, which was explained in the begining of the chapter.

Algorithm 20 ResolutionOP Procedure 1: procedure ResolutionOP(S,maxiter) 2: list ← {} 3: aux, empty ← F alse 4: iter ← maxiter 5: res ← S 6: auxi ← {} 7: while empty === F alse and aux === F alse and iter 6= 0 do 8: res0 ← ResoutionStep(res,auxi) 9: list ← list ∪ res0[[2]] 10: if res0[[1]] is empty then 11: aux ← T rue 12: end if 13: if res0[[1]] contains the empty set then 14: empty ← T rue 15: end if 16: auxi ← res0[[1]] 17: res ← res ∪ res0[[1]] 18: iter ← iter-1 19: end while 20: if empty === T rue then 21: return 0 22: else 23: if iter === 0 then 24: print Resolution has reached a maximum number of iterations. All solutions may not have been found. 25: return res\list 26: else 27: return res\list 28: end if 29: end if 30: end procedure

This algorithm is also very straighforward. Starting with a set of formulas in clausal form, we apply ResolutionStep, and add the results to a list, denoted by list. Then, we check if one of the following cases occurs:

63 • If res0[[1]] is empty, then by definition of ResolutionStep the GRP cannot be applied to any clause. As such, we stop the iteration;

• If res0[[1]] contains the empty set, then the original set is inconsistent, and we again stop the iteration.

If none of these cases occurs, then we continue the procedure. This consists of decreasing the iterator iter, which contains the local value of maxiter, and applying ResolutionStep to the sets res - the original clause plus the Resoluted clauses that have been computed up to that point - and res0[[1]] - the Resoluted clauses that were computed in the previous step.

Lastly, we output 0 if the empty set was derived, or the set of Resoluted clauses over which the GRP was not applied, i.e. the set res\list, otherwise.

5.4.4 Hypotheses Formulation

In the last step in our implementation section, we’ll introduce the remaining procedures, mainly concern- ing the ones that will give origin to the hypotheses set. We’ve built three auxiliary programs that work as explained in Section 5.2.4 - FormulateCandidateHypothesesFOL, RemoveInconsistenHypothesesFOL and SelectGoodHypothesesFol.

Algorithm 21 Formulate Candidate Hypotheses FOL 1: procedure FormulateCandidateHypothesesFOL(H,Comp) 2: return {a1, ..., an} where ai is a subset of ¬Hip with length between 1 and Comp 3: end procedure

Algorithm 22 Remove Inconsistent Hypotheses FOL 1: procedure RemoveInconsistentHypothesesFOL(T,C,F,H,p) 2: list ← {} 3: for each element in H do 4: res ← ResolutionOP(SubsumedElimination(PureLiteralElimination(T ∪ C ∪ H[[i]])), p) 5: res1 ← ResolutionOP(SubsumedElimination(PureLiteralElimination(H[[i]] ∪ F)), p) 6: if res 6= 0 and res1 6= 0 then 7: list ← list ∪ H[[i]] 8: end if 9: end for 10: return list 11: end procedure

64 Algorithm 23 Select Good Hypotheses FOL 1: procedure SelectGoodHypothesesFOL(T,C,H,F,Exp,Comp,p) 2: list ← {} 3: for each element in H do 4: if Length[H[[i]]] 6 Comp then 5: count ← 0 6: for each element in F do 7: if ResolutionOP(SubsumedElimination(PureLiteralElimination(T ∪ C ∪ H[[i]] ∪ F[[j]])),p) === 0 then 8: count ← count+1 9: end if 10: end for 11: if count 6= 0 then 12: list ← {{H[[i]],count}} 13: end if 14: end if 15: end for 16: return Select[list,Last[#]≥Exp &] 17: end procedure

In Algorithms 22 and 23, the goal is to conclude whether or not a set is inconsistent. Hence, before starting the computation, we apply two optimization procedures developed before - SubsumedElimination and PureLiteralElimination- which will remove all occurences of a literal within a clause that does not have a complementary literal in any other clause, and therefore cannot be Resoluted to obtain an empty clause, and transform the input set into a set of clauses with no subsumed clauses, which reduces the number of clauses to be tested without compromising the consistency of the set. Notice that in line 9 of Peirce’s FOL algorithm, the application of ResolutionOP has a different goal. Instead of searching for an empty clause, we want to apply the GRP to every possible pair of clauses in order to obtain a set R. Hence, even if a clause contains a pure literal, other literals within that clause can still be Reso- luted, and the resulting clause be appropriated to build a hypothesis. For this reason, we won’t add this optimization procedures directly to the ResolutionOP routine. The same situation occurs in Consistent.

5.4.5 Implementation of Peirce’s FOL Algorithm

After implementing all these procedures, we’re now capable of constructing the final Peirce’s FOL al- gorithm. The pseudocode was introduced in the beginning of the chapter. The explanation of how this algorithm produces its output is similar to the one given for Peirce’s algorithm, with the differences already mentioned and explained. As such, we’ll only mention a few details.

• Lines 2 to 3: The sets are required to contain some formulas, and these formulas must be related directly or indirectly through other formulas in the sets to the facts set. Otherwise, the algorithm is not capable of formulating hypotheses (in the case where T,C = ∅) or relating the hypotheses to the facts. The most interesting hypotheses will obviously consist of formulas not related directly to F ;

• Lines 4 to 5: When we apply the CLAFOL procedure to the sets, it may be the case that the for- mulas used to define the theory and the conditions set are tautologies, i.e. satisfiable in every interpretation structure, or plain contradictions. As such, the result of the translation into clausal form will be the formula T rue or F alse. Since we cannot operate over these types of formulas, we have to remove them from the set. If all translated formulas are removed, then both sets become empty. Hence, we added a guard that stops the procedure and requests that the user inserts more

65 formulas in the sets. This may be problematic, since if every formula in the theory set (for example) is an axiom, then the translation will result in the empty set. Hence, to explain certain situations, alternate theories must be developed;

• Line 7: The Consistent procedure was not properly defined. Instead, it just consists of an appli- cation of the ResolutionOP procedure to the set TCLA ∪ CCLA, which was ran through the SubsumedElimination and PureLiteralElimination procedures beforehand. The reason for this was explained in the previous section;

• As seen and demonstrated before, when the set is satisfiable, the application of the ResolutionOP may require the activation of the guard maxiter = p denoting the maximum number of iterations allowed. This stoppage may result in three cases, two of which were already mentioned. First, all solutions may not be obtained, resulting in an incomplete hypotheses set. Secondly, all solutions may be obtained, resulting in a complete hypotheses set. At last, we may erroneously conclude that a set is consistent eventhough it is not. The answers in each of these cases must be tested before being considered correct;

• In the optimizations procedures, we mentioned a few types of Resolutions - Unit, Input and Linear. However, the routine we’ve developed does not follow any of these. This is due to the fact that they are oriented for findind an empty clause, while our program has more than one goal attached to it. A different optimization was applied, reducing the number of clauses being compared in each iteration;

• In terms of output, the formulas obtained are not in FOL. Instead, they’re in FOL∗. Hence, we cannot directly conclude that the hypotheses we’ve obtained explain the facts, since they may con- tain variables or constants that are not part of the original language of the system. We could try to apply a reverse Skolemization algorithm (e.g. the one presented in [41]). This process is, however, very time consuming, since it requires a significant number of guesses and comparisons. Thus, we’ll instead present a simplification procedure that will allow us to have a better visualization of the final result. This simplification - which we can also consider to be a kind of reverse Skolemization - is based on the following pattern:

– The output formula contains only variables ci. Then we universally quantify these variables, and possibly apply a substitution in order to translate the output formula into a formula con- tained in the theory set; – The output formula contains only constants xi. Then we either existentially quantify these constants, and possibly replace these constants with variables present in the theory set, or simply keep the formula untouched;

– The output formula contains only variables ci and constants xi. Then we operate over the variables accordingly with item 1, and then operate over the constants accordingly with item 2; – If a formula contains a function symbol fj applied over variables and constants, we first

existentially quantify a variable ci representing the function, and then universally quantify both variables and constants. We then apply the first and second cases to the variables and constants not yet analised; – If a formula contains two function symbols with the same arity, then we have to apply the previous step starting with the one function symbol at the time. This will result in a branch translation. We’ll present an example further ahead.

66 Chapter 6

Results Analysis

In this chapter, we’ll provide and discuss a few examples of the application of the Peirce’s FOL algorithm. We’ll also discuss some computation times, eventhough the first goal of this thesis was already reached. Some results can be found in the table below. The timing was tested using the RepeatedTiming built-in function, preceeded by the ClearSystemCache[] command. Due to space constraints, we’ll present other examples in Appendix A. We can analyse the results based in the positive or negative influence over the running time of the parameters |T |, |C|, |F |, maxiter, ExpPow and Comp. We can also discuss the hypotheses obtained (if any).

• In terms of |T | and |C|, despite playing different roles in the overall problem, they have the same one within the scope of the algorithm. Hence, they will influence the running time in the same manner. From the tests, we’ve noticed that the running time increases when |T |, |C| increase. Rows 1 through 3 of Table 6.1 are an example of this. More, in terms of the structure of the formulas within these sets, we’ve noticed that the processing of universally quantified formulas is slightly more costly than the processing of existentially quantified formulas. In terms of the order of the quantifiers, we’ve verified that when the head quantifier is universal, the time increases. The presence of formulas with conjunctions also lead to a higher running time, since their translation to clausal form results in most cases in more than one clause, which in turn will increase the number of computations in ResolutionOP;

• In terms of |F |, the results are similar to the ones above. The higher the number of facts, the higher the running time, since the program SelectGoodHypothesesFOL will have to apply ResolutionOP for each pair made up of hypothesis and fact. Rows 8 and 9 are an example;

• In terms of maxiter, it’s obvious that the running time is only influenced when the set is satisfi- able, and thus may not terminate, or the value chosen does not allow the full application of GRP, triggering this guard. In this case, the higher the number maxiter, the higher the running time of the algorithm, due to the fact that the printing of the error message consumes a significant amount of computation time. One interesting case occurs when the value of maxiter is increased until the guard is no longer triggered. In such event, the time reduces, since there’s no need to output the message, and the number of formulas over which the algorithm works is fully reduced with ResolutionOP. This case can be observed Rows 5 and 6 of Table 6.1. In Row 5, the guard is triggered, while in Row 6, the program fully terminates;

• In terms of ExpPow and Comp, only the latter has a negative influence on the running time, since higher complexity implies a higher number of clauses that must be tested with the ResolutionOP

67 T C F H maxiter ExpPow Comp Average Time (sec) ∀y(B(y) ⇒ W (y)) ∀y(S(y) ⇒ W (y)) S(c ) ∅ ∃zW (z) 1 4 1 1 0.1340 ∀y(V (y) ⇒ B(y)) V (c1) ∀y(Q(y) ⇒ P (y)) ∃yL(y) ∀y(B(y) ⇒ W (y)) S(c ) ∀y(S(y) ⇒ W (y)) ∀xL(x) ⇒ ∀x(¬B(x)) ∃zW (z) 1 4 1 1 15.8600 V (c ) ∀y(V (y) ⇒ B(y)) 1 ∀y(Q(y) ⇒ P (y)) ∃yL(y) ∀y(B(y) ⇒ W (y)) ∀xL(x) ⇒ ∀x(¬B(x)) S(c ) ∀y(S(y) ⇒ W (y)) ∃zW (z) 1 4 1 1 18.0000 ∀xL(x) ⇒ ∀x(¬S(x)) V (c ) ∀y(V (y) ⇒ B(y)) 1 ∀y(Q(y) ⇒ P (y)) ∀xS(x) ∃x∃yU(x, y) ⇒ ∀yQ(y) ∃zQ(z) TCLA and CCLA already explain the facts 4 1 1 0.0410 ∃x∃y(S(x) ⇒ (¬P (x) ∧ U(x, y))) ∀xS(x) ∃x∃yU(x, y) ⇒ ∀yQ(y) ∃zQ(z) P (x874) 4 1 1 0.2000 ∃x∃y(S(x) ⇒ (¬P (x) ∨ U(x, y))) ∀xS(x) ∃x∃yU(x, y) ⇒ ∀yQ(y) ∃zQ(z) P (x913) 8 1 1 0.1810 ∃x∃y(S(x) ⇒ (¬P (x) ∨ U(x, y))) ∀xP (x) ∀x(P (x) ⇒ V (x)) ∃zQ(z) ∅ 4 1 1 0.0162 ∀x(P (x) ⇒ Q(x)) P (c1) ∃x(Q(x) ⇒ Q(x)) ∅ ∃zQ(z) W (x2390) 4 1 1 0.096 ∀x∃y(S(x) ⇒ Q(y)) S(f2391(c1)) P (c ) ∀x(P (x) ⇒ Q(x)) 1 ∃zQ(z) P (x957) ∃x(W (x) ⇒ Q(x)) ∅ 4 1 1 0. 9100 ∀zQ(z) W (x955) ∀x∃y(S(x) ⇒ Q(y)) S(c2) ∀x(P (x) ⇒ Q(x)) ∃zQ(z) P (c ) ∃x(W (x) ⇒ Q(x)) ∅ 1 4 2 1 0.7130 ∀zQ(z) P (x999) ∀x∃y(S(x) ⇒ Q(y))

Table 6.1: Results obtained with PEIRCEFOL

procedure than the ones needed for less complexity. Since ExpPow is only used to select the final hypotheses, its value does not have a significant influence over the running time. It only influences the time spent outputting the hypotheses, which is residual compared to the total time of the algorithm. The difference can be observed, for example, in the last two rows of Table 6.1;

• In terms of output, we have many possible cases. When the number of formulas in the condition set is higher, the number of hypotheses outputted tends to decrease, since they are required to satisfy more conditions. When the complexity increases, the number of hypotheses outputted can also increase, since different combinations of formulas can satisfy a bigger subset of the facts set. When the explanatory power #2 increases, the number of hypotheses outputted can reduce, since they are required to satisfy more facts;

• Throughout the thesis, we’ve mentioned multiple times that the formulas outputted were not in FOL, but in FOL∗. They are still in clausal normal form. In order to determine whether or not these formulas actually solve our problem, i.e. satisfy equations (5.1), (5.2), (5.3) and (5.4), we have to use the reverse Skolemization procedure mentioned in the previous chapter. For exam-

ple, in Row 8, H = {{P (c1)}, {W (x2390)}, {S(f2391(c1))}}. Using the idea introduced above, the proper hypotheses set would be H∗ = {{∀xP (x)}, {∃xW (x)}, {∀x∃yS(x)}}. Notice that the substitutions have to be constructed by looking at the original input sets, since the goal is to obtain formulas contained within them. This is justification for the not implementation of this procedure, since the number of substitutions that we could possibly apply would increment exponentially with the number of formulas in the input sets. The time consumed by this procedure would not compen- sate for its usefulness, since one can look at H∗ and determine the best substitutions to use man- ually. A special case that we must mention occurs when a formula in the hypotheses set contains two function symbols with the same arity. In this case, we have branched solutions. For example, if ∗ {P (c1, f1(c2, c3), f2(c4, c5))} ∈ H , then {∀x∀w∀v∃t∀y∀z∃uP (x, u, t), ∀x∀y∀z∃t∀w∀v∃uP (x, t, u)} ∈ H. All the variables can be renamed to variables within the system through a renaming substitu- tion, due to the fact that the proof systems used are closed for substitutions;

• The hypotheses set generated can also have more formulas than the original set. This is ex-

68 emplified in Row 9, where the formula ∀x(P (x) ⇒ Q(x)) generates two hypotheses - P (c1) and P (x2449) - where the first satisfies both facts, while the second one only satifies the last fact, after the reverse Skolemization;

• When the facts are not directly or indirectly related to the theory and conditions sets, the algorithm is not capable of producing hypotheses. Row 7 is an example of this case;

• Sometimes, even if the theory and conditions set are related to the facts, all hypotheses generated do not satisfy equation (5.2). In this cases, we are left with set H∗ = H = ∅. One alteration that may change this outcome is the increase in the complexity Comp used for the selection criterion #2;

• The general running time of this algorithm in relation to the original one is higher due to two factors. Firstly, we have the extra translation from FOL to FOL∗, while in the original case we only had to execute the CNF translation. Secondly, the Resolution principle used for PROP formulas requires only a simple comparison between the formulas, while in the case of the GRP implementation in ResolutionOP, there are several steps to be taken, including finding MGUs, renaming clauses and removing factors, all of each require several comparisons (evidenced by the number of For and While loops used during the implementation). As mentioned, we tried to balance this increase in the running time with some optimizations, namely by subsumed elimination, tautology elimination and by reducing the number of clauses compared in each ResolutionStep. The average running time remains, however, higher than the original one (as it should);

• In general, the solutions obtained for practical problems can be easily traced back to the original formula. Hence, even though the average running time of this algorithm is higher than the original one, the increasement in the number of concrete situations to which this new algorithm can be applied to in relation to the first one is a big advantage.

To conclude this chapter, we’ll provide an example of how the final result is obtained. Consider Row 1 of Table 6.1. After applying Peirce’s FOL algorithm, we obtain the hypotheses set H∗ =

{{P (c1)}, {V (c1)}}. Using the reserve Skolemization idea, he get H = {{∀c1P (c1)}, {∀c1V (c1)}}.

By looking at the original set, we can conclude that the usage of the renaming substitution ρ = {c1 → y} is very useful, since it leads to ρ(H) = {{∀yP (y)}, {∀yV (y)}}. We know need to show that

T ∪ C ∪ {∀yP (y)} FOL F (6.1)

T ∪ C ∪ {∀yV (y)} FOL F (6.2)

From the definition of FOL, we need to show that

if I, µ FOL T ∪ C ∪ {∀yP (y)} then I, µ FOL F (6.3)

if I, µ FOL T ∪ C ∪ {∀yV (y)} then I, µ FOL F (6.4) where I is an interpretation structure over CFOL and µ is an assigment into I. We’ll show the first case, since the second is analogous. Suppose that I, µ FOL T ∪ C ∪ {∀yP (y)}. Then, in particular

I, µ FOL ∀yP (y) (6.5)

I, µ FOL ∀y(P (y) ⇒ Q(y)) (6.6)

69 By definition, from (6.6), we know that

0 0 I, µ 1FOL P (y) or I, µ FOL Q(y) (6.7) where µ0 is y-equivalent to µ. On the other hand, from (6.5),

00 I, µ FOL P (y) (6.8) where µ00 is y-equivalent to µ. Taking µ00 = µ0, we conclude from (6.7) that

0 I, µ FOL Q(y) (6.9) which by definition happens only if I, µ FOL ∀yQ(y) (6.10)

70 Chapter 7

Other Applications

Besides its main application, this algorithm can also be used in other contexts. In particular, we’ll use it to automatize the definition of abduction functions for certain kinds of proof systems.

7.1 Proof Systems in FOL

When we look at the definition of proof system, we notice that it contains a set D of derivations and a family P of relations that, given a derivation d, a set of formulas Γ and a formula ϕ, indicates whether d is a derivation of ϕ from Γ. As such, one may conjure that there exists a subset ∆ of Γ that also gives way to a derivation of ϕ. If this subset has smaller cardinality than Γ, then we’ve gained an advantage, in the sense that a smaller number of formulas have to be valid in order for P∆(d, ϕ) = 1. The existence of this set is demonstrated in the next theorem, taken from [42].

Theorem 4. (Compactness Theorem) Suppose that Γ FOL ϕ. Then there exists ∆ ⊆ Γ finite such that ∆ FOL ϕ.

In this sense, it would be useful to have a way of computing that set, since a smaller set would take less take to process in any procedure. In [11], a function to compute this set is defined.

Definition 80. (Abduction Function) Let P = hC,D, ◦,P i be a proof system. An abduction function for P is a computable function

Abd : D → ℘fin(℘fin(L(C))) such that for any Γ ⊆ L(C), if PΓ(d, ϕ) = 1, then there is ∆ ∈ Abd(d) such that P∆(d, ϕ) = 1 and ∆ ⊆ Γ.

In a simpler manner, given a set of formulas Γ, an abduction function receives a derivation of some formula as input and returns a subset ∆ ⊆ Γ of formulas that also gives way to a derivation of the same formula. The idea is to obtain simpler explanations of a certain formula. Not all proof systems have an abduction function. The ones that do can be named as such.

Definition 81. (Hypothesis-Abductible Proof System) A proof system P = hC,D, ◦,P i is said to be hypothesis-abductible if there is an abduction function for P.

Example 17. The proof systems P(PROP) and P(FOL) are hypothesis-abductible.

71 With that in mind, we’ll try to use Peirce’s FOL algorithm to compute an abduction function for P(FOL), since this proof system has more expressive power. As we’ve mentioned, the algorithm re- ceives as input a TCHF reasoning framework #2. Hence, we need to introduce a way of connecting the proof system with the framework. This will be done through the following definition.

FOL Definition 82. (TCHF Reasoning Framework #2 for P(hC , `FOLi))

Let Γ be a set of formulas in FOL, and d = hϕ1, ..., ϕni a Hilbert-derivation within the proof system P(FOL). Then d0 = (T,C,F ) is inductively defined as follows:

• F = {ϕn};

• if ϕi ∈ Γ then ϕi ∈ T ;

• if ϕi is Hilbert-derived from ϕj and ϕj ⊆ {ϕ1, ..., ϕi−1}, then {ϕj ⇒ ϕi} ∈ C;

• if ϕi is ϕn and {ϕj ⇒ ϕi} ∈ C, then ϕj ∈/ T .

This new definition of derivation may seem strange, especially the last item. But this definition was not made at random. The general goal of Peirce’s algorithm is to find a set of hypotheses capable of justifying at least one of the facts, together with the theory and the conditions set, assuming that those facts are not entailed by the background sets T and C. However, in the scope of our proof systems and derivations, this last condition is required to be true. Hence, we need to alter the background sets in such a way that the facts are no longer directly entailed by the background sets, but are still related to them through other operations such as implications, disjunctions and conjunctions. Thus the strange definition of derivation introduced above. Notice that the removal of formulas in item 4 does not imply that some hypothesis might be overlooked, since the implication containing this literal is still present in the conditions set. In regards to the application of Peirce’s FOL algorithm in the context of this proof systems, suppose that we are given an induced proof system P(FOL), and a derivation within that proof system - d ∈ D - such that PΓ(d, ϕ) = 1, where Γ is a set of formulas in FOL and ϕ is a formula in FOL. To determine 0 ∆ ∈ Abd(d) such that ∆ ⊆ Γ and P∆(d, ϕ) = 1, we determine d and apply Peirce’s FOL algorithm. However, from the definition of abduction function, the hypotheses are required to satisfy all of the facts, and not only some of them. Hence, we’ll impose this condition. The output will be a hypotheses set H∗ with formulas in FOL∗. This set must be manually translated to a set H of hypothesis in FOL, which in turn must be crossed with the original theory set in order to determine a set ∆ ⊆ Γ such that

P∆(d, ϕ) = 1.

7.2 Fibring

Another important notion intrinsic with proof systems is fibring. In basic terms, the fibring of two proof systems consists of a system that has properties of both original systems (but not necessarily all of them). As we’ll see, these new systems will also be proof systems. In order to properly define the fibring of two proof systems, we’ll first introduce a mechanism that’ll allow us to translate formulas from a signature into another, as well as a few notions related to the closure of sets of formulas.

Definition 83. (Translation) 0 0 0 Assume that C ⊆ C and let g : L(C ) → N be a bijection. The translation τg : L(C ) → L(C) is a map defined recursively as follows:

• τg(ϕi) = ϕ2i+1, for ϕi ∈ Ξ;

72 • τg(c) = c, for c ∈ C0;

0 • τg(c(ϕ1, ..., ϕk)) = c(τg(ϕ1), ..., τg(ϕk)), for c ∈ Ck and ϕ1, ..., ϕk ∈ L(C );

0 0 • τg(c(ϕ1, ..., ϕk)) = ϕ2g(c(ϕ1,...,ϕk)), for c ∈ Ck\Ck and ϕ1, ..., ϕk ∈ L(C ).

Definition 84. (Inverse Translation) −1 0 Let τg :Ξ → L(C ) be the following substitution:

−1 • τg (ϕ2i+1) = ϕi, for ϕi ∈ Ξ;

−1 −1 • τg (ϕ2i) = g (i).

−1 −1 0 It’s trivial to notice that τg ◦ τg = τg ◦ τg = id, assuming again that C ⊆ C . In Definition 8, we presented the notion of closure of a set of formulas. In terms of fibrings, the closure of Γ is based in the same idea, but is defined not in terms of consequence relations, but instead in terms of the translations introduced above.

Definition 85. (Closure of Γ) Assume that C ⊆ C0 and let C = hC, `i be a consequence system. The closure of Γ0 ⊆ L(C0) is given 0` −1 0 ` 0 by Γ = τg ((τg(Γ )) ), where τg is a translation from L(C ) into L(C).

We can hence present the notion of fibring, starting with the fibring of consequence systems.

Definition 86. (Fibring of Consequence Systems) Let C0 = hC0, `0i and C00 = hC00, `00i be consequence systems. The fibring of these systems, denoted by C0 ]C00 = hC, `i, is such that:

• C = C0 ∪ C00;

•` : ℘(L(C)) → ℘(L(C)) where, for each Γ ⊆ L(C), Γ` is recursively defined as follows:

– Γ ⊆ Γ`;

0 00 – If ∆ ⊆ Γ`, then ∆` ∪ ∆` ⊆ Γ`;

`0 `00 0 0 00 0 00 0 00 where ∆ , ∆ are defined using the translations τg : L(C )∪L(C ) → L(C ) and τg : L(C )∪L(C ) → L(C00), and the bijection g : L(C0) ∪ L(C00) → N.

For purposes related to the next proof, we need to define fibring using a different construction method. To show that this construction will actually produce the same result as the fibring defined above, we’ll need to use a version of Tarski’s fixed point theorem, which we state next.

Definition 87. (Complete Lattice) A complete lattie is a partially ordered set in which all subsets have both a supremum and a infimum.

Definition 88. (Monotone Funtion) A monotone function is function between ordered sets that preserves or reserves the given order.

Theorem 5. (Tarski’s Fixed Point Theorem) Let hL, ≤i be a complete lattice, and let f : L → L be a monotone function. Then the set of all fixed points of f is also a complete lattice hP, ≤i with

^ ^ P = {x ∈ L : f(x) ≤ x} (7.1) is the least fixed point of f.

73 Equipped with this theorem, we can proof the following proposition.

Proposition 29. Given two consequence systems, C0 = hC0, `0i and C00 = hC00, `00i, consider the trans- finite sequence of quasi consequence systems defined as follows:

0 00 `0 •C 0 = hC ∪ C , `0i, where Γ = Γ for every Γ ⊆ L(C);

0 00 `i+1 0−1 0 `i `0 00−1 00 `i `00 •C i+1 = hC ∪ C , `i+1i where Γ = τg ((τg(Γ )) ) ∪ τg ((τg (Γ )) );

0 00 `α S `i •C α = hC ∪ C , `αi, where Γ = i<α Γ if α is a limit ordinal.

0 00 Then C ]C = Cα.

0 00 Proof. Let Υ: ℘(L(C)) → ℘(L(C)) be an operator such that Υ(∆) = ∆` ∪ ∆` . This operator is such that Γ ⊆ Υ(Γ), over the complete lattice h℘(L(C)), ⊆i, and hence it’s monotomic and extensive. By Tarski’s fixed point theorem, for each Γ, there exists a least fixed point Γ`α . We then notice that Γ`α = V{Γ ∈ ℘(L(C)) : Υ(Γ) ⊆ Γ} = Γ`. The result follows.

Another important result is presented in Proposition 30.

Proposition 30. The fibring system C0 ]C00 is a consequence system.

Proof. We need to show that the fibring system satisfies the conditions established in Definition 8.

` 0 0 • Extensivity - We want to show that Γ ⊆ Γ . Let γ ∈ Γ. Then τg(γ) ∈ τg(Γ). By extensitivity of 0 0 0 `0 0−1 0 0−1 0 `0 0−1 0 `0 ` , τg(γ) ∈ (τg(Γ)) . Therefore, τg (τg(γ)) ∈ τg ((τg(Γ)) ). Hence, γ ∈ τg ((τg(Γ)) ), which 0 implies that γ ∈ Γ` . We conclude that γ ∈ Γ`;

` ` • Monotonicity - We want to show that if Γ1 ⊆ Γ2 then Γ1 ⊆ Γ2 . Assume that Γ1 ⊆ Γ2. Then 0 0 00 00 0 00 0 `0 0 `0 τg(Γ1) ⊆ τg(Γ2) and τg (Γ1) ⊆ τg (Γ2). By monotonicity of ` and ` , (τg(Γ1)) ⊆ (τg(Γ2)) 0 `00 0 `00 0−1 0 `0 0−1 0 `0 00−1 00 `00 and (τg(Γ1)) ⊆ (τg(Γ2)) . Hence, τg ((τg(Γ1)) ) ⊆ τg ((τg(Γ2)) ) and τg ((τg (Γ1)) ) ⊆ 00−1 00 `00 ` ` τg ((τg (Γ2)) ). As such, Γ1 ⊆ Γ2 ;

• Idempotence - We want to show that (Γ`)` ⊆ Γ`. By Proposition 29, we know that there’s an α such that (Γ`)` = (Γ`)`α . Hence, we need to show that (Γ`)`α ⊆ Γ`, for every α. We can proceed by induction over α:

– Base: α = 0. Then (Γ`)`0 ⊆ Γ`;

– Step: α = β + 1. By induction hypothesis, (Γ`)`β ⊆ Γ`. By definition of Γ`, (((Γ`)`β )`0) ∪ (((Γ`)`β )`00) ⊆ Γ`. Hence, by Definition 86,

` `β `0 ` `β `00 0−1 0 ` `β `0 00−1 00 ` `β `00 ` `β+1 ` (((Γ ) ) ) ∪ (((Γ ) ) ) = τg ((τg((Γ ) )) ) ∪ τg ((τg ((Γ ) )) ) = (Γ ) ⊆ Γ

` `α S ` `β – Step: α is a limit ordinal. Then, by definition, (Γ ) = β<α(Γ ) . By induction hypothesis, S ` `β S ` ` β<α(Γ ) ⊆ β<α Γ ⊆ Γ . The result follows;

• Closure for renaming substitutions - We want to show that ρ(Γ`) ⊆ (ρ(Γ))` for every renaming substitution ρ. Let ρ be a renaming substitution. For each Γ ∈ L(C) ∪ L(C0), there’s α such that Γ` = Γ`α . We’ll use induction over α, and we’ll show that ρ(Γ`α ) ⊆ (ρ(Γ))`:

– Base: α = 0. Then, by extensivity, ρ(Γ`) ⊆ (ρ(Γ))`;

0 – Step: α = β+1. Since C0 and C00 are closed for renaming substitutions, ρ(Γ`α ) = ρ(((Γ`β )` )∪ 00 0 00 ((Γ`β )` )) ⊆ (ρ(Γ`β ))` ∪ (ρ(Γ`β ))` . By induction hypothesis, ρ(Γ`β ) ⊆ (ρ(Γ))`. The result follows;

74 – Step: α is a limit ordinal. Straightforward.

The next step is to define the fibring of proof systems. This construction will be very similar to the one used in the fibring of consequence systems, albeit not forgetting about the particularities of these abstract systems.

Definition 89. (Fibring of Proof Systems) The fibring of two proof systems P0 = hC0,D0, ◦0,P 0i and P00 = hC00,D00, ◦00,P 00i is a tuple P0 ]P00 = hC,D, ◦,P i such that:

• C = C0 ∪ C00;

S • D = n∈N Dn, where:

0 00 – D0 = D ∪ D ;

0 00 – Dn+1 = {hE, di : E ⊆ Dn, d ∈ D ∪ D };

• E ◦ d = hE, di, if E 6= ∅;

• E ◦ d = d, if E = ∅;

0 0 0 0 0 0 • PΓ(d , ϕ) = 1 if P 0 (d , τ (ϕ)) = 1, for d ∈ D ; τg (Γ) g

00 00 00 00 00 00 • PΓ(d , ϕ) = 1 if P 00 (d , τ (ϕ)) = 1, for d ∈ D ; τg (Γ) g

• PΓ(hE, di, ϕ) = 1 if there is a set Ψ ∈ L(C) for which PΨ(d, ϕ) = 1 and PΓ(E, Ψ) = 1.

Proposition 31. Given two proof systems P0 = hC0,D0, ◦0,P 0i and P00 = hC00,D00, ◦00,P 00i, and their fibring P = hC,D, ◦,P i, consider the transfinite sequence of quasi proof systems where:

0 00 • D0 = D ∪ D ;

0 00 • Dβ+1 = Dβ ∪ (℘(Dβ) × (D ∪ D )); S • Dα = β≤α Dβ where α is a limit ordinal.

Then D = Dα for some ordinal α.

Proof. Let Υ: L(C) → L(C) be an operator such that Υ(∆) = ∆ ∪ (℘(∆) × (D0 ∪ D00)). This operator satisfies (D0 ∪ D00) ⊆ Υ(D0 ∪ D00), i.e., it’s extensive. More, it’s monotonic over the complete lattice h℘(L(C)), ⊆i, and as such, it satisfies the conditions of Tarski’s fixed point theorem. Hence, it has a least fixed point Dα. Similar to the previous case, we notice that Dα = D.

The next main result follows from this proposition.

Proposition 32. The fibring of proof systems P = P0 ]P00 is a proof system.

Proof. We’ll need to show that P satisfies the properties of Definition 34. More specifically, we’ll need to show that this system satisfies right reflexivity, monotonicity, compositionality and variable exchange. The other conditions follow from these ones.

0 0 • Right reflexivity - We want to show that PΓ(D, Γ) = 1, ∀Γ ⊆ L(C). Let ϕ ∈ Γ. Then τg(ϕ) ∈ τg(Γ). 0 0 0 0 0 0 By definition, P 0 (d , τ (ϕ)) = 1 for some d ∈ D . Hence, PΓ(d , ϕ) = 1 and so PΓ(D, Γ) = 1; τg (Γ) g

75 • Monotonicity - We want to show that PΓ1 ≤ PΓ2 , ∀Γ1 ⊆ Γ2 ⊆ L(C). Suppose that Γ1 ⊆ Γ2 and that

PΓ1 (D, ϕ) = 1. By Proposition 31, there exists an α such that PΓ1 (D, ϕ) = PΓ1 (Dα, ϕ) = 1. We’ll

show by induction over α that PΓ2 (Dα, ϕ) = 1:

– Base: α = 0. Without loss of generality, assume that there exists d0 ∈ D0 such that 0 0 0 0 0 0 0 0 P 0 (d , τ (ϕ)) = 1. By monotonicity of P , P 0 (d , τ (ϕ)) = 1. Hence PΓ2 (d , ϕ) = 1. τg (Γ1) g τg (Γ2) g

Thus, PΓ2 (D0, ϕ) = 1;

– Step: α = β +1. Assume that PΓ2 (Dβ+1, ϕ) = 1. Then, there exists Ψ such that PΓ1 (Dβ, Ψ) =

1 and PΨ(D, ϕ) = 1. By induction hypothesis, PΓ2 (Dβ, Ψ) = 1. By definition of D, PΓ2 (Dα, ϕ) = 1; – Step: α is a limit ordinal. Straightforward;

• Compositionality - Follows immediately from the definition of ◦;

• Variable exchange - We want to show that PΓ(D, ϕ) = Pρ(Γ)(D, ρ(ϕ)), where ρ is a renaming

substitution. Suppose that PΓ(D, ϕ) = 1. Then there exists an α such that PΓ(Dα, ϕ) = 1. We’ll

show, by induction over α, that Pρ(Γ)(Dα, ρ(ϕ)) = 1:

0 0 0 0 0 – Base: α = 0. Suppose that d = d ∈ D . Then P 0 (d , τ (ϕ)) = 1. We have to show τg (Γ) g 0 0 0 0 0 0 0 that there’s e ∈ D such that P 0 (e , τ (ρ(ϕ))) = 1. Let ρ :Ξ → L(C ) be a renaming τg (ρ(Γ)) g 0 0 0−1 0 substitution such that ρ (ψ) = τg(ρ(τg (ψ))). By the variable exchange property of P , there’s 0 0 0 0 0 0 0 0 e ∈ D such that P 0 0 (e , ρ (τ (ϕ))) = 1. Since for every ψ ∈ L(C), ρ (τ (ψ)) = τ (ρ(ψ)), ρ (τg (Γ)) g g g 0 0 0 00 00 we get P 0 (e , τ (ρ(ϕ))) = 1. The case where d = d ∈ D is similar. The result follows; τg (ρ(Γ)) g

– Step: α = β + 1. Since PΓ(Dβ+1, ϕ) = 1, there’s Ψ such that PΓ(e, Ψ) = 1 and PΨ(D, ϕ) = 1.

For each ψ ∈ Ψ, there exists e ∈ Dβ such that PΓ(e, ψ) = 1. By induction hypothesis, 0 0 0 there’s e (ψ) ∈ Dβ such that Pρ(Γ)(e (ψ), ρ(ψ)) = 1. Hence, Pρ(Γ)(Dβ, ρ(ψ)) = 1, for E = {e0(ψ): ψ ∈ Ψ}. Using a similar reasoning to the one used in the base step, we conclude that

Pρ(Ψ)(D, ρ(ϕ)) = 1. Thus, Pρ(Γ)(Dα, ρ(ϕ)) = 1; – Step: α is a limit ordinal. Straightforward.

At last, we reach the notion of abduction in terms of fibring. In particular, we can discuss its use in the context of fibring of proof systems and of their decibility.

Proposition 33. The finite-derivation fibring of hypothesis-abductible proof systems is hypothesis-abductible.

Proof. Let P0 and P00 be proof systems with Abduction functions Abd0 and Abd00. Let P = P0 ]P00 be their finite-derivation fibring. Then an Abduction function Abd for P can be defined recursively as follows:

0 00 0 −1 0 00−1 00 • If d ∈ D ∩ D , then Abd(d) = τg (Abd (d)) ∪ τg (Abd (d));

0 0 −1 0 • If d ∈ D , then Abd(d) = τg (Abd (d));

00 00−1 00 • If d ∈ D , then Abd(d) = τg (Abd (d));

0 • If d = hE, d0i then Abd(d) contains all sets generated as follows: for each E = {e1, ..., en} ⊆ E,

let Abd(ei) = {Ψi,1, ..., Ψi,ki }. Then Ψ1,i1 ∪ ... ∪ Ψn,in ∈ Abd(d) for all valus of i1, ..., in such that the previous expression makes sense.

76 Proposition 34. Let P0 and P00 be hypothesis-abductible and decidable proof systems. Then P0 ]P00 is a decidable proof system.

Proof. Let P0 and P00 be decidable proof systems, and let Abd0 and Abd00 be their Abduction functions. 0 00 Let Abd be the Abduction function for P = P ]P . Then PΓ(d, ϕ) can be defined inductively as follows:

0 0 • If d ∈ D and P 0 (d, τ (ϕ)) = 1, then P (d, ϕ) = 1; τg (Γ) g Γ

00 00 • If d ∈ D and P 00 (d, τ (ϕ)) = 1, then P (d, ϕ) = 1; τg (Γ) g Γ

• If d 6= hE, d0i, then PΓ(d, ϕ) = 0;

• If d = hE, d0i, let Abd(d) = {Ψ1, ...Ψn}. For i = 1, ..., n:

– If PΨi (d0, ϕ) = 1 and PΓ(E, Ψi) = 1, then PΓ(d, ϕ) = 1; – Otherwise, increment i;

• If none of the options is true, then PΓ(d, ϕ) = 0.

By looking at the propositions presented, we can see that some of them resort to abduction functions within their proofs. Thus, if we used our Peirce’s FOL algorithm in a similar manner to that of the previous section, we can automatize the process of:

• building the fibring of proof systems according to Definition 87, since it facilitates the computation of set Ψ of FOL formulas;

• building the fibring of finite-derivation hypothesis-abductible proof systems, when one of them uses FOL formulas, and the decidable proof system obtained through the fibring of two hypothesis- abductible decidable proof systems, where one of them uses FOL formulas (recall that even- though FOL is only semi-decidable, finite subsets known as theories are decidable), since it allows us to compute the abduction function of one of the proof systems that is being fibred;

• indirectly determining the value of PΓ(d, ϕ), by using the ResolutionOP procedure developed within its scope, when possible.

77 78 Chapter 8

Conclusion

8.1 Achievements

The major achievement of this thesis was the development and computational implementation in Mathe- matica of an abduction algorithm for FOL formulas within a TCHF reasoning framework #2, containing intrinsically an implementation of a translation from FOL formulas to FOL∗ formulas in clausal normal form, a unification algorithm, and an implementation of a Resolution principle (in our case the GRP). We’ve also introduced a theoretical translation of the results obtained from this algorithm back to FOL. The hypotheses obtained are more useful than the ones presented in [11] in the sense that they don’t consist of the entire set of hypotheses of the original derivation, but instead consist of a finite subset contained in the original set. The implementation was made not only based on the original Peirce’s al- gorithm, but also on structure of the formulas generated within the proof systems (and derived theories) defined in Chapter 2. Yet, we consider this representation to be fairly generic in the sense that a vast number of abduction problems can be ”translated” into formulas contained within these systems, which in turn can be easily included in a TCHF reasoning framework #2. The auxiliary programs developed can also be looked at as accomplishments on their own. In particular, the implementation of the GRP can also be used to compute the value of the relation PΓ(d, ϕ), useful within the proofs of the relations between proof systems and their fibring. Regarding theoretical results, besides the connections already mention in the previous section and in the previous paragraph, we can also add to our achievements a compilation of completeness and soundness results for the main procedures of the algorithms - Resolution principle and GRP. The ability to add this algorithm to a pool of previously developed algorithms to use, for example, in benchmark testing, is another useful result. On the other hand, there are some important aspects that we must consider. As discussed, we considered that all our theories represent practical issues and as such, are finite. However, in theoretical terms, we mostly consider the entire system. Since the algorithm is developed in a system that has space contraints, to proof theoretical results, we would need to specify some cases instead of using a general case, which in a sense removes the general proof aspect. More, the way that the Mathematica system works makes it impossible to apply this algorithm directly to axioms and some other F alse formulas, since the system either automatically translates them to T rue or F alse, representations which are promptly removed before the main procedure starts. At last, in terms of comparison with other algorithms, to our knowledge there is no official Abduction algorithm built in Mathematica, and thus we cannot compare the results within this system. Comparisons with other algorithms are not accurate, since they work over different frameworks, within different com-

79 putational systems. As mentioned, some work was already made in terms of benchmark testing, and can be found in [5]. Comparing with the original algorithm, we notice that despite the multiple optimiza- tions implemented, the developed procedure has a higher average time in theory, for reasons already explained.

8.2 Future Work

There are several changes that can be made in regards to the implementation of this algorithm that might result in a lower complexity:

• the program CLAFOL was implemented according to the translations stated. There are, however, other sets of translations that lead to the same result, such as the ones including the NNF (negative normal form) translation. We could try implement them in order to test if the running time is affected;

• we could compare the results obtained in terms of complexity with other automatized methods that use reasoning systems other than the GRP;

• in regards to the structure of the input sets, this algorithm only supports FOL formulas. Hence, we could implement some changes in order to accomodate formulas in PROP, as well as formulas built with different syntaxes;

• despite providing several optimizations related to the proceedings of GRP (Unit, Linear and Input Resolution), we only embedded one - comparing only pairs of clauses that had not been compared before - in the general ResolutionOP routine. As explained, this was due to the fact that its goal differs depending on the environment in which it is being used. We could implement different GRP procedures that use the multiple optimizations depending on the goal;

• as stated, there are several algorithms available to compute the MGU. Thus, we could use bench- mark testing to choose the best one in each problem. We would have to implement the multiple algorithms available, or at least translate them to the same language;

• we’ve implemented this algorithm in Mathematica, using Wolfram language. The natural path from this point would be to implement this algorithm in different languages, like Python, PROLOG or C++, in order to achieve a better comparison with other algorithms developed in them;

• we could implement the theoretical translation known as reverse Skolemization in order to au- tomize the full process;

• improve the programs themselves, in order to reduce the computation time.

80 Bibliography

[1] C. Peirce, C. Hartshorne, P. Weiss, and A. Burks, Collected Papers of Charles Sanders Peirce. The Belknap Press of Harvard University Press, 1931-1958.

[2] S. Klarman, U. Endriss, and S. Schlobach, “Abox abduction in the description logic alc,” Journal of Automated Reasoning, vol. 46, no. 1, pp. 43–80, 2011.

[3] U. Endriss, P. Mancarella, F. Sadri, G. Terreni, and F. Toni, “The ciff proof procedure for abductive logic programming with constraints.”

[4] D. Poole, R. Goebel, and R. Aleliunas, “Theorist: A logical reasoning system for defaults and diag- nosis,” The Knowledge Frontier, pp. 331–352, 1987.

[5] T. Eiter and G. Gottolb, “The complexity of logic-based abduction,” Journal of the ACM, vol. 42, no. 1, pp. 3–42, 1995.

[6] J. F. Costa, “Complexidade.”

[7] I. Dillig, “Abductive inference and its applications in program analysis, verification, and synthesis.”

[8] M. Genesereth and E. Kao, Introduction to Logic. Morgan Claypool, 2013.

[9] F. Rodrigues, C. Oliveira, and O. Oliveira, “Peirce: an algorithm for abductive reasoning operating with a quaternary reasoning framework,” Research in Computing Science, vol. 148, no. 5, pp. 53– 66, 2014.

[10] A. Kakas and B. V. Nuffelen, “A-system: Declarative programming with abduction,” Lecture Notes in Computer Science, vol. 2173, pp. 393–397, 2001.

[11] L. Cruz-Felipe, A. Sernadas, and C. Sernadas, “Heterogeneous fibring of deductive systems via abstract proof systems,” Logic Journal of the IGPL, vol. 16, no. 2, pp. 121–153, 2008.

[12] C. Sernadas, C. Caleiro, J. Rasga, and W. Carnielli, “Fibring of logics as a universal construction,” Handbook of Philosophical Logic, vol. 13, no. 2, pp. 123–187, 2005.

[13] A. Sernadas, C. Sernadas, and A. Zanardo, “Fibring modal first-order logics: Completeness preser- vation.”

[14] A. Sernadas, C. Sernadas, and J. Rasga, “Fibring labelled deduction systems.”

[15] C. Caleiro, W. Carnielli, M. Coniglio, A. Sernadas, and C. Sernadas, “Fibring non-truth-functional logics: Completeness preservation.”

[16] M. Coniglio, A. Sernadas, and C. Sernadas, “Fibring logics with topos semantics.”

[17] M. C. Mayer and F. Pirri, “First-order abudction via tableau and sequent calculi,” Logic Journal of the IGPL, vol. 1, no. 1, pp. 99–117, 1993.

81 [18] Y. Peng and J. Reggia, “Abductive inference models for diagnostic problem solving.”

[19] S. Shapiro and T. K. Kissel, Classical Logic. Metaphysics Research Lab, Stanford University, 2018.

[20] M. Hazewinkel, Encyclopaedia of Mathematics, Supplement III. Springer Science Business Media, 2007.

[21] Departamento de Informatica´ Universidade da Beira Interior, “A Revision Of Propositional and First- Order Logics.”

[22] A. Sernadas and C. Sernadas, Foundations of Logic and Theory of Computations. Instituto Superior Tecnico,´ 2012.

[23] Department of Computing Science Umea˚ University, “The Resolution Proof System for Proposi- tional Logic.”

[24] G. Boole, The Laws of Thought. Prometheus Books, 2003.

[25] M. Davis and H. Putnam, “A computing procedure for quantification theory,” Journal of the ACM, vol. 7, no. 3, p. 210, 1960.

[26] J. Robinson, “A machine-oriented logic based on the resolution principle,” Journal of the ACM, vol. 12, no. 1, pp. 23–41, 1965.

[27] M. Hauskrecht, “Propositional logic: Horn clauses.”

[28] P. Lucas, “Lecture notes in logic and resolution (representation and reasoning).”

[29] F. Oliehoek, “Lecture notes in artificial intelligence.”

[30] J. Vindel, Angeles´ Riesco, and F. Vegas, Logica´ Computacional. UNED, 2003.

[31] J. Goubault-Larrecq and I. Mackie, Proof Theory and Automated Deduction. Kluwer Academic Publishers, 1997.

[32] J. B. Almeida, M. J. Frade, J. S. Pinto, and S. M. de Sousa, Rigorous Software Development: An Introduction to Program Verification. Springer-Verlag London Limited, 2011.

[33] V. Goranko, First Order Logic: Prenex Normal Form, Skolemization, Clausal Form. DTU Informat- ics, 2010.

[34] F. Baader and W. Snyder, Handbook of Automated Reasoning. Elsevier Science Publishers, 2001.

[35] D. D. Champeaux, “About the paterson-wegman linear unification algorithm,” Journal of Computer and System Sciences, vol. 32, no. 1, pp. 79–90, 1986.

[36] C. M. U. S. of Computer Science, “Lecture notes in computational properties of resolution and first-order logic.”

[37] A. Leitsch, The Resolution Calculus. Springer, 1998.

[38] R. Piskac, “First-order logic - syntax, semantics, resolution.”

[39] S. Arun-Kumar, “Lecture notes in logic for computer science.”

[40] S. Ghilardi and R. Sebastiani, “Combination of theories for decidable fragments of first-order logic,” Lecture Notes in Computer Science, vol. 5749, pp. 263–278, 2009.

82 [41] P. Coz and T. Pietrzykowski, “A complete, nonredundant algorithm for reversed skolemization,” Theorical Computer Science, vol. 28, no. 3, pp. 239–261, 1983.

[42] U. of Calgary, “Lecture notes in the completeness theorem.”

[43] A. Cacas, “Aclp: Abductive constraint logic programming,” The Journal of Logic Programming, vol. 44, no. 3, pp. 129–177, 2000.

[44] A. Sernadas, C. Sernadas, and C. Caleiro, “Fibring of logics as a categorial construction,” Journal of Logic and Computation, vol. 9, no. 2, pp. 149–179, 1999.

[45] C. M. U. S. of Computer Science, “Lecture notes in resolution in first-order logic.”

[46] C. Caleiro and R. Gonc¸alves, “Equipollent logical systems.”

[47] J. Y. Beziau´ and A. Costa-Leite, Perspectives on Universal Logic. Polimetrica, International Scien- tific Publisher, 2009.

[48] W. Carnielli, M. Coniglio, D. M. Gabbay, P. Gouveia, and C. Sernadas, Analysis and Synthesis of Logics: How to Cut and Paste Reasoning Systems. Springer-Verlag New York Inc., 2008.

[49] A. Nerode and R. Shore, Logic for Applications. Springer, 2012.

[50] J. Buss, “Lecture notes in resolution.”

83 84 Appendix A

TCHF Reasoning Frameworks #2 Tested and Summarized Results

A.1 TCHF Reasoning Frameworks #2

• T 1 = {∀x(P (x) ⇒ V (x)), ∃x(V (x) ⇒ Q(x)), ∀x(B(x) ∨ S(x),Q(x)), ∀x(L(x) ∧ W (x) ⇒ Q(x))} C1 = {} F 1 = {∀zQ(z), ∃zQ(z)}

• T 2 = {∃x(A(x)∧B(x) ⇒ P (x)), ∀y(W (y) ⇒ Q(y)), ∀x∀y(U(x, y) ⇒ P (x)), ∃y(L(y) ⇒ ∀xQ(x)), ∀xG(x), ∃y(P (y) ⇒ S(y))} C2 = {∀xG(x) ⇒ ∀y(¬L(y))} F 2 = {∃zS(z)}

• T 3 = {∃x(A(x)∧B(x) ⇒ P (x)), ∀y(W (y) ⇒ Q(y)), ∀x∀y(U(x, y) ⇒ P (x)), ∃y(L(y) ⇒ ∀xQ(x)), ∀xG(x), ∃y(P (y) ⇒ Q(y) ∨ S(y))} C3 = {∀xG(x) ⇒ ∀y(¬L(y))} F 3 = {∃zS(z)}

• T 4 = {∀x(S(x) ⇒ W (x)), ∀x(RA(x) ⇒ W (x)), ∀xT A(x)} C4 = {∀xT A(x) ⇒ ∀x(¬S(x))} F 4 = {∃zW (z)}1

• T 5 = {∀x(U(x) ⇒ A(x) ∧ B(x) ∧ G(x) ∧ J(x)), ∀x(U(x) ⇒ A(x) ∧ B(x) ∧ G(x) ∧ S(x))} C5 = {} F 5 = {∃zA(z), ∃zB(z), ∃zG(z), ∃zP (z)}2

1This reasoning framework is a FOL representation of Example 2 of [9]. 2This reasoning framework is a FOL representation of Example 4 of [9].

85 A.2 Summarized Results

All of these results were obtained with the RepeatedTiming built-in function of Mathematica, preceded by ClearSystemCache[].

T C F H === ∅ maxiter ExpPow Comp maxtier reached Average Time (sec) T 1 C1 F 1 No 4 1 1 No 0.9900 T 1 C1 F 1 No 4 1 2 No 4.1100 T 1 C1 F 1 No 4 1 3 No 10.3000 T 1 C1 F 1 No 4 2 1 No 0.9630 T 1 C1 F 1 No 4 2 2 No 4.1280 T 1 C1 F 1 No 4 2 3 No 10.4000 T 1 C1 F 1 No 2 2 2 Yes 4.3100 T 1 C1 F 1 No 6 2 2 No 4.1110 T 1 C1 F 1 No 10 2 2 No 4.1350 T 1 C1 F 1 No 50 2 2 No 4.1110 T 1 C1 F 1 No 100 2 2 No 4.1300 T 2 C2 F 2 No 4 1 1 Yes 5.1000 T 2 C2 F 2 No 5 1 1 Yes 37.0000 T 2 C2 F 2 No 4 1 2 Yes 38.6300 T 2 C2 F 2 No 4 1 3 Yes 82.0200 T 3 C3 F 3 Yes 4 1 1 Yes 20.4678 T 3 C3 F 3 Yes 4 1 2 Yes 544.9830 T 3 C3 F 3 Yes 4 1 3 Yes 2902.4300 T 4 C4 F 4 No 5 1 5 No 0.1760 T 4 C4 F 4 No 10 1 5 No 0.1770 T 4 C4 F 4 No 15 1 5 No 0.1760 T 4 C4 F 4 No 5 1 1 No 0.0901 T 4 C4 F 4 No 5 1 2 No 0.1520 T 5 C5 F 5 No 5 1 5 Yes 20.2725 T 5 C5 F 5 No 5 2 5 Yes 20.3700 T 5 C5 F 5 No 5 3 5 Yes 20.3200 T 5 C5 F 5 Yes 5 4 5 Yes 20.4000 T 5 C5 F 5 No 10 1 5 Yes 56.5000 T 5 C5 F 5 No 10 2 5 Yes 57.1000 T 5 C5 F 5 No 10 3 5 Yes 56.9600 T 5 C5 F 5 Yes 10 4 5 Yes 57.0000 T 5 C5 F 5 No 5 1 1 Yes 1.8100 T 5 C5 F 5 No 5 1 2 Yes 8.2000 T 5 C5 F 5 No 5 1 3 Yes 15.3900 T 5 C5 F 5 No 5 1 4 Yes 19.6000 T 5 C5 F 5 No 5 1 5 Yes 20.5000

Table A.1: Results obtained with PEIRCEFOL

Let’s observe some of these results.

• In the first reasoning framework, we’re able to construct a hypotheses set in every test. We notice that the average time increases with the complexity, while a variation of the explanatory power #2 does not cause big changes in the running time, as expected. The increase in the value of maxiter also does not alter the running time, since it is never reached;

• In the second reasoning framework, we notice the samme pattern as in the first case. However, in this case, the maxiter is reached everytime. As such, the higher this value, the higher the computation time. Again, this is due to the exponential increase in the number of clauses that have to be compared in each iteration, without the achievement the empty clause;

86 • In the third reasoning framework, the hypotheses set obtained was empty, and maxiter was reached everytime. As expected, the running time increases exponentially with the complexity, going from 20.4678 seconds for Comp = 1 to 2902.4300 seconds (about 48 minutes) for Comp = 3, since the number of clauses also increase exponentially with each iteration;

• In the fourth reasoning framework, we observe another example with the same results as the first one. However, since the number of formulas in play is lower than in the first case, and the formulas are simpler (do not contain conjunctions or disjunctions), the output is obtained almost instantaneously;

• In the fifth reasoning framework, the important aspect not yet referenced in the previous exam- ples is the influence of ExpPow over the arity of the hypotheses set. When we require that the hypotheses satisfy all the facts, the algorithm is not capable of producing such hypotheses, which is a obivous conclusion derived from the reasoning framework. However, since maxiter was not triggered, the running time is not influenced by this occurence. This is trivial to check if we observe the insignificant difference between the average running time of Rows 24 to 27.

A.3 Example 14

In this section, we present the solution obtained for the TCHF reasoning framework #2 of Example 14, related to the disease that Mary has. We introduced the TCHF reasoning framework #2 into PEIRCEFOL, with p = 5. We started with ExpPow = 1 and Comp = 1.

T13 = {ForAll[x, Implies[P[x], U[x]]], Exists[x, Implies[P[x], W[x]]], ForAll[x, Implies[Q[x], V[x] && ! U[x]]], Exists[x, Implies[S[x], V[x]]], Exists[x, Implies[V[x], P[x]]]}; C13 = {}; F13 = {Exists[x, W[x]], Exists[x, U[x]], Exists[x, V[x]]}; ClearSystemCache[]; AbsoluteTiming[PEIRCEFOL[T13, C13, F13, 5, 1, 1]]

{1.12579, {{{{P[x38]}}, 2}, {{{S[x39]}}, 1}}}

The result obtained indicates that two of the symptoms can be derived from a particular case of the Flu, while one of the symptoms may be caused by a particular case of tropical disease. On the other hand, if we want to determine a single disease that causes all three symptoms, the algorithm cannot find the answer. To show that this assertion is true, we changed the values of ExpPow to 3.

T13 = {ForAll[x, Implies[P[x], U[x]]], Exists[x, Implies[P[x], W[x]]], ForAll[x, Implies[Q[x], V[x] && ! U[x]]], Exists[x, Implies[S[x], V[x]]], Exists[x, Implies[V[x], P[x]]]}; C13 = {}; F13 = {Exists[x, W[x]], Exists[x, U[x]], Exists[x, V[x]]}; ClearSystemCache[]; AbsoluteTiming[PEIRCEFOL[T13, C13, F13, 5, 3, 1]]

{1.16775, {}}

If we increase the complexity of the answers to 2, we get more useful results

87 T13 = {ForAll[x, Implies[P[x], U[x]]], Exists[x, Implies[P[x], W[x]]], ForAll[x, Implies[Q[x], V[x] && ! U[x]]], Exists[x, Implies[S[x], V[x]]], Exists[x, Implies[V[x], P[x]]]}; C13 = {}; F13 = {Exists[x, W[x]], Exists[x, U[x]], Exists[x, V[x]]}; ClearSystemCache[]; AbsoluteTiming[PEIRCEFOL[T13, C13, F13, 5, 3, 2]]

{2.21156, {{{{P[x44]}, {S[x45]}}, 3}}}

In this case, a particular combination of Flu and tropical disease explains all three symptoms.

88 Appendix B

Logic Definitions

In this appendix, we present the definitions of logic notions used throughout the thesis.

Definition 90. (Boolean Algebra) The boolean algebra is a set of laws that indicate the relation between the truth value of a formula and the truth value of its subformulas. For Propositional Logic, the relation can be found in the tables below.

ϕ1 ϕ2 ϕ1 ∧ ϕ2 ϕ1 ∨ ϕ2 ⊥ ⊥ ⊥ ⊥ ⊥ > ⊥ > > ⊥ ⊥ > > > > >

Table B.1: Boolean Algebra for ∧ and ∨

ϕ1 ¬ϕ1 ⊥ > > ⊥

Table B.2: Boolean Algebra for ¬

Definition 91. (Bound Variables) The map bv : L(CFOL) → ℘(χ) assigning to each formula the set of bound variables in it is recursively defined as follows:

• bv(P (t1, ..., tn)) = ∅;

• bv(¬ϕ) = bv(ϕ);

• bv(ϕ1 ⇒ ϕ2) = bv(ϕ1) ∪ bv(ϕ2);

• bv(∀ciϕ) = {ci} ∪ bv(ϕ) where var is map assigning to each formula the variables in it.

Proposition 35. The following inference rules are valid:

• ¬> ⇔ ⊥;

•∀ ciϕ ⇔ ¬(∃ci(¬ϕ));

• ϕ1 ∧ ϕ2 ⇔ ¬(ϕ1 ⇒ ¬ϕ2);

89 • ϕ1 ∨ ϕ2 ⇔ ¬((¬ϕ1) ∧ (¬ϕ2));

• ¬¬ϕ ⇔ ϕ;

FOL where ϕ ∈ L(C ) and ci ∈ Term.

90