<<

Logic and Proof

Computer Science Tripos Part IB Michaelmas

Lawrence C Paulson Computer Laboratory University of Cambridge

[email protected]

Copyright c 2013 by Lawrence C. Paulson

Contents

1 Introduction and Learning Guide 1

2 Propositional Logic 2

3 Proof Systems for Propositional Logic 5

4 First-order Logic 8

5 Formal Reasoning in First-Order Logic 11

6 Clause Methods for Propositional Logic 13

7 Skolem Functions, Herbrand’s Theorem and Unification 17

8 First-Order Resolution and 21

9 Decision Procedures and SMT Solvers 24

10 Binary Decision Diagrams 26

11 Modal Logics 27

12 Tableaux-Based Methods 30

i 1 INTRODUCTION AND LEARNING GUIDE 1

1 Introduction and Learning Guide 2009 Paper 6 Q8: resolution, tableau calculi • This course gives a brief introduction to logic, including 2007 Paper 5 Q9: propositional methods, resolution, modal • logic the resolution method of theorem-proving and its relation to the programming language Prolog. Formal logic is used 2007 Paper 6 Q9: proving or disproving first-order formulas • for specifying and verifying computer systems and (some- 2006 Paper 5 Q9: proof and disproof in FOL and modal logic times) for representing knowledge in Artificial Intelligence • 2006 Paper 6 Q9: BDDs, Herbrand models, resolution programs. • The course should help you to understand the Prolog lan- (Lect. 6–8) guage, and its treatment of logic should be helpful for un- 2005 Paper 5 Q9: resolution (Lect. 6, 8, 9) derstanding other theoretical courses. Try to avoid getting • 2005 Paper 6 Q9: DPLL, BDDs, tableaux (Lect. 6, 10, 12) bogged down in the details of how the various proof meth- • ods work, since you must also acquire an intuitive feel for 2004 Paper 5 Q9: semantics and proof in FOL (Lect. 4, 5) logical reasoning. • 2004 Paper 6 Q9: ten true or false questions The most suitable course text is this book: • 2003 Paper 5 Q9: BDDs; clause-based proof methods Michael Huth and Mark Ryan, Logic in • (Lect. 6, 10) Computer Science: Modelling and Reasoning 2003 Paper 6 Q9: (Lect. 5) about Systems, 2nd edition (CUP, 2004) • 2002 Paper 5 Q11: semantics of propositional and first-order It covers most aspects of this course with the exception of • logic (Lect. 2, 4) resolution theorem proving. It includes material (symbolic 2002 Paper 6 Q11: resolution; proof systems model checking) that could be useful later. • The following book may be a useful supplement to Huth 2001 Paper 5 Q11: satisfaction relation; logical equivalences and Ryan. It covers resolution, as well as much else relevant • 2001 Paper 6 Q11: clause-based proof methods; ordered to Logic and Proof. • ternary decision diagrams (Lect. 6, 10) Mordechai Ben-Ari, for 2000 Paper 5 Q11: checking; propositional se- Computer Science, 2nd edition (Springer, 2001) • quent calculus (Lect. 2, 3, 10) The following book provides a different perspective on 2000 Paper 6 Q11: unification and resolution (Lect. 8, 9) • modal logic, and it develops propositional logic carefully. 1999 Paper 5 Q10: Prolog resolution versus general resolu- Finally available in paperback. • tion Sally Popkorn, First Steps in Modal Logic (CUP, 1999 Paper 6 Q10: Herbrand models and clause form • 2008) 1998 Paper 5 Q10: BDDs, sequent calculus, etc. (Lect. 3, • There are numerous exercises in these notes, and they are 10) suitable for supervision purposes. Most old examination 1998 Paper 6 Q10: modal logic (Lect. 11); resolution questions for Foundations of (the for- • 1997 Paper 5 Q10: first-order logic (Lect. 4) mer name of this course) are still relevant. As of 2013/14, • Herbrand’s theorem has been somewhat deprecated, still 1997 Paper 6 Q10: sequent rules for quantifiers (Lect. 5) • mentioned but no longer in detail. Some unification the- 1996 Paper 5 Q10: sequent calculus (Lect. 3, 5, 10) ory has also been removed. These changes created space • for a new lecture on Decision Procedures and SMT Solvers. 1996 Paper 6 Q10: DPLL versus Resolution (Lect. 9) • 1995 Paper 5 Q9: BDDs (Lect. 10) 2013 Paper 5 Q5: DPLL, sequent or tableau calculus • • 2013 Paper 6 Q6: resolution problems 1995 Paper 6 Q9: outline logics; sequent calculus (Lect. 3, • • 5, 11) 2012 Paper 5 Q5: (Herbrand: deprecated) • 1994 Paper 5 Q9: resolution versus Prolog (Lect. 8) 2012 Paper 6 Q6: sequent or tableau calculus, modal logic, • • BDDs 1994 Paper 6 Q9: Herbrand models (Lect. 7) • 2011 Paper 5 Q5: resolution, linear resolution, BDDs 1994 Paper 6 Q9: most general unifiers and resolution • • 2011 Paper 6 Q6: unification, modal logic (Lect. 9) • 2010 Paper 5 Q5: BDDs and models 1993 Paper 3 Q3: resolution and Prolog (Lect. 8) • • 2010 Paper 6 Q6: sequent or tableau calculus, DPLL. Note: • the formula in the first part of this question should be ( x P(x) Q) x (P(x) Q). Acknowledgements. Chloë Brown, Jonathan Davies and ∃ → → ∀ → Reuben Thomas pointed out numerous errors in these notes. 2008 Paper 3 Q6: BDDs, DPLL, sequent calculus • David Richerby and Ross Younger made detailed sugges- 2008 Paper 4 Q5: proving or disproving first-order formulas, • tions. Thanks also to Thomas Forster, Simon Frankau, resolution Adam Hall, Ximin Luo, Steve Payne, Tom Puverle, Max 2009 Paper 6 Q7: modal logic (Lect. 11) Spencer, Ben Thorner, Tjark Weber and John Wickerson. • 2 PROPOSITIONAL LOGIC 2

2 Propositional Logic the meaning of the propositional symbols it contains. The meaning can be calculated using the standard truth tables. Propositional logic deals with truth values and the logical connectives and, or, not, etc. Most of the concepts in propo- AB AA BA BA BA B ¬ ∧ ∨ → ↔ sitional logic have counterparts in first-order logic. Here are 1 1 0 1 1 1 1 the most fundamental concepts. 1 0 0 0 1 0 0 0 1 1 0 1 1 0 Syntax refers to the formal notation for writing assertions. 0 0 1 0 0 1 1 It also refers to the data structures that represent asser- tions in a computer. At the level of syntax, 1 2 is a By inspecting the table, we can see that A B is equiv- → string of three symbols, or a with a node+ labelled alent to A B and that A B is equivalent to (A ¬ ∨ ↔ → + and having two children labelled 1 and 2. B) (B A). (The latter is also equivalent to (A B), where∧ →is exclusive-or.) ¬ ⊕ Semantics expresses the meaning of a formula in terms of Note⊕ that we are using t and f in the language as symbols mathematical or real-world entities. While 1 2 and to denote the truth values 1 and 0. The former belongs to + 2 1 are syntactically distinct, they have the same se- syntax, the latter to semantics. When it comes to first-order + mantics, namely 3. The semantics of a logical state- logic, we shall spend some time on the distinction between ment will typically be true or false. symbols and their meanings. We now make some definitions that will be needed Proof theory concerns ways of proving statements, at least throughout the course. the true ones. Typically we begin with axioms and arrive at other true statements using inference rules. Definition 1 An interpretation, or truth assignment, for a Formal proofs are typically finite and mechanical: set of formulæ is a function from its set of propositional their correctness can be checked without understand- symbols to 1, 0 . ing anything about the subject matter. An interpretation{ } satisfies a formula if the formula eval- uates to 1 under the interpretation. Syntax can be represented in a computer. Proof methods A set S of formulæ is valid (or a tautology) if every in- are syntactic, so they can be performed by computer. On terpretation for S satisfies every formula in S. the other hand, as semantics is concerned with meaning, it A set S of formulæ is satisfiable (or consistent) if there is exists only inside people’s heads. This is analogous to the some interpretation for S that satisfies every formula in S. way computers handle digital photos: the computer has no A set S of formulæ is unsatisfiable (or inconsistent) if it conception of what your photos mean to you, and internally is not satisfiable. they are nothing but bits. A set S of formulæ entails A if every interpretation that satisfies all elements of S, also satisfies A. Write S A. 2.1 Syntax of propositional logic Formulæ A and B are equivalent, A B, provided|H A B and B A. ' |H Take a set of propositional symbols P, Q, R, .... A formula |H consisting of a propositional symbol is called atomic. We It is usual to write A B instead of A B. We may use t and f to denote true and false. similarly identify a one-element|H set with{ a} |Hformula in the Formulæ are constructed from atomic formulæ using the other definitions. logical connectives1 Note that and are not logical connectives but rela- tions between|H formulæ.' They belong not to the logic but to (not) the metalanguage: they are symbols we use to discuss the ¬ (and) ∧ logic. They therefore have lower precedence than the logi- (or) cal connectives. No parentheses are needed in A A A ∨ (implies) ∧ ' → because the only possible reading is (A A) A. We may (if and only if) not write A (A A) because A A∧is not' a formula. ↔ ∧ ' ' These are listed in order of precedence; is highest. We In propositional logic, a valid formula is also called a shall suppress needless parentheses, writing,¬ for example, tautology. Here are some examples of these definitions. The formulæ A A and (A A) are valid for ((( P) Q) R) (( P) Q) as P Q R P Q. • → ¬ ∧ ¬ ¬ ∧ ∨ → ¬ ∨ ¬ ∧ ∨ → ¬ ∨ every formula A. In the metalanguage (these notes), the letters A, B, C, ... The formulæ P and P (P Q) are satisfiable: stand for arbitrary formulæ. The letters P, Q, R, ... stand • they are both true under the∧ interpretation→ that maps P for atomic formulæ. and Q to 1. But they are not valid: they are both false under the interpretation that maps P and Q to 0. 2.2 Semantics If A is a valid formula then A is unsatisfiable. • ¬ Propositional Logic is a formal language. Each formula This set of formulæ is unsatisfiable: P, Q, P Q . has a meaning (or semantics) — either 1 or 0 — relative to • { ¬ ∨¬ }

1Using for implies and for if-and-only-if is archaic. Exercise 1 Is the formula P P satisfiable, or valid? ⊃ ≡ → ¬ 2 PROPOSITIONAL LOGIC 3

2.3 Applications of propositional logic commutative laws

In hardware design, propositional logic has long been used A B B A to minimize the number of gates in a circuit, and to show ∧ ' ∧ A B B A the equivalence of combinational circuits. There now exist ∨ ' ∨ highly efficient tautology checkers, such as BDDs (Binary associative laws Decision Diagrams), which can verify complex combina- tional circuits. This is an important branch of hardware ver- (A B) C A (B C) ification. The advent of efficient SAT solvers has produced ∧ ∧ ' ∧ ∧ (A B) C A (B C) an explosion of applications involving approximating var- ∨ ∨ ' ∨ ∨ ious phenomena as large propositional formulas, typically distributive laws through some process of iterative refinement. 2 Chemical synthesis is a more offbeat example. Under A (B C) (A B) (A C) suitable conditions, the following chemical reactions are ∨ ∧ ' ∨ ∧ ∨ A (B C) (A B) (A C) possible: ∧ ∨ ' ∧ ∨ ∧

HCl NaOH NaCl H2O de Morgan laws + → + C O2 CO2 + → (A B) A B CO2 H2O H2CO3 ¬ ∧ ' ¬ ∨ ¬ + → (A B) A B ¬ ∨ ' ¬ ∧ ¬ Show we can make H2CO3 given supplies of HCl, NaOH, other negation laws O2, and C. Chang and Lee formalize the supplies of chemicals as (A B) A B four axioms and prove that H2CO3 logically follows. The ¬ → ' ∧ ¬ (A B) ( A) B A ( B) idea is to formalize each compound as a propositional sym- ¬ ↔ ' ¬ ↔ ' ↔ ¬ bol and express the reactions as implications: laws for eliminating certain connectives HCl NaOH NaCl H2O ∧ → ∧ A B (A B) (B A) C O2 CO2 ↔ ' → ∧ → ∧ → A A f CO2 H2O H2CO3 ¬ ' → ∧ → A B A B Note that this involves an ideal model of chemistry. What → ' ¬ ∨ if the reactions can be inhibited by the presence of other simplification laws chemicals? Proofs about the real world always depend upon general assumptions. It is essential to bear these in mind A f f ∧ ' when relying on such a proof. A t A ∧ ' A f A ∨ ' 2.4 Equivalences A t t ∨ ' Note that A B and A B are different kinds of asser- A A ↔ ' ¬¬ ' tions. The formula A B refers to some fixed interpre- A A t tation, while the metalanguage↔ statement A B refers to ∨ ¬ ' ' A A f all interpretations. On the other hand, A B means the ∧ ¬ ' same thing as A B. Both are metalanguage|H ↔ statements, Propositional logic enjoys a principle of duality: for ev- and A B is equivalent' to saying that the formula A B ery equivalence A B there is another equivalence A is a tautology.' ↔ 0 B , derived by exchanging' with and t with f. Before ap-' Similarly, A B and A B are different kinds of 0 ∧ ∨ → |H plying this rule, remove all occurrences of and , since assertions, while A B and A B mean the same → ↔ |H → |H they implicitly involve and . thing. The formula A B is a tautology if and only if ∧ ∨ A B. → Here|H is a listing of some of the more basic equivalences Exercise 2 Verify some of these laws using truth tables. of propositional logic. They provide one means of reason- ing about propositions, namely by transforming one propo- 2.5 Normal forms sition into an equivalent one. They are also needed to con- vert propositions into various normal forms. The language of propositional logic has much redundancy: many of the connectives can be defined in terms of others. idempotency laws By repeatedly applying certain equivalences, we can trans- A A A form a formula into a normal form. A typical normal form ∧ ' eliminates certain connectives and uses others in a restricted A A A ∨ ' manner. The restricted structure makes the formula easy to 2Chang and Lee, page 21, as amended by Ross Younger, who knew process, although the normal form may be much larger than more about Chemistry! the original formula, and unreadable. 2 PROPOSITIONAL LOGIC 4

Definition 2 (Normal Forms) Step 3. To obtain CNF, push disjunctions in until they ap- ply only to literals. Repeatedly replace by the equivalences A literal is an atomic formula or its negation. Let K , L, • A (B C) (A B) (A C) L0, ... stand for literals. ∨ ∧ ' ∨ ∧ ∨ (B C) A (B A) (C A) A formula is in Negation Normal Form (NNF) if the ∧ ∨ ' ∨ ∧ ∨ • only connectives in it are , , and , where is only These two equivalences obviously say the same thing, since applied to atomic formulæ.∧ ∨ ¬ ¬ disjunction is commutative. In fact, we have (A B) (C D) (A C) (A D) (B C) (B D). A formula is in (CNF) if ∧ ∨ ∧ ' ∨ ∧ ∨ ∧ ∨ ∧ ∨ • it has the form A1 Am, where each Ai is a Use this equivalence when you can, to save writing. disjunction of one or∧ more · · · ∧ literals. Step 4. Simplify the resulting CNF by deleting any dis- A formula is in Disjunctive Normal Form (DNF) if it • junction that contains both P and P, since it is equivalent has the form A1 Am, where each Ai is a con- ¬ ∨ · · · ∨ to t. Also delete any conjunct that includes another con- junction of one or more literals. junct (meaning, every literal in the latter is also present in the former). This is correct because (A B) A A. Fi- An atomic formula like P is in all the normal forms NNF, nally, two disjunctions of the form P ∨A and∧ P' A can CNF, and DNF. The formula be replaced by A, thanks to the equivalence∨ ¬ ∨ (P Q) ( P S) (R P) (P A) ( P A) A. ∨ ∧ ¬ ∨ ∧ ∨ ∨ ∧ ¬ ∨ ' is in CNF. Unlike in some hardware applications, the dis- This simplification is related to the resolution rule, which juncts in a CNF formula do not have to mention all the vari- we shall study later. Since is commutative, a conjunct of the form A B ables. On the contrary, they should be as simple as possible. ∨ ∨ Simplifying the formula could denote any possible way of arranging the literals into two parts. This includes A f, since one of those parts may ∨ (P Q) ( P Q) (R S) be empty and the empty disjunction is false. So in the last ∨ ∧ ¬ ∨ ∧ ∨ simplification above, two conjuncts of the form P and P ¬ to Q (R S) counts as an improvement, because it will can be replaced by f. make∧ our proof∨ procedures run faster. For examples of DNF formulæ, exchange and in the examples above. As Steps 3’ and 4’. To obtain DNF, apply instead the other with CNF, there is no∧ need∨ to mention all combinations of distributive law: variables. A (B C) (A B) (A C) NNF can reveal the underlying nature of a formula. For ∧ ∨ ' ∧ ∨ ∧ example, converting (A B) to NNF yields A B. (B C) A (B A) (C A) ¬ → ∧ ¬ ∨ ∧ ' ∧ ∨ ∧ This reveals that the original formula was effectively a con- Exactly the same simplifications can be performed for DNF junction. Every formula in CNF or DNF is also in NNF, but as for CNF, exchanging the roles of and . the NNF formula (( P Q) R) P is in neither CNF ∧ ∨ nor DNF. ¬ ∧ ∨ ∧ 2.7 Tautology checking using CNF 2.6 Translation to normal form Here is a (totally impractical) method of proving theorems in propositional logic. To prove A, reduce it to CNF. If Every formula can be translated into an equivalent formula the simplified CNF formula is t then A is valid because in NNF, CNF, or DNF by means of the following steps. each transformation preserves logical equivalence. And if the CNF formula is not t, then A is not valid. To see why, suppose the CNF formula is A A . Step 1. Eliminate and by repeatedly applying the 1 m If A is valid then each A must also be valid.∧ Write · · · ∧A as following equivalences:↔ → i i L1 Ln, where the L j are literals. We can make an interpretation∨ · · · ∨ I that falsifies every L , and therefore falsi- A B (A B) (B A) j ↔ ' → ∧ → fies Ai . Define I such that, for every propositional letter P, A B A B ( → ' ¬ ∨ 0 if L is P for some j I (P) j = 1 if L j is P for some j Step 2. Push negations in until they apply only to atoms, ¬ repeatedly replacing by the equivalences This definition is legitimate because there cannot exist lit- erals L j and Lk such that L j is Lk; if there did, then sim- A A ¬ ¬¬ ' plification would have deleted the disjunction Ai . (A B) A B The powerful BDD method is based on similar ideas, but ¬ ∧ ' ¬ ∨ ¬ (A B) A B uses an if-then-else data structure, an ordering on the propo- ¬ ∨ ' ¬ ∧ ¬ sitional letters, and some standard algorithmic techniques At this point, the formula is in Negation Normal Form. (such as hashing) to gain efficiency. 3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 5

Example 1 Start with Exercise 3 Each of the following formulæ is satisfiable but not valid. Exhibit an interpretation that makes the for- P Q Q R. ∨ → ∨ mula true and another that makes the formula false. Step 1, eliminate , gives P Q P Q P Q → → ∨ → ∧ (P Q) (Q R). (P Q R) (P Q) (Q R) (P R) ¬ ∨ ∨ ∨ ¬ ∨ ∨ ¬ ∧ ∧ ¬ ∨ ∧ ∨ Step 2, push negations in, gives Exercise 4 Convert each of the following propositional ( P Q) (Q R). ¬ ∧ ¬ ∨ ∨ formulæ into Conjunctive Normal Form and also into Dis- Step 3, push disjunctions in, gives junctive Normal Form. For each formula, state whether it is valid, satisfiable, or unsatisfiable; justify each answer. ( P Q R) ( Q Q R). ¬ ∨ ∨ ∧ ¬ ∨ ∨ (P Q) (Q P) Simplifying yields ( P Q R) t and then → ∧ → ¬ ∨ ∨ ∧ ((P Q) R) ( ((P R) (Q R))) ∧ ∨ ∧ ¬ ∨ ∧ ∨ P Q R. (P Q R) ((P Q) R) ¬ ∨ ∨ ¬ ∨ ∨ ∨ ∧ ∨ The interpretation P 1, Q 0, R 0 falsifies this formula, which is equivalent7→ to the7→ original7→ formula. So the Exercise 5 Using ML, define datatypes for representing original formula is not valid. propositions and interpretations. Write a function to test whether or not a proposition holds under an interpretation Example 2 Start with (both supplied as arguments). Write a function to convert a proposition to Negation Normal Form. P Q Q P ∧ → ∧ Step 1, eliminate , gives → 3 Proof Systems for Propositional (P Q) Q P ¬ ∧ ∨ ∧ Logic Step 2, push negations in, gives We can verify any tautology by checking all possible in- ( P Q) (Q P) ¬ ∨ ¬ ∨ ∧ terpretations, using the truth tables. This is a semantic ap- Step 3, push disjunctions in, gives proach, since it appeals to the meanings of the connectives. The syntactic approach is formal proof: generating the- ( P Q Q) ( P Q P) ¬ ∨ ¬ ∨ ∧ ¬ ∨ ¬ ∨ orems, or reducing a conjecture to a known theorem, by Simplifying yields t t, which is t. Both conjuncts are applying syntactic transformations of some sort. We have valid since they contain∧ a formula and its negation. Thus already seen a proof method based on CNF. Most proof P Q Q P is valid. methods are based on axioms and inference rules. ∧ → ∧ What about efficiency? Deciding whether a propositional Example 3 Peirce’s law is another example. Start with formula is satisfiable is an NP-complete problem (Aho, Hopcroft and Ullman 1974, pages 377–383). Thus all ap- ((P Q) P) P proaches are likely to be exponential in the length of the → → → Step 1, eliminate , gives formula. Technologies such as BDDs and SAT solvers, → which can decide huge problems in propositional logic, are ( ( P Q) P) P all the more stunning because their success was wholly un- ¬ ¬ ¬ ∨ ∨ ∨ Step 2, push negations in, gives expected. But even they require a “well-behaved” input for- mula, and are exponential in the worst case. ( ( P Q) P) P ¬¬ ¬ ∨ ∧ ¬ ∨ (( P Q) P) P 3.1 A Hilbert-style proof system ¬ ∨ ∧ ¬ ∨ Step 3, push disjunctions in, gives Here is a simple proof system for propositional logic. There ( P Q P) ( P P) are countless similar systems. They are often called Hilbert ¬ ∨ ∨ ∧ ¬ ∨ systems after the logician David Hilbert, although they ex- Simplifying again yields t. Thus Peirce’s law is valid. isted before him. There is a dual method of refuting A (proving inconsis- This proof system provides rules for implication only. tency). To refute A, reduce it to DNF, say A1 Am. The other logical connectives are not taken as primitive. ∨ · · · ∨ If A is inconsistent then so is each Ai . Suppose Ai is They are instead defined in terms of implication: L1 Ln, where the L j are literals. If there is some ∧ · · · ∧ def literal L0 such that the L j include both L0 and L0, then A A f ¬ ¬ = → Ai is inconsistent. If not then there is an interpretation that A B def A B verifies every L j , and therefore Ai . ∨ = ¬ → def To prove A, we can use the DNF method to refute A. A B ( A B) The steps are exactly the same as the CNF method because¬ ∧ = ¬ ¬ ∨ ¬ the extra negation swaps every and . Gilmore imple- Obviously, these definitions apply only when we are dis- mented a theorem prover based upon∨ this∧ method in 1960. cussing this proof system! 3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 6

Note that A (B A) is a tautology. Call it Axiom K. 2. Each logical connective is defined independently of the Also, → → others. (This is possible because item 1 eliminates the need for tricky uses of implication.) (A (B C)) ((A B) (A C)) → → → → → → 3. Each connective is defined by introduction and elimi- is a tautology. Call it Axiom S. The Double-Negation Law nation rules. A A, is a tautology. Call it Axiom DN. ¬¬ → For example, the introduction rule for describes how to These axioms are more properly called axiom schemes, deduce A B: ∧ ∧ AB since we assume all instances of them that can be obtained ( i) by substituting formulæ for A, B and C. For example, Ax- A B ∧ ∧ iom K is really an infinite set of formulæ. The elimination rules for describe what to deduce from Whenever A B and A are both valid, it follows that B A B: ∧ → ∧ A B A B is valid. We write this as the inference rule ( e1) ( e2) ∧A ∧ ∧B ∧ A BA →B. The elimination rule for says what to deduce from A B. It is just :→ → This rule is traditionally called Modus Ponens. Together A BA ( e) with Axioms K, S, and DN and the definitions, it suffices →B → to prove all tautologies of (classical) propositional logic.3 However, this formalization of propositional logic is incon- The introduction rule for says that A B is proved by → → venient to use. For example, try proving A A! assuming A and deriving B: → A variant of this proof system replaces the Double- [A] Negation Law by the Contrapositive Law: . . B ( B A) (A B) ( i) ¬ → ¬ → → A B → → Another formalization of propositional logic consists of For simple proofs, this notion of assumption is pretty intu- the Modus Ponens rule plus the following axioms: itive. Here is a proof of the formula A B A: ∧ → A A A [A B] ∧ ( e1) ∨ → A ∧ B A B ( i) → ∨ A B A → A B B A ∧ → ∨ → ∨ The key point is that rule ( i) discharges its assumption: (B C) (A B A C) → → → ∨ → ∨ the assumption could be used to prove A from A B, but is ∧ Here A B and A B are defined in terms of and . no longer available once we conclude A B A. ∧ → ¬ ∨ The introduction rules for are straightforward:∧ → Where do truth tables fit into all this? Truth tables define ∨ the semantics, while proof systems define what is some- A B ( i1) ( i2) times called the proof theory. A proof system must respect A B ∨ A B ∨ the truth tables. Above all, we expect the proof system to ∨ ∨ be sound: every theorem it generates must be a tautology. The elimination rule says that to show some C from A B ∨ For this to hold, every axiom must be a tautology and every there are two cases to consider, one assuming A and one inference rule must yield a tautology when it is applied to assuming B: [A] [B] tautologies. . . The converse property is : the proof system . . A B C C can generate every tautology. Completeness is harder to ∨ ( e) achieve and show. There are complete proof systems even C ∨ for first-order logic. (And Gödel’s incompleteness theo- The scope of assumptions can get confusing in complex rem uses the word “completeness” with a different technical proofs. Let us switch attention to the sequent calculus, meaning.) which is similar in spirit but easier to use.

3.2 Gentzen’s Natural Deduction Systems 3.3 The sequent calculus Natural proof systems do exist. Natural deduction, devised The sequent calculus resembles natural deduction, but it by Gerhard Gentzen, is based upon three principles: makes the set of assumptions explicit. Thus, it is more con- crete. 1. Proof takes place within a varying context of assump- A sequent has the form 0 1, where 0 and 1 are finite tions. sets of formulæ.4 These sets⇒ may be empty. The sequent

3 If the Double-Negation Law is omitted, only the intuitionistic tautolo- A1,..., Am B1,..., Bn gies are provable. This axiom system is connected with the combinators S ⇒ and K and the λ-calculus. 4With minor changes, sequents can instead be lists or multisets. 3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 7

basic sequent: A, 0 A, 1 is true (in a particular interpretation) if A1 ... Am implies ∧ ∧ Negation rules: ⇒ B1 ... Bn. In other words, if each of A1,..., Am are true, ∨ ∨ then at least one of B1,..., Bn must be true. The sequent is 0 1, A A, 0 1 valid if it is true in all interpretations. ⇒ ( l) ⇒ ( r) A, 0 1 ¬ 0 1, A ¬ A basic sequent is one in which the same formula appears ¬ ⇒ ⇒ ¬ on both sides, as in P, B B, R. This sequent is valid Conjunction rules: because, if all the formulæ⇒ on the left side are true, then in particular B is; so, at least one right-side formula (B again) A, B, 0 1 0 1, A 0 1, B ⇒ ( l) ⇒ ⇒ ( r) is true. Our calculus therefore regards all basic sequents as A B, 0 1 ∧ 0 1, A B ∧ proved. ∧ ⇒ ⇒ ∧ Every basic sequent might be written in the form A Disjunction rules: 0 A 1, where A is the common formula and 0{ and} ∪ ⇒ { } ∪ A, 0 1 B, 0 1 0 1, A, B 1 are the other left- and right-side formulæ, respectively. ⇒ ⇒ ( l) ⇒ ( r) A B, 0 1 ∨ 0 1, A B ∨ The sequent calculus identifies the one-element set A with ∨ ⇒ ⇒ ∨ its element A and denotes union by a comma. Thus,{ } the Implication rules: correct notation for the general form of a basic sequent is A, 0 A, 1. 0 1, AB, 0 1 A, 0 1, B ⇒ ⇒ ⇒ ( l) ⇒ ( r) Sequent rules are almost always used backward. We start A B, 0 1 → 0 1, A B → with the sequent that we would like to prove. We view the → ⇒ ⇒ → sequent as saying that A1,..., Am are true, and we try to show that one of B1,..., Bn is true. Working backwards, we Figure 1: Sequent Rules for Propositional Logic use sequent rules to reduce it to simpler sequents, stopping when those sequents become trivial. The forward direction backwards (or upwards), we reduce this formula to a basic would be to start with known facts and derive new facts, sequent: but this approach tends to generate random theorems rather than ones we want. A, B B, A ⇒ ( r) Sequent rules are classified as right or left, indicating A B, B A → ⇒ → ( r) which side of the symbol they operate on. Rules that op- A B, B A → erate on the right⇒ side are analogous to natural deduction’s ⇒ → → ( r) (A B) (B A) ∨ introduction rules, and left rules are analogous to elimina- ⇒ → ∨ → tion rules. The basic sequent has a line over it to emphasize that it is The sequent calculus analogue of ( i) is the rule provable. → This example is typical of the sequent calculus: start with A, 0 1, B ⇒ ( r) the desired theorem and work upward. Notice that inference 0 1, A B → ⇒ → rules still have the same logical meaning, namely that the Working backwards, this rule breaks down some implica- premises (above the line) imply the conclusion (below the tion on the right side of a sequent; 0 and 1 stand for the line). Instead of matching a rule’s premises with facts that sets of formulæ that are unaffected by the inference. The we know, we match its conclusion with the formula we want analogue of the pair ( i1) and ( i2) is the single rule to prove. That way, the form of the desired theorem controls ∨ ∨ the proof search. 0 1, A, B ⇒ ( r) Here is part of a proof of the distributive law A (B ∨ 0 1, A B C) (A B) (A C): ∨ ∧ ⇒ ∨ ' ∨ ∧ ∨ This breaks down some disjunction on the right side, re- B, C A, B placing it by both disjuncts. Thus, the sequent calculus is a ⇒ ( l) A A, B B C A, B ∧ kind of multiple-conclusion logic. Figure 1 summarises the ⇒ ∧ ⇒ ( l) A (B C) A, B ∨ rules. ( r) ∨ ∧ ⇒ ∨ Let us prove that the rule ( l) is sound. We must show A (B C) A B similar ∨ ∨ ∧ ⇒ ∨ ( r) that if both premises are valid, then so is the conclusion. For A (B C) (A B) (A C) ∧ ∨ ∧ ⇒ ∨ ∧ ∨ contradiction, assume that the conclusion, A B, 0 1, The second, omitted proof tree proves A (B C) A C is not valid. Then there exists an interpretation∨ I under⇒ similarly. ∨ ∧ ⇒ ∨ which the left side is true while the right side is false; in Finally, here is a failed proof of the invalid formula A particular, A B and 0 are true while 1 is false. Since ∨ ∨ B B C. A B is true under interpretation I , either A is true or B → ∨ ∨ is. In the former case, A, 0 1 is false; in the latter case, A B, C B B, C ⇒ ⇒ ⇒ ( l) B, 0 1 is false. Either case contradicts the assumption A B B, C ∨ ⇒ ∨ ⇒ ( r) that the premises are valid. A B B C ∨ ∨ ⇒ ∨ ( r) A B B C → 3.4 Examples of Sequent Calculus Proofs ⇒ ∨ → ∨ The sequent A B, C has no line over it because it is not To illustrate the use of multiple formulæ on the right, let us valid! The interpretation⇒ A 1, B 0, C 0 falsifies prove the classical theorem (A B) (B A). Working it. We have already seen this7→ as Example7→ 1 (page7→ 5). → ∨ → 4 FIRST-ORDER LOGIC 8

3.5 Further Sequent Calculus Rules Exercise 7 Prove the following sequents: Structural rules concern sequents in general rather than par- (A B) C A (B C) ∧ ∧ ⇒ ∧ ∧ ticular connectives. They are little used in this course, be- (A B) (A C) A (B C) ∨ ∧ ∨ ⇒ ∨ ∧ cause they are not useful for proof procedures. However, a (A B) A B brief mention is essential in any introduction to the sequent ¬ ∨ ⇒ ¬ ∧ ¬ calculus. The weakening rules allow additional formulæ to be in- 4 First-order Logic serted on the left or right side. Obviously, if 0 1 holds then the sequent continues to hold after further assumptions⇒ First-order logic (FOL) extends propositional logic to allow or goals are added: reasoning about the members (such as numbers) of some non-empty universe. It uses the quantifiers (for all) and 0 1 0 1 (there exists). First-order logic has variables∀ ranging over ⇒ (weaken:l) ⇒ (weaken:r) A, 0 1 0 1, A so-called∃ individuals, but not over functions or predicates; ⇒ ⇒ such variables are found in second- or higher-order logic. Exchange rules allow formulæ in a sequent to be re-ordered. We do not need them because our sequents are sets rather 4.1 Syntax of first-order Logic than lists. Contraction rules allow formulæ to be used more than once: Terms stand for individuals while formulæ stand for truth values. We assume there is an infinite supply of variables x, A, A, 0 1 0 1, A, A ⇒ (contract:l) ⇒ (contract:r) y, ... that range over individuals. A first-order language A, 0 1 0 1, A ⇒ ⇒ specifies symbols that may appear in terms and formulæ. A first-order language L contains, for all n 0, a set of n- Because the sets A and A, A are identical, we don’t need place function symbols f , g, ... and n-place≥predicate sym- contraction rules{ either.} { Moreover,} it turns out that we al- bols P, Q, .... These sets may be empty, finite, or infinite. most never need to use a formula more than once. Excep- Constant symbols a, b, ... are simply 0-place function tions are x A (when it appears on the left) and x A (when symbols. Intuitively, they are names for fixed elements of it appears∀ on the right). ∃ the universe. It is not required to have a constant for each The cut rule allows the use of lemmas. Some formula A element; conversely, two constants are allowed to have the is proved in the first premise, and assumed in the second same meaning. premise. A famous result, the cut-elimination theorem, Predicate symbols are also called relation symbols. Pro- states that this rule is not required. All uses of it can be log programmers refer to function symbols as functors. removed from any proof, but the proof could get exponen- tially larger. Definition 3 The terms t, u, ... of a first-order language are defined recursively as follows: 0 1, AA, 0 1 ⇒ ⇒ (cut) A variable is a term. 0 1 • ⇒ A constant symbol is a term. This special case of cut may be easier to understand. We • If t1, ..., tn are terms and f is an n-place function prove lemma A from 0 and use A and 0 together to reach • the conclusion B. symbol then f (t1,..., tn) is a term. 0 B, AA, 0 B Definition 4 The formulæ A, B, ... of a first-order lan- ⇒ ⇒ guage are defined recursively as follows: 0 B ⇒ If t1, ..., tn are terms and P is an n-place predi- • Since 0 contains as much information as A, it is natural to cate symbol then P(t1,..., tn) is a formula (called an expect that such lemmas should not be necessary, but the atomic formula). cut-elimination theorem is hard to prove. If A and B are formulæ then A, A B, A B, A • B, A B are also formulæ.¬ ∧ ∨ → Note On the course website, there is a simple theorem ↔ prover called folderol.ML. It can prove easy first-order If x is a variable and A is a formula then x A and • x A are also formulæ. ∀ theorems using the sequent calculus, and outputs a sum- ∃ mary of each proof. The file begins with very basic instruc- Brackets are used in the conventional way for grouping. tions describing how to run it. The file testsuite.ML Terms and formulæ are tree-like data structures, not strings. contains further instructions and numerous examples. The quantifiers x A and x A bind tighter than the bi- nary connectives; thus∀ x A ∃B is equivalent to ( x A) B. ∀ ∧ ∀ ∧ Exercise 6 Prove the following sequents: Frequently, you will see an alternative quantifier syntax, x . A and x . B, which binds more weakly than the bi- ∀ ∃ A A nary connectives: x . A B is equivalent to x (A B). ¬¬ ⇒ The dot is the give-away;∀ ∧ look out for it! ∀ ∧ A B B A ∧ ⇒ ∧ Nested quantifications such as x y A are abbreviated A B B A to xy A. ∀ ∀ ∨ ⇒ ∨ ∀ 4 FIRST-ORDER LOGIC 9

Example 4 A language for arithmetic might have the con- 4.3 Formal semantics of first-order logic stant symbols 0, 1, 2, ..., and function symbols , , , /, and the predicate symbols , <, >, .... We informally+ − × Let us rigorously define the meaning of formulæ. An inter- may adopt an infix notation for= the function and predicate pretation of a language maps its function symbols to actual symbols. Terms include 0 and (x 3) y; formulæ include functions, and its relation symbols to actual relations. For y 0 and x y < y z. + − example, the predicate symbol “student” could be mapped = + + to the set of all students currently enrolled at the University.

4.2 Examples of first-order statements Definition 5 Let L be a first-order language. An interpre- tation I of L is a pair (D, I ). Here D is a nonempty set, Here are some sample formulæ with a rough English trans- the domain or universe. The operation I maps symbols to lation. English is easier to understand but is too ambiguous individuals, functions or sets: for long derivations. All professors are brilliant: if c is a constant symbol (of L) then I [c] D • ∈ x (professor(x) brilliant(x)) if f is an n-place function symbol then I [ f ] Dn ∀ → • ∈ → The income of any banker is greater than the income of D (which means I [ f ] is an n-place function on D) any bedder : if P is an n-place relation symbol then I [P] Dn • n ∈ → xy (banker(x) bedder(y) income(x) > income(y)) 1, 0 (equivalently, I [P] D , which means I [P] is ∀ ∧ → {an n-place} relation on D) ⊆ Note that > is a 2-place relation symbol. The infix notation is simply a convention. It is natural to regard predicates as truth-valued functions. Every student has a supervisor: But in first-order logic, relations and functions are distinct concepts because no term can denote a truth value. One x (student(x) y supervises(y, x)) ∀ → ∃ of the benefits of higher-order logic is that relations are a This does not preclude a student having several supervisors. special case of functions, and formulas are simply boolean- Every student’s tutor is a member of the student’s Col- valued terms. lege: An interpretation does not say anything about variables. An environment or valuation can be used to represent the xy (student(x) college(y) member(x, y) values of variables. ∀ ∧ ∧ member(tutor(x), y)) → Definition 6 A valuation V of L over D is a function from Formalising the notion of tutor as a function incorporates the variables of L into D. Write IV [t] for the value of t the assumption that every student has exactly one tutor. with respect to I and V , defined by A mathematical example: there exist infinitely many Py- def thagorean triples: IV [x] V (x) if x is a variable = 2 2 2 def n i jk (i > n i j k ) IV [c] I [c] ∀ ∃ ∧ + = = def Here the superscript 2 refers to the squaring function. IV [ f (t1,..., tn)] I [ f ](IV [t1],..., IV [tn]) = Equality (=) is just another relation symbol (satisfying suit- able axioms) but there are many special techniques for it. Write V a/x for the valuation that maps x to a and is { } First-order logic requires a non-empty domain: thus otherwise the same as V . Typically, we modify a valua- x P(x) implies x P(x). If the domain could be empty, tion one variable at a time. This is a semantic analogue of ∀even x t could fail∃ to hold. Note also that x y y2 x is for the variable x. true if∃ the domain is the complex numbers, and∀ ∃ is false= if the domain is the integers or reals. We determine properties of 4.4 What is truth? the domain by asserting the set of statements it must satisfy. There are many other forms of logic. Many-sorted first- We can define truth in first-order logic. This formidable def- order logic assigns types to each variable, function sym- inition formalizes the intuitive meanings of the connectives. bol and predicate symbol, with straightforward type check- Thus it almost looks like a tautology. It effectively specifies ing; types are called sorts and denote non-empty domains. each connective by English descriptions. Valuations help Second-order logic allows quantification over functions and specify the meanings of quantifiers. Alfred Tarski first de- predicates. It can express mathematical induction by fined truth in this manner.

P [P(0) k (P(k) P(k 1)) n P(n)], Definition 7 Let A be a formula. Then for an interpretation ∀ ∧ ∀ → + → ∀ I (D, I ) write A to mean that A is true in I un- using quantification over the unary predicate P. In second- I,V der=V . This is defined|H by cases on the construction of the order logic, these functions and predicates must themselves formula A: be first-order, taking no functions or predicates as argu- ments. Higher-order logic allows unrestricted quantifica- I,V P(t1,..., tn) is defined to hold if tion over functions and predicates of any order. The list of |H logics could be continued indefinitely. I [P](IV [t1],..., IV [tn]) 1 = 4 FIRST-ORDER LOGIC 10

(that is, the actual relation I [P] is true for the The formula ( x P(x)) P(c) holds in the interpreta- given values) tion (D, I ) where∃ D 0→, 1 , I [P] 0 , and I [c] 0. (Thus P(x) means “x=equals { } 0” and=c {denotes} 0.) If= we I,V t u if IV [t] equals IV [u] (if is a pred- |H = = modify this interpretation by making I [c] 1 then the for- icate symbol of the language, then we insist = that it really denotes equality) mula no longer holds. Thus it is satisfiable but not valid. The formula ( x P(x)) ( x P( f (x))) is valid, for let I V B if I V B does not hold ∀ → ∀ |H , ¬ |H , (D, I ) be an interpretation. If x P(x) holds in this inter- ∀ I V B C if I V B and I V C pretation then P(x) holds for all x D, thus I [P] D. |H , ∧ |H , |H , The symbol f denotes some actual function∈ I [ f ] =D I,V B C if I,V B or I,V C |H ∨ |H |H D. Since I [P] D and I [ f ](x) D for all x∈ →D, B C B I,V if I,V does not hold or I,V formula x P( f (=x)) holds. ∈ ∈ |H C holds→ |H |H ∀ The formula xy x y is satisfiable but not valid; it is ∀ = I,V B C if I,V B and I,V C both hold true in every domain that consists of exactly one element. |H ↔ |H |H or neither hold (The empty domain is not allowed in first-order logic.) I V x B if there exists some m D such that |H , ∃ ∈ I,V m/x B holds (that is, B holds when x has|H the{ value} m) Example 6 Let L be the first-order language consisting of the constant 0 and the (infix) 2-place function symbol . I,V x B if I,V m/x B holds for all m D + |H ∀ |H { } ∈ An interpretation I of this language is any non-empty do- main D together with values I [0] and I [ ], with I [0] D The cases for , , and follow the propositional truth and I [ ] D D D. In the language+ L we∈ may tables. ∧ ∨ → ↔ express+ the∈ following× axioms:→ Write I A provided I,V A for all V . Clearly, if A is closed (contains|H no free variables)|H then its truth is indepen- x 0 x dent of the valuation. The definitions of valid, satisfiable, + = etc. carry over almost verbatim from Sect. 2.2. 0 x x + = (x y) z x (y z) Definition 8 Let A be a formula having no free variables. + + = + + (Remember, free variables in effect are universally quanti- An interpretation I satisfies a formula if I A holds. • |H fied, by the definition of I A.) One model of these ax- |H A set S of formulæ is valid if every interpretation of S ioms is the set of natural numbers, provided we give 0 and • satisfies every formula in S. the obvious meanings. But the axioms have many other models.+ 5 Below, let A be some set. A set S of formulæ is satisfiable (or consistent) if there • is some interpretation that satisfies every formula in S. 1. The set of all strings (in ML say) letting 0 denote the empty string and string concatenation. A set S of formulæ is unsatisfiable (or inconsistent) if + • it is not satisfiable. (Each interpretation falsifies some formula of S.) 2. The set of all subsets of A, letting 0 denote the empty set and union. A model of a set S of formulæ is an interpretation that + • satisfies every formula in S. We also consider models 3. The set of functions in A A, letting 0 denote the that satisfy a single formula. identity function and composition.→ + Unlike in propositional logic, models can be infinite and there can be an infinite number of models. There is no Exercise 8 To test your understanding of quantifiers, con- chance of proving by checking all models. We must sider the following formulæ: everybody loves somebody vs rely on proof. there is somebody that everybody loves:

Example 5 The formula P(a) P(b) is satisfiable. x y loves(x, y) (1) ∧ ¬ ∀ ∃ Consider the interpretation with D London, Paris and I y x loves(x, y) (2) defined by = { } ∃ ∀ Does (1) imply (2)? Does (2) imply (1)? Consider both the I [a] Paris = informal meaning and the formal semantics defined above. I [b] London = I [P] Paris = { } Exercise 9 Describe a formula that is true in precisely On the other hand, xy (P(x) P(y)) is unsatisfiable be- those domains that contain at least m elements. (We say cause it requires P∀(x) to be both∧ ¬ true and false for all x. it characterises those domains.) Describe a formula that Also unsatisfiable is P(x) P(y): its free variables are characterises the domains containing at most m elements. taken to be universally quantified,∧ ¬ so it is equivalent to xy (P(x) P(y)). 5Models of these axioms are called monoids. ∀ ∧ ¬ 5 FORMAL REASONING IN FIRST-ORDER LOGIC 11

Exercise 10 Let be a 2-place predicate symbol, which An occurrence of a variable is free if it is not bound. ≈ we write using infix notation as x y instead of (x, y). A closed formula is one that contains no free variables. ≈ ≈ Consider the axioms A ground term, formula, etc. is one that contains no vari- ables at all. x x x (1) ∀ ≈ xy (x y y x) (2) In x y R(x, y, z), the variables x and y are bound ∀ ≈ → ≈ ∀ ∃ xyz (x y y z x z) (3) while z is free. ∀ ≈ ∧ ≈ → ≈ In ( x P(x)) Q(x), the occurrence of x just after P Let the universe be the set of natural numbers, N is bound,∃ while∧ that just after Q is free. Thus x has = 0, 1, 2,... . Which axioms hold if I [ ] is both free and bound occurrences. Such situations can be { } ≈ avoided by renaming bound variables, for example obtain- the empty relation, ? • ∅ ing ( y P(y)) Q(x). Renaming can also ensure that all the universal relation, (x, y) x, y N ? bound∃ variables∧ in a formula are distinct. The renaming of • { | ∈ } the equality relation, (x, x) x N ? bound variables is sometimes called α-conversion. • { | ∈ } the relation (x, y) x, y N x y is even ? • { | ∈ ∧ + } Example 7 Renaming bound variables in a formula pre- the relation (x, y) x, y N x y 100 ? • { | ∈ ∧ + = } serves its meaning, if done properly. Consider the following the relation (x, y) x, y N x y (mod 16) ? renamings of x y R(x, y, z): • { | ∈ ∧ ≡ } ∀ ∃ u y R(u, y, z) OK Exercise 11 Taking and R as 2-place relation symbols, ∀ ∃ consider the following= axioms: x w R(x, w, z) OK ∀u ∃y R(x, y, z) not done consistently x R(x, x) (1) ∀y ∃y R(y, y, z) clash with bound variable y ∀ ¬ ∀ ∃ xy (R(x, y) R(y, x)) (2) z y R(z, y, z) clash with free variable z ∀ ¬ ∧ ∀ ∃ xyz (R(x, y) R(y, z) R(x, z)) (3) ∀ ∧ → xy (R(x, y) (x y) R(y, x)) (4) 5.2 Substitution ∀ ∨ = ∨ xz (R(x, z) y (R(x, y) R(y, z))) (5) If A is a formula, t is a term, and x is a variable, then A[t/x] ∀ → ∃ ∧ is the formula obtained by substituting t for x throughout A. Exhibit two interpretations that satisfy axioms 1–5. Exhibit The substitution only affects the free occurrences of x. Pro- two interpretations that satisfy axioms 1–4 and falsify ax- nounce A[t/x] as “A with t for x”. We also use u[t/x] iom 5. Exhibit two interpretations that satisfy axioms 1–3 for substitution in a term u and C[t/x] for substitution in a and falsify axioms 4 and 5. Consider only interpretations clause C (clauses are described in Sect. 6 below). that make denote the equality relation. (This exercise Substitution is only sensible provided all bound variables asks whether= you can make the connection between the ax- in A are distinct from all variables in t. This can be achieved ioms and typical mathematical objects satisfying them. A by renaming the bound variables in A. For example, if x A start is to say that R(x, y) means x < y, but on what do- then A[t/x] is true for all t; the formula holds when we∀ drop main?) the x and replace x by any term. But x y x y is true in all∀ models, while y y 1 y is∀ not.∃ We= may not ∃ + = 5 Formal Reasoning in First-Order replace x by y 1, since the free occurrence of y in y 1 gets captured by+ the y . First we must rename the bound+y, Logic getting say x z x ∃ z; now we may replace x by y 1, getting z y∀ ∃1 =z. This formula is true in all models,+ This section reviews some syntactic notations: free vari- regardless∃ of+ the meaning= of the symbols and 1. ables versus bound variables and substitution. It lists some + of the main equivalences for quantifiers. Finally it describes and illustrates the quantifier rules of the sequent calculus. 5.3 Equivalences involving quantifiers These equivalences are useful for transforming and simpli- 5.1 Free vs bound variables fying quantified formulæ. They can be to convert formulæ prenex normal form The notion of bound variable occurs widely in mathemat- into , where all quantifiers are at the ics: consider the role of x in R f (x)dx and the role of k front, or conversely to move quantifiers into the smallest possible scope. in limk∞ 0 ak. Similar concepts occur in the λ-calculus. In first-order= logic, variables are bound by quantifiers (rather than by λ). pulling quantifiers through negation (infinitary de Morgan laws) Definition 9 An occurrence of a variable x in a formula is bound ( x A) x A if it is contained within a subformula of the form ¬ ∀ ' ∃ ¬ x A or x A. ( x A) x A ∀ An occurrence∃ of the form x or x is called the binding ¬ ∃ ' ∀ ¬ occurrence of x. ∀ ∃ pulling quantifiers through conjunction and disjunction 5 FORMAL REASONING IN FIRST-ORDER LOGIC 12

(provided x is not free in B) Exercise 12 Verify these equivalences by appealing to the truth definition for first-order logic: ( x A) B x (A B) ∀ ∧ ' ∀ ∧ ( x A) B x (A B) ( x P(x)) x P(x) ∀ ∨ ' ∀ ∨ ¬ ∃ ' ∀ ¬ ( x A) B x (A B) ( x P(x)) R x (P(x) R) ∃ ∧ ' ∃ ∧ ∀ ∧ ' ∀ ∧ ( x A) B x (A B) ( x P(x)) ( x Q(x)) x (P(x) Q(x)) ∃ ∨ ' ∃ ∨ ∃ ∨ ∃ ' ∃ ∨ distributive laws Exercise 13 Explain why the following are not equiva- ( x A) ( x B) x (A B) lences. Are they implications? In which direction? ∀ ∧ ∀ ' ∀ ∧ ( x A) ( x B) x (A B) ( x A) ( x B) x (A B) ∃ ∨ ∃ ' ∃ ∨ ∀ ∨ ∀ 6' ∀ ∨ implication: A B as A B ( x A) ( x B) x (A B) ∃ ∧ ∃ 6' ∃ ∧ (provided x is→ not free¬ in B)∨

( x A) B x (A B) 5.4 Sequent rules for the universal quantifier ∀ → ' ∃ → ( x A) B x (A B) Here are the sequent rules for : ∃ → ' ∀ → ∀ expansion: and as infinitary conjunction and A[t/x], 0 1 0 1, A ∀ ∃ ⇒ ( l) ⇒ ( r) disjunction x A, 0 1 ∀ 0 1, x A ∀ ∀ ⇒ ⇒ ∀ x A ( x A) A[t/x] Rule ( r) holds provided x is not free in the conclusion! ∀ ' ∀ ∧ ∀ x A ( x A) A[t/x] Note that if x were indeed free somewhere in 0 or 1, then ∃ ' ∃ ∨ the sequent would be assuming properties of x. This restric- With the help of the associative and commutative laws tion ensures that x is a fresh variable, which therefore can for and , a quantifier can be pulled out of any conjunct denote an arbitrary value. Contrast the proof of the theorem ∧ ∨ or disjunct. x [P(x) P(x)] with an attempted proof of the invalid The distributive laws differ from pulling: they replace ∀formula P→(x) x P(x). Since x is a bound variable, you two quantifiers by one. (Note that the quantified variables may rename it→ to ∀ get around the restriction, and obviously will probably have different names, so one of them will have P(x) y P(y) should have no proof. → ∀ be renamed.) Depending upon the situation, using distribu- Rule ( l) lets us create many instances of x A. The exer- ∀ tive laws can be either better or worse than pulling. There cises below include some examples that require∀ more than are no distributive laws for over and over . If in ∀ ∨ ∃ ∧ one copy of the quantified formula. Since we regard se- doubt, do not use distributive laws! quents as consisting of sets, we may regard them as con- Two substitution laws do not involve quantifiers explic- taining unlimited quantities of each of their elements. But itly, but let us use x t to replace x by t in a restricted except for the two rules ( l) and ( r) (see below), we only = ∀ ∃ context: need one copy of each formula. (x t A) (x t A[t/x]) = ∧ ' = ∧ Example 8 In this elementary proof, rule ( l) is applied to (x t A) (x t A[t/x]) ∀ = → ' = → instantiate the bound variable x with the term f (y). The application of ( r) is permitted because y is not free in the Many first-order formulæ have easy proofs using equiva- ∀ lences: conclusion (which, in fact, is closed).

x (x a P(x)) x (x a P(a)) P( f (y)) P( f (y)) ∃ = ∧ ' ∃ = ∧ ⇒ ( l) x (x a) P(a) x P(x) P( f (y)) ∀ ' ∃ = ∧ ∀ ⇒ ( r) P(a) x P(x) y P( f (y)) ∀ ' ∀ ⇒ ∀ The following formula is quite hard to prove using the Example 9 This proof concerns part of the law for pulling sequent calculus, but using equivalences it is simple: universal quantifiers out of conjunctions. Rule ( l) just dis- ∀ z (P(z) P(a) P(b)) z P(z) P(a) P(b) cards the quantifier, since it instantiates the bound vari- ∃ → ∧ ' ∀ → ∧ able x with the free variable x. z P(z) P(a) P(b) P(a) P(b) ' ∀ ∧ ∧ → ∧ t P(x), Q P(x) ' ⇒ ( l) P(x) Q P(x) ∧ If you are asked to prove a formula, but no particular formal ∧ ⇒ ( l) x (P(x) Q) P(x) ∀ system (such as the sequent calculus) has been specified, ∀ ∧ ⇒ ( r) x (P(x) Q) x P(x) ∀ then you may use any convincing argument. Using equiv- ∀ ∧ ⇒ ∀ alences as above can shorten the proof considerably. Also, take advantage of symmetries; in proving A B B A, Example 10 The sequent x (A B) A x B is it obviously suffices to prove A B B ∧A. ' ∧ valid provided x is not free in∀ A. That→ condition⇒ → is required ∀ ∧ |H ∧ 6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 13

for the application of ( r) below: Example 11 Figure 2 presents half of the distributive ∀ ∃ law. Rule ( r) just discards the quantifier, instantiating the ∃ A A, B A, B B bound variable x with the free variable x. In the general ⇒ ⇒ ( l) A, A B B → case, it can instantiate the bound variable with any term. → ⇒ ( l) x A, x (A B) B ∀ The restriction on the sequent rules, namely “ is not free ∀ → ⇒ ( r) in the conclusion”, can be confusing when you are building A, x (A B) x B ∀ ∀ → ⇒ ∀ ( r) a sequent proof working backwards. One simple way to x (A B) A x B → ∀ → ⇒ → ∀ avoid problems is always to rename a quantified variable if the same variable appears free in the sequent. For example, What if the condition fails to hold? Let A and B both be when you see the sequent P(x), x Q(x) 1, replace it the formula x 0. Then x (x 0 x 0) is valid, ∃ ⇒ = ∀ = → = immediately by P(x), y Q(y) 1. but x 0 x (x 0) is not valid (it fails under any ∃ ⇒ valuation= that→ sets ∀ x to= 0). Example 12 The sequent Note. The proof on the slides of x P(x) x Q(x) x (P(x) Q(x)) x (P Q(x)) P y Q(y) ∃ ∧ ∃ ⇒ ∃ ∧ ∀ → ⇒ → ∀ is not valid: the value of x that makes P(x) true could differ is essentially the same as the proof above, but uses different from the value of x that makes Q(x) true. This comes out variable names so that you can see how a quantified formula clearly in the proof attempt, where we are not allowed to like x (P Q(x)) is instantiated to produce P Q(y). apply ( l) twice with the same variable name, x. As soon as ∃ The∀ proof given→ above is also correct: the variable→ names we are forced to rename the second variable to y, it becomes are identical, the instantiation is trivial and x (A B) obvious that the two values could differ. Turning to the right ∀ → simply produces A B. In our example, B may be any side of the sequent, no application of ( r) can lead to a proof. ∃ formula possibly containing→ the variable x; the proof on the We have nothing to instantiate x with: slides uses the specific formula Q(x). P(x), Q(y) P(x) Q(x) ⇒ ∧ ( r) P(x), Q(y) x (P(x) Q(x)) ∃ ⇒ ∃ ∧ ( l) Exercise 14 Prove y [(Q(a) Q(b)) Q(y)] using P(x), x Q(x) x (P(x) Q(x)) ∃ ¬∀ ∨ ∧ ¬ ∃ ⇒ ∃ ∧ ( l) equivalences, and then formally using the sequent calculus. x P(x), x Q(x) x (P(x) Q(x)) ∃ ∃ ∃ ⇒ ∃ ∧ ( l) x P(x) x Q(x) x (P(x) Q(x)) ∧ ∃ ∧ ∃ ⇒ ∃ ∧ Exercise 15 Prove the following using the sequent calcu- Exercise 17 Prove the following using the sequent calcu- lus. Note that the last one requires two uses of ( l)! ∀ lus. The last one is difficult and requires two uses of ( r). ∃ ( x P(x)) ( x Q(x)) y (P(y) Q(y)) P(a) x P( f (x)) y P(y) ∀ ∧ ∀ ⇒ ∀ ∧ ∨ ∃ ⇒ ∃ x (P(x) Q(x)) ( y P(y)) ( y Q(y)) x (P(x) Q(x)) ( y P(y)) ( y Q(y)) ∀ ∧ ⇒ ∀ ∧ ∀ ∃ ∨ ⇒ ∃ ∨ ∃ x [P(x) P( f (x))] x [P(x) P( f ( f (x)))] z (P(z) P(a) P(b)) ∀ → ⇒ ∀ → ⇒ ∃ → ∧ Exercise 16 Prove the equivalence x [P(x) P(a)] 6 Clause Methods for Propositional P(a). ∀ ∨ ' Logic

5.5 Sequent rules for existential quantifiers This section discusses two proof methods in the context of propositional logic. Here are the sequent rules for : The Davis-Putnam-Logeman-Loveland procedure dates ∃ from 1960, and its application to first-order logic has been A, 0 1 0 1, A[t/x] regarded as obsolete for decades. However, the procedure ⇒ ( l) ⇒ ( r) x A, 0 1 ∃ 0 1, x A ∃ has been rediscovered and high-performance implementa- ∃ ⇒ ⇒ ∃ tions built. In the 1990s, these “SAT solvers” were applied Rule ( l) holds provided x is not free in the conclusion— to obscure problems in combinatorial mathematics, such as ∃ that is, not free in the formulæ of 0 or 1. These rules the existence of Latin squares. Recently, they have been are strictly dual to the -rules; any example involving generalised to SMT solvers, also handling arithmetic, with ∀ ∀ can easily be transformed into one involving and hav- an explosion of serious applications. ∃ ing a proof of precisely the same form. For example, the Resolution is a powerful proof method for first-order sequent x P(x) y P( f (y)) can be transformed into logic. We first consider ground resolution, which works for ∀ ⇒ ∀ y P( f (y)) x P(x). propositional logic. Though of little practical use, ground ∃ ⇒ ∃ If you have a choice, apply rules that have provisos — resolution introduces some of the main concepts. The res- namely ( l) and ( r) — before applying the other quantifier olution method is not natural for hand proofs, but it is easy ∃ ∀ rules as you work upwards. The other rules introduce terms to automate: it has only one inference rule and no axioms. and therefore new variables to the sequent, which could pre- Both methods require the original formula to be negated, vent you from applying ( l) and ( r) later. then converted into CNF. Recall that CNF is a conjunction ∃ ∀ 6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 14

P(x) P(x), Q(x) ⇒ ( r) P(x) P(x) Q(x) ∨ ⇒ ∨ ( r) P(x) x (P(x) Q(x)) ∃ similar ⇒ ∃ ∨ ( l) ( l) x P(x) x (P(x) Q(x)) ∃ x Q(x) x (P(x) Q(x)) ∃ ∃ ⇒ ∃ ∨ ∃ ⇒ ∃ ∨ ( l) x P(x) x Q(x) x (P(x) Q(x)) ∨ ∃ ∨ ∃ ⇒ ∃ ∨

Figure 2: Half of the distributive law ∃ of disjunction of literals. A disjunction of literals is called delete L from all clauses a clause, and written as a set of literals. Converting the • ¬ negated formula to CNF yields a set of such clauses. Both 3. Delete all clauses containing pure literals. A literal L is pure if there is no clause containing L. methods seek a contradiction in the set of clauses; if the ¬ clauses are unsatisfiable, then so is the negated formula, and 4. If the empty clause is generated, then we have a refu- therefore the original formula is valid. tation. Conversely, if all clauses are deleted, then the To refute a set of clauses is to prove that they are incon- original clause set is satisfiable. sistent. The proof is called a refutation. 5. Perform a case split on some literal L, and recursively 6.1 Clausal notation apply the algorithm to the L and L subcases. The clause set is satisfiable if and only¬ if one of the sub- Definition 10 A clause is a disjunction of literals cases is satisfiable.

K1 Km L1 Ln, This is a decision procedure. It must terminate because each ¬ ∨ · · · ∨ ¬ ∨ ∨ · · · ∨ case split eliminates a propositional symbol. Modern imple- written as a set mentations such as zChaff and MiniSat add various heuris- tics. They also rely on carefully designed data structures K1,..., Km, L1,..., Ln . {¬ ¬ } that improve performance by reducing the number of cache A clause is true (in some interpretation) just when one misses, for example. of the literals is true. Thus the empty clause, namely , Historical note. Davis and Putnam introduced the first {} version of this procedure. Logeman and Loveland intro- indicates contradiction. It is normally written . Since is commutative, associative, and idempotent, the duced the splitting rule, and their version has completely order of∨ literals in a clause does not matter. The above superseded the original Davis-Putnam method. clause is logically equivalent to the implication Tautological clauses are deleted because they are always true, and thus cannot participate in a contradiction. A pure (K1 Km) (L1 Ln) literal can always be assumed to be true; deleting the clauses ∧ · · · ∧ → ∨ · · · ∨ containing it can be regarded as a degenerate case split, in Kowalski notation abbreviates this to which there is only one case.

K1, , Km L1, , Ln ··· → ··· Example 13 DPLL can show that a formula is not a theo- and when n 1 we have the familiar Prolog clauses, also rem. Consider the formula P Q Q R. After negat- ∨ → ∨ known as definite= or Horn clauses. ing this and converting to CNF, we obtain the three clauses P, Q , Q and R . The DPLL method terminates { } {¬ } {¬ } 6.2 The DPLL Method rapidly: P, Q Q R initial clauses The Davis-Putnam-Logeman-Loveland (DPLL) method is { } {¬ } {¬ } based upon some obvious identities: P R unit Q { } {¬R} unit ¬P (also pure) t A A {¬ } unit R (also pure) ∧ ' ¬ A (A B) A ∧ ∨ ' All clauses have been deleted, so execution terminates. The A ( A B) A B ∧ ¬ ∨ ' ∧ clauses are satisfiable by P 1, Q 0, R 0. This A (A B) (A B) interpretation falsifies P Q7→ Q 7→R. 7→ ' ∧ ∨ ∧ ¬ ∨ → ∨ Here is an outline of the algorithm: Example 14 Here is an example of a case split. Consider 1. Delete tautological clauses: P, P,... the clause set { ¬ } 2. Unit propagation: for each unit clause L , Q, R R, P R, Q { } {¬ } {¬ } {¬ } delete all clauses containing L P, Q, R P, Q P, Q . • {¬ }{ } {¬ ¬ } 6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 15

There are no unit clauses or pure literals, so we arbitrarily 4. If a contradiction is reached, we have refuted A. select P for case splitting: ¬ In set notation the resolution rule is Q, R R, Q Q, R Q if P is true {¬ } {¬ }{ } {¬ } B, A1,..., Am B, C1,..., Cn R R unit Q { } {¬ } {¬ }{ } ¬ A1,..., Am, C1,..., Cn unit R { } Q R {}R R Q Q P , , if is false Resolution takes two clauses and creates a new one. A col- {¬ Q } {¬ } {¬ }{Q} R unit lection of clauses is maintained; the two clauses are cho- {¬ }{ } unit ¬Q {} ¬ sen from the collection according to some strategy, and the new clause is added to it. If m 0 or n 0 then the The empty clause is written above to make the pattern of = = execution clearer; traditionally,{} however, the empty clause new clause will be smaller than one of the parent clauses; if m n 0 then the new clause will be empty. If the empty is written . When we encounter a contradiction, we aban- = = don the current case and consider any remaining cases. If clause is generated, resolution terminates successfully: we all cases are contradictory, then the original set of clauses is have found a contradiction! inconsistent. If they arose from some negated formula A, then A is a theorem. ¬ 6.4 Examples of ground resolution You might find it instructive to download MiniSat6, which is a very concise open-source SAT solver. It is coded Let us try to prove in C++. These days, SAT solvers are largely superseded by P Q Q P SMT solvers, which also handle arithmetic, rays, bit vectors ∧ → ∧ , arrays, bit vectors, etc. Convert its negation to CNF:

Exercise 18 Apply the DPLL procedure to the clause set (P Q Q P) ¬ ∧ → ∧ P, Q P, Q P, Q P, Q . { } {¬ }{ ¬ } {¬ ¬ } We can combine steps 1 (eliminate ) and 2 (push nega- tions in) using the law (A B) →A B: 6.3 Introduction to resolution ¬ → ' ∧ ¬ (P Q) (Q P) ∧ ∧ ¬ ∧ Resolution is essentially the following : (P Q) ( Q P) ∧ ∧ ¬ ∨ ¬ B A B C ∨ A ¬C ∨ Step 3, push disjunctions in, has nothing to do. The clauses ∨ are To convince yourself that this rule is sound, note that B P Q Q, P must be either false or true. { }{ } {¬ ¬ } We resolve P and Q, P as follows: { } {¬ ¬ } if B is false, then B A is equivalent to A, so we get • ∨ P P, Q A C { } {¬ ¬ } ∨ Q if B is true, then B C is equivalent to C, so we get {¬ } • A C ¬ ∨ The resolvent is Q . Resolving Q with this new clause ∨ gives {¬ } { } You might also understand this rule via transitivity of Q Q → { } {¬ } A BB C {} ¬ →A C→ ¬ → The resolvent is the empty clause, properly written as . We have proved P Q Q P by assuming its negation A special case of resolution is when A and C are empty: and deriving a contradiction.∧ → ∧ B B It is nicer to draw a tree like this: f¬ P Q, P This detects contradictions. { }{¬¬ } Resolution works with disjunctions. The aim is to prove Q Q a contradiction, refuting a formula. Here is the method for { }{¬} proving a formula A: 

1. Translate A into CNF as A1 Am. ¬ ∧ · · · ∧ Another example is (P Q) (Q P). The steps 2. Break this into a set of clauses: A1, ..., Am. of the conversion to clauses↔ is left as↔ an exercise;↔ remember 3. Repeatedly apply the resolution rule to the clauses, to negate the formula first! The final clauses are producing new clauses. These are all consequences of P, Q P, Q P, Q P, Q A. { } {¬ }{ ¬ } {¬ ¬ } ¬ 6http://minisat.se/ A tree for the resolution proof is 6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 16

H H, M, N P, Q P, Q P, Q P, Q { }{¬} { }{¬}{¬ }{¬¬ } M, N M, P Q  Q { }{¬} { }{¬ } N, P  N, P  { }{¬ } P P Note that the tree contains Q and Q rather than { }{¬} Q, Q and Q, Q . If we{ forget} to{¬ suppress} repeated {literals,} we can{¬ get¬ stuck.} Resolving Q, Q and Q, Q  (keeping repetitions) gives Q, Q {, a tautology.} {¬ Tautolo-¬ } { ¬ } The clauses were not tried at random. Here are some gies are useless. Resolving this one with the other clauses points of proof strategy. leads nowhere. Try it. These examples could mislead. Must a proof use each Ignoring irrelevance. Clauses M, K and N, L clause exactly once? No! A clause may be used repeatedly, {¬ } {¬ } and many problems contain redundant clauses. Here is an lead nowhere, so they were not tried. Resolving with one example: of these would make a clause containing K or L. There is no way of getting rid of either literal, for no clause contains P, R P Q, R R {¬ }{}{¬}{¬} K or L. So this is not a way to obtain the empty clause. (unused) (¬K and¬L are pure literals.) R { } Working from the goal. In each resolution step, at least 2 one clause involves the negated conclusion (possibly via Redundant clauses can make the theorem-prover flounder; earlier resolution steps). We do not attempt to find a con- this is a challenge facing the field. tradiction in the assumptions alone, provided (as is often the case) we know them to be consistent: any contradiction Exercise 19 Prove (A B C) [(A B) (A must involve the negated conclusion. This strategy is called set of support C)] using resolution. → ∨ → → ∨ → . Although largely obsolete, it’s very useful when working problems by hand. 6.5 A proof using a set of assumptions Linear resolution. The proof has a linear structure: each In this example we assume resolvent becomes the parent clause for the next resolution step. Furthermore, the other parent clause is always one H M NM K PN L P → ∨ → ∧ → ∧ of the original set of clauses. This simple structure is very and prove H P. It turns out that we can generate clauses efficient because only the last resolvent needs to be saved. separately from→ the assumptions (taken positively) and the It is similar to the execution strategy of Prolog. conclusion (negated). If we call the assumptions A1, ..., Ak and the conclu- Exercise 20 Explain in more detail the conversion of this sion B, then the desired theorem is example into clauses.

(A1 Ak) B ∧ · · · ∧ → Exercise 21 Prove Peirce’s law, ((P Q) P) P, Try negating this and converting to CNF. Using the law using resolution. → → → (A B) A B, the negation converts in one step to ¬ → ' ∧ ¬ A1 Ak B Exercise 22 Use resolution to prove these formulas: ∧ · · · ∧ ∧ ¬ Since the entire formula is a conjunction, we can separately (Q R) (R P Q) (P Q R) (P Q) convert A1, ..., Ak, and B to clause form and pool the → ∧ → ∧ ∧ → ∨ → ↔ ¬ (P Q R) (P Q R) ((P Q) R) clauses together. ∧ → ∧ ∨ ∨ → ↔ → Assumption H M N is essentially in clause form Show the steps of converting the formula into clauses. already: → ∨ H, M, N {¬ } Exercise 23 Prove that (P Q) (R S) follows from Assumption M K P becomes two clauses: P R and R P S using∧ linear→ resolution.∧ → ∧ → ∧ → M, K M, P {¬ } {¬ } Exercise 24 Convert these axioms to clauses, showing all Assumption N L P also becomes two clauses: steps. Then prove Winterstorm Miserable by resolution: → ∧ → N, L N, P {¬ } {¬ } Rain (Windy Umbrella) Wet ∧ ∨ ¬ → The negated conclusion, (H P), becomes two clauses: Winterstorm Storm Cold ¬ → → ∧ H P Wet Cold Miserable { } {¬ } ∧ → Storm Rain Windy A tree for the resolution proof is → ∧ 7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 17

7 Skolem Functions, Herbrand’s Example 15 Start with Theorem and Unification x y z R(x, y, z) ∃ ∀ ∃ Eliminate the x using the Skolem constant a: Propositional logic is the basis of many proof methods for ∃ first-order logic. Eliminating the quantifiers from a first- y z R(a, y, z) order formula reduces it “almost” to propositional logic. ∀ ∃ Eliminate the z using the 1-place Skolem function f : This section describes how to do so. ∃ y R(a, y, f (y)) ∀ 7.1 Removing quantifiers: Skolem form Finally, drop the y and convert the remaining formula to a clause: ∀ Skolemisation replaces every existentially bound variable R(a, y, f (y)) by a Skolem constant or function. This transformation does { } not preserve the meaning of a formula. However, it does preserve inconsistency, which is the critical property: reso- Example 16 Start with lution works by detecting contradictions. u v w x y z ((P(h(u, v)) Q(w)) R(x, h(y, z))) ∃ ∀ ∃ ∃ ∀ ∃ ∨ ∧ Eliminate the u using the Skolem constant c: How to Skolemize a formula ∃ v w x y z ((P(h(c, v)) Q(w)) R(x, h(y, z))) Take a formula in negation normal form. Starting from the ∀ ∃ ∃ ∀ ∃ ∨ ∧ Eliminate the w using the 1-place Skolem function f : outside, follow the nesting structure of the quantifiers. If the ∃ formula contains an existential quantifier, then the series of v x y z ((P(h(c, v)) Q( f (v))) R(x, h(y, z))) quantifiers must have the form ∀ ∃ ∀ ∃ ∨ ∧ Eliminate the x using the 1-place Skolem function g: ∃ x1 x2 xk y A v y z ((P(h(c, v)) Q( f (v))) R(g(v), h(y, z))) ∀ · · · ∀ · · · ∀ · · · ∃ ∀ ∀ ∃ ∨ ∧ where A is a formula, k 0, and y is the leftmost exis- Eliminate the z using the 2-place Skolem function j (the tential quantifier. Choose≥ a k-place∃ function symbol f not function h is already∃ used!): present in A (that is, a new function symbol). Delete the y ∃ v y ((P(h(c, v)) Q( f (v))) R(g(v), h(y, j(v, y)))) and replace all other occurrences of y by f (x1, x2,..., xk). ∀ ∀ ∨ ∧ The result is another formula: Dropping the universal quantifiers yields a set of clauses: P(h(c, v)), Q( f (v)) R(g(v), h(y, j(v, y))) x1 x2 xk A[ f (x1, x2,..., xk)/y] ∀ · · · ∀ · · · ∀ ··· { }{ } Each clause is implicitly enclosed by universal quantifiers If some existential quantifier is not enclosed in any uni- over each of its variables. So the occurrences of the vari- versal quantifiers, then the formula contains simply y A ∃ able v in the two clauses are independent of each other. as a subformula. Then this quantifier is deleted and occur- Let’s try this example again, first pushing quantifiers in rences of y are replaced by a new constant symbol c. The to the smallest possible scopes: resulting subformula is A[c/y]. u v P(h(u, v)) w Q(w) x y z R(x, h(y, z)) Then repeatedly eliminate all remaining existential quan- ∃ ∀ ∨ ∃ ∧ ∃ ∀ ∃ tifiers as above. The new symbols are called Skolem func- Now the Skolem functions take fewer arguments. tions (or Skolem constants). P(h(c, v)), Q(d) R(e, h(y, f (y))) After Skolemization, the formula has only universal { }{ } quantifiers. The next step is to throw the remaining quan- The difference between this set of clauses and the previous tifiers away. This step is correct because we are converting set may seem small, but in practice it can be huge. to clause form, and a clause implicitly includes universal quantifiers over all of its free variables. Correctness of Skolemization We are almost back to the propositional case, except the formula typically contains terms. We shall have to handle Skolemization does not preserve meaning. The version pre- constants, function symbols, and variables. sented above does not even preserve validity! For example, Prenex normal form—where all quantifiers are moved to x (P(a) P(x)) the front of the formula—would make things easier to fol- ∃ → low. However, increasing the scope of the quantifiers prior is valid. (Because the required x is just a.) to Skolemization makes proofs much more difficult. Push- Replacing the x by the Skolem constant b gives ing quantifiers in as far as possible, instead of pulling them ∃ P(a) P(b) out, yields a better set of clauses. → This has a different meaning since it refers to a constant b Examples of Skolemization not previously mentioned. And it is not valid! For example, it is false in the interpretation where P(x) means x equals For simplicity, we start with prenex normal form. The af- 0 and a denotes 0 and b denotes 1. fected expressions are underlined. Skolemization preserves consistency. 7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 18

The formula x y A is consistent iff it holds in some if and only if P(t1,..., tn) holds in our desired “real” in- • interpretation∀I ∃ (D, I ) terpretation I of the clauses. In other words, any specific = interpretation I (D, I ) over some universe D can be iff for all x D there is some y D such that A holds = • ∈ ∈ mimicked by an Herbrand interpretation. One can show the iff there is some function on D, say f D D, such following two results: ˆ • that for all x D, if y f (x) then A∈holds→ ∈ = ˆ Lemma 12 Let S be a set of clauses. If an interpretation iff the formula x A[ f (x)/y] is consistent. satisfies S, then an Herbrand interpretation satisfies S. • ∀ If a formula is consistent then so is the Skolemized version. Theorem 13 A set S of clauses is unsatisfiable if and only If it is inconsistent then so is the Skolemized version. And if no Herbrand interpretation satisfies S. resolution works by proving that a formula is inconsistent. Equality behaves strangely in Herbrand interpretations. Given an interpretation I, the denotation of is the set of 7.2 Herbrand interpretations = all pairs of ground terms (t1, t2) such that t1 t2 according An Herbrand interpretation basically consists of all terms to I. In a context of the natural numbers,= the denotation that can be written using just the constant and function sym- of could include pairs like (1 1, 2) — the two compo- bols in a set of clauses S (or quantifier-free formula). The nents= need not be the same. + data processed by a Prolog program S is simply its Her- brand universe. 7.3 The Skolem-Gödel-Herbrand Theorem To define the Herbrand universe for the set of clauses S we start with sets of the constant and function symbols in S, This theorem tells us that unsatisfiability can always be de- including Skolem functions. tected by a finite process. It provided the original motivation for research into automated theorem proving. Definition 11 Let C be the set of all constants in S. If there are none, let C a for some constant symbol a of the Definition 14 An instance of a clause C is a clause that = { } first-order language. For n > 0 let Fn be the set of all n- results by replacing variables in C by terms. A ground in- place function symbols in S and let Pn be the set of all n- stance of C is an instance of C that contains no variables. place predicate symbols in S. S Since the variables in a clause are taken to be universally The Herbrand universe is the set H i 0 Hi , where = ≥ quantified, every instance of C is a H0 C of C. = Hi 1 Hi f (t1,..., tn) t1,..., tn Hi and f Fn + = ∪ { | ∈ ∈ } Theorem 15 (Herbrand) A set S of clauses is unsatisfi- Thus, H consists of all the terms that can be written using able if and only if there is a finite unsatisfiable set S0 of only the constants and function symbols present in S. There ground instances of clauses of S. are no variables: the elements of H are ground terms. For- The theorem is valuable because the new set S expresses mally, H turns out to satisfy the recursive equation 0 the inconsistency in a finite way. However, it only tells us that S exists; it does not tell us how to derive S . So how H f (t1,..., tn) t1,..., tn H and f Fn 0 0 = { | ∈ ∈ } do we generate useful ground instances of clauses? One The definition above ensures that C is non-empty. It follows answer, outlined below, is unification. that H is also non-empty, which is necessary. The elements of H are ground terms. An interpretation Example 18 To demonstrate the Skolem-Gödel-Herbrand (H, IH ) is a Herbrand interpretation provided IH [t] t theorem, consider proving the formula for all ground terms t. The point of this peculiar exercise= is that we can give meanings to the symbols of S in a purely x P(x) y [P(y) Q(y)] Q(a) Q(b). syntactic way. ∀ ∧ ∀ → → ∧ If we negate this formula, we trivially obtain the following Example 17 Given the two clauses set S of clauses:

P(a) Q(g(y, z)), P( f (x)) P(x) P(y), Q(y) Q(a), Q(b) . { }{ ¬ } { } {¬ } {¬ ¬ } This set is inconsistent. Here is a finite set of ground in- Then C a , F1 f , F2 g and = { } = { } = { } stances of clauses in S: H a, f (a), g(a, a), f ( f (a)), g( f (a), a), = { P(a) P(b) P(a), Q(a) g(a, f (a)), g( f (a), f (a)), . . . { }{ } {¬ } } P(b), Q(b) Q(a), Q(b) . {¬ } {¬ ¬ } Every Herbrand interpretation IH defines each n-place predicate symbol P to denote some truth-valued function This set reflects the intuitive proof of the theorem. We obvi- n IH [P] H 1, 0 . We take ously have P(a) and P(b); using y [P(y) Q(y)] with ∈ → { } a and b, we obtain Q(a) and Q(b)∀. If we can→ automate this IH [P(t1,..., tn)] 1 procedure, then we can generate such proofs automatically. = 7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 19

Exercise 25 Consider a first-order language with 0 and 1 7.5 Unifiers as constant symbols, with as a 1-place function symbol Definition 18 A substitution θ is a unifier of terms t and and as a 2-place function− symbol, and with < as a 2-place 1 + t2 if t1θ t2θ. More generally, θ is a unifier of terms t1, predicate symbol. = t2, ..., tm if t1θ t2θ tmθ. The term t1θ is the = = · · · = (a) Describe the Herbrand Universe for this language. common instance. (b) The language can be interpreted by taking the integers Two terms can only be unified if they have similar struc- for the universe and giving 0, 1, , , and < their ture apart from variables. The terms f (x) and h(y, z) are − + usual meanings over the integers. What do those sym- clearly non-unifiable since no substitution can do anything bols denote in the corresponding Herbrand model? about the differing function symbols. It is easy to see that θ unifies f (t1,..., tn) and f (u1,..., un) if and only if θ unifies ti and ui for all i 1,..., n. 7.4 Unification = Unification is the operation of finding a common instance Example 21 The substitution [3/x, g(3)/y] unifies the of two or more terms. Consider a few examples. The terms terms g(g(x)) and g(y). The common instance is g(g(3)). f (x, b) and f (a, y) have the common instance f (a, b), re- These terms have many other unifiers, such as these: placing x by a and y by b. The terms f (x, x) and f (a, b) have no common instance, assuming that a and b are dis- unifying substitution common instance tinct constants. The terms f (x, x) and f (y, g(y)) have no [ f (u)/x, g( f (u))/y] g(g( f (u))) common instance, since there is no way that x can have the [z/x, g(z)/y] g(g(z)) form y and g(y) at the same time — unless we admit the [g(x)/y] g(g(x)) infinite term g(g(g( ))). Note that g(g(3)) and g(g( f (u))) are both instances of Only variables may··· be replaced by other terms. Con- g(g(x)). Thus g(g(x)) is more general than g(g(3)) and stants are not affected (they remain constant!). Instances of g(g( f (u))). Certainly g(g(3)) seems to be arbitrary — the term f (t, u) must have the form f (t , u ), where t is an 0 0 0 neither of the original terms mentions 3! Also important: instance of t and u is an instance of u. 0 g(g(x)) is as general as g(g(z)), despite the different vari- Definition 16 A substitution is a finite set of replacements able names. Let us formalize these intuitions.

[t1/x1,..., tk/xk] Definition 19 The substitution θ is more general than φ if φ θ σ for some substitution σ. = ◦ where x1, ..., xk are distinct variables such that ti xi for 6= all i 1,..., k. We use Greek letters φ, θ, σ to stand for g g x g y = Example 22 Recall the unifiers of ( ( )) and ( ). The substitutions. unifier [g(x)/y] is more general than the others listed, for

A substitution θ [t1/x1,..., tk/xk] defines a func- x g y g x y x = [3/ , (3)/ ] [ ( )/ ] [3/ ] tion from the variables x1,..., xk to terms. Postfix no- = ◦ { } [ f (u)/x, g( f (u))/y] [g(x)/y] [ f (u)/x] tation is usual for applying a substitution; thus, for exam- = ◦ [z/x, g(z)/y] [g(x)/y] [z/x] ple, xi θ ti . Substitution on terms, literals and clauses is = ◦ defined recursively= in the obvious way: [g(x)/y] [g(x)/y] [] = ◦ The last line above illustrates that every substitution θ is Example 19 The substitution θ [h(y)/x, b/y] says to = more general than itself because θ θ []. replace x by h(y) and y by b. The replacements occur si- = ◦ multaneously; it does not have the effect of replacing x by Definition 20 A substitution θ is a most general unifier h(b). Applying this substitution gives (MGU) of terms t1, ..., tm if θ unifies t1, ..., tm, and θ is more general than every other unifier of t , ..., t . f (x, g(u), y)θ f (h(y), g(u), b) 1 m = R(h(x), z)θ R(h(h(y)), z) Thus if θ is an MGU of terms t1 and t2 and t1φ t2φ then = = P(x), Q(y) θ P(h(y)), Q(b) φ θ σ for some substitution σ. The natural unification { ¬ } = { ¬ } algorithm= ◦ returns a most general unifier of two terms. Definition 17 If φ and θ are substitutions then so is their composition φ θ, which satisfies ◦ 7.6 A simple unification algorithm t(φ θ) (tφ)θ for all terms t Unification is often presented as operating on the concrete ◦ = syntax of terms, scanning along character strings. But terms Example 20 Let are really tree structures and are so represented in a com- puter. Unification should be presented as operating on trees. φ [ j(x)/u, 0/y] and θ [h(z)/x, g(3)/y]. In fact, we need consider only binary trees, since these can = = represent n-ary branching trees. Then φ θ [ j(h(z))/u, h(z)/x, 0/y]. Our trees have three kinds of nodes: Notice◦ that= y(φ θ) (yφ)θ 0θ 0; the replacement g(3)/y has disappeared.◦ = = = A variable x, y, ... — can be modified by substitution • 7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 20

A constant a, b, ... — handles function symbols also Example 26 Unify f (x, g(y)) with f (y, x). Steps: • A pair (t, u) — where t and u are terms Unify x and y getting [y/x]. • Unification of two terms considers nine cases, most of Try to unify g(y)[y/x] and x[y/x]. These are which are trivial. It is impossible to unify a constant with a g(y) and y, violating the . Fail. pair; in this case the algorithm fails. When trying to unify But we can unify f (x, g(y)) with f (y0, x0). The two constants a and b, if a b then the most general unifier failure was caused by having the same variables is []; if a b then unification= is impossible. The interesting in both terms. cases are6= variable-anything and pair-pair. Example 27 Unify f (x, x) with f (y, g(y)). Steps: Unification with a variable Unify x and y getting [y/x]. When unifying a variable x with a term t, where x t, we Try to unify x[y/x] and g(y)[y/x]. But these must perform the occurs check. If x does not occur in6=t then are y and g(y), where y occurs in g(y). Fail. the substitution [t/x] has no effect on t, so it does the job How can f (x, x) and f (y, g(y)) have a common trivially: instance when the arguments of f must be iden- x[t/x] t t[t/x] = = tical in the first case and different in the second? If x does occur in t then no unifier exists, for if xθ tθ = then the term xθ would be a proper subterm of itself. Example 28 Unify j(w, a, h(w)) with j( f (x, y), x, z).

Example 23 The terms x and f (x) are not unifiable. If Unify w and f (x, y) getting [ f (x, y)/w]. xθ u then f (x)θ f (u). Thus xθ f (x)θ would Unify a and x (the substitution has no effect) get- imply= u f (u). We could= introduce the infinite= term ting [a/x]. Then unify = h(w)[ f (x, y)/w][a/x] and z[ f (x, y)/w][a/x], u f ( f ( f ( f ( f ( ))))) = ··· namely h( f (a, y)) and z; unifier is as a unifier, but this would require a rigorous definition of the syntax and semantics of infinite terms. [h( f (a, y))/z].

Unification of two pairs Result is [ f (x, y)/w] [a/x] [h( f (a, y))/z]. Performing the compositions,◦ ◦ this simplifies to Unifying the pairs (t1, t2) with (u1, u2) requires two recur- [ f (a, y)/w, a/x, h( f (a, y))/z]. sive calls of the unification algorithm. If θ1 unifies t1 with u t u t t 1 and θ2 unifies 2θ1 with 2θ1 then θ1 θ2 unifies ( 1, 2) Example 29 Unify j(w, a, h(w)) with j( f (x, y), x, y). u u ◦ with ( 1, 2). This is the previous example but with a y in place of a z. It is possible to prove that this process terminates, and that if θ1 and θ2 are most general unifiers then so is θ1 θ2. Unify w and f (x, y) getting [ f (x, y)/w]. If either recursive call fails then the pairs are not unifiable.◦ Unify a and x getting [a/x]. Then unify As given above, the algorithm works from left to right, but choosing the reverse direction makes no real difference. h(w)[ f (x, y)/w][a/x] and y[ f (x, y)/w][a/x]. These are h( f (a, y)) and y, but y occurs in Examples of unification h( f (a, y)). Fail. In most of these examples, the two terms have no variables In the diagrams, the lines indicate variable replacements: in common. Most uses of unification (including resolution and Prolog) rename variables in one of the terms to ensure j(w, a, h(w)) j( f (x, y), x, y) this. Such renaming is not part of unification itself. a/x Example 24 Unify f (x, b) with f (a, y). Steps: f (a, y)/w Unify x and a getting [a/x]. Try to unify b[a/x] and y[a/x]. These are b h( f (a, y))/y??? and y, so unification succeeds with [b/y]. Result is [a/x] [b/y], which is [a/x, b/y]. Implementation remarks ◦ To unify terms t1, t2, ..., tm for m > 2, compute a unifier Example 25 Unify f (x, x) with f (a, b). Steps: θ of t1 and t2, then recursively compute a unifier σ of the terms t θ, ..., t θ. The overall unifier is then θ σ . If any Unify x and a getting [a/x]. 2 m unification fails then the set is not unifiable. ◦ Try to unify x[a/x] and b[a/x]. These are dis- A real implementation does not need to compose substi- tinct constants, a and b. Fail. tutions. Most represent variables by pointers and effect the 8 FIRST-ORDER RESOLUTION AND PROLOG 21 substitution [t/x] by updating pointer x to t. The compo- Exercise 26 For each of the following pairs of terms, give sitions are cumulative, so this works. However, if unifica- a most general unifier or explain why none exists. Do not tion fails at some point, the pointer assignments must be rename variables prior to performing the unification. undone! The algorithm sketched here can take exponential time in unusual cases. Faster algorithms exist, but they are f (g(x), z) f (y, h(y)) more complex and seldom adopted. j(x, y, z) j( f (y, y), f (z, z), f (a, a)) Prolog systems, for the sake of efficiency, omit the oc- j(x, z, x) j(y, f (y), z) curs check. This can result in circular data structures and j( f (x), y, a) j(y, z, z) looping. It is unsound for theorem proving. j(g(x), a, y) j(z, x, f (z, z))

7.7 Examples of theorem proving 8 First-Order Resolution and Prolog These two examples are fundamental. They illustrate how the occurs check enforces correct quantifier reasoning. By means of unification, we can extend resolution to first- order logic. As a special case we obtain Prolog. Other the- orem provers are also based on unification. Other applica- Example 30 Consider a proof of tions include polymorphic type checking. A number of resolution theorem provers can be down- ( y x R(x, y)) ( x y R(x, y)). ∃ ∀ → ∀ ∃ loaded for free from the Internet. Some of the main ones include Vampire (http://www.vprover.org, SPASS For simplicity, produce clauses separately for the an- (http://www.spass-prover.org) and E (http: tecedent and for the negation of the consequent. //www.eprover.org). It might be instructive to down- load one of them and experiment with it. The antecedent is y x R(x, y); replacing y by the • Skolem constant a ∃yields∀ the clause R(x, a) . { } 8.1 Binary resolution x y R x y In ( ( , )), pushing in the negation produces We now define the binary resolution rule with unification: • x¬y∀ R∃(x, y). Replacing x by the Skolem con- ∃ ∀ ¬ stant b yields the clause R(b, y) . B, A1,..., Am D, C1,..., Cn {¬ } { } {¬ } A1,..., Am, C1,..., Cn σ if Bσ Dσ Unifying R(x, a) with R(b, y) detects the contradiction { } = R(b, a) R(b, a). As before, the first clause contains B and other literals, ∧ ¬ while the second clause contains D and other literals. The substitution σ is a unifier of B and¬D (almost always a most Example 31 In a similar vein, let us try to prove general unifier). This substitution is applied to all remain- ing literals, producing the conclusion. ( x y R(x, y)) ( y x R(x, y)). ∀ ∃ → ∃ ∀ The variables in one clause are renamed before resolu- tion to prevent clashes with the variables in the other clause. x y R x y y Here the antecedent is ( , ); replacing by Renaming is sound because the scope of each variable is its • the Skolem function f yields∀ ∃ the clause R(x, f (x)) . { } clause. Resolution is sound because it takes an instance of each clause — the instances are valid, because the clauses The negation of the consequent is ( y x R(x, y)), • ¬ ∃ ∀ are universally valid — and then applies the propositional which becomes y x R(x, y). Replacing x by the resolution rule, which is sound. For example, the two Skolem function∀g yields∃ ¬ the clause R(g(y), y) . {¬ } clauses P(x) and P(g(x)) Observe that R(x, f (x)) and R(g(y), y) are not unifiable { } {¬ } because of the occurs check. And so it should be, because yield the empty clause in a single resolution step. This the original formula is not a theorem! The best way to works by renaming variables — say, x to y in the second demonstrate that a formula is not a theorem is to exhibit clause — and unifying P(x) with P(g(y)). Forgetting to a counterexample. Here are two: rename variables is fatal, because P(x) cannot be unified with P(g(x)). The domain is the set of all people who have ever lived. • The relation R(x, y) holds if x loves y. The function f (x) denotes the mother of x, and R(x, f (x)) holds 8.2 Factoring { } because everybody loves their mother. The function In general, resolution must be used alongside another rule, g(x) denotes the landlord of x, and R(g(y), y) holds. factoring ¬ . Factoring takes a clause and unifies some liter- als within it (which must all have the same sign), yield- The domain is the set of integers. The relation R(x, y) ing a new clause. For example, starting with the clause • holds whenever x y. The function f is defined by P(x, b), P(a, y) , factoring can derive P(a, b) , since = f (x) x and R(x, f (x)) holds. The function g is {P(a, b) is the result} of unifying P(x, b) with{ P(a, y}). This = { } defined by g(x) x 1 and so R(g(y), y) holds. new clause is a unit clause and therefore especially useful. = + {¬ } 8 FIRST-ORDER RESOLUTION AND PROLOG 22

Factoring is necessary for completeness. Resolution by Exercise 28 Is the clause P(x, b), P(a, y) logically itself tends to make clauses longer and longer. Only short equivalent to the unit clause{ P(a, b) ? Is} the clause clauses are likely to lead to a contradiction. If every clause P(y, y), P(y, a) logically equivalent{ to} P(y, a) ? Ex- has at least two literals, then the only way to reach the empty plain{ both answers.} { } clause is with the help of factoring. The search space is huge. Resolution and factoring can 8.3 Redundant clauses and subsumption be applied in many different ways, every time. Modern res- olution systems use highly complex heuristics to limit the During resolution, the number of clauses builds up dramat- search. ically; it is important to delete all redundant clauses. Each new clause is a consequence of the existing clauses. Example 32 Prove x y (P(y, x) P(y, y)). A contradiction can only be derived if the original set of ∀ ∃ ¬ ↔ ¬ clauses is inconsistent. A clause can be deleted if it does Negate and expand the , getting ↔ not affect the consistency of the set. A tautology should always be deleted: it is true in all interpretations. x y [( P(y, x) P(y, y)) ¬∀ ∃ ¬ ¬ ∨ ¬ Here is a subtler case. Consider the clauses ( P(y, y) P(y, x))] ∧ ¬¬ ∨ S, R P, S P, Q, R { }{ ¬ }{ } Its negation normal form is Resolving the first two yields P, R . Since each clause is { } x y [( P(y, x) P(y, y)) (P(y, y) P(y, x))] a disjunction, any interpretation that satisfies P, R also ∃ ∀ ¬ ∨ ¬ ∧ ∨ satisfies P, Q, R . Thus P, Q, R cannot cause{ inconsis-} { } { } Skolemization yields tency, and should be deleted. Put another way, P R implies P Q R. Anything ( P(y, a) P(y, y)) (P(y, y) P(y, a)) that could be derived from∨ P Q R could∨ ∨ also be derived ¬ ∨ ¬ ∧ ∨ from P R. This sort of deletion∨ ∨ is called subsumption; The clauses are clause P∨, R subsumes P, Q, R . Subsumption{ } typically{ involves} unification. In the gen- P(y, a), P(y, y) P(y, y), P(y, a) {¬ ¬ }{ } eral case, a clause C subsumes a clause D if Cθ D for some substitution θ (if some instance of C implies⊆D). For We can apply the factoring rule to both of these clauses, example, P(x) subsumes P(a), Q(y) : since x is a vari- obtaining two new clauses: able, P(x{) implies} every formula{ of the} form P(t). { } P(a, a) P(a, a) {¬ }{ } 8.4 Prolog clauses These are complementary unit clauses, so resolution yields the empty clause. This proof is trivial! However, the use Prolog clauses, also called Horn clauses, have at most one of factoring is essential, since the 2-literal clauses must be positive literal. A definite clause is one of the form reduced to unit clauses. A1,..., Am, B Observe what happens if we try to prove it without fac- {¬ ¬ } toring. We can resolve the two original clauses on the literal It is logically equivalent to (A1 Am) B. Prolog’s P(y, a). We obtain P(y, y), P(y, y) , which is a tautol- notation is ∧ · · · ∧ → {¬ } ogy and therefore worthless. B A1,..., Am. ← If m 0 then the clause is simply written as B and is some- Example 33 Let us prove x [P Q(x)] x [Q(x) times= called a fact. ∃ → ∧∃ → P] x [P Q(x)]. The clauses are A negative or goal clause is one of the form → ∃ ↔

P, Q(b) P, Q(x) P, Q(x) P, Q(a) A1,..., Am { ¬ }{ } {¬ ¬ } {¬ } {¬ ¬ } A short resolution proof follows. The complementary liter- Prolog permits just one of these; it represents the list of als are underlined: unsolved goals. Prolog’s notation is Resolve P, Q(b) with P, Q(x) getting P A ,..., A . { ¬ } { } { } 1 m Resolve P, Q(x) with P, Q(a) getting P ← {¬ ¬ } {¬ } {¬ } A Prolog database consists of definite clauses. Observe that Resolve P with P getting  { } {¬ } definite clauses cannot express negative assertions, since This proof relies on the set identity P, P P . We can they must contain a positive literal. From a mathematical { } = { } view it as a degenerate case of factoring where the literals point of view, they have little expressive power; every set of are identical, with no need for unification. definite clauses is consistent! Even so, definite clauses are a natural notation for many problems. Exercise 27 Show the steps of converting x [P Q(x)] x [Q(x) P] x [P Q(x)] into∃ clauses.→ Exercise 29 Show that every set of definite clauses is con- Then show∧ ∃ two resolution→ → proofs ∃ different↔ from the one sistent. (Hint: first consider propositional logic, then extend shown above. your argument to first order logic.) 8 FIRST-ORDER RESOLUTION AND PROLOG 23

8.5 Prolog computations definite clause goal clause A Prolog computation takes a database of definite clauses together with one goal clause. It repeatedly resolves the goal clause with some definite clause to produce a new goal  os(y , x ), k(x ), k(y ) k(henryVIII) clause. If resolution produces the empty goal clause, then {¬ 1 1 ¬ 1 1 }{¬} execution succeeds. Here is a diagram of a Prolog computation step:

os(henryVIII, henryVII) os(henryVIII, x ), k(x ) definite clause goal clause { }{¬ 1 ¬ 1 } A ,..., A , B B ,..., B {¬ 1 ¬ n }{¬1 ¬ m}

σ unify (B, B1) = ¬ new goal clause defeat(y2, x2), k(x2), k(y2) k(henryVII) A σ,..., A σ, B σ,..., B σ {¬ ¬ }{¬} {¬ 1 ¬ n ¬ 2 ¬ m }

This is a linear resolution (§6). Two program clauses are never resolved with each other. The result of each resolution defeat(henryVII, richardIII) defeat(henryVII, x2), k(x2) step becomes the next goal clause; the previous goal clause { }{¬ ¬ } is discarded after use. Prolog resolution is efficient, compared with general res- olution, because it involves less search and storage. Gen- k(richardIII) k(richardIII) eral resolution must consider all possible pairs of clauses; it { }{¬} adds their resolvents to the existing set of clauses; it spends a great deal of effort getting rid of subsumed (redundant) clauses and probably useless clauses. Prolog always re- solves some program clause with the goal clause. Because  goal clauses do not accumulate, Prolog requires little stor- age. Prolog never uses factoring and does not even remove Figure 3: Execution of a Prolog program (os oldestson, k king) = repeated literals from a clause. = Prolog has a fixed, deterministic execution strategy. The program is regarded as a list of clauses, not a set; the clauses The goal is to prove king(henryVIII). Now here is the same are tried strictly in order. With a clause, the literals are also problem in the form of definite clauses: regarded as a list. The literals in the goal clause are proved strictly from left to right. The goal clause’s first literal is oldestson(y, x), king(x), king(y) {¬ ¬ } replaced by the literals from the unifying program clause, defeat(y, x), king(x), king(y) preserving their order. {¬ ¬ } king(richardIII) Prolog’s search strategy is depth-first. To illustrate what { } this means, suppose that the goal clause is simply P defeat(henryVII, richardIII) ← { } and that the program clauses are P P and P . Pro- oldestson(henryVIII, henryVII) log will resolve P P with P←to obtain a← new goal { } clause, which happens← to be identical← to the original one. The goal clause is Prolog never notices the repeated goal clause, so it repeats king(henryVIII) . the same useless resolution over and over again. Depth-first {¬ } search means that at every choice point, such as between using P P and P , Prolog will explore every avenue Figure 3 shows the execution. The subscripts in the clauses arising from← its first choice← before considering the second are to rename the variables. choice. Obviously, the second choice would prove the goal Note how crude this formalization is. It says nothing trivially, but Prolog never notices this. about the passage of time, about births and deaths, about not having two kings at once. The oldest son of Henry VII, Arthur, died aged 15, leaving the future Henry VIII as the 8.6 Example of Prolog execution oldest surviving son. All formal models must omit some real-world details: reality is overwhelmingly complex. Here are axioms about the English succession: how y can The Frame Problem in Artificial Intelligence reveals an- become King after x. other limitation of logic. Consider writing an axiom system to describe a robot’s possible actions. We might include x y (oldestson(y, x) king(x) king(y)) ∀ ∀ ∧ → an axiom to state that if the robot lifts an object at time t, x y (defeat(y, x) king(x) king(y)) then it will be holding the object at time t 1. But we also ∀ ∀ ∧ → + king(richardIII) need to assert that the positions of everything else remain the same as before. Then we must consider the possibil- defeat(henryVII, richardIII) ity that the object is a table and has other things on top of oldestson(henryVIII, henryVII) it. Separation Logic, a variant of Hoare logic, was invented 9 DECISION PROCEDURES AND SMT SOLVERS 24 to solve the frame problem, especially for reasoning about decidable turned out to be very difficult: propositional satis- linked data structures. fiability is NP-complete (and probably exponential), while Prolog is a powerful and useful programming language, many other decision procedures are of hyper-exponential but it is seldom logic. Most Prolog programs rely on special complexity, limiting them to the smallest problems. predicates that affect execution but have no logical mean- But as we have seen before, worst-case results can be ing. There is a huge gap between the theory and practice of overly pessimistic. DPLL solves propositional satisfiabil- logic programming. ity for very large formulas. Used in conjunction with other techniques, it yields highly effective program analysis tools Exercise 30 Convert these formulæ into clauses, show- that can, in particular, identify program loops that could fail ing each step: negating the formula, eliminating and , to terminate. Below we shall briefly consider some sim- pushing in negations, moving the quantifiers, Skolemizing,→ ↔ ple methods for solving systems of arithmetic constraints. dropping the universal quantifiers, and converting the re- These decision procedures can be combined with the DPLL sulting formula into CNF. method. A modern Satisfiability Modulo Theories (SMT) solver brings together a large number of separate, small ( x y R(x, y)) ( y x R(x, y)) reasoning subsystems to solve very large and difficult prob- ∀ ∃ → ∃ ∀ ( y x R(x, y)) ( x y R(x, y)) lems. As before, we work with negated problems, which we ∃ ∀ → ∀ ∃ x yz ((P(y) Q(z)) (P(x) Q(x))) attempt to show unsatisfiable or else to construct a model. ∃ ∀ → → → A number of decidable subcases of first-order logic were y x (R(x, y) z (R(x, z) R(z, x))) ¬∃ ∀ ↔ ¬∃ ∧ identified in the first half of the 20th century. One of the more interesting ones was Presburger arithmetic: the first- Exercise 31 Consider the Prolog program consisting of order theory of the natural numbers with addition (and sub- the definite clauses traction, and multiplication by constants). There is an algo- rithm to determine whether a given sentence in Presburger P( f (x, y)) Q(x), R(y) ← arithmetic holds or not. Real numbers behave differently Q(g(z)) R(z) ← from the natural numbers (where m < n m 1 n) R(a) and require their own algorithms. Once⇐⇒ again,+ the≤ only ← allowed operations are addition, subtraction and constant Describe the Prolog computation starting from the goal multiplication. Such a language is called linear arithmetic. clause P(v). Keep track of the substitutions affect- ← Unrestricted arithmetic on the reals is actually decidable, ing v to determine what answer the Prolog system would but the algorithms are too complicated and expensive for return. widespread use. Even Euclidean geometry can be reduced to problems on the reals, and is therefore also decidable. Exercise 32 Find a refutation from the following set of Practical decision procedures exist for simple data struc- clauses using resolution with factoring. tures such as arrays and lists. P(x, a), P(x, y), P(y, x) {¬ ¬ ¬ } P(x, f (x)), P(x, a) 9.1 Fourier-Motzkin variable elimination { } P( f (x), x), P(x, a) { } Fourier-Motzkin variable elimination is a classic decision procedure for real (or rational) linear arithmetic. It dates Exercise 33 Prove the following formulæ by resolution, from 1826 and is very inefficient in general, but relatively showing all steps of the conversion into clauses. Remember easy to understand. In the general case, it deals with con- to negate first! junctions of linear constraints over the reals or rationals:

x (P Q(x)) (P x Q(x)) m n ∀ ∨ → ∨ ∀ ^ X ai j x j bi (6) xy (R(x, y) zw R(z, w)) ≤ ∃ → ∀ i 1 j 1 = = Note that P is just a predicate symbol, so in particular, x is not free in P. It works by eliminating variables in succession. Eventually a either contradiction or a trivial constraint will remain. The key idea is a technique known as quantifier elimi- 9 Decision Procedures and SMT nation (QE). We have already seen Skolemization, which Solvers removes quantifiers from a formula, but that technique does not preserve the formula’s meaning. QE replaces a formula One of the original objectives of formal logic was to re- with an equivalent but quantifier-free formula. It is only place argumentation with calculation: to solve mathemati- possible for specific theories, and is generally very expen- cal questions mechanically. Unfortunately, this reduction- sive. istic approach is unrealistic. Researchers soon found out For the reals, existential quantifiers can be eliminated as that most fundamental problems were insoluble, in general. follows: The Halting Problem was found to be undecidable. Gödel m n m n ^ ^ ^ ^ proved that no reasonable system of axioms yielded a com- x ( ai x x b j ) ai b j ∃ ≤ ∧ ≤ ⇐⇒ ≤ plete theory for arithmetic. And problem classes that were i 1 j 1 i 1 j 1 = = = = 9 DECISION PROCEDURES AND SMT SOLVERS 25

m A system of constraints has many lower bounds, ai i 1, 9.2 Other decidable theories and upper bounds, b n . To eliminate a variable,{ } = we j j 1 One of the most dramatic examples of quantifier elimination need to generalise to{ include} = every combination of a lower concerns the domain of real polynomial arithmetic: bound with an upper bound. Observe that removing one quantifier replaces a conjunction of m n inequalities by a 2 + x [ax bx c 0] conjunction of m n inequalities. ∃ + + = ⇐⇒ × b2 4ac (c 0 a 0 b2 > 4ac) Given a problem in the form (6), to eliminate the variable ≥ ∧ = ∨ 6= ∨ xn, we examine each of the m constraints based on the sign The left-hand side asks when a quadratic equation can have of the relevant coefficient, ain. If ain 0 then it does not solutions, and the right-hand side gives necessary and suf- = Pn 1 constrain xn at all. Otherwise, define βi bi j−1 ai j x j . ficient conditions, including degenerate cases (a c 0). = − = Then But this neat formula is the exception, not the rule.= In= gen- eral, QE yields gigantic formulas. Applying QE to a for- βi If ain > 0 then xn mula containing no free variables yields a sentence, which ≤ ain simplifies to t or f, but even then, runtime is typically hy- βi perexponential. If ain < 0 then xn ≥ ain For linear integer arithmetic, every decision algorithm cn has a worst-case runtime of at least 22 . People typically The first case yields an upper bound for xn while the second use the Omega test or Cooper’s algorithm. case yields a lower bound. Now for every pair of constraints There exist decision procedures for arrays, for at least the i and i0 involving opposite signs can be added writing the trivial theory of (where lk is lookup and up is update) βi0 lower bound case as xn to obtain ( − ≤ − ai n v if i j 0 lk(up(a, i, v), j) = = lk(a, j) if i j 6= βi βi0 0 . The theory of lists with head, tail, cons is also decidable. ≤ ain − ai n 0 Combinations of decidable theories remain decidable under certain circumstances, e.g., the theory of arrays with linear Consider the following small set of constraints: arithmetic subscripts. The seminal publication, still cited today, is Nelson and Oppen [1980]. x y x z x y 2z 0 z 1 ≤ ≤ − + + ≤ − ≤ − 9.3 Satisfiability modulo theories Let’s work through the algorithm very informally. The first two constraints give upper bounds for x, while the third Many decision procedures operate on existentially quanti- constraint gives a lower bound, and can be rewritten as fied conjunctions of inequalities. An arbitrary formula can x y 2z. Adding it to x y yields 0 2z be solved by translating it into disjunctive normal form (as which− ≤ can − be− rewritten as z 0. Doing≤ the same thing≤ with− opposed to the more usual conjunctive normal form) and by x z yields y z 0 which≤ can be rewritten as z y. eliminating universal quantifiers in favour of negated exis- This≤ leaves us with+ a≤ new set of constraints, where we≤ have − tential quantifiers. However, these transformations typically eliminated the variable x: cause exponential growth and may need to be repeated as each variable is eliminated. Satisfiability modulo theories (SMT) is an extension of z 0 z y z 1 ≤ ≤ − − ≤ − DPLL to make use of decision procedures, extending their scope while avoiding the problems mentioned above. The Now we have two separate upper bounds for z, as well as idea is that DPLL handles the logical part of the problem a lower bound, because we know z 1. There are again ≥ while delegating reasoning about particular theories to the two possible combinations of a lower bound with an upper relevant decision procedures. bound, and we derive 0 1 and 0 y 1. Because 0 We extend the language of propositional satisfiability to 1 is contradictory, Fourier-Motzkin≤ − ≤ − variable− elimination≤ − include atomic formulas belonging to our decidable theory has refuted the original set of constraints. (or theories). For the time being, these atomic formulas are A many other decision procedures exist, frequently for not interpreted, so a literal such as a < 0 is wholly unrelated more restrictive problem domains, aiming for greater ef- to a > 0. But the decision procedures are invoked during ficiency and better integration with other reasoning tools. the execution of DPLL; if we have already asserted a < 0, Difference arithmetic is an example: arithmetic constraints then the attempt to assert a > 0 will be rejected by the are restricted to the form x y c, where c is an inte- decision procedure, causing backtracking. Information can ger constant. Satisfiability of− a set≤ of difference arithmetic be fed back from the decision procedure to DPLL in the constraints can be determined very quickly by construct- form of a new clause, such as (a < 0) (a > 0). ing a graph and invoking the Bellman-Ford algorithm to [Remark: the Fourier-Motzkin¬ decision∨ ¬ procedure elim- look for a cycle representing the contradiction 0 1. inates variables, but all decision procedures in actual use In other opposite direction, harder decision problems≤ − can can deal with constants such as a as well, and satisfiability- handle more advanced applications but require much more preserving transformations exist between formulas involv- computer power. ing constants and those involving quantified variables.] 10 BINARY DECISION DIAGRAMS 26

9.4 SMT example Exercise 34 In Fourier-Motzkin variable elimination, any variable not bounded both above and below is deleted from Let’s consider an example. Suppose we start with the fol- the problem. For example, given the set of constraints lowing four clauses. Note that a, b, c are constants: vari- ables are not possible with this sort of proof procedure. 3x y x 0 y z z 1 z 0 ≥ ≥ ≥ ≤ ≥ c 0, 2a < b b < a the variables x and then y can be removed (with their con- { = }{ } 3a > 2, a < 0 c 0, (b < a) straints), reducing the problem to z 1 z 0. Explain { }{ 6= ¬ } how this happens and why it is correct.≤ ∧ ≥ Unit propagation using b < a yields three clauses: Exercise 35 Apply Fourier-Motzkin variable elimination c 0, 2x < b 3a > 2, a < 0 c 0 { = }{ }{ 6= } to the set of constraints

Unit propagation using c 0 yields two clauses: x z y 2z z 0 x y z. 6= ≥ ≥ ≥ + ≤ 2a < b 3a > 2, a < 0 { }{ } Exercise 36 Apply Fourier-Motzkin variable elimination Unit propagation using 2a < b yields just this: to the set of constraints

3a > 2, a < 0 x 2y x y 3 z x 0 z y 4x. { } ≤ ≤ + ≤ ≤ ≤ Now a case split on the literal 3a > 2 returns a “model”: Exercise 37 Apply the SMT algorithm sketched above to the following set of clauses: b < a c 0 2a < b 3a > 2. ∧ 6= ∧ ∧ c 0, c > 0 a < b, b < a c < 0, a b But the decision procedure finds these contradictory and { = }{ }{ = } returns a new clause: 10 Binary Decision Diagrams (b < a), (2a < b), (3a > 2) {¬ ¬ ¬ } A binary decision tree represents a propositional formula Finally, we get a true model: by binary decisions, namely if-then-else expressions over the propositional letters. (In the relevant literature, proposi- b < a c 0 2a < b a < 0. tional letters are called variables.) A tree may contain much ∧ 6= ∧ ∧ redundancy; a binary decision diagram is a directed graph, Case splitting operates as usual for DPLL. But note that sharing identical subtrees. An ordered binary decision dia- pure literal elimination would make no sense here, as there gram is based upon giving an ordering < to the variables: are connections between literals (consider a < 0 and a > 0 they must be tested in order. Further refinements ensure that again) that are not visible at the propositional level. each propositional formula is mapped to a unique diagram, for a given ordering. 9.5 Final remarks The acronym BDD for binary decision diagram is well- established in the literature. However, many earlier papers We seen here the concepts of over-approximation and use OBDD or even ROBDD (for “reduced ordered binary counterexample-driven refinement. It is frequently used to decision diagram”) synonymously. extend SAT solving to richer domains than propositional A BDD must satisfy the following conditions: logic. By over-approximation we mean that every model ordering: if P is tested before Q, then P < Q of the original problem assigns truth values to the enriched • “propositional letters” (such as a > 0), yielding a model (thus in particular, P cannot be tested more than once of the propositional clauses obtained by ignoring the under- on a single path) lying meaning of the propositional atoms. As above, any uniqueness: identical subgraphs are stored only once claimed model is then somehow checked against the richer • (to do this efficiently, hash each node by its variable problem domain, and the propositional model is then itera- and pointer fields) tively refined. But if the propositional clauses are inconsis- tent, then so is the original problem. irredundancy: no test leads to identical subgraphs in SMT solvers are the focus of great interest at the mo- • the 1 and 0 cases (thanks to uniqueness, redundant ment, and have largely superseded SAT solvers (which they tests can be detected by comparing pointers) incorporate and generalise). One of the most popular SMT solvers is Z3, a product of Microsoft Research but free For a given variable ordering, the BDD representation for non-commercial use. Others include Yices and CVC4. of each formula is unique: BDDs are a canonical form. They are applied to a wide range of problems, including Canonical forms usually lead to good algorithms — for hardware and software verification, program analysis, sym- a start, you can test whether two things are equivalent by bolic software execution, and hybrid systems verification. comparing their canonical forms. 11 MODAL LOGICS 27

The BDD of a tautology is 1. Similarly, that of any incon- P sistent formula is 0. To check whether two formulæ are log- ically equivalent, convert both to BDDs and then — thanks P to uniqueness — simply compare the pointers. A recursive algorithm converts a formula to a BDD. All P Q Q Q the logical connectives can be handled directly, including and . (Exclusive-or is also used, especially in hard- Q R R →ware examples.)↔ The expensive transformation of A B into (A B) (B A) is unnecessary. ↔ 0 1 0 1 0 1 Here→ is how∧ to convert→ a conjunction A A to a BDD. ∧ 0 In this algorithm, XPY is a decision node that tests the vari- The new construction is shown in grey. In both of these able P, with a true-link to X and a false-link to Y . In other examples, it appears over the rightmost formula because its words, XPY is the BDD equivalent of the decision “if P then variables come later in the ordering. X else Y ”. The final diagram indicates that the original formula is

1. Recursively convert A and A0 to BDDs Z and Z 0. always true except if P is true while Q and R are false. When you have such a simple BDD, you can easily check 2. Check for trivial cases. If Z Z (pointer compari- = 0 that it is correct. For example, this BDD suggests the for- son) then the result is Z; if either operand is 0, then the mula evaluates to 1 when P is false, and indeed we find that result is 0; if either operand is 1, then the result is the the formula simplifies to Q Q R, which simplifies other operand. further to 1. → ∨ 3. In the general case, let Z P and Z P . Huth and Ryan [2004] present a readable introduction to X Y 0 X0 0Y 0 There are three possibilities:= = BDDs. A classic but more formidable source of information is Bryant [1992]. (a) If P P then build the BDD P recur- 0 X X0 Y Y 0 sively. This= means convert X X and∧Y Y∧to BDDs 0 0 Exercise 38 Compute the BDD for each of the following U and U , then construct a new∧ decision∧ node from P 0 formulæ, taking the variables as alphabetically ordered: to them. Do the usual simplifications. If U U then = 0 the resulting BDD for the conjunction is U. P Q Q P P Q P Q ∧ → ∧ ∨ → ∧ (b) If P < P then build the BDD P . When 0 X Z 0 Y Z 0 (P Q) P (P Q) (P R) building BDDs on paper, it is easier to∧ pretend∧ that the ¬ ∨ ∨ ¬ ∧ ↔ ∨ second decision node also starts with P: assume that it Exercise 39 Verify these equivalences using BDDs: has the redundant decision P and proceed as in (a). Z 0 Z 0 P P (P Q) R P (Q R) (c) If > 0, the approach is analogous to (b). ∧ ∧ ' ∧ ∧ (P Q) R P (Q R) Other connectives, even , are treated similarly, differing ∨ ∨ ' ∨ ∨ ⊕ P (Q R) (P Q) (P R) only in the base cases. The negation of the BDD XPY is ∨ ∧ ' ∨ ∧ ∨ XP Y . In essence we copy the BDD, and when we reach P (Q R) (P Q) (P R) the¬ leaves,¬ exchange 1 and 0. The BDD of Z f is the ∧ ∨ ' ∧ ∨ ∧ → same as the BDD of Z. Exercise 40 Verify these equivalences using BDDs: During this processing,¬ the same input (consisting of a connective and two BDDs) may be transformed into a BDD (P Q) P Q ¬ ∧ ' ¬ ∨ ¬ repeatedly. Efficient implementations therefore have an (P Q) R P (Q R) additional hash table, which associates inputs to the cor- ↔ ↔ ' ↔ ↔ (P Q) R (P R) (Q R) responding BDDs. The result of every transformation is ∨ → ' → ∧ → stored in the hash table so that it does not have to be com- puted again. 11 Modal Logics

Example 34 We apply the BDD Canonicalisation Algo- Modal logic allows us to reason about statements being rithm to P Q Q R. First, we make tiny BDDs for “necessary” or “possible”. Some variants are effectively P and Q. Then,∨ we→ combine∨ them using to make a small about time (temporal logic) where a statement might hold BDD for P Q: ∨ “henceforth” or “eventually”. ∨ There are many forms of modal logic. Each one is based P upon two parameters:

P Q W is the set of possible worlds (machine states, future • times, ...) 0 1 0 1 R is the accessibility relation between worlds (state • transitions, flow of time, ...) The BDD for Q R has a similar construction, so we omit it. We combine the∨ two small BDDs using , then simplify The pair (W, R) is called a modal frame. (removing a redundant test on Q) to obtain→ the final BDD. The two modal operators, or modalities, are 2 and 3: 11 MODAL LOGICS 28

2A means A is necessarily true and the necessitation rule: • 3A means A is possibly true A • 2A Here “necessarily true’ means “true in all worlds accessible from the present one”. The modalities are related by the law There are no axioms or inference rules for 3. The modal- 3A 2 A; in words, “it is not possible that A is true” ity is viewed simply as an abbreviation: is¬ equivalent' ¬ to “A is necessarily false”. def Complex modalities are made up of strings of the modal 3A 2 A = ¬ ¬ operators, such as 22A, 23A, 32A, etc. Typically many of these are equivalent to others; in S4, an important modal The distribution axiom clearly holds in our semantics. logic, 22A is equivalent to 2A. The propositional connectives obey their usual truth tables in each world. If A holds in all worlds, and A B holds in all worlds, then B holds in all worlds. Thus→ if 2A and 11.1 Semantics of propositional modal logic 2(A B) hold then so does 2B, and that is the essence → Here are some basic definitions, with respect to a particular of the distribution axiom. frame (W, R): The necessitation rule states that all theorems are neces- An interpretation I maps the propositional letters to sub- sarily true. In more detail, if A can be proved, then it holds sets of W. For each letter P, the set I (P) consists of those in all worlds; therefore 2A is also true. worlds in which P is regarded as true. The modal logic that results from adding the distribution axiom and necessitation rule is called K . It is a pure modal If w W and A is a modal formula, then w A means A is true∈ in world w. This relation is defined as follows: logic, from which others are obtained by adding further ax- ioms. Each axiom corresponds to a property that is assumed w P w I (P) to hold of all accessibility relations. Here are just a few of w 2A ⇐⇒ v ∈ A for all v such that R(w, v) the main ones: ⇐⇒ w 3A v A for some v such that R(w, v) ⇐⇒ T 2A A (reflexive) w A B w A or w B ∨ ⇐⇒ 4 2A → 22A (transitive) w A B w A and w B → w A∧ ⇐⇒ w A does not hold B A 23A (symmetric) ¬ ⇐⇒ → This definition of truth is more complex than we have Logic T includes axiom T: reflexivity. Logic S4 includes seen previously (§2.2), because of the extra parameters W axioms T and 4: reflexivity and transitivity. Logic S5 in- and R. We shall not consider quantifiers at all; they really cludes axioms T, 4 and B: reflexivity, transitivity and sym- complicate matters, especially if the universe is allowed to metry; these imply that the accessibility relation is an equiv- vary from one world to the next. alence relation, which is a strong condition. For a particular frame (W, R), further relations can be Other conditions on the accessibility relation concern forms of confluence. One such condition might state that defined in terms of w A: if w1 and w2 are both accessible from w then there exists W,R,I A means w A for all w under interpretation I some v that is accessible from both w1 and w2. |H W,R A means w A for all w and all I |H 11.3 Sequent Calculus Rules for S4

Now A means W,R A for all frames. We say that A |H |H We shall mainly look at S4, which is one of the mainstream is universally valid. In particular, all tautologies of propo- modal logics. As mentioned above, S4 assumes that the sitional logic are universally valid. accessibility relation is reflexive and transitive. If you want Typically we make further assumptions on the accessibil- an intuition, think of the flow of time. Here are some S4 ity relation. We may assume, for example, that R is transi- statements with their intuitive meanings: tive, and consider whether a formula holds under all such frames. More formulæ become universally valid if we re- 2A means “A will be true from now on”. strict the accessibility relation, as they exclude some modal • 3A means “A will be true at some point in the future”, frames from consideration. The purpose of such assump- • tions is to better model the task at hand. For instance, to where the future includes the present moment. model the passage of time, we might want R to be reflex- 23A means “3A will be true from now on”. At any ive and transitive; we could even make it a linear ordering, • future time, A must become true some time afterwards. though branching-time temporal logic is popular. In short, A will be true infinitely often.

11.2 Hilbert-style modal proof systems 22A means “2A will be true from now on”. At any • future time, A will continue to be true. So 22A and Start with any proof system for propositional logic. Then 2A have the same meaning in S4. add the distribution axiom The “time” described by S4 allows multiple futures, 2(A B) (2A 2B) which can be confusing. For example, 32A intuitively → → → 11 MODAL LOGICS 29

it (I omit the ( r) steps): →

A A B B ⇒ ⇒ ( l) A B, A B → → ⇒ (2l) A B, 2A B → ⇒ (2l) 2(A B), 2A B → ⇒ (2r) 2(A B), 2A 2B → ⇒ Intuitively, why is this sequent true? We assume 2(A B): from now on, if A holds then so does B. We assume→ Figure 4: Counterexample to 32A 32B 32(A B) 2A: from now on, A holds. Obviously we can conclude ∧ → ∧ that B will hold from now on, which we write formally as 2B. means “eventually A will be true forever”. You might ex- The order in which you apply rules is important. Working pect 32A and 32B to imply 32(A B), since eventu- backwards, you must first apply rule (2r). This rule discards ally A and B should both have become∧ true. However, this non-2 formulæ, but there aren’t any. If you first apply (2l), property fails because time can split, with A becoming true removing the boxes from the left side, then you will get in one branch and B in the other (Fig. 4). Note in partic- stuck: ular that 232A is stronger than 32A, and means “in all now what? ? futures, eventually A will be true forever”. B ⇒ (2r) The sequent calculus for S4 extends the usual sequent A B, A 2B → ⇒ (2l) rules for propositional logic with additional ones for 2 and A B, 2A 2B 3. Four rules are required because the modalities may oc- → ⇒ (2l) 2(A B), 2A 2B cur on either the left or right side of a sequent. → ⇒ Applying (2r) before (2l) is analogous to applying ( r) be- ∀ fore ( l). The analogy because 2A has an implicit universal A, 0 1 0∗ 1∗, A ∀ ⇒ (2l) ⇒ (2r) 2A, 0 1 0 1, 2A quantifier: for all accessible worlds. ⇒ ⇒ The following two proofs establish the modal equiva- lence 2323A 23A. Strings of modalities, like 2323 A, 0∗ 1∗ 0 1, A ' ⇒ (3l) ⇒ (3r) and 23, are called operator strings. So the pair of results 3A, 0 1 0 1, 3A ⇒ ⇒ establish an operator string equivalence. The validity of this particular equivalence is not hard to see. Recall that 23A The (2r) rule is analogous to the necessitation rule. But means that A holds infinitely often. So 2323A means that now A may be proved from other formulæ. This intro- 23A holds infinitely often — but that can only mean that duces complications. Modal logic is notorious for requiring A holds infinitely often, which is the meaning of 23A. strange conditions in inference rules. The symbols 0 and ∗ Now, let us prove the equivalence. Here is the first half of 1∗ stand for sets of formulæ, defined as follows: the proof. As usual we apply (2r) before (2l). Dually, and def analogously to the treatment of the rules, we apply (3l) 0∗ 2B 2B 0 ∃ = { | ∈ } before (3r): def 1∗ 3B 3B 1 3A 3A = { | ∈ } ⇒ (2l) 23A 3A In effect, applying rule (2r) in a backward proof throws ⇒ (3l) 323A 3A away all left-hand formulæ that do not begin with a 2 and ⇒ (2l) 2323A 3A all right-hand formulæ that do not begin with a 3. ⇒ (2r) If you consider why the (2r) rule actually holds, it is not 2323A 23A ⇒ hard to see why those formulæ must be discarded. If we The opposite entailment is easy to prove: forgot about the restriction, then we could use (2r) to infer A 2A from A A, which is ridiculous. The restriction ⇒ ⇒ 23A 23A ensures that the proof of A in the premise is independent of ⇒ (3r) any particular world. 23A 323A ⇒ (2r) The rule (3l) is an exact dual of (2r). The obligation to 23A 2323A ⇒ discard formulæ forces us to plan proofs carefully. If rules are applied in the wrong order, vital information may have Logic S4 enjoys many operator string equivalences, in- to be discarded and the proof will fail. cluding 22A 2A. And for every operator string equiv- alence, its dual' (obtained by exchanging 2 with 3) also 11.4 Some sample proofs in S4 holds. In particular, 33A 3A and 3232A 32A hold. So we only need to consider' operator strings' in which A few examples will illustrate how the S4 sequent calculus the boxes and diamonds alternate, and whose length does is used. not exceed three. The distribution axiom is assumed in the Hilbert-style The distinct S4 operator strings are therefore 2, 3, 23, proof system. Using the sequent calculus, we can prove 32, 232 and 323. 12 TABLEAUX-BASED METHODS 30

Finally, here are two attempted proofs that fail — be- adopted by proof theorists because of its logical simplicity. cause their conclusions are not theorems! The modal se- Adding unification produces yet another formalism known quent A 23A states that if A holds now then it necessar- as free-variable tableaux; this form is particularly amenable ily holds⇒ again: from each accessible world, another world to implementation. Both formalisms use proof by contra- is accessible in which A holds. This formula is valid if the diction. accessibility relation is symmetric; then one could simply return to the original world. The formula is therefore a the- 12.1 Simplifying the sequent calculus orem of S5 modal logic, but not S4. The usual formalisation of first-order logic involves seven A ⇒ (3r) connectives, or nine in the case of modal logic. For each 3A ⇒ (2r) connective the sequent calculus has a left and a right rule. A 23A ⇒ So, apart from the structural rules (basic sequent and cut) Here, the modal sequent 3A, 3B 3(A B) states that there are 14 rules, or 18 for modal logic. if A holds in some accessible world,⇒ and B∧holds in some Suppose we allow only formulæ in negation normal form. accessible world, then both A and B hold in some accessi- This immediately disposes of the connectives and . Really is discarded also, as it is allowed only→ on propo-↔ ble world. It is a fallacy because those two worlds need not ¬ coincide. The (3l) rule prevents us from removing the dia- sitional letters. So only four connectives remain, six for monds from both 3A and 3B; if we choose one we must modal logic. discard the other: The greatest simplicity gain comes in the sequent rules. The only sequent rules that move formulæ from one side B A B ⇒ ∧ (3r) to the other (across the symbol) are the rules for the B 3(A B) ⇒ ⇒ ∧ (3l) connectives that we have just discarded. Half of the sequent 3A, 3B 3(A B) rules can be discarded too. It makes little difference whether ⇒ ∧ we discard the left-side rules or the right-side rules. The topmost sequent may give us a hint as to why the con- Let us discard the right-side rules. The resulting system clusion fails. Here we are in a world in which B holds, and allows sequents of the form A . It is a form of refutation we are trying to show A B, but there is no reason why A system (proof by contradiction),⇒ since the formula A has should hold in that world.∧ the same meaning as the sequent A . Moreover, a basic The sequent 32A, 32B 32(A B) is not valid sequent has the form of a contradiction.¬ ⇒ We have created a because A and B can become⇒ true in∧ different futures. new formal system, known as the tableau calculus. However, the sequents 32A, 232B 32(A B) and ⇒ ∧ 232A, 232B 232(A B) are both valid. A, 0 A, 0 ⇒ ∧ (basic) ¬ ⇒ ⇒ (cut) A, A, 0 0 Exercise 41 Why does the dual of an operator string ¬ ⇒ ⇒ equivalence also hold? A, B, 0 A, 0 B, 0 ⇒ ( l) ⇒ ⇒ ( l) A B, 0 ∧ A B, 0 ∨ Exercise 42 Prove the sequents 3(A B) 3A, 3B ∧ ⇒ ∨ ⇒ and 3A 3B 3(A B), thus proving∨ the⇒ equivalence ∨ ⇒ ∨ A[t/x], 0 A, 0 3(A B) 3A 3B. ⇒ ( l) ⇒ ( l) ∨ ' ∨ x A, 0 ∀ x A, 0 ∃ ∀ ⇒ ∃ ⇒ Exercise 43 Prove the sequent 3(A B), 2A 3B. Rule ( l) has the usual proviso: it holds provided x is not → ⇒ ∃ free in the conclusion! Exercise 44 Prove the equivalence 2(A B) 2A 2B. ∧ ' ∧ We can extend the system to S4 modal logic by adding just two further rules, one for 2 and one for 3: Exercise 45 Prove 232A, 232B 232(A B). ⇒ ∧ A, 0 A, 0∗ ⇒ (2l) ⇒ (3l) 2A, 0 3A, 0 12 Tableaux-Based Methods ⇒ ⇒ As previously, 0∗ is defined to erase all non-2 formulæ: There is a lot of redundancy among the connectives , , ¬ ∧ def , , , , . We could get away using only three of 0∗ 2B 2B 0 ∨them→ (two↔ if∀ we∃ allowed exclusive-or), but use the full set = { | ∈ } for readability. There is also a lot of redundancy in the se- We have gone from 14 rules to four, ignoring the struc- quent calculus, because it was designed to model human tural rules. For modal logic, we have gone from 18 rules to reasoning, not to be as small as possible. six. One approach to removing redundancy results in the res- A simple proof will illustrate how the tableau calculus olution method. Clause notation replaces the connectives, works. Let us prove x (A B) A x B, where x and there is only one inference rule. A less radical ap- is not free in A. We must∀ negate→ the⇒ formula,→ ∀ convert it to proach still removes much of the redundancy, while pre- NNF and finally put it on the left side of the arrow. The re- serving much of the natural structure of formulæ. The re- sulting sequent is A x B, x ( A B) . Elaborate sulting formalism, known as the tableau calculus, is often explanations should∧ not ∃ be¬ necessary∀ ¬ because∨ ⇒ this tableau 12 TABLEAUX-BASED METHODS 31 calculus is essentially a subset of the sequent calculus de- a basic sequent; there are two distinct ways of doing so: scribed in §5. z f (y) or y f (z) 7→ 7→ basic P(y), P( f (y)), P(z), P( f (z)) A, B, A A, B, B ¬ ¬ ⇒ ( l) ¬ ¬ ⇒ ¬ ⇒ ( l) P(y), P( f (y)), P(z) P( f (z)) ∧ A, B, A B ∨ ¬ ∧ ¬ ⇒ ( l) ¬ ¬ ∨ ⇒ ( l) P(y), P( f (y)), x [P(x) P( f (x))] ∀ A, B, x ( A B) ∀ ¬ ∀ ∧ ¬ ⇒ ( l) ¬ ∀ ¬ ∨ ⇒ ( l) P(y) P( f (y)), x [P(x) P( f (x))] ∧ A, x B, x ( A B) ∃ ∧ ¬ ∀ ∧ ¬ ⇒ ( l) ∃ ¬ ∀ ¬ ∨ ⇒ ( l) ∀ A x B, x ( A B) ∧ x [P(x) P( f (x))] ∧ ∃ ¬ ∀ ¬ ∨ ⇒ ∀ ∧ ¬ ⇒ In the first inference from the bottom, the universal formula 12.2 The free-variable tableau calculus is retained because it must be used again. In principle, uni- versally quantified formulæ ought always to be retained, as Some proof theorists adopt the tableau calculus as their for- they may be used any number of times. I normally erase malisation of first-order logic. It has the advantages of the them to save space. sequent calculus, without the redundancy. But can we use Pulling quantifiers to the front is not merely unnecessary; it as the basis for a theorem prover? Implementing the cal- it can be harmful. Skolem functions should have as few ar- culus (or indeed, implementing the full sequent calculus) guments as possible, as this leads to shorter proofs. Attain- requires a treatment of quantifiers. As with resolution, a ing this requires that quantifiers should have the smallest good computational approach is to combine unification with possible scopes; we ought to push quantifiers in, not pull Skolemization. them out. This is sometimes called miniscope form. First, consider how to add unification. The rule ( l) sub- For example, the formula x y [P(x) P(y)] is tricky ∀ ∃ ∀ → stitutes some term for the bound variable. Since we do not to prove, as we have just seen. But putting it in miniscope know in advance what the term ought to be, instead substi- form makes its proof trivial. Let us do this step by step: tute a free variable. The variable ought to be fresh, not used Negate; convert to NNF: x y [P(x) P(y)] elsewhere in the proof: Push in the y : ∀x [∃P(x) ∧y ¬P(y)] Push in the ∃x : ∀x P(x) ∧ ∃y ¬P(y) A[z/x], 0 ∀ ∀ ∧ ∃ ¬ ⇒ ( l) Skolemize: x P(x) P(a) x A, 0 ∀ ∀ ∧ ¬ ∀ ⇒ The formula x P(x) P(a) is obviously inconsistent. ∀ ∧ ¬ Then allow unification to instantiate variables with terms. Here is its refutation in the free-variable tableau calculus: This should occur when trying to solve any goal containing y a two formulæ, A and B. Try to unify A with B, producing 7→ basic ¬ P(y), P(a) a basic sequent. Instantiating a variable updates the entire ¬ ⇒ ( l) x P(x), P(a) ∀ proof tree. ∀ ¬ ⇒ ( l) x P(x) P(a) ∧ Rule ( l), used in backward proof, must create a fresh ∃ ∀ ∧ ¬ ⇒ variable. That will no longer do, in part because we now A failed proof is always illuminating. Let us try to prove allow variables to become instantiated by terms. We have the invalid formula a choice of techniques, but the simplest is to Skolemize the formula. All existential quantifiers disappear, so we can dis- x [P(x) Q(x)] x P(x) x Q(x). card rule ( l). This version of the tableau method is known ∀ ∨ ⇒ ∀ ∨ ∀ ∃ as the free-variable tableau calculus. Negation and conversion to NNF gives x P(x) ∃ ¬ ∧ Warning: if you wish to use unification, you absolutely x Q(x), x [P(x) Q(x)]. ∃ Skolemization¬ ∀ gives∨ P(a) Q(b), x [P(x) Q(x)]. must also use Skolemization. If you use unification without ¬ ∧¬ ∀ ∨ Skolemization, then you are trying to use two formalisms at The proof fails because a and b are distinct constants. It the same time and your proofs will be nonsense! This is be- is impossible to instantiate y to both simultaneously. cause unification is likely to introduce variable occurrences y a y b??? in places where they are forbidden by the side condition of 7→ 7→ P(a), Q(b), P(y) P(a), Q(b), Q(y) the existential rule. ¬ ¬ ⇒ ¬ ¬ ⇒ ( l) P(a), Q(b), P(y) Q(y) ∨ The Skolemised version of y z Q(y, z) x P(x) is ¬ ¬ ∨ ⇒ ( l) ∀ ∃ ∧ ∃ P(a), Q(b), x [P(x) Q(x)] ∀ y Q(y, f (y)) P(a). The subformula x P(x) goes to ¬ ¬ ∀ ∨ ⇒ ( l) ∀ ∧ ∃ P(a) Q(b), x [P(x) Q(x)] ∧ P(a) and not to P(g(y)) because it is outside the scope of ¬ ∧ ¬ ∀ ∨ ⇒ the y. ∀ 12.4 Tableaux-based theorem provers 12.3 Proofs using free-variable tableaux A tableau represents a partial proof as a set of branches of formulæ. Each formula on a branch is expanded until this Let us prove the formula x y [P(x) P(y)]. First is no longer possible (and the proof fails) or until the proof negate it and convert to NNF,∃ getting∀ x y→[P(x) P(y)]. succeeds. Then Skolemize the formula, and finally∀ ∃ put it on∧¬ the left Expanding a conjunction A B on a branch replaces it side of the arrow. The sequent to be proved is x [P(x) by the two conjuncts, A and B∧. Expanding a disjunction P( f (x))] . Unification completes the proof∀ by creating∧ A B splits the branch in two, with one branch containing A ¬ ⇒ ∨ REFERENCES 32 and the other branch B. Expanding the quantification x A extends the branch by a formula of the form A[t/x].∀ If a branch contains both A and A then it is said to be closed. When all branches are closed,¬ the proof has succeeded. A tableau can be viewed as a compact, graph-based rep- resentation of a set of sequents. The branch operations de- scribed above correspond to our sequent rules in an obvious way. Quite a few theorem provers have been based upon free- variable tableaux. The simplest is due to Beckert and Posegga [1994] and is called leanTAP. The entire program appears below! Its deductive system is similar to the re- duced sequent calculus we have just studied. It relies on some Prolog tricks, and is certainly not pure Prolog code. It demonstrates just how simple a theorem prover can be. leanTAP does not outperform big resolution systems. But it quickly proves some fairly hard theorems. prove((A,B),UnExp,Lits,FreeV,VarLim) :- !, prove(A,[B|UnExp],Lits,FreeV,VarLim). prove((A;B),UnExp,Lits,FreeV,VarLim) :- !, prove(A,UnExp,Lits,FreeV,VarLim), prove(B,UnExp,Lits,FreeV,VarLim). prove(all(X,Fml),UnExp,Lits,FreeV,VarLim) :- !, \+ length(FreeV,VarLim), copy_term((X,Fml,FreeV),(X1,Fml1,FreeV)), append(UnExp,[all(X,Fml)],UnExp1), prove(Fml1,UnExp1,Lits,[X1|FreeV],VarLim). prove(Lit,_,[L|Lits],_,_) :- (Lit = -Neg; -Lit = Neg) -> (unify(Neg,L); prove(Lit,[],Lits,_,_)). prove(Lit,[Next|UnExp],Lits,FreeV,VarLim) :- prove(Next,UnExp,[Lit|Lits],FreeV,VarLim).

The first clause handles conjunctions, the second disjunc- tions, the third universal quantification. The fourth line han- dles literals, including negation. The fifth line brings in the next formula to be analyzed. You are not expected to memorize this program or to un- derstand how it works in detail.

Exercise 46 Use a tableau calculus (standard or free vari- able) to prove examples given in previous sections.

References

B. Beckert and J. Posegga. leanTAP: Lean, tableau-based theorem proving. In A. Bundy, editor, Automated Deduction — CADE-12 International Conference, LNAI 814, pages 793–797. Springer, 1994. R. E. Bryant. Symbolic boolean manipulation with ordered binary-decision diagrams. Computing Surveys, 24(3): 293–318, Sept. 1992.

M. Huth and M. Ryan. Logic in Computer Science: Modelling and Reasoning about Systems. Cambridge University Press, 2nd edition, 2004. G. Nelson and D. C. Oppen. Fast decision procedures based on congruence closure. J. ACM, 27(2):356–364, 1980. ISSN 0004-5411. doi: http://doi.acm.org/10.1145/322186.322198.