<<

Automated Reasoning∗

Marek Kosta Christoph Weidenbach Summer Term 2012

What is about?

Theory Graphics Data Bases Programming Languages Hardware Bioinformatics Verification

What is Automated Deduction about?

Generic Problem Solving by a Computer Program.

∗This document contains the text of the lecture slides (almost verbatim) plus some additional infor- mation, mostly proofs of theorems that are presented on the blackboard during the course. It is not a full script and does not contain the examples and additional explanations given during the lecture. Moreover it should not be taken as an example how to write a research paper – neither stylistically nor typographically.

1 Introductory Example: Solving 4 × 4 Sudoku

2 1

3 1 1 2 Start

2 1 4 3 3 4 1 2 4 2 3 1 1 3 2 4 Solution

Formal Model

Represent board by a function f(x, y) mapping cells to their value.

2 1

3 1 1 2 Start

N = f(1, 1) ≈ 2 ∧ f(1, 2) ≈ 1∧ f(3, 3) ≈ 3 ∧ f(3, 4) ≈ 1∧ f(4, 1) ≈ 1 ∧ f(4, 3) ≈ 2 ∧ is conjunction and ⊤ the empty conjunction. A state is described by a triple (N; D; r) where • N contains the equations for the starting Sudoku • D a conjunction of further equations computed by the • r ∈ {⊤, ⊥} Initial state is (N; ⊤; ⊤). A square f(x, y) where x, y ∈{1, 2, 3, 4} is called defined by N ∧ D if there is an equation f(x, y) ≈ z, z ∈{1, 2, 3, 4} in N or D. For otherwise f(x, y) it is called undefined.

2 Rule-Based Algorithm

Deduce (N; D; ⊤) ⇒ (N; D ∧ f(x, y) ≈ 1; ⊤) provided f(x, y) is undefined in N ∧ D, for any x, y ∈{1, 2, 3, 4}.

Conflict (N; D; ⊤) ⇒ (N; D; ⊥) provided for y = z (i) f(x, y) = f(x, z) for f(x, y), f(x, z) defined in N ∧ D for some x, y, z or (ii) f(y, x) = f(z, x) for f(y, x), f(z, x) defined in N ∧ D for some x, y, z or (iii) f(x, y) = f(x′,y′) for f(x, y), f(x′,y′) defined in N ∧ D and [x, x′ ∈ {1, 2} or x, x′ ∈{3, 4}] and [y,y′ ∈{1, 2} or y,y′ ∈{3, 4}] and x = x′ or y = y′. Backtrack (N; D′ ∧ f(x, y) ≈ z ∧ D′′; ⊥) ⇒ (N; D′ ∧ f(x, y) ≈ z + 1; ⊤) provided z < 4 and D′′ = ⊤ or D′′ contains only equations of the form f(x′,y′) ≈ 4.

Fail (N; D; ⊥) ⇒ (N; ⊤; ⊥) provided D = ⊤ and D contains only equations of the form f(x, y) ≈ 4. Properties: Rules are applied don’t care non-deterministically.

An algorithm (set of rules) is sound if whenever it declares having found a solution it actually has computed a solution. It is complete if it finds a solution if one exists. It is terminating if it never runs forever.

Proposition 0.1 (Soundness) The rules Deduce, Conflict, Backtrack and Fail are sound. Starting from an initial state (N; ⊤; ⊤): (i) for any final state (N; D; ⊤), the equations in N ∧ D are a solution, and, (ii) for any final state (N; ⊤; ⊥) there is no solution to the initial problem.

Proof . (i) So assume a final state (N; D; ⊤) such that no rule is applicable. In particu- lar, this means that for all x, y ∈{1, 2, 3, 4} the square f(x, y) is defined in N ∧ D as for otherwise Deduce would be applicable, contradicting that (N; D; ⊤) is a final state. So all squares are defined by N ∧ D. What remains to be shown is that those assignments actually constitute a solution to the Sudoku. However, if some assignment in N ∧ D results in a repetition of a number in some column, row or 2 × 2 box of the Sudoku, then rule Conflict is applicable, contradicting that (N; D; ⊤) is a final state. In sum,

3 (N; D; ⊤) is a solution to the Sudoku and hence the rules Deduce, Conflict, Backtrack and Fail are sound. (ii) So assume that the initial problem (N; ⊤; ⊤) has a solution. We prove that in this case we cannot reach a state (N; ⊤; ⊥). Let (N; D; ⊤) be an arbitrary state still having a solution. This includes the initial state if D = ⊤. We prove that we can correctly decide the next square. Since (N; D; ⊤) still has a solution the only applicable rule is Deduce and we generate (N; D ∧ f(x, y) ≈ 1; ⊤) for some x, y ∈{1, 2, 3, 4}. If(N; D ∧ f(x, y) ≈ 1; ⊤) still has a solution we are done. So assume (N; D ∧ f(x, y) ≈ 1; ⊤) does not have a solution anymore. But then eventually we will apply Conflict and Backtrack to a state (N; D ∧ f(x, y) ≈ 1 ∧ D′; ⊥) where D′ only contains equations of the form f(x′,y′) ≈ 4 resulting in (N; D ∧ f(x, y) ≈ 2; ⊤). Now repeating the argument we will eventually reach a state (N; D ∧ f(x, y) ≈ k; ⊤) that has a solution.

Proposition 0.2 (Completeness) The rules Deduce, Conflict, Backtrack and Fail are complete. For any solution N ∧ D of the Sudoku there is a sequence of rule applications such that (N; D; ⊤) is a final state.

Proof . A particular strategy for the rule applications is needed to indeed generate (N; D; ⊤) out of (N; ⊤; ⊤) for some solution N ∧ D. Without loss of generality we assume the assignments in D to be sorted such that assignments to a number k ∈ {1, 2, 3, 4} precede any assignment to some number l>k. So if, for example, N does not assign all four values 1, then the first assignment in D is of the form f(x, y) ≈ 1 for some x, y. Now we apply the following strategy, subsequently adding all assignments from D to (N; ⊤; ⊤). Let us assume we have already achieved state (N; D′; ⊤) and the next assignment from D to be established is f(x, y) ≈ k, meaning f(x, y) is not defined in N ∧ D′. Then until l = k we do the following, starting from l = 1. We apply Deduce adding the assignment f(x, y) ≈ l. If Conflict is applicable to this assignment, we apply Backtrack, generating the new assignment f(x, y) ≈ l + 1 and continue. We need to show that this strategy in fact eventually adds f(x, y) ≈ k to D′. As long as l

Proposition 0.3 (Termination) The rules Deduce, Conflict, Backtrack and Fail ter- minate on any input state (N; ⊤; ⊤).

Proof . Once the rule Fail is applicable, no other rule is applicable on the result any- more. So we do not need to consider rule Fail for termination. The idea of the proof is that we assign a measure over the natural numbers to every state such that each rule

4 strictly decreases the measure and that the measure cannot get below 0. For any given state S =(N; D; ∗) with ∗∈ {⊤, ⊥} with D = f(x1,y1) ≈ k1 ∧ ... ∧ f(xn,yn) ≈ kn we 49 n 49−3i assign the measure (S) by (S)=2 − p − i=1 ki 2 where p = 0 if ∗ = ⊤ and p = 1 otherwise. The measure (S) is well-defined and cannot become negative as n ≤ 16, p ≤ 1, and 1 ≤ ki ≤ 4 for any D. In particular, the former holds because the rule Deduce only adds values for undefined squares and the overall number of squares is bound to 16. What remains to be shown is that each rule application decreases . We do this by a case analysis over the rules.

49 n 49−3i 49 n 49−3i 49−3(n+1) Deduce: ((N; D; ⊤))=2 − i=1 ki 2 > 2 − i=1 ki 2 − 1 2 = ((N; D ∧ f(x, y) ≈ 1; ⊤)) 49 n 49−3i 49 n 49−3i Conflict: ((N; D; ⊤))=2 − i=1 ki 2 > 2 − 1 − i=1 ki 2 = ((N; D; ⊥)) ′ ′′ 49 l−1 49−3i Backtrack: ((N; D ∧ f(xl,yl) ≈ kl ∧ D ; ⊥)) = 2 −1 − ( i=1 ki 2 ) − kl 49−3l n 49−3i 49 l−1 49−3i 49−3l ′ 2 − i=l+1 ki 2 > 2 − ( i=1 ki 2 ) − (kl + 1) 2 = (N; D ∧ 49−3l n 49−3i f(xl,yl) ≈ kl + 1; ⊤) in particular because 2 > ki 2 + 1. i=l+1 Confluence

Another important property for don’t care non-deterministic rule based definitions of algorithms is confluence. It means that whenever several sequences of rules are applicable to a given states, the respective results can be rejoined by further rule applications to a common problem state.

Proposition 0.4 (Deduce and Conflict are Locally Confluent) Given a state (N; D; ⊤) out of which two different states (N; D1; ⊤) and (N; D2; ⊥) can be generated by Deduce and Conflict in one step, respectively, then the two states can be rejoined to a state (N; D′; ∗) via further rule applications.

Proof . Consider an application of Deduce and Conflict to a state (N; D; ⊤) resulting in (N; D ∧ f(x, y) ≈ 1; ⊤) and (N; D; ⊥), respectively. We will now show that in fact we can rejoin the two states. Notice that since Conflict is applicable to (N; D; ⊤) it is also applicable to (N; D ∧ f(x, y) ≈ 1; ⊤). So the first sequence of rejoin steps is (N; D ∧ f(x, y) ≈ 1; ⊤) → (N; D ∧ f(x, y) ≈ 1; ⊥) → (N; D ∧ f(x, y) ≈ 2; ⊤) →∗ (N; D ∧ f(x, y) ≈ 4; ⊥) where we subsequently applied Conflict and Backtrack to reach the state (N; D ∧ f(x, y) ≈ 4; ⊥) and →∗ abbreviates those finite number of rule applications. Finally applying Backtrack (or Fail) to (N; D; ⊥) and (N; D ∧ f(x, y) ≈ 4; ⊥) results in the same state.

5 Result

It works. But: It looks like a lot of effort for a problem that one can solve with a little bit of thinking. Reason: Our approach is very general, it can actually be used to “potentially solve” any problem in computer science. This difference is also important for automated reasoning: • For problems that are well-known and frequently used, we can develop optimal specialized methods. ⇒ Algorithms & Data-structures

• For new/unknown/changing problems, we have to develop generic methods that do “something useful”. ⇒ this lecture: Logic + Calculus + Implementation • Combining the two approaches ⇒ Automated Reasoning II (next semester): Logic modulo Theory + Calculus + Implementation

Topics of the Course

Preliminaries math repetition computer science repetition orderings induction (repetition) rewrite systems

Propositional logic logic: syntax, semantics calculi: superposition, CDCL implementation: 2-watched literal, clause learning First-order predicate logic logic: syntax, semantics, model theory calculus: superposition implementation: sharing, indexing First-order predicate logic with equality

6 equational logic: unit equations calculus: term rewriting systems, Knuth-Bendix completion implementation: dependency pairs first-order logic with equality calculus: superposition implementation: rewriting

Literature

Is a big problem, actually you are the “guinea-pigs” for a new textbook. Franz Baader and Tobias Nipkow: Term rewriting and all that, Cambridge Univ. Press, 1998. (Textbook on equational reasoning) Armin Biere and Marijn Heule and Hans van Maaren and Toby Walsh (editors): Hand- book of Satisfiability, IOS Press, 2009. (Be careful: Handbook, hard to read) Alan Robinson and Andrei Voronkov (editors): Handbook of Automated Reasoning, Vol I & II, Elsevier, 2001. (Be careful: Handbook, very hard to read)

1 Preliminaries

• math repetition • computer science repetition • orderings • induction (repetition) • rewrite systems

1.1 Mathematical Prerequisites

N = {0, 1, 2,...} is the set of natural numbers N+ is the set of positive natural numbers without 0 Z, Q, R denote the integers, rational numbers and the real numbers, respectively.

7 Multisets

Given a set M, a multi-set S over M is a mapping S : M → N, where S specifies the number of occurrences of elements m of the base set M within the multiset S. We use the standard set notations ∈, ⊂, ⊆, ∪, ∩ with the analogous meaning for multisets, e.g., (S1 ∪ S2)(m)= S1(m)+ S2(m). We also write multi-sets in a set like notation, e.g., the multi-set S = {1, 2, 2, 4} denotes a multi-set over the set {1, 2, 3, 4} where S(1) = 1, S(2) = 2, S(3) = 0, and S(4) = 1. A multi-set S over a set M is finite if {m ∈ M | S(m) > 0} is finite. In this lecture we only consider finite multi-sets.

Relations

An n-ary relation R over some set M is a subset of M n: R ⊆ M n. For two n-ary relations R, Q over some set M, their union (∪) or intersection (∩) is again an n-ary relation, where R ∪ Q := {(m1,...,mn) ∈ M | (m1,...,mn) ∈ R or (m1,...,mn) ∈ Q} R ∩ Q := {(m1,...,mn) ∈ M | (m1,...,mn) ∈ R and (m1,...,mn) ∈ Q} . A relation Q is a subrelation of a relation R if Q ⊆ R. The characteristic function of a relation R or sometimes called predicate indicates mem- bership. In addition of writing (m1,...,mn) ∈ R we also write R(m1,...,mn). So the predicate R(m1,...,mn) holds or is true if in fact (m1,...,mn) belongs to the relation R.

Words

Given a nonempty alphabet Σ the set Σ∗ of finite words over Σ is defined by (i) the empty word ǫ ∈ Σ∗ (ii) for each letter a ∈ Σ also a ∈ Σ∗ (iii) if u, v ∈ Σ∗ so uv ∈ Σ∗ where uv denotes the concatenation of u and v. The length |u| of a word u ∈ Σ∗ is defined by (i) |ǫ| := 0, (ii) |a| := 1 for any a ∈ Σ and (iii) |uv| := |u| + |v| for any u, v ∈ Σ∗.

8 1.2 Computer Science Prerequisites

A little bit on computational complexity theory.

Big O

Let f(n) and g(n) be functions from the naturals into the non-negative reals. Then

+ O(f(n)) = {g(n) | ∃ c> 0 ∃ n0 ∈ N ∀ n ≥ n0 g(n) ≤ c f(n)}

We use ∀, reads “for all”, and ∃, reads “exists”, on the object and meta level.

Decision Problem

A decision problem is a subset L ⊆ Σ∗ for some fixed finite alphabet Σ. The function chr(L, x) denotes the characteristic function for some decision problem L and is defined by chr(L, u) = 1 if u ∈ L and chr(L, u) = 0 otherwise. A decision problem is solvable in polynomial-time iff its characteristic function can be computed in polynomial-time. The class P denotes all polynomial-time decision prob- lems.

NP

A decision problem L is in NP iff there is a predicate Q(x, y) and a polynomial p(n) such that for all u ∈ Σ∗ we have (i) u ∈ L iff there is an v ∈ Σ∗ with |v| ≤ p(|u|) and Q(u, v) holds, and (ii) the predicate Q is in P.

Reducible,NP-Hard, NP-Complete

A decision problem L is polynomial-time reducible to a decision problem L′ if there is a function g ∈ P such that for all u ∈ Σ∗ we have u ∈ L iff g(u) ∈ L′. For example, if L is reducible to L′ and L′ ∈ P then L ∈ P. A decision problem is NP-hard if every problem in NP is polynomial-time reducible to it. A decision problem is NP-complete if it is NP-hard and in NP.

9 1.3 Ordering

Termination of rewrite systems and proof theory is strongly related to the concept of (well-founded) orderings. An ordering R is a binary relation on some set M. Relevant properties of orderings are: Depending on particular properties such as

(reflexivity) ∀ x ∈ M R(x, x) (irreflexivity) ∀ x ∈ M ¬R(x, x) (antisymmetry) ∀ x, y ∈ M (R(x, y) ∧ R(y, x) → x = y) (transitivity) ∀ x, y, z ∈ M (R(x, y) ∧ R(y, z) → R(x, z)) (totality) ∀ x, y ∈ M (R(x, y) ∨ R(y, x)) where = is the identity relation on M. The boolean connectives ∧, ∨, and → read “and”, “or”, and “implies”, respectively.

Partial Ordering

A strict partial ordering ≻ on a set M is a transitive and irreflexive binary relation on M. An a ∈ M is called minimal, if there is no b in M such that a ≻ b. An a ∈ M is called smallest, if b ≻ a for all b ∈ M different from a. Notation: ≺ for the inverse relation ≻−1 for the reflexive closure (≻ ∪ =) of ≻

Well-Foundedness

A strict partial ordering ≻ on M is called well-founded (Noetherian), if there is no infinite descending chain a0 ≻ a1 ≻ a2 ≻ ... with ai ∈ M.

10 Well-Foundedness and Termination

Let →, > be binary relations on the same set.

Lemma 1.1 If > is a well-founded partial ordering and → ⊆ >, then → is terminating.

Lemma 1.2 If → is a terminating binary relation over A, then →+ is a well-founded partial ordering.

Proof. Transitivity of →+ is obvious; irreflexivity and well-foundedness follow from termination of →. 2

Well-Founded Orderings: Examples

Natural numbers. (N,>)

Lexicographic orderings. Let (M1, ≻1), (M2, ≻2) be well-founded orderings. Then let their lexicographic combination

≻ =(≻1, ≻2)lex

on M1 × M2 be defined as

(a1, a2) ≻ (b1, b2):⇔ a1 ≻1 b1 or (a1 = b1 and a2 ≻2 b2)

(analogously for more than two orderings) This again yields a well-founded ordering (proof below).

Length-based ordering on words. For alphabets Σ with a well-founded ordering >Σ, the relation ≻ defined as

′ ′ ′ ′ w ≻ w :⇔ |w| > |w | or (|w| = |w | and w>Σ,lex w )

is a well-founded ordering on Σ∗ (Exercise). Counterexamples: (Z,>) (N,<) the lexicographic ordering on Σ∗

11 Basic Properties of Well-Founded Orderings

Lemma 1.3 (M, ≻) is well-founded if and only if every ∅ ⊂ M ′ ⊆ M has a minimal element.

Lemma 1.4 (M1, ≻1) and (M2, ≻2) are well-founded if and only if (M1 × M2, ≻) with ≻ =(≻1, ≻2)lex is well-founded.

Proof. (i) “⇒”: Suppose (M1 × M2, ≻) is not well-founded. Then there is an infinite sequence (a0, b0) ≻ (a1, b1) ≻ (a2, b2) ≻ ... .

Let A = { ai | i ≥ 0 } ⊆ M1. Since (M1, ≻1) is well-founded, A has a minimal element an. But then B = { bi | i ≥ n } ⊆ M2 can not have a minimal element, contradicting the well-foundedness of (M2, ≻2). (ii) “⇐”: obvious. 2

Monotone Mappings

Let (M1,>1) and (M2,>2) be strict partial orderings. A mapping ϕ : M1 → M2 is called monotone, if a>1 b implies ϕ(a) >2 ϕ(b) for all a, b ∈ M1.

Lemma 1.5 If ϕ is a monotone mapping from (M1,>1) to (M2,>2) and (M2,>2) is well-founded, then (M1,>1) is well-founded.

Multiset Orderings

Lemma 1.6 (K¨onig’s Lemma) Every finitely branching tree with infinitely many nodes contains an infinite path.

Let (M, ≻) be a strict partial ordering. The multiset extension of ≻ to multisets over M is defined by

S1 ≻mul S2 ⇔

S1 = S2 and

∀m ∈ M: S2(m) >S1(m) ′ ′ ′ ′ ⇒ ∃m ∈ M: m ≻ m and S1(m ) >S2(m )

12 1.4 Induction

More or less all sets of objects in computer science or logic are defined inductively. Typically, this is done in a bottom-up way, where starting with some definite set, it is closed under a given set of operations.

Example 1.7 (Inductive Sets) 1. The set of all Sudoku problem states, consists of the set of start states (N; ⊤; ⊤) for consistent assignments N plus all states that can be derived from the start states by the rules Deduce, Conflict, Backtrack, and Fail. This is a finite set. 2. The set N of the natural numbers, consists of 0 plus all numbers that can be computed from 0 by adding 1. This is an infinite set. 3. The set of all strings Σ∗ over a finite alphabet Σ where all letters of Σ are contained in Σ∗ and if u and v are words out of Σ∗ so is the word uv. This is an infinite set.

All the previous examples have in common that there is an underlying well-founded ordering on the sets induced by the construction. The minimal elements for the Sudoku are the problem states (N; ⊤; ⊤), for the natural numbers it is 0 and for the set of strings the empty word. Now if we want to prove a property of an inductive set it is sufficient to prove it (i) for the minimal element(s) and (ii) assuming the property for an arbitrary set of elements, to prove that it holds for all elements that can be constructed “in one step” out those elements. This is the principle of Noetherian Induction.

Theorem 1.8 (Noetherian Induction) Let (M, ≻) be a well-founded ordering, let Q be a property of elements of M. If for all m ∈ M the implication if Q(m′) for all m′ ∈ M such that m ≻ m′,1 then Q(m).2 is satisfied, then the property Q(m) holds for all m ∈ M.

Proof. Let X = { m ∈ M | Q(m) false }. Suppose, X = ∅. Since (M, ≻) is well- ′ ′ founded, X has a minimal element m1. Hence for all m ∈ M with m ≺ m1 the property Q(m′) holds. On the other hand, the implication which is presupposed for this theorem holds in particular also for m1, hence Q(m1) must be true so that m1 can not be in X. Contradiction. 2

1induction hypothesis 2induction step

13 Theorem 1.9 (Properties Multi-Set Ordering) (a) ≻mul is a strict partial order- ing. (b) ≻ well-founded ⇒ ≻mul well-founded. (c) ≻ total ⇒ ≻mul total.

Proof. see Baader and Nipkow, page 22–24. 2

1.5 Rewrite Systems

A rewrite system is a pair (A, →), where A is a set, → ⊆ A × A is a binary relation on A. The relation → is usually written in infix notation, i. e., a → b instead of (a, b) ∈→.

Let →′ ⊆ A × B and →′′ ⊆ B × C be two binary relations. Then the binary relation (→′ ◦→′′) ⊆ A × C is defined by

a (→′ ◦→′′) c if and only if a →′ b and b →′′ c for some b ∈ B.

→0 = { (a, a) | a ∈ A } identity →i+1 = →i ◦→ i +1-fold composition + i → = i>0 → transitive closure →∗ = →i = →+ ∪→0 reflexive transitive closure i≥0 →= = →∪→0 reflexive closure →−1 = ← = { (b, c) | c → b } inverse ↔ = →∪← symmetric closure ↔+ = (↔)+ transitive symmetric closure ↔∗ = (↔)∗ refl. trans. symmetric closure b ∈ A is reducible, if there is a c such that b → c. b is in normal form (irreducible), if it is not reducible. c is a normal form of b, if b →∗ c and c is in normal form. Notation: c = b↓ (if the normal form of b is unique).

14 A relation → is called

terminating, if there is no infinite descending chain b0 → b1 → b2 → ... . normalizing, if every b ∈ A has a normal form.

Lemma 1.10 If → is terminating, then it is normalizing.

Note: The reverse implication does not hold.

Confluence

Let (A, →) be a rewrite system. b and c ∈ A are joinable, if there is an a such that b →∗ a ∗← c. Notation: b ↓ c. The relation → is called Church-Rosser, if b ↔∗ c implies b ↓ c. confluent, if b ∗← a →∗ c implies b ↓ c. locally confluent, if b ← a → c implies b ↓ c. convergent, if it is confluent and terminating.

For a rewrite system (M, →) consider a sequence of elements ai that are pairwise con- nected by the symmetric closure, i.e., a1 ↔ a2 ↔ a3 ... ↔ an. We say that ai is a peak in such a sequence, if actually ai−1 ← ai → ai+1.

Theorem 1.11 The following properties are equivalent: (i) → has the Church-Rosser property. (ii) → is confluent.

Proof. (i)⇒(ii): trivial. (ii)⇒(i): by induction on the number of peaks in the derivation b ↔∗ c. 2

Lemma 1.12 If → is confluent, then every element has at most one normal form.

15 Proof. Suppose that some element a ∈ A has normal forms b and c, then b ∗← a →∗ c. If → is confluent, then b →∗ d ∗← c for some d ∈ A. Since b and c are normal forms, both derivations must be empty, hence b →0 d 0← c, so b, c, and d must be identical. 2

Corollary 1.13 If → is normalizing and confluent, then every element b has a unique normal form.

Proposition 1.14 If → is normalizing and confluent, then b ↔∗ c if and only if b↓ = c↓.

Proof. Either using Thm. 1.11 or directly by induction on the length of the derivation of b ↔∗ c. 2

Confluence and Local Confluence

Theorem 1.15 (“Newman’s Lemma”) If a terminating relation → is locally conflu- ent, then it is confluent.

Proof. Let → be a terminating and locally confluent relation. Then →+ is a well- founded ordering. Define Q(a) ⇔ ∀b, c : b ∗← a →∗ c ⇒ b ↓ c . We prove Q(a) for all a ∈ A by well-founded induction over →+: Case 1: b 0← a →∗ c: trivial. Case 2: b ∗← a →0 c: trivial. Case 3: b ∗← b′ ← a → c′ →∗ c: use local confluence, then use the induction hypothesis. 2

16 2 Propositional Logic

Propositional logic • logic of truth values • decidable (but NP-complete) • can be used to describe functions over a finite domain • industry standard for many analysis/verification tasks • growing importance for discrete optimization problems (Automated Reasoning II)

2.1 Syntax

• propositional variables • logical connectives ⇒ Boolean connectives and constants

Propositional Variables

Let Σ be a set of propositional variables also called the signature of the (propositional) logic. We use letters P , Q, R, S, to denote propositional variables.

Propositional Formulas

PROP(Σ) is the set of propositional formulas over Σ inductively defined as follows: φ, ψ ::= ⊥ (falsum) | ⊤ (verum) | P , P ∈ Σ (atomic formula) | (¬φ) (negation) | (φ ∧ ψ) (conjunction) | (φ ∨ ψ) (disjunction) | (φ → ψ) (implication) | (φ ↔ ψ) (equivalence)

17 Notational Conventions

As a notational convention we assume that ¬ binds strongest, and we remove outermost parenthesis, so ¬P ∨ Q is actually a shorthand for ((¬P ) ∨ Q). For all other logical connectives we will explicitly put parenthesis when needed. From the semantics we will see that ∧ and ∨ are associative and commutative. Therefore instead of ((P ∧ Q) ∧ R) we simply write P ∧ Q ∧ R. Automated reasoning is very much formula manipulation. In order to precisely represent the manipulation of a formula, we introduce positions.

Formula Manipulation

A position is a word over N. The set of positions of a formula φ is inductively defined by

pos(φ) := {ǫ} if φ ∈ {⊤, ⊥} or φ ∈ Σ pos(¬φ) := {ǫ}∪{1p | p ∈ pos(φ)} pos(φ ◦ ψ) := {ǫ}∪{1p | p ∈ pos(φ)}∪{2p | p ∈ pos(ψ)} where ◦∈{∧, ∨, →, ↔}. The prefix order ≤ on positions is defined by p ≤ q if there is some p′ such that pp′ = q. Note that the prefix order is partial, e.g., the positions 12 and 21 are not comparable, they are “parallel”, see below. By < we denote the strict part of ≤, i.e., p < q if p ≤ q but not q ≤ p. By we denote incomparable positions, i.e., p q if neither p ≤ q, nor q ≤ p. Then we say that p is above q if p ≤ q, p is strictly above q if p < q, and p and q are parallel if p q. The size of a formula φ is given by the cardinality of pos(φ): |φ| := | pos(φ)|. The subformula of φ at position p ∈ pos(φ) is recursively defined by

φ|ǫ := φ ¬φ|1p := φ|p (φ1 ◦ φ2)|ip := φi|p where i ∈{1, 2}

◦∈{∧, ∨, →, ↔}. Finally, the replacement of a subformula at position p ∈ pos(φ) by a formula ψ is recursively defined by

18 φ[ψ]ǫ := ψ (¬φ)[ψ]1p := ¬(φ[ψ]p) (φ1 ◦ φ2)[ψ]1p := (φ1[ψ]p ◦ φ2) (φ1 ◦ φ2)[ψ]2p := (φ1 ◦ φ2[ψ]p) where ◦∈{∧, ∨, →, ↔}.

Example 2.1 The set of positions for the formula φ =(A ∧ B) → (A ∨ B) is pos(φ)= {ǫ, 1, 11, 12, 2, 21, 22}. The subformula at position 22 is B, φ|22 = B and replacing this formula by A ↔ B results in φ[A ↔ B]22 =(A ∧ B) → (A ∨ (A ↔ B)).

A further prerequisite for efficient formula manipulation is the polarity of a subformula ψ of φ. The polarity determines the number of “negations” starting from φ down to ψ. It is 1 for an even number along the path, −1 for an odd number and 0 if there is at least one equivalence connective along the path. The polarity of a subformula ψ of φ at position p, i ∈{1, 2} is recursively defined by

pol(φ, ǫ) := 1 pol(¬φ, 1p) := − pol(φ,p) pol(φ1 ◦ φ2, ip) := pol(φi,p) if ◦∈{∧, ∨} pol(φ1 → φ2, 1p) := − pol(φ1,p) pol(φ1 → φ2, 2p) := pol(φ2,p) pol(φ1 ↔ φ2, ip) := 0

Example 2.2 We reuse the formula φ = (A ∧ B) → (A ∨ B) Then pol(φ, 1) = pol(φ, 11) = −1 and pol(φ, 2) = pol(φ, 22) = 1. For the formula φ′ =(A ∧ B) ↔ (A ∨ B) we get pol(φ′, ǫ)=1 and pol(φ′,p)=0 for all other p ∈ pos(φ′), p = ǫ.

19 2.2 Semantics

In classical logic (dating back to Aristoteles) there are “only” two truth values “true” and “false” which we shall denote, respectively, by 1 and 0. There are multi-valued logics having more than two truth values.

Valuations

A propositional variable has no intrinsic meaning. The meaning of a propositional variable has to be defined by a valuation. A Σ-valuation is a map

A : Σ →{0, 1}. where {0, 1} is the set of truth values.

Truth Value of a Formula in A

Given a Σ-valuation A, the function can be extended to A : PROP(Σ) →{0, 1} by:

A(⊥)=0 A(⊤)=1 A(¬φ)=1 −A(φ) A(φ ∧ ψ) = min({A(φ), A(ψ)}) A(φ ∨ ψ) = max({A(φ), A(ψ)}) A(φ → ψ) = max({(1 −A(φ)), A(ψ)}) A(φ ↔ ψ) = if A(φ)= A(ψ) then 1 else 0

2.3 Models, Validity, and Satisfiability

φ is valid in A (A is a model of φ; φ holds under A):

A|= φ :⇔ A(φ)=1

φ is valid (or is a tautology):

|= φ :⇔ A|= φ for all Σ-valuations A

φ is called satisfiable if there exists an A such that A |= φ. Otherwise φ is called unsatisfiable (or contradictory).

20 Entailment and Equivalence

φ entails (implies) ψ (or ψ is a consequence of φ), written φ |= ψ, if for all Σ-valuations A we have A|= φ ⇒ A|= ψ. φ and ψ are called equivalent, written φ |=| ψ, if for all Σ-valuations A we have A |= φ ⇔ A|= ψ.

Proposition 2.3 φ |= ψ if and only if |=(φ → ψ).

Proof. (⇒) Suppose that φ entails ψ. Let A be an arbitrary Σ-valuation. We have to show that A |= φ → ψ. If A(φ) = 1, then A(ψ) = 1 (since φ |= ψ), and hence A(φ → ψ) = 1. Otherwise if A(φ) = 0, then A(φ → ψ) = max({1, A(ψ)})=1 independently of A(ψ). In both cases, A|= φ → ψ. (⇐) Suppose that φ does not entail ψ. Then there exists a Σ-valuation A such that A |= φ, but not A |= ψ. Consequently, A(φ → ψ) = max({(1 −A(φ)), A(ψ)}) = max({0, 0})=0,so (φ → ψ) does not hold in A. 2

Proposition 2.4 φ |=| ψ if and only if |=(φ ↔ ψ).

Proof. Analogously to Prop. 2.3. 2

Entailment is extended to sets of formulas N in the “natural way”: N |= φ if for all Σ-valuations A: if A|= ψ for all ψ ∈ N, then A|= φ. Note: formulas are always finite objects; but sets of formulas may be infinite. There- fore, it is in general not possible to replace a set of formulas by the conjunction of its elements.

Validity vs. Unsatisfiability

Validity and unsatisfiability are just two sides of the same medal as explained by the following proposition.

Proposition 2.5 φ is valid if and only if ¬φ is unsatisfiable.

Proof. (⇒) If φ is valid, then A(φ) = 1 for every valuation A. Hence A(¬φ) = (1 −A(φ)) = 0 for every valuation A, so ¬φ is unsatisfiable. (⇐) Analogously. 2

21 Hence in order to design a theorem prover (validity checker) it is sufficient to design a checker for unsatisfiability. In a similar way, entailment N |= φ can be reduced to unsatisfiability:

Proposition 2.6 N |= φ if and only if N ∪ {¬φ} is unsatisfiable.

Checking Unsatisfiability

Every formula φ contains only finitely many propositional variables. Obviously, A(φ) depends only on the values of those finitely many variables in φ under A. If φ contains n distinct propositional variables, then it is sufficient to check 2n valuations to see whether φ is satisfiable or not. ⇒ truth table. So the satisfiability problem is clearly decidable (but, by Cook’s Theorem, NP-complete). Nevertheless, in practice, there are (much) better methods than truth tables to check the satisfiability of a formula. (later more)

Truth Table

Let φ be a propositional formula over variables P1,...,Pn and k = | pos(φ)|. Then a complete truth table for φ is a table with n + k columns and 2n +1 rows of the form

P1 ... Pn φ|p1 ... φ|pk

0 ... 0 A1(φ|p1 ) ... A1(φ|pk ) . .

n n 1 ... 1 A2 (φ|p1 ) ... A2 (φ|pk )

n such that the Ai are exactly the 2 different valuations for P1,...,Pn and either pi pi+j or pi ≥ pi+j, in particular pk = ǫ and φ|pk = φ for all i, j ≥ 0, i + j ≤ k. Truth tables can be used to check validity, satisfiability or unsatisfiability of a formula in a systematic way. They have the nice property that if the rows are filled from left to right, then in order to compute Ai(φ|pj ) the values for Ai of φ|pjh are already computed, h ∈{1, 2}.

22 Substitution Theorem

Proposition 2.7 Let φ1 and φ2 be equivalent formulas, and ψ[φ1]p be a formula in which φ1 occurs as a subformula at position p.

Then ψ[φ1]p is equivalent to ψ[φ2]p.

Proof. The proof proceeds by induction over the formula structure of ψ. Each of the formulas ⊥, ⊤, and P for P ∈ Σ contains only one subformula, namely itself. Hence, if ψ = ψ[φ1]ǫ equals ⊥, ⊤, or P , then ψ[φ1]ǫ = φ1, ψ[φ2]ǫ = φ2, and we are done by assumption.

If ψ = ψ1 ∧ ψ2, then either p = ǫ (this case is treated as above), or φ1 is a subformula ′ ′ of ψ1 or ψ2 at position 1p or 2p , respectively. Without loss of generality, assume that φ1 is a subformula of ψ1, so ψ = ψ1[φ1]p′ ∧ ψ2. By the induction hypothesis, ψ1[φ1]p′ and ψ1[φ2]p′ are equivalent. Hence, for any valuation A, A(ψ[φ1]1p′ ) = A(ψ1[φ1]p′ ∧ ψ2) = min({A(ψ1[φ1]p′ ), A(ψ2)}) = min({A(ψ1[φ2]p′ ), A(ψ2)}) = A(ψ1[φ2]p′ ∧ ψ2) = A(ψ[φ2]1p′ ). The other boolean connectives are handled analogously. 2

Equivalences

Proposition 2.8 The following equivalences are valid for all formulas φ,ψ,χ:

(φ ∧ φ) ↔ φ Idempotency ∧ (φ ∨ φ) ↔ φ Idempotency ∨ (φ ∧ ψ) ↔ (ψ ∧ φ) Commutativity ∧ (φ ∨ ψ) ↔ (ψ ∨ φ) Commutativity ∨ (φ ∧ (ψ ∧ χ)) ↔ ((φ ∧ ψ) ∧ χ) Associativity ∧ (φ ∨ (ψ ∨ χ)) ↔ ((φ ∨ ψ) ∨ χ) Associativity ∨ (φ ∧ (ψ ∨ χ)) ↔ (φ ∧ ψ) ∨ (φ ∧ χ) Distributivity ∧∨ (φ ∨ (ψ ∧ χ)) ↔ (φ ∨ ψ) ∧ (φ ∨ χ) Distributivity ∨∧

(φ ∧ (φ ∨ ψ)) ↔ φ Absorption ∧∨ (φ ∨ (φ ∧ ψ)) ↔ φ Absorption ∨∧ (φ ∧ ¬φ) ↔ ⊥ Introduction ⊥ (φ ∨ ¬φ) ↔ ⊤ Introduction ⊤

¬(φ ∨ ψ) ↔ (¬φ ∧ ¬ψ) De Morgan ¬∨ ¬(φ ∧ ψ) ↔ (¬φ ∨ ¬ψ) De Morgan ¬∧ ¬⊤ ↔ ⊥ Propagate ¬⊤ ¬⊥ ↔ ⊤ Propagate ¬⊥

23 (φ ∧ ⊤) ↔ φ Absorption ⊤∧ (φ ∨ ⊥) ↔ φ Absorption ⊥∨ (φ → ⊥) ↔ ¬φ Eliminate ⊥→ (φ ↔ ⊥) ↔ ¬φ Eliminate ⊥↔ (φ ↔ ⊤) ↔ φ Eliminate ⊤↔ (φ ∨ ⊤) ↔ ⊤ Propagate ⊤ (φ ∧ ⊥) ↔ ⊥ Propagate ⊥

(φ → ψ) ↔ (¬φ ∨ ψ) Eliminate → (φ ↔ ψ) ↔ (φ → ψ) ∧ (ψ → φ) Eliminate1 ↔ (φ ↔ ψ) ↔ (φ ∧ ψ) ∨ (¬φ ∧ ¬ψ) Eliminate2 ↔

For simplification purposes the equivalences are typically applied as left to right rules.

2.4 Normal Forms

We define conjunctions of formulas as follows: 0 i=1 φi = ⊤. 1 i=1 φi = φ1. n+1 n i=1 φi = i=1 φi ∧ φn+1. and analogously disjunctions: 0 i=1 φi = ⊥. 1 i=1 φi = φ1. n+1 n i=1 φi = i=1 φi ∨ φn+1. Literals and Clauses

A literal is either a propositional variable P or a negated propositional variable ¬P . A clause is a (possibly empty) disjunction of literals.

24 CNF and DNF

A formula is in conjunctive normal form (CNF, clause normal form), if it is a conjunction of disjunctions of literals (or in other words, a conjunction of clauses). A formula is in disjunctive normal form (DNF), if it is a disjunction of conjunctions of literals. Warning: definitions in the literature differ: are complementary literals permitted? are duplicated literals permitted? are empty disjunctions/conjunctions permitted? Checking the validity of CNF formulas or the unsatisfiability of DNF formulas is easy: A formula in CNF is valid, if and only if each of its disjunctions contains a pair of complementary literals P and ¬P . Conversely, a formula in DNF is unsatisfiable, if and only if each of its conjunctions contains a pair of complementary literals P and ¬P . On the other hand, checking the unsatisfiability of CNF formulas or the validity of DNF formulas is known to be coNP-complete.

Conversion to CNF/DNF

Proposition 2.9 For every formula there is an equivalent formula in CNF (and also an equivalent formula in DNF).

Proof. We consider the case of CNF and propose a naive algorithm. Apply the following rules as long as possible (modulo associativity and commutativity of ∧ and ∨): Step 1: Eliminate equivalences:

φ[(ψ1 ↔ ψ2)]p ⇒ECNF φ[(ψ1 → ψ2) ∧ (ψ1 → ψ2)]p

Step 2: Eliminate implications:

φ[(ψ1 → ψ2)]p ⇒ECNF φ[(¬ψ1 ∨ ψ2)]p

25 Step 3: Push negations downward:

φ[¬(ψ1 ∨ ψ2)]p ⇒ECNF φ[(¬ψ1 ∧ ¬ψ2)]p

φ[¬(ψ1 ∧ ψ2)]p ⇒ECNF φ[(¬ψ1 ∨ ¬ψ2)]p

Step 4: Eliminate multiple negations:

φ[¬¬ψ]p ⇒ECNF φ[ψ]p

Step 5: Push disjunctions downward:

φ[(ψ1 ∧ ψ2) ∨ χ]p ⇒ECNF φ[(ψ1 ∨ χ) ∧ (ψ2 ∨ χ)]p

Step 6: Eliminate ⊤ and ⊥:

φ[(ψ ∧ ⊤)]p ⇒ECNF φ[ψ]p

φ[(ψ ∧ ⊥)]p ⇒ECNF φ[⊥]p

φ[(ψ ∨ ⊤)]p ⇒ECNF φ[⊤]p

φ[(ψ ∨ ⊥)]p ⇒ECNF φ[ψ]p

φ[¬⊥]p ⇒ECNF φ[⊤]p

φ[¬⊤]p ⇒ECNF φ[⊥]p

Proving termination is easy for steps 2, 4, and 6; steps 1, 3, and 5 are a bit more complicated. For step 1, we can prove termination in the following way: We define a function from formulas to positive integers such that (⊥) = (⊤) = (P ) = 1, (¬φ) = (φ), (ψ1 ∧ ψ2) = (ψ1 ∨ ψ2) = (ψ1 → ψ2) = (ψ1)+ (ψ2), and (φ ↔ ψ)=2(φ)+ 2(ψ) + 1. Observe that is constructed in such a way that (φ1) > (φ2) implies (ψ[φ1]p) >(ψ[φ2]p) for all formulas φ1, φ2, and ψ and positions p. Using this property, we can show that whenever a formula χ′ is the result of applying the rule of step 1 to a formula χ, then (χ) >(χ′). Since takes only positive integer values, step 1 must terminate. Termination of steps 3 and 5 is proved similarly. For step 3, we use a function from formulas to positive integers such that (⊥) = (⊤) = (P ) = 1, (¬φ)=2(φ), (φ ∧ ψ)= (φ ∨ ψ)= (φ → ψ)= (φ ↔ ψ)= (φ)+ (ψ) + 1. Whenever a formula χ′ is the result of applying a rule of step 3 to a formula χ, then (χ) > (χ′). Since takes only positive integer values, step 3 must terminate. For step 5, we use a function from formulas to positive integers such that (⊥) = (⊤) = (P ) = 1, (¬φ) = (φ) + 1, (φ ∧ ψ) = (φ → ψ) = (φ ↔ ψ) = (φ)+

26 (ψ)+1, and (φ ∨ ψ)=2(φ)(ψ). Again, if a formula χ′ is the result of applying a rule of step 5 to a formula χ, then (χ) > (χ′). Since takes only positive integer values, step 5 terminates, too. The resulting formula is equivalent to the original one and in CNF. The conversion of a formula to DNF works in the same way, except that conjunctions have to be pushed downward in step 5. 2

Complexity

Conversion to CNF (or DNF) may produce a formula whose size is exponential in the size of the original one.

Negation Normal Form (NNF)

The formula after application of Step 4 is said to be in Negation Normal Form, i.e., it does not contain →, ↔ and negation symbols only occur in front of propositional variables (atoms).

Satisfiability-preserving Transformations

The goal “find a formula ψ in CNF such that φ |=| ψ” is unpractical. But if we relax the requirement to “find a formula ψ in CNF such that φ |= ⊥ ⇔ ψ |= ⊥” we can get an efficient transformation.

Idea: A formula ψ[φ]p is satisfiable if and only if ψ[P ]p ∧ (P ↔ φ) is satisfiable where P is a new propositional variable that does not occur in ψ and works as an abbreviation for φ. We can use this rule recursively for all subformulas in the original formula (this intro- duces a linear number of new propositional variables). Conversion of the resulting formula to CNF increases the size only by an additional factor (each formula P ↔ φ gives rise to at most one application of the distributivity law).

27 Optimized Transformations

A further improvement is possible by taking the polarity of the subformula into ac- count.

For example if ψ[φ1 ↔ φ2]p and pol(ψ,p)= −1 then for CNF transformation do ψ[(φ1 ∧ φ2) ∨ (¬φ1 ∧ ¬φ2)]p.

Proposition 2.10 Let P be a propositional variable not occurring in ψ[φ]p.

If pol(ψ,p)=1, then ψ[φ]p is satisfiable if and only if ψ[P ]p ∧ (P → φ) is satisfiable.

If pol(ψ,p)= −1, then ψ[φ]p is satisfiable if and only if ψ[P ]p ∧ (φ → P ) is satisfiable.

If pol(ψ,p)=0, then ψ[φ]p is satisfiable if and only if ψ[P ]p ∧ (P ↔ φ) is satisfiable.

Proof. Exercise. 2

The number of eventually generated clauses is a good indicator for useful CNF trans- formations: ψ ν(ψ) ν¯(ψ)

φ1 ∧ φ2 ν(φ1)+ ν(φ2) ν¯(φ1)¯ν(φ2) φ1 ∨ φ2 ν(φ1)ν(φ2) ν¯(φ1)+¯ν(φ2) φ1 → φ2 ν¯(φ1)ν(φ2) ν(φ1)+¯ν(φ2) φ1 ↔ φ2 ν(φ1)¯ν(φ2)+¯ν(φ1)ν(φ2) ν(φ1)ν(φ2)+¯ν(φ1)¯ν(φ2) ¬φ1 ν¯(φ1) ν(φ1) P, ⊤, ⊥ 1 1

Optimized CNF

Step 1: Exhaustively apply modulo C of ↔, AC of ∧, ∨:

φ[(ψ ∧ ⊤)]p ⇒OCNF φ[ψ]p

φ[(ψ ∨ ⊥)]p ⇒OCNF φ[ψ]p

φ[(ψ ↔ ⊥)]p ⇒OCNF φ[¬ψ]p

φ[(ψ ↔ ⊤)]p ⇒OCNF φ[ψ]p

φ[(ψ ∨ ⊤)]p ⇒OCNF φ[⊤]p

φ[(ψ ∧ ⊥)]p ⇒OCNF φ[⊥]p

28 φ[(ψ ∧ ψ)]p ⇒OCNF φ[ψ]p

φ[(ψ ∨ ψ)]p ⇒OCNF φ[ψ]p

φ[(ψ1 ∧ (ψ1 ∨ ψ2))]p ⇒OCNF φ[ψ1]p

φ[(ψ1 ∨ (ψ1 ∧ ψ2))]p ⇒OCNF φ[ψ1]p

φ[(ψ ∧ ¬ψ)]p ⇒OCNF φ[⊥]p

φ[(ψ ∨ ¬ψ)]p ⇒OCNF φ[⊤]p

φ[¬⊤]p ⇒OCNF φ[⊥]p

φ[¬⊥]p ⇒OCNF φ[⊤]p

φ[(ψ → ⊥)]p ⇒OCNF φ[¬ψ]p

φ[(ψ → ⊤)]p ⇒OCNF φ[⊤]p

φ[(⊥→ ψ)]p ⇒OCNF φ[⊤]p

φ[(⊤→ ψ)]p ⇒OCNF φ[ψ]p

Step 2: Introduce top-down fresh variables for beneficial subformulas:

ψ[φ]p ⇒OCNF ψ[P ]p ∧ def(ψ,p,P )

where P is new to ψ[φ]p, def(ψ,p,P ) is defined polarity dependent according to Propo- sition 2.10 and ν(ψ[φ]p) > ν(ψ[P ]p ∧ def(ψ,p,P )).

Remark: Although computing ν is not practical in general, the test ν(ψ[φ]p) > ν(ψ[P ]p ∧ def(ψ,p,P )) can be computed in constant time. Step 3: Eliminate equivalences polarity dependent:

φ[ψ1 ↔ ψ2]p ⇒OCNF φ[(ψ1 → ψ2) ∧ (ψ2 → ψ1)]p if pol(φ,p) = 1 or pol(φ,p)=0

φ[ψ1 ↔ ψ2]p ⇒OCNF φ[(ψ1 ∧ ψ2) ∨ (¬ψ2 ∧ ¬ψ1)]p if pol(φ,p)= −1

29 Step 4: Apply steps 2, 3, 4, 5 of ⇒ECNF

Remark: The ⇒OCNF algorithm is already close to a state of the art algorithm. Missing are further redundancy tests and simplification mechanisms we will discuss later on in this section.

2.5 Superposition for PROP(Σ)

Superposition for PROP(Σ) is: • resolution (Robinson 1965) + • ordering restrictions (Bachmair & Ganzinger 1990) + • abstract redundancy criterion (B&G 1990) + • partial model construction (B & G 1990) + • partial-model based inference restriction (Weidenbach)

Resolution for PROP(Σ)

A calculus is a set of inference and reduction rules for a given logic (here PROP(Σ)). We only consider calculi operating on a set of clauses N. Inference rules add new clauses to N whereas reduction rules remove clauses from N or replace clauses by “simpler” ones. We are only interested in unsatisfiability, i.e., the considered calculi test whether a clause set N is unsatisfiable. So, in order to check validity of a formula φ we check unsatisfiability of the clauses generated from ¬φ. For clauses we switch between the notation as a disjunction, e.g., P ∨ Q ∨ P ∨ ¬R, and the notation as a multiset, e.g., {P, Q, P, ¬R}. This makes no difference as we consider ∨ in the context of clauses always modulo AC. Note that ⊥, the empty disjunction, corresponds to ∅, the empty multiset. For literals we write L, possibly with subscript.. If L = P then L = ¬P and if L = ¬P then L = P , so the bar flips the negation of a literal. Clauses are typically denoted by letters C, D, possibly with subscript. The resolution calculus consists of the inference rules resolution and factoring:

Resolution Factoring C ∨ P C ∨ ¬P C ∨ L ∨ L I 1 2 I C1 ∨ C2 C ∨ L

30 where C1, C2, C always stand for clauses, all inference/reduction rules are applied with respect to AC of ∨. Given a clause set N the schema above the inference bar is mapped to N and the resulting clauses below the bar are then added to N. and the reduction rules subsumption and tautology deletion:

Subsumption Tautology Deletion C C C ∨ P ∨ ¬P R 1 2 R C1

where for subsumption we assume C1 ⊆ C2. Given a clause set N the schema above the reduction bar is mapped to N and the resulting clauses below the bar replace the clauses above the bar in N. Clauses that can be removed are called redundant. So, if we consider clause sets N as states, ⊎ is disjoint union, we get the rules Resolution (N ⊎{C1 ∨ P,C2 ∨ ¬P }) ⇒ (N ∪{C1 ∨ P,C2 ∨ ¬P }∪{C1 ∨ C2})

Factoring (N ⊎{C ∨ L ∨ L}) ⇒ (N ∪{C ∨ L ∨ L}∪{C ∨ L})

Subsumption (N ⊎{C1,C2}) ⇒ (N ∪{C1}) provided C1 ⊆ C2

Tautology Deletion (N ⊎{C ∨ P ∨ ¬P }) ⇒ (N)

We need more structure than just (N) in order to define a useful rewrite system. We fix this later on.

Theorem 2.11 The resolution calculus is sound and complete: N is unsatisfiable iff N ⇒∗ {⊥}

Proof. Will be a consequence of soundness and completeness of superposition. 2

31 Ordering restrictions

Let ≺ be a total ordering on Σ.

We lift ≺ to a total ordering on literals by ≺⊆≺L and P ≺L ¬P and ¬P ≺L Q for all P ≺ Q.

We further lift ≺L to a total ordering on clauses ≺C by considering the multiset extension of ≺L for clauses.

Eventually, we overload ≺ with ≺L and ≺C . We define N ≺C = {D ∈ N | D ≺ C}. Eventually we will restrict inferences to maximal literals with respect to ≺.

Abstract Redundancy

A clause C is redundant with respect to a clause set N if N ≺C |= C. Tautologies are redundant. Subsumed clauses are redundant if ⊆ is strict. Remark: Note that for finite N, N ≺C |= C can be decided for PROP(Σ) but is as hard as testing unsatisfiability for a clause set N.

Partial Model Construction

Given a clause set N and an ordering ≺ we can construct a (partial) model NI for N as follows:

NC := D≺C δD {P } if D = D′ ∨ P,P strictly maximal and N |= D δ := D D ∅ otherwise

NI := C∈N δC

ClausesC with δC = ∅ are called productive. Some properties of the partial model construction.

Proposition 2.12 1. For every D with (C ∨ ¬P ) ≺ D we have δD = {P }.

2. If δC = {P } then NC ∪ δC |= C. ′ ′ 3. If NC |= D then for all C with C ≺ C we have NC′ |= D and in particular NI |= D.

32 ≺C Notation: N, N , NI , NC

Please properly distinguish: • N is a set of clauses interpreted as the conjunction of all clauses. • N ≺C is of set of clauses from N strictly smaller than C with respect to ≺.

• NI , NC are sets of atoms, often called Herbrand Interpretations. NI is the overall (partial) model for N, whereas NC is generated from all clauses from N strictly smaller than C.

• Validity is defined by NI |= P if P ∈ NI and NI |= ¬P if P ∈ NI , accordingly for NC .

Superposition

The superposition calculus consists of the inference rules superposition left and factor- ing: Superposition Left (N ⊎{C1 ∨ P,C2 ∨ ¬P }) ⇒ (N ∪{C1 ∨ P,C2 ∨ ¬P }∪{C1 ∨ C2}) where P is strictly maximal in C1 ∨ P and ¬P is maximal in C2 ∨ ¬P

Factoring (N ⊎{C ∨ P ∨ P }) ⇒ (N ∪{C ∨ P ∨ P }∪{C ∨ P }) where P is maximal in C ∨ P ∨ P examples for specific redundancy rules are Subsumption (N ⊎{C1,C2}) ⇒ (N ∪{C1}) provided C1 ⊂ C2

Tautology Deletion (N ⊎{C ∨ P ∨ ¬P }) ⇒ (N)

Subsumption Resolution (N ⊎{C1 ∨ L, C2 ∨ L}) ⇒ (N ∪{C1 ∨ L, C2}) where C1 ⊆ C2

Theorem 2.13 If from a clause set N all possible superposition inferences are redun- dant and ⊥ ∈/ N then N is satisfiable and NI |= N.

33 Proof. The proof is by contradiction. So assume if C is any clause derived by super- position left or factoring from N that C is redundant, i.e., N ≺C |= C. Furthermore, we assume ⊥ ∈/ N but NI |= N. Then there is a minimal, with respect to ≺, clause C1 ∨ L ∈ N such that NI |= C1 ∨ L and L is a maximal literal in C1 ∨ L. This clause must exist because ⊥ ∈/ N.

≺C1∨L (i) note that because C1 ∨ L is minimal it is not redundant. For otherwise, N |= C1 ∨ L and hence NI |= C1 ∨ L, a contradiction. (ii) we distinguish the case whether L is a positive or negative literal. Firstly, let us assume L is positive, i.e., L = P for some propositional variable P . Now if P is strictly maximal in C1 ∨ P then actually δC1∨P = {P } and hence NI |= C1 ∨ P , a contradiction. ′ So P is not strictly maximal. But then actually C1 ∨ P has the form C1 ∨ P ∨ P and ′ ′ ′ ′ by factoring we can derive C1 ∨ P where (C1 ∨ P ) ≺ C1 ∨ P ∨ P . Now C1 ∨ P is not ′ redundant (analogous to (i)), strictly smaller than C1 ∨ L, we have C1 ∨ P ∈ N and ′ NI |= C1 ∨ P , a contradiction against the choice of C1 ∨ L. Secondly, let us assume L is negative, i.e., L = ¬P for some propositional variable P . Then, since NI |= C1 ∨ ¬P we know P ∈ NI. So there is a clause C2 ∨ P ∈ N where

δC2∨P = {P } and P is strictly maximal in C2 ∨ P and (C2 ∨ P ) ≺ (C1 ∨ ¬P ). So by superposition left we can derive C1 ∨ C2 where (C1 ∨ C2) ≺ (C1 ∨ ¬P ). The derived ≺C2∨P clause C1 ∨ C2 cannot be redundant, because for otherwise either N |= C2 ∨ P or ≺C1∨¬P N |= C1 ∨ ¬P . So C1 ∨ C2 ∈ N and NI |= C1 ∨ C2, a contradiction against the choice of C1 ∨ L. 2

So the proof actually tells us that at any point in time we need only to consider either a superposition left inference between a minimal false clause and a productive clause or a factoring inference on a minimal false clause.

A Superposition Theorem Prover STP

3 clause sets: N(ew) containing new inferred clauses U(sable) containing reduced new inferred clauses clauses get into W(orked) O(ff) once their inferences have been computed Strategy: Inferences will only be computed when there are no possibilities for simplification

34 Rewrite Rules for STP

Tautology Deletion (N ⊎{C}; U; WO) ⇒STP (N; U; WO) if C is a tautology

Forward Subsumption (N ⊎{C}; U; WO) ⇒STP (N; U; WO) if some D ∈ (U ∪ WO) subsumes C

Backward Subsumption U (N ⊎{C}; U ⊎{D}; WO) ⇒STP (N ∪{C}; U; WO) if C strictly subsumes D (C ⊂ D) Backward Subsumption WO (N ⊎{C}; U; WO ⊎{D}) ⇒STP (N ∪{C}; U; WO) if C strictly subsumes D (C ⊂ D)

Forward Subsumption Resolution (N ⊎{C1 ∨ L}; U; WO) ⇒STP (N ∪{C1}; U; WO) if there exists C2 ∨ L ∈ (U ∪ WO) such that C2 ⊆ C1

Backward Subsumption Resolution U (N ⊎{C1 ∨ L}; U ⊎{C2 ∨ L}; WO) ⇒STP (N ∪{C1 ∨ L}; U ⊎{C2}; WO) if C1 ⊆ C2 Backward Subsumption Resolution WO (N ⊎{C1 ∨ L}; U; WO ⊎{C2 ∨ L}) ⇒STP (N ∪{C1 ∨ L}; U; WO ⊎{C2}) if C1 ⊆ C2

Clause Processing (N ⊎{C}; U; WO) ⇒STP (N; U ∪{C}; WO)

Inference Computation (∅; U ⊎{C}; WO) ⇒STP (N; U; WO ∪{C}) where N is the set of clauses derived by superposition inferences from C and clauses in WO.

35 Soundness and Completeness

Theorem 2.14

∗ ′ N |= ⊥ ⇔ (N; ∅; ∅) ⇒STP (N ∪ {⊥}; U; WO)

Proof in L. Bachmair, H. Ganzinger: Resolution Theorem Proving appeared in the Handbook of Automated Reasoning, 2001

Termination

Theorem 2.15 For finite N and a strategy where the reduction rules Tautology Dele- tion, the two Subsumption and two Subsumption Resolution rules are always exhaus- tively applied before Clause Processing and Inference Computation, the rewrite relation ⇒STP is terminating on (N; ∅; ∅).

Proof: think of it (more later on).

Fairness

Problem:

∗ ′ If N is inconsistent, then (N; ∅; ∅) ⇒STP (N ∪ {⊥}; U; WO). Does this imply that every derivation starting from an inconsistent set N eventually produces ⊥ ? No: a clause could be kept in U without ever being used for an inference. We need in addition a fairness condition: If an inference is possible forever (that is, none of its premises is ever deleted), then it must be computed eventually. One possible way to guarantee fairness: Implement U as a queue (there are other techniques to guarantee fairness). With this additional requirement, we get a stronger result: If N is inconsistent, then every fair derivation will eventually produce ⊥.

36 2.6 The CDCL Procedure

Goal: Given a propositional formula in CNF (or alternatively, a finite set N of clauses), check whether it is satisfiable (and optionally: output one solution, if it is satisfiable). Assumption: Clauses contain neither duplicated literals nor complementary literals. CDCL: Conflict Driven Clause Learning

Satisfiability of Clause Sets

A|= N if and only if A|= C for all clauses C in N. A|= C if and only if A|= L for some literal L ∈ C.

Partial Valuations

Since we will construct satisfying valuations incrementally, we consider partial valuations (that is, partial mappings A : Σ →{0, 1}). Every partial valuation A corresponds to a set M of literals that does not contain complementary literals, and vice versa: A(L) is true, if L ∈ M. A(L) is false, if L ∈ M. A(L) is undefined, if neither L ∈ M nor L ∈ M. We will use A and M interchangeably. Note that truth of a literal with respect to M is defined differently than for NI . A clause is true under a partial valuation A (or under a set M of literals) if one of its literals is true; it is false (or “conflicting”) if all its literals are false; otherwise it is undefined (or “unresolved”).

37 Unit Clauses

Observation: Let A be a partial valuation. If the set N contains a clause C, such that all literals but one in C are false under A, then the following properties are equivalent: • there is a valuation that is a model of N and extends A. • there is a valuation that is a model of N and extends A and makes the remaining literal L of C true. C is called a unit clause; L is called a unit literal.

Pure Literals

One more observation: Let A be a partial valuation and P a variable that is undefined under A. If P occurs only positively (or only negatively) in the unresolved clauses in N, then the following properties are equivalent: • there is a valuation that is a model of N and extends A. • there is a valuation that is a model of N and extends A and assigns 1 (0) to P . P is called a pure literal.

The Davis-Putnam-Logemann-Loveland Proc. boolean DPLL(literal set M, clause set N) { if (all clauses in N are true under M) return true; elsif (some clause in N is false under M) return false; elsif (N contains unit clause P ) return DPLL(M ∪{P }, N); elsif (N contains unit clause ¬P ) return DPLL(M ∪ {¬P }, N); elsif (N contains pure literal P ) return DPLL(M ∪{P }, N); elsif (N contains pure literal ¬P ) return DPLL(M ∪ {¬P }, N); else { let P be some undefined variable in N; if (DPLL(M ∪ {¬P }, N)) return true; else return DPLL(M ∪{P }, N); } } Initially, DPLL is called with an empty literal set and the clause set N.

38 2.7 From DPLL to CDCL

In practice, there are several changes to the procedure: The pure literal check is only done while preprocessing (otherwise is too expensive). The branching variable is not chosen randomly. The algorithm is implemented iteratively; the backtrack stack is managed explicitly (it may be possible and useful to backtrack more than one level). CDCL = DPLL + Information is reused by learning + Restart + Specific Data Struc- tures

Branching Heuristics

Choosing the right undefined variable to branch is important for efficiency, but the branching heuristics may be expensive itself. State of the art: use branching heuristics that need not be recomputed too frequently. In general: choose variables that occur frequently, prefer variables from recent con- flicts.

The Deduction Algorithm

For applying the unit rule, we need to know the number of literals in a clause that are not false. Maintaining this number is expensive, however. Better approach: “Two watched literals”: In each clause, select two (currently undefined) “watched” literals. For each variable P , keep a list of all clauses in which P is watched and a list of all clauses in which ¬P is watched. If an undefined variable is set to 0 (or to 1), check all clauses in which P (or ¬P ) is watched and watch another literal (that is true or undefined) in this clause if possible. Watched literal information need not be restored upon backtracking.

39 Conflict Analysis and Learning

Goal: Reuse information that is obtained in one branch in further branches. Method: Learning: If a conflicting clause is found, derive a new clause from the conflict and add it to the current set of clauses. Problem: This may produce a large number of new clauses; therefore it may become necessary to delete some of them afterwards to save space.

Backjumping

Related technique: non-chronological backtracking (“backjumping”): If a conflict is independent of some earlier branch, try to skip over that backtrack level.

Restart

Runtimes of DPLL-style procedures depend extremely on the choice of branching vari- ables. If no solution is found within a certain time limit, it can be useful to restart from scratch with an adopted variable selection heuristics, but learned clauses are kept. In particular, after learning a unit clause a restart is done.

Formalizing DPLL with Refinements

The DPLL procedure is modeled by a transition relation ⇒DPLL on a set of states. States: • fail • (M; N) where M is a list of annotated literals and N is a set of clauses. We use + to right add a literal or a list of literals to M Annotated literal: • L: deduced literal, due to unit propagation. • Ld: decision literal (guessed literal).

40 Unit Propagate:

(M; N ∪{C ∨ L}) ⇒DPLL (M + L; N ∪{C ∨ L}) if C is false under M and L is undefined under M. Decide:

d (M; N) ⇒DPLL (M + L ; N) if L is undefined under M and contained in N. Fail:

(M; N ∪{C}) ⇒DPLL fail if C is false under M and M contains no decision literals. Backjump:

′ d ′′ ′ ′ (M + L + M ; N) ⇒DPLL (M + L ; N) if there is some “backjump clause” C ∨ L′ such that N |= C ∨ L′, C is false under M ′, and L′ is undefined under M ′. We will see later that the Backjump rule is always applicable, if the list of literals M contains at least one decision literal and some clause in N is false under M.

There are many possible backjump clauses. One candidate: L1 ∨ ... ∨ Ln, where the Li are all the decision literals in M + Ld + M ′. (But usually there are better choices.)

Lemma 2.16 If we reach a state (M; N) starting from (nil; N), then: (1) M does not contain complementary literals. (2) Every deduced literal L in M follows from N and decision literals occurring before L in M.

Proof. By induction on the length of the derivation. 2

Lemma 2.17 Every derivation starting from (nil; N) terminates.

′ ′ Proof. (Idea) Consider a DPLL derivation step (M; N) ⇒DPLL (M ; N ) and a de- d d ′ composition M0 + L1 + M1 + ... + Lk + Mk of M (accordingly for M ). Let n be the number of distinct propositional variables in N. Then k, k′ and the length of M, M ′ are always smaller or equal to n. We define f(M)= n − length(M) and finally

(M; N) ≻ (M ′; N ′) if

41 ′ ′ ′ ′ (i) f(M0)= f(M0), ...,f(Mi−1)= f(Mi−1), f(Mi) > f(Mi ) for some i f(M ).

Lemma 2.18 Suppose that we reach a state (M; N) starting from (nil; N) such that some clause D ∈ N is false under M. Then: (1) If M does not contain any decision literal, then “Fail” is applicable. (2) Otherwise, “Backjump” is applicable.

Proof. (1) Obvious.

(2) Let L1,...,Ln be the decision literals occurring in M (in this order). Since M |= ¬D, we obtain, by Lemma 2.16, N ∪{L1,...,Ln} |= ¬D. Since D ∈ N, this is a contradiction, so N ∪{L1,...,Ln} is unsatisfiable. Consequently, N |= L1 ∨∨ Ln. ′ ′ Now let C = L1 ∨∨ Ln−1, L = Ln, L = Ln, and let M be the list of all literals of M occurring before Ln, then the condition of “Backjump” is satisfied. 2

Theorem 2.19 (1) If we reach a final state (M; N) starting from (nil; N), then N is satisfiable and M is a model of N. (2) If we reach a final state fail starting from (nil; N), then N is unsatisfiable.

Proof. (1) Observe that the “Decide” rule is applicable as long as literals are undefined under M. Hence, in a final state, all literals must be defined. Furthermore, in a final state, no clause in N can be false under M, otherwise “Fail” or “Backjump” would be applicable. Hence M is a model of every clause in N. (2) If we reach fail, then in the previous step we must have reached a state (M; N) such that some C ∈ N is false under M and M contains no decision literals. By part (2) of Lemma 2.16, every literal in M follows from N. On the other hand, C ∈ N, so N must be unsatisfiable. 2

Getting Better Backjump Clauses

Suppose that we have reached a state (M; N) such that some clause C ∈ N (or following from N) is false under M. Consequently, every literal of C is the complement of some literal in M.

(1) If every literal in C is the complement of a decision literal of M, then C is a backjump clause.

42 (2) Otherwise, C = C′ ∨ L, such that L is a deduced literal. For every deduced literal L, there is a clause D ∨ L, such that N |= D ∨ L and D is false under M. Then N |= D ∨ C′ and D ∨ C′ is also false under M. D ∨ C′ is a resolvent of C′ ∨ L and D ∨ L.

By repeating this process, we will eventually obtain a clause that consists only of com- plements of decision literals and can be used in the “Backjump” rule. Moreover, such a clause is a good candidate for learning.

Learning Clauses

The DPLL system can be extended by two rules to learn and to forget clauses: Learn:

(M; N) ⇒DPLL (M; N ∪{C}) if N |= C. Forget:

(M; N ⊎{C}) ⇒DPLL (M; N) if N |= C. If we ensure that no clause is learned infinitely often, then termination is guaranteed. The other properties of the basic DPLL system hold also for the extended system.

Restart

Part of the CDCL system the restart rule: Restart:

(M; N) ⇒DPLL (nil; N) The restart rule is typically applied after a certain number of clauses have been learned or a unit is derived. It is closely coupled with the variable order heuristic. If Restart is only applied finitely often, termination is guaranteed.

43 Variable Order Heuristic

For every propositional variable Pi there is a positive score ki. At start ki may for example be the number of occurrences of Pi in N.

The variable order is then the descending ordering of the Pi according to the ki.

The scores ki are adjusted during a CDCL run. • Every time a learned clause is computed after a conflict, the involved propositional variables obtain a bonus b, i.e., ki = ki + b. • After each restart, the variable order is recomputed, using the new scores.

th • After each j restart, the scores a leveled: ki = ki/l for some l. The purpose of these mechanisms is to keep the search focused. Parameter b directs the search around the conflict, parameter j decides how many learned clauses are “sufficient” to move in “speed ” of parameter l away from this conflict.

Preprocessing

Before DPLL search, and computation of the variable order heuristics, a number of preprocessing steps are performed:

(i) Subsumption Non-strict version. (ii) Purity Deletion Delete all clauses containing a literal L where L does not occur in the clause set. (iii) Subsumption Resolution

(iv) Tautology Deletion (v) Literal Elimination do all possible resolution steps on a literal L and then throw away all clauses containing L or L; repeat this as long as |N| does not grow.

44 Further Information

The ideas described so far have been implemented in all modern SAT solvers: zChaff , miniSAT,picoSAT. Because of clause learning the algorithm is now called CDCL: Con- flict Driven Clause Learning. It has been shown in 2009 that CDCL can polynomially simulate resolution, a long standing open question: Knot Pipatsrisawat, Adnan Darwiche: On the Power of Clause-Learning SAT Solvers with Restarts. CP 2009, 654-668

Literature

Lintao Zhang and Sharad Malik: The Quest for Efficient Boolean Satisfiability Solvers; Proc. CADE-18, LNAI 2392, pp. 295–312, Springer, 2002. Robert Nieuwenhuis, Albert Oliveras, Cesare Tinelli: Solving SAT and SAT Modulo Theories; From an abstract Davis-Putnam-Logemann-Loveland procedure to DPLL(T), pp. 937–977, Journal of the ACM, 53(6), 2006. Armin Biere, Marijn Heule, Hans van Maaren, Toby Walsh (eds.): Handbook of Satis- fiability; IOS Press, 2009 Daniel Le Berre’s slides at VTSA’09: http://www.mpi-inf.mpg.de/vtsa09/.

45 2.8 Example: Sudoku

1 2 3 4 5 6 7 8 9 1 1 d 2 4 Idea: pi,j=true iff 3 2 the value of square i, j is d 4 5 4 7 5 8 3 6 1 9 For example: 7 3 4 2 8 p3,5 = true 8 5 1 9 8 6

Coding Sudoku by Propositional Clauses

d • Concrete values result in units: pi,j 1 9 • For every square (i, j) we generate pi,j ∨ ... ∨ pi,j ′ d d′ • For every square (i, j) and pair of values d

′ d d • For every value d, column i, and pair of rows j < j we generate ¬pi,j ∨ ¬pi,j′ (Analogously for rows and 3 × 3 boxes)

Constraint Propagation is Unit Propagation

1 2 3 4 5 6 7 8 9 1 1 2 4 3 2 4 5 4 7 5 8 3 6 1 9 7 3 4 7 2 8 5 1 9 8 6

3 3 3 3 1 From ¬p1,7 ∨ ¬p5,7 and p1,7 we obtain by unit propagating ¬p5,7 and further from p5,7 ∨ 2 3 4 9 1 2 4 9 7 p5,7 ∨ p5,7 ∨ p5,7 ∨ ... ∨ p5,7 we get p5,7 ∨ p5,7 ∨ p5,7 ∨ ... ∨ p5,7 (and finally p5,7).

46 2.9 Other Calculi

OBDDs (Ordered Binary Decision Diagrams): Minimized graph representation of decision trees, based on a fixed ordering on propo- sitional variables, ⇒ canonical representation of formulas. see script of the Computational Logic course, see Chapter 6.1/6.2 of Michael Huth and Mark Ryan: Logic in Computer Science: Modelling and Reasoning about Systems, Cambridge Univ. Press, 2000.

FRAIGs (Fully Reduced And-Inverter Graphs) Minimized graph representation of boolean circuits. ⇒ semi-canonical representation of formulas. Implementation needs DPLL (and OBDDs) as subroutines.

Tableau calculus Hilbert calculus Sequent calculus Natural deduction

47 2.10 Superposition Versus CDCL

We establish a relationship between Superposition and CDCL. For CDCL we assume eager propagation and false clause detection.

Superposition: Is based on an ordering ≺. It computes a model assumption NI. Either NI is a model, N contains the empty clause, or there is an inference on the minimal false clause with respect to ≺. CDCL: Is based on a variable selection heuristic. It computes a model assumption via decision variables and propagation. Either this assumption is a model of N, N contains the empty clause, or there is a backjump clause that is learned.

Proposition 2.20 Let (L1 + L2 + ... + Lk; N) be a CDCL state. Some of the Li may be decision literals and the corresponding propositional variables are P1,...,Pk. Furthermore, let us assume that L1 + ... + Lk−1 is a partial valuation that does not falsify any clause in N whereas L1 + L2 + ... + Lk falsifies some clause C ∨ Lk ∈ N. Then

(a) Lk is a propagated literal.

(b) The resolvent between C ∨ Lk and the clause propagating Lk is a superposition inference and the conclusion is not redundant with respect to the ordering P1 ≺ P2 ... ≺ Pk.

Proof. (a) The clause C ∨ Lk propagates Lk with respect to L1 + ... + Lk−1, so with eager propagation, the literal Lk cannot be decision literal but was propagated by a ′ clause C ∨ Lk ∈ N. ′ (b) Both C and C only contain literals with variables from P1,...,Pk−1. Since we assume duplicate literals to be removed and tautologies to be deleted, the literal Lk is ′ strictly maximal in C ∨ Lk and Lk is strictly maximal in C ∨ Lk. So resolving on Lk is a superposition inference with respect to the variable ordering P1 ≺ P2 ... ≺ Pk. Now ′ ′ assume C ∨ C is redundant, i.e., there are clauses D1,...,Dn from N with Di ≺ C ∨ C ′ ′ and D1,...,Dn |= C ∨ C . Since C ∨ C is false in L1 + ... + Lk−1 there is at least one Di that is also false in L1 + ... + Lk−1. A contradiction against the assumption that L1 + ... + Lk−1 does not falsify any clause in N. 2

Proposition 2.21 The 1UIP backjump clause is not redundant.

Proof. By Proposition 2.20 a one resolution step 1UIP backjump clause has this prop- erty. The argument in the proof of Proposition 2.20 can be repeated until we reach the first decision literal Lm by resolving away Lk, Lk−1,...,Lm+1. 2

48 Proposition 2.22 Let (L1 + L2 + ... + Lk; N) be a CDCL state. We assume that all decision literals among the Li are negative and let the corresponding propositional vari- ables be P1,...,Pk. Furthermore, let us assume that L1 + ... + Lk is a partial valuation ≺Pk+1 that does not falsify any clause in N. Then NI = {P1,...,Pk}∩{L1,...,Lk} with ordering P1 ≺ P2 ... ≺ Pk+1.

Proof. We assume that there is a variable Pk+1 ∈ Σ for otherwise it can be added. By induction on k. For the base case k = 1 we distinguish two cases. If L1 is propagated then there is a clause L1 ∈ N. In case L1 is positive then it is also productive and ≺P2 ≺P2 L1 ∈ NI . If it is negative then there cannot be a clause P1 ∈ N, so P1 ∈ NI .

≺Pk For the induction step assume NI = {P1,...,Pk−1}∩{L1,...,Lk−1}. If Lk is prop- agated and positive, then there is a clause C ∨ Lk where all atoms in C are from {P1,...,Pk−1} and hence Lk is strictly maximal in C ∨ Lk, the clause C is false in ≺Pk ≺Pk+1 NI and therefore Lk is produced, proving NI = {P1,...,Pk}∩{L1,...,Lk}.

≺Pk+1 If Lk is propagated and negative, then there cannot be a clause C ∨ Pk ∈ N with ≺Pk C false in NI , because for otherwise L1 + ... + Lk falsifies a clause in N. So there is ≺Pk+1 no clause in N producing Pk and hence NI = {P1,...,Pk}∩{L1,...,Lk}.

≺Pk+1 If Lk is a decision literal and therefore negative, there cannot be a clause C ∨ Pk ∈ N ≺Pk ≺Pk+1 with C false in NI , because we assume eager propagation and so again NI = {P1,...,Pk}∩{L1,...,Lk}. 2

3 First-Order Logic

→First-order logic • formalizes fundamental mathematical concepts • is expressive (Turing-complete) • is not too expressive (e. g. not axiomatizable: natural numbers, uncountable sets) • has a rich structure of decidable fragments • has a rich model and proof theory First-order logic is also called (first-order) predicate logic.

49 3.1 Syntax

Syntax: • non-logical symbols (domain-specific) ⇒ terms, atomic formulas • logical connectives (domain-independent) ⇒ Boolean combinations, quantifiers

Signature

A signature Σ = (Ω, Π) fixes an alphabet of non-logical symbols, where • Ω is a set of function symbols f with arity n ≥ 0, written arity(f)= n, • Π is a set of predicate symbols P with arity m ≥ 0, written arity(P )= m. Function symbols are also called operator symbols. If n = 0 then f is also called a constant (symbol). If m = 0 then P is also called a propositional variable. We will usually use b, c, d for constant symbols, f, g, h for non-constant function symbols, P , Q, R, S for predicate symbols. Convention: We will usually write f/n ∈ Ω instead of f ∈ Ω, arity(f)= n (analogously for predicate symbols). Refined concept for practical applications: many-sorted signatures (corresponds to simple type systems in programming languages); not so interesting from a logical point of view.

Variables

Predicate logic admits the formulation of abstract, schematic assertions. (Object) vari- ables are the technical tool for schematization. We assume that X is a given countably infinite set of symbols which we use to denote variables.

50 Context-Free Grammars

We define many of our notions on the bases of context-free grammars. Recall that a context-free grammar G =(N,T,P,S) consists of: • a set of non-terminal symbols N • a set of terminal symbols T • a set P of rules A ::= w where A ∈ N and w ∈ (N ∪ T )∗ • a start symbol S where S ∈ N

For rules A ::= w1, A ::= w2 we write A ::= w1 | w2

Terms

Terms over Σ and X (Σ-terms) are formed according to these syntactic rules: s,t,u,v ::= x , x ∈ X (variable) | f(s1, ..., sn), f/n ∈ Ω (functional term)

By TΣ(X) we denote the set of Σ-terms (over X). A term not containing any variable is called a ground term. By TΣ we denote the set of Σ-ground terms. In other words, terms are formal expressions with well-balanced brackets which we may also view as marked, ordered trees. The markings are function symbols or variables. The nodes correspond to the subterms of the term. A node v that is marked with a function symbol f of arity n has exactly n subtrees representing the n immediate subterms of v.

Atoms

Atoms (also called atomic formulas) over Σ are formed according to this syntax:

A, B ::= P (s1,...,sm), P/m ∈ Π (non-equational atom) | (s ≈ t) (equation)

Whenever we admit equations as atomic formulas we are in the realm of first-order logic with equality. Admitting equality does not really increase the expressiveness of first- order logic, (cf. exercises). But deductive systems where equality is treated specifically are much more efficient.

51 Literals

L ::= A (positive literal) | ¬A (negative literal)

Clauses

C,D ::= ⊥ (empty clause) | L1 ∨ ... ∨ Lk, k ≥ 1 (non-empty clause)

General First-Order Formulas

FΣ(X) is the set of first-order formulas over Σ defined as follows: φ,ψ,χ ::= ⊥ (falsum) | ⊤ (verum) | A (atomic formula) | ¬φ (negation) | (φ ∧ ψ) (conjunction) | (φ ∨ ψ) (disjunction) | (φ → ψ) (implication) | (φ ↔ ψ) (equivalence) | ∀xφ (universal quantification) | ∃xφ (existential quantification)

Notational Conventions

We omit brackets according to the conventions for propositional logic.

Furthermore, ∀x1,...,xn φ (∃x1,...,xn φ) abbreviates ∀x1 ... ∀xn φ (∃x1 ... ∃xn φ). We use infix-, prefix-, postfix-, or mixfix-notation with the usual operator precedences. Examples: s + t ∗ u for +(s, ∗(t, u)) s ∗ u ≤ t + v for ≤ (∗(s,u), +(t, v)) −s for −(s) 0 for 0()

52 Example: Peano Arithmetic

ΣPA = (ΩPA, ΠPA) ΩPA = {0/0, +/2, ∗/2, s/1} ΠPA = {≤/2, p + >p < >p ≤ Examples of formulas over this signature are: ∀x, y(x ≤ y ↔ ∃z(x + z ≈ y)) ∃x∀y(x + y ≈ y) ∀x, y(x ∗ s(y) ≈ x ∗ y + x) ∀x, y(s(x) ≈ s(y) → x ≈ y) ∀x∃y(x

Remarks About the Example

We observe that the symbols ≤, <, 0, s are redundant as they can be defined in first- order logic with equality just with the help of +. The first formula defines ≤, while the second defines zero. The last formula, respectively, defines s. Eliminating the existential quantifiers by Skolemization (cf. below) reintroduces the “redundant” symbols. Consequently there is a trade-off between the complexity of the quantification structure and the complexity of the signature.

Positions in Terms and Formulas

The set of positions is extended from propositional logic to first-order logic: The Positions of a term s (formula φ): pos(x)= {ε}, n pos(f(s1,...,sn)) = {ε} ∪ i=1{ ip | p ∈ pos(si) }. n pos(P (t1,...,tn)) = {ε} ∪ i=1{ ip | p ∈ pos(ti) }, pos(∀xφ)= {ε}∪{ 1p | p ∈ pos(φ) }, pos(∃xφ)= {ε}∪{ 1p | p ∈pos(φ) }. The prefix order ≤, the subformula (subterm) operator, the formula (term) replacement operator and the size operator are extended accordingly. See the definitions in the propositional logic section.

53 Bound and Free Variables

In Qxφ, Q ∈ {∃, ∀}, we call φ the scope of the quantifier Qx. An occurrence of a variable x is called bound, if it is inside the scope of a quantifier Qx. Any other occurrence of a variable is called free. Formulas without free variables are also called closed formulas or sentential forms. Formulas without variables are called ground. Example:

scope scope

∀ y (∀ x P (x) → Q(x, y)) The occurrence ofy is bound, as is the first occurrence of x. The second occurrence of x is a free occurrence.

Substitutions

Substitution is a fundamental operation on terms and formulas that occurs in all infer- ence systems for first-order logic. In general, substitutions are mappings

σ : X → TΣ(X) such that the domain of σ, that is, the set dom(σ)= { x ∈ X | σ(x) = x }, is finite. The set of variables introduced by σ, that is, the set of variables occurring in one of the terms σ(x), with x ∈ dom(σ), is denoted by codom(σ).

Substitutions are often written as {x1 → s1,...,xn → sn}, with xi pairwise distinct, and then denote the mapping

si, if y = xi {x1 → s1,...,xn → sn}(y)= y, otherwise We also write xσ for σ(x). The modification of a substitution σ at x is defined as follows: t, if y = x σ[x → t](y)= σ(y), otherwise

54 Why Substitution is Complicated

We define the application of a substitution σ to a term t or formula φ by structural induction over the syntactic structure of t or φ by the equations depicted on the next page. In the presence of quantification it is surprisingly complex: We need to make sure that the (free) variables in the codomain of σ are not captured upon placing them into the scope of a quantifier Qy, hence the bound variable must be renamed into a “fresh”, that is, previously unused, variable z. Why this definition of substitution is well-defined will be discussed below.

Application of a Substitution

“Homomorphic” extension of σ to terms and formulas:

f(s1,...,sn)σ = f(s1σ,...,snσ) ⊥σ = ⊥ ⊤σ = ⊤

P (s1,...,sn)σ = P (s1σ,...,snσ) (u ≈ v)σ =(uσ ≈ vσ) ¬φσ = ¬(φσ) (φρψ)σ =(φσρψσ) ; for each binary connective ρ (Qx φ)σ = Qz (φ σ[x → z]) ; with z a fresh variable

Structural Induction

Proposition 3.1 Let G = (N,T,P,S) be a context-free grammar (possibly infinite) and let q be a property of T ∗ (the words over the alphabet T of terminal symbols of G). q holds for all words w ∈ L(G), whenever one can prove the following two properties:

1. (base cases) q(w′) holds for each w′ ∈ T ∗ such that X ::= w′ is a rule in P . 2. (step cases) ∗ If X ::= w0X0w1 ...wnXnwn+1 is in P with Xi ∈ N, wi ∈ T , n ≥ 0, then for all ′ ′ ′ ′ wi ∈ L(G,Xi), whenever q(wi) holds for 0 ≤ i ≤ n, then also q(w0w0w1 ...wnwnwn+1) holds.

55 ∗ Here L(G,Xi) ⊆ T denotes the language generated by the grammar G from the non- terminal Xi.

Structural Recursion

Proposition 3.2 Let G = (N,T,P,S) be a unambiguous (why?) context-free gram- mar. A function f is well-defined on L(G) (that is, unambiguously defined) whenever these 2 properties are satisfied: 1. (base cases) f is well-defined on the words w′ ∈ T ∗ for each rule X ::= w′ in P . 2. (step cases) ′ ′ If X ::= w0X0w1 ...wnXnwn+1 is a rule in P then f(w0w0w1 ...wnwnwn+1) is ′ well-defined, assuming that each of the f(wi) is well-defined.

Substitution Revisited

Q: Does Proposition 3.2 justify that our homomorphic extension

apply :FΣ(X) × (X → TΣ(X)) → FΣ(X), with apply(φ, σ) denoted by φσ, of a substitution is well-defined? A: We have two problems here. One is that “fresh” is (deliberately) left unspecified. That can be easily fixed by adding an extra variable counter argument to the apply function. The second problem is that Proposition 3.2 applies to unary functions only. The standard solution to this problem is to curryfy, that is, to consider the binary function as a unary function producing a unary (residual) function as a result:

apply :FΣ(X) → ((X → TΣ(X)) → FΣ(X)) where we have denoted (apply(φ))(σ) as φσ.

3.2 Semantics

To give semantics to a logical system means to define a notion of truth for the formulas. The concept of truth that we will now define for first-order logic goes back to Tarski. As in the propositional case, we use a two-valued logic with truth values “true” and “false” denoted by 1 and 0, respectively.

56 Structures

A Σ-algebra (also called Σ-interpretation or Σ-structure) is a triple

n m A =(UA, (fA : UA → UA)f/n∈Ω, (PA ⊆ UA )P/m∈Π) where UA = ∅ is a set, called the universe of A. By Σ-Alg we denote the class of all Σ-algebras.

Assignments

A variable has no intrinsic meaning. The meaning of a variable has to be defined externally (explicitly or implicitly in a given context) by an assignment. A (variable) assignment, also called a valuation (over a given Σ-algebra A), is a map β : X → UA. Variable assignments are the semantic counterparts of substitutions.

Value of a Term in A with Respect to β

By structural induction we define

A(β) : TΣ(X) → UA as follows: A(β)(x)= β(x), x ∈ X A(β)(f(s1,...,sn)) = fA(A(β)(s1),..., A(β)(sn)), f/n ∈ Ω

In the scope of a quantifier we need to evaluate terms with respect to modified assign- ments. To that end, let β[x → a]: X → UA, for x ∈ X and a ∈ A, denote the assignment

a if x = y β[x → a](y)= β(y) otherwise

57 Truth Value of a Formula in A with Respect to β

A(β):FΣ(X) →{0, 1} is defined inductively as follows:

A(β)(⊥)=0 A(β)(⊤)=1

A(β)(P (s1,...,sn))=1 ⇔ (A(β)(s1),..., A(β)(sn)) ∈ PA A(β)(s ≈ t)=1 ⇔ A(β)(s)= A(β)(t) A(β)(¬φ)=1 ⇔ A(β)(φ)=0

A(β)(φρψ)= Bρ(A(β)(φ), A(β)(ψ))

with Bρ the Boolean function associated with ρ A(β)(∀xφ) = min{A(β[x → a])(φ)} a∈U A(β)(∃xφ) = max{A(β[x → a])(φ)} a∈U

Example

The “Standard” Interpretation for Peano Arithmetic:

UN = {0, 1, 2,...}

0N = 0

sN : n → n +1

+N : (n, m) → n + m

∗N : (n, m) → n ∗ m

≤N = { (n, m) | n less than or equal to m }

Note that N is just one out of many possible ΣPA-interpretations. Values over N for Sample Terms and Formulas: Under the assignment β : x → 1,y → 3 we obtain

N(β)(s(x)+ s(0)) =3 N(β)(x + y ≈ s(y)) =1 N(β)(∀x, y(x + y ≈ y + x)) = 1 N(β)(∀z z ≤ y) =0 N(β)(∀x∃yx

58 3.3 Models, Validity, and Satisfiability

φ is valid in A under assignment β:

A, β |= φ :⇔ A(β)(φ)=1

φ is valid in A (A is a model of φ):

A|= φ :⇔ A, β |= φ, for all β ∈ X → UA

φ is valid (or is a tautology):

|= φ :⇔ A|= φ, for all A ∈ Σ-Alg

φ is called satisfiable iff there exist A and β such that A, β |= φ. Otherwise φ is called unsatisfiable.

Substitution Lemma

The following propositions, to be proved by structural induction, hold for all Σ-algebras A, assignments β, and substitutions σ.

Lemma 3.3 For any Σ-term t

A(β)(tσ)= A(β ◦ σ)(t), where β ◦ σ : X →A is the assignment β ◦ σ(x)= A(β)(xσ).

Proposition 3.4 For any Σ-formula φ, A(β)(φσ)= A(β ◦ σ)(φ).

Corollary 3.5 A, β |= φσ ⇔ A, β ◦ σ |= φ

These theorems basically express that the syntactic concept of substitution corresponds to the semantic concept of an assignment.

59 Entailment and Equivalence

φ entails (implies) ψ (or ψ is a consequence of φ), written φ |= ψ, if for all A ∈ Σ-Alg and β ∈ X → UA, whenever A, β |= φ, then A, β |= ψ.

φ and ψ are called equivalent, written φ |=| ψ, if for all A ∈ Σ-Alg and β ∈ X → UA we have A, β |= φ ⇔ A, β |= ψ.

Proposition 3.6 φ entails ψ iff (φ → ψ) is valid

Proposition 3.7 φ and ψ are equivalent iff (φ ↔ ψ) is valid.

Extension to sets of formulas N in the “natural way”, e. g., N |= φ

:⇔ for all A ∈ Σ-Alg and β ∈ X → UA: if A, β |= ψ, for all ψ ∈ N, then A, β |= φ.

Validity vs. Unsatisfiability

Validity and unsatisfiability are just two sides of the same medal as explained by the following proposition.

Proposition 3.8 Let φ and ψ be formulas, let N be a set of formulas. Then (i) φ is valid if and only if ¬φ is unsatisfiable. (ii) φ |= ψ if and only if φ ∧ ¬ψ is unsatisfiable. (iii) N |= ψ if and only if N ∪ {¬ψ} is unsatisfiable.

Hence in order to design a theorem prover (validity checker) it is sufficient to design a checker for unsatisfiability.

Theory of a Structure

Let A ∈ Σ-Alg. The (first-order) theory of A is defined as

T h(A)= { ψ ∈ FΣ(X) |A|= ψ }

Problem of axiomatizability: For which structures A can one axiomatize T h(A), that is, can one write down a formula φ (or a recursively enumerable set φ of formulas) such that T h(A)= { ψ | φ |= ψ }? Analogously for sets of structures.

60 Two Interesting Theories

Let ΣPres =({0/0,s/1, +/2}, ∅) and Z+ =(Z, 0, s, +) its standard interpretation on the integers. T h(Z+) is called (M. Presburger, 1929). (There is no essential difference when one, instead of Z, considers the natural numbers N as standard interpretation.) Presburger arithmetic is decidable in 3EXPTIME (D. Oppen, JCSS, 16(3):323–332, 1978), and in 2EXPSPACE, using automata-theoretic methods (and there is a constant 2cn c ≥ 0 such that T h(Z+) ∈ NTIME(2 )).

However, N∗ =(N, 0, s, +, ∗), the standard interpretation of ΣPA =({0/0,s/1, +/2, ∗/2}, ∅), has as theory the so-called Peano arithmetic which is undecidable, not even recursively enumerable. Note: The choice of signature can make a big difference with regard to the computational complexity of theories.

3.4 Algorithmic Problems

Validity(φ): |= φ ? Satisfiability(φ): φ satisfiable? Entailment(φ,ψ): does φ entail ψ? Model(A,φ): A|= φ? Solve(A,φ): find an assignment β such that A, β |= φ. Solve(φ): find a substitution σ such that |= φσ. Abduce(φ): find ψ with “certain properties” such that ψ |= φ.

G¨odel’s Famous Theorems

1. For most signatures Σ, validity is undecidable for Σ-formulas. (Later by Turing: Encode Turing machines as Σ-formulas.) 2. For each signature Σ, the set of valid Σ-formulas is recursively enumerable. (We will prove this by giving complete deduction systems.)

3. For Σ = ΣPA and N∗ = (N, 0, s, +, ∗), the theory T h(N∗) is not recursively enu- merable.

61 These complexity results motivate the study of subclasses of formulas (fragments) of first-order logic Q: Can you think of any fragments of first-order logic for which validity is decidable?

Some Decidable Fragments

Some decidable fragments: • Monadic class: no function symbols, all predicates unary; validity is NEXPTIME- complete. • Variable-free formulas without equality: satisfiability is NP-complete. (why?) • Variable-free Horn clauses (clauses with at most one positive atom): entailment is decidable in linear time. • Finite model checking is decidable in time polynomial in the size of the structure and the formula.

Plan

Lift superposition from propositional logic to first-order logic.

3.5 Normal Forms and Skolemization

Study of normal forms motivated by • reduction of logical concepts, • efficient data structures for theorem proving, • satisfiability preserving transformations (renaming), • Skolem’s and Herbrand’s theorem. The main problem in first-order logic is the treatment of quantifiers. The subsequent normal form transformations are intended to eliminate many of them.

62 Prenex Normal Form (Traditional)

Prenex formulas have the form

Q1x1 ...Qnxn φ, where φ is quantifier-free and Qi ∈ {∀, ∃}; we call Q1x1 ...Qnxn the quantifier prefix and φ the matrix of the formula.

Computing prenex normal form by the rewrite system ⇒P :

φ[(ψ1 ↔ ψ2)]p ⇒P φ[(ψ1 → ψ2) ∧ (ψ2 → ψ1)]p φ[¬Qxψ1]p ⇒P φ[Qx¬ψ1]p φ[((Qxψ1) ρ ψ2)]p ⇒P φ[Qy(ψ1{x → y} ρ ψ2)]p, ρ ∈ {∧, ∨} φ[((Qxψ1) → ψ2)]p ⇒P φ[Qy(ψ1{x → y}→ ψ2)]p, φ[(ψ1 ρ (Qxψ2))]p ⇒P φ[Qy(ψ1 ρ ψ2{x → y})]p, ρ ∈ {∧, ∨, →}

Here y is always assumed to be some fresh variable and Q denotes the quantifier dual to Q, i. e., ∀ = ∃ and ∃ = ∀.

Skolemization

Intuition: replacement of ∃y by a concrete choice function computing y from all the arguments y depends on.

Transformation ⇒S (to be applied outermost, not in subformulas):

∀x1,...,xn∃yφ ⇒S ∀x1,...,xn φ{y → f(x1,...,xn)} where f/n is a new function symbol (Skolem function).

∗ ∗ Together: φ ⇒P ψ ⇒S χ prenex prenex, no ∃ Theorem 3.9 Let φ, ψ, and χ as defined above and closed. Then (i) φ and ψ are equivalent. (ii) χ |= ψ but the converse is not true in general. (iii) ψ satisfiable (Σ-Alg) ⇔ χ satisfiable (Σ′-Alg) where Σ′ = (Ω ∪ SKF, Π), if Σ=(Ω, Π).

63 The Complete Picture

∗ φ ⇒P Q1y1 ...Qnyn ψ (ψ quantifier-free) ∗ ⇒S ∀x1,...,xm χ (m ≤ n, χ quantifier-free)

k ni ∗ ⇒OCNF ∀x1,...,xm Lij i=1 j=1 leave out clauses Ci φ′

N = {C1,...,Ck} is called the clausal (normal) form (CNF) of φ. Note: the variables in the clauses are implicitly universally quantified.

Theorem 3.10 Let φ be closed. Then φ′ |= φ. (The converse is not true in general.)

Theorem 3.11 Let φ be closed. Then φ is satisfiable iff φ′ is satisfiable iff N is satisfiable

Optimization

The normal form algorithm described so far leaves lots of room for optimization. Note that we only can preserve satisfiability anyway due to Skolemization. • size of the CNF is exponential when done naively; the transformations we intro- duced already for propositional logic avoid this exponential growth; • we want to preserve the original formula structure; • we want small arity of Skolem functions (see next section).

64 3.6 Getting Small Skolem Functions

A clause set that is better suited for automated theorem proving can be obtained using the following steps: • rename beneficial subformulas • produce a negation normal form (NNF) • apply miniscoping • rename all variables • skolemize

Formula renaming

We extend the machinery from propositional to first-order logic: ν(∀xφ) = ν(∃xφ) = ν(φ) andν ¯(∀xφ)=¯ν(∃xφ)=¯ν(φ). Introduce top-down fresh predicates for beneficial subformulas: ψ[φ]p ⇒OCNF ψ[P (x1,...,xn)]p ∧ def(ψ,p,P ) where {x1,...,xn} are the free variables in φ, P/n is a predicate new to ψ[φ]p, ν(ψ[φ]p) > ν(ψ[P ]p ∧ def(ψ,p,P )), and def(ψ,p,P ) is defined polarity dependent analogous to the propositional case: def(ψ,p,P ) := ∀x1,...,xn [ψ|p ◦ P (x1,...,xn)] where ◦∈{→, ↔, ←}.

Negation Normal Form (NNF)

Apply the rewrite system ⇒NNF:

φ[ψ1 ↔ ψ2]p ⇒NNF φ[(ψ1 → ψ2) ∧ (ψ2 → ψ1)]p if pol(φ,p) = 1 or pol(φ,p)=0

φ[ψ1 ↔ ψ2]p ⇒NNF φ[(ψ1 ∧ ψ2) ∨ (¬ψ2 ∧ ¬ψ1)]p if pol(φ,p)= −1

65 φ[¬Qx ψ]p ⇒NNF φ[Qx ¬ψ]p φ[¬(ψ1 ∨ ψ2)]p ⇒NNF φ[¬ψ1 ∧ ¬ψ2]p φ[¬(ψ1 ∧ ψ2)]p ⇒NNF φ[¬ψ1 ∨ ¬ψ2]p φ[ψ1 → ψ2]p ⇒NNF φ[¬ψ1 ∨ ψ2]p φ[¬¬ψ]p ⇒NNF φ[ψ]p

Miniscoping

Apply the rewrite relation ⇒MS. For the rules below we assume that x occurs freely in ψ1, ψ3, but x does not occur freely in ψ2:

φ[Qx (ψ1 ∧ ψ2)]p ⇒MS φ[(Qx ψ1) ∧ ψ2]p φ[Qx (ψ2 ∨ ψ2)]p ⇒MS φ[(Qx ψ1) ∨ ψ2]p φ[∀x (ψ1 ∧ ψ3)]p ⇒MS φ[(∀x ψ1) ∧ (∀x ψ3)]p φ[∃x (ψ1 ∨ ψ3)]p ⇒MS φ[(∃x ψ1) ∨ (∃x ψ3)]p

Variable Renaming

Rename all variables in φ such that there are no two different positions p, q with φ|p = ′ Qx ψ1 and φ|q = Q x ψ2.

Standard Skolemization

Apply the rewrite rule:

φ[∃x ψ]p ⇒SK φ[ψ{x → f(y1,...,yn)}]p where p has minimal length, {y1,...,yn} are the free variables in ∃x ψ, f/n is a new function symbol to φ

3.7 Herbrand Interpretations

From now on we shall consider FOL without equality. We assume that Ω contains at least one constant symbol. A Herbrand interpretation (over Σ) is a Σ-algebra A such that

• UA = TΣ (= the set of ground terms over Σ)

• fA :(s1,...,sn) → f(s1,...,sn), f/n ∈ Ω

66 f fA(△,..., △)= △ ... △

In other words, values are fixed to be ground terms and functions are fixed to be the term constructors. Only predicate symbols P/m ∈ Π may be freely interpreted as relations m PA ⊆ TΣ .

Proposition 3.12 Every set of ground atoms I uniquely determines a Herbrand inter- pretation A via

(s1,...,sn) ∈ PA :⇔ P (s1,...,sn) ∈ I

Thus we shall identify Herbrand interpretations (over Σ) with sets of Σ-ground atoms.

Example: ΣPres =({0/0,s/1, +/2}, {

N as Herbrand interpretation over ΣPres: I = { 0 ≤ 0, 0 ≤ s(0), 0 ≤ s(s(0)),..., 0+0 ≤ 0, 0+0 ≤ s(0),..., ..., (s(0)+0)+ s(0) ≤ s(0)+(s(0) + s(0)) ... s(0)+0

Existence of Herbrand Models

A Herbrand interpretation I is called a Herbrand model of φ, if I |= φ.

Theorem 3.13 (Herbrand) Let N be a set of Σ-clauses. N satisfiable ⇔ N has a Herbrand model (over Σ) ⇔ GΣ(N) has a Herbrand model (over Σ) where GΣ(N) = { Cσ ground clause | C ∈ N, σ : X → TΣ } is the set of ground instances of N.

[The proof will be given below in the context of the completeness proof for superposi- tion.]

67 Example of a GΣ

For ΣPres one obtains for C =(x

3.8 Inference Systems and Proofs

Inference systems Γ (proof calculi) are sets of tuples

(φ1,...,φn,φn+1), n ≥ 0, called inferences, and written premises φ ... φ 1 n . φn+1 conclusion

Clausal inference system: premises and conclusions are clauses. One also considers inference systems over other data structures.

Inference Systems

Inference systems Γ are short hands for rewrite systems over sets of formulas. If N is a set of formulas, then premises φ ... φ 1 n side condition φn+1 conclusion is a shorthand for

N ∪{φ1 ... φn} ⇒Γ N ∪{φ1 ... φn}∪{φn+1} if side condition

68 Proofs

A proof in Γ of a formula φ from a a set of formulas N (called assumptions) is a sequence φ1,...,φk of formulas where

(i) φk = φ,

(ii) for all 1 ≤ i ≤ k: φi ∈ N, or else there exists an inference

φi1 ... φini φi

in Γ, such that 0 ≤ ij < i, for 1 ≤ j ≤ ni.

Soundness and Completeness

Provability ⊢Γ of φ from N in Γ: N ⊢Γ φ if there exists a proof Γ of φ from N. Γ is called sound φ ... φ 1 n ∈ Γ implies φ ,...,φ |= φ φ 1 n

Γ is called complete

N |= φ implies N ⊢Γ φ

Γ is called refutationally complete

N |= ⊥ implies N ⊢Γ ⊥

Proposition 3.14

(i) Let Γ be sound. Then N ⊢Γ φ implies N |= φ

(ii) N ⊢Γ φ implies there exist finitely many clauses φ1,...,φn ∈ N such that φ1,...,φn ⊢Γ φ

69 Proofs as Trees

markings = formulas leaves = assumptions and other nodes = inferences: conclusion = ancestor premises = direct descendants P (f(c)) ∨ Q(b) ¬P (f(c)) ∨¬P (f(c)) ∨ Q(b) ¬P (f(c)) ∨ Q(b) ∨ Q(b) P (f(c)) ∨ Q(b) ¬P (f(c)) ∨ Q(b) Q(b) ∨ Q(b) Q(b) ¬P (f(c)) ∨¬Q(b) P (f(c)) ¬P (f(c)) ⊥

70 3.9 Ground Superposition

We observe that propositional clauses and ground clauses are essentially the same, as long as we do not consider equational atoms. In this section we only deal with ground clauses and recall partly the material from Section 2.5 for first-order ground clauses.

The Resolution Calculus Res

Resolution inference rule: D ∨ A ¬A ∨ C D ∨ C Terminology: D ∨ C: resolvent; A: resolved atom For Superposition (Sup): A strictly maximal, ¬A maximal (Positive) factorization inference rule:

C ∨ A ∨ A C ∨ A For Superposition (Sup): A maximal These are schematic inference rules; for each substitution of the schematic variables C, D, and A, by ground clauses and ground atoms, respectively, we obtain an inference. We treat “∨” as associative and commutative, hence A and ¬A can occur anywhere in the clauses; moreover, when we write C ∨ A, etc., this includes unit clauses, that is, C = ⊥.

Sample Refutation

1. ¬P (f(c)) ∨ ¬P (f(c)) ∨ Q(b) (given) 2. P (f(c)) ∨ Q(b) (given) 3. ¬P (g(b, c)) ∨ ¬Q(b) (given) 4. P (g(b, c)) (given) 5. ¬P (f(c)) ∨ Q(b) ∨ Q(b) (Res. 2. into 1.) 6. ¬P (f(c)) ∨ Q(b) (Fact. 5.) 7. Q(b) ∨ Q(b) (Res. 2. into 6.) 8. Q(b) (Fact. 7.) 9. ¬P (g(b, c)) (Res. 8. into 3.) 10. ⊥ (Res. 4. into 9.)

71 Soundness of Resolution

Theorem 3.15 Propositional resolution is sound.

Proof. Let B ∈ Σ-Alg. To be shown: (i) for resolution: B|= D ∨ A, B |= C ∨ ¬A ⇒ B|= D ∨ C (ii) for factorization: B|= C ∨ A ∨ A ⇒ B|= C ∨ A (i): Assume premises are valid in B. Two cases need to be considered: If B|= A, then B|= C, hence B|= D ∨ C. Otherwise, B |= ¬A, then B |= D, and again B |= D ∨ C. (ii): even simpler. 2

Note: In propositional logic (ground clauses) we have:

1. B|= L1 ∨ ... ∨ Ln iff there exists i: B|= Li. 2. B|= A or B|= ¬A. This does not hold for formulas with variables!

72 Closure of Clause Sets under Res

Res(N)= { C | C is conclusion of an inference in Res with premises in N } Res0(N)= N Resn+1(N)= Res(Resn(N)) ∪ Resn(N), for n ≥ 0 ∗ n Res (N)= n≥0 Res (N)

N is called saturated (w. r. t. resolution), if Res(N) ⊆ N.

Proposition 3.16 (i) Res∗(N) is saturated. (ii) Res is refutationally complete, iff for each set N of ground clauses: N |= ⊥ iff ⊥ ∈ Res∗(N)

Construction of Interpretations

Done the same way as for propositional logic: ground atoms play the rˆole of propositional variables.

Model Existence Theorem

Theorem 3.17 (Bachmair & Ganzinger 1990) Let ≻ be a clause ordering, let N ≻ be saturated w.r.t. Res (or Sup), and suppose that ⊥ ∈ N. Then NI |= N.

Corollary 3.18 Let N be saturated w.r.t. Res. Then N |= ⊥⇔⊥∈ N.

≻ Proof of Theorem 3.17. Suppose ⊥ ∈ N, but IN |= N. Let C ∈ N minimal (in ≻) ≻ such that IN |= C. Since C is false in IN , C is not productive. As C = ⊥ there exists a maximal atom A in C. Case 1: C = ¬A ∨ C′ (i. e., the maximal atom occurs negatively) ′ ⇒ IN |= A and IN |= C ⇒ some D = D′ ∨ A ∈ N produces A. Since there is an inference

D′ ∨ A ¬A ∨ C′ , D′ ∨ C′

′ ′ ′ ′ ′ ′ we infer that D ∨ C ∈ N, and C ≻ D ∨ C and IN |= D ∨ C . This contradicts the minimality of C.

73 Case 2: C = C′ ∨ A ∨ A. There is an inference C′ ∨ A ∨ A C′ ∨ A that yields a smaller counterexample C′ ∨ A ∈ N. This contradicts the minimality of C. 2

Compactness of Propositional Logic

Theorem 3.19 (Compactness) Let N be a set of propositional (or first-order ground) formulas. Then N is unsatisfiable, if and only if some finite subset M ⊆ N is unsatisfi- able.

Proof. “⇐”: trivial. “⇒”: Let N be unsatisfiable. ⇒ Res∗(N) unsatisfiable ⇒ ⊥ ∈ Res∗(N) by refutational completeness of resolution ⇒ ∃n ≥ 0 : ⊥ ∈ Resn(N) ⇒ ⊥ has a finite resolution proof P ; choose M as the set of assumptions in P . 2

3.10 General Resolution

Propositional (ground) resolution: refutationally complete, in its most naive version: not guaranteed to terminate for satisfiable sets of clauses, (improved versions do terminate, however) inferior to the DPLL procedure. But: in contrast to the DPLL procedure, resolution can be easily extended to non-ground clauses.

General Resolution through Instantiation

Idea: instantiate clauses appropriately:

74 P (z′, z′) ∨¬Q(z) ¬P (a, y) P (x′, b) ∨ Q(f(x′,x)) [a/z′,f(a, b)/z] [a/y] [b/y] [a/x′, b/x]

P (a, a) ∨¬Q(f(a, b))¬P (a, a) ¬P (a, b) P (a, b) ∨ Q(f(a, b))

¬Q(f(a, b)) Q(f(a, b))

Problems: More than one instance of a clause can participate in a proof. Even worse: There are infinitely many possible instances. Observation: Instantiation must produce complementary literals (so that inferences become possi- ble). Idea: Do not instantiate more than necessary to get complementary literals.

P (z′, z′) ∨¬Q(z) ¬P (a, y) P (x′, b) ∨ Q(f(x′,x)) [a/z′] [a/y] [b/y] [a/x′]

P (a, a) ∨¬Q(z) ¬P (a, a) ¬P (a, b) P (a, b) ∨ Q(f(a, x))

¬Q(z) Q(f(a, x)) [f(a, x)/z] ¬Q(f(a, x)) Q(f(a, x))

75 Lifting Principle

Problem: Make saturation of infinite sets of clauses as they arise from taking the (ground) instances of finitely many general clauses (with variables) effective and efficient. Idea (Robinson 1965): • Resolution for general clauses: • Equality of ground atoms is generalized to unifiability of general atoms; • Only compute most general (minimal) unifiers (mgu).

Significance: The advantage of the method in (Robinson 1965) compared with (Gilmore 1960) is that unification enumerates only those instances of clauses that participate in an inference. Moreover, clauses are not right away instantiated into ground clauses. Rather they are instantiated only as far as required for an inference. Inferences with non-ground clauses in general represent infinite sets of ground inferences which are computed simultaneously in a single step.

Resolution for General Clauses

General binary resolution Res: D ∨ B C ∨ ¬A if σ = mgu(A, B) [resolution] (D ∨ C)σ

C ∨ A ∨ B if σ = mgu(A, B) [factorization] (C ∨ A)σ

For inferences with more than one premise, we assume that the variables in the premises are (bijectively) renamed such that they become different to any variable in the other premises. We do not formalize this. Which names one uses for variables is otherwise irrelevant.

Unification . . Let E = {s1 = t1,...,sn = tn} (si, ti terms or atoms) a multiset of equality problems. A substitution σ is called a unifier of E if siσ = tiσ for all 1 ≤ i ≤ n. If a unifier of E exists, then E is called unifiable. A substitution σ is called more general than a substitution τ, denoted by σ ≤ τ, if there exists a substitution ρ such that ρ ◦ σ = τ, where (ρ ◦ σ)(x) := (xσ)ρ is the composition of σ and ρ as mappings. (Note that ρ ◦ σ has a finite domain as required for a substitution.)

76 If a unifier of E is more general than any other unifier of E, then we speak of a most general unifier of E, denoted by mgu(E).

Proposition 3.20 (i) ≤ is a quasi-ordering on substitutions, and ◦ is associative. (ii) If σ ≤ τ and τ ≤ σ (we write σ ∼ τ in this case), then xσ and xτ are equal up to (bijective) variable renaming, for any x in X.

A substitution σ is called idempotent, if σ ◦ σ = σ.

Proposition 3.21 σ is idempotent iff dom(σ) ∩ codom(σ)= ∅.

Rule-Based Naive Standard Unification . t = t, E ⇒SU E . . . f(s1,...,sn) = f(t1,...,tn), E ⇒SU s1 = t1,...,sn = tn, E . f(...) = g(...), E ⇒SU ⊥ . . x = t, E ⇒SU x = t, E{x → t} if x ∈ var(E), x ∈ var(t) . x = t, E ⇒SU ⊥ if x = t, x ∈ var(t) . . t = x, E ⇒SU x = t, E if t ∈ X

SU: Main Properties . . If E = x1 = u1,...,xk = uk, with xi pairwise distinct, xi ∈ var(uj), then E is called an (equational problem in) solved form representing the solution σE = {x1 → u1,..., xk → uk}.

Proposition 3.22 If E is a solved form then σE is an mgu of E.

Theorem 3.23

′ ′ 1. If E ⇒SU E then σ is a unifier of E iff σ is a unifier of E ∗ 2. If E ⇒SU ⊥ then E is not unifiable. ∗ ′ ′ 3. If E ⇒SU E with E in solved form, then σE′ is an mgu of E.

77 Proof. (1) We have to show this for each of the rules. Let’s treat the case for the 4th . rule here. Suppose σ is a unifier of x = t, that is, xσ = tσ. Thus, σ ◦{x → t} = σ[x → . tσ] = σ[x → xσ] = σ. Therefore, for any equation u = v in E: uσ = vσ, iff u{x → t}σ = v{x → t}σ. (2) and (3) follow by induction from (1) using Proposition 3.22. 2

Main Unification Theorem

Theorem 3.24 E is unifiable if and only if there is a most general unifier σ of E, such that σ is idempotent and dom(σ) ∪ codom(σ) ⊆ var(E).

Proof.

• ⇒SU is Noetherian. A suitable lexicographic ordering on the multisets E (with ⊥ minimal) shows this. Compare in this order: . 1. the number of defined variables (d.h. variables x in equations x = t with x ∈ var(t)), which also occur outside their definition elsewhere in E; 2. the multiset ordering induced by (i) the size (number of symbols) in an equa- . . tion; (ii) if sizes are equal consider x = t smaller than t = x, if t ∈ X.

• A system E that is irreducible w. r. t. ⇒SU is either ⊥ or a solved form. • Therefore, reducing any E by SU will end (no matter what reduction strategy we apply) in an irreducible E′ having the same unifiers as E, and we can read off the mgu (or non-unifiability) of E from E′ (Theorem 3.23, Proposition 3.22). • σ is idempotent because of the substitution in rule 4. dom(σ) ∪ codom(σ) ⊆ var(E), as no new variables are generated. 2

Rule-Based Polynomial Unification

Problem: using ⇒SU , an exponential growth of terms is possible. The following unification algorithm avoids this problem, at least if the final solved form is represented as a DAG.

78 . t = t, E ⇒PU E . . . f(s1,...,sn) = f(t1,...,tn), E ⇒PU s1 = t1,...,sn = tn, E . f(...) = g(...), E ⇒PU ⊥ . . x = y, E ⇒PU x = y, E{x → y} if x ∈ var(E), x = y . . x1 = t1,...,xn = tn, E ⇒PU ⊥ if there are positions pi with ti/pi = xi+1, tn/pn = x1 and some pi = ǫ . x = t, E ⇒PU ⊥ if x = t, x ∈ var(t) . . t = x, E ⇒PU x = t, E if t ∈ X . . . . x = t, x = s, E ⇒PU x = t, t = s, E if t, s ∈ X and |t|≤|s|

79 Properties of PU

Theorem 3.25

′ ′ 1. If E ⇒PU E then σ is a unifier of E iff σ is a unifier of E ∗ 2. If E ⇒PU ⊥ then E is not unifiable. ∗ ′ ′ 3. If E ⇒PU E with E in solved form, then σE′ is an mgu of E.

Note: The solved form of ⇒ is different form the solved form obtained from ⇒ . PU .SU In order to obtain the unifier σE′ , we have to sort the list of equality problems xi = ti in such a way that xi does not occur in tj for j < i, and then we have to compose the substitutions {x1 → t1}◦◦{xk → tk}.

Lifting Lemma

Lemma 3.26 Let C and D be variable-disjoint clauses. If

D C σ ρ Dσ Cρ   [propositional resolution] C′ then there exists a substitution τ such that D C [general resolution] C′′ τ C′ =C′′τ  An analogous lifting lemma holds for factorization.

Saturation of Sets of General Clauses

Corollary 3.27 Let N be a set of general clauses saturated under Res, i. e., Res(N) ⊆ N. Then also GΣ(N) is saturated, that is,

Res(GΣ(N)) ⊆ GΣ(N).

80 Proof. W.l.o.g. we may assume that clauses in N are pairwise variable-disjoint. (Other- wise make them disjoint, and this renaming process changes neither Res(N) nor GΣ(N).) ′ Let C ∈ Res(GΣ(N)), meaning (i) there exist resolvable ground instances Dσ and Cρ of N with resolvent C′, or else (ii) C′ is a factor of a ground instance Cσ of C. Case (i): By the Lifting Lemma, D and C are resolvable with a resolvent C′′ with C′′τ = C′, for a suitable substitution τ. As C′′ ∈ N by assumption, we obtain that ′ C ∈ GΣ(N). Case (ii): Similar. 2

Herbrand’s Theorem

Lemma 3.28 Let N be a set of Σ-clauses, let A be an interpretation. Then A |= N implies A|= GΣ(N).

Lemma 3.29 Let N be a set of Σ-clauses, let A be a Herbrand interpretation. Then A|= GΣ(N) implies A|= N.

Theorem 3.30 (Herbrand) A set N of Σ-clauses is satisfiable if and only if it has a Herbrand model over Σ.

Proof. The “⇐” part is trivial. For the “⇒” part let N |= ⊥.

N |= ⊥ ⇒ ⊥ ∈ Res∗(N) (resolution is sound) ∗ ⇒ ⊥ ∈ GΣ(Res (N)) ∗ ∗ ⇒ GΣ(Res (N))I |= GΣ(Res (N)) (Thm. 3.17; Cor. 3.27) ∗ ∗ ⇒ GΣ(Res (N))I |= Res (N) (Lemma3.29) ∗ ∗ ⇒ GΣ(Res (N))I |= N (N ⊆ Res (N)) 2

The Theorem of L¨owenheim-Skolem

Theorem 3.31 (L¨owenheim–Skolem) Let Σ be a countable signature and let S be a set of closed Σ-formulas. Then S is satisfiable iff S has a model over a countable universe.

Proof. If both X and Σ are countable, then S can be at most countably infinite. Now generate, maintaining satisfiability, a set N of clauses from S. This extends Σ by at ′ ′ most countably many new Skolem functions to Σ . As Σ is countable, so is TΣ′ , the universe of Herbrand-interpretations over Σ′. Now apply Theorem 3.30. 2

81 Refutational Completeness of General Resolution

Theorem 3.32 Let N be a set of general clauses where Res(N) ⊆ N. Then

N |= ⊥⇔⊥∈ N.

Proof. Let Res(N) ⊆ N. By Corollary 3.27: Res(GΣ(N)) ⊆ GΣ(N)

N |= ⊥⇔ GΣ(N) |= ⊥ (Lemma 3.28/3.29; Theorem 3.30)

⇔ ⊥ ∈ GΣ(N) (propositional resolution sound and complete) ⇔ ⊥ ∈ N 2

Compactness of Predicate Logic

Theorem 3.33 (Compactness Theorem for First-Order Logic) Let S be a set of first-order formulas. S is unsatisfiable iff some finite subset S′ ⊆ S is unsatisfiable.

Proof. The “⇐” part is trivial. For the “⇒” part let S be unsatisfiable and let N be the set of clauses obtained by Skolemization and CNF transformation of the formulas in S. Clearly Res∗(N) is unsatisfiable. By Theorem 3.32, ⊥ ∈ Res∗(N), and therefore ⊥ ∈ Resn(N) for some n ∈ N. Consequently, ⊥ has a finite resolution proof B of depth ≤ n. Choose S′ as the subset of formulas in S such that the corresponding clauses contain the assumptions (leaves) of B. 2

82 3.11 First-Order Superposition with Selection

Motivation: Search space for Res very large. Ideas for improvement: 1. In the completeness proof (Model Existence Theorem 2.13) one only needs to resolve and factor maximal atoms ⇒ if the calculus is restricted to inferences involving maximal atoms, the proof remains correct ⇒ ordering restrictions 2. In the proof, it does not really matter with which negative literal an inference is performed ⇒ choose a negative literal don’t-care-nondeterministically ⇒ selection

Selection Functions

A selection function is a mapping

sel : C → set of occurrences of negative literals in C

Example of selection with selected literals indicated as X :

¬A ∨ ¬A ∨ B

¬B0 ∨ ¬B1 ∨ A

Intuition: • If a clause has at least one selected literal, compute only inferences that involve a selected literal. • If a clause has no selected literals, compute only inferences that involve a maximal literal.

Orderings for Terms, Atoms, Clauses

For first-order logic an ordering on the signature symbols is not sufficient to compare atoms, e.g., how to compare P (a) and P (b)? We propose the Knuth-Bendix Ordering for terms, atoms (with variables) which is then lifted as in the propositional case to literals and clauses.

83 The Knuth-Bendix Ordering (Simple)

Let Σ = (Ω, Π) be a finite signature, let ≻ be a total ordering (“precedence”) on Ω ∪ Π, + + let w : Ω ∪ Π ∪ X → R be a weight function, satisfying w(x) = w0 ∈ R for all variables x ∈ X and w(c) ≥ w0 for all constants c ∈ Ω. The weight function w can be extended to terms (atoms) as follows:

w(f(t1,...,tn)) = w(f)+ w(ti) 1≤i≤n

w(P (t1,...,tn)) = w(P )+ w(ti) 1≤i≤n The Knuth-Bendix ordering ≻kbo on TΣ(X) (atoms) induced by ≻ and w is defined by: s ≻kbo t iff (1) #(x, s) ≥ #(x, t) for all variables x and w(s) >w(t), or (2) #(x, s) ≥ #(x, t) for all variables x, w(s)= w(t), and

(a) s = f(s1,...,sm), t = g(t1,...,tn), and f ≻ g, or

(b) s = f(s1,...,sm), t = f(t1,...,tm), and (s1,...,sm)(≻kbo)lex (t1,...,tm). where #(s, t)= |{p | t|p = s}|.

Proposition 3.34 The Knuth-Bendix ordering ≻kbo is (1) a strict partial well-founded ordering on terms (atoms).

(2) stable under substitution: if s ≻kbo t then sσ ≻kbo tσ for any σ. (3) total on ground terms (ground atoms).

≻ Superposition Calculus Supsel

≻ The superposition calculus Supsel is parameterized by • a selection function sel • and a total and well-founded atom ordering ≻.

84 In the completeness proof, we talk about (strictly) maximal literals of ground clauses. In the non-ground calculus, we have to consider those literals that correspond to (strictly) maximal literals of ground instances: A literal L is called [strictly] maximal in a clause C if and only if there exists a ground substitution σ such that Lσ is [strictly] maximal in Cσ (i.e., if for no other L′ in C: Lσ ≺ L′σ [Lσ L′σ]).

D ∨ B C ∨ ¬A [Superposition Left with Selection] (D ∨ C)σ if the following conditions are satisfied: (i) σ = mgu(A, B); (ii) Bσ strictly maximal in Dσ ∨ Bσ; (iii) nothing is selected in D ∨ B by sel; (iv) either ¬A is selected, or else nothing is selected in C ∨ ¬A and ¬Aσ is maximal in Cσ ∨ ¬Aσ.

C ∨ A ∨ B [Factoring] (C ∨ A)σ if the following conditions are satisfied: (i) σ = mgu(A, B); (ii) Aσ is maximal in Cσ ∨ Aσ ∨ Bσ; (iii) nothing is selected in C ∨ A ∨ B by sel.

Special Case: Propositional Logic

For ground clauses the superposition inference rule simplifies to D ∨ P C ∨ ¬P D ∨ C if the following conditions are satisfied: (i) P ≻ D; (ii) nothing is selected in D ∨ P by sel; (iii) ¬P is selected in C ∨ ¬P , or else nothing is selected in C ∨ ¬P and ¬P max(C).

85 Note: For positive literals, P ≻ D is the same as P ≻ max(D). Analogously, the factoring rule simplifies to C ∨ P ∨ P C ∨ P if the following conditions are satisfied: (i) P is the largest literal in C ∨ P ∨ P ; (ii) nothing is selected in C ∨ P ∨ P by sel.

Search Spaces Become Smaller

1 P ∨ Q we assume P ≻ Q 2 P ∨ ¬Q and sel as indicated by 3 ¬P ∨ Q X . The maximal lit- 4 ¬P ∨ ¬Q eral in a clause is de- 5 Q ∨ Q Res 1, 3 picted in red. 6 Q Fact 5 7 ¬P Res 6, 4 8 P Res 6, 2 9 ⊥ Res 8, 7 With this ordering and selection function the refutation proceeds strictly determinis- tically in this example. Generally, proof search will still be non-deterministic but the search space will be much smaller than with unrestricted resolution.

Avoiding Rotation Redundancy

From

C1 ∨ P C2 ∨ ¬P ∨ Q C1 ∨ C2 ∨ Q C3 ∨ ¬Q C1 ∨ C2 ∨ C3 we can obtain by rotation

C2 ∨ ¬P ∨ Q C3 ∨ ¬Q C1 ∨ P C2 ∨ ¬P ∨ C3 C1 ∨ C2 ∨ C3 another proof of the same clause. In large proofs many rotations are possible. However, if P ≻ Q, then the second proof does not fulfill the orderings restrictions. Conclusion: In the presence of orderings restrictions (however one chooses ≻) no rota- tions are possible. In other words, orderings identify exactly one representant in any class of rotation-equivalent proofs.

86 ≻ Lifting Lemma for Supsel

Lemma 3.35 Let D and C be variable-disjoint clauses. If D C σ ρ

Dσ Cρ ≻   [propositional inference in Sup ] C′ sel and if sel(Dσ) ≃ sel(D), sel(Cρ) ≃ sel(C) (that is, “corresponding” literals are selected), then there exists a substitution τ such that D C [inference in Sup≻ ] C′′ sel τ C′ =C′′τ  An analogous lifting lemma holds for factorization.

Saturation of General Clause Sets

≻ ≻ Corollary 3.36 Let N be a set of general clauses saturated under Supsel, i. e., Supsel(N) ⊆ ′ ′ N. Then there exists a selection function sel such that sel |N = sel |N and GΣ(N) is also saturated, i. e., ≻ Supsel′ (GΣ(N)) ⊆ GΣ(N).

Proof. We first define the selection function sel′ such that sel′(C) = sel(C) for all clauses C ∈ GΣ(N) ∩ N. For C ∈ GΣ(N) \ N we choose a fixed but arbitrary clause ′ D ∈ N with C ∈ GΣ(D) and define sel (C) to be those occurrences of literals that are ground instances of the occurrences selected by sel in D. Then proceed as in the proof of Cor. 3.27 using the above lifting lemma. 2

Soundness and Refutational Completeness

Theorem 3.37 Let ≻ be an atom ordering and sel a selection function such that ≻ Supsel(N) ⊆ N. Then N |= ⊥⇔⊥∈ N

Proof. The “⇐” part is trivial. For the “⇒” part consider the propositional level: Construct a candidate interpretation NI as for superposition without selection, except that clauses C in N that have selected literals are not productive, even when they are false in NC and when their maximal atom occurs only once and positively. The result then follows by Corollary 3.36. 2

87 Craig-Interpolation

A theoretical application of superposition is Craig-Interpolation:

Theorem 3.38 (Craig 1957) Let φ and ψ be two propositional formulas such that φ |= ψ. Then there exists a formula χ (called the interpolant for φ |= ψ), such that χ contains only prop. variables occurring both in φ and in ψ, and such that φ |= χ and χ |= ψ.

Proof. Translate φ and ¬ψ into CNF. let N and M, resp., denote the resulting clause set. Choose an atom ordering ≻ for which the prop. variables that occur in φ but not ∗ ≻ in ψ are maximal. Saturate N into N w. r. t. Supsel with an empty selection function ∗ ≻ ∗ sel . Then saturate N ∪ M w. r. t. Supsel to derive ⊥. As N is already saturated, due to the ordering restrictions only inferences need to be considered where premises, if they are from N ∗, only contain symbols that also occur in ψ. The conjunction of these premises is an interpolant χ. The theorem also holds for first-order formulas. For universal formulas the above proof can be easily extended. In the general case, a proof based on superposition technology is more complicated because of Skolemization. 2

Redundancy

So far: local restrictions of the resolution inference rules using orderings and selection functions. Is it also possible to delete clauses altogether? Under which circumstances are clauses unnecessary? (Conjecture: e. g., if they are tautologies or if they are subsumed by other clauses.) Intuition: If a clause is guaranteed to be neither a minimal counterexample nor produc- tive, then we do not need it.

A Formal Notion of Redundancy

Recall: Let N be a set of ground clauses and C a ground clause (not necessarily in N). C is called redundant w. r. t. N, if there exist C1,...,Cn ∈ N, n ≥ 0, such that Ci ≺ C and C1,...,Cn |= C. Redundancy for general clauses: C is called redundant w. r. t. N, if all ground instances Cσ of C are redundant w. r. t. GΣ(N). Note: The same ordering ≺ is used for ordering restrictions and for redundancy (and for the completeness proof).

88 Examples of Redundancy

Proposition 3.39 Recall the redundancy criteria: • C tautology (i. e., |= C) ⇒ C redundant w. r. t. any set N. Tautology Deletion • Cσ ⊂ D ⇒ D redundant w. r. t. N ∪{C}. Subsumption • Cσ ⊆ D ⇒ D ∨ Lσ redundant w. r. t. N ∪{C ∨ L, D}. Subsumption Resolution

Saturation up to Redundancy

≻ N is called saturated up to redundancy (w. r. t. Supsel)

≻ :⇔ Supsel(N \ Red(N)) ⊆ N ∪ Red(N)

Theorem 3.40 Let N be saturated up to redundancy. Then

N |= ⊥⇔⊥∈ N

Proof (Sketch). (i) Ground case:

≻ ≻ • consider the construction of the candidate interpretation NI for Supsel • redundant clauses are not productive

≻ • redundant clauses in N are not minimal counterexamples for NI The premises of “essential” inferences are either minimal counterexamples or productive. (ii) Lifting: no additional problems over the proof of Theorem 3.37. 2

Monotonicity Properties of Redundancy

Theorem 3.41 (i) N ⊆ M ⇒ Red(N) ⊆ Red(M) (ii) M ⊆ Red(N) ⇒ Red(N) ⊆ Red(N \ M)

We conclude that redundancy is preserved when, during a theorem proving process, one adds (derives) new clauses or deletes redundant clauses. Recall that Red(N) may include clauses that are not in N.

89 A First-Order Superposition Theorem Prover

Straightfotward extension of the propositional STP prover.

3 clause sets: N(ew) containing new inferred clauses U(sable) containing reduced new inferred clauses clauses get into W(orked) O(ff) once their inferences have been computed Strategy: Inferences will only be computed when there are no possibilities for simplification

Rewrite Rules for FSTP

Tautology Deletion (N ⊎{C}; U; WO) ⇒FSTP (N; U; WO) if C is a tautology

Forward Subsumption (N ⊎{C}; U; WO) ⇒FSTP (N; U; WO) if some D ∈ (U ∪ WO) subsumes C, Dσ ⊆ C

Backward Subsumption U (N ⊎{C}; U ⊎{D}; WO) ⇒FSTP (N ∪{C}; U; WO) if C strictly subsumes D (Cσ ⊂ D) Backward Subsumption WO (N ⊎{C}; U; WO ⊎{D}) ⇒FSTP (N ∪{C}; U; WO) if C strictly subsumes D (Cσ ⊂ D)

Forward Subsumption Resolution (N ⊎{C1 ∨ L}; U; WO) ⇒FSTP (N ∪{C1}; U; WO) ′ ′ if C2 ∨ L ∈ (U ∪ WO) such that C2σ ⊆ C1 and L σ = L

Backward Subsumption Resolution U ′ (N ⊎{C1 ∨ L }; U ⊎{C2 ∨ L}; WO) ⇒FSTP (N ∪{C1 ∨ L}; U ⊎{C2}; WO) ′ if C1σ ⊆ C2 and L σ = L Backward Subsumption Resolution WO ′ (N ⊎{C1 ∨ L }; U; WO ⊎{C2 ∨ L}) ⇒FSTP (N ∪{C1 ∨ L}; U; WO ⊎{C2}) ′ if C1σ ⊆ C2 and L σ = L

90 Clause Processing (N ⊎{C}; U; WO) ⇒FSTP (N; U ∪{C}; WO)

Inference Computation (∅; U ⊎{C}; WO) ⇒FSTP (N; U; WO ∪{C}) where N is the set of clauses derived by first-order superposition inferences from C and clauses in WO.

Implementation

Although first-order and propositional subsumption just differ in the matcher σ, proposi- tional subsumption between two clauses C and D can be decided in O(n), n = |C| + |D| whereas first-order subsumption is NP-complete.

Hyperresolution

There are many variants of resolution. (We refer to [Bachmair, Ganzinger: Resolution Theorem Proving] for further reading.) One well-known example is hyperresolution (Robinson 1965): Assume that several negative literals are selected in a clause C. If we perform an inference with C, then one of the selected literals is eliminated. Suppose that the remaining selected literals of C are again selected in the conclusion. Then we must eliminate the remaining selected literals one by one by further resolution steps.

≻ Hyperresolution replaces these successive steps by a single inference. As for Supsel, the calculus is parameterized by an atom ordering ≻ and a selection function sel.

D1 ∨ B1 ...Dn ∨ Bn C ∨ ¬A1 ∨ ... ∨ ¬An

(D1 ∨ ... ∨ Dn ∨ C)σ

. . with σ = mgu(A1 = B1,...,An = Bn), if

(i) Biσ strictly maximal in Diσ, 1 ≤ i ≤ n;

(ii) nothing is selected in Di;

(iii) the indicated occurrences of the ¬Ai are exactly the ones selected by sel, or else nothing is selected in the right premise and n = 1 and ¬A1σ is maximal in Cσ.

91 Similarly to superposition (resolution), hyperresolution has to be complemented by a factorization inference. As we have seen, hyperresolution can be simulated by iterated binary superposition. However this yields intermediate clauses which HR might not derive, and many of them might not be extendable into a full HR inference.

3.12 Summary: Superposition Theorem Proving

• Superposition is a machine calculus. • Subtle interleaving of enumerating instances and proving inconsistency through the use of unification. • Parameters: atom ordering ≻ and selection function sel. On the non-ground level, ordering constraints can (only) be solved approximatively. • Completeness proof by constructing candidate interpretations from productive clauses C ∨ A, A ≻ C; inferences with those reduce counterexamples.

• Local restrictions of inferences via ≻ and sel ⇒ fewer proof variants. • Global restrictions of the search space via elimination of redundancy ⇒ computing with “smaller” clause sets; ⇒ termination on many decidable fragments. • However: not good enough for dealing with orderings, equality and more specific algebraic theories (lattices, abelian groups, rings, fields) or arithmetic ⇒ further specialization of inference systems required.

92 Other Inference Systems

• Tableaux • Instantiation-based methods Resolution-based instance generation Disconnection calculus ... • Natural deduction • Sequent calculus/Gentzen calculus • Hilbert calculus

One major problem with all those calculi concerning automation is that they contain a rule either guessing instances or limiting the use of formulas. So the procedure has to guess instances and/or the number of copies of formulas. For example rules like: Universal Quantification S ∪ {∀xφ} ⇒ S ∪ {∀xφ} ∪ φ{x → t} for some ground term t ∈ TΣ Existential Quantification S ∪ {∃xφ} ⇒ S ∪ {∃xφ} ∪ φ{x → a} for some constant a new to φ

4 First-Order Logic with Equality

Equality is the most important relation in mathematics and functional programming. In principle, problems in first-order logic with equality can be handled by any prover for first-order logic without equality:

4.1 Handling Equality Naively

Proposition 4.1 Let φ be a closed first-order formula with equality. Let ∼ ∈/ Π be a new predicate symbol. The set Eq(Σ) contains the formulas

∀x (x ∼ x) ∀x, y (x ∼ y → y ∼ x) ∀x, y, z (x ∼ y ∧ y ∼ z → x ∼ z) ∀x, y (x1 ∼ y1 ∧∧ xn ∼ yn → f(x1,...,xn) ∼ f(y1,...,yn)) ∀x, y (x1 ∼ y1 ∧∧ xm ∼ ym ∧ P (x1,...,xm) → P (y1,...,ym))

93 for every f ∈ Ω and P ∈ Π. Let φ˜ be the formula that one obtains from φ if every occurrence of ≈ is replaced by ∼. Then φ is satisfiable if and only if Eq(Σ) ∪{φ˜} is satisfiable.

Proof. Let Σ =(Ω, Π), let Σ1 = (Ω, Π ∪ {∼}). For the “only if” part assume that φ is satisfiable and let A be a Σ-model of φ. Then we define a Σ1-algebra B in such a way that B and A have the same universe, fB = fA for every f ∈ Ω, pB = pA for every p ∈ Π, and ∼B is the identity relation on the universe. It is easy to check that B is a model of both φ˜ and of Eq(Σ). The proof of the “if” part consists of two steps.

n m Assume that the Σ1-algebra B =(UB, (fB : U → U)f∈Ω, (pB ⊆ UB )p∈Π∪{∼}) is a model of Eq(Σ) ∪{φ˜}. In the first step, we can show that the interpretation ∼B of ∼ in B is a congruence relation. We will prove this for the symmetry property, the other properties of congruence relations, that is, reflexivity, transitivity, and congruence with respect to ′ ′ functions and predicates are shown analogously. Let a, a ∈ UB such that a ∼B a . We ′ have to show that a ∼B a. Since B is a model of Eq(Σ), B(β)(∀x, y (x ∼ y → y ∼ x))=1 for every β, hence B(β[x → b1,y → b2])(x ∼ y → y ∼ x) = 1 for every β and every ′ ′ b1, b2 ∈ UB. Set b1 = a and b2 = a , then 1 = B(β[x → a, y → a ])(x ∼ y → y ∼ x) = ′ ′ ′ ′ (a ∼B a → a ∼B a), and since a ∼B a holds by assumption, a ∼B a must also hold. In the second step, we will now construct a Σ-algebra A from B and the congruence relation ∼B. Let [a] be the congruence class of an element a ∈ UB with respect to ∼B. The universe UA of A is the set { [a] | a ∈ UB } of congruence classes of the universe of B. Fora function symbol f ∈ Ω, we define fA([a1],..., [an]) = [fB(a1,...,an)], and for a predicate symbol p ∈ Π, we define ([a1],..., [an]) ∈ pA if and only if (a1,...,an) ∈ pB. Observe that this is well-defined: If we take different representatives of the same congruence class, we get the same result by congruence of ∼B. Now for every Σ-term t and every B-assignment β,[B(β)(t)] = A(γ)(t), where γ is the A-assignment that maps every variable x to [β(x)], and analogously for every Σ-formula ψ, B(β)(ψ˜)= A(γ)(ψ). Both properties can easily shown by structural induction. Consequently, A is a model of φ. 2

By giving the equality axioms explicitly, first-order problems with equality can in prin- ciple be solved by SFTP . But this is unfortunately not efficient, mainly due to the transitivity . Equality is theoretically difficult: First-order functional programming is Turing-complete. But: SFTP cannot even solve equational problems that are intuitively easy. Consequence: to handle equality efficiently, knowledge must be integrated into the the- orem prover.

94 Roadmap

How to proceed:

Term rewrite systems Expressing semantic consequence syntactically Knuth-Bendix-Completion Entailment for equations (Superposition for first-order clauses with equality)

4.2 Term Rewrite Systems

Let E be a set of (implicitly universally quantified) equations.

The rewrite relation →E ⊆ TΣ(X) × TΣ(X) is defined by

s →E t iff there exist (l ≈ r) ∈ E, p ∈ pos(s), and σ : X → TΣ(X), such that s|p = lσ and t = s[rσ]p. An instance of the lhs (left-hand side) of an equation is called a redex (reducible expres- sion). Contracting a redex means replacing it with the corresponding instance of the rhs (right-hand side) of the rule. An equation l ≈ r is also called a rewrite rule, if l is not a variable and vars(l) ⊇ vars(r). Notation: l → r. A set of rewrite rules is called a term rewrite system (TRS).

We say that a set of equations E or a TRS R is terminating, if the rewrite relation →E or →R has this property. (Analogously for other properties of (abstrac) rewrite systems). Note: If E is terminating, then it is a TRS.

Rewrite Relations

Corollary 4.2 If E is convergent (i. e., terminating and confluent), then s ≈E t if and ∗ only if s ↔E t if and only if s↓E = t↓E.

Corollary 4.3 If E is finite and convergent, then ≈E is decidable.

95 Reminder: If E is terminating, then it is confluent if and only if it is locally confluent. Problems: Show local confluence of E. Show termination of E. Transform E into an equivalent set of equations that is locally confluent and termi- nating.

E-Algebras

Let E be a set of universally quantified equations. A model of E is also called an E-algebra.

If E |= ∀x(s ≈ t), i. e., ∀x(s ≈ t) is valid in all E-algebras, we write this also as s ≈E t. Goal: Use the rewrite relation →E to express the semantic consequence relation syntactically:

∗ s ≈E t if and only if s ↔E t.

Let E be a set of equations over TΣ(X). The following inference system allows to derive consequences of E:

I (Reflexivity) t ≈ t t ≈ t′ I (Symmetry) t′ ≈ t t ≈ t′ t′ ≈ t′′ I (Transitivity) t ≈ t′′ ′ ′ t1 ≈ t1 ... tn ≈ tn I ′ ′ for any f/n (Congruence) f(t1,...,tn) ≈ f(t1,...,tn) t ≈ t′ I for any substitution σ (Instance) tσ ≈ t′σ

96 Lemma 4.4 The following properties are equivalent:

∗ (i) s ↔E t (ii) E ⇒∗ s ≈ t is derivable. where E ⇒∗ s ≈ t is an abbreviation for E ⇒∗ E′ and s ≈ t ∈ E′. A ... A Recall that the before inference rules of the form I 1 k are abbreviations for B rewrite rules E ⊎{A1,...,Ak}⇒ E ∪{A1,...Ak, B}.

∗ Proof. (i)⇒(ii): s ↔E t implies E ⇒ s ≈ t by induction on the depth of the position ∗ ∗ where the rewrite rule is applied; then s ↔E t implies E ⇒ s ≈ t by induction on the ∗ number of rewrite steps in s ↔E t. (ii)⇒(i): By induction on the size (number of symbols) of the derivation for E ⇒∗ s ≈ t. 2

Constructing a quotient algebra: Let X be a set of variables.

′ ∗ ′ For t ∈ TΣ(X) let [t]= { t ∈ TΣ(X) | E ⇒ t ≈ t } be the congruence class of t.

Define a Σ-algebra TΣ(X)/E (abbreviated by T ) as follows:

UT = { [t] | t ∈ TΣ(X) }.

fT ([t1],..., [tn]) = [f(t1,...,tn)] for f ∈ Ω.

′ ′ ′ Lemma 4.5 fT is well-defined: If [ti] = [ti], then [f(t1,...,tn)] = [f(t1,...,tn)].

Proof. Follows directly from the Congruence rule for ⇒∗. 2

Lemma 4.6 T = TΣ(X)/E is an E-algebra.

Proof. Let ∀x1 ...xn(s ≈ t) be an equation in E; let β be an arbitrary assignment. We have to show that T (β)(∀x(s ≈ t)) = 1, or equivalently, that T (γ)(s)= T (γ)(t) for all γ = β[ xi → [ti] | 1 ≤ i ≤ n ] with [ti] ∈ UT .

Let σ = {x1 → t1,...,xn → tn}, then sσ ∈ T (γ)(s) and tσ ∈ T (γ)(t). By the Instance rule, E ⇒∗ sσ ≈ tσ is derivable, hence T (γ)(s) = [sσ] = [tσ]= T (γ)(t). 2

97 Lemma 4.7 Let X be a countably infinite set of variables; let s, t ∈ TΣ(X). If ∗ TΣ(X)/E |= ∀x(s ≈ t), then E ⇒ s ≈ t is derivable.

Proof. Assume that T |= ∀x(s ≈ t), i. e., T (β)(∀x(s ≈ t)) = 1. Consequently, T (γ)(s)= T (γ)(t) for all γ = β[ xi → [ti] | 1 ≤ i ≤ n ] with [ti] ∈ UT . ∗ Choose ti = xi, then [s] = T (γ)(s) = T (γ)(t) = [t], so E ⇒ s ≈ t is derivable by definition of T . 2

Theorem 4.8 (“Birkhoff’s Theorem”) Let X be a countably infinite set of vari- ables, let E be a set of (universally quantified) equations. Then the following properties are equivalent for all s, t ∈ TΣ(X):

∗ (i) s ↔E t. (ii) E ⇒∗ s ≈ t is derivable.

(iii) s ≈E t, i. e., E |= ∀x(s ≈ t).

(iv) TΣ(X)/E |= ∀x(s ≈ t).

Proof. (i)⇔(ii): Lemma 4.4. (ii)⇒(iii): By induction on the size of the derivation for E ⇒∗ s ≈ t.

(iii)⇒(iv): Obvious, since T = TΣ(X)/E is an E-algebra. (iv)⇒(ii): Lemma 4.7. 2

Universal Algebra

∗ TΣ(X)/E = TΣ(X)/≈E = TΣ(X)/↔E is called the free E-algebra with generating set X/≈E = { [x] | x ∈ X }:

Every mapping ϕ : X/≈E → B for some E-algebra B can be extended to a homomor- phismϕ ˆ : TΣ(X)/E → B.

∗ TΣ(∅)/E = TΣ(∅)/≈E = TΣ(∅)/↔E is called the initial E-algebra.

≈E = { (s, t) | E |= s ≈ t } is called the equational theory of E. I ≈E = { (s, t) | TΣ(∅)/E |= s ≈ t } is called the inductive theory of E. Example:

I Let E = {∀x(x +0 ≈ x), ∀x∀y(x + s(y) ≈ s(x + y))}. Then x + y ≈E y + x, but x + y ≈E y + x.

98 4.3 Critical Pairs

Showing local confluence (Sketch):

∗ ∗ Problem: If t1 E← t0 →E t2, does there exist a term s such that t1 →E s E← t2 ? If the two rewrite steps happen in different subtrees (disjoint redexes): yes. If the two rewrite steps happen below each other (overlap at or below a variable position): yes. If the left-hand sides of the two rules overlap at a non-variable position: needs further investigation.

Question: Are there rewrite rules l1 → r1 and l2 → r2 such that some subterm l1|p and l2 have a common instance (l1|p)σ1 = l2σ2 ? Observation: If we assume w.o.l.o.g. that the two rewrite rules do not have common variables, then only a single substitution is necessary: (l1|p)σ = l2σ. Further observation: The mgu of l1|p and l2 subsumes all unifiers σ of l1|p and l2.

Let li → ri (i =1, 2) be two rewrite rules in a TRS R whose variables have been renamed such that vars(l1) ∩ vars(l2)= ∅. (Remember that vars(li) ⊇ vars(ri).)

Let p ∈ pos(l1) be a position such that l1|p is not a variable and σ is an mgu of l1|p and l2.

Then r1σ ← l1σ → (l1σ)[r2σ]p.

r1σ, (l1σ)[r2σ]p is called a critical pair of R.

The critical pair is joinable (or: converges), if r1σ ↓R (l1σ)[r2σ]p.

Theorem 4.9 (“Critical Pair Theorem”) A TRS R is locally confluent if and only if all its critical pairs are joinable.

Proof. “only if”: obvious, since joinability of a critical pair is a special case of local confluence.

“if”: Suppose s rewrites to t1 and t2 using rewrite rules li → ri ∈ R at positions pi ∈ pos(s), where i =1, 2. Without loss of generality, we can assume that the two rules are variable disjoint, hence s|pi = liθ and ti = s[riθ]pi .

We distinguish between two cases: Either p1 and p2 are in disjoint subtrees (p1 || p2), or one is a prefix of the other (w.o.l.o.g., p1 ≤ p2).

99 Case 1: p1 || p2.

Then s = s[l1θ]p1 [l2θ]p2 , and therefore t1 = s[r1θ]p1 [l2θ]p2 and t2 = s[l1θ]p1 [r2θ]p2 .

Let t0 = s[r1θ]p1 [r2θ]p2 . Then clearly t1 →R t0 using l2 → r2 and t2 →R t0 using l1 → r1.

Case 2: p1 ≤ p2.

Case 2.1: p2 = p1q1q2, where l1|q1 is some variable x. In other words, the second rewrite step takes place at or below a variable in the first rule. Suppose that x occurs m times in l1 and n times in r1 (where m ≥ 1 and n ≥ 0). ∗ ′ ′ Then t1 →R t0 by applying l2 → r2 at all positions p1q q2, where q is a position of x in r1. ∗ Conversely, t2 →R t0 by applying l2 → r2 at all positions p1qq2, where q is a position of ′ x in l1 different from q1, and by applying l1 → r1 at p1 with the substitution θ , where ′ θ = θ[x → (xθ)[r2θ]q2 ].

Case 2.2: p2 = p1p, where p is a non-variable position of l1.

Then s|p2 = l2θ and s|p2 =(s|p1 )|p =(l1θ)|p =(l1|p)θ, so θ is a unifier of l2 and l1|p.

Let σ be the mgu of l2 and l1|p, then θ = τ ◦ σ and r1σ, (l1σ)[r2σ]p is a critical pair. ∗ ∗ By assumption, it is joinable, so r1σ →R v ←R (l1σ)[r2σ]p. ∗ Consequently, t1 = s[r1θ]p1 = s[r1στ]p1 →R s[vτ]p1 and t2 = s[r2θ]p2 = s[(l1θ)[r2θ]p]p1 = ∗ s[(l1στ)[r2στ]p]p1 = s[((l1σ)[r2σ]p)τ]p1 →R s[vτ]p1 . This completes the proof of the Critical Pair Theorem. 2 Note: Critical pairs between a rule and (a renamed variant of) itself must be considered – except if the overlap is at the root (i. e., p = ε).

Corollary 4.10 A terminating TRS R is confluent if and only if all its critical pairs are joinable.

Proof. By Newman’s Lemma and the Critical Pair Theorem. 2

Corollary 4.11 For a finite terminating TRS, confluence is decidable.

Proof. For every pair of rules and every non-variable position in the first rule there is at most one critical pair u1,u2. ′ ′ ′ Reduce every ui to some normal form ui. If u1 = u2 for every critical pair, then R is ′ ∗ ∗ ′ confluent, otherwise there is some non-confluent situation u1 R← u1 ←R s →R u2 →R u2. 2

100 4.4 Termination

Termination problems: Given a finite TRS R and a term t, are all R-reductions starting from t terminating? Given a finite TRS R, are all R-reductions terminating?

Proposition 4.12 Both termination problems for TRSs are undecidable in general.

Proof. Encode Turing machines using rewrite rules and reduce the (uniform) halting problems for TMs to the termination problems for TRSs. 2

Consequence: Decidable criteria for termination are not complete.

Reduction Orderings

Goal: Given a finite TRS R, show termination of R by looking at finitely many rules l → ′ r ∈ R, rather than at infinitely many possible replacement steps s →R s .

′ A binary relation ⊐ over TΣ(X) is called compatible with Σ-operations, if s ⊐ s implies ′ ′ f(t1,...,s,...,tn) ⊐ f(t1,...,s ,...,tn) for all f ∈ Ω and s,s , ti ∈ TΣ(X).

Lemma 4.13 The relation ⊐ is compatible with Σ-operations, if and only if s ⊐ s′ ′ ′ implies t[s]p ⊐ t[s ]p for all s,s , t ∈ TΣ(X) and p ∈ pos(t).

Note: compatible with Σ-operations = compatible with contexts.

′ A binary relation ⊐ over TΣ(X) is called stable under substitutions, if s ⊐ s implies ′ ′ sσ ⊐ s σ for all s,s ∈ TΣ(X) and substitutions σ. A binary relation ⊐ is called a rewrite relation, if it is compatible with Σ-operations and stable under substitutions.

Example: If R is a TRS, then →R is a rewrite relation.

A strict partial ordering over TΣ(X) that is a rewrite relation is called rewrite ordering. A well-founded rewrite ordering is called reduction ordering.

101 Theorem 4.14 A TRS R terminates if and only if there exists a reduction ordering ≻ such that l ≻ r for every rule l → r ∈ R.

′ ′ Proof. “if”: s →R s if and only if s = t[lσ]p, s = t[rσ]p. If l ≻ r, then lσ ≻ rσ and therefore t[lσ]p ≻ t[rσ]p. This implies →R ⊆ ≻. Since ≻ is a well-founded ordering, →R is terminating. + 2 “only if”: Define ≻ = →R. If →R is terminating, then ≻ is a reduction ordering.

Two Different Scenarios

Depending on the application, the TRS whose termination we want to show can be (i) fixed and known in advance, or (ii) evolving (e.g., generated by some saturation process). Methods for case (ii) are also usable for case (i). Many methods for case (i) are not usable for case (ii). We will first consider case (ii); additional techniques for case (i) will be considered later.

The Interpretation Method

Proving termination by interpretation: Let A be a Σ-algebra; let ≻ be a well-founded strict partial ordering on its universe.

Define the ordering ≻A over TΣ(X) by s ≻A t iff A(β)(s) ≻A(β)(t) for all assignments β : X → UA.

Is ≻A a reduction ordering?

Lemma 4.15 ≻A is stable under substitutions.

′ ′ Proof. Let s ≻A s , that is, A(β)(s) ≻ A(β)(s ) for all assignments β : X → UA. Let σ be a substitution. We have to show that A(γ)(sσ) ≻ A(γ)(s′σ) for all assignments γ : X → UA. Choose β = γ ◦ σ, then by the substitution lemma, A(γ)(sσ)= A(β)(s) ≻ ′ ′ ′ A(β)(s )= A(γ)(s σ). Therefore sσ ≻A s σ. 2

n ′ A function f : UA → UA is called monotone (with respect to ≻), if a ≻ a implies ′ ′ f(b1,...,a,...,bn) ≻ f(b1,...,a ,...,bn) for all a, a , bi ∈ UA.

102 Lemma 4.16 If the interpretation fA of every function symbol f is monotone w. r. t. ≻, then ≻A is compatible with Σ-operations.

′ ′ Proof. Let s ≻ s , that is, A(β)(s) ≻A(β)(s ) for all β : X → UA. Let β : X → UA be an arbitrary assignment. Then

A(β)(f(t1,...,s,...,tn)) = fA(A(β)(t1),..., A(β)(s),..., A(β)(tn)) ′ ≻ fA(A(β)(t1),..., A(β)(s ),..., A(β)(tn)) ′ = A(β)(f(t1,...,s ,...,tn))

′ Therefore f(t1,...,s,...,tn) ≻A f(t1,...,s ,...,tn). 2

Theorem 4.17 If the interpretation fA of every function symbol f is monotone w. r. t. ≻, then ≻A is a reduction ordering.

Proof. By the previous two lemmas, ≻A is a rewrite relation. If there were an infinite chain s1 ≻A s2 ≻A ... , then it would correspond to an infinite chain A(β)(s1) ≻ A(β)(s2) ≻ ... (with β chosen arbitrarily). Thus ≻A is well-founded. Irreflexivity and transitivity are proved similarly. 2

Polynomial Orderings

Polynomial orderings: Instance of the interpretation method:

The carrier set UA is N or some subset of N.

To every function symbol f with arity n we associate a polynomial Pf (X1,...,Xn) ∈ N[X1,...,Xn] with coefficients in N and indeterminates X1,...,Xn. Then we define fA(a1,...,an)= Pf (a1,...,an) for ai ∈ UA. Requirement 1:

If a1,...,an ∈ UA, then fA(a1,...,an) ∈ UA. (Otherwise, A would not be a Σ- algebra.) Requirement 2:

fA must be monotone (w. r. t. ≻).

103 From now on:

UA = { n ∈ N | n ≥ 1 }.

If arity(f) = 0, then Pf is a constant ≥ 1.

If arity(f)= n ≥ 1, then Pf is a polynomial P (X1,...,Xn), such that every Xi occurs in some monomial with exponent at least 1 and non-zero coefficient. ⇒ Requirements 1 and 2 are satisfied. The mapping from function symbols to polynomials can be extended to terms: A term t containing the variables x1,...,xn yields a polynomial Pt with indeterminates X1,...,Xn (where Xi corresponds to β(xi)). Example: Ω= {b/0, f/1, g/3} 2 Pb = 3, Pf (X1)= X1 , Pg(X1,X2,X3)= X1 + X2X3. 2 Let t = g(f(b), f(x),y), then Pt(X,Y )=9+ X Y .

If P, Q are polynomials in N[X1,...,Xn], we write P > Q if P (a1,...,an) > Q(a1,...,an) for all a1,...,an ∈ UA.

Clearly, l ≻A r iff Pl >Pr iff Pl − Pr > 0.

Question: Can we check Pl − Pr > 0 automatically? Hilbert’s 10th Problem:

Given a polynomial P ∈ Z[X1,...,Xn] with integer coefficients, is P = 0 for some n-tuple of natural numbers?

Theorem 4.18 Hilbert’s 10th Problem is undecidable.

Proposition 4.19 Given a polynomial interpretation and two terms l, r, it is undecid- able whether Pl >Pr.

Proof. By reduction of Hilbert’s 10th Problem. 2

One easy case:

If we restrict to linear polynomials, deciding whether Pl − Pr > 0 is trivial:

kiai + k > 0 for all a1,...,an ≥ 1 if and only if

ki ≥ 0 for all i ∈{1,...,n},

and ki + k> 0

104 Another possible solution:

Test whether Pl(a1,...,an) >Pr(a1,...,an) for all a1,...,an ∈{ x ∈ R | x ≥ 1 }.

This is decidable (but hard). Since UA ⊆{ x ∈ R | x ≥ 1 }, it implies Pl >Pr. Alternatively: Use fast overapproximations.

Simplification Orderings

The proper subterm ordering ⊲ is defined by s ⊲ t if and only if s|p = t for some position p = ε of s.

A rewrite ordering ≻ over TΣ(X) is called simplification ordering, if it has the subterm property: s ⊲ t implies s ≻ t for all s, t ∈ TΣ(X). Example:

Let Remb be the rewrite system Remb = { f(x1,...,xn) → xi | f ∈ Ω, 1 ≤ i ≤ n = arity(f) }.

+ ∗ Define ⊲emb = →Remb and emb = →Remb (“homeomorphic embedding relation”).

⊲emb is a simplification ordering.

Lemma 4.20 If ≻ is a simplification ordering, then s ⊲emb t implies s ≻ t and s emb t implies s t.

Proof. Since ≻ is transitive and is transitive and reflexive, it suffices to show that s →Remb t implies s ≻ t. By definition, s →Remb t if and only if s = s[lσ] and t = s[rσ] for some rule l → r ∈ Remb. Obviously, l ⊲ r for all rules in Remb, hence l ≻ r. Since ≻ is a rewrite relation, s = s[lσ] ≻ s[rσ]= t. 2

Goal: Show that every simplification ordering is well-founded (and therefore a reduction ordering). Note: This works only for finite signatures! To fix this for infinite signatures, the definition of simplification orderings and the definition of embedding have to be modified.

105 Theorem 4.21 (“Kruskal’s Theorem”) Let Σ be a finite signature, let X be a finite set of variables. Then for every infinite sequence t1, t2, t3,... there are indices j > i such that tj emb ti. (emb is called a well-partial-ordering (wpo).)

Proof. See Baader and Nipkow, page 113–115. 2

Theorem 4.22 (Dershowitz) If Σ is a finite signature, then every simplification or- dering ≻ on TΣ(X) is well-founded (and therefore a reduction ordering).

Proof. Suppose that t1 ≻ t2 ≻ t3 ≻ ... is an infinite descending chain.

First assume that there is an x ∈ vars(ti+1) \ vars(ti). Let σ = {x → ti}, then ti+1σ xσ = ti and therefore ti = tiσ ≻ ti+1σ ti, contradicting reflexivity.

Consequently, vars(ti) ⊇ vars(ti+1) and ti ∈ TΣ(V ) for all i, where V is the finite set vars(t1). By Kruskal’s Theorem, there are i < j with ti emb tj. Hence ti tj, contradicting ti ≻ tj. 2

There are reduction orderings that are not simplification orderings and terminating TRSs that are not contained in any simplification ordering. Example: Let R = {f(f(x)) → f(g(f(x)))}.

+ R terminates and →R is therefore a reduction ordering.

Assume that →R were contained in a simplification ordering ≻. Then f(f(x)) →R f(g(f(x))) implies f(f(x)) ≻ f(g(f(x))), and f(g(f(x))) emb f(f(x)) implies f(g(f(x))) f(f(x)), hence f(f(x)) ≻ f(f(x)).

Path Orderings

Let Σ = (Ω, Π) be a finite signature, let ≻ be a strict partial ordering (“precedence”) on Ω.

The lexicographic path ordering ≻lpo on TΣ(X) induced by ≻ is defined by: s ≻lpo t iff (1) t ∈ vars(s) and t = s, or

(2) s = f(s1,...,sm), t = g(t1,...,tn), and

(a) si lpo t for some i, or

(b) f ≻ g and s ≻lpo tj for all j, or

106 (c) f = g, s ≻lpo tj for all j, and (s1,...,sm)(≻lpo)lex (t1,...,tn).

Lemma 4.23 s ≻lpo t implies vars(s) ⊇ vars(t).

Proof. By induction on |s| + |t| and case analysis. 2

Theorem 4.24 ≻lpo is a simplification ordering on TΣ(X).

Proof. Show transitivity, subterm property, stability under substitutions, compatibility with Σ-operations, and irreflexivity, usually by induction on the sum of the term sizes and case analysis. Details: Baader and Nipkow, page 119/120. 2

Theorem 4.25 If the precedence ≻ is total, then the lexicographic path ordering ≻lpo is total on ground terms, i. e., for all s, t ∈ TΣ(∅): s ≻lpo t ∨ t ≻lpo s ∨ s = t.

Proof. By induction on |s| + |t| and case analysis. 2

Recapitulation: Let Σ = (Ω, Π) be a finite signature, let ≻ be a strict partial ordering (“precedence”) on Ω. The lexicographic path ordering ≻lpo on TΣ(X) induced by ≻ is defined by: s ≻lpo t iff (1) t ∈ vars(s) and t = s, or

(2) s = f(s1,...,sm), t = g(t1,...,tn), and

(a) si lpo t for some i, or

(b) f ≻ g and s ≻lpo tj for all j, or

(c) f = g, s ≻lpo tj for all j, and (s1,...,sm)(≻lpo)lex (t1,...,tn). There are several possibilities to compare subterms in (2)(c): • compare list of subterms lexicographically left-to-right (“lexicographic path order- ing (lpo)”, Kamin and L´evy) • compare list of subterms lexicographically right-to-left (or according to some per- mutation π) • compare multiset of subterms using the multiset extension (“multiset path ordering (mpo)”, Dershowitz)

• to each function symbol f with arity(n) ≥ 1 associate a status ∈{mul}∪{ lexπ | π : {1,...,n}→{1,...,n}} and compare according to that status (“recursive path ordering (rpo) with status”)

107 The Knuth-Bendix Ordering

Let Σ = (Ω, Π) be a finite signature, let ≻ be a strict partial ordering (“precedence”) + on Ω, let w : Ω ∪ X → R0 be a weight function, such that the following admissibility conditions are satisfied:

+ w(x)= w0 ∈ R for all variables x ∈ X; w(c) ≥ w0 for all constants c ∈ Ω. If w(f) = 0 for some f ∈ Ω with arity(f) = 1, then f g for all g ∈ Ω. The weight function w can be extended to terms as follows:

w(t)= w(x) #(x, t)+ w(f) #(f, t). x∈vars(t) f∈Ω The Knuth-Bendix ordering ≻kbo on TΣ(X) induced by ≻ and w is defined by: s ≻kbo t iff (1) #(x, s) ≥ #(x, t) for all variables x and w(s) >w(t), or (2) #(x, s) ≥ #(x, t) for all variables x, w(s)= w(t), and (a) t = x, s = f n(x) for some n ≥ 1, or

(b) s = f(s1,...,sm), t = g(t1,...,tn), and f ≻ g, or

(c) s = f(s1,...,sm), t = f(t1,...,tm), and (s1,...,sm)(≻kbo)lex (t1,...,tm).

Theorem 4.26 The Knuth-Bendix ordering induced by ≻ and w is a simplification ordering on TΣ(X).

Proof. Baader and Nipkow, pages 125–129. 2

Remark

If Π = ∅, then all the term orderings described in this section can also be used to compare non-equational atoms by treating predicate symbols like function symbols. Defining a weight w(f) = 0 for some unary function symbol f was in particular intro- duced for the application of KBO to equational systems defining groups.

108 4.5 Knuth-Bendix Completion

Completion: Goal: Given a set E of equations, transform E into an equivalent convergent set R of rewrite rules. (If R is finite: decision procedure for E.) How to ensure termination?

Fix a reduction ordering ≻ and construct R in such a way that →R ⊆ ≻ (i. e., l ≻ r for every l → r ∈ R). How to ensure confluence? Check that all critical pairs are joinable.

Knuth-Bendix Completion: Inference Rules

The completion procedure is itself presented as a set of rewrite rules working on a pair of equations E and rules R: (E0; R0) ⇒ (E1; R1) ⇒ (E2; R2) ⇒ ...

At the beginning, E = E0 is the input set and R = R0 is empty. At the end, E should be empty; then R is the result. For each step (E; R) ⇒ (E′; R′), the equational theories of E ∪ R and E′ ∪ R′ agree: ≈E∪R = ≈E′∪R′ . Notations: . The formula s ≈ t denotes either s ≈ t or t ≈ s. CP(R) denotes the set of all critical pairs between rules in R.

Orient . (E ⊎{s ≈ t}; R) ⇒KBC (E; R ∪{s → t}) if s ≻ t Note: There are equations s ≈ t that cannot be oriented, i. e., neither s ≻ t nor t ≻ s.

Trivial equations cannot be oriented – but we don’t need them anyway: Delete (E ⊎{s ≈ s}; R) ⇒KBC (E; R)

109 Critical pairs between rules in R are turned into additional equations: Deduce (E; R) ⇒KBC (E ∪{s ≈ t}; R) if s, t ∈ CP(R)

Note: If s, t ∈ CP(R) then s R← u →R t and hence R |= s ≈ t.

The following inference rules are not absolutely necessary, but very useful (e. g., to get rid of joinable critical pairs and to deal with equations that cannot be oriented): Simplify-Eq. (E ⊎{s ≈ t}; R) ⇒KBC (E ∪{u ≈ t}; R) if s →R u

Simplification of the right-hand side of a rule is unproblematic. R-Simplify-Rule (E; R ⊎{s → t}) ⇒KBC (E; R ∪{s → u}) if t →R u

Simplification of the left-hand side may influence orientability and orientation. There- fore, it yields an equation: L-Simplify-Rule (E; R ⊎{s → t}) ⇒KBC (E ∪{u ≈ t}; R if s →R u using a rule l → r ∈ R such that s ⊐ l (see next slide).

For technical , the lhs of s → t may only be simplified using a rule l → r, if l → r cannot be simplified using s → t, that is, if s ⊐ l, where the encompassment ⊐ quasi-ordering ∼ is defined by ⊐ s ∼ l if s|p = lσ for some p and σ ⊐ ⊏ ⊐ and ⊐ = ∼ \ ∼ is the strict part of ∼.

Lemma 4.27 ⊐ is a well-founded strict partial ordering.

′ ′ Lemma 4.28 If (E; R) ⇒KBC (E ; R ), then ≈E∪R = ≈E′∪R′ .

′ ′ Lemma 4.29 If (E; R) ⇒KBC (E ; R ) and →R ⊆ ≻, then →R′ ⊆ ≻.

110 Knuth-Bendix Completion: Correctness Proof

If we run the completion procedure on a set E of equations, different things can hap- pen: (1) We reach a state where no more inference rules are applicable and E is not empty. ⇒ Failure (try again with another ordering?) (2) We reach a state where E is empty and all critical pairs between the rules in the current R have been checked. (3) The procedure runs forever. In order to treat these cases simultaneously, we need some definitions.

A (finite or infinite sequence) (E0; R0) ⇒KBC (E1; R1) ⇒KBC (E2; R2) ⇒KBC ... with R0 = ∅ is called a run of the completion procedure with input E0 and ≻.

For a run, E∞ = i≥0 Ei and R∞ = i≥0 Ri.

The sets of persistent equations or rules of the run are E∗ = i≥0 j≥i Ej and R∗ = Rj. i≥0 j≥i Note: If the run is finite and ends with En, Rn, then E∗ = En and R∗ = Rn.

A run is called fair, if CP (R∗) ⊆ E∞ (i. e., if every critical pair between persisting rules is computed at some step of the derivation). Goal:

Show: If a run is fair and E∗ is empty, then R∗ is convergent and equivalent to E0.

∗ In particular: If a run is fair and E∗ is empty, then ≈E0 = ≈E∞∪R∞ = ↔E∞∪R∞ = ↓R∗ .

General assumptions from now on:

(E0; R0) ⇒KBC (E1; R1) ⇒KBC (E2; R2) ⇒KBC ... is a fair run.

R0 and E∗ are empty.

111 A proof of s ≈ t in E∞ ∪ R∞ is a finite sequence (s0,...,sn) such that s = s0, t = sn, and for all i ∈{1,...,n}:

(1) si−1 ↔E∞ si, or

(2) si−1 →R∞ si, or

(3) si−1 R∞← si.

The pairs (si−1,si) are called proof steps.

A proof is called a rewrite proof in R∗, if there is a k ∈{0,...,n} such that si−1 →R∗ si for 1 ≤ i ≤ k and si−1 R←∗ si for k +1 ≤ i ≤ n

Idea (Bachmair, Dershowitz, Hsiang): Define a well-founded ordering on proofs, such that for every proof that is not a rewrite proof in R∗ there is an equivalent smaller proof.

Consequence: For every proof there is an equivalent rewrite proof in R∗.

We associate a cost c(si−1,si) with every proof step as follows:

(1) If si−1 ↔E∞ si, then c(si−1,si)=({si−1,si}, −, −), where the first component is a multiset of terms and − denotes an arbitrary (irrelevant) term.

(2) If si−1 →R∞ si using l → r, then c(si−1,si)=({si−1},l,si).

(3) If si−1 R∞← si using l → r, then c(si−1,si)=({si},l,si−1). Proof steps are compared using the lexicographic combination of the multiset extension of the reduction ordering ≻, the encompassment ordering ⊐, and the reduction ordering ≻. The cost c(P ) of a proof P is the multiset of the costs of its proof steps.

The proof ordering ≻C compares the costs of proofs using the multiset extension of the proof step ordering.

Lemma 4.30 ≻C is a well-founded ordering.

112 Lemma 4.31 Let P be a proof in E∞ ∪ R∞. If P is not a rewrite proof in R∗, then ′ ′ there exists an equivalent proof P in E∞ ∪ R∞ such that P ≻C P .

Proof. If P is not a rewrite proof in R∗, then it contains

(a) a proof step that is in E∞, or (b) a proof step that is in R∞ \ R∗, or

(c) a subproof si−1 R←∗ si →R∗ si+1 (peak). We show that in all three cases the proof step or subproof can be replaced by a smaller subproof: . Case (a): A proof step using an equation s ≈ t is in E∞. This equation must be deleted during the run. . If s ≈ t is deleted using Orient:

...si−1 ↔E∞ si ... =⇒ ...si−1 →R∞ si ... . If s ≈ t is deleted using Delete:

...si−1 ↔E∞ si−1 ... =⇒ ...si−1 ... . If s ≈ t is deleted using Simplify-Eq: ′ ...si−1 ↔E∞ si ... =⇒ ...si−1 →R∞ s ↔E∞ si ...

Case (b): A proof step using a rule s → t is in R∞ \ R∗. This rule must be deleted during the run. If s → t is deleted using R-Simplify-Rule: ′ ...si−1 →R∞ si ... =⇒ ...si−1 →R∞ s R∞← si ... If s → t is deleted using L-Simplify-Rule: ′ ...si−1 →R∞ si ... =⇒ ...si−1 →R∞ s ↔E∞ si ...

Case (c): A subproof has the form si−1 R←∗ si →R∗ si+1. If there is no overlap or a non-critical overlap: ∗ ′ ∗ ...si−1 R←∗ si →R∗ si+1 ... =⇒ ...si−1 →R∗ s R←∗ si+1 ... If there is a critical pair that has been added using Deduce:

...si−1 R←∗ si →R∗ si+1 ... =⇒ ...si−1 ↔E∞ si+1 ... In all cases, checking that the replacement subproof is smaller than the replaced subproof is routine. 2

113 Theorem 4.32 Let (E0; R0) ⇒KBC (E1; R1) ⇒KBC (E2; R2) ⇒KBC ... be a fair run and let R0 and E∗ be empty. Then

(1) every proof in E∞ ∪ R∞ is equivalent to a rewrite proof in R∗,

(2) R∗ is equivalent to E0, and

(3) R∗ is convergent.

Proof. (1) By well-founded induction on ≻C using the previous lemma.

(2) Clearly ≈E∞∪R∞ = ≈E0 . Since R∗ ⊆ R∞, we get ≈R∗ ⊆ ≈E∞∪R∞ . On the other hand, by (1), ≈E∞∪R∞ ⊆ ≈R∗ . 2 (3) Since →R∗ ⊆ ≻, R∗ is terminating. By (1), R∗ is confluent.

114 4.6 Unfailing Completion

Classical completion: Try to transform a set E of equations into an equivalent convergent TRS. Fail, if an equation can neither be oriented nor deleted. Unfailing completion (Bachmair, Dershowitz and Plaisted): If an equation cannot be oriented, we can still use orientable instances for rewriting. Note: If ≻ is total on ground terms, then every ground instance of an equation is trivial or can be oriented. Goal: Derive a ground convergent set of equations. Let E be a set of equations, let ≻ be a reduction ordering.

We define the relation →E≻ by

s →E≻ t iff there exist (u ≈ v) ∈ E or (v ≈ u) ∈ E, p ∈ pos(s), and σ : X → TΣ(X), such that s|p = uσ and t = s[vσ]p and uσ ≻ vσ.

Note: →E≻ is terminating by construction. From now on let ≻ be a reduction ordering that is total on ground terms.

∗ E is called ground convergent w. r. t. ≻, if for all ground terms s and t with s ↔E t ∗ ∗ there exists a ground term v such that s →E≻ v E≻← t. (Analogously for E ∪ R.) As for standard completion, we establish ground convergence by computing critical pairs. However, the ordering ≻ is not total on non-ground terms. Since sθ ≻ tθ implies s t, we approximate ≻ on ground terms by on arbitrary terms. . Let ui ≈. vi (i = 1, 2) be. equations in E whose variables have been renamed such that vars(u1 ≈ v1) ∩ vars(u2 ≈ v2)= ∅. Let p ∈ pos(u1) be a position such that u1|p is not a variable, σ is an mgu of u1|p and u2, and uiσ viσ (i =1, 2). Then v1σ, (u1σ)[v2σ]p is called a semi-critical pair of E with respect to ≻.

The set of all semi-critical pairs of E is denoted by SP≻(E).

Semi-critical pairs of E ∪ R are defined analogously. If →R ⊆ ≻, then CP(R) and SP≻(R) agree. Note: In contrast to critical pairs, it may be necessary to consider overlaps of a rule with itself at the top. For instance, if E = {f(x) ≈ g(y)}, then g(y),g(y′) is a non-trivial semi-critical pair.

115 The Deduce rule takes now the following form: Deduce (E; R) ⇒UKBC (E ∪{s ≈ t}; R) if s, t ∈ SP≻(E ∪ R)

The other rules are inherited from ⇒KBC . The fairness criterion for runs is replaced by

SP≻(E∗ ∪ R∗) ⊆ E∞

(i. e., if every semi-critical pair between persisting rules or equations is computed at some step of the derivation). Analogously to Thm. 4.32 we obtain now the following theorem:

Theorem 4.33 Let (E0; R0) ⇒UKBC (E1; R1) ⇒UKBC (E2; R2) ⇒UKBC ... be a fair run; let R0 = ∅. Then

(1) E∗ ∪ R∗ is equivalent to E0, and

(2) E∗ ∪ R∗ is ground convergent.

Moreover one can show that, whenever there exists a reduced convergent R such that

≈E0 = ↓R and →R ∈ ≻, then for every fair and simplifying run E∗ = ∅ and R∗ = R up to variable renaming. Here R is called reduced, if for every l → r ∈ R, both l and r are irreducible w. r. t. R \ {l → r}. A run is called simplifying, if R∗ is reduced, and for all equations u ≈ v ∈ E∗, u and v are incomparable w. r. t. ≻ and irreducible w. r. t. R∗. Unfailing completion is refutationally complete for equational theories:

Theorem 4.34 Let E be a set of equations, let ≻ be a reduction ordering that is total on ground terms. For any two terms s and t, let sˆ and tˆ be the terms obtained from s and t by replacing all variables by Skolem constants. Let eq/2, true/0 and false/0 be new operator symbols, such that true and false are smaller than all other terms. Let E0 = E ∪{eq(ˆs, tˆ) ≈ true, eq(x, x) ≈ false}. If (E0; ∅) ⇒UKBC (E1; R1) ⇒UKBC (E2; R2) ⇒UKBC ... be a fair run of unfailing completion, then s ≈E t iff some Ei ∪ Ri contains true ≈ false.

Outlook: Combine non-equational superposition resolution and unfailing completion to get a calculus for equational clauses:

116 compute inferences between (strictly) maximal literals as in ordered resolution, compute overlaps between maximal sides of equations as in unfailing completion ⇒ Superposition calculus.

117 5 Implementing Saturation Procedures

Problem: Refutational completeness is nice in theory, but ...... it guarantees only that proofs will be found eventually, not that they will be found quickly. Even though orderings and selection functions reduce the number of possible infer- ences, the search space problem is enormous. First-order provers “look for a needle in a haystack”: It may be necessary to make some millions of inferences to find a proof that is only a few dozens of steps long.

Coping with Large Sets of Formulas

Consequently: • We must deal with large sets of formulas. • We must use efficient techniques to find formulas that can be used as partners in an inference. • We must simplify/eliminate as many formulas as possible. • We must use efficient techniques to check whether a formula can be simplified/eliminated. Note: Often there are several competing implementation techniques. Design decisions are not independent of each other. Design decisions are not independent of the particular class of problems we want to solve. (FOL without equality/FOL with equality/unit equations, size of the signature, special algebraic properties like AC, etc.)

118 5.1 The Main Loop

Standard approach: Select one clause (“Given clause”). Find many partner clauses that can be used in inferences together with the “given clause” using an appropriate index data structure. Compute the conclusions of these inferences; add them to the set of clauses. Consequently: split the set of clauses into two subsets. • Wo = “Worked-off” (or “active”) clauses: Have already been selected as “given clause”. (So all inferences between these clauses have already been computed.) • Us = “Usable” (or “passive”) clauses: Have not yet been selected as “given clause”. During each iteration of the main loop: Select a new given clause C from Us; Us := Us \{C}.

Find partner clauses Di from Wo; New = Infer({ Di | i ∈ I },C); Us = Us ∪ New; Wo = Wo ∪{C} Additionally: Try to simplify C using Wo. (Skip the remainder of the iteration, if C can be elimi- nated.) Try to simplify (or even eliminate) clauses from Wo using C. Design decision: should one also simplify Us using Wo ? yes ; “Full Reduction”: Advantage: simplifications of Us may be useful to derive the empty clause. no ; “Lazy Reduction”: Advantage: clauses in Us are really passive; only clauses in Wo have to be kept in index data structure. (Hence: can use index data structure for which retrieval is faster, even if update is slower and space consumption is higher.)

119 Main Loop Full Reduction

Us = N; Wo = ∅; while (Us = ∅ && ⊥ ∈ Us) { Given = select clause from Us and move it from Us to Wo; New = all inferences between Given and Wo; Reduce New together with Wo and Us; Us = Us ∪ New;} if (⊥ ∈ Us) return “unsatisfiable”; else return “satisfiable”;

5.2 Term Representations

The obvious data structure for terms: Trees

f(g(x1), f(g(x1), x2)) f

g f

x1 g x2

x1

optionally: (full) sharing

An alternative: Flatterms

f(g(x1), f(g(x1), x2))

f g x1 f g x1 x2

need more memory; but: better suited for preorder term traversal and easier memory management.

120 5.3 Index Data Structures

Problem: For a term t, we want to find all terms s such that • s is an instance of t, • s is a generalization of t (i. e., t is an instance of s), • s and t are unifiable, • s is a generalization of some subterm of t, • ... Requirements: fast insertion, fast deletion, fast retrieval, small memory consumption.

121 Note: In applications like functional or , the requirements are different (insertion and deletion are much less important). Many different approaches: • Path indexing • Discrimination trees • Substitution trees • Context trees • Feature vector indexing • ... Perfect filtering: The indexing technique returns exactly those terms satisfying the query. Imperfect filtering: The indexing technique returns some superset of the set of all terms satisfying the query. Retrieval operations must be followed by an additional check, but the index can often be implemented more efficiently. Frequently: All occurrences of variables are treated as different variables.

Path Indexing

Path indexing: Paths of terms are encoded in a trie (“retrieval tree”). A star ∗ represents arbitrary variables. Example: Paths of f(g(∗, b), ∗): f.1.g.1.∗ f.1.g.2.b f.2.∗ Each leaf of the trie contains the set of (pointers to) all terms that contain the respec- tive path.

122 Example: Path index for {f(g(d, ∗),c)}

f 1 2 g c 1 2 {1} d ∗

{1} {1}

Example: Path index for {f(g(d, ∗),c), f(g(∗, b), ∗)}

f 1 2 g ∗ c 1 2 {2} {1} ∗ d b ∗

{2} {1} {2} {1}

Example: Path index for {f(g(d, ∗),c), f(g(∗, b), ∗), f(g(d, b),c)}

f 1 2 g ∗ c 1 2 {2} {1, 3} ∗ d b ∗

{2} {1, 3} {2, 3} {1}

123 Example: Path index for {f(g(d, ∗),c), f(g(∗, b), ∗), f(g(d, b),c), f(g(∗,c), b)}

f 1 2 g b ∗ c 1 2 {4} {2} {1, 3} b ∗ ∗ d c

{2, 4}{1, 3} {2, 3} {4} {1}

Example: Path index for {f(g(d, ∗),c), f(g(∗, b), ∗), f(g(d, b),c), f(g(∗,c), b), f(∗, ∗)}

f 1 2 ∗ g b ∗ c {5} 1 2 {4} {2, 5} {1, 3} b ∗ ∗ d c

{2, 4}{1, 3} {2, 3} {4} {1}

Advantages: Uses little space. No backtracking for retrieval. Efficient insertion and deletion. Good for finding instances. Disadvantages: Retrieval requires combining intermediate results for subterms.

Discrimination Trees

Discrimination trees: Preorder traversals of terms are encoded in a trie. A star ∗ represents arbitrary variables. Example: String of f(g(∗, b), ∗): f.g.∗.b.∗ Each leaf of the trie contains (a pointer to) the term that is represented by the path.

124 Example: Discrimination tree for {f(g(d, ∗),c)}

f g d

c

{1}

Example: Discrimination tree for {f(g(d, ∗),c), f(g(∗, b), ∗)}

f g d ∗

∗ b

c ∗

{1} {2}

Example: Discrimination tree for {f(g(d, ∗),c), f(g(∗, b), ∗), f(g(d, b),c)}

f g d ∗

b ∗ b

c c ∗

{3} {1} {2}

125 Example: Discrimination tree for {f(g(d, ∗),c), f(g(∗, b), ∗), f(g(d, b),c), f(g(∗,c), b)}

f g d ∗

b ∗ b c

c c ∗ b {3} {1} {2} {4}

Example: Discrimination tree for {f(g(d, ∗),c), f(g(∗, b), ∗), f(g(d, b),c), f(g(∗,c), b), f(∗, ∗)}

f g ∗ d ∗ ∗ b ∗ b c {5}

c c ∗ b

{3} {1} {2} {4}

Advantages: Each leaf yields one term, hence retrieval does not require intersections of intermediate results for subterms. Good for finding generalizations. Disadvantages: Uses more storage than path indexing (due to less sharing). Uses still more storage, if jump lists are maintained to speed up the search for instances or unifiable terms. Backtracking required for retrieval.

Feature Vector Indexing

Goal: C′ is subsumed by C if C′ = Cσ ∨ D. Find all clauses C′ for a given C or vice versa.

126 If C′ is subsumed by C, then • C′ contains at least as many literals as C. • C′ contains at least as many positive literals as C. • C′ contains at least as many negative literals as C. • C′ contains at least as many function symbols as C. • C′ contains at least as many occurrences of f as C. • C′ contains at least as many occurrences of f in negative literals as C. • the deepest occurrence of f in C′ is at least as deep as in C. • ... Idea: Select a list of these “features”. Compute the “feature vector” (a list of natural numbers) for each clause and store it in a trie. When searching for a subsuming clause: Traverse the trie, check all clauses for which all features are smaller or equal. (Stop if a subsuming clause is found.) When searching for subsumed clauses: Traverse the trie, check all clauses for which all features are larger or equal. Advantages: Works on the clause level, rather than on the term level. Specialized for subsumption testing. Disadvantages: Needs to be complemented by other index structure for other operations.

Literature

Literature: R. Sekar, I. V. Ramakrishnan, and Andrei Voronkov: Term Indexing, Ch. 26 in Robinson and Voronkov (eds.), Handbook of Automated Reasoning, Vol. II, Elsevier, 2001. Christoph Weidenbach: Combining Superposition, Sorts and Splitting, Ch. 27 in Robin- son and Voronkov (eds.), Handbook of Automated Reasoning, Vol. II, Elsevier, 2001.

127 6 Termination Revisited

So far: Termination as a subordinate task for entailment checking. TRS is generated by some saturation process; ordering must be chosen before the saturation starts. Now: Termination as a main task (e. g., for program analysis). TRS is fixed and known in advance. Literature: Nao Hirokawa and Aart Middeldorp: Dependency Pairs Revisited, RTA 2004, pp. 249- 268 (in particular Sect. 1–4). Thomas Arts and J¨urgen Giesl: Termination of Term Rewriting Using Dependency Pairs, Theoretical Computer Science, 236:133-178, 2000.

6.1 Dependency Pairs

Invented by T. Arts and J. Giesl in 1996, many refinements since then. Given: finite TRS R over Σ = (Ω, ∅).

T0 := { t ∈ TΣ(X) | there is an infinite derivation t →R t1 →R t2 →R ... }.

T∞ := { t ∈ T0 | ∀p>ε : t|p ∈/ T0 } = minimal elements of T0 w. r. t. ⊲. ′ ′ t ∈ T0 ⇒ there exists a t ∈ T∞ such that t t .

R is non-terminating iff T0 = ∅ iff T∞ = ∅.

Assume that T∞ = ∅ and consider some non-terminating derivation starting from t ∈ T∞. Since all subterms of t allow only finite derivations, at some point a rule l → r ∈ R must be applied at the root of t (possibly preceded by rewrite steps below the root):

>ε ∗ ε t = f(t1,...,tn) −→R f(s1,...,sn)= lσ −→R rσ.

In particular, root(t)= root(l), so we see that the root symbol of any term in T∞ must be contained in D := { root(l) | l → r ∈ R }. D is called the set of defined symbols of R; C := Ω \ D is called the set of constructor symbols of R.

The term rσ is contained in T0, so there exists a v ∈ T∞ such that rσ v.

If v occurred in rσ at or below a variable position of r, then xσ|p = v for some x ∈ var(r) ⊆ var(l), hence si xσ and there would be an infinite derivation starting from some ti. This contradicts t ∈ T∞, though.

128 Therefore, v = uσ for some non-variable subterm u of r. As v ∈ T∞, we see that root(u) = root(v) ∈ D. Moreover, u cannot be a proper subterm of l, since otherwise again there would be an infinite derivation starting from some ti. Putting everything together, we obtain

>ε ∗ ε t = f(t1,...,tn) −→R f(s1,...,sn)= lσ −→R rσ uσ where r u, root(u) ∈ D, l ⊲ u, u is not a variable.

Since uσ ∈ T∞, we can continue this process and obtain an infinite sequence. If we define S := { l → u | l → r ∈ R, r u, root(u) ∈ D, l ⊲ u,u∈ / X }, we can combine the rewrite step at the root and the subterm step and obtain

>ε ∗ ε t −→R lσ −→S uσ.

To get rid of the superscripts ε and >ε, it turns out to be useful to introduce a new set of function symbols f ♯ that are only used for the root symbols of this derivation: Ω♯ := { f ♯/n | f/n ∈ Ω }.

♯ ♯ For a term t = f(t1,...,tn) we define t := f (t1,...,tn); for a set of terms T we define T ♯ := { t♯ | t ∈ T }. The set of dependency pairs of a TRS R is then defined by DP (R) := { l♯ → u♯ | l → r ∈ R, r u, root(u) ∈ D, l ⊲ u,u∈ / X }.

For t ∈ T∞, the sequence using the S-rule corresponds now to

♯ ∗ ♯ ♯ t →R l σ →DP (R) u σ

♯ ♯ ♯ ♯ where t ∈ T∞ and u σ ∈ T∞. (Note that rules in R do not contain symbols from Ω♯, whereas all roots of terms in DP (R) come from Ω♯, so rules from R can only be applied below the root and rules from DP (R) can only be applied at the root.)

♯ ♯ Since u σ is again in T∞, we can continue the process in the same way. We obtain: R is non-terminating iff there is an infinite sequence

∗ ∗ t1 →R t2 →DP (R) t3 →R t4 →DP (R) ...

♯ with ti ∈ T∞ for all i. Moreover, if there exists such an infinite sequence, then there exists an infinite sequence in which all DPs that are used are used infinitely often. (If some DP is used only finitely often, we can cut off the initial part of the sequence up to the last occurrence of that DP; the remainder is still an infinite sequence.)

129 Dependency Graphs

Such infinite sequences correspond to “cycles” in the “dependency graph”:

130 Dependency graph DG(R) of a TRS R: directed graph nodes: dependency pairs s → t ∈ DP (R)

∗ edges: from s → t to u → v if there are σ, τ such that tσ →R uτ. Intuitively, we draw an edge between two dependency pairs, if these two dependency pairs can be used after another in an infinite sequence (with some R-steps in between). While this relation is undecidable in general, there are reasonable overapproximations: The functions cap and ren are defined by:

cap(x)= x y if f ∈ D cap(f(t1,...,tn)) = ♯ f(cap(t1),...,cap(tn)) if f ∈ C ∪ D ren(x)= y, y fresh ren(f(t1,...,tn)) = f(ren(t1),...,ren(tn))

The overapproximated dependency graph contains an edge from s → t to u → v if ren(cap(t)) and u are unifiable. A cycle in the dependency graph is a non-empty subset K ⊆ DP (R) such that there is a non-empty path from every DP in K to every DP in K (the two DPs may be identical). Let K ⊆ DP (R). An infinite rewrite sequence in R ∪ K of the form

∗ ∗ t1 →R t2 →K t3 →R t4 →K ...

♯ with ti ∈ T∞ is called K-minimal, if all rules in K are used infinitely often. R is non-terminating iff there is a cycle K ⊆ DP (R) and a K-minimal infinite rewrite sequence.

6.2 Subterm Criterion

Our task is to show that there are no K-minimal infinite rewrite sequences. Suppose that every dependency pair symbol f ♯ in K has positive arity (i. e., no con- stants). A simple projection π is a mapping π : Ω♯ → N such that π(f ♯) = i ∈ {1,...,arity(f ♯)}.

♯ We define π(f (t1,...,tn)) = tπ(f ♯).

131 Theorem 6.1 (Hirokawa and Middeldorp) Let K be a cycle in DG(R). If there is a simple projection π for K such that π(l) π(r) for every l → r ∈ K and π(l) ⊲ π(r) for some l → r ∈ K, then there are no K-minimal sequences.

Proof. Suppose that

∗ ∗ t1 →R u1 →K t2 →R u2 →K ... is a K-minimal infinite rewrite sequence. Apply π to every ti:

Case 1: ui →K ti+1. There is an l → r ∈ K such that ui = lσ, ti+1 = rσ. Then π(ui) = π(l)σ and π(ti+1) = π(r)σ. By assumption, π(l) π(r). If π(l) = π(r), then π(ui) = π(ti+1). If π(l) ⊲ π(r), then π(ui) = π(l)σ ⊲ π(r)σ = π(ti+1). In particular, π(ui) ⊲ π(ti+1) for infinitely many i (since every DP is used infinitely often). ∗ Case 2: ti →R ui. Then π(ti) → π(ui).

By applying π to every term in the K-minimal infinite rewrite sequence, we obtain an infinite (→R ∪ ⊲)-sequence containing infinitely many ⊲-steps. Since ⊲ is well-founded, there must also exist infinitely many →R-steps (otherwise the infinite sequence would have an infinite tail consisting only of ⊲-steps, contradicting well-foundedness.)

Now note that ⊲ ◦→R ⊆ →R ◦ ⊲. Therefore we can commute ⊲-steps and →R-steps and move all →R-steps to the front. We obtain an infinite →R-sequence that starts with π(t1). However t1 ⊲ π(t1) and t1 ∈ T∞, so there cannot be an infinite →R-sequence starting from π(t1). 2 Problem: The number of cycles in DG(R) can be exponential. Better method: Analyze strongly connected components (SCCs). SCC of a graph: maximal subgraph in which there is a non-empty path from every node to every node. (The two nodes can be identical.)3 Important property: Every cycle is contained in some SCC. Idea: Search for a simple projection π such that π(l) π(r) for all DPs l → r in the SCC. Delete all DPs in the SCC for which π(l) ⊲ π(r) (by the previous theorem, there cannot be any K-minimal infinite rewrite sequences using these DPs). Then re-compute SCCs for the remaining graph and re-start. No SCCs left ⇒ no cycles left ⇒ R is terminating. Example: See Ex. 13 from Hirokawa and Middeldorp.

3There are several definitions of SCCs that differ in the treatment of edges from a node to itself.

132 6.3 Reduction Pairs and Argument Filterings

Goal: Show the non-existence of K-minimal infinite rewrite sequences

∗ ∗ t1 →R u1 →K t2 →R u2 →K ... using well-founded orderings. We observe that the requirements for the orderings used here are less restrictive than for reduction orderings: K-rules are only used at the top, so we need stability under substitutions, but com- patibility with contexts is unnecessary.

While →K-steps should be decreasing, for →R-steps it would be sufficient to show that they are not increasing. This motivates the following definitions: Rewrite quasi-ordering : reflexive and transitive binary relation, stable under substitutions, compatible with contexts. Reduction pair (, ≻): is a rewrite quasi-ordering. ≻ is a well-founded ordering that is stable under substitutions. and ≻ are compatible: ◦≻⊆≻ or ≻◦ ⊆ ≻. (In practice, ≻ is almost always the strict part of the quasi-ordering .) Clearly, for any reduction ordering ≻,(, ≻) is a reduction pair. More general reduction pairs can be obtained using argument filterings: Argument filtering π: π : Ω ∪ Ω♯ → N ∪ list of N i ∈{1,...,arity(f)}, or π(f)= [i1,...,ik], where 1 ≤ i1 < < ik ≤ arity(f), 0 ≤ k ≤ arity(f) Extension to terms: π(x)= x

π(f(t1,...,tn)) = π(ti), if π(f)= i

′ π(f(t1,...,tn)) = f (π(ti1 ),...,π(tik )), if π(f) = [i1,...,ik], where f ′/k is a new function symbol.

133 Let ≻ be a reduction ordering, let π be an argument filtering. Define s ≻π t iff π(s) ≻ π(t) and s π t iff π(s) π(t).

Lemma 6.2 (π, ≻π) is a reduction pair.

Proof. Follows from the following two properties:

π(sσ)= π(s)σπ, where σπ is the substitution that maps x to π(σ(x)). π(s), if p does not correspond to any position in π(s) π(s[u]p)= 2 π(s)[π(u)]q, if p corresponds to q in π(s)

For interpretation-based orderings (such as polynomial orderings) the idea of “cutting out” certain subterms can be included directly in the definition of the ordering: Reduction pairs by interpretation: Let A be a Σ-algebra; let ≻ be a well-founded strict partial ordering on its universe.

Assume that all interpretations fA of function symbols are weakly monotone, i. e., ai bi implies f(a1,...,,an) f(b1,...,bn) for all ai, bi ∈ UA.

Define s A t iff A(β)(s) A(β)(t) for all assignments β : X → UA; define s ≻A t iff A(β)(s) ≻A(β)(t) for all assignments β : X → UA.

Then (A, ≻A) is a reduction pair. For polynomial orderings, this definition permits interpretations of function symbols where some variable does not occur at all (e. g., Pf (X,Y )=2X +1fora binary function symbol). It is no longer required that every variable must occur with some positive coefficient.

Theorem 6.3 (Arts and Giesl) Let K be a cycle in the dependency graph of the TRS R. If there is a reduction pair (, ≻) such that • l r for all l → r ∈ R, • l r or l ≻ r for all l → r ∈ K, • l ≻ r for at least one l → r ∈ K, then there is no K-minimal infinite sequence.

134 Proof. Assume that

∗ ∗ t1 →R u1 →K t2 →R u2 →K ... is a K-minimal infinite rewrite sequence.

As l r for all l → r ∈ R, we obtain ti ui by stability under substitutions, compati- bility with contexts, reflexivity and transitivity.

As l r or l ≻ r for all l → r ∈ K, we obtain ui ( ∪ ≻) ti+1 by stability under substitutions. So we get an infinite ( ∪ ≻)-sequence containing infinitely many ≻-steps (since every DP in K, in particular the one for which l ≻ r holds, is used infinitely often). By compatibility of and ≻, we can transform this into an infinite ≻-sequence, contra- dicting well-foundedness. 2

The idea can be extended to SCCs in the same way as for the subterm criterion: Search for a reduction pair (, ≻) such that l r for all l → r ∈ R and l r or l ≻ r for all DPs l → r in the SCC. Delete all DPs in the SCC for which l ≻ r. Then re-compute SCCs for the remaining graph and re-start. Example: Consider the following TRS R from [Arts and Giesl]:

minus(x, 0) → x (1) minus(s(x),s(y)) → minus(x, y) (2) quot(0,s(y)) → 0 (3) quot(s(x),s(y)) → s(quot(minus(x, y),s(y))) (4)

(R is not contained in any simplification ordering, since the left-hand side of rule (4) is embedded in the right-hand side after instantiating y by s(x).) R has three dependency pairs:

minus♯(s(x),s(y)) → minus♯(x, y) (5) quot ♯(s(x),s(y)) → quot ♯(minus(x, y),s(y)) (6) quot ♯(s(x),s(y)) → minus♯(x, y) (7)

The dependency graph of R is

(5) (7) (6)

135 There are exactly two SCCs (and also two cycles). The cycle at (5) can be handled using the subterm criterion with π(minus♯) = 1. For the cycle at (6) we can use an argument filtering π that maps minus to 1 and leaves all other function symbols unchanged (that is, π(g) = [1,...,arity(g)] for every g different from minus.) After applying the argument filtering, we compare left and right-hand sides using an LPO with precedence quot >s (the precedence of other symbols is irrelevant). We obtain l ≻ r for (6) and l r for (1), (2), (3), (4), so the previous theorem can be applied.

DP Processors

The methods described so far are particular cases of DP processors: A DP processor (G, R)

(G1, R1),..., (Gn, Rn) takes a graph G and a TRS R as input and produces a set of pairs consisting of a graph and a TRS. It is sound and complete if there are K-minimal infinite sequences for G and R if and only if there are K-minimal infinite sequences for at least one of the pairs (Gi, Ri). Examples:

(G, R)

(SCC1, R),..., (SCCn, R) where SCC1,...,SCCn are the strongly connected components of G.

(G, R) (G \ N, R) if there is an SCC of G and a simple projection π such that π(l) π(r) for all DPs l → r in the SCC, and N is the set of DPs of the SCC for which π(l) ⊲ π(r). (and analogously for reduction pairs)

Innermost Termination

The dependency method can also be used for proving termination of innermost rewriting: i s −→R t if s →R t at position p and no rule of R can be applied at a position strictly below p. (DP processors for innermost termination are more powerful than for ordinary termination, and for program analysis, innermost termination is usually sufficient.)

136 6.4 Superposition

Goal: Combine the ideas of superposition for first-order logic without equality (overlap maxi- mal literals in a clause) and Knuth-Bendix completion (overlap maximal sides of equa- tions) to get a calculus for equational clauses.

Observation

It is possible to encode an arbitrary predicate p using a function fp and a new constant tt:

P (t1,...,tn) ; fP (t1,...,tn) ≈ tt ¬ P (t1,...,tn) ; ¬ fP (t1,...,tn) ≈ tt

In equational logic it is therefore sufficient to consider the case thatΠ= ∅, i. e., equality is the only predicate symbol. Abbreviation: s ≈ t instead of ¬ s ≈ t.

The Superposition Calculus – Informally

Conventions: From now on: Π = ∅ (equality is the only predicate). Inference rules are to be read modulo symmetry of the equality symbol. We will first explain the ideas and motivations behind the superposition calculus and its completeness proof. Precise definitions will be given later.

Ground inference rules: D′ ∨ t ≈ t′ C′ ∨ s[t] ≈ s′ Superposition Right: D′ ∨ C′ ∨ s[t′] ≈ s′

D′ ∨ t ≈ t′ C′ ∨ s[t] ≈ s′ Superposition Left: D′ ∨ C′ ∨ s[t′] ≈ s′

C′ ∨ s ≈ s Equality Resolution: C′

(Note: We will need one further inference rule.)

137 Ordering restrictions: Some considerations: The literal ordering must depend primarily on the larger term of an equation. As in the resolution case, negative literals must be a bit larger than the corresponding positive literals. Additionally, we need the following property: If s ≻ t ≻ u, then s ≈ u must be larger than s ≈ t. In other words, we must compare first the larger term, then the polarity, and finally the smaller term.

The following construction has the required properties: Let ≻ be a reduction ordering that is total on ground terms. To a positive literal s ≈ t, we assign the multiset {s, t}, to a negative literal s ≈ t the multiset {s,s,t,t}. The literal ordering ≻L compares these multisets using the multiset extension of ≻.

The clause ordering ≻C compares clauses by comparing their multisets of literals using the multiset extension of ≻L.

Ordering restrictions: Ground inferences are necessary only if the following conditions are satisfied: – In superposition inferences, the left premise is smaller than the right premise. – The literals that are involved in the inferences are maximal in the respective clauses (strictly maximal for positive literals in superposition inferences). – In these literals, the lhs is greater than or equal to the rhs (in superposition inferences: greater than the rhs).

Model construction: We want to use roughly the same ideas as in the completenes proof for superposition on first-order without equality. But: a Herbrand interpretation does not work for equality: The equality symbol ≈ must be interpreted by equality in the interpretation.

Solution: Define a set E of ground equations and take TΣ(∅)/E = TΣ(∅)/≈E as the universe.

Then two ground terms s and t are equal in the interpretation, if and only if s ≈E t.

138 If E is a terminating and confluent rewrite system R, then two ground terms s and t are equal in the interpretation, if and only if s ↓R t.

One problem: In the completeness proof for the resolution calculus, the following property holds: If C = C′ ∨ A with a strictly maximal and positive literal A is false in the current interpretation, then adding A to the current interpretation cannot make any literal of C′ true. This does not hold for superposition: Let b ≻ c ≻ d. Assume that the current rewrite system (representing the current interpretation) contains the rule c → d. Now consider the clause b ≈ c ∨ b ≈ d.

We need a further inference rule to deal with clauses of this kind, either the “Merging Paramodulation” rule of Bachmair and Ganzinger or the following “Equality Factoring” rule due to Nieuwenhuis: C′ ∨ s ≈ t′ ∨ s ≈ t Equality Factoring: C′ ∨ t ≈ t′ ∨ s ≈ t′

Note: This inference rule subsumes the usual factoring rule.

How do the non-ground versions of the inference rules for superposition look like? Main idea as in non-equational first-order case: Replace identity by unifiability. Apply the mgu to the resulting clause. In the ordering restrictions, replace ≻ by .

However: As in Knuth-Bendix completion, we do not want to consider overlaps at or below a variable position. Consequence: there are inferences between ground instances Dθ and Cθ of clauses D and C which are not ground instances of inferences between D and C. Such inferences have to be treated in a special way in the completeness proof.

139 The Superposition Calculus – Formally

Until now, we have seen most of the ideas behind the superposition calculus and its completeness proof. We will now start again from the beginning giving precise definitions and proofs. Inference rules are applied with respect to the commutativity of equality ≈. Inference rules: D′ ∨ t ≈ t′ C′ ∨ s[u] ≈ s′ Superposition Right: (D′ ∨ C′ ∨ s[t′] ≈ s′)σ where σ = mgu(t, u) and u is not a variable.

D′ ∨ t ≈ t′ C′ ∨ s[u] ≈ s′ Superposition Left: (D′ ∨ C′ ∨ s[t′] ≈ s′)σ where σ = mgu(t, u) and u is not a variable.

C′ ∨ s ≈ s′ Equality Resolution: C′σ where σ = mgu(s,s′).

C′ ∨ s′ ≈ t′ ∨ s ≈ t Equality Factoring: (C′ ∨ t ≈ t′ ∨ s ≈ t′)σ where σ = mgu(s,s′).

Theorem 6.4 All inference rules of the superposition calculus are correct, i. e., for every rule

Cn,...,C1

C0 we have {C1,...,Cn}|= C0.

Proof. Exercise. 2

140 Orderings: Let ≻ be a reduction ordering that is total on ground terms. To a positive literal s ≈ t, we assign the multiset {s, t}, to a negative literal s ≈ t the multiset {s,s,t,t}. The literal ordering ≻L compares these multisets using the multiset extension of ≻.

The clause ordering ≻C compares clauses by comparing their multisets of literals using the multiset extension of ≻L. Inferences have to be computed only if the following ordering restrictions are satisfied: – In superposition inferences, after applying the unifier to both premises, the left premise is not greater than or equal to the right one. – The last literal in each premise is maximal in the respective premise, i. e., there exists no greater literal (strictly maximal for positive literals in superposition in- ferences, i. e., there exists no greater or equal literal). – In these literals, the lhs is not smaller than the rhs (in superposition inferences: neither smaller nor equal). Superposition Left in Detail:

D′ ∨ t ≈ t′ C′ ∨ s[u] ≈ s′ (D′ ∨ C′ ∨ s[t′] ≈ s′)σ

where σ = mgu(t, u), u is not a variable, tσ t′σ, sσ s′σ (t ≈ t′)σ strictly maximal in (D′ ∨ t ≈ t′)σ, nothing selected (s ≈ s′)σ maximal in (C′ ∨ s ≈ s′)σ or selected Superposition Right in Detail:

D′ ∨ t ≈ t′ C′ ∨ s[u] ≈ s′ (D′ ∨ C′ ∨ s[t′] ≈ s′)σ

where σ = mgu(t, u), u is not a variable, tσ t′σ, sσ s′σ (t ≈ t′)σ strictly maximal in (D′ ∨ t ≈ t′)σ, nothing selected (s ≈ s′)σ strictly maximal in (C′ ∨ s ≈ s′)σ, nothing selected

141 Equality Resolution in Detail:

C′ ∨ s ≈ s′ C′σ

where σ = mgu(s,s′), (s ≈ s′)σ maximal in (C′ ∨ s ≈ s′)σ or selected Equality Factoring in Detail:

C′ ∨ s′ ≈ t′ ∨ s ≈ t (C′ ∨ t ≈ t′ ∨ s ≈ t′)σ

where σ = mgu(s,s′), s′σ t′σ, sσ tσ (s ≈ t)σ maximal in (C′ ∨ s′ ≈ t′ ∨ s ≈ t)σ, nothing selected A ground clause C is called redundant w. r. t. a set of ground clauses N, if it follows from clauses in N that are smaller than C. A clause is redundant w. r. t. a set of clauses N, if all its ground instances are redundant w. r. t. GΣ(N). The set of all clauses that are redundant w. r. t. N is denoted by Red(N). N is called saturated up to redundancy, if the conclusion of every inference from clauses in N \ Red(N) is contained in N ∪ Red(N).

Superposition: Refutational Completeness

For a set E of ground equations, TΣ(∅)/E is an E-interpretation (or E-algebra) with universe { [t] | t ∈ TΣ(∅) }. One can show (similar to the proof of Birkhoff’s Theorem) that for every ground equation ∗ s ≈ t we have TΣ(∅)/E |= s ≈ t if and only if s ↔E t. In particular, if E is a convergent set of rewrite rules R and s ≈ t is a ground equation, then TΣ(∅)/R |= s ≈ t if and only if s ↓R t. By abuse of terminology, we say that an equation or clause is valid (or true) in R if and only if it is true in TΣ(∅)/R.

142 Construction of candidate interpretations (Bachmair & Ganzinger 1990): Let N be a set of clauses not containing ⊥. Using induction on the clause ordering we define sets of rewrite rules EC and RC for all C ∈ GΣ(N) as follows:

Assume that ED has already been defined for all D ∈ GΣ(N) with D ≺C C. Then R = E . C D≺C C D

The setEC contains the rewrite rule s → t, if (a) C = C′ ∨ s ≈ t. (b) s ≈ t is strictly maximal in C. (c) s ≻ t.

(d) C is false in RC . ′ (e) C is false in RC ∪{s → t}.

(f) s is irreducible w. r. t. RC . (g) no negative literal is selected in C′

In this case, C is called productive. Otherwise EC = ∅.

Finally, R∞ = D∈GΣ(N) ED. Lemma 6.5 If EC = {s → t} and ED = {u → v}, then s ≻ u if and only if C ≻C D.

Corollary 6.6 The rewrite systems RC and R∞ are convergent.

Proof. Obviously, s ≻ t for all rules s → t in RC and R∞. Furthermore, it is easy to check that there are no critical pairs between any two rules: Assume that there are rules u → v in ED and s → t in EC such that u is a subterm of s. As ≻ is a reduction ordering that is total on ground terms, we get u ≺ s and therefore D ≺C C and ED ⊆ RC . But then s would be reducible by RC , contradicting condition (f). 2

Lemma 6.7 If D C C and EC = {s → t}, then s ≻ u for every term u occurring in a negative literal in D and s v for every term v occurring in a positive literal in D.

Corollary 6.8 If D ∈ GΣ(N) is true in RD, then D is true in R∞ and RC for all C ≻C D.

143 Proof. If a positive literal of D is true in RD, then this is obvious.

Otherwise, some negative literal s ≈ t of D must be true in RD, hence s ↓RD t. As the rules in R∞ \ RD have left-hand sides that are larger than s and t, they cannot be used 2 in a rewrite proof of s ↓ t, hence s ↓RC t and s ↓R∞ t.

′ ′ Corollary 6.9 If D = D ∨ u ≈ v is productive, then D is false and D is true in R∞ and RC for all C ≻C D.

Proof. Obviously, D is true in R∞ and RC for all C ≻C D. ′ Since all negative literals of D are false in RD, it is clear that they are false in R∞ and ′ ′ ′ RC . For the positive literals u ≈ v of D , condition (e) ensures that they are false in ′ ′ RD ∪{u → v}. Since u u and v u and all rules in R∞ \ RD have left-hand sides that are larger than u, these rules cannot be used in a rewrite proof of u′ ↓ v′, hence ′ ′ ′ ′ 2 u ↓RC v and u ↓R∞ v .

Lemma 6.10 (“Lifting Lemma”) Let C be a clause and let θ be a substitution such that Cθ is ground. Then every equality resolution or equality factoring inference from Cθ is a ground instance of an inference from C.

Proof. Exercise. 2

Lemma 6.11 (“Lifting Lemma”) Let D = D′ ∨ u ≈ v and C = C′ ∨ [¬] s ≈ t be two clauses (without common variables) and let θ be a substitution such that Dθ and Cθ are ground. If there is a superposition inference between Dθ and Cθ where uθ and some subterm of sθ are overlapped, and uθ does not occur in sθ at or below a variable position of s, then the inference is a ground instance of a superposition inference from D and C.

Proof. Exercise. 2

Theorem 6.12 (“Model Construction”) Let N be a set of clauses that is saturated up to redundancy and does not contain the empty clause. Then we have for every ground clause Cθ ∈ GΣ(N):

(i) ECθ = ∅ if and only if Cθ is true in RCθ.

(ii) If Cθ is redundant w. r. t. GΣ(N), then it is true in RCθ.

(iii) Cθ is true in R∞ and in RD for every D ∈ GΣ(N) with D ≻C Cθ.

144 Proof. We prove the theorem without considering selection. We use induction on the clause ordering ≻C and assume that (i)–(iii) are already satisfied for all clauses in GΣ(N) that are smaller than Cθ. Note that the “if” part of (i) is obvious from the construction and that condition (iii) follows immediately from (i) and Corollaries 6.8 and 6.9. So it remains to show (ii) and the “only if” part of (i).

Case 1: Cθ is redundant w. r. t. GΣ(N).

If Cθ is redundant w. r. t. GΣ(N), then it follows from clauses in GΣ(N) that are smaller than Cθ. By part (iii) of the induction hypothesis, these clauses are true in RCθ. Hence Cθ is true in RCθ.

Case 2: xθ is reducible by RCθ.

Suppose there is a variable x occurring in C such that xθ is reducible by RCθ, say ′ ′ ′ xθ →RCθ w. Let the substitution θ be defined by xθ = w and yθ = yθ for every variable y = x. The clause Cθ′ is smaller than Cθ. By part (iii) of the induction hypothesis, it is true in RCθ. By congruence, every literal of Cθ is true in RCθ if and ′ only if the corresponding literal of Cθ is true in RCθ; hence Cθ is true in RCθ.

Case 3: Cθ contains a maximal negative literal.

Suppose that Cθ does not fall into Case 1 or 2 and that Cθ = C′θ ∨ sθ ≈ s′θ, where ′ ′ sθ ≈ s θ is maximal in Cθ. If sθ ≈ s θ is false in RCθ, then Cθ is clearly true in RCθ ′ ′ and we are done. So assume that sθ ≈ s θ is true in RCθ, that is, sθ ↓RCθ s θ. Without loss of generality, sθ s′θ.

Case 3.1: sθ = s′θ.

If sθ = s′θ, then there is an equality resolution inference C′θ ∨ sθ ≈ s′θ . C′θ As shown in the Lifting Lemma, this is an instance of an equality resolution inference C′ ∨ s ≈ s′ C′σ where C = C′ ∨ s ≈ s′ is contained in N and θ = σ ◦ ρ. Without loss of generality, σ is idempotent, therefore C′θ = C′σρ = C′σσρ = C′σθ, so C′θ is a ground instance of ′ C σ. Since Cθ is not redundant w. r. t. GΣ(N), C is not redundant w. r. t. N. As N is saturated up to redundancy, the conclusion C′σ of the inference from C is contained in ′ N ∪ Red(N). Therefore, C θ is either contained in GΣ(N) and smaller than Cθ, or it follows from clauses in GΣ(N) that are smaller than itself (and therefore smaller than Cθ). By the induction hypothesis, clauses in GΣ(N) that are smaller than Cθ are true ′ in RCθ, thus C θ and Cθ are true in RCθ.

145 Case 3.2: sθ ≻ s′θ.

′ ′ If sθ ↓RCθ s θ and sθ ≻ s θ, then sθ must be reducible by some rule in some EDθ ⊆ RCθ. (Without loss of generality we assume that C and D are variable disjoint; so we can use ′ ′ ′ the same substitution θ.) Let Dθ = D θ ∨ tθ ≈ t θ with EDθ = {tθ → t θ}. Since Dθ ′ is productive, D θ is false in RCθ. Besides, by part (ii) of the induction hypothesis, Dθ is not redundant w. r. t. GΣ(N), so D is not redundant w. r. t. N. Note that tθ cannot occur in sθ at or below a variable position of s, say xθ = w[tθ], since otherwise Cθ would be subject to Case 2 above. Consequently, the left superposition inference D′θ ∨ tθ ≈ t′θ C′θ ∨ sθ[tθ] ≈ s′θ D′θ ∨ C′θ ∨ sθ[t′θ] ≈ s′θ is a ground instance of a left superposition inference from D and C. By saturation up to redundancy, its conclusion is either contained in GΣ(N) and smaller than Cθ, or it follows from clauses in GΣ(N) that are smaller than itself (and therefore smaller than Cθ). By ′ ′ ′ ′ the induction hypothesis, these clauses are true in RCθ, thus D θ ∨ C θ ∨ sθ[t θ] ≈ s θ ′ ′ ′ ′ is true in RCθ. Since D θ and sθ[t θ] ≈ s θ are false in RCθ, both C θ and Cθ must be true.

Case 4: Cθ does not contain a maximal negative literal.

Suppose that Cθ does not fall into Cases 1 to 3. Then Cθ can be written as C′θ ∨ sθ ≈ ′ ′ ′ ′ s θ, where sθ ≈ s θ is a maximal literal of Cθ. If ECθ = {sθ → s θ} or C θ is true in ′ ′ RCθ or sθ = s θ, then there is nothing to show, so assume that ECθ = ∅ and that C θ is ′ false in RCθ. Without loss of generality, sθ ≻ s θ.

Case 4.1: sθ ≈ s′θ is maximal in Cθ, but not strictly maximal.

If sθ ≈ s′θ is maximal in Cθ, but not strictly maximal, then Cθ can be written as C′′θ ∨ tθ ≈ t′θ ∨ sθ ≈ s′θ, where tθ = sθ and t′θ = s′θ. In this case, there is a equality factoring inference C′′θ ∨ tθ ≈ t′θ ∨ sθ ≈ s′θ C′′θ ∨ t′θ ≈ s′θ ∨ tθ ≈ t′θ This inference is a ground instance of an inference from C. By induction hypothesis, its ′ ′ ′ ′ ′ ′ conclusion is true in RCθ. Trivially, t θ = s θ implies t θ ↓RCθ s θ, so t θ ≈ s θ must be false and Cθ must be true in RCθ.

Case 4.2: sθ ≈ s′θ is strictly maximal in Cθ and sθ is reducible.

Suppose that sθ ≈ s′θ is strictly maximal in Cθ and sθ is reducible by some rule in ′ ′ ′ EDθ ⊆ RCθ. Let Dθ = D θ ∨ tθ ≈ t θ and EDθ = {tθ → t θ}. Since Dθ is productive, ′ Dθ is not redundant and D θ is false in RCθ. We can now proceed in essentially the

146 same way as in Case 3.2: If tθ occurred in sθ at or below a variable position of s, say xθ = w[tθ], then Cθ would be subject to Case 2 above. Otherwise, the right superposition inference D′θ ∨ tθ ≈ t′θ C′θ ∨ sθ[tθ] ≈ s′θ D′θ ∨ C′θ ∨ sθ[t′θ] ≈ s′θ is a ground instance of a right superposition inference from D and C. By saturation ′ ′ up to redundancy, its conclusion is true in RCθ. Since D θ and C θ are false in RCθ, ′ ′ ′ sθ[t θ] ≈ s θ must be true in RCθ. On the other hand, tθ ≈ t θ is true in RCθ, so by ′ congruence, sθ[tθ] ≈ s θ and Cθ are true in RCθ.

Case 4.3: sθ ≈ s′θ is strictly maximal in Cθ and sθ is irreducible.

′ Suppose that sθ ≈ s θ is strictly maximal in Cθ and sθ is irreducible by RCθ. Then there ′ ′ are three possibilities: Cθ can be true in RCθ, or C θ can be true in RCθ ∪{sθ → s θ}, ′ or ECθ = {sθ → s θ}. In the first and the third case, there is nothing to show. Let us ′ ′ therefore assume that Cθ is false in RCθ and C θ is true in RCθ ∪{sθ → s θ}. Then ′ ′′ ′ ′ ′ C θ = C θ ∨ tθ ≈ t θ, where the literal tθ ≈ t θ is true in RCθ ∪{sθ → s θ} and ′ ′ ′ false in RCθ. In other words, tθ ↓RCθ∪{sθ→s θ} t θ, but not tθ ↓RCθ t θ. Consequently, ∗ ∗ ′ ′ there is a rewrite proof of tθ → u ← t θ by RCθ ∪{sθ → s θ} in which the rule sθ → s′θ is used at least once. Without loss of generality we assume that tθ t′θ. Since ′ ′ ′ ′ sθ ≈ s θ ≻L tθ ≈ t θ and sθ ≻ s θ we can conclude that sθ tθ ≻ t θ. But then there is only one possibility how the rule sθ → s′θ can be used in the rewrite proof: We must have sθ = tθ and the rewrite proof must have the form tθ → s′θ →∗ u ←∗ t′θ, where the ′ ′ ′ first step uses sθ → s θ and all other steps use rules from RCθ. Consequently, s θ ≈ t θ is true in RCθ. Now observe that there is an equality factoring inference C′′θ ∨ tθ ≈ t′θ ∨ sθ ≈ s′θ C′′θ ∨ t′θ ≈ s′θ ∨ tθ ≈ t′θ

′ ′ whose conclusion is true in RCθ by saturation. Since the literal t θ ≈ s θ must be false in RCθ, the rest of the clause must be true in RCθ, and therefore Cθ must be true in RCθ, contradicting our assumption. This concludes the proof of the theorem. 2

A Σ-interpretation A is called term-generated, if for every b ∈ UA there is a ground term t ∈ TΣ(∅) such that b = A(β)(t).

Lemma 6.13 Let N be a set of (universally quantified) Σ-clauses and let A be a term- generated Σ-interpretation. Then A is a model of GΣ(N) if and only if it is a model of N.

147 Proof. (⇒): Let A|= GΣ(N); let (∀xC) ∈ N. Then A|= ∀xC iff A(γ[xi → ai])(C)=1 for all γ and ai. Choose ground terms ti such that A(γ)(ti) = ai; define θ such that xiθ = ti, then A(γ[xi → ai])(C)= A(γ ◦ θ)(C)= A(γ)(Cθ) = 1 since Cθ ∈ GΣ(N).

(⇐): Let A be a model of N; let C ∈ N and Cθ ∈ GΣ(N). Then A(γ)(Cθ) = A(γ ◦ θ)(C) = 1 since A|= N. 2

Theorem 6.14 (Refutational Completeness: Static View) Let N be a set of clauses that is saturated up to redundancy. Then N has a model if and only if N does not contain the empty clause.

Proof. If ⊥ ∈ N, then obviously N does not have a model. If ⊥ ∈/ N, then the interpre- tation R∞ (that is, TΣ(∅)/R∞) is a model of all ground instances in GΣ(N) according to part (iii) of the model construction theorem. As TΣ(∅)/R∞ is term generated, it is a model of N. 2

So far, we have considered only inference rules that add new clauses to the current set of clauses (corresponding to the Deduce rule of Knuth-Bendix Completion).

In other words, we have derivations of the form N0 ⊢ N1 ⊢ N2 ⊢ ... , where each Ni+1 is obtained from Ni by adding the consequence of some inference from clauses in Ni. Under which circumstances are we allowed to delete (or simplify) a clause during the derivation?

A run of the superposition calculus is a sequence N0 ⊢ N1 ⊢ N2 ⊢ ... , such that (i) Ni |= Ni+1, and (ii) all clauses in Ni \ Ni+1 are redundant w. r. t. Ni+1. In other words, during a run we may add a new clause if it follows from the old ones, and we may delete a clause, if it is redundant w. r. t. the remaining ones.

For a run, N∞ = i≥0 Ni and N∗ = i≥0 j≥i Nj. The set N∗ of all persistent clauses is called the limit of the run.

Lemma 6.15 If N ⊆ N ′, then Red(N) ⊆ Red(N ′).

Proof. Obvious. 2

Lemma 6.16 If N ′ ⊆ Red(N), then Red(N) ⊆ Red(N \ N ′).

Proof. Follows from the compactness of first-order logic and the well-foundedness of the multiset extension of the clause ordering. 2

148 Lemma 6.17 Let N0 ⊢ N1 ⊢ N2 ⊢ ... be a run. Then Red(Ni) ⊆ Red(N∞) and Red(Ni) ⊆ Red(N∗) for every i.

Proof. Exercise. 2

Corollary 6.18 Ni ⊆ N∗ ∪ Red(N∗) for every i.

Proof. If C ∈ Ni \ N∗, then there is a k ≥ i such that C ∈ Nk \ Nk+1, so C must be redundant w. r. t. Nk+1. Consequently, C is redundant w. r. t. N∗. 2

A run is called fair, if the conclusion of every inference from clauses in N∗ \ Red(N∗) is contained in some Ni ∪ Red(Ni).

Lemma 6.19 If a run is fair, then its limit is saturated up to redundancy.

Proof. If the run is fair, then the conclusion of every inference from non-redundant clauses in N∗ is contained in some Ni ∪ Red(Ni), and therefore contained in N∗ ∪ Red(N∗). Hence N∗ is saturated up to redundancy. 2

Theorem 6.20 (Refutational Completeness: Dynamic View) Let N0 ⊢ N1 ⊢ N2 ⊢ ... be a fair run, let N∗ be its limit. Then N0 has a model if and only if ⊥ ∈/ N∗.

Proof. (⇐): By fairness, N∗ is saturated up to redundancy. If ⊥ ∈/ N∗, then it has a term-generated model. Since every clause in N0 is contained in N∗ or redundant w. r. t. N∗, this model is also a model of GΣ(N0) and therefore a model of N0.

(⇒): Obvious, since N0 |= N∗. 2

Superposition: Extensions

Extensions and improvements: simplification techniques, selection functions (when, what), redundancy for inferences, constraint reasoning, decidable first-order fragments.

149 Theory Reasoning

Superposition vs. resolution + equality axioms: specialized inference rules, thus no inferences with theory axioms, computation modulo symmetry, stronger ordering restrictions, no variable overlaps, stronger redundancy criterion. Similar techniques can be used for other theories: transitive relations, dense total orderings without endpoints, commutativity, associativity and commutativity, abelian monoids, abelian groups, divisible torsion-free abelian groups.

7 Outlook

Further topics in automated reasoning.

7.1 Satisfiability Modulo Theories (SMT)

CDCL checks satisfiability of propositional formulas. CDCL can also be used for ground first-order formulas without equality: Ground first-order atoms are treated like propositional variables. Truth values of P (a), Q(a), Q(f(a)) are independent. For ground formulas with equality, independence is lost: If b ≈ c is true, then f(b) ≈ f(c) must also be true. Similarly for other theories, e. g. linear arithmetic: b> 5 implies b> 3.

150 We can still use CDCL, but we must combine it with a decision procedure for the theory part T :

M |=T C: M and the theory axioms T entail C. New CDCL rules: T -Propagate:

M N ⇒CDCL(T) M L N

if M |=T L where L is undefined in M and L or L occurs in N. T -Learn:

M N ⇒CDCL(T) M N ∪{C}

if N |=T C and each atom of C occurs in N or M. T -Backjump:

d ′ ′ M L M N ∪{C} ⇒CDCL(T) M L N ∪{C} if M Ld M ′ |= ¬C and there is some “backjump clause” C′ ∨ L′ such that ′ ′ ′ N ∪{C}|=T C ∨ L and M |= ¬C , L′ is undefined under M, and L′ or L′ occurs in N or in M Ld M ′.

7.2 Sorted Logics

So far, we have considered only unsorted first-order logic. In practice, one often considers many-sorted logics: read/2 becomes read : array × nat → data. write/3 becomes write : array × nat × data → array. Variables: x : data Only one declaration per function/predicate/variable symbol. All terms, atoms, substitutions must be well-sorted. Algebras:

Instead of universe UA, one set per sort: arrayA, natA. Interpretations of function and predicate symbols correspond to their declarations:

readA : arrayA × natA → dataA

151 Proof theory, calculi, etc.: Essentially as in the unsorted case. More difficult: Subsorts Overloading Better treated via relativization: ∀xS φ ⇒ ∀y S(y) → φ{xS → y}

7.3 Splitting

Tableau-like rule within resolution to eliminate variable-disjoint (positive) disjunctions:

N ∪{C1 ∨ C2}

N ∪{C1} | N ∪{C2} if var(C1) ∩ var(C2)= ∅. Split clauses are smaller and more likely to be usable for simplification. Splitting tree is explored using intelligent backtracking.

7.4 Integrating Theories into Superposition

Certain kinds of theories/axioms are important in practice, but difficult for theorem provers. So far important case: equality but also: transitivity, arithmetic. . .

Idea: Combine Superposition and Constraint Reasoning. Superposition Left Modulo Theories:

′ ′ Λ1 C1 ∨ t ≈ t Λ2 C2 ∨ s[u] ≈ s ′ ′ (Λ1, Λ2 C1 ∨ C2 ∨ s[t ] ≈ s )σ

where σ = mgu(t, u), ...

152 Advertisements

Interested in Bachelor/Master/PhD Thesis? Automated Reasoning contact Christoph Weidenbach (MPI-INF, MPI-SWS Building, 6th floor) Hybrid System Verification contact Uwe Waldmann Arithmetic Reasoning (Quantifier Elimination) contact Thomas Sturm Next semester: Automated Reasoning II Content: Integration of Theories (Arithmetic) Lecture: Block Course Tutorials: TBA

153