<<

Notes on propositional and first order logic

M. Viale

Contents

1 Propositional logic3 1.1 Semantics...... 3 1.2 Disjunctive normal forms...... 8 1.3 Proof systems...... 9 1.4 The sequent calculs LK ...... 9 1.5 Exercises on propositional logic and LK-calculus...... 22

2 Basics of first order logic 23 2.1 Examples of first order languages...... 24 2.2 Syntax and semantics for arbitrary first order languages...... 31 2.3 Free and bounded variables and substitution of symbols inside formulae: what are the problems to match?...... 36 2.4 Syntactic complexity of terms and formulae...... 39 2.5 Free and bounded variables of a formula...... 40 2.6 Basic rules for logic equivalence and prenex normal forms of formulae... 45 2.7 Substitution of terms inside formulae...... 48 2.8 Geometric of formulae...... 49 2.9 Definable sets with and without parameters...... 52 2.10 Exercises on first order logic...... 54

3 More on first order logic 55 3.1 First order LK-calculus...... 55 3.2 Satisfiable theories and compactness...... 57 3.3 Classes of L-structures...... 59 3.4 Substructures, morphisms, and products...... 60 3.5 Elementary equivalence and completeness...... 68 3.6 More exercises on first order logic...... 69

4 Compactness 69 4.1 Proof of the compactness theorem...... 69

5 First order logic and set theory 79

1 WARNING: These notes are a compendium to Alessandro Andretta’s textbook [1] meant as a complement to [1, Chapter I.3]. In particular sections2 and3 of these notes consists of the material covered in [1, Chapter I.3] and our presentation draws heavily from it. We invite the reader to look at [1, Chapter I.3] as a further source of reference for our treatment of the basics of first order logic.

2 1 Propositional logic

Propositional logic formalizes in a mathematically rigorous theory certain concepts and procedures which rule our way to reason about mathematics. For example by means of propositional logic we can give a mathematically precise counterpart of the concepts of theorem, mathematical truth, contradiction, logical deduction, equivalence of meaning between two different linguistic expressions, etc.... Nonethe- less propositional logic is still a theory too weak to develop a mathematical theory which reflects all kind of mathematical reasoning. This can be accomplished to a really satisfactory extent by means of first order logic, whose basic properties will be introduced in the second part of these notes. We believe that a short introduction to propositional logic can help to understand the key ideas which leads to develop a rigorous and effective mathematical theory of mathematical reasoning.

Definition 1.1. Given an infinite list of propositional variables PropVar = {Ai : i ∈ N}, a is defined by induction using the following clauses out of propositional variables and connectives {¬, ∧, ∨, →}:

• each Ai is a formula, • if φ is a formula (¬φ) is a formula, • if φ, ψ are formulae, also (φ ∧ ψ), (φ ∨ ψ), (φ → ψ) are formulae. We let Form be the set of propositional formulae. Given a formula φ the set of its propositional variables propvar(φ) is given by the propositional letters occurring in φ.

1.1 Semantics We assign truth values to propositional formulae according to the following defini- tion: Definition 1.2. Let v : PropVar → {0, 1}. We extend v to a unique map v : Form → {0, 1} satisfying: • v((¬φ)) = 1 − v(φ), • v((φ ∧ ψ)) = v(φ) · v(ψ), • v((φ ∨ ψ)) = max {v(φ), v(ψ)}, • v((φ → ψ)) = v(¬φ ∨ ψ). The intended meaning of the above definition being: we regard 1 as a truth and 0 as a falsity. We are interested to study only which have a definite , such as “There are infinitely many prime numbers”, “Every continuous functions f : R → R is bounded”, “Every continuous functions f : [0, 1] → R is bounded”,. . . , we know that the first and the third statements are true, while the second is false (as witnessed for example by f(x) = x2). Propositional logic is not suited to study propositions for which we are not able to assign a definite truth value among true and false; an example of a statement on which propositional logic is not able to say much is the following: “This sentence

3 is false”. The latter statement can be neither true (otherwise it asserts its falsity), nor false (otherwise it would be true). A basic intuition regarding valuations is that a propositional variable A can range among all propositions which are either true or false and that a v decides whether we assign to A a true or a false one. Under the decision made by v regarding the propositional variables, the connectives give a truth value to the other formulae reflecting certain propositional constructions typical of the natural language:

• (¬φ) stands for the of the formula/proposition φ,

• (φ ∧ ψ) stands for the conjuction of the formulae/propositions φ, ψ,

• (φ ∨ ψ) stands for the disjuction of the formulae/propositions φ, ψ,

• (φ → ψ) stands for the statement “Whenever φ holds also ψ holds”.

Remark 1.3. A key observation is that the truth value assigned to a formula φ by a valution v depends only on the truth value the valuation assigns to the free variable of φ, i.e.: v0  varprop(φ) = v1  varprop(φ) entails that v0(φ) = v1(φ). This allows to define finite truth tables for each formula φ with 2|varprop(φ)| rows1 (one row for each possible assignment of 0, 1 to the propositional variables of φ), and |varprop(φ)| + 1 columns (one column for φ and another column for each of the propositional variables of φ). For example let φ be the formula ((B → A) ∧ ((B ∨ C) → A)) with propositional variables A, B, C. Its can be computed as follows:

ABC (B → A) (B ∨ C) ((B ∨ C) → A) φ 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 0 0 1 0 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 1 0 1 1

We may omit the truth values of the columns for the subformulae of φ, which are (B → A), (B ∨ C), ((B ∨ C) → A) and obtain the smaller table in 4 columns (3 for the propositional variables A, B, C occurring in φ and 1 for φ), and 8 rows (as many as the 23 possible disinct as- signments of truth values to the three propositional variables A, B, C occurring in φ):

1|X| denotes the number of elements of the set X

4 ABC φ 1 1 1 1 1 1 0 1 1 0 1 1 1 0 0 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 1 Remark however that the truth table of all subformulae of φ needs to be computed in order to be able to compute the one of φ. Remark 1.4. A formula φ is built in a finite number of stages starting from the propositional variables and introducing at each stage some propositional connec- tive attached to one or two simpler sub-formulae of φ; parentheses are needed to understand in which order the process of building φ occurs. φ ≡ ((A ∧ B) ∧ C) is a formula built in two steps out of the three variables A, B, C: first we build ψ ≡ (A ∧ B) out of the propositional variables A, B and then φ ≡ (ψ ∧ C) out of ψ and C. Due to the fact that we are interested in formulae up to logical equivalence (i.e. we identify formulae having the same truth table, see Def. 1.11), and we do not want to burden our notation, we will drop parentheses when we believe this cannot cause confusion in the intended meaning of the formula. For example A ∧ B ∧ C can stand either for ((A ∧ B) ∧ C) or for (A ∧ (B ∧ C)), since the two formulae are logically equivalent. On the other hand a writing of the form A∧B∨C is ambiguous, since it could stand either for the formula θ0 ≡ ((A ∧ B) ∨ C) or for the formula θ1 ≡ (A ∧ (B ∨ C)), which are not logically equivalent. In such cases we will keep enough parentheses to avoid ambiguities, i.e. for θ0 we will write (A∧B)∨C and for θ1 we will write A ∧ (B ∨ C) dropping in both cases the most external parentheses but keeping the ones that clarifies the priorities on the order of introduction of the connectives. So from now on we will write ¬φ rather than (¬φ), φ ∧ ψ, φ ∨ ψ, φ → ψ rather than (φ∧ψ), (φ∨ψ), (φ → ψ), φ1 ∧...∧φn rather than (... ((φ1 ∧φ2)∧...)∧φn) (or any of the possible rearrangements of the parentheses yielding a formula logically equivalent to the conjunction of all the φi). Definition 1.5. A propositional formula φ is: • a if v(φ) = 1 for all valuations v : VarProp → {0, 1}, • a contradiction if v(φ) = 0 for all valuations v : VarProp → {0, 1}, • satisfiable if v(φ) = 1 for some valuation v : VarProp → {0, 1}. Remark 1.6. φ is a tautology if and only if ¬φ is a contradiction, φ is satisfiable if and only if it is not a contradiction. A tautology is a proposition which is true regardless of the context in which it is meaningfully interpreted, a contradiction is a proposition which is false regardless of the context in which it is meaningfully interpreted, a satisfiable proposition is a proposition which is true in some of the contexts in which it can be meaningfully interpreted.

5 Example 1.7. The following are examples of tautologies: Peirce’s law: ((A → B) → A) → A

AB A → B (A → B) → A ((A → B) → A) → A 1 1 1 1 1 1 0 0 1 1 0 1 1 0 1 0 0 1 0 1

Dummet’s law: (A → B) ∨ (B → A)

AB A → B B → A (A → B) ∨ (B → A) 1 1 1 1 1 1 0 0 1 1 0 1 1 0 1 0 0 1 1 1

The formula

(A → ¬B) ∧ (A → B) ∧ (B → A) ∧ (¬B → A) is an example of a contradiction.

AB A → ¬B A → B B → A B → ¬A (A → ¬B) ∧ (A → B) ∧ (B → A) ∧ (B → ¬A) 1 1 0 1 1 0 0 1 0 1 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1 1 0 0

The formula (B → A) ∧ ((B ∨ C) → A) whose truth table we already computed is satisfiable (and thus it is not a contradiction), but it is not a tautology having value 0 for some of its possible valuations. Now we want to formalize in our semantic the concept of logical consequence and ultimately the concept of being a theorem. Notation 1.8. A set is given by providing its elements. We do not care neither on the ordering by which these elements are provided nor on the possible repetitions. For example for us

{a, b, c} , {b, c, a} , {a, b, b, c} , {b, a, b, c} , are all different way to describe the unique set whose elements are the objects a, b, c. ∅ denotes the set with no elements. Given a finite set of propositional formulae Γ = {φ1, . . . , φn} ^ Γ = φ1 ∧ ... ∧ φn,

_ Γ = φ1 ∨ ... ∨ φn.

6 We take the convention that ^ ∅, denotes a true assertion (given that a conjunction is true iff all its conjuncts are, hence, if there are none, it is vacuously true), and that v(V ∅) = 1 for all valuations v. We also stipulate that _ ∅, denotes a contradiction, and that v(W ∅) = 0 for all valuations v (given that a disjunction is true iff at least one of its disjuncts is true, hence, if there are no disjuncts, it cannot be true). Definition 1.9. Let Γ be a finite set of formulae and φ be a formula. Γ |= φ (to be read as “φ is a logical consequence of Γ” or as ‘Γ models φ”) if all valuations v which make true all the formulae in Γ make also true φ; equivalently Γ |= φ if and only if any of the two following conditions is met: • min {v(ψ): ψ ∈ Γ} ≤ v(φ) for all valuations v, • (V Γ) → φ is a tautology. Let ∆ be another finite set of formulae. Γ |= ∆ (to be read as “∆ is a logical consequence of Γ” or as “Γ models ∆” holds if all valuations v which make true all the formulae in Γ make also true some formula in ∆; equivalently Γ |= ∆ if and only if any of the following conditions is met: • min {v(φ): φ ∈ Γ} ≤ max {v(ψ): ψ ∈ ∆} for all valuations v, • (V Γ) → (W ∆) is a tautology. Remark 1.10. Note the following: • ∅ 6|= ∅, since V ∅ stands for a tautology while W ∅ stands for a contradiction: it is clear that the truth cannot have as a logical consequence a falsity. • More generally Γ |= ∅ (or more succinctly Γ |=) holds if and only if V Γ is a contradiction: there cannot be a valuation making all the formula in Γ true, since such a valuation should make W ∅ true as well, which cannot be the case, since W ∅ is a contradiction. •∅| = ∆ (or more succinctly |= ∆) holds if and only if W ∆ is a tautology, since by our conventions v(V ∅) = 1 for all valuations v, hence v(W ∆) = 1 for all valuations v. Our intuition is that a theorem is a statement which can be formalized in our propositional logic as Γ |= φ, where Γ is the set of premises of the theorem and φ is the thesis, or equivalently such that the formula (V Γ) → φ is a tautology of the propositional logic. A statement of the form Γ |= ∆ formalizes a theorem of the form: “If all the hypothesis in Γ hold, then at least one of the possibilities in ∆ occurs”.

7 Definition 1.11. Let φ, ψ be propositional formulae. We say that φ, ψ are logically equivalent if and only if φ |= ψ |= φ equivalently if:

• φ ↔ ψ is a tautology, (where φ ↔ ψ is a shorthand for (φ → ψ) ∧ (ψ → φ)),

• φ and ψ have the same truth table when computing this table over the propo- sitional variables appearing either in φ or in ψ.

1.2 Disjunctive normal forms We seek for canonical representatives of the equivalence classes induced on the set of formulae by the notion of logical equivalence. These representatives are provided by formulae in disjunctive normal form.

Definition 1.12. A literal is either a propositional variable or the negation of a propositional variable A normal conjunction is a formula of type V Γ with Γ a finite set of literals (for example A ∧ ¬B ∧ C is a normal conjuction A ∧ ¬¬B, A ∧ (A → B) are not). A formula is in disjunctive normal form (also written DNF) if it is of the form W ∆ with ∆ a finite set of normal conjuctions (for example A, ¬A, A ∧ ¬B, A ∨ ¬A, (A ∧ ¬B ∧ C) ∨ (D ∧ B ∧ ¬C) ∨ ¬C are formulae in DNF).

Theorem 1.13. Every propositional formula φ is logically equivalent to a formula in DNF.

Proof. Let us prove a specific instance of the theorem. We leave to the reader to understand how the theorem can be proved in general. Let us take for example the formula φ ≡ ((B → A) ∧ ((B ∨ C) → A)) whose truth table has been already computed and is:

ABC φ 1 1 1 1 1 1 0 1 1 0 1 1 1 0 0 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 1

Let us find a ψ in DNF logically equivalent to φ, i.e. with the same truth table. We take for each row of the above table in which a 1 appears, the conjuctions of literals according to the following rule: If a propositional variable X has value 0 we take the literal ¬X, if it has value 1 we take the literal X.

8 Hence we get the following five normal conjuctions one for each of the five rows of the above table in which a 1 appears on the column for φ (rows 1,2,3,4,8):

ψ1 ≡ A ∧ B ∧ C for row 1

ψ2 ≡ A ∧ B ∧ ¬C for row 2

ψ3 ≡ A ∧ ¬B ∧ C for row 3

ψ4 ≡ A ∧ ¬B ∧ ¬C for row 4

ψ8 ≡ ¬A ∧ ¬B ∧ ¬C for row 8

Now observe that the unique truth assignment which makes true the normal con- juction ψi is that of row i, while all other truth assignments make ψi false. Hence

ψ ≡ ψ1 ∨ ψ2 ∨ ψ3 ∨ ψ4 ∨ ψ8 is true only on rows 1,2,3,4,8, exactly as φ. Therefore φ and ψ are logically equiva- lent. Observe that ψ is in DNF.

1.3 Proof systems In the previous section we have given some arguments to assert that the notion of logical consequence Γ |= φ gives a counterpart in propositional logic of the concept of a theorem. Is it possible to convey in a rigorous mathematical definition a coun- terpart of the notion of proof? So far we have an operational method: a “proof” of Γ |= φ amounts to show that (V Γ) → φ has a truth table consisting just of 1 in its column. However this operational method does not seem to reflect in any way our notion of proof. The general mathematical practice to prove a theorem of the form Γ |= φ is to start from the premises Γ of the theorem and, by means of basic inference rules, start to derive from the premises Γ more and more of their logical consequences up to a stage in which φ is obtained among the logical consequences of Γ. In particular the notion of logical consequence Γ |= φ captures the concept that Γ |= φ is a theorem, but it doesn’t give any hint on how we should find a “proof” of this theorem: the computation of the truth table of (V Γ) → φ doesn’t seem to be the natural notion of a proof which from premises Γ yields the conclusion φ. For reasons that will become transparent when we will analyze first order logic, it is better for us to dispose of another characterization of the concept of theorem which is rooted in the formalization of the concept of mathematical proof rather than in the formalization of the notion of logical consequence.

1.4 The sequent calculs LK Let us write Γ ` ∆ to be read as “Γ proves W ∆”, and to signify that we have a proof (whatever that means) of W ∆ from premises Γ. Notation 1.14. To simplify notation in the remainder of these notes we will write Γ, φ to denote the set consisting of all elements in Γ and φ, i.e. if Γ = {φ1, . . . , φn}, Γ, φ is a shorthand for the set {φ1, . . . , φn, φ}. Hence it may occur that we confuse a formula φ with the set {φ} whose unique element is φ. Similarly for Γ = {φ1, . . . , φn}

9 and ∆ = {ψ1, . . . , ψm} finite sets of formulae, Γ, ∆ is a shorthand for Γ ∪ ∆ = {φ1, . . . , φn, ψ1, . . . , ψm}. We introduce the sequent calculus LK. Definition 1.15. A sequent is a string of the form Γ ` ∆ with Γ, ∆ finite sets of formulae. The LK-calculus has the following LK- and LK-deduction rules:

• LK-Axioms: for all formulae φ the sequent φ ` φ is an of LK. • LK-Structural rules:

Γ ` φ, ∆ Γ, φ ` ∆ (Cut) Γ ` ∆

Γ ` ∆ (Weakening) Σ, Γ ` ∆, Π

• LK-Logical rules:

Γ ` φ, ∆ Γ ` ψ, ∆ (∧-R) Γ ` φ ∧ ψ, ∆

Γ, φ, ψ ` ∆ (∧-L) Γ, φ ∧ ψ ` ∆

Γ, φ ` ∆ Γ, ψ ` ∆ (∨-L) Γ, φ ∨ ψ ` ∆

Γ ` φ, ψ, ∆ (∨-R) Γ ` φ ∨ ψ, ∆

Γ ` φ, ∆ (¬-L) Γ, ¬φ ` ∆

Γ, φ ` ∆ (¬-R) Γ ` ¬φ, ∆

Γ ` φ, ∆ Γ, ψ ` ∆ (→-L) Γ, φ → ψ ` ∆

10 Γ, φ ` ψ, ∆ (→-R) Γ` φ → ψ, ∆

An LK-deduction is obtained by any finite repeated applications of LK-deduction rules, either to axioms, or to sequents which are at the bottom of some previously applied deduction rule. In other terms an LK-deduction is a finite tree whose leafs are LK-axioms and whose other nodes are always the bottom sequent whose immediate successor(s) is (are) the top sequent(s) of a deduction rule of LK. Γ ` ∆ is an LK-derivable sequent if there is an LK-deduction whose bottom sequent is Γ ` ∆ (i.e. Γ ` ∆ is the root of the tree given by this LK-deduction). We give several examples of LK-deductions so to get the reader acquainted with this notion. Example 1.16. A proof of Dummet’s law (A → B) ∨ (B → A) (we show that ` (A → B) ∨ (B → A) is the bottom sequent of an LK-deduction, this suffices by Theorem 1.18):

A ` A (Weakening) B,A ` A, B (→-R) B ` A, A → B (→-R) ` B → A, A → B (∨-R) ` (B → A) ∨ (A → B) A proof of De Morgan’s law ¬(A ∧ B) ≡ ¬A ∨ ¬B. More precisely we show that ¬(A ∧ B) ` ¬A ∨ ¬B and ¬A ∨ ¬B ` ¬(A ∧ B). This suffices by Theorem 1.18:

A ` A (Weakening) B ` B (Weakening) B,A ` A B,A ` B (∧-R) B,A ` A ∧ B (¬-L) ¬(A ∧ B),B,A ` (¬-R) ¬(A ∧ B),A ` ¬B (¬-R) ¬(A ∧ B) ` ¬A, ¬B (∨-R) ¬(A ∧ B) ` ¬A ∨ ¬B

A ` A (Weakening) B ` B (Weakening) B,A ` A B,A ` B (∧-L) (∧-L) B ∧ A ` A (¬-L) B ∧ A ` B (¬-L) ¬A, B ∧ A ` ¬B,B ∧ A ` (¬-R) (¬-R) ¬A ` ¬(B ∧ A) ¬B ` ¬(B ∧ A) (∨-R) ¬A ∨ ¬B ` ¬(A ∧ B) Exercise 1.17. Find LK-derivations of A ∧ B ` B ∧ A and of A ∨ B ` B ∨ A. This is the main result we aim for: Theorem 1.18. Let Γ, ∆ be finite sets of propositional formulae. The following are equivalent:

11 1. Γ |= ∆ 2. Γ ` ∆ is an LK-derivable sequent. 2 implies1 is usually referred to as “the soundness theorem”, while1 implies2 is usually referred to as “the completeness theorem”. Exercise 1.19. Is A → B ` B → A a LK-derivable sequent? (HINT: use the above theorem). To simplify our discussions we will often say that Γ ` ∆ is a valid sequent to signify that Γ |= ∆ holds. The keys for the proof of the above theorem are given by the following Lemma: Lemma 1.20. Let v : VarProp → {0, 1} be a valution. Then the followig holds: V W 1. Assume v( Γ0) ≤ v( ∆0) and

Γ0 ` ∆0 (R) Γ ` ∆ is an LK-structural rule with one top sequent (i.e. the Weakening rule). Then v(V Γ) ≤ v(W ∆). V W 2. Assume v( Γi) ≤ v( ∆i) for i = 0, 1 and

Γ0 ` ∆0 Γ1 ` ∆1 (R) Γ ` ∆ is an LK-structural rule with two top sequents (i.e. the Cut rule). Then v(V Γ) ≤ v(W ∆). 3. Assume

Γ0 ` ∆0 (R) Γ ` ∆ is an LK-logical rule with one top sequent (i.e. ∧-L, ∨-R, →-R, ¬-L, ¬-R). V W V W Then v( Γ0) ≤ v( ∆0) if and only if v( Γ) ≤ v( ∆). 4. Assume

Γ0 ` ∆0 Γ1 ` ∆1 (R) Γ ` ∆ V is an LK-logical rule with two top sequents (i.e. ∧-R, ∨-L, →-L). Then v( Γ0) ≤ W V W V v( ∆0) and v( Γ1) ≤ v( ∆1) hold simultaneously if and only if v( Γ) ≤ v(W ∆). The above Lemma has the following immediate corollaries:

Lemma 1.21. Let R be an LK-structural rule. Let Γi ` ∆i be the top sequent(s) of R for i = 0 (i = 0, 1 if the rule has two sequents above the horizontal line). Let Γ ` ∆ be the bottom sequent of R (i.e. the sequent below the horizontal line). Then Γ |= ∆ holds whenever all the sequents Γi ` ∆i on top of the horizontal line are such that Γi |= ∆i.

12 Lemma 1.22. Let R be an LK-logical rule. Let Γi ` ∆i be the top sequent(s) of R for i = 0 (i = 0, 1 if the rule has two sequents above the horizontal line). Let Γ ` ∆ be the bottom sequent of R (i.e. the sequent below the horizontal line). Then Γ |= ∆ if and only if all the sequents Γi ` ∆i on top of the horizontal line are such that Γi |= ∆i. We defer the proof of the above Lemmas to a later stage and for now we assume they all hold. Now we can prove Theorem 1.18 Proof. We prove both implications by a suitable induction on the complexity of proofs or of the formulae in Γ, ∆. 2 implies1. The proof is done by induction on the height of an LK-derivation whose bottom sequent (or root) is Γ ` ∆. We define the height of an LK-derivation as the longest path connecting its bottom sequent to one if its leaves, where a path from the root to a leaf is the number of horizontal lines one crosses to go from the root of the LK-derivation to the given leaf. In the examples of 1.16: • The LK-derivation corresponding to Dummet’s law has root ` (A → B) ∨ (B → A), one leaf A ` A, and a unique path of length 5 connecting the root to this leaf, hence its height is 5. • Both of the LK-derivation corresponding to DeMorgan’s law have two leaves and two paths both of the same length which are respectively 7 - for the topmost LK-derivation - and 6 - for the lower LK-derivation -). Hence these LK-derivation have height respectively 7 and 6. • The LK-derivation in Example 1.24 which establishes that

φ ≡ (P → (Q → R)) ∧ ¬((P → Q) → (P → R)).

is a contradiction has 4 leaves (from left to right R ` R, Q ` Q, P ` P , P ` P ) and the corresponding paths from the root φ ` ∅ to the leaves have length 8, 8, 7, 6. Hence this LK-derivation has height 8. An axiom φ ` φ is an LK-derivation of height 0 and it is trivial to check that φ |= φ. Hence2 implies1 holds for the axioms of LK, i.e. for all trees of an LK-derivation of height 0. Now observe that the Lemmas 1.21 and 1.22 grant that whenever the premises of a deduction rule are valid sequents, so is the conclusion. So assume we pick an LK-derivation of height n + 1 and we know that2 implies1 holds for all LK-derivation of height at most n. The LK-derivation will either look like

T ...... 0 Γ0 ` ∆0 (R) Γ ` ∆

with T0 an LK-derivation of height n whose bottom sequent is Γ0 ` ∆0, or it will look like

13 T T ...... 1 ...... 2 Γ1 ` ∆1 Γ2 ` ∆2 (R) Γ ` ∆

where both Ti are LK-derivations of height at most n whose bottom sequent is Γi ` ∆i. We can apply the induction hypothesis to the LK-derivations Ti for i = 0, 1, 2, since these trees have height at most n. In either cases the induction hyptotheses give that2 implies1 for Γ i, ∆i for i = 0 or for i = 1, 2. Hence we conclude that Γi |= ∆i for i = 0, 1, 2, given that the LK-derivations Ti witness that Γi ` ∆i for i = 0, 1, 2. By Lemmas 1.21 or 1.22 (according to whether R is an LK-structural rule or an LK-logical rule), we conclude that Γ |= ∆. 1 implies2. We suppose that Γ |= ∆, and we must find an LK-derivation of Γ ` ∆. In this case we proceed by induction on the number of connectives appearing in the finite set of formulae Γ ∪ ∆: Given a formula φ, we let c(φ) be the number of propositional con- nectives appearing in φ, for example if φ ≡ (A ∧ ¬B) → C, we have c(φ) = 3, since the 3 connectives appearing in φ are ∧, ¬, →. P Given a finite set of formulae Σ = {φ1, . . . , φn} we let c(Σ) = i=1,...,n c(φi). We prove by induction on c(Γ ∪ ∆) that Γ |= ∆ entails that there is an LK- derivation with root Γ ` ∆. Assume c(Γ ∪ ∆) = 0. This occurs only if Γ ∪ ∆ consists of a finite set of propositional variables. Thus assume that Γ is the finite set of proposi- tional variables {A1,...,An} and ∆ is the finite set of propositional variables {B1,...,Bm}. Assume towards a contradiction that Γ ∩ ∆ = ∅, i.e. Ai 6= Bj for all i = 1, . . . , n and j = 1, . . . , m. Set v(Ai) = 1 for all i and v(Bj) = 0 for all j. Then v(V Γ) = 1 > 0 = v(W ∆), hence v witnesses that Γ 6|= ∆, a contradiction. Hence Ai = Bj = A for some i, j. We obtain an LK-derivation of Γ ` ∆ by means of the weakening rule as follows:

A ` A (Weakening) Γ ` ∆ Now assume that that Γ |= ∆ entails that there is an LK derivation with root Γ ` ∆ for all finite sets of formulae Γ, ∆ such that c(Γ ∪ ∆) ≤ n. Let us prove that this holds also for all finite sets of formulae Γ, ∆ such that c(Γ∪∆) = n+1. Since c(Γ ∪ ∆) = n + 1 > 0, there is at least one formula φ ∈ Γ ∪ ∆ such that c(φ) > 0. If φ appears both in Γ, ∆, then as before Γ ` ∆ can be obtained from the axiom φ ` φ by an instance of the weakening rule. Hence we can suppose that φ belongs just to one of the sets Γ, ∆ and contains at least one connective. For the sake of the discussion, let us assume that φ = ψ ∧ θ is an element of ∆ but not of Γ. Let ∆0 = ∆ \{φ}. Then we have that φ 6∈ Γ, φ 6∈ ∆0 and

14 Γ ` ψ, ∆0 Γ ` θ, ∆0 (∧-R) Γ ` ∆ is an instance of the deduction rule (∧-R) applied to the sequents Γ ` ψ, ∆0, Γ ` θ, ∆0.

By Lemma 1.22, we get that Γ |= ∆ if and only if Γ |= ψ, ∆0 and Γ |= θ, ∆0 hold simultaneously. But our assumption is that Γ |= ∆, hence we conclude that Γ |= ψ, ∆0 and Γ |= θ, ∆0 hold simultaneously. Now observe that

c(Γ ∪ {ψ} ∪ ∆0), c(Γ ∪ {θ} ∪ ∆0) < c(Γ ∪ ∆) = n + 1. This occurs since c(φ) = c(ψ) + c(θ) + 1, and φ is an element neither of Γ nor of ∆0, hence

c(Γ ∪ {ψ} ∪ ∆0) ≤ c(ψ) + c(Γ ∪ ∆0) < c(φ) + c(Γ ∪ ∆0) = c(Γ ∪ ∆), and similarly for θ (the first inequality may be strict in case ψ belongs to Γ∪∆0, in this case c(Γ ∪ {ψ} ∪ ∆0) = c(Γ ∪ ∆0)). Hence by induction hypotheses, we can find LK-derivations

T ...... 0 Γ ` ψ, ∆0

T ...... 1 Γ ` θ, ∆0 of Γ ` ψ, ∆0 and of Γ ` θ, ∆0. Now we can find a derivation of Γ ` φ as follows:

T T ...... 0 ...... 1 Γ ` ψ, ∆0 Γ ` θ, ∆0 (∧-R) Γ ` ∆ We leave to the reader to handle the other possible cases which are: • φ = ψ ∧ θ ∈ Γ, • φ = ψ ∨ θ ∈ Γ, • φ = ψ ∨ θ ∈ ∆, • φ = ¬ψ ∈ Γ, • φ = ¬ψ ∈ ∆, • φ = ψ → θ ∈ Γ, • φ = ψ → θ ∈ ∆. The proofs are all the same and use the following observations: • Γ ` ∆ is the bottom sequent of an LK-logical rule R introducing φ either in Γ or in ∆.

15 • The fact that for all LK-logical rules R the bottom sequent of the rule is valid if and only if the top sequent(s) is (are). • The inductive assumptions to find LK-derivation(s) of the top sequent(s) of R, which must be valid by the above observations.

In all cases one can patch this (these) LK-derivations together on top of the rule R to yield the desired LK-derivation of Γ ` ∆. This concludes the proof of1 implies2.

The proof of Theorem 1.18 is completed (modulo the proofs of Lemmas 1.20, 1.21, 1.22).

So we are left with the proof of Lemmas 1.20, 1.21, 1.22. We prove Lemma 1.20 and leave the proof of the other two to the reader.

Proof. We divide the proof in two parts: one for the LK-structural rules (i.e the first two items of the Lemma), and one for the LK-logical rules (i.e the last two items of the Lemma).

LK-structural rules: Assume R is the weakening rule

Γ ` ∆ (Weakening) Σ, Γ ` ∆, Π

and v(V Γ) ≤ v(W ∆). We must show that v(V(Σ ∪ Γ)) ≤ v(W(∆ ∪ Π)). But ^ ^ _ _ v( (Σ ∪ Γ)) ≤ v( Γ) ≤ v( ∆) ≤ v( (∆ ∪ Π)).

Hence the thesis holds for the Weakening rule.

Assume R is the cut rule

Γ ` φ, ∆ Σ, φ ` Π (Cut) Σ, Γ ` ∆, Π

and ^ _ v( Γ) ≤ v(φ ∨ ∆), ^ _ v(φ ∧ Σ) ≤ v( Π) simultaneously hold. So assume v(V(Σ∪Γ)) = 1. We must show that v(W(∆∪ Π)) = 1: Clearly ^ ^ _ 1 = v( (Σ ∪ Γ)) ≤ v( Γ) ≤ v(φ ∨ ∆),

and ^ ^ 1 = v( (Σ ∪ Γ)) ≤ v( Σ). Hence we get that v(φ ∨ W ∆) = 1. Now there are two cases:

16 • v(φ) = 0. Then _ n _ o _ _ 1 = v(φ ∨ ∆) = max v(φ), v( ∆) = v( ∆) ≤ v( (∆ ∪ Π)).

• v(φ) = 1. Then n ^ o ^ _ _ 1 = min v(φ), v( Σ) = v(φ ∧ Σ) ≤ v( Π) ≤ v( (∆ ∪ Π)).

In either cases we proved that if v(V(Σ ∪ Γ)) = 1, then v(W(∆ ∪ Π)) = 1. Hence the thesis holds also for the Cut rule. LK-logical rules: We provide the proof for two of the eight logical rules we intro- duced, and we invite the reader to provide proofs for the remaining ones. We choose an “easy” one (¬-R) and a “more difficult” one (∨-L). So let us pick the LK-logical rule

Γ, φ ` ∆ (¬-R) Γ ` ∆, ¬φ

and a valuation v. We must show that ^ _ ^ _ v( Γ) ≤ v(¬φ ∨ ∆) if and only if v(φ ∧ Γ) ≤ v( ∆)

First assume v(V Γ) ≤ v(¬φ∨W ∆). We must show that v(φ∧V Γ) ≤ v(W ∆). If v(φ) = 0, this is trivially the case, since ^ _ v(φ ∧ Γ) = 0 ≤ v( ∆).

So let us consider the case v(φ) = 1, hence ^ ^ _ v(φ ∧ Γ) = v( Γ) ≤ v(¬φ ∨ ∆) = n _ o n _ o _ = max v(¬φ), v( ∆) = max 0, v( ∆) = v( ∆),

and we are done also in this case. Now assume v(φ ∧ V Γ) ≤ v(W ∆). We must show that v(V Γ) ≤ v(¬φ ∨ W ∆). If v(φ) = 0, this is trivially the case, since ^ _ v( Γ) ≤ 1 = v(¬φ) ≤ v(¬φ ∨ ∆).

So let us consider the case v(φ) = 1, as before ^ n ^ o ^ _ _ v( Γ) = min 1, v( Γ) = v(φ ∧ Γ) ≤ v( ∆) ≤ v(¬φ ∨ ∆),

and we are done also in this case.

Now let us deal with the LK-logical rule

17 Γ, φ ` ∆ Γ, ψ ` ∆ (∨-L) Γ, φ ∨ ψ ` ∆

and a valuation v. We must show that ^ _ v((φ ∨ ψ) ∧ Γ) ≤ v( ∆) if and only if n ^ ^ o _ max v(φ ∧ Γ), v(ψ ∧ Γ) ≤ v( ∆).

Let Γ = {θ1, . . . , θn}. Then we have ^ v((φ ∨ ψ) ∧ Γ) = min {max {v(φ), v(ψ)} , v(θ1), . . . , v(θn)} =

= max {min {v(φ), v(θ1), . . . , v(θn)} , min {v(ψ), v(θ1), . . . , v(θn)}} = n ^ ^ o = max v(φ ∧ Γ), v(ψ ∧ Γ) .

The desired conclusion is now immediate.

Remark 1.23. LK-structural rules do not enjoy the stronger properties which can be inferred for the LK-logical rules (i.e. inequalities are not only downward preserved but also upward preserved along an LK-logical rule). There can be valuations v for which v(V Γ) ≤ v(W ∆) holds for the bottom sequent of an LK-structural rule but fails for at least one of the top sequents: For example let v(A) = 1, v(B) = 0. Then this is the case for the following instances of the Cut rule with Γ = Π = {A} , ∆ = Σ = ∅

A ` BB ` A (Cut) A ` A and of the Weakening rule:

A ` B (Weakening) A, B ` B We leave to the reader to provide a proof of Lemmas 1.21, 1.22.

LK as a tool to solve satisfiability problems It is in general much faster to tackle the problem of checking whether a given proposi- tional formula is satisfiable, or a tautology, or a contradiction using the LK-calculus, rather than resorting to the computation of its truth table. Lemma 1.20 and the proof of the completeness theorem give an efficient algorithm to check whether Γ |= ∆ holds or not. We give a number of examples below. Example 1.24. Show that (A → B) → [(B → C) → (A → C)] is a tautology. This amounts to show that the sequent ` (A → B) → [(B → C) → (A → C)] is LK-derivable.

18 We follow the proof of the completeness theorem: we seek a proof of the above sequent backward, starting from the root of a possible derivation of the above se- quent. By means of successive applications of LK-logical rules, we try to build with a bottom-up procedure an LK-derivation of this sequent as follows:

STEP 1: We apply →-R to the sequent ` (A → B) → [(B → C) → (A → C)] with Γ = ∅, ∆ = ∅:

A → B ` (B → C) → (A → C) (→-R) ` (A → B) → [(B → C) → (A → C)] STEP 2: We apply →-R to the sequent A → B ` (B → C) → (A → C) with Γ = {A → B} , ∆ = ∅:

A → B,B → C ` A → C (→-R) A → B ` (B → C) → (A → C) (→-R: Γ = ∅, ∆ = ∅) ` (A → B) → [(B → C) → (A → C)] STEP 3: We apply →-R to the sequent A → B,B → C ` A → C with Γ = {B → C,A → B} , ∆ = ∅

A → B,B → C,A ` C (→-R) A → B,B → C ` A → C (→-R: Γ = {A → B} , ∆ = ∅) A → B ` (B → C) → (A → C) (→-R: Γ = ∅, ∆ = ∅) ` (A → B) → [(B → C) → (A → C)] STEP 4: We apply →-L to the sequent A → B,B → C,A ` C with Γ = {B → C,A} , ∆ = {C}

B → C,A ` A, C B, B → C,A ` C (→-L) A → B,B → C,A ` C (→-R: Γ = {B → C,A → B} , ∆ = ∅) A → B,B → C ` A → C (→-R: Γ = {A → B} , ∆ = ∅) A → B ` (B → C) → (A → C) (→-R: Γ = ∅, ∆ = ∅) ` (A → B) → [(B → C) → (A → C)] STEP 5: We apply →-L to the sequent B,B → C,A ` C with Γ = {B,A} , ∆ = {C}

19 B,A ` B, C B, A, C ` C (→-L) B → C,A ` A, C B,B → C,A ` C (→-L: Γ = {B → C,A} , ∆ = {C}) A → B,B → C,A ` C (→-R: Γ = {B → C,A → B} , ∆ = ∅) A → B,B → C ` A → C (→-R: Γ = {A → B} , ∆ = ∅) A → B ` (B → C) → (A → C) (→-R: Γ = ∅, ∆ = ∅) ` (A → B) → [(B → C) → (A → C)]

FINAL STEP: We apply the relevant weakening rules to obtain the sequents

B → C,A ` A, C

B,A ` B,C B, A, C ` C as weakenings of the axioms A ` A, B ` B, C ` C. The LK-derivation of

` (A → B) → [(B → C) → (A → C)] we constructed is:

B ` B (Weakening) C ` C (Weakening) B,A ` B,C B, A, C ` C A ` A (Weakening) (→-L: Γ = {B,A} , ∆ = {C}) B → C,A ` A, C B,B → C,A ` C (→-L: Γ = {B → C,A} , ∆ = {C}) A → B,B → C,A ` C (→-R: Γ = {B → C,A → B} , ∆ = ∅) A → B,B → C ` A → C (→-R: Γ = {A → B} , ∆ = ∅) A → B ` (B → C) → (A → C) (→-R: Γ = ∅, ∆ = ∅) ` (A → B) → [(B → C) → (A → C)]

Now let’ s check the following formula is a contradiction:

φ ≡ (P → (Q → R)) ∧ ¬((P → Q) → (P → R)).

φ is a contradiction iff φ |= ∅ iff φ ` ∅ is LK-derivable. Thus we must find a LK-derivation of φ ` ∅. We proceed as before to build such a derivation and we get:

R ` R Q ` Q Q, P, R ` R Q, P ` R,Q P ` P Q, Q → R,P ` R Q → R,P ` P,R P ` P P → Q, Q → R,P ` R P → Q, P ` P,R P → (Q → R),P → Q, P ` R P → (Q → R),P → Q ` P → R P → (Q → R) ` (P → Q) → (P → R) P → (Q → R), ¬((P → Q) → (P → R)) ` (P → (Q → R)) ∧ ¬((P → Q) → (P → R)) `

20 We leave to the reader to check which rules have been applied in each of the steps of the above derivation. Now we want to address the satisfiability problem. We want to check whether φ ≡ [¬A ∧ ¬C] ∧ [(D → A) ∨ Q] is a satisfiable formula. One can check this computing its truth table, but this is an awkward task given that we have 4 propositional variables in φ, hence 24 = 16 rows in its truth table. Moreover to compute the truth table of φ, along the way, we must also compute the truth tables of all the formula concurring to the construction of φ, i.e.: ¬A ∧ ¬C, (D → A) ∨ Q, ¬A, ¬C, D → A. There is a more efficient strategy to find a valuation witnessing the satisfiability of φ by means of Lemma 1.20, which goes as follows: We start by building a derivation of φ ` ∅ as in the previous examples, using just LK-logical rules, and not using in the last step the weakening rule. This will give us a tree whose leaves j are of the form Γj ` ∆j with Γj ∪ ∆j contained in the finite sets of propositional variables of φ. If for all the leaves j there is some propositional variable Aj in Γj ∩ ∆j, then all the sequents Γj ` ∆j can be obtained from Aj ` Aj by means of the weakening rule, hence φ ` ∅ admits an LK-derivation, i.e. φ is a contradiction. Otherwise some leaf j is such that Γj ∩ ∆j is the emptyset. In which case we can let v(X) = 1 for all X ∈ Γj and v(Y ) = 0 for all Y ∈ ∆j, giving that V W v( Γj) = 1 > 0 = v( ∆j). By repeated application of Lemma 1.20 to the sequents Γ ` ∆ appearing along V W the path from φ ` ∅ to the leaf Γj ` ∆j, the inequality v( Γ) > v( ∆) holds for all such sequents (given that along these paths we have only used LK-logical rules), hence v(φ) > v(W ∅) = 0. v is a valuation witnessing the satisfiability of φ. So let us use this strategy to check whether φ is satisfiable. The tentative con- struction of an LK-derivation of φ ` ∅ gives us the following tree:

Q ` A, C (D → A) ` A, C (D → A) ∨ Q ` A, C ¬C, (D → A) ∨ Q ` A ¬A, ¬C, (D → A) ∨ Q ` ¬A ∧ ¬C, (D → A) ∨ Q ` [¬A ∧ ¬C] ∧ [(D → A) ∨ Q] ` We can stop at this stage of the construction of a possible LK-derivation of φ ` ∅: the topmost left sequent Q ` A, C consists just of propositional variables, and no such variable appears on both sides of the ` symbol. Hence a valuation v such that v(Q) = 1, v(A) = v(C) = 0 is such that v(φ) = 1. The direct computation of the two rows2 of the truth table of φ induced by valuations satisfying the above constraint show that this is indeed the case. 2There at least two valuations witnessing the satisfiability of φ, since the above constraints leave a complete freedom on the choice of v(D). Other leaves which cannot be obtained by weakenings of axioms may provide other valuations realizing φ.

21 In particular from the above tentative construction of an LK-derivation of φ ` ∅, we have been guided to the definition of a valuation witnessing the satisfiability of φ. To appreciate why this method is efficient, we invite the reader to solve this same satisfiability problem by means of truth tables.

1.5 Exercises on propositional logic and LK-calculus Here is a list of suggestions for exercises on : • The webpage https:pythonism.wordpress.com20100913propositional-logic-and- some-tautologies contains a very rich list of propositional tautologies. Find an LK-derivation of at least five of them (˜P stands for ¬P , to deal with LK- derivations of formulae with ↔ among its connectives, replace φ ↔ ψ by the equivalent formula φ → ψ ∧ ψ → φ, or else look at the third item below). • Take randomly four formulae of propositional calculus each one containing at least 4 propositional variables and 6 connectives so that all the connectives ¬, ∧, ∨, → appear in any of these formulae. Check whether the formulae you chose are satisfiable, tautologies, or contradictions. • Find LK-logical rules (↔-L) and (↔-R) satisfying items 3 or 4 of Lemma 1.20 for the propositional connective ↔. (HINT: use that

A ↔ B ≡ (A ∧ B) ∨ (¬A ∧ ¬B) ≡ (A → B) ∧ (B → A)

and try to find an LK-derivation of Γ, (A ∧ B) ∨ (¬A ∧ ¬B) ` ∆ by using LK- logical rules acting only on (A ∧ B) ∨ (¬A ∧ ¬B) or its subformulae. Proceed this way until you reach a stage in which the sequents Σ ` Π you are handling are such that Σ ∪ Π = Γ ∪ ∆ ∪ {A, B}. These sequents together with the one you started with should suggest you what is the LK-logical rules (↔-L). To find (↔-R) proceed in the same way but starting with Γ ` (A → B) ∧ (B → A), ∆.) Prove that Lemma 1.20 holds for the rules you introduced for this connective. • Choose your favourite formula φ in two propositional variables A, B. Define v(A ×φ B) = v(φ) for any valuation v. This gives you the truth table of a propositional connective ×φ binding together two subformulae. Find LK- logical rules (×φ-L) and (×φ-R) satisfying items 3 or 4 of Lemma 1.20 for ×φ and prove that Lemma 1.20 holds for the rules you introduced for this connective (note that your rules may have more than two premises). • Show that the LK-rule

Γ ` φ, ∆ Σ, φ ` Π Γ, Σ ` ∆, Π

can be obtained by means of the cut and weakening rules. • Prove that Lemma 1.21 holds for the following rules (for some of these the proof is a self-evident variation of what has been already proved, for others it is more delicate):

22 Γ ` ∆, φ ∧ ψ (∧-elimination) Γ ` ∆, φ

Γ, ¬φ ` ∆ (Proof by contradiction) Γ ` ∆, φ

Γ ` ∆, ¬φ → ¬ψ (Proof by contraposition) Γ ` ∆, ψ → φ

Γ ` ∆, φ → ψ Γ ` ∆, φ (Modus Ponens) Γ ` ∆, ψ

Γ` ∆, φ ∨ ψ Γ, φ ` ∆, θ Γ, ψ ` ∆, θ (∨-elimination) Γ ` ∆, θ

In general it is worth to introduce a definition of what is a correct logical rule and a complete logical rule.

Definition 1.25. Let Form denote the set of all formulae and Seq = Form

• An n-ary LK-rule is a map R : Seqn → Seq.

• An n-ary LK-rule R is correct if the following holds: V W For all valuations v and h(Γ1, ∆1),..., (Γn, ∆n)i, if v( Γi) ≤ v( ∆i) for all i = 1, . . . , n, we also have that v(V Γ) ≤ v(W ∆), where (Γ, ∆) = R(h(Γ1, ∆1),..., (Γn, ∆n)i).

• An n-ary LK-rule R is complete if the following holds:

For all valuations v and h(Γ1, ∆1),..., (Γn, ∆n)i, letting (Γ, ∆) = V W R(h(Γ1, ∆1),..., (Γn, ∆n)i), v( Γi) ≤ v( ∆i) for all i = 1, . . . , n, whenever v(V Γ) ≤ v(W ∆).

Lemma 1.20 states that the structural LK-rules are correct and that the LK- logical rules are correct and complete. The above exercises provide further examples of correct and/or complete LK-rules.

2 Basics of first order logic

We start to define formulae and semantics for one of the simplest first order lan- guages, a language which is already suited to express the basic properties of groups. First of all we need a piece of notation. Let us stick to the following convention:

23 Notation 2.1. A vocabulary V is a certain set of symbols: for example the set V = {(, ), ∧, ¬, A, B, →} is a vocabulary. A string on V is a finite sequence of elements of V : for example the propositional formulae (¬(A ∧ B)) and (A → ((¬A) ∧ (¬B))) are strings on the vocabulary V . An occurrence of a symbol u of V in a string s is any appearance of the symbol in the string. For example A occurs once in the third position in the string (¬(A∧B)), and twice in the second and seventh position in the string (A → ((¬A) ∧ (¬B))). A substring of a given string s is any consecutive list of symbols of s, for example ¬(A∧ and (A ∧ B) are both substrings of (¬(A ∧ B)). Any propositional formula φ is a string of symbols taken from the vocabulary {An : n ∈ ω} ∪ {(, ), ¬, ∨, ∧, →}. The notion of string and that of set diverge: the notion of string forces us to pay attention not just to which symbols occur in it, but also on the relative position these symbols occupy in it. On the other hand a set is uniquely defined by its elements, and the order and the number of occurrences of an element inside a set does not matter: {a, b, c} = {a, c, b} = {a, a, b, c, b} are three different descriptions of the unique set whose three elements are a, b, c; while the strings abc, acb, aabcb are pairwise distinct, though the symbols occurring in them are the same (those in the vocabulary {a, b, c}). Another example is given by the propositional formulae (¬(A ∧ B)) and ((¬A) ∧ (¬B)) which are distinct strings containing the same set of symbols {A, B, (, ), ¬, ∧}, their difference is made explicit by the different positions and the different number of occurrences each of the symbols in the above set has in each of the two strings.

2.1 Examples of first order languages First example of a first order language . Definition 2.2. Let L0 = {∗, =}, and {xn : n ∈ N} be an infinite set of variables. The L0-terms and L0-formulae are defined as strings over the vocabulary

{∧, ¬, ∨, →, ∀, ∃, (, )} ∪ L0 ∪ {xn : n ∈ N} according to the following rules: Terms: an L0-term is a string defined as follows:

• each variable xn is an L0-term,

• if t, s are L0-terms, also (t ∗ s) is an L0-term, • strings which cannot be obtained by finitely many repeated applications of the above two rules are not L0-terms.

Formulae: an L0-formula is a string defined as follows: . • if t, s are L0-terms, (t = s) is an L0-formula,

• if φ, ψ are L0-formulae and xi is a variable, also (¬φ), (φ∧ψ), (φ → ψ), (φ∨ψ), (∀xiφ), (∃xiφ) are L0-formulae. • strings which cannot be obtained by finitely many repeated applications of the above two rules are not L0-formulae.

24 The basic idea behind the above definition is the following: ∗ is a symbol rep- resenting a given binary operation on some set M, and the variables range over elements of M, an L-term will denote a certain element of M which can be exactly determined using the operation denoted by ∗, once a definite value is assigned to the variables occurring in this term. The simplest formulae are equations stating that a certain term is equal to an- other term, certain assignment of the variables can make an equation true, others can make the equation false. By means of propositional connectives and quantifiers more complicated state- ments (other than equations) expressing the relations subsisting between different terms can be expressed: propositional connective can say something like the given . equation is false (with a formula of type ¬(s = t)), both equations are true (with a . . formula of type (s = t) ∧ (u = v)), at least one equation is true (with a formula of . . type (s = t) ∨ (u = v)), or more complicated such statements which can be obtained by means of repeated use of propositional connectives over simpler formulae. . For what concerns quantifiers, for example a formula of type ∃x(s = t) holds true in M for a given assignment of the variables different from x occurring in the equation, if some element of M satisfies the equation when assigned to x, a formula of type ∀xφ holds true in M if all elements of M satisfy the property φ. We now want to give a precise mathematical meaning to these vague observations.

2 Definition 2.3. Let M be a set and ·M : M → M be a binary operation on M. Fix v : var → M be a function. Given an L0-term t, we define v(t) as follows: • v(t) = v(x) if t is the variable x,

• v((u ∗ s)) = v(u) ·M v(s) if t is the term (u ∗ s).

Given an L0-formula φ we define (M, ·M )  φ[v] as follows: . • (M, ·M )  φ[v] iff v(t) = v(s) and φ is the formula (t = s),

• (M, ·M )  ψ ∧ θ[v] iff (M, ·M )  ψ[v] and (M, ·M )  θ[v],

• (M, ·M )  ψ ∨ θ[v] iff (M, ·M )  ψ[v] or (M, ·M )  θ[v],

• (M, ·M )  ¬ψ[v] iff (M, ·M ) 6 ψ[v],

• (M, ·M )  ∃xψ[v] iff (M, ·M )  ψ[vx/b] for some b ∈ M,

• (M, ·M )  ∀xψ[v] iff (M, ·M )  ψ[vx/b] for all b ∈ M, 0 0 where vx/b denotes the function v : var → M such that v (y) = v(y) for all variables y 6= x and v0(x) = b.

Let M = (M, ·M ), we say that φ holds in the L0-structure M for a valutation v or that the L0-structure M models (satisfies) φ with valuation v when it is the case that M  φ[v].

Example 2.4. Consider the L0-structure (Z, +) and the function v : var → Z such that v(x) = 2, v(y) = 5, v(z) = 8. Let t be the L0-term (x∗z), s be the L0-term (y∗y), u be the L0-term ((z∗y)∗z). Then

25 • v(t) = v(x) + v(z) = 2 + 8 = 10, • v(s) = v(y) + v(y) = 5 + 5 = 10, • v(u) = v((z ∗ y)) + v(z) = (v(z) + v(y)) + v(z) = (8 + 5) + 8 = 21. Hence: . (Z, +)  (t = s)[v] given that v(t) = 10 = v(s). . (Z, +) 6 (t = u)[v] given that v(t) = 10 6= 21 = v(u). For what concerns quantifiers: . (Z, +)  (∃y((y ∗ y) = ((z ∗ y) ∗ z)))[v] Since: . (Z, +)  (∃y((y ∗ y) = ((z ∗ y) ∗ z)))[v] if and only if . (Z, +)  ((y ∗ y) = ((z ∗ y) ∗ z)))[vy/n] for some n ∈ Z if and only if

vy/n((y ∗ y) = vy/n((z ∗ y) ∗ z))) for some n ∈ Z if and only if

2n = vy/n((y ∗ y)) = vy/n((z ∗ y) ∗ z) = n + 16 for some n ∈ Z.

The latter equation 2n = n + 16 is satisfied uniquely by n = 16. Hence . (Z, +)  ((y ∗ y) = ((z ∗ y) ∗ z)))[vy/16] and 16 is the natural number n witnessing that . (Z, +)  (∃y((y ∗ y) = ((z ∗ y) ∗ z)))[v]. On the other hand: . (Z, +) 6 (∀y((y ∗ y) = ((z ∗ y) ∗ z)))[v] Since: . (Z, +)  (∀y((y ∗ y) = ((z ∗ y) ∗ z)))[v] if and only if . (Z, +)  ((y ∗ y) = ((z ∗ y) ∗ z)))[vy/n] for all n ∈ Z if and only if

vy/n((y ∗ y) = vy/n((z ∗ y) ∗ z))) for all n ∈ Z if and only if

2n = vy/n((y ∗ y)) = vy/n((z ∗ y) ∗ z) = n + 16 for all n ∈ Z.

26 But the latter equation is not satisfied for n 6= 16, hence any n 6= 16 (for example n = 3) witnesses that . (Z, +) 6 (∀y((y ∗ y) = ((z ∗ y) ∗ z)))[v], given that (for example) . (Z, +) 6 ((y ∗ y) = ((z ∗ y) ∗ z)))[vy/3]. We adopt the following conventions: Notation 2.5. We drop parenthesis when this will not generate confusion, hence . . we will write t ∗ s rather than (t ∗ s), t = s rather than (t = s), ∃xφ rather than (∃xφ) and so on so forth. Parentheses will be kept when confusion on the priority of the connectives may occur as in ∃xφ ∧ ψ which could stand either for (∃xφ) ∧ ψ or for ∃x(φ ∧ ψ), expressions which can have a very distinct meaning. Moreover for the sake of readability of formulae, we will feel free to use at times the parentheses [, ] in the place of (, ). Here are some examples to appreciate the expressive power of this language:

Exercise 2.6. Find for each natural number n a formula φn such that an L0 structure (M, · ) satisfies φ with any valuation v : var → M if and only if M has at least n M n . . elements. (HINT: φ1 ≡ ∃x(x = x), φ2 ≡ ∃x∃y¬(x = y),.....).

We introduce the group axioms as the following formulae of the language L0:

Associativity law: ∀x∀y∀z((x ∗ y) ∗ z) = (x ∗ (y ∗ z)) . . Neutral element: ∃e∀y(e ∗ y = y ∧ y ∗ e = e) . . Inverse: ∃e[∀y(e ∗ y = y ∧ y ∗ e = e) ∧ ∀x∃z(x ∗ z = e ∧ z ∗ x = e)]

Exercise 2.7. Show the following:

• Choose your favourite group (G, ·G) (for example (Z, +), (Q, +), (R \{0} , ·), (GL2,2(R), ·) — the latter is the group of 2×2 matrix with real coefficients and non-zero determinant, with · the multiplication of matrices) and show that the three formulae above hold in (G, ·G).

• Show that the first two formulae hold in (N, +), but the third fails (HINT:the inverse of a positive natural number with respect to the sum is a negative integer number, which does not belong to N....).

• Show that the first two formulae hold in (GL2,2(Z), ·), but the third fails (where GL2,2(Z) is the family of 2 × 2 matrix with integer coefficients and non-zero determinant, with · being the multiplication of matrices). • Show that the

Commutativity law: ∀x∀y(x ∗ y = y ∗ x)

holds in (Z, +), (Q, +), (R \{0} , ·), and fails in (GL2,2(R), ·).

27 It should be transparent to the reader who has gained familiarity with the se- mantics of L0-formulae that:

2 • A structure (G, ·G) with G a set and ·G : G → G a binary operation is a group if for any valuation v : var → G

(G, ·G)  Associativity law ∧ Inverse[v] and it is not a group otherwise.

2 • A structure (G, ·G) with G a set and ·G : G → G a binary operation is a commutativite group if for any valuation v : var → G

(G, ·G)  Associativity law ∧ Inverse ∧ Commutativity law[v]

Second example of a first language It is somewhat inconvenient to express the existence of the inverse and of the neutral . element for the group operation in the language {∗, =}. This leads us to expand our notion of first order language by introducing constant symbols and unary operation symbols. The expansion of the language allows to express in a simpler fashion the existence of a neutral element and of an inverse for the group operation. . Definition 2.8. Let L1 = {∗, =, I, e}. A string t is a L1-term if: • t is a variable x, • t is the constant symbol e,

• t = (s ∗ u) with s, u L1-terms,

• t = I(s) with s an L1-term. • Strings which cannot be obtained by a finite number of applications of the above rules are not L1-terms. L formulae are built over L -terms by the same rules we used to define L-formulae, 1 1 . i.e. the simplest formulae are equations of the form (t = s) with t, s L1-terms; more complex formulae are built over these equations by means of quantifiers and logical connectives.

Definition 2.9. (G, ·G,IG, eG) is an L1-structure if

• eG ∈ G,

2 •· G : G → G is a binary operation,

• IG : G → G is a unary operation.

Let v : var → G, and t an L1-term. We define v(t) as follows: • v(t) = v(x) if t is the variable x.

• v(t) = eG if t is the constant symbol e.

• v(t) = IG(v(s)) if t is the term I(s).

28 • v(t) = v(s) ·G v(u) if t is the term (s ∗ u).

We define (G, ·G,IG, eG)  φ[v] according to the same rules we gave for L- formulae, for any given L1-formula φ.

We introduce the group axioms as the following formulae of the language L1:

Associativity law: ∀x∀y∀z((x ∗ y) ∗ z) = (x ∗ (y ∗ z)) . . e is the neutral element: ∀y(e ∗ y = y ∧ y ∗ e = e) Inverse: ∀x(x ∗ I(x) = e ∧ I(x) ∗ x = e)

Exercise 2.10. Let (G, ·G,IG, eG) stands for one of the following structures:

• (Z, +,IZ, 0) with IZ : Z → Z given by n 7→ −n,

• (Q, +,IZ, 0), with IQ : Q → Q given by q 7→ −q, ∗ ∗ ∗ • (R = R \{0} , ·,IR, 1), with IR : R → R given by r 7→ 1/r, −1 • (GL2,2(R), ·, I, Id) with I : GL2,2(R) → GL2,2(R) given by A 7→ A , and Id the identity matrix. Show that for all v : var → G

(G, ·G,IG, eG)  Associativity law ∧ e is the neutral element ∧ Inverse[v].

Show that (Q, ·,I0, 1) with I0(a) = 1/a if a 6= 0 and I0(0) = 0 does not satisfy neither that 1 is the neutral element of · nor the existence of multiplicative inverses for all a ∈ Q. (HINT: 0 witnesses the failure that 1 is the neutral element for multiplication, moreover 0 · I0(0) = 0 6= 1, hence I(0) is not the inverse of 0 for ·).

We can define a structure (G, ·G,IG, eG) to be a group with neutral element eG and inverse operation IG if and only if

(G, ·G,IG, eG)  Associativity law ∧ e is the neutral element ∧ Inverse[v] for all v : var → G.

Third example of a first order language We introduce the last natural example of a first order language, so to be able to express the properties of fields and rings with an ordering: . Definition 2.11. Let L = {∗, ⊕, =, , 0¯, 1¯} with ⊕, ∗ symbols for binary opera- . 2 l tions, =, l symbols for binary relations, 0¯, 1¯ constant symbols. A string t is a L2-term if: • t is a variable x, • t is the constant symbol 0¯ or the constant symbol 1,¯

• t = (s ∗ u) with s, u L2-terms,

• t = (s ⊕ u) with s L2-terms.

29 • Strings which cannot be obtained by a finite number of applications of the above rules are not L2-terms.

The atomic L2-formulae are the following:

• (s l u) with s, u L2-terms, . • (s = u) with s, u L2-terms.

The other L2-formulae are defined over L2-atomic formulae using propositional con- nectives and quantifiers.

Definition 2.12. A structure (M,

• v(s ∗ u) = v(s) ·M v(u) for s, u L2-terms,

• v(s ⊕ u) = v(s) +M v(u) for s, u L2-terms,

• v(0)¯ = 0M ,

• v(1)¯ = 1M .

The L2-semantics is defined as follows:

• (M,

• The semantics of other L2-formulae obeys the usual definition given for propo- sitional connectives and quantifiers. Exercise 2.13. Consider the structure R = ( , ·, +, 0, 1, <) and the assignment ⊕ to . R +, ∗ to ·, = to the equality relation, l to the strict order relation {(a, b) ∈ R2 : a < b} ⊆ 2 ¯ ¯ R , 0 to 0, 1 to 1. With this assignment R is an L2-structure.

• Find L2-formulae which express that (R, +, ·, 0, 1) is a field with 0 neutral element for the sum and 1 neutral element for the product on R∗. I.e. find the natural counterpart as L2-formulae of the axioms expressing in the semi- formal mathematical language we are accustomed to work with that (R∗, ·, 1) and (R, +, 0) are commutative groups and the usual distributivity properties of sum and product. (CAUTION: to express that (R∗, ·, 1) is a group one has to specify that all non-zero elements of are invertible, this can be expressed . R . by means of the formula ∀x(¬(x = 0)¯ → ∃y(x ∗ y = 1))¯ .

2 • Find also L2-formulae expressing that < is a strict order relation on R , i.e. L2- formulae formalizing in L2 that < is an antireflexive, antisymmetric, transitive binary relation on R2. Recall that a binary relation R ⊆ R2 is: – antireflexive if ∆ = {(a, a): a ∈ R} ∩ R = ∅, – antysymmetric if at most one among (a, b) and (b, a) belongs to R for all a, b ∈ R,

30 – transitive if for all a, b, c ∈ R such that (a, b), (b, c) ∈ R we also have that (a, c) ∈ R. • Check also that the formula . . ∀x∀y[x l y ↔ ∃z((¬0¯ = z) ∧ x ⊕ (z ∗ z) = y)]

holds in R for any valuation v : var → R.

We define an L2-structure M to be an ordered field if it satisfies all the L2-formulae in the above list we asked you to find. We remark how certain familiar type of linguistic expressions we use in mathe- matics can be rendered in first order logic. “All positive real number are the square of some other real number”.

We can express it in L2 as: . ∀x(0¯ l x → ∃y(y ∗ y = 0))¯

“There exists a positive solution for the polynomial x2 − 1 = 0”.

We can express it in L2 as: . ∃x(0¯ l x ∧ x ∗ x = 1)¯

Actually it is possible to formalize any mathematical reasoning in first order logic. We are going to explore this at length in the sequel.

2.2 Syntax and semantics for arbitrary first order languages We are now ready to define what is an arbitrary first order language and what is its natural semantic. The three examples above will be just special instantiation of this general definition. Definition 2.14. A set . L = {=} ∪ {Ri : i ∈ I} ∪ {fj : j ∈ J} ∪ {ck : k ∈ K} is a first order signature, where Ri denotes a relation symbol of ariety ni ∈ N for each i ∈ I, each fj denotes a function symbol of ariety nj ∈ N for each j ∈ J, each ck denotes a constant symbol for each k ∈ K. Fix {xn : n ∈ N} infinite set of variables (disjoint from L). The L-terms and L-formulae are defined as strings over the vocabulary

{∧, ¬, ∨, →, ↔, ∀, ∃, (, )} ∪ {, } ∪ L ∪ {xn : n ∈ N} according to the following rules:

Terms: An L-term is a string defined as follows:

• each variable xn is an L-term,

• each constant symbol ck for k ∈ K is an L-term,

31 • if t1, . . . , tnj are L-terms, and fj is a function symbol of ariety nj, also

fj(t1, . . . , tnj ) is an L-term, for each j ∈ J. • Strings which cannot be obtained by finitely many repeated applications of the above rules are not L-terms. Formulae: An L-formula is a string defined as follows: . • if t, s are L-terms, (t = s) is an (atomic) L-formula,

• if t1, . . . , tni are L-terms and Ri is a relation symbol of ariety ni, also

Ri(t1, . . . , tni ) is an (atomic) L-formula for each i ∈ I. • if φ, ψ are L-formulae and x is a variable, also (¬φ), (φ ∧ ψ), (φ → ψ), (φ ↔ ψ), (φ ∨ ψ), (∀xφ), (∃xφ) are L-formulae. • Strings which cannot be obtained by finitely many repeated applications of the above rules are not L-formulae.

Notation 2.15. A caveat is in order: we are used to use the infix notation for binary operation and relations, i.e. we are accustomed to write x + y rather than +(x, y) to denote the sum of x and y, also we are used to write x R y rather than R(x, y) to denote that x, y are in the binary relation R. However we match a problem with this type of notation when we are dealing with functions and relations which are not binary. For example if we want to define a ternary function f : M 3 → M defined on some set M, we are used to write f(x, y, z) to denote the output of f on the input (x, y, z), in this latter case we use a prefix-notation to describe the function f and its action on its inputs. It turns out that the prefix notation is best suited to describe in general arbitrary terms of a first order language. In particular if we choose L to be L0 = {∗}, according to the above definition of L-terms, we would get that the L-term t obtained applying ∗ to previously defined L-terms s, u is ∗(s, u) rather than (s ∗ u). This is due to the fact that in the above definition of L-terms . and atomic formulae, we decided to conform all the times (except for =) to the conventions imposed by the adoption of a prefix notation. Definition 2.16. Let . L = {=} ∪ {Ri : i ∈ I} ∪ {fj : j ∈ J} ∪ {ck : k ∈ K} be a first order signature

M M M M = (M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K) is an L structure with domain M if

M • ck ∈ M for each k ∈ K and each constant symbol ck,

M ni • Ri ⊆ M for each relation symbol Ri ∈ L of ariety ni for each i ∈ I,

M nj • fj : M → M is a function for each function symbol fj ∈ L of ariety nj for each j ∈ J. Fix v : var → M be a function. Given an L-term t, we define v(t) as follows: • v(t) = v(x) if t is the variable x,

32 M • v(ck) = ck for each constant symbol ck, M • v(fj(t1, . . . , tnj )) = fj (v(t1), . . . , v(tnj )) for each function symbol fj of ariety nj.

Given an L-formula φ we define M  φ[v] as follows: . •M  (t = s)[v] iff v(t) = v(s), M •M  Ri(t1, . . . , tni )[v] iff Ri (v(t1), . . . , v(tni )) holds (i.e. the ni-tuple (v(t1), . . . , v(tni )) M ni is in the relation Ri ⊆ M ), •M  ψ ∧ θ[v] iff M  ψ[v] and M  θ[v], •M  ψ ∨ θ[v] iff M  ψ[v] or M  θ[v], •M  ¬ψ[v] iff M 6 ψ[v], •M  ψ → θ[v] iff M  ¬ψ[v] or M  θ[v], •M  ψ ↔ θ[v] iff M  ψ → θ[v] and M  θ → ψ[v],

•M  ∃xψ[v] iff M  ψ[vx/b] for some b ∈ M,

•M  ∀xψ[v] iff M  ψ[vx/b] for all b ∈ M, 0 0 where vx/b denotes the function v : var → M such that v (y) = v(y) for all variables y 6= x and v0(x) = b. Exercise 2.17. Check that: •L is the L-language given by just one function symbol ∗ of ariety 2 and no con- 0 . stant symbols and no relation symbols other than = (which is a relation symbol of ariety 2 and the unique one for which we conform to the infix notation).

•L 1 is the L-language given by one function symbol ∗ of ariety 2, one function symbol I of ariety 1, one constant symbols e and no relation symbols other . than =.

•L 2 is the L-language given by two function symbols ∗, ⊕ of ariety 2, two con- stant symbols 0¯, 1,¯ one relation symbol of ariety 2, and the relation symbol . l = of ariety 2. Check also that modulo the reframing of the notion of term and determined by the switch from the infix notation to the prefix notation, the L- semantic defined above is the Li-semantic we previously defined for each of the signatures Li for i = 0, 1, 2. Remark 2.18. Given an L-structure M with domain M and v : Var → M, set

v : L-Form → {0, 1} φ 7→ 1 if and only if M |= φ[v]

Then: • v(φ ∧ ψ) = min {v(φ), v(ψ)},

33 • v(φ ∨ ψ) = max {v(φ), v(ψ)}, • v(φ → ψ) = max {¬v(φ), v(ψ)}, • v(¬φ) = 1 − v(φ),  • v(∃xφ) = max vx/a(φ): a ∈ M ,  • v(∀xφ) = min vx/a(φ): a ∈ M . Hence the semantics just defined is the natural generalization of the semantics for propositional formulae. We can, as in the case of propositional logic, introduce the notion of tautology, contradiction, and satisfiable formula: Definition 2.19. Let L be a first order signature and φ an L-formula. • φ is a tautology if for all L-structures M with domain M and all v : var → M we have that M  φ[v]. • φ is a contradiction if for all L-structures M with domain M and all v : var → M we have that M 6 φ[v]. • φ is satisfiable if for some L-structure M with domain M and some v : var → M, we have that M  φ[v]. A theory T over L is a family of L-formulae. T is satisfiable if for some L- structure M with domain M and some v : var → M, we have that

M  φ[v] for all φ ∈ T . Finally we introduce a key notion in our analysis of first order logic: that of logical consequence. This notion gives a mathematically rigorous definition of what is a theorem of a mathematical theory T . We say that M  T [v] if and only if for all formulae φ ∈ T

M  φ[v]. If T is a theory such that some M  T [v], we say that T is a satisfiable theory. Otherwise we say it is a contradictory theory. Definition 2.20. Let L be a first order signature. Given an L-structure M with domain M, a valuation v : var → M, and a theory T , We say that φ is a logical consequence of T , and write T |= φ, if for all L-structures M with domain M, and all valuations v : var → M, if

M  T [v],

34 then M  φ[v]. We write φ |= ψ to signify that T |= ψ where T = {φ}. We say that φ and ψ are logically equivalent and write φ ≡ ψ if φ |= ψ and ψ |= φ. We also write φ |=T ψ if {φ} ∪ T |= ψ and φ ≡T ψ (φ and ψ are logically equivalent over the theory T ) if φ |=T ψ and ψ |=T φ. The above definition captures our concept that a certain mathematical statement φ is a consequence of the axioms of the theory T , i.e. that there is a mathematical theorem stating that under assumptions T , the thesis φ holds. Let’s give a concrete example: Example 2.21. Let T be the theory of groups and φ states that there exists a unique neutral element, we know that φ is a consequence of T exactly because we are able to show that in any group there is exactly one neutral element. Now let us translate this into our formalism using the language L0. The L0-formula φ ≡ ∀u∀z[(∀y(u ∗ y = y ∧ z ∗ y = y)) → u = z] formalizes that any two neutral element for multiplication on the right are equal, i.e. that there can be at most one neutral element for multiplication on the right. φ is a logical consequence of the axioms of the theory of groups as formalized in the language L0, i.e. Associativity law ∧ Inverse. Exercise 2.22. Prove that Associativity law ∧ Inverse |= ∀u∀z[(∀y(u ∗ y = y ∧ z ∗ y = y)) → u = z] holds.

Exercise 2.23. Show that the L0-formulae x∗y = z and y ∗x = z are logically equiv- alent over the theory T = {Commutatitivity law} but are inequivalent otherwise. (HINT: to show that they are inequivalent in general, take the group GL2,2(R) and three 2×2-matrices A, B, C such that A·B = C 6= B ·A and consider an assignment v : x 7→ A, y 7→ B, z 7→ C.)

Exercise 2.24. Show that ≡ and ≡T are equivalence relations on the set of L- formulae, i.e. that for all L-formulae φ, ψ, θ: • φ ≡ φ, • φ ≡ ψ entails that ψ ≡ φ, • φ ≡ ψ and ψ ≡ θ entail that φ ≡ θ, and that the same holds replacing ≡ with ≡T for any theory T . (HINT: To this aim it suffices to show that |= is a reflexive and transitive relation, and that the same occurs for |=T for any theory T . Since the symmetry of ≡ and ≡T comes for free from the fact that φ ≡ ψ iff φ |= ψ |= φ (φ ≡T ψ iff φ |=T ψ |=T φ).

We are interested to analyze L-formulae up to ≡ or ≡T . For example in commu- tative groups it is irrelevant for us to study the equations x ∗ y = z or the equation y ∗ x = z, since the two are expressing the same concept (if we assume the commu- tativity of the operation ∗). In general, we will proceed as follows in the analysis of a mathematical theory T (for example T could be the theory of groups):

35 • We look for a first order language L in which we are able to express the axioms of T as L-formulae. • We then try to analyze whether a certain property of the structures satisfying these axioms (in our case groups) can be expressed by means of a L-formula φ. • In the case this is possible, we can freely choose which among the many L- formulae ψ which are logically equivalent to φ is more convenient to be used for our analysis of the property described by φ. For this reason it is good to sort out the basic properties of the equivalence relation ≡ and ≡T and of the relation of logical consequence |=. This is what we propose to do in the sequel. Exercise 2.25. Let × be a binary connective and Q one among the symbols ∃, ∀. Prove that φ0 × ψ0 ≡ φ1 × ψ1 if φ0 ≡ φ1 and ψ0 ≡ ψ1. Prove that φ ≡ ψ if and only if ¬φ ≡ ¬ψ. Prove also that Qxφ ≡ Qxψ if φ ≡ ψ. Remark 2.26. The same does not hold for any theory T if in all the above expressions we replace ≡ by ≡ . The problem occurs for example in case φ ≡ ψ with T = . T . T . {(x = e)} with a constant function symbol, φ being (x = e) and ψ being (x = x). Then any L = {e}-structure M |= ψ[v] and any structure M with a valuation v of T such that M |= T [v] (i.e. v is such that v(x) = eM) is also such that M |= φ[v]. On the other hand if M, the domain of M, has at least two elements, we have that . . M |= ∀x(x = x)[v] and M 6|= ∀x(x = c)[v]. The problem does not arise if T is made up just of sentences (i.e. formulae without free variables). But we need first to define what are these sentences, i.e. what are the free variables of a formula. This will be done in the next section.

Exercise 2.27. Show that there is some theory T such that φ ≡T ψ but ∃xφ 6≡T ∃xφ.

2.3 Free and bounded variables and substitution of symbols inside for- mulae: what are the problems to match? To proceed in our analysis of the semantic of first order logic and of the notion of logical consequence, we need to introduce some technical definition which will simplify some of our computations regarding formulae and their semantic. The following basic examples based on our practice with integration outline all the kind of problems we can encounter in manypulating first order formulae. Con- sider the following function:

Z 1 f(u) = (y · u)dy 0 Even though y appears in the expression of f(u), in order to compute f(u) we do not need to assign a value to y, i.e. y is a bounded variable in the above expression

36 and f is a function just of the variable u and not of the variable y. Exactly in the same way, y is bounded in ∃y(y ∗ y = x): in order to compute whether

(N, +)  ∃y(y ∗ y = x)[v] holds for some valuation v : var → N, we just need to see whether v(x) is an even or an odd number. (N, +)  ∃y(y ∗ y = x)[v] holds if and only if v(x) is even. In particular the assignment of v to variables differ- ent from x (in particular to the quantified variable y) is irrelevant in the computation of the validity or not of the expression

(N, +)  ∃y(y ∗ y = x)[v]. So our first observation is: Observation (1): The validity of an expression of type

M  φ[v] depends only on the assignment v gives to the non-bounded variables oc- curring in the string φ. Now let us come back to the expression Z 1 f(u) = (y · u)dy 0 We can change all occurrences of y with any other variable different from u without changing the meaning of f(u):

Z 1 Z 1 f(u) = (y · u)dy = (z · u)dz. 0 0 But if we replace y with u we change the meaning of the expression: Z 1 Z 1 f(u) = u/2 = (y · u)dy 6= (u · u)du = 1/3. 0 0 Similarly in the structure (N, +), the formulae ∃y(y ∗ y = x) and ∃z(z ∗ z = x) are two equivalent expressions which are true when x is assigned to an even number, but ∃x(x ∗ x = x) is a different kind of statement which is always true in (N, +) (for 0 is a number witnessing the truth of ∃x(x ∗ x = x) according to our semantic). So our second observation is: Observation (2): In a formula φ we can replace safely all occurrences of a variable y which are bounded by a quantifier of the form ∃y or ∀y by any other variable z, provided that z is a variable never occurring in φ. Finally it is often the case that in the expression Z 1 Z 1 f(u) = (y · u)dy = (z · u)dz, 0 0

37 u is a shorthand for some other expression, for example u could be a function of x, z i.e. u = g(x, z). In certain type of computation it is convenient for us to treat g(x, z) R 1 as a variable u and manipulate the expression 0 (y · u)dy, and at a certain point replace u with g(x, z) in our expression. This is feasible and correct, provided that u is not a function of the variable y (which is bounded by the differential sign dy in R 1 1 the expression 0 (y · u)dy). For example let g(z, w) = z·w , then letting u = g(z, w): Z 1 Z 1 y 1 f(u) = (y · u)dy = dy = . 0 0 z · w 2(z · w) On the other hand if we used the expression Z 1 f(u) = (z · u)dz, 0 which we saw to represent equally well the function f(u) as the expression Z 1 f(u) = zudz, 0 and we substitute g(z, w) in the place of u, we would get Z 1 Z 1 z 1 1 f(u) = (z · u)dz = dz = 6= . 0 0 z · w w 2(z · w) It is clear that we should consider this second expression the wrong one, since in the process of substituting u with g(z, w) we transformed the free variable z in a bound variable: our final expression is not a function of two variables z, w as it should be, but just of one variable w. Similarly if we subsitute (z ∗ w) to x in ∃y(y ∗ y = x), we get the expression ∃y(y ∗ y = z ∗ w), which is true in (N, +) if z ∗ w is assigned to an even number, but if we replace (z ∗ w) to x in ∃z(z ∗ z = x), we get the expression ∃z(z ∗ z = z ∗ w), which is true in (N, +) for any assignement of w to a natural number. In particular the first substitution shows that the formula ∃y(y ∗ y = z ∗ w) predicates of the term z ∗w the same properties the formula ∃y(y ∗y = x) predicates of the variable x, while the formula ∃z(z ∗ z = z ∗ w) has significantly changed the situation, and it is now a property just of the variable w, and not of the variables w, z. This leads to our third observation: Observation (3): In a formula φ we can safely replace a non-bounded variable of φ by some other term, provided the term we consider do not have variables which falls under the scope of a quantifier in φ. R 1 R 1 R 1 Finally observe that 0 0 (y ∗ x)dydz = 0 (y ∗ x)dy, i.e. the integral in dz does R 1 not affect at all the meaning of the expression 0 (y ∗ x)dy. Similarly one can check that ∃z(∃y(y ∗ y = x)) and ∃y(y ∗ y = x) are logically equivalent. Hence our fourth observation is: Observation (4): Quantifying with a variable z never occurring in a formula φ, does not change the meaning of φ. We need to make these observations rigorous mathematical properties of first or- der logic. So we define the notion of free and bounded occurrence of a variable inside a formula and then formulate the relevant facts which express these observations in a rigorous mathematical form.

38 2.4 Syntactic complexity of terms and formulae In the following we will make proofs by induction on the complexity of a term or of a formula, hence we must define a measure of complexity for such objects, and also give the right terminology to manipulate them.

Definition 2.28. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order lan- guage. • Let t be a L-term; the tree T (t) associated to t is defined as follows:

– T (t) = t if t is a variable x or a constant symbol ck.

– Assume t = fj(t1, . . . , tnj ) and T (ti) has been defined for i = 1, . . . , nj. Then T (t) is the tree:

T (t1) ...T (tnj ) t . • Let φ be a L-formula; φ is atomic if it is of the form (t = s) with t, s L-terms

or of the form Ri(t1, . . . , tni ), with Ri ∈ I a relation symbol of L of ariety ni

and t1, . . . , tni L-terms. The tree T (φ) associated to φ is defined as follows: – T (φ) = φ if φ is atomic. – Assume φ is (θψ) with  ∈ {∨, ∧, →} and T (θ), T (ψ) have been defined. Then T (φ) is the tree:

T (θ) T (ψ) φ – Assume φ is (¬ψ) or (∀ψ) or (∃ψ). Then T (φ) is the tree:

T (ψ) φ • The complexity of a formula φ or of a term t is the height of the associated tree. • The subformulae of φ are all the formulae appearing in some node of T (φ). subfm(φ) is the set of subformulae of φ.

For example the L0-formula φ: . . (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x2(x1 = x2))))) has as T (φ) the tree . (x1 = x2) . . ((x1 ∗ x2) = x0) (∃x2(x1 = x2)) . . (∀x1((x1 ∗ x2) = x0)) (¬(∃x2(x1 = x2))) . . ((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x2(x1 = x2)))) φ

39 The complexity of φ is 4, and its set of subformulae is the following set of substrings of φ: . . subfm(φ) ={φ, ((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x0(x1 = x2)))), . . . . . (∀x1((x1 ∗ x2) = x0)), (¬(∃x2(x1 = x2))), ((x1 ∗ x2) = x0), (∃x2(x1 = x2)), (x1 = x2)}

Observe for example that the symbol x1 occurs thrice in φ in the 6-th, 9-th, and 24- th positions (starting to count form 0 from left to right i.e. letting the 0-th symbol of the string φ being its leftmost symbol — the parenthesis ( — and enumerting step by step from left to right, till we reach the rightmost symbol — the parenthesis ) — placed in the 31-st position. From now on along this section it will be convenient to be extremely cautios in our description of strings, hence we adopt the following terminology:

Notation 2.29. We identify a string φ with a function sφ : nφ = {0, . . . , n − 1} → L0 (where with L0 we intend in this context the set of possible symbols which can occur in some L0-formula) and which associates to each the number j < n the symbol occurring in the j-th position of φ (starting to count from 0 from left to right). Each element of a string φ can be uniquely identified by a pair hk, Si where S denotes the symbol occurring in position k along the string, i.e. hk, Si is the occurrence in position k of the symbol S in φ if and only if sφ(k) = S. For example take the L0-formula φ: . . (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x2(x1 = x2))))) For this formula:

• nφ = 32

• sφ : 32 → L0

• sφ(0) = (

• sφ(31) =)

• sφ(6) = sφ(9) = sφ(24) = x1

• the occurrences of x1 in φ are the pairs h6, x1i , h9, x1i , h24, x1i.

2.5 Free and bounded variables of a formula We are now ready to define precisely what are the free and bounded variable in a formula. Definition 2.30. Let L be a first order signature.

• Let t be an L-term. The variables occurring in t are the variables which occurs in at least some place in the string t. We write t(x1, . . . , xn) to denote that the variables occurring in t are all among the set of variables {x1, . . . , xn}.

• Given a formula φ, let us identify φ with the sequence sφ : n → L which enumerates the elements of the string φ. Let Q = sφ(k) be a quantifier symbol (i.e. one among ∀, ∃) occurring in the string φ.

40 A. The scope of the occurrence hk, Qi of the quantifier symbol Q is the unique substring ψk,Q = sφ(k − 1)Q . . . sφ(l) of φ such that ψk,Q is a subformula of3 φ.

B. An occurrence hj, xi of the variable x = sφ(j) in φ is under the scope of the occurrence hk, Qi of the quantifier symbol Q if it belongs to the string 4 ψk,Q and sφ(k + 1) = x . C. An occurrence of the variable x in φ is bounded by the occurrence of the quantifier symbol Q = sφ(k) in φ if it is under the scope of hk, Qi, and for no j > m > k x is under the scope of the occurrence hm, Q0i of some 0 quantifier symbol Q = sφ(m). D. An occurrence of the variable x in φ is free if it is not under the scope of any occurrence of a quantifier symbol in φ.

• We will write φ(x1, . . . , xn) to denote that all occurrences of variables in the formula φ which are free belong to the set of variables {x1, . . . , xn} (we do not exclude the case that there could be some variables among x1, . . . , xn which do not occur free in φ, or some occurrences of xj which occur free in φ and some other occurrences of xj which occur bounded in φ) • We say that φ is a closed formula or a sentence if (as a string) none of the variables occurring in it is free.

Example 2.31. In the formula . . φ ≡ (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x2(x1 = x2))))), the first occurrence of x1 in the 6-th position of φ is bounded by h5, ∀i, the second occurrence of x in the 9-th position is bounded by h5, ∀i (in both cases ψ is 1 . 5,∀ the subformula (∀x1((x1 ∗ x2) = x0)) of φ); the third occurrence of x1 in the 24-th position of φ is free. The first occurrence of x2 in the 11-th position is free, while the second in the 22-nd position and the third in the 26-th position are bounded . (this is witnessed by the subformula ψ21,∃ of φ given by (∃x2(x1 = x2))). The free variables of φ are {x0, x1, x2}. Any set of variables X ⊇ {x0, x1, x2} is such that φ contains all occurrences of its free variables in the set X.

The requirement (C) in the definition of bounded occurrence of a variable is set up to rule which among the quantifier under whose scope the occurrence of

3In order to understand the scope of a quantifier in a formula φ, we must be able to pair a left-side parenthesis ( occurring in φ with the corresponding rightside parenthesis ) occurring in φ. For example if a quantifier symbol Q occurring in the formula φ is sφ(k), we need to know which ) symbol is paired to the parenthesis sφ(k − 1) = (. Here is an algorithm to compute which j > k is such that sφ(j) =) is paired with sφ(k − 1): start a counter with 1 at sφ(k − 1) = ( and, proceeding rightward on the string sφ, increase the counter by 1 at position l if sφ(l) = ( and decrease it by 1 at position l whenever sφ(l) =). When the counter reaches 0 in position j we get that sφ(j) =) is the parenthesis coupled with sφ(k − 1) = (. For example in . . φ := (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x2(x1 = x2))))) we want to know the scope of sφ(6) = ∀: we let the counter start with 1 at sφ(5) = ( and, proceeding rightward, the counter gets value 2 at position 8, value 3 at position 9, value 2 at position 13, value 1 at position 16 and value 0 at position 17. Hence the scope of h6, ∀i is the substring of φ starting in s (5) and . φ finishing in sφ(17) i.e. the sub-formula of φ given by (∀x1((x1 ∗ x2) = x0)). 4Here j > k + 1 is well possible and often the case, see the example below for clarifications!!

41 the variable falls is really binding the variable. For example consider the following formula: . (∃x(∀x(x ∗ x = y))) In this case it is correct to argue that the most external quantifier ∃x does not bound . any occurrence of the variable x in the formula (x ∗ x = y), since these occurrences of the variable x are already under the scope of the quantifier ∀x. (C) grants that this is indeed the case. Remark 2.32. The semantics we have given to formulae has the property that any formula φ is logically equivalent to (i.e. it has the same meaning of) a formula φ0 with the property that no variable can occur in some place of φ0 as bounded and in some other place as free, for example: . . φ ≡ (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x0(x1 = x2))))), and 0 . . φ ≡ (∃x3((∀x4((x4 ∗ x2) = x0)) ∧ (¬(∃x5(x1 = x2))))), can be shown to be logically equivalent. Observe that in φ0 no variable can occur in distinct places of the formula as free or bounded and there are no two occurrence of quantifier symbols which are followed by the same variable. We will come back on this point later on. The following fact gives a rigorous mathematical formulation of our first obser- vation:

Fact 2.33. Let φ be an L-formula with free variables among x1, . . . , xn, M = (M,... ) be an L-structure and v, v0 : var → M be two distinct valuations. As- 0 sume v(xi) = v (xi) for all i = 1, . . . , n. Then

0 M  φ[v] if and only if M  φ[v ]. I.e. the validity of a formula φ under a valuation v in a structure M depend just on the assignment v gives to the free variables of φ. Proof. The proof is by induction on the logical complexity of φ. If φ is atomic, then the fact is almost self-evident. If φ is a boolean combination of ψ, θ by means of a propositional connective, than the fact is also easily established. The delicate case is when φ is of the form Qxψ with Q among ∃, ∀, but we won’t enter the details of the argument in this case. The following exercise should convince you that the above fact indeed is true: Exercise 2.34. Check that the fact is easily proved for atomic formulae, for example check that . (M, ·)  x ∗ y = z[v] if and only if . 0 (M, ·)  x ∗ y = z[v ] whenever v, v0 agree on x, y, z. For more complex formulae, try to prove the fact for the formula . . φ = (∀x1(x1 ∗ x2 = x0)) ∧ (¬(∃x2(x1 = x2))),

42 0 i.e. for any L0-structure (M, ·) and valuations v, v which agree on the values assigned to x0, x1, x2 (which are the free variables occurring in φ)

(M, ·)  φ[v] if and only if 0 (M, ·)  φ[v ]. The following fact gives a rigorous mathematical formulation of our fourth ob- servation: Fact 2.35. Assume x does not occur free in φ. Then ∃xφ ≡ ∀xφ ≡ φ. Proof. Notice that ∃xφ and φ have the same free variables. Notice also that M = (M,... ) |= ∃xφ[v] if and only if for some b ∈ M M |= φ[vx/b] Since x does not occur free in φ we get that

M |= φ[vx/b] if and only if M |= φ[v] given that v and vx/b agree on all free variables of φ. Hence the thesis. Exercise 2.36. Complete the proof of the above fact for the case ∀xφ. Notation 2.37. Given a string s on a vocabulary V and some x, y ∈ V , we denote by s[x/y] the string t obtained from s systematically replacing the symbol x with the symbol y. For example let s = abbbccmab, s[a/b] = bbbbccmbb, s[a/e] = ebbbccmeb. Given s a string and t a substring of s, we also let s[a/b  t] denote the string obtained from s replacing a with b just on the occurrences of a which are in t. For example if s = abbbccmab and t = bbccma, s[a/b  t] = abbbccmbb. The following fact gives a rigorous mathematical formulation of our second ob- servation. We will not prove it, but we will use it in several occasions: Fact 2.38. Let φ be an L-formula and y a variable never occurring in the string φ. Let θ = (∃xψ) be a subformula of φ. Then φ[x/y  θ] ≡ φ. The following provides an example to convince you why the above fact is true: Exercise 2.39. Prove that the string φ . . (∀x1((x1 ∗ x2) = x0)) ∧ (¬(∃x2(x1 = x2))) is logically equivalent to the string ψ . . (∀y((y ∗ x2) = x0)) ∧ (¬(∃z(x1 = z))). . . Notice that ψ = φ[x1/y  (∀x1((x1 ∗ x2) = x0))][x2/z  (∃x2(x1 = x2))].

43 . . Proof. It is enough to prove that (∀x ((x ∗ x ) = x )) ≡ (∀y((y ∗ x ) = x )) and . . 1 1 2 0 2 0 that (¬(∃x2(x1 = x2))) ≡ (¬(∃z(x1 = z))), and then appeal to the fact that φ0 ≡ ψ0 and φ1 ≡ ψ1 entail that φ0 ∧ φ1 ≡ ψ0 ∧ ψ1. So let us prove that . . (∀x1((x1 ∗ x2) = x0)) ≡ (∀y((y ∗ x2) = x0)).

For any structure M = (M, ·M ) and valuation v : var → M, we have that . M  ∀x1((x1 ∗ x2) = x0)[v] if and only if for all a ∈ M we have that . M  (x1 ∗ x2) = x0[vx1/a]. . 0 0 Now consider the formula (y ∗ x2) = x0. Take the valuation v such that v (x0) = 0 0 v(x0), v (x2) = v(x2) and v (y) = v(x1). 0 0 0 0 0 Observe that v (y)·v (x2) = v(x1)·v(x2) and v (x0) = v(x0). Hence v (y)·v (x2) = 0 v (x0) if and only if v(x1) · v(x2) = v(x0). Assume . M  ∀x1(x1 ∗ x2 = x0)[v]. Then for all a ∈ M we have that . M  x1 ∗ x2 = x0[vx1/a]. Hence for all a ∈ M . 0 M  y ∗ x2 = x0[vy/a]. We conclude that . 0 M  ∀y(y ∗ x2 = x0)[v ]. But now observe that v0(x ) = v(x ) and v0(x ) = v(x ), and x , x are the only free . 0 0 2 2 0 2 variables in ∀y(y ∗ x2) = x0. Hence we get that . M  ∀y(y ∗ x2 = x0)[v]. In particular we have shown that if . M  ∀x1(x1 ∗ x2 = x0)[v], then . M  ∀y(y ∗ x2 = x0)[v].

We can repeat verbatim the same argument switching in all places y with x1 and x with y to get that if 1 . M  ∀y(y ∗ x2 = x0)[v], then . M  ∀x1(x1 ∗ x2 = x0)[v]. Since the above argument is independent of the choice of M and v, we proved that . . (∀x1((x1 ∗ x2) = x0)) ≡ (∀y((y ∗ x2) = x0)). . . We leave to the reader to prove that (¬(∃x2(x1 = x2))) ≡ (¬(∃z(x1 = z))) along the same lines, so to complete the exercise.

44 Definition 2.40. Given an L-formula φ(x1, . . . , xn) with free variables x1, . . . , xn its universal closure is ∀x1 ... ∀xnφ(x1, . . . , xn) and its existential closure is ∃x1 ... ∃xnφ(x1, . . . , xn). We denote by ∀φ (respectively ∃φ) the universal (existential) closure of φ.

Fact 2.41. An L-formula φ(x1, . . . , xn) is a tautology if and only if its universal closure is and is satisfiable if and only if its existential closure is. Exercise 2.42. Prove the fact. Exercise 2.43. Let φ be an L-formula and T an L theory made up just of sentences. Prove the following: 1. M |= T [v] if and only if M |= T [v0] for all valuations v, v0 with target the domain of M. 2. T |= φ if and only if T |= ∀φ. 3. T ∪ {φ} is satisfiable if and only if T ∪ {∃φ} is.

4. Prove exercise 2.25 for ≡T .

2.6 Basic rules for logic equivalence and prenex normal forms of formu- lae We are now ready to prove the basic logical equivalences: Fact 2.44. We have the following for all formulae φ, ψ: •∃ x(φ ∨ ψ) ≡ (∃xφ) ∨ (∃xψ), •∀ x(φ ∧ ψ) ≡ (∀xφ) ∧ (∀xψ), •∀ x¬φ ≡ ¬∃xφ, •∃ x¬φ ≡ ¬∀xφ, •∃ x(φ ∧ ψ) ≡ φ ∧ (∃xψ) if x does not occur free in φ. •∀ x(φ ∨ ψ) ≡ φ ∨ (∀xψ) if x does not occur free in φ. •∃ x(φ → ψ) ≡ φ → (∃xψ) if x does not occur free in φ. •∀ x(φ → ψ) ≡ φ → (∀xψ) if x does not occur free in φ. •∀ x(φ → ψ) ≡ (∃xφ) → ψ if x does not occur free in ψ. •∃ x(φ → ψ) ≡ (∀xφ) → ψ if x does not occur free in ψ. Proof. The first four are left completely to the reader. We prove in detail the fifth and leave the remaining as an exercise to the reader. To prove the first four use the same strategy based on an analysis of the definition of ≡ of the fifth below, for the remaining it is worth to use what has already been proved about ≡ in the first five items above, in combination with the results of exercise 2.25, the logical equivalence φ → ψ ≡ ¬φ∨ψ, and the De Morgan laws ¬(φ∧ψ) ≡ ¬φ∨¬ψ, ¬(φ∨ψ) ≡ ¬φ∧¬ψ. We must show that ∃x(φ ∧ ψ) ≡ φ ∧ (∃xψ)

45 if x does not occur free in φ. Choose a structure M = (M, ·) and a valuation v : var → M such that M  ∃x(φ ∧ ψ)[v]. This occurs if for some a ∈ M

M  φ ∧ ψ[vx/a] which is the case if and only if M  φ[vx/a] and M  ψ[vx/a]. This gives that M  φ[vx/a] and M  ∃xψ[v]. Since x does not occurs free in φ we have that

M  φ[vx/a] if and only if M  φ[v] since v and vx/a differ just on the value assigned to x which is not free in φ. Hence we get that M  φ[v] and also that M  ∃xψ[v], from which we can infer M  φ ∧ ∃xψ[v]. Since this argument is independent of the choice of M, v, we proved that

∃x(φ ∧ ψ) |= φ ∧ ∃xψ.

For the converse assume M  φ ∧ ∃xψ[v]. Then M  φ[v] and M  ∃xψ[v]. This gives that for some a ∈ M

M  ψ[vx/a] and M  φ[v].

46 Since x does not occurs free in φ we have that

M  φ[vx/a] if and only if M  φ[v] since v and vx/a differ just on the value assigned to x which is not free in φ. Hence we get that M  φ[vx/a] and also that M  ψ[vx/a], from which we can infer M  φ ∧ ψ[vx/a]. This a thus witnesses that M  ∃x(φ ∧ ψ)[v]. Since this argument is independent of the choice of M, v we proved that

φ ∧ ∃xψ |= ∃x(φ ∧ ψ).

With these equivalences at hand, we can now prove the existence of canonical representative in the equivalence class of a formula:

Definition 2.45. An L-formula φ is in prenex normal form if φ is of the form Q1x1 ...Qnxnψ with each Qi a symbol among ∀, ∃ and ψ a quantifier free formula.

Theorem 2.46. Every formula φ is equivalent to a formula ψ in prenex normal form.

Proof. SKETCH: We use the above equivalence to systematically pull quantifiers out of subformulae of φ. For example if φ is θ ∧ ∃xψ we pick y variable never occurring neither in θ nor in ψ. Then φ ≡ θ ∧ ∃y(ψ[x/y]) and by the above equivalences (since y does not occur free in θ), we get that φ ≡ ∃y(θ∧ψ[x/y]). Repeating this procedure for all binary connectives occurring in φ and using the equivalences ¬∃xψ ≡ ∀x¬ψ, ¬∀xψ ≡ ∃x¬ψ, after finitely many steps we find a formula ψ ≡ φ in prenex normal form.

Exercise 2.47. Choose an L0-formula φ with at least four quantifiers and six logical connectives and find a ψ ≡ φ in prenex normal form.

Fact 2.48. Every formula φ is logically equivalent to a formula in which only the boolean connectives ∧, ¬ and the quantifier ∃ appear.

Proof. Proceed by induction on the logical complexity of φ to remove the occurrences of ∨, ∀, →, ↔ using the De Morgan laws ψ ∨θ ≡ ¬(¬ψ ∧¬θ) and ∀xφ ≡ ¬∃x¬φ.

47 2.7 Substitution of terms inside formulae We now want to define a general procedure to substitute a free variable by a term inside a formula and thus give a rigorous mathematical form to our third observation.

Definition 2.49. Let L be a first order signature and φ(x1, . . . , xn) a formula with free variables among {x1, . . . , xn} tj(y1, . . . , yk ) terms with free variables among  j y1, . . . , ykj for j = 1, . . . , n. We define φ x1/t1, . . . , xn/tn by the following procedure: J K • By means of repeated applications of Fact 2.38 we replace all the occurrences of bounded variables inside φ with new variables never in the set [  X = {x1, . . . , xn} ∪ y1, . . . , ykj . j=1,...,n

By Fact 2.38, we obtain a formula φ0 ≡ φ with the same free variables and such that all the quantifiers of φ0 quantify on distinct variables which are not in X.

• We let φ x1/t1, . . . , xn/tn be the string obtained replacing each occurrence of 0 xi in φ withJ the string ti.K

Example 2.50. Let us consider the following example: φ(x0, x1, x2) is the formula . . (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∀x2(x1 = x2))))) t0 = x1 ∗ x2, t1 = (x0 ∗ x0) ∗ x1, t2 = x3 ∗ (x2 ∗ x1). We choose variables y, z, w and we change φ to the formula . . ψ = φ[x3/y][x1/z  (∀x1((x1 ∗ x2) = x0))][x2/w  (∀x2(x1 = x2))]. We get that ψ is the formula . . (∃y((∀z((z ∗ x2) = x0)) ∧ (¬(∀w(x1 = w))))).

Now we substitute the terms ti in the occurrences of the variables xi in ψ (which are all free) to get φ x0/t0, x1/t1, x2/t2 : J K . . (∃y((∀z((z ∗ (x3 ∗ (x2 ∗ x1))) = x1 ∗ x2)) ∧ (¬(∀w((x0 ∗ x0) ∗ x1 = w))))).

Fact 2.51. Let L be a first order signature and φ(x1, . . . , xn) a formula with free vari-  ables among {x1, . . . , xn}, tj(y1, . . . , ykj ) terms with free variables among y1, . . . , ykj for j = 1, . . . , n. Let M be an L-structure with domain M, and v, v0 : var → M be valuations. 0 Assume v(xj) = v (tj) for all j = 1, . . . , n. Then

M |= φ(x1, . . . , xn)[v] if and only if 0 M |= φ x1/t1, . . . , xn/tn [v ]. J K We don’t prove the fact but we give an example to explain why this holds:

48 Example 2.52. Consider again the formula φ(x0, x1, x2) of the previous example and the substitution of its free variables by the same terms. Let M = (N, +) and v(x0) = v(x2) = 0, v(x1) = 9. We leave to the reader to check that . . (N, +)  (∃x3((∀x1((x1 ∗ x2) = x0)) ∧ (¬(∀x2(x1 = x2)))))[v], This is the case since the outer quantifier ∃x is irrelevant, and of the two conjuncts . . 3 (∀x1((x1 ∗ x2) = x0)), ¬(∀x2(x1 = x2)), the first is true since n · 0 = 0 for all natural numbers n, and the second is true because any number different from 9 can be used to witness that . (N, +)  ∀x2(x1 = x2)[v] does not hold. After the substitution of xi with ti inside φ we have the formula . . (∃y((∀z((z ∗ (x3 ∗ (x2 ∗ x1))) = x1 ∗ x2)) ∧ (¬(∀w((x0 ∗ x0) ∗ x1 = w))))).

0 0 0 0 Let v (x0) = v (x1) = 3, v (x3) = 2, and v (x2) = 0. Then: 0 0 • v (t0) = v (x1 ∗ x2) = 3 · 0 = 0 = v(x0),

0 0 • v (t1) = v ((x0 ∗ x0) ∗ x1) = 9 = v(x1),

0 0 • v (t2) = v (x3 ∗ (x2 ∗ x1)) = 0 = v(x2). We leave to the reader to check that

. . 0 (N, +)  (∃y((∀z((z ∗ (x3 ∗ (x2 ∗ x1))) = x1 ∗ x2)) ∧ (¬(∀w((x0 ∗ x0) ∗ x1 = w)))))[v ]. This witnesses the truth of the fact in this particular instance.

2.8 Geometric interpretation of formulae Much in the same fashion as one draws the set of solutions of a (dis)equation in two variables as a subset of the plane, one can do the same kind of representations for the set of tuples satisfying some first order formulae with some valuation. Here is how: Definition 2.53. Let L be a first order signature, M an L-structure with do- main M, φ(x1, . . . , xn) an L-formula with all the free variablesoccurring in φ among n x1, . . . , xn. We define the following subset of M : T M = (a , . . . , a ) ∈ M n : M φ[v ] . φ(x1,...,xn) 1 n  x1/a1,...,xn/an

Example 2.54. Take the language L2 and the structure R = (R, ·, ∗, <, 0, 1). The curve y = 1/x in the plane R2 can be described as R  2 Tφ(x,y) = (a1, a2) ∈ R : R  φ[vx/a1,y/a2 ] . where φ(x, y) = (y ∗ x) = x1).¯ More generally one can prove that every polynomial with coefficients in N can be described by an L2-term (see the exercise below). Sets of solutions of equations and disequations in rational coefficients can also be decribed by means of L2-formulae. The above example for the equation y = 1/x is a simple illustration of this fact.

49 Remark 2.55. It is crucial when defining the sets T M to specify, not only φ(x1,...,xn) the formula and the structure, but also the set (or more importantly the number) of distinct free variables x1, . . . , xn one aims to consider. This last information specifies the dimension of the space M n of whom T M is going to be a subset. . φ(x.1,...,xn) For example for φ being ∃z(z ⊕ 1¯ = x0¯ ∧ (y ∗ x) ⊕ z = x0)¯ R  2 Tφ(x,y) = (a1, a2) ∈ R : R  φ[vx/a1,y/a2 ] is a subset of the plane R2 (of dimension one as an algebric variety), but R  3 Tφ(x,y,z) = (a1, a2, a3) ∈ R : R  φ[vx/a1,y/a2 , z/a3 ] is a (two dimensional) subset of R3 (obtained by a translation along the z-axis of the curve y = 1/x). Exercise 2.56. Show that for any polynomial p(x) with coefficients in N, there is an L2-term t(x) such that vx/a(t) = p(a) for all real numbers a and valuations v : var → R, and conversely. (HINT: proceed by induction on the degree n of the Pn+1 i polynomial. Let p(x) = i=0 aix , and assume that for all polynomial of degree Pn i n one can find the required term t(x). Let q(x) = i=0 aix and s(x) the term such that vx/a(s) = q(a) for all real numbers a and all valuations v. Now let t(x) = s(x)⊕(1¯⊕· · ·⊕1)(¯ x∗· · ·∗x) with the appropriate numbers of repetitions of the strings 1¯⊕ and x∗. Check that this works. For the converse assume that t(x) = (s ∗ u) and that there are such polynomials q(x) for s(x) and r(x) for u(x). Let p(x) = s(x)·r(x) and check that it works). More generally show that for any polynomial p(x1, . . . , xn) with coefficients in N, there is an L2-term t(x1, . . . , xn) such that for all real numbers a1, . . . , an vx1/a1,...,xn/an (t) = p(a1, . . . , an) and conversely. Even more generally prove that for any polynomial in integer or even rational coefficients p(x1, . . . , xn) there are terms t1(x1, . . . , xn), t2(x1, . . . , xn) such that for every n-tuple of real numbers(a1, ˙,an) and real number a we have that p(a1, ˙,an) = a if and only if vx1/a1,...,xn/an (t1) = vx1/a1,...,xn/an (t2) + a (HINT: start with the case n = 1 and a = 0 and look at the example of the hyperbole given above and to what has been done to handle the case of polynomial with coefficients in the natural numbers). Fact 2.57. The following holds for any L-structure M with domain M and any M-formula φ, ψ with free variables among x1, . . . , xn: • T M = T M ∩ T M , φ∧ψ(x1,...,xn) φ(x1,...,xn) ψ(x1,...,xn) • T M = T M ∪ T M , φ∨ψ(x1,...,xn) φ(x1,...,xn) ψ(x1,...,xn) • T M = M n \ T M , ¬φ(x1,...,xn) φ(x1,...,xn) • T M = π [T M ] where π : M n → M n−1 is defined by (a , . . . , a ) 7→ ∃xj φ(x1,...,xn) j φ(x1,...,xn) j 1 n (a1, . . . , aj−1, aj+1, . . . , an).

Proof. The proof of all but the last item are left to the reader. For the last item: Assume (a , . . . , a , a , . . . , a ) ∈ T M , then 1 j−1 j+1 n ∃xj φ(x1,...,xn)

M  ∃xjφ[vx1/a1,...,xj−1/aj−1,xj+1/aj+1,...,xn/an ]

50 Figure 1: Existential quantification and projection maps

which occurs if and only if for some a ∈ M

M  φ[vx1/a1,...,xj−1/aj−1,xj /a,xj+1/aj+1,...,xn/an ] giving that M (a1, . . . , aj−1, a, aj+1, . . . , an) ∈ Tφ(x1,...,xn). Now observe that

πj(a1, . . . , aj−1, a, aj+1, . . . , an) = (a1, . . . , aj−1, aj+1, . . . , an), hence M (a1, . . . , aj−1, aj+1, . . . , an) ∈ πj[Tφ(x1,...,xn)]. Since (a , . . . , a , a , . . . , a ) was chosen arbitrarily in T M , we conclude 1 j−1 j+1 n ∃xj φ(x1,...,xn) that T M ⊆ π [T M ]. ∃xj φ(x1,...,xn) j φ(x1,...,xn) We leave the proof of the converse inclusion to the reader. Exercise 2.58. Express T M in terms of T M using the projection maps, ∀xj φ(x1,...,xn) φ(x1,...,xn) and the set theoretic operation of complementation. Exercise 2.59. Fix a first order signature L. Show that

• φ is satisfiable if and only if we can find some n > 0, some set x1, . . . , xn containing the free variables of φ, and an L-structure M such that M Tφ(x1,...,xn) 6= ∅.

• φ is a tautology if and only if we can find some n > 0, and some set x1, . . . , xn containing the free variables of φ such that for all L-structures M with domain M we have that M n Tφ(x1,...,xn) = M .

51 Figure 2: Universal quantification and its sections

• φ is a contradiction if and only if we can find some n > 0, and some set x1, . . . , xn containing the free variables of φ such that for all L-structures M we have that M Tφ(x1,...,xn) = ∅.

Exercise 2.60. Fix a first order signature L. Show that for all L-formulae φ, ψ we have that φ |= ψ if and only if we can find some n > 0 and some set x1, . . . , xn containing the free variables of φ, ψ such that for all L-structure M

M M Tφ(x1,...,xn) ⊆ Tψ(x1,...,xn).

2.9 Definable sets with and without parameters In many cases we are interested in sets which needs some arguments not expressible in a first order language to be defined. For example we want to study the set of solutions of equations and disequations on real numbers for which the coefficients of the relevant polynomials are neither rational, nor even solutions of polynomial equations. It is hard to imagine how to express these coefficients by means of closed terms, or even of formulae in the language L = {·, +, 0, 1, <, =}. In most cases it is even impossible to talk about them (for example it can be proved that the number π is such that for no formula φ(x) in one free variable in the language L = {·, +, 0, 1, <, =} hR, ·, +, 0, 1, <, =i |= φ(x)[x/a] if and only if a = π; the same holds for e). Nonetheless we want to be able to freely use all real numbers in our analysis of the structure hR, ·, +, 0, 1, <, =i, not just the ones obtained as the interpretation of some closed term v or as the unique number c such that for some φ(x) in free variable x, hR, +, ·, +, 0, 1, <, =i |= φ(x)[x/a] if and only if a = c. This leads us to the notion of parameter and of subset of a structure M = hM, · · ·i definable with parameters.

52 Definition 2.61. Let L be a first order signature, M be an L-structure with domain M, b1, . . . , bn ∈ M, and φ(x1, . . . , xk, y1, . . . , yn) be an L-formula with displayed free variables. T M = ha , . . . , a i ∈ M k : M |= φ(x , . . . , x , y , . . . , y )[x /a , y /b ] . φ(x1,...,xk,y1,...,yn),hb1,...,bni 1 k 1 k 1 n i i j j

k • A ⊆ M is M-definable with parameters b1, . . . , bn ∈ M if for some φ(x1, . . . , xk, y1, . . . , yn) L-formula with displayed free variables we have that A = T M . φ(x1,...,xk,y1,...,yn),hb1,...,bni

k • A ⊆ M is M-definable without parameters if if for some φ(x1, . . . , xk) L- formula with displayed free variables we have that A = T M . φ(x1,...,xk)

• a ∈ M is M-definable (with parameters b1, . . . , bn ∈ M) if {a} is definable (with parameters b1, . . . , bn ∈ M). Example 2.62. Here are a few examples: • The interval (−π; e) is R-definable in parameters π, e where R = hR, <, +, 0i by the formula φ := (0 < y0 + x) ∧ (x < y1). • Each natural number n is definable in the structure (N, +, ·) (without param- eters): 0 is definable using the formula ∀z(z + x = z); 1 using the formula φ1(x) := ∀z(z · x = z); given the formula φn(x) defining n we let

φn+1(x) := ∃z∃w(φn(x) x/z ∧ φ1(x) x/w ∧ x = z + w). J K J K Exercise 2.63. Show the following: • For any structure M with domain M all finite and cofinite (i.e. of the form M \ X with X finite subset of M) subsets of M are M-definable with parameters.

• All finite and cofinite sets A ⊆ N are (N, +, ·)-definable without parameters. • The set of solutions of any finite set of polynomial equations and disequations with real coefficients is (R, +, ·, 0, 1, <)-definable with parameters.

Predicates, functions, and constants definable in a structure Given an L-structure M with domain M, it is often the case that certain functions f : M k → M, elements c ∈ M, and predicates R ⊆ M n are M-definable with (or without) parameters from M. To increase the readability of formulae which talks about properties of these M-definable objects, it is often convenient to expand L with symbols which can denote these objects (i.e. a new function symbol f¯ of ariety k to denote the definable function f : M k → M, a new relation constant symbol c¯ to denote the definable element c ∈ M, a new relation symbol R¯ of ariety n to denote the definable relation R ⊆ M n). In this case we enrich the relevant theory of which M is a model by sentences in the new language which explain the meaning of the new symbols added to the language by means of their defining properties in the original language. The following example should clarify what we mean:

53 Example 2.64. Consider the L-model ( , +, ·) for the language L = {⊕, ⊗}. Then R . 0 is definable by the formula φ (x) := ∀y(y ⊗ x = x), 1 is definable by the formula . 0 φ (x) := ∀y(y ⊗ x = y), the order relation a ≤ b is definable by the formula 1 . φ (x, y) := ∃z[x ⊕ (z ⊗ z) = y], the successor operation a 7→ a + 1 is definable by ≤ . the formula φS(x, y) := ∃z[φ1(x) x/z ∧ y = x ⊕ z]. We expand the language L addingJ K the binary relation symbol ≺, the function symbol S¯ and the constant symbols 0¯, 1.¯ The axioms

• ψ0 := φ0(x) x/0¯ , J K • ψ1 := φ1(x) x/1¯ , J K • ψ2 := ∀x∀y[(x ≺ y) ↔ φ≤(x, y)], ¯ • ψ3 := ∀xφS(x, S(x)). provide the definitions of these properties and force their interpretation to be the ∗ ∗ desired one in R, i.e. if we ask that (R, +, ·, c0, c1, ≤ ,S ) is a model of ψj for j = 0,..., 3, it must be the case that S∗ is the successor operation a 7→ a + 1, ∗ c0 = 0, c1 = 1, ≤ is the usual order relation on R One advantage of doing so is that many properties of R which required complex formulae to be formalized in L are now simply expressible in hR, +, ·, 0, 1, ≤i; con- sider for example the property that holds of a, b ∈ R if and only if b = a + 3. Write down an L-formula φ(x, y) which holds for a, b if and only if a + 3 = b and compare it with the L0-formula y = S¯(S¯(S¯(x))).

The following fact is trivial but must be checked, the proof goes by induction on the complexity of formulae and is left to the reader:

Fact 2.65. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} and 0 0 0 0 L = L ∪ {Si : i ∈ I , hj : j ∈ J , dk : k ∈ K } be two first order languages. Assume φ is an L-sentence and

0 D M0 M0 M0 M0 0 M0 0 M0 0E M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K,Si : i ∈ I , hj : j ∈ J , dk : k ∈ K is an L0-structure such that M0 |= φ. Let

D M0 M0 M0 E M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K .

Then M is an L-structure and models φ as well.

2.10 Exercises on first order logic Here is a list of exercises on first order logic:

• Consider the following LK-rule for the elimination of the universal quantifier:

54 Γ ` ∀xφ(x, x1, . . . , xn), ∆ (∀-elimination) Γ ` φ(x, x1, . . . , xn) x/t , ∆ J K for a term t. Show that Lemma 1.21 holds for the above rule. More generally: take the logical rules of Section 15.5, Definition 15.21 of the notes of Berarducci linked on the Moodle page of the course, and prove that Lemma 1.21 holds for all these rules. • Take the set of notes on logic of prof. Andretta available on the Moodle page of this course and do some among the exercises: 3.55, 3.56, 3.57, 3.58, 3.59, 3.63, 3.64, 3.65, 3.66 ((i)*,(iii)**), 3.67 ((vi)*), 3.68 ((ii)*), 3.69, 3.70, 3.71 ((ii)*, (iii)**). (* means the exercise is difficult, ** means the exercise is even more difficult....)

3 More on first order logic

We now start a deeper analysis of what can be done by means of first order logic. In particular we outline how first order logic provides: • a fully satisfactory mathematical formulation of the notion of theorem and of proof (by means of the completeness theorem); • the right framework to generalize the notions of algebraic structures and mor- phism between them, while tying their algebraic properties to syntactic prop- erties of the first order axioms used to define them; • a variety of “exotic” algebraic structures (by means of the compactness theo- rem).

3.1 First order LK-calculus We add to the LK-rules for propositional logic rules which reflect in our calculus the logical meaning of quantifiers and axioms which incorporate the properties of the . symbol = for equality:

Definition 3.1. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order signature with Ri a relation symbol of ariety ni for i ∈ I and fj a function symbol of ariety nj for j ∈ J. The axioms of the LK-calculus for L are: • φ ` φ for φ an L-formula, . •` t = t for all L-terms t, . . • t = s ` s = t for all L-terms s, t, . . . • s = t, t = u ` s = u for all L-terms s, t, u, . . . • x1 = t1, . . . , xnj = tnj ` fj(x1, . . . xnj ) = fj(t1, . . . tnj ) for all variables x1, . . . , xn and L-terms t1, . . . , tn,

55 . . • x1 = t1, . . . , xn = tn, φ ` φ x1/t1, . . . , xn/tn for all variables x1, . . . , xn, L- terms t1, . . . , tn, L-formulae φJ. K Let Γ, ∆ be finite sets of L-formulae. The rules of LK-calcul for L are the ones already introduced for propositional logic (see Def.1.15) and the following rules for quantifiers:

Γ ` φ x/t , ∆ (∃-R) ΓJ` ∃xφ,K ∆

Γ, φ x/t ` ∆ (∀-L) Γ, ∀xφJ ` K∆

Γ, φ ` ∆ (∃-L) Γ, ∃xφ ` ∆

Γ ` φ, ∆ (∀-R) Γ ` ∀xφ, ∆ where in the last two rules it is required that x is not free in any formula belonging to Γ ∪ ∆. Definition 3.2. Let L be a first order signature, T a theory in the language L, φ an L-formula. T ` φ if there is a finite set Γ ⊆ T such that Γ ` φ is LK-derivable, i.e. it is the root of an LK-derivation tree. More generally for finite or infinite sets of L-formulae Γ, ∆ we let Γ ` ∆ if there are finite sets of formulae Γ0 ⊆ Γ and ∆0 ⊆ ∆ such that Γ0 ` ∆0 is LK-derivable. Example 3.3. We give an LK-derivation of ∃y∀xφ ` ∀x∃yφ and argue why the above restriction is necessary in the two latter deduction rules.

φ ` φ (∃-R) φ ` ∃yφ (∀-L) ∀xφ ` ∃yφ (∃-L) ∃y∀xφ ` ∃yφ (∀-R) ∃y∀xφ ` ∀x∃yφ

The last two applications of (∃-L) and (∀-R) are correct because y is not anymore free in ∃yφ, and x is not anymore free in ∃y∀xφ. On the other hand assume x, y occurs as free variables in φ. Then the following derivation of ∀x∃yφ ` ∃y∀xφ is flawed:

φ ` φ (∃-L) ∃yφ ` φ (∀-L) ∀x∃yφ ` φ (∀-R) ∀x∃yφ ` ∀xφ (∃-R) ∀x∃yφ ` ∃y∀xφ

56 The use of (∃-L) in the first step of the above tree is not allowed by the LK-rules, since y occurs free in φ (which belongs to ∆ = {φ}). Exercise 3.4. Show by means of a counterexample that ∀x∃yφ 6|= ∃y∀xφ. Show also that ∃y∀xφ |= ∀x∃yφ. The above calculus works as expected (though the prooof of the theorem below is significantly more intricate than the corresponding proof for propositional logic). Theorem 3.5 (G¨oedel’sCompleteness Theorem, 1930). Let L be a first order sig- nature, T a theory in the language L, φ an L-formula. The following are equivalent: • T ` φ, • T |= φ.

3.2 Satisfiable theories and compactness Fix all over a first order signature L. From now on unless otherwise specified we will assume our theories T consist of L-sentences i.e. of formulae with no free variables. Recall that if φ is an L-sentence and M is an L-structure with domain M the truth value of φ is independent of the valuation v : var → M chosen (all valuations coincide on the empty set, which is the set of free variables of φ, hence we can apply Fact 2.33 to φ). Thus for L-sentences φ we will write

M  φ, rather than

M  φ[v] for some (equivalently all) valuations v : var → M. We also remark that for all L-sentences φ and L-structure M either

M  φ, or M  ¬φ. We will assume our theories consist of L-sentences also in view of the following observations: Exercise 3.6. Let T be a theory and M an L-structure let us denote by ∃T (∀T ) the theory obtained by taking the existential (universal) closure of all formulae in T . ∀T and ∃T are L-theory consisting of sentences. Prove the following: 1. M |= ∀T if and only if for all valuations v and formulae φ ∈ T M |= φ[v]. 2. M |= ∃T if and only if there exists a valuation v such that for all formulae φ ∈ T M |= φ[v]. Therefore if we want to study whether T iholds in some structure for some valuation, it suffices to prove that ∃T holds in some structure. Moreover the usual mathematical practice considers just theories consisting of sentences (such is the case for example for the theories we introduced so far in our examples), hence it is natural to focus our attention on this type of theories.

57 Definition 3.7. Let L be a first order theory. A theory T consisting of L-sentences is: • satisfiable if there exists na L-structure M such that

M  φ for all φ ∈ T ; • closed under logical consequences, if for all L-sentences φ such that T |= φ, we have that φ ∈ T ; • complete, if for all L-sentences φ, we either have that T |= φ or that T |= ¬φ. Given a set of L-sentences S, its closure under logical consequences CCL(S) is the least theory T which contains all L-sentences φ such that S |= φ. Σ is a set of axioms for T , if CCL(T ) = CCL(Σ). T is finitely axiomatizable if there exists a finite set of sentences Σ such that CCL(T ) = CCL(Σ). An immediate corollary of the completeness theorem is the following fundamental result: Theorem 3.8 (Compactness Theorem). Assume T is a finitely satisfiable theory in the language L (i.e. for every finite set Γ ⊆ T there is an L-structure M satisfying all formulae in Γ). Then T is satisfiable. Proof. If T is satisfiable it is clearly finitely satisfiable, hence only one direction is non trivial. We prove it: If T is not satisfiable it does not hold in any model, hence T |= ψ for any ψ, in particular T |= φ ∧ ¬φ. By the completeness theorem, we get that T ` φ ∧ ¬φ. This holds if and only if there is a finite set Γ0 ⊆ T such that Γ0 ` φ ∧ ¬φ is the root of an LK derivation tree. The completeness theorem gives that Γ0 |= φ ∧ ¬φ, i.e. Γ0 is not satisfiable. Hence T is not finitely satisfiable as witnessed by Γ0. A second equivalent formulation of compactness is the following: Theorem 3.9 (Compactness II). Assume T |= φ. Then there exists a finite Σ ⊆ T such that Σ |= φ. Proof. An immediate byproduct of the definitions and of the completeness theorem.

We will give a self-contained proof of the compactness theorem in the last part of these notes. Before then we will outline some of its applications. Proposition 3.10. Assume T is finitely axiomatizable. Then there exists a finite set Σ ⊆ T which is a set of axioms for T . V Proof. Assume Σ0 6⊆ T is a finite set such that CCL(Σ0) = CCL(T ). Then T |= Σ0 and for all formulae φ ∈ T we have that Σ0 |= φ. By the compactness theorem we V get that some finite Σ ⊆ T is such that Σ |= Σ0. On the other hand we also get V V V that Σ0 |= Σ since Σ0 |= φ for all φ ∈ Σ. Therefore Σ ≡ Σ0. We conclude V that Σ ⊆ T is a finite set of axioms for T , since for any ψ ∈ CCL(T )Σ |= Σ0 and V Σ0 |= ψ.

58 3.3 Classes of L-structures

Definition 3.11. Let L be a first order signature. We denote by ModL the class of all L-structures. Given a theory T in the language L

ModL(T ) = {M ∈ ModL : M |= T } .

A family C of L-structures is axiomatizable if C = ModL(T ) for some L-theory T , and finitely axiomatizable if C = ModL(T ) for some theory T given by a finite set of axioms.

Example 3.12. Groups, rings, fields are all examples of finitely axiomatizable classes of L-structures over the appropriate language. An example of a theory which is axiomatizable but not finitely axiomatizable is that of fields of charactersitic 0. We will prove this as an application of the compactness theorem.

Exercise 3.13. Prove that groups, rings, and fields are finitely axiomatizable, and also that fields of characteristic 0 are axiomatizable.

Fact 3.14. Let L be a first order signature and T0,T1 be theories in the language L. Then:

1. ModL = ModL(∅),

2. ModL(T0) ∩ ModL(T1) = ModL(T0 ∪ T1).

3. Assume T0 is finitely axiomatizable. Then ModL \ ModL(T0) is finitely axiom- atizable.

4. Assume ModL\ModL(T0) is axiomatizable. Then ModL(T0) and ModL\ModL(T0) are both finitely axiomatizable. Hence a partition of ModL in two pieces is such that either both pieces are finitely axiomatizable or none of the pieces is axiom- atizable.

Proof. 1, 2, 3 are useful exercises for the reader. We prove 4: If one of the two is finitely axiomatizable, by 3 so is the other. So assume both of them are axiomatizable but not finitely axiomatizable. Let T1 be such that ModL(T1) = ModL \ ModL(T0). Then for any finite Σ ⊆ T1 there is an L-strucure M which models Σ but not T1. Since ModL(T0) = ModL \ModL(T1), we get that M is a model of Σ∪T0. This gives that the theory T0 ∪ T1 is finitely consistent. Therefore it has a model N . Then

N ∈ ModL(T0 ∪ T1) = ModL(T0) ∩ ModL(T1) = ∅.

We can also relativize the above properties to axiomatizable classes:

Definition 3.15. Let L be a first order signature. Given a theory T in the language L, we say that C ⊆ Mod(T ) is (finitely) axiomatizable modulo T if C = Mod(T ∪T0) with T0 a (finite) set of L-sentences.

59 Fact 3.16. Assume C ⊆ ModL(T ) is axiomatizable modulo T . Then C is finitely ax- iomatizable modulo T if and only if so is ModL(T )\C. Hence a partition of ModL(T ) in two pieces is such that either both pieces are finitely axiomatizable modulo T or none of the pieces is axiomatizable. Exercise 3.17. Prove the fact. Example 3.18. As a further application of compactness the let us prove that the class of fields of characteristic 0 is L-axiomatizable, but it is not finitely axioma- tizable where L = {0, 1, +, ·}. This will also give that class of fields of non-zero characteristic is not axiomatizable by Fact 3.16. The theory of fields is finitely ax- iomatizable in L by a finite set of axioms T0 (prove it). We can isolate the fields of characteristic 0 in ModL(T0) as those fields which moreover satisfy the sentences ¬(n = 0) for all n ∈ N, where n is the term obtained by the string 1 + 1 + 1 + ... repeated n-times. Let T1 = T0 ∪ {¬(n = 0) : n ∈ N}. Now assume that ModL(T1) is finitely axiomatizable as witnessed by the finite set Σ. By proposition 3.10 we can assume Σ ⊆ T . Find p a prime number large enough so that no sentence of type ¬(n = 0) appearing in Σ is such that n ≥ p. Then Zp is a model of Σ and is a field with non-zero characteristic, a contradiction.

3.4 Substructures, morphisms, and products We now generalize to arbitrary L-structures the usual notion of homomorphism, product, and substructure we have already encountered in algebra. We will show that many algebraic properties can be characterized in terms of syntactic properties of the formulae satisified by the relevant algebraic structures. First of all we need to define two special subclasses of formulae: the positive ones and the quantifier free ones. Definition 3.19. A formula is positive if the unique boolean connectives occurring in it are ∨, ∧. It is quantifier free if no quantifier symbol occurs in it. Fact 3.20. Let L be a first order signature. Every quantifier free L-formula ψ is logically equivalent to one of the form _ ^ ( Γi) i∈I with I a finite set and Γi a finite set of atomic formulae or negation of atomic formulae. Proof. SKETCH: Run the proof of Theorem 1.13 with atomic formulae taking the role of propositional variables.

Substructures Definition 3.21. Let M be a set and f : M n → M. Given B ⊆ M, B is f-closed if f(a1, . . . , an) ∈ B for all a1, . . . , an ∈ B. Assume F is a family of finitary operations on M, B ⊆ M is F-closed if f(a1, . . . , an) ∈ B for all a1, . . . , an ∈ B and f ∈ F. For any A ⊆ M, ClF (A) (the closure of A under F) is the intersection of all F-closed sets B ⊇ A.

60 Definition 3.22. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order sig- M M M nature and M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K be an L-structure with domain M.  M  M N ⊆ M is an L-substrucure of M if it contains ck : k ∈ K and is fj : j ∈ J - closed. The L-substructure of M generated by X ⊆ M is the intersection of all L-  M  M substrucures of M containing X, i.e. the fj : j ∈ J -closure of X ∪ ck : k ∈ K . An L-substructure N of M is generated by a subset X if N = Cl M (X ∪ {fj :j∈J}  M ck : k ∈ K ), and is finitely generated if it can be generated by a finite subset X.

Example 3.23. A substructure of the L2-structure hR, <, +, ·, −, 0, 1i is any ring A ⊆ R, and the smallest such substructure is Z. Z is finitely generated inside R, since Z = Cl{+,−,·}({0, 1}).

Proposition 3.24. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order sig- M M M nature and M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K be an L-structure with domain M. Assume N is an L-substructure of M generated by X. Then

N = {v(t(x1, . . . , xn)) : t(x1, . . . , xn) an L-term and v : var → X a valuation} . (in case X = ∅, N = {v(t): t a L-term with no free variables}).

 M Proof. We have that N = Cl M (X ∪ ck : k ∈ K ). {fj :j∈J} It is enough to show that D = {v(t(x1, . . . , xn): v : var → X} ⊆ Cl M (X ∪ {fj :j∈J}  M  M M ck : k ∈ K ) is fj : j ∈ J -closed. First of all, ck = v(ck) ∈ D and a = v(x) ∈ D for all a ∈ X, choosing a valuation v which maps all variables to a. Now let t1, . . . , tn be L-terms and v : var → X be a valuation. Assume that  M v(tl) ∈ Cl M (X ∪ ck : k ∈ K ) for all l = 1, . . . , n. Then for any t of the {fj :j∈J} form fj(t1, . . . , tn) we have that

M  M v(t) = v(fj(t1, . . . , tn)) = fj (v(t1), . . . , v(tn)) ∈ Cl M (X ∪ ck : k ∈ K ). {fj :j∈J} An induction on the complexity of a term t shows at the same time that D ⊆  M  M Cl M (X ∪ ck : k ∈ K ) is fj : j ∈ J -closed, giving the desired thesis. {fj :j∈J} Exercise 3.25.

• Prove that the substructure generated by {−1} inside (Z, +, 0) is the family of negative or null integers (HINT: prove by induction: (a) for any negative or null integer number n there is a term t(x) such that v(t(x)) = n for any (the unique) valuation v : var → {−1}, (b) for any term t, v(t) ≤ 0).

• What is the substructure generated by {n} for some fixed non null n ∈ Z?

• What is the substructure generated by a finite set {n1, . . . , nk} of elements of Pl Z? (HINT: it is the substructure given by numbers of type i=1 ki · ni with each ki ∈ N. How would you prove that this is the correct answer?) • What is the substructure generated by an arbitrary subset of Z? Fact 3.26. A substructure of M preserves the truth of all quantifier free formulae.

61 Proof. Every quantifier free formula is logically equivalent to formula φ of type _ n^ o Γi : i ∈ I with each Γi a finite set consisting either of atomic formulae or of their negation. Now it is easy to check that the fact holds for an atomic formula or for its negation and also that if it holds for φ, ψ, it also holds for φ∨ψ and φ∧ψ, hence the thesis. Remark 3.27. A substructure may not preserve existential formulae or universal for- mulae. For example: consider the partial order Z2 with ordering defined component wise ((a, b) < (c, d) if and only if a < c and b < d). Then hZ2,

Products Q Recall that for a family of non-empty sets hMl : l ∈ Li, its product is the set Ml S l∈L given by functions h : L → l∈L Ml such that h(l) ∈ Ml for all l ∈ L.

Definition 3.28. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order lan- D E Ml Ml Ml guage and (Ml = Ml,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K : l ∈ L) be a family of Q L-structures. The product of this family M = l∈L Ml is the L-structure defined as follows: Q • its domain is the set l∈L Ml;

M Ml • Ri (h1, . . . , hn) holds if and only if Ri (h1(l), . . . , hn(l)) holds for all l ∈ L and i ∈ I;

M Ml • fj (h1, . . . , hn)(l) = fj (h1(l), . . . , hn(l)) for all l ∈ L and j ∈ J;

M Ml • ck (l) = ck for all l ∈ L and k ∈ K. Example 3.29. The product of two copies of the structure hC, +, ·, −,−1 , 0, 1i is the structure C2 endowed with the operations defined pointwise and −,−1 denoting the inverse operations of the additive and multiplicative group structure on C. Notice that the product does not preserves the truth of disjunctions:  For example let L = ⊕, ⊗,I⊕,I⊗, 0, 1 be the language to interpret the above symbols as the operations +, ·, −,−1 , 0, 1 in C. Then −1 . C = C, +, ·, −, , 0, 1 |= x = 0 ∨ x ⊗ I⊗(x) = 1[v] for any v : var → C, while . C × C 6|= x = 0 ∨ x ⊗ I⊗(x) = 1[v] for v(x) = (0, 1). Fact 3.30. Assume φ is positive with no symbol of disjunction ∨ occurring in it. D E Ml Ml Ml Let (Ml = Ml,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K : l ∈ L) be a family of L- Q structures. Then the product M = Ml satisifes φ with a valution v : var → Q l∈L l∈L Ml if and only if Ml |= φ[vl]

(where vl(x) = v(x)(l)) for all l ∈ L.

62 Proof. The fact holds for atomic formulae by definition. We prove the clause for existential formulae and leave the remaining to the reader. Assume M |= φ[v] if and only if (Ml |= φ[vl] for all l ∈ L). Then M |= ∃xφ[v] Q if and only if there is h ∈ l∈L Ml such that

M |= φ[vx/h] if and only if (by inductive assumption) for all l ∈ L

Ml |= φ[(vl)x/h(l) = (vx/h)l], giving that Ml |= ∃xφ[vl] for all l ∈ L as witnessed by h(l). Conversely assume that Ml |= ∃xφ[vl] for all l ∈ L. Then for all l ∈ L there is some al ∈ L such that

Ml |= φ[(vl)x/al ]; Q let h ∈ l∈L Ml be defined by h(l) = al, then (by inductive assumption on φ)

M |= φ[vx/h] giving that M |= ∃xφ[v].

Remark 3.31. The above fact fails for of positive formulae, for example: Let φ ≡ ¬x = y. Then

2 Z |= ¬x = y[x/(2, 3), y/(4, 3)], but it is not the case that Z |= ¬x = y[x/3, y/3]. Exercise 3.32. In the language L = {+, ·, 0, 1} give axioms for the theory of integral domains. Prove that the product of two integral domains is never an integral domain. Which axiom is never preserved?

Morphisms The examples we gave of L-structures were all drawn from familiar algebraic struc- tures. It is the case that first order logic is a good setting in which the notion of morphism can be generalized to a much wider class of structures.

Definition 3.33. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order lan- M M M N N N guage and M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K , N = N,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K be first order structures h : M → N is a:

63 • morphism if M N h(ck ) = ck , M N Ri (a1, . . . , an) ⇒ Ri (h(a1), . . . , h(an)), N M fj (h(a1), . . . , h(an)) = h(fj (a1, . . . , an)),

for all i ∈ I, j ∈ J, k ∈ K and a1, . . . , an ∈ M; • an embedding if

M N Ri (a1, . . . , an) ⇔ Ri (h(a1), . . . , h(an)),

holds for all i ∈ I and a1, . . . , an ∈ M;

• an elementary embedding if for all formulae φ(x1, . . . , xn) with displayed free variables and all v : var → M, letting

vh(x) = h(v(x)), (1)

we get that

M |= φ(x1, . . . , xn)[v] ⇔ N |= φ(x1, . . . , xn)[vh];

• an isomorphism if it is a surjective embedding.

We say that h : M → N preserves a formula φ(x1, . . . , xn if for all valuations v : var → M we have that

M |= φ(x1, . . . , xn)[v] ⇒ N |= φ(x1, . . . , xn)[vh];

Example 3.34. The inclusion map of N into Z is an L0-embedding for the language L0 = {∗} and the structures (N, +), (Z, +). It is not an elmentary embedding since the first structure does not satisfy the group axioms, while the second does. The inclusion map of Z into R is an L2-embedding for the language L0 = {+, ·, 0, 1, <} and the structures (Z, +, ·, 0, 1, <), (R, +, ·, 0, 1, <). It is not elemen- tary since (R, +, ·, 0, 1) is a field and (Z, +, ·, 0, 1) is not. It can be shown (but it is a rather deep result) that the inclusion map of (Q, <) into (R, <) is an elementary embedding for the language {<}. Exercise 3.35. Show that the inclusion map of the structure (Q, +, ·, 0, 1, <) into (R, +, ·, 0, 1, <) is an embedding but not an elementary embedding (HINT: there are polynomials in integer coefficients whose roots are not rational). The following exercise gives some of the basic properties of homomrphisms. Exercise 3.36. Prove the following facts: . • For the language L∞ = {∗, =} introduced before, the notion of morphism introduced above correspond to the notion of group homomorphism whenever M, N are groups. • The notion of ring homomorphism correspond to the notion of morphism in . the language L = {∗, ⊕, =, 0¯, 1¯} whenever M, N are rings.

64 • An embedding is always an injective morphism (apply the definition of embed- ding to the equality relation symbol). • The inclusion map of a substructure into a structure is always an embedding. • An elementary embedding is also an embedding (apply the definition of el- ementary embedding to the atomic formulae of type R(x1, . . . , xn) and their negation). The following can be proved by induction on the complexity of the terms: Fact 3.37. Let L be a first order language. Let M, N be L-structures with domain M,N respectively. Assume h : M → N is a morphism between M and N . Then5 h(v(t)) = vh(t) for all L-terms t. Proof. If t is a constant or a variable, the thesis trivially holds; next assume the the thesis holds for t1, . . . , tm and t = fj(t1, . . . , tm). Then

h(v(t)) = h(v(fj(t1, . . . , tm)) = M = h(fj (v(t1), . . . , v(tm)) = N = fj (h(v(t1)), . . . , h(v(tm))) = N = fj (vh(t1), . . . , vh(tm)) =

= vh(t), where in the before last equality we used the inductive assumptions on t1, . . . , tm. There are several conclusions that can be drawn from Fact 3.37: Fact 3.38. Let h : M → N be a morphism between the L-structures M, N with domain respectively M, N. 1. Assume h preserves φ, ψ. Then h preserves ∃xφ, φ ∨ ψ, φ ∧ ψ. 2. Assume h is surjective and preserves φ. Then h preserves also ∀xφ. 3. Assume h preserves φ, then it preserves any formula ψ logically equivalent to φ. Proof. We leave everything as a useful exercise for the reader, except the proof that whenever φ is preserved by h so is ∃xφ: assume M |= ∃xφ(x, x1, . . . , xn)[v]. By definition there is a ∈ M such that

M |= φ(x, x1, . . . , xn)[vx/a].

∗ ∗ By inductive assumptions (letting v = vx/a) applied to φ(x, x1, . . . , xn), v we con- clude that ∗ N |= φ(x, x1, . . . , xn)[vh]; by definition we get that

∗ N |= ∃xφ(x, x1, . . . , xn)[vh].

5 Recall1 for the definition of vh.

65 ∗ Since vh and vh agree on all free variables of ∃xφ(x, x1, . . . , xn), we conclude that

N |= ∃xφ(x, x1, . . . , xn)[vh].

Proposition 3.39. Let h : M → N be a morphism between the L-structures M, N with domain respectively M, N. The following holds: 1. h preserves all positive formulae in which no universal quantifier symbol ∀ appears. 2. If h is a surjective morphism, it preserves all positive formulae. 3. If h is an embedding it preserves all quantifier free formulae. 4. If h is an isomorphism, it is also an elementary embedding. Proof. 1. By the previous fact (using that h is a morphism) we obtain that for any v : var → M

M N Ri (v(t1), . . . , v(tm)) ⇒ Ri (vh(t1), . . . , vh(tm)) Hence M |= Ri(t1, . . . , tm)[v] if and only if M Ri (v(t1), . . . , v(tm)) only if N Ri (vh(t1), . . . , vh(tm)) if and only if N |= Ri(t1, . . . , tm)[vh]. Now a simple inductive argument handles the cases of formulae built up over atomic formulae just by means of ∨, ∧, ∃. 2. If h : M → N is surjective we can handle also the inductive argument for the universal quantifier ∀. 3. If h : M → N is an embedding for any6 v : var → M

M N Ri (v(t1), . . . , v(tm)) ⇔ Ri (vh(t1), . . . , vh(tm)) Hence M |= Ri(t1, . . . , tm)[v] if and only if M Ri (v(t1), . . . , v(tm)) if and only if N Ri (vh(t1), . . . , vh(tm))

6 Recall1 for the definition of vh.

66 if and only if N |= Ri(t1, . . . , tm)[vh]. Therefore h preserves the negation of atomic formulae. Now any quantifier free n o L-formula φ is logically equivalent to a formula ψ of the form W V Γ , i∈I j∈Ji j where each Γj is a set consisiting just of atomic formulae or negation of atomic formulae. By the first item of this proposition and the last item of the previous fact we obtain that h must preserve all quantifier free formulae. 4. Assume h : M → N is an isomorphism between L-structures M, N with domain M,N. Since h is an embedding the above argument applies for atomic formulae. The same argument of the previous item handles the inductive cases for formulae whose principal connective is boolean. We are left with the treatment of quantifiers, which can be handled by means of Fact 3.38.

The above results link algebraic properties of morphisms to their logical proper- ties. There is a celebrated theorem of Birkhoff which gives a nice algebraic charac- terization of which theories can be axiomatized by sentences of type

∀x1 ... ∀xn∀y1 ... ∀ymt(x1, . . . , xn) = s(y1, . . . , ym).

Notation 3.40. Theories which can be axiomatized by this type of axioms are called equational theories.

This is the case for example for the theory of groups in the language L1 = {∗, I, e} and for the theory of rings in the language L = {⊕, ⊗, 0, 1,I⊕} (I⊕ is a unary relation symbol which can be used to define the inverse of the group operation given by the sum). Exercise 3.41. Show that the theory of groups and the theory of rings are equational. Exercise 3.42. Show that the class of groups and the class of rings are closed under:

• homomorphic images (i.e. if (G, ·G,IG, eG), (H, ·H ,IH , eH ) are L1 = {∗, I, e}- structures, G is a group, and h : G → H is a surjective L1-morphism, then H is a group as well, and similarly for rings); • substructures; • products. Exercise 3.43. Show that the class of fields is not closed under products. In general it is not too hard to see that whenever a theory T in a language L is equational, the class CT of L-structures which are models of T is closed un- der products, homomorphic images, and substructures. The converse is Birkhoff’s theorem: Theorem 3.44 (Birkhoff). Let L be a first order language with no relation symbols and C a class of L-structures. The following are equivalent: • C is closed under products, substructures, and homomorphic images.

67 • There is a language L1 ⊇ L such that C is the family of L1-structures satisfying an equational theory T 7. By Birkhoff’s theorem we get that the class of fields is not an equational theory.

3.5 Elementary equivalence and completeness Definition 3.45. Let M, N be L-structures. M, N are elementarily equivalent (M ≡ N ) if for all L-sentences φ

M  φ if and only if N  φ. Lemma 3.46. Assume h : M → N is an elementary embedding between L-structures M, N with domain M, N. Then M and N are elementarily equivalent. Proof. Exercise for the reader. Remark 3.47. The converse does not hold: for example it can be shown that h[0, 1] ∪ (Q ∩ (1, 2]),

TM = {φ : φ is an L-sentence and M  φ}

Remark 3.50. TM is a satisfiable, complete theory, closed under logical consequence as witnessed by M. Proposition 3.51. Let T be a satisfiable L-theory made up of L-sentences and closed under logical consequence. The following are equivalent: 1. T is complete. 2. Every two L-structures M, N which are models of T are elementarily equiva- lent.

3. T = TM for some M which satisfies T .

7 More precisely we can expand each M ∈ C to an L1-structure M1 with the same domain, the same interpretation of the symbols in L, and such that M1 |= T .

68 Proof. We prove these equivalences as follows:

(1) implies (2): Assume M |= φ. Then T 6|= ¬φ as witnessed by M, since T is complete T |= φ. Hence N |= φ as well being a model of T . (This implication does not need that T is closed under logical consequences).

(2) implies (3): Notice that T ⊆ TM for any M model of T . By (2) all models of T satisfy exactly the same set of sentences, i.e. TM = TN for all M, N which model T . Assume T 6= TM for some M which models T . Then there exists φ ∈ TM \ T . Since T is closed under logical consequences T 6|= φ, otherwise φ ∈ T . This gives that some N |= T ∪ {¬φ}. But this contradicts (2), since TM and TN disagree on φ. (3) implies (1): For any sentence φ and any structure M, either M |= φ or M |= ¬φ. Hence TM is complete (and closed under logical consequences).

3.6 More exercises on first order logic Take the set of notes on logic of prof. Andretta available on the Moodle page of this course and do some among the exercises: 3.72(**) (ignore the request to prove elementary equivalence, but —if you think the two models are not elementarily equivalent — try to prove it), 3.73(**) (ignore the request to prove elementary equivalence, but —if you think the two models are not elementarily equivalent — try to prove it), 3.74 ((i)*,(ii)**,(iii)**), 3.75, 3.76 (** — a structure is rigid if it does not admit non-trivial automorphisms), 3.77(* — use 3.76), 3.78 (*), 3.79 (*), 3.80(*), 3.81(*), 3.84 (** — an independent set of axioms ∆ for a theory Σ is a set of axioms for Σ such that CCL(∆ \{φ}) 6= CCL(∆) for all φ ∈ ∆). (* means the exercise is difficult, ** means the exercise is even more difficult....)

4 Compactness

All over this section the first order theories we consider consist just of sentences. Recall the compactness theorem:

Theorem. Let L be a first order signature and T an L-theory. Then T is finitely satisfiable if and only if it is satisfiable.

4.1 Proof of the compactness theorem The proof is based on two Lemmas and two Definitions.

Definition 4.1. Let L be a first order signature and T be an L-theory. T has existential witnesses (or the Henkin property) if for all L-formulae φ such that T |= ∃xφ, there is some constant symbol c ∈ L such that T |= φ x/c . J K In most cases theories do not have existential witnesses simply because there are not enough constants.

69 Example 4.2. Let L = {⊕, ⊗, c0, c1} with c0, c1 constant symbols and and ⊕, ⊗ binary function symbols. Consider TN where

N = hN, +, ·, 0, 1i .

Then TN |= ∃x(x = c1 ⊕ c1) but there is no constant symbol e such that TN |= e = c1 ⊕ c1.

In this example we still have a closed term t such that TN |= ∃x(x = c1 ⊕ c1) if and only if TN |= (t = c1 ⊕ c1); it suffices to take t = c1 ⊕ c1. On the other hand in more complicated structures this is not anymore the case.

Example 4.3. Consider the L = {⊕, ⊗, c0, c1}-structure R = hR, +, ·, 0, 1i and the L-sentence ∃x(x ⊗ x = c1 ⊕ c1).√ This sentence holds in R as witnessed by any assignemnt α satisfying α(x) = 2, but there is no closed L-term t such that R |= t ⊗ t = c1 ⊕ c1.

0 Let us expand L = {⊕, ⊗, c0, c1} to L = {⊕, ⊗, cn : n ∈ N}. Let

0 N = hN, +, ·, n : n ∈ Ni

0 (where each n ∈ N is the interpretation in N of the constant symbol cn). Then TN 0 clearly has existential witnesses and for all L-sentences φ, φ ∈ TN 0 if and only if φ ∈ TN . ∗ ∗ Similarly we can expand L = {⊕, ⊗, c0, c1} to L = {⊕, ⊗, cr : r ∈ R}, let R = hR, +, ·, r : r ∈ Ri (where each r ∈ R is the interpretation in R∗ of the constant symbol cr), and conclude that TR∗ has existential witnesses and φ ∈ TR∗ if and only if φ ∈ TR for all L-sentences φ.

Exercise 4.4. Prove that TN 0 has the Henkin property and φ ∈ TN 0 if and only if φ ∈ TN for all L-sentences φ. Prove that the same holds for TR∗ with respect to TR. More generally we can perform the same construction over any L-theory T :

Proposition 4.5. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K} be a first order signa- ture and T be an L-theory. Assume M |= T with

M M M M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K

0 an L-structure with domain M. Let L = L ∪ {em : m ∈ M} and

0 0 T = T ∪ {∃xφ → φ x/em : M |= φ[x/m], ∃xφ an L -sentence} . J K Then T 0 has the Henkin property. Moreover the L0-structure M0 |= T 0, where

0 M M M M M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K, em : m ∈ M

M with em = m for all m ∈ M. Definition 4.6. Let L be a first order signature, and T be an L-theory consisting of sentences. Let L0 ⊇ L be another first order signature, and T 0 be an L0-theory also consisting of L0-sentences.

70 • T 0 is a consistent extension of T over L if φ ∈ T entails φ ∈ T 0 for all L- sentences φ. • T 0 is a conservative extension of T over L if φ ∈ T if and only if φ ∈ T 0 for all L-sentences φ.

Exercise 4.4 shows that TN 0 is a conservative extension of TN with the Henkin property. Proposition 4.5 generalizes this result to the theory of any L-structure. One of the main ingredients in the proof of the compactness theorem essentially amounts to prove a converse of 4.5, i.e. to show that complete finitely satisfiable theories with the Henkin property have a model. Towards this aim it is important to give a different characterization of complete theories which is based on a maximality property with respect to finite satisfiability. Definition 4.7. A theory T is maximally satisfiable if it is finitely satisfiable and there is no finitely satisfiable S properly containing T . Fact 4.8. Assume T is maximally satisfiable, then: (a) for all sentences φ exactly one among φ and ¬φ is in T ,

(b) for all φ1, . . . , φn ∈ T and φ such that φ1 . . . φn |= φ we have that φ ∈ T . Proof.

(a) If for some φ both φ and ¬φ are in T , T is not finitely satisfiable. Assume none of them is in T . Since T is maximally satisfiable we get that T ∪ {φ} and T ∪ {¬φ} are both not finitely satisfiable. Hence there are φ1, . . . , φn, ψ1 . . . , ψm all in T such that {φ1, . . . , φn, φ}, {ψ1, . . . , ψm, ¬φ} are both not satisfiable. Hence also {φ1, . . . , φn, ψ1, . . . , ψm, φ} and {φ1, . . . , φn, ψ1, . . . , ψm, ¬φ} are both not satisfiable.

Now {φ1, . . . , φn, ψ1, . . . , ψm} ⊆ T , hence it is finitely satisfiable. Find a model Vn Vm M of i=1 φi ∧ j=1 ψj. Then

n m n m ^ ^ ^ ^ φi ∧ ψj ≡ ( φi ∧ ψj) ∧ (φ ∨ ¬φ) ≡ i=1 j=1 i=1 j=1 n m n m ^ ^ ^ ^ ≡ ( φi ∧ ψj ∧ φ) ∨ ( φi ∧ ψj ∧ ¬φ). i=1 j=1 i=1 j=1 Therefore n m n m ^ ^ ^ ^ M |= ( φi ∧ ψj ∧ φ) ∨ ( φi ∧ ψj ∧ ¬φ) i=1 j=1 i=1 j=1 This gives that either n m ^ ^ M |= φi ∧ ψj ∧ φ, i=1 j=1 or n m ^ ^ M |= φi ∧ ψj ∧ ¬φ. i=1 j=1

71 In one case we contradict the unsatisfiability of

{φ1, . . . , φn, ψ1, . . . , ψm, φ} , in the other that of {φ1, . . . , φn, ψ1, . . . , ψm, ¬φ} .

(b) Assume φ1, . . . , φn ∈ T and φ are such that φ1 . . . φn |= φ. If φ 6∈ T , ¬φ ∈ T , but φ1 . . . φn |= φ if and only if {φ1 . . . φn, ¬φ} is not satisfiable (see Fact 3.48). Then T is not finitely satisfiable, a contradiction.

We will not need the following remark in the remainder of the proof, but it is worth stating it. Remark 4.9. Assume T is a maximally satisfiable L-theory consisting of L-sentences. Then any finitely satisfiable consistent extension of T over L is also a finitely satis- fiable conservative extension. The first key Lemma is the following: Lemma 4.10 (Henkin’s lemma). Let L be a first order signature and T be a finitely satisfiable L-theory. 0 There is a language L = L ∪ {el : l ∈ L} containing just new constant symbols, and a maximally satisfiable L0-theory T 0 which has the Henkin property, and is a coherent extension of T over L. The second key Lemma is the following characterization of the theory of a M- structure: Lemma 4.11. Assume T is a finitely satisfiable L-theory which has the Henkin property. The following are equivalent: • T is maximally satisfiable.

• T = TM for some L-structure M. Assume the two lemmas have been proved. The proof of the compactness theorem is given by the following argument:

Proof. Let T be a finitely satisfiable L-theory with L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K}. 0 By Lemma 4.10 there is a signature L = L ∪ {el : l ∈ L} containing the new 0 0 constant symbols {el : l ∈ L} and a maximally satisfiable L -theory T which is a coherent extension of T and has the Henkin property. By Lemma 4.11 T 0 is satisfiable by a L0-model N N N N N = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K, el : l ∈ L . It is clear that (see Fact 2.65) N N N M = M,Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K |= T.

We are left with the proof of the two key lemmas.

72 Proof of Lemma 4.11

Proof. Let L = {Ri : i ∈ I, fj : j ∈ J, ck : k ∈ K}. If T = TM for some L-structure M, we easily get that T is maximally satisfiable (useful exercise for the reader). Therefore we are left with the converse implication. We prove it. Let CTL be the set of closed L-terms t (i.e. those L-terms which do not have variable symbols occurring in them and whose interpretation does not depend on the assignment of a value to free variables). Define the relation ≈T on CTL by t ≈T s if and only if (t = s) ∈ T .

Claim 4.11.1. ≈T has the following properties:

(a) it is an equivalence relation on CTL;

(b) for all terms t ∈ CTL there is some constant symbol c such that c ≈T t;

(c) for all t1 . . . , tn ∈ CTL and any R relation symbol in L either R(t1, . . . , tn) ∈ T , or ¬R(t1, . . . , tn) ∈ T ;

(d) for any relation symbol R in L of ariety n, and all t1 . . . , tn, s1, . . . , sn ∈ CTL such that si ≈T ti for all i = 1, . . . , n,

R(t1, . . . , tn) ∈ T if and only if R(s1, . . . , sn) ∈ T ;

(e) for any function symbol f in L of ariety n, and all t1 . . . , tn, s1, . . . , sn ∈ CTL such that si ≈T ti for all i = 1, . . . , n,

(f(t1, . . . , tn) = f(s1, . . . , sn)) ∈ T.

Proof.

(a) Assume (t1 = t2), (t2 = t3) ∈ T , since t1 = t2, t2 = t3 |= t1 = t3, by Fact 4.8(b), we get that t1 = t3 ∈ T as well. Similarly we can check also the reflexivity and simmetry of ≈T . (b) Let t be any L-closed term. Remark that t = t |= ∃x(x = t) for any closed term t ∈ CTL. Hence ∃x(x = t) ∈ T (being t = t a logical truth). By the Henkin property of T , we get that there is some constant symbol c such that (c = t) ∈ T , hence c ≈T t.

(c) by Fact 4.8(a), at least one among R(t1, . . . , tn), or ¬R(t1, . . . , tn) is in T . (d) By Fact 4.8(b), since

t1 = s1, . . . , tn = sn,R(t1, . . . , tn) |= R(s1, . . . , sn),

R(t1, . . . , tn) ∈ T implies R(s1, . . . , sn) ∈ T . Similarly we can prove the converse implication.

(e) By Fact 4.8(b), since t1 = s1, . . . , tn = sn |= f(t1, . . . , tn) = f(s1, . . . , sn),

f(t1, . . . , tn) = f(s1, . . . , sn) ∈ T.

73 Let MT = CTL/≈T and [t] denote the equivalence class in MT of t ∈ CTL. Set for any function symbol f of ariety n

∗ n f :MT → M h[t1],..., [tn]i 7→ [f(t1, . . . , tn)], and for any relation symbol R of ariety n

∗ h[t1],..., [tn]i ∈ R if and only if R(t1, . . . .tn) ∈ T.

∗ ∗ Claim 4.11.2. MT ,Ri : i ∈ I, fj : j ∈ J, [ck]: k ∈ K is an L-structure.

∗ ∗ Proof. By Claim 4.11.1, the functions fj are well defined for all j ∈ J, and Ri is a ni well defined relation on MT for all i ∈ I. Claim 4.11.3. Let

∗ ∗ M = M,Ri : i ∈ I, fj : j ∈ J, [ck]: k ∈ K . Then

M |= φ(x1, . . . , xn)[x1/[c1], . . . , xn/[cn]] if and only if

φ x1/c1, . . . , xn/cn ∈ T. J K By the Claim we get that T = TM, concluding the proof. We prove the Claim.

Proof. Let φ be an atomic formula of type R(t1(x1, . . . , xm), . . . , tn(x1, . . . , xm)). Then

M |= R(t1(x1, . . . , xm), . . . , tn(x1, . . . , xm))[xi/[ci]: i = 1, . . . , m] if and only if (since the closed terms t1 [x1/c1, . . . , xn/cm] , . . . , tn [x1/c1, . . . , xn/cm] gets the same evaluation of the terms t1(x1, . . . , xm), . . . , tn(x1, . . . , xm) when xi is assigned to [ci] for i = 1, . . . , m, see Fact 2.51).

M |= R(t1 [x1/c1, . . . , xn/cm] , . . . , tn [x1/c1, . . . , xn/cm]) if and only if

∗ h[t1 [x1/c1, . . . , xn/cm]],..., [tn [x1/c1, . . . , xn/cm]]i ∈ R if and only if

R(t1 [x1/c1, . . . , xn/cm] , . . . , tn [x1/c1, . . . , xn/cm]) ∈ T.

Hence the Claim holds for atomic formulae. Now we proceed by induction on the logical complexity of φ. Assume the Claim holds for φ, ψ. We handle each logical operator as follows:

74 Case for φ ∧ ψ:

M |= φ ∧ ψ[xi/[ci]: i = 1 . . . , m] if and only if

(M |= φ[xi/[ci]: i = 1 . . . , m] and M |= ψ[xi/[ci]: i = 1 . . . , m])

if and only if (by inductive assumptions on φ, ψ)

(φ x1/c1, . . . , xm/cm ∈ T and ψ x1/c1, . . . , xm/cm ∈ T ) J K J K if and only if (by Fact 4.8(b))

φ ∧ ψ x1/c1, . . . , xn/cn ∈ T. J K Case for ¬φ:

M |= ¬φ[xi/[ci]: i = 1 . . . , m] if and only if

M 6|= φ[xi/[ci]: i = 1 . . . , m]

if and only if (by inductive assumptions on φ)

φ x1/c1, . . . , xm/cm 6∈ T J K if and only if (by Fact 4.8(a))

¬φ x1/c1, . . . , xn/cm ∈ T. J K Case for ∃xφ:

M |= ∃xφ[xi/[ci]: i = 1 . . . , m] if and only if

For some [c] ∈ MT M |= φ[x/[c], xi/[ci]: i = 1 . . . , m]

if and only if (by inductive assumptions on φ)

For some [c] ∈ MT φ x/c, x1/c1, . . . , xn/cm ∈ T J K if and only if (for ⇐ use the Henkin property of T , for ⇒ use Fact 4.8(b))

∃xφ x1/c1, . . . , xn/cn ∈ T. J K By Fact 2.48 this suffices.

The Lemma is proved in all its parts.

75 Proof of Lemma 4.10

The proof of this Lemma uses a recursive procedure to define theories Tn and lan- guages Ln such that:

• T0 = T .

•L 0 = L.

• At odd stages 2n + 1 we extend the Ln-theory T2n to a maximally satisfiable Ln-theory T2n+1.

• At even stages 2n + 2 we extend Ln to

 n Ln+1 = Ln ∪ eφ : φ a Ln-formula in at most one free variable ,

and the maximally satisfiable Ln-theory T2n+1 to a Ln+1-theory T2n+2 ⊇ T2n+1 which:

– has the Henkin property with respect to all existential formulae in T2n+1 (i.e. for all Ln-formulae φ in at most one free variable x such that T2n+1 |= n n ∃xφ, there is eφ such that φ qx/eφy ∈ T2n+2), – is finitely satisfiable,

– is a coherent extension of T2n+1. Assume this construction can be carried. Then the following holds:

Claim 4.11.4. Let L0 = S L and T 0 = S T . n∈N n n∈N n Then T 0 is a maximally satisfiable L0-theory which is a coherent extension of T .

Proof.

0 0 T is a coherent extension of T : clear since T = T0 ⊆ T .

0 0 T is finitely satisfiable: Any finite set of formulae Γ in T appears in some T2n which is finitely satisfiable in some Ln-structure M with domain M. We can 0 interpret the remaining constants of L \Ln in M as we like, so to extend M to an L0-structure which is still a model of Γ.

0 0 T has the Henkin property: Any sentence in T appears in some Tn, hence at stages 2n + 2 existential witnesses for it have been added.

0 0 0 T is maximally satisfiable: If S ⊇ T find ψ ∈ S \ T . Then ψ is an Ln-sentence for some odd n, hence ¬ψ ∈ Tn+1 ⊆ T (since ψ 6∈ T ⊇ Tn+1 and Tn+1 is maximally consistent). This gives that ψ, ¬ψ ∈ S, therefore S is not finitely satisfiable.

To complete the proof of Lemma 4.10 we must show that the above construction can be performed. The following proposition handles the odd stages of the construction:

76 Proposition 4.12. Let L be a first order signature and T be a finitely satisfiable L-theory. Then T can be extended to a maximally satisfiable, L-theory T 0.

We give two proofs of this proposition: First proof.

Proof. Let A = {S ⊇ T : S is a finitely satisfiable L-theory} Exercise 4.13. Show that whenever F ⊆ A is such that (F, ⊆) is a linear order, S F ∈ A. By Zorn’s lemma A has a maximal element T 0. Clearly T 0 ⊇ T is maximally satisfiable.

Second proof.

Proof. Let T be a finitely satisfiable theory over the language L. Consider the equivalence relation on the set of L-sentences given by logical equivalence. Let [φ] denote the equivalence class of an L-sentence φ. Let BL be the

{[φ]: φ an L-sentence} with boolean operations [φ] ∧ [ψ] = [φ ∧ ψ], [φ] ∨ [ψ] = [φ ∨ ψ], ¬[φ] = [¬φ].

Exercise 4.14. Check that BL with boolean operations defined above is a boolean algebra with top element [φ ∨ ¬φ] and bottom element [φ ∧ ¬φ] for some (any) φ.

Check also that [φ] ≤BL [ψ] if and only if φ |= ψ.

Exercise 4.15. For any set S of L-sentences, let IS = {[φ]: S |= ¬φ}. Prove that IS is a proper ideal on BL if and only if S is finitely satisfiable.

By the Prime Ideal Theorem [2, Thm 2.2.30], let I ⊇ IT be a prime ideal on BL. Let T 0 = {φ :[¬φ] ∈ I}. Claim 4.15.1. T 0 ⊇ T is a maximally satisfiable theory.

Proof. Clearly T 0 ⊇ T .

0 0 T is finitely satisfiable: If not there are φ1, . . . , φn ∈ T such that φ1, . . . , φn |= φ ∧ ¬φ. This gives that [¬(φ ∧ ¬φ)] ∈ I, contradicting that I is a proper ideal.

T 0 is maximally satisfiable: Assume S ⊃ T 0 is finitely satisfiable with φ ∈ S \T 0. Then IS = {[ψ]: S |= ¬ψ} is a proper ideal strictly containing I, since [¬φ] ∈ IS \ I. This contradicts the maximality of I.

The Claim is proved.

The Proposition is proved.

The following proposition handles the even stages of the recursive construction.

77 Proposition 4.16. Let L be a first order signature and T be a finitely satisfiable L-theory. Let

0 L = L ∪ {eφ : φ an L-formula in at most one free variable x} , with 0 C = {eφ : φ an L-formula in at most one free variable x} a new set of constant symbols disjoint from L. Define

0 0 T = T ∪ {(∃xφ) → (φ x/eφ ): eφ ∈ C } . J K Then:

• T 0 has the Henkin property with respect to all existential sentences in T (i.e. 0 for all L-sentences ∃xφ such that T |= ∃xφ, T |= φ x/eφ ), J K • T 0 is finitely satisfiable,

• T 0 is a coherent extension of T .

Proof. The unique delicate point is to check that T 0 is finitely satisfiable. Given a finite subset Γ of T 0, find M model of

∆ = (Γ ∩ T ) with domain M (M exists by the finite satisfiability of T ). For each L-formula φ in at most one free variable x such that M |= ∃xφ, let aφ ∈ M be such that M |= φ[x/aφ]. Consider the L0-structure M∗ for L0, with domain M and interpretation of all sym- bols in L equal to the interpretation of these symbols in M, and evaluation of the 0 new constant symbols eφ ∈ C by the requirement that

M∗ • eφ = aφ if M |= ∃xφ,

M∗ • eφ = a for some fixed a ∈ M otherwise. Then M∗ |= Γ: On the one hand it models ∆. On the other hand M∗ models all 0 sentences (∃xφ) → (φ x/eφ ) for all eφ ∈ C , since : J K • either M∗ 6|= ∃xφ (since M 6|= ∃xφ),

∗ M0 • or M |= φ x/eφ (since M |= φ[x/aφ] and eφ = aφ). J K Since Γ was chosen arbitrarily among the finite subsets of T 0, we get that T 0 is finitely satisfiable.

By these two propositions we can perform the construction sketched at the be- ginning of this section and complete the proof of Lemma 4.10.

78 Sketch of proof for the completeness theorem A proof of the completeness theorem can go as follows:

• A L-theory T is consistent if T 6` φ ∧ ¬φ.

• A L-theory T is maximally consistent if no S ⊇ T is consistent.

In all the results and definitions of the previous section replace finitely satisfiable with consistent and maximally satisfiable with maximally consistent, in most ar- guments the semantic notion of logical consequence |= must be replaced with the syntactic notion of derivability `. Check that all proofs of the relevant claims go through mutatis-mutandis. This gives that T is satisfiable if T is consistent i.e. the completeness theorem.

5 First order logic and set theory

We give some brief comments on how first order logic can be developed inside set theory. First of all L-formulae and L-terms are strings over the alphabet L ∪ Symb where

Symb = {∧, ∨, →, (, ), ∀, ∃, ¬} ∪ {, } ∪ {xn : n ∈ N} . All over these notes (with the exception of the preceding section) we just focused on countable (or even finite) languages. We may assume that a first order signature L is identified with a subset of N×{0, 1, 2, 3}×N, with the convention that a triple of type (n, 0, m) is a relation symbol of ariety m, a triple of type (n, 1, m) is a function symbol of ariety m, a triple of type (n, 2, m) is a constant symbol, a triple of type (n, 3, m) is a symbol in Symb.

• Strings over L corresponds to the set (L ∪ Symb)

•L -formulae and L-terms can be recognized inside (L ∪ Symb)

• Finitely satisfiable theories form a proper subset of P((L ∪ Symb)

•L -structures can be seen as functions M :({∗} ∪ L) → (P(M

• Over any set M one can define an L-structure having that set as domain. Hence it is easy to check that the family of L-structures is always a proper class.

All the results presented in the previous sections can be formulated as set theoretical statements asserting that certain sets and classes exist and have certain properties.

79 References

[1] Alessandro Andretta, Elementi di logica matematica, 2014. [2] Matteo Viale, Notes on forcing, 2017.

80